CaptainSandwich

Members
  • Posts

    7
  • Joined

  • Last visited

Everything posted by CaptainSandwich

  1. I too am experiencing similar crashes with a WDS500G2B0B, using Marvell controller, as a cache drive. System becomes unresponsive and must be rebooted, no errors at all in the logs. Did either of you confirm that swapping your drives resolved the problem? EDIT: For me, it doesn't look like the SSD controller was the problem. I removed it from the system, changed USB keys and tried a few other troubleshooting steps but the problem remained. Ultimately, it appears as though the problem was related to Ryzen C states or idle power. I have a Ryzen 1700x and Asrock X470 Taichi (Rev 2.0 bios), and changing the 'Power Supply Idle Control" setting to 'typical current' looks to have resolved the issue. C-States were left on auto. SSD is back in the system and has been stable for 36 hours.
  2. Seems to be working ok for pass through, not sure if that's because the gpu bios has been passed through as well? Additionally, I've added the below to /etc/libvirt/hooks/qemu to re-enable persistence mode when the VM releases resources (see https://libvirt.org/hooks.html). This allows the card to enter P8 state where it draws ~4W whilst idle in Unraid, compared to the ~9W in the default P0 state. I'm not sure how robust this is, or how it'll work once the second card is added - may need to specify address of the target gpu also - but it appears to work for now. Added snippet: if ($argv[2] == 'release' && $argv[3] == 'end'){ shell_exec('date +"%b %d %H:%M:%S libvirt hook: Setting nVidia Promiscuous mode to 1" >> /var/log/syslog'); shell_exec('nvidia-smi --persistence-mode=1'); } Full script at /etc/libvirt/hooks/qemu: #!/usr/bin/env php <?php if ($argv[2] == 'release' && $argv[3] == 'end'){ shell_exec('date +"%b %d %H:%M:%S libvirt hook: Setting nVidia Promiscuous mode to 1" >> /var/log/syslog'); shell_exec('nvidia-smi --persistence-mode=1'); } if (!isset($argv[2]) || $argv[2] != 'start') { exit(0); } $strXML = file_get_contents('php://stdin'); $doc = new DOMDocument(); $doc->loadXML($strXML); $xpath = new DOMXpath($doc); $args = $xpath->evaluate("//domain/*[name()='qemu:commandline']/*[name()='qemu:arg']/@value"); for ($i = 0; $i < $args->length; $i++){ $arg_list = explode(',', $args->item($i)->nodeValue); if ($arg_list[0] !== 'vfio-pci') { continue; } foreach ($arg_list as $arg) { $keypair = explode('=', $arg); if ($keypair[0] == 'host' && !empty($keypair[1])) { vfio_bind($keypair[1]); break; }
  3. Did you ever find out the answer to this question? I'm looking to setup a script to set nvidia driver persistence mode when a VM shuts down, so the gpu can enter low power state. Thanks. EDIT - looks like /etc/libvirt/hooks/qemu is the one you want to edit. I modified the existing script to the below to set nvidia persistence mode when a VM releases all resources. The script update appears to persist between boots. #!/usr/bin/env php <?php if ($argv[2] == 'release' && $argv[3] == 'end'){ shell_exec('date +"%b %d %H:%M:%S libvirt hook: Setting nVidia Promiscuous mode to 1" >> /var/log/syslog'); shell_exec('nvidia-smi --persistence-mode=1'); } if (!isset($argv[2]) || $argv[2] != 'start') { exit(0); } $strXML = file_get_contents('php://stdin'); $doc = new DOMDocument(); $doc->loadXML($strXML); $xpath = new DOMXpath($doc); $args = $xpath->evaluate("//domain/*[name()='qemu:commandline']/*[name()='qemu:arg']/@value"); for ($i = 0; $i < $args->length; $i++){ $arg_list = explode(',', $args->item($i)->nodeValue); if ($arg_list[0] !== 'vfio-pci') { continue; } foreach ($arg_list as $arg) { $keypair = explode('=', $arg); if ($keypair[0] == 'host' && !empty($keypair[1])) { vfio_bind($keypair[1]); break; }
  4. Just had a look at VFIO-PCI Config, looked like the GPU was being bound to the vfio-pci driver. Unbound, restarted, and everything seems to be working fine again.
  5. Hi itimpi, The intent is to use a secondary gpu for transcoding in emby, however the 1050 super is just for game streaming. Whilst not running a VM, the nvidia drivers allow for the card to enter a low power state. Is this not correct? I have power cycled the machine and still received the nvidia-smi failed message prior to starting VM.
  6. I'm having the same issue with nVidia Unraid 6.8.3, driver version 440.59. I'm using a gtx 1050 super, however. It worked ok immediately after installing the build, however I have since passed it through to a VM. Upon stopping the VM, nvidia-smi failed. Rebooting Unraid still sees nvidia-smi fail, even before passing through to VM.
  7. I've finally managed to get my setup to a stage where the disks in the array will spin down pretty consistently. But I think I'm now running into an issue with Fan Control. There are 2 SSD's not impacted by the array fan; the cache drive has been excluded from Auto Fan Control however an NVME drive is mounted via unassigned devices and does not show up in the list for exclusion. However, the log seems to indicate that when all the array drives have spun down, Auto Fan Control then refers to the NVME's temperature to determine fan speed, which means fan speed and noise increases with the array spun down. Is there any way to exclude drives mounted in unassigned devices from auto fan control?