6.9.2 broke my vm's with nvidia gpu's


Recommended Posts

Hello.

 

I just decided to take the jump to 6.9.2 today, as the threads seems to have slowed down on bug reporting. Everything worked just fine on 6.8.3. After the first reboot into 6.9.2 my main windows 10 vm was really slow. And I check the ssd and it seemed to have been effected in regards to the 1 MiB alignment "bug". So I moved everything from it and reformatted using unassigned devices plugin. And moved my vm's back.

 

Everything seemed fine for around 30 minutes and it just crashed on me. The funny thing is that it works just fine using VNC. Their is no problems at all. Everything seems to work fine just up until the gpu driver seems to be loaded in windows. Then everything freezes or crashes. I have tried rebooting. removing the gpu from vfio and back in. Nothing seems to help me at all. - I then found this thread and followed the guide by removing all the gpu drivers using vnc. Added the gpu back in. Booted just fine into windows with display on my screens.

 

Then when I tried installing the gpu driver it just crashed. And now i'm back to square one. I got a flash drive backup from before upgrading. So might just end up downgrading again. But I hope someone can help me out here.

I get this line almost everytime I boot up a VM with my gpu passed through to it.

Tower kernel: vfio-pci 0000:81:00.0: vfio_ecap_init: hiding ecap 0x19@0x900

- not sure if this is intended or not. I have not seen it before. But could ofc be a 6.9 thing.

 

Best regards,

Brydezen

tower-diagnostics-20210410-2236.zip

Edited by Brydezen
Link to comment
  • 1 month later...
Posted (edited)

VM: Ubuntu 20.04 LTS server VM with a Nvidia 1050Ti set to pass-through.
Configuration: IPC NVR with Tensorflow for object detection and email alerts.
Behavior: I observed that in 6.9.2 release applied the Tensorflow alerts would be delayed by half an hour and up to an hour. The attached email screenshots did not show the captured object as expected. Rolled back Unraid to 6.8.3 release and the object detection / alerts reported as expected.
Notes: In 6.9.2 release viewing the CPU usage of the various assigned programs to PM2 in this case reported Tensorflow as using 200% of the CPU consistently. In contrast with 6.8.3 PM2 Tensorflow reported < 100% CPU usage averaging 95 - 98%.
Conclusion: The only change to this Unraid server where the VM is hosted was upgrading from 6.8.3 release to 6.9.2 release. Downgrading to v.6.8.3 was the only way to resolve this issue.

Edited by Thirs
  • Like 1
Link to comment
  • 4 weeks later...
  • 1 month later...
  • 4 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.