• 6.6 Upgrade - GPU Passthrough Issues


    CSeK
    • Solved Minor

    Upgraded from 6.5.3 to 6.6 stable version. Everything looks good and was running as expected until I attempted to start my Windows 10 vm with GPU passthrough (GTX1080)

    Threadripper 1950x cpu

    PCIe override downstream/multifunction

    vfio-pci with appropriate details enabled at startup

     

    Working smooth in 6.5.3 and previous versions

    Now with exact same settings in 6.6 the VM fails to start with log codes stating GPU stuck in D3 state.

     

    Reverted to 6.5.3 and continues to start as expected without issue.

    Upgraded again to 6.6 - issue persist

    Reverted to 6.5.3 and works fine.

     

    Still running 6.5.3 at the moment where I generated diag files. Let me know if I need to upgrade, attempt to boot vm and then run diag again.

    unraid-diagnostics-20180920-1449.zip




    User Feedback

    Recommended Comments

    Attached is the 6.6 diag.

    Upgraded to 6.6

    Booted VM and terminated after stuck in D3

    Logs say...

    2018-09-20T20:42:51.541859Z qemu-system-x86_64: vfio: Unable to power on device, stuck in D3
    2018-09-20T20:43:03.206433Z qemu-system-x86_64: terminating on signal 15 from pid 7961 (/usr/sbin/libvirtd)

    Fails to kill VM stating resource is busy

    Downloaded the diag.

    unraid-diagnostics-20180920-1643.zip

    Link to comment

    You scared me a little as I came here trying to find info on the new PCIe ACS Override options (e.g. Downstream, Multifunction) without any luck, but just checked out my VM and GPU pass-through is still working after the upgrade.  One difference I noticed between our setups was the ACS Override settings...

     

    Yours

    Quote

    BOOT_IMAGE=/bzimage pcie_acs_override=downstream,multifunction vfio-pci.ids=10de:1b80 vfio-pci.ids=10de:10f0 vfio-pci.ids=1b73:1100 vfio_iommu_type1.allow_unsafe_interrupts=1 initrd=/bzroot

     

    Mine

    Quote

    BOOT_IMAGE=/bzimage pcie_acs_override=downstream initrd=/bzroot

     

    I'm not sure if you manually added the other portions or not, but mine is much more simplistic.  Both your 6.5.3 and 6.6.0 installs appear to have the same settings, so maybe that's not the issue?  I'll attach my Diag bundle in case it helps you out.

     

    undou-diagnostics-20180921-0213.zip

    Link to comment

    On older versions of UNRAID i had to use the pcie_acs_override=downstream,multifunction option to break down my IOMMU groups to get the GPUs in its own groups. With the newest BIOS (AGESA 1.1.0.0) + Unraid 6.6 this isn't needed anymore. At least for me. All my network adapters are still in one group and the USB controllers which i don't passthrough anyway. Maybe check if you're on an up-to-date BIOS? @CSeK I also dissabled the ACS-Override option in the VM-Manager settings and everything works.

     

    Edit:

    Your logs show you're on an old BIOS

    Sep 20 16:41:34 unRAID kernel: DMI: System manufacturer System Product Name/ROG ZENITH EXTREME, BIOS 0902 12/21/2017

     

    ASUS released a new version in august with the newer AGESA version. 

    ROG ZENITH EXTREME BIOS 1402
    Update AGESA 1.1.0.1 Patch A to support AMD 2nd Gen Ryzen™ Threadripper™ processors

     

     

    Edited by bastl
    Link to comment

    Updated the BIOS

    Re-applied BIOS settings

    • Server Virtualization
    • Enumerate IOMMU

    Booted unraid

    Booted VM

    Profit.

    @bastl - thanks for the suggestion there

     

    Side note - I removed the acs override before attempting bios update and same result

    Side note 2 - since i removed the acs override already, i did not re-enable that function before attempting the BIOS upgrade as well. (i have updated bios and no acs overrides set currently)

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.