• Windows 10 VM with Passthrough RX480 Full System Lockup


    Interstellar
    • Annoyance

    Note: This isn't a bug-report per-say, more a thread looking for potential fixes.

     

    I've been struggling with this for a while and any 'solutions' I've found on the forums thus far have not been successful.

    Background:

    I've been passing through an RX480 without a dongle (i.e. with VNC enabled) absolutely fine for months (up-time of 40+ days at one point), however now as I want to use it as a remote Windows workstation I've tweaked it to make Parsec work properly (4K HDMI Dongle + Parsec).

     

    However in that configuration it totally locks up the machine with zero information displayed in the log or on the IPMI KVM view screen (Its a total and immediate lockup... which is useful).

     

    In addition any attempt to close a Remote Desktop session also results in the machine totally locking up, a reboot is the only solution. The notes for changing the WDDM thing hasn't made a difference (in any case an update KB is installed that allegedly fixes it).

     

    I have the AMD reset plugin installed and I'm running 6.10-rc2 and rc1 prior to that.

     

    Two cores + HT isolated (2,3, 6, 7) assigned to the VM, along with 6144MB RAM.

    Q35-6.1 (i440fx never seems to work for me, gets stuck at the boot screen or Windows stops responding)

    OVFM TPM

    USB Controller (3 qemu XHCI)

    Windows 10 21H2 - totally 100% up-to-date, latest AMD drivers, latest VirtIO drivers, etc...

     

    I've now taken the dongle out and added the VNC server back in and I'll test across the next two weeks (Parsec also disabled).

     

    Does anyone have any thoughts on what is causing UnRAID (or Windows to cause UnRAID) to completely lock up either by closing a Remote Desktop connection or randomly at some point in time when there is no VNC enabled?

     

    If it wasn't for the fact GPU prices are insane at the moment I'd pick something up newer, but alas we are where we are.

     

    I'll post the diagnostics plus information on other server configs when I have more time, but for now if anyone has any ideas, please feel free to let me know and I'll try them!




    User Feedback

    Recommended Comments

    Also now tried assigning more/less memory to the VM, changing vm.dirty_ratio and background ratio - no change. Crash within an hour.

     

    Memtest86+ passed and in any case it is ECC memory.

    Back to VNC without a HDMI plug - no issues 3 hours later...

     

    Again nothing in syslog saved to flash, nothing on a SSH syslog "watch 0.1 tail /var/log/syslog" nor on the IPMI view, a total machine lockup.

     

     

     

    cmdline.txt btrfs-usage.txt plugins.txt motherboard.txt iommu_groups.txt ethtool.txt folders.txt lsmod.txt lsscsi.txt loads.txt memory.txt lspci.txt meminfo.txt lsusb.txt urls.txt top.txt ps.txt vars.txt lscpu.txt df.txt ifconfig.txt Windows 10 Gaming.txt

    Windows 10 Gaming.xml

    Edited by Interstellar
    Link to comment

    Running with the VNC option as GPU #1 has been perfect. No lockup.

     

    Anyone any ideas?

     

    Going to plug a keyboard, mouse and monitor in directly and play around next.

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.