killeriq Posted June 3, 2020 Share Posted June 3, 2020 i was playing around with that , but cant definitely tell which step did the fix... "i assume as i moved to 6.9.0 beta 1 it upgraded kernel and fix it somehow." Quote Link to comment
mattz Posted June 4, 2020 Author Share Posted June 4, 2020 (edited) Wanted to follow-up. The cause for my issue [with the Ryzen 3900x hanging while trying to pass-through USB Controller 3.0] was totally that FLR issue posted above. Luckily, someone on this forum had already compiled a kernel with a temporary fix, and I used that. Find that custom kernel for Unraid 6.8.3 here: On 6/3/2020 at 6:50 AM, killeriq said: i was playing around with that , but cant definitely tell which step did the fix... "i assume as i moved to 6.9.0 beta 1 it upgraded kernel and fix it somehow." Note that I tried Unraid 6.9.0-beta1 and it did not yet have the FLR fix in the Linux kernel. It will eventually make it into the Linux Kernel, but probabaly not until 5.8... So, might be a while before it makes it into Unraid, read more about the commit - https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?h=pci/virtualization&id=0d14f06cd665 @killeriq - not sure how you got it to work with the Unraid 6.9.0 beta 1, but if it works, I would say that's the important part. Edited June 4, 2020 by mattz additional note about original problem with 3900x Quote Link to comment
Handl3vogn Posted June 30, 2020 Share Posted June 30, 2020 (edited) Hello found a beta bios with agesa 1004b for Asus Prime x370-pro Tried it and passtrough is now working again: https://www.hardwareluxx.de/community/threads/prime-x370-pro-combopi_1004patchb.1251325/ Just wanted to share if anyone need it. Asus prime x370-pro + ryzen 1700x Edited June 30, 2020 by Handl3vogn Quote Link to comment
killeriq Posted July 5, 2020 Share Posted July 5, 2020 On 6/3/2020 at 2:53 AM, mattz said: @killeriq- I think I'm in the same boat now. I just upgraded my x470 board to the Ryzen 3900x from the 2700x (wanted the cores!). However, I am no longer able to pass through my motherboard's USB Controller 3.0 the same way I did with the 2700x. I now get the same error you had and the whole system will lock up, requiring a hard reboot: kernel: vfio-pci 0000:0c:00.0: not ready 1023ms after FLR; waiting It is something others are encountering--the only way to fix it is to avoid passing through that particular USB controller, and use other USB Controllers, if you can: There is also a Kernel patch, it appears, that could fix it. So, I am not sure, does the latest Unraid BIOS fix it for you? It could be the kernel patch made it in... After i added 2nd GPU card - needed to do some testing...all was good. Then removed it , kept only one and the same issue started again and FREEEZEs. Read through your notes, some custom patch has to be applied (for version 6.8.3). I was already on 6.9.1b22 so not able to revert 2version back. Anyway not really sure how i was able to run it before without any patch, but i assume this is the way: I wasnt able to start VM module, soon as i wanted it freeze with error bellow. So what to do: 1. in BIOS disable IOMMU 2. Start the Unraid2 3. Start VM module. Make all possible VMs with "AMD Starship/Matisse PCIe Dummy Function | Non-Essential Instrumentation (0c:00.0)" on Disabled AUTO start, then restart unraid 4. Enable IOMMU in BIOS 5. Unraid shold boot , VM module should be visible. Edit the VMs and look for "AMD Starship/Matisse PCIe Dummy Function | Non-Essential Instrumentation (0c:00.0)" added into your VM image - you shold UNTICK IT, then SAVE...next time when you EDIT VM image is not present anymore. 6. Start the VM and all should be running fine I added limetech to my reply , to include patch...as seems like all users with new Ryzen 3xxx series have the same problem. "AMD Starship/Matisse PCIe Dummy Function | Non-Essential Instrumentation (0c:00.0)" source of issues Jul 5 13:02:30 unRAIDTower kernel: vfio-pci 0000:0c:00.0: not ready 1023ms after FLR; waiting Jul 5 13:02:32 unRAIDTower kernel: vfio-pci 0000:0c:00.0: not ready 2047ms after FLR; waiting Jul 5 13:02:35 unRAIDTower kernel: vfio-pci 0000:0c:00.0: not ready 4095ms after FLR; waiting Jul 5 13:02:40 unRAIDTower kernel: vfio-pci 0000:0c:00.0: not ready 8191ms after FLR; waiting Jul 5 13:02:50 unRAIDTower kernel: vfio-pci 0000:0c:00.0: not ready 16383ms after FLR; waiting Jul 5 13:03:07 unRAIDTower kernel: vfio-pci 0000:0c:00.0: not ready 32767ms after FLR; waiting Jul 5 13:03:42 unRAIDTower kernel: vfio-pci 0000:0c:00.0: not ready 65535ms after FLR; giving up Jul 5 13:03:43 unRAIDTower kernel: clocksource: timekeeping watchdog on CPU10: Marking clocksource 'tsc' as unstable because the skew is too large: Jul 5 13:03:43 unRAIDTower kernel: clocksource: 'hpet' wd_now: b4700ed2 wd_last: b3954a18 mask: ffffffff Jul 5 13:03:43 unRAIDTower kernel: clocksource: 'tsc' cs_now: 1d337ecfa60 cs_last: 1d337dd658c mask: ffffffffffffffff Jul 5 13:03:43 unRAIDTower kernel: tsc: Marking TSC unstable due to clocksource watchdog Jul 5 13:03:43 unRAIDTower kernel: TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'. Jul 5 13:03:43 unRAIDTower kernel: sched_clock: Marking unstable (510899129422, -8570651)<-(510996221197, -105679272) Jul 5 13:03:45 unRAIDTower kernel: clocksource: Switched to clocksource hpet 2 Quote Link to comment
mattz Posted July 6, 2020 Author Share Posted July 6, 2020 13 hours ago, killeriq said: I added limetech to my reply , to include patch...as seems like all users with new Ryzen 3xxx series have the same problem. "AMD Starship/Matisse PCIe Dummy Function | Non-Essential Instrumentation (0c:00.0)" source of issues Good idea adding limetech. They may defer for it to be included into the Linux Kernel, which should come based on that commit I reference. However, with the Ryzen 3600 and others SO CHEAP and performant I am sure there are quite a few people moving on them. BTW - Those steps you had to take, good points. Super annoying, it's because the VM image will "remember" devices that are "removed". You can also edit the XML directly to remove the reference so you don't need the checkbox; however, it's a little bit of guesswork to figure out which XML element(s) it is. Quote Link to comment
killeriq Posted July 6, 2020 Share Posted July 6, 2020 11 hours ago, mattz said: BTW - Those steps you had to take, good points. Super annoying, it's because the VM image will "remember" devices that are "removed". You can also edit the XML directly to remove the reference so you don't need the checkbox; however, it's a little bit of guesswork to figure out which XML element(s) it is. I was in the state where i had VM Module OFF, soon as i enabled it...Server got frozen and needed to reboot. So i coudn't get into any VM config via WebUI. FYI: someone replied that Limetech will fix it in next release. But still Ryzen 3xxx are over 6 months on the market and still having such issue... Everyone complains about Windows, but those HW implementations seems to be much faster there...linux been always delayed, in case you are not a Linux guru who compiles his own kernel Quote Link to comment
RaidBoi1904 Posted July 21, 2020 Share Posted July 21, 2020 Does anyone know when the next release is? I've used the kernel that was linked by another member here and that allowed me to finally pass my audio card ( kernel + pcie_no_flr=1022:1487 because pcie_no_flr=1022:149c,1022:1487 crashes the system ). But trying to pass both the audio card and the usb controller crashes unRaid (I'm done testing random things I've rebooted my poor server out of more than 20 unRaid hangups, I'm going to kill my array if I keep this up. I just bought my own copy of unraid 3 days ago, I love it for dockers and arrays but VMs have been an absolute nightmare. Quote Link to comment
mattz Posted July 21, 2020 Author Share Posted July 21, 2020 16 minutes ago, RaidBoi1904 said: Does anyone know when the next release is? I've used the kernel that was linked by another member here and that allowed me to finally pass my audio card ( kernel + pcie_no_flr=1022:1487 because pcie_no_flr=1022:149c,1022:1487 crashes the system ). But trying to pass both the audio card and the usb controller crashes unRaid (I'm done testing random things I've rebooted my poor server out of more than 20 unRaid hangups, I'm going to kill my array if I keep this up. I just bought my own copy of unraid 3 days ago, I love it for dockers and arrays but VMs have been an absolute nightmare. @RaidBoi1904 You are a champ for jumping head-first into this issue with a new UnRaid setup. And, sorry to hear the problems all at once... they are not so bad when they pop up once every 2 years after a major hardware upgrade. But your first time out can be rough. So, to pass through Audio and USB (or anything), you will need to isolate them (in addition to the no_flr hack right now for this mobo/cpu combo). It looks like you know where you're going- Main > Flash > Syslinux Configuration to add these lines My setup looks like this for just the USB -- notice the vfio-pci.ids for isolation--I don't know if I need all of them, but I do them as a group and it works: pcie_no_flr=1022:149c,1022:1487,1022:1485 vfio-pci.ids=1022:149c,1022:1487,1022:1485 You will also need to isolate the Audio device to pass-through. on my mobo it looks like it's 10de:10f0, so you would add that to vfio-pci.ids: I use the Arctis Pro Wireless headset that has an external USB driver, so don't need the audio controller. Quote Link to comment
RaidBoi1904 Posted July 28, 2020 Share Posted July 28, 2020 (edited) @mattz Thank you for your reply. Since I wrote this message I have tried a ton of things to get this to work, out of desperation i went back and tried all of those things once again! I did learn a few things so it is not all wasted, but I still haven't achieved a working solution (I did have a mobo brick it self via wifi update, that was cool lol and apparently also common!). In several occasions I got all the devices installed with their drivers but the computer always ended up freezing and refused to reboot (displaying a blue screen with a kernel security error). It turns out I can't pass any of the usb hubs to my VM, they appear to share an ID so regardless of what usb hub I pass as soon as one is passed unraid is no longer able to access the unraid USB. As I write this I'm thinking i should go change the groupings in the VM setting to see if that gives them different ids. At any rate I'm back from trying beta .22 .23 .24 with regular and custom kernels. I'm now on the stable branch with a custom kernel and the following flags: label Unraid OS menu default kernel /bzimage append pcie_no_flr=1022:1487,144d:a808,8086:2526 vfio-pci.ids=1022:1487,144d:a808,8086:2526 initrd=/bzroot This is how my current VM looks: I've put my video card and its sound card on the same buss in the xml like so: <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/> </source> <rom file='/mnt/user/isos/zotak-1070.rom'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0' multifunction='on'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x09' slot='0x00' function='0x1'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x1'/> </hostdev> Right now I'm stuck with error 43, although I've passed this card to 20+ different VMs with this bios and this xml without issues (passing the video card is the only thing that always worked and now it doesn't lol). So I'm some how worse off than a week ago. --------------------- Update #1: I don't like this at all but the fix to my error 43 was disabling the UEFI boot... that makes a lot of sense to me. I will attempt to pass audio now, giving up on the usb hubs for now as changing the grouping method didn't change the "1022:149c" id for all of my usb hubs =/. Update #2: I'm unable to install drivers for the sound device I passed through, it appears to be working ok with the generic windows driver at the moment. I will start testing some games to see if this try doesn't crash. Oh I was going to pass group 39 but that is the usb hub with the same vendor id as group 31. As I type this I will go see if I can find a fix for passing same vendorID devices and ductape one more thing to this vm! Edited July 28, 2020 by RaidBoi1904 Quote Link to comment
mattz Posted March 22, 2021 Author Share Posted March 22, 2021 Wanted to close the loop on this. I *think* this issue has been fully resolved with the release of UnRaid 6.9.0, since they are using the Linux Kernel 5.10.x branch: https://wiki.unraid.net/Unraid_OS_6.9.0 The original Linux Kernel fix for the AMD 3xxx/Xen CPUs was implemented in 5.8.x, so we should be good now: https://github.com/torvalds/linux/commit/39a1af76195086349c4302f01e498a5fcbcb11d6 I have not yet tried it, but I will when I have potentially a few days to feel the frustration if I have to revert. Quote Link to comment
xsinmyeyes Posted March 22, 2021 Share Posted March 22, 2021 7 hours ago, mattz said: Wanted to close the loop on this. I *think* this issue has been fully resolved with the release of UnRaid 6.9.0, since they are using the Linux Kernel 5.10.x branch: https://wiki.unraid.net/Unraid_OS_6.9.0 The original Linux Kernel fix for the AMD 3xxx/Xen CPUs was implemented in 5.8.x, so we should be good now: https://github.com/torvalds/linux/commit/39a1af76195086349c4302f01e498a5fcbcb11d6 I have not yet tried it, but I will when I have potentially a few days to feel the frustration if I have to revert. I updated my Asus ROG STRIX X370-F bios that previously had this issue and forced me to run a bios from 2018. Win10 VM with gpu passthrough running issue free on 6.9.1. 1 Quote Link to comment
mattz Posted March 23, 2021 Author Share Posted March 23, 2021 21 hours ago, xsinmyeyes said: I updated my Asus ROG STRIX X370-F bios that previously had this issue and forced me to run a bios from 2018. Win10 VM with gpu passthrough running issue free on 6.9.1. Just did my upgrade to Unraid 6.9.1, and it is all smoothly running! I have not yet removed the Kernel VFIO definitions in the boot flash, but I will switch over to the new, integrated menu in the Settings when I get chance. Because I should be able to remove both the pci_no_flr (no longer need) and vfio-pci.ids (now in Settings > VFIO-PCI Config). 🍻 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.