Rhynri Posted December 29, 2022 Share Posted December 29, 2022 (edited) Hello - and thanks for taking a peek! I'm running into an issue even though I can see my 6 SATA Drives and 3 NVME drives on the BIOS Post screen, they only two of the NVME Drives show up in the OS itself. I'm also noticing that none of the USB devices attached to the system aside from the boot drive are visible, including USB drives plugged in after boot. An NVMe drives plugged into a PCIe slot via PCIe -> M.2 Adapters did appear. What's even weirder is that the third "missing" NVMe controller does seem to show up, but it doesn't appear as a drive. The motherboard is a Pro WS WRX80E-SAGE SE WIFI with an AMD Threadripper Pro 5965WX with 128GB of system RAM installed and no PCIe slot devices installed at the moment (for testing). I've tried making a new Unraid USB - they boot fine but do not resolve the issue. I see a lot of pci 0000:22:06.0: BAR 14: failed to assign -type messages in the SysLog so I'm going to fiddle with the ram a bit and see if I have a bad stick or something since it's my understanding that bad ram can cause this. Worth a shot. Diags [Edit: from Safe Mode] attached. tower-diagnostics-20221228-2321.zip Edited December 29, 2022 by Rhynri Add "Safe Mode" Quote Link to comment
Vr2Io Posted December 29, 2022 Share Posted December 29, 2022 Try disable SVM (IOMMU) then check any different. If positive, then re-try all option in SVM. Quote Link to comment
Rhynri Posted December 29, 2022 Author Share Posted December 29, 2022 (edited) I tried disabling IOMMU first (it's a separate setting - or rather a half dozen - on my board) and then SVM. No change. For the record, booting into Windows (I had a disposable install on one of the NVMe's that going to be in the cache) with current settings shows all devices as expected. So this is an Unraid only issue - the hardware itself seems to be working fine. Edited December 29, 2022 by Rhynri Tweaks for clarity. Quote Link to comment
Vr2Io Posted December 29, 2022 Share Posted December 29, 2022 Yes, I mixed-up the name SVM. Quote Link to comment
Rhynri Posted December 29, 2022 Author Share Posted December 29, 2022 No worries, just trying to be very clear what I attempted for posterity. Quote Link to comment
Solution SimonF Posted December 29, 2022 Solution Share Posted December 29, 2022 (edited) 5 minutes ago, Rhynri said: No worries, just trying to be very clear what I attempted for posterity. Have you tried pci=realloc=off Edited December 29, 2022 by SimonF 1 1 1 Quote Link to comment
Rhynri Posted December 29, 2022 Author Share Posted December 29, 2022 Just now, SimonF said: Have you tried pci=realloc=off Yes! That worked! What exactly does that do, and how would one go about figuring out they need it in the absence of wonderful people like you? (i.e. - Is there a list of boot options and their functions?) Quote Link to comment
SimonF Posted December 29, 2022 Share Posted December 29, 2022 5 minutes ago, Rhynri said: Yes! That worked! What exactly does that do, and how would one go about figuring out they need it in the absence of wonderful people like you? (i.e. - Is there a list of boot options and their functions?) Not had to use myself, but details below. realloc= Enable/disable reallocating PCI bridge resources if allocations done by BIOS are too small to accommodate resources required by all child devices. off: Turn realloc off on: Turn realloc on realloc same as realloc=on https://static.lwn.net/kerneldoc/admin-guide/kernel-parameters.html Quote Link to comment
Rhynri Posted December 29, 2022 Author Share Posted December 29, 2022 Thanks for the follow up, I had to sleep a bit, but it looks like @JorgeB kindly marked your solution. I had a look again today and it also restored the USB devices. After looking at this kernel pci patch thread I'll experiment with 4G bios settings to see if I can run without this setting. Quote Link to comment
Rhynri Posted December 29, 2022 Author Share Posted December 29, 2022 (edited) Having experimented in the BIOS, I was able to boot with full functionality and without the `pci=realloc=off` by going into my PCI Subsystem settings and changing the following settings: 1) Above 4G Decoding - ENABLED 2) Re-Size BAR Support - AUTO 3) SR-IOV Support - ENABLED 4) BME DMA Mitigation - DISABLED [Edit - See below for explanation] 5) Hot-Plug Support - DISABLED It's also worth noting I'm not booting in EFI - I prefer the way the onboard ASPEED VGA outputs a functional console in legacy boot. Edited December 30, 2022 by Rhynri Fix BME DMA to not mislead anyone. 1 Quote Link to comment
Rhynri Posted December 30, 2022 Author Share Posted December 30, 2022 So there is another level of complexity here. Turns out SR-IOV support is the one actually solving this. Without it on, the symptoms return. If you have BME DMA Mitigation on then when you pass through the cards, they won't return to the system (and can't be reset). However now I error out of windows hard when trying to game. First it games fine (I'm hitting my monitor refresh in Deep Rock Galactic) but then I get a VIDEO_DXGKRNL_FATAL_ERROR - after some tinkering I now get one when shutting down and the video card will not return to the system when this happens, but I can still boot. Going to load up a blank Windows 11 install and see if it has the same issues. Quote Link to comment
Rhynri Posted December 30, 2022 Author Share Posted December 30, 2022 So after much testing and experimentation it appears that the actual install of Windows is broken - Win11 install is a straight beast on the new hardware and solid as a rock. It could also be that running it off of the M.2 to PCIe slot adapter is broken, I'm not sure which. I'm not sure which. As an experiment, I'm going to rip the drive to a QEMU disk image and boot from that. I'll keep recording my journey here for posterity in case it helps someone else. Quote Link to comment
Rhynri Posted January 4, 2023 Author Share Posted January 4, 2023 (edited) To Follow Up - the VIDEO_DXGKRNL_FATAL_ERROR was actually because of AquaComputer's Aquasuite - turns out that they have some video and audio services that try to access low-level resources in the system and cause the Graphics subsystem to crash - once these are disabled the problem goes away. I still have the BAR issues and a nasty hard-lock crash detailed here. Edited January 4, 2023 by Rhynri clarified fix Quote Link to comment
SimonF Posted January 4, 2023 Share Posted January 4, 2023 Are you able to post current syslog so I can look at your spindown issue also. Quote Link to comment
SohailS Posted January 4 Share Posted January 4 On 12/29/2022 at 7:19 AM, Rhynri said: I tried disabling IOMMU first (it's a separate setting - or rather a half dozen - on my board) and then SVM. No change. For the record, booting into Windows (I had a disposable install on one of the NVMe's that going to be in the cache) with current settings shows all devices as expected. So this is an Unraid only issue - the hardware itself seems to be working fine. I'm trying to enable iommu in this board the asus wrx80e-sage what are those settings? You say there are a few? Quote Link to comment
Rhynri Posted January 5 Author Share Posted January 5 I sold the original and have a Sage wifi II now. As far as I recall there are several IOMMU settings in different sections. Just look for anything IOMMU and virtualization related. If you had a SAGE II I can give you my mobo settings. Quote Link to comment
alessandro15 Posted January 12 Share Posted January 12 On 5/1/2024 at 06:30, Rhynri said: On 1/5/2024 at 6:30 AM, Rhynri said: I sold the original and have a Sage wifi II now. As far as I recall there are several IOMMU settings in different sections. Just look for anything IOMMU and virtualization related. If you had a SAGE II I can give you my mobo settings. hi, I have the sage II, have you made any particular changes? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.