Rhynri Posted January 4, 2023 Share Posted January 4, 2023 (edited) Hello! I've been experiencing some wonkiness with my Unraid server, including hard locks that take out the VMs [displays go black], the HTTP interface, and SSH. In the logs I'm seeing many lines of: Jan 4 10:49:11 Tower kernel: pci 0000:00:03.1: BAR 15: no space for [mem size 0x00600000 64bit pref] Jan 4 10:49:11 Tower kernel: pci 0000:00:03.1: BAR 15: failed to assign [mem size 0x00600000 64bit pref] Jan 4 10:49:11 Tower kernel: pci 0000:00:03.1: BAR 13: no space for [io size 0x2000] Jan 4 10:49:11 Tower kernel: pci 0000:00:03.1: BAR 13: failed to assign [io size 0x2000] Jan 4 10:49:11 Tower kernel: clipped [mem size 0x00000000 64bit pref] to [mem size 0xfffffffffffc0000 64bit pref] for e820 entry [mem 0x000a0000-0x000fffff] Jan 4 10:49:11 Tower kernel: clipped [mem size 0x00020000 64bit pref] to [mem size 0xfffffffffffe0000 64bit pref] for e820 entry [mem 0x000a0000-0x000fffff] Jan 4 10:49:11 Tower kernel: pci 0000:00:03.1: BAR 15: no space for [mem size 0x00200000 64bit pref] Jan 4 10:49:11 Tower kernel: pci 0000:00:03.1: BAR 15: failed to assign [mem size 0x00200000 64bit pref] Jan 4 10:49:11 Tower kernel: pci 0000:00:03.1: BAR 13: no space for [io size 0x2000] Jan 4 10:49:11 Tower kernel: pci 0000:00:03.1: BAR 13: failed to assign [io size 0x2000] It's almost like Unraid is booting in 32bit mode or something and running out of memory space - although I wouldn't think this is possible. In a previous post here, I detailed some attempts at workarounds I made to get the system to boot, but no combination of pci=realloc=off, SR-IOV, and my motherboard's PCIe settings can seem to resolve this. Either I get a partial boot (pci=realloc=off for example, I lose a GPU) or none of the drives/gpus are visible. Memory test came out perfectly clean - no errors in SMT or normal mode. Most of the attached hardware was in my previous 1950X Threadripper build, which was solid as a rock. I'm appreciate any help. I have a syslog server up on a raspberry pi to try and catch one of the crashes directly if possible, but I am thinking that all these PCIe issues can't be helping. tower-diagnostics-20230104-1113.zip Edited January 4, 2023 by Rhynri Attach Diagnostics Quote Link to comment
JorgeB Posted January 4, 2023 Share Posted January 4, 2023 Try adding pci=realloc=off to syslinux.cfg Quote Link to comment
Rhynri Posted January 4, 2023 Author Share Posted January 4, 2023 Thanks for the reply! To clarify - if I do that with SR-IOV off, one of the GPUs (and probably some other devices, I haven't looked that close) does not show up in Unraid. If I do it with it on, the BAR errors are still present. For sake of completeness, I've attached new diagnostics with `pci=realloc=off` in the bootstring tower-diagnostics-20230104-1205.zip Quote Link to comment
JorgeB Posted January 4, 2023 Share Posted January 4, 2023 There was a recent similar thread, user found that changing some BIOS options also helped, see here: Quote Link to comment
globadyne Posted February 3 Share Posted February 3 On 1/4/2023 at 1:24 PM, JorgeB said: Try adding pci=realloc=off to syslinux.cfg What Section of it does it get added to? Quote Link to comment
JorgeB Posted February 3 Share Posted February 3 6 hours ago, globadyne said: What Section of it does it get added to? To the boot option you are using, usually Unraid OS, though you can add it to multiple sections, if you typically use multiple options. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.