Jerem Posted October 14 Share Posted October 14 Hi. I had an unraid server stable for several years running 7/7 - 365 . I recently upgraded to a new more powerful hardware based on an AMD CPU. I kept on getting weekly crashes when unraid becomes totally unresponsive on the network. I already did a few changes suggested on forums (Global c-state disabled, reduce frequency of RAM just in case as I am NOT doing any overclocking anyway) but the issues are stil ocurring. This is driving me nuts. That said when the server becomes totally unresponsive it still responds to me pressing the physical power shutdown and then does a graceful shutdown. Below is an extract of the syslog just when the unresponsiveness is triggered. Any help would be appreciated. log.txt Quote Link to comment
JorgeB Posted October 14 Share Posted October 14 Problem with the NIC getting dropped, I assume this is onboard? Oct 14 15:28:29 Tower kernel: igc 0000:0b:00.0 eth0: PCIe link lost, device now detached Quote Link to comment
JayDee73 Posted October 14 Share Posted October 14 I am dealing with the same issue. In my case running an Intel-based CPU (Gigabyte Mobo with 2.5G Intel 225 NIC onboard). Same occasional crashes, same syslog error message. About 14 days ago I deactivated all ASPM BIOS settings and all C-State settings. Also everything related to powertop tweaking in Unraid. Since then, no crash. You may give this a try yourself… Nevertheless this should not be the final solution for me, as the server consumes more energy than necessary…but to track things down I started „from scratch“. I am abroad at the moment, so I cannot tweak anything in BIOS. I will try to re-activate things as soon as I come back. Quote Link to comment
Veah Posted October 14 Share Posted October 14 Try disabling wake on lan in bios. 1 Quote Link to comment
wewantrice Posted October 14 Share Posted October 14 I'm having similar issues with server instability. Tried replacing a 5 year old flash drive but no joy. Server has to be hard reset at least once a week. This is lowering the WAF and of course I will be away from home for the next few months and unable to troubleshoot. Syslog attached if anyone could offer any insights. syslog-192.168.0.150.log Quote Link to comment
JorgeB Posted October 14 Share Posted October 14 There appears to be a container constantly restarting, check the up-times so see if you can find out which one it is. Quote Link to comment
Jerem Posted October 14 Author Share Posted October 14 14 hours ago, JorgeB said: Problem with the NIC getting dropped, I assume this is onboard? Oct 14 15:28:29 Tower kernel: igc 0000:0b:00.0 eth0: PCIe link lost, device now detached Yes, this the Intel 2.5 Gb Ethernet NIC onboard the ROG STRIX X670E-E motherboard. Quote Link to comment
Jerem Posted October 14 Author Share Posted October 14 12 hours ago, JayDee73 said: I am dealing with the same issue. In my case running an Intel-based CPU (Gigabyte Mobo with 2.5G Intel 225 NIC onboard). Same occasional crashes, same syslog error message. About 14 days ago I deactivated all ASPM BIOS settings and all C-State settings. Also everything related to powertop tweaking in Unraid. Since then, no crash. You may give this a try yourself… Nevertheless this should not be the final solution for me, as the server consumes more energy than necessary…but to track things down I started „from scratch“. I am abroad at the moment, so I cannot tweak anything in BIOS. I will try to re-activate things as soon as I come back. Thank you very much. I have just deactivated the ASPM in the BIOS (C-state was already disabled in BIOS and I have not installed powertop in Unraid). I will continue to monitor the server and revert to this forum with updates. Quote Link to comment
wewantrice Posted October 18 Share Posted October 18 Crashes continue and now getting out of memory errors? Diagnostics and syslog post crash attached. tower-diagnostics-20241018-1148.zip syslog-192.168.0.150.log Quote Link to comment
JorgeB Posted October 18 Share Posted October 18 1 hour ago, wewantrice said: Crashes continue Please start your own thread, since the OP's issue may still not be resolved, it can be confusing trying to help different users at the same time. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.