Yizura Posted December 29, 2023 Share Posted December 29, 2023 Hi All Slowly pulling my hair out at a chronic server issue - seems I get an unclean shutdown every few days. Seemingly have tried everything, replacing disks, changing filesystem, ip/mac vlan for docker and still get the error. Unfortunately I can't seem to find anything in the log files so would appreciate a more experienced look. Thanks! Syslogs.zip yizura-labrys-diagnostics-20231229-1019.zip Quote Link to comment
Solution Kev600 Posted December 29, 2023 Solution Share Posted December 29, 2023 (edited) If this relates to the array failing to stop fully before shut-down, then you could try changing the Settings>Disk Settings>'Default spin down delay' - To - 'Never'. I see you have shutdownTimeout set to '90', which should be ok. OR - Is your shutdown unexpected/unwanted? Edited December 29, 2023 by Kev600 Quote Link to comment
itimpi Posted December 29, 2023 Share Posted December 29, 2023 Have you tried the steps for troubleshooting this outlined here in the online documentation accessible via the Manual link at the bottom of the Unraid GUI. In addition every forum page has a DOCS link at the top and a Documentation link at the bottom. Quote Link to comment
Yizura Posted December 29, 2023 Author Share Posted December 29, 2023 9 minutes ago, Kev600 said: If this relates to the array failing to stop fully before shut-down, then you could try changing the Settings>Disk Settings>'Default spin down delay' - To - 'Never'. I see you have shutdownTimeout set to '90', which should be ok. OR - Is your shutdown unexpected/unwanted? Hi Kev, entirely unwanted Quote Link to comment
Yizura Posted December 29, 2023 Author Share Posted December 29, 2023 7 minutes ago, itimpi said: Have you tried the steps for troubleshooting this outlined here in the online documentation accessible via the Manual link at the bottom of the Unraid GUI. In addition every forum page has a DOCS link at the top and a Documentation link at the bottom. Thanks Itimpi, I have run through the troubleshooting (and run extensive memtests with no errors) as well as reviewed other forum posts - unfortunately still very stumped. Quote Link to comment
itimpi Posted December 29, 2023 Share Posted December 29, 2023 2 minutes ago, Yizura said: entirely unwanted By this do you mean that the system shuts itself down unexpectedly; or something else (e.g crash); Quote Link to comment
Yizura Posted December 29, 2023 Author Share Posted December 29, 2023 Just now, itimpi said: By this do you mean that the system shuts itself down unexpectedly; or something else (e.g crash); Thanks for asking The system reboots itself (so I only realise when I check the server and see its doing a parity check, with a small uptime). Checking the logs didnt reveal anything I could understand. Ocassionally the behaviour will lead to a black screen and the system must be manually rebooted. I'm assuming its some kind of crash, then the BIOS auto-startup triggers to boot the machine again. Quote Link to comment
itimpi Posted December 29, 2023 Share Posted December 29, 2023 6 minutes ago, Yizura said: The system reboots itself (so I only realise when I check the server and see its doing a parity check, with a small uptime). Checking the logs didnt reveal anything I could understand. The system rebooting itself strongly indicates a hardware error. Most likely suspects would be either thermal (overheating) issues with the CPU or PSU relate problems. Do you have the BIOS set to automatically boot if power is lost and then restored. If so a UPS might help if you are getting intermittent power cuts. Quote Link to comment
Yizura Posted December 29, 2023 Author Share Posted December 29, 2023 1 minute ago, itimpi said: The system rebooting itself strongly indicates a hardware error. Most likely suspects would be either thermal (overheating) issues with the CPU or PSU relate problems. Thanks i'll double check those two components - system was working for ~1year no issues previously. Quote Link to comment
itimpi Posted December 29, 2023 Share Posted December 29, 2023 4 minutes ago, Yizura said: Thanks i'll double check those two components - system was working for ~1year no issues previously. Yes, but PSU's can degrade, and fans can stop working properly. Quote Link to comment
Kev600 Posted December 29, 2023 Share Posted December 29, 2023 This is a pain in the ass issue to troubleshoot mate.. Check your BIOS Temps/Thermal Protection thresholds, if available. Check your CPU/PSU cooler fans & fins/vents - are they caked in dust? Can you boot to a different OS and hammer the system? Hopefully you'll see the same behaviour and at least confirm you have a serious hardware issue. Quote Link to comment
Yizura Posted January 4 Author Share Posted January 4 Feeding back in for completion (and for anyone searching/googling) it takes a long time to revalidate but it's almost certainly the spin up of hdds crashing the server. I have 4 7.2k hdds on the main array and spin ups during nightly scheduled jobs (like docker updates) causes a full spin up and restarts the server. To Kev's and Itimpi's points it's likely PSU ("be quiet! Pure Power 11 (400 W)") but should be specced to manage this on a Ryzen server with no gpu. Will play around with it and see if I can get spin down stable and update if I do. Otherwise turning off spindown seems to fix it for the moment. Quote Link to comment
Yizura Posted January 4 Author Share Posted January 4 And thanks to all for the help 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.