Jump to content

Consistent crashes every ~2 days


Yizura
Go to solution Solved by Kev600,

Recommended Posts

Hi All

 

Slowly pulling my hair out at a chronic server issue - seems I get an unclean shutdown every few days. Seemingly have tried everything, replacing disks, changing filesystem, ip/mac vlan for docker and still get the error. Unfortunately I can't seem to find anything in the log files so would appreciate a more experienced look. Thanks!

Syslogs.zip yizura-labrys-diagnostics-20231229-1019.zip

Link to comment
  • Solution

If this relates to the array failing to stop fully before shut-down, then you could try changing the Settings>Disk Settings>'Default spin down delay' - To - 'Never'.

I see you have shutdownTimeout set to '90', which should be ok. 

 

OR - Is your shutdown unexpected/unwanted?

Edited by Kev600
Link to comment
9 minutes ago, Kev600 said:

If this relates to the array failing to stop fully before shut-down, then you could try changing the Settings>Disk Settings>'Default spin down delay' - To - 'Never'.

I see you have shutdownTimeout set to '90', which should be ok. 

 

OR - Is your shutdown unexpected/unwanted?

Hi Kev, entirely unwanted :)

Link to comment
7 minutes ago, itimpi said:

Have you tried the steps for troubleshooting this outlined here in the online documentation accessible via the Manual link at the bottom of the Unraid GUI.  In addition every forum page has a DOCS link at the top and a Documentation link at the bottom.

Thanks Itimpi, I have run through the troubleshooting (and run extensive memtests with no errors) as well as reviewed other forum posts - unfortunately still very stumped.

Link to comment
Just now, itimpi said:

 

By this do you mean that the system shuts itself down unexpectedly; or something else (e.g crash);

Thanks for asking

 

The system reboots itself (so I only realise when I check the server and see its doing a parity check, with a small uptime). Checking the logs didnt reveal anything I could understand.

 

Ocassionally the behaviour will lead to a black screen and the system must be manually rebooted.

 

I'm assuming its some kind of crash, then the BIOS auto-startup triggers to boot the machine again.

Link to comment
6 minutes ago, Yizura said:

The system reboots itself (so I only realise when I check the server and see its doing a parity check, with a small uptime). Checking the logs didnt reveal anything I could understand.

 

 

The system rebooting itself strongly indicates a hardware error.   Most likely suspects would be either thermal (overheating) issues with the CPU or PSU relate problems.

 

Do you have the BIOS set to automatically boot if power is lost and then restored.  If so a UPS might help if you are getting intermittent power cuts.

 

Link to comment
1 minute ago, itimpi said:

 

The system rebooting itself strongly indicates a hardware error.   Most likely suspects would be either thermal (overheating) issues with the CPU or PSU relate problems.

 

Thanks i'll double check those two components - system was working for ~1year no issues previously.

Link to comment

This is a pain in the ass issue to troubleshoot mate..
Check your BIOS Temps/Thermal Protection thresholds, if available.

Check your CPU/PSU cooler fans & fins/vents - are they caked in dust?
Can you boot to a different OS and hammer the system? Hopefully you'll see the same behaviour and at least confirm you have a serious hardware issue.

 

 

Link to comment

Feeding back in for completion (and for anyone searching/googling) it takes a long time to revalidate but it's almost certainly the spin up of hdds crashing the server. I have 4 7.2k hdds on the main array and spin ups during nightly scheduled jobs (like docker updates) causes a full spin up and restarts the server. To Kev's and Itimpi's points it's likely PSU ("be quiet! Pure Power 11 (400 W)") but should be specced to manage this on a Ryzen server with no gpu. Will play around with it and see if I can get spin down stable and update if I do. Otherwise turning off spindown seems to fix it for the moment.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...