Jump to content

Spontaneous reboots


Farai

Recommended Posts

Hi all,

 

For some time now my server has been spontaneously rebooting. I've mirrored my syslog to my flash drive, but I don't see anything relevant logged. For example, these are the last lines before reboot somewhere around 20:30.

Aug 16 20:29:40 Tower autofan: Highest disk temp is 50C, adjusting fan speed from: 236 (92% @ 1732rpm) to: FULL (100% @ 1821rpm)
Aug 16 20:29:46 Tower autofan: Highest disk temp is 49C, adjusting fan speed from: 208 (81% @ 1075rpm) to: 236 (92% @ 1075rpm)
Aug 16 20:35:35 Tower kernel: microcode: microcode updated early to revision 0x21, date = 2019-02-13
Aug 16 20:35:35 Tower kernel: Linux version 5.13.8-Unraid (root@Develop) (gcc (GCC) 10.3.0, GNU ld version 2.36.1-slack15) #1 SMP Wed Aug 4 09:39:46 PDT 2021

 

I've attached my diagnostics. I've had this problem on both 6.9.2 and 6.10.0-rc1. I upgraded to the latter as I read on the forums here that that could resolve some Docker network issues. 

 

Any help on this would be great, as this is getting really frustrated. Before installing Unraid this server used to run FreeNas without any issues.

 

Edited by Farai
Removed diagnostics file now issue has been identified.
Link to comment
12 minutes ago, Squid said:

First thing to do would be to run memtest from the boot menu for a minimum of a pass or 2

 

Forgot to mention that, did that already. Also really want to stress that this server ran without any problems for years, so I doubt it's a sudden hardware problem.

 

Edit:

Btw, I'm running a Grafana stack to keep track of the server. I don't see any spikes in CPU, RAM, temperature, network or anything at the moment of reboot.

Edited by Farai
Added info
Link to comment
2 minutes ago, Frank1940 said:

Is there any possibility that a child or pet is pushing the 'Reset' button on the case?  They are often attracted by the LED.

 

Another thing to try is to replace the PS.   They have been known to cause this problem... 

 

Ha, that's what I actually thought it might be for the longest time, but it just rebooted twice while I was sitting next to it doing something else.

 

PS? As in PSU? Do you mean my specific model or in general?

Link to comment
5 hours ago, Farai said:

PS? As in PSU? Do you mean my specific model or in general?


if you have made no configuration change and spontaneously start getting reboots this strongly suggests something at the hardware level is starting to fail.
 

The PSU in general :)  the PSU is something that can degrade over time so when you suddenly start getting reboots it is an obvious suspect, and hopefully is something that can relatively easily be checked out.

 

if it is some other component (e.g motherboard, RAM) that is starting to fail this is much more difficult to diagnose.

Link to comment
8 hours ago, Farai said:

PS? As in PSU? Do you mean my specific model or in general?

PS meaning  Power Supply  

 

I actually had a one that was causing reboots that was less than a month old.  (In my case, it was easy to figure out what was the problem.  It was the only thing that I had changed!)  A PS contains circuity that interacts with the MB.   Often one has a spare PS that can be swapped in ...or borrowed from a friend.  (Some vendors--- by their return policy --- will also allow you to 'borrow' one.)

Link to comment

Ok, so I decided to do another overnight memtest just to be sure, and it rebooted while running that.

 

So my bad, it's definitely hardware related. Also just realised my Grafana stack isn't tracking CPU temperature, so will start tracking that to see if the CPU is overheating. It's been quite warm here lately so that might be a factor.

 

If not you all are probably right and the PSU is having trouble keeping up. It's a couple of years old by now, so replacing it with something more efficient wouldn't be a bad idea anyways.

 

Thanks for the help @Squid, @Frank1940 and @itimpi!

 

Topic can be closed now as it isn't Unraid related after all, my apologies.

Link to comment

Check that the case and heat sinks are clean.  Check that all of your fans in the case are running.  With servers, it is usually best that air flow comes in the front of the case and exits out the back. 

 

No need to close the thread.  You will not be the first person helped with hardware problems.  If you have more questions, don't hesitate to post about them.

Edited by Frank1940
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...