Jump to content

[SOLVED] Server keeps booting after parity check is finished or cancelled [6.8.3]


Recommended Posts

I have been experiencing a weird issue for the past couple of months.

Shortly after a parity check finishes or i cancel the check, the server reboots.

This of course triggers a new parity check and so the circle is complete.

I have syslog setup to mirror to flash, and when checking with cat, it runs through a bunch of log, but once it finishes, I can only access the logs from after the crash.

Anyone have any input on why the crashes might be happening or maybe a way to see the logs from before the crash, so that might bring answers to light.

Edited by Soxekaj
Link to comment

If you set up syslog to mirror to flash then the file on the flash drive should contain both pre and post crash. Just attach the file.

Crashes like that are hard to diagnosed though. We'll see if there's anything useful in the log.

 

While you are at it, attach the diagnostic zip (Tools -> Diagnostics -> attach the whole zip file).

 

My first hunch is PSU. Second hunch is RAM.

Link to comment

You have setup the Syslog Server per these instructions???

 

     https://forums.unraid.net/topic/46802-faq-for-unraid-v6/page/2/?tab=comments#comment-781601

 

Two things: 

 

First, attach the syslog that you get. (I see that @testdasi has already requested that.)  Give us a date and approximate time to be looking at  if it is large syslog.

 

Second, hook up a monitor and trigger the crash by stopping the parity check. Perhaps, you could use a camera to get a photo of the problem.  (Prepare to be quick as it may last only a few seconds.  If nothing else, describe what you see.  Does it look like everything was fine and then a restart or does the system vomit a whole bunch of stuff (like a core dump) that does not look like normal syslog entries?

 

Edited by Frank1940
Link to comment

One more thing.  Does it run normally for a while after the parity check or it virtually instantaneous with the finish. 

 

Is there an possibility that a child or pet (cats sometimes are a problem here) may be pushing the Reset button.  IF you even suspect this could be a problem, disconnect the switch leads at the MB end.

Link to comment

What's your RAM speed? Are you using XMP?

 

Also can you let us know the last time it crashed / unexpectedly reboot? Was it this morning?

Your 23MB syslog spanned almost the whole April so it contains too much information (e.g. your intentional reboot is mixed up with the crash) so we need to narrow it down.

Link to comment

RAM is runnuing at base clock speed (2666), don't think i have XMP active, but not sure, been a while since i was in bios and i don't have a graphics card in the machine (model F i3, so no on chip graphics).

I have not done intentional reboots for a while, so if you look at April that is probably only crashes for that period.

Link to comment

Doing a search of your syslog using   Linux version 4  as the search term, I noticed that the reboot seems to occur with fifteen minutes of when the parity finishes or is stopped.  There is no clue as to anything happening beyond what would be expected.  It appears that the sever is in a idle state. 

 

I would check the BIOS and make sure that it s not set to some power saving mode.  If it is change to high performance mode as a test. 

 

You could also run memtst (a boot option) for 24 hours and see if that detects anything. 

 

You might also consider changing the PS.  They have been found to be the culprit in other similar cases.  You might have one in your junk box, borrow one from a friend or a loan of one from a vendor with a liberal return policy...

 

EDIT:  Do one thing at a time, so if something helps, you know what it is!

Edited by Frank1940
Link to comment
6 minutes ago, itimpi said:

No.   Many people do not realize that XMP is an overclocked setting and thus the question.

That is what i was assuming, but given my predicament, I just wanna get everything right.

I do have a hard time seeing why XMP or any other hardware component would cause this, since it happens pretty tight around 15 min after parity check finishes. If it was hardware related, wouldn't it also happen during the check, or is the hardware being hit differently after the check? 

Link to comment
10 minutes ago, Soxekaj said:

That is what i was assuming, but given my predicament, I just wanna get everything right.

I do have a hard time seeing why XMP or any other hardware component would cause this, since it happens pretty tight around 15 min after parity check finishes. If it was hardware related, wouldn't it also happen during the check, or is the hardware being hit differently after the check? 

Hardware-related instability can be hard to predict / understand.

For example, with Precision Boost on (i.e. AMD-certified automatic overclock), my system is rock solid stable. Browsing, gaming, transcoding etc, all fine. The only exception is a very specific Lightroom job that does not even load up on the CPU (not even ONE core to 100%!) and it reliably crashes my whole server every single time despite being run in a VM.

 

The point here is you can't really predict when an innate instability will rear its ugly head.

Link to comment
  • 3 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...