ASRockRack E3C242D4U with Version: 6.9.2 Crashing Randomly


Recommended Posts

Hi Everyone,

 

Looking for some assistance to trouble shoot my new server install which randomly crashes 5-10 minutes after starting, I have removed all cards and drives except one to continue the testing.

 

I cant migrate my old server until I get this resolved  - any help gretly appriciated.

 

Attached is the diagnostic and syslog files after the last crash

tower-diagnostics-20210831-2006.zip syslog

Link to comment

Hi trurl

 

thank you for commenting appreciate your suggestion- I actually  swapped out the two 8gb modules which were new with another new module that was originally in my hp gen 10 server.

 

I don’t believe this issue is hardware related as windows 10 pro that I installed as a test ran fine SIS sandra benchmark ran fine , I also installed the latest Slackware tried the live cd first and then installed all run fine only unraid crashes after 5~10 mins.

 

however I will run memtest and advise.

Edited by Gibbo592
Link to comment

Hi there,

 

What we really need is for you to get the system running with a monitor and keyboard attached.  Login to the console and then type the following command:

 

tail /var/log/syslog -f

 

This will begin printing your log directly to the monitor and when the system crashes, you can capture the final events in the log prior to the crash, which should provide some indication of what is causing it.

 

All the best,

 

Jon

Link to comment

something weird with the time going on tried changing the NTP servers

 

root@Tower:~# hwclock --show
2021-09-07 20:47:19.137429+10:00
root@Tower:~# ntpq -p
     remote           refid      st t when poll reach   delay   offset  jitter
==============================================================================
 LOCAL(0)        .LOCL.          10 l  438   64  100    0.000   +0.000   0.000
+Server.local    216.239.35.0     2 u   46   64  177    0.533   -3.040   1.881
-resolv.internod 203.35.83.242    2 u   27   64  177   13.468   +5.395   2.128
+time.cloudflare 10.85.8.92       3 u   35   64  177   11.104   -0.635   2.659
*time4.google.co .GOOG.           1 u   34   64  177  158.321   +1.003   2.384

 

Screen Shot 2021-09-07 at 10.52.56 am.png

Link to comment

To be perfectly honest I have absolutely no idea what is wrong with your system.  The logs aren't printing any significant errors out and I doubt time sync issue is causing the crash.  One thing to try would be to leave the array in a stopped state but with the server on and see if just idling like that you can recreate the crash.  If so, then we know it isn't storage related.  If not, then maybe there is something amiss with the storage or storage controller(s).

Link to comment

I have found a work around to the random crashing which may help someone else, I added initrd=/bzroot acpi=off kernel option in the syslinux configuration. This worked for both 6.9.2 and 6.10.0rc

 

You can temporarirly try by selecting tab at the boot menu and manually add acpi=off to the line to make it permanent add the acpi=off option to the syslinux config.

 

I also found issues with the NTP service and syslog was being flooded with "nchan: a message from the past has just been published. unless the system time has been adjusted, this should never happen."

 

I tried several NTP servers close to home but still the error persisted - I have now stopped the NTP service and the issue is no longer apparent.

Edited by Gibbo592
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.