Jump to content

Crash/Freeze of new Server-Setup


Recommended Posts

Posted

Hey all,

 

after long time of unsuccessful troubleshooting by myself i am forced to ask you for help.

My situation:

- My Server crashes every time in a timespan between of 2-24h after the reboot (since the begginning on)

- mostly no access via WebUI or SMB or via direct command line after crash (sometimes but rarely still access via Web UI)

- crash can be induced while copying or moving, but mostly happens in idle (mostly while parity check if it's running)

- crash during copying e.g. from windows result in error messages on Windows like "0x8007003B" or "0x8007003A" or "path is not available"

- it once a time induced the switch (it is connected to) also to crash/block the other clients of the switch

- once i saw one cpu core stuck at 100% (while only WebUI was still working on a crash)

- parity check usually becomes extremely slow after 40% progress (recently it started to find errors in parity check, but crash was existing long time before)

- Commandline shows log (see CommandlineOutputScreenshot.rar )

 

My lineup:

- Supermicro X10SBA (SoC: J1900)

- G.Skill SO-DIMM 8 GB DDR3L-1333 Kit

- PSU: Be-Quiet Pure Power 11 BN290

- 2x WD Red 4 TB WD40EFAX (unfortunately SMR) (Array)

- 2x SanDisk SSD PLUS 480GB (Cache Pool Raid 1)

 

I did the following approach of troubleshooting without any success:

- RAM MemTest (passed)

- CPU stress test (with Windows10): (passed)

- HDD SMART test: (passed)

- USB Boot stick replaced, USB port changed

- SATA cables replaced

- Ubuntu 20.04 with the same hardware setup: manual break after 5 days uptime without any errors

- Windows 10 with same hardware setup: manual break after days of uptime, also passed MemTest and CPU stress test

- update to Unraid OS 6.9.0-Beta25: still crash

- applied this patch (C-States): https://howto.lintel.in/freezing-intels-bay-trail-socs-cushioned-patch/

- removed the cache drives (due to BTRFS and non ECC RAM combination)

- boot in safe mode; boot in legacy and UEFI mode

- Syslog and enhanced Syslog does not help because of unexpected hard shutdowns

 

What else can i try to find the problem? Does anyone know or see a problem? Could it be a Hardware/Software incompatibility?

 

Next step for me would be to remove the hard drives or at least try a new "greenfield" build up of the server, what i actually wanted to avoid rather.

 

Thank you!

 

 

 

 

powernas-diagnostics-20200726-2053.zip powernas-smart-20200726-2053.zip powernas-smart-20200726-2052.zip

Posted (edited)

Strange.

 

1 hour ago, konvicted said:

My Server crashes every time in a timespan between of 2-24h

Could you try completely disable onboard NIC in BIOS, then check it by local KVM, just try any different if no network.

Edited by Benson
Posted
21 hours ago, Benson said:

Strange.

 

Could you try completely disable onboard NIC in BIOS, then check it by local KVM, just try any different if no network.

I tried, no difference..

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...