Jump to content

server keeps crashing


shorshi

Recommended Posts

Hey guys, 

 

I am very new to unraid, only have been going at it for a couple days, so please go easy on me, haha.

 

I built a Server mainly for media usage out of spare parts i had and im planning on moving over from my Synology NAS... Hardware im using is:

 

1) MSI H100M ECO with newest Bios from 2018

2) i7 7700

3) 32gb DDR4 PC-2133

4) no GPU

5) currently a 16 TB Seagate ironwolf, 2x 10 TB WD Reds and 2x 4 TB WD Reds, since the bulk of my 16 TB ironwolfs is still in my synology, waiting for the move.

6) 2x Crucial MX500 1TB SSD as cache

 

My server just keeps crashing randomly, so far i CANNOT reproduce the issue at will, it seems to happen more likely when im transferring files, but i am NOT sure. Things i did so far

 

a) ran a CPU stresstest @ 100% usage for more than 1hour straight, it never goes above 75° and did NOT crash

b) memtest86 for multiple passes

c) used an ethernet dongle to avoid the integrated chip

d) used no ethernet at all 

e) have disabled ALL docker and VM stuff

f) bought a brand new Sandisk Cruzer USB stick for boot

g) i ran xfs_repair roughly 25minutes ago and it did not show me any errors or anything, and also only took 3 seconds to finish

 

Since g) it has so far been running as we speak, but i am hesitant.

 

Is it possible that the DELL Perch H310 raid controller i bought used off ebay is faulty? but all my HDDs show up correctly and are able to hold Data. i have transferred a couple TB onto the server by now, inbetween crashes, once even more than 4 TB at once before a crash, so im not sure how that can be a faulty RAID controller

 

Thanks for your help!

server-diagnostics-20211216-1638.zip

Edited by shorshi
Link to comment
16 minutes ago, JorgeB said:

Enable the syslog server and post that after a crash, hopefully it catches something.

i have actually already done that a couple hours ago, here is the file... there over the 4 hours this file spans, the server crashed probably 6-7 times

 

the newest entries "error: /plugins/unassigned.devices/UnassignedDevices.php: wrong csrf_token" are weird but the server crashed many many times BEFORE those showed up. can this csrf_token thing have something to do with the xfs_repair i performed?

 

oh by the way all disks pass SMART tests with 0 errors

WDC_WD100EFAX-68LHPN0_JEJ3417N-20211216-1509.txt

Edited by shorshi
Link to comment
2 minutes ago, trurl said:

Nothing obvious to even suggest a crash happened. Do you have a timestamp we can focus on?

not really, to be honest. the file contains multiple crashes and you can see everytime i (manually) booted the machine again when the "caching directories" thingy shows up, but i assume you know what a normal boot sequence in these files looks like...

 

can i do anything else in terms of stuff like xfs_repair? should i just start from scratch? maybe with 6.10.0 ?

Edited by shorshi
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...