Setup new hardware with my server, now freezing up [Unraid 6.5.0 and 6.5.3]


Recommended Posts

So I just upgraded my board from an i7 build to a SuperMicro build in a rack mount but since then UnRaid has been hardlocking itself, seemingly at random. It lasted 3 days once, then it was 8 hours, then it was only a few hours (happened while watching Plex) then it lasted almost another day again before it locked up in the middle of the night. The only thing that had been running at the time that I figure would be intensive, or should have been running, was pre-clear on two Disks I'm trying to add to the array. tower-diagnostics-20180723-0213.zip This is the last diagnostics I have before the last crash, it was the latest thing in my syslog file.

 

So I tried to do some digging, as I'm assuming it's a RAM issue. I couldn't get memtest to work on the server (at boot, I'd scroll down to memtest and it'd just keep trying to start and then returning me to the menu) and then I found out ECC ram doesn't actually work well with MemTest so I went into the SEL Log. It was throwing DIMM errors on one slot every minute, single bit errors. Pulled it out, reconfigured, and giving it time to test. Did an extended run of Common Problems and it threw me a MCE error and told me to run for the hills and hope someone smarter then I can help. I've done so, attached here tower-diagnostics-20180723-0612.zip . And when I check into my log it's still throwing memory errors.

 

Mobo: Supermicro - X9DRi-LN4+

CPU: 2x Xeon® CPU E5-2660 v2 @ 2.20GH

RAM: Currently 24GB Muti-bit ECC

Cache: 1tb ADATA SSD.

 

The new additions are the Mobo, CPU, RAM, LSI-9811, and an SAS Expander. For drives I upgraded the SSD and added a larger Parity drive (the parity built fine, i Just can't clear the other two drives to get added in. Or it just doesn't finish before it locks up) Currently, I'm still trying to run preclear on one of the two HDD"s and it can't get past the starting phase so I've cancelled that. I'm not really sure what else might be the issue besides letting it stay up and testing RAM one by one, it all seems a bit above my google-fu.

 

Any help is mightly appreciated.

Link to comment

So I took it down to just 2 RAM for the time being (as I didn't know if it'd be able to function with 2 CPU's without the two RAM). I didn't see any memory log errors like before but my server once again froze up and was unresponsive at its physical location necessitating a hard-power reboot to turn off.

 

So any ideas? I pulled the diagnostics down from it again and can't seem  to see what the issue might be...tower-diagnostics-20180724-0340.zip At this point I'm debating installing on a new USB and seeing if it just locks up over time, at least then I'd know it was definitely hardware but because it bricks up I'm not sure where to go looking to find the issue. I can't just sit and stare at the log all day and hope it pops up. I've set Fix Common Problems back in Troubleshoot Mode so I hope it catches something this time.

Edited by Lawlanator
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.