December 23, 20196 yr Hello, My server has been crashing and I would appreciate help with diagnosing it. I've noticed it crashes more often during copying, or working with docker containers. So I replaced the sata cables (brand new) and it let me copy TBs of files with no issues. The next day it crashed again. Sometimes it just straight up randomly hard reboots, no stacktrace, no error nothing. Most of the time when I go to check on it, there is a stack trace on the screen, but never any errors in the log at the time of the crash. Things I have done so far: I tried mirroring the syslog to flash, and even through several crashes, nothing is captured in the log at the time of the crash.. Running the built in memtest (that comes with Unraid) for 24 hours - no reported issues Ran a separate memtest (newest version) from a separate usb, one ram stick at a time until completion (about >4 ish hours each) Tired different network cables, different nics on the server, and connecting to a different network switch. Today, I got a new USB, and downloaded a fresh copy (6.8.0) to it, using the unraid tool. From a different pc. - same problems Please let me know what I should try. Specs: CPU - Intel Xeon E3-1270 V2 CPU Mainboard - Supermicro X9SCM-F Ram - Kingston 32GB (4x8GB) 240-Pin DDR3 1600 ECC SERVER Unbuffered Memory PSU - Corsair CX 600 Samsung ssds and wd drives (all new and precleared) Edited December 23, 20196 yr by ToddCat
December 23, 20196 yr Community Expert See if it will work in SAFE mode and with no dockers or VMs. Also post Diagnostics. It can tell us more about your configuration and hardware than just syslog.
December 23, 20196 yr Author Currently in safe mode, and here is the diagnostics from the tools tab. eldorado-diagnostics-20191222-1706.zip
December 23, 20196 yr Community Expert You don't have a parity disk, is that correct? Start the array and post new diagnostics. A started array has a little more to tell us.
December 23, 20196 yr Author I do not yet. Diagnostics attached, Safe mode (in which I guess docker still runs) and array started. eldorado-diagnostics-20191222-1744.zip
December 23, 20196 yr Community Expert This likely won't have anything to do with your crash but you should set domains and system shares to cache-prefer, disable docker and VM services, and either delete docker and libvirt images so they can get recreated on cache, or run mover to get those shares moved to cache where they belong.
December 23, 20196 yr Author 26 minutes ago, trurl said: Ok all steps listed are done. I am guessing it will continue to run fine in safe mode. It usually runs fine in normal mode if I just leave it and dont log in or run any dockers. Its when I start using the server that it crashes. Edited December 23, 20196 yr by ToddCat
December 31, 20196 yr Author Alright, I set up a syslog server on another pc, The server did a hard reset on its own, just crashed and rebooted, no stack trace, no errors. Nothing in the syslog near the time. Reset at 11:24, last entry from 11:07 and its just info nothing helpful. Any advice on how to track down whats causing this?
Archived
This topic is now archived and is closed to further replies.