ToddCat Posted December 23, 2019 Share Posted December 23, 2019 (edited) Hello, My server has been crashing and I would appreciate help with diagnosing it. I've noticed it crashes more often during copying, or working with docker containers. So I replaced the sata cables (brand new) and it let me copy TBs of files with no issues. The next day it crashed again. Sometimes it just straight up randomly hard reboots, no stacktrace, no error nothing. Most of the time when I go to check on it, there is a stack trace on the screen, but never any errors in the log at the time of the crash. Things I have done so far: I tried mirroring the syslog to flash, and even through several crashes, nothing is captured in the log at the time of the crash.. Running the built in memtest (that comes with Unraid) for 24 hours - no reported issues Ran a separate memtest (newest version) from a separate usb, one ram stick at a time until completion (about >4 ish hours each) Tired different network cables, different nics on the server, and connecting to a different network switch. Today, I got a new USB, and downloaded a fresh copy (6.8.0) to it, using the unraid tool. From a different pc. - same problems Please let me know what I should try. Specs: CPU - Intel Xeon E3-1270 V2 CPU Mainboard - Supermicro X9SCM-F Ram - Kingston 32GB (4x8GB) 240-Pin DDR3 1600 ECC SERVER Unbuffered Memory PSU - Corsair CX 600 Samsung ssds and wd drives (all new and precleared) Edited December 23, 2019 by ToddCat Quote Link to comment
trurl Posted December 23, 2019 Share Posted December 23, 2019 See if it will work in SAFE mode and with no dockers or VMs. Also post Diagnostics. It can tell us more about your configuration and hardware than just syslog. Quote Link to comment
ToddCat Posted December 23, 2019 Author Share Posted December 23, 2019 Currently in safe mode, and here is the diagnostics from the tools tab. eldorado-diagnostics-20191222-1706.zip Quote Link to comment
trurl Posted December 23, 2019 Share Posted December 23, 2019 You don't have a parity disk, is that correct? Start the array and post new diagnostics. A started array has a little more to tell us. Quote Link to comment
ToddCat Posted December 23, 2019 Author Share Posted December 23, 2019 I do not yet. Diagnostics attached, Safe mode (in which I guess docker still runs) and array started. eldorado-diagnostics-20191222-1744.zip Quote Link to comment
trurl Posted December 23, 2019 Share Posted December 23, 2019 This likely won't have anything to do with your crash but you should set domains and system shares to cache-prefer, disable docker and VM services, and either delete docker and libvirt images so they can get recreated on cache, or run mover to get those shares moved to cache where they belong. Quote Link to comment
ToddCat Posted December 23, 2019 Author Share Posted December 23, 2019 (edited) 26 minutes ago, trurl said: Ok all steps listed are done. I am guessing it will continue to run fine in safe mode. It usually runs fine in normal mode if I just leave it and dont log in or run any dockers. Its when I start using the server that it crashes. Edited December 23, 2019 by ToddCat Quote Link to comment
ToddCat Posted December 31, 2019 Author Share Posted December 31, 2019 Alright, I set up a syslog server on another pc, The server did a hard reset on its own, just crashed and rebooted, no stack trace, no errors. Nothing in the syslog near the time. Reset at 11:24, last entry from 11:07 and its just info nothing helpful. Any advice on how to track down whats causing this? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.