bling Posted February 29, 2020 Share Posted February 29, 2020 Logged in this morning to my server and noticed that a parity check was running, which I thought was odd. I was like hmmmm...I didn't start that. Maybe I had it scheduled for end of the month? Checked that, nope -- it's off. Then I realized that the uptime was an hour. My machine is 24/7 and has been rock solid for 2-3 weeks. From there, I was experimenting with a new docker app, and then bam!! Random reboot. From here, I turned off docker/VMs and let the parity check run to completion. Thankfully no errors. Just now, again, playing around with a docker app, and it rebooted again. This time, I'm greeted with an unmountable btrfs cache disk (can't read superblock). I was able to mount the cache disk in read-only mode with nologreplay and copy everything to the array. I've heard horror stories of others with corrupting btrfs cache disks, and once I copy everything over I'm reformatting my cache disk to XFS. Ironically, this is usually due to sudden power loss, and even though I do have a UPS hooked up, it didn't protect from the computer rebooting itself. Could a bad hard drive cause random reboots? I'm highly suspecting it's either the drive or btrfs, given that's the only thing common in all 3 reboots. All docker containers are using the cache disk. Thanks in advance. Quote Link to comment
bling Posted February 29, 2020 Author Share Posted February 29, 2020 (edited) Sigh....while I was rsyncing from the array back to a freshly formatted XFS cache disk, the server hard rebooted again. So I guess that rules out the file system. I had putty tailing the syslog at the time and nothing was logged during the reboot. I'm also tailing dmesg now... Edited March 1, 2020 by bling Quote Link to comment
Decto Posted February 29, 2020 Share Posted February 29, 2020 You haven't said anything about the spec so difficult to comment. Have you run a memory test? Quote Link to comment
bling Posted February 29, 2020 Author Share Posted February 29, 2020 It's my old rig that I repurposed as a NAS server. 4790k, asrock mobo, 16GB RAM. It's been rock solid since day 1 when it was running Windows. Recently when I rebuilt it for unraid, all the hardware remained the same except for new hard drives, a recently replaced PSU under warranty, and a new UPC. I'm running memtest right now, directly plugged into the wall. Quote Link to comment
bling Posted February 29, 2020 Author Share Posted February 29, 2020 Another bit of useful information, I caught it doing a random boot in the middle of a reboot! I SSHed into the box, was tailing the log, before emhttp started up I lost connection. Quote Link to comment
bling Posted March 1, 2020 Author Share Posted March 1, 2020 memtest passed overnight. rebooted unraid in safe mode, and within moments of a medium workload within a docker container, reboot! i checked /proc/sys/kernel/panic, and it's set to 0, which is the default meaning it will not auto-reboot. just swapped out the PSU with a spare...wish me luck! Quote Link to comment
bling Posted March 2, 2020 Author Share Posted March 2, 2020 it's the PSU. since i swapped with a spare it hasn't crashed regardless of what workloads i threw at it. did a full parity check and some bits needed correcting. Quote Link to comment
Hoopster Posted March 2, 2020 Share Posted March 2, 2020 4 minutes ago, bling said: it's the PSU. since i swapped with a spare it hasn't crashed regardless of what workloads i threw at it. did a full parity check and some bits needed correcting. Every time I have seen the behavior you described; random reboots; especially under more than an idle load, the problem has been the PSU. That is especially true if it reboots while the system is booting up. The boot up process puts brief moderate to heavy loads on the PSU and if it is failing, you get another reboot. The PSU and/or RAM are the usual suspects but I always check PSU before RAM (although RAM tests are relatively easy to do). That's why I keep a spare around. Glad you appear to have it sorted. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.