June 1, 20251 yr Kept putting this off, but for awhile now, I will periodically find that my server has shutdown and on boot up it states it had an unclean shutdown. it used to happen only a once or twice a year but now it's been happening almost once a month (had a crash a couple days ago, ran the parity check, and today since it's the 1st of the month, I had it start my scheduled monthly parity check, but it crashed again about an hour ago). There doesn't seem to be a pattern to it, but it does more frequently happen closer to the last of the month, causing me to run have to run 2 parity checks in short succession. I use a APC 900W UPS, and I've raised the minimum power requirements for when it safely shuts down, as well as raised the time limits for safe shutdown for shutting down docker, VM, and drives, which still hasn't seemed to make a difference. I have strong suspicions it is memory related, since a couple years ago I had gotten an mce log with a memory error in it indicating a bad slot, so I was unable to fully fill the 4 slots (I have a supermicro dual CPU mobo, and I've tried moving the RAM sticks to the other CPU's slots and splitting the RAM between the 2, but that hasn't seemed to help). I don't have the money to do a new server build at the moment, but while I suspect it's a hardware issue, I still have a lot to learn about unRAID so in the meantime I figured I'd ask here in case there's some software configuration or error message that I'm missing (I have my syslog setup, but I haven't noticed any errors around when the shut downs happen). I generated a diagnostic zip after I booted back up a couple days ago just in case, so that's my latest one, but I can generate a newer one now in case that'd help. I'm also posting my saved syslogs in case there was something logged there that i missed or didn't recognize as something significant.I think I've put all the information I can, but please let me know if I'm missing anything, and I appreciate any help I can get! carlunraid-diagnostics-20250529-1440.zip syslog-192.168.1.58.log syslog-192.168.1.58.log.1 syslog-192.168.1.58.log.2
June 2, 20251 yr Community Expert There's nothing relevant logged that I can see, if you have multiple sticks try using the server with just one or a pair, if the same try with a different one, that will basically rule out bad RAM.
June 16, 20251 yr Author Solution Just an update for anyone else who sees this, I ended up solving it, it appears to have been a bad motherboard, ended up swapping out/replacing the boot flash, RAM, PSU, and Processors and nothing worked until I replaced the motherboard. Luckily it's a pretty old system to everything was relatively cheap. Worth noting, before I swapped the motherboard out, I turned off all docker containers except 4 (just Jellyfin, NGINX proxy manager, a dynamic DNS container, and a watchstate tracker for Jellyfin), and I was able to complete a full parity check. The new motherboard arrived that night so I shutdown the server and swapped them in the morning so don't know how long it would've stayed on, but I'm guess that all the containers I was running combined (or maybe even a specific one) was taxing the system in such a way that pushed the defective motherboard over the edge. Not sure what part of it was broken, or even what the issue was (probably either overheating or something power related, my money's on the later). But either way, it's all fixed now, thanks for the piece of mind for double checking the logs for me. Edited June 16, 20251 yr by flickdaddy
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.