August 29, 20232 yr Hello, The server is randomly crashing but I can't seem to find the reason (might just be blind). I've been reading the logs but I cannot see any error. I did have some OOM error days/week ago but it was because of two misconfigured containers, but since I've set them up I had no more issues and they actually never caused issues apart the error in the log. In February I changed the whole system, going from AMD to Intel, the only component not changed is the DDR4 (and drives), which is not overclocked (I rather have a more stable server but slower), I did also change the flash drive, power supply, and the server has is own UPS as well. I'm saying this just to not rule out an hardware issue, I'll swap the ram too if that is the issue. I'm attaching the diagnostic and the logs. I hope someone can help me identify what's going on. I've named the log with the time the crash happened, the 0430 is the most recent one. tower-diagnostics-20230829-0939.zip crash_0430.log crash_0102.log Edited August 29, 20232 yr by bexem
August 29, 20232 yr Community Expert Other than the OOM issues don't see anything relevant logged, and although the crashes are not likely caused by OOM it would be good to fix those, another thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one.
August 29, 20232 yr Author 1 hour ago, JorgeB said: Other than the OOM issues don't see anything relevant logged, and although the crashes are not likely caused by OOM it would be good to fix those, another thing you can try is to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Thank you for having a look! I did fix the OOM errors, I haven't seen any other since. I guess I'll need to try what you are suggesting...I just wish unraid gave me an error/reason for the crashes!
September 1, 20232 yr Author Solution Sorry for the double post but I might have found the reason why unRaid was randomly crashing: the cache drive would randomly disconnect itself making unRaid unable to write the log file (as the syslog server was set on a share which prefers the cache) hence why no errors. I can’t say if it’s the drive itself or the SATA cable, but in the meantime I’ve removed the cache altogether and I’ve already observed the behaviour while mounted as external drive (can’t remember the name of the plug-in). I will replace the drive as currently without a cache the system is understandably slower, but I’m wondering if there is a way to notify or check somehow if the drive has disconnected itself or not (it does “reconnect” automatically by itself)? Basically I’m planning to replace the SATA cable and keep it connected (doing nothing), if that was the issue, great, otherwise I’ll have to send the disk back. Again, sorry for the double post but I wanted to share my experience in case other users encounter the same issue.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.