June 5, 20242 yr Hey all, For a very long while, around three years or so, I've been dealing with Unraid hanging on me every time the parity check starts. After four or five minutes, the system would just hang, including all services. Thinking this was a hardware issue I've switched around pretty much all components to no positive result. More recently, the box started hanging randomly, not just after the monthly parity check started. There are no traces in syslog that could point me to what's going on. During these last two events, I noticed that the Docker service didn't start with the system booting. After a few minutes of starting the service manually, the box hangs. Today, just a few minutes ago, I started the machine to get the diagnostics file and again the Docker service didn't start automatically as it was supposed to. This time around I left it off and the system is still doing the parity check without problems. Even though I still have no clue what's happening, it's pointing to the Docker service or a particular docker and not a hardware issue. Would you be able to help me troubleshoot this further? I'm attaching the diagnostics file from this last boot-up. Thank you in advance! unraid-diagnostics-20240605-1800.zip
June 6, 20242 yr Community Expert Seems unlikely that docker would make the server crash during a parity check, but it's an easy test, disabled the docker service, reboot, and run a parity check.
June 6, 20242 yr Author Yeah, that's what i did last night to find the server unresponsive again this morning. I'm back to square one.
June 6, 20242 yr Community Expert Sounds more like a hardware issue, but enable the syslog server and post that after a crash, in case it catches something.
June 6, 20242 yr Author There is no error in the syslog. I think the diagnostics download includes it. I can only find multiple instances of these lines: Jun 5 17:51:20 MESIASUNRAID rsyslogd: action 'action-3-builtin:omfwd' resumed (module 'builtin:omfwd') [v8.2102.0 try https://www.rsyslog.com/e/2359 ] Jun 5 17:51:20 MESIASUNRAID rsyslogd: omfwd/udp: socket 2: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ] Jun 5 17:51:20 MESIASUNRAID rsyslogd: omfwd: socket 2: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ] Jun 5 17:51:20 MESIASUNRAID rsyslogd: action 'action-3-builtin:omfwd' suspended (module 'builtin:omfwd'), retry 0. There should be messages before this one giving the reason for suspension. [v8.2102.0 try https://www.rsyslog.com/e/2007 ]
June 6, 20242 yr Author I reboot it agan but cancelled the parity check. It's been 9 hours... Let's see tomorrow.
June 7, 20242 yr Community Expert 9 hours ago, Mesias said: I think the diagnostics download includes it The syslog in the diagnostics is the RAM version that starts afresh every time the system is booted. The diagnostics only also include the one generated by the syslog server if you were using the mirror to flash option - if instead you were using the Remote Syslog Server field it then needs to be manually supplied.
June 7, 20242 yr Author Well, after canceling the parity check the system hasn't crashed. I notice now the shares are not visible in the network. I can still access them based on mapped drives through Windows but I can't see them in Explorer. Very odd... attached the syslog. This log should have everything since 2022. syslog-MESIASUNRAID.zip
June 9, 20242 yr Author After a couple of days of uptime, I turned on the Docker service and, after a few minutes, it crashed. Here is the syslog from the bootup on 06/06 up to this last crash. I appreciate your help. syslog-MESIASUNRAID_20240606-0609.zip
June 9, 20242 yr Community Expert Could be a container causing issues, try starting just the docker service, then if OK, start the containers one by one and retest.
June 11, 20242 yr Author I was able to narrow it down to the SABnzbd container. It was in the process of repairing a large movie file and it seems to be causing the crash. There is no information in the SABnzbd logs or in syslog... I can see the intense CPU usage but nothing in the logs. I was reading around and I found a couple of interesting options to solve this: 1. ZFS cache drive vs BTRFS. My cache drive is currently on BTRFS but all other drives under the array are on XFS. Should I change the cache drive to match the rest of the drives? 2. SABnzbd hogging resources when the downloads are in the same drive as the appdata. Some people recommend having the downloads folders for the *arrs in a separate unassigned drive. Are these potentially viable solutions? Are you guys aware of any other option? Edited June 11, 20242 yr by Mesias Typo
June 11, 20242 yr Community Expert You can try changing the filesystem, or use disk paths for the container, or an exclusive share, that may also help.
June 11, 20242 yr Author I added an unassigned SSD and the pre-clear process crashed it again... all this while the SABnzbd container is stopped. I don't know what to think anymore... I'd hate to spend on new hardware if this continues happening. Are there any other logs I can enable?
June 12, 20242 yr Community Expert 10 hours ago, Mesias said: Are there any other logs I can enable? Not that I know of.
June 14, 20242 yr Author Any heavy disk I/O is causing the crash... Based on the symptoms, what are the chances of this being a software issue and not hardware-related?
June 14, 20242 yr Community Expert On 6/11/2024 at 10:39 PM, Mesias said: I added an unassigned SSD and the pre-clear process crashed it again What do you mean by this, just preclearing a device crashed the server? Or preclear crashed?
June 15, 20242 yr Author Yes, just pre-clearing the device crashed the server. It turned the system unresponsive, same symptoms I've described before. No web access, no network access, nothing works.
June 16, 20242 yr Community Expert 13 hours ago, Mesias said: Yes, just pre-clearing the device crashed the server. That's very different than what you described before, and sounds more like a hardware issue.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.