This will be the third time I've attempted to reach out to support to address this issue, with no resolution and the last attempt I didn't even get a response.
Quite a few of my docker containers (which is about the only thing I use Unraid for now) get in a hung state, typically with s6-sync in a zombie state and maxing out a core on my CPUs. Sometimes I can get a day out of them, sometimes less. Sometimes I can get weeks. The instability is literally making me lose my mind, as I have investigated every single aspect outside Unraid (since I'm not good with troubleshooting this OS). Currently, the main docker container that is now doing this, is the nginx-proxy-manager container. I used to have issues with Sonarr/Radarr and anything that touched the SMB shares mounted using the unassigned devices plugin. I believe I have found the source of at least the issues with those containers, being that the shares those dockers used, were hosted on a TrueNAS server using SMB, with Asynchronous writes disabled. Once I re-enabled that, the issues at least on those dockers seems to have DRASTICALLY decreased.
Now, the docker I have the most issues with is nginx-proxy-manager. I cannot host my websites if this continually goes down, and requires a reboot of Unraid to resolve. Unless I'm mistaken, this docker should be nowhere near these SMB shares, which leads me to believe that was never the problem in the first place. If this was just happening with nginx-proxy-manager, I would reach out specifically to that dev, but this has been a problem for me for months with no resolution.
I initially had my array setup with 12 SAS drives, and thought that may be the cause. I have since replace those and have installed 4 SATA drives and still have the same issue. I run a cache drive to stage everything at a faster read/write speed, and to also store my dockers on. This reports no issues, nor can I find anything in the logs to suggest it with this device. I have shut down nearly all non-essential dockers at this point, and it still seems to occur. Short of attempting to migrate this to a new set of server hardware and starting the USB stick from scratch, I can't seem to figure out any leads on what may be causing this.
mut7app201-diagnostics-20201129-0129.zip