November 30, 20178 yr Running unRAID 6.3.5 on a N54L micro-server. I tried to stop a docker container two days ago, and that hung the web interface completely. The emhttp process is available but doesn't process any requests - it has a Recv-Q of 129 for the listen socket. One s6-sync process is consuming 100% of one processor core and all is system time, and it has consumed 44 hours. I don't see it perform any disk activity - the counters in /proc/diskstats doesn't change. The amount of CPU time consumed and the start time of the s6-sync process does match quite well with the time when I did try to stop the docker container. The system did mail an ok health report tonight and previous night. Last line in dmesg is a disk spindown a couple of days ago. No information in /var/log/ except mails having been sent out, my ssh connects, a report that bunker has processed 240 files from a drive and an attempt I made to do a shutdown. No recent information on the system console - last emitted line there is a note from when dynamix.file.integrity.plg was installed. The reboot attempt on the command line just resulted in the prompt hanging after having written the reboot message - but the system isn't rebooting so it's possible to make a new ssh connection to the system. The shutdown command hangs waiting for noninterruptible I/O. All results I can see of the shutdown command is that it managed to stop all docker containers and /var/log/docker.log contains a note about shutdown of containerd. Short SMART test is ok for all drives, but one drive can't be accessed - any process touching it will hang in uninterruptible wait. Current uptime after upgrading to 6.3.5 is 17 days. With a version 6 beta the system got restarted about 1-2 times/year for dusting or because of loss of power. It looks like the s6-sync process is busy-looping within the kernel while holding some critical resource lock for one of the file systems.
Archived
This topic is now archived and is closed to further replies.