boyd91 Posted December 30, 2018 Share Posted December 30, 2018 Sometimes (all) my services running in Docker containers are totally unresponsive and giving timeouts on every http request. Only the unraid GUI still works as fast as always, except for the Docker page, that one takes very long to load. But it will load eventually and there's no single container that's using much CPU or mem. I've experienced this 4-6 times already and only a reboot would fix it. When this happens I can however log in to the unraid web GUI without any trouble. The dashboard bars show 100% CPU and memory usage on all my cores. But when I run `top` in the terminal, I see very different (and normal) CPU and mem usage. I was hoping to be able to provide some diagnostics but it happens only every few months or so (will do it the next time it happens). I did look at the syslog last time and there was nothing unusual to see, only the timeouts that are caused by the unresponsiveness of the containers. Does anyone recognize these symptoms? Or can anyone give me advice on how to pinpoint the issue, because this is seriously hurting the wife acceptance factor, considering Home Assistant is running on it. 1 Quote Link to comment
MulletBoy Posted January 6, 2019 Share Posted January 6, 2019 Im seeing similar symptoms, extremely high CPU usage in the UNRAID dashboard, by very normal cpu usage when i run 'top' or 'htop' in terminal. Restarts didnt fix it for me, and upgrading the latest unraid did not fix it either. ( I went from 6.6.3 to 6.6.6). I am running standard stuff, plex, sonarr, couch potato, sabnzbd, letsencrypt (with some wordpress sites), mysql, mariadb, nextcloud, rutorrent, unifi, muximux. I have found the culprit in my case to be couchpotato, it is running the mover on my completed torrents folder on repeat erroneously.. as in copy completed torrent xyz to the library location leaving a copy in the completed folder (as it should for seeding), however then not remembering that its processed that file already, and just doing it again and again, cycling through all the files in the folder.... I havent figured out what is wrong with couchpotato or how to fix it yet, but at least i know what the issue is and disabling couchpotato docker for now has stopped the super high CPU usage. Quote Link to comment
Johan76 Posted January 14, 2019 Share Posted January 14, 2019 (edited) Have you found out anything else about this? I have the EXAKT same problem as you describe. Sometimes it however resolves itself. I dont need to reboot, but I can actullay enter setting and disable dockers. This forces everything to stop and then I can restart docker agan and it works normally. This is really annoying and I am not sure how to check what is causing this. I am not running Couchpotato but Radarr (and Sonarr). I have not made this connection. I have however sometimes files moving around which could seem to be forever but they dont use all this CPU when that has been occuring. Most annoying things is that Plex is non responsive so my movies for the kids does not work (if I am not home to "fix" it). ------------------------- Ok. Thanks for the tip. I logged into console. Killed off Sonarr docker with docker command. CPU dropped down to normal use in a few minutes (guess when all processes belonging to Sonarr where all killed). I will try to disable Sonarr for the moment. Edited January 14, 2019 by Johan76 New test - killing off containers Quote Link to comment
Zonediver Posted January 14, 2019 Share Posted January 14, 2019 (edited) Diagnostics??? Logs??? Something else??? Edited January 14, 2019 by Zonediver Quote Link to comment
Johan76 Posted January 14, 2019 Share Posted January 14, 2019 2 hours ago, Zonediver said: Diagnostics??? Logs??? Something else??? Good point! Been at 100% cpu all day. Around 17:00 today I killed the Sonarr docker and everything went back to normal. I think the logs are a while back so it should be during the problems. Let me know if you need something else. nas2-diagnostics-20190114-1843.zip Quote Link to comment
Zonediver Posted January 14, 2019 Share Posted January 14, 2019 (edited) 4 hours ago, Johan76 said: Good point! The diagnostics file is "always" important so the specialists can analyse and see what happens 😉 Edited January 14, 2019 by Zonediver Quote Link to comment
zyrmpg Posted June 3, 2019 Share Posted June 3, 2019 Did you guys happen to figure this out? I've been seeing this exact same problem for a week now. Its been so frustrating! Quote Link to comment
Timbiotic Posted November 12, 2019 Share Posted November 12, 2019 (edited) same thing today on latest unraid any answers? Attaching diags before rebooting lillis.69.mu-diagnostics-20191112-1454.zip Edited November 12, 2019 by Timbiotic Quote Link to comment
Timbiotic Posted November 12, 2019 Share Posted November 12, 2019 and top screenshot and gui Quote Link to comment
glennv Posted November 12, 2019 Share Posted November 12, 2019 Although you mistakenly may think that top shows no activity, check the cpu wait . Its at around 50% indicating cpu is waiting on something (typicaly i/o). So its in line with what the gui is showing namely about 50% load (2 of the 4 cores are in wait state) Quote Link to comment
Timbiotic Posted November 12, 2019 Share Posted November 12, 2019 seeing that as i cannot reboot, plex and duplicati wont shut down will probably have to fat finger after work. Quote Link to comment
Timbiotic Posted November 12, 2019 Share Posted November 12, 2019 is the wait the "wa" sorry dont use linux that often Quote Link to comment
glennv Posted November 12, 2019 Share Posted November 12, 2019 (edited) 3 minutes ago, Timbiotic said: is the wait the "wa" sorry dont use linux that often yes Cpu usage is always split in user/system/wait/idle . Together its ~100% (ignoring the other smaller indicators) Meaning its x% of the time either serving the user, bussy with internal system activities, waiting for something , or idle doing nothing. Edited November 12, 2019 by glennv Quote Link to comment
Timbiotic Posted November 12, 2019 Share Posted November 12, 2019 can i find out what its waiting on and kill it? i hate having to fat finger it. All dockers stopped but plex and duplicati and I cant stop them from command line. Quote Link to comment
glennv Posted November 12, 2019 Share Posted November 12, 2019 did you try killing them like normal linux processes with kill pid or kill -9 pid. ? If they wont die with this , its difficult as likely completely hanging. You can see this effect also when a process is hanging on an hard nfs mount that is not there anymore. Typicaly only a reboot can kill these sessions. You dont happen to have any mounted external shares that they can be hanging on ? Quote Link to comment
Timbiotic Posted November 12, 2019 Share Posted November 12, 2019 i killed dockerd how can i see what specific is it waiting on? I also unmounted some external unassigned disks. Quote Link to comment
Dissones4U Posted November 12, 2019 Share Posted November 12, 2019 (edited) 1 hour ago, Timbiotic said: how can i see what specific is it waiting on have you tried: ps l (try ps -x instead this gives a better list with the current state of the process and the pid) anything waiting should be in the (uninterruptible) D state I think... all of my processes are in the (interruptible) S sleep state. Edited November 12, 2019 by Dissones4U corrected Quote Link to comment
glennv Posted November 12, 2019 Share Posted November 12, 2019 (edited) if killing the specific docker still did not clear the iowait situation then its not as simple to find the culprit. check this for some ideas how to approach it https://bencane.com/2012/08/06/troubleshooting-high-io-wait-in-linux/ The required commands/tools for deeper troubleshooting you can install using the nerdtools plugin. Like iotop for example, which is in there and which may be usefull. Its not a simple problem that can easily be identified remote. edit: iostat is part of the sysstat package of nerdtools Edited November 12, 2019 by glennv Quote Link to comment
Osiris Posted February 1, 2021 Share Posted February 1, 2021 (edited) I experience the EXACT same behaviour. All docker containers are irresponsive, but still running. seth-diagnostics-20210201-0154.zip Solved it by killing 2 factorio-docker containers (kill -9 on the processes involved) & 2 vms (windows console & ubuntu idle test system). Other containers became available again. Edited February 1, 2021 by Osiris Quote Link to comment
DrSpaldo Posted July 9 Share Posted July 9 This is an old post and may be unrelated. But, if people google search like I did and come across this. Check to make sure if you are using rsync to transfer that you do not use the compress flag (-z) as this will cause this exact behaviour. I was getting unresponsive dockers, total CPU usage reported within the unraid gui but little usage in htop. Turned it after I stopped the rysnc transfer and resumed without the compression flag, it fixed the issue. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.