AquaWolf Posted February 5, 2022 Share Posted February 5, 2022 Hey there my unraid server hangs up randomly after about 2 or 3 days. I saved now also old syslog files to troubleshoot but I'm a bit lost. I see many lines of this in the logs Feb 4 04:34:14 Tower rsyslogd: file '/var/log/syslog'[9] write error - see https://www.rsyslog.com/solving-rsyslog-write-errors/ for help OS error: No space left on device [v8.2002.0 try https://www.rsyslog.com/e/2027 ] Feb 4 04:34:14 Tower rsyslogd: action 'action-0-builtin:omfile' (module 'builtin:omfile') message lost, could not be processed. Check for additional error messages before this one. [v8.2002.0 try https://www.rsyslog.com/e/2027 ] I'm not sure from where this problem comes because its not possible to use the unraid in this state at all (reboot is also not working just a hardware reset is working then. Normally there should be enough space. I attached syslog and diagnostics. Syslog is a bit too big for a attachment: https://drive.google.com/file/d/14N8SRlCZ6ZLJpD9ExvXU_mThHAaw_Pmc/view?usp=sharing The diagnostics are from the current session. I would be really thankfull for some help troubleshoot this problem. Kind Regards tower-diagnostics-20220205-1422.zip Quote Link to comment
Squid Posted February 5, 2022 Share Posted February 5, 2022 Its the docker log that's going insane. Unfortunately it didn't get attached to the diagnostics for some reason. Only real recourse is to reboot. Then wait say half hour and repost a set of diagnostics Quote Link to comment
AquaWolf Posted February 5, 2022 Author Share Posted February 5, 2022 Ok I'll send when I'm back home. Quote Link to comment
AquaWolf Posted February 5, 2022 Author Share Posted February 5, 2022 (edited) Ok I think again the docker log is missing... I extracted here on the phone but I can't see a docker log tower-diagnostics-20220205-1538.zip I will reboot when I'm back home. Edited February 5, 2022 by AquaWolf Quote Link to comment
AquaWolf Posted February 5, 2022 Author Share Posted February 5, 2022 Here is a diagnostic file after reboot tower-diagnostics-20220205-1625.zip Quote Link to comment
AquaWolf Posted February 5, 2022 Author Share Posted February 5, 2022 (edited) here are the docker logs after some time 98f9df2c32a12084a9cd3a5c550446335282d4448621ca2aa4324dea5ff179ac container is authelia docker.txt Edited February 5, 2022 by AquaWolf Quote Link to comment
Squid Posted February 5, 2022 Share Posted February 5, 2022 Do any of the containers (especially in Advanced view) show Unhealthy? Try this (won't hurt either on healthy containers) Quote Link to comment
AquaWolf Posted February 5, 2022 Author Share Posted February 5, 2022 (edited) thanks a lot, seems to that the docker log is less spammed. I'll look into it tomorrow and see if something is still going on in the log Edit: Logs are still spammed full with also after disabling healthcheck for those containers time="2022-02-05T22:02:30.828492862+01:00" level=error msg="collecting stats for 12a3c7b35713a9562b74d2ed3b7cbadfee5143f299a082759f5a9d50edda5521: no metrics received" time="2022-02-05T22:02:30.834479553+01:00" level=error msg="collecting stats for 0de4d7d5a567b9bc4de167f048bba62e2d79c55856209db598a74db0e3638e84: no metrics received" time="2022-02-05T22:02:30.835612282+01:00" level=error msg="collecting stats for 62cd22a03f1764c2d67e9d5c0179f462ed942d388ba2c21598fde8ad32154a2c: no metrics received" After a reboot only authelia container is spamming the logs. Edited February 6, 2022 by AquaWolf Quote Link to comment
Solution AquaWolf Posted February 6, 2022 Author Solution Share Posted February 6, 2022 Ok i found the Problem, could be helpfull also for others that are not receiving stats from containers. Here is the Github Issue: https://github.com/docker/for-linux/issues/219 I put these two lines in go file sudo mkdir /sys/fs/cgroup/systemd sudo mount -t cgroup -o none,name=systemd cgroup /sys/fs/cgroup/systemd I came to this issue by this lines right after starting the authelia container: level=error msg="loading cgroup for 55949" error="cgroups: cannot find cgroup mount destination" So after doing this manual fix on the console it was able for me to get the stats for the authelia container so i added those two commands to go file, let's see if this was the cause of my freezing server but I thinks that's the temporary fix. Thanks @Squid for giving me some troubleshooting hints 😃 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.