Januszmirek Posted March 21 Share Posted March 21 I started recently to encounter lots of BTRFS and syslog errors. I would not be normally bothered but recently every few days I wake up to find out my docker containers are basically not working. Arrary restart doesn't help. Only unraid restart helps. But the issue comes back a few days later. What I usually see in log prior to restart is below. Mar 18 04:55:22 Tower rsyslogd: action 'action-3-builtin:omfile' (module 'builtin:omfile') message lost, could not be processed. Check for additional error messages before this one. [v8.2102.0 try https://www.rsyslog.com/e/2027 ] Mar 18 04:55:22 Tower rsyslogd: file '/mnt/user/appdata/syslog-192.168.50.5.log'[2] write error - see https://www.rsyslog.com/solving-rsyslog-write-errors/ for help OS error: No space left on device [v8.2102.0 try https://www.rsyslog.com/e/2027 ] Mar 21 08:24:10 Tower rsyslogd: omfwd/udp: socket 5: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ] Mar 21 08:24:10 Tower rsyslogd: omfwd: socket 5: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ] Mar 15 15:54:31 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 11436949504 have 0 My array disks, cache disk and flash drive show 0 errors. I have included old log in the zip file (syslog-old.txt), maybe it will be useful for someone to help me out what's wrong with my server. I tried google both issues but nothing helpful was found. Anyway thanks in advance to anyone who has any idea what's going on. tower-diagnostics-20240321-1753.zip Quote Link to comment
JorgeB Posted March 22 Share Posted March 22 I'm not seeing any btrfs errors in the diags, did you reboot? Quote Link to comment
Januszmirek Posted March 22 Author Share Posted March 22 7 hours ago, JorgeB said: I'm not seeing any btrfs errors in the diags, did you reboot? I did. See the syslog-old.txt for btrfs errors. Quote Link to comment
JorgeB Posted March 22 Share Posted March 22 Only checked the other one, but the old one is missing the start of the problem, in any case, those errors come from the docker image, so start by recreating it: https://docs.unraid.net/unraid-os/manual/docker-management/#re-create-the-docker-image-file Also see below if you have any custom docker networks: https://docs.unraid.net/unraid-os/manual/docker-management/#docker-custom-networks Quote Link to comment
JorgeB Posted March 22 Share Posted March 22 And if there are more errors after that save the diags before rebooting. Quote Link to comment
Januszmirek Posted March 22 Author Share Posted March 22 3 hours ago, JorgeB said: Only checked the other one, but the old one is missing the start of the problem, in any case, those errors come from the docker image, so start by recreating it: https://docs.unraid.net/unraid-os/manual/docker-management/#re-create-the-docker-image-file Also see below if you have any custom docker networks: https://docs.unraid.net/unraid-os/manual/docker-management/#docker-custom-networks Thanks! I'll do that and report back on how it went. I don't suppose I have set up any custom docker network, all containers Network type is 'host' or 'bridge'. Quote Link to comment
Januszmirek Posted March 23 Author Share Posted March 23 I have rebuilt docker and restarted the machine. Syslog errors came back right away Mar 23 14:10:10 Tower rsyslogd: omfwd: socket 1: error 101 sending via udp: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ] Mar 23 14:10:10 Tower rsyslogd: omfwd/udp: socket 1: sendto() error: Network is unreachable [v8.2102.0 try https://www.rsyslog.com/e/2354 ] I wonder if this has anything to do with the network type setting in docker? Currently this is setup as follows: Docker custom network type: macvlan Could changing this have impact on these errors? For btfrs errors I will probably need to wait a day or two as these are usually happen during night hours. Hopefully the docker rebuilt helped solve this. Quote Link to comment
itimpi Posted March 23 Share Posted March 23 4 hours ago, Januszmirek said: Docker custom network type: macvlan If you want to use macvlan then you should have bridging disabled on eth0 or you are likely to have system instability. Quote Link to comment
Januszmirek Posted March 24 Author Share Posted March 24 (edited) I don't need macvlan. I'm not sure why it was set up like this in a first place. Anyway, docker rebuilt didn't help. Woke up this morning to find out not only containers but also unraid web interface was not available. Hard reset later, and rebuilt of docker again, this time with ipvlan seems to work so far. At least no syslog errors in log. I will monitor those btfrs errors now. Thanks for the hint with macvlan;) EDIT: Happiness didn't last long. Now, an hour after docker rebuilt, full system crash - only reboot helped. New btfrs errors from log: Mar 24 20:19:25 Tower kernel: BTRFS info (device loop4): using crc32c (crc32c-intel) checksum algorithm Is my cache drive dying? it still shows 0 errors. Edited March 24 by Januszmirek Quote Link to comment
JorgeB Posted March 24 Share Posted March 24 2 hours ago, Januszmirek said: Mar 24 20:19:25 Tower kernel: BTRFS info (device loop4): using crc32c (crc32c-intel) checksum algorithm This is not an error. Quote Link to comment
Januszmirek Posted March 26 Author Share Posted March 26 How about this one: Mar 26 08:58:40 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224475136 have 0 Mar 26 08:58:40 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224524288 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224507904 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224475136 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224524288 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 2 want 20224524288 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224507904 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224475136 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224524288 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 2 want 20224524288 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224507904 have 0 Mar 26 08:58:45 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224475136 have 0 Mar 26 09:12:57 Tower unraid-api[8774]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:13:04 Tower unraid-api[11408]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:13:11 Tower unraid-api[14508]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:13:18 Tower unraid-api[17221]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:13:25 Tower unraid-api[20009]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:13:32 Tower unraid-api[22508]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:13:39 Tower unraid-api[25349]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:13:46 Tower unraid-api[27752]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:13:53 Tower unraid-api[30200]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:14:00 Tower unraid-api[32595]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:14:07 Tower unraid-api[2590]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:14:14 Tower unraid-api[5324]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:14:21 Tower unraid-api[7854]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:14:28 Tower unraid-api[9611]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:14:35 Tower unraid-api[11448]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' Mar 26 09:14:42 Tower unraid-api[13675]: ⚠️ Caught exception: EIO: i/o error, scandir '/var/run/docker/containerd/daemon/io.containerd.runtime.v2.task/moby/4c765a904e781a4ced957e91ad602ca741043834bba888d3de0d59fca040f5b0/work' It's getting ridiculous now. Some containers start to behave really weird. Nginx won't generate new ssl certs. Plex web does not open. I tried to remove the container but 'Execution error Server error' pop up shows up and I am unable to remove the container. I will try to restart the unraid but this is becoming a chore and a far cry from rock solid experience I had with the machine for the last few years. Quote Link to comment
JorgeB Posted March 26 Share Posted March 26 2 hours ago, Januszmirek said: Mar 26 08:58:40 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224475136 have 0 Mar 26 08:58:40 Tower kernel: BTRFS error (device loop2: state EA): bad tree block start, mirror 1 want 20224524288 have 0 These indicate a corrupt docker image, recreate: https://docs.unraid.net/unraid-os/manual/docker-management/#re-create-the-docker-image-file Also see below if you have any custom docker networks: https://docs.unraid.net/unraid-os/manual/docker-management/#docker-custom-networks Quote Link to comment
Solution Januszmirek Posted April 6 Author Solution Share Posted April 6 So I finally solved the issue. Turns out it wasn't a corrupt docker image or a problem with docker networks or anything else that I initially suspected. One other issue I was encountering for months now (but somehow did not connect it with this one) is that every night I got notifications about cache disk space filling out (to 100%). I had no idea what was this about as in the morning everything was fine and cache disk was maybe 60% filled. I then forgot I created about 600gb VM on my array - the space i needed for it was too big for my cache. What I however forgot to change after creating the vm was to not attempt moving the VM to cache. Basically as show below: I really did not need this VM anymore, so deleted it and boom! all problems magically disappeared all together. No more btrfs or syslog errors. Doubt this would help anyone, but just wanted to let you know the issue is resolved. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.