[Solved] /var/log/docker.log.1 filling up /var/log


MrLinux

Recommended Posts

Hi, for some reason my /var/log/docker.log.1 keeps getting filled up.

 

root@unraid:/var/log# tail docker.log.1
time="2020-11-24T07:52:02.278261235-08:00" level=error msg="Failed to log msg \"\" for logger json-file: write /var/lib/docker/containers/7cc59575b280c9b47d562787e36b753ffdbfeba1f3c686e75886dd18280ae207/7cc59575b280c9b47d562787e36b753ffdbfeba1f3c686e75886dd18280ae207-json.log: read-only file system"
time="2020-11-24T07:52:02.278277104-08:00" level=error msg="Failed to log msg \"\" for logger json-file: write /var/lib/docker/containers/7cc59575b280c9b47d562787e36b753ffdbfeba1f3c686e75886dd18280ae207/7cc59575b280c9b47d562787e36b753ffdbfeba1f3c686e75886dd18280ae207-json.log: read-only file system"
time="2020-11-24T07:52:02.278290524-08:00" level=error msg="Failed to log msg \"\" for logger json-file: write /var/lib/docker/containers/7cc59575b280c9b47d562787e36b753ffdbfeba1f3c686e75886dd18280ae207/7cc59575b280c9b47d562787e36b753ffdbfeba1f3c686e75886dd18280ae207-json.log: read-only file system"

Things I've tried to do:

  1. A few restarts
  2. Stop Docker service > Delete docker image file > Start docker service > reinstall dockers
  3. Increased /var/log in my go file from 128MB to 384MB

 

I can't really think of anything that would have caused this. At first, I thought it was because my cache pool was getting full but I've already addressed the docker that was causing it to fill up. This last time my cache pool has plenty of space (>500GB).

 

root@unraid:/mnt/cache# df -hP
Filesystem             Size  Used Avail Use% Mounted on
rootfs                  16G 1022M   15G   7% /
devtmpfs                16G     0   16G   0% /dev
tmpfs                   16G     0   16G   0% /dev/shm
cgroup_root            8.0M     0  8.0M   0% /sys/fs/cgroup
tmpfs                  384M  384M     0 100% /var/log
/dev/sda1               30G  294M   30G   1% /boot
/dev/loop0             9.4M  9.4M     0 100% /lib/modules
/dev/loop1             8.5M  8.5M     0 100% /lib/firmware
tmpfs                  1.0M     0  1.0M   0% /mnt/disks
/dev/mapper/md1        3.7T  931G  2.8T  25% /mnt/disk1
/dev/mapper/md2         11T  6.5T  4.5T  60% /mnt/disk2
/dev/mapper/md3        9.1T  8.3T  846G  91% /mnt/disk3
/dev/mapper/md4        3.7T   26G  3.7T   1% /mnt/disk4
/dev/mapper/md5        3.7T   26G  3.7T   1% /mnt/disk5
/dev/mapper/nvme0n1p1  932G  387G  544G  42% /mnt/cache
shfs                    31T   16T   16T  51% /mnt/user0
shfs                    31T   16T   16T  51% /mnt/user
/dev/loop3             1.0G  4.0M  905M   1% /etc/libvirt
/dev/loop2              50G   20G   31G  40% /var/lib/docker

 

root@unraid:/var/log# ls -lSshr | tail -5
4.0K -rw-rw-rw- 1 root   root 1.3K Nov 23 00:42 vfio-pci
8.0K -rw-rw-r-- 1 root   utmp 7.9K Nov 25 10:58 wtmp
 64K -rw-rw-rw- 1 root   root  64K Nov 23 00:42 dmesg
720K -rw-r--r-- 1 root   root 720K Nov 25 10:58 syslog
384M -rw-rw-rw- 1 root   root 384M Nov 24 13:14 docker.log.1

 

Thanks for the help!

 

Solution / TL;DR

This could have been a combination of a few issues (in my use case, though many of these shouldn't be related to each other)

  1. docker.img corrupted (most probable reason for the log file getting filled up with read-only errors)
    1. Stop the docker service
    2. delete /mnt/user/system/docker/docker.img
    3. Start the docker service
    4. re-install docker containers
  2. IPMI Plugin issues
    1. Stopped this for a while to see the ipmi errors in syslog go away
    2. I've since re-enabled this so I can get networks stats back (I'll move to influx/grafana at some point)
  3. Github connectivity issues
    1. I've read that loading the plugins page can be affected if github is inaccessible. 
    2. To work around this, disable checking for plugins in Settings > Notification Settings > Plugins update (Disabled)
    3. There may have been another location but don't remember.
  4. I've also updated from v6.9.0b25 to v6.9.0b35

unraid-diagnostics-20201125-1103.zip

Edited by MrLinux
Link to comment

The reason that log is filling is because docker.img is corrupt.

 

Why have you given 50G to docker.img? 20G should be more than enough, but I see you have already used 20G of that 50, so I assume you have you had problems filling it.

 

You will have to delete and recreate it, 20G should be plenty. Then you need to figure out why you are filling it. You can reinstall your docker apps just as they were using the Previous Apps feature on the Apps page, but just as they were is probably not good enough.

 

The usual reason for filling docker.img is an application writing to a path that isn't mapped. Common mistake is specifying a path within the application that doesn't exactly match the container path in the mappings in upper/lower case.

 

Not related but your domains share has files on the array. Is that intentional? Do you have any VMs?

Link to comment

Thanks for the suggestion, but I've tried deleting the docker image file and re-installing (see #2 of things I've tried already).

That being said, 20G does seem a bit much. Where can I go find which Docker is eating up all the space?

 

Going into /var/lib/docker didn't seem to help much as it's showing not much space is being taken up at all

root@unraid:/var/lib/docker# du -shx *
0	btrfs
16K	builder
56K	buildkit
1.1M	containerd
131M	containers
45M	image
148K	network
0	plugins
0	runtimes
0	swarm
0	tmp
0	trust
964K	unraid
4.0K	unraid-autostart
8.0K	unraid-update-status.json
14M	volumes

While `df` is showing /var/lib/docker/ is taking up 20GB

root@unraid:/var/lib/docker# df -hP
Filesystem             Size  Used Avail Use% Mounted on
rootfs                  16G  988M   15G   7% /
devtmpfs                16G     0   16G   0% /dev
tmpfs                   16G     0   16G   0% /dev/shm
cgroup_root            8.0M     0  8.0M   0% /sys/fs/cgroup
tmpfs                  384M  2.1M  382M   1% /var/log
/dev/sda1               30G  294M   30G   1% /boot
/dev/loop0             9.4M  9.4M     0 100% /lib/modules
/dev/loop1             8.5M  8.5M     0 100% /lib/firmware
tmpfs                  1.0M     0  1.0M   0% /mnt/disks
/dev/mapper/md1        3.7T  931G  2.8T  25% /mnt/disk1
/dev/mapper/md2         11T  6.5T  4.5T  60% /mnt/disk2
/dev/mapper/md3        9.1T  8.3T  846G  91% /mnt/disk3
/dev/mapper/md4        3.7T   26G  3.7T   1% /mnt/disk4
/dev/mapper/md5        3.7T   26G  3.7T   1% /mnt/disk5
/dev/mapper/nvme0n1p1  932G  422G  509G  46% /mnt/cache
shfs                    31T   16T   16T  51% /mnt/user0
shfs                    31T   16T   16T  51% /mnt/user
/dev/loop2              50G   20G   31G  39% /var/lib/docker
/dev/loop3             1.0G  4.0M  905M   1% /etc/libvirt

 

Link to comment
5 minutes ago, MrLinux said:

In another thread I see you were using 8GB with a dozen Dockers.

That was a very old thread. In several more recent threads the way I put it is I have 17 dockers and they are using less than half of 20G.

 

14 minutes ago, MrLinux said:

Where can I go find which Docker is eating up all the space?

Docker page Container Size button at bottom.

  • Like 1
Link to comment

In the Docker, /config doesn't appear to be a separate volume, but is pointing to appdata in /mnt/user/appdata/. Is this expected? 

 

sh-5.0# root@unraid:/mnt/user/appdata# docker_it binhex-jellyfin
sh-5.0# df -hP
Filesystem      Size  Used Avail Use% Mounted on
/dev/loop2       50G   21G   30G  41% /
tmpfs            64M     0   64M   0% /dev
tmpfs            16G     0   16G   0% /sys/fs/cgroup
shm              64M     0   64M   0% /dev/shm
shfs             31T   16T   16T  51% /media
/dev/loop2       50G   21G   30G  41% /etc/hosts
tmpfs            16G     0   16G   0% /proc/acpi
tmpfs            16G     0   16G   0% /sys/firmware

sh-5.0# cd /
sh-5.0# du -shxc ./*
4.0K	./bin
0	./boot
4.7G	./config
0	./dev
8.1M	./etc
2.2G	./home
4.0K	./lib
4.0K	./lib64
10T	./media
0	./mnt
0	./opt
du: cannot read directory './proc/68/map_files': Permission denied
du: cannot read directory './proc/70/map_files': Permission denied
du: cannot access './proc/4247/task/4247/fd/4': No such file or directory
du: cannot access './proc/4247/task/4247/fdinfo/4': No such file or directory
du: cannot access './proc/4247/fd/4': No such file or directory
du: cannot access './proc/4247/fdinfo/4': No such file or directory
0	./proc
12K	./root
0	./run
4.0K	./sbin
0	./srv
4.0K	./supervisord.pid
0	./sys
888K	./tmp
1.5G	./usr
14M	./var
10T	total
sh-5.0#

Note: docker_it is just my alias to quickly login the docker

 

Does this seem reasonable?

2020-11-28_02-46-17.thumb.png.98b766d6d9f0452e7f47250db2aa7ac6.png

Link to comment
4 hours ago, MrLinux said:

/config doesn't appear to be a separate volume, but is pointing to appdata in /mnt/user/appdata/. Is this expected? 

It is typical for /config to be mapped to a subfolder of the appdata user share (/mnt/user/appdata). Makes me wonder why you even asked that question.

 

4 hours ago, MrLinux said:

Note: docker_it is just my alias to quickly login the docker

You can get to the command line inside any of your docker containers by simply clicking on its icon and selecting Console.

Link to comment
You can get to the command line inside any of your docker containers by simply clicking on its icon and selecting Console.

I like ssh’ing to the box directly. I use iTerm to have split windows.

 

Regarding /config, I was just pointing out that it doesn’t appear to be on its own file system like /media.

 

 

Sent from my iPhone using Tapatalk

Link to comment
Command:
root@localhost:# /usr/local/emhttp/plugins/dynamix.docker.manager/scripts/docker run -d --name='binhex-jellyfin' --net='br0.255' --ip='10.1.255.2' -e TZ="America/Los_Angeles" -e HOST_OS="Unraid" -e 'TCP_PORT_8096'='8096' -e 'UMASK'='000' -e 'PUID'='99' -e 'PGID'='100' -v '/mnt/user/Media/':'/media':'rw' -v '/mnt/user/appdata/binhex-jellyfin/':'/config':'rw' 'binhex/arch-jellyfin'
52e486470c7c51a7f66090a6170f549a9e6b885ce943f28f32fe23cc43cd5358

The command finished successfully!

image.thumb.png.318eff3f57634bdffca49e57d4d49ad1.png

 

 

I tested if the container's /config is really mapped to /mnt/user/appdata/binhex-jellyfin. The test below shows that it is mapped, even though /config is not seen as a mountpoint in the container (df -hP). 

image.thumb.png.3b2dc66182d8c5409f865745e82a5ce8.png

Thanks for the help.

Edited by MrLinux
Link to comment

TL:DR I'm finding a hard time finding my config/mapping issues to show that a 20GB of used docker.img is unreasonable with 29 containers. BTW, it seems many of my syslog errors went away after disabling daily plugin checks and disable/re-enable ipmi. I've since re-enabled daily plugin checks and the errors have not come back. I read it could have been related to being unable to connect to github if it had an outage or some other routing issues.

 

Checked what is using up the 5.88GB in the container, but see it's only using up 3.6GB instead. Where is the other ~2.2GB coming from? I see "Writable" is roughly this size of ~2.2GB.

 

2020-11-29_04-51-22.thumb.png.7af7528c3679a731b87a644624c1a519.png

 

Where is this 2.2GB coming from? The only location that's big enough to be considered in the local filesystem is in /home, as there are no other directories large enough to account for the 2.2GB.

sh-5.0# find / -maxdepth 1 -mindepth 1 -type d  \! -exec mountpoint -q {} \; -exec du -xshc {} + | sort -h
0	/boot
0	/mnt
0	/opt
0	/run
0	/srv
4.0K	/.gnupg
12K	/root
24K	/tmp
8.1M	/etc
14M	/var
1.5G	/usr
2.2G	/home # <----------------------
3.6G	total

 

Inside home is just a user called nobody. The files in here do not appear to be user-created files from the application. 

sh-5.0# cd /home
sh-5.0# ls -la
total 0
drwxr-xr-x 1 root   root   12 Sep  2 01:12 .
drwxr-xr-x 1 root   root  206 Nov 29 04:06 ..
drwxrwxr-x 1 nobody users 130 Nov 29 04:08 nobody
sh-5.0# cd nobody/
sh-5.0# ls -la
total 8
drwxrwxr-x 1 nobody users 130 Nov 29 04:08 .
drwxr-xr-x 1 root   root   12 Sep  2 01:12 ..
drwxrwxrwx 1 nobody users  38 Nov 29 04:06 .aspnet
-rw------- 1 root   root  581 Nov 29 04:23 .bash_history
drwxrwxr-x 1 nobody users  34 Sep  5 11:32 .cache
drwxrwxr-x 1 nobody users   6 Sep  5 11:29 .config
drwxrwxr-x 1 nobody users 220 Sep  5 11:35 .dotnet
drwxrwxr-x 1 nobody users  10 Sep  5 11:36 .local
drwxrwxr-x 1 nobody users  26 Sep  5 11:36 .nuget
-rwxrwxr-x 1 nobody users 467 Sep  5 11:28 start.sh
drwxrwxr-x 1 nobody users   6 Sep  5 11:32 .yarn
sh-5.0# find . -maxdepth 1 -mindepth 1 -type d  \! -exec mountpoint -q {} \; -exec du -xshc {} + | sort -h
0	./.yarn
4.0K	./.aspnet
4.0K	./.config
12K	./.dotnet
254M	./.local
639M	./.cache
1.3G	./.nuget
2.2G	total

 

Using mountpoint, I found that /config is actually a mountpoint, even though it doesn't appear as one using df.

sh-5.0# mountpoint /config
/config is a mountpoint

sh-5.0# mountpoint /media
/media is a mountpoint

sh-5.0# mountpoint /home
/home is not a mountpoint

sh-5.0# mountpoint /usr
/usr is not a mountpoint

 

Edited by MrLinux
Link to comment
  • 1 year later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.