Jump to content

2 issues, full log file and missing dockers


IpDo

Recommended Posts

Hi,

I have 2 issues for a long while now that I keep ignoring, but I would like to get fixed before upgrading to 5.10.

 

1. the log file is getting full fast.

My current uptime is 40 days, but it's getting full in a few days tops according to the report of "Fix Common Problems" plugin.

diagnostics attached.

 

2. my second issue is more annoying - when SOME containers stop - they disappear form the "docker" tab. I get get them running again by adding a new container with the same config file and everything will work fine as long as they are operational.

the issue mostly bother me after a server restart (the dockers are missing and will not start automatically) or when I get some issue after updating the docker container (more annoying, as the logs will be missing as well).

 

I can see the issue easily with 2 of my dockers:

a. "frigate" - using "my-frigate.xml"

b. "deepstack_gpux" using "my-deepstack_gpux.xml"

 

I might created those two configs manually or changed them from the base version, as I don't thing they were available via the community addon when I started using them. not sure as too much time passed.

I've attached the docker files for those two as well.

 

Any help will be appreciated :)

tower-diagnostics-20211026-1921.zip my-frigate.xml my-deepstack_gpux.xml

Link to comment

Why do you have 100G docker.img? Have you had problems filling it? 20G is often more than enough, and making it larger won't fix filling it, it will only make it take longer to fill. The usual cause of filling docker.img is an application writing to a path that isn't mapped. Linux is case-sensitive.

 

Your syslog is being flooded with these, and in fact, you ran out of log space nearly a month ago.

Sep 28 04:40:24 Tower kernel: usb 1-1: USB disconnect, device number 41
Sep 28 04:40:24 Tower kernel: usb 1-1: new full-speed USB device number 42 using xhci_hcd
Sep 28 04:40:24 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:40:28 Tower kernel: usb 1-1: USB disconnect, device number 42
Sep 28 04:40:28 Tower kernel: usb 1-1: new full-speed USB device number 43 using xhci_hcd
Sep 28 04:40:28 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:41:23 Tower kernel: usb 1-1: USB disconnect, device number 43
Sep 28 04:41:23 Tower kernel: usb 1-1: new full-speed USB device number 44 using xhci_hcd
Sep 28 04:41:24 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:41:27 Tower kernel: usb 1-1: USB disconnect, device number 44
Sep 28 04:41:27 Tower kernel: usb 1-1: new full-speed USB device number 45 using xhci_hcd
Sep 28 04:41:28 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:42:10 Tower kernel: usb 1-1: USB disconnect, device number 45
Sep 28 04:42:10 Tower kernel: usb 1-1: new full-speed USB device number 46 using xhci_hcd
Sep 28 04:42:10 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:42:13 Tower kernel: usb 1-1: USB disconnect, device number 46
Sep 28 04:42:14 Tower kernel: usb 1-1: new full-speed USB device number 47 using xhci_hcd
Sep 28 04:42:14 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:43:24 Tower kernel: usb 1-1: USB disconnect, device number 47
Sep 28 04:43:24 Tower kernel: usb 1-1: new full-speed USB device number 48 using xhci_hcd
Sep 28 04:43:24 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:43:28 Tower kernel: usb 1-1: USB disconnect, device number 48
Sep 28 04:43:28 Tower kernel: usb 1-1: new full-speed USB device number 49 using xhci_hcd
Sep 28 04:43:28 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:44:23 Tower kernel: usb 1-1: USB disconnect, device number 49
Sep 28 04:44:23 Tower kernel: usb 1-1: new full-speed USB device number 50 using xhci_hcd
Sep 28 04:44:24 Tower kernel: cdc_acm 1-1:1.0: ttyACM1: USB ACM device
Sep 28 04:44:27 Tower kernel: usb 1-1: USB disconnect, device number 50

Any idea what that is about? You need to fix whatever is causing it. You will have to reboot to get syslog cleared.

Link to comment

I have / had a few dockers for image processing that had a large container (frigate for example is about 1.5gb~). so i just changed it to 100gb to be on the safe side. it's not filling up as far as i can tell.

 

ttyACM1 should be one of the zigbee sticks. i'll check it out. thanks!

any way to clean the logs without rebooting to see it the problem keeps happening?

Link to comment

Before removeing the log files i wanted to check the sizes.

it led me to this:

root@Tower:/var/log# ls -l
total 3872
-rw------- 1 root   root       0 Apr  7  2021 btmp
-rw-r--r-- 1 root   root       0 Apr 10  2020 cron
-rw-r--r-- 1 root   root       0 Apr 10  2020 debug
-rw-rw-rw- 1 root   root     518 Sep 16 15:42 diskinfo.log
-rw-rw-rw- 1 root   root   66415 Sep 16 15:40 dmesg
-rw-rw-rw- 1 root   root   12288 Oct 26 19:05 docker.log
-rw-r--r-- 1 root   root       0 Jun 16  2020 faillog
-rw-r--r-- 1 root   root       0 Apr  8  2000 lastlog
drwxr-xr-x 3 root   root     140 Oct 24 04:40 libvirt/
-rw-r--r-- 1 root   root       0 Apr 10  2020 maillog
-rw-r--r-- 1 root   root       0 Apr 10  2020 messages
drwxr-xr-x 2 root   root      40 May 16  2001 nfsd/
drwxr-x--- 2 nobody root      60 Sep 16 15:42 nginx/
lrwxrwxrwx 1 root   root      24 Apr  7  2021 packages -> ../lib/pkgtools/packages/
drwxr-xr-x 5 root   root     100 Sep 16 15:41 pkgtools/
drwxr-xr-x 2 root   root     300 Oct 26 19:47 plugins/
-rw-rw-rw- 1 root   root       0 Sep 16 15:41 preclear.disk.log
drwxr-xr-x 2 root   root      40 Sep 16 15:44 pwfail/
lrwxrwxrwx 1 root   root      25 Apr  7  2021 removed_packages -> pkgtools/removed_packages/
lrwxrwxrwx 1 root   root      24 Apr  7  2021 removed_scripts -> pkgtools/removed_scripts/
lrwxrwxrwx 1 root   root      34 Sep 16 15:41 removed_uninstall_scripts -> pkgtools/removed_uninstall_scripts/
drwxr-xr-x 3 root   root     180 Sep 16 15:51 samba/
lrwxrwxrwx 1 root   root      23 Apr  7  2021 scripts -> ../lib/pkgtools/scripts/
-rw-r--r-- 1 root   root       0 Apr 10  2020 secure
lrwxrwxrwx 1 root   root      21 Apr  7  2021 setup -> ../lib/pkgtools/setup/
-rw-r--r-- 1 root   root       0 Apr 10  2020 spooler
drwxr-xr-x 3 root   root      60 Jul 16  2020 swtpm/
-rw-r--r-- 1 root   root  917504 Oct 26 21:28 syslog
-rw-r--r-- 1 root   root 1491281 Sep 28 04:39 syslog.1
-rw-r--r-- 1 root   root 1457028 Sep 26 04:39 syslog.2
-rw-rw-rw- 1 root   root       0 Sep 16 15:40 vfio-pci
-rw-rw-r-- 1 root   utmp    6912 Sep 16 15:41 wtmp

root@Tower:/var/log/libvirt# ls -l
total 127192
-rw------- 1 root root 129273720 Oct 26 21:34 libvirtd.log
-rw------- 1 root root    967863 Sep 19 04:40 libvirtd.log.5.gz
drwxr-xr-x 2 root root        60 Sep 16 15:42 qemu/
-rw------- 1 root root         0 Sep 16 15:42 virtlockd.log
-rw------- 1 root root         0 Sep 16 15:42 virtlogd.log

 

the "libvirtd.log" seems to be the one the clogs the logs if i'm not mistaken.

I've attached it compressed is it 123mb raw.

 

the log seems to be this (below) on repeat:

2021-09-19 01:40:03.722+0000: 11980: error : virFileIsSharedFixFUSE:3384 : unable to canonicalize /mnt/user/domains/pfSenseVM/vdisk1.img: No such file or directory
2021-09-19 01:40:03.722+0000: 11980: error : qemuOpenFileAs:3175 : Failed to open file '/mnt/user/domains/pfSenseVM/vdisk1.img': No such file or directory
2021-09-19 01:40:03.750+0000: 11982: error : virFileIsSharedFixFUSE:3384 : unable to canonicalize /mnt/user/domains/pfSenseVM/vdisk1.img: No such file or directory
2021-09-19 01:40:03.750+0000: 11982: error : qemuOpenFileAs:3175 : Failed to open file '/mnt/user/domains/pfSenseVM/vdisk1.img': No such file or directory
2021-09-19 01:40:03.753+0000: 11980: error : virFileIsSharedFixFUSE:3384 : unable to canonicalize /mnt/user/domains/pfSenseVM/vdisk1.img: No such file or directory
2021-09-19 01:40:03.753+0000: 11980: error : qemuOpenFileAs:3175 : Failed to open file '/mnt/user/domains/pfSenseVM/vdisk1.img': No such file or directory
2021-09-19 01:40:03.758+0000: 11981: error : virFileIsSharedFixFUSE:3384 : unable to canonicalize /mnt/user/domains/pfSenseVM/vdisk1.img: No such file or directory
2021-09-19 01:40:03.758+0000: 11981: error : qemuOpenFileAs:3175 : Failed to open file '/mnt/user/domains/pfSenseVM/vdisk1.img': No such file or directory
2021-09-19 01:40:03.764+0000: 11983: error : virFileIsSharedFixFUSE:3384 : unable to canonicalize /mnt/user/domains/pfSenseVM/vdisk1.img: No such file or directory
2021-09-19 01:40:03.764+0000: 11983: error : qemuOpenFileAs:3175 : Failed to open file '/mnt/user/domains/pfSenseVM/vdisk1.img': No such file or directory
2021-09-19 01:40:03.861+0000: 11983: error : qemuOpenFileAs:3175 : Failed to open file '/dev/disk/by-id/ata-SAMSUNG_HD103SJ_S246J9BB442595': No such file or directory
2021-09-19 01:40:03.882+0000: 11984: error : qemuOpenFileAs:3175 : Failed to open file '/dev/disk/by-id/ata-SAMSUNG_HD103SJ_S246J9BB442595': No such file or directory
2021-09-19 01:40:03.886+0000: 11983: error : qemuOpenFileAs:3175 : Failed to open file '/dev/disk/by-id/ata-SAMSUNG_HD103SJ_S246J9BB442595': No such file or directory
2021-09-19 01:40:03.889+0000: 11982: error : qemuOpenFileAs:3175 : Failed to open file '/dev/disk/by-id/ata-SAMSUNG_HD103SJ_S246J9BB442595': No such file or directory
2021-09-19 01:40:03.894+0000: 11983: error : qemuOpenFileAs:3175 : Failed to open file '/dev/disk/by-id/ata-SAMSUNG_HD103SJ_S246J9BB442595': No such file or directory
2021-09-19 01:40:04.814+0000: 11980: error : virFileIsSharedFixFUSE:3384 : unable to canonicalize /mnt/user/domains/pfSenseVM/vdisk1.img: No such file or directory
2021-09-19 01:40:04.814+0000: 11980: error : qemuOpenFileAs:3175 : Failed to open file '/mnt/user/domains/pfSenseVM/vdisk1.img': No such file or directory
2021-09-19 01:40:05.605+0000: 11982: error : qemuOpenFileAs:3175 : Failed to open file '/dev/disk/by-id/ata-SAMSUNG_HD103SJ_S246J9BB442595': No such file or directory

 

I only one one VM that I use. the rest are relics that are no longer working and I only keep there in case I'll need to get something from the VM image.

 

vms.thumb.png.2db9cbad316ad1df1c77495156a69fca.png

 

Any ideas what's going on?

should I just removed the old VMs from the manager?

 

libvirtd.7z

Link to comment

Ok,

I've just removed the 3 unused VMs. removed the bad zigbee stick as well.

 

restarted the logs - seems to be fine now.

 

Any one got an idea about the 2nd issue (missing dockers in the manager)?

 

after stopping the docker, this is what I see in the log:

...
e":"can not get logs from container which is dead or marked for removal"}
e":"No such container: eb3bc92ef7d0"}
e":"No such container: eb3bc92ef7d0"}

 

Edited by IpDo
add info
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...