Log Filling Up Regularly

zbron · January 7, 2021

Currently running 6.8.3 and having a new and strange issue I can't figure out. Server had been running for multiple months with little to no issue, but recently every few days my log will fill up overnight (usually between 4:30AM - 5:00AM) which basically causes the server to cease to function (all dockers shut down except Plex, which ceases to function) and can only be fixed with a reboot. While the reboot fixes the issue, it keeps occurring and I'd like to fix it at the source.

All my dockers have the extra parameter: --log-opt max-size=50m --log-opt max-file=1

Currently my docker log rotation is on

I am no VMs

Below I've attached a couple of items I thought might help:

-sm /var/log/* output
All Diagnostics files (strangely my syslog doesn't have any entries after ~2:00AM)
Mover Settings
Mover Tuning Settings
The notification I received from Fix Common Problems telling me the log was full
SSD Trim Settings

Thanks in advance for the help Unraid community!

Diagnostics.zip

ChatNoir · January 7, 2021

Looks like you have problems on your NVME drive:

Jan  7 02:02:57 Ansible kernel: BTRFS info (device nvme0n1p1): leaf 2371893067776 gen 22649348 total ptrs 127 free space 6377 owner 2
Jan  7 02:02:57 Ansible kernel: 	item 0 key (998602620928 168 32768) itemoff 16230 itemsize 53
Jan  7 02:02:57 Ansible kernel: 		extent refs 1 gen 8074017 flags 1
Jan  7 02:02:57 Ansible kernel: 		ref#0: extent data backref root 5 objectid 40579188 offset 0 count 1
Jan  7 02:02:57 Ansible kernel: 	item 1 key (998602653696 168 24576) itemoff 16177 itemsize 53
Jan  7 02:02:57 Ansible kernel: 		extent refs 1 gen 20973330 flags 1
Jan  7 02:02:57 Ansible kernel: 		ref#0: extent data backref root 5 objectid 81908089 offset 0 count 1
Jan  7 02:02:57 Ansible kernel: 	item 2 key (998602678272 168 16384) itemoff 16124 itemsize 53
Jan  7 02:02:57 Ansible kernel: 		extent refs 1 gen 20807749 flags 1
Jan  7 02:02:57 Ansible kernel: 		ref#0: extent data backref root 5 objectid 81073547 offset 0 count 1

This cause errors on loop2 (docker.img I think).

Jan  7 02:02:57 Ansible kernel: BTRFS: error (device nvme0n1p1) in __btrfs_free_extent:6802: errno=-2 No such entry
Jan  7 02:02:57 Ansible kernel: BTRFS info (device nvme0n1p1): forced readonly
Jan  7 02:02:57 Ansible kernel: BTRFS: error (device nvme0n1p1) in btrfs_run_delayed_refs:2935: errno=-2 No such entry
Jan  7 02:02:57 Ansible kernel: BTRFS error (device nvme0n1p1): pending csums is 7475200
Jan  7 02:03:01 Ansible kernel: loop: Write error at byte offset 10158755840, length 4096.
Jan  7 02:03:01 Ansible kernel: print_req_error: I/O error, dev loop2, sector 19841320
Jan  7 02:03:01 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 6, rd 0, flush 0, corrupt 0, gen 0
Jan  7 02:03:11 Ansible kernel: loop: Write error at byte offset 4098166784, length 4096.
Jan  7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 8004232
Jan  7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 7, rd 0, flush 0, corrupt 0, gen 0
Jan  7 02:03:11 Ansible kernel: loop: Write error at byte offset 2370904064, length 4096.
Jan  7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 4630672
Jan  7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 8, rd 0, flush 0, corrupt 0, gen 0
Jan  7 02:03:11 Ansible kernel: loop: Write error at byte offset 3451928576, length 4096.
Jan  7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 6742048
Jan  7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 9, rd 0, flush 0, corrupt 0, gen 0

This would explain your issues with your dockers and the log filling with errors.

zbron · January 7, 2021

ChatNoir - thanks for spotting this! Any thoughts on how to fix? Sounds like I have a bunch of research to do. Hopefully the drive isn't dead... it's been in my cache pool for only 1.5 years but I guess the excessive write issue could have burnt it out.

Additionally I'd appreciate any insight you can offer as to whether this is a hardware issue (replace the NVME drive) or a software issue (nuke the cache, reformat, restore from backup).

I performed a BTRFS scrub (no errors) and also a filesystem check (output attached)

Edited January 7, 2021 by zbron
more info

ChatNoir · January 7, 2021

I am not the resident expert on BTRFS, so I'll call @JorgeB.

JorgeB · January 8, 2021

You can try check --repair but only after making sure everything important is backed up, if it doesn't work or if it re-occurs best bet is to re-format and restore the data.

zbron · January 8, 2021

Thanks @JorgeB will try that now and hope it solves the issue.

Log Filling Up Regularly

Recommended Posts

zbron

Link to comment

ChatNoir

Link to comment

zbron

Link to comment

ChatNoir

Link to comment

JorgeB

Link to comment

zbron

Link to comment

Join the conversation