Log Filling Up Regularly


zbron

Recommended Posts

Currently running 6.8.3 and having a new and strange issue I can't figure out. Server had been running for multiple months with little to no issue, but recently every few days my log will fill up overnight (usually between 4:30AM - 5:00AM) which basically causes the server to cease to function (all dockers shut down except Plex, which ceases to function) and can only be fixed with a reboot. While the reboot fixes the issue, it keeps occurring and I'd like to fix it at the source.

 

All my dockers have the extra parameter: --log-opt max-size=50m --log-opt max-file=1

Currently my docker log rotation is on

I am no VMs

 

Below I've attached a couple of items I thought might help:

  • -sm /var/log/* output
  • All Diagnostics files (strangely my  syslog doesn't have any entries after ~2:00AM)
  • Mover Settings
  • Mover Tuning Settings
  • The notification I received from Fix Common Problems telling me the log was full
  • SSD Trim Settings

 

Thanks in advance for the help Unraid community!

-sm :var:log:* Output.png

Mover Settings.png

Mover Tuning Settings.png

Notification.png

SSD Trim Settings.png

Diagnostics.zip

Link to comment

Looks like you have problems on your NVME drive:

Jan  7 02:02:57 Ansible kernel: BTRFS info (device nvme0n1p1): leaf 2371893067776 gen 22649348 total ptrs 127 free space 6377 owner 2
Jan  7 02:02:57 Ansible kernel: 	item 0 key (998602620928 168 32768) itemoff 16230 itemsize 53
Jan  7 02:02:57 Ansible kernel: 		extent refs 1 gen 8074017 flags 1
Jan  7 02:02:57 Ansible kernel: 		ref#0: extent data backref root 5 objectid 40579188 offset 0 count 1
Jan  7 02:02:57 Ansible kernel: 	item 1 key (998602653696 168 24576) itemoff 16177 itemsize 53
Jan  7 02:02:57 Ansible kernel: 		extent refs 1 gen 20973330 flags 1
Jan  7 02:02:57 Ansible kernel: 		ref#0: extent data backref root 5 objectid 81908089 offset 0 count 1
Jan  7 02:02:57 Ansible kernel: 	item 2 key (998602678272 168 16384) itemoff 16124 itemsize 53
Jan  7 02:02:57 Ansible kernel: 		extent refs 1 gen 20807749 flags 1
Jan  7 02:02:57 Ansible kernel: 		ref#0: extent data backref root 5 objectid 81073547 offset 0 count 1

This cause errors on loop2 (docker.img I think).

Jan  7 02:02:57 Ansible kernel: BTRFS: error (device nvme0n1p1) in __btrfs_free_extent:6802: errno=-2 No such entry
Jan  7 02:02:57 Ansible kernel: BTRFS info (device nvme0n1p1): forced readonly
Jan  7 02:02:57 Ansible kernel: BTRFS: error (device nvme0n1p1) in btrfs_run_delayed_refs:2935: errno=-2 No such entry
Jan  7 02:02:57 Ansible kernel: BTRFS error (device nvme0n1p1): pending csums is 7475200
Jan  7 02:03:01 Ansible kernel: loop: Write error at byte offset 10158755840, length 4096.
Jan  7 02:03:01 Ansible kernel: print_req_error: I/O error, dev loop2, sector 19841320
Jan  7 02:03:01 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 6, rd 0, flush 0, corrupt 0, gen 0
Jan  7 02:03:11 Ansible kernel: loop: Write error at byte offset 4098166784, length 4096.
Jan  7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 8004232
Jan  7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 7, rd 0, flush 0, corrupt 0, gen 0
Jan  7 02:03:11 Ansible kernel: loop: Write error at byte offset 2370904064, length 4096.
Jan  7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 4630672
Jan  7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 8, rd 0, flush 0, corrupt 0, gen 0
Jan  7 02:03:11 Ansible kernel: loop: Write error at byte offset 3451928576, length 4096.
Jan  7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 6742048
Jan  7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 9, rd 0, flush 0, corrupt 0, gen 0

This would explain your issues with your dockers and the log filling with errors.

Link to comment

ChatNoir - thanks for spotting this! Any thoughts on how to fix? Sounds like I have a bunch of research to do. Hopefully the drive isn't dead... it's been in my cache pool for only 1.5 years but I guess the excessive write issue could have burnt it out.

 

Additionally I'd appreciate any insight you can offer as to whether this is a hardware issue (replace the NVME drive) or a software issue (nuke the cache, reformat, restore from backup).

 

I performed a BTRFS scrub (no errors) and also a filesystem check (output attached)

Screen Shot 2021-01-07 at 2.11.50 PM.png

Edited by zbron
more info
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.