zbron Posted January 7, 2021 Share Posted January 7, 2021 Currently running 6.8.3 and having a new and strange issue I can't figure out. Server had been running for multiple months with little to no issue, but recently every few days my log will fill up overnight (usually between 4:30AM - 5:00AM) which basically causes the server to cease to function (all dockers shut down except Plex, which ceases to function) and can only be fixed with a reboot. While the reboot fixes the issue, it keeps occurring and I'd like to fix it at the source. All my dockers have the extra parameter: --log-opt max-size=50m --log-opt max-file=1 Currently my docker log rotation is on I am no VMs Below I've attached a couple of items I thought might help: -sm /var/log/* output All Diagnostics files (strangely my syslog doesn't have any entries after ~2:00AM) Mover Settings Mover Tuning Settings The notification I received from Fix Common Problems telling me the log was full SSD Trim Settings Thanks in advance for the help Unraid community! Diagnostics.zip Quote Link to comment
ChatNoir Posted January 7, 2021 Share Posted January 7, 2021 Looks like you have problems on your NVME drive: Jan 7 02:02:57 Ansible kernel: BTRFS info (device nvme0n1p1): leaf 2371893067776 gen 22649348 total ptrs 127 free space 6377 owner 2 Jan 7 02:02:57 Ansible kernel: item 0 key (998602620928 168 32768) itemoff 16230 itemsize 53 Jan 7 02:02:57 Ansible kernel: extent refs 1 gen 8074017 flags 1 Jan 7 02:02:57 Ansible kernel: ref#0: extent data backref root 5 objectid 40579188 offset 0 count 1 Jan 7 02:02:57 Ansible kernel: item 1 key (998602653696 168 24576) itemoff 16177 itemsize 53 Jan 7 02:02:57 Ansible kernel: extent refs 1 gen 20973330 flags 1 Jan 7 02:02:57 Ansible kernel: ref#0: extent data backref root 5 objectid 81908089 offset 0 count 1 Jan 7 02:02:57 Ansible kernel: item 2 key (998602678272 168 16384) itemoff 16124 itemsize 53 Jan 7 02:02:57 Ansible kernel: extent refs 1 gen 20807749 flags 1 Jan 7 02:02:57 Ansible kernel: ref#0: extent data backref root 5 objectid 81073547 offset 0 count 1 This cause errors on loop2 (docker.img I think). Jan 7 02:02:57 Ansible kernel: BTRFS: error (device nvme0n1p1) in __btrfs_free_extent:6802: errno=-2 No such entry Jan 7 02:02:57 Ansible kernel: BTRFS info (device nvme0n1p1): forced readonly Jan 7 02:02:57 Ansible kernel: BTRFS: error (device nvme0n1p1) in btrfs_run_delayed_refs:2935: errno=-2 No such entry Jan 7 02:02:57 Ansible kernel: BTRFS error (device nvme0n1p1): pending csums is 7475200 Jan 7 02:03:01 Ansible kernel: loop: Write error at byte offset 10158755840, length 4096. Jan 7 02:03:01 Ansible kernel: print_req_error: I/O error, dev loop2, sector 19841320 Jan 7 02:03:01 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 6, rd 0, flush 0, corrupt 0, gen 0 Jan 7 02:03:11 Ansible kernel: loop: Write error at byte offset 4098166784, length 4096. Jan 7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 8004232 Jan 7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 7, rd 0, flush 0, corrupt 0, gen 0 Jan 7 02:03:11 Ansible kernel: loop: Write error at byte offset 2370904064, length 4096. Jan 7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 4630672 Jan 7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 8, rd 0, flush 0, corrupt 0, gen 0 Jan 7 02:03:11 Ansible kernel: loop: Write error at byte offset 3451928576, length 4096. Jan 7 02:03:11 Ansible kernel: print_req_error: I/O error, dev loop2, sector 6742048 Jan 7 02:03:11 Ansible kernel: BTRFS error (device loop2): bdev /dev/loop2 errs: wr 9, rd 0, flush 0, corrupt 0, gen 0 This would explain your issues with your dockers and the log filling with errors. Quote Link to comment
zbron Posted January 7, 2021 Author Share Posted January 7, 2021 (edited) ChatNoir - thanks for spotting this! Any thoughts on how to fix? Sounds like I have a bunch of research to do. Hopefully the drive isn't dead... it's been in my cache pool for only 1.5 years but I guess the excessive write issue could have burnt it out. Additionally I'd appreciate any insight you can offer as to whether this is a hardware issue (replace the NVME drive) or a software issue (nuke the cache, reformat, restore from backup). I performed a BTRFS scrub (no errors) and also a filesystem check (output attached) Edited January 7, 2021 by zbron more info Quote Link to comment
ChatNoir Posted January 7, 2021 Share Posted January 7, 2021 I am not the resident expert on BTRFS, so I'll call @JorgeB. Quote Link to comment
JorgeB Posted January 8, 2021 Share Posted January 8, 2021 You can try check --repair but only after making sure everything important is backed up, if it doesn't work or if it re-occurs best bet is to re-format and restore the data. Quote Link to comment
zbron Posted January 8, 2021 Author Share Posted January 8, 2021 Thanks @JorgeB will try that now and hope it solves the issue. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.