gadget069 Posted July 2, 2023 Share Posted July 2, 2023 I've started getting this error. This is my main chache drive, the crc error count hasn't increased since I built the server months ago. I'm still on 6.11.5 Jul 2 10:06:26 Tortuga kernel: BTRFS error (device sdc1): bdev /dev/sdh1 errs: wr 719861, rd 56099, flush 29483, corrupt 72317, gen 0 Jul 2 10:06:26 Tortuga kernel: BTRFS warning (device sdc1): csum failed root 5 ino 192195 off 8282112 csum 0x8941f998 expected csum 0x95788b85 mirror 1 Jul 2 10:06:26 Tortuga kernel: BTRFS error (device sdc1): bdev /dev/sdh1 errs: wr 719861, rd 56099, flush 29483, corrupt 72318, gen 0 Jul 2 10:06:26 Tortuga kernel: BTRFS warning (device sdc1): csum failed root 5 ino 192195 off 8286208 csum 0x8941f998 expected csum 0x37e66784 mirror 1 Jul 2 10:06:26 Tortuga kernel: BTRFS error (device sdc1): bdev /dev/sdh1 errs: wr 719861, rd 56099, flush 29483, corrupt 72319, gen 0 Jul 2 10:06:26 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 192195 off 8273920 (dev /dev/sdh1 sector 730716592) Jul 2 10:06:26 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 192195 off 8269824 (dev /dev/sdh1 sector 730716584) Jul 2 10:06:26 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 192195 off 8278016 (dev /dev/sdh1 sector 730716600) Jul 2 10:06:26 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 192195 off 8282112 (dev /dev/sdh1 sector 730716608) Jul 2 10:06:26 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 192195 off 8286208 (dev /dev/sdh1 sector 730716616) Jul 2 10:06:26 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 192195 off 8265728 (dev /dev/sdh1 sector 730716576) tortuga-diagnostics-20230702-1010.zip Quote Link to comment
gadget069 Posted July 2, 2023 Author Share Posted July 2, 2023 Is my cache pool corrupted? I had the Crucial drop out yesterday, bad cable from Amazon. Quote Link to comment
gadget069 Posted July 3, 2023 Author Share Posted July 3, 2023 (edited) More info SDH is the Crucial "parity" cache drive. It appears I need to take it offline/replace? root@Tortuga:~# btrfs dev stats /mnt/cache [/dev/sdc1].write_io_errs 0 [/dev/sdc1].read_io_errs 0 [/dev/sdc1].flush_io_errs 0 [/dev/sdc1].corruption_errs 0 [/dev/sdc1].generation_errs 0 [/dev/sdh1].write_io_errs 719861 [/dev/sdh1].read_io_errs 56099 [/dev/sdh1].flush_io_errs 29483 [/dev/sdh1].corruption_errs 74731 [/dev/sdh1].generation_errs 0 root@Tortuga:~# Edited July 3, 2023 by gadget069 Quote Link to comment
Solution JorgeB Posted July 3, 2023 Solution Share Posted July 3, 2023 Run a correcting scrub and post the results. Quote Link to comment
gadget069 Posted July 3, 2023 Author Share Posted July 3, 2023 1 hour ago, JorgeB said: Run a correcting scrub and post the results. UUID: 23927474-4d91-49d0-9d5d-f8afb9afe97e Scrub started: Mon Jul 3 05:02:07 2023 Status: finished Duration: 0:02:59 Total to scrub: 125.55GiB Rate: 718.24MiB/s Error summary: verify=2705 csum=41755 Corrected: 44460 Uncorrectable: 0 Unverified: 0 Quote Link to comment
JorgeB Posted July 3, 2023 Share Posted July 3, 2023 All should be good now, problems were caused by one of the devices dropping offline, recommend taking a look here for better pool monitoring for future issues. Quote Link to comment
gadget069 Posted July 3, 2023 Author Share Posted July 3, 2023 All should be good now, problems were caused by one of the devices dropping offline, recommend taking a look here for better pool monitoring for future issues.Thanks, I will today. I plan on a sight motherboard and CPU upgrade and ditching the Amazon cables. Nothing but issues with them. I think 3 or if the 5 were bad. Sent from my SM-S918U1 using Tapatalk 1 Quote Link to comment
gadget069 Posted July 3, 2023 Author Share Posted July 3, 2023 2 hours ago, JorgeB said: All should be good now, problems were caused by one of the devices dropping offline, recommend taking a look here for better pool monitoring for future issues. It looks like the 2nd cahce drive still has issues. Maybe I have the pool configured wrong? root@Tortuga:~# btrfs dev stats /mnt/cache [/dev/sdc1].write_io_errs 0 [/dev/sdc1].read_io_errs 0 [/dev/sdc1].flush_io_errs 0 [/dev/sdc1].corruption_errs 0 [/dev/sdc1].generation_errs 0 [/dev/sdh1].write_io_errs 0 [/dev/sdh1].read_io_errs 0 [/dev/sdh1].flush_io_errs 0 [/dev/sdh1].corruption_errs 86297 [/dev/sdh1].generation_errs 5446 root@Tortuga:~# Quote Link to comment
JorgeB Posted July 3, 2023 Share Posted July 3, 2023 If you reset the errors it means there are new ones, post new diags. Quote Link to comment
gadget069 Posted July 3, 2023 Author Share Posted July 3, 2023 12 minutes ago, JorgeB said: If you reset the errors it means there are new ones, post new diags. Thanks tortuga-diagnostics-20230703-0754.zip Quote Link to comment
JorgeB Posted July 3, 2023 Share Posted July 3, 2023 I'm not seeing any errors after the scrub, did you reset the stats after that? Quote Link to comment
gadget069 Posted July 3, 2023 Author Share Posted July 3, 2023 15 minutes ago, JorgeB said: I'm not seeing any errors after the scrub, did you reset the stats after that? I did not. But am seeing this in the log just now Jul 3 09:00:46 Tortuga kernel: BTRFS warning (device sdc1): checksum verify failed on 524376424448 wanted 0xb6800818 found 0xf660d968 level 1 Jul 3 09:00:46 Tortuga kernel: repair_io_failure: 166 callbacks suppressed Jul 3 09:00:46 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 0 off 524376424448 (dev /dev/sdc1 sector 29999776) Jul 3 09:00:46 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 0 off 524376428544 (dev /dev/sdc1 sector 29999784) Jul 3 09:00:46 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 0 off 524376432640 (dev /dev/sdc1 sector 29999792) Jul 3 09:00:46 Tortuga kernel: BTRFS info (device sdc1): read error corrected: ino 0 off 524376436736 (dev /dev/sdc1 sector 29999800) Cache pool settings Quote Link to comment
JorgeB Posted July 3, 2023 Share Posted July 3, 2023 Reboot to clear the logs then run a new scrub and post new diags. Quote Link to comment
gadget069 Posted July 3, 2023 Author Share Posted July 3, 2023 (edited) 1 hour ago, JorgeB said: Reboot to clear the logs then run a new scrub and post new diags. Log attached I'm still seeing this on the second drive root@Tortuga:~# btrfs dev stats /mnt/cache [/dev/sdc1].write_io_errs 0 [/dev/sdc1].read_io_errs 0 [/dev/sdc1].flush_io_errs 0 [/dev/sdc1].corruption_errs 0 [/dev/sdc1].generation_errs 0 [/dev/sdh1].write_io_errs 0 [/dev/sdh1].read_io_errs 0 [/dev/sdh1].flush_io_errs 0 [/dev/sdh1].corruption_errs 86297 [/dev/sdh1].generation_errs 5446 root@Tortuga:~# tortuga-diagnostics-20230703-1148.zip Edited July 3, 2023 by gadget069 Quote Link to comment
JorgeB Posted July 3, 2023 Share Posted July 3, 2023 Scrub finished without any errors, type btrfs dev stats -z /mnt/cache to reset the stats and keep monitoring. Quote Link to comment
gadget069 Posted July 3, 2023 Author Share Posted July 3, 2023 4 minutes ago, JorgeB said: Scrub finished without any errors, type btrfs dev stats -z /mnt/cache to reset the stats and keep monitoring. Will do, thanks Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.