December 20, 20232 yr Hello, I've attached the diagnostics. I noticed that some of my containers weren't working and then saw in the logs that the FS went RO. After further inspection, I saw a lot of BTRFS errors. If anyone could point me in the right direction I would be very grateful. root@Zeus:~# btrfs dev stat -z /mnt/nvme [/dev/nvme0n1p1].write_io_errs 0 [/dev/nvme0n1p1].read_io_errs 0 [/dev/nvme0n1p1].flush_io_errs 0 [/dev/nvme0n1p1].corruption_errs 0 [/dev/nvme0n1p1].generation_errs 0 [/dev/nvme1n1p1].write_io_errs 0 [/dev/nvme1n1p1].read_io_errs 0 [/dev/nvme1n1p1].flush_io_errs 0 [/dev/nvme1n1p1].corruption_errs 1172369 [/dev/nvme1n1p1].generation_errs 3659 [/dev/nvme2n1p1].write_io_errs 0 [/dev/nvme2n1p1].read_io_errs 0 [/dev/nvme2n1p1].flush_io_errs 0 [/dev/nvme2n1p1].corruption_errs 0 [/dev/nvme2n1p1].generation_errs 0 zeus-diagnostics-20231219-2129.zip
December 20, 20232 yr Author 5 hours ago, JorgeB said: Post the results of a correcting scrub. root@Zeus:~# btrfs scrub start -rdB /mnt/nvme Scrub device /dev/nvme0n1p1 (id 1) done Scrub started: Wed Dec 20 08:00:50 2023 Status: finished Duration: 0:03:25 Total to scrub: 362.03GiB Rate: 1.77GiB/s Error summary: no errors found Scrub device /dev/nvme1n1p1 (id 2) done Scrub started: Wed Dec 20 08:00:50 2023 Status: finished Duration: 0:02:09 Total to scrub: 180.32GiB Rate: 1.40GiB/s Error summary: verify=2913 csum=1170601 Corrected: 0 Uncorrectable: 0 Unverified: 0 Scrub device /dev/nvme2n1p1 (id 3) done Scrub started: Wed Dec 20 08:00:50 2023 Status: finished Duration: 0:02:11 Total to scrub: 181.71GiB Rate: 1.39GiB/s Error summary: no errors found root@Zeus:~#
December 20, 20232 yr -r is read only, you can use the GUI to scrub the pool, no need to scrub each device.
December 20, 20232 yr Author Solution 14 minutes ago, JorgeB said: -r is read only, you can use the GUI to scrub the pool, no need to scrub each device. Sorry about that: UUID: f00cb2bb-b4d2-4e4d-8f8b-167ee6f6be26 Scrub started: Wed Dec 20 08:36:25 2023 Status: finished Duration: 0:02:47 Total to scrub: 724.04GiB Rate: 4.33GiB/s Error summary: verify=2389 csum=1170601 Corrected: 1172990 Uncorrectable: 0 Unverified: 0
December 20, 20232 yr All errors were corrected, see here for how to reset the stats and better pool monitoring.
December 20, 20232 yr Author 47 minutes ago, JorgeB said: All errors were corrected, see here for how to reset the stats and better pool monitoring. Thanks, I appreciate it. I'll work on setting up the reporting after I reboot my system in about 8 hours, once the parity check finishes as last night Docker hungup the array and I wasn't able to get a clean shutdown. On a side note, I have all of my docker instances running off the NVME drives for speed. Should just be using these as cache drives to help reduce the chance of data loss? I'm currently using SSD's for this purpose, and the NVMEs are just for Docker.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.