August 29, 20214 yr This morming the docker service crashed with one docker running it had filed the log, so i tried to restart. now it seems that my cache drive wont mount. the btrfs seems to be unmountable. the cache drive log says this. Quote Aug 29 14:24:02 ChiaTower kernel: ata8.00: configured for UDMA/133 Aug 29 14:24:02 ChiaTower kernel: ata8.00: Enabling discard_zeroes_data Aug 29 14:24:02 ChiaTower kernel: sd 10:0:0:0: [sdm] 937703088 512-byte logical blocks: (480 GB/447 GiB) Aug 29 14:24:02 ChiaTower kernel: sd 10:0:0:0: [sdm] 4096-byte physical blocks Aug 29 14:24:02 ChiaTower kernel: sd 10:0:0:0: [sdm] Write Protect is off Aug 29 14:24:02 ChiaTower kernel: sd 10:0:0:0: [sdm] Mode Sense: 00 3a 00 00 Aug 29 14:24:02 ChiaTower kernel: sd 10:0:0:0: [sdm] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Aug 29 14:24:02 ChiaTower kernel: sdm: sdm1 Aug 29 14:24:02 ChiaTower kernel: ata8.00: Enabling discard_zeroes_data Aug 29 14:24:02 ChiaTower kernel: sd 10:0:0:0: [sdm] Attached SCSI disk Aug 29 14:24:02 ChiaTower kernel: BTRFS: device fsid 70a02f4b-af7b-4685-863f-3a5c13160d86 devid 1 transid 5841 /dev/sdm1 scanned by udevd (2214) Aug 29 14:25:31 ChiaTower emhttpd: INTEL_SSDSC2BB480G4R_PHWL501600ME480QGN (sdm) 512 937703088 Aug 29 14:25:31 ChiaTower emhttpd: import 30 cache device: (sdm) INTEL_SSDSC2BB480G4R_PHWL501600ME480QGN Aug 29 14:25:31 ChiaTower emhttpd: read SMART /dev/sdm Aug 29 14:40:22 ChiaTower emhttpd: shcmd (969): mount -t btrfs -o noatime,space_cache=v2 /dev/sdm1 /mnt/cache Aug 29 14:40:22 ChiaTower kernel: BTRFS info (device sdm1): using free space tree Aug 29 14:40:22 ChiaTower kernel: BTRFS info (device sdm1): has skinny extents Aug 29 14:40:22 ChiaTower kernel: BTRFS info (device sdm1): enabling ssd optimizations Aug 29 14:40:22 ChiaTower kernel: BTRFS info (device sdm1): start tree-log replay Aug 29 14:40:25 ChiaTower kernel: BTRFS info (device sdm1): leaf 129368064 gen 5842 total ptrs 207 free space 137 owner 2 Aug 29 14:40:25 ChiaTower kernel: BTRFS error (device sdm1): unable to find ref byte nr 51010932736 parent 0 root 5 owner 29588 offset 12409917440 Aug 29 14:40:25 ChiaTower kernel: BTRFS: error (device sdm1) in __btrfs_free_extent:3092: errno=-2 No such entry Aug 29 14:40:25 ChiaTower kernel: BTRFS: error (device sdm1) in btrfs_run_delayed_refs:2144: errno=-2 No such entry Aug 29 14:40:25 ChiaTower kernel: BTRFS: error (device sdm1) in btrfs_replay_log:2279: errno=-2 No such entry (Failed to recover log tree) Aug 29 14:40:25 ChiaTower kernel: BTRFS error (device sdm1): open_ctree failed How do I fix this problem? the cache drive was added less than one week ago, Edited August 29, 20214 yr by Struck
August 29, 20214 yr Author Diagnostics attached the restart also triggered a parity sync. I don’t know why this is, since the array seems to be unaffected of this problem. chiatower-diagnostics-20210829-1504.zip
August 29, 20214 yr Community Expert 4 hours ago, Struck said: the restart also triggered a parity sync. I don’t know why this is Aug 29 14:25:28 ChiaTower emhttpd: unclean shutdown detected You will always get a parity check after an unclean shutdown 4 hours ago, Struck said: btrfs seems to be unmountable
August 31, 20214 yr Author I used the instructions to retore the data, formatted the drive and copied the data back afterwards. It worked for less than three days. Now the issue is the same. The log is filled with stuff like this: Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0 Aug 31 04:29:37 ChiaTower kernel: BTRFS warning (device sdm1): csum failed root 5 ino 12305 off 15759626240 csum 0x21417709 expected csum 0x00000000 mirror 1 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 5, gen 0 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): parent transid verify failed on 8343076864 wanted 3304 found 3233 Aug 31 04:29:37 ChiaTower kernel: BTRFS info (device sdm1): no csum found for inode 12305 start 15759699968 Aug 31 04:29:37 ChiaTower kernel: BTRFS warning (device sdm1): csum failed root 5 ino 12305 off 15759699968 csum 0x108cc45f expected csum 0x00000000 mirror 1 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): parent transid verify failed on 8343076864 wanted 3304 found 3233 Aug 31 04:29:37 ChiaTower kernel: BTRFS info (device sdm1): no csum found for inode 12305 start 15759708160 Aug 31 04:29:37 ChiaTower kernel: BTRFS warning (device sdm1): csum failed root 5 ino 12305 off 15759708160 csum 0x7d0b155f expected csum 0x00000000 mirror 1 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 7, gen 0 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): parent transid verify failed on 8343076864 wanted 3304 found 3233 Aug 31 04:29:37 ChiaTower kernel: BTRFS info (device sdm1): no csum found for inode 12305 start 15759736832 Aug 31 04:29:37 ChiaTower kernel: BTRFS warning (device sdm1): csum failed root 5 ino 12305 off 15759736832 csum 0xabb5631a expected csum 0x00000000 mirror 1 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 8, gen 0 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): parent transid verify failed on 8343076864 wanted 3304 found 3233 Aug 31 04:29:37 ChiaTower kernel: BTRFS info (device sdm1): no csum found for inode 12305 start 15760031744 Aug 31 04:29:37 ChiaTower kernel: BTRFS warning (device sdm1): csum failed root 5 ino 12305 off 15760031744 csum 0xb842b40e expected csum 0x00000000 mirror 1 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 9, gen 0 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): parent transid verify failed on 8343076864 wanted 3304 found 3233 Aug 31 04:29:37 ChiaTower kernel: BTRFS info (device sdm1): no csum found for inode 12305 start 15759298560 Aug 31 04:29:37 ChiaTower kernel: BTRFS warning (device sdm1): csum failed root 5 ino 12305 off 15759298560 csum 0xff2de314 expected csum 0x00000000 mirror 1 Aug 31 04:29:37 ChiaTower kernel: BTRFS error (device sdm1): bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 Aug 31 04:30:13 ChiaTower kernel: verify_parent_transid: 10 callbacks suppressed Aug 31 04:30:13 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:30:13 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:30:44 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:30:44 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:31:15 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:31:15 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:31:46 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:31:46 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:32:18 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:32:18 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:32:49 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:32:49 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:33:20 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:33:20 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 Aug 31 04:33:51 ChiaTower kernel: BTRFS error (device loop2): parent transid verify failed on 4708515840 wanted 203356 found 202764 And my guess is that if i try to reboot the machine the cache drive parition cannot be mounted. Even though I can access the cache drive fine before the reboot. Is the drive bad? I have multiple of these drives, so i can try and replace it. Would i be having less issues if i run multiple of them in the cache pool? chiatower-diagnostics-20210831-1815.zip Edited August 31, 20214 yr by Struck diagnostics added
August 31, 20214 yr Community Expert You could run an extended SMART test on that SSD. My guess is some other problem is causing corruption. Have you done memtest?
August 31, 20214 yr Author 18 minutes ago, trurl said: You could run an extended SMART test on that SSD. My guess is some other problem is causing corruption. Have you done memtest? I will run an extended SMART test after reboot. I have now inserted a new SSD, that is supposed to replace the one i currently use. I will try memtest later, but i haven't had any problems before i installed the SSD. The array is unaffected of this problem it seems Edited August 31, 20214 yr by Struck
August 31, 20214 yr Community Expert 3 minutes ago, Struck said: The array is unaffected of this problem it seems Array is XFS, btrfs is much more sensitive to bad RAM, though if that is the problem you'll also get data corruption on the array, just undetected.
August 31, 20214 yr Author 4 minutes ago, JorgeB said: Array is XFS, btrfs is much more sensitive to bad RAM, though if that is the problem you'll also get data corruption on the array, just undetected. Okay,. Thanks i will try it after the extended smart test is done. As a side note, the cache disk mounted as normally after a reboot.
September 2, 20214 yr Author Memtest didn't find anything. Extended SMART test did not find any issues either. I have not tried replacing the drive yet. I will do that after the weekend i guess
September 2, 20214 yr Community Expert 2 hours ago, Struck said: Memtest didn't find anything. How long did you let it run?
September 2, 20214 yr Author 6 hours ago, trurl said: How long did you let it run? Not long enough. 2 passes, like 4 hours.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.