January 29, 20233 yr I have two nvme ssd cache disks. One is in an M.2 slot on the motherboard, the other is in a PCIE adapter. They are in RAID 1. After installing some fans the system reports issues with one of the disks and I cannot start a particular VM, another one appears to work. The docker containers seem to run fine though. Some messages I am getting are: Jan 29 11:34:53 Sherlock kernel: BTRFS error (device nvme1n1p1): parent transid verify failed on 1138769920 wanted 196530 found 179974 Jan 29 11:34:57 Sherlock kernel: btrfs_print_data_csum_error: 997 callbacks suppressed Jan 29 11:34:57 Sherlock kernel: BTRFS warning (device nvme1n1p1): csum failed root 5 ino 305583 off 0 csum 0x710bae7d expected csum 0xc75ff09f mirror 2 Jan 29 11:34:57 Sherlock kernel: btrfs_dev_stat_print_on_error: 997 callbacks suppressed Jan 29 11:34:57 Sherlock kernel: BTRFS error (device nvme1n1p1): bdev /dev/nvme1n1p1 errs: wr 206546744, rd 3634341, flush 2878652, corrupt 12296835, gen 0 and similar. I have attached the diag, any help would be greatly appreciated. sherlock-diagnostics-20230129-1119.zip Edited January 29, 20233 yr by tdatta update info
January 29, 20233 yr Author update: I ran scrub. No longer any file system issues, but that particular VM wont boot. The windows recovery options do not work. the cache drives were in RAID 1. What is the point of that if they have issues like this? Edited January 30, 20233 yr by tdatta update
January 30, 20233 yr Community Expert One of your devices dropped offline in the past, note that if you have NOCOW shares and bring the dropped device online without wiping it first it can corrupt them, more info here about that and how to better monitor pools. Jan 29 11:01:12 Sherlock kernel: BTRFS info (device nvme1n1p1): bdev /dev/nvme1n1p1 errs: wr 206546744, rd 3634341, flush 2878652, corrupt 2376407, gen 0
January 30, 20233 yr Author 5 hours ago, JorgeB said: One of your devices dropped offline in the past, note that if you have NOCOW shares and bring the dropped device online without wiping it first it can corrupt them, more info here about that and how to better monitor pools. Jan 29 11:01:12 Sherlock kernel: BTRFS info (device nvme1n1p1): bdev /dev/nvme1n1p1 errs: wr 206546744, rd 3634341, flush 2878652, corrupt 2376407, gen 0 Thank you I will read through what you linked.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.