CorneliusCornbread Posted December 9, 2023 Share Posted December 9, 2023 (edited) So recently I just had my cache absolutely die on me, I'm unable to mount it and it says its unmountable in unraid, attempting to try and read the drive from another device using another computer yields the exact same output as Unraid. Is there something I can do to try and recover my files or are they gone forever? [ 665.270184] BTRFS info (device sdc1): using crc32c (crc32c-intel) checksum algorithm [ 665.270196] BTRFS info (device sdc1): using free space tree [ 665.271019] BTRFS error (device sdc1): devid 1 uuid 1534a103-adb6-4af8-97dd-604441e7394f is missing [ 665.271031] BTRFS error (device sdc1): failed to read the system array: -2 [ 665.271450] BTRFS error (device sdc1): open_ctree failed [ 677.052903] BTRFS: device fsid a1f55575-2547-45b5-b8a3-86cc81362f17 devid 2 transid 698149 /dev/sdc1 scanned by mount (11480) [ 677.053635] BTRFS info (device sdc1): using crc32c (crc32c-intel) checksum algorithm [ 677.053644] BTRFS info (device sdc1): using free space tree [ 677.054517] BTRFS error (device sdc1): devid 1 uuid 1534a103-adb6-4af8-97dd-604441e7394f is missing [ 677.054525] BTRFS error (device sdc1): failed to read the system array: -2 [ 677.054827] BTRFS error (device sdc1): open_ctree failed [ 679.731194] BTRFS: device fsid a1f55575-2547-45b5-b8a3-86cc81362f17 devid 2 transid 698149 /dev/sdc1 scanned by mount (11506) [ 679.732081] BTRFS info (device sdc1): using crc32c (crc32c-intel) checksum algorithm [ 679.732094] BTRFS info (device sdc1): using free space tree [ 679.733048] BTRFS error (device sdc1): devid 1 uuid 1534a103-adb6-4af8-97dd-604441e7394f is missing [ 679.733059] BTRFS error (device sdc1): failed to read the system array: -2 [ 679.733930] BTRFS error (device sdc1): open_ctree failed Also for what its worth, it seems to have been related to a cache move, moving a movie from the cache to the array, as that's about the time our drives went corrupt and our server went down. Edited December 9, 2023 by CorneliusCornbread Quote Link to comment
CorneliusCornbread Posted December 9, 2023 Author Share Posted December 9, 2023 (edited) Here's another log when trying to start the array on my server with the cache drives connected rose-plex-log-section.txt Edited December 9, 2023 by CorneliusCornbread Quote Link to comment
JorgeB Posted December 9, 2023 Share Posted December 9, 2023 Please post the diagnostics but according to that snipped a device is missing, any idea where it is? Quote Link to comment
CorneliusCornbread Posted December 9, 2023 Author Share Posted December 9, 2023 11 hours ago, JorgeB said: Please post the diagnostics but according to that snipped a device is missing, any idea where it is? The first set of logs is from just plugging the paritied drive into my other PC using a USB to sata adapter, the second set of logs is from the Unraid server itself with both drives. It seems to stop complaining about that on the actual unraid box with both drives. I included the second one for that reason. Also, while my server's been down, I ran Memtest86 all of last night and through some of the afternoon, 16 hours of tests yielded no bad memory issues so I've ruled that out. Here's the diagnosticsrose-plex-diagnostics-20231209-1637.zip Quote Link to comment
Solution JorgeB Posted December 10, 2023 Solution Share Posted December 10, 2023 If the log tree is the only problem this may help: btrfs rescue zero-log /dev/nvme0n1p1 Then restart the array Quote Link to comment
CorneliusCornbread Posted December 10, 2023 Author Share Posted December 10, 2023 8 hours ago, JorgeB said: If the log tree is the only problem this may help: btrfs rescue zero-log /dev/nvme0n1p1 Then restart the array That seems to have worked! Thank you so much! I'm backing up my appdata directory 1 Quote Link to comment
CorneliusCornbread Posted December 17, 2023 Author Share Posted December 17, 2023 (edited) On 12/10/2023 at 6:16 AM, JorgeB said: If the log tree is the only problem this may help: btrfs rescue zero-log /dev/nvme0n1p1 Then restart the array So only I'm able to read from the cache using this (kinda sorta, sometimes I can write sometimes I can't), (sorry for the late response I just got finished with finals), and I've backed up everything I need so at this point I'm trying to figure out if I need to blow away my cache and start from scratch or if I can get the file system sorted. We replaced the drive that we think was causing the issue, we think it was our nvme drive going bad, for some reason we couldn't get smart reports to work on it at all. After replacing the drive I let the cache array rebuild itself overnight. Running a btrfs check yields this [1/7] checking root items [2/7] checking extents data extent[5392462663680, 16384] referencer count mismatch (root 5 owner 12824979 offset 6780850176) wanted 0 have 1 data extent[5392462663680, 16384] bytenr mimsmatch, extent item bytenr 5392462663680 file item bytenr 0 data extent[5392462663680, 16384] referencer count mismatch (root 5583673057798520837 owner 4294936705 offset 6780850176) wanted 1 have 0 backpointer mismatch on [5392462663680 16384] ERROR: errors found in extent allocation tree or chunk allocation [3/7] checking free space tree [4/7] checking fs roots [5/7] checking only csums items (without verifying data) [6/7] checking root refs [7/7] checking quota groups skipped (not enabled on this FS) Opening filesystem to check... warning, device 1 is missing Checking filesystem on /dev/sdd1 UUID: a1f55575-2547-45b5-b8a3-86cc81362f17 found 283721043968 bytes used, error(s) found total csum bytes: 163032844 total tree bytes: 798720000 total fs tree bytes: 477118464 total extent tree bytes: 121241600 btree space waste bytes: 154372644 file data blocks allocated: 1237013446656 referenced 270644371456 And attempting to do a repair via the check just has the repair abort enabling repair mode WARNING: Do not use --repair unless you are advised to do so by a developer or an experienced user, and then only after having accepted that no fsck can successfully repair all types of filesystem corruption. E.g. some software or hardware bugs can fatally damage a volume. The operation will start in 10 seconds. Use Ctrl-C to stop it. 10 9 8 7 6 5 4 3 2 1[1/7] checking root items Fixed 0 roots. [2/7] checking extents data extent[5392462663680, 16384] referencer count mismatch (root 5 owner 12824979 offset 6780850176) wanted 0 have 1 data extent[5392462663680, 16384] bytenr mimsmatch, extent item bytenr 5392462663680 file item bytenr 0 data extent[5392462663680, 16384] referencer count mismatch (root 5583673057798520837 owner 4294936705 offset 6780850176) wanted 1 have 0 backpointer mismatch on [5392462663680 16384] Unable to find block group for 0 Unable to find block group for 0 Unable to find block group for 0 failed to repair damaged filesystem, aborting Starting repair. Opening filesystem to check... warning, device 1 is missing Checking filesystem on /dev/sdd1 UUID: a1f55575-2547-45b5-b8a3-86cc81362f17 Is the file system beyond repair? If so what's the easiest way to blow it away and start from scratch, I'm going to need to recreate my system and appdata directories for sure as those were cache only. Edited December 17, 2023 by CorneliusCornbread Quote Link to comment
JorgeB Posted December 18, 2023 Share Posted December 18, 2023 Recommend you backup and recreate the pool, to wipe the pool you can click on "erase". 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.