August 21, 20232 yr Hi folks, I was wondering if I could get some help recovering my "appdata" folder from my cache pool. I had 2 SDD (one NVME and one 2.5") in a RAID1. Everything was working fine but then my dockers went offline so I rebooted and my cache pool was showing unmountable. The file system is BTRFS. I unassigned one of the drives (not sure why in retrospect) and restarted the array and now I get this: I'm attaching my diagnostic and tried a bunch of commands from various topic but not really sure what I'm doing. Here are some of the errors I get back. ----------- root@Delta:~# mount -o rescue=all,ro /dev/nvme0n1 /temp mount: /temp: wrong fs type, bad option, bad superblock on /dev/nvme0n1, missing codepage or helper program, or other error. dmesg(1) may have more information after failed mount system call. ----------- root@Delta:~# btrfs rescue zero-log /dev/nvme0n1 No valid Btrfs found on /dev/nvme0n1 ERROR: could not open ctree ----------- root@Delta:~# btrfs restore -v /dev/nvme0n1 /mnt/disks/WD-WCAZA9724880/restore No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super No valid Btrfs found on /dev/nvme0n1 Could not open root, trying backup super ----------- Any chance I might be able to save the folder data so I don't have to reconfigure it all? I did not backup as a very poor decision which has led me to this particular place. I appreciate your help! delta-diagnostics-20230820-2319.zip
August 21, 20232 yr Author 7 hours ago, JorgeB said: Post output of: btrfs fi show Label: none uuid: f2eaaf1e-021f-4d72-9fb1-39333caf8e61 Total devices 1 FS bytes used 12.71TiB devid 1 size 12.73TiB used 12.73TiB path /dev/md1p1 Label: none uuid: 6c8567c5-6e83-4044-86fc-b0f9aa64bc33 Total devices 1 FS bytes used 9.08TiB devid 1 size 9.09TiB used 9.09TiB path /dev/md2p1 Label: none uuid: 74f12617-8f88-45ea-8e43-1977b282e953 Total devices 1 FS bytes used 9.06TiB devid 1 size 9.09TiB used 9.09TiB path /dev/md3p1 Label: none uuid: 406ed446-dc92-4112-b9d9-a7bad4b7cc10 Total devices 1 FS bytes used 99.79GiB devid 1 size 3.64TiB used 106.02GiB path /dev/sdf1 Label: none uuid: ea9eeecc-e314-48dc-970e-90bd2449db01 Total devices 1 FS bytes used 12.71TiB devid 1 size 12.73TiB used 12.73TiB path /dev/sdd1 Label: none uuid: 829aff6c-8b5e-4387-a2b1-c0c3d81f185c Total devices 1 FS bytes used 532.47GiB devid 1 size 1.36TiB used 593.24GiB path /dev/sdc1
August 21, 20232 yr Community Expert There's no btrfs filesystem on the NVMe device or other unassigned SSD, suggesting the devices were wiped, cannot see what happened since you've rebooted. Post the output of fdisk -l /dev/nvme0n1 Also, which unassigned SSD was part of the pool?
August 21, 20232 yr Author I didn't actively trigger the wipe on the drive, but maybe a moot point now. Here is the output: root@Delta:~# fdisk -l /dev/nvme0n1 Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: Samsung SSD 970 EVO Plus 2TB Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes The unassigned drive that was part of the pool was "sdb" (Samsung SSD 870 2TB). I tried to add it back to the pool but of course I was warned that it would be wiped if I tried.
August 21, 20232 yr Community Expert There are no partitions on both devices, there's one thing we can try that usually works, as long as the devices were not fully trimmed, type sfdisk /dev/sdb then type 2048 and hit enter, post a screenshot of the results. P.S. it may take me a few hours to reply since about to go offline for the day
August 21, 20232 yr Author Here are the results; I said no for now: root@Delta:~# sfdisk /dev/sdb Welcome to sfdisk (util-linux 2.38.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Checking that no-one is using this disk right now ... OK Disk /dev/sdb: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: Samsung SSD 870 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes sfdisk is going to create a new 'dos' disk label. Use 'label: <name>' before you define a first partition to override the default. Type 'help' to get more information. >>> 2048 Created a new DOS disklabel with disk identifier 0xdd4dbb34. Created a new partition 1 of type 'Linux' and of size 1.8 TiB. Partition #1 contains a btrfs signature. Do you want to remove the signature? [Y]es/[N]o:
August 22, 20232 yr Community Expert 13 hours ago, josephsiu said: Do you want to remove the signature? [Y]es/[N]o: Type N here then type write and hit enter, and post output of btrfs fi show
August 23, 20232 yr Author Here are the results: -------------- root@Delta:~# sfdisk /dev/sdb Welcome to sfdisk (util-linux 2.38.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Checking that no-one is using this disk right now ... OK Disk /dev/sdb: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: Samsung SSD 870 Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes sfdisk is going to create a new 'dos' disk label. Use 'label: <name>' before you define a first partition to override the default. Type 'help' to get more information. >>> 2048 Created a new DOS disklabel with disk identifier 0x6ea176be. Created a new partition 1 of type 'Linux' and of size 1.8 TiB. Partition #1 contains a btrfs signature. Do you want to remove the signature? [Y]es/[N]o: N /dev/sdb1 : 2048 3907029167 (1.8T) Linux /dev/sdb2: write New situation: Disklabel type: dos Disk identifier: 0x6ea176be Device Boot Start End Sectors Size Id Type /dev/sdb1 2048 3907029167 3907027120 1.8T 83 Linux The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks. root@Delta:~# btrfs fi show Label: none uuid: f2eaaf1e-021f-4d72-9fb1-39333caf8e61 Total devices 1 FS bytes used 12.71TiB devid 1 size 12.73TiB used 12.73TiB path /dev/md1p1 Label: none uuid: 6c8567c5-6e83-4044-86fc-b0f9aa64bc33 Total devices 1 FS bytes used 9.08TiB devid 1 size 9.09TiB used 9.09TiB path /dev/md2p1 Label: none uuid: 74f12617-8f88-45ea-8e43-1977b282e953 Total devices 1 FS bytes used 9.06TiB devid 1 size 9.09TiB used 9.09TiB path /dev/md3p1 warning, device 4 is missing Label: none uuid: 406ed446-dc92-4112-b9d9-a7bad4b7cc10 Total devices 1 FS bytes used 99.79GiB devid 1 size 3.64TiB used 106.02GiB path /dev/sdf1 Label: none uuid: ea9eeecc-e314-48dc-970e-90bd2449db01 Total devices 1 FS bytes used 12.71TiB devid 1 size 12.73TiB used 12.73TiB path /dev/sdd1 Label: none uuid: 829aff6c-8b5e-4387-a2b1-c0c3d81f185c Total devices 1 FS bytes used 532.47GiB devid 1 size 1.36TiB used 593.24GiB path /dev/sdc1 Label: none uuid: 167d508b-3822-4e67-8cf8-3e2ea48529e5 Total devices 2 FS bytes used 768.59GiB devid 2 size 1.82TiB used 774.03GiB path /dev/sdb1 *** Some devices missing -------------- I see sdb now in the last command! Is there hope? In the unassigned devices it looks like it is mountable but I'll wait for guidance. Am I correct to assume that I can mount it and copy the data out?
August 23, 20232 yr Community Expert Solution Lets see if we can recover the other device as well, type: sfdisk /dev/nvme0n1 then 2048 enter, if there's a btrfs signature type N to keep it and then write finally btrfs fi show once more
August 23, 20232 yr Author Here it is! ---------------- root@Delta:~# sfdisk /dev/nvme0n1 Welcome to sfdisk (util-linux 2.38.1). Changes will remain in memory only, until you decide to write them. Be careful before using the write command. Checking that no-one is using this disk right now ... OK Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors Disk model: Samsung SSD 970 EVO Plus 2TB Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes sfdisk is going to create a new 'dos' disk label. Use 'label: <name>' before you define a first partition to override the default. Type 'help' to get more information. >>> 2048 Created a new DOS disklabel with disk identifier 0x2340b07d. Created a new partition 1 of type 'Linux' and of size 1.8 TiB. Partition #1 contains a btrfs signature. Do you want to remove the signature? [Y]es/[N]o: N /dev/nvme0n1p1 : 2048 3907029167 (1.8T) Linux /dev/nvme0n1p2: write New situation: Disklabel type: dos Disk identifier: 0x2340b07d Device Boot Start End Sectors Size Id Type /dev/nvme0n1p1 2048 3907029167 3907027120 1.8T 83 Linux The partition table has been altered. Calling ioctl() to re-read partition table. Syncing disks. root@Delta:~# btrfs fi show Label: none uuid: f2eaaf1e-021f-4d72-9fb1-39333caf8e61 Total devices 1 FS bytes used 12.71TiB devid 1 size 12.73TiB used 12.73TiB path /dev/md1p1 Label: none uuid: 6c8567c5-6e83-4044-86fc-b0f9aa64bc33 Total devices 1 FS bytes used 9.08TiB devid 1 size 9.09TiB used 9.09TiB path /dev/md2p1 Label: none uuid: 74f12617-8f88-45ea-8e43-1977b282e953 Total devices 1 FS bytes used 9.06TiB devid 1 size 9.09TiB used 9.09TiB path /dev/md3p1 Label: none uuid: 406ed446-dc92-4112-b9d9-a7bad4b7cc10 Total devices 1 FS bytes used 99.79GiB devid 1 size 3.64TiB used 106.02GiB path /dev/sdf1 Label: none uuid: ea9eeecc-e314-48dc-970e-90bd2449db01 Total devices 1 FS bytes used 12.71TiB devid 1 size 12.73TiB used 12.73TiB path /dev/sdd1 Label: none uuid: 829aff6c-8b5e-4387-a2b1-c0c3d81f185c Total devices 1 FS bytes used 532.47GiB devid 1 size 1.36TiB used 593.24GiB path /dev/sdc1 Label: none uuid: 167d508b-3822-4e67-8cf8-3e2ea48529e5 Total devices 2 FS bytes used 768.59GiB devid 2 size 1.82TiB used 774.03GiB path /dev/sdb1 devid 4 size 1.82TiB used 774.03GiB path /dev/nvme0n1p1 ----------------
August 23, 20232 yr Community Expert Now stop array, unassign the assigned pool device, start array without any device assigned to the pool so that the pool config is reset, stop array, re-assign both pool members, start array to import the pool, post new diags if it doesn't mount.
August 23, 20232 yr Author Didn't work. It says Auto which I thought might have been a problem but it is grayed out so I cannot change it.
August 23, 20232 yr Community Expert There's a problem with the log tree, if that's the only issue typing this may help: btrfs rescue zero-log /dev/sdb1 Then re-start the array.
August 23, 20232 yr Author root@Delta:~# btrfs rescue zero-log /dev/sdb1 Clearing log on /dev/sdb1, previous log_root 2961845878784, level 0 It is back! I'm going to copy the data out now! Should I reset and format the pool and basically start over? What is the usual cause of log tree issues? I'm going to run SMART test on these SSDs as well. For my learning, in which file did you see the log tree error in the diagnostic?
August 23, 20232 yr Community Expert 1 minute ago, josephsiu said: Should I reset and format the pool and basically start over? Make sure backups are up to date, you can then see if the pool holds up, if there are issues again soon, with the log tree or other, re-format. 2 minutes ago, josephsiu said: What is the usual cause of log tree issues? Most often unclean unmount/shutdown. 2 minutes ago, josephsiu said: For my learning, in which file did you see the log tree error in the diagnostic? Aug 23 10:26:12 Delta kernel: BTRFS: error (device sdb1: state EA) in btrfs_replay_log:2414: errno=-5 IO failure (Failed to recover log tree)
November 2, 20232 yr On 8/23/2023 at 10:00 AM, JorgeB said: Lets see if we can recover the other device as well, type: sfdisk /dev/nvme0n1 then 2048 enter, if there's a btrfs signature type N to keep it and then write finally btrfs fi show once more Just wanted to post saying that this saved me countless hours of backup recovery. Thank you so much @JorgeB! I was about to wipe and start fresh.
January 2, 20242 yr Hello there and a happy new year, i have the exact same problem and could follow all steps but when i try this: btrfs rescue zero-log /dev/nvme1n1 thats the output: No valid Btrfs found on /dev/nvme1n1 ERROR: could not open ctree Any hint or help would be great! Cheers
January 2, 20242 yr Community Expert 1 hour ago, rakoth said: i have the exact same problem and could follow all steps but when i try this: First please post the diags to see if it's really the same thing.
January 2, 20242 yr Community Expert Fist thing you should run memtest since there's a lot of data corruption detected on both pool devices, then type: btrfs rescue zero-log /dev/nvme1n1p1 And re-start the array
January 2, 20242 yr Thank you! that solved it! Will now check my memory and might replace it, cheaper then the drives Thanks again for your help and quick responses!
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.