BTRFS Unmountable No file system (Cache Pool, Raid0)

nextgenpotato · May 19, 2021

Hi all,

I have 2 sata SSDs I had setup as a raid0 btrfs cache pool.

After one of my VMs (with the disk being on cache) being frozen I had to restart the array and now I cannot mount my cache drive(s) even though the array starts. So dockers and VMs are disabled.

It seems like one of disks started failing I just saw unraid posting a CRC error.

Is there any way I can mount this back long enough so I can get the data out of it? It would save me some headache.

What I've tried so far

1- Unassign disks start array, reassign and start array

2- btrfs check --readonly (it shows same result for both drives)

Opening filesystem to check...
Checking filesystem on /dev/sdi1
UUID: f4c7a0bf-b527-47f1-ada2-5da96e590ef7
[1/7] checking root items
[2/7] checking extents
data backref 2013043068928 root 5 owner 623 offset 100331225088 num_refs 0 not found in extent tree
incorrect local backref count on 2013043068928 root 5 owner 623 offset 100331225088 found 1 wanted 0 back 0x62aa9a0
incorrect local backref count on 2013043068928 root 13 owner 623 offset 100331225088 found 0 wanted 1 back 0x62aaad0
backref disk bytenr does not match extent record, bytenr=2013043068928, ref bytenr=0
backpointer mismatch on [2013043068928 65536]
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space tree
cache and super generation don't match, space cache will be invalidated
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 538120765440 bytes used, error(s) found
total csum bytes: 86454728
total tree bytes: 452263936
total fs tree bytes: 287948800
total extent tree bytes: 64061440
btree space waste bytes: 81374861
file data blocks allocated: 961713012736
 referenced 534715310080

3- blkid output

/dev/sdh1: UUID="f4c7a0bf-b527-47f1-ada2-5da96e590ef7" UUID_SUB="166aa0d2-ada9-45f3-b388-82897c1caf40" BLOCK_SIZE="4096" TYPE="btrfs"
/dev/sdi1: UUID="f4c7a0bf-b527-47f1-ada2-5da96e590ef7" UUID_SUB="62953cd2-a423-47a2-9a2c-b9f6a3f1af4b" BLOCK_SIZE="4096" TYPE="btrfs"

4- btrfs fi show f4c7a0bf-b527-47f1-ada2-5da96e590ef7

Label: none  uuid: f4c7a0bf-b527-47f1-ada2-5da96e590ef7
        Total devices 2 FS bytes used 501.16GiB
        devid    1 size 465.76GiB used 277.03GiB path /dev/sdi1
        devid    2 size 465.76GiB used 277.03GiB path /dev/sdh1

Edited May 19, 2021 by nextgenpotato

nextgenpotato · May 19, 2021

Well, I was telling myself not to rush and try random things to fix it, risking breaking it further, but I have no patience

I was able to fix the mounting issue after some googling, and I'm backing up my data (my VM disk images) as I type this.

Solution was posted on reddit r/btrfs.

Basically all I did was to run this command, which I understand deletes the btrfs log marking the drive bad, so it's mountable again at your own risk, which is what I wanted.

btrfs rescue zero-log /dev/sdi1

And I don't think the root cause here was drive failing (it has CRC count of 1)

I was running qcow2 VM disks on this btrfs raid0 config.

I've seen a few times the windows 10 VM reporting it's disk being full, while in reality it was far from being full. And it would go back to normal after a restart.

And when I checked the disk image size it was maxed out as well, being qcow2 it had to be closer to the size of disk user, rather than size allocated.

So I think there is a bug somewhere making the file system confused in this particular scenario.

I ordered a new nvme drive to replace this configuration and I will go back to boring old XFS without any raid config

Edited May 19, 2021 by nextgenpotato

BTRFS Unmountable No file system (Cache Pool, Raid0)

Recommended Posts

nextgenpotato

Link to comment

nextgenpotato

Link to comment

Join the conversation