Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Btrfs cache lost due to stupid error on my part

Featured Replies

Generally I have a rule that I don't work on things when I am really tired so that I don't make stupid mistakes. 

Last night I was really tired and ignored that rule and now I am paying the price.

 

I just swapped motherboard/processor/ram for my Unraid server yesterday and that went well. I also added 2 nvme drives to use for my cache pool to replace the 1 nvme drive I had been using so that I would have a little redundancy there. (I know it's not a proper backup, I am actually in the process of setting up a more proper backup system.) 

 

I added 1 of the new nvme drives to the cache pool and that seemed fine and it balanced it. This is where I should have stopped and waited until after I had slept. I was impatient and I thought it was a good idea to just swap the old nvme drive for the new one in the gui directly. Bad idea. Btrfs pool now gone. Turned off the server, went to bed, and now here I am the next day.

 

I've tried the restore options in the FAQ up until the last option. Currently I have the original nvme drive and the one I added 1st yesterday in the cache pool like it was when it was working before I made the mistake.

 

I have a feeling that I am screwed, but is there any way to restore from here?

 

 

Solved by JorgeB

  • Community Expert

Please post the diagnostics after array start and the output of:

btrfs fi show

 

  • Author
Label: none  uuid: 849f7ded-6fbb-4f5f-9627-fa781a175567
        Total devices 2 FS bytes used 714.96GiB
        devid    1 size 931.51GiB used 826.48GiB path /dev/sdd1
        devid    2 size 931.51GiB used 826.48GiB path /dev/sdb1

Label: none  uuid: db14842a-e91c-4154-bdd8-fbe89dfc7ce3
        Total devices 1 FS bytes used 340.00KiB
        devid    1 size 20.00GiB used 536.00MiB path /dev/loop2

Label: none  uuid: 6331b51d-add2-421b-a015-22c674718eb5
        Total devices 1 FS bytes used 412.00KiB
        devid    1 size 1.00GiB used 126.38MiB path /dev/loop3

 

sdd and sdb are part of another btrfs pool not having to do with cache.

 

turtle-diagnostics-20231212-1040.zip

  • Community Expert

No valid btrfs filesystem on the NVMe devices, suggesting they were wiped or fully trimmed, post the output of:


 

fdisk -l nvme2n1
fdisk -l nvme0n1

 

  • Author
root@Turtle:~# fdisk -l nvme2n1
fdisk: cannot open nvme2n1: No such file or directory
root@Turtle:~# fdisk -l nvme0n1
fdisk: cannot open nvme0n1: No such file or directory

 

  • Community Expert

Sorry, should be :

fdisk -l /dev/nvme2n1
fdisk -l /dev/nvme0n1

 

  • Author
root@Turtle:~# fdisk -l /dev/nvme2n1
Disk /dev/nvme2n1: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: CT1000P1SSD8                            
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
root@Turtle:~# fdisk -l /dev/nvme0n1
Disk /dev/nvme0n1: 1.82 TiB, 2000398934016 bytes, 3907029168 sectors
Disk model: WD_BLACK SN850X 2000GB                  
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
  • Community Expert

That confirms no partition exists, if they were wiped with wipefs it may still be recoverable, if a full device trim was done with blkdiscard it won't, you can try this, type:

 

sfdisk /dev/nvme2n1

 

then type 2048 and hit return and post a screenshot of the results.

  • Author

root@Turtle:~# sfdisk /dev/nvme2n1

Welcome to sfdisk (util-linux 2.38.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Checking that no-one is using this disk right now ... OK

Disk /dev/nvme2n1: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: CT1000P1SSD8                            
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

sfdisk is going to create a new 'dos' disk label.
Use 'label: <name>' before you define a first partition
to override the default.

Type 'help' to get more information.

>>> 2048
Created a new DOS disklabel with disk identifier 0xfc45e933.
Created a new partition 1 of type 'Linux' and of size 931.5 GiB.
/dev/nvme2n1p1 :         2048   1953525167 (931.5G) Linux
/dev/nvme2n1p2: 

  • Community Expert

Hit Control + C to abort and repeat but this time with 64

 

sfdisk /dev/nvme2n1

 

then type 64 and hit return and post a screenshot of the results.

  • Author
root@Turtle:~# sfdisk /dev/nvme2n1

Welcome to sfdisk (util-linux 2.38.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Checking that no-one is using this disk right now ... OK

Disk /dev/nvme2n1: 931.51 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: CT1000P1SSD8                            
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

sfdisk is going to create a new 'dos' disk label.
Use 'label: <name>' before you define a first partition
to override the default.

Type 'help' to get more information.

>>> 64
Created a new DOS disklabel with disk identifier 0xe90b059d.
Created a new partition 1 of type 'Linux' and of size 931.5 GiB.
Partition #1 contains a btrfs signature.

Do you want to remove the signature? [Y]es/[N]o: 

 

  • Community Expert

Type N plus return to keep the signature, then type w and hit return to save, then post the output of

btrfs fi show

 

  • Author

It's not letting me do the save part.

 

Do you want to remove the signature? [Y]es/[N]o: N
/dev/nvme2n1p1 :           64   1953525167 (931.5G) Linux
/dev/nvme2n1p2: w
unsupported command
/dev/nvme2n1p2: W
unsupported command
/dev/nvme2n1p2: 

 

  • Author

That worked. 

 

/dev/nvme2n1p2: write

New situation:
Disklabel type: dos
Disk identifier: 0xe90b059d

Device         Boot Start        End    Sectors   Size Id Type
/dev/nvme2n1p1         64 1953525167 1953525104 931.5G 83 Linux

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
root@Turtle:~# btrfs fi show
Label: none  uuid: 849f7ded-6fbb-4f5f-9627-fa781a175567
        Total devices 2 FS bytes used 714.96GiB
        devid    1 size 931.51GiB used 826.48GiB path /dev/sdd1
        devid    2 size 931.51GiB used 826.48GiB path /dev/sdb1

Label: none  uuid: db14842a-e91c-4154-bdd8-fbe89dfc7ce3
        Total devices 1 FS bytes used 340.00KiB
        devid    1 size 20.00GiB used 536.00MiB path /dev/loop2

Label: none  uuid: 6331b51d-add2-421b-a015-22c674718eb5
        Total devices 1 FS bytes used 412.00KiB
        devid    1 size 1.00GiB used 126.38MiB path /dev/loop3

warning, device 2 is missing
Label: none  uuid: 9bceeb35-e26e-47d8-9ef3-3534abbaa204
        Total devices 2 FS bytes used 383.72GiB
        devid    1 size 931.51GiB used 884.05GiB path /dev/nvme2n1p1
        *** Some devices missing

*edited to show the part after 'write' as well

Edited by Endy

  • Community Expert
  • Solution

I assume nvme2n1, the Crucial device, was the original cache? If yes we can try and mount it alone, if needed you can then still try to recover the other device.

 

To mount that device alone, unassign both pool members from the pool, start array, stop array, re-assign only the Crucial device, start array, post new diags.

  • Author

It's alive! As far as I can tell, everything seems to be there. Docker started up and seems to be working.

 

Am I out of the woods now?

If so, because I plan on just using the 2 new drives for cache and removing the Crucial drive, next step would be to temporarily move all data off of the cache pool? Then delete and recreate the cache pool using just the 2 new drives and then move the data back?

turtle-diagnostics-20231213-0735.zip

  • Community Expert

Everything looks good for now but the pool is still balancing, when the pool activity stops post new diags to confirm all is OK, if yes you can then assign one of the new devices to the pool.

  • Author

Ok, it finished.

 

If I am planning on removing the Crucial drive, do I want/need to add one of the new drives? Just want to make sure I don't mess up again.

 

Also, thank you so much for your help JorgeB. While I wouldn't have lost any irreplaceable data, you have saved me from countless hours recreating and setting everything up. I truly appreciate it.

turtle-diagnostics-20231213-0936.zip

  • Community Expert
Dec 13 09:35:19 Turtle kernel: BTRFS error (device nvme2n1p1): bdev /dev/nvme2n1p1 errs: wr 0, rd 0, flush 0, corrupt 6, gen 0
Dec 13 09:35:33 Turtle kernel: BTRFS info (device nvme2n1p1): balance: ended with status: -5

 

Balance didn't finish because some data corruption was detected, possibly also why you had issues before adding the device, in this case you should not add the new device, you can create a new pool with the new device(s), copy everything you can then remove that device.

 

23 minutes ago, Endy said:

Also, thank you so much for your help JorgeB.

Your welcome.

  • Community Expert

Forgot to mention, it would also be a good idea to run memtest, just to make sure there are no obvious RAM issues.

  • Author

Hopefully small hiccup... I stopped the array and created a new pool for the 2 new drives and when I tried to start the array I got a message saying

 

"Wrong Pool State

cache - too many missing/wrong devices"

 

I didn't touch the original cache pool. I tried deleting the new pool and same message.

  • Community Expert

Post new diags, devices may need to be wiped first.

  • Community Expert
1 hour ago, Endy said:

I tried deleting the new pool and same message.

Missed this part, so this is about the old pool, possibly because the missing device failed to be removed, try re-importin the pool again, unassign the old cache device, assign the new ones to a new pool now, because you may have the same issue later after an array stop, start array, stop array, re-assign old cache, start array.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.