Replace drive in cache pool - "Never Started"? - General Support

October 2, 20178 yr

Hi, I'm following the guide to replace 1 of my 2 SSD cache pool devices which has stopped working. I've taken out the broken drive a nd put the new one in and started the array so I could do a backup of the working cache drive. I did not change any assignments. The new blank drive is /dev/sdj.

When I put in the status command, then it says "never started"

Diagnostics attached.

I can see these entries in the logs.

What should I do now?

Thanks!

Oct 3 00:54:40 Tower kernel: BTRFS info (device sde1): found 4352 extents
Oct 3 00:54:40 Tower kernel: BTRFS info (device sde1): relocating block group 2682845265920 flags 17
Oct 3 00:54:43 Tower kernel: BTRFS info (device sde1): found 2792 extents
Oct 3 00:55:09 Tower kernel: BTRFS info (device sde1): found 2792 extents
Oct 3 00:55:09 Tower kernel: BTRFS info (device sde1): relocating block group 2681771524096 flags 17
Oct 3 00:55:11 Tower kernel: BTRFS info (device sde1): found 4112 extents
Oct 3 00:55:39 Tower kernel: BTRFS info (device sde1): found 4112 extents
Oct 3 00:55:39 Tower kernel: BTRFS info (device sde1): relocating block group 2680697782272 flags 17
Oct 3 00:55:42 Tower kernel: BTRFS info (device sde1): found 2827 extents

root@Tower:/mnt/cache# sfdisk /dev/sdj

Welcome to sfdisk (util-linux 2.28.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

Checking that no-one is using this disk right now ... OK

Disk /dev/sdj: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

sfdisk is going to create a new 'dos' disk label.
Use 'label: <name>' before you define a first partition
to override the default.

Type 'help' to get more information.

>>> 64
Created a new DOS disklabel with disk identifier 0x20e308b5.
Created a new partition 1 of type 'Linux' and of size 931.5 GiB.
   /dev/sdj1 :           64   1953525167 (931.5G) Linux
/dev/sdj2: write

New situation:

Device     Boot Start        End    Sectors   Size Id Type
/dev/sdj1          64 1953525167 1953525104 931.5G 83 Linux

The partition table has been altered.
Calling ioctl() to re-read partition table.
Syncing disks.
root@Tower:/mnt/cache# btrfs fi show /mnt/cache
Label: none  uuid: 6110ce92-70ca-4bac-be5d-15972207af53
        Total devices 2 FS bytes used 603.77GiB
        devid    1 size 931.51GiB used 930.51GiB path /dev/sde1
        *** Some devices missing

root@Tower:/mnt/cache# btrfs device usage /mnt/cache
/dev/sde1, ID: 1
   Device size:           931.51GiB
   Device slack:              0.00B
   Data,single:           455.47GiB
   Data,RAID1:            471.00GiB
   Metadata,single:         2.00GiB
   Metadata,RAID1:          2.00GiB
   System,single:          11.00MiB
   System,RAID1:           32.00MiB
   Unallocated:             1.00GiB

missing, ID: 2
   Device size:               0.00B
   Device slack:           16.00EiB
   Data,RAID1:            471.00GiB
   Metadata,RAID1:          2.00GiB
   System,RAID1:           32.00MiB
   Unallocated:           458.48GiB

root@Tower:/mnt/cache# btrfs replace start 2 /dev/sdj1 /mnt/cache
root@Tower:/mnt/cache# btrfs replace status /mnt/cache
Never started
root@Tower:/mnt/cache# btrfs replace status /mnt/cache

tower-diagnostics-20171003-0053.zip

Quote

October 3, 20178 yr

Community Expert

You need to wait for the current balance (going from 2 to 1 devices) to finish before you can add the new device.

Then you'll need to add the new device instead of replacing.

Quote

October 3, 20178 yr

Author

Thanks - new one is adding now.

Why did it start a balance on its own? Should I have done anything differently?

Cheers!

Quote

October 3, 20178 yr

Community Expert

9 minutes ago, al_uk said:

Why did it start a balance on its own?

When you start the array with a missing device it will be deleted and the pool rebalanced to the single remaining device.

10 minutes ago, al_uk said:

Should I have done anything differently?

If you have enough ports you could've done a direct replacement without removing the old device, but the end result will be the same.

Quote

October 3, 20178 yr

Author

ok - that clears that up thanks!

I have enough ports, however the SSD failed, and was not being detected.

Even on the balancing that is happening now there are lots of writes to the remaining working SSD.

My worry with all this balancing, is that this does lots of writes to the remaining SSD, which could then fail.

Should I be concerned?

sde is the remaining working SSD

sdf is the new blank SSD that is being balanced at the moment.

Edited October 3, 20178 yr by al_uk

Quote

October 3, 20178 yr

Community Expert

4 minutes ago, al_uk said:

Should I be concerned?

No, that's normal.

Quote

October 5, 20178 yr

Author

Just to close this off, all completed successfully. Thanks for the replies.

So that is a full recovery with no data loss from a catastrophically failed SSD. It failed after a reboot where it was not detected by the controller. This SSD is a 1TB Samsung 850 EVO which is just under 2 years old, and for the 1st year I was writing CCTV to it at 5MB per second. I think I've exceeded the write cycle warranty.

Does the FAQ need updating to show that the balance will start automatically?

Label: none  uuid: 6110ce92-70ca-4bac-be5d-15972207af53
	Total devices 2 FS bytes used 602.05GiB
	devid    1 size 931.51GiB used 800.03GiB path /dev/sde1
	devid    2 size 931.51GiB used 800.03GiB path /dev/sdf1

Data, RAID1: total=796.00GiB, used=600.93GiB
System, RAID1: total=32.00MiB, used=144.00KiB
Metadata, RAID1: total=4.00GiB, used=1.12GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

Quote

October 6, 20178 yr

Community Expert

8 hours ago, al_uk said:

Does the FAQ need updating to show that the balance will start automatically?

It does need some updating specially because v6.4rc8 includes several cache enhancements, I'm just waiting for v6.4 stable to update it.

Quote

Replace drive in cache pool - "Never Started"?

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)