Hangs on adding cache drive


Recommended Posts

I decided to add a additional identical cache drive as raid protection for cache 1.  After adding the drive and rebooting, I selected the new drive to 'cache 2'.  Once I clicked 'start', it just hung on 'starting...'. 

 

After 10 hours I decided to reboot.  The array still wasn't started, but now 'cache 2' was also green ball. I clicked start, and again, just hangs on 'starting...'.  I can access my unraid data across the network, including the cache drive, but the array hasn't started, so no Docker or VM's.  Shares are working though.

 

Did I do something wrong?

 

I should also note that I muddied things but upgrading the CPU at the same time, and thus I reset my BIOS to factory.  I made sure to turn back on virtualization.  I don't think there are any other BIOS settings that could have led to this problem, but figured I should mention it.

 

Still wating after a few minutes for diagnostics to download...

Edited by curtis-r
Link to comment

Here you go:

 

root@Tower:~# btrfs fi usage -T /mnt/cache
Overall:
    Device size:                 223.58GiB
    Device allocated:            111.79GiB
    Device unallocated:          111.79GiB
    Device missing:                  0.00B
    Used:                         83.79GiB
    Free (estimated):            138.19GiB      (min: 138.19GiB)
    Free (statfs, df):           111.79GiB
    Data ratio:                       1.00
    Metadata ratio:                   1.00
    Global reserve:              226.67MiB      (used: 0.00B)
    Multiple profiles:                  no

             Data      Metadata  System              
Id Path      single    single    single   Unallocated
-- --------- --------- --------- -------- -----------
 1 /dev/sdi1 109.78GiB   2.01GiB  4.00MiB     1.02MiB
 2 /dev/sdb1         -         -        -   111.79GiB
-- --------- --------- --------- -------- -----------
   Total     109.78GiB   2.01GiB  4.00MiB   111.79GiB
   Used       83.38GiB 422.45MiB 16.00KiB   

Link to comment
Dec 18 06:06:43 Tower kernel: BTRFS: error (device sdi1) in cleanup_transaction:1942: errno=-28 No space left

 

Thanks, I wanted to see that, because of this no space left error, the original device has space free the filesystem was fully allocated:

 

Unallocated
 1.02MiB

 

This is likely the cause of the issue, but to recover now it would probably be easier to reboot, then mount the pool read-only, backup and re-format.

 

Link to comment

mkdir /temp worked but

mount -o rescue=all,ro /dev/sdX1 /temp

returned

mount: /temp: special device /dev/sdX1 does not exist.


Also

mount -o rescue=all,ro /dev/mapper/sdX1 /temp 

returned

mount: /temp: special device /dev/mapper/sdX1 does not exist.

 

Do I need the array in a stopped state to do this?

Link to comment

Ok, did a hard reboot.  Forgot to mention that it won't boot unless in safe mode.  It hangs on Starting Samba (see attached photo).

 

Again, tried to Start, but gives the BTRFS error of "no space left".

 

Tried your instructions, but there is no indication anything worked.

mkdir /temp
mount -o degraded,usebackuproot,ro /dev/sdi1 /temp

 

I decided to go the copy Cache, format, restore route.  Tried copying with MC but would always freeze at 2%.  Not sure if there is another way to copy the cache because it luckily seems like it's still all there.  Can you copy without mounting?

 

EDIT: When I removed the new/secondary cache drive, changed the pool to 1, and started the array, it starts. It says the cache is "Unmountable: not mounted".  So I then ran the above commands and they went through, but still says unmountable.

 

 

samba_error.jpg

Edited by curtis-r
Link to comment

Wow, what an ordeal.  But I'm so close I can taste it.  EDIT: SOLVED!

 

While array stopped, I used this to format the original cache drive (using 'i' in my case for 'X', but to others, make sure you have the correct drive letter)

blkdiscard /dev/sdX -f

Then started array & clicked to format the cache drive pool.  Then used MC to restore the backed-up cache data.  Rebooted (default mode)


Cache shows green balls & proper size, but VM says "Libvirt Service failed to start." and Docker "Docker Service failed to start."  Diagnostics attached.

 

EDIT: Turned off Settings/VM, then through MC deleted libvirt.img, and copied it again.  Same goes for docker.img It they started and all is back and running!!! Thank you for your help.  Still wonder what went wrong when I added the cache 2 drive, but I jsut want to move on.

tower-diagnostics-20231219-0853.zip

Edited by curtis-r
SOLVED!
  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.