Hangs on adding cache drive

curtis-r · December 18, 2023

I decided to add a additional identical cache drive as raid protection for cache 1. After adding the drive and rebooting, I selected the new drive to 'cache 2'. Once I clicked 'start', it just hung on 'starting...'.

After 10 hours I decided to reboot. The array still wasn't started, but now 'cache 2' was also green ball. I clicked start, and again, just hangs on 'starting...'. I can access my unraid data across the network, including the cache drive, but the array hasn't started, so no Docker or VM's. Shares are working though.

Did I do something wrong?

I should also note that I muddied things but upgrading the CPU at the same time, and thus I reset my BIOS to factory. I made sure to turn back on virtualization. I don't think there are any other BIOS settings that could have led to this problem, but figured I should mention it.

Still wating after a few minutes for diagnostics to download...

Edited December 18, 2023 by curtis-r

JorgeB · December 18, 2023

See if you can get the syslog:

cp /var/log/syslog /boot/syslog.txt

curtis-r · December 18, 2023

That worked. Thanks. Hopefully you can make sense of it.

BTW, here is how it still looks.

syslog.txt

Edited December 18, 2023 by curtis-r

JorgeB · December 18, 2023

Pool is crashing on mount, see if you can get the output of:

btrfs fi usage -T /mnt/cache

curtis-r · December 18, 2023

Here you go:

root@Tower:~# btrfs fi usage -T /mnt/cache
Overall:
Device size: 223.58GiB
Device allocated: 111.79GiB
Device unallocated: 111.79GiB
Device missing: 0.00B
Used: 83.79GiB
Free (estimated): 138.19GiB (min: 138.19GiB)
Free (statfs, df): 111.79GiB
Data ratio: 1.00
Metadata ratio: 1.00
Global reserve: 226.67MiB (used: 0.00B)
Multiple profiles: no

Data Metadata System
Id Path single single single Unallocated
-- --------- --------- --------- -------- -----------
1 /dev/sdi1 109.78GiB 2.01GiB 4.00MiB 1.02MiB
2 /dev/sdb1 - - - 111.79GiB
-- --------- --------- --------- -------- -----------
Total 109.78GiB 2.01GiB 4.00MiB 111.79GiB
Used 83.38GiB 422.45MiB 16.00KiB

JorgeB · December 18, 2023

Dec 18 06:06:43 Tower kernel: BTRFS: error (device sdi1) in cleanup_transaction:1942: errno=-28 No space left

Thanks, I wanted to see that, because of this no space left error, the original device has space free the filesystem was fully allocated:

Unallocated
 1.02MiB

This is likely the cause of the issue, but to recover now it would probably be easier to reboot, then mount the pool read-only, backup and re-format.

curtis-r · December 18, 2023

Ok, will try mounting, backup, & reformatting when I get home, and will report back. Thanks for your help!

curtis-r · December 18, 2023

mkdir /temp worked but

mount -o rescue=all,ro /dev/sdX1 /temp

returned

mount: /temp: special device /dev/sdX1 does not exist.

Also

mount -o rescue=all,ro /dev/mapper/sdX1 /temp

returned

mount: /temp: special device /dev/mapper/sdX1 does not exist.

Do I need the array in a stopped state to do this?

JorgeB · December 18, 2023

You need to replace X with the correct identifier, in this case i, so /dev/sdi1

curtis-r · December 18, 2023

Ugh, sorry. The instructions were certainly clear. So now it appears to be working since nothing has returned yet. Should it take a while?

JorgeB · December 18, 2023

Do you meant the cursor is not back? There won't be any output if it mounts.

curtis-r · December 18, 2023

Correct. cursor is not back. Nothing changed on the Main tab so I tried to reboot, but the reboot button, nor does "reboot" command-line work, and I'm remoting in from work, so I'll try a hard reboot when I get home.

curtis-r · December 19, 2023

Ok, did a hard reboot. Forgot to mention that it won't boot unless in safe mode. It hangs on Starting Samba (see attached photo).

Again, tried to Start, but gives the BTRFS error of "no space left".

Tried your instructions, but there is no indication anything worked.

mkdir /temp
mount -o degraded,usebackuproot,ro /dev/sdi1 /temp

I decided to go the copy Cache, format, restore route. Tried copying with MC but would always freeze at 2%. Not sure if there is another way to copy the cache because it luckily seems like it's still all there. Can you copy without mounting?

EDIT: When I removed the new/secondary cache drive, changed the pool to 1, and started the array, it starts. It says the cache is "Unmountable: not mounted". So I then ran the above commands and they went through, but still says unmountable.

Edited December 19, 2023 by curtis-r

curtis-r · December 19, 2023

Ok, some good news. After much trial and error, I was able to get MC to copy all the cache to an array disk. I guess I should format the cache next? And do I just use MC to copy back the cache data?

Edited December 19, 2023 by curtis-r

JorgeB · December 19, 2023

5 hours ago, curtis-r said:

I guess I should format the cache next? And do I just use MC to copy back the cache data?

Yes, and you can use mc.

curtis-r · December 19, 2023

Wow, what an ordeal. But I'm so close I can taste it. EDIT: SOLVED!

While array stopped, I used this to format the original cache drive (using 'i' in my case for 'X', but to others, make sure you have the correct drive letter)

blkdiscard /dev/sdX -f

Then started array & clicked to format the cache drive pool. Then used MC to restore the backed-up cache data. Rebooted (default mode)

Cache shows green balls & proper size, but VM says "Libvirt Service failed to start." and Docker "Docker Service failed to start." Diagnostics attached.

EDIT: Turned off Settings/VM, then through MC deleted libvirt.img, and copied it again. Same goes for docker.img It they started and all is back and running!!! Thank you for your help. Still wonder what went wrong when I added the cache 2 drive, but I jsut want to move on.

tower-diagnostics-20231219-0853.zip

Edited December 19, 2023 by curtis-r
SOLVED!

Hangs on adding cache drive

Recommended Posts

curtis-r

Link to comment

JorgeB

Link to comment

curtis-r

Link to comment

JorgeB

Link to comment

curtis-r

Link to comment

JorgeB

Link to comment

curtis-r

Link to comment

curtis-r

Link to comment

JorgeB

Link to comment

curtis-r

Link to comment

JorgeB

Link to comment

curtis-r

Link to comment

curtis-r

Link to comment

curtis-r

Link to comment

JorgeB

Link to comment

curtis-r

Link to comment

Join the conversation