Cache Issues (Solved)

-=Striker=- · February 13, 2021

Hello,

I was connecting some additional power connections in my Unraid server (while the server was powered off). I didn't realize, but the power connection to one of my cache drives came loose. I powered on the server and of course received an error that a drive was missing.

I powered off the server, reconnected the power to the cache drive and booted up again.

When the server came back up, it saw both drives but only one was in the cache pool. I stopped the array and moved the other drive back into the cache. I believe the drives were in the wrong order at this time as when I started the array, it said the drive was now a new drive. I stopped the array again and reversed the order of both drives. Now it no longer indicates that the drive is new, however my cache is not working.

It states that one drive is unmountable with no file system. The second device states it is part of the cache pool.

Both drives are 500GB Crucial SSD's and are supposed to be running in a mirrored mode so that if one drive fails, things keep working. Suffice it to say i'm a little concerned at this point as I am not seeing any of my VM's or docker containers.

This is what I see for my cache pool now:

I've tried starting the array in maintenance mode to see if I can scrub the drives or anything of that nature, but the options aren't available. The check option was available for one of the disks but it failed stating there was no file system.

I've attached an anonymized diagnostics file as well in case that can be of assistance.

Any help anyone can offer is greatly appreciated!

hal9000-diagnostics-20210213-1718.zip

Edited February 14, 2021 by -=Striker=-

-=Striker=- · February 14, 2021

So I've been going through the forums and there seems to be a known issue of a cache pool being created in RAID1 for the data but single disk mode for the meta data.

This thread describes a very similar issue to what I am having now.

What doesn't make sense, is that in this thread it is stated that this was a known issue in 6.7+ but resolved in 6.8.1 rc. I created my cache pool using 6.8.3 so the bug fix should have been in place, but perhaps it has regressed?

JorgeB · February 14, 2021

It happened because the array was started with just a single device and there were already a lot of errors, likely from a pool device dropping offline previously, the balance didn't finish because of the errors or it was canceled, and when you added the second device back it was wiped, there would have been a warning in red "all data on this device will be deleted at array start".

You can try this on the wiped device (sdb in the log):

btrfs-select-super -s 1 /dev/sdX1

If the command is successful then make Unraid "forget" the pool by starting the array without any cache devices assigned, then stop array, re-assign only sdb and see if it mounts, if it does wipe sdc with blkdiscard and add it back to the pool, if it doesn't there are some recovery options here.

-=Striker=- · February 14, 2021

thanks!

I found a similar thread and was able to do some recovery. I also had the Community Application Backup installed that helped me do a restore, so most things came back, I can rebuild the rest.

Cache Issues (Solved)

Recommended Posts

-=Striker=-

Link to comment

-=Striker=-

Link to comment

JorgeB

Link to comment

-=Striker=-

Link to comment

Join the conversation