Blew up my cache pool, need help with rescue

ainuke · December 20, 2020

So, after upgrading my old gaming mobo/cpu to server equipment, I broke the first rule of unRaid: don't open the box with the array active. Everything was working fine with the new hardware, but I wanted to take some measurements to print fan shrouds for the cpu heatsinks. I opened the case and to my horror, dislodged one of the power lines to my drive cage holding the cache drives, and a data line to one of the array drives. Cables must not have been seated properly. Well, I was able to do a parity rebuild on the array drive (it showed a red X and demanded I fix).

But the cache pool was hosed. Cache2 said unmountable: no file system, cache2 said "part of pool" or some such.

Any idea on how I recover from this?

If I try to use cache1 as the sole cache drive, unraid wants to format it as a fresh drive.

If I try to use cache2 as the sole cache drive, unraid says it's unmountable. If I try to mount it through the console using

mount -o usebackuproot,ro /dev/sdb1 /x

I get

mount: /x: wrong fs type, bad option, bad superblock on /dev/sdb1, missing codepage or helper program, or other error.

I have also tried

btrfs restore -v /dev/sdb1 /mnt/disk2/restore

and get

bad tree block 2414535622656, bytenr mismatch, want=2414535622656, have=0
Couldn't read tree root
Could not open root, trying backup super
bad tree block 2414535622656, bytenr mismatch, want=2414535622656, have=0
Couldn't read tree root
Could not open root, trying backup super
bad tree block 2414535622656, bytenr mismatch, want=2414535622656, have=0
Couldn't read tree root
Could not open root, trying backup super

I currently have a USB drive as a cache so I can do whatever needed with the cache drives as unassigned devices.

I have all my appdata on the cache pool, and a backup from august. For some reason I thought I was backing up automagically, but no.

So what's my next move?

Erik

trurl · December 20, 2020

If possible before rebooting and preferably with the array started
Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread.

ainuke · December 20, 2020

38 minutes ago, trurl said:

If possible before rebooting and preferably with the array started
Go to Tools - Diagnostics and attach the complete Diagnostics ZIP file to your NEXT post in this thread.

Thank you for the response. attached are diagnostics.

I had a mean time getting the array to shut down to unassign the borked disk before the parity rebuild. I think whatever was going on with cache drives was causing the stop array operation to hang at unmounting the disks.

orthanc-diagnostics-20201219-2150.zip

JorgeB · December 20, 2020

Looking at the syslog it appears that the pool was not redundant (possible due to a bug if it was created on v6.7) and one of the devices is missing/wiped, so difficult to recover any data.

ainuke · December 20, 2020

8 hours ago, JorgeB said:

Looking at the syslog it appears that the pool was not redundant (possible due to a bug if it was created on v6.7) and one of the devices is missing/wiped, so difficult to recover any data.

That sucks so bad. Cache pool was created with 6.8.3; I've only been using unraid since about July '20, and cache pool showed the capacity one would expect from raid 1 (~480GB available from 2 480GB drives). Pretty frustrated, given that this was the exact reason I had two cache drives.

Edited December 20, 2020 by ainuke
mistake in version number

ainuke · December 20, 2020

21 minutes ago, ainuke said:

Pretty frustrated, given that this was the exact reason I had two cache drives.

Any indication on where I went wrong setting up the cache pool?

As I recall, I had a working single disk cache, then added a second and there was a lot of copying until the pool reached the capacity of half the combined physical capacity of the drives. Does this sound like what I should've expected? I ask because I'm about to rebuild all the data and dockers that I started with; fresh, because it turns out that my "appdata" and "flash" directories in my backup folder are empty. I was absolutely sure I had backups...

I do still have the docker templates, but who knows what state those are going to end up being in? 2020 isn't done with us yet...

trurl · December 20, 2020

44 minutes ago, ainuke said:

I do still have the docker templates

Docker templates are on flash and they can be used to reinstall your dockers using the Previous Apps feature on the Apps page, but of course without appdata the applications themselves will be starting over.

ainuke · December 20, 2020

1 hour ago, trurl said:

Docker templates are on flash and they can be used to reinstall your dockers using the Previous Apps feature on the Apps page, but of course without appdata the applications themselves will be starting over.

Thanks. I've been reinstalling them from the docker templates, which thankfully has all the port and path mappings intact. But yes, I need to go through and re-config everything inside the containers. The Plex database isn't that big a deal to re-scan, minus watched status/etc, and thankfully I didn't have much I had set up for mariadb, but I'm going to catch it how for my kids' minecraft worlds that are mostly gone... I'll likely have more questions about best practices for the Minecraft data for both protection and quick access. I've had it all on the cache pool thus far, but maybe I should put it on a share that utilizes array/cache/mover? I have the appdata plugin properly configured now, but am not sure if keeping all the Minecraft stuff on the cache and backing it up regularly is going to devour my available disk space. Seems like a post in and of itself...

Anyway, thanks for your help.

Patriot_Unraider · December 21, 2020

Just lost my cache disk last night a nvme stick just up and died on me. Furious I lost data like yourself and swore to get dual SSD to do a cache pool. Yet now researching the subject and your own experience as well, it seems a waste of a second drive. Currently looking at a solution myself on a better way of handling the data on cache so if there is a drive failure I don’t lose my data. I use the CA backup appdata plugin but with it having to shutdown dockers it interrupts some of my programs even if it’s running overnight. Also I have a large Plex setup that takes forever to backup. That’s why something like the cache pool would be ideal. Unfortunately it doesn’t perform as advertised. As in actually providing a reliable redundancy for the cache drive.

JorgeB · December 21, 2020

18 hours ago, ainuke said:

Any indication on where I went wrong setting up the cache pool?

Difficult to say without previous diags, but looks like at least one of the devices was wiped, possibly by trying to re-add it to the pool.

CS01-HS · December 21, 2020

19 hours ago, ainuke said:

I have the appdata plugin properly configured now, but am not sure if keeping all the Minecraft stuff on the cache and backing it up regularly is going to devour my available disk space

I keep a single, weekly appdata backup on the array (in a non-cached share.) That shouldn't consume much space.

It protects against cache drive failure but not complete system failure. For that (and versioning) I use Duplicacy to backup the backup share to an external drive.

Blew up my cache pool, need help with rescue

Recommended Posts

ainuke

Link to comment

trurl

Link to comment

ainuke

Link to comment

JorgeB

Link to comment

ainuke

Link to comment

ainuke

Link to comment

trurl

Link to comment

ainuke

Link to comment

Patriot_Unraider

Link to comment

JorgeB

Link to comment

CS01-HS

Link to comment

Join the conversation