Need some help restoring cache pool


DBJordan

Recommended Posts

Running 6.8.0. I have two NVME devices I use as a cache pool. One of them seems to be flaky, and system got into a pretty unresponsive state, I was lucky to be able to attempt a graceful reboot (I did get a diagnostic dump from this, see truesource-diagnostics-20191215-2222.zip in the attachments).

 

When the box came up, the flaky drive had simply vanished from the configuration and the other drive in the pool was holding down the fort. I could still get to the files in the cache pool, so I initiated an attempt at rescue: I kicked off a Mover, with my next steps to be VM and Docker backups and another Mover to make sure everything is on spinning disks. Sometime during the Mover, the unraid system became unresponsive. Wouldn't let me ssh in, and wouldn't even return pings. So I powered it off ungracefully because even the physical keyboard and display were unresponsive. Since I couldn't do anything with it in that state, I was unable to retrieve diagnostics.

 

Upon reboot, it now shows the drive that was part of the cache pool still in its place, but the drive pool doesn't seem to exist any more. Also, now unassigned drives has the drive that dropped off earlier. Here's a picture:

image.thumb.png.d329ce03db9d78c47438fad01368e987.png

 

Also, it's picked up Mr Flaky Drive down in unassigned devices.

image.png.f29b157b59880187e29091d08fe0e0bd.png

 

At this point, I took another diagnostics snapshot.

truesource-diagnostics-20191216-0106.zip

 

I'm not going to mess with Mr Flaky Drive anymore. He's dead to me and I'm ordering replacement. But can someone help me find a way to get unraid to recognize the cache that's surely still sitting in my Cache 2 btrfs drive? I have done no formatting or experimentation. I came here first.

 

Thanks for any help. Please let me know if you have any questions or recommended reading.

truesource-diagnostics-20191215-2222.zip

Link to comment

So... when I had no Mr Flaky in the unassigned drives, my cache pool was showing up. So I uninstalled unassigned drives, so now it looks like this:

 

image.thumb.png.71cd893cd9414b3148f694b3a618c0a2.png

 

The Cache pool is back as a pool of one device. Not sure why uninstalling unassigned devices left the flaky guy at the bottom, but I'm at least able to move my stuff on to the spinning disks and make backups.

 

Ironic. The timed VM backups are supposed to occur about now, and the timed docker and other app data would have occurred 24 hours from now. Just. Too. Late.

 

Would still like any theories on what's going on or how to fix it without stumbling around so much if anyone has thoughts. Latest diags attached.truesource-diagnostics-20191216-0123.zip

Link to comment

 

5 hours ago, Benson said:

If according last PIC, NVMe mount success with 198GB data. You cound ignore the dashboard showing.

 

Does file there ?

 

Oh sorry I wasn't unclear. Yes, I can get to the pool drive now (in contractrary to earlier pic) and have happy backed up and moved over anything on it to something more stable. I'll check that other thread.

 

I'm not sure HOW I got one of the two drives to show up and offer up its data, but I'm glad its limping along enough for me to not lose the last few days of work.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.