Missing VMs after cache drive upgrades - I messed this up somehow


Go to solution Solved by JorgeB,

Recommended Posts

OK, here's what happened, I purchased two used Intel PCIe SSDs and replaced my existing SSDs as cache drives.  I then upgraded to 6.10.  I am pretty sure this has nothing to do with the upgrade and everything to do with the cache disk replacement.  Somehow, I have lost two of my three VMs.  The other being stored on a ZFS array.  The two affected VMs were stored on the cache drives.  How can I get the disk images off the old cache drives, they were btrfs mirrored.  They are still installed but will not mount.  Diags attached.  The old cache drives are sdk and sdl.

 

Adam

beast-diagnostics-20220521-2201.zip

Link to comment
  • Solution

Server was rebooted in the middle so can't see everything that happened, current cache is mounting, but it's empty, possibly some error during the replacement.

 

Also note that you have device 81:00 bound to vfio-pci, 81 is now one of the NVMe devices and although it's a different device and it wasn't bound it still brought the device offline, so delete that bind and run a correcting scrub on the pool.

 

As for the old data, I assume this device was part of the original pool?

 

C300-CTFDDAC128MAG_00000000102202FBC295 (sdl)

 

If yes this also suggests the replacement wasn't done correctly since there's still a btrfs filesystem there, and it should have been wiped during the replacement, if this was part of the old pool and the pool was redundant you should be able to mount it if you assign alone to a new pool (not with UD), if it doesn't mount in a new pool post new diags, maybe it can be mounted manually.

Link to comment

JorgeB,  You are a gentleman and a scholar.  Sure enough, after installing the PCIe NVMe SSDs, the number of the resource had changed and it was assigning one of the NVMe SSDs to my Windows 10 VM, instead of a USB 3.1 card as desired.  Once I swapped that around, it fixed the NVMe disappearing when I started the VM (which I didn't correlate until you mentioned it).  Then I was able to have both NVMe SSDs in the cache pool.  Lastly, I was able to create a temp pool with one of the old cache drives and mount it, and copy off the data I needed.  However, I do have a question.  I thought data from the cache, was supposed to be flushed nightly to the array by the mover, is that not correct?  The data wasn't on my array, only the old cache drive.

  • Like 1
Link to comment
1 minute ago, cadamwil said:

JorgeB,  You are a gentleman and a scholar.  Sure enough, after installing the PCIe NVMe SSDs, the number of the resource had changed and it was assigning one of the NVMe SSDs to my Windows 10 VM, instead of a USB 3.1 card as desired.  Once I swapped that around, it fixed the NVMe disappearing when I started the VM (which I didn't correlate until you mentioned it).  Then I was able to have both NVMe SSDs in the cache pool.  Lastly, I was able to create a temp pool with one of the old cache drives and mount it, and copy off the data I needed.  However, I do have a question.  I thought data from the cache, was supposed to be flushed nightly to the array by the mover, is that not correct?  The data wasn't on my array, only the old cache drive.

Data will only be move if use cache is yes. if prefer data will stay on cache drive, i.e. appdata for dockers imgs for vms.

Link to comment
28 minutes ago, SimonF said:

Data will only be move if use cache is yes. if prefer data will stay on cache drive, i.e. appdata for dockers imgs for vms.

what would a typical setup be for appdata, domains, and system.  In my setup, all are set to prefer.  I do have a btrfs mirrored cache setup, but I also have space on the array as my real main storage is my ZFS array.  Any downside to changing those from Prefer to yes?

Link to comment
6 minutes ago, cadamwil said:

what would a typical setup be for appdata, domains, and system.  In my setup, all are set to prefer.  I do have a btrfs mirrored cache setup, but I also have space on the array as my real main storage is my ZFS array.  Any downside to changing those from Prefer to yes?

Most people run them on a pool, i.e. the cache etc. Best to leave as prefer as the cache pools are normally more better performing devices i.e. NVME, SSD. Also will stop drives spinning down on the array if those shares are moved to the array. Also docker and vm manager needs to be disabled for them to be able to move as files will be open.

 

I am not a real user for ZFS so cannot advise, have seen issues reported with running docker on ZFS.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.