Tried to back up my VM image, now it won't boot (UEFI shell)


judyio

Recommended Posts

I was running Home Assistant in a VM successfully for months. Realized I needed a better backup solution for rollbacks, and found the vmbackup plugin. Perfect. Installed that, started to point it to my Home Assistant disk. Read through the help text on a bunch of options, and saw this about snapshots:

 

Quote

Snapshots will be used when backing up VMs to prevent them from needing to be shutdown.

WARNING: This will fail if the config path for the virtual disk is /mnt/user/. you must use /mnt/cache/ or /mnt/diskX/ for snapshots to work.

 

Oh crap. I didn't know this was going to shut down Home Assistant every time it ran. And I change things in HA regularly, so I was hoping to back it up every night. I better use snapshots. But my hassos.qcow2 file was on the disk array rather than cache, because I only had one cache drive. But no fear, I'm about to have vmbackups!

 

  1. I shut down the Home Assistant VM. No reason to believe it didn't shut down correctly. (I had turned it off and back on a week earlier to pass-through a USB Zigbee stick, and it came back online that time.)
  2. I changed the preferences of my domains share to "Prefer" on being stored on the cache.
  3. Rather than wait for the mover or trigger it manually, I SSHed in to Unraid and copied the file from /mnt/user/domains/HassOS/hassos.qcow2 to /mnt/cache/domains/HassOS/hassos.qcow2. I believe I double-checked the permissions and ownership to make sure they were the same.
  4. I changed the VM settings to point to the new location at /mnt/cache/domains/HassOS/hassos.qcow2.
  5. Turned on the VM.

 

The green light turned on, but I couldn't access Home Assistant. Used VNC and just saw this:

 

954428738_ScreenShot2021-02-05at2_24_48PM.thumb.png.1938d15ba90c43af26dcccd0dde18996.png

 

That's not good. I think I might have screwed up when copying from /mnt/user instead of finding the actual disk and copying from there. I also remember when I first set up the VM following these instructions that I ran qemu-img to resize the disk. It worked great, though I don't know much about what it did, and could that have screwed it up? I tried to undo my mistake:

 

  • Stopped the VM, pointed it back at /mnt/user/domains/HassOS/hassos.qcow2, and restarted. Didn't work, same screen as above.
  • I found the original image on disk3. Stopped the VM, pointed it at /mnt/disk3/domains/HassOS/hassos.qcow2, and restarted. Didn't work either, same screen as above.
  • Stopped the VM, created a brand new VM with the same settings, pointed it at the above location. No dice. WTF??
  • Googled for a couple hours, all I could find are people that have similar issues but can still see "FS0:" in the Mapping Table and are able to repair things from there.

 

I'm at a loss. Wondering if I need to recreate my entire HA setup from scratch now... 😰 Or at least if there's some way to mount the qcow2 image so I can retrieve my HA configuration when recreating the VM. But I thought I'd ask here first and see if anyone has an idea of how to save my original image. Please help, you're my only hope!

 

 

Link to comment
On 2/6/2021 at 3:39 AM, judyio said:

Wondering if I need to recreate my entire HA setup from scratch now... 😰

I have same issue if backup libvert.img and qcow2 image couldn't backup/restore HA VM. BTW, using HA self backup function can solve the problem in easy. I import that image from Pi baremetal to HA VM, just several click would complete, no need start from scratch.

 

Always backup image in HA self, this always work and save lot of time.

Edited by Vr2Io
Link to comment

Well, I feel a bit sheepish. On my setup, it looks like libvirt can't mount images from anywhere other than /mnt/user. I copied the same image to shares on /mnt/cache and mnt/disk*, and none of those work. I just get the UEFI Interactive Shell and BLK0: in the Mapping Table. Either that and/or it doesn't like you changing the location of an image once the VM has been created. Not sure.

 

I created a new VM, used the original image in /mnt/user... and it booted right up. So much time spent on figuring that out. Ugh.

 

11 hours ago, Vr2Io said:

Always backup image in HA self, this always work and save lot of time.

 

Yeah, I think this is the most reasonable option at this point. I originally thought taking snapshots of the entire VM once a night would be simplest, and it turned out not to be. I'm going to explore automated backups from within HA. Thanks!

Link to comment
  • 6 months later...
  • 3 months later...

Just got this error when i tried to move the qcow2 image to a different folder. The reason for this error is because unraid for some weird reason changes the type from "ucow2" to "raw" (in the xml). Change this back to "ucow2" and it should boot as normal (atleast it did for me).

 

Quote

    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='writeback'/>
      <source file='/mnt/user/domains/ha/ha.qcow2' index='1'/>

change to this:

    <emulator>/usr/local/sbin/qemu</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='writeback'/>
      <source file='/mnt/user/domains/ha/ha.qcow2' index='1'/>

 

Edited by alpha
  • Like 3
  • Upvote 3
Link to comment
  • 1 month later...
  • 1 month later...

Bitten by the same issue of non-booting (though not trying to make a backup). Problem/fix compounded by not being able to shut down the VM instance while it was stuck in the UEFI Shell - so no ability to edit the XML.

 

Waiting on the whole system to come back after trying to reboot the server via Unraid menu.

 

EDIT. Serious scare when the system came back up. NVMe cache drive holding all VMs didn't show up and made all VMs complete inaccessible. Shut-down and reboot brought it back and everything was as expected. Including the problem VM which was currently set to qcow2 instead of raw disk type. I didn't manually change anything, it just showed up again like it had always been.  The entire fiasco is strange.

 

 

Edited by Espressomatic
Link to comment
  • 2 months later...

I was having this exact same issue but on a brand new VM with a fresh HASSOS qcow2 file (straight from Home Assistant's website) and couldn't for the life of me figure out why it wouldn't boot (since past UEFI boot issues have been due to Home Assistant OS updates breaking its own boot files but this was a fresh image file).  I checked the XML view and sure enough it was set to raw instead of qcow2 for some reason.  I changed that and it boots up just fine.  I wonder what is causing unRAID to pick the wrong disk type.

Link to comment
  • 1 month later...
On 5/28/2022 at 10:42 AM, Warp3 said:

I was having this exact same issue but on a brand new VM with a fresh HASSOS qcow2 file (straight from Home Assistant's website) and couldn't for the life of me figure out why it wouldn't boot (since past UEFI boot issues have been due to Home Assistant OS updates breaking its own boot files but this was a fresh image file).  I checked the XML view and sure enough it was set to raw instead of qcow2 for some reason.  I changed that and it boots up just fine.  I wonder what is causing unRAID to pick the wrong disk type.

Likely file extension, although I think manual just defaults to `raw`.  I wish unraid would fix this bug or add to the UI a drop down list.

Link to comment

So I've been trying to get Homeassistant VM going, and even tried a Windows VM to get going and was having the same issue above.  I go to XML, and it shows RAW, but I change it as instructed and update and it still doesn't load.  So I go to check it again, and it shows its raw again, for some reason its now staying udpated.

 

Any recommendations? 

Link to comment
  • 1 year later...
On 12/30/2021 at 10:49 PM, alpha said:

Just got this error when i tried to move the qcow2 image to a different folder. The reason for this error is because unraid for some weird reason changes the type from "ucow2" to "raw" (in the xml). Change this back to "qcow2" and it should boot as normal (atleast it did for me).

 

I'm actually shocked that it's nearly 2024 as I write this - and I just spent the better part of an hour figuring out what on earth was going on. And this was it. Sorry, but..... this is pretty unacceptable to remain unfixed in a commercial product for so long :x

Edited by amapo
  • Upvote 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.