Is there a way to just start over, not losing my paid OS plus?


Recommended Posts

Hi. I'm a nooB that had an Unraid setup working with an Aorus Master B550 motherboard but wanted to upgrade to an X570 Strix-e that had both 2.5 and 1 gb LAN as well as 8 sata ports. However, I wasn't able to get Unraid to boot up in legacy mode as I had been doing with the same USB on the previous B550 board. 

 

I tried deleting keys, setting to use with Other OS in boot options, and that didn't help, only had a black screen when I could get past the BIOS. 

The only thing that worked was deleting the - from EFI- folder, making it say only EFI on the boot disc... That gets it to show Unraid loading up, then it goes black with only a blinking _ at the top-left of the screen, and I have to pull up Unraid via another system's web page, which then works as normal.

 

However, I got both a Windows 11 and a Garuda Linux VM working previously on the previous build, and then they also were working with this new setup after reapplying the GPU's for passthrough as well as USB and keyboard and mouse as per the video I watched, with one or two exceptions where I had to reboot to get the VM's to work.

 

Now I can't get either VM to boot, both having no previous problem with PCI passthrough such as with the graphics card, a 1080ti for both audio and video. Now I can't get either to show on screen, and I haven't made any other changes to the VM's.

 

And as I was typing this topic I got the error message /var/log is getting full, and I haven't even quite gotten past the 30 day trial period. I'm assuming these things are related. Here is a copy of the diagnostic in case it may be helpful. 

 

It's possible that some of the problem was that I also added a second cache disc /pool to store VM's and related info in as per a video I watched PRIOR to moving the server to the new motherboard/setup. I copied the info to the new disc with Binhex Krusader and was able to get them to boot night last night. Then neither VM will boot this morning, though they show as started but only shut down when forced.

 

I want to troubleshoot more but have to go to work (13 or more hours today) and will check back later. I'm hoping I may be able to keep and make use of the X570 motherboard, as it has 8 SATA ports. But I'll be ready to do my first Amazon return tomorrow if it doesn't pan out, and maybe I'll start from scratch if the problem is more extensive than I'm thinking, or may consider making a separate Unraid server and just dual booting Linux and Windows on another PC. Thanks so much for any reply and advice. 

rushtallica-diagnostics-20211118-0503.zip

Edited by Rushtallica
Link to comment

Hi, you should attach to vfio the following iommu groups: 35, 36, 42

Once attached to vfio reboot unraid

 

For "Garuda.1" vm:

make the gpu device multifunction:

change from this:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/1080tiHybrid.dump'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x07' slot='0x00' function='0x0'/>
    </hostdev>

 

to this:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/1080tiHybrid.dump'/>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
    </hostdev>

 

 

For "Windows 10.1" vm:

These devices do not exist:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x09' slot='0x00' function='0x4'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>

 

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x04' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>

 

For "Windows 11.1" vm: same as for windows 10, make the gpu device multifunction:

change from this:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/1080tiHybrid.dump'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
    </hostdev>

 

to this:

    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0c' slot='0x00' function='0x0'/>
      </source>
      <rom file='/mnt/user/isos/vbios/1080tiHybrid.dump'/>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0' multifunction='on'/>
    </hostdev>
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x0c' slot='0x00' function='0x1'/>
      </source>
      <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x1'/>
    </hostdev>

 

 

Since the vms were working I suppose no issues to the vbios you are pointing to.

Link to comment

I tried replacing the sections but couldn't get it to work. But I ended up getting both garuda and Windows 11 to work late last night by only adding the multifunction part. Then this morning they won't show up again, and the keyboard number lock button won't work, though the setting is still the same as last night with the multifunction parts still in the edit.

 

I then created a new Pop OS install, and it loaded up, let me install updates, etc., then when I powered the VM down and then tried to restart it, it's the same problem with just a black screen. I scrapped all previous VM's I created before adding the second cache drive, but the Pop OS one is new today and is also having the problem. I'll try one more Windows one with VNC first and then will try to pass through the GPU. Thanks. :)

Link to comment

I wish I never would have added the cache drive. I'm confident I messed something up with it when trying to split things up to put the second cache drive for VM's. Does anyone know if there's any way to just start everything all over again, reverting back to just one cache drive? 

Edited by Rushtallica
Link to comment

Hi. Is there a nuke option/ way to just erase everything and get to the original template? Like would reinstalling an older flash drive save put back the original configuration I had if I remove the current second cache drive as it was before with just one? I'd really like to get the VM part of things working with this new mobo if possible. I just want to make sure I don't make things worse in suddenly unmounting the second cache drive to try this. Thanks for any reply. 

 

 

 

 

* I had things going pretty well with a previous mobo, a B550. I 'upgraded' to an x570 because it had 8 sata ports, but I had no idea Asus motherboards won't easily boot in legacy mode (I've tried deleting all secure keys, setting to 'other OS' in BIOS, etc, and could only get it to boot by setting the flash drive's EFI- folder to EFI without the dash, but it boots in UEFI mode vs legacy -- which I'm not sure if is part of the problem I'm having with VM's not booting now with GPU passthrough except sometimes only upon creating it, then VM's won't boot up again after restarting.

 

This is the error I get when restarting after a failed VM boot: "Unable to write to file /var/log/libvirt/qemu/Windows 11.1.log: No space left on device"

 

Additionally, I believe I really messed up in trying to add a second cache drive primarily for VM's as I was swapping motherboards and tried to use Krusader to move things around to accommodate the new cache drive and haven't been able to figure out how to fix it, and should have spent more time researching first. Sometimes I learn the harder way. :facepalm:  

Edited by Rushtallica
Link to comment

I want to do something similar because I both upgraded to a new motherboard and added a second cache drive, and now I can't get VM's to boot with passthrough anymore and have no idea if it's even possible to fix barring a fresh start. Can I just take all disks and reformat them in Windows and do something with the USB? Thanks for any reply.

Link to comment

Hi. Is there a simple way to just erase all drives and start from scratch without losing my paid registration?     I do have a backup, but there was information on it from my old motherboard/setup, though it was working.

 

I'm now unable to boot VM's with passthrough, no troubleshooting has helped, and I also get the error "/var/log is getting full (currently 100 % used)" that I seem to notice occasionally AFTER I have an unsuccessful boot into a VM and then try to restart it again.

 

I followed advice on a post elsewhere but still no fix, and I believe I just messed up badly in choice of motherboard 'upgrade' (Asus X570 Strix-e, wanting the 8 sata ports) and/or royally messed it up in my limited understanding of how to add a second cache drive/pool and moving things around to try and make the second one work for VM's. I'm having no luck with my searches and troubleshooting so far.

 

If I thought it would work again, I'd just reinstall my old motherboard, but I made a lot of changes with putting in the second cache drive since then.    Thanks for any possible help.

 

 

rushtallica-diagnostics-20211119-1008.zip

Edited by Rushtallica
Link to comment

Thanks for your reply. I got this in command:

 

0       /var/log/pwfail
4.0K    /var/log/swtpm
20K     /var/log/samba
0       /var/log/plugins
0       /var/log/pkgtools
0       /var/log/nginx
0       /var/log/nfsd
12K     /var/log/libvirt
328K    /var/log

Edited by Rushtallica
Link to comment

I had deleted the VM's and have since created another but hadn't tried to pass it through yet.    I'm wondering if it's something I did in moving files after I added the other cache drive -- or if it's how the new motherboard is working with Unraid.    I wonder if it's possible a misplaced file would fill up the drive when trying to pass through the GPU. 

Edited by Rushtallica
Link to comment

When I got the motherboard/Asus X570, it wouldn't boot into Unraid.   I had deleted security keys and set boot option to other OS. Then I found a post mentioning deleting the hyphen from the EFI- folder, making it EFI instead. That let me boot in UEFI mode. But I wonder if not being in legacy mode in part of the problem.     I just noticed when clicking on the flash drive in Unraid that there's a choice that was ticked for booting into UEFI.  I unticked it and then went into the USB on my Windows PC and edited the EFI folder to again show EFI-, and so far it's let me boot and then I passed through my GPU and installed NVidia drivers on a Windows 11 VM.   I've been able to do this before but then failed afterward, but it's the first time I've been able to boot into Unraid in legacy mode.  I'm hopeful for the moment, anyway. :D   Will report back in a bit. Thanks again.

Link to comment

I guess I should still ask, is a way to just restart everything from scratch while keeping my paid key?    Thanks for any reply.   I guess my next step is to reinstall my old motherboard.

 

EDIT: One thing I'm wondering, I did set the second drive to xfs format instead of btrfs file format (that the first/original cache SSD has) when I installed it, though I don't know if that might possibly cause these types of problems.   And it's 40 degrees C immediately after rebooting with no VM running.

 

And trying to boot into the Windows VM has the same problem with not booting up, and it's showing 32,000 reads and 4500 writes now on the drive, and showing the error again /var/log is getting full (currently 100 % used)

Edited by Rushtallica
Link to comment

and after doing the command line that you requested earlier now shows (after shutting down the VM but not rebooting yet)

0       /var/log/pwfail
0       /var/log/swtpm
20K     /var/log/samba
0       /var/log/plugins
0       /var/log/pkgtools
0       /var/log/nginx
0       /var/log/nfsd
7.6M    /var/log/libvirt
128M    /var/log

Link to comment
14 minutes ago, Rushtallica said:

I'm showing my VM's NVME disk temperature is 84 C!!!

This is a problem with 6.10.0 RC2.  Other users have reported that with this release, NVMe temps are showing 84C (same temp for everyone).  Rolling back to prior version makes that go away.  It is not clear if that is an erroneously reported temperature or if something in RC2 is actually causing the NVMe to get that hot.

 

 

Link to comment
7 minutes ago, Hoopster said:

This is a problem with 6.10.0 RC2.  Other users have reported that with this release, NVMe temps are showing 84C (same temp for everyone).  Rolling back to prior version makes that go away.  It is not clear if that is an erroneously reported temperature or if something in RC2 is actually causing the NVMe to get that hot.

 

 

Thanks so much for that info!     I'm having other issues, too, and am seeing major heavy reads and writes and think it could be the real temp on mine.  But I wonder if 6.10.0 RC2 could possibly be related to the temp and other problems I've been seeing since installing this particular motherboard.  Thanks so much again for your reply! :)

Edited by Rushtallica
Link to comment
49 minutes ago, Rushtallica said:

I wonder if 6.10.0 RC2 could possibly be related to the temp and other problems I've been seeing since installing this particular motherboard

I am running 6.10.0 RC1 on my main system and have not seen any "anomalies" with this version.  My NVMe temps are normal in the mid to high 30s (at idle) and mid to high 50s (under heavy activity load).

 

I have been trying to diagnose a random lockup problem and have changed PSUs, run RAM rests, redone cabling, etc. and none of that has caused any problems. 

 

Perhaps installing 6.10.0 RC1 or 6.9.2 could help you eliminate the unRAID version as any potential cause of your issues.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.