• RC3 VM Alias Handler Change?


    Arbadacarba
    • Minor

    I'm running a pfsense VM with a 2port NIC and an Atheros Wifi card passed through.

     

    (I have a slightly unreliable POINT to POINT LTE connection, and occasional mission critical internet requirements - So the Wifi card is set to automatically connect to my cell phone hotspot)

     

    I had trouble getting this Wifi card to work with an error message referencing hostdev1 or something not fitting in BARs.

     

    image.thumb.png.de7558e8c54f0e1dbb3bbc404e5bbf0b.png

     

    I searched, I posted, I got a great answer from Ghost82 who translated the info I had found into the xml:

        <qemu:commandline>
          <qemu:arg value='-set'/>
          <qemu:arg value='device.hostdev1.x-msix-relocation=bar2'/>
        </qemu:commandline>
      </domain>

     

    This, with minimal fiddling (hostdev0 not hostdev1) worked perfectly.

    20220220_2240_pfSense - Cerberus.xml

     

    Right up until I updated to RC3.

     

    First the VFIO passthrough changed in System Devices, and then after I re-enabled that it still would not boot the VM.

    But now it doesn't refer to any hostdev id... it refers to it as bellow

      

    image.png.c84cd227207afc7bb7a0ef74d52395ae.png

     

    If I remove the Wifi card the 2 NIC ports get alias' lines, but if I add the Wifi card back in nothing does.

    Working - No WIFI.txt

    If I edit the xml to give the Wifi card an alias and put the comandline part back in.

    WIFI Added with Command.txt

     

    I get an error claiming there is no hostdev0

     

    image.png.c4ac339a0ae6781c02775e9eb7ab5e8d.png

     

    and when I look back in the XML the alias' are all gone.

    Alias missing.txt

     

    Ghost82 suggests editing the XML in the cli but I haven't tried that yet. Would it make a difference? Should it?

     

    Has there been some change t the way alias' are handled? I don't understand why they disappear.

     

    Thanks for any help

     

    Arbadacarba

     

    jupiter-diagnostics-20220315-0016.zip




    User Feedback

    Recommended Comments



    1 minute ago, Jryski said:

    So this goes back to the original issue, the alias command is removed when you attempt to run this

    Can you attach the diagnostics for this?

    I think the alias is not removed, but the issue only related to -set and json format, but it seems I'm wrong.. :(

     

     

    Link to comment

    You could try also this:

    1. replace this:

      <qemu:commandline>
        <qemu:arg value='-device'/>
        <qemu:arg value='vfio-pci,host=0000:03:00.0,id=ua-sm2262,bus=pci.3,addr=0x0,x-msix-relocation=bar2'/>
      </qemu:commandline>

     

    with this:

      <qemu:commandline>
        <qemu:arg value='-device'/>
        <qemu:arg value='pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2'/>
        <qemu:arg value='-device'/>
        <qemu:arg value='vfio-pci,host=0000:03:00.0,id=ua-sm2262,bus=pci.3,addr=0x0,x-msix-relocation=bar2'/>
      </qemu:commandline>

     

    2. In addition to deleting the hostdev block for the nvme also delete this:

        <controller type='pci' index='3' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='3' port='0x12'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
        </controller>

     

    Link to comment

    Diagnostic attached. Pulled it before I made the changes you suggested.

     

    This is after making the suggested edits.

     

    internal error: qemu unexpectedly closed the monitor: 2022-03-22T15:01:51.766791Z qemu-system-x86_64: -device vfio-pci,host=0000:03:00.0,id=ua-sm2262,bus=pci.3,addr=0x0,x-msix-relocation=bar2: vfio 0000:03:00.0: failed to open /dev/vfio/21: Operation not permitted

    unimatrixzero-diagnostics-20220322-1100.zip

    • Like 1
    Link to comment
    1 minute ago, Jryski said:

    This is after making the suggested edits.

     

    internal error: qemu unexpectedly closed the monitor: 2022-03-22T15:01:51.766791Z qemu-system-x86_64: -device vfio-pci,host=0000:03:00.0,id=ua-sm2262,bus=pci.3,addr=0x0,x-msix-relocation=bar2: vfio 0000:03:00.0: failed to open /dev/vfio/21: Operation not permitted

    ok, seems better now, can you attach diagnostics with the latest edits applied please?

    Link to comment

    Damn...I suspect that the pcie-root-port with index 3 is automatically added again as a controller block...

        <controller type='pci' index='3' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='3' port='0x12'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
        </controller>

     

    did you delete it?

    Link to comment

    Yes, I deleted it, and I just tested again, If I remove it, update, and try to start the VM, it adds it back.

    • Like 1
    Link to comment

    Thanks for all the assistance, I had fun and learned quite a bit about what's possible. Next scheduled restart I'll unstub the drive and try your other idea of passing it through on a virtual controller.

    • Upvote 1
    Link to comment

    Another idea...use qemu:capabilities in the xml to remove json for devices...

     

    So, remove the hostdev block, use qemu:arg for vfio injection and qemu:capabilities to remove json

     

    Try to replace this:

      <qemu:commandline>
        <qemu:arg value='-device'/>
        <qemu:arg value='vfio-pci,host=0000:03:00.0,id=ua-sm2262,bus=pci.3,addr=0x0,x-msix-relocation=bar2'/>
      </qemu:commandline>

     

    With this:

      <qemu:commandline>
        <qemu:arg value='-device'/>
        <qemu:arg value='vfio-pci,host=0000:03:00.0,id=ua-sm2262,bus=pci.3,addr=0x0,x-msix-relocation=bar2'/>
      </qemu:commandline>
      <qemu:capabilities>
        <qemu:del capability='device.json'/>
      </qemu:capabilities>

     

    Or in alternative with this (depending on the libvirt/qemu versions):

      <qemu:commandline>
        <qemu:arg value='-device'/>
        <qemu:arg value='vfio-pci,host=0000:03:00.0,id=ua-sm2262,bus=pci.3,addr=0x0,x-msix-relocation=bar2'/>
      </qemu:commandline>
      <qemu:capabilities>
        <qemu:del capability='device.json+hotplug'/>
      </qemu:capabilities>

     

    With some luck json should be disabled (???)

    Edited by ghost82
    Link to comment

    First method results:

    internal error: qemu unexpectedly closed the monitor: qxl_send_events: spice-server bug: guest stopped, ignoring 2022-03-22T16:46:48.094874Z qemu-system-x86_64: -device vfio-pci,host=0000:03:00.0,id=ua-sm2262,bus=pci.3,addr=0x0,x-msix-relocation=bar2: vfio 0000:03:00.0: failed to open /dev/vfio/21: Operation not permitted

     

    2nd:

    internal error: invalid qemu namespace capability 'device.json+hotplug'

    • Like 1
    Link to comment

    ok, last test for this, back to the beginning:

    1. Restore the hostdev block:

        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
          </source>
          <alias name='ua-sm2262'/>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
        </hostdev>

     

    2. Restore the -set qemu:arg and disable json:

      <qemu:commandline>
        <qemu:arg value='-set'/>
        <qemu:arg value='device.ua-sm2262.x-msix-relocation=bar2'/>
      </qemu:commandline>
      <qemu:capabilities>
        <qemu:del capability='device.json'/>
      </qemu:capabilities>

     

    Can you attach diagnostics after this? just to look at if json is disabled or not.

    • Like 1
    Link to comment
    30 minutes ago, Jryski said:

    Certainly

    Thanks, but after my edits of the last post :D

     

    Quote

    failed to open /dev/vfio/21: Operation not permitted

    This is the same error as before, it seems related to permissions (hopefully related to the way you are passing through the nvme with the qemu:arg instead of hostdev block), so if it worked before with the hostdev block I cannot see any reason it shouldn't work anymore (I'm referring to the permission error with iommu group 21 --> nvme controller)...And by disabling json for devices we should be with the "old" working qemu/libvirt behaviors (i.e. with aliases correctly seen by qemu:arg)...

     

    Moreover from your latest diagnostics it seems that qemu:capabilities --> del did the trick, now devices are passed from libvirt to qemu in non json format and aliases are all there...hopefully it will work..

     

    Another note: aliases are correctly passed from libvirt to qemu: the fact that they cannot be seen in the gui is another thing and I don't know from what it depends, but this is not relevant, as I said, from the logs, aliases are correctly passed.

    Edited by ghost82
    Link to comment

    May have put my party hat on too soon. I can see the drive now in the VM, but so far I can't make any changes to it, so no creating partitions or formatting, or installing windows.

    Link to comment

    Windows throws an error "We couldn't create a new partition error: 0xfdd10070"
     

    This is the first time I've got this particular passthrough working, This nvme controller sm2262 has been an issue all along, samsung NVME I had in it before worked fine.

     

    Link to comment

    I don't think that this is related to the fix.

    However, are you able to see Repair your computer button, after yiu get the error and have access to a command prompt from the installation disk?if so, you could try to use diskpart utility and try to format the disk there.

    Otherwise you can create a new vm with the nvme passed through and gparted live disk and try to format the nvme there.

    Edited by ghost82
    • Like 1
    Link to comment

    About masking device.json:

     

    PLEASE NOTE THAT THIS IS A TEMPORARY WORKAROUND AND THAT FOR FUTURE LIBVIRT/QEMU VERSIONS MASKING DEVICE.JSON MAY BREAK OTHER FEATURES.

    PATCHES FOR PROPERLY OVERRIDE ARE ON THEIR WAY, SO HOPEFULLY THE NEXT LIBVIRT VERSION (8.2.0?) WILL ALLOW THESE OVERRIDES WITHOUT MASKING JSON AND UNRAID COULD INCLUDE THE NEW LIBVIRT VERSION.

     

    https://listman.redhat.com/archives/libvir-list/2022-March/229463.html

     

    Pinging also @limetech for this.

     

    ---

    If and when the patches will be merged, I think that the fix will be:

        <qemu:override>
          <qemu:device alias='YOURALIASHERE'>
            <qemu:frontend>
              <qemu:property name='x-msix-relocation' type='string' value='bar2'/>
            </qemu:frontend>
          </qemu:device>
        </qemu:override>

     

    Replacing this code block:

      <qemu:commandline>
        <qemu:arg value='-set'/>
        <qemu:arg value='device.YOURALIASHERE.x-msix-relocation=bar2'/>
      </qemu:commandline>
      <qemu:capabilities>
        <qemu:del capability='device.json'/>
      </qemu:capabilities>

     

    Edited by ghost82
    • Like 1
    Link to comment

    I will update if I can get this working via whatever means. Thanks again for all the information, I'll update this post after the next libvert/qemu update with how I update the XML. I tell you what though, I REALLY wish they'd add a toggle to make the XML sticky when a GUI change is made... Thanks again for all the work.

     

    Link to comment

    Same issue in DISKPART, it fails due to "i/o errors" wondering if it has something to do with the fact that the drive is being passed as an entire controller, but also the xml refers to it as a generic PCI device, not a SCSI controller? Grasping at straws. At this point, I'm ready to call this non-working until the updates are here. I can see the drive, but no luck making changes.

    The VM also takes 10 minutes to get to the windows installer screen, and hangs at each step. I've attempted to limit the cores down to 1 which was a workaround in the past, as well as reduce the ram down to minimum required for windows 11 to allow the install. Despite all our effort, I believe it's time to wait for official support.

    • Like 1
    Link to comment

    Libvirt 8.2.0 was officially released yesterday with the override fix.

     

    Quote

    qemu: Allow overrides of device properties via the qemu namespace

    Users wishing to override or modify properties of devices configured by libvirt can use the <qemu:deviceOverride> QEMU namespace element to specify the overrides instead of relying on the argv passthrough of the -set qemu commandline option which no longer works with new qemu.

     

    Edited by ghost82
    Link to comment

    Just for info, now unraid 6.10RC7 includes libvirt 8.2.0, so this applies from this version (example referred to x-msix-relocation):

        <qemu:override>
          <qemu:device alias='YOURALIASHERE'>
            <qemu:frontend>
              <qemu:property name='x-msix-relocation' type='string' value='bar2'/>
            </qemu:frontend>
          </qemu:device>
        </qemu:override>

     

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.