Jump to content

Acidcliff

Members
  • Posts

    20
  • Joined

Posts posted by Acidcliff

  1. 4 hours ago, giganode said:

     

    @zhtengw did you recognize my post?

     

    [ 43.683341] i915 0000:00:02.2: Device initialization failed (-71) [ 43.683344] i915: probe of 0000:00:02.2 failed with error -71

    => I'm by far no expert but this seems to be the culprit - seems as if unraid isn't able to initiatize the VFs. Therefore Win11 is probably not able to use it

     

    What CPU model are you using?

    Have you enabled SR-IOV in BIOS?

    Have you done a reboot after installing the plugin?

    What's your "lspci -v" output?

  2. 10 hours ago, zhtengw said:

    Sorry, it's the first version with web setting page. May cause some file conflicts. Could you please try remove the old plugin then install the new one?

    Tried that (via the uninstall function on the plugins page)  - same error.
    And now I've the problem that the plugin doesn't want to install anymore.

     

     

    Edit: Also tried to replace the libvirt.php (that was patched by the previous version of the plugin) with the libvirt.php.orig backup -> but it didn't work

  3. plugin: installing: i915-sriov.plg
    Executing hook script: pre_plugin_checks
    plugin: downloading: i915-sriov.plg ... done
    
    Executing hook script: pre_plugin_checks
    
    +==============================================================================
    | Skipping package unraid-i915-sriov-2023.03.30 (already installed)
    +==============================================================================
    
    patching file usr/local/emhttp/plugins/dynamix.vm.manager/include/libvirt.php
    Hunk #1 FAILED at 780.
    1 out of 1 hunk FAILED -- saving rejects to file usr/local/emhttp/plugins/dynamix.vm.manager/include/libvirt.php.rej
    plugin: run failed: /bin/bash
    Executing hook script: post_plugin_checks

     

    getting this error, when trying to update the plugin to the latest version

  4. 22 hours ago, Acidcliff said:

    Hey, it's amazing that you have brought it so far!

     

    I've the following problem. When I try to bind one of the VFs (02.1) to the VM (Win 11) the iGPU does not seem to work (Code 43 in the device manager)

     

    I'm running a 12600k on 6.11.5.

    VFs are created and visible in device overview. 

    02.1 is bound to vfio at boot.

     

    Any idea why it's not working?

     

        <graphics type='vnc' port='-1' autoport='yes' websocket='-1' listen='0.0.0.0' keymap='de'>
          <listen type='address' address='0.0.0.0'/>
        </graphics>
        <audio id='1' type='none'/>
        <video>
          <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
        </video>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
          </source>
          <address type='pci' domain='0x0000' bus='0x06' slot='0x10' function='0x0'/>
        </hostdev>
    Loading config from /boot/config/vfio-pci.cfg
    BIND=0000:00:02.1|8086:4680
    ---
    Processing 0000:00:02.1 8086:4680
    Error: Device 0000:00:02.1 does not exist, unable to bind device
    ---
    vfio-pci binding complete
    
    Devices listed in /sys/bus/pci/drivers/vfio-pci:
    
    Loading config from /boot/config/vfio-pci.cfg
    BIND=0000:00:02.1|8086:4680
    ---
    Processing 0000:00:02.1 8086:4680
    Vendor:Device 8086:4680 found at 0000:00:02.1
    
    IOMMU group members (sans bridges):
    /sys/bus/pci/devices/0000:00:02.1/iommu_group/devices/0000:00:02.1
    
    Binding...
    0000:00:02.1 already bound to vfio-pci
    Successfully bound the device 8086:4680 at 0000:00:02.1 to vfio-pci
    ---
    vfio-pci binding complete
    
    Devices listed in /sys/bus/pci/drivers/vfio-pci:
    lrwxrwxrwx 1 root root    0 Mar 28 16:01 0000:00:02.1 -> ../../../../devices/pci0000:00/0000:00:02.1

     

     

    Yay I finally got it working!!!

     

    Tried everything: Reinstalling the plugin, intel_gpu_top, using the modified kernel instead of the addon...

    But the error 43 always remained.

     

    The solution was the following:

    Although I already had the intel GPU driver installed, I had to reinstall it with the option for "clean install" marked (otherwise it didnt't work). After the reinstall everything worked as a charm

     

    I'm finally able to run my Win11 VM on VF 2.1 with Parsec (using the Virtual Display Driver) (don't forget to remove the Red Hat Display Driver)

  5. Hey, it's amazing that you have brought it so far!

     

    I've the following problem. When I try to bind one of the VFs (02.1) to the VM (Win 11) the iGPU does not seem to work (Code 43 in the device manager)

     

    I'm running a 12600k on 6.11.5.

    VFs are created and visible in device overview. 

    02.1 is bound to vfio at boot.

     

    Any idea why it's not working?

     

        <graphics type='vnc' port='-1' autoport='yes' websocket='-1' listen='0.0.0.0' keymap='de'>
          <listen type='address' address='0.0.0.0'/>
        </graphics>
        <audio id='1' type='none'/>
        <video>
          <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
        </video>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
          </source>
          <address type='pci' domain='0x0000' bus='0x06' slot='0x10' function='0x0'/>
        </hostdev>
    Loading config from /boot/config/vfio-pci.cfg
    BIND=0000:00:02.1|8086:4680
    ---
    Processing 0000:00:02.1 8086:4680
    Error: Device 0000:00:02.1 does not exist, unable to bind device
    ---
    vfio-pci binding complete
    
    Devices listed in /sys/bus/pci/drivers/vfio-pci:
    
    Loading config from /boot/config/vfio-pci.cfg
    BIND=0000:00:02.1|8086:4680
    ---
    Processing 0000:00:02.1 8086:4680
    Vendor:Device 8086:4680 found at 0000:00:02.1
    
    IOMMU group members (sans bridges):
    /sys/bus/pci/devices/0000:00:02.1/iommu_group/devices/0000:00:02.1
    
    Binding...
    0000:00:02.1 already bound to vfio-pci
    Successfully bound the device 8086:4680 at 0000:00:02.1 to vfio-pci
    ---
    vfio-pci binding complete
    
    Devices listed in /sys/bus/pci/drivers/vfio-pci:
    lrwxrwxrwx 1 root root    0 Mar 28 16:01 0000:00:02.1 -> ../../../../devices/pci0000:00/0000:00:02.1

     

  6. Hi zusammen,

     

    super Anleitung, super Thread und vor allem richtig Wahnsinn wie konstruktiv und nett hier allen geholfen wird!

     

    Mein Setup:

    Unraid 6.11.3
    MB: MSI PRO Z690-A DDR4

    CPU: 12600K

    HDDs: 5xSATA

    SSD: 1xM.2

    Powertop 2.15

    ASPM scheint nach meiner Auffassung alles Enabled zu sein

    USB: Conbee II und JetFlash Unraid USB Stick

     

    Ich komme mit meinem Setup aktuell selbst mit abgeschaltetem Array insgesamt nicht unter C6. Allerdings komme ich, wenn ich den Conbee II via zigbee2mqtt (Docker) nutze, nicht mehr unter C2. Autosuspend für den Conbee steht im powertop auch auf Bad und selbst nach tuning scheint er sich immer wieder zurückzustellen. Schalte ich den zigbee2mqtt container ab, gehts wieder auf C6.

     

    Ihn erahne zwar schon die Antwort... Aber denkt ihr es wäre prinzipiell möglich mit dem Conbee auf ein besseres Level zu erreichen?

     

     

     

  7. On 7/2/2022 at 7:06 AM, jinlife said:

    Hello Acidcliff, I built a kernel with the IOV code in intel LTS Kernel 5.15. It looks fine to run but I have no 12th CPU to test the i915 SR-IOV ability.

    Would you please try it? It was built for unraid 6.10.3, please just overwrite these files to the flash drive.

    Please backup original 4 files, copying them back will revert the change.

    https://drive.google.com/drive/folders/1bAnedRHWaz7QGQAkO7xWohhnjDxSfbqS?usp=sharing

    It is just experiment, may not work. Maybe the code is not the only condition.

    I think I will be able to test it on the weekend!

  8. Looking at the code and commits to me it seems that Intel's LTS Kernel 5.10 has added SR-IOV support for the i915 driver. (Although nothing happened to this after the commit in on 24th Feb). Intels 5.15 LTS Kernel does not have it though. It's also not in the 5.18 Linux Kernel.

     

    Not sure in what state the updated SR-IOV driver in 5.10 is.

     

    I'm no Linux Pro so I'm also not sure if and how it is possible to compile the adapted driver and add it to the unraid kernel. But if it would be possible it would definitelly be cool :D

    • Upvote 2
  9. Not quite firm with AMD CPUs but have you tried the following:

    • Setting the cpu scaling governor to a more power saving strategy (e.g. powersave)?
    • Pinning CPU cores so less of them get used (may be less area to cool - but I could be wrong)
    • Undervolting the CPU
    • Making sure that the iGPU is in a power saving state (for intel & nvidia this means to at least install the GPU drivers and enabling it - e.g. through "nvidia-smi --persistence-mode=1")
    • Reducing other heat producing components and case temperature in your build (e.g. spinning down HDDs, using Gigabit-LAN rather than Multigig - if applicable, increasing case FAN speed)

    At what temperature is your CPU currently running? If you've got enough head-room you might get away with just unplugging the fan :D (at your own risk of course...)

     

    Alternatively (don't now how much room you have) you could also think about using a small AIO and lead the cooling outside. could be more efficient than having the cooler near the HDDs and other stuff...

  10. Yep - applied all from powertop - but to be honest that didn't make too much of a difference (maybe since my HDDs are mostly spun down anyways).

     

    But applying "powersave" as scaling governor made a huge difference bringing the System down about 5-10W to 47-50W*.

     

    Maybe I'll look into undervolting - but overall I think I can be happy with the 50W for the setup.

    *Current Setup:

    CPU: Intel 12600k, GPU MSI GTX 1050 Areo, 5xHDDs (~32TB), 32 GB RAM 3600MHZ, MSI Z690 PRO A, PNY XLR8 CS3030 1TB M.2, Arctic Liquid Freeze 420, 3x 140mm fans (all stopped for the test), no OC, XMP on, 1 VM running (Linux Debian), 12 Docker container Running (mainly home automation stuff)

  11. On 1/12/2022 at 10:35 PM, sylus said:

    I think you have to pass through the gpu to the VM otherwise the gpu will work with full power.

     

    Have you enabled any energy savings in the bios yet? The power consumptions seems quite high.

     

     

     

    Had a deeper look into that - thank you again for pointing that out. In fact while not being under load the GTX 1050 was running in P-Mode P0.
     

    Instead of running a VM to get it down to P8 i used, which seems to work (haven't had yet the opportunity to measure the impact):

    nvidia-smi --persistence-mode=1

     

  12. I think it would be possible to go even below the 36W since I use a 420mm AIO which is way overkill (but a silent pc was important for me).

     

    The AlderLake iGPU has Quick Sync - but you'll run into the same problem that I'm currently trying to solve, that is, that beginning with intel gen 11, intel has dropped the support for VT-x (d or g) and switched to SR-IOV. So at least with the current UNRAID Plugins (like intel TOP) you won't be able to virtualize the iGPU to use it for your docker containers and vms in parallel. Unfortunatelly there is close to no material/tutorials to be found on the topic of "SR-IOVing" an iGPU. As long as this isn't solved you'll probably be stuck with using dedicated GPUs (and I guess the 1050 is one of the least power hungry that also have proper HW-encoding features - Features that lack for example in a 1030).

     

    Details on the/my problems with the iGPU SR-IOV here:

     

  13. First of all a huge thank you to you @BVD! Imo it's the most complete and "user friendliest" guide to SR-IOV that I could find anywhere. Really cool!


    I know that you focused this thread around NICs but I tried to apply the guide to intels iGPU that (supposedly) support SR-IOV on gen 11+.

    Don't want to capture this thread - details are in a separate Thread: 

     

    When I try to use the VFs from the iGPU, I get an error stating, that the VF needs a VF Token to be used (to my research a shared secret UUID between the PF and VF). Most sources about VF Tokens I could find are from discussions around DPDK so I guess it's also a topic one could stumble upon when using VFs on NICS. I was wondering if someone here ever had to deal with something similar and know a way of setting the Token with the workflow described here.

  14. Hi all,

     

    I'm trying to use the SR-IOV feature of the iGPU of the intel 12600k (VGA function is device 0000:00:02.0).

     

    I'm on 6.10RC2 using an intel 12600k on a MSI Z690 PRO A

     

    This is my /boot/config/go

    #!/bin/bash
    # Start the Management Utility
    /usr/local/sbin/emhttp &
    echo 3 > /sys/bus/pci/devices/0000:00:02.0/sriov_numvfs
    # Relaunch vfio-pci script to bind virtual function adapters that didn't exist at boot time
    /usr/local/sbin/vfio-pci >>/var/log/vfio-pci

     

    This is my /boot/syslinux/syslinux.cfg

    default menu.c32
    menu title Lime Technology, Inc.
    prompt 0
    timeout 50
    label Unraid OS
      menu default
      kernel /bzimage i915.force_probe=4680 vfio-pci.enable_sriov=1
      append isolcpus=0-11 video=efifb:off initrd=/bzroot intel_iommu=on
    label Unraid OS GUI Mode
      kernel /bzimage
      append isolcpus=0-11 initrd=/bzroot,/bzroot-gui
    label Unraid OS Safe Mode (no plugins, no GUI)
      kernel /bzimage
      append initrd=/bzroot unraidsafemode
    label Unraid OS GUI Safe Mode (no plugins)
      kernel /bzimage
      append initrd=/bzroot,/bzroot-gui unraidsafemode
    label Memtest86+
      kernel /memtest

     

     

    The log from the vfio-pci binding 

    Loading config from /boot/config/vfio-pci.cfg
    BIND=0000:00:02.0|8086:4680 0000:00:02.1|8086:4680 0000:00:02.2|8086:4680 0000:00:02.3|8086:4680
    ---
    Processing 0000:00:02.0 8086:4680
    Vendor:Device 8086:4680 found at 0000:00:02.0
    
    IOMMU group members (sans bridges):
    /sys/bus/pci/devices/0000:00:02.0/iommu_group/devices/0000:00:02.0
    
    Binding...
    Successfully bound the device 8086:4680 at 0000:00:02.0 to vfio-pci
    ---
    Processing 0000:00:02.1 8086:4680
    Error: Device 0000:00:02.1 does not exist, unable to bind device
    ---
    Processing 0000:00:02.2 8086:4680
    Error: Device 0000:00:02.2 does not exist, unable to bind device
    ---
    Processing 0000:00:02.3 8086:4680
    Error: Device 0000:00:02.3 does not exist, unable to bind device
    ---
    vfio-pci binding complete
    
    Devices listed in /sys/bus/pci/drivers/vfio-pci:
    lrwxrwxrwx 1 root root 0 Jan 12 21:51 0000:00:02.0 -> ../../../../devices/pci0000:00/0000:00:02.0
    
    Loading config from /boot/config/vfio-pci.cfg
    BIND=0000:00:02.0|8086:4680 0000:00:02.1|8086:4680 0000:00:02.2|8086:4680 0000:00:02.3|8086:4680
    ---
    Processing 0000:00:02.0 8086:4680
    Vendor:Device 8086:4680 found at 0000:00:02.0
    
    IOMMU group members (sans bridges):
    /sys/bus/pci/devices/0000:00:02.0/iommu_group/devices/0000:00:02.0
    
    Binding...
    0000:00:02.0 already bound to vfio-pci
    Successfully bound the device 8086:4680 at 0000:00:02.0 to vfio-pci
    ---
    Processing 0000:00:02.1 8086:4680
    Vendor:Device 8086:4680 found at 0000:00:02.1
    
    IOMMU group members (sans bridges):
    /sys/bus/pci/devices/0000:00:02.1/iommu_group/devices/0000:00:02.1
    
    Binding...
    0000:00:02.1 already bound to vfio-pci
    Successfully bound the device 8086:4680 at 0000:00:02.1 to vfio-pci
    ---
    Processing 0000:00:02.2 8086:4680
    Vendor:Device 8086:4680 found at 0000:00:02.2
    
    IOMMU group members (sans bridges):
    /sys/bus/pci/devices/0000:00:02.2/iommu_group/devices/0000:00:02.2
    
    Binding...
    0000:00:02.2 already bound to vfio-pci
    Successfully bound the device 8086:4680 at 0000:00:02.2 to vfio-pci
    ---
    Processing 0000:00:02.3 8086:4680
    Vendor:Device 8086:4680 found at 0000:00:02.3
    
    IOMMU group members (sans bridges):
    /sys/bus/pci/devices/0000:00:02.3/iommu_group/devices/0000:00:02.3
    
    Binding...
    0000:00:02.3 already bound to vfio-pci
    Successfully bound the device 8086:4680 at 0000:00:02.3 to vfio-pci
    ---
    vfio-pci binding complete
    
    Devices listed in /sys/bus/pci/drivers/vfio-pci:
    lrwxrwxrwx 1 root root 0 Jan 12 13:51 0000:00:02.0 -> ../../../../devices/pci0000:00/0000:00:02.0
    lrwxrwxrwx 1 root root 0 Jan 12 13:52 0000:00:02.1 -> ../../../../devices/pci0000:00/0000:00:02.1
    lrwxrwxrwx 1 root root 0 Jan 12 13:52 0000:00:02.2 -> ../../../../devices/pci0000:00/0000:00:02.2
    lrwxrwxrwx 1 root root 0 Jan 12 13:52 0000:00:02.3 -> ../../../../devices/pci0000:00/0000:00:02.3

     

    So the virtual function seem to be established end binded to vfio-pci (Kernel driver in use is vfio-pci):

    IOMMU.thumb.PNG.64f6885b63446b8d4df86ff5b29660c1.PNG

     

    The devices are also showing in the GPU DropDown in the VM-Config:

    VMconf.PNG.e25ceb44be3b93cf7afc22c0c4dfa675.PNG

     

    When I try to start the VM with any VF other than 00:02.0 I get the following error-message:

    internal error: qemu unexpectedly closed the monitor: 2022-01-12T22:18:26.174166Z qemu-system-x86_64: -device vfio-pci,host=0000:00:02.1,id=hostdev0,bus=pci.0,addr=0x2.0x1: vfio 0000:00:02.1: error getting device from group 14: Permission denied Verify all devices in group 14 are bound to vfio- or pci-stub and not already in use

     

    But I guess this error message is generic - in DMESG I get the following error after starting the VM, which I think is the real error: 

    [  640.917885] vfio-pci 0000:00:02.3: VF token required to access device

     

    By god I tried my best to find anything about that "VF token" thing, but I wasn't able to find much.

    Does anyone have a clue how get around that error?

    • Thanks 1
  15. Quote

    Is that measured at the wall using a power meter?  So total power consumption of the system?

    Yes - Measured with an "Innr SP120" Zigbee Wall Plug

     

    Quote

    What power supply?

    Fractal Design Ion+ 660p

     

    Quote

    Can you confirm that is with no gpu or any pci-e cards? I see you say it was barebones with just the M2 1TB in. 

    Yes indeed - no GPU or PCI-E Cards. Just the M2 1TB mentioned


     

    Quote

     

    Is that with unraid running and assigned to the e cores? How many?

     

    This was measured on a fresh "native" (no vm) Win11 install (no unraid) - so the values came from the Win11-Thread director

     

    Quote

    I'd be interested to see if you are still able to get those numbers when running a windows Vm assigned also to the e cores, and then again with one assigned to power cores to see how it varies. 

    I have done some new measurements using a pretty barebones (no Docker Containers or VMs running) Unraid (system mentioned above but now with 2x 12TB Western Digital HDDs):
    - ~55 W with HDDs spinning idle / 45 W with HDDs spinned down (no CPU Pinning - i think unraid defaults to the first cores which are p-cores)

    - ~45 W with HDDs spinning idle / 36 W with HDDs spinned down (exclusive usage of e-cores through CPU Isolation)

    Adding a MSI Aero GTX 1050 2GB results in ~15-20 W additional idle power draw.
    All are measurements of the complete system measured at the wall outlet.

     

    Quote

    Where did you read of the 12600 series going "well below" 10watt idle? That's can't be accurate? Do you mean just the CPU rather than total system at the wall watts? Even if just the CPU, that seems still very low.

    The ~10W were for idling CPU only - not the whole system. Don't know anymore from which source I got that information

    • Like 1
  16. In the end I chose the 12600k with a z690 (MSI PRO A).

     

    If it's for interest for somebody: I measured the idle power draw of the pretty "bare bones system*" in a freshly installed win11. It was about 32-38 W(+/-5).

     

    * 12600k, MSI Z690 PRO A, PNY XLR8 CS3030 1TB M.2, Arctic Liquid Freeze 420, 3x 140mm fans (all stopped for the test), no OC, XMP on

     

    Have yet to test it with unraid installed, but have to overcome some hurdles that came with the platform (e.g. onboard ethernet with intel 225-v not working, need for dedicated GPU since uhd770 has no supported method for sharing between VMs...). Pains of early adopting (or my personal incompetence/ignorance since this is my first build since ~7 years :D)

    • Like 1
  17. Hey all,

     

    I'm currently researching options (12600k vs 5800x) for my server build to prepare for my switch from my old Qnap NAS to unraid.

     

    Context:

    System will host ~20 TB of data, ~20 docker containers and 2 VMs. These are all for my home automation environment, so not CPU-heavy but have to run 24/7 (with exception to some lighter 1080p video transcoding for surveillance steams.

    My hypothesis is, that the system will run close to idle CPU usage most of the time. Therefore I think it make sense to optimize the build for this working mode.

     

    Now to the question, that I couldn't yet really solve through my research:

    There seem to be multiple sources that indicate that the 12600k is able to stay well under 10W in idle/low CPU usage scenarios due to its hybrid architecture. Even if Linux is currently not supporting hybrid architecture I think I could pin the containers and VMs to the e-cores to force the low power consumption. I was not able to find information on idle power consumption of the 5800x (just the CPU - not whole system). Some meta-information suggest, that the 5800x is more efficient than the 12600k under high load, but much more inefficient in close-to-idle scenarios. Is that the case?

     

    Does anyone here have more information, experience or an advice what build I should prefer in my specific context?

     

     

    • Like 1
×
×
  • Create New...