happythatsme

Members
  • Posts

    22
  • Joined

  • Last visited

Posts posted by happythatsme

  1. Hi all, 

     

    has anyone tried to add the loki docker plugin?  https://grafana.com/docs/loki/latest/clients/docker-driver/configuration/

     

    I have installed it using

     

    Quote

    docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions

     

    i can see the plugin listed 

     

    Quote

    root@Tower:/mnt/user/appdata/loki# docker plugin ls
    ID             NAME          DESCRIPTION           ENABLED
    d18f5eaea7c6   loki:latest   Loki Logging Driver   true

     

     

    per the instructions, i need to edit file: /etc/docker/daemon.json 

     

    nano /etc/docker/daemon.json

    My file currently contains the following: 

    Quote

    {
        "runtimes": {
            "nvidia": {
                "path": "/usr/bin/nvidia-container-runtime",
                "runtimeArgs": []
            }
        }
    }


     

    I've tried to merge the config then disable and enable docker, however, i just get errors

     

    Merged: 

    Quote

    {
        "runtimes": {
            "nvidia": {
                "path": "/usr/bin/nvidia-container-runtime",
                "runtimeArgs": []
            }
        },
        "debug" : true,
        "log-driver": "loki",
        "log-opts": {
            "loki-url": "http://host.docker.internal:3100/loki/api/v1/push"
        }
    }

     

     

    Error:

     

    Quote

    Warning: stream_socket_client(): unable to connect to unix:///var/run/docker.sock (No such file or directory) in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 693
    Couldn't create socket: [2] No such file or directory
    Warning: Invalid argument supplied for foreach() in /usr/local/emhttp/plugins/dynamix.docker.manager/include/DockerClient.php on line 877

     

     

    Im guessing my json is badly formed and i've merged the changes incorreclty 

     

    cheers!

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

  2. 46 minutes ago, JorgeB said:

    This SATA controller is being passed through to the Win10 VM:

     

    
    49:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev 51)
        DeviceName: X570/590 SATA1
        Subsystem: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901]
        Kernel driver in use: vfio-pci
        Kernel modules: ahci

     

    So when the VM starts Unraid loses access to it and all the connected disks.

     

    Hi JorgeB,

     

    thank you so much for taking the time to review my logs i really appreicate it. 

     

    This is 100% a classic case of dumb user error! 

     

    I can confirm its all working well now, once i gave the sata contoller and the disks back to unraid..... 

     

    thank you again! 

    • Like 1
  3. Hi All, 


    I shut down my server to add a Network Card, when I booted the server back up one of the disks was missing, I suspect I dislodged the SATA cable by mistake. 


    The disk then became disabled in the array, I shut down, reset all the cables and the disk came back up in disabled mode. 


    I restart in maintenance mode to run smart tests - everything seemed fine - I removed it from the array and added it back in. Then rebuilt the array, no errors were reported. 


    However, now i seems to have some XFS errors across all disks: 

    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md1): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x11a0df168 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md2): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x697e9f18 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md2): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x697e9f18 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md2): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x697e9f18 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md2): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x653bd0d8 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md2): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x653bd0d8 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md2): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x653bd0d8 len 32 error 5
    Jul 15 10:06:12 Tower kernel: XFS (md2): metadata I/O error in "xfs_imap_to_bp+0x5c/0xa2 [xfs]" at daddr 0x653bd0d8 len 32 error 5
    Jul 15 10:06:13 Tower dhcpcd[3442]: br1: using IPv4LL address 169.254.222.50

     

     

    All drives appear to have errors now, i've attached the diagnostic files from before and after

    • Jul 14 - While disk was disabled. 
    • Jul 15 - After array was rebuilt 

     

    Any advice on what I should do here? 

     

    Thanks!

    Drives.jpg

    tower-diagnostics-20210715-1011.zip tower-diagnostics-20210714-1730.zip

  4. Hi All, 

     

    I'm using a Unifying receiver passed through to a windows 10 VM.  It randomly disconnects in windows, the only way I can get it back again is by rebooting the entire unraid system. The windows VM appears to work fine and Unraid is still running. I can plug in another USB mouse then reboot the system. 


    I realized it only  happenes when I was on a conference call, I have a pair of Plantronics Blackwire 5220 USB C headphones.  After the unifying reciever disconnects the headphones remain fully funtional. I can finish my call and reboot. But its very frustrating. 

     

    USB Devices: 

    Bus 001 Device 001:	ID 1d6b:0002 Linux Foundation 2.0 root hub
    Bus 002 Device 001:	ID 1d6b:0003 Linux Foundation 3.0 root hub
    Bus 003 Device 001:	ID 1d6b:0002 Linux Foundation 2.0 root hub
    Bus 004 Device 001:	ID 1d6b:0003 Linux Foundation 3.0 root hub
    Bus 005 Device 001:	ID 1d6b:0002 Linux Foundation 2.0 root hub
    Bus 005 Device 002:	ID 0d8c:0134 C-Media Electronics, Inc. USB PnP Audio Device
    Bus 005 Device 003:	ID 05e3:0608 Genesys Logic, Inc. Hub
    Bus 005 Device 004:	ID 0414:a002 Giga-Byte Technology Co., Ltd USB Audio
    Bus 005 Device 005:	ID 046d:c52b Logitech, Inc. Unifying Receiver
    Bus 005 Device 006:	ID 047f:c053 Plantronics, Inc. USB2.0 Hub
    Bus 005 Device 007:	ID 048d:8297 Integrated Technology Express, Inc. ITE Device(8595)
    Bus 006 Device 001:	ID 1d6b:0003 Linux Foundation 3.0 root hub
    Bus 007 Device 001:	ID 1d6b:0002 Linux Foundation 2.0 root hub
    Bus 007 Device 002:	ID 0461:4d17 Primax Electronics, Ltd Optical Mouse
    Bus 007 Device 003:	ID 8087:0029 Intel Corp.
    Bus 007 Device 004:	ID 05e3:0608 Genesys Logic, Inc. Hub
    Bus 008 Device 001:	ID 1d6b:0003 Linux Foundation 3.0 root hub
    Bus 008 Device 002:	ID 0781:55a3 SanDisk Corp. Ultra Luxe
    Bus 009 Device 001:	ID 1d6b:0002 Linux Foundation 2.0 root hub
    Bus 009 Device 002:	ID 2109:2812 VIA Labs, Inc. VL812 Hub
    Bus 009 Device 003:	ID 046d:082c Logitech, Inc. HD Webcam C615
    Bus 010 Device 001:	ID 1d6b:0003 Linux Foundation 3.0 root hub
    Bus 010 Device 002:	ID 2109:0812 VIA Labs, Inc. VL812 Hub

     

    I've attached 3 recent crash logs below:

    tower-diagnostics-20201210-1831.zip

    8th Dec:

    From the syslog i can see the following:

    Dec  8 19:04:37 Tower kernel: usb 5-4.2: reset full-speed USB device number 5 using xhci_hcd
    ### [PREVIOUS LINE REPEATED 1 TIMES] ###
    Dec  8 19:04:38 Tower kernel: logitech-djreceiver 0003:046D:C52B.000F: hiddev97,hidraw0: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:46:00.1-4.2/input2
    Dec  8 19:04:38 Tower kernel: input: Logitech MX Master 2S as /devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:08.0/0000:46:00.1/usb5/5-4/5-4.2/5-4.2:1.2/0003:046D:C52B.000F/0003:046D:4069.0010/input/input18
    Dec  8 19:04:38 Tower kernel: logitech-hidpp-device 0003:046D:4069.0010: input,hidraw3: USB HID v1.11 Keyboard [Logitech MX Master 2S] on usb-0000:46:00.1-4.2:1
    Dec  8 19:04:38 Tower kernel: input: Logitech Craft as /devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:08.0/0000:46:00.1/usb5/5-4/5-4.2/5-4.2:1.2/0003:046D:C52B.000F/0003:046D:4066.0011/input/input19
    Dec  8 19:04:38 Tower kernel: logitech-hidpp-device 0003:046D:4066.0011: input,hidraw6: USB HID v1.11 Keyboard [Logitech Craft] on usb-0000:46:00.1-4.2:2
    Dec  8 19:04:38 Tower kernel: input: Logitech MX Master 3 as /devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:08.0/0000:46:00.1/usb5/5-4/5-4.2/5-4.2:1.2/0003:046D:C52B.000F/0003:046D:4082.0012/input/input20
    Dec  8 19:04:38 Tower kernel: logitech-hidpp-device 0003:046D:4082.0012: input,hidraw7: USB HID v1.11 Keyboard [Logitech MX Master 3] on usb-0000:46:00.1-4.2:3

    10th Dec:

    tower-diagnostics-20201210-1831.zip

    Dec 10 18:04:09 Tower kernel: usb 7-3: USB disconnect, device number 3
    Dec 10 18:04:09 Tower kernel: usblp0: removed
    Dec 10 18:04:13 Tower kernel: usb 7-3: new high-speed USB device number 7 using xhci_hcd
    Dec 10 18:04:13 Tower kernel: usblp 7-3:1.0: usblp0: USB Bidirectional printer dev 7 if 0 alt 0 proto 2 vid 0x04F9 pid 0x0424
    Dec 10 18:27:18 Tower kernel: usb 5-4.2: reset full-speed USB device number 5 using xhci_hcd
    ### [PREVIOUS LINE REPEATED 1 TIMES] ###
    Dec 10 18:27:19 Tower kernel: logitech-djreceiver 0003:046D:C52B.000F: hiddev97,hidraw0: USB HID v1.11 Device [Logitech USB Receiver] on usb-0000:46:00.1-4.2/input2
    Dec 10 18:27:19 Tower kernel: input: Logitech MX Master 3 as /devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:08.0/0000:46:00.1/usb5/5-4/5-4.2/5-4.2:1.2/0003:046D:C52B.000F/0003:046D:4082.0010/input/input30
    Dec 10 18:27:19 Tower kernel: logitech-hidpp-device 0003:046D:4082.0010: input,hidraw3: USB HID v1.11 Keyboard [Logitech MX Master 3] on usb-0000:46:00.1-4.2:3
    Dec 10 18:27:19 Tower kernel: input: Logitech MX Master 2S as /devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:08.0/0000:46:00.1/usb5/5-4/5-4.2/5-4.2:1.2/0003:046D:C52B.000F/0003:046D:4069.0011/input/input31
    Dec 10 18:27:19 Tower kernel: logitech-hidpp-device 0003:046D:4069.0011: input,hidraw6: USB HID v1.11 Keyboard [Logitech MX Master 2S] on usb-0000:46:00.1-4.2:1
    Dec 10 18:27:19 Tower kernel: input: Logitech Craft as /devices/pci0000:40/0000:40:01.1/0000:41:00.0/0000:42:08.0/0000:46:00.1/usb5/5-4/5-4.2/5-4.2:1.2/0003:046D:C52B.000F/0003:046D:4066.0012/input/input32
    Dec 10 18:27:19 Tower kernel: logitech-hidpp-device 0003:046D:4066.0012: input,hidraw7: USB HID v1.11 Keyboard [Logitech Craft] on usb-0000:46:00.1-4.2:2
    Dec 10 18:27:21 Tower kernel: logitech-hidpp-device 0003:046D:4082.0010: HID++ 4.5 device connected.
    Dec 10 18:27:23 Tower kernel: logitech-hidpp-device 0003:046D:4066.0012: HID++ 4.5 device connected.

     

    This morning the entire VM crashed when i picked up a call, it seems the GPU has some issues too. 

    tower-diagnostics-20201211-1032.zip

     

    Dec 11 10:24:52 Tower kernel: usb 7-3: USB disconnect, device number 3
    Dec 11 10:24:52 Tower kernel: usblp0: removed
    Dec 11 10:31:31 Tower kernel: usb 5-4.2: reset full-speed USB device number 5 using xhci_hcd
    Dec 11 10:31:31 Tower kernel: usb 7-5: reset full-speed USB device number 4 using xhci_hcd
    Dec 11 10:31:32 Tower kernel: usb 5-1: reset full-speed USB device number 2 using xhci_hcd
    Dec 11 10:31:32 Tower kernel: vfio_bar_restore: 0000:05:00.1 reset recovery - restoring bars
    Dec 11 10:31:32 Tower kernel: vfio_bar_restore: 0000:05:00.0 reset recovery - restoring bars

     

    Im not sure whats causing this, could the USB hub get overloaded, maybe my headphones are drawing too much power im really not sure. Any help would be very much appreciated. 

     

    Thanks! 

     

     

     

     

     

     

    • Upvote 1
  5. Hi grizzle,

     

    Do you have a single GPU in your system ? I had some issues getting Uraid to release the card to my VMs i bought a very cheap card to get around this. 

    Also the AMD reset bug caused me major issues, once i rebooted the VM, i had to restart the entire system to release the resource and allow it to boot again. 

    Do your logs show any error when you start the VM? 

  6. On 6/13/2020 at 6:39 PM, grizzle said:

    hi there i have a similar problem passing through a 5700xt the VM boots but in the logs i get the following errors below

     

    -msg timestamp=on
    2020-06-13 10:30:10.873+0000: Domain id=4 is tainted: high-privileges
    2020-06-13 10:30:10.873+0000: Domain id=4 is tainted: custom-argv
    2020-06-13 10:30:10.873+0000: Domain id=4 is tainted: host-cpu
    char device redirected to /dev/pts/0 (label charserial0)
    2020-06-13T10:30:16.387348Z qemu-system-x86_64: -device vfio-pci,host=0000:45:00.0,id=hostdev0,bus=pci.3,multifunction=on,addr=0x0: Failed to mmap 0000:45:00.0 BAR 0. Performance may be slow

     

    I have been trying for days now and i am sure i have just over looked something in the XML 

     

     

     

    Hi Grizzle, 

     

    Did you get your RX5700 working ? 

     

    I managed to get it working for me, however, i had to pass the rom file and enable multifunction, I didnt have to do this in windows. 

     

    I also passedthrough an NVME drive for video editing. 

     

     <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev0'/>
          <rom file='/mnt/user/isos/Powercolor.RX5700XT.8176.190808.rom'/>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0' multifunction='on'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x05' slot='0x00' function='0x1'/>
          </source>
          <alias name='hostdev1'/>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
        </hostdev>

     

    Full XML: 

    <?xml version='1.0' encoding='UTF-8'?>
    <domain type='kvm' id='1' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
      <name>MacinaboxCatalina</name>
      <uuid>ae14c64c-03cc-4b73-a006-85958f10c075</uuid>
      <description>MacOS Catalina</description>
      <metadata>
        <vmtemplate xmlns="unraid" name="MacOS" icon="/mnt/user/domains/MacinaboxCatalina/icon/catalina.png" os="Catalina"/>
      </metadata>
      <memory unit='KiB'>8388608</memory>
      <currentMemory unit='KiB'>8388608</currentMemory>
      <memoryBacking>
        <nosharepages/>
      </memoryBacking>
      <vcpu placement='static'>40</vcpu>
      <cputune>
        <vcpupin vcpu='0' cpuset='12'/>
        <vcpupin vcpu='1' cpuset='44'/>
        <vcpupin vcpu='2' cpuset='13'/>
        <vcpupin vcpu='3' cpuset='45'/>
        <vcpupin vcpu='4' cpuset='14'/>
        <vcpupin vcpu='5' cpuset='46'/>
        <vcpupin vcpu='6' cpuset='15'/>
        <vcpupin vcpu='7' cpuset='47'/>
        <vcpupin vcpu='8' cpuset='16'/>
        <vcpupin vcpu='9' cpuset='48'/>
        <vcpupin vcpu='10' cpuset='17'/>
        <vcpupin vcpu='11' cpuset='49'/>
        <vcpupin vcpu='12' cpuset='18'/>
        <vcpupin vcpu='13' cpuset='50'/>
        <vcpupin vcpu='14' cpuset='19'/>
        <vcpupin vcpu='15' cpuset='51'/>
        <vcpupin vcpu='16' cpuset='20'/>
        <vcpupin vcpu='17' cpuset='52'/>
        <vcpupin vcpu='18' cpuset='21'/>
        <vcpupin vcpu='19' cpuset='53'/>
        <vcpupin vcpu='20' cpuset='22'/>
        <vcpupin vcpu='21' cpuset='54'/>
        <vcpupin vcpu='22' cpuset='23'/>
        <vcpupin vcpu='23' cpuset='55'/>
        <vcpupin vcpu='24' cpuset='24'/>
        <vcpupin vcpu='25' cpuset='56'/>
        <vcpupin vcpu='26' cpuset='25'/>
        <vcpupin vcpu='27' cpuset='57'/>
        <vcpupin vcpu='28' cpuset='26'/>
        <vcpupin vcpu='29' cpuset='58'/>
        <vcpupin vcpu='30' cpuset='27'/>
        <vcpupin vcpu='31' cpuset='59'/>
        <vcpupin vcpu='32' cpuset='28'/>
        <vcpupin vcpu='33' cpuset='60'/>
        <vcpupin vcpu='34' cpuset='29'/>
        <vcpupin vcpu='35' cpuset='61'/>
        <vcpupin vcpu='36' cpuset='30'/>
        <vcpupin vcpu='37' cpuset='62'/>
        <vcpupin vcpu='38' cpuset='31'/>
        <vcpupin vcpu='39' cpuset='63'/>
      </cputune>
      <resource>
        <partition>/machine</partition>
      </resource>
      <os>
        <type arch='x86_64' machine='pc-q35-3.1'>hvm</type>
        <loader readonly='yes' type='pflash'>/mnt/user/domains/MacinaboxCatalina/ovmf/OVMF_CODE.fd</loader>
        <nvram>/mnt/user/domains/MacinaboxCatalina/ovmf/OVMF_VARS.fd</nvram>
      </os>
      <features>
        <acpi/>
        <apic/>
      </features>
      <cpu mode='host-passthrough' check='none'>
        <cache mode='passthrough'/>
        <feature policy='require' name='topoext'/>
      </cpu>
      <clock offset='utc'>
        <timer name='rtc' tickpolicy='catchup'/>
        <timer name='pit' tickpolicy='delay'/>
        <timer name='hpet' present='no'/>
      </clock>
      <on_poweroff>destroy</on_poweroff>
      <on_reboot>restart</on_reboot>
      <on_crash>restart</on_crash>
      <devices>
        <emulator>/usr/local/sbin/qemu</emulator>
        <disk type='file' device='disk'>
          <driver name='qemu' type='qcow2' cache='writeback'/>
          <source file='/mnt/user/domains/MacinaboxCatalina/Clover.qcow2' index='3'/>
          <backingStore/>
          <target dev='hdc' bus='sata'/>
          <boot order='1'/>
          <alias name='sata0-0-2'/>
          <address type='drive' controller='0' bus='0' target='0' unit='2'/>
        </disk>
        <disk type='file' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source file='/mnt/user/domains/MacinaboxCatalina/Catalina-install.img' index='2'/>
          <backingStore/>
          <target dev='hdd' bus='sata'/>
          <alias name='sata0-0-3'/>
          <address type='drive' controller='0' bus='0' target='0' unit='3'/>
        </disk>
        <disk type='file' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source file='/mnt/user/domains/MacinaboxCatalina/macos_disk.img' index='1'/>
          <backingStore/>
          <target dev='hde' bus='sata'/>
          <alias name='sata0-0-4'/>
          <address type='drive' controller='0' bus='0' target='0' unit='4'/>
        </disk>
        <controller type='sata' index='0'>
          <alias name='ide'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
        </controller>
        <controller type='pci' index='0' model='pcie-root'>
          <alias name='pcie.0'/>
        </controller>
        <controller type='pci' index='1' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='1' port='0x10'/>
          <alias name='pci.1'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
        </controller>
        <controller type='pci' index='2' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='2' port='0x11'/>
          <alias name='pci.2'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
        </controller>
        <controller type='pci' index='3' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='3' port='0x12'/>
          <alias name='pci.3'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
        </controller>
        <controller type='pci' index='4' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='4' port='0x13'/>
          <alias name='pci.4'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
        </controller>
        <controller type='pci' index='5' model='pcie-root-port'>
          <model name='pcie-root-port'/>
          <target chassis='5' port='0x8'/>
          <alias name='pci.5'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0'/>
        </controller>
        <controller type='virtio-serial' index='0'>
          <alias name='virtio-serial0'/>
          <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
        </controller>
        <controller type='usb' index='0' model='ich9-ehci1'>
          <alias name='usb'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x7'/>
        </controller>
        <controller type='usb' index='0' model='ich9-uhci1'>
          <alias name='usb'/>
          <master startport='0'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0' multifunction='on'/>
        </controller>
        <controller type='usb' index='0' model='ich9-uhci2'>
          <alias name='usb'/>
          <master startport='2'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x1'/>
        </controller>
        <controller type='usb' index='0' model='ich9-uhci3'>
          <alias name='usb'/>
          <master startport='4'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x2'/>
        </controller>
        <filesystem type='mount' accessmode='passthrough'>
          <source dir='/mnt/user/Media/Wedding/'/>
          <target dir='Wedding'/>
          <alias name='fs0'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
        </filesystem>
        <interface type='bridge'>
          <mac address='52:54:00:74:a0:97'/>
          <source bridge='br0'/>
          <target dev='vnet0'/>
          <model type='e1000-82545em'/>
          <alias name='net0'/>
          <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
        </interface>
        <serial type='pty'>
          <source path='/dev/pts/0'/>
          <target type='isa-serial' port='0'>
            <model name='isa-serial'/>
          </target>
          <alias name='serial0'/>
        </serial>
        <console type='pty' tty='/dev/pts/0'>
          <source path='/dev/pts/0'/>
          <target type='serial' port='0'/>
          <alias name='serial0'/>
        </console>
        <channel type='unix'>
          <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/domain-1-MacinaboxCatalina/org.qemu.guest_agent.0'/>
          <target type='virtio' name='org.qemu.guest_agent.0' state='connected'/>
          <alias name='channel0'/>
          <address type='virtio-serial' controller='0' bus='0' port='1'/>
        </channel>
        <input type='tablet' bus='usb'>
          <alias name='input0'/>
          <address type='usb' bus='0' port='1'/>
        </input>
        <input type='mouse' bus='ps2'>
          <alias name='input1'/>
        </input>
        <input type='keyboard' bus='ps2'>
          <alias name='input2'/>
        </input>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev0'/>
          <rom file='/mnt/user/isos/Powercolor.RX5700XT.8176.190808.rom'/>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x0' multifunction='on'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x05' slot='0x00' function='0x1'/>
          </source>
          <alias name='hostdev1'/>
          <address type='pci' domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
        </hostdev>
        <hostdev mode='subsystem' type='usb' managed='no'>
          <source>
            <vendor id='0x046d'/>
            <product id='0xc52b'/>
            <address bus='5' device='2'/>
          </source>
          <alias name='hostdev2'/>
          <address type='usb' bus='0' port='4'/>
        </hostdev>
        <hostdev mode='subsystem' type='usb' managed='no'>
          <source>
            <vendor id='0x047f'/>
            <product id='0xc053'/>
            <address bus='5' device='6'/>
          </source>
          <alias name='hostdev3'/>
          <address type='usb' bus='0' port='5'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
          </source>
          <alias name='hostdev4'/>
          <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/>
        </hostdev>
        <memballoon model='virtio'>
          <alias name='balloon0'/>
          <address type='pci' domain='0x0000' bus='0x05' slot='0x00' function='0x0'/>
        </memballoon>
      </devices>
      <seclabel type='dynamic' model='dac' relabel='yes'>
        <label>+0:+100</label>
        <imagelabel>+0:+100</imagelabel>
      </seclabel>
      <qemu:commandline>
        <qemu:arg value='-usb'/>
        <qemu:arg value='-device'/>
        <qemu:arg value='usb-kbd,bus=usb-bus.0'/>
        <qemu:arg value='-device'/>
        <qemu:arg value='************************'/>
        <qemu:arg value='-smbios'/>
        <qemu:arg value='type=2'/>
        <qemu:arg value='-cpu'/>
        <qemu:arg value='Penryn,kvm=on,vendor=GenuineIntel,+invtsc,vmware-cpuid-freq=on,+pcid,+ssse3,+sse4.2,+popcnt,+avx,+aes,+xsave,+xsaveopt,check'/>
      </qemu:commandline>
    </domain>

     

    I still suffer from the AMD reset bug, i need to restart if i want to switch from MAC to Windows, not ideal, but its working... Spaceinvader mentions it in this video, the script stopped working for me also. 

     

     

     

     

     

     

  7. Hi Everyone, 

     

    Thanks for your help so far, I wanted to document what I’ve learned over the weekend just in case anyone else comes across this thread and also so i can remember what i did.. 


    I tried the following:

    1. Removed all drives and 3rd Party NIC - fans continue to ramp up
    2. Changed many settings in bios
      1. Cooling
      2. Performance
      3. etc...
      4. Reset to defaults
      5. I copied the settings exactly from the 12Core to the 20Core
    3. I created a fresh install of Unraid just in case my version had messed up. Made no difference. 
    4. Stopped Docker and VM manager. Made no difference. 
    5. Stopped Array. Made no difference
    6. 20Core has 750w PDU - 12Core has 460Watt 
      1. Removed the spare PDU from 12 Core and tried in 20Core. Made no difference.
      2. Plugged in both 460watts with Power to 20Core. Made no difference.
    7. Updated ILo - Trial License – Free until the end of 2020
      1. Free iLO Advanced 2020
    8. I have not attempted to update any other firmware yet.
      1. Note 20Core and 12Core seem to have the same setup.

    Viewing the fans staus via iLo became tiresome,  I also had no way to view historical performace.  I decided to pull stats from the server (CPU, fan speed, temps, etc...) then plot them on a time series DB, that way I could compare temps v fan or CPU usage v fan. 

     

    I came across this post:  https://www.homelabrat.com/ipmi-dashboard/ 

     

    Enable IPMI over LAN Access

    Loggin to iLO:

    1. Administration
    2. Access Settings
    3. Enable IPMI/DCMI over LAN Access

     

    I installed Community Applications

     

    Then installed the following Docker containers:

    • influxdb 
    • granfana
    • telegraf 

    Influxdb:

    default - no changes

     

    Telegraf:

    Ensure you create a file at the following location: /mnt/user/appdata/telegraf/telegraf.conf - Note the docker install created a folder telegraf.conf - i deleted it and replaced it with the contents of this file:

    Open the console to the telegraf docker 

    image.png.e77a9cf17e4b3cd0fe9949281feefc0c.png

     

    install ipmitool by running the following commands:

     

    apk update
    apk add ipmitool

    Now you can use ipmitool to pull stats from iLO, verify the following command works, replacing your IP/USER/PASS

    • Note -I can be either lan or lanplus - lanplus worked for me
    ipmitool -H 192.168.1.142 -U admin -P password -I lanplus sdr

     

    You should see something like so:

    ipmitool -H 192.168.1.142 -U admin -P password -I lanplus sdr
    UID Light        | 0x00              | ok
    Sys. Health LED  | no reading        | ns
    01-Inlet Ambient | 32 degrees C      | ok
    02-CPU 1         | 40 degrees C      | ok
    03-CPU 2         | 40 degrees C      | ok
    04-P1 DIMM 1-6   | disabled          | ns
    05-P1 DIMM 7-12  | 39 degrees C      | ok
    06-P2 DIMM 1-6   | disabled          | ns
    07-P2 DIMM 7-12  | 37 degrees C      | ok
    08-P1 Mem Zone   | 39 degrees C      | ok

     

    Now edit the following file:

    /mnt/user/appdata/telegraf/telegraf.conf

     

    Search in the file for IPMI, then remove # at the following lines, or paste the text below (replacing you IP/USER/PASS )

    • Im pulling stats from iLo every 10seconds
    [[inputs.ipmi_sensor]]
    servers = ["admin:password@lanplus(192.168.1.142)","admin:password@lanplus(192.168.1.110)"]
    interval = "10s"
    timeout = "20s"
    path = "/usr/sbin/ipmitool"
    metric_version = 2

    Save and close the file then restart telegraf

     

    image.png.76ff6746da7bf6507caec6f9770852ec.png

     

    Check the logs to ensure there are no errors by clicking the log icon on the right of the telgraf docker container:

    image.thumb.png.ff56c7cddfbb7e260195a70558af07ca.png

     

    telegraf should be pulling information from both my servers and sending it to the telgraf db in influxdb every 10 seconds

     

    Grafana

    Now we need to visualize the data, note i am not an expert in grafana, i followed a few tutorials online. 

     

    login to your docker by connecting to port http://UNRAIDIP:3000

    default username/password is admin/admin

     

    We need to connect grafana to influxdb, on the left select the settings icon and select data sources

    image.png.691b5265ddc3597648ed3171d050bd6a.png

     

    Change the following:

    name: influxdb

    URL: http://UNRAIDIP:8086

    Database: telegraf

     

    image.thumb.png.38ac8c80ee89e6888f12c8a4ce709b82.png

     

    Hit save and test, if everything worked you should see the following: 

     

    image.png.0960047cbf880ebc762b9f08b843c512.png

     

    We could create a dashboard from scratch, but that will take too long, so lets import one.

    Click the plus icon and select import:

     

    image.png.af704b8878b82b89fb09739d4d813e4b.png

     

    Pre-made dashboards here, i tried both, neither worked straight away. 

    https://grafana.com/grafana/dashboards/10192 (DL380 Gen8 )

    https://grafana.com/grafana/dashboards/10191 (DL360 Gen7) 

     

     

    Paste the id and hit the tab key, either 10191 or 10192 

     

    image.png.f6ae0617b371c536fa10b927aaecbc74.png

     

    You should then see the following: 

    • Name: whatever you want
    • Folder: general will make it easier to find in the future
    • uid: hit change if you see an error

    image.png.b97a9569803abe20c132aed9bf2520dd.png

     

    you should now see a screen like so, note the server address is wrong

     

    image.thumb.png.75dca0c750ca3d5f586614b90a537e70.png

     

    Click the settings icon on the top right and select varibles

     

    image.thumb.png.8ff458a3e6dbb1cf6f57fd7b51efbd35.png

     

    Double click on the server varible, change the values to match your server and hit the update button then go back to the dashboard.

     

    image.png.179bbae50d878ebd45a9a1d12e015f2d.png

     

    You can now select your ip from the top left, select it and hit save

     

    image.png.a140f30836bce0343f289601fee65a2d.png

     

    You should then see something like so:

    image.thumb.png.88b30ac64ebda6e114ce7a0111733b13.png

     

    Note the fans didn't work by default for me but mess around until you get it working as you like. 

     

    HP Serv-1588565699801.json you can also import this my dashboard from this json file

     

    I can now see the temperature over time and the fan speed:

    • Notice the three red circles where the fans dopped down to 30ish % 
    • I realized I had reset the ilo at those points, it had also reset the fans...
    •  

    image.thumb.png.613885c62a1e654d0b30594999ebf729.png

     

     

    By running the following command, i could reset the ilo and thus reset the fans. 

    ipmitool -H 192.168.1.110 -U admin -P password -I lanplus mc reset warm

    As the fans were still ramping up to 95% after 17mins or so -  I then decided to reset the ilo every 5mins... 

     

    image.thumb.png.b4f4bc0a3fc8912af604b26cf4d726c0.png

     

    I installed the plugin CA User Scripts, then created a new script called resetIlo

     

    Script:

    #!/bin/bash
    ipmitool -H 192.168.1.110 -U admin -P password -I lanplus mc reset warm

     

    Set the a custom schedule to run every 5mins

    */5 * * * *

     

    image.thumb.png.adf9e2d3e370d667ed642b3b54e74cc5.png

     

    The average over a 5mins period is now 45% much better than 95%

    image.thumb.png.b699b824bc9e10c893eab6f66d588c53.png

     

    While I clearly have not found the cause of the issue, I've least found a "workaround" for now... 45% average is much better than 95%. 

     

    Last 6 hours:

    image.thumb.png.5a0d426fb4cc5e9f59c6032133eb7f14.png

     

    Thanks

     

     

     

  8. On 4/30/2020 at 7:30 PM, Flubster said:

    What brand are the HDD's? Are they HP SAS drives? My DL380p doesn't like non HP approved drives at all and spins up the fans If i install one. Also try removing the 3rd party card and seeing if that helps. They can be very stubborn servers!

    Thanks, i checked they are all HP

  9. Servers are both HPDL360 Gen 8

     

    I copied the temps into a table to compare, not much difference, mostly the same on each sever, if anything the 20Cores seems to be cooler than the 12Core. 

     

    image.thumb.png.45a7939f54eb08e67eece1ba89520a03.png

     

  10. Hi All,

     

    Wondering if anyone can help or at least point me in the right direction, i understand this is not an HP support forum, but i know mamy of your have experience. 

     

    I bought 2x used HPDL360 ( no warrenty of support from HP)

    1. 2x 10Core Intel(R) Xeon(R) CPU E5-2660 v2 @ 2.20GHz
    2. 2x 6Core  Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz

     

    Both of them are running fine, however, the 2x10Core fans are running at 94% most of the time, even with the VM and dockers stopped.

    The 2x 6 Core is fine running at 26% 

     

    2x10Core:

    image.thumb.png.0ac034859fa011d526e6a2141457dca5.png

     

    image.thumb.png.2e0cc440b927f41b3421970091938fef.png

     

    2x 6Core

    image.thumb.png.e76f856237d314d95233f757863ed48c.png

     

    image.thumb.png.26d6707fbf5309e60829f5f5cfdf518e.png

     

    I cant really see any major difference between them.

    I have installed a Solarflare card in each of them. 

     

    dl360-20core-diagnostics-20200430-1356.zip ( Licensed ) 

    tower-diagnostics-20200430-1357.zip ( trial ) 

     

    Anything else i should be looking for? 

     

    Thanks

  11. 2 hours ago, 1812 said:

    noooo, don't format the drive.

     

    install this: 

     

     

    then once installed (and possibly rebooting) go to settings>Unraid Nvidia. once there, look for the stock Unraid builds, pick one, let it install, reboot. then you'll need to go back to the github site for the right patch that aligns with the Unraid version you picked, copy that bzimage to the flash drive, reboot, and then report.

     

     

     

     

     

    Thank You! - really useful tool. 

     

    I Installed the plugin from https://raw.githubusercontent.com/linuxserver/Unraid-Nvidia-Plugin/master/plugins/Unraid-Nvidia.plg - Then rolled back to 6.8.0

     

    image.png.86883c299f866bb8f654f89a40c65bc2.png

     

    Rebooted the server, everything seems to be working.

     

    I had passed through the device.

    image.png.d271f60875a014c03a0fedc44442d8c5.png

     

    dl360-20core-diagnostics-20200428-1014.zip

     

    Strange i have a different error now: 

     

    image.png.b9627c4f107aadb6d3ca7c4670e6202f.png

     

    I moved on, i copied over https://github.com/AnnabellaRenee87/Unraid-HP-Proliant-Edition/blob/master/bzimage releases/Stable/6-8-0/bzimage to the flash drive, rebooted and everything started fine. 

     

    Got the same error as above. 

    dl360-20core-diagnostics-20200428-1026.zip

     

    The logs do not mention anything relating to ineligible devices - perhaps the check did not happen in 6.8.0

    vfio-pci 0000:04:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.

    I will research the error above

     

    Thanks

     

     

  12. 9 hours ago, 1812 said:

    how interesting.

     

    you typically don't need to update bios for the patch to work but in this case I would consider it. But before that,  I would also consider dropping back an Unraid version or two, maybe 6.8.2 or 6.8.1 and try with those patches. I don't have a need for the patch anymore so I have no way to test if it is the patch or not. But that will eliminate a variable.

    Apologies, I’m rather new to this. 

     

    What is the "easiest" way to drop back to a previous version? 

     

    Right now, i'd reformt the drive and copy over the files ?  Is there a way to retain my existing logical setup I have a lot of dockers etc installed.

  13. 3 hours ago, 1812 said:

     

     

    your log shows

     

    
    vfio-pci 0000:04:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.

     

    you need

     

     

     

     

    Hi, 

     

    thanks very much for the tip! - I somehow missed this thread. 

     

    I tried to follow the instructions, however, i still seem to have the same issue in my syslog.

     

    dl360-20core-diagnostics-20200426-2354.zip

     

    I'm running the latest verison 6.8.3

     

    I copied the bzimage file from here:  https://github.com/AnnabellaRenee87/Unraid-HP-Proliant-Edition/tree/master/bzimage releases/Stable/6-8-3  onto my flash disk and rebooted. 

     

    image.png.d95d458813e3dcea85bfc8ce055795e2.png

     

    After rebooting,  i can see the new file is there with todays date, however the i still get "Device is ineligible".

    Apr 26 08:52:42 DL360-20Core kernel: vfio-pci 0000:04:00.0: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
    Apr 26 08:52:43 DL360-20Core kernel: vfio_ecap_init: 0000:04:00.0 hiding ecap 0x19@0x168
    Apr 26 08:52:43 DL360-20Core kernel: vfio_ecap_init: 0000:04:00.0 hiding ecap 0x1e@0x26c
    Apr 26 08:52:43 DL360-20Core kernel: vfio-pci 0000:04:00.1: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.

    I also compiled it on my server then used that version, however, same result.

    mkdir -p /mnt/cache/.rmrr
    cd /mnt/cache/.rmrr
    wget https://raw.githubusercontent.com/AnnabellaRenee87/Unraid-HP-Proliant-Edition/master/build_script/kernel_compile.sh
    chmod +x kernel_compile.sh
    ./kernel_compile.sh

     

    Anything else i could be missing? Do i need to udpate bios or any other ideas? 

     

    Thanks!

     

     

     

     

     

     

     

  14. Hi all, 

     

    I've beentrying to passthough a Solarflare Dual 10Gb Network card by following Spaceinvaders video here: https://www.youtube.com/watch?v=n2OPfALLqRA

     

    I have two IOMMU groups, however, they both seem to have the same ID yet they are listed in two different IOMMU groups. 

    IOMMU group 23:[1924:0a03] 04:00.0 Ethernet controller: Solarflare Communications SFC9220 10/40G Ethernet Controller (rev 02)
    
    IOMMU group 24:[1924:0a03] 04:00.1 Ethernet controller: Solarflare Communications SFC9220 10/40G Ethernet Controller (rev 02)

    I edited my Syslinux Configuration file adding:

    append vfio-pci.ids=1924:0a03 initrd=/bzroot

    Hit apply, done, restarted then went to edit my VM.

     

    When i edit the VM settings ( ubuntu ) i can see two devices listed:

    image.png.22303bef57f0c4a128bca50d34e1aa83.png

     

    However, when i assign (1 or both) of them to the VM. I get the following error on startup:

     

    image.png.c15b65a6e1c36454f2ff74149be6ca3b.png

     

     

    I've included the diagnotics file below:

    dl360-20core-diagnostics-20200420-2128.zip

     

    I have a feeling this card is doing something weird and may not be supported, but i wanted to check if any experts could help before i move it into a dedicated machine. 

     

    Thanks

    Happy