Jump to content

dnoyeb

Members
  • Content Count

    121
  • Joined

  • Last visited

Community Reputation

8 Neutral

About dnoyeb

  • Rank
    Advanced Member

Converted

  • Gender
    Undisclosed

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. Retested on beta 29, issue still there. Based on further testing and seeing Radek's comment; it occurs in 6.8.3 as well. I tried both cards that were in the box, same issue on each address. Here's the current error that pops up and the diag is attached: Execution error internal error: qemu unexpectedly closed the monitor: 2020-09-29T17:39:26.418978Z qemu-system-x86_64: -device vfio-pci,host=0000:03:00.0,id=hostdev0,bus=pci.4,addr=0x0: vfio 0000:03:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Device or resource busy One side note I just wanted to bring up; based on mellanox's website, they're up to driver version 5.x whereas the diagnostic files i'm looking through seem to show this build is using the 4.0 driver version. Any reason to think that could contribute? Or when these things are passed through are they completely transparent to unraid? tower-diagnostics-20200929-1351.zip
  2. oh I see one i'm going to test tomorrow! "webgui: better handling of multiple nics with vfio-pci" do you guys prefer that we update the testing in our bug report thread for record keeping purposes?
  3. Anonymized version attached; if you need the other version let me know and i'll dm it. Other testing done last night: I tried disabling the usb2.0 ports and had same issue, also tried disabling the usb3.0 ports and moved the key over to the 2.0 to just make sure it wasn't something funny like that. Neither helped. Thanks for any guidance. tower-diagnostics-20200804-0854.zip
  4. bit more info, I see in the logs these lines: Aug 3 19:04:09 Tower kernel: vfio-pci 0000:03:00.0: enabling device (0100 -> 0102) Aug 3 19:04:10 Tower kernel: vfio-pci 0000:03:00.0: vfio_ecap_init: hiding ecap 0x19@0x18c Aug 3 19:04:10 Tower kernel: genirq: Flags mismatch irq 16. 00000000 (vfio-intx(0000:03:00.0)) vs. 00000080 (ehci_hcd:usb1) Hopefully this helps in the troubleshooting..
  5. Trying to get a mellanox-3 card passed through to a VM and having some troubles. To set the stage, I have two of these cards in my unraid server, I am using one of them for the OS. I used the Tools / System Devices / Bind Selected to VFIO at boot method and have verified that the card is added: cat vfio-pci.cfg BIND=0000:03:00.0|15b3:1003 Here is the log showing it was successful in being bound at boot of unraid: Loading config from /boot/config/vfio-pci.cfg BIND=0000:03:00.0|15b3:1003 --- Processing 0000:03:00.0 15b3:1003 Vendor:Device 15b3:1003 found at 0000:03:00.0 IOMMU group members (sans bridges): /sys/bus/pci/devices/0000:03:00.0/iommu_group/devices/0000:03:00.0 Binding... Successfully bound the device 15b3:1003 at 0000:03:00.0 to vfio-pci --- vfio-pci binding complete Devices listed in /sys/bus/pci/drivers/vfio-pci: lrwxrwxrwx 1 root root 0 Aug 3 18:56 0000:03:00.0 -> ../../../../devices/pci0000:00/0000:00:1c.4/0000:03:00.0 ls -l /dev/vfio/ total 0 crw------- 1 root root 249, 0 Aug 3 18:56 12 crw-rw-rw- 1 root root 10, 196 Aug 3 18:56 vfio This card shows up when setting up the VM : Other PCI Devices: Mellanox Technologies MT27500 Family [ConnectX-3] | Ethernet controller (03:00.0) When that box is checked and the VM is started, this error shows up in the log: 2020-08-04T00:24:59.369033Z qemu-system-x86_64: -device vfio-pci,host=0000:03:00.0,id=hostdev0,bus=pci.0,addr=0x8: vfio 0000:03:00.0: Failed to set up TRIGGER eventfd signaling for interrupt INTX-0: VFIO_DEVICE_SET_IRQS failure: Device or resource busy 2020-08-04 00:25:00.476+0000: shutting down, reason=failed I'm confused as to how / why it is in use since it is allocated to VFIO at boot. I have tried enabling "PCIe ACS override" and for the fun of it did the "VFIO allow unsafe interrupts" Neither helped (didn't think they would but tried anyways. Does anyone have any thoughts on other things to try? I am working to setup a RockNSM VM and need to pass through this NIC so the VM can capture the 10g mirror of my uplink to my router. Thanks in advance for any assistance.
  6. Wow I'm literally in the exact same boat... I have been debating on upgrading to some 2696 v2's in the same motherboard or upgrading to an amd since they seem to be extremely strong. The things i'm concerned with are the pci x16 slots as I have 3 m1015 cards now (I have about 20 drives) yet still having space for a 10gb nic and onboard VGA + at least one more slot for my nvidia 1660 for the gaming vm / plex transcoding. Have you made any headway on the motherboard you'd pick?
  7. Holy smokes, the new alpha build on transcoding via the nvidia card is freaking unreal.... my cpu levels went to practically zero. huge improvement having decode and encode working by default.
  8. Ok, big thanks to JasonM! Got the VM side working while using the Nvidia build without pinning anything. Few things were needed, first: Step 1 (initial instructions JasonM shared and probably would have been enough if I was already running OVMF): However this didn't seem to get it working, ended up that my Windows10 VM was running on SeaBios. I used the directions from alturismo to prepare my SeaBios backed version of Windows 10 install for OVMF: Step 2 (prepare vidsk for OVMF): Step 3: Edit the newly created VM template, pin the CPU's, manually add the vdisk, checking boxes for the keyboard / mouse I was attaching and saving without starting. Step 4: edit VM again, this time go into XML mode, add hostdev code you built from Step one up above and paste under the last entry for hostdev Step 5: save and edit one last time in GUI mode. Add the Nvidia GPU and sound. Save step 6: verify you don't have any transcodes going on, boot up the sucker and go play some Doom. just figured i'd share in case anyone came across my issue and wondered how it got solved.
  9. replying to my issue... I get the plugin to see the card if I remove the vfio-pci.ids= (my id's of the nvidia gpu, nvidia sound, nvidia usb, nvidia serial) This breaks my ability to connect to a VM. Tried doing the pcie_acs_override=downstream but still no go... guess next try is to add multithread I guess and see if that does anything. added the multithread option, still a no go. So does anyone with these newer cards (1660ti and above) have the ability to use the card with the nvidia plugin and kvm? KVM doesn't seem to lime my iommu group due to those dang usb/serial controllers on the nvidia card.... hence I had to stub them to allow me to launch the vm. Really hope I can do both; otherwise may just have to return the card for another model (1060 or the likes) Side note, does anyone else have a 1660ti?? Do they all have this stupid nvidia USB/Serial on them? It seems to be what screwing with me.
  10. Question... Installed a new 1660ti for playing games in a VM (I know will cause issues if I launch while transcodes going). However, to get the VM's to boot I had to use vfio-pci.ids= in my syslinux config as the card apparently has a USB / serial controller built in and the VM's wouldn't launch since the placement group had the nvidia gpu, the nvidia sound, the nvidia usb and the nvidia serial. Awyways, I used vfio-pci.ids= to resolve; but it seems that perhaps based on my syslog; it's keeping the kernel from this plugin from attaching properly to the card: Sep 3 18:54:59 unRAID kernel: nvidia: loading out-of-tree module taints kernel. Sep 3 18:54:59 unRAID kernel: nvidia: module license 'NVIDIA' taints kernel. Sep 3 18:54:59 unRAID kernel: Disabling lock debugging due to kernel taint Sep 3 18:54:59 unRAID kernel: sd 10:0:2:0: [sdn] Attached SCSI disk Sep 3 18:54:59 unRAID kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 247 Sep 3 18:54:59 unRAID kernel: NVRM: The NVIDIA probe routine was not called for 1 device(s). Sep 3 18:54:59 unRAID kernel: NVRM: This can occur when a driver such as: Sep 3 18:54:59 unRAID kernel: NVRM: nouveau, rivafb, nvidiafb or rivatv Sep 3 18:54:59 unRAID kernel: NVRM: was loaded and obtained ownership of the NVIDIA device(s). Sep 3 18:54:59 unRAID kernel: NVRM: Try unloading the conflicting kernel module (and/or Sep 3 18:54:59 unRAID kernel: NVRM: reconfigure your kernel without the conflicting Sep 3 18:54:59 unRAID kernel: NVRM: driver(s)), then try loading the NVIDIA kernel module Sep 3 18:54:59 unRAID kernel: NVRM: again. Sep 3 18:54:59 unRAID kernel: NVRM: No NVIDIA devices probed. Sep 3 18:54:59 unRAID kernel: nvidia-nvlink: Unregistered the Nvlink Core, major device number 247 Sep 3 18:54:59 unRAID kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 247 Sep 3 18:54:59 unRAID kernel: NVRM: The NVIDIA probe routine was not called for 1 device(s). Sep 3 18:54:59 unRAID kernel: NVRM: This can occur when a driver such as: Sep 3 18:54:59 unRAID kernel: NVRM: nouveau, rivafb, nvidiafb or rivatv Sep 3 18:54:59 unRAID kernel: NVRM: was loaded and obtained ownership of the NVIDIA device(s). Sep 3 18:54:59 unRAID kernel: NVRM: Try unloading the conflicting kernel module (and/or Sep 3 18:54:59 unRAID kernel: NVRM: reconfigure your kernel without the conflicting Sep 3 18:54:59 unRAID kernel: NVRM: driver(s)), then try loading the NVIDIA kernel module Sep 3 18:54:59 unRAID kernel: NVRM: again. Sep 3 18:54:59 unRAID kernel: NVRM: No NVIDIA devices probed. Sep 3 18:54:59 unRAID kernel: nvidia-nvlink: Unregistered the Nvlink Core, major device number 247 Anyone had this issue and work around it?
  11. Looking over on the main plex page I see folks running it after doing a manual upgrade: Go into the console for that docker and do the following: wget <paste link to Ubuntu version of 1597 Plex> Wait for it to download, then dpkg -i Restart your plex docker and you’re done. I haven't had a chance to try it just yet.
  12. sweet, thanks. anyone tried out the new transcoder for hardware encoding / decoding yet?
  13. Quick question, I am guessing that "latest" version is only pulling from beta. Any way to get the docker to update into 1.16.7.1597 instead? Curious to try out the new transcoder.