deivis163 Posted June 2, 2016 Share Posted June 2, 2016 Hello, I have MSI Z170A PC MATE motherboard, intel i7 6700 CPU, 1x nvidia 960GTX and 1x nvidia 650GTX. My motherboard have only 2x pcie express slots 1 of them running 16x other one running 4x, I have a problem with second one slot which is running on 4x, I can't start GPU which are inserted in this slot. I can see that this GPU are in the same IOMMU group with other devices. How I need to transfer that GPU to isolated group for only this GPU? // PCI devices 00:00.0 Host bridge: Intel Corporation Sky Lake Host Bridge/DRAM Registers (rev 07) 00:01.0 PCI bridge: Intel Corporation Sky Lake PCIe Controller (x16) (rev 07) 00:02.0 VGA compatible controller: Intel Corporation Sky Lake Integrated Graphics (rev 06) 00:08.0 System peripheral: Intel Corporation Sky Lake Gaussian Mixture Model 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31) 00:14.2 Signal processing controller: Intel Corporation Sunrise Point-H Thermal subsystem (rev 31) 00:15.0 Signal processing controller: Intel Corporation Sunrise Point-H LPSS I2C Controller #0 (rev 31) 00:15.1 Signal processing controller: Intel Corporation Sunrise Point-H LPSS I2C Controller #1 (rev 31) 00:16.0 Communication controller: Intel Corporation Sunrise Point-H CSME HECI #1 (rev 31) 00:17.0 SATA controller: Intel Corporation Device a102 (rev 31) 00:1c.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #1 (rev f1) 00:1c.2 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #3 (rev f1) 00:1c.4 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #5 (rev f1) 00:1d.0 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #9 (rev f1) 00:1d.2 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #11 (rev f1) 00:1d.3 PCI bridge: Intel Corporation Sunrise Point-H PCI Express Root Port #12 (rev f1) 00:1e.0 Signal processing controller: Intel Corporation Sunrise Point-H LPSS UART #0 (rev 31) 00:1f.0 ISA bridge: Intel Corporation Sunrise Point-H LPC Controller (rev 31) 00:1f.2 Memory controller: Intel Corporation Sunrise Point-H PMC (rev 31) 00:1f.3 Audio device: Intel Corporation Sunrise Point-H HD Audio (rev 31) 00:1f.4 SMBus: Intel Corporation Sunrise Point-H SMBus (rev 31) 01:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 960] (rev a1) 01:00.1 Audio device: NVIDIA Corporation Device 0fba (rev a1) 03:00.0 USB controller: ASMedia Technology Inc. Device 1242 04:00.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GTX 650] (rev a1) 04:00.1 Audio device: NVIDIA Corporation GK107 HDMI Audio Controller (rev a1) 06:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 03) 08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15) IOMMU Groups /sys/kernel/iommu_groups/0/devices/0000:00:00.0 /sys/kernel/iommu_groups/1/devices/0000:00:01.0 /sys/kernel/iommu_groups/2/devices/0000:00:02.0 /sys/kernel/iommu_groups/3/devices/0000:00:08.0 /sys/kernel/iommu_groups/4/devices/0000:00:14.0 /sys/kernel/iommu_groups/4/devices/0000:00:14.2 /sys/kernel/iommu_groups/5/devices/0000:00:15.0 /sys/kernel/iommu_groups/5/devices/0000:00:15.1 /sys/kernel/iommu_groups/6/devices/0000:00:16.0 /sys/kernel/iommu_groups/7/devices/0000:00:17.0 /sys/kernel/iommu_groups/8/devices/0000:00:1c.0 /sys/kernel/iommu_groups/9/devices/0000:00:1c.2 /sys/kernel/iommu_groups/9/devices/0000:00:1c.4 /sys/kernel/iommu_groups/9/devices/0000:03:00.0 /sys/kernel/iommu_groups/9/devices/0000:04:00.0 /sys/kernel/iommu_groups/9/devices/0000:04:00.1 /sys/kernel/iommu_groups/10/devices/0000:00:1d.0 /sys/kernel/iommu_groups/11/devices/0000:00:1d.2 /sys/kernel/iommu_groups/11/devices/0000:00:1d.3 /sys/kernel/iommu_groups/11/devices/0000:06:00.0 /sys/kernel/iommu_groups/11/devices/0000:08:00.0 /sys/kernel/iommu_groups/12/devices/0000:00:1e.0 /sys/kernel/iommu_groups/13/devices/0000:00:1f.0 /sys/kernel/iommu_groups/13/devices/0000:00:1f.2 /sys/kernel/iommu_groups/13/devices/0000:00:1f.3 /sys/kernel/iommu_groups/13/devices/0000:00:1f.4 /sys/kernel/iommu_groups/14/devices/0000:01:00.0 /sys/kernel/iommu_groups/14/devices/0000:01:00.1 Starting machine error: internal error: early end of file from monitor: possible problem: 2016-06-02T14:56:00.315917Z qemu-system-x86_64: -device vfio-pci,host=04:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: error, group 9 is not viable, please ensure all devices within the iommu_group are bound to their vfio bus driver. 2016-06-02T14:56:00.315930Z qemu-system-x86_64: -device vfio-pci,host=04:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: vfio: failed to get group 9 2016-06-02T14:56:00.315937Z qemu-system-x86_64: -device vfio-pci,host=04:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device initialization failed 2016-06-02T14:56:00.315943Z qemu-system-x86_64: -device vfio-pci,host=04:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on: Device 'vfio-pci' could not be initialized Could you someone give advice how I can solve this issue? Link to comment
chvb Posted June 3, 2016 Share Posted June 3, 2016 Your second Card are in the same IOMMU Group as other devices. You have to try to enable the PCIe ACS Override Function. You can find this here: Settings -> VM Manager -> Enable PCIe ACS Override set it to enable Please reboot your UNRAID System. Link to comment
deivis163 Posted June 3, 2016 Author Share Posted June 3, 2016 Thx for reply, I have tried to enable this option, but everything is the same after reboot. I think this option will help if I have both GPU in same group. I think I have to manually change IOMMU group to this my nvidia 650gtx device, but I don't know how to do it.. I hope there is a solution how to solve this issue and kindly peoples will give me minds how I can try to solve this issue. I saw in other posts that on skylake platform this is not so easy, but I hope it is possible. Link to comment
chvb Posted June 3, 2016 Share Posted June 3, 2016 Maybe you should try to deactivate your USB in BIOS and give it another try. As you see, the USB is in the same group as your second Card. Link to comment
deivis163 Posted June 3, 2016 Author Share Posted June 3, 2016 But after these changes I can't boot UNRAID from my USB flash Link to comment
chvb Posted June 3, 2016 Share Posted June 3, 2016 Your USB3 Controller should be enabled. 00:14.0 USB controller: Intel Corporation Sunrise Point-H USB 3.0 xHCI Controller (rev 31) You have to be deactivate these one 03:00.0 USB controller: ASMedia Technology Inc. Device 1242 Insert your usb flash drive into the Intel USB 3 Controller. Maybe this could be your solution. Link to comment
deivis163 Posted June 3, 2016 Author Share Posted June 3, 2016 I checked and in BIOS menu I can disable USB, but then I'm disabling all USB controllers, I don't have permission to disable only one controller. Maybe it is possible somehow move to other group or disable that controller via UNRAID command line? Link to comment
chvb Posted June 3, 2016 Share Posted June 3, 2016 ok. check the id from the USB Controller with the command: lspci -n and add this to the syslinux.cfg vfio-pci.ids=10de:1381 replace 10de:1381 with your device id and reboot your system. Link to comment
deivis163 Posted June 4, 2016 Author Share Posted June 4, 2016 OMG it helped! Thank you so much. But now I have one more problem, both of my VM's are restarting automatically due to interupts. I see messages in ssh console: Message from syslogd@Tower at Jun 4 03:09:18 ... kernel:Disabling IRQ #16 Message from syslogd@Tower at Jun 4 03:10:36 ... kernel:Disabling IRQ #16 Message from syslogd@Tower at Jun 4 03:10:37 ... kernel:Disabling IRQ #16 Message from syslogd@Tower at Jun 4 03:11:08 ... kernel:Disabling IRQ #16 Message from syslogd@Tower at Jun 4 03:11:10 ... kernel:Disabling IRQ #16 Message from syslogd@Tower at Jun 4 03:12:26 ... kernel:Disabling IRQ #16 Message from syslogd@Tower at Jun 4 03:12:28 ... kernel:Disabling IRQ #16 And here is my /proc/interupts CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7 0: 54 0 0 0 0 0 0 0 IR-IO-APIC-edge timer 1: 3 0 0 0 0 0 0 0 IR-IO-APIC-edge i8042 5: 0 0 0 0 0 0 0 0 IR-IO-APIC-edge parport0 7: 40 0 0 0 0 0 0 0 IR-IO-APIC-edge 8: 24 0 0 0 0 0 0 0 IR-IO-APIC-edge rtc0 9: 0 0 0 0 0 0 0 0 IR-IO-APIC-fasteoi acpi 12: 3 0 0 0 0 0 0 0 IR-IO-APIC-edge i8042 16: 6309442 0 0 0 0 0 0 0 IR-IO-APIC 16-fasteoi 120: 0 0 0 0 0 0 0 0 DMAR_MSI-edge dmar0 121: 0 0 0 0 0 0 0 0 DMAR_MSI-edge dmar1 123: 76849 0 0 0 0 0 0 0 IR-PCI-MSI-edge xhci_hcd 124: 1045372 0 0 0 0 0 0 0 IR-PCI-MSI-edge 0000:00:17.0 125: 370711 0 0 0 0 0 0 0 IR-PCI-MSI-edge eth0 NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts LOC: 1377038 1151678 1022694 1006239 724426 840385 964953 1038576 Local timer interrupts SPU: 0 0 0 0 0 0 0 0 Spurious interrupts PMI: 0 0 0 0 0 0 0 0 Performance monitoring interrupts IWI: 46 0 0 0 0 0 0 0 IRQ work interrupts RTR: 0 0 0 0 0 0 0 0 APIC ICR read retries RES: 756658 1516290 695308 1107540 888187 985263 773691 1076654 Rescheduling interrupts CAL: 2345 2935 2045 2924 2948 2243 1948 1716 Function call interrupts TLB: 15873 19224 19742 17211 10748 10689 8770 5506 TLB shootdowns TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 0 0 0 0 Machine check exceptions MCP: 11 11 11 11 11 11 11 11 Machine check polls HYP: 0 0 0 0 0 0 0 0 Hypervisor callback interrupts ERR: 40 MIS: 0 Maybe this problem is already known for you? Link to comment
deivis163 Posted June 4, 2016 Author Share Posted June 4, 2016 I pluged in my sata cables to first and second sata slots and now everything is working without interupts. I want to say big thank you for user: chvb now I have 2 working virtual machines with GPUs. Problem solved. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.