TheSkaz Posted September 17, 2020 Share Posted September 17, 2020 I cant seem to find anything regarding NVLink and passthrough to a VM. Is it possible, or is there such a tutorial? Quote Link to comment
TheSkaz Posted September 17, 2020 Author Share Posted September 17, 2020 (edited) here is what I am trying now: (causes a kernel panic when trying to start vm) Edited September 17, 2020 by TheSkaz Quote Link to comment
TheSkaz Posted September 17, 2020 Author Share Posted September 17, 2020 (edited) Does this mean anything useful in regards to my issue? Sep 17 08:44:57 Tower kernel: vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none Sep 17 08:44:57 Tower kernel: Linux agpgart interface v0.103 Sep 17 08:44:57 Tower kernel: xhci_hcd 0000:01:00.2: remove, state 4 Sep 17 08:44:57 Tower kernel: usb usb2: USB disconnect, device number 1 Sep 17 08:44:57 Tower kernel: xhci_hcd 0000:01:00.2: USB bus 2 deregistered Sep 17 08:44:57 Tower kernel: xhci_hcd 0000:01:00.2: remove, state 4 Sep 17 08:44:57 Tower kernel: usb usb1: USB disconnect, device number 1 Sep 17 08:44:57 Tower kernel: xhci_hcd 0000:01:00.2: USB bus 1 deregistered Sep 17 08:44:57 Tower kernel: vfio-pci 0000:50:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none Sep 17 08:44:57 Tower kernel: xhci_hcd 0000:50:00.2: remove, state 4 Sep 17 08:44:57 Tower kernel: usb usb16: USB disconnect, device number 1 Sep 17 08:44:57 Tower kernel: xhci_hcd 0000:50:00.2: USB bus 16 deregistered Sep 17 08:44:57 Tower kernel: xhci_hcd 0000:50:00.2: remove, state 4 Sep 17 08:44:57 Tower kernel: usb usb15: USB disconnect, device number 1 Sep 17 08:44:57 Tower kernel: xhci_hcd 0000:50:00.2: USB bus 15 deregistered Sep 17 08:44:57 Tower kernel: nvidia: loading out-of-tree module taints kernel. Sep 17 08:44:57 Tower kernel: nvidia: loading out-of-tree module taints kernel. Sep 17 08:44:57 Tower kernel: nvidia: module license 'NVIDIA' taints kernel. Sep 17 08:44:57 Tower kernel: nvidia: module license 'NVIDIA' taints kernel. Sep 17 08:44:57 Tower kernel: Disabling lock debugging due to kernel taint Sep 17 08:44:57 Tower kernel: nvidia-nvlink: Nvlink Core is being initialized, major device number 247 Sep 17 08:44:57 Tower kernel: vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none Sep 17 08:44:57 Tower kernel: vfio-pci 0000:50:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none Sep 17 08:44:57 Tower kernel: nvidia 0000:4e:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none Sep 17 08:44:57 Tower kernel: NVRM: The NVIDIA probe routine was not called for 2 device(s). Sep 17 08:44:57 Tower kernel: NVRM: This can occur when a driver such as: Sep 17 08:44:57 Tower kernel: NVRM: nouveau, rivafb, nvidiafb or rivatv Sep 17 08:44:57 Tower kernel: NVRM: was loaded and obtained ownership of the NVIDIA device(s). Sep 17 08:44:57 Tower kernel: NVRM: Try unloading the conflicting kernel module (and/or Sep 17 08:44:57 Tower kernel: NVRM: reconfigure your kernel without the conflicting Sep 17 08:44:57 Tower kernel: NVRM: driver(s)), then try loading the NVIDIA kernel module Sep 17 08:44:57 Tower kernel: NVRM: again. Sep 17 08:44:57 Tower kernel: NVRM: loading NVIDIA UNIX x86_64 Kernel Module 440.100 Fri May 29 08:45:51 UTC 2020 Edited September 17, 2020 by TheSkaz Quote Link to comment
TheSkaz Posted September 22, 2020 Author Share Posted September 22, 2020 I have the VM up and able to boot with both gpus showing. in the VM logs for the machine, I am getting hundreds of these: 2020-09-22T06:21:28.221139Z qemu-system-x86_64: vfio_region_write(0000:01:00.0:region1+0x801b8, 0x0,8) failed: Device or resource busy that is my primary video card for the system and 1 of the 2 gpus for the VM. anything that attempts to use the gpus freezes. Quote Link to comment
TheSkaz Posted September 22, 2020 Author Share Posted September 22, 2020 googled the error and found that running: echo 0 > /sys/class/vtconsole/vtcon0/bind echo 0 > /sys/class/vtconsole/vtcon1/bind echo efi-framebuffer.0 > /sys/bus/platform/drivers/efi-framebuffer/unbind works. NVLink seems to work too Quote Link to comment
PeteyBoPetey Posted May 24, 2022 Share Posted May 24, 2022 I've got the same problem. The two gpu's show up in device manager, but I can't see an option to enable nvlink in nvida's control panel? How did you fix this? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.