Creslinunx

Members
  • Posts

    1
  • Joined

  • Last visited

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

Creslinunx's Achievements

Newbie

Newbie (1/14)

0

Reputation

  1. Hi this info in this thread with a couple other sources helped me enormously. I managed to get my 1080 working in KVM on ubuntu 18.04 with my bulleted steps / configs. This was not an unraid build, but standard ubuntu 18.04 - but the overlap is huge so hopefully this may be of help to others It took me a couple days of playing with lots of options before found this fit where the KVM would come up clean, the card not fall of the bus etc. PCIe Passthrough on DL360/380 Gen 7 for KVM for GTX 1080 Linux Kernel 4.15.x Ubuntu 18.04 LTS A) Host BIOS ============ 1) In the host bios check both VT-x and VT-d are enabled. Press "F9" to enter the bios at boot B) Host Operating system ======================== 0) Install KVM, vfio as standard(outside the scope of these bullets) 1) Take a copy of the GTX vbios, this is needed later when loading the KVM - Download nvflash_linux from https://www.techpowerup.com/download/nvidia-nvflash/ - save the vbios with `./nvflash_linux --save mygtx.rom` 2) Blacklist any drivers that may attach to the card - edit `/etc/modprobe.d/blacklist.conf` - add: ``` blacklist nouveau blacklist nvidiafb blacklist nvidia blacklist nvidia_drm blacklist snd-pcsp ``` - run `update-initramfs -u` 3) Take note of PCIe IDs for the GTX, in this example they are "10de:1b80" and "10de:10f0" - `lspci --nnk` ** VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] ** Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] 4) Assign these to VFIO - edit `cat /etc/modprobe.d/vfio.conf` - add(replace the IDs with your own): ``` vfio vfio_iommu_type1 vfio_pci ids=10de:1b80,10de:10f0 vfio_virqfd vhost-net ``` 5) create modprobe.d vfio.conf to ensure vfio takes the IDs on boot - edit `/etc/modprobe.d/vfio.conf` - add: ``` options vfio-pci ids=10de:1b80,10de:10f0 softdep nouveau pre: vfio-pci ``` - and set to load on boot `echo 'vfio-pci' > /etc/modules-load.d/vfio-pci.conf` 6) Update GRUB to use iommu and allow unsafe interupts: - edit `/etc/defaults/grub` - update/append entry "GRUB_CMDLINE_LINUX_DEFAULT" with additional arguments: GRUB_CMDLINE_LINUX_DEFAULT="intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1" - update grub `update-grub` 7) Recompile Linux to disable RMRR check - apt-get install fakeroot, kernel-package, linux-source - download and untar linux src under /usr/src, as example `tar xf /usr/src/linux-source-4.15.tar.xz` - cd `/usr/src/linux-4.15.0/` (or your src) - edit `/usr/src/linux-4.15.0/drivers/iommu/intel-iommu.c` to remove the `-EPERM` return - Replace: ``` if (device_is_rmrr_locked(dev)) { dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement. Contact your platform vendor.\n"); return -EPERM; } ``` With: ``` if (device_is_rmrr_locked(dev)) { dev_warn(dev, "HP should release a BIOS update for G7 hardware. Ignoring RMRR.\n"); } ``` - Compile the image ("this will take a while...") ` fakeroot make-kpkg --initrd --revision=1.0.custom kernel_image` - Instal the image: `dpkg -i ../linux-image-4.15-.....<< your sub arch number >>_1.0.custom_amd64.deb` Create your KVM in virt manager - The KVM should use UEFI i.e type 35, not BIOS - Add hardware, find the GTX PCIe and add If unsure you can find your PCI channel with: `lspci -nn` Here I am interested in `09:00:0` 0.9:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1) 09:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1) - set "rom to On" - If using headless for CUDA work, you may want to add a QLX video card and spice display to access the console in virt manager - apply / save 9) Edit the KVM XLM directly: - vish edit <your kvm name> - Update the PCIe lane added from host with you card to include the PATH to the cards ROM dumped earlier. - From: ``` <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/> </source> <rom bar='on'/> <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/> </hostdev> ``` - To: ``` <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x09' slot='0x00' function='0x0'/> </source> <rom bar='on' file='/home/creslin/gtx/mygtx.rom'/> <address type='pci' domain='0x0000' bus='0x08' slot='0x00' function='0x0'/> </hostdev> ``` 10) Reboot! 11) Check the driver in use for the GT care is `vfio-pci` - `lspci -nnk` 09:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1) Subsystem: Gigabyte Technology Co., Ltd GP104 [GeForce GTX 1080] [1458:3730] Kernel driver in use: vfio-pci Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia 09:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1) Subsystem: Gigabyte Technology Co., Ltd GP104 High Definition Audio Controller [1458:3730] Kernel driver in use: vfio-pci Kernel modules: snd_hda_intel 12) Check Intel IOMMU is loaded, VT-D is working: - ` dmesg | grep -E "DMAR|IOMMU" ` [ 0.000000] ACPI: DMAR 0x00000000C762FE80 000150 (v01 HP ProLiant 00000001 \xd2? 0000162E) [ 0.000000] DMAR: IOMMU enabled [ 0.000000] DMAR-IR: This system BIOS has enabled interrupt remapping [ 1.319488] DMAR: Host address width 39 [ 1.319492] DMAR: DRHD base: 0x000000cfffe000 flags: 0x1 [ 1.319513] DMAR: dmar0: reg_base_addr cfffe000 ver 1:0 cap c90780106f0462 ecap f0207e [ 1.319517] DMAR: RMRR base: 0x000000c77fc000 end: 0x000000c77fdfff [ 1.319520] DMAR: RMRR base: 0x000000c77f5000 end: 0x000000c77fafff [ 1.319523] DMAR: RMRR base: 0x000000c763e000 end: 0x000000c763ffff [ 1.319526] DMAR: ATSR flags: 0x0 [ 1.319841] DMAR: dmar0: Using Queued invalidation [ 1.319858] DMAR: Setting RMRR: [ 1.320205] DMAR: Setting identity map for device 0000:02:00.0 [0xc763e000 - 0xc763ffff] [ 1.320562] DMAR: Setting identity map for device 0000:02:00.2 [0xc763e000 - 0xc763ffff] [ 1.320844] DMAR: Setting identity map for device 0000:05:00.0 [0xc763e000 - 0xc763ffff] [ 1.321190] DMAR: Setting identity map for device 0000:09:00.0 [0xc763e000 - 0xc763ffff] [ 1.321564] DMAR: Setting identity map for device 0000:09:00.1 [0xc763e000 - 0xc763ffff] [ 1.321893] DMAR: Setting identity map for device 0000:00:1d.0 [0xc77f5000 - 0xc77fafff] [ 1.322178] DMAR: Setting identity map for device 0000:00:1d.1 [0xc77f5000 - 0xc77fafff] [ 1.322513] DMAR: Setting identity map for device 0000:00:1d.2 [0xc77f5000 - 0xc77fafff] [ 1.322870] DMAR: Setting identity map for device 0000:00:1d.3 [0xc77f5000 - 0xc77fafff] [ 1.322891] DMAR: Setting identity map for device 0000:02:00.0 [0xc77f5000 - 0xc77fafff] [ 1.322896] DMAR: Setting identity map for device 0000:02:00.2 [0xc77f5000 - 0xc77fafff] [ 1.323218] DMAR: Setting identity map for device 0000:02:00.4 [0xc77f5000 - 0xc77fafff] [ 1.323570] DMAR: Setting identity map for device 0000:00:1d.7 [0xc77fc000 - 0xc77fdfff] [ 1.323596] DMAR: Prepare 0-16MiB unity mapping for LPC [ 1.323918] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff] [ 1.324132] DMAR: Intel(R) Virtualization Technology for Directed I/O 13) Check the GTX and sound if multifunctional are in their own IOMMU group - run: `for d in /sys/kernel/iommu_groups/*/devices/*; do n=${d#*/iommu_groups/*}; n=${n%%/*}; printf 'IOMMU Group %s ' "$n"; lspci -nns "${d##*/}"; done;` As example, mine are in group 21 without other devices: IOMMU Group 21 09:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP104 [GeForce GTX 1080] [10de:1b80] (rev a1) IOMMU Group 21 09:00.1 Audio device [0403]: NVIDIA Corporation GP104 High Definition Audio Controller [10de:10f0] (rev a1) 14) Start and install your KVM! C) On KVM ======================== 1) Check host sees the GTX card: - `lspci --nnk` 2) Install Nvidia GTX drivers, CUDA drivers (outside the scope of these bullets 3) Benchmark, i used gpu-burn - i got 100% the CUDA acceleration in guest as did in host