October 17, 20196 yr Good day I've been working on this issue for over three days now and believe I've explored all I can with regard to the issue. Any advice would be gratefully received. HP ML350p Gen 8 latest BIOS. I'm not receiving any of the RR issues and can successfully pass through a GPU (GTX 1060) without any issues. What is of issue is the performance FPS etc. wise. Even running a benchmark using 720p (low) on the Superposition test is only pulling around 40FPS. Utilising the concBandWidthTest tool, I'm seeing around 7900 - 9,000 MB/s average Bi-directional bandwidth. Whilst I've upgraded to the latest 6.8.0 rc1 and tried both the Q35 and i440 VM types, still the same. So I read up on the Root PCI-e bandwidth fix and note that the issue could have been my GPU not connecting to the VM at pci-e 3.0 x16. Utilising the latest Q35 4.1, I note that the speed in my VM BOTH in the Nvidia Control Panel AND GPU-Z shows x16 3.0 / 3.0 Gen 3 under load. I thus was wondering if this was a hardware issue with the server and not a VM / Unraid issue. I've thus run lspci -vvvn -s 0a:00.0 | grep LnkSta to confirm the current link state of the GPU. Whilst NOT under load, the link speed is 2.5GT/s @ x16 width. When under load (all be it at 40FPS), it displays 8GT/s @ x16 width, which I believe is full PCIe-3.0 speed. To note and may be an issue...may be not. I've allowed unsafe interrupts: vfio_iommu_type1.allow_unsafe_interrupts=1 pcie_acs_override=downstream,multifunction initrd=/bzroot Both my GPU AND HDMI Audio are in separate MMU groups on their own, however I cannot boot the VM with the Audio device...Failed to set iommu for container: operation not permitted error. Although I cannot reset the GPU WITHOUT attempting to boot with the Audio device (which then releases the card). I'm thinking of a BIOS downgrade on the server...it certainly seems more hardware related than config. I thus am at a loss to appreciate why I'm seeing very poor performance. I've checked that the GPU is in a X16 slot and, whilst I initially had some concerns regarding the C600 chipset used in the ML350p Gen 8 server, HP documentation confirms it is PCIe-3.0. I've also looked into CPU bottlenecks and confirmed that both the GPU and CPU cores assigned were on the same numa node. Any ideas? Happy to post logs if useful but I didn't want to throw them out if not required. Any tests I can undertake on the HOST to check that the GPU is performing correctly prior to attempting to debug on the VM? Thanks Mo
Archived
This topic is now archived and is closed to further replies.