![](http://content.invisioncic.com/u329766/set_resources_34/84c1e40ea0e759e3f1505eb1788ddf3c_pattern.png)
bastl
-
Posts
1,267 -
Joined
-
Last visited
-
Days Won
3
Content Type
Profiles
Forums
Downloads
Store
Gallery
Bug Reports
Documentation
Landing
Posts posted by bastl
-
-
@xlucero1 You can't passthrough the audio device from group 15 as long as it isn't separated in it's own group. The ACS override option should have added an entry in your syslinux config already. Check your syslinux config, you can find it under main and click your flash device. Change it to the following and restart your server and check your system devices again.
pcie_acs_override=downstream,multifunction
-
The changes aren't that big in synthetic benchmarks like Timespy or Heaven. In FarCry 5 i saw more improvement. I guess the fact that a real game constantly streaming textures and stuff is what I see here. A synthetic benchmark loads all the stuff right at the beginning into memory. Games like Doom and FarCry at least feel smoother now. Below an overview what I've tested.
Test 1 was my original i440fx VM with some manual tweaks like numatune, emulatorpin and iothread set. For test 2 I created a fresh Q35 VM with the same corecounts, RAM, NVME, SSD, GPU as in test 1 and applied all the tweaks from the i440fx VM + the Qemu arguments at the end of the XML
<qemu:commandline> <qemu:arg value='-global'/> <qemu:arg value='pcie-root-port.speed=8'/> <qemu:arg value='-global'/> <qemu:arg value='pcie-root-port.width=16'/> </qemu:commandline>
Test 3 is a fresh Q35 VM with no manual tweaks. Only GPU and SSD/NVME passthrough, same cores and RAM as before without the Qemu arguments at the end. Test 4 is the same as test 3. I only added the Qemu part. And finally test 5 is basically test 2 with a couple tweaks.
In test 5 I changed the memory mode from 'preferred' to strict
<memory mode='strict' nodeset='1'/>
added some changes in the hyperv section
<hyperv> ... <vpindex state='on'/> <synic state='on'/> <stimer state='on'/> <reset state='on'/> </hyperv>
and I changed some parts of the EPYC fix
old:
<cpu mode='custom' match='exact' check='partial'> <model fallback='forbid'>EPYC-IBPB</model> <topology sockets='1' cores='7' threads='2'/> <feature policy='require' name='topoext'/> <feature policy='disable' name='monitor'/> <feature policy='require' name='x2apic'/> <feature policy='require' name='hypervisor'/> <feature policy='disable' name='svm'/> </cpu>
new:
<cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>EPYC</model> <topology sockets='1' cores='7' threads='2'/> <cache level='3' mode='emulate'/> <feature policy='require' name='topoext'/> <feature policy='disable' name='monitor'/> <feature policy='require' name='hypervisor'/> <feature policy='disable' name='svm'/> <feature policy='disable' name='x2apic'/> </cpu>
CPUZ scores also looks pretty good now.
-
All tests i did so far, on synthetic benchmarks like cinebench or heaven and superposition you can't see that much of a difference. I guess the fact, that the benchmark loads all shaders and textures at the begining is the reason. Testing FarCry5 the performance gain I see is bigger. I guess games that constantly loading stuff will benefit more from that patch. I think that needs a couple more tests
Edit:
I posted some test in another forum related to this.
-
@Nooke I'am currently testing a lot of different settings. Main VM runs still on i440fx and i have a second template as Q35 configured with the same devices passthrough as my main VM. So far it looks promising. Still trying to find the best settings for numa settings pinning etc. Will report back later.
-
Same for me. Test VM with a 1050ti and the Nvidia driver shows the correct PCIe Bus speeds.
-
Thanks @Jerky_san You basically added the qemu lines at the end. For me in a test VM the Nvidia driver reports the 1050ti as x16 Gen3 now, before only x1.
Another thing i noticed, this is the first VM using the correct memory i have setup. Usually with strict no matter what, it always used a couple MB from the other node. Coincidence? Never saw that before.
<memory mode='strict' nodeset='0'/>
3 hours ago, limetech said:Right, our testing didn't show much speed difference but maybe not configuring properly...
Looks like a slight improvment to me. 😂
Couple more tests will follow tomorrow. Thanks for adding that fix 👍
-
@Jerky_san can you post your xml for reference?
-
@blaine07 There are still no Mojave web drivers for Nvidia 10 series cards available. I have the EVGA 1050ti and passthrough works but no acceleration with this card without the drivers.
-
2 minutes ago, Jerky_san said:
build just for threadripper
Yeah, if I remember correctly they pushed the "ugly patch" into unraid before it was build into the kernel. Let's hope the devs still loving their Threadripper systems and playing around with them.
-
I tried a lot of things to improve the performance of my VMs the last couple days and stumbled across that level1tech forum as i guess like everybody here. Great in depth information and i hope limetech is able to push that fix to us unraid users as soon as possible 😉
GIVE US THE FIX NOOOOOOW
Just kiddin. Don't push features if they aren't tested in your product. Since I'am using Unraid, even with all the RC builds I tested (every public RC since early 2018) were stable for my needs. Sure there are always performance improvments possible often on the edge of stability. Always using the bleeding edge technology is fun, sure and for a techi nice to play with but for the general user often hard to handle. It's hard for @limetech and any over tech company to find a good middle way. I believe in you guys 👍
-
I don't think. I don't directly passthrough a nic. It's a virtual nic emulated by unraid. I guess it's the same for you.
-
If the storport.sys handles all the disk IO than maybe changing/tweaking the iothreadpin can bring improvements.
-
The <emulatorpin> tag specifies which host physical CPUs the emulator (a subset of a domain, not including vCPUs) will be pinned to. The <emulatorpin> tag provides a method of setting a precise affinity to emulator thread processes. As a result, vhost threads run on the same subset of physical CPUs and memory, and therefore benefit from cache locality.
@Tritech Does that mean emulatorpin outside the range of the already used vCPUs? I already have it set up for my main VM that the emupin cores are separated from the cores the VM uses, same die. Difficult if it's not your main language ^^
-
@Tritech Ok the "<numa>" tag is only if you have more vCPUs as one NUMA node has and you want to have a NUMA topology inside the VM. Let's say 2 cores from each node and than you can tell the vm which "virtual" node uses how much RAM. This should not affect us.
Btw. a really useful guide.
4 available nodes (0-3) Node 0: CPUs 0 4, size 4000 MiB Node 1: CPUs 1 5, size 3999 MiB Node 2: CPUs 2 6, size 4001 MiB Node 3: CPUs 0 4, size 4005 MiB In this scenario, use the following Domain XML setting: <cputune> <vcpupin vcpu="0" cpuset="1"/> <vcpupin vcpu="1" cpuset="5"/> <vcpupin vcpu="2" cpuset="2"/> <vcpupin vcpu="3" cpuset="6"/> </cputune> <numatune> <memory mode="strict" nodeset="1-2"/> </numatune> <cpu> <numa> <cell id="0" cpus="0-1" memory="3" unit="GiB"/> <cell id="1" cpus="2-3" memory="3" unit="GiB"/> </numa> </cpu>
-
<memory mode='strict' nodeset='1'/>
Btw the numatune doesn't really work as it supposed to work. It always grabs RAM from the other node too. First VM "strict" 8GB from node0 and second "preferred" 16GB from node1. If I remeber right strict can cause issues when not enough RAM is available on the specified node and should cause an error but it doesn't for me. No glue how to fix this yet.
Edit:
Another thing i noticed in your xml
<numa> <cell id='0' cpus='0-15' memory='16777216' unit='KiB'/> </numa>
Isn't that line telling the VM using 16GB from node0 for cores 0-15 where you using cores 8-15 and 24-32?
-
@Tritech That level1 forum is the one we talked about earlier btw 😂
I didn't had any time to test yet, but from what i read are some of these fixes available with qemu 3.2 and later defaults for 4.0. Not sure when we'll see this in unraid.
-
-
@Tritech With one of the earlier Agesa updates i thing end 2017 or early 2018 Amd changed something thats right. On the first BIOS version on my board (dec 2017) it reported the core pairings differently as on an update a couple months later. Depending on your BIOS version, i guess you will have already the changed one, lstopo showed it always correct and so does unraid. In earlier days people got confused cause everyone had different pairings.
-
These PCIE fixes gnif talked about aren't in yet. He suggested they will be implemented in 4.x as default and first shown in 3.2 and we are currently on 3.1 in the RC build.
1 Dec '18 Hi, is there any chance these patches reach qemu 3.1 release or other systems involved ? gnif: I believe they are trying to get them queued up for 3.1, and will default to full speed in Qemu 4.0. These patches will only apply to platforms that actually have PCIe such as Q35, i440fx is out of the question.
lessaj: Went to do a new build and the patch set failed to apply, seems as of Dec 19 this patch set was committed to qemu master branch. Awesome!
gnif: 4.0 is when it will default to using the higher link speeds, last I read however the 3.2 and later builds have these patches but you must specify the link speed. I have not checked as I have been on break however and could have the versioning wrong :slight_smile:
-
@Nooke can you please link the level1techs forums entry? I can't find it.
-
-
Another idea. What version is your installed audio driver? Just an idea, remebering some AMD GPU users had issues passing through there GPUs with the latest drivers. I have a kinda old driver installed which is working from day one until today.
-
@Tritech I will check the next days if it's the newest BIOS causing it.
-
1
-
-
I already waited for that question 😂 I have that same error from the beginning.
Cannot reset device 0000:0a:00.3, depends on group 18 which is not owned.
Group 18 for me is the same as your group 17
[1022:1455] 0a:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function
I never tried to pass that trough because i never had audio issues with the onboard device inside the VM. The VM is running for almost 10 hours now with a online radio playing in the backround and not one single audio drop or lag.
QEMU PCIe Root Port Patch
in Feature Requests
Posted
@rix Try the following, Get some load on the GPU for example with the render test in GPUZ and run the following comand in unraid.
lspci -s 43:00.0 -vv | grep LnkSta:
Adjust it so it matches your GPU. 43:00.0 is my passed through 1080ti. 8GT/s is what you wanna see for a x16 Gen3 speed.
Mistake that everyone makes is to trust the link speeds GPUZ is reporting. Even if it's reporting x16 the Nvidia system info is the place shows it right.
Another tool for testing is concBandwidthTest:
https://forums.evga.com/PCIE-bandwidth-test-cuda-m1972266.aspx
Run it from the comandline inside your VM and report back the values you get.