iphillips77 Posted February 16, 2016 Share Posted February 16, 2016 Hey everyone, I've posted a few times here getting my system up and everyone's been a great help, thanks. I'm putting aside part of my unraid server to use as a gaming and htpc rig. Core i7 5920k overclocked and stable at 4.5ghz. Everything's running, but unfortunately not smoothly enough for me to actually use. Here's what I've done: Isolate cores 0-7 (out of 12 total) from host operations using isolcpus=0-7 in syslinux. Pass through cores 0-4 and 8gb RAM (out of 32 total) to the Windows machine Windows 10 vm image located on an unshared nvme ssd (fast fast fast) Disable xhci in bios to split apart usb controllers, one being passed through to Windows machine using <hostdev> Nvidia GTX760 + audio passed through to Windows machine using <qemu:commandline> MSI stuff done (GTX760 and audio controller show negative IRQ in device manager, lspci -v -s shows MSI: Enable+) DPC latency tests are generally good, under 1000us for the most part with the occasional spike. Was much, much worse but enabling MSI on the GTX760 largely fixed that. System Interrupts in resource monitor seems a little high.. It's averaging about 4% cpu right now but last night during tinkering it was up around 10% at times. I'm gaming with Dolphin, which is an emulator and generally CPU-bound. Running at 100%, 60FPS, my CPU usage hovers around 35-40%, so I've got plenty of overhead there. But I'm getting dips in framerate that I'm thinking are GPU related... because what else could it be? Oh, another weird thing, who knows, maybe related. I get weird mouse stuttering sometimes. Like the pointer gets stuck for a second, then boing, it's off on the other side of the screen overshooting whatever I'm trying to click on. That's pretty frustrating too. Here's my xml. Nothing weird. (Yeah I haven't deleted the virtio drivers iso part yet; though I'd thought of that in a previous vm running Win7 with the same stuttering problems, didn't help) <domain type='kvm' id='16' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> <name>Windows 10</name> <uuid>8cca1c77-5110-27f1-aa77-5386c6405f85</uuid> <metadata> <vmtemplate name="Custom" icon="windows7.png" os="windows7"/> </metadata> <memory unit='KiB'>8388608</memory> <currentMemory unit='KiB'>8388608</currentMemory> <memoryBacking> <nosharepages/> <locked/> </memoryBacking> <vcpu placement='static'>4</vcpu> <cputune> <vcpupin vcpu='0' cpuset='0'/> <vcpupin vcpu='1' cpuset='1'/> <vcpupin vcpu='2' cpuset='2'/> <vcpupin vcpu='3' cpuset='3'/> </cputune> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-i440fx-2.3'>hvm</type> </os> <features> <acpi/> <apic/> </features> <cpu mode='host-passthrough'> <topology sockets='1' cores='4' threads='2'/> </cpu> <clock offset='localtime'> <timer name='rtc' tickpolicy='catchup'/> <timer name='pit' tickpolicy='delay'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='file' device='disk'> <driver name='qemu' type='raw' cache='writeback'/> <source file='/mnt/nvme/vm_images/vdisk1.img'/> <backingStore/> <target dev='hdc' bus='virtio'/> <boot order='1'/> <alias name='virtio-disk2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/mnt/user/Misc/kvm/virtio-win.iso'/> <backingStore/> <target dev='hdb' bus='ide'/> <readonly/> <alias name='ide0-0-1'/> <address type='drive' controller='0' bus='0' target='0' unit='1'/> </disk> <controller type='usb' index='0'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </controller> <interface type='bridge'> <mac address='52:54:00:c0:89:32'/> <source bridge='virbr0'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/0'/> <target port='0'/> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/0'> <source path='/dev/pts/0'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='unix'> <source mode='bind' path='/var/lib/libvirt/qemu/channel/target/Games Machine.org.qemu.guest_agent.0'/> <target type='virtio' name='org.qemu.guest_agent.0' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x00' slot='0x1a' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </hostdev> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> </devices> <qemu:commandline> <qemu:arg value='-device'/> <qemu:arg value='ioh3420,bus=pci.0,addr=1c.0,multifunction=on,port=2,chassis=1,id=root.1'/> <qemu:arg value='-device'/> <qemu:arg value='vfio-pci,host=07:00.0,bus=root.1,addr=00.0,multifunction=on,x-vga=on'/> <qemu:arg value='-device'/> <qemu:arg value='vfio-pci,host=07:00.1,bus=root.1,addr=00.1'/> </qemu:commandline> </domain> I'm out of ideas.. Anyone know what I might be missing? Quote Link to comment
iphillips77 Posted February 17, 2016 Author Share Posted February 17, 2016 Crappy 3dMark scores, too. I didn't make a note of the score before closing the window, but was getting 30fps-ish at 720p. Hmmmm, maybe a PCIe bus width issue? Loading up the nvidia control panel and selecting system information shows "Bus: PCI Express x 1". CPU-Z doesn't show anything in the Bus Width section, though... lspci -vv results: Subsystem: ZOTAC International (MCO) Ltd. Device 3265 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 32 bytes Interrupt: pin A routed to IRQ 47 Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M] Region 1: Memory at f0000000 (64-bit, prefetchable) [size=128M] Region 3: Memory at f8000000 (64-bit, prefetchable) [size=32M] Region 5: I/O ports at c000 [size=128] Expansion ROM at fb000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee00498 Data: 0000 Capabilities: [78] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #2, Speed 8GT/s, Width x16, ASPM L0s L1, Latency L0 <1us, L1 <4us ClockPM+ Surprise- LLActRep- BwNot- LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF Not Supported DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance De-emphasis: -6dB LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+ EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest+ Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100 v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb: Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01 Status: NegoPending- InProgress- Capabilities: [128 v1] Power Budgeting <?> Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Capabilities: [900 v1] #19 Kernel driver in use: vfio-pci Hmmm.. LinkCap shows x16, LinkSta shows x8 (I've got two GPUs in here, and the 5820k is short on lanes so I'm not surprised about the x8.. So as far as unraid's end is concerned it's connected at x8) And looking here, http://www.linux-kvm.org/page/PCITodo, "Support for different PCI express link width/speed settings" is on their to-do list. Specifically.... "Issue: QEMU currently emulates all links at minimal width and speed. This means we don't need to emulate link negotiation, but might in theory confuse guests for assigned devices." Although this page is undated, so I don't know if this is still the case.... Quote Link to comment
iphillips77 Posted February 17, 2016 Author Share Posted February 17, 2016 Ahhh, scratch that idea. Must be a bug in nvidia's control panel -- both GPU-Z and lspci run under windows show the same 16x/8x that lspci does from unraid. Quote Link to comment
unevent Posted February 17, 2016 Share Posted February 17, 2016 Only six real cores/12 threads. A guess since your running CPU bound apps might stick to core assignment. Quote Link to comment
iphillips77 Posted February 17, 2016 Author Share Posted February 17, 2016 I really don't think it's a CPU problem. Like I said, CPU usage is in the 40% range (as indicated in Windows) and things are still stuttering. I've run Prime95 to rule out CPU usage being misreported... when stress testing usage is pegged at 100%. I have tried giving it more cores, all cores, even tried less on a whim. No changes. Slowdowns are repeatable. They'll occur at the same point in a game map, for example... I'll load up Super Mario Galaxy, and if I walk to a certain place, and the camera is pointed in a certain direction, my FPS drop from 60 to 50. And stay at 50 if I don't move again. All the while I'm looking at my CPU meter never going above 40%. Nothing running in the background. I've ruled out Dolphin as a culprit. I've tried both DX and OpenGL backends, tweaked every setting, and this is the best I've been able to get it. I'm getting bad GPU benchmark scores in 3dmark and Cinebench. Might not be a GPU issue but it sure seems like it. Just don't know what to try next. Quote Link to comment
Scrapz Posted February 17, 2016 Share Posted February 17, 2016 <cpu mode='host-passthrough'> <topology sockets='1' cores='4' threads='2'/> </cpu> I forget where I saw it, but I recall seeing a post where someone mentioned changing this from "threads=2" to "threads=1" addressed some performance issues they were having. Give that a go? Quote Link to comment
iphillips77 Posted February 17, 2016 Author Share Posted February 17, 2016 Thanks Scrapz, gave it a try but no dice. Playing around with pci-e settings in the bios now, I'm just about out of ideas. Quote Link to comment
bungee91 Posted February 17, 2016 Share Posted February 17, 2016 While this may not be directly your problem, something to keep in mind is that the cores are not grouped (logical, and HT ones) from 0-1, 2-3, 4-5, etc... This varies by chip/manufacturer, and some testing to figure this out is needed (there is a script that tests latency that doesn't run natively on unRAID that does this). You take a hit in performance when the shared registers/cache between logical cores and HT ones are doing completely different work loads. Anyhow with a 6 core CPU it is likely that the companions are 0-7, 1-8, 2-9, 3-10, 4-11, 5-12, however again, not a universal thing. I have not done this testing on my CPU, but we likely have the same configuration as the 5920/5930 are very similar. I do notice some stuttering in my main Windows 10 VM with 4 cores assigned (8-12), however don't game on it, so it hasn't bothered me enough to investigate. Some info: JonP talked about in another thread there is a script to check latencies between cores to help distinguish which are in the same logical core. https://github.com/awilliam/cpu-latencies This does not run natively in SSH for UnRAID, I assume netperf needs to be installed or something of that nature. Talk of it https://www.redhat.com/archives/vfio-users/2015-September/msg00041.html https://www.redhat.com/archives/vfio-users/2015-September/msg00175.html Quote Link to comment
billington.mark Posted February 18, 2016 Share Posted February 18, 2016 While this may not be directly your problem, something to keep in mind is that the cores are not grouped (logical, and HT ones) from 0-1, 2-3, 4-5, etc... This varies by chip/manufacturer, and some testing to figure this out is needed (there is a script that tests latency that doesn't run natively on unRAID that does this). You take a hit in performance when the shared registers/cache between logical cores and HT ones are doing completely different work loads. Anyhow with a 6 core CPU it is likely that the companions are 0-7, 1-8, 2-9, 3-10, 4-11, 5-12, however again, not a universal thing. I have not done this testing on my CPU, but we likely have the same configuration as the 5920/5930 are very similar. I do notice some stuttering in my main Windows 10 VM with 4 cores assigned (8-12), however don't game on it, so it hasn't bothered me enough to investigate. Some info: JonP talked about in another thread there is a script to check latencies between cores to help distinguish which are in the same logical core. https://github.com/awilliam/cpu-latencies This does not run natively in SSH for UnRAID, I assume netperf needs to be installed or something of that nature. Talk of it https://www.redhat.com/archives/vfio-users/2015-September/msg00041.html https://www.redhat.com/archives/vfio-users/2015-September/msg00175.html Thankyou for this, I always just assumed that each core was grouped with its thread. However, surely this needs to be addressed at an Unraid OS level so Hyperthreaded cores can be distinguished in the ''create VM' gui? Quote Link to comment
jude Posted February 18, 2016 Share Posted February 18, 2016 Does anyone know how the pairings are arranged for AMD FX chips? Quote Link to comment
RobJ Posted February 18, 2016 Share Posted February 18, 2016 The mouse stuttering and other things seem to me to be indicative of moments of 100% CPU usage. You think you are seeing 40% CPU, but remember that's usually an average over a long period of a second or 2, long in CPU time. It could very well be bouncing consistently between periods of 10% and 100%. However, I have no idea why your DPC latency numbers are not showing problems. They should be if the mouse is freezing. But I don't have a lot of experience here, perhaps there are other explanations. Quote Link to comment
wedge22 Posted February 18, 2016 Share Posted February 18, 2016 I was having issues with stuttering in Youtube and I did some testing and noticed that I had increased issues if using Plex on another device while running a Unigine Benchmark on my Windows 10 VM. To resolve the issue I have pinned CPU cores to certain dockers and I have also pinned cores 6-11 for the Win 10 VM. CPU Pinning Windows 10 VM 6-11 Plex 4-5 Sonaar 3 Sabnzbd 2 I have left cores 0-1 unpinned as I believe this is a good idea for unRAID to function correctly. Quote Link to comment
iphillips77 Posted February 22, 2016 Author Share Posted February 22, 2016 After a long week of banging my head against walls at work, I've got a couple days off to bang my head against walls with this instead. Thanks for the suggestion, bungee91. I downloaded the script you linked to, managed to install netperf but couldn't find a build of netserver that would work. Instead, I just tried some trial and error, but didn't manage to see any improvement. I'm going to ask around on the Dolphin forums as well to see if someone over there might know some way to improve things.. It's very puzzling. I'm starting to think that unraid just isn't going to be able to do this. Holding out hope that 6.2 will help -- OVMF instead of seabios improved things a little for me -- but something here just doesn't add up. Quote Link to comment
billington.mark Posted February 22, 2016 Share Posted February 22, 2016 the only thing i can think to try next if we cant ensure we are passing through the correct pairs of hyper-threaded cores is to disable hyper-threading so each core you pass through is actually a true core... But then i feel like i'm giving up performance as a whole to solve the problem. @jonp, are you able to shed any light on how we would address the issue of making sure we are passing through the hyperthreaded pairs? surely this is going to have an impact on other stuff outside of the VM if we are pinning CPUs to docker containers as well as VMs... Quote Link to comment
methanoid Posted February 22, 2016 Share Posted February 22, 2016 You running 6.1.8? I am... and have similar issues. It worked fine under 6.1.7 with different CPU/Mobo and seeing the recent issues of a few people makes me wonder if it is 6.18 related? Quote Link to comment
billington.mark Posted February 22, 2016 Share Posted February 22, 2016 yea im on 6.1.8 (should really update my sig!). As far as im aware, there were no kvm\qemu\libvirt specific updates in 6.1.8, but there are a few changes scheduled for 6.2. I had the same stuttering and poor overall performance since the original implementation of the hypervisor stuff, so i dont think its 6.1.8 specific (with me anyway). Quote Link to comment
Scrapz Posted February 23, 2016 Share Posted February 23, 2016 I downloaded the script you linked to, managed to install netperf but couldn't find a build of netserver that would work. I managed to get this to work last night, but I don't really know what to do with the results. Unraid is based on Slackware, so you can use the Slackware package manager to install what you need. First, you'll need the netperf package: http://pkgs.org/slackware-14.1/slackonly-x86_64/netperf-2.6.0-x86_64-1_slack.txz.html Then, you'll need the bc package: http://pkgs.org/slackware-14.1/slackware-x86_64/bc-1.06.95-x86_64-2.txz.html Install each package using with upgradepkg --install-new {packagename} And then you'll need to modify the script so it points to the binaries in "/usr/bin" as opposed to "/usr/local/bin" (for both NETPERF and NETSERV variables). Let me know what you make of the results. I'd be interested to know how I'm supposed to read it. Quote Link to comment
iphillips77 Posted February 23, 2016 Author Share Posted February 23, 2016 Holy crap I think I may have figured it out. Seems to possibly be a problem with the host not scaling the CPU frequency as efficiently/intelligently as the guest would like. Check out this shizz. http://unix.stackexchange.com/questions/64297/host-cpu-does-not-scale-frequency-when-kvm-guest-needs-it So here's what I did to test, and got immediate results. cd /sys/devices/system/cpu/cpu0/cpufreq That's config info for cpu0. You can monkey with your cpu in here. "cat scaling_max_freq" resulted in 4300000. So I thought I'd give this a try. echo 4300000 > scaling_min_freq Basically what this does is set the minimum frequency to be the same as the maximum, so it'll run full tilt constantly. Did it for the other CPUs I was passing to the Windows VM as well. Went back to the VM, and noticed in CPU-Z that my CPU was now running at max frequency. At first glance, all my stuttering and slowdown problems were gone as well. I still have to do more testing but in-emulator benchmarks have immediately improved 33%. This is exciting. Now, I probably shouldn't leave things like this. But, what I surmise was happening to me is that the hypervisor wasn't triggering a jump to the highest multiplier. It could if it wanted to, though.. because Prime95 did it. So. What do we do with this information? It should be possible to change the frequency scaling rules, shouldn't it? Quote Link to comment
Scrapz Posted February 23, 2016 Share Posted February 23, 2016 Good info. Reading that link, the top comment mentions that because you're distributing the load across multiple cores, no one core goes above 95%, so it stays throttled. Or at least the scaling kicks in later than it should be for persistent loads. Bringing that threshold down to about 50% will make it kick in sooner. echo 50 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold Changes are lost after a reboot, so there's no harm in trying it to see if there's a performance boost. Quote Link to comment
iphillips77 Posted February 23, 2016 Author Share Posted February 23, 2016 That was the first thing I tried, actually, but it seems that there are some differences between unRaid and Ubuntu when it comes to how CPU multipliers are handled. That file doesn't exist, so we can't change things that way. cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors returns "performance" and "powersave".. cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor is set to "powersave" by default. I'm giving this a try now... echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ...for all cores.. cpu0/cpufreq, cpu1/cpufreq, etc etc Quote Link to comment
Scrapz Posted February 23, 2016 Share Posted February 23, 2016 Weird, the file existed for me. And my scaling_governor is set to "ondemand" by default. Different configs for different CPU"s? I'll give the "up_threshold" a good run as a test, and see how I go. Quote Link to comment
bungee91 Posted February 23, 2016 Share Posted February 23, 2016 I managed to get this to work last night, but I don't really know what to do with the results. Unraid is based on Slackware, so you can use the Slackware package manager to install what you need. First, you'll need the netperf package: http://pkgs.org/slackware-14.1/slackonly-x86_64/netperf-2.6.0-x86_64-1_slack.txz.html Then, you'll need the bc package: http://pkgs.org/slackware-14.1/slackware-x86_64/bc-1.06.95-x86_64-2.txz.html Install each package using with upgradepkg --install-new {packagename} And then you'll need to modify the script so it points to the binaries in "/usr/bin" as opposed to "/usr/local/bin" (for both NETPERF and NETSERV variables). Let me know what you make of the results. I'd be interested to know how I'm supposed to read it. Thanks for the info! Some notes that may be beneficial for those that are looking into this further https://www.redhat.com/archives/vfio-users/2015-September/msg00041.html Quote Link to comment
methanoid Posted February 23, 2016 Share Posted February 23, 2016 That link is the answer IMHO. Can someone translate it into English? ;-) I think it's telling us how to assign cores!! Quote Link to comment
iphillips77 Posted February 23, 2016 Author Share Posted February 23, 2016 Scrapz.. Yep, it appears the 'ondemand' and 'conservative' governors have been deprecated for my CPU. All I have are 'performance' and 'powersave'. Also, found some tools already installed in unraid to manage CPU frequency.. /usr/bin/cpufreq-set, which allows you to set minimum and maximum frequencies for all cores or individually, as well as changing governers.. /usr/bin/cpufreq-info gives the current settings and /usr/bin/cpufreq-aperf seems to be a performance monitor tool. Much easier than catting and echoing! Quote Link to comment
Gyph Posted February 23, 2016 Share Posted February 23, 2016 I had some issues with this on my FX-9590. I kept thinking it HAD to be something to do with the CPU, reassigned different cores ect. Popping only seemed to occur when I had at least one VM using 4 cores (so it HAD to be CPU right?). Turns out that I updated the sound card drivers for my Creative Sound Blaster X-Fi Titanium Fata1ty and now popping is non-existent! Something to keep in mind if someone else is pulling their hair out, try updating drivers! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.