Jump to content

harperhendee

Members
  • Posts

    42
  • Joined

  • Last visited

Everything posted by harperhendee

  1. When I refer to VM HDD, I mean the individual VM disk allocation size. My theory was that an intermittent cache SSD caused occasional system hangs, and system hangs caused data corruption that wasn't caught by parity, since the hang occurred before parity was updated. Bad SSD was my last hope of a "fixable" problem with my system to run 2x VR sessions on VMs. Intermittent hangs that happen at different times, regardless of whether PCI remapped or not... If the disks aren't bad, the only culprits left are Unraid, the KVM SW, and the actual virtualization tech in the chip. In any case, the system is running solidly on the former cache disk as a standalone Win10 gaming build. I may give this one more try using my 256GB SATA SSD as new cache. But I'm pretty tired of debugging this issue--its simply not converging. Much as I like virtualization tech, I like platforms that don't hang even better. --Brad
  2. I've been chasing stability problems in my unraid system for almost a year now. I have replaced almost every component of the system, but still get hangs on VMs, especially at load time and when detecting new devices. I have one final theory: The cache drive is unreliable. My array is as follows: Cache: Intel 1 TB M.2 SSD Disk 1 : 2 TB Western digital red Disk 2 : 1 TB Western digital black (high performance) Parity: 2 TB western digital red I noticed a while back that almost no data was actually consumed in either of the WD disks. My VM HDD allocation was about 1.5 TB before I recently deleted and rebuilt my VMs. I noticed that the cache was shown as fully used, but the array disks showed very low utilization. Yesterday, I decided to do a Windows 10 installation on the cache drive. I deleted all the partitions form the windows installation SW and installed it in the resulting unallocated space. Things worked fine for the installation and upgrade to Win10 fall update. Then I decided to plug in my other 256GB SSD with Win10 on it. I figured I could mount the disk and re-install SW and drivers from its "downloads" directory. The system hung at a black screen with spinning circles. I told it to boot from the 256GB SSD. Boots fine. So my thought is the SSD is actually bad, but I am not sure how to confirm this other than replacing the component. Replacing a 1TB M2 SSD is not cheap, so I was hoping some diagnostics might help me determine if I need to or not. --Harper
  3. I've been using unraid for a little over a year now. My goal has been to build a multi-headed VR gaming rig. I had many trials and tribulations to get the system to a stable point with all the PCIE pass through and USB devices. It was pretty stable for about 6 months. Then I started getting intermittent hangs on my two main gaming VMs (one for Oculus, one for VIVE). The hangs became more frequent until the system was unusable. I ended up booting back to a regular Windows image without unraid and everything worked fine. A few weeks back, I made major HW changes. Maybe my problem was HW related. I changed my Xeon 22-core processor for a new i9 7900x. I had to upgrade the motherboard from my Gigabyte x99 Aorus Designaire to a MSI x299 Xpower AC board. With these two major upgrades done, I brought up the whole system again with a new install of Windows on a clean SSD. Everything works fine in windows. I booted back to unraid and had a look at my old VMs. They all had problems. The two main VMs would intermittently boot, then hang after 1-10 minutes of usage. I had a number of relatively "pristine" VMs that I had only done a basic windows install with variations of BIOS and processor model. These VMs also would hang, especially during loading. I never got any error messages that I could sniff out. The VM would just be shown as "paused" in the web interface. If I force killed the VM, about 50% of the time it would no longer even launch, giving me some message about execution errors. So I decided that perhaps these VMs had become corrupted by having survived all the hangs in the past. I deleted all VMs and built up 3 new ones. I had no problems with these VMs during installation. Then I decided to run some stress tests. I ran combinations of 7zip, cinebench, and passmark. All three VMs performed flawlessly for a good 24 hours of grueling CPU loads. I brought the VMs down, increased disk space, and then launched them again to fix the disks in windows. My first VM hung at BIOS. I killed it and relaunched. It hung during windows init. I rebooted and tried the other VMs. I was able to get hangs during BIOS and windows logo as well. Also, I was able to fatally hang a VM by plugging/unplugging USB devices into a PCIE mapped USB card. What do I mean by "hang"? The VM becomes unresponsive to input, doesn't update its output, which remains stuck. Sometimes HTOP will show CPU utilization approaching 0 on all cores. Sometimes, it will show all cores pegged at 100% (especially for launch hangs). Sometimes, a single thread continues even though VM is unresponsive. Sometimes, unraid will hang as well and requires a hard reset to recover. After a hang, sometimes I can relaunch the VM, and sometimes I get execution errors when I attempt. Reboot resolves these issues. So my conclusion is: 1) CPU virtualization is fine 2) Memory virtualization is fine (I suspected memory corruption as issue early on) 3) VM launch has serious problems 4) USB discovery via PCIE passthru has problems And I have some big open questions: 1) Why do my VMs get worse over time? 2) What does a "VM Hang" actually involve? 3) Is there an architectural problem with KVM, virtio, or Intel's virtualization tech? I'm going to give this a few more days to resolve, and then I'm wiping the whole thing and building up a couple of dual-boot windows build. I may not be able to run 2 parallel sessions, but at least it will be stable. --Brad
  4. This should work. I have two Fresco 1100 USB cards in my rig. I run two gaming VMs in paralell running Oculus and Vive. My setup isn't perfect. I get occasional hangs, especially when playing "switchboard operator" on the USB ports. I think this is due to my own problems with PCI initialization. My cards reside at 4e and 54. Here's where they are discovered in syslog.txt: Similar stuff on 54. Later, it is revisited at 4c (which is the PCI bridge chip on MOBO) And then I get a non-fatal error: Then another non-fatal error: And then this message: And finally this message: USB devices seem to work fine in Unraid after this point. I can even boot from the card (better than my mobo usb, actually). However, when I launch my VM, I get one more non-fatal error: There's a warning in the VM log file as well: After this point, USB mostly works. But I get flaky behavior that I think is related to the non-fatal errors. Some of the flaky behavior results in hard hangs, which is not nice at all! I've been doing the research on how PCI enumeration works so that I can steer things into a more predictable and stable configuration. This thread is pure gold! --Harper
  5. I get similar errors on my two Innatek USB3 cards, but they seem to work in spite of the warnings. I'm wondering if it contributes to some of the system stability issues I see around USB. Here's what my errors are on bootup: Mar 15 19:09:05 Yggrasil kernel: Unpacking initramfs... Mar 15 19:09:05 Yggrasil kernel: Freeing initrd memory: 139516K (ffff88005bccc000 - ffff88006450b000) Mar 15 19:09:05 Yggrasil kernel: DMAR: [Firmware Bug]: RMRR entry for device 4e:00.0 is broken - applying workaround Mar 15 19:09:05 Yggrasil kernel: DMAR: [Firmware Bug]: RMRR entry for device 54:00.0 is broken - applying workaround Mar 15 19:09:05 Yggrasil kernel: DMAR: dmar0: Using Queued invalidation Mar 15 19:09:05 Yggrasil kernel: DMAR: dmar1: Using Queued invalidation When I launch a VM, I get another virtlog error about not able to map the BAR on the USB device. I don't have it handy, I'm afraid.
  6. I have been dealing with some system stability issues since my first installation 6 months ago. They fall into a couple of different buckets: Bucket 1 - Windows 10 VM fails to launch 1a) Hangs with no display from VM 1b) Hangs with VM static display of windows logo, but no spinning dots 1c) Goes to Windows recovery 1d) Kill and relaunch fixes it about 50% of the time Bucket 2 - Windows 10 VM hangs during usage 2a) VM becomes unresponsive when USB devices plugged/unplugged 2b) Other VMs continue to function Bucket 3 - Unraid hard hang 3a) Sometimes bucket 2 problems also hang the Unraid server. 3b) Screen is frozen for Unraid and VMs 3c) Server will not respond to SSH, ping, or short HW power button press Debugging these things is difficult: Bucket 1 - There isn't any debug trail that I can find. I don't see any errors associated with the failure in syslog or virtlogd. I can usually spot that the issue has occurred based on CPU usage. Normal behavior is all CPUs at 100% for a time, then one CPU at 80-100% while others are idling from 5-50%. Failure modes are all CPUs 100% or 1 CPU at 100% and all others at 0%. Bucket 2 - I occasionally get a "fatal error" message in VM log. Sometimes nothing. When I try to restart the VM, I usually get some execution error pop up. All CPUs are at 0%. Bucket 3 - I have no idea how to debug this. Once it hangs, I reboot the system and lose my logs from previous run. I was thinking that I might use a second computer to ssh and run "tail -f" on the syslogd file. Are there other debug messages I can get to? I read about MCE logs as a possible debug path. I'm not sure if those are already going to show up in syslogd or my remote SSH console. What low level information is exposed with unraid? Is there a HW observation point where I could get lower level debug information over and above what unraid supplies? HW diagram of system is attached. --Brad
  7. What is your use case here? Do you want a thumb drive available to two separate VMs?
  8. You could always try using a Ethernet connected USB hub, like SEH myUTN to dynamically attach to the USB device. myUTN is kind of expensive for what you get, but it is very useful when trying to share a USB device with multiple VMs. One of my use cases is to use a lab probe with multiple VMs. In this case, I install the myUTN SW on each VM, then activate the probe on the VM I want. There is some lag due to network, but it works remarkably well. --Harper
  9. I don't have this handy, but I enabled it by going to Steam and selecting "enable beta". This is somewhere in the Steam app, not the SteamVR app. Go to the SteamVR page inside the Steam App, then hunt down this switch. --Harper
  10. My Vive setup has been pretty solid ever since I updated to the beta drivers. But I'm using a PCIE USB card to pass in the USB devices. There are a lot of USB endpoints in the Vive controller: Bus 009 Device 005: ID 0bb4:2744 HTC (High Tech Computer Corp.) Bus 009 Device 006: ID 0bb4:2134 HTC (High Tech Computer Corp.) Bus 009 Device 007: ID 0bb4:0306 HTC (High Tech Computer Corp.) Bus 009 Device 008: ID 0424:274d Standard Microsystems Corp. Bus 009 Device 009: ID 0d8c:0012 C-Media Electronics, Inc. Bus 009 Device 010: ID 0bb4:2c87 HTC (High Tech Computer Corp.) Bus 009 Device 011: ID 28de:2101 Bus 009 Device 012: ID 28de:2101 Bus 009 Device 013: ID 28de:2000 Bus 009 Device 014: ID 0bb4:2c87 HTC (High Tech Computer Corp.) There are several problems: 1) The wireless controllers share the same vendor/device ids 2) The wireless controllers will disconnect when powered down or sometimes during gameplay The real issue is that the Vive is a mini subsystem held together by a USB hub.
  11. Last night, I was running with my two headed VR rig in some games. I had two players running through the Rec Room, doing some of the latest content. Both VMs were working fine for about 20 minutes, when I got unrecoverable errors on both VMs simultaneously. I had to reboot the system. I wasn't able to observe the GPU temperatures directly, but I felt some real heat when I opened the box. It got me to wondering how to manage the various temperature controls spread across systems. For the GPUs, I assume that the VM which controls that GPU will control the fans on that GPU only. For the CPU cooler, I'm not sure who is the controller. I have the NZXT Kraken x62 AIO cooler, which has its own controls for fans. It is attached as a USB device to the mobo. If I unplug the USB connector, the cooler defaults to "max" settings. How do I monitor the temperature of my system as a whole? How do I make sure the CPU cooler is reacting appropriately without running NZXT SW? How can I diagnose whether an error is thermal related? --Brad
  12. I just did a bunch of experiments with this last night. The difference between the passing case and the failing case is that in the failing case, I cannot boot to the USB device, regardless of what I do in BIOS. BIOS simply doesn't see it as an option, even over multiple power cycles. You'll notice that in both cases, the Kingston Datatraveller has the same device credentials. I purposely connected it to the MB ports that went directly to the chipset so that it would be enumerated as simply as possible. All the "basic" systems are this way--hard wired mouse+keyboard, controls to power supply and cooler, the unraid thumb drive. I have a few thoughts on what is going on... 1) The enumeration sequence in unraid is different than the enumeration sequence in early boot, or in BIOS. In early boot, there is only one USB 2.0 port that is monitored so that FW patches can be applied. Early boot is controlled by the PCH (the motherboard chipset) and runs on an embedded CPU that cannot be controlled by the user. During BIOS execution, all the USB devices are discovered and enumerated. This is how BIOS sees the USB world. During OS execution, all the USB devices are rediscovered and enumerated. This is how Unraid sees the USB world 2) I only have visibility into the mapping during OS execution. Perhaps BIOS execution phases experiences discovery failures that go unreported. 3) BIOS seems to always prefer the "last" USB device. When I have three USB keyboards plugged in: a) Kensington direct to Motherboard b) Logitech A on PCIE USB card at 4e c) Logitech B on PCIE USB card at 54 BIOS will only recognize Logictech B in this case. I'm not sure what I can infer from that, other than the USB stack that BIOS uses is more primitive than what an OS provides. Remember, USB enumeration is an OS level task. The code that discovers USB during BIOS is entirely different than the code that does it again using OS runtime. It is entirely possible that some topologies will confound the BIOS code. BIOS code is written by a handful of SW programmers using "generic" lab builds that have few tested configurations. There isn't a lot of feedback from end uses to those BIOS writers. So I can easily believe there are "fatal" USB topologies. In any case, I hope this info is useful to the folks at Limetech. I think there is a real danger for relying on USB devices for boot. The USB topology is fussy and dynamic. It is hard to guarantee a deterministic and reliable boot recipe including USB. I think the solution is to boot from a PCIE mapped device and use the USB thumb drive as a license dongle. The USB stick as OS works brilliantly when it works, but it is a huge problem if something goes wrong here. The problems found in USB enumeration are hard to understand and harder to solve. This is not a good place for Limetech to provide technical support, but it prevents users from using Unraid, so it is a problem for the company.
  13. First things first--I have a lot of USB devices. Take a look at this giant list. In particular, have a look at what the Vive adds to the mix: USB Devices Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 002: ID 8087:800a Intel Corp. Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 002: ID 8087:8002 Intel Corp. Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 Device 002: ID 047d:2043 Kensington Bus 003 Device 003: ID 8087:0a2b Intel Corp. Bus 003 Device 004: ID 1e71:170e NZXT Bus 003 Device 005: ID 045b:0209 Hitachi, Ltd Bus 003 Device 006: ID 045b:0209 Hitachi, Ltd Bus 003 Device 007: ID 1b1c:1c08 Corsair Bus 003 Device 008: ID 0930:6544 Toshiba Corp. TransMemory-Mini / Kingston DataTraveler 2.0 Stick (2GB) Bus 003 Device 010: ID 045e:00cb Microsoft Corp. Basic Optical Mouse v2.0 Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 004 Device 002: ID 045b:0210 Hitachi, Ltd Bus 004 Device 003: ID 045b:0210 Hitachi, Ltd Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 006 Device 002: ID 2833:0211 Bus 006 Device 003: ID 2833:0211 Bus 007 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 007 Device 002: ID 045e:02e6 Microsoft Corp. Bus 007 Device 003: ID 1a40:0101 Terminus Technology Inc. Hub Bus 007 Device 004: ID 2109:2812 VIA Labs, Inc. VL812 Hub Bus 007 Device 005: ID 2833:0211 Bus 007 Device 006: ID 046d:c52b Logitech, Inc. Unifying Receiver Bus 007 Device 007: ID 2833:2031 Bus 007 Device 008: ID 2833:0031 Bus 008 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 008 Device 002: ID 2109:0812 VIA Labs, Inc. VL812 Hub Bus 008 Device 003: ID 2833:3031 Bus 009 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 009 Device 002: ID 28de:1142 Bus 009 Device 003: ID 2109:2812 VIA Labs, Inc. VL812 Hub Bus 009 Device 004: ID 046d:c52b Logitech, Inc. Unifying ReceiverBus 009 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 009 Device 002: ID 28de:1142 Bus 009 Device 003: ID 2109:2812 VIA Labs, Inc. VL812 Hub Bus 009 Device 004: ID 046d:c52b Logitech, Inc. Unifying Receiver Bus 009 Device 005: ID 0bb4:2744 HTC (High Tech Computer Corp.) Bus 009 Device 006: ID 0bb4:2134 HTC (High Tech Computer Corp.) Bus 009 Device 007: ID 0bb4:0306 HTC (High Tech Computer Corp.) Bus 009 Device 008: ID 0424:274d Standard Microsystems Corp. Bus 009 Device 009: ID 0d8c:0012 C-Media Electronics, Inc. Bus 009 Device 010: ID 0bb4:2c87 HTC (High Tech Computer Corp.) Bus 009 Device 011: ID 28de:2101 Bus 009 Device 012: ID 28de:2101 Bus 009 Device 013: ID 28de:2000 Bus 009 Device 014: ID 0bb4:2c87 HTC (High Tech Computer Corp.) Bus 010 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 010 Device 002: ID 2109:0812 VIA Labs, Inc. VL812 Hub Bus 010 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 010 Device 002: ID 2109:0812 VIA Labs, Inc. VL812 Hub Notice that there are 10 US buses, and 36 devices! If I plug the Vive into a motherboard USB slot, I get the following: Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 001 Device 002: ID 8087:800a Intel Corp. Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 002 Device 002: ID 8087:8002 Intel Corp. Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 003 Device 002: ID 047d:2043 Kensington Bus 003 Device 003: ID 8087:0a2b Intel Corp. Bus 003 Device 004: ID 1e71:170e NZXT Bus 003 Device 005: ID 045b:0209 Hitachi, Ltd Bus 003 Device 006: ID 045b:0209 Hitachi, Ltd Bus 003 Device 007: ID 1b1c:1c08 Corsair Bus 003 Device 008: ID 0930:6544 Toshiba Corp. TransMemory-Mini / Kingston DataTraveler 2.0 Stick (2GB) Bus 003 Device 009: ID 0bb4:2744 HTC (High Tech Computer Corp.) Bus 003 Device 010: ID 045e:00cb Microsoft Corp. Basic Optical Mouse v2.0 Bus 003 Device 011: ID 0bb4:2134 HTC (High Tech Computer Corp.) Bus 003 Device 012: ID 0bb4:0306 HTC (High Tech Computer Corp.) Bus 003 Device 013: ID 0424:274d Standard Microsystems Corp. Bus 003 Device 014: ID 28de:2000 Bus 003 Device 015: ID 0bb4:2c87 HTC (High Tech Computer Corp.) Bus 003 Device 016: ID 0d8c:0012 C-Media Electronics, Inc. Bus 003 Device 017: ID 0bb4:2c87 HTC (High Tech Computer Corp.) Bus 003 Device 018: ID 28de:2101 Bus 003 Device 019: ID 28de:2101 Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 004 Device 002: ID 045b:0210 Hitachi, Ltd Bus 004 Device 003: ID 045b:0210 Hitachi, Ltd Bus 005 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 006 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 006 Device 002: ID 2833:0211 Bus 006 Device 003: ID 2833:0211 Bus 007 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Bus 007 Device 002: ID 045e:02e6 Microsoft Corp. Bus 007 Device 003: ID 1a40:0101 Terminus Technology Inc. Hub Bus 007 Device 004: ID 2109:2812 VIA Labs, Inc. VL812 Hub Bus 007 Device 005: ID 2833:0211 Bus 007 Device 006: ID 046d:c52b Logitech, Inc. Unifying Receiver Bus 007 Device 007: ID 2833:2031 Bus 007 Device 008: ID 2833:0031 Bus 008 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 008 Device 002: ID 2109:0812 VIA Labs, Inc. VL812 Hub Bus 008 Device 003: ID 2833:3031 Bus 009 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub Plugging into the PCIE card causes the Vive USB devices to show up earlier in the enumeration sequence. I'm not sure why this affects things, but it appears to work. If I want to boot from a USB drive, I need to unplug the VIVE from the PCIE card. This is kind of pain in the ass. But I don't know that my BIOS can cope with this mysterious enumeration sequence.
  14. Can I create VMs that use the same image with different PCI devices mapped? This might be a way for me to isolate the installation vs. the HW. If the Vanaheim image is able to run fine with all the Muspelheim dedicated HW, then the problem is probably with the installation.
  15. I have two main gaming VMs that I have been using to run multi-player VR sessions, Vanaheim (Vive oriented) and Muspelheim (Oculus oriented) . The two VMs are almost identical in composition and creation, except for what PCI devices are assigned for each VM. Vanaheim always boots up successfully. But Muspelheim will often (~50%) hang at either the BIOS screen or the Windows 10 Loading screen, prior to the spinning circles. On a failing run, I don't see anything in the VM log files to indicate something bad has happened. The CPU utilization shows all 8 CPUs running near 100% during initialization, then all go to zero while one stays up at near 100%. This CPU stays active forever, but the boot never advances. I've left it in this state for hours without resolution. What can I do to characterize and debug this issue?
  16. An update on the disappearing contoller issue with Vive on a VM. Switching to the beta version back in March cleared the problem up completely.
  17. I've been running into VM problems around peripherals. Things like slow IRQ servicing, disappearing USB devices, or unexpected latency. I'd like to isolate these problems to see if it is due to HW, OS, SW, or VM issues. I purchased another SSD and installed windows on it as a dual boot option. Can I launch a VM version of this disk from within unraid?
  18. I'll try some experiments this weekend. I haven't paid much attention to it, since right now two VR rigs is all I can support. That third GPU will be much more important as soon as I get my 3rd VR headset. --Harper
  19. The Oculus sensors are probably being re-enumerated after the VM launches. Lots of things can cause a USB port to re-enumerate (which just means get assigned a number). This is why device number is worthless. It is dynamically assigned. The vendor ID stays the same. PCI re-mapping is much more straight forward.
  20. Not every USB port is created equal on your mobo. Some go through USB hubs and some go straight to the PCH (the Intel chipset chip on the mobo, not the processor). If you can find a USB port that goes straight to the PCH, you can put Unraid there and pass the rest of the controller. On mine, I have a number of USB host controllers: IOMMU group 21 [8086:8d31] 00:14.0 USB controller: Intel Corporation C610/X99 series chipset USB xHCI Host Controller (rev 05) IOMMU group 24 [8086:8d2d] 00:1a.0 USB controller: Intel Corporation C610/X99 series chipset USB Enhanced Host Controller #2 (rev 05) IOMMU group 30 [8086:8d26] 00:1d.0 USB controller: Intel Corporation C610/X99 series chipset USB Enhanced Host Controller #1 IOMMU group 35 [8086:1578] 03:02.0 PCI bridge: Intel Corporation DSL6540 Thunderbolt 3 Bridge [Alpine Ridge 4C 2015] [8086:15b6] 06:00.0 USB controller: Intel Corporation DSL6540 USB 3.1 Controller [Alpine Ridge] IOMMU group 42 [1b73:1100] 4e:00.0 USB controller: Fresco Logic FL1100 USB 3.0 Host Controller (rev 10) IOMMU group 47 [1b73:1100] 54:00.0 USB controller: Fresco Logic FL1100 USB 3.0 Host Controller (rev 10) The 00:14.0 controller is the PCH USB 3.0 host controller. I think the 00:1a.0 and 00.1d.0 are the PCH USB 2.0 host controllers The 06:00.0 address is the Alpine ridge thunderbolt controller, which also runs a single USB 3.1 port and a USB type C port. The 4e:00.0 and 54:00.0 are my PCI plug in controllers. Some deep stuff around USB host controllers coming up.... There are actually separate USB 3.0 and USB 2.0 controllers on the Intel PCH (chipset). If you can pass the USB 3.0 controller (usually called xHCI), then it won't necessarily pass the usb 2.0 controllers. This puts things in a weird state where USB 3.0 devices will show up to the VM and USB 2.0 devices go to unraid. This should allow your unraid thumb driver, mouse and keyboard to run at USB 2.0 to the unraid host and the 3.0 devices (oculus stuff) to the VM. I'm really not sure if something like this would work for you. I've never actually sliced up the USB 2.0 and USB 3.0 controllers for passthru. I'm not sure you can do it for a 8086 mapped PCH device. But I have often disabled the USB 3.0 controller in some way, where the USB 2.0 function continues to run just fine. USB 2.0 and USB 3.0 are only slightly related. If you look at a USB cable, you will find that it is two entirely independent cables that happen to share a sheath and plugs. There is little interaction between the USB 2.0 and USB 3.0 worlds. In other words, a USB 3.0 cable with the USB2.0 D+/- lines removed will still work as a USB3.0 cable. Just like a USB 2.0 cable plugged into a USB 3.0 plug will continue to run as USB 2.0.
  21. When I followed the instructions for dumping my GFX card video ROM, I ran into some issues. I followed the instructions in the pinned thread to get GPU ROM. I get something that is about 1/10 as large as the corresponding ROM I get from the internet. Is this an indicator that it was unsuccessful? Is there some way to verify if a ROM is correct? Do I have to dump ROM for separate pass-thru on audio portion? Can I collect the ROM image from one of my other identical cards? So far, I've had little luck wresting the primary MSI GTX 1070 Quicksilver from the OS. Every time I go to activate the VM using the ROM file method, the screen will freeze and become unresponsive. Sometimes unraid will freeze and no longer respond to pings. Sometimes just the screen. This happens both with official ROM and the one I manually extracted. I do notice that when I try the unbind step, it tells me it is already unbound... --Brad
  22. BTW, I was able to confirm a few things about CPU numbering and physical layout from Anandtech and a few other sources: 1) The CPUs are numbered in roughly consecutive order around the rings. The numbering is determined by a correlated variable, so they might not be exactly as you'd expect, but there's no functional difference between 0,1,2,3,4,5 and 1,0,3,2,5,4 ordering. 2) Base layouts give you basically 1, 1.5, or 2 iterations of the ring structure. If there are 12 cores in a ring, there will be 3 versions with 12/18/24 physical cores. 3) The CPUs are always fused off in equal numbers from each ring. This includes half-rings. So the 22 core Xeon has two rings of 11 cores each. There's never a 10 and 12 core ring. 4) There is a small latency price to pay when crossing rings. Try to minimize cross-ring traffic. From a topological point of view, if you cross a ring, you generate 2x the bandwidth.
  23. I took an educated guess on the cpu numbering, then mapped all 8 VMs + 1 core for unraid. It looks like the diagram. I mainly mapped things based on what was easy in Visio. The final mapping does the following: Unraid: Gets first 2 threads Vanaheim and Muspelheim: Get the middle of each ring. Consume some of the other VM mappings if I want to run just these two. Alfheim and Svartalfheim: My next two VMs getting prepped for a new FOVE headset Niflheim, Helheim, Asgard, Midgard: Minimum size VMs are placeholders for future expansion Utgard: I'd like to have a dual-boot option into windows (machine name Jotunheim). Utgard is a VM that can read and execute from Jotunheim's unmanaged disk. ,
  24. Gigabyte X99 Designare EX Usage: multi-user VR experiences Plus points: Plenty of data lanes for multiple GPUs and PCIe devices Good power distribution Mechanically sound solution for PCIe slots Alpine ridge with Thunderbolt and USB3.1 performs well when passed through to VM Plenty of connectors and USB3 ports Minus points: Renesys USB hubs don't do passthru very well PCIe endpoint 02 is overloaded Requires a monitor connected to boot. Doesn't like to let go of primary GPU (still WIP for me) GPU: MSI 1070 Quicksilver Usage: VR gaming GPU Plus points: Easy installation No SW problems Can use LED SW to indicate which VMs are located and active on each GPU Great cooling and silent operation under load Minus points: Only 1 HDMI out, no VGA (but who does nowadays?) No way to get at the covered PCIe slot (even with thin x1 ribbon cable)
  25. I'm using a higher core count Xeon (22 CPUs/44 threads). I'm wondering if I should take into account the routing of these cores. In this generation of Xeon, there are multiple rings (see attached). It seems like you would want to keep VM cores near each other in the ring, so that they would use the fabric more efficiently. In the simplest case, I would create two VMs out of this topology using ring 1 and ring 2. When we go to 4 VMs, maybe align to the 4 columns. Of course, I can just as easily convince myself that the CPUs should be spread out so that each one has more distributed bandwidth. I think this works out if there are multiple VMs that are not necessarily running full tilt at the same time. Spreading out the CPUs ensures that each CPU lives in a "quiet neighborhood." Therefore, the isolated VM with spread out CPUs has more paths to memory and fewer rivals. But when all VMs are enabled and under load, they will end up stepping all over each other.
×
×
  • Create New...