bungee91

Members
  • Posts

    744
  • Joined

  • Last visited

Posts posted by bungee91

  1. Need to update this, and ask for continued support.. Yes I know I'm the only one seeing this. Reminder that Memtest 24+ hours with 0 errors, no actual instability issues. This also (somewhat) makes sense I only starting seeing these when they replaced my GPU with a newer refurbished one (different model, R260X).

     

    This is what I know now, this is only evident (for me with the components I have here) when I pass through either my replacement R260X, or GTX950, so basically "gaming able" GPU's.

     

    This error "alloc_bts_buffer" from extensive Googling details a memory fragmentation error, which in the past has been requested to stop "spamming the logs".

     

    If I remove either of those cards, and assign a GT720 to this VM, the issue is completely gone, 8+ days on 6.2 final (there have been frequent updates, causing me to reboot).

     

    I removed the R260X, tried my new GTX950 (same issue), I assigned a GT720 and the problem is completely non existent (it's almost too peaceful in my syslog!) ;D .

    I'm currently running the following:

    3 GT 720's (Win 8.1 MCE, Win 10 X2), 1 GT 710 (Librelec), and one headless, all working as expected.

     

     

    So, if I keep it just like this, this is likely solved...

    However if I add back in either GPU, I'm pretty confident within 24 hours these messages will appear again, always following the VM that the card is assigned to.

     

    I've attached a clean log from my 8+ days using the 720's and 710 prior to upgrading to 6.3 RC1.

    Other logs to compare to above, I don't see anything obviously different between the two to help diagnose this.

    server-diagnostics-20161009-1000_Clean_8_Days_GT720s.zip

  2. Edit: You beat me to it.

     

    I should have just stayed quiet, such an excellent and more detailed answer!  ;)

     

    :-*

     

     

    You're right, missed the point regarding VM device assignment, but was thinking about specific disc share assignment issues (or disc based mount points) if the disc slots changed (which they shouldn't in this case,  but if reassigning them we think of the data slots as not important to ensure they're correct since the order makes zero difference to parity protection).

     

    As for the VM's, fortunately a quick edit to the VM templates to re-select the correct devices should fix you right up (if this is an issue for you). The editor is pretty smart in that if you edit the VM and the hardware is no longer there, it automatically removes the assignment, and doesn't complain about "missing device XXXX", then just assign the device and all should be well.

  3. UnRAID is hardware agnostic, the process should be extremely painless.

    The only recommendation (which is somewhat deprecated now) is to remember the slot locations for your discs (a simple screenshot is always helpful just in case), in case you had to reassign them to the array. Really the only one that matters is your parity drive (or both if using dual parity).

    However as long as all discs are present on booting the new hardware, you likely won't have to do a thing.

    UnRAID stores the disc location by serial #, so even though the ports/locations may have changed, it should find them and assign them to the correct slots.

     

    Edit: You beat me to it.

  4. It is possible that the override is not working correctly, however reporting that the devices are in separate groups, causing this to fail.

    You're correct that ACS on root ports is only available (at this time) on the "E" edition i7's, and the E5 Xeon's.

    I'm going to assume that the OP requires this to be enabled, or the device he is trying to pass is grouped with other devices.

    It is always recommended to attempt to relocate the device, and hopefully it is moved into a group by itself, which would then have actual isolation and not require this patch.

     

    Since you mentioned the override failed to separate your 2nd video card from your first one, I'd say the patch wasn't working as it is intended to for your hardware.

    However in this case the output(s) look like it is, just unfortunately not working when starting the VM.

    Unfortunately none of this is helping the OP, so hopefully someone can chime in with some additional help.

    I suppose you could try to stub the device ID and see if it helps, however being that the output shows it is free, or being bound to vfio-pci, I wouldn't think this would help solve your issue.

  5. They don't have ACS on root ports, yet they are using the ACS override, which is attempting to break up the video card from its root port(s). They need to kill that kernel option, reboot, and plan their IOMMU adventures properly for the hardware they have.

     

    This is incorrect, and I do fully understand the lack of isolation, the non upstream patch being used, and what is being reported here.

    The fact that the outputs show the grouping being separated allows for this to be functioning, even though true isolation is not achieved.

    However this patch works for many (read: a lot of people here), for allowing for a device to be assigned to a VM.

    Since the outputs show that the device is assignable (IE: not in a group with other devices, regardless if direct DMA could cause issues/corruption) should not be relevant to the fact that the VM will not start. Something else is a muck here, but it is not common.

    While I do not recommend the ACS override patch (nor do most in the community, and hence the warnings it prompts), it is effective for many in allowing for devices to be assigned assuming isolation when there is no hardware in place to actually provide it. Regardless, the device should be assignable and allow for this VM to start.

  6. I'm uncertain how to see the version #'s on the hub (sorry, not something I've done prior), but do see "Success" from an update to the beta tag 6 days prior.

    I see the same for all versions listed there, so maybe there just have been no updates to the Docker for the last 6 days? If so, then no complaints, just not normal from the regular updates (they normally follow pretty in line with the dashboard messages listing any updates to the server).

     

     

     

  7. I seem to be stuck on version 3.1.170.0 beta, running the EmbyServerBeta release from CA.

    The Emby dashboard shows that Version 3.1.183 is now available for download.

     

    I have killed the docker image for Emby and reinstalled through CA, but always have 3.1.170 installed.

     

    This has been this way for more than a week, and speaking from past experiences this container normally requests an update daily, if not more.

    Is the XML in CA not pointing to a new configuration or location to pull updates for this Docker?

     

  8. My theory was incorrect (but was the most likely thing).

    I don't use the PCI ACS override, but it does appear to be working if/as needed, as your groups are separated.

     

    I'm kind of wondering if something is binding this card, preventing it to be assigned to vfio-pci.

    From the terminal if you type "lspci -v" look and see if the device is binded to something.

     

    It should look like this example (this is a USB card of mine), however not have the binded line at the bottom, should be free for the VM start to grab/bind as needed:

    0e:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB 3.0 xHCI Host Controller (rev 02) (prog-if 30 [XHCI])
            Flags: bus master, fast devsel, latency 0, IRQ 18, NUMA node 0
            Memory at fb500000 (64-bit, non-prefetchable) [size=64K]
            Memory at fb510000 (64-bit, non-prefetchable) [size=8K]
            Capabilities: [40] Power Management version 3
            Capabilities: [48] MSI: Enable- Count=1/8 Maskable- 64bit+
            Capabilities: [70] Express Endpoint, MSI 00
            Capabilities: [c0] MSI-X: Enable+ Count=8 Masked-
            Capabilities: [100] Advanced Error Reporting
            Capabilities: [150] Device Serial Number 08-00-28-00-00-20-00-00
            Kernel driver in use: vfio-pci
    

     

  9. You saved the vdisk but not the XML?

     

    If you recreate the XML using the same settings it should boot -- basically you can't have switched from SeaBIOS to OVMF although I think you can do with a different machine although I'm not certain of that.

     

    You will likely have an issue with activation because even if you create the exact same setting you won't have the same uuid number so you'll have to activate Windows again.

    This is all true, to elaborate.

     

    If you change the machine type or the UUID, you will be prompted to reactivate.

    Changing the machine type/chipset will not normally cause a boot related issue, you will see driver installation upon boot for the new devices.

     

    However in no straight forward way can you boot an OVMF image with a SeaBIOS XML or vice versa.

     

    Now why are you playing with the vdisk permissions?

    No comment on that, just don't know why/what you did to make changes to that.

     

    If this was OVMF, at first boot you sometimes have to select the boot device, it will remember this setting afterwards.

    If SeaBIOS, I would point to the XML not properly having the boot order setting correct/priority for the Vdisk.

     

     

  10. Please provide the listing of your IOMMU groups, you'll find this listed in the system devices page.

     

    It looks as if for whatever reason the device 1:00:0 is now in a group with other devices, which would give this error.

    This is speculation until I see that output.

     

     

  11. Thanks bungee91 - I'm still learning.  I have 2 sections with 'subsystem' - I'm assuming one's the videocard and one's the audiocard so I should add the rom location to both? i.e.

     

    <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
          </source>
         <rom file='/mnt/isos/vbios.dump'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
          </source>
         <rom file='/mnt/isos/vbios.dump'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
        </hostdev>
    

     

    You only need the GPU rom for the GPU, not the audio chip.

    Only add it to the (typically) XX.00.00 entry, in your case 06:00:00.

  12. Help please!

     

    I've dumped my bios to /mnt/isos/vbios.dump, but I can't work out where to add it to my xml:

     

    <domain type='kvm'>
      <name>Windows 10 - Nvidia</name>
      <uuid>072617e6-bd86-aa40-2274-cb6a2d0e2a2c</uuid>
      <metadata>
        <vmtemplate xmlns="unraid" name="Windows 10" icon="windows.png" os="windows10"/>
      </metadata>
      <memory unit='KiB'>8388608</memory>
      <currentMemory unit='KiB'>8388608</currentMemory>
      <memoryBacking>
        <nosharepages/>
        <locked/>
      </memoryBacking>
      <vcpu placement='static'>12</vcpu>
      <cputune>
        <vcpupin vcpu='0' cpuset='8'/>
        <vcpupin vcpu='1' cpuset='9'/>
        <vcpupin vcpu='2' cpuset='10'/>
        <vcpupin vcpu='3' cpuset='11'/>
        <vcpupin vcpu='4' cpuset='12'/>
        <vcpupin vcpu='5' cpuset='13'/>
        <vcpupin vcpu='6' cpuset='22'/>
        <vcpupin vcpu='7' cpuset='23'/>
        <vcpupin vcpu='8' cpuset='24'/>
        <vcpupin vcpu='9' cpuset='25'/>
        <vcpupin vcpu='10' cpuset='26'/>
        <vcpupin vcpu='11' cpuset='27'/>
        <emulatorpin cpuset='0,14'/>
      </cputune>
      <os>
        <type arch='x86_64' machine='pc-i440fx-2.5'>hvm</type>
        <loader readonly='yes' type='pflash'>/usr/share/qemu/ovmf-x64/OVMF_CODE-pure-efi.fd</loader>
        <nvram>/etc/libvirt/qemu/nvram/072617e6-bd86-aa40-2274-cb6a2d0e2a2c_VARS-pure-efi.fd</nvram>
      </os>
      <features>
        <acpi/>
        <apic/>
        <hyperv>
          <relaxed state='on'/>
          <vapic state='on'/>
          <spinlocks state='on' retries='8191'/>
          <vendor id='none'/>
        </hyperv>
      </features>
      <cpu mode='host-passthrough'>
        <topology sockets='1' cores='6' threads='2'/>
      </cpu>
      <clock offset='localtime'>
        <timer name='hypervclock' present='yes'/>
        <timer name='hpet' present='no'/>
      </clock>
      <on_poweroff>destroy</on_poweroff>
      <on_reboot>restart</on_reboot>
      <on_crash>restart</on_crash>
      <devices>
        <emulator>/usr/local/sbin/qemu</emulator>
        <disk type='file' device='disk'>
          <driver name='qemu' type='raw' cache='writeback'/>
          <source file='/mnt/user/domains/Windows 10 - Nvidia/vdisk1.img'/>
          <target dev='hdc' bus='virtio'/>
          <boot order='1'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
        </disk>
        <disk type='file' device='cdrom'>
          <driver name='qemu' type='raw'/>
          <source file='/mnt/user/isos/Operating Systems/Windows10.iso'/>
          <target dev='hda' bus='ide'/>
          <readonly/>
          <boot order='2'/>
          <address type='drive' controller='0' bus='0' target='0' unit='0'/>
        </disk>
        <disk type='file' device='cdrom'>
          <driver name='qemu' type='raw'/>
          <source file='/mnt/user/isos/virtio-win-0.1.118-2.iso'/>
          <target dev='hdb' bus='ide'/>
          <readonly/>
          <address type='drive' controller='0' bus='0' target='0' unit='1'/>
        </disk>
        <controller type='usb' index='0' model='nec-xhci'>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
        </controller>
        <controller type='pci' index='0' model='pci-root'/>
        <controller type='ide' index='0'>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
        </controller>
        <controller type='virtio-serial' index='0'>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
        </controller>
        <interface type='bridge'>
          <mac address='52:54:00:ce:97:84'/>
          <source bridge='br0'/>
          <model type='virtio'/>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/>
        </interface>
        <serial type='pty'>
          <target port='0'/>
        </serial>
        <console type='pty'>
          <target type='serial' port='0'/>
        </console>
        <channel type='unix'>
          <source mode='connect'/>
          <target type='virtio' name='org.qemu.guest_agent.0'/>
          <address type='virtio-serial' controller='0' bus='0' port='1'/>
        </channel>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x06' slot='0x00' function='0x0'/>
          </source>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
        </hostdev>
        <hostdev mode='subsystem' type='pci' managed='yes'>
          <driver name='vfio'/>
          <source>
            <address domain='0x0000' bus='0x06' slot='0x00' function='0x1'/>
          </source>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/>
        </hostdev>
        <hostdev mode='subsystem' type='usb' managed='no'>
          <source>
            <vendor id='0x045e'/>
            <product id='0x0745'/>
          </source>
        </hostdev>
        <memballoon model='virtio'>
          <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
        </memballoon>
      </devices>
    </domain>
    

    Thanks in advance

     

    http://lime-technology.com/wiki/index.php/UnRAID_6/VM_Management#Edit_XML_for_VM_to_supply_GPU_ROM_manually

  13. It's happening because you're requesting 1:00:0 and 1:00:1 which are listed in IOMMU group 1.

    Everything within the same group must be passed to the VM, or stubbed (not used, bound to a placeholder).

    The message seems a bit odd, however 4:00:0 is in group 1, along with the device your attempting to use in this VM.

     

    You have a lot of items in group 1, you can try relocating it to another slot (may move the group), or use the ACS override setting.

    Either way, you have to get that device isolated (in a different group) to assign it to the VM.

     

  14. FYI slot 2 for the computer you specified "Lenovo TS140" is only a 4X PCIe Gen 2 slot, from the manufacturer page.

    1 x PCIe GEN3: HH/FL x16 mechanical, x16 electrical

    1 x PCIe GEN2: HH/HL x1 mechanical, x1 electrical

    1 x PCIe GEN2: HH/HL x16 mechanical, x4 electrical

    1 x PCI 32-bit/33 MHz: FH/HL

    http://shop.lenovo.com/us/en/systems/servers/towers/thinkserver/ts140/#tab-tech_specs

     

    So you may want to investigate further or you may have some bottlenecking related to this.

     

    Edit: Also in case you care (keep in mind this can be dependent on manufacturer GPU BIOS implementation) I recently upgraded to a GTX950, using it in OVMF, and it performs correctly. I'm certain many others use this card also, so you should be able to get OVMF to work as expected.

    On my previous card (R260X) I would see an "invalid rom contents" in the VM log file that would lead to instability in OVMF. Passing the rom to the card with the "romfile=" in the XML to the VM solved this issue, and the message and related instability (mainly) disappeared.

  15. The best way to troubleshoot this would be, when in slot 1 do you see the device in your BIOS?

    UnRAID should list what it sees, if it is not listed it is not UnRAID but something else going on.

    Unfortunately not all BIOS's do a good job of showing devices that are initialized but some do (ASRock for instance has a display that shows what slots are used, and by what). I'd start there, however it also sounds like something is wrong with the device or MB as you've eluded to issues now in slot 2.

  16. That will be a nightmare to maintain, log messages vary with each linux version and installed package, but if you like a challenge be my guest :)

     

    That stifled my determination pretty quickly.

     

    (Not looking at a log currently) I think the only ones that jump out at me (and likely others) are the ones with "Error" "Fault", or "Warning", does that make it a reasonable path?

    Honestly the important ones that I concern myself with are the ones that Dynamix applies the color to in the log, however I don't think that capability is default on a plain install (I think I added a plugin for that, can't recall).

  17. Everyone on v6 has the same message, ignore it.

     

    Reading syslogs is both about knowing what is important, and about knowing what to ignore, and there's quite a lot of things in a syslog that may look more important that they really are.  unRAID builds a new OS each time it boots, and the Linux kernel has to figure out both what hardware and software is there and what is not, and adapt itself accordingly.

     

    Considering this, I wonder if it'd be advantageous of us to create an entry for common/harmless syslog messages.

    Something that could be added to a wiki article for each major released version (if needed) if it will change based on build/packages/kernel/etc included within the release.

    That is unless we already have something like this, or similar, and I should have searched before opening my mouth.  :D

     

    I know for instance of another common VM question regarding "tainted" in the output of the VM log that gets people all flustered, and is effectively harmless.

  18. You may be exactly okay actually.

    If you MB does allow the split of internal USB devices into a total of 3 (that seems to be the norm if it allows it; my previous Gigabyte board refused to split them up). So two Vm's have one of those three assigned, the other has the USB devices assigned in the VM (so not passing the entire controller to one of them) on the same USB card as the UnRAID flash drive.

    You'll have to do some testing on your MB to find out how it assigns these. That and I hope you have a build that supports ACS on root ports considering what you're attempting to accomplish (for Intel, i7 "E" and Xeon E5 and up).

    Details to split up the controllers here http://lime-technology.com/forum/index.php?topic=36768.0

  19. That will likely not work, however there's a solution.

    Pass a USB card or device to each VM and you can plug in whatever you want to each, plus it'll then be hot pluggable.

    Most motherboards have multiple USB controllers, so you may have three already by default, but will need one of them for UNRAID, so you may only have to add one to a Pcie 1x slot, and you'd be all set.

  20. I have been using the Docker without any problems and has very little impact on my system. I did a spin off the of docker so that i can keep it up to date though at this point the docker is not missing anything big right now.

     

    If you would like to install my docker then PM me and I will send you a link to try.

     

    I may take you up on that, just concerned with ongoing support as this continues to be updated.

    The most recent update has the backend functions for pausing liveTv, but hasn't enabled it on any clients as of yet.

     

     

    Plex can now integrate with HDHR devices and PVR with guide. Plex pass needed.

     

    I'm not really a Plex guy (I kind of hate it actually), however thank you for letting me know.

    I was aware of this, however at this point I believe it only supports 1 tuner (or maybe just one stream) and doesn't do LiveTv so it is not usable for me.

    Emby also does DVR natively, and is pretty good at it, however they still need to make updates to the Theater app in order to have recording management, which makes it less than ideal at the moment also.

     

    I PM'd @CHBMB thinking maybe LSIO would put together a Docker, since they're kind of "the gold standard" around here for quality work and ongoing support. Being that this doesn't work AT ALL without a subscription (eventually an annual one at that), it may not be the most appealing project to work on/support. Can't hurt to ask though.

     

    The DVR backend really doesn't do much, no transcoding whatsoever, mainly a path to record to and scheduling.

     

    Is there any reason this should be a Docker vs a plugin? Other than the obvious reasoning why previous plugins became Dockers (isolation, dependencies, etc...).