• [6.6.6] Elgato HD60 Pro crashes Unraid due to pci bus resetting


    bamhm182
    • 6.7.0-rc1 Retest Minor

    I just bought an Elgato HD60 Pro and when I pass the card through to my Windows 10 VM, it crashes the entire server when I turn off/reboot the VM. It appears as though the issue is resolvable by inserting the following lines into a few kernel files.

     

    drivers/pci/quirks.c:

    DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_YYE, PCI_DEVICE_ID_YYE_MOZARD_395S, quirk_no_bus_reset);

    include/linux/pci_ids.h

    #define PCI_VENDOR_ID_YYE		0x12ab
    #define PCI_DEVICE_ID_YYE_MOZART_395S	0x0380

     

    I have been working for a while to rebuild the kernel with this code and I think I've got it building now but all I really have to show for my work is a pissed off wife. I also know that all this work I'm doing will go away as soon as Unraid updates the kernel again, so I figured I would ask if there's any way the code could get added into the next release. Thanks and keep up the great work!

     

    EDIT: I don't know if this should be marked Minor or Urgent, sorry. It crashes the entire server forcing me to hold down the power to reboot it, which the side bar says is urgent, but it only affects people who are trying to use this capture card, so it doesn't seem like it's an urgent issue. Additionally, I'm trying to generate the diagnostics zip, but for some reason it is changing my page to the link it should be downloading from and not actually downloading anything. I confirmed that it is working on my other Unraid server (both on 6.6.6) and it generates the diagnostics zip without any problems.

    tardis-diagnostics-20190112-2028.zip




    User Feedback

    Recommended Comments

    Haha, that's true. That is the gist where I found the code, yes. I have been working on a script that checks the unRAID, Slackware and Kernel versions, then goes out and downloads the appropriate packages and files and follows similar steps to gfjardim's gist and the one on the wiki. Only problem is they're both pretty old and this is the first time I've ever tried to compile a kernel. I managed to get rid of most of the negative sounding things, except for what looks like a few warnings when running the make. I finally managed to get a bzimage and bzroot that look pretty similar to the bzroot and bzimage that come with unRAID, but it was late last night and I was unable to test it. With that, I haven't been able to verify that it fixes the problem just yet, but it worked for the Creator of that patch once he also made a few changes to syslinux.

    Link to comment
    5 hours ago, bamhm182 said:

    it worked for the Creator of that patch once he also made a few changes to syslinux

    In order to pass the device to a VM correct?

    Link to comment

    That is correct. The person who originally made the patch was using Arch, but he had a virtual machine running on Arch and ran into the same issues as we are running into now. Making these changes and a few in syslinux solved the reboot VM crash for him.

     

    He modified his sysconfig to look similar to the one I have set up here:

    label unRAID OS (Elgato Fix)
      kernel /bzimage-new
      append pcie_acs_override=downstream pci-stub.ids=<removed>,12ab:0380 vfio-pci.ids=12ab:0380 disable_idle_d3=1 initrd=/bzroot-new

    EDIT: Making progress! I just took a moment to step back and think about the steps and realized that this whole time I've been trying to compile both bzroot and bzimage, but I haven't really been doing anything to bzroot, but extracting and recompiling it (wrong, I might add). I was replacing bzroot for no reason, so I used the bzroot that came with unRAID and the bzimage that I have been compiling, low and behold, I CAN REBOOT! There are some issues with the card still, but I think they may not be related. It's just weird, it prompts me to reboot to finish installing the drivers, but when I reboot, that message doesn't go away and the card still doesn't work. I'm wondering if maybe Windows isn't shutting down cleanly...

    Edited by bamhm182
    Link to comment

    I added the patch to upcoming 6.7 release, not a big deal.

     

    1 hour ago, bamhm182 said:

    I'm wondering if maybe Windows isn't shutting down cleanly

    Probably the card is garbage would be my guess unfortunately.

    Link to comment

    Thanks for doing that! I'm beginning to think the card is garbage as well. I have a little mSATA drive I'm not using, I may install that and toss Windows on there just to give it one final go before I send it back to Amazon.

     

    I've just got one final question since I've been stuck on this for a few days. What's the proper way to extract and compile the bzroot file? All the information I can find suggests piping it from xzcat to cpio, but when I do that, xzcat complains about an invalid format. I was finally able to get it to do something with cpio -id < /boot/bzroot, but all that was extracted was a kernel folder. Is that all that's in there? As for compiling it, I never had any luck.

    Link to comment

    Been running my kernel for a few days with no issues. I'm going to close this.

     

    In other news, still can't get this card to work, I even installed Windows baremetal on a spare disk, got the card working there, then copied the disk to an img. I booted that and it still told me to reboot to finish the driver installation. I feel like the programmers put that in there instead of "it isn't working and we don't know why." I think I'm going to stop trying to get this to work with unRAID and just dual boot windows and unRAID. No fault to unRAID, just this crappy card. Keep up the great work!

    Link to comment

    Well the patch will be in the Unraid 6.7.0-rc1 release, maybe try it again after we publish.

    Edited by limetech
    fix typo
    Link to comment

    I didn't even catch that he probably just missed the 0. Haha. I thought he actually meant 6.7.9. Glad to hear it'll be released sooner than that. >_<

    Link to comment

    So I'm running into a new issue now (after getting 6.7.0 to cooperate) for some reason if I boot while 12ab:0380 is in the ids, I can click all over the GUI, but as soon as I click on the VMs tab, unRAID hard crashes. I tried reverting to 6.6.6 and running the patched kernel that was working fine before I tried upgrading and now that is doing the exact same thing. All I should need to do to downgrade is copy the files in the previous folder back to /boot, right?

    Link to comment
    35 minutes ago, bamhm182 said:

    All I should need to do to downgrade is copy the files in the previous folder back to /boot, right?

    Yes but on Update OS page there should be a button that does that.

    Link to comment

    Alright, so I managed to get back to where I had it before 6.7.0rc-1/2, and I think I figured out that it was crashing there because of how I had the IDs in my syslinux file.  I currently have this:

     

    append pcie_acs_override=downstream pci-stub.ids=<others>,12ab:0380 vfio-pci.ids=12ab:0380 disable_idle_d3=1 initrd=/bzroot,/bzroot-gui

     

    Everything works fine until I go and attach the card to a VM and try to boot it. It says unknown pci header type 127 and hard crashes the computer after a few seconds. I tail'd syslog while this happens as requested in another post with this error and I'm thinking maybe the disable_idle_d3 part is what's getting me, but the guy who made the kernel patch said that was required. Either way, I'll try it without that part tomorrow.

    IMG_20190125_195224.jpg

    Screenshot_20190125-194346.png

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.