Passthrough USB Hardware goes completely unresponsive, have to reboot server


Recommended Posts

Recently updated my server to the latest versions of everything.  Quarantine Fun!

 

Is there a way to reset the hardware stack or drivers via command line?

I'm passing though a USB controller and graphics card to some VMs.  USB devices on the VM are attaced to the USB controller with a switch so I can switch between another computer and the VM.  Sometimes the controller freaks out... a server reboot takes care of the issue.

 

Linux/Windows VMs will report:

Quote

xHCI host controller not responding, assume dead

Is there a way to reset the hardware stack or drivers via command line?

 

Thanks!!!
 

 

Edited by Mysticle31
Link to comment

TLDR Summary:

USB Switch switching between two laptops and an unraid VM.  The VM doesn't always connect back when being switched.  Sometimes it picks up right away, sometimes takes a minute, mostly requires a server reboot.  Linux VM, Windows VM doesn't seem to matter.

 

Would like to avoid the server reboot at the very least!

Is there a way to reset the hardware stack or drivers via command line?

How can I begin to diagnose and address this USB VM Passthru issue?

 

Details:

In a fit of irony, I've been watching lsmod and lspci to look for changes for the past few hours, but it's worked EVERY TIME.  I'll try and grab working and not working logs...

 

Interestingly, I can know when the system freaks out because I have this corded optical mouse and I can watch the light flash instead of being solid.  Just unplugging and re-plugging the mouse sometimes makes the mouse work, but when it doesn't...time to reboot the server.

 

Here is a process, Server off.

  1. Start Server USB device connected
  2. Start VM USB device connected + passthru enabled
  3. VM Recognizes Device, usable in VM
  4. Switch off the VM to one of the laptops and the laptop will recognize the device instantly. 
  5. 5. Switch back to the VM and several things could happen
    1. Recognizes the device instantly (fairly rare)
    2. Recognizes the device in a minute.
    3. Wont recognize at all and have to switch back and forth a few times to make it recognize. 
    4. Have to reboot the server and start this process over.  Rebooting the VM does not help.
Link to comment

 I just saw it happen, and fortunately had the console open.  I dont always see unraid reporting anything.

 

My Linux VM reported

Quote

 

xhci_hcd 0000:00:06.0: xHCI host controller not responding, assume dead

xhci_hcd 0000:00:06.0: HC died: cleaning up

usb usb1-port1: couldn't allocate usb_device

 

 

Unraid reported

Quote

kernel:Disabling IRQ #18

 

lspcid'd it

Sure enough IRQ 18

Quote

 

ASMedia Technology Inc, ASM1024A USB 3.0 Host Controller (prog-if 30 [XHCI])

....

Kernel driver in use: vfio-pci. 

 

 

Tried to rmmod and modprobe vfio-pci.  Didn't work

 

  • Like 1
Link to comment
  • 2 months later...

Hi, I have the same kind type of issue. I try to configure some macOS VM's which needs keyboard and mouse to work with the eGPU. Whilst setting the VM up i sometimes need to force stop the VM. Sometimes it works, but most of the times it says this in the syslog:

 

Jan 18 13:21:16 Unraid kernel: xhci_hcd 0000:02:00.0: WARNING: Host System Error
Jan 18 13:21:16 Unraid kernel: xhci_hcd 0000:02:00.0: Event logged [IO_PAGE_FAULT domain=0x0000 address=0x00000000fff91000 flags=0x0000]
Jan 18 13:21:16 Unraid kernel: xhci_hcd 0000:02:00.0: Host halt failed, -110
Jan 18 13:21:27 Unraid kernel: xhci_hcd 0000:02:00.0: xHCI host not responding to stop endpoint command.
Jan 18 13:21:27 Unraid kernel: xhci_hcd 0000:02:00.0: Host halt failed, -110
Jan 18 13:21:27 Unraid kernel: xhci_hcd 0000:02:00.0: xHCI host controller not responding, assume dead
Jan 18 13:21:27 Unraid kernel: xhci_hcd 0000:02:00.0: HC died; cleaning up

 

After that USB devices are gone from Unraid GUI and tools. It seems that it disables the 02:00.0 device, which in my case is a USB 3.0 controller.

Before 'crash':

afbeelding.png.bd642255c1884813f02366c25120212f.png

After crash:

afbeelding.png.fa39553d57cbe7ec9e89ff8f184249f6.png

 

 

I am trying to find a command to reset the hardware.

It must be the controller, because: if I try to connect the hardware to another port on the server it actually recognizes it, but will still have the same issue and will also not work. This kind off makes my server a gun with ammo: if you blow it all, you need to reload (restart the whole server) which in my case takes the whole network down due to having network VM's.

Does anyone have a command or a way to fix this?
Greetings,
Bjorn (Running unraid 6.8.3)

 

Edited by maxstevens2
Link to comment
  • 7 months later...

Ever get anywhere with this?  

 

I never did for me.  

 

What I did notice was that my USB Switch has an impact on the issue.  It's worked great for months now, sometimes will die but if I unplug the switch then plug a device directly into the usb port, sometimes it comes back.  Then sometimes I can put the switch back on and continue for awhile, sometimes the switch wont work and I'll have to reboot.  It's strange.

 

I've thought about getting an external USB card (I'm using MB only ports) and seeing if it behaves differently.  

Link to comment
  • 4 months later...

I have a similar problem with an ASMedia device.

    Ubuntu 20.04.3 LTS
    Linux idallen-oak 5.13.0-27-generic #29~20.04.1-Ubuntu SMP Fri Jan 14 00:32:30 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

I have a StarTech SDOCK4U33 4-slot USB 3 hard drive dock using ASMedia Technology:

    Bus 004 Device 007: ID 174c:55aa ASMedia Technology Inc. Name: ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge, ASM1153E SATA 6Gb/s bridge
    Bus 004 Device 008: ID 174c:55aa ASMedia Technology Inc. Name: ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge, ASM1153E SATA 6Gb/s bridge
    Bus 004 Device 009: ID 174c:55aa ASMedia Technology Inc. Name: ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge, ASM1153E SATA 6Gb/s bridge
    Bus 004 Device 010: ID 174c:55aa ASMedia Technology Inc. Name: ASM1051E SATA 6Gb/s bridge, ASM1053E SATA 6Gb/s bridge, ASM1153 SATA 3Gb/s bridge, ASM1153E SATA 6Gb/s bridge

If I reboot my Linux machine, the dock causes these USB errors during the boot:

    Jan 18 21:56:52 idallen-oak kernel: [   13.254357] usb 4-3: device descriptor read/8, error -110
    Jan 18 21:57:03 idallen-oak kernel: [   24.398290] xhci_hcd 0000:00:14.0: Abort failed to stop command ring: -110
    Jan 18 21:57:03 idallen-oak kernel: [   24.400554] xhci_hcd 0000:00:14.0: xHCI host controller not responding, assume dead
    Jan 18 21:57:03 idallen-oak kernel: [   24.402820] xhci_hcd 0000:00:14.0: HC died; cleaning up

Losing that controller at boot means many of my other USB devices are
not detected.

If I unplug the hard drive dock USB cable, the machine boots fine.  If I
then plug in the dock USB cable after the machine is up and running,
the above error happens (and I again lose many other USB devices).
Surely plugging in a USB device should not cause the whole controller
to fail and lose all the other working USB devices?

If I power off all of the drives in the hard drive dock and then
power them on, that seems to reset the dock and then everything works
fine again when I reboot.  The dock is correctly detected at boot and
everything works.  If I then reboot with the dock running, the problem
returns until I power cycle the dock and reboot again.

There is something about this dock that xhci_hcd doesn't like when the
dock is left powered on across a reboot.

More details (syslog, etc.) upon request.
 

Link to comment
  • 7 months later...

Hi, Did you solve the issue?
I am having a similar problem with unraid.

Aug 19 22:45:07 LakNAS root: plugin: running: anonymous
Aug 19 22:45:07 LakNAS root: plugin: running: anonymous
Aug 19 22:45:36 LakNAS nginx: 2022/08/19 22:45:36 [error] 5699#5699: *124209 open() "/usr/local/emhttp/plugins/dynamix.file.manager/javascript/ace/mode-log.js" failed (2: No such file or directory) while sending to client, client: 192.168.1.100, server: , request: "GET /plugins/dynamix.file.manager/javascript/ace/mode-log.js HTTP/2.0", host: "192.168.1.150", referrer: "https://192.168.1.150/Shares/Browse?dir=/mnt/user/logs"
Aug 20 06:50:41 LakNAS kernel: xhci_hcd 0000:02:00.0: xHCI host controller not responding, assume dead
Aug 20 06:50:41 LakNAS kernel: xhci_hcd 0000:02:00.0: HC died; cleaning up
Aug 20 06:50:41 LakNAS kernel: usb 1-7: USB disconnect, device number 2
Aug 20 06:50:41 LakNAS kernel: usb 1-9: USB disconnect, device number 3
Aug 20 06:50:41 LakNAS kernel: usb 1-10: USB disconnect, device number 4
Aug 20 06:50:41 LakNAS kernel: FAT-fs (sda1): Directory bread(block 30546) failed
Aug 20 06:50:41 LakNAS kernel: FAT-fs (sda1): Directory bread(block 30547) failed
Aug 20 06:50:41 LakNAS kernel: FAT-fs (sda1): Directory bread(block 30548) failed
Aug 20 06:50:41 LakNAS kernel: FAT-fs (sda1): Directory bread(block 30549) failed

 

After some time of the server being up (several hours), I get to this error, and after this the server becomes unresponsive.

 

I have to restart the server again and again... Do you know what can be happening?

Link to comment
  • 2 weeks later...

Try to disconnect all other usb devices.  It is pretty common to have a bad usb device that brings down hole usb bus or a mose that have a short in the cable that only shorts in some of the movements. Also a lot of cheepo wireless keyboard mouse receivers hang after some time and mess the usb buss. 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.