makin Posted April 9, 2022 Share Posted April 9, 2022 (edited) Hi everyone, I am currently facing a critical issue with my Unraid server's network hanging when my Debian VM with GPU passthrough is running. I had a similar issue as described in this thread here but since I added another cache drive (M2 SSD) and reconfigured my VM's XML (changing the addresses within the XML form for GPU passthrough to work) after a few minutes the Unraid server is not accessible via the network anymore. Plugging the ethernet cable in and out solves the issue for the next few minutes. With the VM being shut down, the server remains accessible. Here is how I set up the VM and the xml (sorry for only providing screenshots). I am at my wits' end... I cannot see anything in the logs. Here is an extract with the network cable being pulled out and plugged in again. Apr 9 13:35:55 GRAViTY ntpd[2076]: no peer for too long, server running free now Apr 9 14:11:14 GRAViTY ntpd[2076]: no peer for too long, server running free now Apr 9 15:35:08 GRAViTY kernel: r8169 0000:04:00.0 eth0: Link is Down Apr 9 15:35:08 GRAViTY kernel: bond0: (slave eth0): link status definitely down, disabling slave Apr 9 15:35:08 GRAViTY kernel: device eth0 left promiscuous mode Apr 9 15:35:08 GRAViTY kernel: bond0: now running without any active interface! Apr 9 15:35:08 GRAViTY kernel: br0: port 1(bond0) entered disabled state Apr 9 15:35:14 GRAViTY kernel: r8169 0000:04:00.0 eth0: Link is Up - 1Gbps/Full - flow control off Apr 9 15:35:14 GRAViTY kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex Apr 9 15:35:14 GRAViTY kernel: bond0: (slave eth0): making interface the new active one Apr 9 15:35:14 GRAViTY kernel: device eth0 entered promiscuous mode Apr 9 15:35:14 GRAViTY kernel: bond0: active interface up! Apr 9 15:35:14 GRAViTY kernel: br0: port 1(bond0) entered blocking state Apr 9 15:35:14 GRAViTY kernel: br0: port 1(bond0) entered forwarding state Apr 9 15:35:58 GRAViTY kernel: br0: port 2(vnet0) entered disabled state Apr 9 15:35:58 GRAViTY kernel: device vnet0 left promiscuous mode Apr 9 15:35:58 GRAViTY kernel: br0: port 2(vnet0) entered disabled state Apr 9 15:35:58 GRAViTY kernel: usb 1-2: reset full-speed USB device number 3 using xhci_hcd Apr 9 15:35:58 GRAViTY kernel: input: Microsoft Microsoft® 2.4GHz Transceiver v7.0 as /devices/pci0000:00/0000:00:01.3/0000:02:00.0/usb1/1-2/1-2:1.0/0003:045E:07B2.0004/input/input9 Apr 9 15:35:58 GRAViTY kernel: hid-generic 0003:045E:07B2.0004: input,hidraw0: USB HID v1.11 Keyboard [Microsoft Microsoft® 2.4GHz Transceiver v7.0] on usb-0000:02:00.0-2/input0 Apr 9 15:35:58 GRAViTY kernel: input: Microsoft Microsoft® 2.4GHz Transceiver v7.0 Mouse as /devices/pci0000:00/0000:00:01.3/0000:02:00.0/usb1/1-2/1-2:1.1/0003:045E:07B2.0005/input/input10 Apr 9 15:35:58 GRAViTY kernel: input: Microsoft Microsoft® 2.4GHz Transceiver v7.0 Consumer Control as /devices/pci0000:00/0000:00:01.3/0000:02:00.0/usb1/1-2/1-2:1.1/0003:045E:07B2.0005/input/input11 Apr 9 15:35:58 GRAViTY kernel: hid-generic 0003:045E:07B2.0005: input,hidraw1: USB HID v1.11 Mouse [Microsoft Microsoft® 2.4GHz Transceiver v7.0] on usb-0000:02:00.0-2/input1 Apr 9 15:35:58 GRAViTY kernel: input: Microsoft Microsoft® 2.4GHz Transceiver v7.0 Consumer Control as /devices/pci0000:00/0000:00:01.3/0000:02:00.0/usb1/1-2/1-2:1.2/0003:045E:07B2.0006/input/input12 Apr 9 15:35:58 GRAViTY kernel: input: Microsoft Microsoft® 2.4GHz Transceiver v7.0 System Control as /devices/pci0000:00/0000:00:01.3/0000:02:00.0/usb1/1-2/1-2:1.2/0003:045E:07B2.0006/input/input14 Apr 9 15:35:58 GRAViTY kernel: hid-generic 0003:045E:07B2.0006: input,hiddev96,hidraw2: USB HID v1.11 Device [Microsoft Microsoft® 2.4GHz Transceiver v7.0] on usb-0000:02:00.0-2/input2 Apr 9 15:36:00 GRAViTY kernel: vfio-pci 0000:07:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem I have attached the diagnostics file. Appreciating any hint from you guys! The server being unaccessible sucks especially as my Home Assistant is running on the server and I heavily relay on that. Thanks in advance! gravity-diagnostics-20220409-1549.zip Edited April 9, 2022 by makin Quote Link to comment
makin Posted April 9, 2022 Author Share Posted April 9, 2022 I found out the following: - When the VM is running, network access of the server is timing out and occasionally it is reachable again for a few minutes - Pulling the Ethernet cable and plugging it in again makes the server available again immediately (for a few minutes) - When the VM is shut down, the server is not reachable shortly after and remains that way until I re-plug Ethernet again I attached some screenshots from Ping attempts. And lastly, with the VM being shut down and the cable being pulled out and plugged in again. Quote Link to comment
makin Posted May 19, 2022 Author Share Posted May 19, 2022 Hi again, not sure whether it is right to post here again instead of creating a new thread but I partly solved the issue with a workaround. i assume that there is some network storming caused by the VM and to troubleshoot I created a new VLAN in which only the VM is running (managed by my UniFi UDM Pro). The server has not become inaccessible since then until today. But this time not due to my HTPC VM but since I installed the ZWaveJS2MQTT Docker. i really have no idea how to troubleshoot but come on… I don’t want to create some artificial VLANs just to work around this problem. Do you guys have any idea? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.