hoff Posted June 9, 2022 Share Posted June 9, 2022 (edited) Hi All. The problem is I lose connectivity to anything of my immediate L2 domain. I can ping the default gateway and manage the device, all is fine. Docker on/off causes loss of external network traffic. Dell R430 UNRAID server which has a tg3 NIC running Intel VT off/on no change. Docker only on IPVLAN to stop crashes from macvlan. Only Docker. The server boots and works as expected, can ping 1.1.1.1 as an example. Start RAID. still ok Start Docker. still ok. 1-5mins later, no external traffic. Stop Docker. Everything returns. MACvlan in the past has caused server locks IPvlan has resolved the lockups. Tested: VT on and off Test 6.10.1 and 6.10.2 no changes Any thoughts? Edited June 9, 2022 by hoff Quote Link to comment
ljm42 Posted June 9, 2022 Share Posted June 9, 2022 Not sure if you've seen the 6.10.2 release notes which mention the tg3 driver specifically? https://forums.unraid.net/topic/124108-unraid-os-version-6102-available/ If that doesn't resolve the issue, be sure to upload your diagnostic.zip file (from Tools -> Diagnostics) Quote Link to comment
hoff Posted June 9, 2022 Author Share Posted June 9, 2022 yeah, happens on 6.10.1 and 2. ill upgrade again now and retest. Quote Link to comment
hoff Posted June 9, 2022 Author Share Posted June 9, 2022 upload the diags here? So on 6.10.2 (VT-D Off in the Bios) the same thing happens but if I disable Docker(iplan) the system will behave correctly. Once Docker(iplan) is running and containers are booted, 1-2mins I loose all traffic external to my L2 domain/subnet. if I stop all containers remains "offline" outside my local layer2 network until I turn off Docker and then services return. I just tested the same with Docker(macvlan) and I am online and services appear to be working no issues. Switched back to Docker(ipvlan) and the problem has no not re-occured. *puzzled looks* Quote Link to comment
hoff Posted June 9, 2022 Author Share Posted June 9, 2022 Apologies. it failed back on IPVLAN. So it appears to be an issue with IPVLAN+Docker in 6.10.1/2 Quote Link to comment
ljm42 Posted June 10, 2022 Share Posted June 10, 2022 On 6/8/2022 at 8:56 PM, hoff said: upload the diags here? Right, reproduce the problem and then upload the entire diagnostics.zip file (from Tools -> Diagnostics) to your next post in this thread. Hopefully someone from the community will spot the issue. Quote Link to comment
aarontry Posted June 16, 2022 Share Posted June 16, 2022 I have the same issue. If you have host access enabled with ipvlan then you lose external network in a few minutes. Quote Link to comment
ljm42 Posted June 16, 2022 Share Posted June 16, 2022 8 hours ago, aarontry said: I have the same issue. If you have host access enabled with ipvlan then you lose external network in a few minutes. Reproduce the problem and then upload the entire diagnostics.zip file (from Tools -> Diagnostics) to your next post in this thread. Hopefully someone from the community will spot the issue. Quote Link to comment
hoff Posted June 18, 2022 Author Share Posted June 18, 2022 confirmed also. MACLAN will cause a nf_nat fault and a kernel panic IPLAN will cause loss of connectivity and I will test the host access on as @aarontrycommented on at this point my unraid server is useless. I will upload a diagnostics with IPLAN and Host access on shortly. Quote Link to comment
hoff Posted June 18, 2022 Author Share Posted June 18, 2022 (edited) @ljm42diag - IPLAN on. No Network traffic after a few mins. stop docker and external network access recovers. tower-diagnostics-20220618-1816.zip Edited June 18, 2022 by hoff Quote Link to comment
hoff Posted June 18, 2022 Author Share Posted June 18, 2022 Hi All. Looking at below, I am no expert in the shim-br0 but that seems to be the issue. I am thinking its might be a stale arp/mac issue as it takes a while to time out and then dies... If you are running a ping when the IPLAN + host access stops, there is no longer any transmit or recieve for anything offnet with a "tcpdump -nn -vv host 1.1.1.1" you cant see the tx or an rx on that packet, the box basically forgets how to get off its local subnet. Docker + IPLAN with Host Access ----------------------------------------------------------- root@Tower:~# arp -an | grep "80.1)" ? (10.80.80.1) at 7c:5a:1c:a2:7a:03 [ether] on shim-br0 ? (10.80.80.1) at <incomplete> on br0 16: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 14:18:77:4e:bf:d3 brd ff:ff:ff:ff:ff:ff inet 10.80.80.100/24 scope global br0 59: shim-br0@br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether 14:18:77:4e:bf:d3 brd ff:ff:ff:ff:ff:ff inet 10.80.80.100/32 scope global shim-br0 root@Tower:~# ip route default via 10.80.80.1 dev br0 10.80.80.0/25 dev shim-br0 scope link 10.80.80.0/24 dev br0 proto kernel scope link src 10.80.80.100 10.80.80.128/25 dev shim-br0 scope link 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 Docker on without Host Access ----------------------------------------------------- root@Tower:~# ip route default via 10.80.80.1 dev br0 10.80.80.0/24 dev br0 proto kernel scope link src 10.80.80.100 172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 Docker Off ------------------- root@Tower:~# arp -an | grep "80.1)" ? (10.80.80.1) at 7c:5a:1c:a2:7a:03 [ether] on br0 No shim-br0 Quote Link to comment
hoff Posted June 18, 2022 Author Share Posted June 18, 2022 why do i have host access? I run a couple of mysql and other databases on lan_ip's on bridge and a number of services on host ports which need to talk to those other lan_ip's. A number of the out of the box apps in the repo dont like the br ip configuration which I would prefer for everything and ive been too lazy to fix all the apps that have issues not being on 'host' It might actually be time for unraid not to be my containers node at this point Quote Link to comment
JorgeB Posted June 18, 2022 Share Posted June 18, 2022 See if this helps you: https://forums.unraid.net/bug-reports/stable-releases/6100-6102-network-lost-after-some-days-working-r1971/?do=findComment&comment=19909 Quote Link to comment
hoff Posted June 19, 2022 Author Share Posted June 19, 2022 I will read it again but that resolution includes moving all my containers that are not on the host network to another subnet and not my primary which is how ive operated for a long time, so changing the networking config of all my containers is going to be challenging with all my cross container application level integrations. As this worked pre 6.10.* there has to be a way? bug? Quote Link to comment
hoff Posted June 20, 2022 Author Share Posted June 20, 2022 after reading the link above again, this is a workaround to fix a problem that is now occurring that didn't before. It was more than possible (I was running it) to operate IPLAN + Host Access with docker containers having IP's on br0 as the same subnet as my host and primary network. adding another docker network/vlan and reconfiguring everything is really not an option at this time. whats broken from previous versions? Quote Link to comment
bonienl Posted June 22, 2022 Share Posted June 22, 2022 When you enable "Host access to custom network" a routing trick with more specific routes for the same IP subnet (br0) is introduced to fool Docker and make access possible, something which Docker does not allow for security reasons. The first question to ask: do I really need host access? In general this is NOT required, but there are some specific cases which need it. Have you tried and tested to run your containers without host access enabled? Because br0 and shim-br0 is actually one and the same machine (your Unraid box) but announced twice, it can be that certain network equipment, e.g. certain firewalls, do not like this situation and prohibit communication. In such a case you'll need to look into that device configuration and see if exception rules can be set. Quote Link to comment
ronmcmxci Posted June 8, 2023 Share Posted June 8, 2023 I ran into this same issue. As @hoffsays it looks to be an arp issue for me also. The arp table for br0 will occasionally go <incomplete> for my gateway. To workaround it I configured arp manually for my gateway with "arp -s <gateway> <gateway mac> -i br0". I'll follow up if I see an issue Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.