May 27, 20242 yr I have a very weird networking issue and I was not able to reproduce it, it just happens unregulary between once a week and once in months... Out of the blue my LAN devices are not able to access WAN anymore, and some time after, I am unable to access any other device in my LAN. (Other LAN-VLANs and WLAN continue to work) First, I suspected some of my network infrastructure, but one after one I restarted and/or temporaily replaced them, until I observed that when I unplug my Unraid instance from the network, all other devices will come back to work. Also reconnecting the Unraid instance will raise the problems once more. As the Unraid instance runs headless and for whatever (unrelated) reasons my BMC/KVM view is not able to get any output (maybe wake it), my only option is to trigger a server reset, after which I savely can reconnect the instance to my network. I did not notice anything really obvious logged, and I even tried to tcpdump the network traffic - however I just was looking if there was some obvious flooding. I run several docker instancens and services, and I thought about turning some off for testing, however as I could not find a reliable way to trigger this issue, I do not want to keep my services disabled for weeks.... I am well aware, that this issue sounds very unbelievable, or at least as it "has to be something else", and due to the difficulty analysing the problem I would not have made a post here if I wouldn't have read some posts with similar issues (without solutions) elsewhere: https://www.reddit.com/r/unRAID/comments/13becyn/unraid_takes_down_my_internet_at_the_wan_level/
May 28, 20242 yr Community Expert this sounds more like an ISP issue affecting the router. When in the state, pull up terminal / cmd and run ping and see if you can pin the lan network. Then ping the router then 8.8.8.8 and then google.com Or run network monitor tool such as netprobe. pleas post unrqaid diagnostic file. I'm not sure if your unraid system is frezzing. but that shouldn't prevent other lan access.
May 28, 20242 yr On 5/28/2024 at 5:22 AM, n00b42 said: I run several docker instancens and services, and I thought about turning some off for testing, however as I could not find a reliable way to trigger this issue, I do not want to keep my services disabled for weeks.... I believe some docker or add-on make your network down, other people also report similar case, I think you haven't much options, you need stop something to troubleshoot. For me, I never got this problem with Unraid. ( several Unraid build in 5yrs+, one build were 7*24 in 1.5yrs) May be Plex ?? I haven't that. Edited May 28, 20242 yr by Vr2Io
May 28, 20242 yr Author 38 minutes ago, bmartino1 said: this sounds more like an ISP issue affecting the router. When in the state, pull up terminal / cmd and run ping and see if you can pin the lan network. Then ping the router then 8.8.8.8 and then google.com Or run network monitor tool such as netprobe. pleas post unrqaid diagnostic file. I'm not sure if your unraid system is frezzing. but that shouldn't prevent other lan access. Setup: Both my Unraid server as well as my PC are in the same LAN (and even in the same VLAN) Situation: If the problem occurs, I can not reach any other device in my LAN from my PC, however, as soon as I either disconnect the server OR put it (or the PC) in another VLAN (even if routed in the same subnet) then I am able to access any remaining LAN devices from my PC (as well as my Router/WAN). Diagnostic Problem (unrelated to the actual issue): When the problem occurs, I am unable to access the headless server (and connecting a monitor and keyboard in that state has no effect), therefore I never was able to access the device when the issue happened. However I am working on solving that in the future (using a dummy vga to keep it alive so that my BMC can access it) Thanks for the hint at NetProbe, not sure if it will help here, but sure is an interesting tool I will look into. I appended a diagnostics file, however as I needed to restart the server to access it, there is obviously no log about the incident... (or before) xxx-diagnostics-20240529-0107.zip
May 28, 20242 yr Author 37 minutes ago, Vr2Io said: I believe some docker or add-on make your network down, other people also report similar case, I think you haven't much options, you need stop something to troubleshoot. For me, I never got this problem with Unraid. ( several Unraid build in 5yrs+, one build were 7*24 in 1.5yrs) May be Plex ?? I haven't that. I aggree that this is the most probable reason and I would totally agree if it would happen more often (within days) However the time between the last two incidents was way more than a month... Not a reasonable time to wait and not use it productively. I will check my docker containers and think about suspending some of them for now, or at least change some docker settings of the remaining (bridged or priviledged).
May 29, 20242 yr Community Expert Reviewing diagnostic to hopefully shed some light. I would also recommend running memtest when you have a chance. Usually its dns and with unraid in my experiences its the aviah /nmbd base in durring a system freze on the server and then on the lan. The Lan tries to complete the dns request and fails. This crates a multicast/udp bomb when this happens on the lan. wireshark to capture package when it happens again. In diagnosti: per the go file you run command: setterm -blank 0 Not sure what your setting here and why. https://www3.rocketsoftware.com/rocketd3/support/documentation/mvb/32/refman/fileacct/set-term_command.htm You also have bond and bridge on (this tells me default docker is ipvlan as this has caused problems before). You also have mutiple nic, vlans, and a custom docker network. We may need to know more on what is connected to what and how to diagnose. Syslog concerns: mcelog: Kernel does not support page offline interface May 28 12:30:06 XXX wsdd2[1644]: 'Terminated' signal received. May 28 12:30:06 XXX nmbd[1634]: [2024/05/28 12:30:06.423816, 0] ../../source3/nmbd/nmbd.c:59(terminate) May 28 12:30:06 XXX nmbd[1634]: Got SIGTERM: going down... May 28 12:30:06 XXX winbindd[1647]: [2024/05/28 12:30:06.423889, 0] ../../source3/winbindd/winbindd_dual.c:1950(winbindd_sig_term_handler) May 28 12:30:06 XXX winbindd[1647]: Got sig[15] terminate (is_parent=1) May 28 12:30:06 XXX wsdd2[1644]: terminating. May 28 12:30:09 XXX root: Starting Avahi mDNS/DNS-SD Daemon: /usr/sbin/avahi-daemon -D May 28 12:30:09 XXX avahi-daemon[5717]: Found user 'avahi' (UID 61) and group 'avahi' (GID 214). May 28 12:30:09 XXX avahi-daemon[5717]: Successfully dropped root privileges. May 28 12:30:09 XXX avahi-daemon[5717]: avahi-daemon 0.8 starting up. May 28 12:30:09 XXX avahi-daemon[5717]: Successfully called chroot(). May 28 12:30:09 XXX avahi-daemon[5717]: Successfully dropped remaining capabilities. May 28 12:30:09 XXX avahi-daemon[5717]: Loading service file /services/sftp-ssh.service. May 28 12:30:09 XXX avahi-daemon[5717]: Loading service file /services/smb.service. May 28 12:30:09 XXX avahi-daemon[5717]: Loading service file /services/ssh.service. May 28 12:30:09 XXX avahi-daemon[5717]: Joining mDNS multicast group on interface br0.IPv4 with address 192.168.0.200. May 28 12:30:09 XXX avahi-daemon[5717]: New relevant interface br0.IPv4 for mDNS. May 28 12:30:09 XXX avahi-daemon[5717]: Network interface enumeration completed. May 28 12:30:09 XXX avahi-daemon[5717]: Registering new address record for 192.168.0.200 on br0.IPv4. SAMBA Starts: May 28 12:30:17 XXX nmbd[5589]: Got SIGTERM: going down... May 28 12:30:17 XXX winbindd[5659]: [2024/05/28 12:30:17.212377, 0] ../../source3/winbindd/winbindd_dual.c:1950(winbindd_sig_term_handler) May 28 12:30:17 XXX winbindd[5659]: Got sig[15] terminate (is_parent=1) May 28 12:30:17 XXX wsdd2[5640]: 'Terminated' signal received. May 28 12:30:17 XXX wsdd2[5640]: terminating. May 28 12:30:17 XXX winbindd[6346]: [2024/05/28 12:30:17.213741, 0] ../../source3/winbindd/winbindd_dual.c:1950(winbindd_sig_term_handler) May 28 12:30:17 XXX winbindd[6346]: Got sig[15] terminate (is_parent=0) May 28 12:30:17 XXX winbindd[5662]: [2024/05/28 12:30:17.213798, 0] ../../source3/winbindd/winbindd_dual.c:1950(winbindd_sig_term_handler) May 28 12:30:17 XXX winbindd[5662]: Got sig[15] terminate (is_parent=0) May 28 12:30:17 XXX network: reload service: nginx ^Per log there is definitely some concerns buyt not errors with your smb/net bios and mdns on the network. What router do you use? also per FCP: May 28 12:40:13 XXX root: Fix Common Problems: Error: Macvlan and Bridging found You are a vitum of the mcvlan trace bug... Either turn off bridging or set docker settings to use ipvlan If you want to run a mac vlan with bridging enabled you will need to create the macvlan manulay and set it to a eth interface# not br0 syslog Dokcers start: May 28 13:44:19 XXX kernel: docker0: port 2(veth4c24e61) entered blocking state May 28 13:44:19 XXX kernel: docker0: port 2(veth4c24e61) entered disabled state May 28 13:44:19 XXX kernel: device veth4c24e61 entered promiscuous mode May 28 13:44:19 XXX kernel: docker0: port 2(veth4c24e61) entered blocking state May 28 13:44:19 XXX kernel: docker0: port 2(veth4c24e61) entered forwarding state May 28 13:44:19 XXX kernel: docker0: port 2(veth4c24e61) entered disabled state May 28 13:44:20 XXX kernel: eth0: renamed from vethfadee0e May 28 13:44:20 XXX kernel: IPv6: ADDRCONF(NETDEV_CHANGE): veth4c24e61: link becomes read every veth is a docker on a custom network.
May 29, 20242 yr Community Expert review: https://docs.unraid.net/unraid-os/release-notes/6.12.4/#fix-for-macvlan-call-traces https://unraid.net/blog/6-12-10 Call traces and crashes related to macvlan If you are getting call traces related to macvlan (or any unexplained crashes, really), as a first step we recommend navigating to Settings > Docker, switching to advanced view, and changing the Docker custom network type from macvlan to ipvlan. This is the default configuration that Unraid has shipped with since version 6.11.5 and should work for most systems. Note that some users have reported issues with port forwarding from certain routers (Fritzbox) and reduced functionality with advanced network management tools (Ubiquity) when in ipvlan mode. If this affects you, see the alternate solution available since Unraid 6.12.4. Quote per diag you have a kernal panic of the macvlan trace bug: May 28 20:15:06 XXX kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan] May 28 20:15:06 XXX kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 28 20:15:06 XXX kernel: Code: 44 24 10 e8 e2 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 7e e6 ff ff 84 c0 75 a2 48 89 df e8 9b e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 18 dd ff ff e8 93 e3 ff ff e9 72 01 May 28 20:15:06 XXX kernel: RSP: 0018:ffffc90000148d98 EFLAGS: 00010202 May 28 20:15:06 XXX kernel: RAX: 0000000000000001 RBX: ffff8882e52c0d00 RCX: 18de1138e9c2945e May 28 20:15:06 XXX kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff8882e52c0d00 May 28 20:15:06 XXX kernel: RBP: 0000000000000001 R08: 1d68469c69805a1a R09: 1aca27c1fce9b86d May 28 20:15:06 XXX kernel: R10: 071ceac07eb4b712 R11: ffffc90000148d60 R12: ffffffff82a11d00 May 28 20:15:06 XXX kernel: R13: 000000000001cde8 R14: ffff8886a2ef4d00 R15: 0000000000000000 May 28 20:15:06 XXX kernel: FS: 0000000000000000(0000) GS:ffff88885fc80000(0000) knlGS:0000000000000000 May 28 20:15:06 XXX kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 28 20:15:06 XXX kernel: CR2: 0000154199e8311c CR3: 00000001f0a9c000 CR4: 00000000003506e0 May 28 20:15:06 XXX kernel: Call Trace: May 28 20:15:06 XXX kernel: <IRQ> May 28 20:15:06 XXX kernel: ? __warn+0xab/0x122 May 28 20:15:06 XXX kernel: ? report_bug+0x109/0x17e May 28 20:15:06 XXX kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 28 20:15:06 XXX kernel: ? handle_bug+0x41/0x6f May 28 20:15:06 XXX kernel: ? exc_invalid_op+0x13/0x60 May 28 20:15:06 XXX kernel: ? asm_exc_invalid_op+0x16/0x20 May 28 20:15:06 XXX kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack] May 28 20:15:06 XXX kernel: ? __nf_conntrack_confirm+0x9e/0x2b0 [nf_conntrack] May 28 20:15:06 XXX kernel: ? nf_nat_inet_fn+0x60/0x1a8 [nf_nat] May 28 20:15:06 XXX kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack] May 28 20:15:06 XXX kernel: nf_hook_slow+0x3d/0x96 May 28 20:15:06 XXX kernel: ? ip_protocol_deliver_rcu+0x164/0x164 May 28 20:15:06 XXX kernel: NF_HOOK.constprop.0+0x79/0xd9 May 28 20:15:06 XXX kernel: ? ip_protocol_deliver_rcu+0x164/0x164 May 28 20:15:06 XXX kernel: __netif_receive_skb_one_core+0x77/0x9c May 28 20:15:06 XXX kernel: process_backlog+0x8c/0x116 May 28 20:15:06 XXX kernel: __napi_poll.constprop.0+0x2b/0x124 May 28 20:15:06 XXX kernel: net_rx_action+0x159/0x24f May 28 20:15:06 XXX kernel: __do_softirq+0x129/0x288 May 28 20:15:06 XXX kernel: do_softirq+0x7f/0xab May 28 20:15:06 XXX kernel: </IRQ> May 28 20:15:06 XXX kernel: <TASK> May 28 20:15:06 XXX kernel: __local_bh_enable_ip+0x4c/0x6b May 28 20:15:06 XXX kernel: netif_rx+0x52/0x5a May 28 20:15:06 XXX kernel: macvlan_broadcast+0x10a/0x150 [macvlan] May 28 20:15:06 XXX kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] May 28 20:15:06 XXX kernel: process_one_work+0x1ab/0x295 May 28 20:15:06 XXX kernel: worker_thread+0x18b/0x244 May 28 20:15:06 XXX kernel: ? rescuer_thread+0x281/0x281 May 28 20:15:06 XXX kernel: kthread+0xe7/0xef May 28 20:15:06 XXX kernel: ? kthread_complete_and_exit+0x1b/0x1b May 28 20:15:06 XXX kernel: ret_from_fork+0x22/0x30 May 28 20:15:06 XXX kernel: </TASK> May 28 20:15:06 XXX kernel: ---[ end trace 0000000000000000 ]--- May 28 20:45:32 XXX emhttpd: read SMART /dev/sde
May 29, 20242 yr Author setterm -blank 0 This is an attempt to disable the screen to turn off (which might be a problem when accessing it via BMC/KVM), there are several posts about that command in the forum, e.g. https://forums.unraid.net/topic/104711-solved-looking-for-the-command-to-disable-screen-blanking/?do=findComment&comment=986789
May 29, 20242 yr Author Quote You also have bond and bridge on (this tells me default docker is ipvlan as this has caused problems before). You also have mutiple nic, vlans, and a custom docker network. We may need to know more on what is connected to what and how to diagnose. I see, the docker containers stuck around for a long time and there might be some legacy stuff (or "recommended" stuff from old tutorials I used back then), will look into that. I am not sure about all the reasoning and which tutorials I used, however this one looks close to what I have/had: (emby + swag in a custom network) https://forums.unraid.net/topic/101969-guide-setup-emby-with-hw-transcoding-on-unraid-remote-access-through-reverse-proxy-using-swag/ The server only uses one of its NIC (the others are never connected), and there is no VLAN tagging on the server side (only done internally by the Router on a port basis) The router runs OpenWRT, which allowed me to easily move the devices (on a port basis) from one VLAN to another, and to get tcpdumps on a VLAN basis. I have several tcpdumps, however I feel uncomfortable posting them publicly, but I am happy to provide them temporarily via PM, if that might help and is ok for you?
May 29, 20242 yr Author > docker network ls NETWORK ID NAME DRIVER SCOPE 69d9ba12e3ba br0 macvlan local b3c086cd02ba bridge bridge local ef0efa310ce6 cdocknet bridge local 9f5a5e61419c host host local d85a3c71fba0 none null local > docker network inspect br0 bridge cdocknet host none [ { "Name": "br0", "Id": "xxx", "Created": "2024-05-28T12:30:30.178193348+02:00", "Scope": "local", "Driver": "macvlan", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "192.168.0.0/24", "Gateway": "192.168.0.1", "AuxiliaryAddresses": { "server": "192.168.0.200" } } ] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": { "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:c0:a8:00:02", "IPv4Address": "192.168.0.2/24", "IPv6Address": "" } }, "Options": { "parent": "br0" }, "Labels": {} }, { "Name": "bridge", "Id": "xxx", "Created": "2024-05-28T12:30:27.200004702+02:00", "Scope": "local", "Driver": "bridge", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [ { "Subnet": "172.17.0.0/16", "Gateway": "172.17.0.1" } ] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": { "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:11:00:03", "IPv4Address": "172.17.0.3/16", "IPv6Address": "" }, "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:11:00:02", "IPv4Address": "172.17.0.2/16", "IPv6Address": "" } }, "Options": { "com.docker.network.bridge.default_bridge": "true", "com.docker.network.bridge.enable_icc": "true", "com.docker.network.bridge.enable_ip_masquerade": "true", "com.docker.network.bridge.host_binding_ipv4": "0.0.0.0", "com.docker.network.bridge.name": "docker0", "com.docker.network.driver.mtu": "1500" }, "Labels": {} }, { "Name": "cdocknet", "Id": "xxx", "Created": "2021-03-03T15:58:08.470260897+01:00", "Scope": "local", "Driver": "bridge", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": {}, "Config": [ { "Subnet": "172.18.0.0/16", "Gateway": "172.18.0.1" } ] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": { "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:12:00:07", "IPv4Address": "172.18.0.7/16", "IPv6Address": "" }, "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:12:00:02", "IPv4Address": "172.18.0.2/16", "IPv6Address": "" }, "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:12:00:05", "IPv4Address": "172.18.0.5/16", "IPv6Address": "" }, "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:12:00:03", "IPv4Address": "172.18.0.3/16", "IPv6Address": "" }, "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:12:00:08", "IPv4Address": "172.18.0.8/16", "IPv6Address": "" }, "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:12:00:06", "IPv4Address": "172.18.0.6/16", "IPv6Address": "" }, "xxx": { "Name": "xxx", "EndpointID": "xxx", "MacAddress": "02:42:ac:12:00:04", "IPv4Address": "172.18.0.4/16", "IPv6Address": "" } }, "Options": {}, "Labels": {} }, { "Name": "host", "Id": "xxx", "Created": "2021-03-03T14:41:07.112591038+01:00", "Scope": "local", "Driver": "host", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": {}, "Options": {}, "Labels": {} }, { "Name": "none", "Id": "xxx", "Created": "2021-03-03T14:41:06.621475741+01:00", "Scope": "local", "Driver": "null", "EnableIPv6": false, "IPAM": { "Driver": "default", "Options": null, "Config": [] }, "Internal": false, "Attachable": false, "Ingress": false, "ConfigFrom": { "Network": "" }, "ConfigOnly": false, "Containers": {}, "Options": {}, "Labels": {} } ] Is the problem that there are networks with different drivers (macvlan and bridge) at the same time? Or that there are any bridges while "Docker custom network type" is set to "macvlan"? My most important (and reverse proxied) containers are in cdocknet, I probably could move all other containers there and delete the other networks, would that help? (I also read about bonding, and as I will not use it I could deactivate it to reduce complexity) I actually only access my docker containers via exposed ports on the server IP (or via reverse proxy), i.e. as far as I understand it, currently I neither rely on macvlan nor ipvlan, right?
May 29, 20242 yr Community Expert 8 hours ago, n00b42 said: setterm -blank 0 This is an attempt to disable the screen to turn off (which might be a problem when accessing it via BMC/KVM), there are several posts about that command in the forum, e.g. https://forums.unraid.net/topic/104711-solved-looking-for-the-command-to-disable-screen-blanking/?do=findComment&comment=986789 not sure if i can assist in the screen turn off. We need to fix the other issue first. Regarding docker networks. wrong setup is implementd or understoood.. Thank you for the other inform like docker inspect. Quote "Driver": "macvlan", Bridge is bridge driver. There is a difference. Your current setup for netwroking is flawed and needs to be fixed. I would recomend you go to settings > docekr > change: Docker custom network type to ipvlan This will remove the tracebug and system freeze.
June 4, 20242 yr Author For once I changed "Settings > docker > Docker custom network type" to "ipvlan" as suggested. Furthermore I removed the "br0" network and moved the corresponding container to the default bridge network for now.
July 4, 20242 yr Author just as a note: even though it does not have to mean anything, a month passed without problems. we will see.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.