Astatine Posted March 13 Share Posted March 13 (edited) **DIAGNOSTICS REPORT ATTACHED** For the past few days, my unRAID server has started having severe network connection issues. I hadn't made any changes to the network settings. I went through multiple articles online about DNS issues with unRAID. I've tried use all of the following DNS addresses: 1.1.1.1 8.8.8.8 8.8.4.4 <my gateway/router ip> 208.67.222.222 & 208.67.220.220 But every time I am done changing the DNS, the server works fine for a few minutes, i.e., is able to connect to the internet. However, after a few minutes, I looses connection. I am unable to load the CA tab, I tried pinging both unraid.net & google.com and they were unreachable This has gotten very frustrating. The following part can be completely unrelated to the above but I'll still leave it here in case. In the days prior to having the DNS issue with unRAID server itself, I had DNS issues with the OpenVPN containers that I have running on Docker. I would have to restart the server to get access to the outside world and after running for a few minutes the containers would loose access to the internet. I am not expert whatsoever but I speculate that somehow, maybe due to a corrupt file or something, that DNS issue from the OpenVPN container has traveled to the unRAID server itself (I know it sounds absurd) Please help me out here. I was finally really happy with my homeserver setup and want to go back to that phase. astatine-diagnostics-20230312-2035.zip Edited March 13 by Astatine Quote Link to comment
MAM59 Posted March 13 Share Posted March 13 you do not have a dns issue, you have a routing problem. The gateway at 192.168.1.1 is unable to forward your packets. Line gone? Wrong Gateway? Or you have serious problems with wakeup after sleep mode of your server ("ntp:no peer for too long, server running free now") Too much powersaving maybe. Quote Link to comment
jace5869 Posted March 13 Share Posted March 13 I'm having almost the exact same issue. I keep thinking it's my realtek nics. This totally disrupts dockers and downloaders. I'm not sure how it could be routing when I'm using a simple fritzbox Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 (edited) 3 hours ago, MAM59 said: you do not have a dns issue, you have a routing problem. The gateway at 192.168.1.1 is unable to forward your packets. Line gone? Wrong Gateway? Or you have serious problems with wakeup after sleep mode of your server ("ntp:no peer for too long, server running free now") Too much powersaving maybe. I don't think it is a routing issue because rest of the devices in my home are working just fine. I made no changes to the router/modem. I have a separate machine running Home Assistant and it works great. I am able to control it remotely as expected. It's only this unRAID machine that has been facing connection issues. Besides, can you mention what specific setting in the router you want me check just so I can make sure it indeed is not a routing issue? Also, here are the network settings from my unraid instance Edited March 13 by Astatine Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 I went through the logs and I don't know why but the following logs seem out of place to me. Maybe it's just that I don't understand what they are for. Mar 12 18:22:08 astatine avahi-daemon[3663]: Joining mDNS multicast group on interface veth44611d6.IPv6 with address fe80::c026:5fff:feaf:d7c6. Mar 12 18:22:08 astatine avahi-daemon[3663]: New relevant interface veth44611d6.IPv6 for mDNS. Mar 12 18:22:08 astatine avahi-daemon[3663]: Registering new address record for fe80::c026:5fff:feaf:d7c6 on veth44611d6.*. Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered blocking state Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered disabled state Mar 12 18:22:08 astatine kernel: device vethb2fb7ee entered promiscuous mode Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered blocking state Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered forwarding state Mar 12 18:22:08 astatine kernel: eth0: renamed from veth19a3a37 Mar 12 18:22:08 astatine kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethb2fb7ee: link becomes ready Mar 12 18:22:08 astatine rc.docker: linkding: started succesfully! Mar 12 18:22:09 astatine avahi-daemon[3663]: Joining mDNS multicast group on interface vethbadb898.IPv6 with address fe80::4cd6:26ff:fe36:9eeb. Mar 12 18:22:09 astatine avahi-daemon[3663]: New relevant interface vethbadb898.IPv6 for mDNS. Mar 12 18:22:09 astatine avahi-daemon[3663]: Registering new address record for fe80::4cd6:26ff:fe36:9eeb on vethbadb898.*. Mar 12 18:22:09 astatine avahi-daemon[3663]: Joining mDNS multicast group on interface vethb2fb7ee.IPv6 with address fe80::8835:e6ff:fe26:d1ac. Mar 12 18:22:09 astatine avahi-daemon[3663]: New relevant interface vethb2fb7ee.IPv6 for mDNS. Mar 12 18:22:09 astatine avahi-daemon[3663]: Registering new address record for fe80::8835:e6ff:fe26:d1ac on vethb2fb7ee.*. Mar 12 18:22:10 astatine avahi-daemon[3663]: Joining mDNS multicast group on interface veth02790f8.IPv6 with address fe80::28b2:6ff:fe74:20c5. Mar 12 18:22:10 astatine avahi-daemon[3663]: New relevant interface veth02790f8.IPv6 for mDNS. Mar 12 18:22:10 astatine avahi-daemon[3663]: Registering new address record for fe80::28b2:6ff:fe74:20c5 on veth02790f8.*. Quote Link to comment
trott Posted March 13 Share Posted March 13 try to delete network.cfg and setup network again, but why do you enable bonding if you only has nic Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 I haven't touched the network setting. They are as is as they came with the install. I only changed the DNS trying to fix my issue. Bonding is enabled out of the box with 'active-backup' as default bonding mode. Anyways, if I deleted network.cfg, do I have to reboot the server to get an empty/default network.cfg.? And will removing network.cfg mess up with container networks? Quote Link to comment
jace5869 Posted March 13 Share Posted March 13 26 minutes ago, Astatine said: I went through the logs and I don't know why but the following logs seem out of place to me. Maybe it's just that I don't understand what they are for. Mar 12 18:22:08 astatine avahi-daemon[3663]: Joining mDNS multicast group on interface veth44611d6.IPv6 with address fe80::c026:5fff:feaf:d7c6. Mar 12 18:22:08 astatine avahi-daemon[3663]: New relevant interface veth44611d6.IPv6 for mDNS. Mar 12 18:22:08 astatine avahi-daemon[3663]: Registering new address record for fe80::c026:5fff:feaf:d7c6 on veth44611d6.*. Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered blocking state Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered disabled state Mar 12 18:22:08 astatine kernel: device vethb2fb7ee entered promiscuous mode Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered blocking state Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered forwarding state Mar 12 18:22:08 astatine kernel: eth0: renamed from veth19a3a37 Mar 12 18:22:08 astatine kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethb2fb7ee: link becomes ready Mar 12 18:22:08 astatine rc.docker: linkding: started succesfully! Mar 12 18:22:09 astatine avahi-daemon[3663]: Joining mDNS multicast group on interface vethbadb898.IPv6 with address fe80::4cd6:26ff:fe36:9eeb. Mar 12 18:22:09 astatine avahi-daemon[3663]: New relevant interface vethbadb898.IPv6 for mDNS. Mar 12 18:22:09 astatine avahi-daemon[3663]: Registering new address record for fe80::4cd6:26ff:fe36:9eeb on vethbadb898.*. Mar 12 18:22:09 astatine avahi-daemon[3663]: Joining mDNS multicast group on interface vethb2fb7ee.IPv6 with address fe80::8835:e6ff:fe26:d1ac. Mar 12 18:22:09 astatine avahi-daemon[3663]: New relevant interface vethb2fb7ee.IPv6 for mDNS. Mar 12 18:22:09 astatine avahi-daemon[3663]: Registering new address record for fe80::8835:e6ff:fe26:d1ac on vethb2fb7ee.*. Mar 12 18:22:10 astatine avahi-daemon[3663]: Joining mDNS multicast group on interface veth02790f8.IPv6 with address fe80::28b2:6ff:fe74:20c5. Mar 12 18:22:10 astatine avahi-daemon[3663]: New relevant interface veth02790f8.IPv6 for mDNS. Mar 12 18:22:10 astatine avahi-daemon[3663]: Registering new address record for fe80::28b2:6ff:fe74:20c5 on veth02790f8.*. I see pretty much the same logs. I think it's the docker/network cleanup after power and off containers...not sure though. Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 So, I deleted the network.cfg and rebooted, setup the network. Everything was working just fine for about 10 minutes and then unRAID server again lost connect to the internet. However, this time a new error message came in the logs Mar 13 10:47:56 astatine ntpd[1172]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized I then checked the time in Settings > Date and Time and the time looks okay to me. It's the correct timezone and datetime. Is there a way to reset this? Quote Link to comment
MAM59 Posted March 13 Share Posted March 13 no, you have got me wrong. I just counted up POSSIBLE reasons for this kind of symptoms. And your suspect "DNS" is not among them. Since your logs do not show ANY kind of real error (the "unsync clock" is normal after a reboot. ntp needs a lot of samples until it believes the incoming data), there is no clue where to search. So I end up with the basics: check your cables, switch and LAN Card. And, like somebody above already noted: get rid of that bonding stuff! Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 @MAM59 Okay, I got rid of bonding stuff. But the issue didn't go away. After changing network settings, everything works fine for about 10-15 minutes and then it's back to square one. When I ping anything, I get 'Destination Host Unreachable'. I've attached the new diagnostics to this post as bonding settings were changed. astatine-diagnostics-20230313-1204.zip Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 @trurl @JorgeB guys, a little help here! Please Quote Link to comment
MAM59 Posted March 13 Share Posted March 13 your logs are not very helpful. You should go to your server, reboot, log on to the console (not the gui), place a "ping -t somebodyYouKnow" and wait for the unreachable. Then cancel the ping and directly do "diagnostics" to capture this state. This should only show the relevant data and not contains tons and tons of wrong tries. Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 17 minutes ago, MAM59 said: your logs are not very helpful. You should go to your server, reboot, log on to the console (not the gui), place a "ping -t somebodyYouKnow" and wait for the unreachable. Then cancel the ping and directly do "diagnostics" to capture this state. This should only show the relevant data and not contains tons and tons of wrong tries. I perform this in the evening. However, I just now noticed a strange thing. So, as of now I am unable to see CA tab, no docker container is able to connect to the internet, pings are failing. However, the network panel on the dashboard is show significant network activity. I saw the outbound traffic going upward to about 10Mbps as well. What can explain this? Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 @JorgeB Here you go. astatine-diagnostics-20230313-1304.zip Quote Link to comment
jace5869 Posted March 13 Share Posted March 13 Closely following I will try some of this. Got two 10g marvel nics coming to try Quote Link to comment
JorgeB Posted March 13 Share Posted March 13 Everything looks normal in the diags, looks more like a LAN problem, Try changing your DNS server to 208.67.222.222 and 208.67.220.220 instead of using your router. Quote Link to comment
jace5869 Posted March 13 Share Posted March 13 Seems strange we are having the same issue and it be LAN. Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 3 hours ago, JorgeB said: Everything looks normal in the diags, looks more like a LAN problem, Try changing your DNS server to 208.67.222.222 and 208.67.220.220 instead of using your router. Already done that. No luck. It works fine for 10-15 minutes after saving the changes and then back to square one. No connection. Pings failing. Quote Link to comment
Mr_Jay84 Posted March 13 Share Posted March 13 What version of Unraid are you guys running? Quote Link to comment
Astatine Posted March 13 Author Share Posted March 13 4 minutes ago, Mr_Jay84 said: What version of Unraid are you guys running? v6.11.5 Quote Link to comment
itimpi Posted March 13 Share Posted March 13 11 hours ago, jace5869 said: I'm not sure how it could be routing when I'm using a simple fritzbox My upstairs neighbour had exactly this problem and it turned our to be his Fritz box (he is not an Unraid user)! Quote Link to comment
Astatine Posted March 14 Author Share Posted March 14 15 hours ago, itimpi said: My upstairs neighbour had exactly this problem and it turned our to be his Fritz box (he is not an Unraid user)! I am not using a custom router/modem. I have a Bell connection and use the router/modem provided by them. It's a Bell Home Hub 4000 Quote Link to comment
Astatine Posted March 14 Author Share Posted March 14 (edited) @JorgeB@itimpi So, I gave up on the server last night and left it on its own. Turns out a couple hours after the network reset yesterday, the server behavior changed. Now the server is connecting on and off. And I am getting new errors in the log(s). I have attached the latest diagnostics to this post. Here's a snippet of the new error(s) that I am seeing. Mar 13 11:52:22 astatine avahi-daemon[23598]: Joining mDNS multicast group on interface vethf5a568d.IPv6 with address fe80::c86c:29ff:fead:e306. Mar 13 11:52:22 astatine avahi-daemon[23598]: New relevant interface vethf5a568d.IPv6 for mDNS. Mar 13 11:52:22 astatine avahi-daemon[23598]: Registering new address record for fe80::c86c:29ff:fead:e306 on vethf5a568d.*. Mar 13 11:52:22 astatine avahi-daemon[23598]: Joining mDNS multicast group on interface veth25e0c9f.IPv6 with address fe80::44e7:81ff:fe1d:5a72. Mar 13 11:52:22 astatine avahi-daemon[23598]: New relevant interface veth25e0c9f.IPv6 for mDNS. Mar 13 11:52:22 astatine avahi-daemon[23598]: Registering new address record for fe80::44e7:81ff:fe1d:5a72 on veth25e0c9f.*. Mar 13 11:52:23 astatine avahi-daemon[23598]: Joining mDNS multicast group on interface veth21f1006.IPv6 with address fe80::307a:a5ff:fe6a:307e. Mar 13 11:52:23 astatine avahi-daemon[23598]: New relevant interface veth21f1006.IPv6 for mDNS. Mar 13 11:52:23 astatine avahi-daemon[23598]: Registering new address record for fe80::307a:a5ff:fe6a:307e on veth21f1006.*. Mar 13 11:52:23 astatine avahi-daemon[23598]: Joining mDNS multicast group on interface veth3e1bea2.IPv6 with address fe80::409b:19ff:febd:db3e. Mar 13 11:52:23 astatine avahi-daemon[23598]: New relevant interface veth3e1bea2.IPv6 for mDNS. Mar 13 11:52:23 astatine avahi-daemon[23598]: Registering new address record for fe80::409b:19ff:febd:db3e on veth3e1bea2.*. Mar 13 17:25:47 astatine nginx: 2023/03/13 17:25:47 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 13 17:25:47 astatine nginx: 2023/03/13 17:25:47 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 13 17:25:47 astatine nginx: 2023/03/13 17:25:47 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 13 17:42:32 astatine ntpd[7970]: no peer for too long, server running free now Mar 13 18:17:31 astatine ntpd[7970]: no peer for too long, server running free now Mar 13 18:24:29 astatine webGUI: Successful login user root from <tailscale ip of my phone> Checkout the log at timestamp: Mar 13 20:07:29 Mar 13 18:27:07 astatine avahi-daemon[23598]: Joining mDNS multicast group on interface veth641d21e.IPv6 with address fe80::78a8:91ff:fee9:b4be. Mar 13 18:27:07 astatine avahi-daemon[23598]: New relevant interface veth641d21e.IPv6 for mDNS. Mar 13 18:27:07 astatine avahi-daemon[23598]: Registering new address record for fe80::78a8:91ff:fee9:b4be on veth641d21e.*. Mar 13 20:07:29 astatine ntpd[7970]: receive: Unexpected origin timestamp 0xe7ba3941.9078c3b0 does not match aorg 0000000000.00000000 from [email protected].239.35.0 xmt 0xe7ba3940.e57b3745 Mar 14 00:59:48 astatine nginx: 2023/03/14 00:59:48 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 00:59:48 astatine nginx: 2023/03/14 00:59:48 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 00:59:48 astatine nginx: 2023/03/14 00:59:48 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 02:00:01 astatine root: mover: started Mar 14 02:00:03 astatine root: mover: finished Mar 14 02:22:58 astatine nginx: 2023/03/14 02:22:58 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 03:53:35 astatine nginx: 2023/03/14 03:53:35 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 03:53:35 astatine nginx: 2023/03/14 03:53:35 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 03:53:35 astatine nginx: 2023/03/14 03:53:35 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 04:23:08 astatine ntpd[7970]: no peer for too long, server running free now Mar 14 05:29:33 astatine nginx: 2023/03/14 05:29:33 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 05:29:33 astatine nginx: 2023/03/14 05:29:33 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 05:29:33 astatine nginx: 2023/03/14 05:29:33 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 06:18:35 astatine nginx: 2023/03/14 06:18:35 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 07:59:39 astatine ntpd[7970]: no peer for too long, server running free now Mar 14 09:02:32 astatine ntpd[7970]: no peer for too long, server running free now Mar 14 09:31:20 astatine ntpd[7970]: no peer for too long, server running free now Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:04 astatine nginx: 2023/03/14 10:23:04 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:04 astatine nginx: 2023/03/14 10:23:04 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:04 astatine nginx: 2023/03/14 10:23:04 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:05 astatine nginx: 2023/03/14 10:23:05 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:06 astatine nginx: 2023/03/14 10:23:06 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Mar 14 10:23:07 astatine nginx: 2023/03/14 10:23:07 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen. Edited March 14 by Astatine Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.