unRAID server having severe DNS issues


Go to solution Solved by autumnwalker,

Recommended Posts

**DIAGNOSTICS REPORT ATTACHED**

For the past few days, my unRAID server has started having severe network connection issues. I hadn't made any changes to the network settings. I went through multiple articles online about DNS issues with unRAID. I've tried use all of the following DNS addresses:

  • 1.1.1.1
  • 8.8.8.8
  • 8.8.4.4
  • <my gateway/router ip>
  • 208.67.222.222 & 208.67.220.220

But every time I am done changing the DNS, the server works fine for a few minutes, i.e., is able to connect to the internet. However, after a few minutes, I looses connection. I am unable to load the CA tab, I tried pinging both unraid.net & google.com and they were unreachable

1409131002_Screenshot2023-03-12at21-01-51astatine_Apps.thumb.png.2e9fe6146da5928351f85e63a53afc6d.png

810021787_Screenshot2023-03-12at18-14-56root@astatinebash--login(astatine).png.9802a55ce02900d6ddf74a66d2880c63.png

973353286_Screenshot2023-03-12at20-59-37root@astatinebash--login(astatine).thumb.png.460d849179fa75f8ea127d6da3fea4b9.png

1485881603_Screenshotfrom2023-03-1221-09-46.png.a2d95198342ce53b42084482c1c2a1e9.png

This has gotten very frustrating.

 

The following part can be completely unrelated to the above but I'll still leave it here in case.

 

In the days prior to having the DNS issue with unRAID server itself, I had DNS issues with the OpenVPN containers that I have running on Docker. I would have to restart the server to get access to the outside world and after running for a few minutes the containers would loose access to the internet. I am not expert whatsoever but I speculate that somehow, maybe due to a corrupt file or something, that DNS issue from the OpenVPN container has traveled to the unRAID server itself (I know it sounds absurd)

 

Please help me out here. I was finally really happy with my homeserver setup and want to go back to that phase.

 

astatine-diagnostics-20230312-2035.zip

Edited by Astatine
Link to comment
  • Replies 58
  • Created
  • Last Reply

Top Posters In This Topic

you do not have a dns issue, you have a routing problem. The gateway at 192.168.1.1 is unable to forward your packets.

Line gone? Wrong Gateway?

Or you have serious problems with wakeup after sleep mode of your server ("ntp:no peer for too long, server running free now")

Too much powersaving maybe.

 

 

Link to comment
3 hours ago, MAM59 said:

you do not have a dns issue, you have a routing problem. The gateway at 192.168.1.1 is unable to forward your packets.

Line gone? Wrong Gateway?

Or you have serious problems with wakeup after sleep mode of your server ("ntp:no peer for too long, server running free now")

Too much powersaving maybe.

 

 

I don't think it is a routing issue because rest of the devices in my home are working just fine. I made no changes to the router/modem. I have a separate machine running Home Assistant and it works great. I am able to control it remotely as expected. It's only this unRAID machine that has been facing connection issues. Besides, can you mention what specific setting in the router you want me check just so I can make sure it indeed is not a routing issue?

 

Also, here are the network settings from my unraid instance

image.thumb.png.0fd0a95571a50b61bec60934c725b88b.png

image.thumb.png.960cd49fed025b5e622fe276ad8b29dc.png

Edited by Astatine
Link to comment

I went through the logs and I don't know why but the following logs seem out of place to me. Maybe it's just that I don't understand what they are for.

Mar 12 18:22:08 astatine  avahi-daemon[3663]: Joining mDNS multicast group on interface veth44611d6.IPv6 with address fe80::c026:5fff:feaf:d7c6.
Mar 12 18:22:08 astatine  avahi-daemon[3663]: New relevant interface veth44611d6.IPv6 for mDNS.
Mar 12 18:22:08 astatine  avahi-daemon[3663]: Registering new address record for fe80::c026:5fff:feaf:d7c6 on veth44611d6.*.
Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered blocking state
Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered disabled state
Mar 12 18:22:08 astatine kernel: device vethb2fb7ee entered promiscuous mode
Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered blocking state
Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered forwarding state
Mar 12 18:22:08 astatine kernel: eth0: renamed from veth19a3a37
Mar 12 18:22:08 astatine kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethb2fb7ee: link becomes ready
Mar 12 18:22:08 astatine rc.docker: linkding: started succesfully!
Mar 12 18:22:09 astatine  avahi-daemon[3663]: Joining mDNS multicast group on interface vethbadb898.IPv6 with address fe80::4cd6:26ff:fe36:9eeb.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: New relevant interface vethbadb898.IPv6 for mDNS.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: Registering new address record for fe80::4cd6:26ff:fe36:9eeb on vethbadb898.*.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: Joining mDNS multicast group on interface vethb2fb7ee.IPv6 with address fe80::8835:e6ff:fe26:d1ac.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: New relevant interface vethb2fb7ee.IPv6 for mDNS.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: Registering new address record for fe80::8835:e6ff:fe26:d1ac on vethb2fb7ee.*.
Mar 12 18:22:10 astatine  avahi-daemon[3663]: Joining mDNS multicast group on interface veth02790f8.IPv6 with address fe80::28b2:6ff:fe74:20c5.
Mar 12 18:22:10 astatine  avahi-daemon[3663]: New relevant interface veth02790f8.IPv6 for mDNS.
Mar 12 18:22:10 astatine  avahi-daemon[3663]: Registering new address record for fe80::28b2:6ff:fe74:20c5 on veth02790f8.*.

 

Link to comment

I haven't touched the network setting. They are as is as they came with the install. I only changed the DNS trying to fix my issue. Bonding is enabled out of the box with 'active-backup' as default bonding mode. 

 

Anyways, if I deleted network.cfg, do I have to reboot the server to get an empty/default network.cfg.? And will removing network.cfg mess up with container networks?

Link to comment
26 minutes ago, Astatine said:

I went through the logs and I don't know why but the following logs seem out of place to me. Maybe it's just that I don't understand what they are for.

Mar 12 18:22:08 astatine  avahi-daemon[3663]: Joining mDNS multicast group on interface veth44611d6.IPv6 with address fe80::c026:5fff:feaf:d7c6.
Mar 12 18:22:08 astatine  avahi-daemon[3663]: New relevant interface veth44611d6.IPv6 for mDNS.
Mar 12 18:22:08 astatine  avahi-daemon[3663]: Registering new address record for fe80::c026:5fff:feaf:d7c6 on veth44611d6.*.
Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered blocking state
Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered disabled state
Mar 12 18:22:08 astatine kernel: device vethb2fb7ee entered promiscuous mode
Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered blocking state
Mar 12 18:22:08 astatine kernel: docker0: port 7(vethb2fb7ee) entered forwarding state
Mar 12 18:22:08 astatine kernel: eth0: renamed from veth19a3a37
Mar 12 18:22:08 astatine kernel: IPv6: ADDRCONF(NETDEV_CHANGE): vethb2fb7ee: link becomes ready
Mar 12 18:22:08 astatine rc.docker: linkding: started succesfully!
Mar 12 18:22:09 astatine  avahi-daemon[3663]: Joining mDNS multicast group on interface vethbadb898.IPv6 with address fe80::4cd6:26ff:fe36:9eeb.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: New relevant interface vethbadb898.IPv6 for mDNS.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: Registering new address record for fe80::4cd6:26ff:fe36:9eeb on vethbadb898.*.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: Joining mDNS multicast group on interface vethb2fb7ee.IPv6 with address fe80::8835:e6ff:fe26:d1ac.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: New relevant interface vethb2fb7ee.IPv6 for mDNS.
Mar 12 18:22:09 astatine  avahi-daemon[3663]: Registering new address record for fe80::8835:e6ff:fe26:d1ac on vethb2fb7ee.*.
Mar 12 18:22:10 astatine  avahi-daemon[3663]: Joining mDNS multicast group on interface veth02790f8.IPv6 with address fe80::28b2:6ff:fe74:20c5.
Mar 12 18:22:10 astatine  avahi-daemon[3663]: New relevant interface veth02790f8.IPv6 for mDNS.
Mar 12 18:22:10 astatine  avahi-daemon[3663]: Registering new address record for fe80::28b2:6ff:fe74:20c5 on veth02790f8.*.

 

I see pretty much the same logs. I think it's the docker/network cleanup after power and off containers...not sure though.

Link to comment

So, I deleted the network.cfg and rebooted, setup the network. Everything was working just fine for about 10 minutes and then unRAID server again lost connect to the internet. However, this time a new error message came in the logs

Mar 13 10:47:56 astatine  ntpd[1172]: kernel reports TIME_ERROR: 0x41: Clock Unsynchronized

I then checked the time in Settings > Date and Time and the time looks okay to me. It's the correct timezone and datetime. 

image.thumb.png.45c579a6ab09bc50b64b1c9be80aee30.png

 

Is there a way to reset this?

Link to comment

no, you have got me wrong. I just counted up POSSIBLE reasons for this kind of symptoms. And your suspect "DNS" is not among them.

 

Since your logs do not show ANY kind of real error (the "unsync clock" is normal after a reboot. ntp needs a lot of samples until it believes the incoming data), there is no clue where to search.

So I end up with the basics: check your cables, switch and LAN Card. And, like somebody above already noted: get rid of that bonding stuff!

 

Link to comment

your logs are not very helpful. You should go to your server, reboot, log on to the console (not the gui), place a "ping -t somebodyYouKnow" and wait for the unreachable.

Then cancel the ping and directly do "diagnostics" to capture this state.

This should only show the relevant data and not contains tons and tons of wrong tries.

 

Link to comment
17 minutes ago, MAM59 said:

your logs are not very helpful. You should go to your server, reboot, log on to the console (not the gui), place a "ping -t somebodyYouKnow" and wait for the unreachable.

Then cancel the ping and directly do "diagnostics" to capture this state.

This should only show the relevant data and not contains tons and tons of wrong tries.

 

I perform this in the evening. However, I just now noticed a strange thing. So, as of now I am unable to see CA tab, no docker container is able to connect to the internet, pings are failing. However, the network panel on the dashboard is show significant network activity. 

msedge_0hnDAGmfBN.gif.1e1feb1097d2ceea8265ccac25886b2e.gif

 

I saw the outbound traffic going upward to about 10Mbps as well. 

 

What can explain this?

Link to comment
3 hours ago, JorgeB said:

Everything looks normal in the diags, looks more like a LAN problem, Try changing your DNS server to 208.67.222.222 and 208.67.220.220 instead of using your router.

 

 

Already done that. No luck. It works fine for 10-15 minutes after saving the changes and then back to square one. No connection. Pings failing.

Link to comment
15 hours ago, itimpi said:

My upstairs neighbour had exactly this problem and it turned our to be his Fritz box (he is not an Unraid user)!

I am not using a custom router/modem. I have a Bell connection and use the router/modem provided by them. It's a Bell Home Hub 4000

Link to comment

 

@JorgeB@itimpi So, I gave up on the server last night and left it on its own. Turns out a couple hours after the network reset yesterday, the server behavior changed. Now the server is connecting on and off. And I am getting new errors in the log(s). I have attached the latest diagnostics to this post. Here's a snippet of the new error(s) that I am seeing.

Mar 13 11:52:22 astatine  avahi-daemon[23598]: Joining mDNS multicast group on interface vethf5a568d.IPv6 with address fe80::c86c:29ff:fead:e306.
Mar 13 11:52:22 astatine  avahi-daemon[23598]: New relevant interface vethf5a568d.IPv6 for mDNS.
Mar 13 11:52:22 astatine  avahi-daemon[23598]: Registering new address record for fe80::c86c:29ff:fead:e306 on vethf5a568d.*.
Mar 13 11:52:22 astatine  avahi-daemon[23598]: Joining mDNS multicast group on interface veth25e0c9f.IPv6 with address fe80::44e7:81ff:fe1d:5a72.
Mar 13 11:52:22 astatine  avahi-daemon[23598]: New relevant interface veth25e0c9f.IPv6 for mDNS.
Mar 13 11:52:22 astatine  avahi-daemon[23598]: Registering new address record for fe80::44e7:81ff:fe1d:5a72 on veth25e0c9f.*.
Mar 13 11:52:23 astatine  avahi-daemon[23598]: Joining mDNS multicast group on interface veth21f1006.IPv6 with address fe80::307a:a5ff:fe6a:307e.
Mar 13 11:52:23 astatine  avahi-daemon[23598]: New relevant interface veth21f1006.IPv6 for mDNS.
Mar 13 11:52:23 astatine  avahi-daemon[23598]: Registering new address record for fe80::307a:a5ff:fe6a:307e on veth21f1006.*.
Mar 13 11:52:23 astatine  avahi-daemon[23598]: Joining mDNS multicast group on interface veth3e1bea2.IPv6 with address fe80::409b:19ff:febd:db3e.
Mar 13 11:52:23 astatine  avahi-daemon[23598]: New relevant interface veth3e1bea2.IPv6 for mDNS.
Mar 13 11:52:23 astatine  avahi-daemon[23598]: Registering new address record for fe80::409b:19ff:febd:db3e on veth3e1bea2.*.
Mar 13 17:25:47 astatine nginx: 2023/03/13 17:25:47 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 13 17:25:47 astatine nginx: 2023/03/13 17:25:47 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 13 17:25:47 astatine nginx: 2023/03/13 17:25:47 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 13 17:42:32 astatine  ntpd[7970]: no peer for too long, server running free now
Mar 13 18:17:31 astatine  ntpd[7970]: no peer for too long, server running free now
Mar 13 18:24:29 astatine webGUI: Successful login user root from <tailscale ip of my phone>

Checkout the log at timestamp: Mar 13 20:07:29

Mar 13 18:27:07 astatine  avahi-daemon[23598]: Joining mDNS multicast group on interface veth641d21e.IPv6 with address fe80::78a8:91ff:fee9:b4be.
Mar 13 18:27:07 astatine  avahi-daemon[23598]: New relevant interface veth641d21e.IPv6 for mDNS.
Mar 13 18:27:07 astatine  avahi-daemon[23598]: Registering new address record for fe80::78a8:91ff:fee9:b4be on veth641d21e.*.
Mar 13 20:07:29 astatine  ntpd[7970]: receive: Unexpected origin timestamp 0xe7ba3941.9078c3b0 does not match aorg 0000000000.00000000 from server@216.239.35.0 xmt 0xe7ba3940.e57b3745
Mar 14 00:59:48 astatine nginx: 2023/03/14 00:59:48 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 00:59:48 astatine nginx: 2023/03/14 00:59:48 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 00:59:48 astatine nginx: 2023/03/14 00:59:48 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 02:00:01 astatine root: mover: started
Mar 14 02:00:03 astatine root: mover: finished
Mar 14 02:22:58 astatine nginx: 2023/03/14 02:22:58 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 03:53:35 astatine nginx: 2023/03/14 03:53:35 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 03:53:35 astatine nginx: 2023/03/14 03:53:35 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 03:53:35 astatine nginx: 2023/03/14 03:53:35 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 04:23:08 astatine  ntpd[7970]: no peer for too long, server running free now
Mar 14 05:29:33 astatine nginx: 2023/03/14 05:29:33 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 05:29:33 astatine nginx: 2023/03/14 05:29:33 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 05:29:33 astatine nginx: 2023/03/14 05:29:33 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 06:18:35 astatine nginx: 2023/03/14 06:18:35 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 07:59:39 astatine  ntpd[7970]: no peer for too long, server running free now
Mar 14 09:02:32 astatine  ntpd[7970]: no peer for too long, server running free now
Mar 14 09:31:20 astatine  ntpd[7970]: no peer for too long, server running free now
Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:03 astatine nginx: 2023/03/14 10:23:03 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:04 astatine nginx: 2023/03/14 10:23:04 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:04 astatine nginx: 2023/03/14 10:23:04 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:04 astatine nginx: 2023/03/14 10:23:04 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:05 astatine nginx: 2023/03/14 10:23:05 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:06 astatine nginx: 2023/03/14 10:23:06 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.
Mar 14 10:23:07 astatine nginx: 2023/03/14 10:23:07 [error] 3949#3949: nchan: A message from the past has just been published. Unless the system time has been adjusted, this should never happen.

 

Edited by Astatine
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.