groot-stuff

Members
  • Posts

    24
  • Joined

  • Last visited

Community Answers

  1. groot-stuff's post in DNS Resolution issues was marked as the answer   
    @Squid I understand you aren't a networking guru, but thank you very much, I really appreciated your instant CA update to help me troubleshoot.
    After a nice hike yesterday and good night's sleep, I banged my head on the keyboard some more this morning.
     
    I would hate coming to a thread with a similar or same issue... read through and it ends without a resolution. I would equally hate a final post from the OP saying "Never mind, I fixed it." because it doesn't help the next person that a Google search lands here.
     
    Request:
    I am a not a networking expert and would LOVE someone who is - to take a stab at why this resolved the issue... despite no changes being made to the router or unRaid network config that would have caused it.
     
    So further down the troubleshooting rabbit hole:
    - I stopped the array, same symptoms
        - I tested this because of reading posts about unRaid gui latency being caused by failing disks
     
    Next I dug through my mental cobwebs and found my 20 year old, uncertifed CCNA knowledge... along with many web articles, some linked below.
    - dig [anyDomainName.TLD] came back instantaneously... SO STRANGE, as here I am thinking unRaid is having DNS request time outs
        - This command allowed me to learn that my docker containers are using 127.0.0.11 as their DNS server... hence why docker containers had no symptoms 
        - Resource: https://www.google.com/amp/s/www.hostinger.com/tutorials/how-to-use-the-dig-command-in-linux/amp/
     
    So I ran tcpdumps on these various commands (curl, ping, traceroute and even clicking on the CA apps tab) to observe what is sent/received
     
     
    Digging into tcpdump of "curl ifconfig.io"
        - unRaid settings: external DNS servers statically assigned in Settings > Network (open DNS = 208.67.220.220 and 208.67.222.222)
        - The traffic (packets) going from point A to B were just fine, quick as it should be... but something was slowing the overall request down
        - 106 packets over 15 seconds
        - 33 of those were ARP requests to my local router asking "who has [myPublicIP]?
            - This got me curious... why is unRaid continually trying to figure out that my router is the next layer 2 (mac address) hop to my public IP?
     
    Digging into ARP requests
    - Resource: https://osqa-ask.wireshark.org/questions/5412/what-does-arp-42-who-has-19216811-tell-192168133-mean
    - ARP (Address Resolution Protocol) requests are similar to DNS requests but they are IP to MAC resolutions rather than Domain to IP resolutions
    - running arp -a was very slow (this shows IP addresses and their corresponding MAC addresses)
    - running arp -an (no DNS lookups) was instantaneous
        - Surprisingly this is the same after resolution, but it makes sense because DNS requests of each local subnet IP and 172.18.X.X docker subnet IP fail
     
     
    Digging into tcpdump of "curl ifconfig.io" with local router as the first DNS server
        - unRaid settings: local router and external DNS servers statically assigned in Settings > Network
            - local router as #1, then open DNS = 208.67.220.220 and 208.67.222.222 as #2 and #3)
        - 42 packets (64 less) over less than 1 second (14 less)
        - None of those (33 less) were ARP requests to my local router asking "who has [myPublicIP]?
     
    Symptom Resolution:
        - CLI requests to WAN resources are instantaneous (as expected)
        - Plugins tab loads in ~5 seconds (background update checks are enabled)
        - CA tab loads in 2-3 seconds
        - Dockers update and retain icon image
        - Fix Common Problems plugin scans within 15 seconds now (took multiple minutes before)
     
    Unresolved symptoms:
        - Docker's still show version "not available"
            - Forcing an update resolves this, but I am going to wait overnight to see if the scheduled update check resolves it (will post again tomorrow)
            - UPDATE (same day): I triggered a manual check for docker updates and about 15-20 seconds later all 17 of my dockers had no more "not available"
        - traceroutes to my default gateway still fail
     
     
    @Squid So, it was my network... I think? lol