Jump to content
  • 6.12.13 - massive network latency when docker is enabled - server breaking for me


    Schmackei
    • Urgent

    TLDR; I have narrowed down that the issue I am backing of massive network latency, and only network based as it is testable and not present between router and ISP at all, only occurs when I enable docker since update. it shows 0 network usage on either of my NICs but the latency increase is unbearable and makes my entire network unstable and unusable.


    Woke this morning to an update to 6.12.13 from 6.12.4 that I was running, I ran the update and a few app updates in dockers while I was at it. when the server came back up, i found my internet to be crazy unreliable and lagging out to the point of disconning me from any and all services I was using from any device on my home network. I troubleshot with my ISP and they identified the issue was only local to my network and not external. 

    I was able to confirm from my AX11000 TP-Link router that any external connection was not an issue. i disabled all wifi devices, firmware updated router to see if there was an issue with the network side of things, i then factory reset it to confirm the same issues existed. I replaced all ethernet cabled between my desktop and the router with everything else disconnected from the router, no issue existed. i started plugging in devices and found that the issue only happened when the server was plugged into the switch, i tried different nic ports on the switch, no change, latency (4000+ ms) still persisted, i swapped cabled between desktop and server (going back and forth) and can identify without a doubt the issue was and is soley the server causing the issue.

    I started to shut down all non essential services and systems inside of the unraid server including the array. the latency went away. I started to turn the array back on, no latency. then the docker, latency immediately. I stopped and disabled all containers, still latency. I disabled docker again, no latency. I deleted network.cfg and rebuilt it, no latency until docker was started again. I ran an update through console to update docker, latency remained.

    I am out of ideas on my end as i am not a programmer, and i found this submit a bug before you roll back your os update. so here i am, submitting my diag zip and hoping you can help identify why unraid is not working on my network after all these years of no issue.

    --- EDIT ---
    I booted again into safemode, went to plugins and deleted any new plugins that showed with an error. then I went to Tols > Update Assistant and ran that again. this time there was an update for something for CA. I rebooted into normal, deleted my Unraid Theme Pack plugin, and the plugin for Drive Locations. So far it still feels sluggish, but the impact on the network has decreased down to a usable level, but I would like to continue to monitor and will report back with in 24 hours how it is performing.

    --- EDIT ---
    With in 5 minutes of clicking Submit on the last update, it caused a major network drop again. issue still persisting.
     

    jarvis-diagnostics-20241020_0123.zip




    User Feedback

    Recommended Comments

    JorgeB

    Posted

    Does the issue go away if your revert back to the previous release you were running?

    Schmackei

    Posted

    2 hours ago, JorgeB said:

    Does the issue go away if your revert back to the previous release you were running?

    I wanted to try troubleshoot a bit before i took that step, but logically, i see that or deleting my existing docker image are likely the next step. so I will work on that i guess if there is nothing in the diag files i supplied that immediately jump out at people.

     

    Schmackei

    Posted

    3 hours ago, JorgeB said:

    Does the issue go away if your revert back to the previous release you were running?

    I backed up my Docker.img and made a new docker.img, left it empty but enabled it. network latency immediately returned. I disabled docker again, it went away. 
    I rolled back to 6.12.10, rebooted, no latency, I enabled the empty docker.img and it returned immedately again. it is lower than it has been, but it still shouldnt be happening. I cannot see why it is happening at all. my network devices for the server are showing next to no usage at idle, and there is no HDD activity while the network is being lagged out to the point of disconnections.

    JorgeB

    Posted

    And is 6.12.10 the release you upgraded form?

    Schmackei

    Posted

    1 minute ago, JorgeB said:

    And is 6.12.10 the release you upgraded form?

    no, i was on 6.12.2 i believe, i am not certain how to roll that far back.

     

    JorgeB

    Posted

    You could have downgraded from the GUI, but if if you went to a different release that's no longer possible, since AFAIK there aren't any similar reports, I would recommend recreating the flash drive with the minimum config and retesting, to make sure it's not a config issue.

    Schmackei

    Posted

    ok, do you have a link to a guide that I can stick to for recreating the flash drive? and will it lose my license for unraid?

     

    JorgeB

    Posted

    Backup the current one first and then redo it and just restore the bare minimum, like the key, super.dat and the pools folder for the assignments, also copy the docker user templates folder, if all works you can then reconfigure the server or try restoring a few config files at a time from the backup to see if you can find the culprit.    

    Schmackei

    Posted (edited)

    I got a new fresh usb c flash drive, i ran the USB creation tool for 6.12.13 on that while i backed up my existing flash drive with unraid on it. I copied over the key, pools, shares files from config and nothing else. I connected the flash drive, selected the order of my drives as they were, and it had to rebuild my parity for the last 11+ hours. while it was doing this, no issues with the network at all. as soon as the parity had finished, all i did was enable docker and the network flooded instantly and started disconnecting devices from my home wired network, i disabled docker and the disconnecting immediately stopped.

    I am out of ideas. and now by the look of it, i have an entire unraid server that i will have to setup again if this issue can be diagnosed and solved.

    Edited by Schmackei
    JorgeB

    Posted

    See if you can retest with a different NIC, or a different server.

    Schmackei

    Posted

    16 minutes ago, JorgeB said:

    See if you can retest with a different NIC, or a different server.

    I have tested both NIC on the system, with the same result, that was part of the initial isolation test. I have tested with different and new netwrok cables, with the same result. I have tested other devices on the network using the old network cables with out any issue. I have tested with a different switch, and a different router. the only time this issue is happening, is when Docker is enabled on unraid. it didnt exist prior to the update, it can be 100% proven to not exist as an origin from any other device on the network. and I have tested 100% new network components, and cables with the issue being 100% replicatable from the server/docker issue. 

    I have, as mentioned above also, changed out the USB flash drive for a new one with a fresh install of the OS. Att his point, it would be spending a substantial amount of money to replace the server in hope that it solves the problem, but that is not viable and simply not going to happen for a problem that didnt exist prior to the OS update. 

    So, explain to be clearly, like I am a 5 year old, what you mean by test with a different NIC or a different server. since i have already tester, retested and retested again with different NICs with the same result, and I dont have thousands of dollars at my disposal to troubleshoot with a different server. Maybe you can loan me your server to test it with. *shakes head*

    Vr2Io

    Posted (edited)

    Suggest try don't bridging both NIC at br0 and disable IPv6.

     

    image.png.6964db36693062c6fd19695257afaab6.png

    Edited by Vr2Io
    Schmackei

    Posted (edited)

    1 hour ago, Vr2Io said:

    Suggest try don't bridging both NIC at br0 and disable IPv6.

     

    image.png.6964db36693062c6fd19695257afaab6.png

    There was a time that was not bridged, or bonded. let me go and disable that now. should I also disable bonded too?
    I never have IPV6 enabled on my server, or my home network.

    Edited by Schmackei
    Vr2Io

    Posted

    4 minutes ago, Schmackei said:

    should I also disable bonded too?

    Yes too, make network as simple as possible.

     

    5 minutes ago, Schmackei said:

    I never have IPV6 enabled on my server, or my home network.

    I overlook DHCP6(0) was "yes", but actually protocol is "ipv4" only.

     

    When you enable bridge and bridging both NIC, it may easy cause looping, so whole network will down.

    image.png.54530c2b85052506aac812239529aafc.png

    Schmackei

    Posted

    12 minutes ago, Vr2Io said:

    When you enable bridge and bridging both NIC, it may easy cause looping, so whole network will down.

     

    I have disabled bridging and bonding, left it as one NIC eth0, rebooted, not at bad, but same issue. I then swapped it to eth1, rebooted, same issue again. no resolution here. still only doing it as soon as Docker is enabled.

    Vr2Io

    Posted

    1 minute ago, Schmackei said:

    I have disabled bridging and bonding, left it as one NIC eth0, rebooted, not at bad, but same issue. I then swapped it to eth1, rebooted, same issue again. no resolution here. still only doing it as soon as Docker is enabled.

     

    Then what docker have running when docker service start ?

    Schmackei

    Posted

     

    12 minutes ago, Vr2Io said:

     

    Then what docker have running when docker service start ?

    none, they were all turned off. and after building a new flash, there was no dockers containers installed, and i was getting the same issue.

     

    Vr2Io

    Posted (edited)

    19 minutes ago, Schmackei said:

     

    none, they were all turned off. and after building a new flash, there was no dockers containers installed, and i was getting the same issue.

     

    Note, I use same mobo but it running in Windows not Unraid. You still use 1st release BIOS and currently were 4.x

    BTW, would you describe a bit more on this mobo running with Unraid in past ? With docker enable ?

     

    There are some strange on your ifconfig, there are an unknow "br-f67f951ba59f" network there ( although down ) and your eth1 have some ipv6 setting too. The right corner is an example of one of my unraid, you found  no such at all.

     

    I think the "last" was I can try booting Unraid in same mobo and enable docker service as a ref. for you.

     

    image.png.c268e6cf3e411b4fe3caf985740fb461.png

    Edited by Vr2Io
    Vr2Io

    Posted

    Pls note, I never face similar problem on different build even different Unraid OS version. Pls also try only connect one Ethernet and check any different. 

    Schmackei

    Posted

    24 minutes ago, Vr2Io said:

    Note, I use same mobo but it running in Windows not Unraid. You still use 1st release BIOS and currently were 4.x

    BTW, would you describe a bit more on this mobo running with Unraid in past ? With docker enable ?

     

    There are some strange on your ifconfig, there are an unknow "br-f67f951ba59f" network there ( although down ) and your eth1 have some ipv6 setting too. The right corner is an example of one of my unraid, you found  no such at all.

     

    I think the "last" was I can try booting Unraid in same mobo and enable docker service as a ref. for you.

     

     

    here is a new diagnostics for you, that way we arent going back and looking at the network settings that no longer exist and arent relevant to the issue we are still facing.

    I am not certain what the unknown network was, the OS was installed years ago and it was a project for me to learn how to navigate around, im sure at some point before 6.10 I manually installed things like wireguard, tunnels, or tailscale and it might be residual from that. but as i said, since the fist Diag file was attached, the network was reset. the flash drive and OS were freshly reset, the docker was freshly reset, and there was no docker containers setup. the issue persisted even after a fresh install of unraid, and only when docker was enabled in the settings. Also it is strange that eth1 was showing ipv6 settings when it was quite very selected as ipv4 only, maybe that is another fun quirk with the OS that needs to be looked at.

    I have been re-adding some things as the array does have some very important things i need to access for work through the week, and I would rather not have to go and retreive it from my offsite as it is hours away.

    My MOBO revision might be older than yours, no idea why, i have flashed it with the latest bios that is listed. I am not certain what other information you would like from me regarding the mobo, be specific if you want specifics.

    jarvis-diagnostics-20241020-1433.zip

    Vr2Io

    Posted

    I have make test on same mobo in Unraid 6.12.13, I also move two onboard NIC as eth0 ( so have two test ) and start docker service, the USB also fresh. I can't reproduce your problem.

     

    Screenshot2024-10-20at16-02-35Tower_Docker.thumb.png.79a81457f8724ec9bf4322e799085bd8.png

     

    image.png.da346c7f639ef2c8f5d8efb5ecf8570f.png

     

    image.png.d4c6be13a52f1ca59f192e241c30eefa.png



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.

×
×
  • Create New...