• [6.12] Networking not working correctly after upgrading to 6.12 and changing docker to ipvlan


    fireplex
    • Solved

    Hi,

     

    Just upgraded to 6.12 from 6.11.5 and noticed I was getting repeating kernel macvlan errors reported in syslog so changed docker network type to ipvlan as recommended. Note that apart from these kernel errors everything seemed OK and networking was OK at 6.12 prior to docker change.

     

    Changing docker to ipvlan results in unRAID being unable to check dockers or plugins for updates, under status it reports "not available". I also seem to lose remote access to my swag docker and Plex docker seems unable to resolve its IP.

     

    Diagnostics attached, any ideas please?

     

    Thanks!

    tower-diagnostics-20230618-2206.zip




    User Feedback

    Recommended Comments



    32 minutes ago, thecode said:

    What is the meaning of this parameter?

     

    It tells how many interfaces are present and need to be configured

    When this line is not present, it is assumed the network.cfg file holds a legacy configuration (pre V6) and it results in a basic network configuration without all the extras which is supported in V6 and later.

     

    When modifying the network settings using the GUI, it will automatically add the SYSNICS setting.

     

    Link to comment

    Since @bonienl mentioned my config is missing "SYSNICS", renamed the network.cfg file and rebooted the server. This doesn't make any problem since the server IPv4 is also reserved on the router DHCP.

     

    After reboot now I noticed two changes:

    1. br0 interface metric changed from 1 to 1006

    2. broadcast address for br0 changed from 0.0.0.0 to 192.168.x.255

    image.thumb.png.5bac74c2795d6ac85dcf88c3dfc3d81c.png

    I was already suspecting that having both shim-br0 and br0 using the same metric can cause a problem with routing (thous maybe creating the incomplete arp table after server is accessing an internet resource) 

     

    Checking my other server (which uses macvlan for now) both br0 and shim-br0 metric is set to 1. I can't change settings on this server since it runs the house now but I assume even with macvlan it should have different metric for br0.

     

    For now I did not succeed to reproduce the incomplete arp, before this change it happen within few minutes. I still have the problematic config file, I can try to switch back to it and find the problematic setting if it can help others. I also wonder if it worth switching to macvlan and check if it makes any changes there now.

     

    Note: My system doesn't have a network.cfg file now, since I deleted it and did not create any changes from the GUI.

    Edited by thecode
    Link to comment
    Quote

    After reboot now I noticed two changes:

    1. br0 interface metric changed from 1 to 1006

    2. broadcast address for br0 changed from 0.0.0.0 to 192.168.x.255

     

    This information comes from your DHCP server

     

    macvlan / ipvlan setting is for docker, it doesn't change network settings.

     

    Quote

    For now I did not succeed to reproduce the incomplete arp,

     

    An incomplete ARP occurs when the target host (your router in this case) is no longer responding to arp requests from the server.

     

    Link to comment
    17 minutes ago, bonienl said:

     

    This information comes from your DHCP server

     

     

    To check it I disabled the DHCP from the GUI (set to static), did not change anything else (the DHCP IP is already offered as the correct static IP), rebooted the server, right after reboot the server is missing the arp for br0.

     

     

     

    ip.png

    Edited by thecode
    Link to comment
    2 minutes ago, thecode said:

    To check it I disabled the DHCP from the GUI (set to static), did not change anything else (the DHCP IP is already offered as the correct static IP), rebooted the server, right after reboot the server is missing the arp for br0.

     

    To me this confirms your router is the source of the problem.

     

    Link to comment

    This is my routing table

    I use static routing with explicit settings for metrics.

    Docker uses ipvlan and has "host access" enabled, hence the shim interface

     

    image.png

     

    Doesn't give any problems in my network set up.

     

    • Thanks 1
    Link to comment
    2 minutes ago, bonienl said:

     

    To me this confirms your router is the source of the problem.

     


    This doesn't make any sense, how would setting a static IP in unraid (with the same IP) related to the router? If a problem would be when DHCP is used it can explain, but when setting a static IP the router should not have any effect.

     

    I have another unraid server which is set to "macvlan" which doesn't suffer from this (with a static IP) and more than 100 network devices and only this server suffer from this problem. I have also tried to setup a linux machine with docker and manually create shim interface (using https://blog.oddbit.com/post/2018-03-12-using-docker-macvlan-networks/) and it did not have any problem. Only this server has the incomplete arp problem and only with ipvlan.

    I also can't understand why would a switch/router that has both servers connected to it will be missing the arp for one server. I can capture traffic directly on the router to check.

    I am not the only one having this, not sure if this helps but there is a thread about it on reddit:

     

    Link to comment
    13 minutes ago, bonienl said:

    This is my routing table

    I use static routing with explicit settings for metrics.

    Docker uses ipvlan and has "host access" enabled, hence the shim interface

     

    image.png

     

    Doesn't give any problems in my network set up.

     

    There is one difference between this and my config, my config is missing the default route for the shim-br0:

    image.thumb.png.604a8d6350451fa131f599cef54ea3dd.png

    Link to comment

    To test my assumption about the interface metric I have manually set the interface metric to 1.

    image.thumb.png.6580d0e480530835aab12bd8cbf9d31b.png

     

    With this set to 1 my routing table in the GUI now looks like this:

    image.thumb.png.855787f4e7cafc870f18927fbd65bacd.png

     

    output from route:

    image.png.91703e099b2ca5a7898e805607062c95.png

     

    There may be one little GUI bug since the GUI shows metric as "1" while route shows 0 or 1.

    With this setting I think the issue of the incomplete arp is fixed for me. I will let it run for 1-2 days to test.

    Link to comment

    I see what is happening

    When you create a static IP address + gateway without explicitely setting a metric value, then a default route with metric 0 is created for br0. This is indeed wrongly displayed in the GUI as value 1 (a display error).

     

    When creating the interface shim-br0, another default route with metric 0 is added, but this fails because two times the same metric is not allowed and hence there is a missing default route for the shim-br0 interface as a result, and this causes communication loss to the outside world whenever the shim-interface is used.

     

    The solution is indeed to set a metric value other then zero for the br0 gateway when a static IP assignment is used and "host access" is enabled. Btw this is not Unraid 6.12 specific but is true for earlier versions too.

     

    Need to think of a way to make this clear to the user and avoid this situation.

     

    Thanks for all the testing.

     

    • Thanks 1
    Link to comment
    4 minutes ago, bonienl said:

    I see what is happening

    When you create a static IP address + gateway without explicitely setting a metric value, then a default route with metric 0 is created for br0. This is indeed wrongly displayed in the GUI as value 1 (a display error).

     

    When creating the interface shim-br0, another default route with metric 0 is added, but this fails because two times the same metric is not allowed and hence there is a missing default route for the shim-br0 interface as a result, and this causes communication loss to the outside world whenever the shim-interface is used.

     

    The solution is indeed to set a metric value other then zero for the br0 gateway when a static IP assignment is used and "host access" is enabled.

     

    Need to think of a way to make this clear to the user and avoid this situation.

     

    Thanks for all the testing.

     


    Many thanks for your helping me debugging this issue. I think that a first step (but not urgent) would be to fix the GUI to reduce confusion. To fixed it, maybe if metric is not explicitly set for br0 it can be automatically set to 1 when the script creates "shim-br0"

     

    About making it clear to the user, since the 6.12 Known issues points to your excellent post, I would suggest to edit the post and suggest to first try ipvlan with metric set to 1 for br0.

     

     

    And thanks again 👍

    Link to comment

    I made an update which automatically resolves the issue when the shim interface is created.

    This should help anyone with static network settings and "host access" enabled.

     

    • Thanks 1
    Link to comment
    5 hours ago, bonienl said:

    I made an update which automatically resolves the issue when the shim interface is created.

    This should help anyone with static network settings and "host access" enabled.

     

    This fix is included in 6.12.3, be sure to read the announce post when upgrading:

     

    Link to comment

    I can confirm the fix works in 6.12.3, I removed the static metric before upgrade and route is set with the correct metric after.

     

    Thanks for the efforts resolving this issue.

    • Like 1
    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.