• 6.12.0-rc4 "macvlan call traces found", but not on <=6.11.x


    sonic6
    • Solved Minor

    Hello,

     

    I got the following output in my syslog: https://pastebin.cloud-becker.de/5ede1251c0f8 (Diagnostic attached)

     

    I know, the general fix for this is using "ipvlan" instead of "mcvlan".

    But in my (and other people) case, this isn't an option.

    The AVM Fritzbox (7595 in my case) isn't compatible with ipvlan.

     

    I came from the latest 6.11.x stable without any problems, the same for 6.10.x .

     

    @alturismo got the same problem with 6.12.x, also when he was problemless on 6.11.x . Maybe he can post some more details from his setup.

     

    So I hope it is fixable, especially when version before run without this problem.

    unraid-1-diagnostics-20230429-1014.zip

    • Upvote 2



    User Feedback

    Recommended Comments



    Never had traces before. Came with 6.12.x. From rock solid system to very unstable. Now at ipvlan without problems but that's not a good solution. I hope this is looked at. 

    Link to comment

    Encountered the same issues here. Just updated to 6.12.3 the other day and within a few minutes it had 3 macvlan trace issue notifications. Decided to keep it running and the server crashed only a few hours later, and have since rolled back to 6.11.

    Link to comment

    We are absolutely looking for ways to get macvlan to work without throwing call traces. Currently our best advice is documented in the known issues for 6.12.0:

      https://docs.unraid.net/unraid-os/release-notes/6.12.0#known-issues

     

    If those solutions aren't ideal for you then you can try to ignore the call traces, but if your system ends up crashing that isn't good. At that point you can take another look at the solutions above or roll back to a previous version.  Be sure to read the 6.12.0 release notes if you do decide to roll back.

    Link to comment
    12 hours ago, sonic6 said:

    why is this still recommended? if you take a look in this thread, there are more and more user, who is using this "solution", but got Traces.

    This just confuses more and more users, who's looking for help.

     

    OK so first let me acknowledge that these macvlan call traces are an issue, and we are looking for a fix.

     

    Having said that, I think the recommendations here are valid until we have something better:

    https://docs.unraid.net/unraid-os/release-notes/6.12.0/#known-issues

     

    1) IPVLAN may not work for everyone, but it is a simple thing to try. I think you don't like it because your router can't do port forwarding when IPVLAN is enabled. OK, that is a bummer. 

     

    Side note - one thing I keep meaning to ask folks is does your ISP *require* you to use the Fritzbox or are you allowed to replace it? In the US, ISPs give you a pretty bare bones router but you are usually allowed to replace it if you want more functionality. Not sure if it works that way everywhere or not.

     

    2) As far as I know the two nic solution should work. If there are issues with it in 6.12.3, it would be great to have that discussion (including diagnostics!) over in this thread:

    https://forums.unraid.net/topic/137048-guide-how-to-solve-macvlan-and-ipvlan-issues-with-containers-on-a-custom-network/

    So far I haven't seen any diagnostics from 6.12.3 that say we shouldn't recommend this as a way to avoid macvlan call traces. It may end up depending on the hardware involved. Anyway, probably best to keep that discussion in that thread.

     

    If those solutions aren't ideal for you then you can try to ignore the call traces, but if your system ends up crashing that isn't good. At that point you can take another look at the solutions above or roll back to a previous version.  Be sure to read the 6.12.0 release notes if you do decide to roll back.

    Link to comment
    13 minutes ago, ljm42 said:

    Having said that, I think the recommendations here are valid until we have something better:

    https://docs.unraid.net/unraid-os/release-notes/6.12.0/#known-issues

    okay, i have to say sorry, because just meant this "solution": 

     

    and you are also right, there aren't diagnostic from 6.12.3 ... there aren't any diagnostics. But also hard to post, when the server crashed ;)

     

     

     

    12 minutes ago, ljm42 said:

    Side note - one thing I keep meaning to ask folks is does your ISP *require* you to use the Fritzbox or are you allowed to replace it? In the US, ISPs give you a pretty bare bones router but you are usually allowed to replace it if you want more functionality. Not sure if it works that way everywhere or not.

     

    In germany we normaly can free use our routers, but the Fritzbox is on of the best choise for normal consumers like 99% of the people. (there are a small amount of ISP that gives you a router for their fiberline connections, that you have to use. in most cases this is a Fritzbox.)

    Edited by sonic6
    Link to comment

    Same Problem here, with Fritz.Box Router.. i cant Change the Router, becouse the One from my Provider is so worse... Please Fix this issue 

    Link to comment
    1 minute ago, eLpresidente said:

    Please Fix this issue 

     

    Hopefully it was clear from my earlier response, that is the goal : ) 

     

     

    This was just a side conversation... I see a lot of "I can't do XYZ because of Fritzbox" so I'm just wondering if people have the ability to replace it with a better router or if they are locked in to using it.

    • Thanks 1
    Link to comment
    1 hour ago, sonic6 said:

    and you are also right, there aren't diagnostic from 6.12.3 ... there aren't any diagnostics. But also hard to post, when the server crashed ;)

     

    diagnostics showing a call trace in the two nic setup in the other thread would be helpful (this provides both a syslog and configuration information to help determine what the problem could be)

     

    Please also enable syslog mirroring to flash to capture logs if it does crash (this would be in the logs folder on the flash drive, ideally you also would have diagnostics from before the crash to show the configuration)

    • Thanks 1
    Link to comment
    6 hours ago, ljm42 said:

    This was just a side conversation... I see a lot of "I can't do XYZ because of Fritzbox" so I'm just wondering if people have the ability to replace it with a better router or if they are locked in to using it.

    generally spoken, sure, we are free to use own hardware.

     

    and now may one culprit, Fritz is more or less the only one who make decent cable modem/routers ... and as there are meanwhile many cable users here the only option would be to make a dual setup (2 routers ...).

     

    which may really doesnt fit in the time anymore considering energy consumptions and so on ... even i thought about it to finally get rid of this issue now ... i spent 2 excessive month figuring it out ;)

     

    also, Fritz has a wide distribution of DECT smart home accessoires (like here too ...) so basically, yes, either im locked to the Fritz "appliances" or i drop everything and just keep a fritz as modem and get another router, hoping it can handle ipvlan properly ... while i was surprised that also unify has firewall isuues in some combinations ... and when i think about it, alot of hardware is somehow binding networking to mac addressing and not ip's only so personally, i wouldnt even know what to buy now ...

     

    the 2nd NIC solution is posted sometimes already, sadly with no diags afaik ... if it wouldnt be such a mess to reconfigure everything i could also test it again, but for now after relying long on 6.11 i switched now almost all dockers to bridge usage and the ones which are impossible i setted up seperatly LXC containers for my usage ...

     

    when i find some spare time i ll report the 2nd NIC usage results here, sadly i dont have a 2 NIC setup in my small Test Server ... otherwise it would be already posted ;)

    • Like 3
    Link to comment
    On 7/20/2023 at 11:17 PM, eLpresidente said:

    Same Problem here, with Fritz.Box Router.. i cant Change the Router, becouse the One from my Provider is so worse... Please Fix this issue 

    i just changed it to IPVLAN, now its working for 4 days without any issues.. i am relativ new to linux and Unraid.. so what Alturismo wrote is correct for the Fritz users.. i don't want to have another Router that needs power again.. so i am asking what have changed from version 6.11 to 6.12 that this doesnt work anymore ? i am not a dev and dont have the knowlege.. but as a newbee I am wondering, if you could reverse the changes vor MACvlan ? 

     

    for the diagnostics files.. i am still waiting for the Server to run a week or more.. to sort it out.. after that i can change it to MACvlan and post a diagnostic..

     

    BTW thanks for helping us out.. because changing hardware for unraid with the knowleg of (it worked on 6.11) isn't the right solution in my opinion and like alt sayed.. there isn't realy any better options for us in Germany 

    Link to comment

    The call trace issue, while I have it, doesn't seem to cause me issues.  Scanning the logs I get one about once a week.  Doesn't halt the system in anyway and I've seen no issues.

     

    I reconfigured my network setup based on some guides earlier.  2 NICs in the server, one dedicated to the Docker network.  No bridge between NIC 1 and 2.  NIC 2 isn't assigned an IP address.  

     

    I run a Unifi setup and I'd much prefer my network map to be correct.  Being borderline OCD i like to give all my dockers a static IP address.  So IPVLAN isn't for me.

     

    So in summary, I do get the odd call trace, but it doesn't cause my server any problems.  Not sure if that's consistent with others or whether I'm one of the lucky ones.

    Link to comment

    Going to throw my hat in the ring again, same issues. I switched to ipvlan which isn’t ideal as I need each docker to have a static ip that I hand out from my router. This works for now but eagerly waiting for a fix since this wasn’t an issue in previous releases. 

    Link to comment
    On 7/23/2023 at 5:01 AM, dalben said:

    I reconfigured my network setup based on some guides earlier.  2 NICs in the server, one dedicated to the Docker network.  No bridge between NIC 1 and 2.  NIC 2 isn't assigned an IP address.  

    Can you please confirm if you have bridging enabled for the docker dedicated NIC? And if not please post a couple of the call traces you are getting (or the syslog).

    Link to comment
    On 8/9/2023 at 3:01 PM, JorgeB said:

    Can you please confirm if you have bridging enabled for the docker dedicated NIC? And if not please post a couple of the call traces you are getting (or the syslog).

     

    I may need to back track out of that statement.  I went trawling through my syslog to find I had weekly alerts from the Fix Common Problems plugin that I had macvlan call trace errors, but I couldn't find one in the actual syslog.  So ignore the above for now,

    • Like 1
    Link to comment

    @JorgeB - Caught the macvlan call trace today.  Attached are diagnostics and the call trace snippet below.  I have Version: 6.12.4 running on the box.

     

    Sep 22 08:39:13 tdm kernel: ------------[ cut here ]------------
    Sep 22 08:39:13 tdm kernel: WARNING: CPU: 9 PID: 159 at net/netfilter/nf_conntrack_core.c:1210 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    Sep 22 08:39:13 tdm kernel: Modules linked in: tun tls xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod tcp_diag inet_diag ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc macvtap macvlan tap e1000e r8169 realtek intel_rapl_msr zfs(PO) intel_rapl_common zunicode(PO) zzstd(O) i915 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel zlua(O) zavl(PO) icp(PO) kvm iosf_mbi drm_buddy i2c_algo_bit ttm zcommon(PO) drm_display_helper drm_kms_helper znvpair(PO) spl(O) crct10dif_pclmul crc32_pclmul crc32c_intel drm ghash_clmulni_intel sha512_ssse3 aesni_intel crypto_simd cryptd mei_hdcp mei_pxp intel_gtt rapl intel_cstate gigabyte_wmi wmi_bmof mpt3sas i2c_i801 nvme agpgart ahci mei_me i2c_smbus syscopyarea raid_class i2c_core intel_uncore nvme_core mei libahci sysfillrect scsi_transport_sas sysimgblt fb_sys_fops thermal fan video wmi
    Sep 22 08:39:13 tdm kernel: backlight intel_pmc_core acpi_tad acpi_pad button unix [last unloaded: e1000e]
    Sep 22 08:39:13 tdm kernel: CPU: 9 PID: 159 Comm: kworker/u24:6 Tainted: P           O       6.1.49-Unraid #1
    Sep 22 08:39:13 tdm kernel: Hardware name: Gigabyte Technology Co., Ltd. B365 M AORUS ELITE/B365 M AORUS ELITE-CF, BIOS F3d 08/18/2020
    Sep 22 08:39:13 tdm kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan]
    Sep 22 08:39:13 tdm kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    Sep 22 08:39:13 tdm kernel: Code: 44 24 10 e8 e2 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 7e e6 ff ff 84 c0 75 a2 48 89 df e8 9b e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 18 dd ff ff e8 93 e3 ff ff e9 72 01
    Sep 22 08:39:13 tdm kernel: RSP: 0018:ffffc9000032cd98 EFLAGS: 00010202
    Sep 22 08:39:13 tdm kernel: RAX: 0000000000000001 RBX: ffff888194440700 RCX: 0d35b370dc56628d
    Sep 22 08:39:13 tdm kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff888194440700
    Sep 22 08:39:13 tdm kernel: RBP: 0000000000000001 R08: e03672da4c370c38 R09: 4eea21ed5130b6c5
    Sep 22 08:39:13 tdm kernel: R10: 9101b661ea51c81d R11: ffffc9000032cd60 R12: ffffffff82a11d00
    Sep 22 08:39:13 tdm kernel: R13: 000000000002fc07 R14: ffff8881af585900 R15: 0000000000000000
    Sep 22 08:39:13 tdm kernel: FS:  0000000000000000(0000) GS:ffff88880f440000(0000) knlGS:0000000000000000
    Sep 22 08:39:13 tdm kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    Sep 22 08:39:13 tdm kernel: CR2: 0000149eca3e2484 CR3: 000000000220a002 CR4: 00000000003706e0
    Sep 22 08:39:13 tdm kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    Sep 22 08:39:13 tdm kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
    Sep 22 08:39:13 tdm kernel: Call Trace:
    Sep 22 08:39:13 tdm kernel: <IRQ>
    Sep 22 08:39:13 tdm kernel: ? __warn+0xab/0x122
    Sep 22 08:39:13 tdm kernel: ? report_bug+0x109/0x17e
    Sep 22 08:39:13 tdm kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    Sep 22 08:39:13 tdm kernel: ? handle_bug+0x41/0x6f
    Sep 22 08:39:13 tdm kernel: ? exc_invalid_op+0x13/0x60
    Sep 22 08:39:13 tdm kernel: ? asm_exc_invalid_op+0x16/0x20
    Sep 22 08:39:13 tdm kernel: ? __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    Sep 22 08:39:13 tdm kernel: ? __nf_conntrack_confirm+0x9e/0x2b0 [nf_conntrack]
    Sep 22 08:39:13 tdm kernel: ? nf_nat_inet_fn+0x60/0x1a8 [nf_nat]
    Sep 22 08:39:13 tdm kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack]
    Sep 22 08:39:13 tdm kernel: nf_hook_slow+0x3a/0x96
    Sep 22 08:39:13 tdm kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    Sep 22 08:39:13 tdm kernel: NF_HOOK.constprop.0+0x79/0xd9
    Sep 22 08:39:13 tdm kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    Sep 22 08:39:13 tdm kernel: __netif_receive_skb_one_core+0x77/0x9c
    Sep 22 08:39:13 tdm kernel: process_backlog+0x8c/0x116
    Sep 22 08:39:13 tdm kernel: __napi_poll.constprop.0+0x28/0x124
    Sep 22 08:39:13 tdm kernel: net_rx_action+0x159/0x24f
    Sep 22 08:39:13 tdm kernel: __do_softirq+0x126/0x288
    Sep 22 08:39:13 tdm kernel: do_softirq+0x7f/0xab
    Sep 22 08:39:13 tdm kernel: </IRQ>
    Sep 22 08:39:13 tdm kernel: <TASK>
    Sep 22 08:39:13 tdm kernel: __local_bh_enable_ip+0x4c/0x6b
    Sep 22 08:39:13 tdm kernel: netif_rx+0x52/0x5a
    Sep 22 08:39:13 tdm kernel: macvlan_broadcast+0x10a/0x150 [macvlan]
    Sep 22 08:39:13 tdm kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan]
    Sep 22 08:39:13 tdm kernel: process_one_work+0x1a8/0x295
    Sep 22 08:39:13 tdm kernel: worker_thread+0x18b/0x244
    Sep 22 08:39:13 tdm kernel: ? rescuer_thread+0x281/0x281
    Sep 22 08:39:13 tdm kernel: kthread+0xe4/0xef
    Sep 22 08:39:13 tdm kernel: ? kthread_complete_and_exit+0x1b/0x1b
    Sep 22 08:39:13 tdm kernel: ret_from_fork+0x1f/0x30
    Sep 22 08:39:13 tdm kernel: </TASK>
    Sep 22 08:39:13 tdm kernel: ---[ end trace 0000000000000000 ]---

     

    tdm-diagnostics-20230922-1630.zip

    Edited by dalben
    Link to comment
    1 hour ago, dalben said:

    Caught the macvlan call trace today.  Attached are diagnostics and the call trace snippet below.  I have Version: 6.12.4 running on the box.

    You have bridging enabled for the docker dedicated NIC, disable it, reboot and the call traces should go away.

    Link to comment

    Thanks.

     

    OK, did that, seems up and running.  Mild heart attack when the containers wouldn't start. It seems that even though they had Custom eth1 and the correct IP address in the settings, the container needed me to make a change to actually set that.  Just deleting the last number of the IP and re-entering it, then Apply, and all came good

     

    Some of my containers now seem rate limited.  What was at least a 50MB download speed before is now around 10MB.  But I also upgraded that latest Unifi network client around the same time so I'll need roll that back to see whats causing this issue.

    Link to comment



    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.