alturismo

Moderators
  • Posts

    5830
  • Joined

  • Last visited

  • Days Won

    45

Report Comments posted by alturismo

  1. 10 hours ago, bonienl said:

    There are no code changes for the dashboard that would explain a different behavior.

     

    22 Dockers here and like @sonic6 mentioned, there is definately a changed behaviour since 6.12.4 rc13

     

    no changes here besides network setup due macvlan changes, same dockers, same vm's, same plugins, same hardware.

     

    turned off bridging

    changed docker's to eth0

    changed vm's to vhost

     

    nothing to worry about as its only cosmetic, but its been more or less instant here and now we have the delay, just to confirm @sonic6 post.

    • Thanks 1
  2. 6 hours ago, ljm42 said:

    This was just a side conversation... I see a lot of "I can't do XYZ because of Fritzbox" so I'm just wondering if people have the ability to replace it with a better router or if they are locked in to using it.

    generally spoken, sure, we are free to use own hardware.

     

    and now may one culprit, Fritz is more or less the only one who make decent cable modem/routers ... and as there are meanwhile many cable users here the only option would be to make a dual setup (2 routers ...).

     

    which may really doesnt fit in the time anymore considering energy consumptions and so on ... even i thought about it to finally get rid of this issue now ... i spent 2 excessive month figuring it out ;)

     

    also, Fritz has a wide distribution of DECT smart home accessoires (like here too ...) so basically, yes, either im locked to the Fritz "appliances" or i drop everything and just keep a fritz as modem and get another router, hoping it can handle ipvlan properly ... while i was surprised that also unify has firewall isuues in some combinations ... and when i think about it, alot of hardware is somehow binding networking to mac addressing and not ip's only so personally, i wouldnt even know what to buy now ...

     

    the 2nd NIC solution is posted sometimes already, sadly with no diags afaik ... if it wouldnt be such a mess to reconfigure everything i could also test it again, but for now after relying long on 6.11 i switched now almost all dockers to bridge usage and the ones which are impossible i setted up seperatly LXC containers for my usage ...

     

    when i find some spare time i ll report the 2nd NIC usage results here, sadly i dont have a 2 NIC setup in my small Test Server ... otherwise it would be already posted ;)

    • Like 3
  3. may a small update, after 2 month uptime on 6.11.5 on my main Server, changing all dockers to bridge mode and ipvlan and all traces are also gone as expected on 6.12.2, so hardware is fine, its definately a software issue in unraid caming up on ~6.12beta7 ...

     

    changing to bridge btw is really a pain with some services like tvheadend (satip usage), homeassistant (discovery), ... and so on, so not really a final solution ... sure, can also startup with a mixed mode (some in host, some in bridge) but thats also only working until a certain point ... like dlna Services are all using port 1900 as simple sample ...

     

    as note as more and more are upgrading and running into issues and dont even know why their servers are crashing (macvlan) or why their services have connectivity issues (ipvlan on some router hardware) it would be really nice to look futher into it.

     

    may also as note, we have the first issues coming up with unify routers and firewall rules as their also the mac address is the leader and ipvlan not really a solution ...

     

    just as friendly reminder that this is still a issue ... ;) sadly only a few people reporting this issue and returning to 6.11 for now as they think its just a bug which will be fixed soon.

    • Like 1
    • Thanks 1
  4. 1 hour ago, JorgeB said:

    @alturismo just to confirm, if bridge is enabled, you see the call traces even without any VM using the bridge at the same time correct?

    the small Test Server doesnt even have VM enabled, so its not VM related (what i 1st thought from my Main Server)

     

    and if we talk about bridge, macvlan br0 mode is the one which is causing the issues ...

     

    sample (now from the 6.11.5 mashine which is working) from a docker setup

    image.thumb.png.233e7b5ad153d0bf0e86ec36fa00ea24.png

     

    the 6.12rc Test Server is now running fine since 7 days + with eth0 setup ...

    image.thumb.png.87d6ad9423cb728831f34349bc2038f6.png

     

    if i use br0 mode on 6.12rc (2 local mashines, totally different hardware) the Server's will always startup with the posted errors and always will crash after few hours or few days (max where 4 days or so without traffic on the Server), doesnt matter if VM is on, off, active, disabled, ...

     

    hope its understandable ;)

    • Like 2
  5. On 5/7/2023 at 8:36 PM, bonienl said:

    Thanks for testing.

    For the moment, it seems the only reliable macvlan usage is without bridging enabled.

    i think for now i can confirm this

     

    with more traffic on the mashine (small testserver with eth0 setup) i get 0 errors currently.

     

    root@AlsServerII:~# uptime 
     04:38:23 up 4 days, 15:18,  0 users,  load average: 0.03, 0.07, 0.02
    root@AlsServerII:~#

     

    only once nginx had some issues, but i assume its more related to the 8 GB Ram only ;) not related to this topic.

     

    syslog attached just as comparision.

     

    as note, the main Server with 6.11.5 downgraded and bridged macvlan "normal" setup here.

    root@AlsServer:~# uptime 
     04:41:35 up 20 days, 20:42,  0 users,  load average: 0.45, 0.33, 0.29
    root@AlsServer:~# 

    also no more issues since downgraded ... ;)

    alsserverii-syslog-20230513-0237.zip

  6. 9 hours ago, bonienl said:

    For the moment, it seems the only reliable macvlan usage is without bridging enabled.

     

    ok, thanks for taking the time.

     

    just as small note, i rebooted yesterday like described, had it completely idle ... no errors in log.

     

    now just restarted a docker and put small traffic on tvheadend and ... here the snipplet from starting docker.d <esterday to today ...

     

    May  7 20:40:36 AlsServerII root: starting dockerd ...
    May  7 20:40:38 AlsServerII rc.docker: created network br0 with subnets: 192.168.1.0/24; 2a02:810b:56bf:dc30::/64; 
    May  7 20:40:49 AlsServerII kernel: eth0: renamed from veth49a7b60
    May  7 20:40:49 AlsServerII kernel: device br0 entered promiscuous mode
    May  7 20:59:04 AlsServerII emhttpd: spinning down /dev/sdb
    May  7 21:19:48 AlsServerII emhttpd: read SMART /dev/sdb
    May  7 21:36:49 AlsServerII emhttpd: spinning down /dev/sdb
    May  8 00:10:04 AlsServerII emhttpd: read SMART /dev/sdb
    May  8 00:27:00 AlsServerII emhttpd: spinning down /dev/sdb
    May  8 01:00:01 AlsServerII Docker Auto Update: Community Applications Docker Autoupdate running
    May  8 01:00:01 AlsServerII Docker Auto Update: Checking for available updates
    May  8 01:00:04 AlsServerII emhttpd: read SMART /dev/sdb
    May  8 01:00:09 AlsServerII Docker Auto Update: No updates will be installed
    May  8 01:16:51 AlsServerII emhttpd: spinning down /dev/sdb
    May  8 01:20:01 AlsServerII Plugin Auto Update: Checking for available plugin updates
    May  8 01:20:08 AlsServerII Plugin Auto Update: Auto Updating community.applications.plg
    May  8 01:20:09 AlsServerII root: plugin: running: anonymous
    May  8 01:20:09 AlsServerII root: plugin: running: anonymous
    May  8 01:20:09 AlsServerII root: plugin: creating: /boot/config/plugins/community.applications/community.applications-2023.05.07a-x86_64-1.txz - downloading from URL https://raw.githubusercontent.com/Squidly271/community.applications/master/archive/community.applications-2023.05.07a-x86_64-1.txz
    May  8 01:20:10 AlsServerII root: plugin: checking: /boot/config/plugins/community.applications/community.applications-2023.05.07a-x86_64-1.txz - MD5
    May  8 01:20:10 AlsServerII root: plugin: running: upgradepkg --install-new --reinstall /boot/config/plugins/community.applications/community.applications-2023.05.07a-x86_64-1.txz
    May  8 01:20:10 AlsServerII root: plugin: running: anonymous
    May  8 01:20:10 AlsServerII root: plugin: community.applications.plg updated
    May  8 01:20:12 AlsServerII Plugin Auto Update: Auto Updating parity.check.tuning.plg
    May  8 01:20:12 AlsServerII root: plugin: running: anonymous
    May  8 01:20:12 AlsServerII root: plugin: creating: /boot/config/plugins/parity.check.tuning/parity.check.tuning-2023.05.07.txz - downloading from URL https://raw.githubusercontent.com/itimpi/parity.check.tuning/master/archives/parity.check.tuning-2023.05.07.txz
    May  8 01:20:12 AlsServerII root: plugin: running: upgradepkg --install-new /boot/config/plugins/parity.check.tuning/parity.check.tuning-2023.05.07.txz
    May  8 01:20:13 AlsServerII root: plugin: running: anonymous
    May  8 01:20:13 AlsServerII root: plugin: parity.check.tuning.plg updated
    May  8 01:20:14 AlsServerII Plugin Auto Update: Checking for language updates
    May  8 01:20:15 AlsServerII Plugin Auto Update: Community Applications Plugin Auto Update finished
    May  8 05:57:33 AlsServerII emhttpd: read SMART /dev/sdb
    May  8 06:06:58 AlsServerII kernel: device br0 left promiscuous mode
    May  8 06:06:58 AlsServerII kernel: veth49a7b60: renamed from eth0
    May  8 06:06:59 AlsServerII kernel: eth0: renamed from veth8d82cef
    May  8 06:06:59 AlsServerII kernel: device br0 entered promiscuous mode
    May  8 06:26:33 AlsServerII kernel: ------------[ cut here ]------------
    May  8 06:26:33 AlsServerII kernel: WARNING: CPU: 1 PID: 15585 at net/netfilter/nf_conntrack_core.c:1211 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    May  8 06:26:33 AlsServerII kernel: Modules linked in: xt_nat xt_tcpudp macvlan bluetooth ecdh_generic ecc cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry dns_resolver xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat xt_addrtype br_netfilter xfs xt_MASQUERADE ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 md_mod tcp_diag inet_diag nct6775 nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables 8021q garp mrp bridge stp llc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel i915 kvm iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper crct10dif_pclmul crc32_pclmul crc32c_intel drm_kms_helper ghash_clmulni_intel sha512_ssse3 mei_pxp mei_hdcp drm aesni_intel crypto_simd intel_gtt cryptd rapl i2c_i801 agpgart intel_cstate mei_me i2c_smbus r8169 i2c_core ahci realtek libahci mei syscopyarea sysfillrect sysimgblt fb_sys_fops thermal fan button video wmi backlight intel_pmc_core unix
    May  8 06:26:33 AlsServerII kernel: CPU: 1 PID: 15585 Comm: kworker/u8:2 Not tainted 6.1.27-Unraid #1
    May  8 06:26:33 AlsServerII kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J3355M, BIOS P1.90 11/27/2018
    May  8 06:26:33 AlsServerII kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan]
    May  8 06:26:33 AlsServerII kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    May  8 06:26:33 AlsServerII kernel: Code: 44 24 10 e8 f4 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 76 e6 ff ff 84 c0 75 a2 48 89 df e8 ad e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 2a dd ff ff e8 8b e3 ff ff e9 72 01
    May  8 06:26:33 AlsServerII kernel: RSP: 0018:ffffc900000fcd98 EFLAGS: 00010202
    May  8 06:26:33 AlsServerII kernel: RAX: 0000000000000001 RBX: ffff888147964300 RCX: df117f1db0e5f651
    May  8 06:26:33 AlsServerII kernel: RDX: 0000000000000000 RSI: 0000000000000112 RDI: ffff888147964300
    May  8 06:26:33 AlsServerII kernel: RBP: 0000000000000001 R08: 4bd3c8213a28cba8 R09: 24428f35b7be7bf4
    May  8 06:26:33 AlsServerII kernel: R10: 90e86a8e9e578041 R11: ffffc900000fcd60 R12: ffffffff82a0e440
    May  8 06:26:33 AlsServerII kernel: R13: 000000000000c6ab R14: ffff88815a192500 R15: 0000000000000000
    May  8 06:26:33 AlsServerII kernel: FS:  0000000000000000(0000) GS:ffff888277e80000(0000) knlGS:0000000000000000
    May  8 06:26:33 AlsServerII kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  8 06:26:33 AlsServerII kernel: CR2: 00001456add1e1e0 CR3: 000000000420a000 CR4: 00000000003506e0
    May  8 06:26:33 AlsServerII kernel: Call Trace:
    May  8 06:26:33 AlsServerII kernel: <IRQ>
    May  8 06:26:33 AlsServerII kernel: ? nf_nat_inet_fn+0x60/0x1a8 [nf_nat]
    May  8 06:26:33 AlsServerII kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack]
    May  8 06:26:33 AlsServerII kernel: nf_hook_slow+0x3a/0x96
    May  8 06:26:33 AlsServerII kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    May  8 06:26:33 AlsServerII kernel: NF_HOOK.constprop.0+0x79/0xd9
    May  8 06:26:33 AlsServerII kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    May  8 06:26:33 AlsServerII kernel: __netif_receive_skb_one_core+0x77/0x9c
    May  8 06:26:33 AlsServerII kernel: process_backlog+0x8c/0x116
    May  8 06:26:33 AlsServerII kernel: __napi_poll.constprop.0+0x28/0x124
    May  8 06:26:33 AlsServerII kernel: net_rx_action+0x159/0x24f
    May  8 06:26:33 AlsServerII kernel: __do_softirq+0x126/0x288
    May  8 06:26:33 AlsServerII kernel: do_softirq+0x7f/0xab
    May  8 06:26:33 AlsServerII kernel: </IRQ>
    May  8 06:26:33 AlsServerII kernel: <TASK>
    May  8 06:26:33 AlsServerII kernel: __local_bh_enable_ip+0x4c/0x6b
    May  8 06:26:33 AlsServerII kernel: netif_rx+0x52/0x5a
    May  8 06:26:33 AlsServerII kernel: macvlan_broadcast+0x10a/0x150 [macvlan]
    May  8 06:26:33 AlsServerII kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan]
    May  8 06:26:33 AlsServerII kernel: process_one_work+0x1a8/0x295
    May  8 06:26:33 AlsServerII kernel: worker_thread+0x18b/0x244
    May  8 06:26:33 AlsServerII kernel: ? rescuer_thread+0x281/0x281
    May  8 06:26:33 AlsServerII kernel: kthread+0xe4/0xef
    May  8 06:26:33 AlsServerII kernel: ? kthread_complete_and_exit+0x1b/0x1b
    May  8 06:26:33 AlsServerII kernel: ret_from_fork+0x1f/0x30
    May  8 06:26:33 AlsServerII kernel: </TASK>
    May  8 06:26:33 AlsServerII kernel: ---[ end trace 0000000000000000 ]---

     

    so in the end, traffic on this specific network is causing it (and prolly also speeding up) and crashing something then.

     

    i ll try now again the eth0 method (no bridging) and recheck this, last time the server completely crashed there too (but different).

  7. ok, Server hard crashed again

     

    i putted some traffic on there yesterday, some today ... now i wanted to check logs again and its down.

     

    @bonienl sadly the "change" didnt work out, i ll give it another try with host access off (i turned it on with the last change)

     

    there was visible only 1 macvlan error on 20230506 ~ 2.10 am (idle), no more to see here ...

     

    it went offline pretty sure here with 0 traffic (just a HA meter but ...)

     

    image.png.bdd4e3e334744b988ea7dbffa1757651.png

     

    no web, smb, ssh, docker(s), ... short, dead ;)

  8. 11 hours ago, bonienl said:

    This is indeed a different error, caused by a memory allocation error when loading a module

     

    definately, the upper described crash "borked" the tvh docker (lsio alpine based) which pretty sure allocated this error, i redid the tvh docker to make sure, i think we can drop this for now.

     

    about the "regular macvlan errors", one more came up while idle

     

    May  4 15:52:43 AlsServerII kernel: ret_from_fork+0x1f/0x30
    May  4 15:52:43 AlsServerII kernel: </TASK>
    May  4 15:52:43 AlsServerII kernel: ---[ end trace 0000000000000000 ]---
    May  4 16:35:57 AlsServerII emhttpd: read SMART /dev/sdb
    May  4 16:52:51 AlsServerII emhttpd: spinning down /dev/sdb
    May  4 17:37:44 AlsServerII webGUI: Successful login user root from 192.168.1.83
    May  4 17:49:43 AlsServerII kernel: ------------[ cut here ]------------
    May  4 17:49:43 AlsServerII kernel: WARNING: CPU: 1 PID: 22408 at net/netfilter/nf_nat_core.c:594 nf_nat_setup_info+0x8c/0x7d1 [nf_nat]
    May  4 17:49:43 AlsServerII kernel: Modules linked in: wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha xt_nat xt_tcpudp macvlan md_mod bluetooth ecdh_generic ecc cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry dns_resolver xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat xt_addrtype br_netfilter bridge xfs xt_MASQUERADE ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tcp_diag inet_diag nct6775 nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables 8021q garp mrp stp llc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel i915 kvm iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 drm_kms_helper drm aesni_intel mei_pxp mei_hdcp crypto_simd intel_gtt i2c_i801 agpgart cryptd rapl i2c_smbus intel_cstate mei_me mei i2c_core ahci r8169 libahci realtek
    May  4 17:49:43 AlsServerII kernel: syscopyarea sysfillrect sysimgblt fb_sys_fops thermal fan button video wmi backlight intel_pmc_core unix [last unloaded: md_mod]
    May  4 17:49:43 AlsServerII kernel: CPU: 1 PID: 22408 Comm: kworker/u8:1 Tainted: G        W          6.1.27-Unraid #1
    May  4 17:49:43 AlsServerII kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J3355M, BIOS P1.90 11/27/2018
    May  4 17:49:43 AlsServerII kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan]
    May  4 17:49:43 AlsServerII kernel: RIP: 0010:nf_nat_setup_info+0x8c/0x7d1 [nf_nat]
    May  4 17:49:43 AlsServerII kernel: Code: a8 80 75 26 48 8d 73 58 48 8d 7c 24 20 e8 18 db fc ff 48 8d 43 0c 4c 8b bb 88 00 00 00 48 89 44 24 18 eb 54 0f ba e0 08 73 07 <0f> 0b e9 75 06 00 00 48 8d 73 58 48 8d 7c 24 20 e8 eb da fc ff 48
    May  4 17:49:43 AlsServerII kernel: RSP: 0000:ffffc900000fcc78 EFLAGS: 00010282
    May  4 17:49:43 AlsServerII kernel: RAX: 0000000000000180 RBX: ffff888162b3b000 RCX: ffff888106097900
    May  4 17:49:43 AlsServerII kernel: RDX: 0000000000000000 RSI: ffffc900000fcd5c RDI: ffff888162b3b000
    May  4 17:49:43 AlsServerII kernel: RBP: ffffc900000fcd40 R08: 000000000d01a8c0 R09: 0000000000000000
    May  4 17:49:43 AlsServerII kernel: R10: 0000000000000098 R11: 0000000000000000 R12: ffffc900000fcd5c

     

    tomorrow (may today) i have more time to check further into it, put traffic on there etc.

    • Like 1
    • Thanks 1
  9. 7 hours ago, bonienl said:

    Thanks for testing, we will further investigate what happened, based on your syslog

     

    ok, took some hours but 1st error came in, but it looks a little "different" overall.

     

    cant tell for real now as there was 0 traffic on the mashine, will report further what is happening.

     

    May  4 06:08:26 AlsServerII kernel: eth0: renamed from veth45e1ae9
    May  4 06:14:42 AlsServerII kernel: eth0: renamed from veth2247758
    May  4 06:14:42 AlsServerII kernel: wireguard: WireGuard 1.0.0 loaded. See www.wireguard.com for information.
    May  4 06:14:42 AlsServerII kernel: wireguard: Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
    May  4 06:25:56 AlsServerII kernel: veth79d5f07: renamed from eth0
    May  4 06:26:19 AlsServerII kernel: eth0: renamed from vethabedd9a
    May  4 06:31:38 AlsServerII kernel: traps: tvh:tcp-start[13438] general protection fault ip:15131a06d5be sp:15130a8da8b8 error:0 in ld-musl-x86_64.so.1[15131a05c000+4b000]
    May  4 06:31:53 AlsServerII kernel: tvh:tcp-start[20663]: segfault at 10 ip 000014bcf57765be sp 000014bce69ee8b8 error 4 in ld-musl-x86_64.so.1[14bcf5765000+4b000] likely on CPU 0 (core 0, socket 0)
    May  4 06:31:53 AlsServerII kernel: Code: 80 7f fc 00 74 12 85 d2 74 01 f4 48 63 57 f8 81 fa ff ff 00 00 7f 01 f4 89 d0 c1 e0 04 48 98 48 29 c7 48 8b 47 f0 48 83 ef 10 <48> 39 78 10 74 01 f4 40 8a 78 20 83 e7 1f 39 f7 7d 01 f4 8b 78 18
    May  4 06:32:19 AlsServerII kernel: traps: tvh:tcp-start[21116] general protection fault ip:145de58625be sp:145ddaf448b8 error:0 in ld-musl-x86_64.so.1[145de5851000+4b000]
    May  4 06:38:32 AlsServerII kernel: vethabedd9a: renamed from eth0
    May  4 07:10:20 AlsServerII emhttpd: spinning down /dev/sdb
    May  4 15:52:43 AlsServerII kernel: ------------[ cut here ]------------
    May  4 15:52:43 AlsServerII kernel: WARNING: CPU: 1 PID: 31711 at net/netfilter/nf_conntrack_core.c:1211 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    May  4 15:52:43 AlsServerII kernel: Modules linked in: wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha xt_nat xt_tcpudp macvlan md_mod bluetooth ecdh_generic ecc cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry dns_resolver xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat xt_addrtype br_netfilter bridge xfs xt_MASQUERADE ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tcp_diag inet_diag nct6775 nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables 8021q garp mrp stp llc x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel i915 kvm iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 drm_kms_helper drm aesni_intel mei_pxp mei_hdcp crypto_simd intel_gtt i2c_i801 agpgart cryptd rapl i2c_smbus intel_cstate mei_me mei i2c_core ahci r8169 libahci realtek
    May  4 15:52:43 AlsServerII kernel: syscopyarea sysfillrect sysimgblt fb_sys_fops thermal fan button video wmi backlight intel_pmc_core unix [last unloaded: md_mod]
    May  4 15:52:43 AlsServerII kernel: CPU: 1 PID: 31711 Comm: kworker/u8:0 Not tainted 6.1.27-Unraid #1
    May  4 15:52:43 AlsServerII kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J3355M, BIOS P1.90 11/27/2018
    May  4 15:52:43 AlsServerII kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan]
    May  4 15:52:43 AlsServerII kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    May  4 15:52:43 AlsServerII kernel: Code: 44 24 10 e8 f4 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 76 e6 ff ff 84 c0 75 a2 48 89 df e8 ad e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 2a dd ff ff e8 8b e3 ff ff e9 72 01
    May  4 15:52:43 AlsServerII kernel: RSP: 0018:ffffc900000fcd98 EFLAGS: 00010202
    May  4 15:52:43 AlsServerII kernel: RAX: 0000000000000001 RBX: ffff8881604b6900 RCX: 090a13e34320926a
    May  4 15:52:43 AlsServerII kernel: RDX: 0000000000000000 RSI: 00000000000003af RDI: ffff8881604b6900
    May  4 15:52:43 AlsServerII kernel: RBP: 0000000000000001 R08: c480d98a4a69a559 R09: 6faa12ec36c91b29
    May  4 15:52:43 AlsServerII kernel: R10: 4866b6af83806fd2 R11: ffffc900000fcd60 R12: ffffffff82a0e440
    May  4 15:52:43 AlsServerII kernel: R13: 0000000000035f7d R14: ffff8881714cf700 R15: 0000000000000000
    May  4 15:52:43 AlsServerII kernel: FS:  0000000000000000(0000) GS:ffff888277e80000(0000) knlGS:0000000000000000
    May  4 15:52:43 AlsServerII kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  4 15:52:43 AlsServerII kernel: CR2: 0000237e6db86000 CR3: 000000000420a000 CR4: 00000000003506e0
    May  4 15:52:43 AlsServerII kernel: Call Trace:
    May  4 15:52:43 AlsServerII kernel: <IRQ>
    May  4 15:52:43 AlsServerII kernel: ? nf_nat_inet_fn+0x60/0x1a8 [nf_nat]
    May  4 15:52:43 AlsServerII kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack]
    May  4 15:52:43 AlsServerII kernel: nf_hook_slow+0x3a/0x96
    May  4 15:52:43 AlsServerII kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    May  4 15:52:43 AlsServerII kernel: NF_HOOK.constprop.0+0x79/0xd9
    May  4 15:52:43 AlsServerII kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    May  4 15:52:43 AlsServerII kernel: __netif_receive_skb_one_core+0x77/0x9c
    May  4 15:52:43 AlsServerII kernel: process_backlog+0x8c/0x116
    May  4 15:52:43 AlsServerII kernel: __napi_poll.constprop.0+0x28/0x124
    May  4 15:52:43 AlsServerII kernel: net_rx_action+0x159/0x24f
    May  4 15:52:43 AlsServerII kernel: ? swake_up_one+0x1a/0x27
    May  4 15:52:43 AlsServerII kernel: __do_softirq+0x126/0x288
    May  4 15:52:43 AlsServerII kernel: do_softirq+0x7f/0xab
    May  4 15:52:43 AlsServerII kernel: </IRQ>
    May  4 15:52:43 AlsServerII kernel: <TASK>
    May  4 15:52:43 AlsServerII kernel: __local_bh_enable_ip+0x4c/0x6b
    May  4 15:52:43 AlsServerII kernel: netif_rx+0x52/0x5a
    May  4 15:52:43 AlsServerII kernel: macvlan_broadcast+0x10a/0x150 [macvlan]
    May  4 15:52:43 AlsServerII kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan]
    May  4 15:52:43 AlsServerII kernel: process_one_work+0x1a8/0x295
    May  4 15:52:43 AlsServerII kernel: worker_thread+0x18b/0x244
    May  4 15:52:43 AlsServerII kernel: ? rescuer_thread+0x281/0x281
    May  4 15:52:43 AlsServerII kernel: kthread+0xe4/0xef
    May  4 15:52:43 AlsServerII kernel: ? kthread_complete_and_exit+0x1b/0x1b
    May  4 15:52:43 AlsServerII kernel: ret_from_fork+0x1f/0x30
    May  4 15:52:43 AlsServerII kernel: </TASK>
    May  4 15:52:43 AlsServerII kernel: ---[ end trace 0000000000000000 ]---
    May  4 16:35:57 AlsServerII emhttpd: read SMART /dev/sdb
    May  4 16:52:51 AlsServerII emhttpd: spinning down /dev/sdb

     

  10. 7 hours ago, bonienl said:

    @alturismo would you mind to do another test?

     

    sure, 1st feedback, somehow the little Server crashed completely last nite now while i left it in the last state ;)

     

    really weird, the syslog was "cutted" ... only the failure entries left, so a reboot was required as the array didnt properly stop anymore (left in stopping state), nevermind.

     

    rebooted now, applied the changes and looking ok for now, i ll let it run a little and give feedback, currently no macvlan errors in the logs.

    alsserverii-syslog-20230504-0330.zip

    • Like 2
  11. 40 minutes ago, bonienl said:

    Letting macvlan operate directly on the interface should take away this conflict.

     

    ok, for now its looking good after a few minutes, usually the 1st error comes after a few seconds when i put some load on the interface ... so for now, looks like your suspicion could be correct ;)

     

    thanks for taking a look into it.

     

    may i ask what the downside is with this setup ? (not using a bridge)

  12. as quick & dirty ... ;)

     

    i enabled 1 Docker on the small mashine with br0 while host access is off

     

    image.thumb.png.103d017d4d94fa8129ebb54bbc88eaec.png

     

    image.thumb.png.6476fd1a0d2faaca6dcec854c937c0d3.png

     

    now to speed up a little, i started a stream in tvheadend (put traffic on ...)

     

    image.thumb.png.3a39e111a3f15c307741c4d348862340.png

     

    few seconds later ...

     

    May  3 11:03:16 AlsServerII kernel: ------------[ cut here ]------------
    May  3 11:03:16 AlsServerII kernel: WARNING: CPU: 1 PID: 12856 at net/netfilter/nf_conntrack_core.c:1211 __nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    May  3 11:03:16 AlsServerII kernel: Modules linked in: macvlan bluetooth ecdh_generic ecc cmac cifs asn1_decoder cifs_arc4 cifs_md4 oid_registry dns_resolver wireguard curve25519_x86_64 libc
    urve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha xt_nat xt_tcpudp veth xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo 
    iptable_nat xt_addrtype br_netfilter xfs xt_MASQUERADE ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 md_mod tcp_diag inet_diag nct6775 nct6775_core hwmon_vid efivarfs ip6tab
    le_filter ip6_tables iptable_filter ip_tables x_tables 8021q garp mrp bridge stp llc i915 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul iosf_mbi drm_buddy i2c
    _algo_bit crc32_pclmul ttm crc32c_intel drm_display_helper ghash_clmulni_intel drm_kms_helper sha512_ssse3 mei_hdcp mei_pxp drm aesni_intel crypto_simd cryptd r8169 mei_me rapl intel_cstate 
    i2c_i801 i2c_smbus realtek ahci intel_gtt libahci agpgart i2c_core mei
    May  3 11:03:16 AlsServerII kernel: syscopyarea sysfillrect sysimgblt fb_sys_fops thermal fan button video wmi backlight intel_pmc_core unix
    May  3 11:03:16 AlsServerII kernel: CPU: 1 PID: 12856 Comm: kworker/u8:0 Not tainted 6.1.26-Unraid #1
    May  3 11:03:16 AlsServerII kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./J3355M, BIOS P1.90 11/27/2018
    May  3 11:03:16 AlsServerII kernel: Workqueue: events_unbound macvlan_process_broadcast [macvlan]
    May  3 11:03:16 AlsServerII kernel: RIP: 0010:__nf_conntrack_confirm+0xa4/0x2b0 [nf_conntrack]
    May  3 11:03:16 AlsServerII kernel: Code: 44 24 10 e8 f4 e1 ff ff 8b 7c 24 04 89 ea 89 c6 89 04 24 e8 76 e6 ff ff 84 c0 75 a2 48 89 df e8 ad e2 ff ff 85 c0 89 c5 74 18 <0f> 0b 8b 34 24 8b 7c
     24 04 e8 2a dd ff ff e8 8b e3 ff ff e9 72 01
    May  3 11:03:16 AlsServerII kernel: RSP: 0018:ffffc900000fcd98 EFLAGS: 00010202
    May  3 11:03:16 AlsServerII kernel: RAX: 0000000000000001 RBX: ffff88817479c300 RCX: 321b95d7a64d7c4c
    May  3 11:03:16 AlsServerII kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff88817479c300
    May  3 11:03:16 AlsServerII kernel: RBP: 0000000000000001 R08: edd6cfc8a928aa6e R09: c9caecd99da36003
    May  3 11:03:16 AlsServerII kernel: R10: 7fcf9720518d109e R11: ffffc900000fcd60 R12: ffffffff82a0e440
    May  3 11:03:16 AlsServerII kernel: R13: 000000000000f8e1 R14: ffff8881029f3400 R15: 0000000000000000
    May  3 11:03:16 AlsServerII kernel: FS:  0000000000000000(0000) GS:ffff888277e80000(0000) knlGS:0000000000000000
    May  3 11:03:16 AlsServerII kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    May  3 11:03:16 AlsServerII kernel: CR2: 00007fff8710b9b8 CR3: 0000000213a78000 CR4: 00000000003506e0
    May  3 11:03:16 AlsServerII kernel: Call Trace:
    May  3 11:03:16 AlsServerII kernel: <IRQ>
    May  3 11:03:16 AlsServerII kernel: ? nf_nat_inet_fn+0x123/0x1a8 [nf_nat]
    May  3 11:03:16 AlsServerII kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack]
    May  3 11:03:16 AlsServerII kernel: nf_hook_slow+0x3a/0x96
    May  3 11:03:16 AlsServerII kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    May  3 11:03:16 AlsServerII kernel: NF_HOOK.constprop.0+0x79/0xd9
    May  3 11:03:16 AlsServerII kernel: ? ip_protocol_deliver_rcu+0x164/0x164
    May  3 11:03:16 AlsServerII kernel: __netif_receive_skb_one_core+0x77/0x9c
    May  3 11:03:16 AlsServerII kernel: process_backlog+0x8c/0x116
    May  3 11:03:16 AlsServerII kernel: __napi_poll.constprop.0+0x28/0x124
    May  3 11:03:16 AlsServerII kernel: net_rx_action+0x159/0x24f
    May  3 11:03:16 AlsServerII kernel: __do_softirq+0x126/0x288
    May  3 11:03:16 AlsServerII kernel: do_softirq+0x7f/0xab
    May  3 11:03:16 AlsServerII kernel: </IRQ>
    May  3 11:03:16 AlsServerII kernel: <TASK>
    May  3 11:03:16 AlsServerII kernel: __local_bh_enable_ip+0x4c/0x6b
    May  3 11:03:16 AlsServerII kernel: netif_rx+0x52/0x5a
    May  3 11:03:16 AlsServerII kernel: macvlan_broadcast+0x10a/0x150 [macvlan]
    May  3 11:03:16 AlsServerII kernel: ? _raw_spin_unlock+0x14/0x29
    May  3 11:03:16 AlsServerII kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan]
    May  3 11:03:16 AlsServerII kernel: process_one_work+0x1a8/0x295
    May  3 11:03:16 AlsServerII kernel: worker_thread+0x18b/0x244
    May  3 11:03:16 AlsServerII kernel: ? rescuer_thread+0x281/0x281
    May  3 11:03:16 AlsServerII kernel: kthread+0xe4/0xef
    May  3 11:03:16 AlsServerII kernel: ? kthread_complete_and_exit+0x1b/0x1b
    May  3 11:03:16 AlsServerII kernel: ret_from_fork+0x1f/0x30
    May  3 11:03:16 AlsServerII kernel: </TASK>
    May  3 11:03:16 AlsServerII kernel: ---[ end trace 0000000000000000 ]---
    May  3 11:05:11 AlsServerII sshd[32630]: Connection from 192.168.1.96 port 59534 on 192.168.1.4 port 22 rdomain ""
    May  3 11:05:11 AlsServerII sshd[32630]: Accepted password for root from 192.168.1.96 port 59534 ssh2
    May  3 11:05:11 AlsServerII sshd[32630]: pam_unix(sshd:session): session opened for user root(uid=0) by (uid=0)
    May  3 11:05:11 AlsServerII elogind-daemon[1051]: New session c2 of user root.
    May  3 11:05:11 AlsServerII sshd[32630]: Starting session: shell on pts/0 for root from 192.168.1.96 port 59534 id 0
    root@AlsServerII:~# 

     

    which is the beginning of the end ... ;)

    • Like 1
  13. 3 minutes ago, JorgeB said:

    Looks like it doesn't help but just to confirm did you also test with "Host access to custom networks" disabled?

    actually i cant remember as i tested all kind of scenarios ... i still have the mini server up and can make a small mockup and see what happens in terms you interested.

     

    In the end this wouldnt be a option as this is basically one major aspect here ... but i let you know.

  14. 11 hours ago, bonienl said:

    Do you have any problems?

     

    actually i can confirm the situation like @sonic6 described, made a long testrun and the issue came ~ 6.12 beta 5 - 8 (started at Feb. 7th here), i habe all logs from beggining 2021 here and never had those failures before.

     

    i also tested all scenarios with ipv4 only etc etc etc ... also setted up locally an "old" mashien with unraid parallel and can confirm, as soon i put some traffic on dockers in br0 mode failures will come and the system will definately crash, even with a small base setup with only a few dockers and 0 VM's.

     

    tested on 2 local systems here, currently, keeping br0 mode is only possible by staying on 6.11 ... i also have 3 more "external managed" unraid Servers with the same lineups ;) and im sure i wont test them as this is by far too risky and after my experience now it looks like this is a dead end road here, no idea what happened if its really a kernel issue or some background change in unraid, i consider to switch to another linux dist which also uses the 6x linux kernel and test the macvlan bevaviour there, as soon i find some spare time ...

     

    IPVLAN is not possible to use with a Fritz, as the Fritz relies on the mac address only, the br0 assigned dockers will "jump" and it takes longer to establish connections, or they fail, or sometimes it just works ... but thats not a solution. after contacting AVM they confirm this and are currently not changing their system ... what i understand actually as this would need some major changes their ...

     

    in terms you need more logs, or screens, ... i have plenty of them ;) but in the end its always the same end and its probably also for limetech hard to tell ...

     

    here 2 different samples ... small "old" mashine syslog and from my Main a crash screenshot from the mon ...

     

    alsserverii-syslog-20230414-2015.zip

     

    image.thumb.png.3c151ff1075c4f90004488fc76fe7c6b.png

    • Thanks 1
  15. 2 minutes ago, Tristankin said:

    Do you have VT-d enabled on your system?

    i1080k and 9900 one, and the small currently offline one, yes

    the other one, nope

     

    2 minutes ago, Tristankin said:

    You can understand that is pretty frustrating right?

    of course ...

     

    i just had an issue with an beta which worked and while changing some BIOS settings it broke my VM's completely, returning back didnt help either ... only returning to last stable one did. after a week experimenting its been wiping VM's (without the disks), wiping the libvirt image, updating, adding the VM's again to get it running on the beta (which worked flawlessly before ...), so yes, i know its frustrating, and trying since a year way worse ...

     

    but as you see, may some also have the issue, but the most aint ... so its hard to debug and say why its happening in your case, may worse a try, make a clean install while resetting the BIOS and unraid (of course keep a backup), test it "bare metal" with either legacy mode and / or uefi mode (bios & unraid) and basic setup, array, plex, run it ... 

     

    if the error reappears, return the backup and you on the same state, if its running then build 1 by 1 up (plugins, dockers, changes, ...) until it breaks to narrow it down, its definately no general issue as you see, either hardware or setup, hardware would be a shame if its incompatible, setup could be something to solve.

     

    that would be my final approach.

  16. 29 minutes ago, ich777 said:

    None of them crashed so far after about a month of uptime transcoding with Unmanic on Unraid.

     

    you can add the following with no issues

     

    i9 10850k on asrock z590

    i9 9900 on msi z370

    i5 2405S on asus P8Z77-V LX

    and 1 more asus with an celeron which is currently offline ;) 

     

    all have no issues with the intel igpu on the latest unraid releases in plex, the 10850k also no issues with ffmpeg encoding like unmanic (others are not used therefore, only plex).

    • Like 1
  17. On 7/7/2022 at 12:13 PM, bonienl said:

    It works fine when using a static IP address assignment for Unraid.

     

    Using DHCP may cause a race condition, and the shim network is not created when your DHCP server is slow in responding.

    sorry @bonienl but i cant confirm this, as i didnt had any "unsafe" shutdowns lately i just forced one and ...

     

     

    image.thumb.png.ec4da88cb8bae4489539d82fc0b4f5b6.png

     

    and i use a static assignement since day 1 i use unraid (all local devices here use static ip's here)

     

    image.thumb.png.1689c33f04d26bbe110c5e4523164d77.png

     

    docker setting

     

    image.thumb.png.17a98e13558f2caccabaf420bc15cf23.png

     

    so for now i have to stop/start docker service to get the "host access" back to work or make a clean restart ...

     

    then its back again

     

    image.thumb.png.14861a867906f70bf91e7708ad7a8b8c.png

     

    as i stated, no chance to make a simple routine ? unclean shutdown = docker service restart ;)

    image.png