Jump to content

My server freezes every couple of days


Recommended Posts

Hello

I've been running Unraid 1 year rock solid, a few updates in between, no major issues. Some flaws with containers but I always got it working back using backups, but not this time. I can't figure out what's going on.

Every couple of days my server freezes and I have to hard reset using button.

At first I thought it was because I added 3 new disks and an HBA (LSI 9201-16e) but I removed and the sympthoms persisted. So I added them again. I didn't find anything rare in the logs until yesterday that I enabled mirror to the flash drive and today the problem appeared but it doesn't have hang yet.

 

Feb  9 05:46:07 caronte kernel: ------------[ cut here ]------------
Feb  9 05:46:07 caronte kernel: WARNING: CPU: 3 PID: 30466 at net/netfilter/nf_conntrack_core.c:1120 __nf_conntrack_confirm+0x99/0x1e1
Feb  9 05:46:07 caronte kernel: Modules linked in: macvlan xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat iptable_mangle ip6table_filter ip6_tables vhost_net tun vhost vhost_iotlb tap veth xt_nat iptable_filter xfs nfsd lockd grace sunrpc md_mod nct6775 hwmon_vid iptable_nat xt_MASQUERADE nf_nat ip_tables wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libblake2s blake2s_x86_64 libblake2s_generic libchacha bonding mlx4_en mlx4_core e1000e x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm btusb crct10dif_pclmul crc32_pclmul btrtl crc32c_intel btbcm ghash_clmulni_intel btintel aesni_intel bluetooth crypto_simd cryptd glue_helper mpt3sas rapl i2c_i801 video ecdh_generic ahci raid_class i2c_smbus backlight intel_cstate i2c_core scsi_transport_sas intel_uncore ecc libahci thermal button fan [last unloaded: mlx4_core]
Feb  9 05:46:07 caronte kernel: CPU: 3 PID: 30466 Comm: kworker/3:8 Not tainted 5.10.1-Unraid #1
Feb  9 05:46:07 caronte kernel: Hardware name: OEGStone DQ77MK/DQ77MK, BIOS MKQ7710H.86A.0074.2018.1025.1727 10/25/2018
Feb  9 05:46:07 caronte kernel: Workqueue: events macvlan_process_broadcast [macvlan]
Feb  9 05:46:07 caronte kernel: RIP: 0010:__nf_conntrack_confirm+0x99/0x1e1
Feb  9 05:46:07 caronte kernel: Code: e4 e3 ff ff 8b 54 24 14 89 c6 41 89 c4 48 c1 eb 20 89 df 41 89 de e8 54 e1 ff ff 84 c0 75 b8 48 8b 85 80 00 00 00 a8 08 74 18 <0f> 0b 89 df 44 89 e6 31 db e8 89 de ff ff e8 af e0 ff ff e9 1f 01
Feb  9 05:46:07 caronte kernel: RSP: 0018:ffffc90000138dd8 EFLAGS: 00010202
Feb  9 05:46:07 caronte kernel: RAX: 0000000000000188 RBX: 000000000000c741 RCX: 000000001c455466
Feb  9 05:46:07 caronte kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff8200a544
Feb  9 05:46:07 caronte kernel: RBP: ffff8882a9c30b40 R08: 0000000095c44e4d R09: ffff888071397780
Feb  9 05:46:07 caronte kernel: R10: 0000000000000000 R11: ffff88810d9e7c00 R12: 000000000000c4cd
Feb  9 05:46:07 caronte kernel: R13: ffffffff8210da40 R14: 000000000000c741 R15: ffff8882a9c30b4c
Feb  9 05:46:07 caronte kernel: FS:  0000000000000000(0000) GS:ffff88840dcc0000(0000) knlGS:0000000000000000
Feb  9 05:46:07 caronte kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb  9 05:46:07 caronte kernel: CR2: 00001527e5ee8020 CR3: 000000000500c001 CR4: 00000000000606e0
Feb  9 05:46:07 caronte kernel: Call Trace:
Feb  9 05:46:07 caronte kernel: <IRQ>
Feb  9 05:46:07 caronte kernel: nf_conntrack_confirm+0x2f/0x36
Feb  9 05:46:07 caronte kernel: nf_hook_slow+0x39/0x8e
Feb  9 05:46:07 caronte kernel: nf_hook.constprop.0+0xb1/0xd8
Feb  9 05:46:07 caronte kernel: ? ip_protocol_deliver_rcu+0xfe/0xfe
Feb  9 05:46:07 caronte kernel: ip_local_deliver+0x49/0x75
Feb  9 05:46:07 caronte kernel: __netif_receive_skb_one_core+0x74/0x95
Feb  9 05:46:07 caronte kernel: process_backlog+0xa3/0x13b
Feb  9 05:46:07 caronte kernel: net_rx_action+0xf4/0x29d
Feb  9 05:46:07 caronte kernel: __do_softirq+0xc4/0x1c2
Feb  9 05:46:07 caronte kernel: asm_call_irq_on_stack+0xf/0x20
Feb  9 05:46:07 caronte kernel: </IRQ>
Feb  9 05:46:07 caronte kernel: do_softirq_own_stack+0x2c/0x39
Feb  9 05:46:07 caronte kernel: do_softirq+0x3a/0x44
Feb  9 05:46:07 caronte kernel: netif_rx_ni+0x1c/0x22
Feb  9 05:46:07 caronte kernel: macvlan_broadcast+0x10e/0x13c [macvlan]
Feb  9 05:46:07 caronte kernel: macvlan_process_broadcast+0xf8/0x143 [macvlan]
Feb  9 05:46:07 caronte kernel: process_one_work+0x13c/0x1d5
Feb  9 05:46:07 caronte kernel: worker_thread+0x18b/0x22f
Feb  9 05:46:07 caronte kernel: ? process_scheduled_works+0x27/0x27
Feb  9 05:46:07 caronte kernel: kthread+0xe5/0xea
Feb  9 05:46:07 caronte kernel: ? kthread_unpark+0x52/0x52
Feb  9 05:46:07 caronte kernel: ret_from_fork+0x1f/0x30
Feb  9 05:46:07 caronte kernel: ---[ end trace e5dca96e95407d59 ]---

 

I attach the diagnostics zip and a more_logs.txt file where you can see errors similar to this trace in a loop. Until I pushed the button.

Thank you in advance for your help.

 

EDIT: Even this time I lost my install of nextcloud when this happened and I'm not able to get it running again using backups. At least I have a backup of the files. I get a connection refused when I try to access the frontend when I restore the database and nextcloud to a couple of weeks ago.

 

caronte-diagnostics-20210209-2003.zip more_logs.txt

Edited by magonzalez112
Link to comment
21 minutes ago, JorgeB said:

Macvlan call traces are usually the result of having dockers with a custom IP address, more info below.

 

https://forums.unraid.net/topic/70529-650-call-traces-when-assigning-ip-address-to-docker-containers/

I really appreciate your reply. Thank you.

I guessed it was something related to that, because it mentions the whole time, macvlan and netfilter

it's time to deploy vlans in my network. It´s something i always had pending.

I think my network hardware is compatible. I've got an HP Procurve J9028B as my homelab switch and a Mikrotik router and a tp-link switch for the house rooms.

I'm not an expert on vlans and I wouldn't like to lose connectivity, especially on the server and my workstation where I configure everything. Do you recommend any guide/book to start with ?

Muchas gracias Jorge.

Link to comment
2 hours ago, magonzalez112 said:

Do you recommend any guide/book to start with ?

The guide I linked in that macvlan call trace thread was sufficient to get me and others going in setting up VLANs.  Some work must be done in unRAID and on the router side.  I am no networking expert myself.

 

Even though the guide uses hardware you do not have in the examples, the concepts are the same and you should be able to find the corresponding settings in your hardware.

 

  • Like 1
  • Thanks 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...