Jump to content

6.5.3 - server is randomly freezing


moodinsk

Recommended Posts

unraid-diagnostics-20180614-2059.zip

 

The server hangs completely, so I am unsure if the diagnostics zip has a full syslog of the issue. This particular incident happened on 6/14, before I updated to 6.5.3.. but it has since happened again after the update.

 

The server was recently decommissioned as a HyperV server that ran a handful of images without issue. The only difference is the drives, which I refreshed...except for the cache drive. It is randomly getting CRC errors and I have planning on replacing it, but while I could be wrong, I dont believe that those CRC errors on a cache drive would cause the server to lock up entirely?

 

Attached is also a screenshot of the console, the server was completely unresponsive and required a reset button to bring it back.

 

The server will function for maybe up to a week without issue, and then randomly crash. It happened as recent as 6/25 but I wasnt able to get a screenshot.

 

The server isnt doing much at all, it has a few file share volumes, runs plex for the kids, and a unifi docker for my wireless network. I would like to start using it for VMs, but until I can resolve this, I am hesitant.

 

Please let me know if there is more that I can gather to help you help me :)

IMG_0804.JPG

Link to comment
  • 2 weeks later...

Any suggestions at all?

 

I am seeing this in dmesg after I tried stopping the array.

[875151.955326] device br0 left promiscuous mode
[875151.967278] veth689ec15: renamed from eth0
[875191.929540] eth0: renamed from veth760b8c4
[875191.939331] device br0 entered promiscuous mode
[895736.136897] ------------[ cut here ]------------
[895736.136903] WARNING: CPU: 0 PID: 11494 at net/netfilter/nf_conntrack_core.c:769 __nf_conntrack_confirm+0x97/0x4d6
[895736.136903] Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables vhost_net tun vhost tap veth macvlan xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs nfsd lockd grace sunrpc md_mod bonding e1000e ptp pps_core x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul ast crc32c_intel ghash_clmulni_intel pcbc ttm aesni_intel i2c_algo_bit drm_kms_helper aes_x86_64 crypto_simd glue_helper drm cryptd mpt3sas agpgart i2c_i801 i2c_core ahci raid_class libahci scsi_transport_sas intel_cstate intel_uncore video syscopyarea sysfillrect intel_rapl_perf ie31200_edac ipmi_si sysimgblt backlight fb_sys_fops thermal button
[895736.136933]  fan [last unloaded: pps_core]
[895736.136935] CPU: 0 PID: 11494 Comm: kworker/0:0 Tainted: G        W       4.14.49-unRAID #1
[895736.136936] Hardware name: ASUSTek Computer INC. RS300-E7-PS4/P8B-E Series, BIOS 6003 11/20/2012
[895736.136939] Workqueue: events macvlan_process_broadcast [macvlan]
[895736.136940] task: ffff8807e7225100 task.stack: ffffc900075d0000
[895736.136941] RIP: 0010:__nf_conntrack_confirm+0x97/0x4d6
[895736.136942] RSP: 0018:ffff88082fc03d50 EFLAGS: 00010202
[895736.136943] RAX: 0000000000000188 RBX: 00000000000009a4 RCX: 0000000000000001
[895736.136943] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffffffff81c09448
[895736.136944] RBP: ffff8805c2c72b00 R08: 0000000000000101 R09: ffff8805cbfc8e00
[895736.136945] R10: 0000000000000098 R11: 0000000000000000 R12: ffffffff81c88480
[895736.136945] R13: 0000000000000392 R14: ffff8807e347da40 R15: ffff8807e347da98
[895736.136946] FS:  0000000000000000(0000) GS:ffff88082fc00000(0000) knlGS:0000000000000000
[895736.136947] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[895736.136948] CR2: 0000145966d9bcc8 CR3: 0000000001c0a001 CR4: 00000000001606f0
[895736.136948] Call Trace:
[895736.136950]  <IRQ>
[895736.136953]  ipv4_confirm+0xac/0xb4 [nf_conntrack_ipv4]
[895736.136956]  nf_hook_slow+0x37/0x96
[895736.136958]  ip_local_deliver+0x97/0xb0
[895736.136960]  ? inet_del_offload+0x3e/0x3e
[895736.136962]  ip_rcv+0x2f5/0x32a
[895736.136964]  ? ip_local_deliver_finish+0x1aa/0x1aa
[895736.136966]  __netif_receive_skb_core+0x69f/0x718
[895736.136968]  process_backlog+0x7e/0x116
[895736.136971]  net_rx_action+0xfb/0x24f
[895736.136974]  __do_softirq+0xcd/0x1c2
[895736.136976]  do_softirq_own_stack+0x2a/0x40
[895736.136977]  </IRQ>
[895736.136980]  do_softirq+0x46/0x52
[895736.136981]  netif_rx_ni+0x1a/0x20
[895736.136983]  macvlan_broadcast+0x117/0x14f [macvlan]
[895736.136985]  macvlan_process_broadcast+0xc5/0x10c [macvlan]
[895736.136987]  process_one_work+0x155/0x237
[895736.136989]  ? rescuer_thread+0x275/0x275
[895736.136990]  worker_thread+0x1d5/0x2ad
[895736.136992]  kthread+0x111/0x119
[895736.136993]  ? kthread_create_on_node+0x3a/0x3a
[895736.136995]  ? SyS_exit_group+0xb/0xb
[895736.136996]  ret_from_fork+0x35/0x40
[895736.136997] Code: 48 c1 eb 20 89 1c 24 e8 31 f9 ff ff 8b 54 24 04 89 df 89 c6 41 89 c5 e8 a9 fa ff ff 84 c0 75 b9 49 8b 86 80 00 00 00 a8 08 74 02 <0f> 0b 4c 89 f7 e8 03 ff ff ff 49 8b 86 80 00 00 00 0f ba e0 09
[895736.137014] ---[ end trace 60f28eab616b4212 ]---
[897033.496770] mdcmd (41): nocheck
[897033.496772] md: nocheck_array: check not active
[897033.496864] mdcmd (42): spinup 0
[897033.496870] mdcmd (43): spinup 1
[897033.496876] mdcmd (44): spinup 2
[897034.874089] device virbr0-nic left promiscuous mode
[897034.874091] virbr0: port 1(virbr0-nic) entered disabled state
[897043.756401] device br0 left promiscuous mode
[897043.764966] veth760b8c4: renamed from eth0
[897046.171567] ------------[ cut here ]------------
[897046.171573] WARNING: CPU: 7 PID: 32490 at fs/btrfs/extent_io.c:534 clear_state_bit+0x3a/0x122
[897046.171574] Modules linked in: xt_CHECKSUM iptable_mangle ipt_REJECT nf_reject_ipv4 ebtable_filter ebtables ip6table_filter ip6_tables veth macvlan xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 iptable_filter ip_tables nf_nat xfs nfsd lockd grace sunrpc md_mod bonding e1000e ptp pps_core x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul ast crc32c_intel ghash_clmulni_intel pcbc ttm aesni_intel i2c_algo_bit drm_kms_helper aes_x86_64 crypto_simd glue_helper drm cryptd mpt3sas agpgart i2c_i801 i2c_core ahci raid_class libahci scsi_transport_sas intel_cstate intel_uncore video syscopyarea sysfillrect intel_rapl_perf ie31200_edac ipmi_si sysimgblt backlight fb_sys_fops thermal button fan [last unloaded: tun]
[897046.171599] CPU: 7 PID: 32490 Comm: umount Tainted: G        W       4.14.49-unRAID #1
[897046.171599] Hardware name: ASUSTek Computer INC. RS300-E7-PS4/P8B-E Series, BIOS 6003 11/20/2012
[897046.171600] task: ffff88078b2e0000 task.stack: ffffc900033f4000
[897046.171601] RIP: 0010:clear_state_bit+0x3a/0x122
[897046.171602] RSP: 0018:ffffc900033f7d18 EFLAGS: 00010287
[897046.171603] RAX: 0000000000003000 RBX: ffff88046b9cb6e0 RCX: 0000000000000000
[897046.171604] RDX: ffffc900033f7d5c RSI: ffff88046b9cb6e0 RDI: ffff8805c93819c0
[897046.171604] RBP: ffff8805c93819c0 R08: 0000000000000000 R09: 0000000000000001
[897046.171605] R10: ffffea00170b2e40 R11: 0000000000000001 R12: 00000000fffde7ff
[897046.171606] R13: 0000000000000001 R14: ffff88046b9cb6e0 R15: 000000000000020a
[897046.171606] FS:  0000145c0e5c6780(0000) GS:ffff88082fdc0000(0000) knlGS:0000000000000000
[897046.171607] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[897046.171608] CR2: 000014f8380e2be0 CR3: 00000007d4636002 CR4: 00000000001606e0
[897046.171608] Call Trace:
[897046.171612]  __clear_extent_bit+0x229/0x2bf
[897046.171615]  ? kmem_cache_alloc+0xdc/0xe8
[897046.171616]  clear_extent_bit+0x10/0x15
[897046.171618]  btrfs_evict_inode+0x175/0x484
[897046.171620]  evict+0xb9/0x16d
[897046.171621]  dispose_list+0x30/0x39
[897046.171622]  evict_inodes+0x11e/0x12d
[897046.171624]  generic_shutdown_super+0x3e/0x10c
[897046.171625]  kill_anon_super+0x9/0xe
[897046.171627]  btrfs_kill_super+0xd/0x8e
[897046.171629]  deactivate_locked_super+0x2f/0x61
[897046.171630]  cleanup_mnt+0x40/0x5c
[897046.171633]  task_work_run+0x77/0x8b
[897046.171636]  exit_to_usermode_loop+0x46/0x75
[897046.171637]  do_syscall_64+0xf7/0xfe
[897046.171639]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[897046.171641] RIP: 0033:0x145c0d838ec7
[897046.171641] RSP: 002b:00007fffc3751c78 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
[897046.171643] RAX: 0000000000000000 RBX: 00000000006072b0 RCX: 0000145c0d838ec7
[897046.171643] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 000000000060b440
[897046.171644] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[897046.171644] R10: 0000000000000008 R11: 0000000000000246 R12: 000000000060b440
[897046.171645] R13: 0000145c0e3b1ed0 R14: 0000000000607490 R15: 0000000000000000
[897046.171646] Code: 41 51 8b 02 41 89 c4 41 81 e4 ff e7 fd ff a8 01 74 22 f6 46 44 01 74 1c 48 8b 46 08 48 8b 4f 10 48 ff c0 48 2b 06 48 39 c1 73 02 <0f> 0b 48 29 c1 48 89 4d 10 48 8b 45 20 48 85 c0 74 1d 48 8b 40
[897046.171663] ---[ end trace 60f28eab616b4213 ]---

thanks for your time

Link to comment

Definitely could be the issue. I do have Unifi docker set with an assigned IP different from my unraid host.

 

I read through the post and I am to believe we are waiting on a possible kernel fix to assigning IPs in dockers to br0? I dont currently use vlans on my unraid, the 4 NICS i have are setup as LACP so I currently dont have the option of assigning docker to specific eth interfaces.

 

 

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...