aleberro Posted January 9, 2023 Share Posted January 9, 2023 Hello guys, I need help with this problem that happened to my server. Please note it had already happened once, maybe 1-2 months ago, then never happened again until yesterday. Please find attached my diagnostics zip. I don't understand how, but the machine enters an "unresponsive status": while still being powered on it is not possible to access web ui, containers ui, smb share, nothing works. If i try to ping it it says unreachable. To my ears, HDDs seem to be spin down. The only solution (unfortunately) seems to be hard shut down, then power on again. Last time it worked, server did parity check after and there were no errors (this time i managed to reboot it again, parity check is in progress). After the first time I enabled local rsyslog to a rpi, so I was able to collect the full log, which as far as I can understand doesn't tell much. I am sure that the machine was working fine until 19:15, then I unfortunately was not at home until this morning. In kernel.log there is a huge series of the two lines below, then (i suppose at the time of the crash, unfotrunately I was not home) it interrupts and next line is the power on of this morning after the hard sutdown. 2023-01-08T23:19:21+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Enabling MPC IRBNCE 2023-01-08T23:19:21+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Intel PCH root port ACS workaround enabled 2023-01-09T11:17:29+00:00 BerroServer kernel: md: unRAID driver 2.9.25 installed It looks like 23:19 might have been the time of the crash (?) Another file in rsyslog folder, named ".log", maybe tells a bit more. Here's the tail: 2023-01-08T16:26:04+00:00 BerroServer emhttpd: read SMART /dev/sdg 2023-01-08T16:26:11+00:00 BerroServer emhttpd: read SMART /dev/sde 2023-01-08T16:27:34+00:00 BerroServer shfs: share cache full 2023-01-08T16:27:34+00:00 BerroServer emhttpd: read SMART /dev/sdf 2023-01-08T16:27:41+00:00 BerroServer shfs: share cache full 2023-01-08T16:27:49+00:00 BerroServer message repeated 148 times: [ shfs: share cache full] 2023-01-08T16:32:48+00:00 BerroServer shfs: share cache full 2023-01-08T16:32:58+00:00 BerroServer message repeated 9 times: [ shfs: share cache full] 2023-01-08T16:57:45+00:00 BerroServer emhttpd: spinning down /dev/sdg 2023-01-08T17:05:15+00:00 BerroServer emhttpd: spinning down /dev/sde 2023-01-08T17:05:15+00:00 BerroServer emhttpd: spinning down /dev/sdf 2023-01-09T11:17:28+00:00 BerroServer sshd[7822]: Server listening on 0.0.0.0 port 22. In this log the last entry before the reboot is dated 17:05, so way before the supposed crash time of 23:19. I have looked up in the forum about the message "shfs: share cache full" and it looks like it shouldn't be the cause of this problem. The only similar issue i found was this post unraid-became-mostly-unresponsive which unfortunately led to nowhere because there was no log. If You need, i can provide the full zip export of rsyslog. Could it be a hardware related problem? I do have 4x8GB ECC RAM, is it useful to run a memtest (after parity-check completion of course)? Thanks in advance for the support. berroserver-diagnostics-20230109-1238.zip Quote Link to comment
JorgeB Posted January 9, 2023 Share Posted January 9, 2023 37 minutes ago, aleberro said: "shfs: share cache full" and it looks like it shouldn't be the cause of this problem. It should not. Without anything logged there aren't many clues, try disabling PCIe ACS override, you can also try to boot the server in safe mode with all docker/VMs disabled, let it run as a basic NAS for a few days, if it still crashes it's likely a hardware problem, if it doesn't start turning on the other services one by one. Quote Link to comment
aleberro Posted February 5, 2023 Author Share Posted February 5, 2023 First of all, thank You @JorgeB for the support. System worked flawlessly until yesterday. Please notice that in the meanwhile I replaced a HDD which began showing SMART errors. New HDD has been working without any issues for two weeks, I guess the old drive was not related to this "unresponsive" thing which happened again. This time kernel log has more info, please find attached the diagnostics .zip. Here's the last lines of the log, which as far as I can understand, maybe can be useful in investigating the problem: Spoiler 2023-02-04T09:48:34+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Enabling MPC IRBNCE 2023-02-04T09:48:34+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Intel PCH root port ACS workaround enabled 2023-02-04T09:49:35+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Enabling MPC IRBNCE 2023-02-04T09:49:35+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Intel PCH root port ACS workaround enabled 2023-02-04T09:50:36+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Enabling MPC IRBNCE 2023-02-04T09:50:36+00:00 BerroServer kernel: pcieport 0000:00:1c.1: Intel PCH root port ACS workaround enabled 2023-02-04T09:50:44+00:00 BerroServer kernel: general protection fault, probably for non-canonical address 0xa2948afb5bf76207: 0000 [#1] PREEMPT SMP PTI 2023-02-04T09:50:44+00:00 BerroServer kernel: CPU: 3 PID: 18557 Comm: app Tainted: G W 5.19.17-Unraid #2 2023-02-04T09:50:44+00:00 BerroServer kernel: Hardware name: ASUSTeK COMPUTER INC. P9D-M Series/P9D-M Series, BIOS 2101 04/20/2018 2023-02-04T09:50:44+00:00 BerroServer kernel: RIP: 0010:nf_nat_setup_info+0x142/0x7b1 [nf_nat] 2023-02-04T09:50:44+00:00 BerroServer kernel: Code: 4c 89 f7 e8 2f f8 ff ff 48 8b 15 66 6a 00 00 89 c0 48 8d 04 c2 4c 8b 28 4d 85 ed 74 2a 49 81 ed 90 00 00 00 eb 21 8a 44 24 46 <41> 38 45 46 74 21 49 8b 95 90 00 00 00 48 85 d2 0f 84 53 ff ff ff 2023-02-04T09:50:44+00:00 BerroServer kernel: RSP: 0018:ffffc90000178730 EFLAGS: 00010282 2023-02-04T09:50:44+00:00 BerroServer kernel: RAX: ffff888103edf511 RBX: ffff8881a113b100 RCX: 469e93f3514ea88e 2023-02-04T09:50:44+00:00 BerroServer kernel: RDX: a2948afb5bf76251 RSI: d1e865f3880a7926 RDI: 1f64589d6c3d4144 2023-02-04T09:50:44+00:00 BerroServer kernel: RBP: ffffc900001787f8 R08: d776c335d6b7943a R09: 91b317315f4e5ab5 2023-02-04T09:50:44+00:00 BerroServer kernel: R10: 2d5ac8d98b98afa7 R11: ce13e7c889e48066 R12: ffffc9000017880c 2023-02-04T09:50:44+00:00 BerroServer kernel: R13: a2948afb5bf761c1 R14: ffffffff82909480 R15: 0000000000000000 2023-02-04T09:50:44+00:00 BerroServer kernel: FS: 000000c000380090(0000) GS:ffff88880fcc0000(0000) knlGS:0000000000000000 2023-02-04T09:50:44+00:00 BerroServer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2023-02-04T09:50:44+00:00 BerroServer kernel: ? __ip_finish_output+0x144/0x144 2023-02-04T09:50:44+00:00 BerroServer kernel: nf_hook+0xdf/0x110 2023-02-04T09:50:44+00:00 BerroServer kernel: CR2: 000000c00040e000 CR3: 00000001870d4002 CR4: 00000000001706e0 2023-02-04T09:50:44+00:00 BerroServer kernel: ? ethnl_parse_bit+0xce/0x202 2023-02-04T09:50:44+00:00 BerroServer kernel: ? __ip_finish_output+0x144/0x144 2023-02-04T09:50:44+00:00 BerroServer kernel: ip_output+0x78/0x88 2023-02-04T09:50:44+00:00 BerroServer kernel: Call Trace: 2023-02-04T09:50:44+00:00 BerroServer kernel: <IRQ> 2023-02-04T09:50:44+00:00 BerroServer kernel: ? krealloc+0x7f/0x90 2023-02-04T09:50:44+00:00 BerroServer kernel: nf_nat_masquerade_ipv4+0x114/0x13c [nf_nat] 2023-02-04T09:50:44+00:00 BerroServer kernel: masquerade_tg+0x48/0x66 [xt_MASQUERADE] 2023-02-04T09:50:44+00:00 BerroServer kernel: ipt_do_table+0x51e/0x5bf [ip_tables] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? xt_write_recseq_end+0xf/0x1c [ip_tables] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? __local_bh_enable_ip+0x56/0x6b 2023-02-04T09:50:44+00:00 BerroServer kernel: ? __ip_finish_output+0x144/0x144 2023-02-04T09:50:44+00:00 BerroServer kernel: ip_sabotage_in+0x4a/0x58 [br_netfilter] 2023-02-04T09:50:44+00:00 BerroServer kernel: nf_hook_slow+0x3d/0x96 2023-02-04T09:50:44+00:00 BerroServer kernel: ? ip_rcv_finish_core.constprop.0+0x3b7/0x3b7 2023-02-04T09:50:44+00:00 BerroServer kernel: NF_HOOK.constprop.0+0x79/0xd9 2023-02-04T09:50:44+00:00 BerroServer kernel: ? ip_rcv_finish_core.constprop.0+0x3b7/0x3b7 2023-02-04T09:50:44+00:00 BerroServer kernel: __netif_receive_skb_one_core+0x77/0x9c 2023-02-04T09:50:44+00:00 BerroServer kernel: ? ipt_do_table+0x57a/0x5bf [ip_tables] 2023-02-04T09:50:44+00:00 BerroServer kernel: netif_receive_skb+0xbf/0x127 2023-02-04T09:50:44+00:00 BerroServer kernel: br_handle_frame_finish+0x476/0x4b0 [bridge] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? br_pass_frame_up+0xdd/0xdd [bridge] 2023-02-04T09:50:44+00:00 BerroServer kernel: br_nf_hook_thresh+0xe5/0x109 [br_netfilter] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? br_pass_frame_up+0xdd/0xdd [bridge] 2023-02-04T09:50:44+00:00 BerroServer kernel: nf_nat_inet_fn+0x126/0x1a8 [nf_nat] 2023-02-04T09:50:44+00:00 BerroServer kernel: nf_nat_ipv4_out+0x15/0x91 [nf_nat] 2023-02-04T09:50:44+00:00 BerroServer kernel: nf_hook_slow+0x3d/0x96 2023-02-04T09:50:44+00:00 BerroServer kernel: br_nf_pre_routing_finish+0x2c1/0x2ec [br_netfilter] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? br_pass_frame_up+0xdd/0xdd [bridge] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? NF_HOOK.isra.0+0xe4/0x140 [br_netfilter] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? br_nf_hook_thresh+0x109/0x109 [br_netfilter] 2023-02-04T09:50:44+00:00 BerroServer kernel: br_nf_pre_routing+0x226/0x23a [br_netfilter] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? br_nf_hook_thresh+0x109/0x109 [br_netfilter] 2023-02-04T09:50:44+00:00 BerroServer kernel: br_handle_frame+0x27f/0x2e7 [bridge] 2023-02-04T09:50:44+00:00 BerroServer kernel: ? br_pass_frame_up+0xdd/0xdd [bridge] 2023-02-04T09:50:44+00:00 BerroServer kernel: __netif_receive_skb_core.constprop.0+0x4f9/0x6e3 2023-02-04T09:50:44+00:00 BerroServer kernel: ? dequeue_load_avg+0x30/0x6d 2023-02-04T09:50:44+00:00 BerroServer kernel: ? enqueue_entity+0x150/0x1ae 2023-02-04T09:50:44+00:00 BerroServer kernel: __netif_receive_skb_one_core+0x40/0x9c 2023-02-04T09:50:44+00:00 BerroServer kernel: process_backlog+0x8c/0x116 2023-02-04T09:50:44+00:00 BerroServer kernel: __napi_poll.constprop.0+0x2b/0x124 2023-02-04T09:50:44+00:00 BerroServer kernel: net_rx_action+0x159/0x24f 2023-02-04T09:50:44+00:00 BerroServer kernel: ? _raw_spin_lock_irq+0x19/0x22 2023-02-04T09:50:44+00:00 BerroServer kernel: __do_softirq+0x129/0x288 2023-02-04T09:50:44+00:00 BerroServer kernel: do_softirq+0x7f/0xab 2023-02-04T09:50:44+00:00 BerroServer kernel: </IRQ> 2023-02-04T09:50:44+00:00 BerroServer kernel: <TASK> 2023-02-04T09:50:44+00:00 BerroServer kernel: __local_bh_enable_ip+0x4c/0x6b 2023-02-04T09:50:44+00:00 BerroServer kernel: ip_finish_output2+0x37d/0x3b0 2023-02-04T09:50:44+00:00 BerroServer kernel: ip_send_skb+0x15/0x3b 2023-02-04T09:50:44+00:00 BerroServer kernel: udp_send_skb+0x278/0x2e6 2023-02-04T09:50:44+00:00 BerroServer kernel: udp_sendmsg+0x72c/0x991 2023-02-04T09:50:44+00:00 BerroServer kernel: ? ip_neigh_gw4+0x8b/0x8b 2023-02-04T09:50:44+00:00 BerroServer kernel: ? sched_clock_cpu+0x12/0xa1 2023-02-04T09:50:44+00:00 BerroServer kernel: ? __smp_call_single_queue+0x23/0x35 2023-02-04T09:50:44+00:00 BerroServer kernel: ? ttwu_queue_wakelist+0x9a/0xcf 2023-02-04T09:50:44+00:00 BerroServer kernel: ? _raw_spin_unlock_irqrestore+0x24/0x3a 2023-02-04T09:50:44+00:00 BerroServer kernel: ? try_to_wake_up+0x20e/0x248 2023-02-04T09:50:44+00:00 BerroServer kernel: ? sock_sendmsg_nosec+0x2b/0x40 2023-02-04T09:50:44+00:00 BerroServer kernel: sock_sendmsg_nosec+0x2b/0x40 2023-02-04T09:50:44+00:00 BerroServer kernel: sock_write_iter+0x89/0xb8 2023-02-04T09:50:44+00:00 BerroServer kernel: new_sync_write+0x7f/0xbb 2023-02-04T09:50:44+00:00 BerroServer kernel: vfs_write+0xda/0x129 2023-02-04T09:50:44+00:00 BerroServer kernel: ksys_write+0x76/0xc2 2023-02-04T09:50:44+00:00 BerroServer kernel: ? fpregs_assert_state_consistent+0x1d/0x41 2023-02-04T09:50:44+00:00 BerroServer kernel: do_syscall_64+0x6b/0x81 2023-02-04T09:50:44+00:00 BerroServer kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd 2023-02-04T09:50:44+00:00 BerroServer kernel: RIP: 0033:0x40394e 2023-02-04T09:50:44+00:00 BerroServer kernel: Code: 48 89 6c 24 38 48 8d 6c 24 38 e8 0d 00 00 00 48 8b 6c 24 38 48 83 c4 40 c3 cc cc cc 49 89 f2 48 89 fa 48 89 ce 48 89 df 0f 05 <48> 3d 01 f0 ff ff 76 15 48 f7 d8 48 89 c1 48 c7 c0 ff ff ff ff 48 2023-02-04T09:50:44+00:00 BerroServer kernel: RSP: 002b:000000c0001a6198 EFLAGS: 00000206 ORIG_RAX: 0000000000000001 2023-02-04T09:50:44+00:00 BerroServer kernel: RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 000000000040394e 2023-02-04T09:50:44+00:00 BerroServer kernel: RDX: 000000000000002f RSI: 000000c00020e002 RDI: 0000000000000009 2023-02-04T09:50:44+00:00 BerroServer kernel: RBP: 000000c0001a61d8 R08: 0000000000000000 R09: 0000000000000000 2023-02-04T09:50:44+00:00 BerroServer kernel: R10: 0000000000000000 R11: 0000000000000206 R12: 000000c0001a6318 2023-02-04T09:50:44+00:00 BerroServer kernel: R13: 0000000000000000 R14: 000000c000032340 R15: 000014c71145d038 2023-02-04T09:50:44+00:00 BerroServer kernel: </TASK> 2023-02-04T09:50:44+00:00 BerroServer kernel: Modules linked in: tcp_diag udp_diag inet_diag xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_nat xt_tcpudp veth macvlan xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod nct6775 nct6775_core hwmon_vid wmi jc42 iptable_nat xt_MASQUERADE nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls igb x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel ast kvm drm_vram_helper drm_ttm_helper ttm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd drm rapl intel_cstate agpgart i2c_i801 syscopyarea i2c_algo_bit sysfillrect i2c_smbus ahci 2023-02-04T09:50:44+00:00 BerroServer kernel: sysimgblt intel_uncore ipmi_si i2c_core fb_sys_fops libahci thermal fan video backlight button unix [last unloaded: igb] 2023-02-04T09:50:44+00:00 BerroServer kernel: ---[ end trace 0000000000000000 ]--- 2023-02-04T09:50:44+00:00 BerroServer kernel: RIP: 0010:nf_nat_setup_info+0x142/0x7b1 [nf_nat] 2023-02-04T09:50:44+00:00 BerroServer kernel: Code: 4c 89 f7 e8 2f f8 ff ff 48 8b 15 66 6a 00 00 89 c0 48 8d 04 c2 4c 8b 28 4d 85 ed 74 2a 49 81 ed 90 00 00 00 eb 21 8a 44 24 46 <41> 38 45 46 74 21 49 8b 95 90 00 00 00 48 85 d2 0f 84 53 ff ff ff 2023-02-04T09:50:44+00:00 BerroServer kernel: RSP: 0018:ffffc90000178730 EFLAGS: 00010282 2023-02-04T09:50:44+00:00 BerroServer kernel: RAX: ffff888103edf511 RBX: ffff8881a113b100 RCX: 469e93f3514ea88e 2023-02-04T09:50:44+00:00 BerroServer kernel: RDX: a2948afb5bf76251 RSI: d1e865f3880a7926 RDI: 1f64589d6c3d4144 2023-02-04T09:50:44+00:00 BerroServer kernel: RBP: ffffc900001787f8 R08: d776c335d6b7943a R09: 91b317315f4e5ab5 2023-02-04T09:50:44+00:00 BerroServer kernel: R10: 2d5ac8d98b98afa7 R11: ce13e7c889e48066 R12: ffffc9000017880c 2023-02-04T09:50:44+00:00 BerroServer kernel: R13: a2948afb5bf761c1 R14: ffffffff82909480 R15: 0000000000000000 2023-02-04T09:50:44+00:00 BerroServer kernel: FS: 000000c000380090(0000) GS:ffff88880fcc0000(0000) knlGS:0000000000000000 2023-02-04T09:50:44+00:00 BerroServer kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 2023-02-04T09:50:44+00:00 BerroServer kernel: CR2: 000000c00040e000 CR3: 00000001870d4002 CR4: 00000000001706e0 Is there something useful in the attached log? Thanks in advance. berroserver-diagnostics-20230205-1844.zip Quote Link to comment
JorgeB Posted February 6, 2023 Share Posted February 6, 2023 Try switching to ipvlan (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)). 1 Quote Link to comment
aleberro Posted February 7, 2023 Author Share Posted February 7, 2023 Ok, thanks, I'll post if problem persists. Quote Link to comment
javinp Posted September 6, 2023 Share Posted September 6, 2023 On 2/6/2023 at 2:29 PM, aleberro said: Ok, thanks, I'll post if problem persists. how's it been going? any issues? Any fixes you recommend? I've been having this issue lately... Quote Link to comment
VPiedade Posted September 17, 2023 Share Posted September 17, 2023 Any updates about this issue? I’m still having to deal with same issue in my HPE Microserver Gen8. Quote Link to comment
JorgeB Posted September 18, 2023 Share Posted September 18, 2023 Enable the syslog server and post that after a crash. Quote Link to comment
RoTalk Posted October 20, 2023 Share Posted October 20, 2023 On 2/6/2023 at 4:43 AM, JorgeB said: [RESOLVED] ]Try switching to ipvlan (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)). Do we know why this would be an issue? Mine suddenly became unresponsive few weeks back and tried, save mode, save mode, without plugins and the only fix was reboot as I couldn't login via ssh, gui, nor console/video, switched the docker to ipvlan and so far so good, additionally I also disabled docker/vms. Hopefully this sticks and fixed my issue as well. Quote Link to comment
SpyKiIIer Posted January 1 Share Posted January 1 I've been having this issue as well. Changed Motherboard, Ram, CPU, same thing occurs. Sometimes its over a month, sometimes its multiple times in a day. I have made this change and will see if it makes a difference. Quote Link to comment
RoTalk Posted January 1 Share Posted January 1 Mine was good for 4 days and started again, I also tested with all VMS off and dockers. Same thing, I also spent 2 days on mem86 testing all ram/cpu and cleared. I'm leaning towards maybe a bad USB? Also upgraded the firmware for the motherboard as it was 4 versions behind. Quote Link to comment
exwebjunkie Posted March 29 Share Posted March 29 Did you ever anywhere with fixing this? Having a similar issue. Quote Link to comment
vaahr Posted Tuesday at 04:13 PM Share Posted Tuesday at 04:13 PM This is happening to me aswell, anyone found the issue yet? Quote Link to comment
RoTalk Posted Tuesday at 08:10 PM Share Posted Tuesday at 08:10 PM I started looking at the router that the server is connected to. Noticed that the lights were off on that particular port. Disconnected, unplugged the ethernet cable and putting it back in did not work, but what worked is disconnecting the ethernet cable from back of the server/PC and re-inserting it got the lights back in, and I can connect immediately back to RDP, SSH, and the GUI. Going through logs there is / was a docker that was messing with the same lan adapter or something to that extend. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.