CorruptComputer Posted November 21, 2022 Share Posted November 21, 2022 I tried to log into my server this morning to upgrade to the newly released 6.11.5 and noticed that it had hanged, and was returning either a 500 server error or timing out when trying to access the web interface. After rebooting through the iLO the syslog is just empty though, so I'm not sure what was happening. Never had this happen before this version, so I don't think its hardware related as nothing has changed there. The server has been running perfectly for years at this point. Does anyone know where the logs are stored so I can get a log of what happened? Going to wait to update in case that wipes the old logs out, and see if it happens again. Quote Link to comment
JorgeB Posted November 21, 2022 Share Posted November 21, 2022 Enable the syslog server and post that if it happens again together with the complete diagnostics. Quote Link to comment
CorruptComputer Posted November 21, 2022 Author Share Posted November 21, 2022 (edited) 10 minutes ago, JorgeB said: Enable the syslog server and post that if it happens again together with the complete diagnostics. So there are no logs stored if this wasn't enabled before... Why is this not enabled by default? Seems quite useless to enable this AFTER a problem occurs, but I've enabled it now so hopefully it will give some insight as to what is going on if this happens again. EDIT: Perhaps this info could be added to the setup guide? I feel like this is very important to enable system logging for when problems occur. https://wiki.unraid.net/Articles/Getting_Started Edited November 21, 2022 by CorruptComputer Quote Link to comment
JonathanM Posted November 21, 2022 Share Posted November 21, 2022 5 minutes ago, CorruptComputer said: Why is this not enabled by default? Because there is no good universal location to log to, every situation is slightly different. The only universal location that is guaranteed to exist is the flash drive itself, and that is a very poor choice for a logging location as all the constant writes put much wear and tear on the licensed USB. Probably the best option is to send the logs to a SSD, but not everyone uses one in their server, and the path can be different for any given install. If you do choose to log to the boot USB, be sure to turn that off as soon as you can. Quote Link to comment
CorruptComputer Posted November 21, 2022 Author Share Posted November 21, 2022 11 minutes ago, JonathanM said: Because there is no good universal location to log to, every situation is slightly different. The only universal location that is guaranteed to exist is the flash drive itself, and that is a very poor choice for a logging location as all the constant writes put much wear and tear on the licensed USB. Probably the best option is to send the logs to a SSD, but not everyone uses one in their server, and the path can be different for any given install. If you do choose to log to the boot USB, be sure to turn that off as soon as you can. Yeah for sure, not logging to the USB. Added a new share and saving the logs there. How large can I generally expect them to be? I setup a rotation of 4 at 100mb each, do you think this is sufficient if I notice issues within a day or two of them happening? Not sure if I had the edit in by the time you were replying, so I'll ask again. Should the syslog configuration be added to the setup guide so folks can get a log of issues when they happen? Quote Link to comment
CorruptComputer Posted April 4, 2023 Author Share Posted April 4, 2023 (edited) Just had this happen to me again on 6.11.5 and got the following log: Apr 1 01:02:51 neptune kernel: ------------[ cut here ]------------ Apr 1 01:02:51 neptune kernel: WARNING: CPU: 0 PID: 926 at net/netfilter/nf_conntrack_core.c:1208 __nf_conntrack_confirm+0xa5/0x2cb [nf_conntrack] Apr 1 01:02:51 neptune kernel: Modules linked in: xt_mark xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap macvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls igb x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel mgag200 ipmi_ssif drm_shmem_helper ghash_clmulni_intel aesni_intel drm_kms_helper crypto_simd cryptd drm rapl nvme intel_cstate intel_uncore backlight syscopyarea i2c_algo_bit acpi_ipmi sysfillrect sysimgblt i2c_core fb_sys_fops ahci nvme_core intel_pch_thermal wmi ipmi_si libahci acpi_tad acpi_power_meter button unix [last unloaded: igb] Apr 1 01:02:51 neptune kernel: CPU: 0 PID: 926 Comm: kworker/0:1 Not tainted 5.19.17-Unraid #2 Apr 1 01:02:51 neptune kernel: Hardware name: HPE ProLiant MicroServer Gen10 Plus/ProLiant MicroServer Gen10 Plus, BIOS U48 07/14/2022 Apr 1 01:02:51 neptune kernel: Workqueue: events macvlan_process_broadcast [macvlan] Apr 1 01:02:51 neptune kernel: RIP: 0010:__nf_conntrack_confirm+0xa5/0x2cb [nf_conntrack] Apr 1 01:02:51 neptune kernel: Code: c6 48 89 44 24 10 e8 dd e2 ff ff 8b 7c 24 04 89 da 89 c6 89 04 24 e8 56 e6 ff ff 84 c0 75 a2 48 8b 85 80 00 00 00 a8 08 74 18 <0f> 0b 8b 34 24 8b 7c 24 04 e8 16 de ff ff e8 2c e3 ff ff e9 7e 01 Apr 1 01:02:51 neptune kernel: RSP: 0018:ffffc90000003cf0 EFLAGS: 00010202 Apr 1 01:02:51 neptune kernel: RAX: 0000000000000188 RBX: 0000000000000000 RCX: ab746607d338df42 Apr 1 01:02:51 neptune kernel: RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffffa02fcccc Apr 1 01:02:51 neptune kernel: RBP: ffff8881e9620900 R08: 9f6e9c8d27a2e914 R09: d373ae0a6dc13241 Apr 1 01:02:51 neptune kernel: R10: 9920e9b9d70536e0 R11: 0600604e2c14e251 R12: ffffffff82909480 Apr 1 01:02:51 neptune kernel: R13: 0000000000036d3d R14: ffff8881e9e71400 R15: 0000000000000000 Apr 1 01:02:51 neptune kernel: FS: 0000000000000000(0000) GS:ffff88885ec00000(0000) knlGS:0000000000000000 Apr 1 01:02:51 neptune kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Apr 1 01:02:51 neptune kernel: CR2: 00001464ee5d8000 CR3: 000000000420a003 CR4: 00000000003706f0 Apr 1 01:02:51 neptune kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Apr 1 01:02:51 neptune kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Apr 1 01:02:51 neptune kernel: Call Trace: Apr 1 01:02:51 neptune kernel: <IRQ> Apr 1 01:02:51 neptune kernel: nf_conntrack_confirm+0x25/0x54 [nf_conntrack] Apr 1 01:02:51 neptune kernel: nf_hook_slow+0x3a/0x96 Apr 1 01:02:51 neptune kernel: ? ip_protocol_deliver_rcu+0x164/0x164 Apr 1 01:02:51 neptune kernel: NF_HOOK.constprop.0+0x79/0xd9 Apr 1 01:02:51 neptune kernel: ? ip_protocol_deliver_rcu+0x164/0x164 Apr 1 01:02:51 neptune kernel: ip_sabotage_in+0x47/0x58 [br_netfilter] Apr 1 01:02:51 neptune kernel: nf_hook_slow+0x3a/0x96 Apr 1 01:02:51 neptune kernel: ? ip_rcv_finish_core.constprop.0+0x3b7/0x3b7 Apr 1 01:02:51 neptune kernel: NF_HOOK.constprop.0+0x79/0xd9 Apr 1 01:02:51 neptune kernel: ? ip_rcv_finish_core.constprop.0+0x3b7/0x3b7 Apr 1 01:02:51 neptune kernel: __netif_receive_skb_one_core+0x77/0x9c Apr 1 01:02:51 neptune kernel: process_backlog+0x8c/0x116 Apr 1 01:02:51 neptune kernel: __napi_poll.constprop.0+0x28/0x124 Apr 1 01:02:51 neptune kernel: net_rx_action+0x159/0x24f Apr 1 01:02:51 neptune kernel: __do_softirq+0x126/0x288 Apr 1 01:02:51 neptune kernel: do_softirq+0x7f/0xab Apr 1 01:02:51 neptune kernel: </IRQ> Apr 1 01:02:51 neptune kernel: <TASK> Apr 1 01:02:51 neptune kernel: __local_bh_enable_ip+0x4c/0x6b Apr 1 01:02:51 neptune kernel: netif_rx+0x52/0x5a Apr 1 01:02:51 neptune kernel: macvlan_broadcast+0x10a/0x150 [macvlan] Apr 1 01:02:51 neptune kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] Apr 1 01:02:51 neptune kernel: process_one_work+0x1a8/0x295 Apr 1 01:02:51 neptune kernel: worker_thread+0x18b/0x244 Apr 1 01:02:51 neptune kernel: ? rescuer_thread+0x281/0x281 Apr 1 01:02:51 neptune kernel: kthread+0xe4/0xef Apr 1 01:02:51 neptune kernel: ? kthread_complete_and_exit+0x1b/0x1b Apr 1 01:02:51 neptune kernel: ret_from_fork+0x1f/0x30 Apr 1 01:02:51 neptune kernel: </TASK> Apr 1 01:02:51 neptune kernel: ---[ end trace 0000000000000000 ]--- Apr 1 05:00:15 neptune crond[1103]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Apr 2 01:53:58 neptune webGUI: Successful login user root from 10.0.0.88 Apr 2 01:54:01 neptune sSMTP[24591]: Creating SSL connection to host Apr 2 01:54:01 neptune sSMTP[24591]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384 Apr 2 01:54:03 neptune sSMTP[24591]: Sent mail for [email protected] (221 Bye) uid=0 username=root outbytes=821 Apr 2 01:54:39 neptune flash_backup: adding task: /usr/local/emhttp/plugins/dynamix.my.servers/scripts/UpdateFlashBackup update Apr 2 01:55:01 neptune sSMTP[26871]: Creating SSL connection to host Apr 2 01:55:02 neptune sSMTP[26871]: SSL connection using ECDHE-RSA-AES256-GCM-SHA384 Apr 2 01:55:03 neptune sSMTP[26871]: Sent mail for [email protected] (221 Bye) uid=0 username=root outbytes=848 Apr 2 04:40:01 neptune apcupsd[1629]: apcupsd exiting, signal 15 Apr 2 04:40:01 neptune apcupsd[1629]: apcupsd shutdown succeeded Apr 2 04:40:03 neptune apcupsd[17815]: apcupsd 3.14.14 (31 May 2016) slackware startup succeeded Apr 2 04:40:03 neptune apcupsd[17815]: NIS server startup succeeded Apr 2 05:00:09 neptune crond[1103]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Apr 3 04:00:07 neptune avahi-daemon[2511]: Registering new address record for fe80::90bc:40ff:fe4f:443d on shim-br0.*. Apr 3 04:00:12 neptune kernel: igb 0000:02:00.0 eth0: igb: eth0 NIC Link is Down Apr 3 04:00:12 neptune kernel: bond0: (slave eth0): link status definitely down, disabling slave Apr 3 04:00:12 neptune kernel: device eth0 left promiscuous mode Apr 3 04:00:12 neptune kernel: bond0: now running without any active interface! Apr 3 04:00:12 neptune kernel: br0: port 1(bond0) entered disabled state Apr 3 04:00:15 neptune ntpd[1082]: Deleting interface #1 br0, 10.0.1.1#123, interface stats: received=1215, sent=1215, dropped=0, active_time=238051 secs Apr 3 04:00:15 neptune ntpd[1082]: 143.215.130.72 local addr 10.0.1.1 -> <null> Apr 3 04:00:15 neptune ntpd[1082]: 80.241.0.72 local addr 10.0.1.1 -> <null> Apr 3 04:00:15 neptune ntpd[1082]: 142.202.190.19 local addr 10.0.1.1 -> <null> Apr 3 04:00:15 neptune ntpd[1082]: 69.164.213.136 local addr 10.0.1.1 -> <null> Apr 3 04:01:27 neptune kernel: igb 0000:02:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX Apr 3 04:01:27 neptune kernel: bond0: (slave eth0): link status definitely up, 1000 Mbps full duplex Apr 3 04:01:27 neptune kernel: bond0: (slave eth0): making interface the new active one Apr 3 04:01:27 neptune kernel: device eth0 entered promiscuous mode Apr 3 04:01:27 neptune kernel: bond0: active interface up! Apr 3 04:01:27 neptune kernel: br0: port 1(bond0) entered blocking state Apr 3 04:01:27 neptune kernel: br0: port 1(bond0) entered forwarding state Apr 3 04:01:29 neptune ntpd[1082]: Listen normally on 3 br0 10.0.1.1:123 Apr 3 04:01:29 neptune ntpd[1082]: new interface(s) found: waking up resolver Apr 3 04:01:30 neptune avahi-daemon[2511]: Withdrawing address record for fe80::90bc:40ff:fe4f:443d on shim-br0. Apr 3 04:01:30 neptune avahi-daemon[2511]: Registering new address record for fe80::90bc:40ff:fe4f:443d on shim-br0.*. Apr 3 04:02:06 neptune avahi-daemon[2511]: Withdrawing address record for fe80::90bc:40ff:fe4f:443d on shim-br0. Apr 3 05:00:09 neptune crond[1103]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Apr 4 02:52:06 neptune kernel: BUG: Bad rss-counter state mm:000000005de9be02 type:MM_SHMEMPAGES val:1 Apr 4 05:00:16 neptune crond[1103]: exit status 1 from user root /usr/local/sbin/mover &> /dev/null Apr 4 08:31:50 neptune emhttpd: Starting services... If I had to guess from reading this, it looks like the mover crashed and took down the rest of the server with it. My ping check shows it stopped responding around 4:40am and I restarted the server around 8:00am. Edited April 4, 2023 by CorruptComputer Quote Link to comment
JorgeB Posted April 4, 2023 Share Posted April 4, 2023 5 minutes ago, CorruptComputer said: Apr 1 01:02:51 neptune kernel: macvlan_broadcast+0x10a/0x150 [macvlan] Apr 1 01:02:51 neptune kernel: macvlan_process_broadcast+0xbc/0x12f [macvlan] Macvlan call traces are usually the result of having dockers with a custom IP address and will end up crashing the server, switching to ipvlan should fix it (Settings -> Docker Settings -> Docker custom network type -> ipvlan (advanced view must be enabled, top right)) Quote Link to comment
CorruptComputer Posted April 4, 2023 Author Share Posted April 4, 2023 Ah I see, thanks for letting me know! I found the option you said to change, but for me it is not able to be changed: Quote Link to comment
JorgeB Posted April 4, 2023 Share Posted April 4, 2023 10 minutes ago, CorruptComputer said: but for me it is not able to be changed: You need to stop the docker service first. Quote Link to comment
CorruptComputer Posted April 4, 2023 Author Share Posted April 4, 2023 Ah, I see. I am currently running a parity check so I can't stop the array. I'll make that change tomorrow once the check finishes. Thank you for your help! Quote Link to comment
JorgeB Posted April 4, 2023 Share Posted April 4, 2023 2 minutes ago, CorruptComputer said: Ah, I see. I am currently running a parity check so I can't stop the array. OK, but you don't need to stop the array, just the docker service, first option on that page. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.