relink Posted April 21, 2021 Share Posted April 21, 2021 Ever since upgrading to the 6.9.x versions on Unraid I have been getting Kernel panics at-least once a week, sometimes more. I have no idea what to check next to get this fixed. So far I have; checked the RAM, and CPU, C-states are disabled in bios, temps are fine, PSU is fine. I had heard it may have been the GPU Stats plugin, so I added the below script to run on first array start; #!/bin/bash nvidia-persistenced fuser -v /dev/nvidia* I have also removed any plugins I don't actively use, and made sure the ones I do use stay updated. The only thing I haven't tried yet is doing a BIOS update, which I would like to not do unless I completely run out of options. Diagnostics are from after a reboot. And the below logs are from my Syslog server. Date Time Level Host Name Category Program Messages 4/21/21 12:18:43 Warning SERVERUS kern kernel CR2: 0000000000000034 4/21/21 12:18:43 Warning SERVERUS kern kernel Modules linked in: md4 sha512_ssse3 sha512_generic cmac cifs libarc4 nfsv3 nfs nfs_ssc nvidia_uvm(PO) xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap macvlan veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter ext4 mbcache jbd2 xfs nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper drm backlight agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding e1000e igb i2c_algo_bit edac_mce_amd kvm_amd kvm crct10dif_pclmul mxm_wmi wmi_bmof crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd nvme glue_helper input_leds nvme_core rapl ahci aacraid ccp i2c_piix4 led_class k10temp wmi i2c_core libahci button acpi_cpufreq [last unloaded: e1000e] 4/21/21 12:18:43 Warning SERVERUS kern kernel R13: 000014dc4a9acfe0 R14: 0000000000000010 R15: 000014dc4a7814e8 4/21/21 12:18:43 Warning SERVERUS kern kernel R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002a 4/21/21 12:18:43 Warning SERVERUS kern kernel RBP: 000014dc490e0340 R08: 0000000000000000 R09: 0000000000000000 4/21/21 12:18:43 Warning SERVERUS kern kernel RDX: 0000000000000010 RSI: 000014dc495705e0 RDI: 000000000000000e 4/21/21 12:18:43 Warning SERVERUS kern kernel RAX: ffffffffffffffda RBX: 000014dc49574b38 RCX: 000014dc4e5a4352 4/21/21 12:18:43 Warning SERVERUS kern kernel RSP: 002b:000014dc49570538 EFLAGS: 00000246 ORIG_RAX: 000000000000002a 4/21/21 12:18:43 Warning SERVERUS kern kernel Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 8a d2 ff ff 41 54 b8 02 00 00 00 49 89 f4 be 00 88 08 00 55 4/21/21 12:18:43 Warning SERVERUS kern kernel RIP: 0033:0x14dc4e5a4352 4/21/21 12:18:43 Warning SERVERUS kern kernel entry_SYSCALL_64_after_hwframe+0x44/0xa9 4/21/21 12:18:43 Warning SERVERUS kern kernel do_syscall_64+0x5d/0x6a 4/21/21 12:18:43 Warning SERVERUS kern kernel __x64_sys_connect+0x11/0x14 4/21/21 12:18:43 Warning SERVERUS kern kernel __sys_connect+0x62/0x9d 4/21/21 12:18:43 Warning SERVERUS kern kernel inet_stream_connect+0x34/0x49 4/21/21 12:18:43 Warning SERVERUS kern kernel __inet_stream_connect+0xd3/0x2b6 4/21/21 12:18:43 Warning SERVERUS kern kernel tcp_v4_connect+0x3fc/0x455 4/21/21 12:18:43 Warning SERVERUS kern kernel tcp_connect+0x76d/0x7f4 4/21/21 12:18:43 Warning SERVERUS kern kernel __tcp_transmit_skb+0x845/0x8ba 4/21/21 12:18:43 Warning SERVERUS kern kernel __ip_queue_xmit+0x2a3/0x2df 4/21/21 12:18:43 Warning SERVERUS kern kernel ? ipv4_mtu+0x3d/0x64 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_finish_output2+0x2ec/0x31f 4/21/21 12:18:43 Warning SERVERUS kern kernel __local_bh_enable_ip+0x3b/0x43 4/21/21 12:18:43 Warning SERVERUS kern kernel do_softirq+0x3a/0x44 4/21/21 12:18:43 Warning SERVERUS kern kernel do_softirq_own_stack+0x2c/0x39 4/21/21 12:18:43 Warning SERVERUS kern kernel </IRQ> 4/21/21 12:18:43 Warning SERVERUS kern kernel asm_call_irq_on_stack+0xf/0x20 4/21/21 12:18:43 Warning SERVERUS kern kernel __do_softirq+0xc4/0x1c2 4/21/21 12:18:43 Warning SERVERUS kern kernel net_rx_action+0xf4/0x29d 4/21/21 12:18:43 Warning SERVERUS kern kernel process_backlog+0xa3/0x13b 4/21/21 12:18:43 Warning SERVERUS kern kernel __netif_receive_skb_one_core+0x3d/0x95 4/21/21 12:18:43 Warning SERVERUS kern kernel __netif_receive_skb_core+0x335/0x4e7 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_pass_frame_up+0xda/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel br_handle_frame+0x25e/0x2a6 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel br_nf_pre_routing+0x229/0x239 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel NF_HOOK+0xd7/0xf7 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_handle_frame_finish+0x351/0x351 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_nf_forward_finish+0xd0/0xd0 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? nf_nat_ipv4_pre_routing+0x1e/0x4a [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_handle_frame_finish+0x351/0x351 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_pass_frame_up+0xda/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel br_nf_pre_routing_finish+0x23d/0x264 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_pass_frame_up+0xda/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel br_nf_hook_thresh+0xa3/0xc3 [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? br_pass_frame_up+0xda/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel ? ipt_do_table+0x570/0x5c0 [ip_tables] 4/21/21 12:18:43 Warning SERVERUS kern kernel br_handle_frame_finish+0x30d/0x351 4/21/21 12:18:43 Warning SERVERUS kern kernel netif_receive_skb+0x79/0xa1 4/21/21 12:18:43 Warning SERVERUS kern kernel __netif_receive_skb_one_core+0x74/0x95 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_rcv+0x41/0x61 4/21/21 12:18:43 Warning SERVERUS kern kernel ? l3mdev_l3_rcv.constprop.0+0x50/0x50 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_hook.constprop.0+0xb1/0xd8 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_hook_slow+0x39/0x8e 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_sabotage_in+0x43/0x4d [br_netfilter] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? ip_check_defrag+0x18f/0x18f 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_forward+0x3f1/0x420 4/21/21 12:18:43 Warning SERVERUS kern kernel ? __ip_finish_output+0x146/0x146 4/21/21 12:18:43 Warning SERVERUS kern kernel ip_output+0x7d/0x8a 4/21/21 12:18:43 Warning SERVERUS kern kernel ? __ip_finish_output+0x146/0x146 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_hook+0xab/0xd3 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_hook_slow+0x39/0x8e 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_nat_ipv4_out+0xf/0x88 [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_nat_inet_fn+0xe9/0x183 [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? fib_validate_source+0xb0/0xda 4/21/21 12:18:43 Warning SERVERUS kern kernel ? ipt_do_table+0x570/0x5c0 [ip_tables] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? __dev_queue_xmit+0x4d9/0x501 4/21/21 12:18:43 Warning SERVERUS kern kernel ipt_do_table+0x51a/0x5c0 [ip_tables] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? netdev_start_xmit+0x1b/0x38 4/21/21 12:18:43 Warning SERVERUS kern kernel masquerade_tg+0x44/0x5e [xt_MASQUERADE] 4/21/21 12:18:43 Warning SERVERUS kern kernel nf_nat_masquerade_ipv4+0x10b/0x131 [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel ? krealloc+0x26/0x7a 4/21/21 12:18:43 Warning SERVERUS kern kernel ? __ksize+0x15/0x64 4/21/21 12:18:43 Warning SERVERUS kern kernel ? sch_direct_xmit+0x16/0x1de 4/21/21 12:18:43 Warning SERVERUS kern kernel <IRQ> 4/21/21 12:18:43 Warning SERVERUS kern kernel Call Trace: 4/21/21 12:18:43 Warning SERVERUS kern kernel CR2: 0000000000000034 CR3: 0000000216e82000 CR4: 00000000003506f0 4/21/21 12:18:43 Warning SERVERUS kern kernel CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 4/21/21 12:18:43 Warning SERVERUS kern kernel FS: 000014dc49574b38(0000) GS:ffff8887fe800000(0000) knlGS:0000000000000000 4/21/21 12:18:43 Warning SERVERUS kern kernel R13: ffffc90000003720 R14: ffffc900000037dc R15: ffffffff8210b440 4/21/21 12:18:43 Warning SERVERUS kern kernel R10: ffff88821d56c388 R11: ffffffff815cbe4b R12: 0000000000000000 4/21/21 12:18:43 Warning SERVERUS kern kernel RBP: ffffc900000037c8 R08: 00000000d638ba09 R09: ffff88818fa62b80 4/21/21 12:18:43 Warning SERVERUS kern kernel RDX: ffffffffffffffee RSI: 000000007418f38d RDI: ffffc90000003720 4/21/21 12:18:43 Warning SERVERUS kern kernel RAX: ffff88819ec50206 RBX: ffff88816107f0c0 RCX: 0000000000000000 4/21/21 12:18:43 Warning SERVERUS kern kernel RSP: 0018:ffffc90000003700 EFLAGS: 00010286 4/21/21 12:18:43 Warning SERVERUS kern kernel Code: ff 48 8b 15 ef 6a 00 00 89 c0 48 8d 04 c2 48 8b 10 48 85 d2 74 80 48 81 ea 98 00 00 00 48 85 d2 0f 84 70 ff ff ff 8a 44 24 46 <38> 42 46 74 09 48 8b 92 98 00 00 00 eb d9 48 8b 4a 20 48 8b 42 28 4/21/21 12:18:43 Warning SERVERUS kern kernel RIP: 0010:nf_nat_setup_info+0x129/0x6aa [nf_nat] 4/21/21 12:18:43 Warning SERVERUS kern kernel Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 2301 04/19/2019 4/21/21 12:18:43 Warning SERVERUS kern kernel CPU: 0 PID: 13620 Comm: python3 Tainted: P W O 5.10.28-Unraid #1 4/21/21 12:18:43 Warning SERVERUS kern kernel Oops: 0000 [#1] SMP NOPTI 4/21/21 12:18:43 Information SERVERUS kern kernel PGD 0 P4D 0 4/21/21 12:18:43 Alert SERVERUS kern kernel #PF: error_code(0x0000) - not-present page 4/21/21 12:18:43 Alert SERVERUS kern kernel #PF: supervisor read access in kernel mode 4/21/21 12:18:43 Alert SERVERUS kern kernel BUG: kernel NULL pointer dereference, address: 0000000000000034 4/21/21 12:00:44 Information SERVERUS kern kernel eth0: renamed from vethbb3a427 4/21/21 12:00:44 Information SERVERUS kern kernel veth37fa1d8: renamed from eth0 4/21/21 12:00:44 Information SERVERUS kern kernel oom_reaper: reaped process 3059 (xteve), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB 4/21/21 12:00:44 Error SERVERUS kern kernel Memory cgroup out of memory: Killed process 3059 (xteve) total-vm:2784340kB, anon-rss:2087212kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:4176kB oom_score_adj:0 4/21/21 12:00:44 Information SERVERUS kern kernel oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,mems_allowed=0,oom_memcg=/docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,task_memcg=/docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,task=xteve,pid=3059,uid=0 4/21/21 12:00:44 Information SERVERUS kern kernel [ 3059] 0 3059 696085 521803 4276224 0 0 xteve 4/21/21 12:00:44 Information SERVERUS kern kernel [ 3054] 0 3054 396 11 36864 0 0 crond 4/21/21 12:00:44 Information SERVERUS kern kernel [ 2968] 0 2968 554 53 40960 0 0 entrypoint.sh 4/21/21 12:00:44 Information SERVERUS kern kernel [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name 4/21/21 12:00:44 Information SERVERUS kern kernel Tasks state (memory values in pages): 4/21/21 12:00:44 Information SERVERUS kern kernel thp_collapse_alloc 0 4/21/21 12:00:44 Information SERVERUS kern kernel thp_fault_alloc 66 4/21/21 12:00:44 Information SERVERUS kern kernel pglazyfreed 0 4/21/21 12:00:44 Information SERVERUS kern kernel pglazyfree 0 4/21/21 12:00:44 Information SERVERUS kern kernel pgdeactivate 104645 4/21/21 12:00:44 Information SERVERUS kern kernel pgactivate 732171 4/21/21 12:00:44 Information SERVERUS kern kernel pgsteal 272820 4/21/21 12:00:44 Information SERVERUS kern kernel pgscan 20914208 4/21/21 12:00:44 Information SERVERUS kern kernel pgrefill 107383 4/21/21 12:00:44 Information SERVERUS kern kernel pgmajfault 231 4/21/21 12:00:44 Information SERVERUS kern kernel pgfault 764247 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_nodereclaim 0 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_restore_file 2013 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_restore_anon 0 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_activate_file 3927 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_activate_anon 0 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_refault_file 212124 4/21/21 12:00:44 Information SERVERUS kern kernel workingset_refault_anon 0 4/21/21 12:00:44 Information SERVERUS kern kernel slab 297176 4/21/21 12:00:44 Information SERVERUS kern kernel slab_unreclaimable 1336 4/21/21 12:00:44 Information SERVERUS kern kernel slab_reclaimable 295840 4/21/21 12:00:44 Information SERVERUS kern kernel unevictable 0 4/21/21 12:00:44 Information SERVERUS kern kernel active_file 77824 4/21/21 12:00:44 Information SERVERUS kern kernel inactive_file 2076672 4/21/21 12:00:44 Information SERVERUS kern kernel active_anon 270336 4/21/21 12:00:44 Information SERVERUS kern kernel inactive_anon 2139475968 4/21/21 12:00:44 Information SERVERUS kern kernel anon_thp 69206016 4/21/21 12:00:44 Information SERVERUS kern kernel file_writeback 675840 4/21/21 12:00:44 Information SERVERUS kern kernel file_dirty 0 4/21/21 12:00:44 Information SERVERUS kern kernel file_mapped 270336 4/21/21 12:00:44 Information SERVERUS kern kernel shmem 0 4/21/21 12:00:44 Information SERVERUS kern kernel sock 0 4/21/21 12:00:44 Information SERVERUS kern kernel percpu 0 4/21/21 12:00:44 Information SERVERUS kern kernel kernel_stack 147456 4/21/21 12:00:44 Information SERVERUS kern kernel file 3108864 4/21/21 12:00:44 Information SERVERUS kern kernel anon 2138357760 4/21/21 12:00:44 Information SERVERUS kern kernel Memory cgroup stats for /docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89: 4/21/21 12:00:44 Information SERVERUS kern kernel kmem: usage 5288kB, limit 9007199254740988kB, failcnt 0 4/21/21 12:00:44 Information SERVERUS kern kernel memory+swap: usage 2097156kB, limit 4194304kB, failcnt 0 4/21/21 12:00:44 Information SERVERUS kern kernel memory: usage 2097156kB, limit 2097152kB, failcnt 76745 4/21/21 12:00:44 Warning SERVERUS kern kernel R13: 0000000000bcd4e0 R14: 0000000000000000 R15: 000000000046c560 4/21/21 12:00:44 Warning SERVERUS kern kernel R10: 000014f1d1566eb0 R11: ed386059261e7389 R12: 0000000000000002 4/21/21 12:00:44 Warning SERVERUS kern kernel RBP: 000000c00079b888 R08: 0000000000000009 R09: ffffffffffffffff 4/21/21 12:00:44 Warning SERVERUS kern kernel RDX: 00000000009ea5f4 RSI: 000000000041a345 RDI: 000000c00079bac0 4/21/21 12:00:44 Warning SERVERUS kern kernel RAX: 0000000000bcd4e0 RBX: 0000000000000003 RCX: 000000c000288f00 4/21/21 12:00:44 Warning SERVERUS kern kernel RSP: 002b:000000c00079b878 EFLAGS: 00010216 4/21/21 12:00:44 Warning SERVERUS kern kernel Code: Unable to access opcode bytes at RIP 0x4548ff. 4/21/21 12:00:44 Warning SERVERUS kern kernel RIP: 0033:0x454929 4/21/21 12:00:44 Warning SERVERUS kern kernel asm_exc_page_fault+0x1e/0x30 4/21/21 12:00:44 Warning SERVERUS kern kernel ? asm_exc_page_fault+0x8/0x30 4/21/21 12:00:44 Warning SERVERUS kern kernel exc_page_fault+0x259/0x373 4/21/21 12:00:44 Warning SERVERUS kern kernel ? __raw_spin_unlock_irq+0x5/0xd 4/21/21 12:00:44 Warning SERVERUS kern kernel handle_mm_fault+0xb83/0xec3 4/21/21 12:00:44 Warning SERVERUS kern kernel __do_fault+0x49/0x64 4/21/21 12:00:44 Warning SERVERUS kern kernel ? filemap_map_pages+0x1ec/0x217 4/21/21 12:00:44 Warning SERVERUS kern kernel ? xas_find+0xa9/0x121 4/21/21 12:00:44 Warning SERVERUS kern kernel filemap_fault+0x2a4/0x478 4/21/21 12:00:44 Warning SERVERUS kern kernel pagecache_get_page+0x108/0x13e 4/21/21 12:00:44 Warning SERVERUS kern kernel add_to_page_cache_lru+0x42/0xa7 4/21/21 12:00:44 Warning SERVERUS kern kernel ? lruvec_page_state+0x2f/0x2f 4/21/21 12:00:44 Warning SERVERUS kern kernel __add_to_page_cache_locked+0xab/0x274 4/21/21 12:00:44 Warning SERVERUS kern kernel mem_cgroup_charge+0xfe/0x17e 4/21/21 12:00:44 Warning SERVERUS kern kernel ? mem_cgroup_charge+0x170/0x17e 4/21/21 12:00:44 Warning SERVERUS kern kernel try_charge+0x3ec/0x501 4/21/21 12:00:44 Warning SERVERUS kern kernel mem_cgroup_out_of_memory+0x79/0xae 4/21/21 12:00:44 Warning SERVERUS kern kernel out_of_memory+0x3dd/0x410 4/21/21 12:00:44 Warning SERVERUS kern kernel oom_kill_process+0x7b/0xf6 4/21/21 12:00:44 Warning SERVERUS kern kernel dump_header+0x45/0x1e8 4/21/21 12:00:44 Warning SERVERUS kern kernel dump_stack+0x6b/0x83 4/21/21 12:00:44 Warning SERVERUS kern kernel Call Trace: 4/21/21 12:00:44 Warning SERVERUS kern kernel Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 2301 04/19/2019 4/21/21 12:00:44 Warning SERVERUS kern kernel CPU: 4 PID: 3653 Comm: xteve Tainted: P W O 5.10.28-Unraid #1 4/21/21 12:00:44 Warning SERVERUS kern kernel xteve invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 4/21/21 9:44:28 Notice SERVERUS user flash_backup adding task: php /usr/local/emhttp/plugins/dynamix.unraid.net/include/UpdateFlashBackup.php update 4/21/21 9:43:20 Alert SERVERUS local7 nginx 2021/04/21 09:43:20 [alert] 29892#29892: worker process 7356 exited on signal 6 serverus-diagnostics-20210421-1308.zip Quote Link to comment
JorgeB Posted April 21, 2021 Share Posted April 21, 2021 30 minutes ago, relink said: The only thing I haven't tried yet is doing a BIOS update I would try that, but there are a number of Ryzen users having call traces with v6.9.x, possibly a kernel thing. Quote Link to comment
relink Posted April 21, 2021 Author Share Posted April 21, 2021 Well I can certainly try updating the bios, so I can at least rule it out. Whatever this is I hope we can get it sorted soon because I cringe every time I have to hard shutdown my server. Sent from my iPhone using Tapatalk Quote Link to comment
mladams Posted April 22, 2021 Share Posted April 22, 2021 I'm in the same boat but I have a powerege r720. Been getting a kernel panic at least once a day for the past 2 or 3 weeks. Memtest clean, syslog shows nothing. I've tried every trick I've found in this forum and others... I have no idea what to do next except downgrade. It's getting terribly frustrating. Quote Link to comment
JorgeB Posted April 23, 2021 Share Posted April 23, 2021 9 hours ago, mladams said: I'm in the same boat but I have a powerege r720. This seems to be the number one reason for crashing (on non Ryzen hardware), see if it applies to you: https://forums.unraid.net/bug-reports/stable-releases/690691-kernel-panic-due-to-netfilter-nf_nat_setup_info-docker-static-ip-macvlan-r1356/?do=getNewComment&d=2&id=1356 Quote Link to comment
relink Posted April 25, 2021 Author Share Posted April 25, 2021 (edited) Ok so I finally got a chance to rerun the memtest, and I did get a few errors on the first run. I shutdown and reseated the DIMMS and it passed just fine. Maybe something got knocked loose at some point. I also updated my motherboard to the latest bios. Unfortunately, ever since updating the bios UNRAID doesn’t see any of the disks attached to my Adaptec HBA, and I am not seeing it’s bios on boot either. In my motherboards bios I made sure quick boot was disabled, show logo is disabled, show oprom enabled, and allow interrupt 19 is enabled...now the panic starts to set in... EDIT: That was a great use of almost 4 hours, the raid card not showing up was due to a new BIOS option in the update that ASUS didn’t bother to document... Back to the original issue. RAM good, bios is updated, I’m going to continue monitoring and see what happens this week. I really hope I can get some stability back, things have been rough since 6.9 Edited April 26, 2021 by relink Quote Link to comment
relink Posted April 27, 2021 Author Share Posted April 27, 2021 Ok I think things are now getting worse. This is the 4th time where everything looks like it’s still running, all my containers look fine, and the unraid UI is responsive, but in reality everything is kind of in the unresponsive state. I can’t seem to pull a diagnostics either, every time I try it gets stuck at this exact spot; Downloading... /boot/logs/serverus-diagnostics-20210427-0632.zip smartctl -x '/dev/sdg' 2>/dev/null|todos >'/serverus-diagnostics-20210427-0632/smart/ST1000VN002-2EY102_Z9C5A3AJ-20210427-0632 disk2 (sdg).txt' Quote Link to comment
relink Posted April 27, 2021 Author Share Posted April 27, 2021 Little bit of an update. After digging in further I realized most of my containers other than Plex are actually not frozen. However I cannot download diagnostics still, I cannot restart or stop any containers, I cannot stop the docker service, and I cannot stop the array... Quote Link to comment
relink Posted April 27, 2021 Author Share Posted April 27, 2021 (edited) I dug through and pulled all kernel errors from my syslog server. Seems related to my raid controller, but I have no idea why, still looking into it. Date Time Level Host Name Category Program Messages 2021-04-27 07:27:13 Error SERVERUS kern kernel 2021-04-27 04:01:56 Error SERVERUS kern kernel aacraid 0000:0a:00.0: Controller reset type is 3 2021-04-27 04:01:56 Error SERVERUS kern kernel aacraid: Host bus reset request. SCSI hang ? 2021-04-27 04:01:50 Error SERVERUS kern kernel aacraid: Outstanding commands on (2,1,6,0): 2021-04-27 04:01:50 Error SERVERUS kern kernel aacraid: Host adapter abort request. 2021-04-27 00:00:07 Error SERVERUS kern kernel blk_update_request: critical target error, dev sda, sector 76048 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0 2021-04-26 21:38:48 Error SERVERUS kern kernel Memory cgroup out of memory: Killed process 32031 (xteve) total-vm:2773040kB, anon-rss:2055388kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:4164kB oom_score_adj:0 2021-04-26 10:15:53 Error SERVERUS kern kernel 2021-04-26 06:41:16 Error SERVERUS kern kernel aacraid 0000:0a:00.0: Controller reset type is 3 2021-04-26 06:41:16 Error SERVERUS kern kernel aacraid: Host bus reset request. SCSI hang ? 2021-04-26 06:41:10 Error SERVERUS kern kernel aacraid: Outstanding commands on (11,1,6,0): 2021-04-26 06:41:10 Error SERVERUS kern kernel aacraid: Host adapter abort request. 2021-04-26 06:09:30 Error SERVERUS kern kernel sd 11:1:5:0: [sdd] tag#166 timing out command, waited 7s 2021-04-26 06:08:33 Error SERVERUS kern kernel aacraid 0000:0a:00.0: Controller reset type is 3 2021-04-26 06:08:33 Error SERVERUS kern kernel aacraid: Host bus reset request. SCSI hang ? 2021-04-26 06:08:32 Error SERVERUS kern kernel aacraid: Outstanding commands on (11,1,5,0): 2021-04-26 06:08:32 Error SERVERUS kern kernel aacraid: Host adapter abort request. 2021-04-26 04:09:16 Error SERVERUS kern kernel sd 11:1:5:0: [sdd] tag#221 timing out command, waited 7s 2021-04-26 04:08:19 Error SERVERUS kern kernel aacraid 0000:0a:00.0: Controller reset type is 3 2021-04-26 04:08:19 Error SERVERUS kern kernel aacraid: Host bus reset request. SCSI hang ? 2021-04-26 04:08:17 Error SERVERUS kern kernel aacraid: Outstanding commands on (11,1,5,0): 2021-04-26 04:08:17 Error SERVERUS kern kernel aacraid: Host adapter abort request. 2021-04-26 04:06:11 Error SERVERUS kern kernel aacraid 0000:0a:00.0: Controller reset type is 3 2021-04-26 04:06:11 Error SERVERUS kern kernel aacraid: Host bus reset request. SCSI hang ? 2021-04-26 04:06:06 Error SERVERUS kern kernel aacraid: Outstanding commands on (11,1,6,0): 2021-04-26 04:06:06 Error SERVERUS kern kernel aacraid: Host adapter abort request. 2021-04-26 00:00:03 Error SERVERUS kern kernel blk_update_request: critical target error, dev sda, sector 76048 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0 2021-04-25 23:01:41 Error SERVERUS kern kernel 2021-04-25 22:51:30 Error SERVERUS kern kernel 2021-04-25 22:39:03 Error SERVERUS kern kernel 2021-04-25 22:28:09 Error SERVERUS kern kernel 2021-04-25 20:43:54 Error SERVERUS kern kernel 2021-04-25 20:18:51 Error SERVERUS kern kernel 2021-04-25 19:40:40 Error SERVERUS kern kernel Edited April 27, 2021 by relink Quote Link to comment
relink Posted April 27, 2021 Author Share Posted April 27, 2021 After searching for some of the errors I came across this post here, the first part talks about changing the device timeout to 45seconds. The problem is I don't know how to do that in unRAID. Running "ls /sys/block/" shows disks listed as both "mdx" and "sdx", plus I'm not even sure if a change like that would break unRAID, or persist over a reboot, etc... Now the second part of the post I can definitely check and will when I get home. Thats is a BIOS update for the controller. Firmware Version 7.3.0 Build 30612 lists "Resolved an issue where I/O would slow and eventually result in a controller reset" as one of the fixes. My only concern is this BIOS was released in 2013, I find it hard to believe my card would have a bios older than that, but I will check. Quote Link to comment
relink Posted April 27, 2021 Author Share Posted April 27, 2021 The BIOS on my controller was was from 2015. I just finished updating to the latest version which is from 2018. So fingers crossed that I don’t have anymore issues. If I do I’ll probably just open a new thread as this has gotten off topic from the original problem. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.