Jump to content

Im still getting Kernel panics, and I don't know what to check next.


Recommended Posts

Ever since upgrading to the 6.9.x versions on Unraid I have been getting Kernel panics at-least once a week, sometimes more. I have no idea what to check next to get this fixed. 

 

So far I have; checked the RAM, and CPU, C-states are disabled in bios, temps are fine, PSU is fine. I had heard it may have been the GPU Stats plugin, so I added the below script to run on first array start;

#!/bin/bash
nvidia-persistenced
fuser -v /dev/nvidia*

I have also removed any plugins I don't actively use, and made sure the ones I do use stay updated. 

 

The only thing I haven't tried yet is doing a BIOS update, which I would like to not do unless I completely run out of options. 

 

Diagnostics are from after a reboot. And the below logs are from my Syslog server. 

 

Date	Time	Level	Host Name	Category	Program	Messages
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	CR2: 0000000000000034
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	Modules linked in: md4 sha512_ssse3 sha512_generic cmac cifs libarc4 nfsv3 nfs nfs_ssc nvidia_uvm(PO) xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap macvlan veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter ext4 mbcache jbd2 xfs nfsd lockd grace sunrpc md_mod nvidia_drm(PO) nvidia_modeset(PO) drm_kms_helper drm backlight agpgart syscopyarea sysfillrect sysimgblt fb_sys_fops nvidia(PO) ip6table_filter ip6_tables iptable_filter ip_tables x_tables bonding e1000e igb i2c_algo_bit edac_mce_amd kvm_amd kvm crct10dif_pclmul mxm_wmi wmi_bmof crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd nvme glue_helper input_leds nvme_core rapl ahci aacraid ccp i2c_piix4 led_class k10temp wmi i2c_core libahci button acpi_cpufreq [last unloaded: e1000e]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	R13: 000014dc4a9acfe0 R14: 0000000000000010 R15: 000014dc4a7814e8
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000002a
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RBP: 000014dc490e0340 R08: 0000000000000000 R09: 0000000000000000
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RDX: 0000000000000010 RSI: 000014dc495705e0 RDI: 000000000000000e
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RAX: ffffffffffffffda RBX: 000014dc49574b38 RCX: 000014dc4e5a4352
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RSP: 002b:000014dc49570538 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	Code: c3 8b 07 85 c0 75 24 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> e9 8a d2 ff ff 41 54 b8 02 00 00 00 49 89 f4 be 00 88 08 00 55
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RIP: 0033:0x14dc4e5a4352
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	entry_SYSCALL_64_after_hwframe+0x44/0xa9
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	do_syscall_64+0x5d/0x6a
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__x64_sys_connect+0x11/0x14
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__sys_connect+0x62/0x9d
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	inet_stream_connect+0x34/0x49
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__inet_stream_connect+0xd3/0x2b6
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	tcp_v4_connect+0x3fc/0x455
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	tcp_connect+0x76d/0x7f4
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__tcp_transmit_skb+0x845/0x8ba
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__ip_queue_xmit+0x2a3/0x2df
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? ipv4_mtu+0x3d/0x64
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	ip_finish_output2+0x2ec/0x31f
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__local_bh_enable_ip+0x3b/0x43
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	do_softirq+0x3a/0x44
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	do_softirq_own_stack+0x2c/0x39
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	</IRQ>
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	asm_call_irq_on_stack+0xf/0x20
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__do_softirq+0xc4/0x1c2
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	net_rx_action+0xf4/0x29d
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	process_backlog+0xa3/0x13b
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__netif_receive_skb_one_core+0x3d/0x95
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__netif_receive_skb_core+0x335/0x4e7
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_pass_frame_up+0xda/0xda
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	br_handle_frame+0x25e/0x2a6
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	br_nf_pre_routing+0x229/0x239 [br_netfilter]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	NF_HOOK+0xd7/0xf7 [br_netfilter]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_handle_frame_finish+0x351/0x351
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_nf_forward_finish+0xd0/0xd0 [br_netfilter]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? nf_nat_ipv4_pre_routing+0x1e/0x4a [nf_nat]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_handle_frame_finish+0x351/0x351
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_pass_frame_up+0xda/0xda
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	br_nf_pre_routing_finish+0x23d/0x264 [br_netfilter]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_pass_frame_up+0xda/0xda
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	br_nf_hook_thresh+0xa3/0xc3 [br_netfilter]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? br_pass_frame_up+0xda/0xda
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? ipt_do_table+0x570/0x5c0 [ip_tables]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	br_handle_frame_finish+0x30d/0x351
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	netif_receive_skb+0x79/0xa1
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	__netif_receive_skb_one_core+0x74/0x95
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	ip_rcv+0x41/0x61
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? l3mdev_l3_rcv.constprop.0+0x50/0x50
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	nf_hook.constprop.0+0xb1/0xd8
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	nf_hook_slow+0x39/0x8e
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	ip_sabotage_in+0x43/0x4d [br_netfilter]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? ip_check_defrag+0x18f/0x18f
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	ip_forward+0x3f1/0x420
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? __ip_finish_output+0x146/0x146
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	ip_output+0x7d/0x8a
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? __ip_finish_output+0x146/0x146
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	nf_hook+0xab/0xd3
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	nf_hook_slow+0x39/0x8e
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	nf_nat_ipv4_out+0xf/0x88 [nf_nat]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	nf_nat_inet_fn+0xe9/0x183 [nf_nat]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? fib_validate_source+0xb0/0xda
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? ipt_do_table+0x570/0x5c0 [ip_tables]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? __dev_queue_xmit+0x4d9/0x501
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	ipt_do_table+0x51a/0x5c0 [ip_tables]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? netdev_start_xmit+0x1b/0x38
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	masquerade_tg+0x44/0x5e [xt_MASQUERADE]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	nf_nat_masquerade_ipv4+0x10b/0x131 [nf_nat]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? krealloc+0x26/0x7a
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? __ksize+0x15/0x64
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	? sch_direct_xmit+0x16/0x1de
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	<IRQ>
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	Call Trace:
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	CR2: 0000000000000034 CR3: 0000000216e82000 CR4: 00000000003506f0
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	FS:  000014dc49574b38(0000) GS:ffff8887fe800000(0000) knlGS:0000000000000000
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	R13: ffffc90000003720 R14: ffffc900000037dc R15: ffffffff8210b440
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	R10: ffff88821d56c388 R11: ffffffff815cbe4b R12: 0000000000000000
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RBP: ffffc900000037c8 R08: 00000000d638ba09 R09: ffff88818fa62b80
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RDX: ffffffffffffffee RSI: 000000007418f38d RDI: ffffc90000003720
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RAX: ffff88819ec50206 RBX: ffff88816107f0c0 RCX: 0000000000000000
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RSP: 0018:ffffc90000003700 EFLAGS: 00010286
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	Code: ff 48 8b 15 ef 6a 00 00 89 c0 48 8d 04 c2 48 8b 10 48 85 d2 74 80 48 81 ea 98 00 00 00 48 85 d2 0f 84 70 ff ff ff 8a 44 24 46 <38> 42 46 74 09 48 8b 92 98 00 00 00 eb d9 48 8b 4a 20 48 8b 42 28
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	RIP: 0010:nf_nat_setup_info+0x129/0x6aa [nf_nat]
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 2301 04/19/2019
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	CPU: 0 PID: 13620 Comm: python3 Tainted: P        W  O      5.10.28-Unraid #1
4/21/21	12:18:43	Warning	SERVERUS	kern	kernel	Oops: 0000 [#1] SMP NOPTI
4/21/21	12:18:43	Information	SERVERUS	kern	kernel	PGD 0 P4D 0 
4/21/21	12:18:43	Alert	SERVERUS	kern	kernel	#PF: error_code(0x0000) - not-present page
4/21/21	12:18:43	Alert	SERVERUS	kern	kernel	#PF: supervisor read access in kernel mode
4/21/21	12:18:43	Alert	SERVERUS	kern	kernel	BUG: kernel NULL pointer dereference, address: 0000000000000034
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	eth0: renamed from vethbb3a427
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	veth37fa1d8: renamed from eth0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	oom_reaper: reaped process 3059 (xteve), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
4/21/21	12:00:44	Error	SERVERUS	kern	kernel	Memory cgroup out of memory: Killed process 3059 (xteve) total-vm:2784340kB, anon-rss:2087212kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:4176kB oom_score_adj:0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,mems_allowed=0,oom_memcg=/docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,task_memcg=/docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89,task=xteve,pid=3059,uid=0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	[   3059]     0  3059   696085   521803  4276224        0             0 xteve
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	[   3054]     0  3054      396       11    36864        0             0 crond
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	[   2968]     0  2968      554       53    40960        0             0 entrypoint.sh
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	[  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	Tasks state (memory values in pages):
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	thp_collapse_alloc 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	thp_fault_alloc 66
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pglazyfreed 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pglazyfree 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pgdeactivate 104645
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pgactivate 732171
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pgsteal 272820
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pgscan 20914208
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pgrefill 107383
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pgmajfault 231
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	pgfault 764247
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	workingset_nodereclaim 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	workingset_restore_file 2013
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	workingset_restore_anon 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	workingset_activate_file 3927
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	workingset_activate_anon 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	workingset_refault_file 212124
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	workingset_refault_anon 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	slab 297176
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	slab_unreclaimable 1336
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	slab_reclaimable 295840
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	unevictable 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	active_file 77824
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	inactive_file 2076672
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	active_anon 270336
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	inactive_anon 2139475968
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	anon_thp 69206016
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	file_writeback 675840
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	file_dirty 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	file_mapped 270336
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	shmem 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	sock 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	percpu 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	kernel_stack 147456
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	file 3108864
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	anon 2138357760
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	Memory cgroup stats for /docker/5d46a8bd8153a70ccebf7b3f8e963ede192d277826d80f9653f49ee87ea73d89:
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	kmem: usage 5288kB, limit 9007199254740988kB, failcnt 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	memory+swap: usage 2097156kB, limit 4194304kB, failcnt 0
4/21/21	12:00:44	Information	SERVERUS	kern	kernel	memory: usage 2097156kB, limit 2097152kB, failcnt 76745
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	R13: 0000000000bcd4e0 R14: 0000000000000000 R15: 000000000046c560
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	R10: 000014f1d1566eb0 R11: ed386059261e7389 R12: 0000000000000002
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	RBP: 000000c00079b888 R08: 0000000000000009 R09: ffffffffffffffff
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	RDX: 00000000009ea5f4 RSI: 000000000041a345 RDI: 000000c00079bac0
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	RAX: 0000000000bcd4e0 RBX: 0000000000000003 RCX: 000000c000288f00
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	RSP: 002b:000000c00079b878 EFLAGS: 00010216
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	Code: Unable to access opcode bytes at RIP 0x4548ff.
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	RIP: 0033:0x454929
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	asm_exc_page_fault+0x1e/0x30
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	? asm_exc_page_fault+0x8/0x30
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	exc_page_fault+0x259/0x373
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	? __raw_spin_unlock_irq+0x5/0xd
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	handle_mm_fault+0xb83/0xec3
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	__do_fault+0x49/0x64
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	? filemap_map_pages+0x1ec/0x217
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	? xas_find+0xa9/0x121
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	filemap_fault+0x2a4/0x478
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	pagecache_get_page+0x108/0x13e
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	add_to_page_cache_lru+0x42/0xa7
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	? lruvec_page_state+0x2f/0x2f
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	__add_to_page_cache_locked+0xab/0x274
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	mem_cgroup_charge+0xfe/0x17e
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	? mem_cgroup_charge+0x170/0x17e
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	try_charge+0x3ec/0x501
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	mem_cgroup_out_of_memory+0x79/0xae
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	out_of_memory+0x3dd/0x410
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	oom_kill_process+0x7b/0xf6
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	dump_header+0x45/0x1e8
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	dump_stack+0x6b/0x83
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	Call Trace:
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	Hardware name: System manufacturer System Product Name/ROG STRIX B450-F GAMING, BIOS 2301 04/19/2019
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	CPU: 4 PID: 3653 Comm: xteve Tainted: P        W  O      5.10.28-Unraid #1
4/21/21	12:00:44	Warning	SERVERUS	kern	kernel	xteve invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
4/21/21	9:44:28	Notice	SERVERUS	user	flash_backup	adding task: php /usr/local/emhttp/plugins/dynamix.unraid.net/include/UpdateFlashBackup.php update
4/21/21	9:43:20	Alert	SERVERUS	local7	nginx	2021/04/21 09:43:20 [alert] 29892#29892: worker process 7356 exited on signal 6

 

serverus-diagnostics-20210421-1308.zip

Link to comment

I'm in the same boat but I have a powerege r720. Been getting a kernel panic at least once a day for the past 2 or 3 weeks. Memtest clean, syslog shows nothing. I've tried every trick I've found in this forum and others... I have no idea what to do next except downgrade. It's getting terribly frustrating. 

Screenshot_20210422-111646_Remote Desktop.jpg

Link to comment

Ok so I finally got a chance to rerun the memtest, and I did get a few errors on the first run. I shutdown and reseated the DIMMS and it passed just fine. Maybe something got knocked loose at some point. 
 

I also updated my motherboard to the latest bios. Unfortunately, ever since updating the bios UNRAID doesn’t see any of the disks attached to my Adaptec HBA, and I am not seeing it’s bios on boot either. 
 

In my motherboards bios I made sure quick boot was disabled, show logo is disabled, show oprom enabled, and allow interrupt 19 is enabled...now the panic starts to set in...

 

EDIT: That was a great use of almost 4 hours, the raid card not showing up was due to a new BIOS option in the update that ASUS didn’t bother to document...

 

Back to the original issue. RAM good, bios is updated, I’m going to continue monitoring and see what happens this week. I really hope I can get some stability back, things have been rough since 6.9

Edited by relink
Link to comment

Ok I think things are now getting worse. This is the 4th time where everything looks like it’s still running, all my containers look fine, and the unraid UI is responsive, but in reality everything is kind of in the unresponsive state. 
 

I can’t seem to pull a diagnostics either, every time I try it gets stuck at this exact spot;


Downloading...

/boot/logs/serverus-diagnostics-20210427-0632.zip

smartctl -x '/dev/sdg' 2>/dev/null|todos >'/serverus-diagnostics-20210427-0632/smart/ST1000VN002-2EY102_Z9C5A3AJ-20210427-0632 disk2 (sdg).txt'

 

Link to comment

Little bit of an update. After digging in further I realized most of my containers other than Plex are actually not frozen. However I cannot download diagnostics still, I cannot restart or stop any containers, I cannot stop the docker service, and I cannot stop the array...

Link to comment

I dug through and pulled all kernel errors from my syslog server. Seems related to my raid controller, but I have no idea why, still looking into it. 

 

Date	Time	Level	Host Name	Category	Program	Messages
2021-04-27	07:27:13	Error	SERVERUS	kern	kernel	
2021-04-27	04:01:56	Error	SERVERUS	kern	kernel	aacraid 0000:0a:00.0: Controller reset type is 3
2021-04-27	04:01:56	Error	SERVERUS	kern	kernel	aacraid: Host bus reset request. SCSI hang ?
2021-04-27	04:01:50	Error	SERVERUS	kern	kernel	aacraid: Outstanding commands on (2,1,6,0):
2021-04-27	04:01:50	Error	SERVERUS	kern	kernel	aacraid: Host adapter abort request.
2021-04-27	00:00:07	Error	SERVERUS	kern	kernel	blk_update_request: critical target error, dev sda, sector 76048 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
2021-04-26	21:38:48	Error	SERVERUS	kern	kernel	Memory cgroup out of memory: Killed process 32031 (xteve) total-vm:2773040kB, anon-rss:2055388kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:4164kB oom_score_adj:0
2021-04-26	10:15:53	Error	SERVERUS	kern	kernel	
2021-04-26	06:41:16	Error	SERVERUS	kern	kernel	aacraid 0000:0a:00.0: Controller reset type is 3
2021-04-26	06:41:16	Error	SERVERUS	kern	kernel	aacraid: Host bus reset request. SCSI hang ?
2021-04-26	06:41:10	Error	SERVERUS	kern	kernel	aacraid: Outstanding commands on (11,1,6,0):
2021-04-26	06:41:10	Error	SERVERUS	kern	kernel	aacraid: Host adapter abort request.
2021-04-26	06:09:30	Error	SERVERUS	kern	kernel	sd 11:1:5:0: [sdd] tag#166 timing out command, waited 7s
2021-04-26	06:08:33	Error	SERVERUS	kern	kernel	aacraid 0000:0a:00.0: Controller reset type is 3
2021-04-26	06:08:33	Error	SERVERUS	kern	kernel	aacraid: Host bus reset request. SCSI hang ?
2021-04-26	06:08:32	Error	SERVERUS	kern	kernel	aacraid: Outstanding commands on (11,1,5,0):
2021-04-26	06:08:32	Error	SERVERUS	kern	kernel	aacraid: Host adapter abort request.
2021-04-26	04:09:16	Error	SERVERUS	kern	kernel	sd 11:1:5:0: [sdd] tag#221 timing out command, waited 7s
2021-04-26	04:08:19	Error	SERVERUS	kern	kernel	aacraid 0000:0a:00.0: Controller reset type is 3
2021-04-26	04:08:19	Error	SERVERUS	kern	kernel	aacraid: Host bus reset request. SCSI hang ?
2021-04-26	04:08:17	Error	SERVERUS	kern	kernel	aacraid: Outstanding commands on (11,1,5,0):
2021-04-26	04:08:17	Error	SERVERUS	kern	kernel	aacraid: Host adapter abort request.
2021-04-26	04:06:11	Error	SERVERUS	kern	kernel	aacraid 0000:0a:00.0: Controller reset type is 3
2021-04-26	04:06:11	Error	SERVERUS	kern	kernel	aacraid: Host bus reset request. SCSI hang ?
2021-04-26	04:06:06	Error	SERVERUS	kern	kernel	aacraid: Outstanding commands on (11,1,6,0):
2021-04-26	04:06:06	Error	SERVERUS	kern	kernel	aacraid: Host adapter abort request.
2021-04-26	00:00:03	Error	SERVERUS	kern	kernel	blk_update_request: critical target error, dev sda, sector 76048 op 0x3:(DISCARD) flags 0x800 phys_seg 1 prio class 0
2021-04-25	23:01:41	Error	SERVERUS	kern	kernel	
2021-04-25	22:51:30	Error	SERVERUS	kern	kernel	
2021-04-25	22:39:03	Error	SERVERUS	kern	kernel	
2021-04-25	22:28:09	Error	SERVERUS	kern	kernel	
2021-04-25	20:43:54	Error	SERVERUS	kern	kernel	
2021-04-25	20:18:51	Error	SERVERUS	kern	kernel	
2021-04-25	19:40:40	Error	SERVERUS	kern	kernel	

 

Edited by relink
Link to comment

After searching for some of the errors I came across this post here, the first part talks about changing the device timeout to 45seconds. The problem is I don't know how to do that in unRAID. Running "ls /sys/block/" shows disks listed as both "mdx" and "sdx", plus I'm not even sure if a change like that would break unRAID, or persist over a reboot, etc...

 

Now the second part of the post I can definitely check and will when I get home. Thats is a BIOS update for the controller. Firmware Version 7.3.0 Build 30612 lists "Resolved an issue where I/O would slow and eventually result in a controller reset" as one of the fixes. My only concern is this BIOS was released in 2013, I find it hard to believe my card would have a bios older than that, but I will check. 

 

 

Link to comment

The BIOS on my controller was was from 2015. I just finished updating to the latest version which is from 2018. So fingers crossed that I don’t have anymore issues. If I do I’ll probably just open a new thread as this has gotten off topic from the original problem. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...