Jump to content

Kernal panic/call trace/system lockups


Recommended Posts

New in the last 90 days to UnRAID and loving it, but I've got to get it to a more stable place.  I have been having some stability issues for some time now.  Usually I can google and fix most anything, and I've done some tweaking and improved the stability to where it's better than it has been, but I'm experiencing these kernel panic/call trace situations every day or two.  One time I made it 10 days before I crashed and I thought I had it licked, but apparently not. 

 

This evening I was on my way home when I got an alert on my phone that my Unifi controller was offline, which is indicative of my entire system being offline.  Sure enough, when I got home and started digging, I found my system was unresponsive to ping, ssh, web based GUI, etc.  I turned on the monitor and was greeted with the scrolling text shown in the screenshot.  I was able to log into the system from the console, generate the diags, and reboot the system which hasn't always been the case in the past, but it rebooted smoothly and came right back up.

 

Here is the info from the syslog when it started freaking out which matches what I've seen in the past (this goes on for thousands of lines, but the diagnostics are attached if you'd like to see for yourself):

 

Quote

Nov 24 17:46:36 Tower kernel: DMAR: ERROR: DMA PTE for vPFN 0xffcca already set (to 2000 not 13e5b4002)
Nov 24 17:46:36 Tower kernel: ------------[ cut here ]------------
Nov 24 17:46:36 Tower kernel: WARNING: CPU: 3 PID: 0 at drivers/iommu/intel-iommu.c:2300 __domain_mapping+0x205/0x2dd
Nov 24 17:46:36 Tower kernel: Modules linked in: wireguard ip6_udp_tunnel udp_tunnel xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables vhost_net tun vhost tap veth macvlan xt_nat ipt_MASQUERADE iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables xfs md_mod nct6775 hwmon_vid x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd alx glue_helper i2c_i801 i2c_core mxm_wmi intel_cstate intel_uncore ahci intel_rapl_perf libahci mdio button video thermal fan wmi pcc_cpufreq backlight
Nov 24 17:46:36 Tower kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G    B             4.19.107-Unraid #1
Nov 24 17:46:36 Tower kernel: Hardware name: MSI MS-7821/Z87-G45 GAMING (MS-7821), BIOS V1.9 07/21/2014
Nov 24 17:46:36 Tower kernel: RIP: 0010:__domain_mapping+0x205/0x2dd
Nov 24 17:46:36 Tower kernel: Code: 48 c7 c7 b7 b6 d7 81 e8 1f a2 c7 ff 8b 05 8b 5c a5 00 85 c0 74 08 ff c8 89 05 7f 5c a5 00 48 c7 c7 79 fc d2 81 e8 01 a2 c7 ff <0f> 0b 8b 54 24 24 b8 34 00 00 00 8d 0c d2 83 e9 09 83 f9 34 0f 4f
Nov 24 17:46:36 Tower kernel: RSP: 0018:ffff8887ff983d68 EFLAGS: 00010246
Nov 24 17:46:36 Tower kernel: RAX: 0000000000000024 RBX: 000000013e5b4002 RCX: 0000000000000470
Nov 24 17:46:36 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000247
Nov 24 17:46:36 Tower kernel: RBP: 0000000000000001 R08: 0000000000000003 R09: 0000000000014800
Nov 24 17:46:36 Tower kernel: R10: 0000000000000000 R11: 0000000000000044 R12: 00000000000ffcca
Nov 24 17:46:36 Tower kernel: R13: 0000000000000000 R14: ffff8887f8750b00 R15: ffff8887f8b00650
Nov 24 17:46:36 Tower kernel: FS:  0000000000000000(0000) GS:ffff8887ff980000(0000) knlGS:0000000000000000
Nov 24 17:46:36 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 24 17:46:36 Tower kernel: CR2: 000000a7288ea25a CR3: 0000000001e0a001 CR4: 00000000001626e0
Nov 24 17:46:36 Tower kernel: Call Trace:
Nov 24 17:46:36 Tower kernel: <IRQ>
Nov 24 17:46:36 Tower kernel: domain_mapping+0x16/0xa7
Nov 24 17:46:36 Tower kernel: __intel_map_single+0xc8/0x122
Nov 24 17:46:36 Tower kernel: alx_refill_rx_ring+0x145/0x21d [alx]
Nov 24 17:46:36 Tower kernel: alx_poll+0x364/0x3f1 [alx]
Nov 24 17:46:36 Tower kernel: net_rx_action+0x107/0x26c
Nov 24 17:46:36 Tower kernel: __do_softirq+0xc9/0x1d7
Nov 24 17:46:36 Tower kernel: irq_exit+0x5e/0x9d
Nov 24 17:46:36 Tower kernel: do_IRQ+0xb2/0xd0
Nov 24 17:46:36 Tower kernel: common_interrupt+0xf/0xf
Nov 24 17:46:36 Tower kernel: </IRQ>
Nov 24 17:46:36 Tower kernel: RIP: 0010:cpuidle_enter_state+0xe8/0x141
Nov 24 17:46:36 Tower kernel: Code: ff 45 84 f6 74 1d 9c 58 0f 1f 44 00 00 0f ba e0 09 73 09 0f 0b fa 66 0f 1f 44 00 00 31 ff e8 7a 8d bb ff fb 66 0f 1f 44 00 00 <48> 2b 2c 24 b8 ff ff ff 7f 48 b9 ff ff ff ff f3 01 00 00 48 39 cd
Nov 24 17:46:36 Tower kernel: RSP: 0018:ffffc900031bfe98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdc
Nov 24 17:46:36 Tower kernel: RAX: ffff8887ff99fac0 RBX: ffff8887ff9aa300 RCX: 000000000000001f
Nov 24 17:46:36 Tower kernel: RDX: 0000000000000000 RSI: 0000000028000347 RDI: 0000000000000000
Nov 24 17:46:36 Tower kernel: RBP: 0000b0257c10b1e2 R08: 0000b0257c10b1e2 R09: 0000b02569a50110
Nov 24 17:46:36 Tower kernel: R10: 0000000000005a6c R11: 071c71c71c71c71c R12: 0000000000000005
Nov 24 17:46:36 Tower kernel: R13: ffffffff81e5b120 R14: 0000000000000000 R15: ffffffff81e5b318
Nov 24 17:46:36 Tower kernel: ? cpuidle_enter_state+0xbf/0x141
Nov 24 17:46:36 Tower kernel: do_idle+0x17e/0x1fc
Nov 24 17:46:36 Tower kernel: cpu_startup_entry+0x6a/0x6c
Nov 24 17:46:36 Tower kernel: start_secondary+0x197/0x1b2
Nov 24 17:46:36 Tower kernel: secondary_startup_64+0xa4/0xb0
Nov 24 17:46:36 Tower kernel: ---[ end trace cf87be4a2df4b994 ]---
Nov 24 17:46:36 Tower kernel: DMAR: DRHD: handling fault status reg 3
Nov 24 17:46:36 Tower kernel: DMAR: [DMA Write] Request device [02:00.0] fault addr ffcca000 [fault reason 05] PTE Write access is not set
Nov 24 17:46:36 Tower kernel: DMAR: ERROR: DMA PTE for vPFN 0xffcca already set (to 2000 not 13920c002)
Nov 24 17:46:36 Tower kernel: ------------[ cut here ]------------

 

I'm really hopeful that somebody has some guidance for me here as I'd love to get this system rock solid stable.

IMG_0136.jpg

tower-diagnostics-20201124-1828.zip

Link to comment

Maybe unrelated, but why do you have 200G docker.img? 20G should be much more than enough. Have you had problems filling it?

 

I see you have also put it in the appdata share, possibly so you can outsmart the CA Backup plugin and make it backup docker.img. There is no good reason to backup docker.img.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...