chrischerman Posted November 25, 2020 Share Posted November 25, 2020 New in the last 90 days to UnRAID and loving it, but I've got to get it to a more stable place. I have been having some stability issues for some time now. Usually I can google and fix most anything, and I've done some tweaking and improved the stability to where it's better than it has been, but I'm experiencing these kernel panic/call trace situations every day or two. One time I made it 10 days before I crashed and I thought I had it licked, but apparently not. This evening I was on my way home when I got an alert on my phone that my Unifi controller was offline, which is indicative of my entire system being offline. Sure enough, when I got home and started digging, I found my system was unresponsive to ping, ssh, web based GUI, etc. I turned on the monitor and was greeted with the scrolling text shown in the screenshot. I was able to log into the system from the console, generate the diags, and reboot the system which hasn't always been the case in the past, but it rebooted smoothly and came right back up. Here is the info from the syslog when it started freaking out which matches what I've seen in the past (this goes on for thousands of lines, but the diagnostics are attached if you'd like to see for yourself): Quote Nov 24 17:46:36 Tower kernel: DMAR: ERROR: DMA PTE for vPFN 0xffcca already set (to 2000 not 13e5b4002) Nov 24 17:46:36 Tower kernel: ------------[ cut here ]------------ Nov 24 17:46:36 Tower kernel: WARNING: CPU: 3 PID: 0 at drivers/iommu/intel-iommu.c:2300 __domain_mapping+0x205/0x2dd Nov 24 17:46:36 Tower kernel: Modules linked in: wireguard ip6_udp_tunnel udp_tunnel xt_CHECKSUM ipt_REJECT ip6table_mangle ip6table_nat nf_nat_ipv6 iptable_mangle ip6table_filter ip6_tables vhost_net tun vhost tap veth macvlan xt_nat ipt_MASQUERADE iptable_filter iptable_nat nf_nat_ipv4 nf_nat ip_tables xfs md_mod nct6775 hwmon_vid x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd cryptd alx glue_helper i2c_i801 i2c_core mxm_wmi intel_cstate intel_uncore ahci intel_rapl_perf libahci mdio button video thermal fan wmi pcc_cpufreq backlight Nov 24 17:46:36 Tower kernel: CPU: 3 PID: 0 Comm: swapper/3 Tainted: G B 4.19.107-Unraid #1 Nov 24 17:46:36 Tower kernel: Hardware name: MSI MS-7821/Z87-G45 GAMING (MS-7821), BIOS V1.9 07/21/2014 Nov 24 17:46:36 Tower kernel: RIP: 0010:__domain_mapping+0x205/0x2dd Nov 24 17:46:36 Tower kernel: Code: 48 c7 c7 b7 b6 d7 81 e8 1f a2 c7 ff 8b 05 8b 5c a5 00 85 c0 74 08 ff c8 89 05 7f 5c a5 00 48 c7 c7 79 fc d2 81 e8 01 a2 c7 ff <0f> 0b 8b 54 24 24 b8 34 00 00 00 8d 0c d2 83 e9 09 83 f9 34 0f 4f Nov 24 17:46:36 Tower kernel: RSP: 0018:ffff8887ff983d68 EFLAGS: 00010246 Nov 24 17:46:36 Tower kernel: RAX: 0000000000000024 RBX: 000000013e5b4002 RCX: 0000000000000470 Nov 24 17:46:36 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000247 Nov 24 17:46:36 Tower kernel: RBP: 0000000000000001 R08: 0000000000000003 R09: 0000000000014800 Nov 24 17:46:36 Tower kernel: R10: 0000000000000000 R11: 0000000000000044 R12: 00000000000ffcca Nov 24 17:46:36 Tower kernel: R13: 0000000000000000 R14: ffff8887f8750b00 R15: ffff8887f8b00650 Nov 24 17:46:36 Tower kernel: FS: 0000000000000000(0000) GS:ffff8887ff980000(0000) knlGS:0000000000000000 Nov 24 17:46:36 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 24 17:46:36 Tower kernel: CR2: 000000a7288ea25a CR3: 0000000001e0a001 CR4: 00000000001626e0 Nov 24 17:46:36 Tower kernel: Call Trace: Nov 24 17:46:36 Tower kernel: <IRQ> Nov 24 17:46:36 Tower kernel: domain_mapping+0x16/0xa7 Nov 24 17:46:36 Tower kernel: __intel_map_single+0xc8/0x122 Nov 24 17:46:36 Tower kernel: alx_refill_rx_ring+0x145/0x21d [alx] Nov 24 17:46:36 Tower kernel: alx_poll+0x364/0x3f1 [alx] Nov 24 17:46:36 Tower kernel: net_rx_action+0x107/0x26c Nov 24 17:46:36 Tower kernel: __do_softirq+0xc9/0x1d7 Nov 24 17:46:36 Tower kernel: irq_exit+0x5e/0x9d Nov 24 17:46:36 Tower kernel: do_IRQ+0xb2/0xd0 Nov 24 17:46:36 Tower kernel: common_interrupt+0xf/0xf Nov 24 17:46:36 Tower kernel: </IRQ> Nov 24 17:46:36 Tower kernel: RIP: 0010:cpuidle_enter_state+0xe8/0x141 Nov 24 17:46:36 Tower kernel: Code: ff 45 84 f6 74 1d 9c 58 0f 1f 44 00 00 0f ba e0 09 73 09 0f 0b fa 66 0f 1f 44 00 00 31 ff e8 7a 8d bb ff fb 66 0f 1f 44 00 00 <48> 2b 2c 24 b8 ff ff ff 7f 48 b9 ff ff ff ff f3 01 00 00 48 39 cd Nov 24 17:46:36 Tower kernel: RSP: 0018:ffffc900031bfe98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdc Nov 24 17:46:36 Tower kernel: RAX: ffff8887ff99fac0 RBX: ffff8887ff9aa300 RCX: 000000000000001f Nov 24 17:46:36 Tower kernel: RDX: 0000000000000000 RSI: 0000000028000347 RDI: 0000000000000000 Nov 24 17:46:36 Tower kernel: RBP: 0000b0257c10b1e2 R08: 0000b0257c10b1e2 R09: 0000b02569a50110 Nov 24 17:46:36 Tower kernel: R10: 0000000000005a6c R11: 071c71c71c71c71c R12: 0000000000000005 Nov 24 17:46:36 Tower kernel: R13: ffffffff81e5b120 R14: 0000000000000000 R15: ffffffff81e5b318 Nov 24 17:46:36 Tower kernel: ? cpuidle_enter_state+0xbf/0x141 Nov 24 17:46:36 Tower kernel: do_idle+0x17e/0x1fc Nov 24 17:46:36 Tower kernel: cpu_startup_entry+0x6a/0x6c Nov 24 17:46:36 Tower kernel: start_secondary+0x197/0x1b2 Nov 24 17:46:36 Tower kernel: secondary_startup_64+0xa4/0xb0 Nov 24 17:46:36 Tower kernel: ---[ end trace cf87be4a2df4b994 ]--- Nov 24 17:46:36 Tower kernel: DMAR: DRHD: handling fault status reg 3 Nov 24 17:46:36 Tower kernel: DMAR: [DMA Write] Request device [02:00.0] fault addr ffcca000 [fault reason 05] PTE Write access is not set Nov 24 17:46:36 Tower kernel: DMAR: ERROR: DMA PTE for vPFN 0xffcca already set (to 2000 not 13920c002) Nov 24 17:46:36 Tower kernel: ------------[ cut here ]------------ I'm really hopeful that somebody has some guidance for me here as I'd love to get this system rock solid stable. tower-diagnostics-20201124-1828.zip Quote Link to comment
trurl Posted November 25, 2020 Share Posted November 25, 2020 Maybe unrelated, but why do you have 200G docker.img? 20G should be much more than enough. Have you had problems filling it? I see you have also put it in the appdata share, possibly so you can outsmart the CA Backup plugin and make it backup docker.img. There is no good reason to backup docker.img. Quote Link to comment
chrischerman Posted November 25, 2020 Author Share Posted November 25, 2020 When I first set this up I wrongly assumed bigger is better. When my wife gets done with Plex here in a little bit I'm planning to delete my docker.img file and recreate to eliminate any corruption there. Is that the correct time to resize it back down to 20GB? What's the correct location for it? Quote Link to comment
chrischerman Posted November 30, 2020 Author Share Posted November 30, 2020 Does anybody have any thoughts on this? Same thing just happened again after 4 days of uptime. Quote Link to comment
trurl Posted November 30, 2020 Share Posted November 30, 2020 On 11/24/2020 at 10:29 PM, chrischerman said: What's the correct location for it? The usual place for docker.img is in system share. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.