March 9, 20233 yr Hi all, I've got a relatively new build, perhaps two months old now, that seems stable with Unraid until I start Docker containers. I'm in the process of trying to start containers one-by-one to see which one is causing the issues. But, is there a better way? I'm trying to set up a remote syslog service to see if I can capture anything meaningful. Here is a picture of a small HDMI monitor I have attached while I had this command running. My hope was I'd catch an error in the act and get something useful. tail -f /var/log/syslog This means little to me. Since I've seen it done here, I'm attaching two difference diagnostics files. The 20230218 file was right before or around the time of a kernel panic (not from the pic above). And the second is just a recent one, though I've made some drive changes. After each panic, which would be an unclean shutdown, I'm letting parity check happen which with my array takes around 24 hours. So it's slow going checking the Dockers. My suspicion is that using the iGPU on multiple containers may have something to do with it? Or a specific version of Plex causing issues? And some of the *arrs have been heavy on my system in this past too. Is there other info I can provide that may narrow the search down? Other things I've tried: I've done a memtest which came back fine after two passes. I've moved all SATA cables other than three to the Mobo onboard ports. I am using one of these PCIe SATA Cards for the two cache SSDs and one drive. I know that's less ideal than a HBA. I recently bought this LSI 9220-8i, which was supposed to be flashed with IT mode (I'm not sure), but I couldn't get Unraid to recognize the drives. I've pulled every other card out besides that PCIe SATA card, the drives, the flash drive, and my mini HDMI screen (powered by rear USB ports). Other basic specs: i9-13900K ASUSTeK COMPUTER INC. ProArt Z690-CREATOR WIFI , Version Rev 1.xx American Megatrends Inc., Version 1720 Corsair DOMINATOR PLATINUM RGB DDR5 128GB (4x32GB) 5200MHz All array drives are WD181KFGX 18TB drives Cache SSDs are Samsung SSD 870 EVO 1TB and Samsung SSD 850 EVO 2TB. My most used cache drive is NVMe: Samsung SSD 980 1TB. tower-diagnostics-20230218-1145.zip tower-diagnostics-20230308-2143.zip
March 9, 20233 yr One of the diags is showing various apps segfaulting, run memtest, if no errors are found enable the syslog server and post that after a crash.
March 10, 20233 yr Author Memory test passed fine - see results I tried starting and then stopping each container individually and then watching the logs: Starting the Plex container caused this error: Mar 10 13:22:32 Tower kernel: Plex Script Hos[24901]: segfault at 0 ip 00001490f655efdb sp 00007ffd79bded50 error 6 in libpython27.so[1490f6436000+179000] Mar 10 13:22:32 Tower kernel: Code: 48 83 c4 20 49 89 c5 48 8b 7d d0 48 ff 0f 75 07 48 8b 47 08 ff 50 30 48 8b 03 4c 39 f8 76 22 48 8d 48 f8 48 89 0b 48 8b 78 f8 <48> ff 0f 75 0a 48 8b 47 08 ff 50 30 48 8b 0b 48 89 c8 4c 39 f9 77 Mar 10 13:22:44 Tower kernel: Plex Tuner Serv[24915]: segfault at 58 ip 000014dc1dbefd0f sp 000014dc1bfd86e8 error 4 in ld-musl-x86_64.so.1[14dc1dba8000+53000] Mar 10 13:22:44 Tower kernel: Code: 00 00 00 48 8b 44 24 f8 48 89 47 20 0f 28 44 24 e8 0f 11 47 10 0f 28 44 24 d8 0f 11 07 48 85 f6 74 05 8b 06 89 47 10 31 c0 c3 <f6> 47 10 0f 75 10 b9 10 00 00 00 31 c0 f0 0f b1 0f 9b 85 c0 74 07 And starting Sonarr: Mar 10 13:24:12 Tower kernel: BUG: kernel NULL pointer dereference, address: 0000000000000030 Mar 10 13:24:12 Tower kernel: #PF: supervisor read access in kernel mode Mar 10 13:24:12 Tower kernel: #PF: error_code(0x0000) - not-present page Mar 10 13:24:12 Tower kernel: PGD 58db05067 P4D 58db05067 PUD 211f60067 PMD 0 Mar 10 13:24:12 Tower kernel: Oops: 0000 [#4] PREEMPT SMP NOPTI Mar 10 13:24:12 Tower kernel: CPU: 8 PID: 29534 Comm: pgrep Tainted: G D 5.19.17-Unraid #2 Mar 10 13:24:12 Tower kernel: Hardware name: ASUS System Product Name/ProArt Z690-CREATOR WIFI, BIOS 1720 08/12/2022 Mar 10 13:24:12 Tower kernel: RIP: 0010:do_dentry_open+0x212/0x2bf Mar 10 13:24:12 Tower kernel: Code: 00 48 85 d2 74 0e 48 83 7a 58 00 74 07 81 4b 44 00 00 40 00 81 63 40 3f fc ff ff bd ea ff ff ff 48 8b 00 48 8d bb 98 00 00 00 <48> 8b 70 30 e8 cc fc f7 ff 48 b8 00 40 00 00 00 00 40 00 48 23 43 Mar 10 13:24:12 Tower kernel: RSP: 0018:ffffc90001997cc8 EFLAGS: 00010206 Mar 10 13:24:12 Tower kernel: RAX: 0000000000000000 RBX: ffff888186b22c00 RCX: 0000000000000000 Mar 10 13:24:12 Tower kernel: RDX: 0000000000000000 RSI: ffffffff820e517e RDI: ffff888186b22c98 Mar 10 13:24:12 Tower kernel: RBP: 00000000ffffffea R08: 0000000000000dc0 R09: 0000000000000001 Mar 10 13:24:12 Tower kernel: R10: 0000000000000000 R11: 0000000000000fe0 R12: ffff888100677938 Mar 10 13:24:12 Tower kernel: R13: ffff888186b22c10 R14: ffffffff812866e8 R15: 0000000000000000 Mar 10 13:24:12 Tower kernel: FS: 00001457cb2827c0(0000) GS:ffff88a02d200000(0000) knlGS:0000000000000000 Mar 10 13:24:12 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 10 13:24:12 Tower kernel: CR2: 0000000000000030 CR3: 00000004e736a005 CR4: 0000000000770ee0 Mar 10 13:24:12 Tower kernel: PKRU: 55555554 Mar 10 13:24:12 Tower kernel: Call Trace: Mar 10 13:24:12 Tower kernel: <TASK> Mar 10 13:24:12 Tower kernel: path_openat+0x950/0xaa9 Mar 10 13:24:12 Tower kernel: do_filp_open+0x55/0xb8 Mar 10 13:24:12 Tower kernel: ? getname_flags+0x29/0x152 Mar 10 13:24:12 Tower kernel: ? kmem_cache_alloc+0x11a/0x143 Mar 10 13:24:12 Tower kernel: do_sys_openat2+0x6c/0xd9 Mar 10 13:24:12 Tower kernel: do_sys_open+0x3a/0x5a Mar 10 13:24:12 Tower kernel: do_syscall_64+0x68/0x81 Mar 10 13:24:12 Tower kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd Mar 10 13:24:12 Tower kernel: RIP: 0033:0x1457cb4470f1 Mar 10 13:24:12 Tower kernel: Code: 75 37 89 f0 25 00 00 41 00 3d 00 00 41 00 74 29 80 3d fa b4 0e 00 00 74 4d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 77 48 83 c4 68 5b 5d c3 48 8d 84 24 80 00 00 Mar 10 13:24:12 Tower kernel: RSP: 002b:00007ffce139ae90 EFLAGS: 00000202 ORIG_RAX: 0000000000000101 Mar 10 13:24:12 Tower kernel: RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00001457cb4470f1 Mar 10 13:24:12 Tower kernel: RDX: 0000000000000000 RSI: 00007ffce139af20 RDI: 00000000ffffff9c Mar 10 13:24:12 Tower kernel: RBP: 00007ffce139af20 R08: 0000000000000038 R09: 0000000000000073 Mar 10 13:24:12 Tower kernel: R10: 0000000000000000 R11: 0000000000000202 R12: 00007ffce139b150 Mar 10 13:24:12 Tower kernel: R13: 0000000000409530 R14: 0000000000000020 R15: 00007ffce139b150 Mar 10 13:24:12 Tower kernel: </TASK> Mar 10 13:24:12 Tower kernel: Modules linked in: xt_mark xt_nat veth xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge xfs nfsd auth_rpcgss oid_registry lockd grace sunrpc md_mod efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp stp llc bonding tls igc atlantic i915 iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper drm_kms_helper drm btusb btrtl btbcm btintel bluetooth intel_gtt nvme agpgart i2c_i801 input_leds wmi_bmof x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd intel_cstate intel_uncore i2c_smbus thunderbolt ecdh_generic led_class ecc i2c_core nvme_core joydev ahci libahci syscopyarea sysfillrect Mar 10 13:24:12 Tower kernel: sysimgblt fb_sys_fops tpm_crb tpm_tis thermal fan wmi tpm_tis_core video tpm backlight acpi_tad acpi_pad button unix [last unloaded: igc] Mar 10 13:24:12 Tower kernel: CR2: 0000000000000030 Mar 10 13:24:12 Tower kernel: ---[ end trace 0000000000000000 ]--- Mar 10 13:24:12 Tower kernel: RIP: 0010:do_dentry_open+0x1ee/0x2bf Mar 10 13:24:12 Tower kernel: Code: 28 48 83 7a 18 00 75 07 48 83 7a 28 00 74 08 0d 00 00 04 00 89 43 44 48 8b 83 d0 00 00 00 48 8b 90 90 00 00 00 48 85 d2 74 0e <48> 83 7a 58 00 74 07 81 4b 44 00 00 40 00 81 63 40 3f fc ff ff bd Mar 10 13:24:12 Tower kernel: RSP: 0018:ffffc9000b81fcc8 EFLAGS: 00010206 Mar 10 13:24:12 Tower kernel: RAX: ffff888125fa3a3a RBX: ffff8881a333be00 RCX: 0000000000000000 Mar 10 13:24:12 Tower kernel: RDX: 3ad0000000000000 RSI: ffffffff820e517e RDI: ffff8881097ac0b0 Mar 10 13:24:12 Tower kernel: RBP: 0000000000000000 R08: 0000000000000dc0 R09: 0000000000000001 Mar 10 13:24:12 Tower kernel: R10: 0000000000000000 R11: 0000000000000fe0 R12: ffff888125fa38b8 Mar 10 13:24:12 Tower kernel: R13: ffff8881a333be10 R14: ffffffff812866e8 R15: 0000000000000000 Mar 10 13:24:12 Tower kernel: FS: 00001457cb2827c0(0000) GS:ffff88a02d200000(0000) knlGS:0000000000000000 Mar 10 13:24:12 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 10 13:24:12 Tower kernel: CR2: 0000000000000030 CR3: 00000004e736a005 CR4: 0000000000770ee0 Mar 10 13:24:12 Tower kernel: PKRU: 55555554 I tried again with just Plex: Mar 10 13:26:36 Tower kernel: traps: Plex Script Hos[1798] general protection fault ip:1532784d97d0 sp:7fff7c0816e8 error:0 in ld-musl-x86_64.so.1[1532784c9000+53000] And then again with just Sonarr: No errors A third time with just Plex: Mar 10 13:29:15 Tower kernel: Plex Script Hos[7454]: segfault at 0 ip 000014d27435efdb sp 00007fff3e6131d0 error 6 in libpython27.so[14d274236000+179000] Mar 10 13:29:15 Tower kernel: Code: 48 83 c4 20 49 89 c5 48 8b 7d d0 48 ff 0f 75 07 48 8b 47 08 ff 50 30 48 8b 03 4c 39 f8 76 22 48 8d 48 f8 48 89 0b 48 8b 78 f8 <48> ff 0f 75 0a 48 8b 47 08 ff 50 30 48 8b 0b 48 89 c8 4c 39 f9 77 Mar 10 13:29:15 Tower kernel: Plex Script Hos[7529]: segfault at 149b16b8ad50 ip 0000149b16b8ad50 sp 00007fff6ac706b8 error 15 Mar 10 13:29:15 Tower kernel: Code: 00 00 48 c8 02 18 9b 14 00 00 07 00 00 00 00 00 00 00 f3 f3 e7 29 50 4a 87 ee 00 00 00 00 69 6d 70 6f 72 74 09 00 00 00 00 00 <41> 00 00 00 00 00 00 00 48 c8 02 18 9b 14 00 00 06 00 00 00 00 00 And then with most others container except Plex and Sonarr on: Presently with no errors, but going to leave it on. Attaching my docker configurations for both Plex and Sonarr...I just PDFed them, not sure if there is a better way to share. And updated diagnostics zip file too. And just the syslog file for convenience Are the Linux Server Plex or Hotio Sonarr containers known to be problematic? tower-diagnostics-20230310-1433.zip Tower_Plex.pdf Tower_Sonarr.pdf syslog-192.168.1.201.log
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.