bdarnell

Members
  • Posts

    32
  • Joined

  • Last visited

Everything posted by bdarnell

  1. I have a known good mb/CPU/ram that I can swap out. It it currently in use for unraid with a similar setup, but should help rule out those components.
  2. I downgraded to 6.11.5, started a parity check which got stuck again. Since I have swapped everything except the motherboard and CPU. I would assume it has to be one of those. How do I know which? piggy-diagnostics-20240102-1314.zip
  3. Would it make sense that both sticks are faulty? And all my years of building computers. I've never ran into a faulty ram stick. What are the chances of both of them being bad??
  4. I haven't had time to do an actual mem test, but this is what I've done so far. Hope this provides some feedback value. In the bios I disabled all automatic ram options and set them all to default or stable. The system has 2x 32gb ram sticks. I was able to run the system on a single stick and remove the opposite stick, then vice versa. Both scenarios ended up having the same result, getting parity check stuck at random percentages. I chose this way first as it only has a few min downtime vs a mem test which could have 24-48hrs of down time. I'm willing to purchase additional ram rather than take the system offline. Let me know what you think. piggy-diagnostics-20231224-2135.zip
  5. This has been an on going issue for months, I've swapped basically everything, I'm not sure what else to do. I've attached a diagnostic log to see if anything there helps. This time the parity check is stuck at 10.3% and the estimated finish has now extended to 600+ days. Ok, give me a min, the unraid firefox browser won't let me attach the file right now. I'm going to reboot and attach after this post.
  6. I think I have found my issue after swapping everything, cpu, mb, ram, every single data, signal, power cable, sas expander, hba card, psu, and wasting 5 months of my life, and lapsing return windows so I'm stuck with all this extra hardware. The WD 20tb drives use more power than the WD 10tb drives. My PSU that runs the case that only has harddrives is a Corsair RM750x. On the box it states that the 5v rail max is 20a and the 12v rail max is 62.5a. The WD 20tb drives state on the label that each drive needs [email protected] and [email protected] so when you multiply that by 15drives you get [email protected] and [email protected]. Both of which are well below the max of the PSU. But as I pulled out drives 1 by 1 and put them on another PSU, the errors slowly went away. My current data rebuild is finally progressing #10 of 15. On the RM750x PSU I have 7x 20tb drives and 5x 10tb drives and the remaining 3x 20tb on a separate PSU. If I add any more drives that the RM750x PSU, that is when I start getting errors, and if I swap data or power cables, it doesn't matter. It's when I lowered the power draw from the PSU is when the errors when away. I'll be looking for a new power supply that can hopefully run all 15x WD 20tb drives with out giving me errors. Do they make a device that can monitor the actual amperage draw on each PSU voltage rail?
  7. I swapped the sata power cable with a known good cable, had the same exact result as before. This is a list of things not changed yet 2x lsi 9217-8i Power supply Corsair hx750 What else could it be? Or what parts would you recommend I use instead? I just want my system to be stable again. All these parts and wires work fine on my other system, the same exact parts.
  8. almost forgot the diagnostics...attached herepiggy-diagnostics-20230509-2219.zip
  9. I swapped out the mb CPU ram with a known good system I started the parity rebuild again, lost count of how many times I've started it on the previous setup. Parity made it 13% then disk 3 created 1million errors. Disk 3 is new, has had 2x preclear cycles and has been installed for months with no issues. Syslog went from 200k to 4gb Is it my 2x lsi 9217-8i controllers?
  10. Yes is the same scenario when using v6.11.5 Here is an error from that version May 4 20:50:24 piggy kernel: md: recovery thread: multiple disk errors, sector=7498539328 May 4 20:50:24 piggy kernel: ------------[ cut here ]------------ May 4 20:50:24 piggy kernel: kernel BUG at drivers/md/unraid.c:1617! May 4 20:50:24 piggy kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI May 4 20:50:24 piggy kernel: CPU: 0 PID: 10725 Comm: unraidd0 Not tainted 5.19.17-Unraid #2 May 4 20:50:24 piggy kernel: Hardware name: Gigabyte Technology Co., Ltd. Z690 AERO D/Z690 AERO D, BIOS F23a 01/04/2023 May 4 20:50:24 piggy kernel: RIP: 0010:unraidd+0x1051/0x1140 [md_mod] May 4 20:50:24 piggy kernel: Code: 00 83 3d 99 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 19 c3 10 a0 48 8b 73 20 e8 06 1e 71 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 May 4 20:50:24 piggy kernel: RSP: 0018:ffffc90003dc3df0 EFLAGS: 00010246 May 4 20:50:24 piggy kernel: RAX: 0000000000000000 RBX: ffff8881986cee08 RCX: 0000000000000000 May 4 20:50:24 piggy kernel: RDX: 0000000000000000 RSI: ffffffff828e59e0 RDI: ffff888106aa2c38 May 4 20:50:24 piggy kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 May 4 20:50:24 piggy kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff888164997110 May 4 20:50:24 piggy kernel: R13: ffff8881986cf000 R14: ffff8881986cf078 R15: ffff8881673452d8 May 4 20:50:24 piggy kernel: FS: 0000000000000000(0000) GS:ffff88907f400000(0000) knlGS:0000000000000000 May 4 20:50:24 piggy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 4 20:50:24 piggy kernel: CR2: 0000155417e48000 CR3: 00000001eaad2003 CR4: 0000000000770ef0 May 4 20:50:24 piggy kernel: PKRU: 55555554 May 4 20:50:24 piggy kernel: Call Trace: May 4 20:50:24 piggy kernel: <TASK> May 4 20:50:24 piggy kernel: md_thread+0x100/0x12e [md_mod] May 4 20:50:24 piggy kernel: ? _raw_spin_rq_lock_irqsave+0x20/0x20 May 4 20:50:24 piggy kernel: ? md_seq_show+0x720/0x720 [md_mod] May 4 20:50:24 piggy kernel: kthread+0xe4/0xef May 4 20:50:24 piggy kernel: ? kthread_complete_and_exit+0x1b/0x1b May 4 20:50:24 piggy kernel: ret_from_fork+0x1f/0x30 May 4 20:50:24 piggy kernel: </TASK> May 4 20:50:24 piggy kernel: Modules linked in: tun veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod tcp_diag inet_diag efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls igc atlantic gigabyte_wmi wmi_bmof x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd intel_cstate intel_uncore i2c_i801 i2c_smbus thunderbolt i915 iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper ahci libahci drm_kms_helper joydev input_leds led_class btusb drm btrtl btbcm btintel bluetooth mpt3sas intel_gtt nvme agpgart ecdh_generic i2c_core nvme_core ecc raid_class syscopyarea scsi_transport_sas sysfillrect sysimgblt fb_sys_fops thermal wmi fan tpm_crb tpm_tis tpm_tis_core video tpm backlight acpi_pad acpi_tad button unix May 4 20:50:24 piggy kernel: [last unloaded: igc] May 4 20:50:24 piggy kernel: ---[ end trace 0000000000000000 ]--- May 4 20:50:24 piggy kernel: RIP: 0010:unraidd+0x1051/0x1140 [md_mod] May 4 20:50:24 piggy kernel: Code: 00 83 3d 99 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 19 c3 10 a0 48 8b 73 20 e8 06 1e 71 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 May 4 20:50:24 piggy kernel: RSP: 0018:ffffc90003dc3df0 EFLAGS: 00010246 May 4 20:50:24 piggy kernel: RAX: 0000000000000000 RBX: ffff8881986cee08 RCX: 0000000000000000 May 4 20:50:24 piggy kernel: RDX: 0000000000000000 RSI: ffffffff828e59e0 RDI: ffff888106aa2c38 May 4 20:50:24 piggy kernel: RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 May 4 20:50:24 piggy kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff888164997110 May 4 20:50:24 piggy kernel: R13: ffff8881986cf000 R14: ffff8881986cf078 R15: ffff8881673452d8 May 4 20:50:24 piggy kernel: FS: 0000000000000000(0000) GS:ffff88907f400000(0000) knlGS:0000000000000000 May 4 20:50:24 piggy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 4 20:50:24 piggy kernel: CR2: 0000155417e48000 CR3: 00000001eaad2004 CR4: 0000000000770ef0 May 4 20:50:24 piggy kernel: PKRU: 55555554 May 4 20:50:24 piggy kernel: ------------[ cut here ]------------ May 4 20:50:24 piggy kernel: WARNING: CPU: 0 PID: 10725 at kernel/exit.c:741 do_exit+0x39/0x8e5 May 4 20:50:24 piggy kernel: Modules linked in: tun veth xt_nat xt_tcpudp xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs md_mod tcp_diag inet_diag efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls igc atlantic gigabyte_wmi wmi_bmof x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd intel_cstate intel_uncore i2c_i801 i2c_smbus thunderbolt i915 iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper ahci libahci drm_kms_helper joydev input_leds led_class btusb drm btrtl btbcm btintel bluetooth mpt3sas intel_gtt nvme agpgart ecdh_generic i2c_core nvme_core ecc raid_class syscopyarea scsi_transport_sas sysfillrect sysimgblt fb_sys_fops thermal wmi fan tpm_crb tpm_tis tpm_tis_core video tpm backlight acpi_pad acpi_tad button unix May 4 20:50:24 piggy kernel: [last unloaded: igc] May 4 20:50:24 piggy kernel: CPU: 0 PID: 10725 Comm: unraidd0 Tainted: G D 5.19.17-Unraid #2 May 4 20:50:24 piggy kernel: Hardware name: Gigabyte Technology Co., Ltd. Z690 AERO D/Z690 AERO D, BIOS F23a 01/04/2023 May 4 20:50:24 piggy kernel: RIP: 0010:do_exit+0x39/0x8e5 May 4 20:50:24 piggy kernel: Code: 89 fd 53 48 83 ec 28 65 48 8b 04 25 28 00 00 00 48 89 44 24 20 31 c0 65 48 8b 1c 25 c0 bb 01 00 48 83 bb a0 07 00 00 00 74 02 <0f> 0b 48 8b bb c8 06 00 00 e8 b7 c0 7c 00 48 8b 83 c0 06 00 00 83 May 4 20:50:24 piggy kernel: RSP: 0018:ffffc90003dc3ee0 EFLAGS: 00010286 May 4 20:50:24 piggy kernel: RAX: 0000000000000000 RBX: ffff888107eec000 RCX: 0000000000000000 May 4 20:50:24 piggy kernel: RDX: 0000000000000000 RSI: 0000000000000003 RDI: 000000000000000b May 4 20:50:24 piggy kernel: RBP: 000000000000000b R08: 0000000000000000 R09: ffffffff828653f0 May 4 20:50:24 piggy kernel: R10: 00003fffffffffff R11: ffff8890bfbc5fde R12: ffffc90003dc3d48 May 4 20:50:24 piggy kernel: R13: ffff888107eec000 R14: 0000000000000002 R15: ffffffff820b236d May 4 20:50:24 piggy kernel: FS: 0000000000000000(0000) GS:ffff88907f400000(0000) knlGS:0000000000000000 May 4 20:50:24 piggy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 4 20:50:24 piggy kernel: CR2: 0000155417e48000 CR3: 00000001eaad2004 CR4: 0000000000770ef0 May 4 20:50:24 piggy kernel: PKRU: 55555554 May 4 20:50:24 piggy kernel: Call Trace: May 4 20:50:24 piggy kernel: <TASK> May 4 20:50:24 piggy kernel: make_task_dead+0xba/0xba May 4 20:50:24 piggy kernel: rewind_stack_and_make_dead+0x17/0x17 May 4 20:50:24 piggy kernel: RIP: 0000:0x0 May 4 20:50:24 piggy kernel: Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. May 4 20:50:24 piggy kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 May 4 20:50:24 piggy kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 May 4 20:50:24 piggy kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 May 4 20:50:24 piggy kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 May 4 20:50:24 piggy kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 May 4 20:50:24 piggy kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 May 4 20:50:24 piggy kernel: </TASK> May 4 20:50:24 piggy kernel: ---[ end trace 0000000000000000 ]---
  11. That error message came from the unRAID version v6.12-rc5 If I don't run parity sync or drive rebuilds, then system is stable. If it's an hardware issue... Are the recommendations for disabling things in BIOS (gigabyte aero d with i9-13900k). All the parts in this build are brand new and cables have been all replaced with brand new. What items should I look at? Should I completely unplug everything from the motherboard/power supply/harddrive and reseat every thing? Should I stress test the CPU and memory modules? Would the nvme drives cause this issue? Does anyone else have issues with this combination of CPU/motherboard?
  12. I'm still getting errors during Parity sync which is locking up my docker containers, bringing down plex I've tried... updating my BIOS to the latest version updating to the latest rc version of Unraid. What does this error mean? May 5 00:56:42 piggy kernel: md: recovery thread: multiple disk errors, sector=992025512 May 5 00:56:42 piggy kernel: md: recovery thread: multiple disk errors, sector=992025512 May 5 00:57:58 piggy kernel: md: recovery thread: multiple disk errors, sector=1019867216 May 5 00:57:58 piggy kernel: md: recovery thread: multiple disk errors, sector=1019867216 May 5 00:58:15 piggy kernel: md: recovery thread: multiple disk errors, sector=1025975768 May 5 00:58:15 piggy kernel: ------------[ cut here ]------------ May 5 00:58:15 piggy kernel: kernel BUG at drivers/md/unraid.c:1617! May 5 00:58:15 piggy kernel: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI May 5 00:58:15 piggy kernel: CPU: 16 PID: 6813 Comm: unraidd0 Not tainted 6.1.27-Unraid #1 May 5 00:58:15 piggy kernel: Hardware name: Gigabyte Technology Co., Ltd. Z690 AERO D/Z690 AERO D, BIOS F24a 04/25/2023 May 5 00:58:15 piggy kernel: RIP: 0010:unraidd+0x1051/0x1140 [md_mod] May 5 00:58:15 piggy kernel: Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 c3 3e a0 48 8b 73 20 e8 ce b4 46 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 May 5 00:58:15 piggy kernel: RSP: 0018:ffffc90000f3fdf0 EFLAGS: 00010246 May 5 00:58:15 piggy kernel: RAX: 0000000000000000 RBX: ffff88813e2dee08 RCX: 0000000000000000 May 5 00:58:15 piggy kernel: RDX: 0000000000000000 RSI: ffffffff829e4f00 RDI: ffff8881012ce038 May 5 00:58:15 piggy kernel: RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000000 May 5 00:58:15 piggy kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88810bccb930 May 5 00:58:15 piggy kernel: R13: ffff88813e2df2c0 R14: ffff88813e2df338 R15: ffff88813e1195d8 May 5 00:58:15 piggy kernel: FS: 0000000000000000(0000) GS:ffff88907f800000(0000) knlGS:0000000000000000 May 5 00:58:15 piggy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 5 00:58:15 piggy kernel: CR2: 000014663af54000 CR3: 0000000733336005 CR4: 0000000000770ee0 May 5 00:58:15 piggy kernel: PKRU: 55555554 May 5 00:58:15 piggy kernel: Call Trace: May 5 00:58:15 piggy kernel: <TASK> May 5 00:58:15 piggy kernel: md_thread+0xf4/0x122 [md_mod] May 5 00:58:15 piggy kernel: ? _raw_spin_rq_lock_irqsave+0x20/0x20 May 5 00:58:15 piggy kernel: ? signal_pending+0x1d/0x1d [md_mod] May 5 00:58:15 piggy kernel: kthread+0xe4/0xef May 5 00:58:15 piggy kernel: ? kthread_complete_and_exit+0x1b/0x1b May 5 00:58:15 piggy kernel: ret_from_fork+0x1f/0x30 May 5 00:58:15 piggy kernel: </TASK> May 5 00:58:15 piggy kernel: Modules linked in: tun xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod tcp_diag inet_diag efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls igc atlantic i915 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 drm aesni_intel btusb crypto_simd btrtl cryptd btbcm mei_hdcp mei_pxp btintel i2c_i801 intel_gtt rapl intel_cstate gigabyte_wmi wmi_bmof bluetooth intel_uncore mpt3sas thunderbolt agpgart i2c_smbus nvme mei_me ahci input_leds i2c_core ecdh_generic joydev led_class mei nvme_core libahci raid_class ecc syscopyarea sysfillrect scsi_transport_sas sysimgblt thermal fb_sys_fops fan video tpm_crb tpm_tis tpm_tis_core wmi tpm May 5 00:58:15 piggy kernel: backlight intel_pmc_core acpi_pad acpi_tad button unix [last unloaded: igc] May 5 00:58:15 piggy kernel: ---[ end trace 0000000000000000 ]--- May 5 00:58:15 piggy kernel: RIP: 0010:unraidd+0x1051/0x1140 [md_mod] May 5 00:58:15 piggy kernel: Code: 00 83 3d 83 50 00 00 03 7e 16 41 8b 56 98 89 e9 48 c7 c7 21 c3 3e a0 48 8b 73 20 e8 ce b4 46 e1 41 f6 86 69 ff ff ff 02 75 02 <0f> 0b 48 8b 43 20 49 03 47 10 41 c7 46 b0 00 10 00 00 49 8b 56 10 May 5 00:58:15 piggy kernel: RSP: 0018:ffffc90000f3fdf0 EFLAGS: 00010246 May 5 00:58:15 piggy kernel: RAX: 0000000000000000 RBX: ffff88813e2dee08 RCX: 0000000000000000 May 5 00:58:15 piggy kernel: RDX: 0000000000000000 RSI: ffffffff829e4f00 RDI: ffff8881012ce038 May 5 00:58:15 piggy kernel: RBP: 0000000000000005 R08: 0000000000000000 R09: 0000000000000000 May 5 00:58:15 piggy kernel: R10: 0000000000000000 R11: 0000000000000000 R12: ffff88810bccb930 May 5 00:58:15 piggy kernel: R13: ffff88813e2df2c0 R14: ffff88813e2df338 R15: ffff88813e1195d8 May 5 00:58:15 piggy kernel: FS: 0000000000000000(0000) GS:ffff88907f800000(0000) knlGS:0000000000000000 May 5 00:58:15 piggy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 5 00:58:15 piggy kernel: CR2: 000014663af54000 CR3: 0000000733336006 CR4: 0000000000770ee0 May 5 00:58:15 piggy kernel: PKRU: 55555554 May 5 00:58:15 piggy kernel: ------------[ cut here ]------------ May 5 00:58:15 piggy kernel: WARNING: CPU: 16 PID: 6813 at kernel/exit.c:814 do_exit+0x87/0x923 May 5 00:58:15 piggy kernel: Modules linked in: tun xt_nat xt_tcpudp veth xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod tcp_diag inet_diag efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls igc atlantic i915 x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm iosf_mbi drm_buddy i2c_algo_bit ttm drm_display_helper drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 drm aesni_intel btusb crypto_simd btrtl cryptd btbcm mei_hdcp mei_pxp btintel i2c_i801 intel_gtt rapl intel_cstate gigabyte_wmi wmi_bmof bluetooth intel_uncore mpt3sas thunderbolt agpgart i2c_smbus nvme mei_me ahci input_leds i2c_core ecdh_generic joydev led_class mei nvme_core libahci raid_class ecc syscopyarea sysfillrect scsi_transport_sas sysimgblt thermal fb_sys_fops fan video tpm_crb tpm_tis tpm_tis_core wmi tpm May 5 00:58:15 piggy kernel: backlight intel_pmc_core acpi_pad acpi_tad button unix [last unloaded: igc] May 5 00:58:15 piggy kernel: CPU: 16 PID: 6813 Comm: unraidd0 Tainted: G D 6.1.27-Unraid #1 May 5 00:58:15 piggy kernel: Hardware name: Gigabyte Technology Co., Ltd. Z690 AERO D/Z690 AERO D, BIOS F24a 04/25/2023 May 5 00:58:15 piggy kernel: RIP: 0010:do_exit+0x87/0x923 May 5 00:58:15 piggy kernel: Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 51 40 80 00 48 83 bb 90 07 00 00 00 74 02 <0f> 0b 48 8b bb b8 06 00 00 e8 53 3f 80 00 48 8b 83 b0 06 00 00 83 May 5 00:58:15 piggy kernel: RSP: 0018:ffffc90000f3fee0 EFLAGS: 00010286 May 5 00:58:15 piggy kernel: RAX: 0000000080000000 RBX: ffff88810bd72f40 RCX: 0000000000000000 May 5 00:58:15 piggy kernel: RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff May 5 00:58:15 piggy kernel: RBP: 000000000000000b R08: 0000000000000000 R09: ffffffff8294b3f0 May 5 00:58:15 piggy kernel: R10: 00003fffffffffff R11: ffff8890bfbc3f6e R12: ffff8881043ee000 May 5 00:58:15 piggy kernel: R13: ffff88813f1e3180 R14: 0000000000000002 R15: ffffffff82069847 May 5 00:58:15 piggy kernel: FS: 0000000000000000(0000) GS:ffff88907f800000(0000) knlGS:0000000000000000 May 5 00:58:15 piggy kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 5 00:58:15 piggy kernel: CR2: 000014663af54000 CR3: 0000000733336006 CR4: 0000000000770ee0 May 5 00:58:15 piggy kernel: PKRU: 55555554 May 5 00:58:15 piggy kernel: Call Trace: May 5 00:58:15 piggy kernel: <TASK> May 5 00:58:15 piggy kernel: make_task_dead+0x11c/0x11c May 5 00:58:15 piggy kernel: rewind_stack_and_make_dead+0x17/0x17 May 5 00:58:15 piggy kernel: RIP: 0000:0x0 May 5 00:58:15 piggy kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6. May 5 00:58:15 piggy kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 May 5 00:58:15 piggy kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 May 5 00:58:15 piggy kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 May 5 00:58:15 piggy kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 May 5 00:58:15 piggy kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 May 5 00:58:15 piggy kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 May 5 00:58:15 piggy kernel: </TASK> May 5 00:58:15 piggy kernel: ---[ end trace 0000000000000000 ]---
  13. I got this error message today....what does it mean" Apr 30 02:12:49 piggy kernel: python3[24762]: segfault at 14bf167330 ip 000014bf1642763c sp 00007ffd3b138580 error 4 in libpython3.11.so.1.0[14bf162ee000+223000] Apr 30 02:12:49 piggy kernel: Code: a8 00 00 00 01 e9 2b d5 ee ff 0f 1f 40 00 41 54 55 53 48 89 fb 48 83 ec 10 48 8b 57 f0 48 85 d2 0f 84 08 02 00 00 48 8b 7f f8 <4c> 8b 42 08 4c 8d 15 d9 ff ff ff 4c 8b 4b 08 48 83 e7 fc 41 83 e0
  14. Random docker soft crashing during data-rebuild, unraid gui works fine, cpus get pegged when the docker applications soft crash.... It really feels like something else is causing this to happen, but what else can I test? I've reduced the number of wires by moving the case close to the drives. Now from the motherboard, I have 2x LSI 9217-8I connected directly to the harddrives with 4x SFF-8087 to 4xSATA Forward Breakout 1.6ft I have 9 drives swapped, working on number 10. But this one keeps stopping around 30-35% locking up the docker containers, preventing me from stopping or restarting docker applications. piggy-diagnostics-20230427-2326.zip
  15. I've reduced from 15x10tb and 15x20tb drives to only 15 total drives, and physically swapping 1 drive at a time. This has been working and when complete I will be putting everything into one case. This will reduce the number of potential failure points by removing extra cards, wires, power supplies.
  16. The array just stopped again, not sure why. I attached diagnostics to see if you can help me. piggy-diagnostics-20230406-2201.zip
  17. I dont want to start the data rebuild until I have something I can change and test, because it does not complete and the array stops somewhere between 1hr and 20hrs into the rebuild. But the Dashboard section will show errors during a rebuild, and the Main section will show errors on multiple drives...after a reboot those numbers are reset, and I don't see errors in the logs for those disks. (I attached images of these sections, they show 0 because its after a restart and I paused the rebuild immediately upon reboot) What do you suggest I can try before I try another rebuild?
  18. GOAL, move unraid usb, disk drives, cache drive to a completely new setup. Then upgrade all the disk drives to 20tb shucked WD drives. Sell the 10tb drives online to recoup some costs. Moving to the new setup went fine. No issues, parity good on the old setup and on the new setup, and all things started up as before. I have already precleared all the 14x 20tb drives prior to bringing them into this setup, again, no issues all passed. Upgrading all the disk drives is where I'm getting issues. I have 14x 10tb drives in the primary DAS ... 4x SFF-8087 to 4xSATA Forward Breakout 1.6ft to a LSI IBM 03X3834 16 Port PCI-e SAS Expander to 2xMini SAS SFF-8087 to SFF-8087 Cable, 100-Ohms, 1.6ft to Dual Ports Mini SAS SFF-8088 to SAS 36Pin SFF-8087 PCBA Female Adapter with PCI Bracket to 2xMini-SAS SFF-8088 to SFF-8088 Molex 2 Meter Cable to a LSI SAS9200-16e 16-Port External HBA Full-Height PCIe P20 IT Mode Connected to the first PCIe slot on my motherboard I have 14x 20tb drives in the secondary DAS ... 4x SFF-8087 to 4xSATA Forward Breakout 1.6ft to a LSI IBM 03X3834 16 Port PCI-e SAS Expander to 2xMini SAS SFF-8087 to SFF-8087 Cable, 100-Ohms, 1.6ft to Dual Ports Mini SAS SFF-8088 to SAS 36Pin SFF-8087 PCBA Female Adapter with PCI Bracket to 2xMini-SAS SFF-8088 to SFF-8088 Molex 2 Meter Cable to a LSI SAS9200-16e 16-Port External HBA Full-Height PCIe P20 IT Mode Connected to the second PCIe slot on my motherboard Procedure to swap a drive... Prevent any unnecessary disk usages programs from running... Mover, Parity check, unmaniac, binhex-backup Stop the array Swap out on of the 10tb drives for a 20tb drive Start the array Rebuild starts automatically Once that drive completes, repeat until all the drives are up Issues encountered All the drives in the current array are spun up, and the transfer speeds are initially low (5-20MB/s per disk). Then later they spin up to 205MB/s per disk. Then I get variations of disk speeds down to 5-20MB/s range, and sometimes the CPU get pegged at 100% for a few min, then returns to normal usage >10%, and the disk speeds increase again. After some amount of time I come back to check on the progress, and the array has stopped, multiple drives are showing errors on the Main screen, multiple drives are showing elevated 199 UDMA CRC error counts, uptime is less than the time from when I started the swap drive process, and of course many texts from my friends and family that they cannot access plex. What I've tried already Reseated all cables Replaced all cables with new ones Verified FW version are correct for the HBAs I've attached diagnostics, as I cannot figure out what's going on. piggy-diagnostics-20230316-1958.zip
  19. I don't know what to make of it either, and agree some other issue (gremlin) lurks, but I am glad it's finally working.
  20. I tried this method to enable QuickSync hw transcoding, but it didn't work in Plex mapping the device /dev/dri to the container. I installed the GPUStatistics plugin to see what it might show, and instead of showing the iGPU, it showed the P2000....which I know it didn't before, it was blank the last time I tried this plugin. I then tried 'nvidia-smi' command, and actually got a response. I then re-setup the plex container with all the nvidia Extra parameters and Variables, and Applied to the container. I tried to force transcoding of a 4k to 1080p and it actually worked. I can transcode now with the P2000. 🗽 Here is a list of things I did incase one of them makes a difference. Removed all nvidia parameters from the plex container, Apply changes Edited /flash/config/go to include modprobe i915 && chmod -R 777 /dev/dri Rebooted unraid to apply those changes Updated the plex container to use a new Device /dev/dri:/dev/dri, Apply changes Tried to force transcode with UHD 630 QuickSync...didn't work. Installed CA App for Intel-GPU-TOP, Opened the WebUI to see intel-gpu-top running Restarted the Plex container Tried to force transcode with UHD 630 QuickSync...didn't work. Installed CA App for GPUStatistics Went to the Dashboard to view the GPU stats, and it showed the P2000 stats Edited the Plex container, removed Device /dev/dri, Added Extra Parameters, Added Variable NVIDIA_DRIVER_CAPABILITIES and NVIDIA_VISIBLE_DEVICES, Apply changes Tried to force transcode with P2000...now it works. Still puzzled.
  21. Yes, I can pull it from that system to test in another Unraid setup. It will take a few days since the 2 servers are in 2 locations. During that time I'll also try to get QuickSync to transcode, because it's been hammering the CPU to do both video and audio transcodes. Would be nice to offload some of that. Even if I can verify that the P2000 works in the other system, it's still giving me a headache in the current setup. Might just sell it, as long as QuickSync UHD 630 can handle the load.