September 28, 20241 yr Hello I've come to ask for help because I'm having trouble redoing the parity. In fact, it systematically stops during the process. I tried to reduce the size of the array, I followed as I had already done this video: https://www.youtube.com/watch?v=nV5snitWrBk But when I was about to delete the disk, script remove disk, it didn't seem to be finished according to the script, but I had no more activity on the disks. So I decided to go ahead and remove the disk and run a parity check ... but at one point the writing seemed to stop (as with the erasing), the rebuild/check time increased and nothing happened. I've tried several times to restart the process by recreating the array with Tools/New Config, but the result is the same. it doesn't work. I even found myself without web access to unraid, which meant I couldn't see the status of the rescontruction.... I don't understand why the rebuild stops, I don't have any errors on the disks, after rebooting, they're all there. The adaptec 72405 card doesn't seem to have any problems.... how to proceed? this morning I noticed some things in the logs that seemed suspicious, but that I don't understand: Sep 28 01:08:58 Tower kernel: R13: ffff88815c2c0fb8 R14: 0000000000000007 R15: 0000000000000009 Sep 28 01:08:58 Tower kernel: FS: 0000000000000000(0000) GS:ffff888ffeb00000(0000) knlGS:0000000000000000 Sep 28 01:08:58 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 28 01:08:58 Tower kernel: CR2: 00000000005b8110 CR3: 000000000220a000 CR4: 00000000003506e0 Sep 28 01:08:58 Tower kernel: ------------[ cut here ]------------ Sep 28 01:08:58 Tower kernel: WARNING: CPU: 12 PID: 22540 at kernel/exit.c:814 do_exit+0x87/0x923 Sep 28 01:08:58 Tower kernel: Modules linked in: md_mod nfsd auth_rpcgss oid_registry lockd grace sunrpc zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) iptable_nat xt_MASQUERADE nf_nat xt_connmark nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_mark iptable_mangle xt_comment xt_addrtype iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc bonding tls r8169 realtek edac_mce_amd intel_rapl_msr edac_core intel_rapl_common iosf_mbi kvm_amd video drm_kms_helper kvm drm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 sha256_ssse3 wmi_bmof sha1_ssse3 aesni_intel nvme backlight crypto_simd i2c_piix4 cryptd ahci syscopyarea input_leds sysfillrect sysimgblt aacraid rapl i2c_core k10temp ccp nvme_core fb_sys_fops joydev led_class libahci wmi tpm_crb tpm_tis tpm_tis_core tpm button Sep 28 01:08:58 Tower kernel: acpi_cpufreq unix [last unloaded: md_mod] Sep 28 01:08:58 Tower kernel: CPU: 12 PID: 22540 Comm: unraidd0 Tainted: P S B D O 6.1.79-Unraid #1 Sep 28 01:08:58 Tower kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450M Pro4 R2.0, BIOS P5.70 10/20/2022 Sep 28 01:08:58 Tower kernel: RIP: 0010:do_exit+0x87/0x923 Sep 28 01:08:58 Tower kernel: Code: 24 74 04 75 13 b8 01 00 00 00 41 89 6c 24 60 48 c1 e0 22 49 89 44 24 70 4c 89 ef e8 c9 fb 80 00 48 83 bb b0 07 00 00 00 74 02 <0f> 0b 48 8b bb d8 06 00 00 e8 cb fa 80 00 48 8b 83 d0 06 00 00 83 Sep 28 01:08:58 Tower kernel: RSP: 0018:ffffc90015917ee0 EFLAGS: 00010286 Sep 28 01:08:58 Tower kernel: RAX: 0000000080000000 RBX: ffff88815bfdcec0 RCX: 0000000000000000 Sep 28 01:08:58 Tower kernel: RDX: 0000000000000001 RSI: 0000000000002710 RDI: 00000000ffffffff Sep 28 01:08:58 Tower kernel: RBP: 000000000000000b R08: 0000000000000000 R09: 3030303030303020 Sep 28 01:08:58 Tower kernel: R10: 3a34524320303030 R11: 6130323230303030 R12: ffff8881058f7000 Sep 28 01:08:58 Tower kernel: R13: ffff8881027a18c0 R14: 0000000000000000 R15: 0000000000000000 Sep 28 01:08:58 Tower kernel: FS: 0000000000000000(0000) GS:ffff888ffeb00000(0000) knlGS:0000000000000000 Sep 28 01:08:58 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Sep 28 01:08:58 Tower kernel: CR2: 00000000005b8110 CR3: 000000000220a000 CR4: 00000000003506e0 Sep 28 01:08:58 Tower kernel: Call Trace: Sep 28 01:08:58 Tower kernel: <TASK> Sep 28 01:08:58 Tower kernel: ? __warn+0xab/0x122 Sep 28 01:08:58 Tower kernel: ? report_bug+0x109/0x17e Sep 28 01:08:58 Tower kernel: ? do_exit+0x87/0x923 Sep 28 01:08:58 Tower kernel: ? handle_bug+0x41/0x6f Sep 28 01:08:58 Tower kernel: ? exc_invalid_op+0x13/0x60 Sep 28 01:08:58 Tower kernel: ? asm_exc_invalid_op+0x16/0x20 Sep 28 01:08:58 Tower kernel: ? do_exit+0x87/0x923 Sep 28 01:08:58 Tower kernel: make_task_dead+0x11c/0x11c Sep 28 01:08:58 Tower kernel: rewind_stack_and_make_dead+0x17/0x17 Sep 28 01:08:58 Tower kernel: RIP: 0000:0x0 Sep 28 01:08:58 Tower kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6. Sep 28 01:08:58 Tower kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000 Sep 28 01:08:58 Tower kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 Sep 28 01:08:58 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 Sep 28 01:08:58 Tower kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 Sep 28 01:08:58 Tower kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 Sep 28 01:08:58 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 Sep 28 01:08:58 Tower kernel: </TASK> Sep 28 01:08:58 Tower kernel: ---[ end trace 0000000000000000 ]--- Just to clarify, the server's motherboard was changed last week because the old one seemed to be dead. The new card ran the server correctly for a week. I'm able to mount the server on another installation without using the adaptec 72405 card, is this worth testing? thank you for your help Edited September 28, 20241 yr by Xili
September 28, 20241 yr Author 2 hours ago, JorgeB said: Please post the diagnostics. tower-diagnostics-20240928-1214.zip
September 28, 20241 yr Community Expert Unraid driver is crashing, this is almost always a hardware issue, and Ryzen with overclocked RAM like you have is known to have issues, start by correcting that, see here.
September 28, 20241 yr Author 45 minutes ago, JorgeB said: Unraid driver is crashing, this is almost always a hardware issue, and Ryzen with overclocked RAM like you have is known to have issues, start by correcting that, see here. thank you this problem speaks to me, I may have already been confronted with it before restarting parity-sync, could you confirm that i can change the order of the HDDs, since parity is no longer validated, that shouldn't be a problem? (for organizational reasons for me). edit : new diagnostics after ram frequency change tower-diagnostics-20240928-1417.zip Edited September 28, 20241 yr by Xili files
September 28, 20241 yr Community Expert Since parity is not yet synced you can change the disks, just do a new config.
October 1, 20241 yr Author Hello thanks for the help, it must have come from there. The reconstruction was successful
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.