Jump to content

JorgeB

Moderators
  • Posts

    67,871
  • Joined

  • Last visited

  • Days Won

    708

Everything posted by JorgeB

  1. Enable the syslog server and post that after a crash to see if there's something visible.
  2. Post the screenshot from the test.
  3. What do you mean by this? That is, what do you observe? Also how are you accessing the server, using the 10GbE IP or did you add the server name to the hosts file?
  4. Run the diskspeed docker, it will test disks individually and controller bandwidth with all together.
  5. There have been other reports of spin down issues with some Seagate drives, possibly due to some kernel change.
  6. Problems are likely due to bad SATA cables, bad power cables/splitter or bad PSU.
  7. Basically yes, copy the data, services must be stopped to copy, assign it as cache and leave the other unassigned.
  8. This suggests that there's still some hardware issue.
  9. Parity cannot be fixed while disk10 keeps giving read errors. What do you meant by this? Parity can cannot be synced to save a failing disk, also dsk9 looks healthy, errors might have the same root cause as disk10.
  10. Based on the log snippet you've posted looks like the device dropped offline, but without the rest cannot say for sure, if it dropped the below can sometimes help. Some NVMe devices have issues with power states on Linux, try this, on the main GUI page click on the flash drive, scroll down to "Syslinux Configuration", make sure it's set to "menu view" (top right) and add this to your default boot option, after "append initrd=/bzroot" nvme_core.default_ps_max_latency_us=0 pcie_aspm=off e.g.: append initrd=/bzroot nvme_core.default_ps_max_latency_us=0 pcie_aspm=off Reboot and see if it makes a difference.
  11. Nope, you just replace disk1 with a 6TB, you could cancel the first rebuild to the old parity, replace it it with a 6TB disk, start the rebuild and cancel after a minute, and that would be enough to see if xfs_repair now works, if yes you can than complete the rebuild.
  12. If it happens again grab at least the syslog, to see the beginning of the error.
  13. If the CRC errors keep increasing it would be better to replace it, more as a performance concern than reliability for now.
  14. Because of this: XFS thinks the partition is larger and that's why xfs_repair doesn't work, but like mentioned it *might* work with a larger disk, it's not something I can test.
  15. It would be an improvement, I've made a request to see if it can be changed.
  16. It's not a problem mounting an array disk in UD, of course if the data fails to copy due to read errors in case the disk really fails there won't be much you can do.
  17. It's logged as a disk problem, connect that disk to the onboard SATA using different cables and try again.
  18. It works but it will have the same issue as the current disk, since it's the same size.
  19. Problem appears to start with a Nvidia driver crash: Oct 20 05:00:18 Monolith kernel: general protection fault, probably for non-canonical address 0x841f0f2e66c3f3: 0000 [#1] PREEMPT SMP NOPTI Oct 20 05:00:18 Monolith kernel: CPU: 2 PID: 23970 Comm: PMS Timer Tainted: P O 5.19.14-Unraid #1 Oct 20 05:00:18 Monolith kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 AORUS PRO WIFI/Z390 AORUS PRO WIFI-CF, BIOS F12 11/05/2021 Oct 20 05:00:18 Monolith kernel: RIP: 0010:_nv027480rm+0x12/0x20 [nvidia] Oct 20 05:00:18 Monolith kernel: Code: f6 b8 01 00 00 00 89 d1 d3 e0 09 04 b7 c3 66 2e 0f 1f 84 00 00 00 00 00 48 8b bf 80 01 00 00 89 f6 b8 fe ff ff ff 89 d1 d3 c0 <21> 04 b7 c3 66 2e 0f 1f 84 00 00 00 00 00 48 8b bf 80 01 00 00 89 Oct 20 05:00:18 Monolith kernel: RSP: 0018:ffffc900037df8d0 EFLAGS: 00010246 Oct 20 05:00:18 Monolith kernel: RAX: 00000000fffffffe RBX: ffff8881ab1b0008 RCX: 0000000000000000 Oct 20 05:00:18 Monolith kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00841f0f2e66c3f3 Oct 20 05:00:18 Monolith kernel: RBP: ffff88814fd45b90 R08: 0000000000000020 R09: ffff88814fd45b78 Oct 20 05:00:18 Monolith kernel: R10: ffff8881ab1b0008 R11: 00000001010f0000 R12: ffff8881ab1b0008 Oct 20 05:00:18 Monolith kernel: R13: ffff8881ab1b0008 R14: 0000000000000000 R15: 0000000000000001 Oct 20 05:00:18 Monolith kernel: FS: 0000000000000000(0000) GS:ffff88889dc80000(0000) knlGS:0000000000000000 Oct 20 05:00:18 Monolith kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 20 05:00:18 Monolith kernel: CR2: 000014e33580ff1c CR3: 000000000200a006 CR4: 00000000003706e0 Oct 20 05:00:18 Monolith kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Oct 20 05:00:18 Monolith kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Oct 20 05:00:18 Monolith kernel: Call Trace: Oct 20 05:00:18 Monolith kernel: <TASK> Oct 20 05:00:18 Monolith kernel: ? _nv019327rm+0x8b/0x100 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv019776rm+0x340/0x7f0 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv019777rm+0xda/0x1e0 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv019777rm+0xb8/0x1e0 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv013890rm+0x438/0x920 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv013890rm+0x418/0x920 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv013878rm+0x78/0xf0 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv014058rm+0x196/0x7f0 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv035922rm+0xac/0xe0 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv037394rm+0xac/0x140 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv037393rm+0x2f7/0x4d0 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv035835rm+0xbe/0x140 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv035836rm+0x42/0x70 [nvidia] Oct 20 05:00:18 Monolith kernel: ? _nv031339rm+0x113/0x230 [nvidia] Oct 20 05:00:18 Monolith kernel: ? rm_gpu_ops_channel_destroy+0x1c/0x60 [nvidia] Oct 20 05:00:18 Monolith kernel: ? nvUvmGetSafeStack+0x42/0x92 [nvidia] Oct 20 05:00:18 Monolith kernel: ? nvUvmInterfaceChannelDestroy+0x1e/0x29 [nvidia] Oct 20 05:00:18 Monolith kernel: ? channel_destroy+0x89/0xc0 [nvidia_uvm] Oct 20 05:00:18 Monolith kernel: ? uvm_channel_manager_destroy.part.0+0x4c/0xa0 [nvidia_uvm] Oct 20 05:00:18 Monolith kernel: ? remove_gpu+0x183/0x3a0 [nvidia_uvm] Oct 20 05:00:18 Monolith kernel: ? uvm_gpu_release_locked+0x1f/0x40 [nvidia_uvm] Oct 20 05:00:18 Monolith kernel: ? uvm_va_space_destroy+0x45f/0x4c0 [nvidia_uvm] Oct 20 05:00:18 Monolith kernel: ? uvm_release.constprop.0+0x3d/0xa0 [nvidia_uvm] Oct 20 05:00:18 Monolith kernel: ? uvm_release_entry.part.0.isra.0+0x7a/0xb0 [nvidia_uvm] Oct 20 05:00:18 Monolith kernel: ? __fput+0x101/0x1c8 Oct 20 05:00:18 Monolith kernel: ? task_work_run+0x66/0x7e Oct 20 05:00:18 Monolith kernel: ? do_exit+0x39f/0x8e5 Oct 20 05:00:18 Monolith kernel: ? _raw_spin_unlock_irqrestore+0x24/0x3a Oct 20 05:00:18 Monolith kernel: ? do_group_exit+0x8f/0x8f Oct 20 05:00:18 Monolith kernel: ? get_signal+0x606/0x63e Oct 20 05:00:18 Monolith kernel: ? hrtimer_init_sleeper+0x41/0x41 Oct 20 05:00:18 Monolith kernel: ? arch_do_signal_or_restart+0x36/0x607 Oct 20 05:00:18 Monolith kernel: ? do_futex+0xcd/0x143 Oct 20 05:00:18 Monolith kernel: ? exit_to_user_mode_prepare+0x58/0x10d Oct 20 05:00:18 Monolith kernel: ? syscall_exit_to_user_mode+0x18/0x27 Oct 20 05:00:18 Monolith kernel: ? do_syscall_64+0x77/0x81 Oct 20 05:00:18 Monolith kernel: ? entry_SYSCALL_64_after_hwframe+0x63/0xcd Oct 20 05:00:18 Monolith kernel: </TASK> Oct 20 05:00:18 Monolith kernel: Modules linked in: xt_mark xt_nat veth tcp_diag udp_diag inet_diag nvidia_uvm(PO) xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs input_leds led_class joydev r8152 mii md_mod ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge stp llc ipv6 nvidia_drm(PO) nvidia_modeset(PO) x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm nvidia(PO) crct10dif_pclmul crc32_pclmul gigabyte_wmi wmi_bmof intel_wmi_thunderbolt mxm_wmi crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd cryptd rapl drm_kms_helper intel_cstate intel_uncore nvme i2c_i801 e1000e drm thunderbolt nvme_core i2c_smbus ahci i2c_core libahci syscopyarea sysfillrect sysimgblt fb_sys_fops intel_pch_thermal thermal fan tpm_crb tpm_tis video Oct 20 05:00:18 Monolith kernel: tpm_tis_core wmi backlight tpm acpi_pad button unix Oct 20 05:00:18 Monolith kernel: ---[ end trace 0000000000000000 ]---
×
×
  • Create New...