some background info:
N5105 Board, with early microcode in BIOS, working fine with TrueNas Scale, except that VM in Trunnas crashes randomly, which is a known issue with early micorcode. check https://forums.servethehome.com/index.php?threads/jasper-lake-proxmox-kvm-qemu-vm-guest-stability.38824/ so i stopped using VM on Truenas. I did some memery test last night, all passed. so I assume the hardware is ok.
2 days ago, im trying to setup unraid on this machine, all good, ZFS volume not recognized, so i have to reformat them but i have backup. when i mounted my usb harddrive which contains my backup, and use native copy function in unraid, error happened, i did not download the log that time, but i can clearly remember it's something like *process*tainted*call*trace .... and both "rsync" and "z_wr_iss" had such "tainted" error.
today, i used my another PC, and connected to smb share of ZFS pool. after a few minutes of copying, similair errors emerge:
May 17 11:05:18 Tower kernel: general protection fault, probably for non-canonical address 0xfffb8883e4dbc650: 0000 [#1] PREEMPT SMP NOPTI May 17 11:05:18 Tower kernel: CPU: 1 PID: 2113 Comm: z_wr_iss Tainted: P O 6.1.27-Unraid #1 May 17 11:05:18 Tower kernel: Hardware name: UGREEN DX4600/To be filled by O.E.M, BIOS 5.19 06/16/2022 May 17 11:05:18 Tower kernel: RIP: 0010:metaslab_alloc_dva+0xdf9/0xfce [zfs] May 17 11:05:18 Tower kernel: Code: 03 4d 58 48 01 d8 48 39 c8 73 bc 48 8b 44 24 58 bf ff ff ff 00 49 c1 ec 09 48 8b 74 24 50 48 c1 e7 20 48 8b 5c 24 18 48 01 c6 <48> 8b 0e 48 89 c8 48 c1 e8 20 48 33 03 48 c1 e0 20 48 21 f8 8b bc May 17 11:05:18 Tower kernel: RSP: 0018:ffffc90001b47ba8 EFLAGS: 00010286 May 17 11:05:18 Tower kernel: RAX: 0000000000000000 RBX: ffff8881652d4000 RCX: 0000000000300000 May 17 11:05:18 Tower kernel: RDX: 0000000000000000 RSI: fffb8883e4dbc650 RDI: 00ffffff00000000 May 17 11:05:18 Tower kernel: RBP: ffff888102badc00 R08: 0000000000000000 R09: ffffffffa0daebbe May 17 11:05:18 Tower kernel: R10: ffff8884cc768000 R11: 0000000000000002 R12: 00000001ae20b7d0 May 17 11:05:18 Tower kernel: R13: ffff888101401400 R14: ffff888102badc00 R15: 0000000000000001 May 17 11:05:18 Tower kernel: FS: 0000000000000000(0000) GS:ffff888c4fe80000(0000) knlGS:0000000000000000 May 17 11:05:18 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 17 11:05:18 Tower kernel: CR2: 000000c00027c000 CR3: 0000000285fc2000 CR4: 0000000000350ee0 May 17 11:05:18 Tower kernel: Call Trace: May 17 11:05:18 Tower kernel: <TASK> May 17 11:05:18 Tower kernel: ? preempt_latency_start+0x2b/0x46 May 17 11:05:18 Tower kernel: metaslab_alloc+0x107/0x1fd [zfs] May 17 11:05:18 Tower kernel: zio_dva_allocate+0xee/0x73f [zfs] May 17 11:05:18 Tower kernel: ? preempt_latency_start+0x2b/0x46 May 17 11:05:18 Tower kernel: ? _raw_spin_lock+0x13/0x1c May 17 11:05:18 Tower kernel: ? _raw_spin_unlock+0x14/0x29 May 17 11:05:18 Tower kernel: ? zio_wait_for_children+0xa9/0xb7 [zfs] May 17 11:05:18 Tower kernel: ? preempt_latency_start+0x2b/0x46 May 17 11:05:18 Tower kernel: ? _raw_spin_lock+0x13/0x1c May 17 11:05:18 Tower kernel: ? _raw_spin_unlock+0x14/0x29 May 17 11:05:18 Tower kernel: ? tsd_hash_search+0x70/0x7d [spl] May 17 11:05:18 Tower kernel: zio_execute+0xb1/0xdf [zfs] May 17 11:05:18 Tower kernel: taskq_thread+0x266/0x38a [spl] May 17 11:05:18 Tower kernel: ? wake_up_q+0x44/0x44 May 17 11:05:18 Tower kernel: ? zio_subblock+0x22/0x22 [zfs] May 17 11:05:18 Tower kernel: ? taskq_dispatch_delay+0x106/0x106 [spl] May 17 11:05:18 Tower kernel: kthread+0xe4/0xef May 17 11:05:18 Tower kernel: ? kthread_complete_and_exit+0x1b/0x1b May 17 11:05:18 Tower kernel: ret_from_fork+0x1f/0x30 May 17 11:05:18 Tower kernel: </TASK> May 17 11:05:18 Tower kernel: Modules linked in: xt_CHECKSUM ipt_REJECT nf_reject_ipv4 ip6table_mangle iptable_mangle vhost_net vhost vhost_iotlb tap ipvlan xt_nat xt_tcpudp veth xt_conntrack nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype br_netfilter xfs xt_MASQUERADE xt_mark iptable_nat nfsd auth_rpcgss oid_registry lockd grace sunrpc ip6table_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tun md_mod tcp_diag inet_diag nct6775_core hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet 8021q garp mrp bridge stp llc bonding tls igc zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) i915 mei_pxp mei_hdcp wmi_bmof x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel drm_buddy i2c_algo_bit ttm drm_display_helper kvm drm_kms_helper drm crct10dif_pclmul intel_gtt crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 aesni_intel agpgart mei_me i2c_i801 crypto_simd cryptd intel_cstate tpm_crb nvme i2c_smbus May 17 11:05:18 Tower kernel: nvme_core mei processor_thermal_device_pci_legacy i2c_core processor_thermal_device processor_thermal_rfim processor_thermal_mbox int340x_thermal_zone intel_soc_dts_iosf tpm_tis syscopyarea ahci sysfillrect sysimgblt libahci iosf_mbi fb_sys_fops thermal tpm_tis_core video fan wmi backlight tpm acpi_pad acpi_tad intel_pmc_core button unix [last unloaded: igc] May 17 11:05:18 Tower kernel: ---[ end trace 0000000000000000 ]--- May 17 11:05:18 Tower kernel: RIP: 0010:metaslab_alloc_dva+0xdf9/0xfce [zfs] May 17 11:05:18 Tower kernel: Code: 03 4d 58 48 01 d8 48 39 c8 73 bc 48 8b 44 24 58 bf ff ff ff 00 49 c1 ec 09 48 8b 74 24 50 48 c1 e7 20 48 8b 5c 24 18 48 01 c6 <48> 8b 0e 48 89 c8 48 c1 e8 20 48 33 03 48 c1 e0 20 48 21 f8 8b bc May 17 11:05:18 Tower kernel: RSP: 0018:ffffc90001b47ba8 EFLAGS: 00010286 May 17 11:05:18 Tower kernel: RAX: 0000000000000000 RBX: ffff8881652d4000 RCX: 0000000000300000 May 17 11:05:18 Tower kernel: RDX: 0000000000000000 RSI: fffb8883e4dbc650 RDI: 00ffffff00000000 May 17 11:05:18 Tower kernel: RBP: ffff888102badc00 R08: 0000000000000000 R09: ffffffffa0daebbe May 17 11:05:18 Tower kernel: R10: ffff8884cc768000 R11: 0000000000000002 R12: 00000001ae20b7d0 May 17 11:05:18 Tower kernel: R13: ffff888101401400 R14: ffff888102badc00 R15: 0000000000000001 May 17 11:05:18 Tower kernel: FS: 0000000000000000(0000) GS:ffff888c4fe80000(0000) knlGS:0000000000000000 May 17 11:05:18 Tower kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 May 17 11:05:18 Tower kernel: CR2: 000000c00027c000 CR3: 0000000285fc2000 CR4: 0000000000350ee0
just made a flashdrive with memtest86+, currently full test pass 2 times. still running, makes me confident the hardware is ok. is this related to how unraid handles zfs memory? pls help investigate using the log i attached.
tower-diagnostics-20230517-1123.zip tower-syslog-20230517-0309.zip
Recommended Comments
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.