Jump to content
We're Hiring! Full Stack Developer ×

Server Crashing--New hardware


Recommended Posts

I installed new hardware just yesterday on the server. 
I was running a 3900x amd cpu with a 2060. New hardware is now a 12700k without the gpu using the igpu. 

I am using the same 3600 corsair ram modules that I was in the old motherboard. 

I am having issues with the server kernal panicing randomly and is unsure on what to do about it at the moment. I am not sure if it is something with the hybrid nature of the 12700k or if I set something up incorrectly. Any assistance would be helpful.

 

The latest crash was at 1:01am EST. Syslog attached. 

As a note I am seeing messages of segfaults perdiodically in the log as well. No overclocking of the cpu is enabled. I did have XMP enabled for the ram but I have disabled that and still seeing issues. I have also attached a full list of plugins I have enabled incase one of those is the issue. 

 

Unraid Version: 6.12.0-rc5 

Screenshot 2023-05-10 062143.png

syslog

Edited by soultaco83
Link to comment

Memtest has started. I realize unraid's version does not work on uefi so had to download from site but no problem there. I went ahead and unplugged the 4 sticks of ram and replaced them in. So let's see what this says. Decided to enable xmp again as well as I would rather be running at 3600 if I can. 

Link to comment

May 11 16:24:44 Weeb-Central kernel: CPU: 10 PID: 27684 Comm: chown Tainted: G      D            6.1.27-Unraid #1
May 11 16:24:44 Weeb-Central kernel: Hardware name: Gigabyte Technology Co., Ltd. Z690 AORUS ELITE AX DDR4/Z690 AORUS ELITE AX DDR4, BIOS F23 03/09/2023
May 11 16:24:44 Weeb-Central kernel: RIP: 0010:fuse_dentry_revalidate+0x8f/0x2b1
May 11 16:24:44 Weeb-Central kernel: Code: bc 24 90 00 00 00 31 c0 b9 1c 00 00 00 f3 ab 4d 85 ff 0f 84 b8 01 00 00 83 e5 40 b8 f6 ff ff ff 0f 85 f8 01 00 00 49 8b 47 28 <4c> 8b b0 68 03 00 00 e8 0b 8e 00 00 48 85 c0 49 89 c4 75 0a b8 f4
May 11 16:24:44 Weeb-Central kernel: RSP: 0018:ffffc900222fbb18 EFLAGS: 00010246
May 11 16:24:44 Weeb-Central kernel: RAX: 0000000000000000 RBX: ffff8883d909ed80 RCX: 0000000000000000
May 11 16:24:44 Weeb-Central kernel: RDX: 01b92d7cc57a0000 RSI: 0000000000000000 RDI: ffffc900222fbc18
May 11 16:24:44 Weeb-Central kernel: RBP: 0000000000000000 R08: c4b92d7cc57a0000 R09: 0000000000000041
May 11 16:24:44 Weeb-Central kernel: R10: 517b4a0273df7382 R11: 0000000000000fe0 R12: ffffc900222fbcf0
May 11 16:24:44 Weeb-Central kernel: R13: ffff888939d60020 R14: 0000000000000040 R15: ffff8883de460000
May 11 16:24:44 Weeb-Central kernel: FS:  000014693c4825c0(0000) GS:ffff88907f880000(0000) knlGS:0000000000000000
May 11 16:24:44 Weeb-Central kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 11 16:24:44 Weeb-Central kernel: CR2: 0000000000000368 CR3: 0000000799c4e006 CR4: 0000000000772ee0
May 11 16:24:44 Weeb-Central kernel: PKRU: 55555554
May 11 16:24:44 Weeb-Central kernel: Call Trace:
May 11 16:24:44 Weeb-Central kernel: <TASK>
May 11 16:24:44 Weeb-Central kernel: ? preempt_latency_start+0x2b/0x46
May 11 16:24:44 Weeb-Central kernel: ? memcg_slab_free_hook+0x20/0xcf
May 11 16:24:44 Weeb-Central kernel: ? _raw_spin_lock_irqsave+0x2c/0x37
May 11 16:24:44 Weeb-Central kernel: ? slab_free_freelist_hook.constprop.0+0x3b/0xaf
May 11 16:24:44 Weeb-Central kernel: ? kmem_cache_free+0xc9/0x154
May 11 16:24:44 Weeb-Central kernel: ? fuse_simple_request+0x1df/0x1ef
May 11 16:24:44 Weeb-Central kernel: ? get_cached_acl_rcu+0x17/0x51
May 11 16:24:44 Weeb-Central kernel: lookup_fast+0x97/0xc0
May 11 16:24:44 Weeb-Central kernel: walk_component+0x42/0xca
May 11 16:24:44 Weeb-Central kernel: path_lookupat+0x78/0xfe
May 11 16:24:44 Weeb-Central kernel: filename_lookup+0x5f/0xbc
May 11 16:24:44 Weeb-Central kernel: ? slab_post_alloc_hook+0x4d/0x15e
May 11 16:24:44 Weeb-Central kernel: vfs_statx+0x62/0x126
May 11 16:24:44 Weeb-Central kernel: vfs_fstatat+0x46/0x62
May 11 16:24:44 Weeb-Central kernel: __do_sys_newfstatat+0x26/0x5c
May 11 16:24:44 Weeb-Central kernel: ? syscall_trace_enter.constprop.0+0x5e/0xc8
May 11 16:24:44 Weeb-Central kernel: do_syscall_64+0x68/0x81
May 11 16:24:44 Weeb-Central kernel: entry_SYSCALL_64_after_hwframe+0x63/0xcd
May 11 16:24:44 Weeb-Central kernel: RIP: 0033:0x14693c39c9bf
May 11 16:24:44 Weeb-Central kernel: Code: 00 b8 ff ff ff ff c3 0f 1f 40 00 f3 0f 1e fa 41 89 f9 45 89 c2 89 f7 48 89 d6 48 89 ca 41 83 f9 01 77 30 b8 06 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 09 c3 0f 1f 84 00 00 00 00 00 48 8b 15 99 e4
May 11 16:24:44 Weeb-Central kernel: RSP: 002b:00007ffcd951e168 EFLAGS: 00000246 ORIG_RAX: 0000000000000106
May 11 16:24:44 Weeb-Central kernel: RAX: ffffffffffffffda RBX: 00005615c256df40 RCX: 000014693c39c9bf
May 11 16:24:44 Weeb-Central kernel: RDX: 00005615c256dfb8 RSI: 00005615c256e048 RDI: 0000000000000008
May 11 16:24:44 Weeb-Central kernel: RBP: 00005615c256dfb8 R08: 0000000000000100 R09: 0000000000000001
May 11 16:24:44 Weeb-Central kernel: R10: 0000000000000100 R11: 0000000000000246 R12: 00005615c0f64c20
May 11 16:24:44 Weeb-Central kernel: R13: 00005615c256de20 R14: 0000000000000064 R15: 00005615c0f64c20
May 11 16:24:44 Weeb-Central kernel: </TASK>
May 11 16:24:44 Weeb-Central kernel: Modules linked in: veth tls xt_connmark xt_comment iptable_raw wireguard curve25519_x86_64 libcurve25519_generic libchacha20poly1305 chacha_x86_64 poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha xt_mark xt_nat xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle vhost_net tun vhost vhost_iotlb tap ipvlan xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_addrtype br_netfilter xfs md_mod tcp_diag inet_diag k10temp hwmon_vid efivarfs ip6table_filter ip6_tables iptable_filter ip_tables x_tables bridge 8021q garp mrp stp llc i915 x86_pkg_temp_thermal intel_powerclamp coretemp iosf_mbi drm_buddy kvm_intel i2c_algo_bit ttm drm_display_helper kvm drm_kms_helper crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel sha512_ssse3 drm aesni_intel crypto_simd cryptd intel_gtt btusb r8169 agpgart btrtl mei_hdcp mei_pxp btbcm rapl syscopyarea
May 11 16:24:44 Weeb-Central kernel: btintel intel_cstate gigabyte_wmi wmi_bmof bluetooth intel_uncore mpt3sas i2c_i801 mei_me i2c_smbus sysfillrect nvme ahci input_leds raid_class ecdh_generic sysimgblt tpm_crb i2c_core mei nvme_core libahci realtek joydev led_class ecc scsi_transport_sas fb_sys_fops video thermal fan tpm_tis tpm_tis_core wmi tpm backlight intel_pmc_core acpi_tad acpi_pad button unix
May 11 16:24:44 Weeb-Central kernel: CR2: 0000000000000368
May 11 16:24:44 Weeb-Central kernel: ---[ end trace 0000000000000000 ]---
May 11 16:24:44 Weeb-Central kernel: RIP: 0010:fuse_dentry_revalidate+0x8f/0x2b1
May 11 16:24:44 Weeb-Central kernel: Code: bc 24 90 00 00 00 31 c0 b9 1c 00 00 00 f3 ab 4d 85 ff 0f 84 b8 01 00 00 83 e5 40 b8 f6 ff ff ff 0f 85 f8 01 00 00 49 8b 47 28 <4c> 8b b0 68 03 00 00 e8 0b 8e 00 00 48 85 c0 49 89 c4 75 0a b8 f4
May 11 16:24:44 Weeb-Central kernel: RSP: 0018:ffffc9000922bb18 EFLAGS: 00010246
May 11 16:24:44 Weeb-Central kernel: RAX: 0000000000000000 RBX: ffff8883d909ed80 RCX: 0000000000000000
May 11 16:24:44 Weeb-Central kernel: RDX: 01b92d7cc57a0000 RSI: 0000000000000005 RDI: ffffc9000922bc18
May 11 16:24:44 Weeb-Central kernel: RBP: 0000000000000000 R08: b4b92d7cc57a0000 R09: 7000000000000041
May 11 16:24:44 Weeb-Central kernel: R10: 517b4a0273df7382 R11: 0000000000000fe0 R12: ffffc9000922bcf0
May 11 16:24:44 Weeb-Central kernel: R13: ffff888942aad020 R14: 0000000000000045 R15: ffff8883de460000
May 11 16:24:44 Weeb-Central kernel: FS:  000014693c4825c0(0000) GS:ffff88907f880000(0000) knlGS:0000000000000000
May 11 16:24:44 Weeb-Central kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 11 16:24:44 Weeb-Central kernel: CR2: 0000000000000368 CR3: 0000000799c4e006 CR4: 0000000000772ee0
May 11 16:24:44 Weeb-Central kernel: PKRU: 55555554
May 11 16:24:44 Weeb-Central kernel: note: chown[27684] exited with irqs disabled

 

 

 

 

 

I guess I will try to get a new cpu/motherboard this weekend? It was part of a bundle that I got on monday from microcenter. 

Link to comment

I want to update things as they sit right now. I took the motherboard out, removed all cables, removed ram, removed cpu, removed pcie cards. 

reseated cpu

reseated ram

reseated power cables

reseated HBA cards

 

reset bios again(Even though there were currently no changes all stock)

 

At this moment the server has booted for the first time I had 0 errors. I am letting everything run and will keep monitoring. Fingers crossed. 

just had a few segfaults: 
May 12 17:37:06 Weeb-Central kernel: smartctl[26804]: segfault at 0 ip 0000000000000000 sp 00007ffe308ded48 error 14 in smartctl[400000+4000] likely on CPU 0 (core 0, socket 0)
May 12 17:37:06 Weeb-Central kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
May 12 17:56:54 Weeb-Central kernel: smartctl[9730]: segfault at 0 ip 0000000000000000 sp 00007ffcbc08bdf8 error 14 in smartctl[400000+4000] likely on CPU 16 (core 36, socket 0)
May 12 17:56:54 Weeb-Central kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
 

Edited by soultaco83
Link to comment

Hmm thought these were resolved I set the smart controller to ATA and set the poll time to 3600 seconds. More faults though. Different faults this time. 



May 18 08:00:28 Weeb-Central kernel: eth0: renamed from veth7cfdc86
May 18 08:00:32 Weeb-Central Docker Auto Update: Stopping binhex-urbackup
May 18 08:00:35 Weeb-Central kernel: main checkpoint[18226]: segfault at 0 ip 0000000000000000 sp 00001468e3bfdcb8 error 14
May 18 08:00:35 Weeb-Central kernel: lnk checkpoint[18229]: segfault at 0 ip 0000000000000000 sp 00001468e35facb8 error 14 likely on CPU 4 (core 8, socket 0)
May 18 08:00:35 Weeb-Central kernel: likely on CPU 12 (core 24, socket 0)
May 18 08:00:35 Weeb-Central kernel: 
May 18 08:00:35 Weeb-Central kernel: 
May 18 08:00:35 Weeb-Central kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
May 18 08:00:35 Weeb-Central kernel: settings checkp[18227]: segfault at 0 ip 0000000000000000 sp 00001468e39fccb8 error 14
May 18 08:00:35 Weeb-Central kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
May 18 08:00:35 Weeb-Central kernel: likely on CPU 8 (core 16, socket 0)
May 18 08:00:35 Weeb-Central kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.
May 18 08:00:35 Weeb-Central kernel: lnk jour checkp[18228]: segfault at 0 ip 0000000000000000 sp 00001468e37fbcb8 error 14 likely on CPU 6 (core 12, socket 0)
May 18 08:00:35 Weeb-Central kernel: Code: Unable to access opcode bytes at 0xffffffffffffffd6.

Link to comment
  • 5 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...