Jump to content

Upgraded CPU to Ryzen 7 1700 - Now Unraid unstable.


Recommended Posts

I have recently upgraded my unraid server to use a spare ryzen 7 1700 cpu on a MSI B550 Tomahawk Motherboard.

 

Since that time I have been seeing frequent crashes / lock ups for a variety of reasons. 

 

My most recent crash is :

 

Apr  1 20:51:10 Tower kernel: ------------[ cut here ]------------
Apr  1 20:51:10 Tower kernel: WARNING: CPU: 7 PID: 7435 at kernel/exit.c:725 do_exit+0x4b/0x8eb
Apr  1 20:51:10 Tower kernel: Modules linked in: xt_nat veth xt_CHECKSUM ipt_REJECT nf_reject_ipv4 xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle nf_tables vhost_net tun vhost vhost_iotlb tap xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xt_addrtype iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter xfs nfsd lockd grace sunrpc md_mod ip6table_filter ip6_tables iptable_filter ip_tables x_tables r8169 realtek sr_mod cdrom edac_mce_amd kvm_amd kvm crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel crypto_simd r8125(O) cryptd glue_helper wmi_bmof i2c_piix4 rapl input_leds ccp ahci i2c_core wmi k10temp led_class cdc_acm libahci acpi_cpufreq button [last unloaded: realtek]
Apr  1 20:51:10 Tower kernel: CPU: 7 PID: 7435 Comm: unraidd10 Tainted: G S    D    O      5.10.28-Unraid #1
Apr  1 20:51:10 Tower kernel: Hardware name: Micro-Star International Co., Ltd. MS-7C91/MAG B550 TOMAHAWK (MS-7C91), BIOS A.80 12/16/2021
Apr  1 20:51:10 Tower kernel: RIP: 0010:do_exit+0x4b/0x8eb
Apr  1 20:51:10 Tower kernel: Code: 65 48 8b 1c 25 c0 7b 01 00 48 8b 83 e8 06 00 00 48 85 c0 74 17 48 8b 10 48 39 d0 75 0d 48 8b 50 10 48 83 c0 10 48 39 c2 74 02 <0f> 0b 65 8b 0d ec 40 fc 7e 89 c8 48 c7 c7 2e 61 d7 81 25 00 ff ff
Apr  1 20:51:10 Tower kernel: RSP: 0018:ffffc90000a7fee8 EFLAGS: 00010012
Apr  1 20:51:10 Tower kernel: RAX: ffffc90000a7fe40 RBX: ffff8881055a3800 RCX: 0000000000000027
Apr  1 20:51:10 Tower kernel: RDX: ffff88813be8c348 RSI: 0000000000000001 RDI: 0000000000000009
Apr  1 20:51:10 Tower kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 00000000ffffdfff
Apr  1 20:51:10 Tower kernel: R10: ffffc90000a7f958 R11: ffffc90000a7f950 R12: 0000000000000009
Apr  1 20:51:10 Tower kernel: R13: 0000000000000009 R14: 0000000000000046 R15: 0000000000000000
Apr  1 20:51:10 Tower kernel: FS:  0000000000000000(0000) GS:ffff888fee9c0000(0000) knlGS:0000000000000000
Apr  1 20:51:10 Tower kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Apr  1 20:51:10 Tower kernel: CR2: 0000000000000000 CR3: 0000000383a90000 CR4: 00000000003506e0
Apr  1 20:51:10 Tower kernel: Call Trace:
Apr  1 20:51:10 Tower kernel: ? md_seq_show+0x69e/0x69e [md_mod]
Apr  1 20:51:10 Tower kernel: ? kthread+0xe5/0xea
Apr  1 20:51:10 Tower kernel: rewind_stack_do_exit+0x17/0x17
Apr  1 20:51:10 Tower kernel: RIP: 0000:0x0
Apr  1 20:51:10 Tower kernel: Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
Apr  1 20:51:10 Tower kernel: RSP: 0000:0000000000000000 EFLAGS: 00000000 ORIG_RAX: 0000000000000000
Apr  1 20:51:10 Tower kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Apr  1 20:51:10 Tower kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Apr  1 20:51:10 Tower kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
Apr  1 20:51:10 Tower kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Apr  1 20:51:10 Tower kernel: R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Apr  1 20:51:10 Tower kernel: ---[ end trace 558ddcf995bf62ba ]---

 

Getting frustrated with being unable to find out what the root cause is, I know i have a bad cache device, and that was causing the mover sequence to lockup / crash, Wondering if anyone can assist / point me in the correct direction 

 

Attached is  my diagnostics

tower-diagnostics-20220401-2103.zip

Link to comment

Went through this with my 1500X build.  Since setting the PSU idle power in BIOS and setting RAM to the Ryzen default speeds appropriate for my memory (and pay attention to which single/dual rank RAM you have), my system as been rock solid stable.  I never touched my c-states.

 

YMMV

Edited by ConnerVT
wrong word
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...