October 18, 20196 yr Hello all you fine fellas. I've recently upgraded my Unraid machine from old chunky Q6600 machine to a mATX build. Some specs: CPU: Ryzen R5 1400 @ stock RAM: 16GB DDR4 @ 3200MHz MOBO: Gigabyte B450 Aorus M PSU: Seasonic 550W (Good Tier) The server is used for NAS and running servers on a Linux VM. First night the server was fine, but the other night it froze and was unresponsive, didn't even let me type in the terminal. Same thing happened next night. I installed Fix Common Problems plugin and it reported a MCE Hardware Error, I then looked into the log and found this: mce: [Hardware Error]: Machine check events logged mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 5: bea0000000000108 mce: [Hardware Error]: TSC 0 ADDR 1ffff816560ea MISC d012000100000000 SYND 4d000000 IPID 500b000000000 mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1571423384 SOCKET 0 APIC 0 microcode 8001138 Seems to be either CPU or RAM issue, but the same components were in my main system for about a year before I upgraded them and I had no issues with them during that time. Seems odd. I thank anyone who can help me with this.
October 18, 20196 yr Community Expert Have you done memtest? Setup Syslog Server: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=781601 and get us the diagnostics plus whatever syslog captures before the crash. Tools - Diagnostics, attach complete diagnostics zip file to your next post.
October 18, 20196 yr 3 hours ago, Arizuia said: it froze and was unresponsive, 3 hours ago, Arizuia said: Ryzen R5 1400 Make sure that C-States are disabled in the BIOS
October 18, 20196 yr 4 minutes ago, Squid said: Make sure that C-States are disabled in the BIOS Actually, the better solution is to look for a "Power supply idle mode" setting in the BIOS* and set it to "Typical current idle" rather than the default "Auto". That still allows the CPU to enter C states but doesn't allow the power to drop so low that it can't wake up again. The issue only affects 1000-series processors. *Typically under Advanced -> AMD CBS -> Zen Common Options
October 19, 20196 yr Author 3 hours ago, trurl said: Have you done memtest? Setup Syslog Server: https://forums.unraid.net/topic/46802-faq-for-unraid-v6/?do=findComment&comment=781601 and get us the diagnostics plus whatever syslog captures before the crash. Tools - Diagnostics, attach complete diagnostics zip file to your next post. I've left memtest running for a while, at 3200MHz OC there were quite many errors, over 700. Removing OC reduced that to 2 so far. Attaching the diagnostics file as you asked. kotiservu-diagnostics-20191018-1834.zip
October 19, 20196 yr Don't run the CPU's memory controller beyond its spec. You have to set up a server differently from a gaming machine. Two memory errors is a fail.
October 20, 20196 yr Author On 10/19/2019 at 5:08 AM, John_M said: Don't run the CPU's memory controller beyond its spec. You have to set up a server differently from a gaming machine. Two memory errors is a fail. After reseating the RAM and keeping it stock, everything seems to be fine now. No memtest errors and hasn't crashed or frozen during last night. Very late update: I bought an used Ryzen 3200G and it solved everything, seems the Ryzen 1400 has some early Zen bugs regarding Linux and stability. Edited April 7, 20215 yr by Arizuia late update
Archived
This topic is now archived and is closed to further replies.