machine check error


Recommended Posts

hello all, had unraid reboot on me randomly today. i have a script that copies the log to another location every 5 minutes, didn't see anything in there. when it came back online, Fix Common Problems reported an issue:

 

Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged.

 

I didn't have mcelog installed, so I installed it. But I guess this only logs going forward? In any case, in my logs from after the reboot (before mcelog was installed) showed the following message:

 

May 28 00:16:00 Tower kernel: smpboot: CPU0: AMD Ryzen 7 1700 Eight-Core Processor (family: 0x17, model: 0x1, stepping: 0x1)
May 28 00:16:00 Tower kernel: Performance Events: Fam17h core perfctr, AMD PMU driver.
May 28 00:16:00 Tower kernel: ... version:                0
May 28 00:16:00 Tower kernel: ... bit width:              48
May 28 00:16:00 Tower kernel: ... generic registers:      6
May 28 00:16:00 Tower kernel: ... value mask:             0000ffffffffffff
May 28 00:16:00 Tower kernel: ... max period:             00007fffffffffff
May 28 00:16:00 Tower kernel: ... fixed-purpose events:   0
May 28 00:16:00 Tower kernel: ... event mask:             000000000000003f
May 28 00:16:00 Tower kernel: rcu: Hierarchical SRCU implementation.
May 28 00:16:00 Tower kernel: smp: Bringing up secondary CPUs ...
May 28 00:16:00 Tower kernel: x86: Booting SMP configuration:
May 28 00:16:00 Tower kernel: .... node  #0, CPUs:        #1  #2  #3  #4  #5  #6  #7  #8  #9 #10 #11 #12 #13
May 28 00:16:00 Tower kernel: mce: [Hardware Error]: Machine check events logged
May 28 00:16:00 Tower kernel: mce: [Hardware Error]: CPU 13: Machine Check: 0 Bank 5: bea0000000000108
May 28 00:16:00 Tower kernel: mce: [Hardware Error]: TSC 0 ADDR 1ffff81654a1a MISC d012000101000000 SYND 4d000000 IPID 500b000000000 
May 28 00:16:00 Tower kernel: mce: [Hardware Error]: PROCESSOR 2:800f11 TIME 1559016932 SOCKET 0 APIC d microcode 8001126
May 28 00:16:00 Tower kernel: #14 #15
May 28 00:16:00 Tower kernel: smp: Brought up 1 node, 16 CPUs

Is the zenstates fix still required for 6.7? I have my C-states in BIOS disabled already.

tower-diagnostics-20190528-0431.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.