Machine Check Error - CPU


Recommended Posts

Running the 6.5.1rc5 and I have received a few errors similar to this:

Apr 17 22:10:21 Tower kernel: mce: [Hardware Error]: Machine check events logged
Apr 17 22:10:21 Tower kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 0: 9000004000010005
Apr 17 22:10:21 Tower kernel: mce: [Hardware Error]: TSC 21546d59d59c6 
Apr 17 22:10:21 Tower kernel: mce: [Hardware Error]: PROCESSOR 0:306a9 TIME 1524017421 SOCKET 0 APIC 4 microcode 1f

Not very frequent but enough that I am concerned but not sure how to test/remedy. Enclosed are my diagnostics from today - would appreciate any guidance.

tower-diagnostics-20180419-1010.zip

Link to comment

Memtest didn't show any issues - although it should run for a few days but can't cripple the server that long - overnight test showed no errors.

 

 I have actually now added another 16GB of RAM as well (4x8GB) ...

 

Another error listed on the 24th but nothing since then:

Apr 24 10:53:06 Tower kernel: mce: [Hardware Error]: Machine check events logged
Apr 24 10:53:06 Tower kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 0: 9000004000010005
Apr 24 10:53:06 Tower kernel: mce: [Hardware Error]: TSC cabbb7762305 
Apr 24 10:53:06 Tower kernel: mce: [Hardware Error]: PROCESSOR 0:306a9 TIME 1524581586 SOCKET 0 APIC 4 microcode 1f

Same - only different!?

 

I am going to 'redo' my stock Intel CPU heatsink with a Noctua very shortly - waiting for parts - perhaps that will help ...

Link to comment

The server DOES seem functional after this happens - it ran along just fine for a while and I never noticed anything wrong at all

But now you mention it my local GUI was 'frozen' on this last error - I hardly ever boot to a local GUIbut I had for this instance.

I remember wanting to reboot from the server but had to do it from a remote connection - the remote GUI was fine and let me restart the server.

I am questioning my stock Intel cooler though and not really 'trusting' it to be efficient.

I updated from rc5 to 6.5.1 last night as well with no issues so fingers crossed ...

Link to comment
  • 2 years later...

Since this is first thread that pops up in google search when Googling this specific error (Machine check events logged Bank 0: 9000004000010005 unraid). I figured I'd leave my two cents. I got exactly same error as OP. Same error was filling my unraid syslogs and the server was unstable, randomly crashing and rebooting. I managed to solve this by changing my motherboard's bios settings.

 

I set my CPU CORE RATIO from AUTO to SYNC ALL CORES.

Everything else related to CPU is set to AUTO. 

 

Motherboard: Asus z170-A

CPU: Intel i7-6700k

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.