Machine Check Error


Recommended Posts

"Fix Common Problems" plugin detected error:

 

Quote

Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged

 

Posting my diag. log here as the plugin suggests and hoping someone could take a quick look.

I am not having any obvious issues at the moment. Everything seems to be working fine.

 

Thanks

tower-diagnostics-20200503-1711.zip

Link to comment
Apr 24 20:45:31 Tower kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
Apr 24 20:45:31 Tower kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0x44721c offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0008:00c2 socket:0 ha:0 channel_mask:4 rank:1)

Memory error.  Check your system event log for more info.

Link to comment
  • 1 month later...
On 5/4/2020 at 2:50 AM, Squid said:

Apr 24 20:45:31 Tower kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
Apr 24 20:45:31 Tower kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0x44721c offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0008:00c2 socket:0 ha:0 channel_mask:4 rank:1)

Memory error.  Check your system event log for more info.

Thank you. Are there any hints as to which of my 2 memory sticks may be failing? it's been a long time since I ran memtest86 but I remember it would take a long time to run.

 

Link to comment

I'm not 100% sure, but memtest86 could not show you that your ram has issue if it's ecc, since it corrects errors (if it's still able to do so); I had a similar problem in the past, errors related to ram, memtest was ok.

After changing 2 sticks of ram, which seemed to solve the issue, I found out that the issue came from one of the 2 cpus in my server.

So errors related to ram may come from other parts of the server.

My specific error was:
 

EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0x1b2c39 offset:0x8c0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0093 socket:0 ha:0 channel_mask:8 rank:1)
EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 5: 8c00004000010093

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.