Jump to content

Machine check events


MvL

Recommended Posts

Posted

Hi,

 

I have found machine check events on my server. Fix common problems has warned me for three times now so I guess it's time to find out what the problem is. Can someone help to solve the issue? I have my thoughts what the issue is. I have attached my diagnostic file to this post. 

 

 

rackserver-diagnostics-20190518-0936.zip

Posted

I see this in the syslog:

 

Quote

May 17 22:35:46 RackServer kernel: mce: [Hardware Error]: Machine check events logged
May 17 22:35:46 RackServer kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
May 17 22:35:46 RackServer kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 13: 8c000049000800c0
May 17 22:35:46 RackServer kernel: EDAC sbridge MC0: TSC ab5c2e16c9629 
May 17 22:35:46 RackServer kernel: EDAC sbridge MC0: ADDR a58102000 
May 17 22:35:46 RackServer kernel: EDAC sbridge MC0: MISC 90000008000928c 
May 17 22:35:46 RackServer kernel: EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1558125346 SOCKET 0 APIC 0
May 17 22:35:46 RackServer kernel: EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xa58102 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0008:00c0 socket:0 ha:0 channel_mask:2 rank:0)

 

Are this memory error's? I've checked my bios but I can't find anything related to this..

Posted

Thank you for the reply.

 

The weirdest part is I don't see errors in my bios log files. Any idea how to diagnostic this further?

Posted

Investigating!

 

I have found the memtest86 what you see in the options during booting of unRAID. It's running at the moment. I'll keep you informed.

Posted

Appreciate your guidance.

 

I've checked the event log via IPMI and there are no events. The latest event was on 2019-5-2.

 

I have put Passmark memtest on usb stick and it is now running.

Posted

Okay. 

 

The log of unRAID is reporting Chan#1_DIMM#0. I'm guessing this is DIMM A2? So the first DIMM position (A) on the motherboard then the second slot of DIMM A thus DIMM A2? I'm guessing there is also a channel 0. So channel 0 --> slot 1, channel 1 --> slot 2.

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...