Jump to content

Dimm Error Identification


statecowboy

Recommended Posts

Hi guys.  I was curious how to rectify the unraid mem errors with my actual dimms.  In my case, I am able to log in to my web console and see errors.  I also have an LED that blinks when an error is registered.  In this case DIMM H2 is lit up and my web console output the following:

 

1679 02/17/2018 23:29:30 Mmry ECC Sensor Memory Correctable ECC. CPU: 2, DIMM: H2. - Asserted

 

That said, this is the error I get in unraid.

 

Feb 17 17:27:57 someflix-unraid kernel: mce: [Hardware Error]: Machine check events logged
Feb 17 17:27:57 someflix-unraid kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR
Feb 17 17:27:57 someflix-unraid kernel: EDAC sbridge MC1: CPU 10: Machine Check Event: 0 Bank 12: 8c000043000800c3
Feb 17 17:27:57 someflix-unraid kernel: EDAC sbridge MC1: TSC 5365d5a58cd7c 
Feb 17 17:27:57 someflix-unraid kernel: EDAC sbridge MC1: ADDR 9dd90f000 
Feb 17 17:27:57 someflix-unraid kernel: EDAC sbridge MC1: MISC 122100008000868c 
Feb 17 17:27:57 someflix-unraid kernel: EDAC sbridge MC1: PROCESSOR 0:306e4 TIME 1518910077 SOCKET 1 APIC 20
Feb 17 17:27:57 someflix-unraid kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#1 (channel:0 slot:1 page:0x9dd90f offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0008:00c3 socket:1 ha:0 channel_mask:1 rank:4)

 

Can someone explain how to tell from the unraid log which DIMM I am getting an error on?  Obviously I can check my web console, but I was curious what the methodology is.

 

Thanks

Link to comment

Hi guys.  I have another error in my memory which I've added below to explain my question above (I've also attached my last diagnostics).

From unRAID logs:

Feb 20 11:30:45 someflix-unraid kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR
Feb 20 11:30:45 someflix-unraid kernel: EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 10: 8c000047000800c1
Feb 20 11:30:45 someflix-unraid kernel: EDAC sbridge MC0: TSC 160afab75fbec 
Feb 20 11:30:45 someflix-unraid kernel: EDAC sbridge MC0: ADDR 142592000 
Feb 20 11:30:45 someflix-unraid kernel: EDAC sbridge MC0: MISC 908400800080e8c 
Feb 20 11:30:45 someflix-unraid kernel: EDAC sbridge MC0: PROCESSOR 0:306e4 TIME 1519147845 SOCKET 0 APIC 0
Feb 20 11:30:45 someflix-unraid kernel: EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#0_DIMM#0 (channel:0 slot:0 page:0x142592 offset:0x0 grain:32 syndrome:0x0 -  area:DRAM err_code:0008:00c1 socket:0 ha:0 channel_mask:1 rank:1)

 

From BMC Web Console:

Event ID   Ascending   Time Stamp   Ascending   Sensor Name   Ascending   Sensor Type   Ascending   Description   Ascending
22 02/20/2018 17:31:40 Mmry ECC Sensor Memory Correctable ECC. CPU: 1, DIMM: B1. - Asserted

someflix-unraid-diagnostics-20180221-1837.zip

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...