In the last couple months, I moved Unraid over from a Dell R510 to a Supermicro build, and since then, I see occasional warnings about machine check errors.
Dec 19 16:49:16 helium kernel: mce: [Hardware Error]: Machine check events logged Dec 19 16:49:16 helium kernel: EDAC sbridge MC0: HANDLING MCE MEMORY ERROR Dec 19 16:49:16 helium kernel: EDAC sbridge MC0: CPU 6: Machine Check Event: 0 Bank 10: 8c000046000800c1 Dec 19 16:49:16 helium kernel: EDAC sbridge MC0: TSC 51ce458bc87a8 Dec 19 16:49:16 helium kernel: EDAC sbridge MC0: ADDR c5c6ea000 Dec 19 16:49:16 helium kernel: EDAC sbridge MC0: MISC 900100010000c8c Dec 19 16:49:16 helium kernel: EDAC sbridge MC0: PROCESSOR 0:306f2 TIME 1608418156 SOCKET 1 APIC 10 Dec 19 16:49:16 helium kernel: EDAC MC0: 1 CE memory scrubbing error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0xc5c6ea offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c1 socket:1 ha:0 channel_mask:2 rank:0)
Current system is a Supermicro X10DRi with dual E5-2620 v3 processors, and 64GB of RAM, configured as 2x16GB per socket.
This obviously is a memory issue, but what exactly causes this, and how do I go about fixing it? I don't know a lot about these sort of logs. Presumably this isn't logging of things like ECC corrections, and this indicates a memory issue where I may have to replace the stick, correct?
Recommended Comments
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.