Jump to content
Sign in to follow this  
redpill85

Fix Common Problems - MCE Error

6 posts in this topic Last Reply

Recommended Posts

Hey folks,

 

Getting an error that I've seen posted a few times on here regarding a hardware issue reported by fix common problems. Error seems to be memory related. Just wanted some more experienced eyes on it to confirm as I am new to server builds and UnRaid. Also, running a memtest now.

 

Thanks in advance!

unraid-diagnostics-20201016-2056.zip

Share this post


Link to post
Oct 16 20:17:20 UnRaid kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x87fa42 offset:0xcc0 grain:32 syndrome:0x0 -  area:DRAM err_code:0001:0091 socket:1 ha:0 channel_mask:2 rank:0)

Bad memory.  You BIOS' event log will probably have more information on which one other than Channel 1, Dimm 0

Share this post


Link to post

Thanks for the reply. FWIW, I ran a memtest overnight and all today so far with no errors (3 passes thus far). How many times does memtest need to pass successfully before I can reasonably conclude something? I've read different things. It also says it's running the test on only "socket 1". Cores active:1, 1 total. I have a dual CPUs. I guess there is an option somewhere to run it on the other socket?

Edited by redpill85

Share this post


Link to post

Unless ECC can be disable in the BIOS Memtest won't find any errors if ECC is correcting them, check system event log in the BIOS/IPMI, there should be more info there.

Share this post


Link to post
4 hours ago, JorgeB said:

Unless ECC can be disable in the BIOS Memtest won't find any errors if ECC is correcting them, check system event log in the BIOS/IPMI, there should be more info there.

That is good to know thank you. I did find which DIMM slot in the BIOS event logs that was causing the error.

 

However, I was getting a message about my CMOS battery dying so I changed that out. I'm no longer getting any memory read errors in BIOS event logs or Fix Common Problems. Has anyone seen a dying CMOS battery cause an issue like this before? Also, one of my system fans stopped working last night after I switched the fan speed in IPMI and then I lost control over fan speed setting altogether. That was fixed as well after I replaced the CMOS battery

Share this post


Link to post

I did a Fix common problems scan just a minute ago and I got the same error. Looks like I'll just have to replace the bad stick.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this