redpill85 Posted October 17, 2020 Share Posted October 17, 2020 Hey folks, Getting an error that I've seen posted a few times on here regarding a hardware issue reported by fix common problems. Error seems to be memory related. Just wanted some more experienced eyes on it to confirm as I am new to server builds and UnRaid. Also, running a memtest now. Thanks in advance! unraid-diagnostics-20201016-2056.zip Quote Link to comment
Squid Posted October 17, 2020 Share Posted October 17, 2020 Oct 16 20:17:20 UnRaid kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x87fa42 offset:0xcc0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0091 socket:1 ha:0 channel_mask:2 rank:0) Bad memory. You BIOS' event log will probably have more information on which one other than Channel 1, Dimm 0 Quote Link to comment
redpill85 Posted October 17, 2020 Author Share Posted October 17, 2020 (edited) Thanks for the reply. FWIW, I ran a memtest overnight and all today so far with no errors (3 passes thus far). How many times does memtest need to pass successfully before I can reasonably conclude something? I've read different things. It also says it's running the test on only "socket 1". Cores active:1, 1 total. I have a dual CPUs. I guess there is an option somewhere to run it on the other socket? Edited October 17, 2020 by redpill85 Quote Link to comment
JorgeB Posted October 18, 2020 Share Posted October 18, 2020 Unless ECC can be disable in the BIOS Memtest won't find any errors if ECC is correcting them, check system event log in the BIOS/IPMI, there should be more info there. 1 Quote Link to comment
redpill85 Posted October 18, 2020 Author Share Posted October 18, 2020 4 hours ago, JorgeB said: Unless ECC can be disable in the BIOS Memtest won't find any errors if ECC is correcting them, check system event log in the BIOS/IPMI, there should be more info there. That is good to know thank you. I did find which DIMM slot in the BIOS event logs that was causing the error. However, I was getting a message about my CMOS battery dying so I changed that out. I'm no longer getting any memory read errors in BIOS event logs or Fix Common Problems. Has anyone seen a dying CMOS battery cause an issue like this before? Also, one of my system fans stopped working last night after I switched the fan speed in IPMI and then I lost control over fan speed setting altogether. That was fixed as well after I replaced the CMOS battery Quote Link to comment
redpill85 Posted October 18, 2020 Author Share Posted October 18, 2020 I did a Fix common problems scan just a minute ago and I got the same error. Looks like I'll just have to replace the bad stick. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.