October 17, 20205 yr Hey folks, Getting an error that I've seen posted a few times on here regarding a hardware issue reported by fix common problems. Error seems to be memory related. Just wanted some more experienced eyes on it to confirm as I am new to server builds and UnRaid. Also, running a memtest now. Thanks in advance! unraid-diagnostics-20201016-2056.zip
October 17, 20205 yr Oct 16 20:17:20 UnRaid kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#1_DIMM#0 (channel:1 slot:0 page:0x87fa42 offset:0xcc0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0091 socket:1 ha:0 channel_mask:2 rank:0) Bad memory. You BIOS' event log will probably have more information on which one other than Channel 1, Dimm 0
October 17, 20205 yr Author Thanks for the reply. FWIW, I ran a memtest overnight and all today so far with no errors (3 passes thus far). How many times does memtest need to pass successfully before I can reasonably conclude something? I've read different things. It also says it's running the test on only "socket 1". Cores active:1, 1 total. I have a dual CPUs. I guess there is an option somewhere to run it on the other socket? Edited October 17, 20205 yr by redpill85
October 18, 20205 yr Community Expert Unless ECC can be disable in the BIOS Memtest won't find any errors if ECC is correcting them, check system event log in the BIOS/IPMI, there should be more info there.
October 18, 20205 yr Author 4 hours ago, JorgeB said: Unless ECC can be disable in the BIOS Memtest won't find any errors if ECC is correcting them, check system event log in the BIOS/IPMI, there should be more info there. That is good to know thank you. I did find which DIMM slot in the BIOS event logs that was causing the error. However, I was getting a message about my CMOS battery dying so I changed that out. I'm no longer getting any memory read errors in BIOS event logs or Fix Common Problems. Has anyone seen a dying CMOS battery cause an issue like this before? Also, one of my system fans stopped working last night after I switched the fan speed in IPMI and then I lost control over fan speed setting altogether. That was fixed as well after I replaced the CMOS battery
October 18, 20205 yr Author I did a Fix common problems scan just a minute ago and I got the same error. Looks like I'll just have to replace the bad stick.
Archived
This topic is now archived and is closed to further replies.