bluesky509 Posted May 3, 2020 Posted May 3, 2020 "Fix Common Problems" plugin detected error: Quote Your server has detected hardware errors. You should install mcelog via the NerdPack plugin, post your diagnostics and ask for assistance on the unRaid forums. The output of mcelog (if installed) has been logged Posting my diag. log here as the plugin suggests and hoping someone could take a quick look. I am not having any obvious issues at the moment. Everything seems to be working fine. Thanks tower-diagnostics-20200503-1711.zip Quote
Squid Posted May 4, 2020 Posted May 4, 2020 Apr 24 20:45:31 Tower kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Apr 24 20:45:31 Tower kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0x44721c offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c2 socket:0 ha:0 channel_mask:4 rank:1) Memory error. Check your system event log for more info. Quote
bluesky509 Posted June 30, 2020 Author Posted June 30, 2020 On 5/4/2020 at 2:50 AM, Squid said: Apr 24 20:45:31 Tower kernel: EDAC sbridge MC1: HANDLING MCE MEMORY ERROR Apr 24 20:45:31 Tower kernel: EDAC MC1: 1 CE memory scrubbing error on CPU_SrcID#0_Ha#0_Chan#2_DIMM#0 (channel:2 slot:0 page:0x44721c offset:0x0 grain:32 syndrome:0x0 - area:DRAM err_code:0008:00c2 socket:0 ha:0 channel_mask:4 rank:1) Memory error. Check your system event log for more info. Thank you. Are there any hints as to which of my 2 memory sticks may be failing? it's been a long time since I ran memtest86 but I remember it would take a long time to run. Quote
ghost82 Posted July 1, 2020 Posted July 1, 2020 I'm not 100% sure, but memtest86 could not show you that your ram has issue if it's ecc, since it corrects errors (if it's still able to do so); I had a similar problem in the past, errors related to ram, memtest was ok. After changing 2 sticks of ram, which seemed to solve the issue, I found out that the issue came from one of the 2 cpus in my server. So errors related to ram may come from other parts of the server. My specific error was: EDAC MC0: 1 CE memory read error on CPU_SrcID#0_Ha#0_Chan#3_DIMM#0 (channel:3 slot:0 page:0x1b2c39 offset:0x8c0 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0093 socket:0 ha:0 channel_mask:8 rank:1) EDAC sbridge MC0: CPU 0: Machine Check Event: 0 Bank 5: 8c00004000010093 Quote
JorgeB Posted July 1, 2020 Posted July 1, 2020 Memtest can still work if there's an option on your BIOS to disable ECC. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.