January 4, 20206 yr Hello, I am getting an error on the fix common problems plugin and it told me to post my mcelog for more help. Is anyone able to tell me what any of this means? I'm pretty new to linux. syslog Thanks!
January 4, 20206 yr Bad memory Jan 3 18:57:32 Linus kernel: mce: [Hardware Error]: Machine check events logged Jan 3 18:57:32 Linus kernel: EDAC MC0: 1 CE read ECC error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x21e664 offset:0x4c0 grain:8 syndrome:0x20 - read error) Jan 3 18:58:20 Linus kernel: mce: [Hardware Error]: Machine check events logged Jan 3 18:58:20 Linus kernel: EDAC MC0: 1 CE read ECC error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x21e664 offset:0x4c0 grain:8 syndrome:0x20 - read error) Jan 3 18:58:20 Linus kernel: EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0
January 4, 20206 yr Author 4 minutes ago, Squid said: Bad memory Jan 3 18:57:32 Linus kernel: mce: [Hardware Error]: Machine check events logged Jan 3 18:57:32 Linus kernel: EDAC MC0: 1 CE read ECC error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x21e664 offset:0x4c0 grain:8 syndrome:0x20 - read error) Jan 3 18:58:20 Linus kernel: mce: [Hardware Error]: Machine check events logged Jan 3 18:58:20 Linus kernel: EDAC MC0: 1 CE read ECC error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x21e664 offset:0x4c0 grain:8 syndrome:0x20 - read error) Jan 3 18:58:20 Linus kernel: EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0 Thank you! Is this something I should worry about? Ill be honest the error has been there for probably around 6 months.
January 5, 20206 yr On 1/4/2020 at 11:02 AM, aco262 said: Is this something I should worry about? Only if you care about the uptime on your server. If the memory errors are ever not correctable it will lock solid or restart to attempt to avoid further corruption. It's also possible the memory is fine, just being pushed past its limits. I'd play with slowing the memory down or maybe increasing the voltage just a tiny bit and see of the errors change.
January 5, 20206 yr And without digging into the diagnostics and googling a ton of stuff, it should be noted that any and all overclocks introduce instability into any system and any OS. XMP / AMP memory profiles (which unfortunately tend to be used by default nowadays in the BIOS) is an overclock. Just because at the time of manufacture a DIMM could handle the overclock by XMP/AMP does not mean that a day, a week, a year down the road that it still can. YMMV
January 5, 20206 yr On 1/4/2020 at 11:02 AM, aco262 said: Thank you! Is this something I should worry about? Ill be honest the error has been there for probably around 6 months. Isn't the whole point of ECC memory that you should worry about it once the error happens? If you aren't going to replace the memory once errors begin to happen, then why bother with the extra expense of ECC in the first place? Completely up to you though. The errors are being corrected for the time being.
Archived
This topic is now archived and is closed to further replies.