Jump to content

Machine Check Events Error


Recommended Posts

Bad memory

Jan  3 18:57:32 Linus kernel: mce: [Hardware Error]: Machine check events logged
Jan  3 18:57:32 Linus kernel: EDAC MC0: 1 CE read ECC error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x21e664 offset:0x4c0 grain:8 syndrome:0x20 - read error)
Jan  3 18:58:20 Linus kernel: mce: [Hardware Error]: Machine check events logged
Jan  3 18:58:20 Linus kernel: EDAC MC0: 1 CE read ECC error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x21e664 offset:0x4c0 grain:8 syndrome:0x20 - read error)
Jan  3 18:58:20 Linus kernel: EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0

 

Link to comment
4 minutes ago, Squid said:

Bad memory


Jan  3 18:57:32 Linus kernel: mce: [Hardware Error]: Machine check events logged
Jan  3 18:57:32 Linus kernel: EDAC MC0: 1 CE read ECC error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x21e664 offset:0x4c0 grain:8 syndrome:0x20 - read error)
Jan  3 18:58:20 Linus kernel: mce: [Hardware Error]: Machine check events logged
Jan  3 18:58:20 Linus kernel: EDAC MC0: 1 CE read ECC error on CPU#0Channel#0_DIMM#0 (channel:0 slot:0 page:0x21e664 offset:0x4c0 grain:8 syndrome:0x20 - read error)
Jan  3 18:58:20 Linus kernel: EDAC i7core: New Corrected error(s): dimm0: +1, dimm1: +0, dimm2 +0

 

Thank you! Is this something I should worry about? Ill be honest the error has been there for probably around 6 months.

Link to comment
On 1/4/2020 at 11:02 AM, aco262 said:

Is this something I should worry about?

Only if you care about the uptime on your server. If the memory errors are ever not correctable it will lock solid or restart to attempt to avoid further corruption.

 

It's also possible the memory is fine, just being pushed past its limits. I'd play with slowing the memory down or maybe increasing the voltage just a tiny bit and see of the errors change.

Link to comment

And without digging into the diagnostics and googling a ton of stuff, it should be noted that any and all overclocks introduce instability into any system and any OS.

 

XMP / AMP memory profiles (which unfortunately tend to be used by default nowadays in the BIOS) is an overclock.  Just because at the time of manufacture a DIMM could handle the overclock by XMP/AMP does not mean that a day, a week, a year down the road that it still can.  YMMV

Link to comment
On 1/4/2020 at 11:02 AM, aco262 said:

Thank you! Is this something I should worry about? Ill be honest the error has been there for probably around 6 months.

Isn't the whole point of ECC memory that you should worry about it once the error happens?  If you aren't going to replace the memory once errors begin to happen, then why bother with the extra expense of ECC in the first place?  Completely up to you though.  The errors are being corrected for the time being.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...