Jump to content

Machine Check Events and/or parity read errors


Go to solution Solved by JorgeB,

Recommended Posts

I'm not sure if these two items are correlated, but I just woke up to a MCE error in fix common problems, and also a disabled parity drive due to read errors.

 

I was having read errors with multiple drives a couple weeks back, so I did extensive RAM testing, but it ended up being resolved (I think?) by moving my HDD sata power plugs to a different PSU rail. Now this single drive comes comes back with more errors (the system disabled it).

 

May 20 08:07:38 Osgiliath kernel: mce: [Hardware Error]: Machine check events logged
May 20 08:07:38 Osgiliath kernel: [Hardware Error]: Deferred error, no action required.
May 20 08:07:38 Osgiliath kernel: [Hardware Error]: CPU:1 (19:21:0) MC12_STATUS[Over|-|-|AddrV|PCC|SyndV|UECC|Deferred|Poison|Scrub]: 0xc765ffc883007f37
May 20 08:07:38 Osgiliath kernel: [Hardware Error]: Error Addr: 0x0000000000000000
May 20 08:07:38 Osgiliath kernel: [Hardware Error]: IPID: 0x0000000000000000, Syndrome: 0x0000000000000000
May 20 08:07:38 Osgiliath kernel: [Hardware Error]: Bank 12 is reserved.
May 20 08:07:38 Osgiliath kernel: [Hardware Error]: cache level: L3/GEN, tx: DATA

 

 

Any ideas?

 

syslog.txt

Edited by Kudagra
Link to comment
  • 2 months later...
On 6/1/2023 at 12:21 AM, JorgeB said:

Stop overclocking RAM, it's a known issue with Ryzen, if it doesn't help try with a new CPU if possible.

 

Disk errors look more like a power/connection problem, replace both cables for that disk.

 

I disabled the RAM overclocking, but unfortunately I'm still receiving the 'Machine Check Events' error (most recent diag attache). Unfortunately I don't have a second CPU on hand to try, so you have any suggestions on how I can test the health of this CPU?

 

As for the disk errors (for anyone else seeing this problem)- I resolved this with a new PSU. Apparently my previous PSU had one of it's rails failing.

diagnostics-20230808-0941.zip

Link to comment
  • 3 weeks later...
On 8/11/2023 at 10:09 AM, JorgeB said:

Not really, best bet would be using a different one, also look for a BIOS update but that shouldn't do much for this.

Unfortunately BIOS is already up to date, but I did have this error popup in my log. Think it's related?

 

 

 

Edited by Kudagra
Link to comment
  • 5 months later...

Just to provide an update on this problem- it ended up being a faulty CPU as JorgeB suggested.

 

I successfully RMA'd my 5950x a few weeks back (meaning it failed their testing), and I haven't seen the errors since installing the replacement CPU.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...