jfoxwoosh Posted July 31, 2022 Share Posted July 31, 2022 (edited) I have been consistently getting this hardware error. Quote Jul 31 09:10:44 kernel: mce: [Hardware Error]: Machine check events logged Jul 31 09:10:44 kernel: [Hardware Error]: Corrected error, no action required. Jul 31 09:10:44 kernel: [Hardware Error]: CPU:1 (19:21:0) MC15_STATUS[-|CE|-|-|PCC|-|UECC|-|-|-]: 0x820b64a7c7c748ee Jul 31 09:10:44 kernel: [Hardware Error]: IPID: 0x0000000000000000 Jul 31 09:10:44 kernel: [Hardware Error]: Microprocessor 5 Unit Ext. Error Code: 7, Instruction Cache Bank B ECC or parity error. Jul 31 09:10:44 kernel: [Hardware Error]: cache level: L2, tx: RESV Some googling of this, seems to point to early Ryzen CPU defect. However, I haven't found any other report like this online for 3rd gen Ryzen (5900X). Should this error be of concern and perhaps required a warranty service with AMD? My system diagnostics is attached. diagnostics-20220731-1416.zip Edited November 15, 2022 by jfoxwu Quote Link to comment
JorgeB Posted August 1, 2022 Share Posted August 1, 2022 If the error has been consistent I would try to get a replacement. Quote Link to comment
Solution jfoxwoosh Posted November 15, 2022 Author Solution Share Posted November 15, 2022 (edited) I have identified the cause of the MCE error message. I need to set "typical idle current" in bios for the CPU as described in this post by JorgeB. Thanks! I remember having this set correctly when the server was built initially, however, after some MB bios updates later, I forgot to change it back when all bios settings got reset. It has been 30+ days now, and the server hasn't given any error message like before. Edited November 16, 2022 by jfoxwu typo 2 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.