Hardware errors detected, likely a RAM stick, but which?


Earan

Recommended Posts

Hi Everyone, for a while now, my unraid server throws hardware errors, every now and then, which seem to be RAM related. I recently saw this on the screen it's attached to:

RI1CM.thumb.jpg.3519aca1815e43a7bb0090fcf64f54f4.jpg

Here's the parts that I'm using:

  • Supermicro MBD-H11DSi-NT-B
  • 2x AMD Epyc 7301
  • 8x16 GB of Kingston Server Premier KSM26RD8/16HAI DDR4-2666 regECC

 

One RAM stick seems to have issues, since the server reports as 112GB of Memory sometimes, and not 128GB after a reboot.

How do I find out which RAM stick it is, since those errors come up infrequently?

Are there other issues in the logs on the screen?

Link to comment

Neither on the CLI with IPMITool from the Nerdpack, nor with the IPMI support plugin can I see any RAM Related issues. Downloading the full syslog I see quite a few events like the one on the screen, but all of them say,

[Hardware Error]: Corrected error, no action required.

also not really stating which RAMslot it is, or atleast, I cannot make it out.

 

this is one full event:

Dec 16 21:46:50 itXsvr kernel: mce: [Hardware Error]: Machine check events logged
Dec 16 21:46:50 itXsvr kernel: [Hardware Error]: Corrected error, no action required.
Dec 16 21:46:50 itXsvr kernel: [Hardware Error]: CPU:8 (17:1:2) MC15_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000000011b
Dec 16 21:46:50 itXsvr kernel: [Hardware Error]: Error Addr: 0x0000000143092400
Dec 16 21:46:50 itXsvr kernel: [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: 0x000067100a400401
Dec 16 21:46:50 itXsvr kernel: [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error.
Dec 16 21:46:50 itXsvr kernel: EDAC MC2: 1 CE on mc#2csrow#1channel#0 (csrow:1 channel:0 page:0x973092 offset:0x400 grain:64 syndrome:0x6710)
Dec 16 21:46:50 itXsvr kernel: [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.