How to find out which memory stick is bad?


Recommended Posts

After memory overheating (I suppose, the temps were at around 100°C) I'm getting MCE memory errors but I don't know which stick exactly is causing it. I have a Supermicro X9DRi-LN4F+

Oct 8 02:05:19 Tower kernel: EDAC MC1: 1 CE memory read error on CPU_SrcID#1_Ha#0_Chan#0_DIMM#2 (channel:0 slot:2 page:0x561ee8 offset:0x100 grain:32 syndrome:0x0 - area:DRAM err_code:0001:0090 socket:1 ha:0 channel_mask:1 rank:8)

How can I find out which slot Channel 0 Slot 2 is? The manual and everything else I can use to figure out which one it is only labels them as P1 DIMMA1, P1 DIMMA2 etc.

Since I have 24 sticks in there I was wondering if there is a faster way to figure out which one is bad without taking out one stick at a time and then waiting again for the error to appear.

I can't run memtest at is always gives me the message "Booting kernel failed: Invalid argument" when I try to select memtest86

tower-diagnostics-20191008-0025.zip

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.