September 27, 20232 yr Hi. I'm testing the memory of a new server hardware and got these ECC errors this morning after about 60 hours of total interrupted memtest run. Total errors is zero so hardware overcame these issue. I didn't mean to run memtest this long, but had three power failures in my home, and the server is not connected to a UPS, so the memtest restarted and continued to test. One full test pass (~ 20 hours) was completed without any errors / ECC Errors. These errors began this morning, about 15 minutes after the third power failure. All logs (found on the memtest86 USB stick) beside the current one have no errors and no fixed ECC errors. Any recommendation? Should I start looking for a malfunctioned Dimm? Maybe run several passes (only) of these failed tests, this time with a UPS? Each pass would take about 4 hours. The hardware was bought used: MB: Supermicro HL12SSL-I CPU: EPYC 7302 + 4U fan Mem: 8X64GB Samsung PC4-2666V Registered ECC DDR4 PSU: Corsair HX1200i Edited September 27, 20232 yr by Gico
September 27, 20232 yr 15 minutes ago, Gico said: Should I start looking for a malfunctioned Dimm? Look and the system event log in the BIOS, or IPMI log, it may show the affected DIMM.
September 27, 20232 yr Author Didn't find anything relevant. The "Health Event Log" in the IPMI has similar errors in 2021. BIOS had only configuration of system event log, not the event log entries. Found "SMBIOS event log" which wasn't relevant.
October 6, 20232 yr Author I tested the memory sticks. 8 Passes of 4 sticks passed successfully without any error. 8 Passes of 4 the other 4 sticks (using the same slots as the previous 4) passed successfully without any error. 8 Passes of all the 8 sticks together passed successfully with 1 corrected ECC error. This might indicate that one of the slots being used by sticks 5-8 has an issue, but I doubt it. I don't know if this has nothing to do with the ECC errors, but the CPU fan is adjacent and actually touching one of the sticks. On my initial testing I in stalled the fan next to that stick, so the fan pushed a little the stick horizontally, and when doing the latter tests I installed the stick under the fan, so it might pushed the fan up a little, as seen in the screenshot.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.