Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Can ECC Ram go bad?

Featured Replies

The very nature of ECC is that it's error checking, so if a stick develops errors, is there any way to tell that corrections are taking place? How badly does an ECC stick need to fail before ECC is unable to do it's job?

Yes, an ECC module can fail.  It can correct a single bit error and detect multiple failures ... which an OS that supports ECC will report (as long as the errors don't cause a crash).

 

  • Author

Would unraid be able to report those errors?

Supermicro boards report correctable ECC errors in the bios event log, don't know about other brands but assume they are similar, if there's an uncorrectable error I believe the server should halt.

As Johnnie noted, SuperMicro boards will show any corrected errors in the event log.  I'm not certain, but don't think that UnRAID reports these anywhere, so you'd only know about them if you check the event log periodically.

 

If you add the IPMI plug in, you may be able to access an event log that will show these errors. I was able to do so with my Supermicro chassis, although it has an Intel server motherboard, it recorded ECC failures on a specific piece of RAM and identified which slot it was in so I could remove it.

If one of your ram blocks start throwing more than 1 error once in a while you probably have a defective ram block that should be replaced, and RMA'ed if its within its warranty. I haven't seen any so fare in my relative new SuperMicroX11 64GB ECC build. But will let you know if/when I start notice ECC errors in the eventlog.

  • Author

So I've installed the ECC ram into my Microserver now. I see that single bit error correction is now reported when I type 'dmidecode --type memory'.

 

I thought I should have multi-bit ECC as well?

ECC memory can only correct a single bit error.

 

In normal PC applications, where ECC DIMM modules are 72 bits wide, 64 of those bits are data.  The normal usage in a PC allows single-bit errors to be corrected at the memory controller (in the processor) and double bit errors to be detected and not corrected.  Multiple bit errors of more than two bits, may or may not be detected, but cannot be corrected.  Error correction schemes can be implemented to offer much greater protection than this, but they would require more check bits to be stored with the data.  The correction codes and numbers of check bits used with regular processors and memory systems are chosen based on costs, complexity and relative likelihood of different types of errors.  I worked on error correcting computer memories back in the late 1970's - the maths behind Hamming codes and similar methods was beyond me, but the hardware to make these things work is fascinating (in a geeky kind of way).  

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.