ECC Memory or not


tessierp

Recommended Posts

Hi,

 

I am contemplating the idea of building a NAS system and installing PLEX on it. I will be storing pictures, videos and documents on it primarily. I will be using a Ryzen 2700X or maybe a 3700X. I am not sure I will be going with a server motherboard since I do not really need IPMI and most of the boards I have seen are either too expensive or are just Micro-ATX. However, that may chance depending on the answer to my question which is the following.

 

There is a bit of confusion for me regarding to use ECC RAM or not. From what I read ECC is supposed to prevent data corruption but this is something I have rarely seen happened. I owned a QNAP TS-563 for a while now, has no ECC RAM and never encountered any issue. So really, is ECC really that necessary? Has someone here even ran into a problem of every corruption? I intend to use 2 parity drives using BTRFS, isn't that sufficient to guaranty data corruption?

 

I would appreciate your comments on this.

 

Thanks

Link to comment
53 minutes ago, tessierp said:

 

There is a bit of confusion for me regarding to use ECC RAM or not

ECC RAM is not a necessity.  That said, of the three servers I currently have in use. two have ECC RAM, one does not.

 

ECC RAM is very expensive right now and many will tell you they have been running their unRAID servers for years without it and without any memory errors and resulting corruption.

 

I have ECC RAM on two servers because the motherboards and CPU support it. I thought, why not?  It's just an added measure of protection in a server that is running 24x7.

 

The thing about ECC RAM is that it corrects single-bit errors, so, it is unlikely you would ever know if it saved you from corruption. It just silently does its job unless the memory corruption is major.

 

Do you have to have it?  No.  Is it nice to have if if doesn't break the bank?  Sure, if you value that extra protection over other things on which you could spend your money.

Edited by Hoopster
Link to comment
8 hours ago, Hoopster said:

The thing about ECC RAM is that it corrects single-bit errors, so, it is unlikely you would ever know if it saved you from corruption. It just silently does its job unless the memory corruption is major.

And, if the memory corruption is irreparable, it halts the computer immediately so you can fix the issue before it corrupts all the data written to the drives. Non-ECC can merrily go on its way, silently putting bad bits into all the data you are trying so hard to keep safe.

 

Granted, memory failure is rare, but it does happen, on this forum we see probably 1 or 2 instances a week sometimes where a server is acting strangely and a memtest reveals bad RAM. Of those cases, many are found because of unexplained file system corruption.

 

ECC is an insurance policy. It's just an extra unnecessary cost, until it saves your bacon.

 

If unraid only ever holds true third tier backups, and those backups have means to verify their validity through some checksum function, then no, ECC probably isn't a good investment. You can always recovery corrupted data from your other backups in the unlikely chance you have bad RAM.

Link to comment

@Hoopster @jonathanm Thanks to both of you for your responses. ECC RAM is expensive and also doesn't come with heat sinks which I always found odd. The timings are also much slower as well. Granted, it is not a build meant for gaming however, some people have been building systems using UNRAID for use as a NAS as well as gaming and workstation machines using hardware passthrough which I thought is very interesting and useful. 

 

But I do have a few more questions. ECC RAM is a hardware method to help prevent errors I get that. I was under the impression that parity drives was a software method of not only preventing the loss of data in case of drive failures but also did checksum calculations to prevent data corruption, or is that untrue? 

 

Last but not least, corruption of data is most likely not something that happens very often when writing to a disk, I mean I haven't seen that happen EVER on a normal machine but, I did get crashes or memory corruption errors which I guess is where ECC RAM helps, to prevent system crashes right?

 

Also, if I do want to use ECC RAM, I believe that not only the processor needs to support it (in this case Ryzen does) but the motherboard as well right? So that means you also need to have, in many cases, a server motherboard correct?

 

Thanks again.

Edited by tessierp
Link to comment

The point about ECC RAM being slower than non-ECC RAM is not relevant since you shouldn't be overclocking RAM on Unraid. Those advertised high speed and fast timing are certified overclock and is an overclock nonetheless.

 

ECC won't prevent system crash or data corruption in the sense that if you use it, you won't have those things taking place. It only protects you against crash / corruption in a very specific case that is a single-bit error. Whether that matters or not (vs the cost of ECC RAM) is entirely personal preference.

 

I have run both ECC and non-ECC and my personal anecdotal experience is that it makes no perceivable diff in terms of stability or data corruption.

 

 

 

Link to comment
4 minutes ago, testdasi said:

The point about ECC RAM being slower than non-ECC RAM is not relevant since you shouldn't be overclocking RAM on Unraid. Those advertised high speed and fast timing are certified overclock and is an overclock nonetheless.

 

ECC won't prevent system crash or data corruption in the sense that if you use it, you won't have those things taking place. It only protects you against crash / corruption in a very specific case that is a single-bit error. Whether that matters or not (vs the cost of ECC RAM) is entirely personal preference.

 

I have run both ECC and non-ECC and my personal anecdotal experience is that it makes no perceivable diff in terms of stability or data corruption.

 

 

 

@testdasi Thanks for sharing your experience and knowledge regarding this. I guess I have to read more on when and how single bit errors happen and if this is something I should protect against given what I want to use my server for. Basically, I want to replace my QNAP TS-563 with something I can expand more and where I can guarantee I will find parts to replace failing components.. I want to use it for storage, PLEX and perhaps run some docker containers, probably 1 VM. That is about it. Not sure if I will leave the thing running 24/7.

 

I guess that, if I do really want to secure my data, I shouldn't rely on a NAS 100% anyways and make proper external backup on what really matters anyways.

Link to comment
1 hour ago, tessierp said:

I guess that, if I do really want to secure my data, I shouldn't rely on a NAS 100% anyways and make proper external backup on what really matters anyways.

One thing I forgot to caveat. I was talking in terms of Unraid core experience.

If you plan to ever use ZFS (e.g. ZFS plugin "app" that is not officially supported by Unraid i.e. outside of the Unraid core experience) then ECC is considered a must-have requirement.

 

I do agree with your point above. When picking between spending on a backup vs ECC RAM (for Unraid uses), I would pick a backup every time.

Link to comment
1 minute ago, testdasi said:

One thing I forgot to caveat. I was talking in terms of Unraid core experience.

If you plan to ever use ZFS (e.g. ZFS plugin "app" that is not officially supported by Unraid i.e. outside of the Unraid core experience) then ECC is considered a must-have requirement.

 

I do agree with your point above. When picking between spending on a backup vs ECC RAM (for Unraid uses), I would pick a backup every time.

@testdasi I have been reading on ZFS, the guy from level1tech and others say it is the best file system around and from what I read I would agree. However, I do not think I really need it that much and I find it very limiting, you can't add one drive at a time with different sizes from what I read. BTRFS seem to be better suited for my needs and good enough. I do plan to use 2 parity drive and backups like I said. That should be better than fully relying on ECC. I'm not saying ECC is not useful, it is an added protection however at the price asked, just too much and I have some free DDR4 memory already available.

 

If my old QNAP TS-563 without ECC RAM was able to last me 4 years without any issues, I'm sure UNRAID with decent hardware can do the job.

 

Again I appreciate all the responses. Very helpful.

Link to comment
3 hours ago, tessierp said:

I was under the impression that parity drives was a software method of not only preventing the loss of data in case of drive failures but also did checksum calculations to prevent data corruption, or is that untrue? 

No. Parity as implemented by unraid only recreates a single (or double if using 2 parity disks) missing disk regardless of content. It has no concept of files or file corruption.

 

Checksum is a function of the file system, and since unraid uses single member file systems, corruption can only be detected, not corrected.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.