Thinking about unRAID but one burning question

justme777 · April 4, 2015

Hi all, thinking about installing unRaid on a DIY nas, I have however one big question.

As we all know the danger of raid5 is the URE (unrecoverable read error) during a rebuild.

In raid 5 this means a total failure of the rebuild and bye bye array... (and data).

A big reason why raid5 is not advised anymore and things like raid6 or raidz2 (in zfs) are advised (or 1 or 10).

I seriously cannot find a consistent and clear answer about how unRAID handles this situation.

In an old topic I see that limetech itself responds the following:

The reconstruct will continue.

See: http://lime-technology.com/forum/index.php?topic=3222.msg27262#msg27262

However when I read new topics I see people explaining that they had a failed rebuild due to an URE?

For me, getting one corrupt file would be no problem as long as the rebuilding continues, however a completely failed rebuild does not sound appealing...

So to recap, how does unRAID handle UREs during a rebuild?

Thanks!

Squid · April 4, 2015

I was PM'ing Tom a month or two ago and he confirmed that in the case of a read error during a rebuild, the system will carry on as if nothing had happened. You will be warned about it on the main screen as a read error. Depending upon how many files you have on the drive being rebuilt, there may be a corrupted file, or nothing noticeable at all. Its why I keep md5 checksums of all my files so that if something like this happens at least I can find out which file if any was affected

justme777 · April 4, 2015

Thanks for the quick reply..

This is a huge huge plus! that for some strange reason is nowhere written as a USP (which it definitely is!)

I still wonder why I read topics about a failed rebuild due to URE for example:

The rebuild failed due to an unreadable sector, unRAID threw the drive out of the array and two drives were now offline.

My parity drive that I was trying to rebuild and the data drive that had the failed unreadable/pending sector.

FWIW, I did do a parity check the night before.

As far as 'a rebuild doesn't fail from a pending sector' that's only half true.

If the pending sector needs to be read by some allocated data and it is unreadable,

and the drive returns that back to the kernel, the rebuild will fail.

If the pending sector is in an unused portion of the drive, and can somehow be reconstructed from CRC, it will be remapped when written.

see: http://lime-technology.com/forum/index.php?topic=30588.msg302479#msg302479

I really hope "Tom" is right in this case because that means that the big reason why not to use raid5 is gone with unRAID

(I was thinking about xpenology since that has an amazing amount of plugins (full synology features!)) but there i would have to lose one extra disk (and I only want to use 4)

Hope that someone can clear this up even more....

Thanks

Squid · April 4, 2015

I really hope "Tom" is right in this case because that means that the big reason why not to use raid5 is gone with unRAID

Considering that Tom (@limetech) wrote the software I would trust him.

The big thing is that during a rebuild you want to avoid writing to the array. If the write happens to fail, then another drive will be taken out of service, and at that point the rebuild will fail since you've now exceeded the tolerance level.

Worst case scenario is that on a failed rebuild during that circumstance is that you've lost two drives worth of information (not all of the data if you were on RAID5) (but if the parity drive was one of them then you've only lost 1) Its super easy to avoid writes however if you use the cache drive - just disable mover.

itimpi · April 4, 2015

The other big thing that I do not think is made enough of is the fact that under unRAID each disk is a self-contained file system that can be read independently outside of the array. This means that except in the case of a drive becoming completely dead (which is relatively rare) the vast majority of the data will still be retrievable. This is a huge plus from a safety/recovery perspective as it means that even multiple drive failures do not necessarily lead to massive data loss.

I guess it is one of those features that is very hard to market as one does not want to talk about how good you are at recovering from severe failures. After all it is likely to convince the naïve users that the product is unreliable However if you have experienced what happens with other types of RAID in failure scenarios it is an attractive feature.

justme777 · April 4, 2015

That is indeed a great feature... and extremely important data should be backed up in more then one location anyway.

Isn't btrfs going to create more data security since that is included in verson 6?

(I don't know much, but if it is like zfs there are self healing features and bit rot prevention checksums etc.)

Squid · April 4, 2015

That is indeed a great feature... and extremely important data should be backed up in more then one location anyway.

Isn't btrfs going to create more data security since that is included in verson 6?

(I don't know much, but if it is like zfs there are self healing features and bit rot prevention checksums etc.)

I can't 100% answer that. My opinion is that since btrfs is still an evolving filesystem I would avoid it and go with XFS for the array drives (which is the current recommendation from limetech)

itimpi · April 4, 2015

That is indeed a great feature... and extremely important data should be backed up in more then one location anyway.

Isn't btrfs going to create more data security since that is included in verson 6?

(I don't know much, but if it is like zfs there are self healing features and bit rot prevention checksums etc.)

On paper BTRFS provides data level protection against things like bitrot which does help detect/correct certain types of problem early on. However I do not know how well BTRFS handles severe corruption. I do not think the recovery tools are (yet) as good as some of the older file systems but I would expect that to only be a matter of time. I guess that with BTRFS support being a standard option in v6 we will start gathering some real world experience.

WeeboTech · April 5, 2015

I was PM'ing Tom a month or two ago and he confirmed that in the case of a read error during a rebuild, the system will carry on as if nothing had happened. You will be warned about it on the main screen as a read error. Depending upon how many files you have on the drive being rebuilt, there may be a corrupted file, or nothing noticeable at all.

Didn't happen to me, the drive was kicked and the array went bad with two drives failed.

Fortunately I was able to get by that 1 pending sector causing a rebuild to fail.

I had to use ddrescue onto another drive. In forward, then reverse mode. I did not loose the whole drive's data.

Its why I keep md5 checksums of all my files so that if something like this happens at least I can find out which file if any was affected

This is important to do on some regular basis so you know where a problem exists.

Even ReiserFS had it's own silent corruption and the only way to verify the data was by some external checksum mechanism.

Thinking about unRAID but one burning question

Recommended Posts

justme777

Link to comment

Squid

Link to comment

justme777

Link to comment

Squid

Link to comment

itimpi

Link to comment

justme777

Link to comment

Squid

Link to comment

itimpi

Link to comment

WeeboTech

Link to comment

Archived