When is moving to dual parity preferred


Recommended Posts

I'm wondering at what number of hard drives or at what total array size is dual parity now the preferred way to go?  I’m currently faced with my next upgrade dilemma so wondering which way is going to be the best way forward.

 

The array is currently a mixture of 4, 3 and 2TB disks totally 16TB’s.  Being the hoarder that I am I’m running low on space again and faced with my next upgrade.

 

The current case is the Fractal Define Mini which has 6 internal 3.5” bays and 2 external 5.25” bays.  The 2 5.25” bays are occupied by my ssd cache drive and 1 x 3.5” drive.

 

http://www.fractal-design.com/home/product/cases/define-series/define-mini

 

Parity TOSHIBA_MD04ACA400_84E2K6HPFSAA - 4 TB

Disk 1 TOSHIBA_DT01ACA300_242S5X2GS - 3 TB

Disk 2 TOSHIBA_MD04ACA400_84E1K4RAFSAA - 4 TB

Disk 3 Hitachi_HDS5C3020ALA632_ML0220F30DYRWD - 2 TB

Disk 4 Hitachi_HDS5C3020ALA632_ML0220F30GGKXD - 2 TB

Disk 5 TOSHIBA_DT01ACA300_331K5XRGS - 3 TB

Disk 6 Hitachi_HDS5C3020ALA632_ML0220F30GGLTD - 2 TB

 

So as it stands currently I’ve run out of places for new drives and have used all sata headers on my motherboard.  I do also have a spare case which is a fraction bigger then the Define but it will not fit into my home entertainment unit so I can’t use this yet.  The spare case has 5 internal 3.5” bays and 5 x external 5.25” bays.  We are hoping to buy and move house in the near future so when this happens I will have move space for networking and storage equipment and will probably move my current setup into this case at that point.  To complicate matters further I also have a  3TB drive spare which was left over from another project.

 

I’ve been running a few scenario’s around but can’t come to any decision yet and am looking for some advice.  Any decision will also depend on at what point it is advisable to move to dual parity.

 

Array is used for Movies, TV, Music, Photos and Computer backups.  Music, Photo’s and Computer backups are regularly copied to 1 of 2 portable hard drives and rotated off-site to the stock room at my work.  So anything else is technically replaceable but I would like to minimise as much as possible the need to replace data.

 

Option 1: Replace a 2TB drive with a spare 3TB drive and net 1TB’s worth of space for $0 spent.

 

Option 2: Replace parity with 8TB Seagate and move parity drive to replace a 2TB drive, net 2TB’s storage for $300.

 

Option 3: move the SSD somewhere else in the case and free up space for either a spare 3TB drive or the replaced parity drive in the above scenario. Nets either 3 or 4TB of space for either $80 for Sata Controller or $380 for controller + 8TB.

 

Option 4: replace parity with 8TB drive for now and replace 2TB drive.  2TB gained for $300.  After moving house Add additional 8TB parity drive into larger case and add spare 3TB drive + replaced 2TB drive netting another 5TB’s of space for $300 + the cost of a 5in3 5.25” enclose. This would bring me up to 9 drives in total. In theory I could run up to 10-12 drives total in the spare case.  So moving forwards from this point I could either add an additional 8TB drive or two as required netting the full 8TB of extra space for $300 each time and then once this is exhausted look to upgrade/replace existing drives.

array.png.58001abad8440aea367963dcfc835d83.png

Link to comment

I think once you get to the point where you have so many drives that the possibility of two drive failings at the same time increases beyond what you are comfortable with, its time for dual parity. Or a second drive failing during a parity rebuild.

 

I have dual parity on my 'monster' unRAID server, but it has 27 drives and parity is 8TB which takes just over 24hrs to rebuild. So for me, with all those drives and all that time to rebuild parity, my chances of a second drive failing during a parity rebuild are greater than most peoples.

Link to comment

I would suggest you take a look at the table at the end of my post and decide yourself how much risk you are willing to take.


Explenation of the math:

Have a look at your HDD-Specs, you will finde something like "Non-recoverable read errors per bits read", for WD-Red you get "<1 in 10^14".

 

Now think of how many bits have to be read to recover a faild drive (i.e. all bits on all non-failed drives).

The closer the number of bits that have to be read is to the error-rate ... the higher the risk in encountering a second "failure" while rebuilding the failed drive.

 

Lets say the "1 in 10^14" is the "expectation" of a Poisson-Process counting the erros you encounter while reading. This process therefore has the rate "10^-14". The probability in not encountering an error while reading 4TB ~ 3.5x10^13 is e^(- 10^-14 * 3.5x10^13) = 0.7047 = 70.47%

 

Now if you have 7x 4TB drives and 1 parity you have to hope for 6 drives not to fail (having read-errors) while being read (for the rebuild).

The probability in not encountering an error on all 6 drives is therefore 0.7047^6 = 0.122 = 12.2%.

 

Are you willing to take an 87.8% chance in corrupted data / loosing all your data while rebuilding?

Or would you rather pay the premium (buying + occupied SATA-Port + electricity) and have peace of mind?

 

Please remember: no raid is a substitute for a backup


 

For the people who do not want to do the maths:

The following table gives the chance of encountering at least one read error.

 

 

Hypthesis: Such errors can be modeled as Poisson-Process with rate "1/expectation"

 

first line = expected error rate, please consider "<1 in 10^14" could be any value, enterprise disks are typically "<1 in 10^15" so your real value is brobably something in between this table

first column = size data to be read while rebuilding

 

1 expected error in: 1,00E+14 2,00E+14 3,00E+14 4,00E+14 5,00E+14 6,00E+14 7,00E+14 8,00E+14 9,00E+14 1,00E+15

2TB16,13%8,42%5,70%4,30%3,46%2,89%2,48%2,18%1,94%1,74%

4TB29,66%16,13%11,07%8,42%6,79%5,70%4,90%4,30%3,83%3,46%

6TB41,01%23,19%16,13%12,36%10,02%8,42%7,26%6,38%5,70%5,14%

8TB50,52%29,66%20,91%16,13%13,13%11,07%9,56%8,42%7,52%6,79%

10TB58,51%35,58%25,41%19,74%16,13%13,64%11,81%10,41%9,31%8,42%

12TB65,20%41,01%29,66%23,19%19,03%16,13%14,00%12,36%11,07%10,02%

14TB70,81%45,98%33,67%26,50%21,83%18,55%16,13%14,27%12,79%11,59%

16TB75,52%50,52%37,45%29,66%24,53%20,91%18,21%16,13%14,48%13,13%

18TB79,47%54,69%41,01%32,69%27,14%23,19%20,24%17,96%16,13%14,64%

20TB82,78%58,51%44,37%35,58%29,66%25,41%22,22%19,74%17,76%16,13%

22TB85,56%62,00%47,54%38,36%32,09%27,57%24,15%21,49%19,35%17,59%

24TB87,89%65,20%50,52%41,01%34,44%29,66%26,04%23,19%20,91%19,03%

26TB89,84%68,13%53,34%43,55%36,71%31,69%27,87%24,86%22,44%20,44%

28TB91,48%70,81%56,00%45,98%38,90%33,67%29,66%26,50%23,94%21,83%

30TB92,86%73,27%58,51%48,30%41,01%35,58%31,41%28,10%25,41%23,19%

32TB94,01%75,52%60,87%50,52%43,05%37,45%33,11%29,66%26,86%24,53%

34TB94,97%77,58%63,10%52,65%45,02%39,25%34,77%31,19%28,27%25,85%

36TB95,79%79,47%65,20%54,69%46,92%41,01%36,39%32,69%29,66%27,14%

38TB96,47%81,20%67,18%56,64%48,75%42,71%37,97%34,15%31,02%28,41%

40TB97,04%82,78%69,05%58,51%50,52%44,37%39,51%35,58%32,36%29,66%

50TB98,77%88,91%76,92%66,70%58,51%51,95%46,65%42,29%38,66%35,58%

60TB99,49%92,86%82,78%73,27%65,20%58,51%52,95%48,30%44,37%41,01%

70TB99,79%95,40%87,16%78,55%70,81%64,16%58,51%53,68%49,55%45,98%

80TB99,91%97,04%90,42%82,78%75,52%69,05%63,41%58,51%54,25%50,52%

90TB99,96%98,09%92,86%86,18%79,47%73,27%67,73%62,83%58,51%54,69%

100TB99,98%98,77%94,67%88,91%82,78%76,92%71,54%66,70%62,37%58,51%

150TB>99,99%99,86%98,77%96,31%92,86%88,91%84,82%80,78%76,92%73,27%

200TB>99,99%99,98%99,72%98,77%97,04%94,67%91,90%88,91%85,84%82,78%

300TB>99,99%>99,99%99,98%99,86%99,49%98,77%97,69%96,31%94,67%92,86%

400TB>99,99%>99,99%>99,99%99,98%99,91%99,72%99,34%98,77%97,99%97,04%

500TB>99,99%>99,99%>99,99%>99,99%99,98%99,93%99,81%99,59%99,25%98,77%

 

This table was created using google-spreadsheets and http://theenemy.dk/table/ to convert.

 


 

Case: Double parity

 

If you have double parity, the chance of any read-error on your first n drives being exactly where the second-parity (disk n+2) also encounters an read error is ... really small.

 

Let's just assume you got 4kB of "read-errors", so your second-parity drive has to cover those by reading 4kB, the chance of encountering an error while doing so is 1-e^(-10^(-14)*4*8*1024^1)=0,000000000328= 0,0000000328%.

 

Best regards blue_Bandana

Link to comment

Thank you, this is exactly what i was looking for, great explanation to.  Looking at the table for 6 drives @ 4TB is it actually saying 87.89% change of encountering a URE?  If so i think i know what i should be prioritizing. I realise unRaid is no substitute for backups but my most critical data, photos, documents, music and computer backups are backed up twice and rotated off site. This is really to protect/minimise the change of having to rebuild any/all of the movie or tv libraries.

Link to comment

Looking at the table for 6 drives @ 4TB is it actually saying 87.89% change of encountering a URE?

 

This chance of encountering an UBE is correct if my hypothesis is right, that UBE's can be modelled as a Poisson-Process with rate 10^-14.

 

I think it is pretty safe to say that not all of those assumptions are true. If it were then there would be a METRIC CRAP TON of users complaining about parity errors during their monthly parity checks. Perhaps I am incorrect. It would be nice if one of the smarter people would weigh in. Specifically I am wondering what happens when you encounter a URE during a parity check. I am working from the premise that a parity check reads the data disks, computes the expected parity value and compares that to the value that is currently stored on the parity drive. So what happens if there is a URE and the current parity bit cannot be read? What happens if there is a URE on one of the data disks and parity cannot be computed? I am assuming that the system would log some kind of error. Perhaps I am incorrect.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.