Reduce parity calculation times by up to 20%


Recommended Posts

The calculation of parity is performed by all drives being written to at once. At position one each bit is read from each drive. An Exclusive-OR operation is performed on these bits and a value is calculated. This value is written to the parity drive. The speed of these calculations is constrained by the speed at which the data can be obtained from the individual data drives. This speed is determined by the speed of the drives themselves or the data bus to which they are attached.

 

Most unRAID systems being built today place the hard disks on a PCIe bus to prevent the data bus slowing the parity calculation, therefore the speed of the drives themselves becomes the limiting factor. Most systems utilize a mixture of disk sizes. These smaller drives have lower areal density and thus lower data read rates. During parity calculation these small drives slow down parity calculation until they are read from completely. Because the data read rate is a function of the areal density, the read rate at the inner tracks of a hard disk is much lower than those at the outside. This degradation is about a factor of two on the disks we tend to use in unRAID.

 

We can reduce and sometimes eliminate the reduction in parity calculation speed by shifting the smaller drives until the end of the parity calculation. To do this would of course require a change to the way the unRAID software. Smaller drives would have an offset added to their first address such that the last bit on all drives would occur at the same time, unlike now where the first bit is aligned. In my calculations this would speed up my parity calculations by twenty percent. I have a mixture of 1T. and 2T. drives, some 7200 and some 5400-5900. In no case would this ever decrease performance of the parity calculation. If all drives are of equal size, then parity would of course be the same as it is now.

 

This would require a software change on Tom’s part. The major downsize is that the parity disk would be different between two different versions of the software. Downgrading UnRAID release versions would not be possible without a parity recalculation. A couple of things could be done to help alleviate the transition. Write out the parity creating method on the parity disk itself; warn the user if this signature is not recognized. Allow users to opt in to this parity creation method, so they will only invoke it once they know the risks. I for one would jump at the chance to decrease parity check times. I just spent several hundred dollars upgrading because my parity calculation times were getting out of hand. Now I want that last little bit.

 

Scott

Link to comment

In my calculations this would speed up my parity calculations by twenty percent.

 

In that statement, you are giving us a nice solid number, without any qualifiers like around, or about, which implies that your number is very accurate.

Knowing that 48.6 percent of all statistics are made up on the spot, I almost stopped reading right there. :)

 

Other than that, the idea of staggering all smaller drives along the size of the big parity drive may turn out to be very interesting.

 

Link to comment

To calculate my speedup in parity I first logged a parity calculation by capturing the current time and the position of the parity calculation every 30 seconds. After staring at this graph and seeing my parity calculation times jump up when my 1T WD green drives were finished it dawned on me that removing them from consideration for the first half of parity calculation would really speed things up. The parity calculation graphs (calculation speed vs. position) for the first half and second half of the calcuation were about the same. I estimated my first half speeds based on the lowest speed 2T disk (100 MB/sec WD green). The read rate for the first 1T of my parity calculation and second T were roughly equal, so I used my actual time for the second half of my parity calculation. Overall I estimated a reduction of about 20%. I could get more accurate numbers if I did a recalculation of parity without my slow drives in the mix, but I don't have an extra 2T drive around. In thinking about it I believe that my situation is about the worst case, so that is why I stated up to 20%.

 

Replacing a smaller drive with a larger drive would be trivial. Instead of zeroing the end of the drive, we would zero the start.

Link to comment

I read an article that basically said that the first 20% of the drive was very fast.  The testing for that article revealed that the first 20% of a big 7200rpm drive was as fast as a WD velociRapter drive which spins at 10Krpm.  So if you have a small drive like a 320GB drive then 20% of that will be fast and the rest last 40% or so will imped the parity calc if the other drives are all big, at least until you pass the small drive's size.

 

2TB drive 7200rpm

  20% == 400GB will be uberfast.  Like 10K fast.

  60% = 1.2TB will be mostly average.

  20% = 400GB will be slow.

 

320GB drive  7200rpm

  20% == 64GB will be uberfast.  Like probably not 10K fast but faster than 7200rpm average.

  60% = 192GB will be mostly average.

  20% = 64GB will be slow.

 

Now, that's only part of the story because that uberfast can have different uber-ness depending on the actual areal density of the particular platters.

But it's a general argument I think that the slow-down effect is more about the smaller drive prevent you from seeing the big uberfast-ness of a larger drive than it is about slowing the parity check down slower than the average speed of the slowest big drive.

Add one more twist...if the small drive is 7200rpm and the big drives are 5400/5900rpm then the slow down is somewhat reduced even more.

 

So I don't think the issue is as big as believed.  And I'd say for sure that as a feature enhancement the priority is way down on the list.

 

What might end up being a bigger issue:

 

2TB drive + 1TB drive + 750GB + 500GB + 320GB

 

In this case, all the slow parts of each drive are offset from each other so the slow down area is enlarged.  But even this problem will work itself out as you replace older drives with larger newer drives.

Link to comment

Replacing a smaller drive with a larger drive would be trivial. Instead of zeroing the end of the drive, we would zero the start.

Not so easy.

 

Let's try an example.

 

Parity = 1TB

Data 1 = 1TB

Data 2 = 500Gig

 

To make this easy, all the data bits on the 1TB drive are "1"

On the 500Gig drive, only the first bit is a "1", the remaining are "0"

 

Ok, for this exercise, here are the bits

Conventional Calculation

                   data1         data2        parity
Address 0           11111111    10000000     01111111
Address 1           11111111    00000000     11111111

Address 500G      11111111    00000000     11111111
Address 500G+1   11111111    00000000     11111111

 

Your proposed method

                    data1         data2        parity
Address 0           11111111    00000000     11111111
Address 1           11111111    00000000     11111111

Address 250G      11111111    00000000     11111111

Address 500G      11111111    10000000     01111111
Address 500G+1   11111111    00000000     11111111

Address 750G      11111111    00000000     11111111

 

Now...   We will replace the 500Gig drive with a 750 Gig drive because the 500Gig drive failed.

 

                          data1         data2        parity
Address 0           11111111    00000000     11111111
Address 1           11111111    00000000     11111111

Address 250G      11111111    XXXXXXXX     11111111   <- to calculate this byte, we need parity and data-1 from the 500G addresses
                                                   but, as soon as we put it into place, then this line's parity is wrong and will also need changing.  
                                                   (The new data-2 byte will be 1000000 and the new parity 011111111)

Address 500G      11111111    XXXXXXXX     01111111    
Address 500G+1   11111111    XXXXXXXX     11111111

Address 750G      11111111    00000000    11111111

 

When we try to shift downward, we need to not only calculate the new data byte, but also new parity on the entire set of data drives for that same byte position.   In other words, we need to read the entire set of bytes on the original offset, AND the entire set of bytes on the new byte offset and seek between them.  This will take over twice as long as sequential reads of the set of disks.  There goes your 20% savings.

 

Link to comment

Thanks Joe for giving my idea some thought, how about this:

 

                        data1        data2        parity

Address 0          11111111    00000000    11111111

Address 1          11111111    00000000    11111111

 

Address 250G      11111111    00000000    11111111  <- we zero this

 

Address 500G-1      11111111    00000000    11111111  <- to here

Address 500G      11111111    10000000    01111111   

Address 500G+1  11111111    00000000    11111111

 

Address 750G      11111111    00000000    11111111

 

Can’t we store the “real data” at the end of the new drive? Is it an issue with the pointers to the data by the filesystem?

 

Link to comment

for me, the process would be 50% faster, as the WD greens are half the speed of the Hitachi 7K's (ie, the 7K is pulled back to the speed of the GP for half of its data size).

 

What would be better is to 'right align' the data bytes, not put a wait timer so that all disks end at the same time, ie a 1TB disk starts as soon as theres 1TB left of parity on a 1 TB drive.

 

this would of corse mean a totally different way of writing parity.

Link to comment

for me, the process would be 50% faster, as the WD greens are half the speed of the Hitachi 7K's (ie, the 7K is pulled back to the speed of the GP for half of its data size).

 

What would be better is to 'right align' the data bytes, not put a wait timer so that all disks end at the same time, ie a 1TB disk starts as soon as theres 1TB left of parity on a 1 TB drive.

 

this would of corse mean a totally different way of writing parity.

 

When writing parity drive:

The most that the slow down can be is the difference between the write speed of the parity drive and the slowest read speed.

 

When checking parity:

The typical most for slow down would be the difference between the fastest read and the slowest read.

 

I'm interested in the read speeds of your drives because 50% seems too big.

Link to comment

My thought is you might loose some of the "outer track" speed benefits when writing to the smaller drives.

 

If I write to the outer track of a 320GB drive, it's going to be using the inner track of the parity drive.

 

I would think a threaded approach to handling parity checks would be better.

If drives on different controllers were handled via parallel threads, then it might alleviate some of the waits for a drive to finish.

Link to comment

reading a 7k goes from 130MB to 80MB, readking a GP goes form around 80MB to 50MB. a 7K is essentially twice as fast as a GP, but as its only slowed down for ahlf of its data the increase would be half of twice, hence 50% or so.

 

Final average parity check speed with just 7K's was 95MB/sec (started 120+) with teh additional GP drive its down to 60MB-ish (started 80), didnt quick have a look when it was done.

 

Gets even worse if you have WD blacks and greens in one unRAID, these are even quicker than 7K's.

Link to comment

Here's a chart to help visualize how the drive speeds vary over the cylinder locations.

 

The second chart shows how the drives would be aligned if the parity were calculated in the way the OP suggested.

While it's true that the parity drive would become the only really important speed limiter, it would also mean that the fastest part of the parity drive might go unused if the data drives were all smaller.  That alone seems a high price to pay, too high I think.

SpeedByCylinderLocation.jpg.f2a56d4153f2692f85a1c39ed49f981d.jpg

InvertedParityDiskAlignment.jpg.c72182945504cfcf50bab316f8c9ab56.jpg

  • Like 1
Link to comment

terrastrife, I going to have to refute your claims. My array has a final parity check speed of 75 MB/s and I am using 3 WD Green 2TB drives. They start off at over 100 MB/sec. That puts the difference in speed from 95 MB/s to 75 MB/s a drop of about 22% {the math: (1 - (75/95)) * 100 }, which is nowhere close to your claimed 50% performance loss.

 

Perhaps your drop in performance is from the adding of another drive, regardless of it's type.

 

Link to comment

To calculate my speedup in parity I first logged a parity calculation by capturing the current time and the position of the parity calculation every 30 seconds. After staring at this graph and seeing my parity calculation times jump up when my 1T WD green drives were finished it dawned on me that removing them from consideration for the first half of parity calculation would really speed things up.

But doesn't that also mean that those disk will slow you down for the second half of the parity calculation if you start them from there?

 

Link to comment

really?

i started with 'for me'

not for you, not for everyone.

i have eacs green powers, before they were labelled as 'green'

 

Yes, really. Reread the following statement which seems aimed at all GPs.

 

reading a 7k goes from 130MB to 80MB, readking a GP goes form around 80MB to 50MB. a 7K is essentially twice as fast as a GP, but as its only slowed down for ahlf of its data the increase would be half of twice, hence 50% or so
Link to comment

really?

i started with 'for me'

not for you, not for everyone.

i have eacs green powers, before they were labelled as 'green'

 

Yes, really. Reread the following statement which seems aimed at all GPs.

 

reading a 7k goes from 130MB to 80MB, readking a GP goes form around 80MB to 50MB. a 7K is essentially twice as fast as a GP, but as its only slowed down for ahlf of its data the increase would be half of twice, hence 50% or so

 

remember, GP's are NOT Caviar Green. WD used the GreenPower tag before they instroduced the Black/Blue/Green.

 

also, i did start with 'for me', nto sure how you managed to quote from that same reply, but not include the part with i started with FOR ME.

Link to comment

Alright, here's your full post:

 

reading a 7k goes from 130MB to 80MB, readking a GP goes form around 80MB to 50MB. a 7K is essentially twice as fast as a GP, but as its only slowed down for ahlf of its data the increase would be half of twice, hence 50% or so.

 

Final average parity check speed with just 7K's was 95MB/sec (started 120+) with teh additional GP drive its down to 60MB-ish (started 80), didnt quick have a look when it was done.

 

Gets even worse if you have WD blacks and greens in one unRAID, these are even quicker than 7K's.

 

I do not see "for me" in there at all. You also stated it gets even worse if you have WD Greens. Was that not a reference to the WEADS / WEARS ?

Link to comment
  • 12 years later...
On 3/31/2010 at 12:52 PM, ohlwiler said:

The calculation of parity is performed by all drives being written to at once. At position one each bit is read from each drive. An Exclusive-OR operation is performed on these bits and a value is calculated. This value is written to the parity drive. The speed of these calculations is constrained by the speed at which the data can be obtained from the individual data drives. This speed is determined by the speed of the drives themselves or the data bus to which they are attached.

 

Most unRAID systems being built today place the hard disks on a PCIe bus to prevent the data bus slowing the parity calculation, therefore the speed of the drives themselves becomes the limiting factor. Most systems utilize a mixture of disk sizes. These smaller drives have lower areal density and thus lower data read rates. During parity calculation these small drives slow down parity calculation until they are read from completely. Because the data read rate is a function of the areal density, the read rate at the inner tracks of a hard disk is much lower than those at the outside. This degradation is about a factor of two on the disks we tend to use in unRAID.

 

We can reduce and sometimes eliminate the reduction in parity calculation speed by shifting the smaller drives until the end of the parity calculation. To do this would of course require a change to the way the unRAID software. Smaller drives would have an offset added to their first address such that the last bit on all drives would occur at the same time, unlike now where the first bit is aligned. In my calculations this would speed up my parity calculations by twenty percent. I have a mixture of 1T. and 2T. drives, some 7200 and some 5400-5900. In no case would this ever decrease performance of the parity calculation. If all drives are of equal size, then parity would of course be the same as it is now.

 

This would require a software change on Tom’s part. The major downsize is that the parity disk would be different between two different versions of the software. Downgrading UnRAID release versions would not be possible without a parity recalculation. A couple of things could be done to help alleviate the transition. Write out the parity creating method on the parity disk itself; warn the user if this signature is not recognized. Allow users to opt in to this parity creation method, so they will only invoke it once they know the risks. I for one would jump at the chance to decrease parity check times. I just spent several hundred dollars upgrading because my parity calculation times were getting out of hand. Now I want that last little bit.

 

Scott

 

Hi Scott,  quick question for you... When is this parity calculation done.  I thought it was always a post process job that was scheduled but now I'm seeing conflicting notes.  It seems that it is now done in realtime.   Can you clarify for me?  If its now done in real time, when did this change happen (or am I completely mistaken about this being a post process).   Thanks

-Clayton

Link to comment
6 minutes ago, [email protected] said:

It seems that it is now done in realtime.   Can you clarify for me?  If its now done in real time, when did this change happen (or am I completely mistaken about this being a post process)

It has always been real-time parity in all Unraid versions I have used (I started with v5).   I am reasonably sure earlier versions were the same.

Link to comment
56 minutes ago, [email protected] said:

 

Hi Scott,  quick question for you...

Clayton,

 

Are you aware you're asking questions related to a forum thread from over 12 years ago...?

However, as a long time user (since 2011, version 4.7) I can also confirm that parity calculation has always been done on the fly as new files are added to the array.

 

Are you perhaps confusing parity updates with array updates from a cache drive (if installed)?  Data written to a cache drive is not parity protected.  Data written directly to the array is protected with parity at that time.  If using a cache drive, data is transferred to the array at a later time, either on demand from the UI or according to a schedule, and would then become parity protected at that time. 

 

I also suggest NOT using an email address as your username on any forum - that is asking for a major increase in spamming to your email account.

  • Haha 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.