Jump to content

Let's talk about Raid-6


limetech

Recommended Posts

There are 3 ways to implement raid-6:

1. Using “Diagonal Parity” – this is patented, patent held by NetApp.

2. Using “Even-Odd”, similar to “Diagonal Parity” – this is patented, patent held by IBM.

3a. Using Reed-Solomon to generate Q.  This is not patented and is the algorithm used by linux stock md-layer (and probably btrfs though I haven’t looked at their source).

3b. Using "Liberation Codes" to generate Q.  Not sure if anyone's implemented this.

https://www.usenix.org/legacy/events/fast08/tech/full_papers/plank/plank.pdf

 

 

Of the three, #1 and #2 only require the target disk as well as both redundancy disks (P and Q) to be spun up during writes.  #1 is slightly more CPU efficient.

 

#3 requires all drives to be spun up for writes and is also most CPU intensive (though in practice we’re talking probably negligible difference).

 

Companies pay NetApp and IBM royalties to use those algorithms.  I have no idea how much.  I'm not interested in implementing and then deal with letters from lawyers  >:(

 

So.. knowing our implementation would require all disks to be spun-up for writes, is there as much enthusiasm for this feature?

 

Link to comment
  • Replies 92
  • Created
  • Last Reply

Well, depends on how much does cost the license. unRAID 7 could be a paid upgrade. If it imports in a small fee, I don't see why people bother paying a few bucks for extra protection.

 

If the license cost is absurd, I think the 3rd would be preferred, but will kill a major advantage of unRAID against normal MD and ZFS, for example.

 

EDIT: didn't read the final line  ::) I think if people can choose between single/dual parity, I see no problem with option #3.

Link to comment

At the risk of going slightly off topic, I believe a big reason to implement raid-6 is extra redundancy due to a large wide arrays.

Is there a possible choice of multiple smaller array pools that are single drive parity protected?

 

Regarding Option 3 with all drives spinning.

Although we may loose the advantage of spin down for unused file systems, I'm sure people with a huge number of drives may still find the extra redundancy useful. i.e. a business archiving implementation.

 

In my case, I would not use it.

I would rather have smaller manageable protected arrays that are consolidated in visibility.

Either on one server with smaller array pools or multiple servers.

Link to comment

I'd rather see multiple smaller arrays than dual-parity if all drives must spin for writes.

 

Dual parity protects any pair of drives.  Multiple arrays only protects a single drive on each array.

 

Yup, and thats exactly what I'd rather have ... Multiple Arrays instead of dual parity with all drives spinning. If i wanted all drives spinning I'd switch over to zfs.

Link to comment

You need real numbers, for all aspects.  You need an idea what the development effort is going to be for each option.  You need numbers from NetApp and IBM for per-seat cost of 1000, 10000, 100000 clients.  And you need the number of truly committed users who will purchase it.

 

I can speculate on a $5 per user fee to patent holder, and $25 upgrade fee for unRAID users.  Once you have numbers from the patent holders and an idea how much you need to do it, then a poll can be set up to find out how many truly committed users would go for it, and from the poll results you can modify the plan accordingly.

 

I wonder if a KickStarter campaign would work?  That's real money committed.

 

Option 3b sounds like a very interesting project, but does not seem interesting for us, to be the guinea pigs for an untested system.  It feels really wrong to try and add data stability by adding unstable software.  Commitments to buying it would be very slow, as users wait for others to thoroughly test it.  And you have no idea what surprises will turn up yet, until someone else has done it.  Later, if it proves fast and safe (and well tested), then it could be substituted in.

Link to comment

I have always thought that multiple arrays are a great idea, because you can do special things with each one, like set up a 2 disk array in RAID 1 for mission critical stuff, and other special uses.  But it's a little more complicated, as once set up, users are going to want User Shares to span arrays, and I have no idea how complicated that will be to develop.

Link to comment

I have always thought that multiple arrays are a great idea, because you can do special things with each one, like set up a 2 disk array in RAID 1 for mission critical stuff, and other special uses.

 

The original MD driver did this. The RAID-5 driver did this as well.

 

  But it's a little more complicated, as once set up, users are going to want User Shares to span arrays, and I have no idea how complicated that will be to develop.

 

Not really, if disk 1,2,3,4 are protected by parity #1 and disk 5,6,7,8 are protected by parity #2, then mounting disk 1-n the same way as it is done today would provide the user share spanning.

 

FWIW, when doing high speed loads with turbo write, it's almost as fast as a single drive.

So for some usage cases, this could prove to be a big benefit for certain sub-arrays.

Link to comment

Dual parity protects any pair of drives.  Multiple arrays only protects a single drive on each array.

 

Yup, and thats exactly what I'd rather have ... Multiple Arrays instead of dual parity with all drives spinning. If i wanted all drives spinning I'd switch over to zfs.

 

What is the problem with all drives spinning up once a day while mover runs?

 

All writes go to the (protected?) cache and the cache is flushed once per day ( or twice a day, or once a week or ,,,,).

 

The only problem I can see is where an exisiting file, already moved to an array drive, is updated - a frequently modified database, for example .... ummm, just like the kodi database which is modified every time you start/stop playing a movie .... okay, I see your point!

Link to comment

While I don't like the requirement for all drives to be spun up for a write, I think that's easily outweighed by having dual fault-tolerance.    As already noted, this is easily mitigated by having a cache pool ... so the "all drives spun up" is only a once/day event when Mover runs.    For read access, the current "only the drive being accessed" is spun up would still be true.

 

I think this is preferable to the multi-array concept, as with multiple arrays you're committing at least as many (if only two arrays) or even MORE (if more than 2 arrays) drives to fault-tolerance, yet still only getting single failure protection.  Granted you're getting "maybe more than one" fault tolerance ... as long as the "more than one" is on a different array than the first failure => but I'd think actual dual fault-tolerance is FAR preferable.

 

Reed-Solomon is a very tried-and-true technique [i wrote a Reed-Solomon package over 25 years ago] that works very well and is used by a lot of RAID-6 controllers.  But if NetApp's license fee isn't too onerous I don't think most folks would balk at paying an upgrade fee for a v7 with dual fault tolerance.  Actually I think that's true regardless of which technique you use to implement it ... but since the NetApp diagonal parity technique doesn't require all disks spinning, that's a big advantage.

 

I think you need to get a quote on the royalty fees before you can make a decision on this.

 

 

 

 

 

Link to comment

I’d gladly pay for one disk of extra redundancy.

 

Please don’t dismiss #1/#2 until you at least get a quote for the royalties requested. It would be awesome not to have the entire array spun up. Not a deal breaker for me, though, but it would be really nice.

 

Given my writing pattern I’d rather have dual fault-tolerance and all drives spinning while writing than single fault-tolerance and mostly spun down drives. There's always the cache pool if it bothers me...

Link to comment

Depending on the licensing costs, and how granular they are (i.e will you have to absorb upfront costs or can you 'pay as you go' per implementation) then you may have a path to have another tier of unraid licensing.

 

Pay more for the 'Unraid 6 Double Protection ' license to unlock the feature - and that uplift cost covers your backend licensing fees and a little on top for your trouble. It may be low volume in terms of sales but that might not matter. Or if not needing all disks to spin up it may be a very popular license option for customers.

 

Or the backend licensing fees could be so low that it can just be rolled into the unraid base without any fuss and the general unraid license cost increased by a small amount across the board.

 

Charging a fee for a new unraid license or unraid upgrade come version 7 for existing users (presuming this feature would be included) might not cause any problems. If you'd charged again for version 6 I would have happily paid given the improvement in feature set. Something like this would bring enough addiitonal value to the use of th eproduct that I would see it reasonable to pay for v7 if necessary.

Link to comment

If a license fee needs to be paid, then I assume it could be handled rather like the trial license is at the moment.  In other words users would have a way via the unRAID GUI to get an extra chargeable key from LimeTech web site before the Dual Parity option started working.  Done that way both new and existing users would be treated the same, and they then have the option of whether to pay the cost of unlocking the Dual Parity feature.

Link to comment

Ditto on paying for dual, if I even needed it (I don't at the moment).

 

[pointy stick time] Honestly this entire thread is just premature without that cost data.[/pointy stick time]

 

The thing to also consider is how hard it might be from a maintenance POV to offer the ability to still run in single parity mode.

Link to comment

The thing to also consider is how hard it might be from a maintenance POV to offer the ability to still run in single parity mode.

I think the ability to do this will be essential as those with very small arrays will not want to dedicate two drives to parity.
Link to comment

I would gladly pay for a dual parity key/server (one disk of extra redundancy)

I would gladly pay a PREMIUM for this key -> as much as the pro key itself and I would not hesitate.

It would be nice to not have all the disks spin up, but given the use of a cache pool not required.

Link to comment

While I don't like the requirement for all drives to be spun up for a write, I think that's easily outweighed by having dual fault-tolerance.    As already noted, this is easily mitigated by having a cache pool ... so the "all drives spun up" is only a once/day event when Mover runs.    For read access, the current "only the drive being accessed" is spun up would still be true.

 

I think this is preferable to the multi-array concept, as with multiple arrays you're committing at least as many (if only two arrays) or even MORE (if more than 2 arrays) drives to fault-tolerance, yet still only getting single failure protection.  Granted you're getting "maybe more than one" fault tolerance ... as long as the "more than one" is on a different array than the first failure => but I'd think actual dual fault-tolerance is FAR preferable.

 

Reed-Solomon is a very tried-and-true technique [i wrote a Reed-Solomon package over 25 years ago] that works very well and is used by a lot of RAID-6 controllers.  But if NetApp's license fee isn't too onerous I don't think most folks would balk at paying an upgrade fee for a v7 with dual fault tolerance.  Actually I think that's true regardless of which technique you use to implement it ... but since the NetApp diagonal parity technique doesn't require all disks spinning, that's a big advantage.

 

I think you need to get a quote on the royalty fees before you can make a decision on this.

 

I agree with Gary (what am I saying  :-X)

 

If you don't need to spin up the entire array, that is a plus, but even if you do, the feature would still be very valuable.

 

I few baby steps toward approaching NetApp and IBM might be in order before closing the door on those options.

 

On the idea of separately protected smaller arrays within a single box, I am lukewarm on that. There were three primary use cases for so called dual parity:

1 - a two drive failure;

2 - a second drive fails while rebuilding a failed disk; and

3 - an unexplained parity error and we want to triangulate the responsible disk.

 

Smaller arrays do not address any other these, although would reduce the likelihood of the 1st and 2nd I suppose.  The advantage is you could use two different sized parity drives - a small one protecting smaller disks, and a larger one protecting larger disks. But then I think about upsizing a disk - now you need to move it to the other array - it just becomes unnecessarily complicated IMHO.

Link to comment

While I don't like the requirement for all drives to be spun up for a write, I think that's easily outweighed by having dual fault-tolerance.    As already noted, this is easily mitigated by having a cache pool ... so the "all drives spun up" is only a once/day event when Mover runs.    For read access, the current "only the drive being accessed" is spun up would still be true.

 

I think this is preferable to the multi-array concept, as with multiple arrays you're committing at least as many (if only two arrays) or even MORE (if more than 2 arrays) drives to fault-tolerance, yet still only getting single failure protection.  Granted you're getting "maybe more than one" fault tolerance ... as long as the "more than one" is on a different array than the first failure => but I'd think actual dual fault-tolerance is FAR preferable.

 

Reed-Solomon is a very tried-and-true technique [i wrote a Reed-Solomon package over 25 years ago] that works very well and is used by a lot of RAID-6 controllers.  But if NetApp's license fee isn't too onerous I don't think most folks would balk at paying an upgrade fee for a v7 with dual fault tolerance.  Actually I think that's true regardless of which technique you use to implement it ... but since the NetApp diagonal parity technique doesn't require all disks spinning, that's a big advantage.

 

I think you need to get a quote on the royalty fees before you can make a decision on this.

 

I agree with Gary (what am I saying  :-X )

 

If you don't need to spin up the entire array, that is a plus, but even if you do, the feature would still be very valuable.

 

I few baby steps toward approaching NetApp and IBM might be in order before closing the door on those options.

 

On the idea of separately protected smaller arrays within a single box, I am lukewarm on that. There were three primary use cases

1 - a two drive failure;

2 - a second drive fails while rebuilding a failed disk; and

3 - an unexplained parity error and we want to triangulate the failed disk.

 

Smaller arrays do not address any other these, although would reduce the likelihood of the 1st and 2nd I suppose.  The advantage, I guess, is you could use two different sized parity drives - a small one protecting smaller disks, and a larger one protecting larger disks. But then I think about upsizing a disk - now you need to move it to the other array - it just becomes unnecessarily complicated IMHO.

 

I agree with Gary as well.

For arrays with a protected cache drive, it's a small price to pay for spinning up all drives.

 

I presented the idea for smaller arrays as food for thought and garner opinion on the options pro/con.

 

I am just as interested as other parties as to the royalties for the other options.

If the royalty/upgrade fee isn't cost prohibitive, this sets unRAID apart from many other solutions.

Link to comment

Archived

This topic is now archived and is closed to further replies.


×
×
  • Create New...