Support for more drives?


Recommended Posts

24 minutes ago, itimpi said:

I am not an expert on the maths behind this but I believe there is a need for different algorithms to handle multiple disk failures.so you can identify which content belongs on which drive. 

Yeah, I'm no expert too, but I don't see, why this would not work. Because it could behave just like one drive but mirrored. Someone smarter should explain if he/she comes across this post. I may try some experiments with a Raid card.
Edit: Yeah practically, that would work only if the parity drives failed, not the drives with data. That's a pity.

Edited by Risino15
Link to comment
45 minutes ago, Risino15 said:

Maybe a stupid question, but what if 3 drives and up were just raid 1? Is there a limitation why parity drives cannot just be mirrored?

 

EDIT: So theoretically it would be possible to add HW Raid 1 array to the parity drive to "create" more parity drives?

 

You can create mirrored parity drives.

But it's better to use different algorithms for each parity instance since that makes each drive independent of each other.

And making each drive independent of each other means that a system with n parity drives can handle any n-drive failure - with some disks mirrored you get different recovery ability depending on which specific drives that failed.

 

There already exists known algorithms to continue way past two parity drives, so no reason to start with mirrored disks.

 

Anyway - many parity is best for a traditional, striped, RAID where the loss of n+1 drives means a total loss of all data in the RAID and requires the system to be offline until everything has been read back from a backup. And it really is not good to build a single array with a huge number of disks, because the probability of failure increases with the number of drives. And the probability of having more disks failing during the rebuild increases. And having a larger single data pool means more work having a backup - since parity isn't a replacement for backup.

 

Next thing - unRAID can only perform one write at a time. If you write to multiple data disks, then unRAID needs to multiplex the writes since every write must also update every parity disk. So unRAID scales badly when it comes to filling huge arrays.

 

The way to scale a system for more disks is to have more arrays. When an enterprise needs a single file system to be huge, they either use arrays of arrays. So an array where every "disk" is a fault-tolerant array. Or they go the other route and uses a clustered file system where each disk is totally stand-alone but a database layer breaks up the file content into blocks and writes the blocks to n different disks. If one disk fails, then a random number of other disks will contain small parts of the information so each of the other drives will have some percent higher load to handle all reads/writes until the system have made one more copy of all the different blocks to return back to n-way redundancy for each block.

 

Anyone who wants to run a clustered file system can - there are such file systems available and supported by Linux. But they aren't optimal for home users.

 

The best and natural route for unRAID to support larger storage pools while keeping the current concept and being compatible with existing installations is to add support for multiple arrays of arbitrary type.

 

So if you want a two-disk mirror for your source code and document store, then you add a two-disk SSD RAID-1 for that task. Basically similar to a current cache pool.

If you want a temporary area with high transfer rate for streamed data, then potentially a four-disk striped RAID-5 of HDD.

If someone want 1000 TB of movies, then add 10 classical unRAID arrays of 10+2 disks of 10 TB each. Still single-disk reads possible since no striping. But 10 different arrays that can be written to at full disk speed, assuming the machine has enough network capacity. And still the advantage that a single movie can be sent out with a single data disk spinning.

 

There is a step to take to go from one array to two. But a system that can handle two arrays can just as well handle three or four arrays. And it's just a question of configuration if the files of two or three arrays should be unified and presented as a single user share. Or if a user selects to have one array dedicated "movies" and another array dedicated "tv-series".

 

The only concept that is a bit problematic to handle is /mnt/user0/ and /mnt/user/ because if multiple arrays it isn't obvious anymore what should be seen as cache. Maybe the system has two system-wide cache volumes. Maybe zero cache volumes. Or maybe each unRAID array has one cache volume.

In the end, it should be quite easy to extend unRAID with multiple arrays.

While it isn't obvious how to extend unRAID with multiple cache volumes.

Link to comment
1 hour ago, pwm said:

it's just a question of configuration if the files of two or three arrays should be unified and presented as a single user share.

I think they should be unified and the user can create shares movies etc. It would make sense maybe if the user wants Purple drives for NVR and different drives for anything else (Though I don't know if it's a great idea to run NVR on UnRaid, because of the Speed limits.

 

1 hour ago, pwm said:

 

The only concept that is a bit problematic to handle is /mnt/user0/ and /mnt/user/ because if multiple arrays it isn't obvious anymore what should be seen as cache. Maybe the system has two system-wide cache volumes. Maybe zero cache volumes. Or maybe each unRAID array has one cache volume

There can be just one Cache array for everything and user can choose which ones to use the cache for, like they can now.

 

1 hour ago, pwm said:

In the end, it should be quite easy to extend unRAID with multiple arrays.

While it isn't obvious how to extend unRAID with multiple cache volumes.

I hope they will implement this in the future, and the Cache can be solved also pretty easily as I wrote.

Thanks for your very informative response, it helped me understand a lot!

Link to comment
2 hours ago, Risino15 said:

I think they should be unified and the user can create shares movies etc.


You don't want obligatory unification - you want the user to be able to decide, since different users have different needs. That's why I wrote:

 

4 hours ago, pwm said:

it's just a question of configuration if the files of two or three arrays should be unified and presented as a single user share

 

I think the biggest problem with supporting more arrays is licensing and Linux compatibility.

 

On a unRAID system /proc/mdstat emits state info for one  unRAID array.

On a traditional system /proc/mdstat emits state info for an arbitrary number of standard Linux arrays.

 

The traditional software RAID support in Linux supports [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10].

 

It isn't obvious that it's possible to run both unRAID and classical arrays on the same machine - and since Limes implementation is closed source, it isn't possible to just merge the functionality. mdadm and other standard software RAID functionality is GPL-licensed. So with the exception of RAID functionality implemented in the BTRFS code, Limes may have to write own code to support RAID-1 for other file systems.

  • Upvote 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.