Newbie Questions about parity


Recommended Posts

I am new to this so i am sorry if i did not find a post about this.
I had a few questions that I had not found a Clear or clean cut answer on.
Not wanting to post many questions, thought i would post once to get a few answers

I am on Version 6.3.3, the Plan for my box, is file storage letting my ESXI server copy VM Snapshots to this backup storage server.
In other words Off site backup server.


Questions

1. I think i know what the Parity Disk is. This Disk has the mathematical formulas used to understand what data is missing when a data Drive is lost or damaged. 
Swap new Drive and it Rebuids all the data that was lost. Is this correct?
 

2. Does the Parity Drive needs to be as big as the full Data Array.
I have seen some post that it should be as big or bigger than your largest Disk, but Some say must be the size of the data Array or bigger.
If the Array is 6TB, do I need a 6TB Parity Drive?

3. in 6.3.3 the system lets you add more than one Parity Drive, what is the point?
A) 2 Parity Drive is Mirroring the 1st, letting you lose one drive?
B) 2 drives offers more Parity Disk space?


4. How do i know if the parity disk is getting full?

5. I seen people say do not use a SSD as a Parity Disk.
Other than the cost is there a technical reason? What about SATA Drives with Large Caches like 64MB or a Hybrid Drive. Example 2TB 7500RPM with 8GB SSD Internal Cache

6. If you can use a Hybrid Drive with 8GB Cache, do i need a cache Disk?


7. Does UnRAID have a way to SYNC data between two servers? Example Two servers on the same LAN and they work to stay in sync?

8. Is there a way to have a Hot spare Disk that can jump into in as a Data Disk or a Parity disk?

 

Link to comment

1. Correct

2.The parity drive has to be as large or larger than the largest data drive in the array. So if your largest drive is 6TB, the parity drive has to be at least 6TB or larger.

3a.The point of more than one parity drive is to protect you from a second drive failure when you have already experienced one drive failure. During the rebuild of your first failed drive, a second drive could fail, with a second parity drive you are protected.

3b.Not more parity space, more parity protection, but I think that is what you mean.

4.The parity disk is not like a data disk in terms of space, it does not fill up, you don't need to worry about this.

5.I don't know the reason for not using an SSD for parity other than they are very expensive the larger they get and I think right now they top out at 4TB.

6.A cache disk is very useful if you want to use dockers, as well its a temporary storage location for when data is copied to the array.

7.Not natively, there are programs you can use for this like rsync.

8.Not sure what you mean by this. As unRAID is software raid, the concept of 'hot spare' is a hardware raid thing, it does not apply to software raid.

 

 

Link to comment


#2 & 4 : Dont understand why the Parity disk must match the larger Data Disk,, But i have a feeling this sbject may get real deep. I am Better off just saying OK..move on. :)


#8 : I was thinking of a Hot Spare where you have a disk that is not added to a data Array pr a Parity. Example, lets say i am out of town for the weekend, I have a Disk that goes bad. UnRAID would see the disk was lost and take the hot spare and add the drive to the Array and then start building the data up. When i come back i pull the bad drive out and put a new drive back in and its now the spare.

Link to comment
7 hours ago, zdude said:


#2 & 4 : Dont understand why the Parity disk must match the larger Data Disk,, But i have a feeling this sbject may get real deep. I am Better off just saying OK..move on. :)


#8 : I was thinking of a Hot Spare where you have a disk that is not added to a data Array pr a Parity. Example, lets say i am out of town for the weekend, I have a Disk that goes bad. UnRAID would see the disk was lost and take the hot spare and add the drive to the Array and then start building the data up. When i come back i pull the bad drive out and put a new drive back in and its now the spare.

 

Actually #2 & 4 turn out to be simple when you realise that the parity disk has no concept of data.   All it understands is physical sectors.    Therefore sector 'n' on the parity disk is the relevant mathematical calculations for sector 'n' on all the data disks put together.    If a data disk is smaller than the parity disk then a logical sector containing all zeroes is assumed for sectors beyond the end.  That should help make it clear why the parity disk can never be smaller than the largest data disk.

 

This has a few consequences:

- Parity is completely file system independent as sectors exist at a lower level than the file system.

- When a disk fails the rebuild process uses the parity+remaining data disks to reconstruct each sector on the replacement disk to the state that parity thinks should be there.  It is NOT restoring data - just sectors

- If a write to a file system corrupted a sector and the write gets reflected in parity then the parity system is unaware of such corruption.   In such a case a rebuild to another disk would merely recreate the corrupted file system.   Recovering from corrupt file systems  is the purpose of the various file system check utilities which know how to traverse the file systems to put them back to a good state

- If you 'format' a disk then parity is not aware of this as it is only working at the physical sector level.   It therefore merely sees a format as a sequence of write operations and updates parity appropriately.

 

regarding 8) then unRAID cannot automatically swap in a replacement disk if one fails.   You CAN have one plugged into your system, and if a disk fails stop the array assign the replacement disk;  restart the array to rebuild onto the replacement disk.  Since unRAID 'emulates' a failed disk or a disk being rebuilt then your data remains available during this process (except for the short period where you stopped/restarted the array) so you can continue to operate as normal.  Many people like to avoid updating the array while a rebuild is taking place although in fact unRAID can handle this quite happily (it may cause a slight performance hit on the rebuild while writing new data due to drive contention effects)

Link to comment
9 hours ago, zdude said:

1. I think i know what the Parity Disk is. This Disk has the mathematical formulas used to understand what data is missing when a data Drive is lost or damaged. 
Swap new Drive and it Rebuids all the data that was lost. Is this correct?

While it is true that parity allows you to rebuild a data drive, the explanation of parity given in your question is vague and probably the source of some of the misconceptions that leads to some of your other questions.

 

The parity disk doesn't contain any mathematical formulas. The parity calculation is part of the unRAID software. Parity contains parity bits that the software uses with the bits on all the other drives to allow it to calculate the data of a missing disk. Without the other disks, parity can't calculate anything.

 

9 hours ago, zdude said:

2. Does the Parity Drive needs to be as big as the full Data Array.
I have seen some post that it should be as big or bigger than your largest Disk, but Some say must be the size of the data Array or bigger.
If the Array is 6TB, do I need a 6TB Parity Drive?

Since the parity calculation uses all of the other disks plus parity, it doesn't need to be as large as the array, just as large as the largest data disk.

 

9 hours ago, zdude said:

3. in 6.3.3 the system lets you add more than one Parity Drive, what is the point?
A) 2 Parity Drive is Mirroring the 1st, letting you lose one drive?
B) 2 drives offers more Parity Disk space?

Each additional parity disk allows the parity calculation to succeed with one fewer remaining data disk. So unRAID v6 with 2 parity disks allows the data from 2 missing disks to be calculated from parity plus all the remaining disks.

 

9 hours ago, zdude said:

4. How do i know if the parity disk is getting full?

One way to think about this is that parity is always completely full of parity bits, never gets any fuller, and doesn't need to, since it is large enough to allow the parity calculation of the largest data disk.

 

Here is good illustration of how parity works:

https://lime-technology.com/wiki/index.php/UnRAID_6/Overview#Parity-Protected_Array

 

9 hours ago, zdude said:

6. If you can use a Hybrid Drive with 8GB Cache, do i need a cache Disk?

The cache disk is separate from all other disks, and has multiple, specific uses.

 

9 hours ago, zdude said:

7. Does UnRAID have a way to SYNC data between two servers? Example Two servers on the same LAN and they work to stay in sync?

Nothing builtin but there are many ways to accomplish this with addins.

 

9 hours ago, zdude said:

8. Is there a way to have a Hot spare Disk that can jump into in as a Data Disk or a Parity disk?

unRAID will never automatically replace a disk, but that is actually a good thing, since it allows you to decide how best to take care of the problem.

 

One thing to remember about unRAID, unlike traditional RAID, each data disk is independent. As long as you don't have more missing disks than parity allows, all data is readable and writable, including a missing disk (emulated with parity calculation). And even if you get more missing disks than parity allows, the other disks aren't affected.

Link to comment
9 hours ago, zdude said:

#2 & 4 : Dont understand why the Parity disk must match the larger Data Disk,, But i have a feeling this sbject may get real deep. I am Better off just saying OK..move on. :)

Just thought I would add a comment about this.

 

Parity isn't particularly deep. It is much, much simpler than file compression and codecs, for example. From reading some comments on this forum, it seems some have the idea that the parity disk magically has a copy of all the other disks, like it has compressed them all down (even though many files on the disks are already compressed) to fit on a disk not nearly large enough.

 

If you take a look at that wiki I linked, and take a little trouble to understand, everything about how unRAID works and operates makes a lot more sense.

 

When you have a problem or just need to change something in the array, there are instructions for a lot of situations already in the wiki. You can get more instructions by asking on the forum, and I encourage everyone to do so.

 

But if you understand parity, you will understand why those instructions work. So, you will better understand (and not misunderstand) the instructions and be less likely to make mistakes that make things worse. The reason some people can even write those instructions in the first place is because they understand parity.

  • Upvote 1
Link to comment
  • 8 months later...

#5 : The reason not to use a SSD for parity is technical. SSDs have a life cycle based on write/erase for each data cell. Most SSDs are multi-level cells and those SSD have a maximum write/erase count of about 10 000 for each cell. This is approximative, it could be more or it could be less. Each time something is written on any data drive, the parity is recalculated and then written back on the parity drive. The parity drive has a very large amount of write/erase on it and this would shorten the life of your SSD a lot when using it as a parity drive.

Link to comment
2 hours ago, sebcote80 said:

#5 : The reason not to use a SSD for parity is technical. SSDs have a life cycle based on write/erase for each data cell. Most SSDs are multi-level cells and those SSD have a maximum write/erase count of about 10 000 for each cell. This is approximative, it could be more or it could be less. Each time something is written on any data drive, the parity is recalculated and then written back on the parity drive. The parity drive has a very large amount of write/erase on it and this would shorten the life of your SSD a lot when using it as a parity drive.

Dual-bit SSD often manage 10k erase cycles/block.

 

Newer SSD storing tripple bits/cell often manage 300-400 erase cycles/block. So a 1TB SSD might support 300-400 TB of total writes. And actual figures can become even lower depending on write amplification.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.