Single-disk corruption detection on dual parity arrays


Recommended Posts

I would like to request the feature that the parity check will be able to (optionally) do a single-disk corruption detection when working on a dual parity array.  That kind of parity check will likely come at the expense of some serious performance hit, (hense optional), but it would be priceless in certain situations.


A discussion on the matter occured in this thread:


https://forums.unraid.net/topic/2557-article-why-raid-5-stops-working-in-2009/?do=findComment&comment=882947


This feature can even be (for starters) implemented as rport-only mode -- i.e., the driver will not have to make any decisions about what to do with the found wrong byte/sector -- just report it to the syslog and move on.


Thank you.
 

  • Like 1
Link to comment

For those of you who are not sure what exactly I am suggesting, let me explain a but.  When a parity check runs into a mismatched byte at a certain byte position, in 99% of the cases the reasonable thing to do is to just sync the parity disks.  But there can be some situations, albeit rare, when only a single disk is mismatched at that byte position.  Double parity can be used to detect such cases, and it can detect exactly which disk is carrying the mismatched byte.  In those cases, we could be given the golden opportunity to to recover the mismatched byte from parity, instead of propagating the mismatch onto the parity disks.


I the thread that I quoted above, I talked about how my server (which I was carrying in the trunk of my car) got out of my sight for some period of time during a recent border crossing.  I can't but be concerned that somebody may have taken some disks out of my server and tried to mount them somewhere in order to check for whatever it is they were checking for.  In such a case, they may have inadvertantly (or otherwise) modified a few bytes here and there.  Now, if the Unraid's parity check had the option to detect single-disk corruptions, then I would have a very good chance to restore everything on all the disks just as they were before the incident.


One can imagine various scenarios that can lead to single-disk corruption.  It can be a malicious script, or it can be simply a human error.  For another illustration, consider this example:


I set out to write a little script that will spin up all disks. (Story is all made up, but it can easily be true).  For that, I will just read a few random bytes from each disk.  That will wake them up, right?  So I write the script (simplified) like this:

disks="sdb sdd sdf sdm"
for i in $disks ;do
   dd of=/dev/$i bs=1024 seek=$(($RANDOM*10240)) if=/dev/urandom count=1
done

Pleased with the result, I run that script.  Twenty seconds later, a terrifying realization strikes me: What I have just done is not reading, but actually writing random bytes to all the disks!  Horrified, I jump out of my chair, and yank the power cord off the wall.  Too late.  I have just corrupted not one, not two, but ALL my data disks.  What was I thinking?!  And, what can be done?  Well, if the feature that I'm suggesting here had been implemented, then in this situation I would be able to perfectly recover ALL my disks in just one pass.  Now wouldn't that be something?


Thank you for considering my suggestion.
 

  • Like 1
Link to comment

Also, let me give a rough description about how the driver could be doing what I am suggesting:


It starts a parity check on a dual parity array, and at a certain byte position it arrives at a parrity error.  At this point it could do some extra work, trying to determine if this is a case of single-disk corruption.  One way to do that -- a slow and ugly way, but will do the job -- is this: it runs another parity calculation for the same byte position, but this time disregarding one disk, as though as that disk is missing.  Then another calculation, but this time disregarding another disk, and so on, for each disk in the array, one by one.  If we are fortunate enough to find an outcome in which the parity calculation shows to be correct, then that tells us right there that we have a case of single disk corruption, and we know exactly which disk is carrying the wrong byte.


At this point we will face the question what to do with this new knowledge.  The choices for action are not that many, really.


You could either...
DO: A) Do what we've always done before -- sync the parity disks;
OR: B) Do nothing, just report it in syslog; (read-only parity check)
OR: C) Do this new thing -- recover the corrupted byte from parity.


That's all.


Now that we have one way of doing it -- feasible, not necessarily optimal -- we can optimize it from there.
 

Link to comment

I'll revisit this topic for 6.10 release - why?  Because we need to make some other non-trivial changes in md/unraid driver to support 5.8 kernel.  Here's my pattern in dealing with driver changes:  Realize any non-trivial driver coding is intricate and perilous.  That is, one tiny bug can cause quite a bit of damage, as in losing data kind of damage.  Therefore whenever I embark on driver changes, I have to "clear the deck" of all other coding distractions and concentrate just on this, and then test, and then test some more.  First, however, is making a business case for the changes in the first place.  For example, is coding time better spent dealing with identifying a possible specific failed device in a P+Q array, and then what to do about it, or say, adding ZFS support (or pick any other feature)?

 

In your particular example, it would help to demonstrate a realistic use case for this feature that could benefit a lot of users, considering also that not everyone is even running a P+Q array.

  • Like 2
  • Thanks 1
Link to comment

I would also be a bit concerned about what are the chance of a ‘false positive’ if in fact the corruption occurred on a parity drive and not on a data drive.   Could you be sure that whatever algorithm is used can detect this scenario and always be able to tell it apart from the case where the corrupt bits were on a data drive.   Taking action on a ‘false positive’ could result in perfectly good data then getting corrupted as a result.

  • Thanks 1
Link to comment
6 hours ago, itimpi said:

I would also be a bit concerned about what are the chance of a ‘false positive’ if in fact the corruption occurred on a parity drive and not on a data drive.   Could you be sure that whatever algorithm is used can detect this scenario and always be able to tell it apart from the case where the corrupt bits were on a data drive.   Taking action on a ‘false positive’ could result in perfectly good data then getting corrupted as a result.

Actually, it doesn't matter which one the corrupted drive is -- data or parity -- it's all the same.  If it happens that the corruption is on a parity drive, it will just end up like syncing the parity drive.

Link to comment
6 hours ago, limetech said:

...or say, adding ZFS support (or pick any other feature)?

ZFS or pick any other feature -- I can do on any other distro.  The only reason I am here is the md/unraid driver -- that's the one thing others don't have.

 

Of course, I realize very well that my suggestion is not something trivial.  Some ides need to cook in the back of your head for quite some time before they pop out all ready to code.  As long as we feel that this is something important, and should eventually be implemented, that is all I am hoping for.  

 

6 hours ago, limetech said:

I'll revisit this topic for 6.10 release

Thank you!

Link to comment
6 hours ago, Pourko said:

Actually, it doesn't matter which one the corrupted drive is -- data or parity -- it's all the same.  If it happens that the corruption is on a parity drive, it will just end up like syncing the parity drive.

I hope it ends up really being that simple but I am still not convinced :)  The commonest case of parity needing to be corrected is after unclean shutdowns and in such a case there must be a significand chance of more than one drive being wrong.   I would suggest that if anything along these lines is done the first step would be to simply report in the syslog which drive seems to be the culprit but keep current behaviour otherwise.   That would allow for collecting some real-world evidence on how reliable doing any sort of auto-correct based on the detection algorithm might be.

Link to comment

in your scenario it seems you're only thinking about your data disks.. what if when your server was out of your control your parity drive(s) had data written to them?  What if both your data and parity drives were written to?  You said you didn't want invalid data written to the parity drive, but if the parity drive were actually invalid and wrote that to the array it would corrupt your files, no?

 

I'm not smart enough to follow along with this single disk parity checking stuff, but it sounds like it's more work than it's worth, and more complicated to implement than any potential benefit it would give.

 

It starts a parity check on a dual parity array, and at a certain byte position it arrives at a parrity error.  At this point it could do some extra work, trying to determine if this is a case of single-disk corruption.  One way to do that -- a slow and ugly way, but will do the job -- is this: it runs another parity calculation for the same byte position, but this time disregarding one disk, as though as that disk is missing.  Then another calculation, but this time disregarding another disk, and so on, for each disk in the array, one by one.  If we are fortunate enough to find an outcome in which the parity calculation shows to be correct, then that tells us right there that we have a case of single disk corruption, and we know exactly which disk is carrying the wrong byte.

 

So if you have a system of 10 disks, you're suggesting to run something like 100 different parity checks for each byte because you keep excluding one disk at a time? For what purpose vs how it's done now? I don't understand.  Your final assumption is that you have a case of "single disk corruption"... but how do you know unless you finish the parity check fully using both disks.. you could have other errors on the other disk too.. and in any event, which byte would be taken as the correct byte? What if that byte was incorrect, there goes your data.. again.

 

I'm missing the point here, I think... but if it came down to recovering from data modified externally outside of a mounted array, I think any implementation would be purely 'guessing' at which data was the correct one.  And you'd want to do a parity check on all your drives, and verify the parity on both your parity disks also, not excluding one because you 'think' you have a single disk corruption.

Link to comment
12 hours ago, Energen said:

So if you have a system of 10 disks, you're suggesting to run something like 100 different parity checks for each byte because you keep excluding one disk at a time?

 

No, I am not suggesting that.  I did it like that because it was the only way to do it manually.  When done programatically, there are better ways to identify the corrupted disk at a certain byte position.

 

I see that there is some confusion here... When I say "single disk corruption", I am not referring to a whole corrupted disk, I am only talking about a certain byte position at which we have arrived and we found a parity mismatch.  At the position of the next parity error, the corruption may or may not be on the same disk as before. It doesn't matter, we are dealing with one byte position at a time.  Therefore, all disks may have corruprions in various places, and as long as the corruptions are not at the same byte position, you could recover all disks in one pass.  Also, note that we are not destinguishing between data or parity disks, it works the same.

Link to comment
22 hours ago, itimpi said:

The commonest case of parity needing to be corrected is after unclean shutdowns

Yes, in 99% of the cases it will indeed be found that a parity disk is the one that's carrying the wrong byte. You will know that without the need of guessing.

 

You are a moderatror, so you read a lot.  You must have seen hundreds of posts over the years that sometimes parrity errors appear for no obvious reason, without any unclean shutdowns.  When people start pulling their hair, trying to guess which disk those errors may be coming from, they should all wish we have the feature I am suggesting. :-)

 

Personally, this thing has been bugging me for over 10 years now.  I raised these questions back in 2013.  But back then we had only single parity. And with that you can only detect the fact that there's a parity error, but you can't determine the position of the error.  Now that we have dual parity, we could take full advantage of it, and be able to know the position of the error.

Link to comment

Following this thread and do have some problems to understand.

 

During parity check a bit on one disk (parity or data) does not carry the expected value. IMHO the only information that helps here would be the block that bit belongs to on the corresponding data (!!!) drive. This block might belong to a file - but not neccessarily. But if it's file content I would like to know it. I could then restore that file from backup.

 

So IMHO a parity mismatch should lead to a check if it's file content and report about it. 

 

So to follow the OP: Somebody takes a drive from the array, manipulates data and puts that disk back into the array. During a parity check a lot of filenames will pop up. And that files are all stored on one specific disk. In that case I would pull that disk and let it re-create.

 

I hope it'e not total nonsense ...

 

 

Link to comment
49 minutes ago, hawihoney said:

During parity check a bit on one disk (parity or data) does not carry the expected value. IMHO the only information that helps here would be the block that bit belongs to on the corresponding data (!!!) drive. This block might belong to a file - but not neccessarily. But if it's file content I would like to know it.

Parity protection knows nothing about file systems. (At least it shouldn't)  It just adds corresponding bits from all disks to see if the result matches the bit that's on the parity disk, that's all.

 

It shouldn't even know about partitions, but that's a different story. :-)

Edited by Pourko
Link to comment

I know, I know, but the file system on the data disk has it all. Simply said:

 

P holds 101010 at a certain position. Data disks report 1, 0, 0, 0, 1, 0. So there's a mismatch for the data disk 3 at that position. Now check if that bit is part of non-allocated space, metadata, ..., or a file on that _data disk_. If this bit belongs to a file --> report it and then sync that bit.

 

I don't know if it's possible to find blocks that belong to a file with the help of a given bit or block. But this is IMHO what would help in many situations.

 

So the question is: I have a bit/block at a given position on a known data disk. Is it possible to identify to what that bit/block belongs on that data disk?

 

If it's possible it would add a small performance drawback whenever a mismatch occurs but it would allow to restore corrupt files from backup. The restore would "repair" parity then. If the bit/block does not belong to a file, sync would happen as usual.

 

As I said: I have no idea if this sounds reasonable, but this one puzzles me since over 10 years, when my Unraid server synced over 1400 errors to parity because of a bad cable and I didn't notice that. 

 

Edited by hawihoney
  • Thanks 1
Link to comment
44 minutes ago, hawihoney said:

...Now check if that bit is part of non-allocated space, metadata, ..., or a file on that _data disk_. If this bit belongs to a file -->...

Yes, I understand, but that is beyond the scope of the kernel driver. Tools in userspace could be developed to that effect.

 

44 minutes ago, hawihoney said:

this one puzzles me since over 10 years, when my Unraid server synced over 1400 errors to parity because of a bad cable and I didn't notice that.

That is a perfect example to prove my point.  Thank you.

Edited by Pourko
Link to comment

See, guys, this whole thing boils down to knowledge.  Knowledge about which disk exactly is carrying the mismatched byte at certain position.  Knowledge that can't be had with single parity, but can be had with dual parity.  It amazes me that some people would not want that knowledge.  Like, life is easier without that knowledge -- just assume that the parity is wrong and sync it.  Like, if they are suddenly given the knowledge that, say, disk#4 is carrying the corrupted byte at that point, then they would be stumped about what to do with that knowledge.  Let me ease the anxiety: You could always chose to do what you have been doing all along -- sync the corrupted byte onto the parity disks.  Personally, I would take the other option -- recover the corrupted byte from parity.

  • Like 1
  • Thanks 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.