Article: Why RAID 5 stops working in 2009


bubbaQ

Recommended Posts

On 8/9/2020 at 9:23 AM, BRiT said:

Others have already brought up this point (here or elsewhere), so why are you ignoring it for your cosmic example? The drive itself will know what data is wrong since the hardware supports checksums of sectors, so if a data bit is flipped and it doesn't change the checksum information, the drive will report it.

I used a cosmic hypothetical only to save us time, as that's not the main point.  If you insist, I can give you some real life examples about how this can happen.

 

On 8/9/2020 at 9:23 AM, BRiT said:

Mathematically it is impossible to know which disk is wrong with the algorithms used by single and dual parity protections.

Mathematically it is impossible to know which disk is wrong with the algorithms used by single parity protection.  That is not true for dual parity though.  Which gets us straight to the point.

Link to comment
7 hours ago, Pourko said:

Mathematically it is impossible to know which disk is wrong with the algorithms used by single parity protection.  That is not true for dual parity though.  Which gets us straight to the point.

Not with the algorithm unraid uses for dual parity.

Link to comment
7 hours ago, Pourko said:

Mathematically it is impossible to know which disk is wrong with the algorithms used by single parity protection.  That is not true for dual parity though. 

I don't see how dual parity on its own can identify which disk is wrong so please provide support behind your claim. Preferably mathematical proof since you claimed mathematically. I'm keen on reading up on that.

Link to comment
5 hours ago, BRiT said:

And I will point you to the details of how hard drives work in reality, where the drives themselves will report CRC errors, and thus you know which drive is damaged.

Brit, I am not talking about "CRC errors". I am talking about a single byte changed on one data disk, which to that disk looked like a legitimate change, so nothing to do with disk errors. 

 

Here's an example that may help you see what I am talking about... During a recent border crossing, my server was out of my hands for a short period of time.  I am allowing for the possibility that one of the disks may have been mounted somewhere. (read-only, hopefully!:)  If they have mistakenly changed even a single byte on that disk, to the disk that would look like "legitimate" write, so nothing to do with CRC errors. Now, it that scenario, if I were to start the array and do a parity check, the discepancy of that byte will be found, and it will be automatically "validated" onto the parity disk, under the assumption that it is the parity disk that's wrong. Am I explaining this a little better now?

 

Although this example is a little extreme, you can think of various scenarios how a byte on a data disk can be changed. With single parity setup, you really have no way of knowing which exactly physical disk is the one with the wrong byte. So, back to the point:

5 hours ago, BRiT said:

Not with the algorithm unraid uses for dual parity.

Are you really familiar with the algorithm unraid uses for dual parity, or are you making it up?  Has there been any discussion on the matter that you can point me to?

Link to comment

Phew... my comp-sci class is still right then 😅.

I was taught that mathematically, parity (from single to n-parity) can only correct self-identifying failures i.e. the failures have to be identified before correction can be applied. That fact hasn't changed.

 

Yes, P+Q can identify and correct a single-disk failure but one has to first identify that there was indeed a single failure (and not double).

So blindly using dual parity to correct an assumed single-disk corruption is misguided IMO. Like bashing a screw with a hammer.

Link to comment

parity, as implemented in most raid systems, is only useful to tell you something in the collection is wrong, or to recover a known missing piece in the collection. if two items are wrong (yes, this is a case where two wrongs CAN and DO make a right :D ) or you don't know from other means some specific piece is wrong, then parity is useless for recovery.  that's just the nature of the beast.

even ECC memory can only correct 1 bit in 64; it can however detect >1 errors but the data is considered corrupted at that point.

 

you can do otherwise, but it gets expensive.

Link to comment
1 hour ago, johnnie.black said:

If you continue to read the quoted post you see why Unraid doesn't support that.

Yes, I read that post a few times. Limetech confirms that he can be doing it. But the reason he gives for not doing it is kind of flaky: He gives some extremely improbable scenario in which two disks are corrupt in the same byte positions. I don't find that to be a valid argument, because you could apply the same logic if you want to build up an argument against having any parity protection -- even with single parity there is the hypothetical possibility that two disks are corrupt in the same byte position in such a way that the existing parity checksum "looks" good. So how is that a valid reasoning?

 

The thing is, identifying something, and deciding what to do about it, if anything, are two completely different things. That posting over there boils down to: "we're not going to try to identify it, because if we do, we will not know how to handle one hypothetical extreme scenario".  In my scenario however, If you can identify for me (with the help of dual parity) which exactly physical disk is the one that's been illegaly modified, then I could just throw that disk it in the trash bin, and restore my good data onto a new disk.

Edited by Pourko
typo
Link to comment
14 hours ago, Pourko said:

Brit, I am not talking about "CRC errors". I am talking about a single byte changed on one data disk, which to that disk looked like a legitimate change, so nothing to do with disk errors. 

 

So you're backing away from your Cosmic Data Change example that simply was never possible. So that's a good sign.

Link to comment
2 hours ago, BRiT said:

 

So you're backing away from your Cosmic Data Change example that simply was never possible. So that's a good sign.

Yes, in my initial post I used the word "magically", for the sake of saving us time and getting straight to the point.  But now I see how I got you distracted by that. :-)  Please don't fixate on that musfortunate wording.  I am counting on your long time experience to find a real solution to a real problem. (Even if that means me putting that server in storage for 12 months, while I'm trying to convince Limetech that what I am talking about actually makes some sense.:)

  • Haha 1
Link to comment

It comes down to being mathematically uncertain that you only have 1 unknown in an algorithm set of N variables.

 

I do think it would be nice to have a better presentation of where or what Data Integrity issue (corruptions) your array has. Ideal world would have something like the following (massive amount of functionality required):

  • UI display the list of corruptions detected 
  • For each corruption, a way to view the data stored there for each drive
  • For each corruption, a way to possibly view the filename located there for each drive
  • For each corruption, an indicator of likely which drive has the corruption if only one drive is corrupt (with huge warnings and caveats about dragons)
  • For each corruption, a means of backing up current data values for each corruption for a designated drive
  • For each corruption, a means of restoring the data from the previous backup for a designated drive
  • For each corruption, a means of selecting a drive to attempt a rebuild of the data which then marks this corruption as possibly fixed and has yet to be verified by the user

 

This doesn't do anything automatic, except generating the list of corruptions during the scan.

  • Like 1
Link to comment
9 minutes ago, BRiT said:

massive amount of functionality required

You aren't kidding. The last time I looked in to trying to tease out what file is located at a specific address in a volume, it seemed to be a near impossible task. Now, amplify that by 3 (or 4) different filesystems, each with the possibility of an encryption layer, and the task seems to be insurmountable.

 

Something that could actually be implemented would be to do a read only trial run on each recovery scenario (trust and fail combinations) and present the emulated drive(s) that would result. That way you could do a hash check and see which files were effected in each different configuration. Still a massive amount of time involved in checking and rechecking, with the likely outcome of the flipped bit(s) actually residing in the slack space or unallocated section, resulting in zero outcome for max effort.

 

All this, and if you have a failure outside of the recoverable scenario you still need backups.

 

A vast majority of data loss is caused by user error or equipment failure outside of the scope of what parity can recover from.

 

Bottom line, nothing replaces a backup copy of irreplaceable data. I would much rather see @limetech spend time on a built in back up solution, preferably one that could traverse WAN connections to a second Unraid box.

  • Like 1
  • Thanks 1
Link to comment
46 minutes ago, BRiT said:

It comes down to being mathematically uncertain that you only have 1 unknown in an algorithm set of N variables.

 

I do think it would be nice to have a better presentation of where or what Data Integrity issue (corruptions) your array has. Ideal world would have something like the following (massive amount of functionality required):

  • UI display the list of corruptions detected 
  • For each corruption, a way to view the data stored there for each drive
  • For each corruption, a way to possibly view the filename located there for each drive
  • For each corruption, an indicator of likely which drive has the corruption if only one drive is corrupt (with huge warnings and caveats about dragons)
  • For each corruption, a means of backing up current data values for each corruption for a designated drive
  • For each corruption, a means of restoring the data from the previous backup for a designated drive
  • For each corruption, a means of selecting a drive to attempt a rebuild of the data which then marks this corruption as possibly fixed and has yet to be verified by the user

 

This doesn't do anything automatic, except generating the list of corruptions during the scan.

Throwing in the UI and all the other things you listed, doesn't help us cut through the fog. File systems have nothing to do with this conversation. Let us forget about the UI for now, and let's not talk about massive multi-drive failures.  Trying to keep things as simple as possible, imagine the problem like this:


We had a good parity protected array, for which we had run parity checks, and everything was OK.


Now we will start a parity check, which will only report found errors if any.


A single parity array can only report someting like this:
"A mismatched byte was found at byte position NNNNNNNN"


A dual parity array could report:
"A mismatched byte was found on disk#5 at byte position NNNNNNNN"


Are you not seeing the possibilities in this?
 

Link to comment

I'm unaware of ever seeing a single parity error, they have always presented in multiples. Since Unraid is the only program in charge of writing to the parity disk(s), and then only after writing to the data disks, the normal scenario for small number of parity errors is that a crash or unclean shutdown of some sort interrupted the parity disk write. That may or may not include unfinished writes on the data disks, but like you said, file system errors are not in the realm of parity correction. Since the corrupting event is usually a single moment in time, you can't predict whether or not multiple drives were open for writing, and had incomplete parity calculations. That means you can't with any certainty say that only a single disk is the culprit, scenarios where multiple bits at that address are wrong is equally likely.

 

When dealing with multiple possibly conflicting parity corrections, how do you propose to handle them?

 

99% of the time, a disk failure presents with read errors, which unraid already logs, followed by write failures, which kick the disk out of the array. That is a clear cut reason to investigate the health of a specific disk, but the write errors can also be caused by controller, cable, PSU or RAM issues. Discarding a disk because a parity bit was wrong is not productive.

Link to comment
2 hours ago, jonathanm said:

99% of the time, a disk failure presents with read errors, which unraid already logs, followed by write failures, which kick the disk out of the array. That is a clear cut reason to investigate the health of a specific disk, but the write errors can also be caused by controller, cable, PSU or RAM issues. Discarding a disk because a parity bit was wrong is not productive.

Jonathan, you completely misunderstood my question. I am not talking about disk failure or read/write errors, or anything of the kind. The disks are all in good health, but one disk (and I don't know exactly which one) may have been inadvertantly modified while outside of the array. So no, I do not want to discard parity, exactly the opposite -- I want to trust the known good parity to restore the correct data on that disk, in case that it had indeed been modified. All i need to do that is that a report-only parity check reports the actual disk on which it finds the mismatched byte(s) -- that is something that only double parity can do.

Link to comment
4 hours ago, sota said:

if you pull a disk out of the array and mount it anyplace else, you should automatically assume it's not integral with respect to parity.

It is not quite true that you should automatically assume that. You can do an external mounting as read-only, and not disturb a single bit on the disk.  And, if you have any doubts, then that's what parity check is for.

Link to comment

So, I put that server in "storage", and set up a test server, to play with things, and see how I can get myseld out of this mess.

 

I think I ran into a bug with the UI. (Maybe I should post a bug report somewhere?)

 

So here's the bug, and how to reproduce:

You start the array from the UI, and the first thing you want is a read-only parity check. So you dutifully uncheck the "Write corrections to parity" checkbox, and click the "Check button". To your greatest surprize, in syslog you see:

 

Aug 13 19:36:15 ToyVB kernel: mdcmd (142): check
Aug 13 19:36:15 ToyVB kernel: md: recovery thread: check P Q ...
Aug 13 19:36:15 ToyVB kernel: md: recovery thread: PQ corrected, sector=19464
Aug 13 19:36:16 ToyVB kernel: md: sync done. time=1sec
Aug 13 19:36:16 ToyVB kernel: md: recovery thread: exit status: 0
 

On consecutive tries, the UI does honor the unchecked checkbox, and you see:

 

Aug 13 19:36:24 ToyVB kernel: mdcmd (143): check nocorrect
Aug 13 19:36:24 ToyVB kernel: md: recovery thread: check P Q ...
Aug 13 19:36:25 ToyVB kernel: md: sync done. time=1sec
Aug 13 19:36:25 ToyVB kernel: md: recovery thread: exit status: 0
 

To see the bug again, just stop the array, start it again, and try another "read-only" parity check.

 

Unraid version 6.8.3 by the way.

Edited by Pourko
typo
Link to comment
6 hours ago, Pourko said:

(Maybe I should post a bug report somewhere?)

You should, there is indeed a bug, it works correctly at first array start (auto-start enable or disable) but it doesn't the first time you do it after stopping and re-starting the array.

 

Since it still happens on the latest beta and that's the one being developed please report it here:

https://forums.unraid.net/bug-reports/prereleases/

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.