Jump to content

Parity Build/Check/Sync/Verify


Recommended Posts

So...  I recently upgraded to v5b6.  At this moment, my system is actively performing a Parity Sync.  I have a couple of questions...

 

Browsing the forums I see reference to Parity Build, Parity Check, Parity Sync and Parity Verify.  I understand the difference between these states as follows:

 

Parity Build - No parity exists, build new

Parity Check - Parity exists, Verify & Report, make no change

Parity Sync - Parity exists, Verify and Update as needed

Parity Verify - Not an actual status.  Incorrect reference to Parity Check

 

Is my understanding of these labels correct?

 

Also...  I seem to remember that when I was running v4.7 that whenever a Parity Check or Sync was running that the number of corrections was displayed in the WEB GUI.  I find comfort checking every now and then to verify that this number is zero.  Was this removed from version 5?  I don't see it.

 

 

Link to comment

So...  I recently upgraded to v5b6.  At this moment, my system is actively performing a Parity Sync.  I have a couple of questions...

 

Browsing the forums I see reference to Parity Build, Parity Check, Parity Sync and Parity Verify.  I understand the difference between these states as follows:

 

Parity Build - No parity exists, build new

Parity Check - Parity exists, Verify & Report, make no change

Parity Sync - Parity exists, Verify and Update as needed

Parity Verify - Not an actual status.  Incorrect reference to Parity Check

 

Is my understanding of these labels correct?

 

Also...  I seem to remember that when I was running v4.7 that whenever a Parity Check or Sync was running that the number of corrections was displayed in the WEB GUI.  I find comfort checking every now and then to verify that this number is zero.  Was this removed from version 5?  I don't see it.

 

I only refer to these terms:

 

parity sync - reads all the data disks and writes computed parity to parity disk

 

parity check - reads all the data disks and the parity disk, comparing computed parity with stored parity.  This operation has a flag:

  CORRECT - if a parity mismatch occurs, write parity disk with computed parity and report in syslog

  NOCORRECT - just report in syslog

 

Only the first 100 parity check errors are reported.

Link to comment

At the bottom of the Parity Sync / Parity Check statistics there should be a line indicating how many sync errors. Also under the Array Status on the "Main" page it indicates the results of the last parity check, for instance: "Last checked on Tue Mar 1 23:57:32 2011 EST, finding 0 errors."

 

At least I'm remembering seeing that.

Link to comment

At the bottom of the Parity Sync / Parity Check statistics there should be a line indicating how many sync errors. Also under the Array Status on the "Main" page it indicates the results of the last parity check, for instance: "Last checked on Tue Mar 1 23:57:32 2011 EST, finding 0 errors."

 

At least I'm remembering seeing that.

 

That's right.

 

The messages in the system log are generated for each sector address where a parity mis-match occurs.  So if you tried a parity-check on an array that doesn't have valid parity the system log would quickly become massive, so it's limited to 100 messages.

 

These parity mis-matches are called "Sync errors" or "parity sync errors" - it is a count of how many sector addresses were found where computed parity did not "synchronize" with (i.e., match) stored parity.

Link to comment

At the bottom of the Parity Sync / Parity Check statistics there should be a line indicating how many sync errors. Also under the Array Status on the "Main" page it indicates the results of the last parity check, for instance: "Last checked on Tue Mar 1 23:57:32 2011 EST, finding 0 errors."

 

At least I'm remembering seeing that.

 

That's right.

 

The messages in the system log are generated for each sector address where a parity mis-match occurs.  So if you tried a parity-check on an array that doesn't have valid parity the system log would quickly become massive, so it's limited to 100 messages.

 

These parity mis-matches are called "Sync errors" or "parity sync errors" - it is a count of how many sector addresses were found where computed parity did not "synchronize" with (i.e., match) stored parity.

We are seeing users report "transient" parity errors.  Errors that are probably due to a disk reporting incorrect data, or memory flipping a bit.  These are very difficult to troubleshoot, since even with the "sector" address it is nearly impossible to know the values of the bits that were compared.  what some users resort to doing is performing repeated CRC or MD5 checks on files on the various drives.  If a given file shows different checksums, the bad drive is (probably) isolated.

 

Might an improvement in the parity check process be, in the case where an error is initially detected, to re-read the sector, possibly several times, comparing results to try to determine if it was a transient byte from a given drive, or if the values of the bytes across the drives do not vary, an actual parity error on a disk.

Link to comment

We are seeing users report "transient" parity errors.  Errors that are probably due to a disk reporting incorrect data, or memory flipping a bit.   These are very difficult to troubleshoot, since even with the "sector" address it is nearly impossible to know the values of the bits that were compared.  what some users resort to doing is performing repeated CRC or MD5 checks on files on the various drives.   If a given file shows different checksums, the bad drive is (probably) isolated.

 

Might an improvement in the parity check process be, in the case where an error is initially detected, to re-read the sector, possibly several times, comparing results to try to determine if it was a transient byte from a given drive, or if the values of the bytes across the drives do not vary, an actual parity error on a disk.

 

To do repeated retries would be "hard" to implement in the code ok maybe not, need to look at this some more.  Can do the same thing, though slower, by user repeating parity check (nocorrect) and see if same error happens.

Link to comment

Parity Build - No parity exists, build new

Parity Check - Parity exists, Verify & Report, make no change

Parity Sync - Parity exists, Verify and Update as needed

Parity Verify - Not an actual status.  Incorrect reference to Parity Check

 

Is my understanding of these labels correct?

 

I don't remember seeing Verify, so I think you are right that it was an incorrect reference; and Tom has provided the correct terminology for Check.  Build and Sync are used synonymously, and I am probably as guilty as any in using both at different times.  There has never been a distinction for existing parity.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...