Jump to content

Parity check - 900+ million errors


FreeMan

Recommended Posts

My machine kicked off its monthly parity check on the 1st. When I went to bed last night it had just passed the 4TB mark and had recorded 0 errors. By the time I got to the office, it finished the check and reported 976 million errors!

 

Funny thing. I've got an 8TB parity drive and none of my data drives are over 4TB in size. All the errors happened when it was comparing parity against nothing. The parity drive is a brand new WD just shucked from a MyBook about 3 weeks ago. I ran an extended SMART on it and completely zeroed it while it was still in its shell and saw no signs of any issues. Obviously, that's no guarantee that it hasn't gone bad, but one would hope not...

 

Diagnostics attached, and another extended SMART test kicking off right now. Can anyone read the tea leaves and give me a hint about what may have gone wrong?

nas-diagnostics-20180802-1809.zip

Link to comment

Your errors started at sector 7814037064.

Thats byte 4 000 786 976 768.

 

 

And your 976 million errors is about 4 TB.

 

So somewhere down the line, only the content copied from a previous 4TB parity drive was correct while the rest of the drive wasn't.

 

No indication that you have any issue with the disk itself.

Link to comment

Thanks, that's comforting since the previous parity drive was a 4tb drive that's now in the array as a data drive.

Should I run a correcting parity check to "fix" it? I'm not sure how long it will be before I put another 8tb unit in the server and having the errors starting at me until then seems... annoying...

Sent from Tapatalk

Link to comment
Yes, you need to let unRAID correct the remaining 4TB. You don't want to see all these errors every time a parity check is run - with constant spurious noise you will not see any real parity errors happening.
That's what I figured, just wanted to be sure.

It seems odd that upgrading the size of the parity drive doesn't automatically zero out the "unused" portion of the parity drive. I don't ever remember having an issue like this in the past, is this something new?

Sent from Tapatalk

Link to comment
18 hours ago, FreeMan said:

It seems odd that upgrading the size of the parity drive doesn't automatically zero out the "unused" portion of the parity drive.

It does, something went wrong during the upgrade.

 

Also a new parity sync should be much faster then correcting all those errors.

Link to comment
It does, something went wrong during the upgrade.
 
Also a new parity sync should be much faster then correcting all those errors.
Hmm... Would I have seen what went wrong had I checked logs after the upgrade?

And now you tell me... :( I'm at 56% and it's corrected 300 some million errors. I'll just let it finish from here.

Sent from Tapatalk

Link to comment

New party means one-pass - read the data disks and write to parity disk(s).

 

Correcting means read the old data + parity and compare and then write corrected parity blocks. So for the 4TB that are wrong, it becomes a read/modify/write instead of just a write.

Link to comment
New party means one-pass - read the data disks and write to parity disk(s).
 
Correcting means read the old data + parity and compare and then write corrected parity blocks. So for the 4TB that are wrong, it becomes a read/modify/write instead of just a write.
If only I'd thought of that.

Oh well. Issue will be fixed soon enough.

Thank you both for your help.

Sent from Tapatalk

Link to comment
Just now, FreeMan said:

If only I'd thought of that.

Oh well. Issue will be fixed soon enough.

Thank you both for your help.

Sent from Tapatalk
 

 

Your's is a special case since you have so large amounts of incorrect parity. So a huge number of blocks to first check and then correct. If having 100 incorrect blocks, then the time to write the 100 corrections becomes irrelevant to the total read time.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...