Parity sync errors.


Recommended Posts

So my UPC battery died yesterday and forced an unclean shutdown. When I started the array again, a parity check was performed automatically. It came back with ~1500 sync errors. The other interesting thing is that last parity check (~ 2 mos. ago) there were exactly the same number of sync errors. I have included the diagnostics. I was under the impression that, unless you specifically uncheck the option, that parity checks were correcting. This does not seem to be the case as i see NOCORRECT in the syslog. In the main UnRAID window the correcting box was checked, when I went to look aft it completed. My questions are:

 

1. why wasn't a correcting check done, as I thought that was the default?

2. Does anyone see anything helpful in the syslog?

3. should I go ahead and do a correcting check, or is something else warranted?

4. can anyone tell if it is a particular disk that may be the culprit?

 

Thanks for any wisdom people are willing to impart.

 

 

tower-diagnostics-20180619-2146.zip

Parity_check_history.jpg

Edited by ratmice
Link to comment
4 hours ago, ratmice said:

I was under the impression that, unless you specifically uncheck the option, that parity checks were correcting.

Automatic parity checks, like after an unclean shutdown, are always non correct, so if errors are found you need to run a correcting check after, or since errors are expected after an unclean shutdown, cancel the automatic check and start a correcting check right away.

Link to comment

The danger with automatic correction is that the system can overwrite a 99.99% valid parity with garbage because one or more of the data disks have broken and started to produce garbage. Especially since the unclean shutdown could have been caused by unclean power - possibly a nearby lightning strike.

 

So it's always important to do a non-correcting check and see that the majority of the data have valid parity. Obviously, this step doesn't need to run the full 100%. Even after 5%, it's enough to see that the parity is mostly correct so all array disks are up and running and producing correct data. Then it's safe to do a correcting check and fix the individual blocks that are incorrect because the volumes were mounted and because of in-progress disk writes during the shutdown/hang.

  • Upvote 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.