Parity check with 488378638 errors, where do I start?


Recommended Posts

5.0-rc15

 

Lots of parity updates and no logs.

 

9w3v.png

 

 

It would appear that all the errors came from disks 4 and 5.  On this screen all disks appear to be sleeping (blinking) except for these.    There is no issue browsing them.  Can anyone help explain what happened and more importantly what I need to do to fix this.

 

9nqt.png

 

Thank you

Link to comment

The good news is you haven't had any failed writes to those two drives -- otherwise they'd be red-balled and you'd potentially lose all of the data on both.

 

This MAY simple be a loose cable and/or drives that have become insecurely fastened in a hot-swap cage.

Shut down;  re-seat (remove and push back in) the drives if in a hot-swap cage;  or unplug both the power and data cables from the drives and reseat them -- being certain they're securely fastened.

 

Then turn the system back on and see if that resolves the issue.  Reset your stats before you do anything else, so they start out at all zeroes.

 

Link to comment

I shut down, reseated all drives, and restarted.  All looked normal upon restart.  I kicked off a non correcting parity check (I believe the previous one that found all the errors we correcting).  The check is not complete yet but it has gotten past the 2tb mark and the drives that appeared problematic are only 2tb.  This check shows no errors yet.

 

No errors is great but I'm rather confused. 

Why were all those errors logged previously.

If the last parity check was correcting then shouldn't I see lots of parity errors now that these drives are "better"?

 

Link to comment

In the absence of the actual details, it's only a guess ... but I suspect the following is what happened:

 

On the last parity check, the number of corrections was 317 less than the number of errors on the troublesome drives.

 

I suspect that means that almost all of the read errors were correctable on a retry ... so UnRAID updated the parity to match ... most likely the correct thing to do [As I've noted before, if you don't have backups to compare to, there's no way to know for sure).    In the other 317 cases, UnRAID would have re-written the data to the disks to correct it ... and those writes were apparently all good (otherwise the disk would have been red-balled).

 

In any event, all looks good now ... so it seems that the issue was indeed an insecure connection.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.