(SOLVED) Swapped Parity Drive, Parity Check Yields 976214698 errors


Ruthalas

Recommended Posts

I'd appreciate some help, I am pretty worried about this. Thank you!

Summary:

  1. Swapped in a new (larger) parity drive (sdd)
  2. Used old parity drive as data disk and removed old data disk
  3. Parity check ran and returned many errors
  4. Is the correct course of action to replace the new drive?

 

Longer story:

  1. I recently purchased a 12TB drive to increase the size of my parity drive.
  2. After purchase I ran the WD Data Lifeguard Diagnostic on it.
  3. It failed the first time, and I realized the drive was situated in the hot exhaust of my case and was hot enough to burn my fingers. :/
  4. After moving it, I ran the diagnostic again and it passed, both quick and extended tests
  5. I then zeroed the drive with the same tool with no issues.
  6. I then performed the parity/data shuffle as per the wiki, and everything seemed to go smoothly (it took three days or so).
  7. A scheduled parity check was initiated about a day after the rebuild, and ran for two days; when finished it returned 976214698 errors.

 

  • SMART stats seem to indicate the new drive is indeed new, and has no issues.
  • The array was in use during the parity check (I think that's fine?).
  • There was a temperature warning for a different disk in the array once during the rebuild, but it is physically distant from the new disk in question.
  • Diagnostic files are attached.


My immediate thought is that the drive is bad, and I should have heeded the issues it had when I ran tests when it was warm.
That being the case, I believe simply returning this drive and replacing it immediately with another would be the correct action?

I wanted to post to make sure I wasn't missing anything, and to check that replacing the parity disk is the correct course of action.

 

If there's any other info I can provide, let me know!

Thank you for your time,

Ruthalas

markus-pc-diagnostics-20201229-1830.zip

Edited by Ruthalas
(SOLVED)
Link to comment
Quote

...you need to correct those errors...

My parity checks are currently set to write corrections to the disk, so I think I am good on that front.

 

Quote

There's a known bug...

In that case, should I re-run the check and see if it turns out correctly this time?
(Edit: Yes, the new drive is ~4TB larger than the old one.)

Edited by Ruthalas
Link to comment
  • 3 months later...
On 12/29/2020 at 8:32 PM, JorgeB said:

If so first sync error is on sector=15628053064, which is exactly at the 8TB mark, so it was the bug, but like mentioned no harm in running another check to confirm all is well.

That was very helpful. I was getting worried because i had this problem with millions of sync errors.
I don't know if i just didnt see it in the docs, but if this known bug isn't in the documentation i think it would be very important to add it.

It took me multiple search terms and 20 forum/reddit posts to get here.

 

For others with this problem:
You can confirm your problems are related to this known bug, by looking in the syslog and finding the first sector that got repaired (it will only print the first 100 errors in the log so it should not be hard to find) and calculating the following:

[first_repaired_sector_number] * (512 / 1000000000) = [position_of_sector_in_gigabytes]

If the result is close to the size of the previously replaced parity drive you are affected by this bug.
 

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.