Ruthalas Posted December 29, 2020 Share Posted December 29, 2020 (edited) I'd appreciate some help, I am pretty worried about this. Thank you! Summary: Swapped in a new (larger) parity drive (sdd) Used old parity drive as data disk and removed old data disk Parity check ran and returned many errors Is the correct course of action to replace the new drive? Longer story: I recently purchased a 12TB drive to increase the size of my parity drive. After purchase I ran the WD Data Lifeguard Diagnostic on it. It failed the first time, and I realized the drive was situated in the hot exhaust of my case and was hot enough to burn my fingers. After moving it, I ran the diagnostic again and it passed, both quick and extended tests I then zeroed the drive with the same tool with no issues. I then performed the parity/data shuffle as per the wiki, and everything seemed to go smoothly (it took three days or so). A scheduled parity check was initiated about a day after the rebuild, and ran for two days; when finished it returned 976214698 errors. SMART stats seem to indicate the new drive is indeed new, and has no issues. The array was in use during the parity check (I think that's fine?). There was a temperature warning for a different disk in the array once during the rebuild, but it is physically distant from the new disk in question. Diagnostic files are attached. My immediate thought is that the drive is bad, and I should have heeded the issues it had when I ran tests when it was warm. That being the case, I believe simply returning this drive and replacing it immediately with another would be the correct action? I wanted to post to make sure I wasn't missing anything, and to check that replacing the parity disk is the correct course of action. If there's any other info I can provide, let me know! Thank you for your time, Ruthalas markus-pc-diagnostics-20201229-1830.zip Edited December 30, 2020 by Ruthalas (SOLVED) Quote Link to comment
JorgeB Posted December 29, 2020 Share Posted December 29, 2020 There's a known bug with the parity swap procedure where some times it doesn't correctly zero the remainder of the new parity disk, so it will find sync errors once you pass the old parity size, if it was that your data is fine. Quote Link to comment
JorgeB Posted December 29, 2020 Share Posted December 29, 2020 Also, you need to correct those errors, in case it was a non correcting check. Quote Link to comment
Ruthalas Posted December 29, 2020 Author Share Posted December 29, 2020 (edited) Quote ...you need to correct those errors... My parity checks are currently set to write corrections to the disk, so I think I am good on that front. Quote There's a known bug... In that case, should I re-run the check and see if it turns out correctly this time? (Edit: Yes, the new drive is ~4TB larger than the old one.) Edited December 29, 2020 by Ruthalas Quote Link to comment
JorgeB Posted December 29, 2020 Share Posted December 29, 2020 If you noticed that the sync errors only started after the 4TB mark probably not needed, but it can't hurt to run a non correcting check, also correcting checks should only be run if errors are expected. Quote Link to comment
Ruthalas Posted December 29, 2020 Author Share Posted December 29, 2020 I am not sure how I'd determine when the sync error started. Is that something I can find in logs? Gotcha. I believe I neglected to turn it off after testing it some time ago. Thanks for letting me know. Quote Link to comment
JorgeB Posted December 29, 2020 Share Posted December 29, 2020 3 minutes ago, Ruthalas said: Is that something I can find in logs? Yes, old parity was 8TB correct? If so first sync error is on sector=15628053064, which is exactly at the 8TB mark, so it was the bug, but like mentioned no harm in running another check to confirm all is well. Quote Link to comment
Ruthalas Posted December 29, 2020 Author Share Posted December 29, 2020 That's correct. I'll run another check to confirm. Thank you very much, I appreciate the assistance. Quote Link to comment
TessyPowder Posted April 25, 2021 Share Posted April 25, 2021 On 12/29/2020 at 8:32 PM, JorgeB said: If so first sync error is on sector=15628053064, which is exactly at the 8TB mark, so it was the bug, but like mentioned no harm in running another check to confirm all is well. That was very helpful. I was getting worried because i had this problem with millions of sync errors. I don't know if i just didnt see it in the docs, but if this known bug isn't in the documentation i think it would be very important to add it. It took me multiple search terms and 20 forum/reddit posts to get here. For others with this problem: You can confirm your problems are related to this known bug, by looking in the syslog and finding the first sector that got repaired (it will only print the first 100 errors in the log so it should not be hard to find) and calculating the following: [first_repaired_sector_number] * (512 / 1000000000) = [position_of_sector_in_gigabytes] If the result is close to the size of the previously replaced parity drive you are affected by this bug. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.