FreeMan Posted August 2, 2018 Share Posted August 2, 2018 My machine kicked off its monthly parity check on the 1st. When I went to bed last night it had just passed the 4TB mark and had recorded 0 errors. By the time I got to the office, it finished the check and reported 976 million errors! Funny thing. I've got an 8TB parity drive and none of my data drives are over 4TB in size. All the errors happened when it was comparing parity against nothing. The parity drive is a brand new WD just shucked from a MyBook about 3 weeks ago. I ran an extended SMART on it and completely zeroed it while it was still in its shell and saw no signs of any issues. Obviously, that's no guarantee that it hasn't gone bad, but one would hope not... Diagnostics attached, and another extended SMART test kicking off right now. Can anyone read the tea leaves and give me a hint about what may have gone wrong? nas-diagnostics-20180802-1809.zip Link to comment
pwm Posted August 3, 2018 Share Posted August 3, 2018 Your errors started at sector 7814037064. Thats byte 4 000 786 976 768. And your 976 million errors is about 4 TB. So somewhere down the line, only the content copied from a previous 4TB parity drive was correct while the rest of the drive wasn't. No indication that you have any issue with the disk itself. Link to comment
FreeMan Posted August 3, 2018 Author Share Posted August 3, 2018 Thanks, that's comforting since the previous parity drive was a 4tb drive that's now in the array as a data drive.Should I run a correcting parity check to "fix" it? I'm not sure how long it will be before I put another 8tb unit in the server and having the errors starting at me until then seems... annoying... Sent from Tapatalk Link to comment
pwm Posted August 3, 2018 Share Posted August 3, 2018 Yes, you need to let unRAID correct the remaining 4TB. You don't want to see all these errors every time a parity check is run - with constant spurious noise you will not see any real parity errors happening. Link to comment
FreeMan Posted August 3, 2018 Author Share Posted August 3, 2018 Yes, you need to let unRAID correct the remaining 4TB. You don't want to see all these errors every time a parity check is run - with constant spurious noise you will not see any real parity errors happening.That's what I figured, just wanted to be sure.It seems odd that upgrading the size of the parity drive doesn't automatically zero out the "unused" portion of the parity drive. I don't ever remember having an issue like this in the past, is this something new? Sent from Tapatalk Link to comment
JorgeB Posted August 4, 2018 Share Posted August 4, 2018 18 hours ago, FreeMan said: It seems odd that upgrading the size of the parity drive doesn't automatically zero out the "unused" portion of the parity drive. It does, something went wrong during the upgrade. Also a new parity sync should be much faster then correcting all those errors. Link to comment
FreeMan Posted August 4, 2018 Author Share Posted August 4, 2018 It does, something went wrong during the upgrade. Also a new parity sync should be much faster then correcting all those errors. Hmm... Would I have seen what went wrong had I checked logs after the upgrade?And now you tell me... I'm at 56% and it's corrected 300 some million errors. I'll just let it finish from here. Sent from Tapatalk Link to comment
pwm Posted August 4, 2018 Share Posted August 4, 2018 New party means one-pass - read the data disks and write to parity disk(s). Correcting means read the old data + parity and compare and then write corrected parity blocks. So for the 4TB that are wrong, it becomes a read/modify/write instead of just a write. Link to comment
FreeMan Posted August 4, 2018 Author Share Posted August 4, 2018 New party means one-pass - read the data disks and write to parity disk(s). Correcting means read the old data + parity and compare and then write corrected parity blocks. So for the 4TB that are wrong, it becomes a read/modify/write instead of just a write.If only I'd thought of that.Oh well. Issue will be fixed soon enough.Thank you both for your help. Sent from Tapatalk Link to comment
pwm Posted August 4, 2018 Share Posted August 4, 2018 Just now, FreeMan said: If only I'd thought of that. Oh well. Issue will be fixed soon enough. Thank you both for your help. Sent from Tapatalk Your's is a special case since you have so large amounts of incorrect parity. So a huge number of blocks to first check and then correct. If having 100 incorrect blocks, then the time to write the 100 corrections becomes irrelevant to the total read time. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.