live4soccer7 Posted October 18, 2019 Share Posted October 18, 2019 I have a disk that has 39 errors and the error count in parity shows 39 errors as well. I tried to run a parity check and fix the errors, however it did not fix them. I have a disk to replace the one with errors, however I am unsure about which way would be best to do it. 1. Should I pull the disk, put a new one in and then rebuild it with parity. 2. Should I install a new disk, transfer data from failing disk to the new disk and then remove the failing disk from the array. Quote Link to comment
John_M Posted October 18, 2019 Share Posted October 18, 2019 3 hours ago, live4soccer7 said: I tried to run a parity check and fix the errors, however it did not fix them. That isn't how parity works, I'm afraid. Go to Tools -> Diagnostics and post the resulting zip file for advice on how best to proceed. Quote Link to comment
live4soccer7 Posted October 18, 2019 Author Share Posted October 18, 2019 tower-diagnostics-20191017-2121.zip Quote Link to comment
JorgeB Posted October 18, 2019 Share Posted October 18, 2019 Parity was replaced and during the sync there were read errors on disk5, so parity wasn't 100% correct. Then you did a correcting check and luckily for you there were no read errors again on disk5, so previous sync errors were corrected, so now parity is in sync, still disk5 is past its best days and IMHO should be replaced now, just do a standard rebuild. Quote Link to comment
live4soccer7 Posted October 18, 2019 Author Share Posted October 18, 2019 Thank you very much for that information. I'll look up the procedure for replacing and rebuilding a disk to make sure I follow it correctly. Is there a thread or FAQ on determining when a disk should be replaced? I realize that a lot of this will be up to the admins discretion based on the smart test results, but not really knowing much about the actual results that's hard to determine. It's just a lack of experience on my part as far as that goes. I'm looking to learn a little. Quote Link to comment
JorgeB Posted October 18, 2019 Share Posted October 18, 2019 There are the most common SMART attributes that point to a problem like pending and reallocated sectors and then there are other clues, that sometimes don't apply to all manufacturers, with WDs it's good to monitor these attributes: Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate POSR-K 200 198 051 - 320 200 Multi_Zone_Error_Rate ---R-- 199 001 000 - 370 Ideally they should be 0, though very small values can be OK, but large values are a bad sign, together with these: Error 17560 [15] occurred at disk power-on lifetime: 59470 hours (2477 days + 22 hours) When the command that caused the error occurred, the device was active or idle. After command completion occurred, registers were: ER -- ST COUNT LBA_48 LH LM LL DV DC -- -- -- == -- == == == -- -- -- -- -- 40 -- 51 04 00 00 3e 00 a8 73 a8 40 00 Error: UNC at LBA = 0x3e00a873a8 = 266299012008 UNC @ LBA are media errors, so it was a disk problem in the past, and it will likely fail again soon. Quote Link to comment
live4soccer7 Posted October 18, 2019 Author Share Posted October 18, 2019 Thank you very much, again. I will definitely be familiarizing myself with these more as the array is fairly old now and I will definitely be getting more failures. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.