Disk with errors that weren't corrected by parity

October 18, 20196 yr

I have a disk that has 39 errors and the error count in parity shows 39 errors as well. I tried to run a parity check and fix the errors, however it did not fix them. I have a disk to replace the one with errors, however I am unsure about which way would be best to do it.

1. Should I pull the disk, put a new one in and then rebuild it with parity.

2. Should I install a new disk, transfer data from failing disk to the new disk and then remove the failing disk from the array.

Quote

October 18, 20196 yr

3 hours ago, live4soccer7 said:

I tried to run a parity check and fix the errors, however it did not fix them.

That isn't how parity works, I'm afraid. Go to Tools -> Diagnostics and post the resulting zip file for advice on how best to proceed.

Quote

October 18, 20196 yr

Author

tower-diagnostics-20191017-2121.zip

Quote

October 18, 20196 yr

Community Expert

Parity was replaced and during the sync there were read errors on disk5, so parity wasn't 100% correct.

Then you did a correcting check and luckily for you there were no read errors again on disk5, so previous sync errors were corrected, so now parity is in sync, still disk5 is past its best days and IMHO should be replaced now, just do a standard rebuild.

Quote

October 18, 20196 yr

Author

Thank you very much for that information. I'll look up the procedure for replacing and rebuilding a disk to make sure I follow it correctly.

Is there a thread or FAQ on determining when a disk should be replaced? I realize that a lot of this will be up to the admins discretion based on the smart test results, but not really knowing much about the actual results that's hard to determine. It's just a lack of experience on my part as far as that goes. I'm looking to learn a little.

Quote

October 18, 20196 yr

Community Expert

There are the most common SMART attributes that point to a problem like pending and reallocated sectors and then there are other clues, that sometimes don't apply to all manufacturers, with WDs it's good to monitor these attributes:

Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAGS    VALUE WORST THRESH FAIL RAW_VALUE
  1 Raw_Read_Error_Rate     POSR-K   200   198   051    -    320
200 Multi_Zone_Error_Rate   ---R--   199   001   000    -    370

Ideally they should be 0, though very small values can be OK, but large values are a bad sign, together with these:

Error 17560 [15] occurred at disk power-on lifetime: 59470 hours (2477 days + 22 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER -- ST COUNT  LBA_48  LH LM LL DV DC
  -- -- -- == -- == == == -- -- -- -- --
  40 -- 51 04 00 00 3e 00 a8 73 a8 40 00  Error: UNC at LBA = 0x3e00a873a8 = 266299012008

UNC @ LBA are media errors, so it was a disk problem in the past, and it will likely fail again soon.

Quote

October 18, 20196 yr

Author

Thank you very much, again. I will definitely be familiarizing myself with these more as the array is fairly old now and I will definitely be getting more failures.

Quote

Disk with errors that weren't corrected by parity

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)