April 13, 201214 yr i just precleared a used WD20EARS. i noticed no errors in the syslog for the pre-read and zeroing phase. However during the post-read step i started seeing these: Apr 11 14:24:06 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 (Errors) Apr 11 14:24:06 Tower kernel: res 41/40:00:20:80:8d/db:00:04:00:00/40 Emask 0x409 (media error) <F> (Errors) Apr 11 14:24:06 Tower kernel: ata7.00: error: { UNC } (Errors) Apr 11 15:06:38 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 (Errors) Apr 11 15:06:38 Tower kernel: res 41/40:00:70:06:02/db:00:09:00:00/40 Emask 0x409 (media error) <F> (Errors) Apr 11 15:06:38 Tower kernel: ata7.00: error: { UNC } (Errors) Apr 11 15:06:41 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 (Errors) Apr 11 15:06:41 Tower kernel: res 41/40:00:70:06:02/db:00:09:00:00/40 Emask 0x409 (media error) <F> (Errors) Apr 11 15:06:41 Tower kernel: ata7.00: error: { UNC } (Errors) Apr 11 15:06:44 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 (Errors) Apr 11 15:06:44 Tower kernel: res 41/40:00:68:06:02/db:00:09:00:00/40 Emask 0x409 (media error) <F> (Errors) this is a used WD20EARS. Has 13,000 powered on hours. Was used in a qnap 419P in RAID1 without issues. Smart report shows 0 sectors pending re-allocation before clearing and 15 after. also the post-read took 3x as long as the pre-read or zeroing. pre and zeroing was averaging 55mB/sec, post read was 19 mB/sec. thanks.
April 13, 201214 yr i just precleared a used WD20EARS. i noticed no errors in the syslog for the pre-read and zeroing phase. However during the post-read step i started seeing these: Apr 11 14:24:06 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x0 (Errors) Apr 11 14:24:06 Tower kernel: res 41/40:00:20:80:8d/db:00:04:00:00/40 Emask 0x409 (media error) <F> (Errors) Apr 11 14:24:06 Tower kernel: ata7.00: error: { UNC } (Errors) Apr 11 15:06:38 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 (Errors) Apr 11 15:06:38 Tower kernel: res 41/40:00:70:06:02/db:00:09:00:00/40 Emask 0x409 (media error) <F> (Errors) Apr 11 15:06:38 Tower kernel: ata7.00: error: { UNC } (Errors) Apr 11 15:06:41 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 (Errors) Apr 11 15:06:41 Tower kernel: res 41/40:00:70:06:02/db:00:09:00:00/40 Emask 0x409 (media error) <F> (Errors) Apr 11 15:06:41 Tower kernel: ata7.00: error: { UNC } (Errors) Apr 11 15:06:44 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x3 SErr 0x0 action 0x0 (Errors) Apr 11 15:06:44 Tower kernel: res 41/40:00:68:06:02/db:00:09:00:00/40 Emask 0x409 (media error) <F> (Errors) this is a used WD20EARS. Has 13,000 powered on hours. Was used in a qnap 419P in RAID1 without issues. Smart report shows 0 sectors pending re-allocation before clearing and 15 after. also the post-read took 3x as long as the pre-read or zeroing. pre and zeroing was averaging 55mB/sec, post read was 19 mB/sec. thanks. basically, media errors are unreadable sectors. You found 15. The pre-clear certainly did not make them un-readable, it just tried to read every sector. defects in the disk platters cause them to occur. Post-read normally takes longer. We are not just reading, but validating all is zeros. pre-read does no validation, so it can go faster.
April 13, 201214 yr So this drive is ok for use or should I run it through again? It is pre-cleared... There is no guarantee it will not continue to develop additional un-readable sectors. I'd run it through another pre-clear cycle. If no more un-readable sectors appear, use the drive. If more appear, rinse-repeat, etc...
April 13, 201214 yr Author so the theory is here that if it results in 15 sectors pending re-allocation or less then the drive is "stable". If it continues to get worse then the drive should be RMA'd. is this correct?
April 13, 201214 yr Author update: round 2 of pre-clearing underway. started pre-read @ 10pm last night. 4am this morning it reached 25%. MUCH slower than the first round through. this means the pre-read is expected to take ~24hours when it took 9 hours the first pass. here is the email report for the 1st pass. == Last Cycle's Pre Read Time : 9:47:40 (56 MB/s) == Last Cycle's Zeroing time : 9:58:34 (55 MB/s) == Last Cycle's Post Read Time : 30:21:16 (18 MB/s) == Last Cycle's Total Time : 50:08:40 the last WD20EARS drive I pre-cleared reported this: == Last Cycle's Pre Read Time : 6:41:21 (83 MB/s) == Last Cycle's Zeroing time : 6:24:36 (86 MB/s) == Last Cycle's Post Read Time : 13:11:23 (42 MB/s) == Last Cycle's Total Time : 26:18:28 what changed? Well the most recent is on the secondary SATA controller on the motherboard, a JMicron IDE/SATA combo unit.
April 13, 201214 yr I just added one of these drives (new) and mine took over 24 hours for the preclear, so that seems like a more normal amount of time. This drive then started failing about 3 weeks after I added it to my array too so good luck with yours.
April 13, 201214 yr so the theory is here that if it results in 15 sectors pending re-allocation or less then the drive is "stable". If it continues to get worse then the drive should be RMA'd. is this correct? Actually, there should NEVER be any sectors pending re-allocation. I had misread your original post. I thought you had said the sectors were already re-allocated. All sectors that are pending re-allocation should be re-allocated when the zeros are written to it. There should be no further un-readable sectors unless the zeros written during the clearing were un-readable in the post-read phase. If the sectors were indeed already re-allocated, and you run another preclear, then no additional sectors should show as re-allocated, or pending re-allocation. Joe L.
April 13, 201214 yr Author ok i'll report the findings after the second pass is complete. the pre-read pass is @ 11hours, at 50%.... during the first pass of the preclearing the pre-read only took 9 hours.
April 13, 201214 yr Author pre-read for pass 2 is now at 90% after 18 hours! current pending sector count is now at 34.
April 13, 201214 yr I have seen this happen 3 times with WD20EARS drives the last year, I RMA'd them all. My experience with these drives is when they start developing pending sectors these numbers will grow. So better RMA them now.
April 13, 201214 yr Author Thanks. Ill RMA this one after I clear the next one. Although it already had multi zone error rate #'s too.
April 13, 201214 yr Author update. pre-read is complete. almost 20 hours. Currently zeroing drive @ 105 MB/Sec. current pending sector down to 9 is this just the drive firmware remapping itself after discovering crappy sectors?
April 14, 201214 yr Just shows how unreliable that disk is. I wouldn't trust any data to it. Look again when preclear has finished.
April 14, 201214 yr current pending sector down to 9 is this just the drive firmware remapping itself after discovering crappy sectors? Yes, as the zeros are written to the sectors marked for re-allocation, they will be re-allocated. Typically, they will then show in the reallocated sector count. Occasionally they will be re-written in place and a re-allocation will not be necessary.
April 18, 201214 yr Author ok i ran the second drive through. same size/model/age and it only took 30ish hours for a single cycle. i think i will RMA the first drive and run another cycle on this drive. results from second drive:\ Disk: /dev/sdh smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA0102247 Firmware Version: 50.0AB50 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Apr 17 00:24:52 2012 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (38400) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 199 199 051 Pre-fail Always - 166 3 Spin_Up_Time 0x0027 166 164 021 Pre-fail Always - 6700 4 Start_Stop_Count 0x0032 092 092 000 Old_age Always - 8213 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 13426 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 14 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 13 193 Load_Cycle_Count 0x0032 144 144 000 Old_age Always - 169671 194 Temperature_Celsius 0x0022 120 111 000 Old_age Always - 30 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 13 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 4 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 5 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 70% 13391 3906014840 # 2 Short offline Completed: read failure 60% 13390 3906014848 # 3 Short offline Completed without error 00% 13330 - # 4 Short offline Completed without error 00% 13162 - # 5 Short offline Completed without error 00% 12994 - # 6 Short offline Completed without error 00% 12827 - # 7 Short offline Completed without error 00% 12660 - # 8 Short offline Completed without error 00% 12492 - # 9 Short offline Completed without error 00% 12324 - #10 Short offline Completed without error 00% 12157 - #11 Short offline Completed without error 00% 11989 - #12 Short offline Completed without error 00% 11821 - #13 Short offline Completed without error 00% 11653 - #14 Short offline Completed without error 00% 11485 - #15 Short offline Completed without error 00% 11318 - #16 Short offline Completed without error 00% 11150 - #17 Short offline Completed without error 00% 10982 - #18 Short offline Completed without error 00% 10814 - #19 Short offline Completed without error 00% 10647 - #20 Short offline Completed without error 00% 10479 - #21 Short offline Completed without error 00% 10311 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay.
Archived
This topic is now archived and is closed to further replies.