Joe L. Posted April 20, 2012 Share Posted April 20, 2012 currently running a second pass now. I skipped the pre-read and went straight to writing zero's first. Since it just finished a post-read, that should do. I noticed it originally had 5 sectors pending re-allocation, and they were all re-allocated when writing zeros, but then in the post-read, 13 more were identified as un-readable.... We'll see what happens this time. Quote Link to comment
mr-hexen Posted April 20, 2012 Share Posted April 20, 2012 if it keeps getting worse i'll RMA it. i have another one that i think needs to be RMA'd as well. I'll post the results of that test shortly. Quote Link to comment
mr-hexen Posted April 20, 2012 Share Posted April 20, 2012 results of the other WD20EARS, Cycle 1/2 ========================================================================1.13 == invoked as: ./preclear_disk.sh -M 4 /dev/sdh == WDC WD20EARS-00MVWB0 WD-WCAZA0101187 == Disk /dev/sdh has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 9:47:40 (56 MB/s) == Last Cycle's Zeroing time : 9:58:34 (55 MB/s) == Last Cycle's Post Read Time : 30:21:16 (18 MB/s) == Last Cycle's Total Time : 50:08:40 == == Total Elapsed Time 50:08:40 == == Disk Start Temperature: 26C == == Current Disk Temperature: 29C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdh /tmp/smart_finish_sdh ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 121 124 0 ok 29 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 15 sectors are pending re-allocation at the end of the preclear, a change of 15 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ============================================================================ == == S.M.A.R.T Initial Report for /dev/sdh == Disk: /dev/sdh smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA0101187 Firmware Version: 50.0AB50 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Apr 10 17:53:03 2012 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x82) Offline data collection activity was completed without error. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (36000) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 166 165 021 Pre-fail Always - 6658 4 Start_Stop_Count 0x0032 092 092 000 Old_age Always - 8257 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 13388 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 16 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 14 193 Load_Cycle_Count 0x0032 189 189 000 Old_age Always - 33160 194 Temperature_Celsius 0x0022 124 109 000 Old_age Always - 26 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 13387 - # 2 Short offline Completed without error 00% 13326 - # 3 Short offline Completed without error 00% 13159 - # 4 Short offline Completed without error 00% 12991 - # 5 Short offline Completed without error 00% 12823 - # 6 Short offline Completed without error 00% 12656 - # 7 Short offline Completed without error 00% 12489 - # 8 Short offline Completed without error 00% 12321 - # 9 Short offline Completed without error 00% 12153 - #10 Short offline Completed without error 00% 11985 - #11 Short offline Completed without error 00% 11818 - #12 Short offline Completed without error 00% 11650 - #13 Short offline Completed without error 00% 11482 - #14 Short offline Completed without error 00% 11315 - #15 Short offline Completed without error 00% 11147 - #16 Short offline Completed without error 00% 10979 - #17 Short offline Completed without error 00% 10811 - #18 Short offline Completed without error 00% 10644 - #19 Short offline Completed without error 00% 10476 - #20 Short offline Completed without error 00% 10308 - #21 Short offline Completed without error 00% 10140 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ ============================================================================ == == S.M.A.R.T Final Report for /dev/sdh == Disk: /dev/sdh smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA0101187 Firmware Version: 50.0AB50 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Apr 12 20:01:42 2012 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (36000) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 166 165 021 Pre-fail Always - 6658 4 Start_Stop_Count 0x0032 092 092 000 Old_age Always - 8257 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 13437 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 16 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 14 193 Load_Cycle_Count 0x0032 189 189 000 Old_age Always - 33161 194 Temperature_Celsius 0x0022 121 109 000 Old_age Always - 29 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 15 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 13387 - # 2 Short offline Completed without error 00% 13326 - # 3 Short offline Completed without error 00% 13159 - # 4 Short offline Completed without error 00% 12991 - # 5 Short offline Completed without error 00% 12823 - # 6 Short offline Completed without error 00% 12656 - # 7 Short offline Completed without error 00% 12489 - # 8 Short offline Completed without error 00% 12321 - # 9 Short offline Completed without error 00% 12153 - #10 Short offline Completed without error 00% 11985 - #11 Short offline Completed without error 00% 11818 - #12 Short offline Completed without error 00% 11650 - #13 Short offline Completed without error 00% 11482 - #14 Short offline Completed without error 00% 11315 - #15 Short offline Completed without error 00% 11147 - #16 Short offline Completed without error 00% 10979 - #17 Short offline Completed without error 00% 10811 - #18 Short offline Completed without error 00% 10644 - #19 Short offline Completed without error 00% 10476 - #20 Short offline Completed without error 00% 10308 - #21 Short offline Completed without error 00% 10140 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ Quote Link to comment
mr-hexen Posted April 20, 2012 Share Posted April 20, 2012 Pass 2/2: ========================================================================1.13 == invoked as: ./preclear_disk.sh -M 4 /dev/sdh == WDC WD20EARS-00MVWB0 WD-WCAZA0101187 == Disk /dev/sdh has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 20:22:52 (27 MB/s) == Last Cycle's Zeroing time : 8:17:05 (67 MB/s) == Last Cycle's Post Read Time : 23:25:29 (23 MB/s) == Last Cycle's Total Time : 52:06:34 == == Total Elapsed Time 52:06:34 == == Disk Start Temperature: 29C == == Current Disk Temperature: 30C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdh /tmp/smart_finish_sdh ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 199 200 51 ok 398 Temperature_Celsius = 120 121 0 ok 30 No SMART attributes are FAILING_NOW 28 sectors were pending re-allocation before the start of the preclear. 34 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 5 sectors are pending re-allocation at the end of the preclear, a change of -23 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ============================================================================ == == S.M.A.R.T Initial Report for /dev/sdh == Disk: /dev/sdh smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA0101187 Firmware Version: 50.0AB50 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Apr 12 21:57:49 2012 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (36000) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 166 165 021 Pre-fail Always - 6658 4 Start_Stop_Count 0x0032 092 092 000 Old_age Always - 8257 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 13439 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 16 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 14 193 Load_Cycle_Count 0x0032 189 189 000 Old_age Always - 33167 194 Temperature_Celsius 0x0022 121 109 000 Old_age Always - 29 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 28 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 13387 - # 2 Short offline Completed without error 00% 13326 - # 3 Short offline Completed without error 00% 13159 - # 4 Short offline Completed without error 00% 12991 - # 5 Short offline Completed without error 00% 12823 - # 6 Short offline Completed without error 00% 12656 - # 7 Short offline Completed without error 00% 12489 - # 8 Short offline Completed without error 00% 12321 - # 9 Short offline Completed without error 00% 12153 - #10 Short offline Completed without error 00% 11985 - #11 Short offline Completed without error 00% 11818 - #12 Short offline Completed without error 00% 11650 - #13 Short offline Completed without error 00% 11482 - #14 Short offline Completed without error 00% 11315 - #15 Short offline Completed without error 00% 11147 - #16 Short offline Completed without error 00% 10979 - #17 Short offline Completed without error 00% 10811 - #18 Short offline Completed without error 00% 10644 - #19 Short offline Completed without error 00% 10476 - #20 Short offline Completed without error 00% 10308 - #21 Short offline Completed without error 00% 10140 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ ============================================================================ == == S.M.A.R.T Final Report for /dev/sdh == Disk: /dev/sdh smartctl 5.39.1 2010-01-28 r3054 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD20EARS-00MVWB0 Serial Number: WD-WCAZA0101187 Firmware Version: 50.0AB50 User Capacity: 2,000,398,934,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Sun Apr 15 02:04:23 2012 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (36000) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 199 199 051 Pre-fail Always - 398 3 Spin_Up_Time 0x0027 166 165 021 Pre-fail Always - 6658 4 Start_Stop_Count 0x0032 092 092 000 Old_age Always - 8257 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 082 082 000 Old_age Always - 13491 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 16 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 14 193 Load_Cycle_Count 0x0032 189 189 000 Old_age Always - 33168 194 Temperature_Celsius 0x0022 120 109 000 Old_age Always - 30 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 5 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed without error 00% 13387 - # 2 Short offline Completed without error 00% 13326 - # 3 Short offline Completed without error 00% 13159 - # 4 Short offline Completed without error 00% 12991 - # 5 Short offline Completed without error 00% 12823 - # 6 Short offline Completed without error 00% 12656 - # 7 Short offline Completed without error 00% 12489 - # 8 Short offline Completed without error 00% 12321 - # 9 Short offline Completed without error 00% 12153 - #10 Short offline Completed without error 00% 11985 - #11 Short offline Completed without error 00% 11818 - #12 Short offline Completed without error 00% 11650 - #13 Short offline Completed without error 00% 11482 - #14 Short offline Completed without error 00% 11315 - #15 Short offline Completed without error 00% 11147 - #16 Short offline Completed without error 00% 10979 - #17 Short offline Completed without error 00% 10811 - #18 Short offline Completed without error 00% 10644 - #19 Short offline Completed without error 00% 10476 - #20 Short offline Completed without error 00% 10308 - #21 Short offline Completed without error 00% 10140 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ Quote Link to comment
mr-hexen Posted April 20, 2012 Share Posted April 20, 2012 of note (i believe anyways) is that the first drive #1187 took a little over 50hrs each pass while drive #2247 took just over 30hrs. same cables, ports, etc. Quote Link to comment
FrackinFrog Posted April 20, 2012 Share Posted April 20, 2012 I'm preclearing my first 3.0TB drive using preclear v1.13, and prior to starting the preclear it says that the cleared drive size will be 2.2TB. Is that actually going to be the final size after preclear, or is that just the way preclear needs to handle larger drives and reports it smaller than it really is? This is on a system running 5.0-beta12a with 2x BR10i controllers (with the latest firmware flash from this thread - http://lime-technology.com/forum/index.php?topic=12767.0). Quote Link to comment
Joe L. Posted April 20, 2012 Share Posted April 20, 2012 I'm preclearing my first 3.0TB drive using preclear v1.13, and prior to starting the preclear it says that the cleared drive size will be 2.2TB. Is that actually going to be the final size after preclear, or is that just the way preclear needs to handle larger drives and reports it smaller than it really is? This is on a system running 5.0-beta12a with 2x BR10i controllers (with the latest firmware flash from this thread - http://lime-technology.com/forum/index.php?topic=12767.0). It will report the partition as 2.2TB to older utilities. I've never seen the message you described, but them I've never seen a 3TB drive Probably just how "fdisk" is describing it. (and fdisk is an older utility) Don't worry though, the preclear script does not use "fdisk" to actually create a partition. Joe L. Quote Link to comment
mr-hexen Posted April 20, 2012 Share Posted April 20, 2012 so pass #2 failed after the zeroing of the drive. got this email this morning. Preclear Disk /dev/sdh FAILED!!!!. no idea why it failed though. will find out when i get home. guess i have TWO WD20EARS drives to RMA... Quote Link to comment
mr-hexen Posted April 20, 2012 Share Posted April 20, 2012 this is all over the syslog at the time the preclear failed. Apr 20 03:15:57 Tower kernel: ata7.00: exception Emask 0x0 SAct 0x7fffffff SErr 0x0 action 0x6 frozen (Errors) Apr 20 03:15:57 Tower kernel: ata7.00: failed command: WRITE FPDMA QUEUED (Minor Issues) Apr 20 03:15:57 Tower kernel: ata7.00: cmd 61/00:00:20:eb:8e/04:00:e1:00:00/40 tag 0 ncq 524288 out (Drive related) Apr 20 03:15:57 Tower kernel: res 40/00:54:f8:b6:85/67:00:e1:00:00/40 Emask 0x4 (timeout) (Errors) Apr 20 03:15:57 Tower kernel: ata7.00: status: { DRDY } (Drive related) Apr 20 03:15:57 Tower kernel: ata7.00: failed command: WRITE FPDMA QUEUED (Minor Issues) Apr 20 03:15:57 Tower kernel: ata7.00: cmd 61/00:08:20:ef:8e/04:00:e1:00:00/40 tag 1 ncq 524288 out (Drive related) Apr 20 03:15:57 Tower kernel: res 40/00:00:90:4e:fa/67:00:e1:00:00/40 Emask 0x4 (timeout) (Errors) Apr 20 03:15:57 Tower kernel: ata7.00: status: { DRDY } (Drive related) Apr 20 03:15:57 Tower kernel: ata7.00: failed command: WRITE FPDMA QUEUED (Minor Issues) Apr 20 03:15:57 Tower kernel: ata7.00: cmd 61/00:10:20:77:8e/04:00:e1:00:00/40 tag 2 ncq 524288 out (Drive related) Apr 20 03:15:57 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) (Errors) Apr 20 03:15:57 Tower kernel: ata7.00: status: { DRDY } (Drive related) Apr 20 03:15:57 Tower kernel: ata7.00: failed command: WRITE FPDMA QUEUED (Minor Issues) Apr 20 03:15:57 Tower kernel: ata7.00: cmd 61/00:18:20:7b:8e/04:00:e1:00:00/40 tag 3 ncq 524288 out (Drive related) Apr 20 03:15:57 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) (Errors) Apr 20 03:15:57 Tower kernel: ata7.00: status: { DRDY } (Drive related) Apr 20 03:15:57 Tower kernel: ata7.00: failed command: WRITE FPDMA QUEUED (Minor Issues) Apr 20 03:15:57 Tower kernel: ata7.00: cmd 61/00:20:20:7f:8e/04:00:e1:00:00/40 tag 4 ncq 524288 out (Drive related) Apr 20 03:15:57 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) (Errors) the drive later got re-assigned sdi from sdh. Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e a7 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784222496 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e ab 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784223520 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e af 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784224544 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e b3 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784225568 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e b7 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784226592 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e bb 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784227616 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e bf 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784228640 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e c3 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784229664 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e c7 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784230688 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e cb 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784231712 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e cf 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784232736 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e d3 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784233760 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e d7 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784234784 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e db 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784235808 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e df 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784236832 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e e3 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784237856 (Errors) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Result: hostbyte=0x00 driverbyte=0x08 (System) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Sense Key : 0xb [current] [descriptor] (Drive related) Apr 20 03:17:11 Tower kernel: Descriptor sense data with sense descriptors (in hex): Apr 20 03:17:11 Tower kernel: 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 Apr 20 03:17:11 Tower kernel: 00 00 00 00 Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] ASC=0x0 ASCQ=0x0 (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] CDB: cdb[0]=0x2a: 2a 00 e1 8e e7 20 00 04 00 00 (Drive related) Apr 20 03:17:11 Tower kernel: end_request: I/O error, dev sdh, sector 3784238880 (Errors) Apr 20 03:17:11 Tower kernel: ata7: EH complete (Drive related) Apr 20 03:17:11 Tower kernel: ata7.00: detaching (SCSI 6:0:0:0) (Drive related) Apr 20 03:17:11 Tower kernel: ------------[ cut here ]------------ Apr 20 03:17:11 Tower kernel: WARNING: at fs/fs-writeback.c:588 writeback_inodes_wb+0x25f/0x343() (Minor Issues) Apr 20 03:17:11 Tower kernel: Hardware name: P35-DS3R Apr 20 03:17:11 Tower kernel: Modules linked in: md_mod xor i2c_i801 i2c_core r8169 pata_jmicron jmicron ahci (Drive related) Apr 20 03:17:11 Tower kernel: Pid: 11292, comm: flush-8:112 Not tainted 2.6.32.9-unRAID #8 (Errors) Apr 20 03:17:11 Tower kernel: Call Trace: (Errors) Apr 20 03:17:11 Tower kernel: [<c102449e>] warn_slowpath_common+0x60/0x77 (Errors) Apr 20 03:17:11 Tower kernel: [<c10244c2>] warn_slowpath_null+0xd/0x10 (Errors) Apr 20 03:17:11 Tower kernel: [<c10822d2>] writeback_inodes_wb+0x25f/0x343 (Errors) Apr 20 03:17:11 Tower kernel: [<c10824a6>] wb_writeback+0xf0/0x14d (Errors) Apr 20 03:17:11 Tower kernel: [<c10825e2>] wb_do_writeback+0x62/0x11b (Errors) Apr 20 03:17:11 Tower kernel: [<c10562b9>] bdi_start_fn+0x93/0xa1 (Errors) Apr 20 03:17:11 Tower kernel: [<c1056226>] ? bdi_start_fn+0x0/0xa1 (Errors) Apr 20 03:17:11 Tower kernel: [<c1033869>] kthread+0x61/0x68 (Errors) Apr 20 03:17:11 Tower kernel: [<c1033808>] ? kthread+0x0/0x68 (Errors) Apr 20 03:17:11 Tower kernel: [<c100339f>] kernel_thread_helper+0x7/0x1a (Errors) Apr 20 03:17:11 Tower kernel: ---[ end trace 51dac434e29763bb ]--- Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Synchronizing SCSI cache (Drive related) Apr 20 03:17:11 Tower kernel: sd 6:0:0:0: [sdh] Stopping disk (Drive related) Apr 20 03:17:12 Tower kernel: scsi 6:0:0:0: Direct-Access ATA WDC WD20EARS-00M 50.0 PQ: 0 ANSI: 5 (Drive related) Apr 20 03:17:12 Tower kernel: sd 6:0:0:0: [sdi] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB) (Drive related) Apr 20 03:17:12 Tower kernel: sd 6:0:0:0: [sdi] Write Protect is off (Drive related) Apr 20 03:17:12 Tower kernel: sd 6:0:0:0: [sdi] Mode Sense: 00 3a 00 00 (Drive related) Apr 20 03:17:12 Tower kernel: sd 6:0:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA (Drive related) Apr 20 03:17:12 Tower kernel: sdi: sdi1 (Drive related) Apr 20 03:17:12 Tower kernel: sd 6:0:0:0: [sdi] Attached SCSI disk (Drive related) i think this one's going to be RMA'd. Quote Link to comment
mr-hexen Posted April 20, 2012 Share Posted April 20, 2012 syslog attached. syslog-2012-04-20.zip Quote Link to comment
RobJ Posted April 22, 2012 Share Posted April 22, 2012 this is all over the syslog at the time the preclear failed. snipped ... the drive later got re-assigned sdi from sdh. snipped ... i think this one's going to be RMA'd. At 'Apr 20 03:15:57', the drive stopped responding to commands. A hard reset was sent, and the SATA Link reported no problems, was full speed at 3.0gbps, but there was still no response to higher level queries over that link, not even identity info. After more hard resets with similar results, the SATA link was slowed to 1.5, to see if that might make a difference, but unsuccessful too. So at 'Apr 20 03:16:52', Linux marks the drive device sdh as disabled. Once a device has been marked as disabled, I have never yet seen a recovery, until next boot. And you can completely ignore all of the subsequent error messages and actions related to that drive (including that second section you quoted). It does look like the kernel successfully recovered the drive later, and even read the partition table, but it was too late. It was as a new device sdi, which unRAID and Preclear knew nothing about. After a reboot, everything should have appeared to be fine, except that Preclear had to abort. In my experience, a drive disabled this way is usually not at fault for the trouble. It is generally an issue related to the driver or cable or controller card, or possibly a power issue. So no, this particular problem would not be a reason to RMA the drive. However, the syslog and SMART and Preclear reports you have provided do show an ongoing series of problematic (possibly bad) sectors, media errors. Another Preclear is very necessary, and if it is perfect (no further pending or remapped sectors), then I would recommend an additional Preclear, just to see if it too is clean. I think I would only trust this drive if it can perfectly pass 2 consecutive Preclears. A drive that keeps finding new bad sectors on each pass, that are never the same, seems very suspicious to me. Quote Link to comment
RobJ Posted April 22, 2012 Share Posted April 22, 2012 I ran preclear twice on the same WD20EARS disk. The first time I got this in the summary: Changed attributes in files: /tmp/smart_start_hda /tmp/smart_finish_hda ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 115 114 0 ok 37 Reallocated_Event_Count = 198 199 0 ok 2 No SMART attributes are FAILING_NOW 1 sector was pending re-allocation before the start of the preclear. 1 sector was pending re-allocation after pre-read in cycle 1 of 1. 1 sector was pending re-allocation after zero of disk in cycle 1 of 1. 1 sector is pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 4 sectors had been re-allocated before the start of the preclear. 4 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. Can I trust this disk in the array? Thank you I am tempted to say no... I have spent some time browsing the forums on SMART results, what I have come to understand is that what your drive is doing could possibly be ok, but parts of the drive ARE failing so it comes down to wanting to take a risk... Looking at the prices of disks I would advise taking out this disk, using it for some other kind of storage (its not broken yet) and replace your unraid drive with a fresh one that comes out of multuple preclear cycles without any issues.. I would try another pre-clear cycle. Each cycle so far has uncovered additional un-readable sectors. If that continues, the drive is not one I'd want in my array. On the other hand, I've got one drive with 100 re-allocated sectors that has never changed from the first pre-clear I ran it through. Since it is stable, I trust it. Joe L. This keeps coming up, and I have been thinking about it, and wrote up some thoughts, perhaps for a future SMART wiki page. I don't know if this is an appropriate place for them, so I apologize in advance to Joe L. My thoughts below: (apologies also for the lack of clarity in my writing) I've noticed that users generally feel that a newly acquired drive that does not maintain a perfect SMART report in its first week is worth much less than a drive that remains perfect, SMART-wise. In addition, users often consider any drive with a significant number of remapped sectors to be suspect. To a large extent, that makes sense, and yet I have had doubts as to the significance of a perfect SMART report. [speculative] Sometimes I wonder if the "0 sectors reallocated" is a real zero, or a normalized base line of zero. In other words, are these drives really that perfect, with all tracks and sectors completely contiguous? Or is it possible that they have initialized the starting drive maps so that, after initial testing, they only include sectors that have passed testing? If so, then an initial value of zero bad sectors is not very meaningful, and furthermore, we should probably not be that concerned with a few bad sectors that appear in the first month of a drive's life. It is easy to speculate that what actually happens is more like this: engineering tests determine that a new drive can potentially carry X number of tracks (for illustration we will use the number 5300). Management says they want a drive of size Y, which will require 5000 tracks, plus an extra 100 tracks reserved for spare sectors. So an initialization program is set up to install the firmware, do the basic tests of the drive, and then initialize the starting drive maps by locating the best 5100 tracks of those 5300, and then reserve a hundred for spare sectors. In other words, there could well be a number of tracks skipped, perhaps only because they had a single sector that seemed questionable. How would we know the difference, between this and a pristine totally contiguous drive? We wouldn't! The initial SMART values would be cleared, and appear perfect. From your experience with manufacturers in general, would you not say that monetary and marketing factors are more likely to drive product design than other factors? It does make sense from a practical point of view, so I personally think this is a likely scenario. All of which makes the current value of remapped sectors somewhat meaningless, so long as the number is not changing. [Disclaimer: this is all speculation, I have no inside knowledge, and it is possible that the platters are actually made so well, so consistent (over-engineered?) that they usually ARE near perfect.] And I don't know that this really matters either. I like the current setup, where you acquire an apparently pristine drive, that includes a relatively large number of spares to replace any that suffer 'infant mortality' plus more for the long term loss due to wear and tear or whatever. [/speculative] The important thing therefore may not be to focus too much on the current number of reallocated sectors, zero or otherwise, but on whether the number is growing. A growing number is always concerning, but once our testing appears to show that the number has plateaued (no change at all), then the drive should again regain our confidence. I think an enormous number of drives have been returned, that may have had many reliable years left in them. Quote Link to comment
Joe L. Posted April 23, 2012 Share Posted April 23, 2012 a very nice summary. Nobody knows if a drive with 0 re-allocated sectors is actually defect free, or just set to zero by the manufacturer after mapping an initial set of defects. I can not know for sure, but the facts I know are that it takes a finite amount of time to read all the sectors of a disk. I know we can read at roughly 100MB/s. I have no idea what the speed of an internal only test might be. Whatever it is, it must be very similar to a "long" test as issued by the smartctl utility. I'm not sure that each disk is actually tested to see that every bit is readable. With that in mind, does each drive sit on the manufacturing line for the 4 to 6 hours needed for a full test? Nobody knows except the manufacturer. I doubt the drives are tested for that length of time... it would limit the number of drives you could manufacture, as the tests would take FAR longer than the actual assembly time. Joe L. Quote Link to comment
RobJ Posted April 24, 2012 Share Posted April 24, 2012 Yes, 4 to 6 hours does seem like a significant limiting factor in manufacturing them, but don't they already have to low-level format them? I don't know for sure, but it seems reasonable to me that their formatting machines are testing while writing. I suspect that they are using proprietary machines much faster than our consumer drives, perhaps multi-headed. If I were there, I would want the engineers to design a drive without the common single-headed swing arm, and use something like a sliding bar perpendicular to the tracks, crammed with all the heads that can fit on it, each wired for simultaneous read/write, plus multi-threaded software to simultaneously operate each head, plus a high speed bus to handle all that I/O. If you could cram heads 3mm apart, then the bar only has to slide 3mm in a radial direction for heads to reach every track. That would be about 8 heads per inch, giving you 10 to 15 heads to speed up the formatting, and 6 hours goes down to about half an hour. And if they can double or triple the rotational speed ... Speculating again... Testing could be as simple as reading the track immediately after writing the formatting, before moving to another track. With our consumer drives, that would certainly slow the process down, but if we add a second head bar to our hypothetical machine, so that it reads after the first set of heads write, then there is almost no extra time required. I'm sure they don't need my ideas(!), but I do think they must have much faster machines than we do. Quote Link to comment
sharpeshuffle Posted April 26, 2012 Share Posted April 26, 2012 Hi, I'm new to unRaid and having been reading through the various FAQ's and Configuration Tutorials and was running the preclear on 3 WD20EARX drives (2TB). I had it running from around 10pm on Tuesday night (4/24/12). The three drives were setup to run preclear at the console (not telnet). Various times i checked on the status they were running, however when I came home from work today at around 7.30pm (4/25/12) I saw that the VTerm seemed to be showing error messages with a bunch of text. I couldn't switch to the different VTerminals using Alt+F1/F2/F3. Basically the system seemed frozen with a blinking cursor, keyboard would not respond and I couldn't connect via a web browser from my laptop. I was only able to take a picture since the keyboard wouldn't respond, so no logs. Can anyone decipher what the screenshot is indicating? I restarted the preclearing on all 3 drives again hoping it was just a fluke or something. I have a HP Proliant NL40 with 8GB RAM and 3 WD20EARX (2TB drives). Lenovo USB flash drive (1GB) with the latest beta version of unRaid 5.0.14 Beta. I used the -A for preclearing also. Picture was too big too attach so hopefully the links below work. TIA *EDIT* checked status again and its happened again, same or similar screen... but this time within the last 2hrs preclear was running. Should I try running preclear on one drive at a time? Quote Link to comment
Joe L. Posted April 26, 2012 Share Posted April 26, 2012 basically, you crashed. (probably ran out of free memory) The dump seems to indicate the kernel was attempting to find the lru (least recently used) page of memory so it could re-use it. Joe L. Quote Link to comment
BetaQuasi Posted April 26, 2012 Share Posted April 26, 2012 Hey guys, I think I've got a drive on its way to being a dud by the looks of this - just took 4 of these drives out of a Readynas NV+ and dropped them into unRAID and this one turned up suspect. They'd been running fine for the last 2 years or so. Can someone with better knowledge than I make comment on the results? I'm running another preclear on it now in any case. preclear_finish__5XW1PQ16_2012-04-26.txt Quote Link to comment
sharpeshuffle Posted April 26, 2012 Share Posted April 26, 2012 basically, you crashed. (probably ran out of free memory) The dump seems to indicate the kernel was attempting to find the lru (least recently used) page of memory so it could re-use it. Joe L. Thanks for the reply Joe. Is that normal to run out of memory for 3 preclears running at the same time? I have 8GB RAM in the server and only did it because the FAQ mentioned 4 or more preclears shouldn't be run at the same time. After the second crash, I formatted the usb drive and reloaded again from scratch. I just started the preclear on one drive only. As of 7am this morning it seems to be still going so hopefully it'll go ok all the way till the end and I can start the 2nd one tonight. Quote Link to comment
Joe L. Posted April 26, 2012 Share Posted April 26, 2012 basically, you crashed. (probably ran out of free memory) The dump seems to indicate the kernel was attempting to find the lru (least recently used) page of memory so it could re-use it. Joe L. Thanks for the reply Joe. Is that normal to run out of memory for 3 preclears running at the same time? I have 8GB RAM in the server and only did it because the FAQ mentioned 4 or more preclears shouldn't be run at the same time. After the second crash, I formatted the usb drive and reloaded again from scratch. I just started the preclear on one drive only. As of 7am this morning it seems to be still going so hopefully it'll go ok all the way till the end and I can start the 2nd one tonight. I've got absolutely no idea. It would depend on lots of factors. I would not have thought you would have issues... but I was not the one who wrote that part of the wiki. If your syslog filled with errors, that could use up all your memory. If you were performing a parity check/sync, all available memory would be used as cache. (unless you have more RAM than disk space) I would run a tail -f /var/log/syslog and see if anything is filling it. I would also perform a memory test, if you have not done one... who knows, you might have less RAM than you think. Quote Link to comment
RobJ Posted April 28, 2012 Share Posted April 28, 2012 Hey guys, I think I've got a drive on its way to being a dud by the looks of this - just took 4 of these drives out of a Readynas NV+ and dropped them into unRAID and this one turned up suspect. They'd been running fine for the last 2 years or so. Can someone with better knowledge than I make comment on the results? I'm running another preclear on it now in any case. Unless something turns up on the next Preclear report for this drive, the drive should be fine. If you examine the SMART report you attached, you can see that the drive has almost 12000 hours of usage (Power_On_Hours = 11826), and the last UNC error (UNCorrectable media error - data in a sector too scrambled for ECC info to correct) was at 1428 hours, perhaps 2 to 6 months after you acquired the drive and over 10000 usage hours ago. The drive does report 13 UNC errors, which corresponds to the 13 errors in the SMART log. The log only shows the last 5, but you can see that at 1428 hours, 3 UNC errors occurred, 2 of which are repeats of the 2 errors preceding (error #9 and #10). We cannot see errors 1 through 8, but we know that they were UNC errors like the 5 we CAN see, and we know that they occurred no later than hour 53, which is within the first 3 days of drive usage. So you have been using the drive for over 10000 hours without any issues, and that is why I think the drive is fine pending any adverse results from the Preclear of it. The writing of zeroes to the entire drive should make any issues visible. The fact that 2 (at least) sectors have appeared twice (at differing times) to be UNC does seem a little suspicious, so the media surface under them may be a little weak. Ideally (I think), Preclear will force the drive to discover and remap them. That would result in a few Reallocated sectors, but if a subsequent Preclear is clear with no further changes, then the drive should be very good, and you would no longer even have to worry about weak sectors. Quote Link to comment
BetaQuasi Posted April 29, 2012 Share Posted April 29, 2012 Thanks very much RobJ, exactly what I needed. In fact that was so detailed that I should be able to interpret my own reports from now on! The 2nd preclear showed no issues other than those original errors, so I've gone ahead and added the drive to the array. Thanks again! Quote Link to comment
Helmonder Posted April 30, 2012 Share Posted April 30, 2012 A week or so ago I noticed that one of my drives (disk1) was showing sectors pending reallocation, the disk was also one of my oldest and has made over 20,000 hours. I figured a replace was in order, I have been running flawlessly with v5b14 and want to be ready for the future, a 3TB replacement was in order. I started a replacement caroussel: - Bought new 3TB Hitachi - Precleared 3TB with 3 cycles, flawless ! - Replaced 2TB parity (that was flawless) with 3TB - Precleared old parity 2TB 3 cycles, flawless ! - Replaced my "failing" disk1 with the old parity disk Everything worked as expected, no stress ! Then I started a 3 cycle preclear on the old disk1 fully expecting to have it "fail" on me. Results were as followed: ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdd /tmp/smart_finish_sdd ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 117 125 0 ok 35 No SMART attributes are FAILING_NOW 4 sectors were pending re-allocation before the start of the preclear. 4 sectors were pending re-allocation after pre-read in cycle 1 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 3. 0 sectors were pending re-allocation after post-read in cycle 1 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 2 of 3. 1 sector was pending re-allocation after post-read in cycle 2 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 3 of 3. 0 sectors are pending re-allocation at the end of the preclear, a change of -4 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. My question: Should I continue to use the old disk1 or should I retire it ? More then happy to add it to the array for extra space (though I do not really need it at the moment. What confuses me is that there are changes in re-allocation during the cycle but preclear states it has "passed". I thought that such "changing" behavious most likely pointed towards a disk going to fail ? Current SMART report: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 75 3 Spin_Up_Time 0x0027 142 142 021 Pre-fail Always - 9875 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 443 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 072 072 000 Old_age Always - 21114 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 86 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 40 193 Load_Cycle_Count 0x0032 107 107 000 Old_age Always - 281502 194 Temperature_Celsius 0x0022 116 090 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 175 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 20488 1470867921 # 2 Extended offline Completed: read failure 90% 20196 1470867921 # 3 Short offline Completed without error 00% 20124 - # 4 Short offline Completed without error 00% 20030 - Quote Link to comment
duffbeer2000 Posted April 30, 2012 Share Posted April 30, 2012 Hi I had to format one of my 2TB HDDs with NTFS. Now I want to preclear this disk again before I assign it to the arry. Is it possible to preclear the disk without the pre-read(-W Option) because I know the Disk is Ok and I want to save Time? Quote Link to comment
mr-hexen Posted May 1, 2012 Share Posted May 1, 2012 if the option is there then its certainly possible to do Quote Link to comment
Joe L. Posted May 1, 2012 Share Posted May 1, 2012 A week or so ago I noticed that one of my drives (disk1) was showing sectors pending reallocation, the disk was also one of my oldest and has made over 20,000 hours. I figured a replace was in order, I have been running flawlessly with v5b14 and want to be ready for the future, a 3TB replacement was in order. I started a replacement caroussel: - Bought new 3TB Hitachi - Precleared 3TB with 3 cycles, flawless ! - Replaced 2TB parity (that was flawless) with 3TB - Precleared old parity 2TB 3 cycles, flawless ! - Replaced my "failing" disk1 with the old parity disk Everything worked as expected, no stress ! Then I started a 3 cycle preclear on the old disk1 fully expecting to have it "fail" on me. Results were as followed: ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdd /tmp/smart_finish_sdd ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 117 125 0 ok 35 No SMART attributes are FAILING_NOW 4 sectors were pending re-allocation before the start of the preclear. 4 sectors were pending re-allocation after pre-read in cycle 1 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 3. 0 sectors were pending re-allocation after post-read in cycle 1 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 2 of 3. 1 sector was pending re-allocation after post-read in cycle 2 of 3. 0 sectors were pending re-allocation after zero of disk in cycle 3 of 3. 0 sectors are pending re-allocation at the end of the preclear, a change of -4 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. My question: Should I continue to use the old disk1 or should I retire it ? More then happy to add it to the array for extra space (though I do not really need it at the moment. What confuses me is that there are changes in re-allocation during the cycle but preclear states it has "passed". I thought that such "changing" behavious most likely pointed towards a disk going to fail ? Current SMART report: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 75 3 Spin_Up_Time 0x0027 142 142 021 Pre-fail Always - 9875 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 443 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 072 072 000 Old_age Always - 21114 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 86 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 40 193 Load_Cycle_Count 0x0032 107 107 000 Old_age Always - 281502 194 Temperature_Celsius 0x0022 116 090 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 175 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Short offline Completed: read failure 90% 20488 1470867921 # 2 Extended offline Completed: read failure 90% 20196 1470867921 # 3 Short offline Completed without error 00% 20124 - # 4 Short offline Completed without error 00% 20030 - Well... the original 4 sectors were apparently re-written in place, and not re-allocated, but then on the second cycle one additional sector was not readable initially, but re-written in place on the zeroing phase. I'd run it through another few cycles. If it still has an occasional sector show as un-readable, I'd not trust it for anything critical. If it is all OK, then go ahead and use it if you like... (Or RMA it, if in warranty) Of course, you can always put it in a windows OS, and blame any future data corruption on Microsoft. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.