Joe L. Posted February 23, 2013 Share Posted February 23, 2013 But I've used this drive for > 3 years and it hasn't locked up yet, if it does, I guess I'll just have to replace it. This is not good: Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 2 twice (so far) the disk was not able to spin up to speed. Typically, it indicates significant mechanical wear. There is also a lot of hardware error correction taking place. Notice now the normalized value (46) is getting closer to the failure threshold (0) and the "worst" value (14) is very close. 195 Hardware_ECC_Recovered 0x001a 046 014 000 Old_age Always - 146458279 If the spin-retry-count gets higher I'll start taking it as a serious number, but, I've been monitoring these drives for the last 3 months (On my desktop) and it hasn't increased recently, so, for all I know it may have been right as I was placing it in it had a dodgy PSU connection/etc. When I first got this drive I didn't do any real tests on it at-all until ~ 3 months ago. As for the Hardware_ECC, can you explain what that does? Wikipedia just says:- (Vendor specific raw value.) The raw value has different structure for different vendors and is often not meaningful as a decimal number. Only reason why I'm not so eager to take it out of my array is because I obviously no longer have any warrant on it (Had it for 3 years+) so, it's either in my array or being used for nothing. All drives perform hardware error correction. (it is as if they have their own internal way of correcting data, typically by re-reading and comparing internal checksums.) Just keep an eye on the drive. If the normalized numbers start getting closer to their failure threshold, consider moving data off it that might be accessed frequently. I realize it is an older drive, Since it is working, odds are it will work a lot longer. I have two disks in my first unRAID server that are a quite a bit older (over 7 years). I just keep an eye on them too, and so far, they are working perfectly. 9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 62767 and 9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 62488 Interestingly, neither have ever had a sector pending reallocation or re-allocated. (They are IDE and 500Gig. They were the biggest available at the time the server was originally built, and over $300 each... ouch. Times have sure changed.) Joe L. Quote Link to comment
Automatic Posted February 23, 2013 Share Posted February 23, 2013 But I've used this drive for > 3 years and it hasn't locked up yet, if it does, I guess I'll just have to replace it. This is not good: Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 2 twice (so far) the disk was not able to spin up to speed. Typically, it indicates significant mechanical wear. There is also a lot of hardware error correction taking place. Notice now the normalized value (46) is getting closer to the failure threshold (0) and the "worst" value (14) is very close. 195 Hardware_ECC_Recovered 0x001a 046 014 000 Old_age Always - 146458279 If the spin-retry-count gets higher I'll start taking it as a serious number, but, I've been monitoring these drives for the last 3 months (On my desktop) and it hasn't increased recently, so, for all I know it may have been right as I was placing it in it had a dodgy PSU connection/etc. When I first got this drive I didn't do any real tests on it at-all until ~ 3 months ago. As for the Hardware_ECC, can you explain what that does? Wikipedia just says:- (Vendor specific raw value.) The raw value has different structure for different vendors and is often not meaningful as a decimal number. Only reason why I'm not so eager to take it out of my array is because I obviously no longer have any warrant on it (Had it for 3 years+) so, it's either in my array or being used for nothing. All drives perform hardware error correction. (it is as if they have their own internal way of correcting data, typically by re-reading and comparing internal checksums.) Just keep an eye on the drive. If the normalized numbers start getting closer to their failure threshold, consider moving data off it that might be accessed frequently. I realize it is an older drive, Since it is working, odds are it will work a lot longer. I have two disks in my first unRAID server that are a quite a bit older (over 7 years). I just keep an eye on them too, and so far, they are working perfectly. 9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 62767 and 9 Power_On_Hours 0x0012 092 092 000 Old_age Always - 62488 Interestingly, neither have ever had a sector pending reallocation or re-allocated. (They are IDE and 500Gig. They were the biggest available at the time the server was originally built, and over $300 each... ouch. Times have sure changed.) Joe L. Well, I'll be sure to keep my eye on the S.M.A.R.T. status & parity reports. Thanks for your input. Quote Link to comment
Harpz Posted February 24, 2013 Share Posted February 24, 2013 Hi I pre cleared a couple of old drives I had laying around and was thinking of utilising one of them as a cache drive. The oldest is a Maxtor drive 250gb and the others an Samsung 500gb drive, ideally I was thinking about using the Samsung as it a little newer and bigger. Problem being when I did an initial smart report on the Samsung it showed 6 pending sectors but I pre cleared x3 in the hope that it clear the errors, it did. The Maxtor's results look like its mainly mechanical ware and tare but appear to function as it should even after a pre clear x3. Looking at the results attached was I right in saying the Samsung's OK to use now and was I right in thinking the Maxtor drive is nearly worn out? Are there any other issues or learning points I should be aware / concerned with on either drive? Still learning how to interpret these smart reports but getting there preclear_results__9QE0XT5H_2013-02-24.txt preclear_results_S0MUJ1GP711754_2013-02-24.txt Quote Link to comment
maddog808 Posted February 24, 2013 Share Posted February 24, 2013 The preclear just finished on a warranty replacement drive. It looks good to me (number of sectors pending re-allocation & sectors re-allocated did not change). I was hoping someone here could just give it a once over, to be sure. Thanks in advance if so. preclear_results_2-24-13.txt Quote Link to comment
RobJ Posted February 24, 2013 Share Posted February 24, 2013 Hi I pre cleared a couple of old drives I had laying around and was thinking of utilising one of them as a cache drive. The oldest is a Maxtor drive 250gb and the others an Samsung 500gb drive, ideally I was thinking about using the Samsung as it a little newer and bigger. Problem being when I did an initial smart report on the Samsung it showed 6 pending sectors but I pre cleared x3 in the hope that it clear the errors, it did. The Maxtor's results look like its mainly mechanical ware and tare but appear to function as it should even after a pre clear x3. Looking at the results attached was I right in saying the Samsung's OK to use now and was I right in thinking the Maxtor drive is nearly worn out? Are there any other issues or learning points I should be aware / concerned with on either drive? Still learning how to interpret these smart reports but getting there The Maxtor looks fine, no issues at all. It's not that old either, with less than 20000 hours on it. At 250GB, it's a bit small, but that depends on what you want to use it for. The Samsung is a little older, with 28000 hours on it, but it too looks fine. Quote Link to comment
RobJ Posted February 25, 2013 Share Posted February 25, 2013 The preclear just finished on a warranty replacement drive. It looks good to me (number of sectors pending re-allocation & sectors re-allocated did not change). I was hoping someone here could just give it a once over, to be sure. Thanks in advance if so. Both drives listed in your attached file look fine, no issues, brand new. Quote Link to comment
JohnWys Posted February 25, 2013 Share Posted February 25, 2013 My array has generally been well behaved (running version 4.7). Recently I added two new drives and precleared them, and while I believe both precleared successfully, just before they finished I logged a lot of errors, seemingly related to one of the drives. Any thoughts? Thanks. color_coded_syslog.rtf.zip syslog.txt Quote Link to comment
Joe L. Posted February 25, 2013 Share Posted February 25, 2013 Your version of the preclear script is several versions old. You should always use the newest version As if now it is 1.13. Those are media errors (un-readable sector errors) Feb 25 06:25:29 10 preclear_disk-diff[30056]: 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. (Misc) Feb 25 06:25:29 10 preclear_disk-diff[30056]: 1 sector is pending re-allocation at the end of the preclear, (Misc) Feb 25 06:25:29 10 preclear_disk-diff[30056]: a change of 1 in the number of sectors pending re-allocation. (Misc) Since it occurred in the post-read, it means you were unable to read one of the sectors written. That is not a really good sign. It indicates either a bad platter sector, or a marginal electronics, or a marginal power supply voltage causing poor supply regulation. I'd try another preclear. Joe L. Quote Link to comment
RonP Posted February 26, 2013 Share Posted February 26, 2013 Need help interpreting my preclear.sh results. I selected to have the result emailed to me per the Configuration Tutorial (./preclear_disk.sh -m [email protected] /dev/sdX) and received the email attached below. With the exception of UDMA_CRC_Error_Count (I put in two drives but one of the two was an open box so it wasn't perfectly new, it is possible the Error Count occurred on other hardware, the value didn't change from pre to post so I'm pretty sure there is no issue with the CRC error count) the only other issue is "Postread detected un-expected non-zero bytes on disk". I've read the forums and it is clear that some bytes were non-zero even after being zeroed out. I have a few questions: 1) The subject line of the email was "Preclear: PASS! Preclearing Disk sdi Finished!!! Cycle 1 of 1". I'm thrown off by the "PASS!" as that contradicts the "Disk /dev/sdi has NOT been successfully precleared ". Am I missing something? 2) The forums suggest that a "postread" issue could be attributable to a lot of issues and to start with a memory check and then a dd/checksum test. I think I've done enough reading to know what I need to do so I'm primarily asking for validation: a) I had previously precleared a smaller drive (500 MB) without error, does that help isolate the problem with this result? b) For the memory test, I read a post Joe L. made (http://lime-technology.com/forum/index.php?topic=23753.0) and will do a memory check tonight. I have 2 GB of ram on this machine, while I was running two preclears on 3TB drives (I didn't have any array running), I did install Plex Media Server (although didn't enable it) including doing a wget, used the unMenu package manager to install infozip. I wouldn't think any of those things would be memory intensive and/or contribute to issues but thought I would mention them. I'll know more once I run a memory check but thought I'd ask, if anything I did might have contributed to the post read issues. c) For the dd /checksum test, I've found two points of information that I think verify what I need to do. The first is the unRAID section on How To Troubleshoot Recurring Parity Errors and the other is Joe L's post at the bottom of this forum page (http://lime-technology.com/forum/index.php?topic=19975.0). From what I gather, I need to take a look in /tmp/postread_errorssda to find which blocks to check. I can then repeat some of the specific post read checks as Joe outlined in his posting. I can also run a checksum per the FAQ multiple times covering the same blocks to see if there is a consistency issue. Am I on the right track? 3) My final question is do I just re-run the pre-clear or what do I do if I find issues with the above. Forgot to mention (in case it is useful information) that the disks are connected directly to the motherboard (ASUS M2N32-SLI Deluxe) with a AMD Athlon X2 BE-2350 Brisbane 2.1GHz. I did not disable unnecessary ports in the BIOS as I'm using this box during transition to dual boot into a Windows machine that has Plex and the old drive I'll be transferring. Email follows.... ========================================================================1.13 == invoked as: ./preclear_disk.sh -m [email protected] /dev/sdi == == Disk /dev/sdi has NOT been successfully precleared == Postread detected un-expected non-zero bytes on disk== == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 8:56:52 (93 MB/s) == Last Cycle's Zeroing time : 7:49:47 (106 MB/s) == Last Cycle's Post Read Time : 19:23:14 (42 MB/s) == Last Cycle's Total Time : 36:10:53 == == Total Elapsed Time 36:10:53 == == Disk Start Temperature: 36C == == Current Disk Temperature: 28C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdi /tmp/smart_finish_sdi ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Seek_Error_Rate = 100 200 0 ok 0 Temperature_Celsius = 122 114 0 ok 28 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ============================================================================ == == S.M.A.R.T Initial Report for /dev/sdi == Disk: /dev/sdi smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD30EFRX-68AX9N0 Serial Number: WD-WMC1T1001327 Firmware Version: 80.00A80 User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 9 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Mon Feb 25 00:28:18 2013 CST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (42600) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x70bd) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 185 174 021 Pre-fail Always - 5708 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 17 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 51 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 14 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 9 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 7 194 Temperature_Celsius 0x0022 114 109 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 17 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ ============================================================================ == == S.M.A.R.T Final Report for /dev/sdi == Disk: /dev/sdi smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD30EFRX-68AX9N0 Serial Number: WD-WMC1T1001327 Firmware Version: 80.00A80 User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 9 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Feb 26 12:39:11 2013 CST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (42600) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x70bd) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 185 174 021 Pre-fail Always - 5708 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 17 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 87 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 14 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 9 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 7 194 Temperature_Celsius 0x0022 122 109 000 Old_age Always - 28 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 17 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ Quote Link to comment
ogi Posted February 27, 2013 Share Posted February 27, 2013 I could also use some help with some preclear result interpretations ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdr == WDC WD20EARS-00MVWB0 WD-WCAZA0358422 == Disk /dev/sdr has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 7:06:00 (78 MB/s) == Last Cycle's Zeroing time : 5:57:45 (93 MB/s) == Last Cycle's Post Read Time : 13:49:47 (40 MB/s) == Last Cycle's Total Time : 26:54:32 == == Total Elapsed Time 26:54:32 == == Disk Start Temperature: 22C == == Current Disk Temperature: 33C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdr /tmp/smart_finish_sdr ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 117 128 0 ok 33 No SMART attributes are FAILING_NOW 1 sector was pending re-allocation before the start of the preclear. 1 sector was pending re-allocation after pre-read in cycle 1 of 1. 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 48 sectors are pending re-allocation at the end of the preclear, a change of 47 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ I've attached the other relevant preclear files to this post. preclear_finish__WD-WCAZA0358422_2013-02-26.txt preclear_rpt__WD-WCAZA0358422_2013-02-26.txt preclear_start__WD-WCAZA0358422_2013-02-26.txt Quote Link to comment
Joe L. Posted February 27, 2013 Share Posted February 27, 2013 Need help interpreting my preclear.sh results. I selected to have the result emailed to me per the Configuration Tutorial (./preclear_disk.sh -m [email protected] /dev/sdX) and received the email attached below. With the exception of UDMA_CRC_Error_Count (I put in two drives but one of the two was an open box so it wasn't perfectly new, it is possible the Error Count occurred on other hardware, the value didn't change from pre to post so I'm pretty sure there is no issue with the CRC error count) the only other issue is "Postread detected un-expected non-zero bytes on disk". I've read the forums and it is clear that some bytes were non-zero even after being zeroed out. I have a few questions: 1) The subject line of the email was "Preclear: PASS! Preclearing Disk sdi Finished!!! Cycle 1 of 1". I'm thrown off by the "PASS!" as that contradicts the "Disk /dev/sdi has NOT been successfully precleared ". Am I missing something? 2) The forums suggest that a "postread" issue could be attributable to a lot of issues and to start with a memory check and then a dd/checksum test. I think I've done enough reading to know what I need to do so I'm primarily asking for validation: a) I had previously precleared a smaller drive (500 MB) without error, does that help isolate the problem with this result? b) For the memory test, I read a post Joe L. made (http://lime-technology.com/forum/index.php?topic=23753.0) and will do a memory check tonight. I have 2 GB of ram on this machine, while I was running two preclears on 3TB drives (I didn't have any array running), I did install Plex Media Server (although didn't enable it) including doing a wget, used the unMenu package manager to install infozip. I wouldn't think any of those things would be memory intensive and/or contribute to issues but thought I would mention them. I'll know more once I run a memory check but thought I'd ask, if anything I did might have contributed to the post read issues. c) For the dd /checksum test, I've found two points of information that I think verify what I need to do. The first is the unRAID section on How To Troubleshoot Recurring Parity Errors and the other is Joe L's post at the bottom of this forum page (http://lime-technology.com/forum/index.php?topic=19975.0). From what I gather, I need to take a look in /tmp/postread_errorssda to find which blocks to check. I can then repeat some of the specific post read checks as Joe outlined in his posting. I can also run a checksum per the FAQ multiple times covering the same blocks to see if there is a consistency issue. Am I on the right track? 3) My final question is do I just re-run the pre-clear or what do I do if I find issues with the above. Forgot to mention (in case it is useful information) that the disks are connected directly to the motherboard (ASUS M2N32-SLI Deluxe) with a AMD Athlon X2 BE-2350 Brisbane 2.1GHz. I did not disable unnecessary ports in the BIOS as I'm using this box during transition to dual boot into a Windows machine that has Plex and the old drive I'll be transferring. Email follows.... ========================================================================1.13 == invoked as: ./preclear_disk.sh -m [email protected] /dev/sdi == == Disk /dev/sdi has NOT been successfully precleared == Postread detected un-expected non-zero bytes on disk== == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 8:56:52 (93 MB/s) == Last Cycle's Zeroing time : 7:49:47 (106 MB/s) == Last Cycle's Post Read Time : 19:23:14 (42 MB/s) == Last Cycle's Total Time : 36:10:53 == == Total Elapsed Time 36:10:53 == == Disk Start Temperature: 36C == == Current Disk Temperature: 28C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdi /tmp/smart_finish_sdi ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Seek_Error_Rate = 100 200 0 ok 0 Temperature_Celsius = 122 114 0 ok 28 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ============================================================================ == == S.M.A.R.T Initial Report for /dev/sdi == Disk: /dev/sdi smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD30EFRX-68AX9N0 Serial Number: WD-WMC1T1001327 Firmware Version: 80.00A80 User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 9 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Mon Feb 25 00:28:18 2013 CST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (42600) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x70bd) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 185 174 021 Pre-fail Always - 5708 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 17 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 51 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 14 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 9 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 7 194 Temperature_Celsius 0x0022 114 109 000 Old_age Always - 36 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 17 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ ============================================================================ == == S.M.A.R.T Final Report for /dev/sdi == Disk: /dev/sdi smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD30EFRX-68AX9N0 Serial Number: WD-WMC1T1001327 Firmware Version: 80.00A80 User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 9 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Feb 26 12:39:11 2013 CST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (42600) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x70bd) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0 3 Spin_Up_Time 0x0027 185 174 021 Pre-fail Always - 5708 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 17 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 100 253 000 Old_age Always - 0 9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 87 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 14 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 9 193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 7 194 Temperature_Celsius 0x0022 122 109 000 Old_age Always - 28 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 100 253 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 17 200 Multi_Zone_Error_Rate 0x0008 100 253 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ Ignore the mail subject. Apparently it is not looking at the actual results. If the preclear said " Disk /dev/sdi has NOT been successfully precleared == Postread detected un-expected non-zero bytes on disk==" then it is NOT cleared. I'd run another pass, and capture the actual screen output, not the mail although it really does not matter where the errors were, if it was unable the read back all zeros, the disk is really messed up. I'd not trust it in my array. On the other hand, it could just be that your power supply is marginal, or the memory in your server is marginal, or not configured properly for voltage, clock speed, or timing. You are on the right track as far as troubleshooting the issue. Eliminate the easier items first.. (memory test) Joe L. Quote Link to comment
:) Posted February 27, 2013 Share Posted February 27, 2013 Can I preclear an external drive that's connected via USB? Quote Link to comment
Joe L. Posted February 27, 2013 Share Posted February 27, 2013 Can I preclear an external drive that's connected via USB? yes, but it will be slower since the USB rate is slower than an SATA cable directly connected. Quote Link to comment
shooga Posted February 27, 2013 Share Posted February 27, 2013 I have a WD15EADS that a friend gave me from his old stash. It's preclearing MUCH slower than other drives I've been preclearing and adding to my new server. After 9 hours it's only 40% through pre-reading the first of two cycles - unMenu is showing 18MB/s but it started out at 3MB/s. Besides this one, the slowest drives I've precleared have gone at more like 40MB/s and the first drives were liike 2-3x that even. Is this just a sign that it's a very old slow drive? Or would this mean that something is wrong? I can post the preclear results when the drive is finished, but at this rate it's going to take another 5 days to finish the 2 cycles I'm running. Quote Link to comment
Automatic Posted February 27, 2013 Share Posted February 27, 2013 Two questions:- A. I have an "End_to_end" error on a drive I'm currently preclearing (Warned me at the start, just started it. Can't do any harm to preclear it). Here's the smart report off unmenu:- martctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: ST3000DM001-1CH166 Serial Number: Z1F0Z1XZ Firmware Version: CC43 User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: ATA-8-ACS revision 4 Local Time is: Wed Feb 27 17:44:45 2013 GMT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED See vendor-specific Attribute list for marginal Attributes. General SMART Values: Offline data collection status: (0x00) Offline data collection activity was never started. Auto Offline Data Collection: Disabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 600) seconds. Offline data collection capabilities: (0x73) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. No Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 1) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 2) minutes. SCT capabilities: (0x3085) SCT Status supported. SMART Attributes Data Structure revision number: 10 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f 113 099 006 Pre-fail Always - 53356440 3 Spin_Up_Time 0x0003 091 090 000 Pre-fail Always - 0 4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 85 5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0 7 Seek_Error_Rate 0x000f 061 055 030 Pre-fail Always - 12890110958 9 Power_On_Hours 0x0032 097 097 000 Old_age Always - 3350 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0 12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 46 183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0 184 End-to-End_Error 0x0032 099 099 099 Old_age Always FAILING_NOW 1 187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0 188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0 190 Airflow_Temperature_Cel 0x0022 077 057 045 Old_age Always - 23 (Min/Max 22/23) 191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0 192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 32 193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 905 194 Temperature_Celsius 0x0022 023 043 000 Old_age Always - 23 (0 16 0 0) 197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 768 240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 124730145246361 241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 22497570897 242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 44084490249 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error # 1 Conveyance offline Completed without error 00% 2877 - SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. It's still in warranty (and will be for quite a lot longer, ~ 2 years), any advice? Like I said, it's currently preclearing anyway, if it gets any worse I guess that's a good thing (Easier RMA & saved me from data troubles) Second thing is:- WARNING: GPT (GUID Partition Table) detected on '/dev/sdj'! The util fdisk doesn't support GPT. Use GNU Parted. On both my 3TB HDDs that were previously in my windows build, anything I just do first? Or just preclear them and dump them into the array. Quote Link to comment
Joe L. Posted February 27, 2013 Share Posted February 27, 2013 Interesting in that the end-to-end error says FAILING_NOW, but the overall smart status is PASSED. I would RMA the drive on just that. Buggy firmware. Yu can ignore any warnings from the preclear script about fdisk and gpt partitions on drives over 2.2TB. The GPT partitions are expected, and fdisk is only being used to read the MBR, not partition the disk, so it is safe. Joe L. Quote Link to comment
ogi Posted February 27, 2013 Share Posted February 27, 2013 I could also use some help with some preclear result interpretations ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdr == WDC WD20EARS-00MVWB0 WD-WCAZA0358422 == Disk /dev/sdr has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 7:06:00 (78 MB/s) == Last Cycle's Zeroing time : 5:57:45 (93 MB/s) == Last Cycle's Post Read Time : 13:49:47 (40 MB/s) == Last Cycle's Total Time : 26:54:32 == == Total Elapsed Time 26:54:32 == == Disk Start Temperature: 22C == == Current Disk Temperature: 33C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdr /tmp/smart_finish_sdr ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 117 128 0 ok 33 No SMART attributes are FAILING_NOW 1 sector was pending re-allocation before the start of the preclear. 1 sector was pending re-allocation after pre-read in cycle 1 of 1. 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 48 sectors are pending re-allocation at the end of the preclear, a change of 47 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ I've attached the other relevant preclear files to this post. Hey Joe, not sure if you missed my post, but given the weird results of this preclear, I could use your input Thanks! Quote Link to comment
Joe L. Posted February 27, 2013 Share Posted February 27, 2013 I could also use some help with some preclear result interpretations ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdr == WDC WD20EARS-00MVWB0 WD-WCAZA0358422 == Disk /dev/sdr has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 7:06:00 (78 MB/s) == Last Cycle's Zeroing time : 5:57:45 (93 MB/s) == Last Cycle's Post Read Time : 13:49:47 (40 MB/s) == Last Cycle's Total Time : 26:54:32 == == Total Elapsed Time 26:54:32 == == Disk Start Temperature: 22C == == Current Disk Temperature: 33C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdr /tmp/smart_finish_sdr ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 117 128 0 ok 33 No SMART attributes are FAILING_NOW 1 sector was pending re-allocation before the start of the preclear. 1 sector was pending re-allocation after pre-read in cycle 1 of 1. 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 48 sectors are pending re-allocation at the end of the preclear, a change of 47 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ I've attached the other relevant preclear files to this post. Hey Joe, not sure if you missed my post, but given the weird results of this preclear, I could use your input Thanks! Sorry, I did miss it. This line is weird... 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. I would not expect for sectors to be pending re-allocation after being written. (and it is the highest count I've ever seen) Makes me want to RMA the disk for buggy firmware. (did you perhaps run two pre-clears on it at the same time, with one writing, while another was reading?) In any case, there were still 47 sectors that it could not re-allocate, and on that I'd RMA the drive. You should, if you decide to keep the drive put it through at least one more preclear cycle, and if cannot come through with all the sectors re-allocated, and no new ones detected as unreadable, RMA it. Quote Link to comment
Automatic Posted February 27, 2013 Share Posted February 27, 2013 I could also use some help with some preclear result interpretations ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdr == WDC WD20EARS-00MVWB0 WD-WCAZA0358422 == Disk /dev/sdr has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 7:06:00 (78 MB/s) == Last Cycle's Zeroing time : 5:57:45 (93 MB/s) == Last Cycle's Post Read Time : 13:49:47 (40 MB/s) == Last Cycle's Total Time : 26:54:32 == == Total Elapsed Time 26:54:32 == == Disk Start Temperature: 22C == == Current Disk Temperature: 33C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdr /tmp/smart_finish_sdr ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 117 128 0 ok 33 No SMART attributes are FAILING_NOW 1 sector was pending re-allocation before the start of the preclear. 1 sector was pending re-allocation after pre-read in cycle 1 of 1. 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 48 sectors are pending re-allocation at the end of the preclear, a change of 47 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ I've attached the other relevant preclear files to this post. Hey Joe, not sure if you missed my post, but given the weird results of this preclear, I could use your input Thanks! Sorry, I did miss it. This line is weird... 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. I would not expect for sectors to be pending re-allocation after being written. (and it is the highest count I've ever seen) Makes me want to RMA the disk for buggy firmware. (did you perhaps run two pre-clears on it at the same time, with one writing, while another was reading?) In any case, there were still 47 sectors that it could not re-allocate, and on that I'd RMA the drive. You should, if you decide to keep the drive put it through at least one more preclear cycle, and if cannot come through with all the sectors re-allocated, and no new ones detected as unreadable, RMA it. Well, instantly that number calls out to me, 65535 being the maximum number a two byte integer (short) can be, that alone (being exactly 2^16-1) sounds dodgy. Quote Link to comment
ogi Posted February 27, 2013 Share Posted February 27, 2013 I could also use some help with some preclear result interpretations ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdr == WDC WD20EARS-00MVWB0 WD-WCAZA0358422 == Disk /dev/sdr has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 7:06:00 (78 MB/s) == Last Cycle's Zeroing time : 5:57:45 (93 MB/s) == Last Cycle's Post Read Time : 13:49:47 (40 MB/s) == Last Cycle's Total Time : 26:54:32 == == Total Elapsed Time 26:54:32 == == Disk Start Temperature: 22C == == Current Disk Temperature: 33C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdr /tmp/smart_finish_sdr ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 117 128 0 ok 33 No SMART attributes are FAILING_NOW 1 sector was pending re-allocation before the start of the preclear. 1 sector was pending re-allocation after pre-read in cycle 1 of 1. 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 48 sectors are pending re-allocation at the end of the preclear, a change of 47 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ I've attached the other relevant preclear files to this post. Hey Joe, not sure if you missed my post, but given the weird results of this preclear, I could use your input Thanks! Sorry, I did miss it. This line is weird... 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. I would not expect for sectors to be pending re-allocation after being written. (and it is the highest count I've ever seen) Makes me want to RMA the disk for buggy firmware. (did you perhaps run two pre-clears on it at the same time, with one writing, while another was reading?) In any case, there were still 47 sectors that it could not re-allocate, and on that I'd RMA the drive. You should, if you decide to keep the drive put it through at least one more preclear cycle, and if cannot come through with all the sectors re-allocated, and no new ones detected as unreadable, RMA it. Didn't run 2 preclears at the same time... I decided to RMA the drive, as I already had a different drive ready to RMA, just needed to drop it off. I like a column of 0's on the preclear report and when I see large numbers like this, I figured RMA was the likely suggestion. Thanks again! Quote Link to comment
ogi Posted February 27, 2013 Share Posted February 27, 2013 I could also use some help with some preclear result interpretations ========================================================================1.13 == invoked as: ./preclear_disk.sh -A /dev/sdr == WDC WD20EARS-00MVWB0 WD-WCAZA0358422 == Disk /dev/sdr has been successfully precleared == with a starting sector of 64 == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 7:06:00 (78 MB/s) == Last Cycle's Zeroing time : 5:57:45 (93 MB/s) == Last Cycle's Post Read Time : 13:49:47 (40 MB/s) == Last Cycle's Total Time : 26:54:32 == == Total Elapsed Time 26:54:32 == == Disk Start Temperature: 22C == == Current Disk Temperature: 33C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdr /tmp/smart_finish_sdr ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 117 128 0 ok 33 No SMART attributes are FAILING_NOW 1 sector was pending re-allocation before the start of the preclear. 1 sector was pending re-allocation after pre-read in cycle 1 of 1. 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 48 sectors are pending re-allocation at the end of the preclear, a change of 47 in the number of sectors pending re-allocation. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ I've attached the other relevant preclear files to this post. Hey Joe, not sure if you missed my post, but given the weird results of this preclear, I could use your input Thanks! Sorry, I did miss it. This line is weird... 65535 sectors were pending re-allocation after zero of disk in cycle 1 of 1. I would not expect for sectors to be pending re-allocation after being written. (and it is the highest count I've ever seen) Makes me want to RMA the disk for buggy firmware. (did you perhaps run two pre-clears on it at the same time, with one writing, while another was reading?) In any case, there were still 47 sectors that it could not re-allocate, and on that I'd RMA the drive. You should, if you decide to keep the drive put it through at least one more preclear cycle, and if cannot come through with all the sectors re-allocated, and no new ones detected as unreadable, RMA it. Well, instantly that number calls out to me, 65535 being the maximum number a two byte integer (short) can be, that alone (being exactly 2^16-1) sounds dodgy. Yeah, I saw this comment somewhere else on this forum, but regardless of that number not being a coincidence, is there anything that can be done in terms of identifying the cause, or is it a potentially an issue with the preclear script? Quote Link to comment
Joe L. Posted February 27, 2013 Share Posted February 27, 2013 The preclear script has nothing that is limited to 65535... It simply reports what it gets from the smartctl reports it invokes while it is processing. I'd suspect the drive, not the script. If anything else might be involved it is your system RAM, or power supply. Joe L. Quote Link to comment
shooga Posted February 27, 2013 Share Posted February 27, 2013 I posted this previously, but have a feeling it was either not specific enough or got lost because the thread went to the next page. Trying one more time just in case: I have a WD15EADS that a friend gave me from his old stash. It's preclearing MUCH slower than other drives I've been preclearing and adding to my new server. After 9 hours it's only 40% through pre-reading the first of two cycles - unMenu is showing 18MB/s (now 12MB/s) but it started out at 3MB/s. Besides this one, the slowest drives I've pre-read have gone at more like 40MB/s and the first drives were liike 2-3x that even. Is this just a sign that it's a very old slow drive? Or would this mean that something is wrong? I can post the preclear results when the drive is finished, but at this rate it's going to take another 5 days to finish the 2 cycles I'm running. Quote Link to comment
Joe L. Posted February 27, 2013 Share Posted February 27, 2013 I posted this previously, but have a feeling it was either not specific enough or got lost because the thread went to the next page. Trying one more time just in case: I have a WD15EADS that a friend gave me from his old stash. It's preclearing MUCH slower than other drives I've been preclearing and adding to my new server. After 9 hours it's only 40% through pre-reading the first of two cycles - unMenu is showing 18MB/s (now 12MB/s) but it started out at 3MB/s. Besides this one, the slowest drives I've pre-read have gone at more like 40MB/s and the first drives were liike 2-3x that even. Is this just a sign that it's a very old slow drive? Or would this mean that something is wrong? I can post the preclear results when the drive is finished, but at this rate it's going to take another 5 days to finish the 2 cycles I'm running. look in the system log. If no errors, it is just a very slow drive. (older drives were much slower) Joe L. Quote Link to comment
BobPhoenix Posted February 28, 2013 Share Posted February 28, 2013 The preclear script has nothing that is limited to 65535... It simply reports what it gets from the smartctl reports it invokes while it is processing. I'd suspect the drive, not the script. If anything else might be involved it is your system RAM, or power supply. Joe L. I've got a 3TB WD Green drive that flip/flops pending sectors. One pass it's 65535 then next it's zero then 65535 then zero etc.. for at least 6 passes. I'm going to RMA since the last pass left it at 65535. That was the highest I've ever seen too. It was a drive WD upgraded me too when I RMA'd another defective 2TB drive to them. It was the "last straw" for me on WD green drives just had too many go bad on me lately. Only had 1 out of 10 WD Red's bad so far which is much better than the 30% failure rate of Seagate drives and 40% of WD Greens. Sure wish Hitachi and Samsung still made drives. Guess I'll have to try some Toshiba besides the WD Red's. Oh they were all purchased in 1's and 2's mostly from newegg but not all and as I said even straight from WD RMA process. And for me I know it isn't a power supply as it was the only HDD in a preclear station that has cleared 90% of all of the drives I've got some before and after this one. It's just a bad drive . smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Device Model: WDC WD30EZRS-11J99B1 Serial Number: WD-WCAWZ0773504 Firmware Version: 80.00A80 User Capacity: 3,000,592,982,016 bytes Device is: Not in smartctl database [for details use: -P showall] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Wed Feb 27 19:11:21 2013 CST SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (52980) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x3035) SCT Status supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 21 3 Spin_Up_Time 0x0027 161 147 021 Pre-fail Always - 8950 4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 52 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 095 095 000 Old_age Always - 4028 10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 46 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 41 193 Load_Cycle_Count 0x0032 199 199 000 Old_age Always - 5400 194 Temperature_Celsius 0x0022 130 104 000 Old_age Always - 22 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 066 066 000 Old_age Always - 65535 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 4 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 405 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 3 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.