JustinChase Posted October 17, 2014 Share Posted October 17, 2014 I just finished a preclear on a new drive. There are a couple of items showing "near threshold". Should I be concerned? ** Changed attributes in files: /tmp/smart_start_sdi /tmp/smart_finish_sdi ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 118 100 6 ok 179537912 Spin_Up_Time = 98 92 0 ok 0 Spin_Retry_Count = 100 100 97 near_thresh 0 End-to-End_Error = 100 100 99 near_thresh 0 High_Fly_Writes = 99 100 0 ok 1 Airflow_Temperature_Cel = 75 74 45 ok 25 Temperature_Celsius = 25 26 0 near_thresh 25 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. Quote Link to comment
RobJ Posted October 17, 2014 Share Posted October 17, 2014 I just finished a preclear on a new drive. There are a couple of items showing "near threshold". Should I be concerned? ** Changed attributes in files: /tmp/smart_start_sdi /tmp/smart_finish_sdi ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 118 100 6 ok 179537912 Spin_Up_Time = 98 92 0 ok 0 Spin_Retry_Count = 100 100 97 near_thresh 0 End-to-End_Error = 100 100 99 near_thresh 0 High_Fly_Writes = 99 100 0 ok 1 Airflow_Temperature_Cel = 75 74 45 ok 25 Temperature_Celsius = 25 26 0 near_thresh 25 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. Nothing to worry about, at all. I've made some suggestions to Joe L about removing these near_thresh's from the reporting, but I suspect he's been too busy lately. Quote Link to comment
JustinChase Posted October 17, 2014 Share Posted October 17, 2014 I figured as much, but it's good to get confirmation; thanks. Quote Link to comment
JustinChase Posted October 18, 2014 Share Posted October 18, 2014 so, this is my old Parity drive. it started acting up and I've replaced it with a new drive. I was hoping a preclear would 'fix' this drive, but I'm not sure how to interpret these results. Is this drive okay to put back into service in the array as a data drive? ** Changed attributes in files: /tmp/smart_start_sdf /tmp/smart_finish_sdf ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 115 117 6 ok 188984744 Power_On_Hours = 91 92 0 ok 7887 Spin_Retry_Count = 100 100 97 near_thresh 0 End-to-End_Error = 100 100 99 near_thresh 0 Reported_Uncorrect = 89 91 0 ok 11 High_Fly_Writes = 88 92 0 ok 12 Airflow_Temperature_Cel = 69 72 45 near_thresh 31 Temperature_Celsius = 31 28 0 ok 31 No SMART attributes are FAILING_NOW 8 sectors were pending re-allocation before the start of the preclear. 8 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, a change of -8 in the number of sectors pending re-allocation. 40 sectors had been re-allocated before the start of the preclear. 40 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. Quote Link to comment
RobJ Posted October 18, 2014 Share Posted October 18, 2014 so, this is my old Parity drive. it started acting up and I've replaced it with a new drive. I was hoping a preclear would 'fix' this drive, but I'm not sure how to interpret these results. Is this drive okay to put back into service in the array as a data drive? ** Changed attributes in files: /tmp/smart_start_sdf /tmp/smart_finish_sdf ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 115 117 6 ok 188984744 Power_On_Hours = 91 92 0 ok 7887 Spin_Retry_Count = 100 100 97 near_thresh 0 End-to-End_Error = 100 100 99 near_thresh 0 Reported_Uncorrect = 89 91 0 ok 11 High_Fly_Writes = 88 92 0 ok 12 Airflow_Temperature_Cel = 69 72 45 near_thresh 31 Temperature_Celsius = 31 28 0 ok 31 No SMART attributes are FAILING_NOW 8 sectors were pending re-allocation before the start of the preclear. 8 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, a change of -8 in the number of sectors pending re-allocation. 40 sectors had been re-allocated before the start of the preclear. 40 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. It means the drive tested each of the 8 sectors, when written zeroes to, and decided they were fine so put them back online. It doesn't mean those 8 are perfect, although they could be, but for now the drive believes each of them can be trusted. The drive appears to be fine now, but not perfect. The best advice we give users when this happens is to Preclear it one or two more times, and see if anything changes. If no further changes, then the drive should be good. All other numbers look OK. Quote Link to comment
JustinChase Posted October 18, 2014 Share Posted October 18, 2014 Okay, that makes sense. I'll preclear it 2 more times, then see what I come up with. Should I do the full preclear, or can/should I use the -W option to save a bit of time? That preclear took about 50 hours. I can live with it taking 100 hours, but if I can cut that down a bit, that might be better. Quote Link to comment
jowi Posted October 20, 2014 Share Posted October 20, 2014 I remember i noticed this before but i'm not sure what the reason was. I am running preclear on a new WD red, but it keeps all my drives in a spinned up state? If i spin them down, they start right up. Is this a known 'feature'? Quote Link to comment
jowi Posted October 20, 2014 Share Posted October 20, 2014 I've disabled all plugins, the only thing that is running is preclear. Also there is constant reading from disk1? Is preclear causing this, and why? Quote Link to comment
garycase Posted October 20, 2014 Share Posted October 20, 2014 I've disabled all plugins, the only thing that is running is preclear. Also there is constant reading from disk1? Is preclear causing this, and why? Constant reads? ... or is it just spinning? On some controllers, if one drive is active, other drives on the controller will also be active => so if the drive you're pre-clearing and disk1 are both on a controller with these properties, that would explain it. But it shouldn't be causing actual read activity from the drive. Do you have any plugins that may be using the disk? Quote Link to comment
jowi Posted October 20, 2014 Share Posted October 20, 2014 No, all plugins are stopped. I can see the read counter from the webgui on disk1 increasing, also the disk light on the case is blinking... so there is definitly something going on... *edit* i missed one plugin, cache_dirs... stopped it, it looks like all disks are spinned down... still weird, normally cache_dirs does not have that effect... Quote Link to comment
garycase Posted October 20, 2014 Share Posted October 20, 2014 Cache_Dirs will indeed do a LOT of reading when you first boot the server; but will normally finish after all the directories are cached. In you have a LOT of files on disk1; and if the pre-clear activity is using enough of your memory that all of the directories can't be cached; then Cache_Dirs could be constantly attempting to update the cached information, and never actually finish, since everything isn't fitting in memory. Quote Link to comment
jowi Posted October 20, 2014 Share Posted October 20, 2014 I guess that is exactly what happened. Indeed my disk1 is a collection of a lot of smaller files. Maybe preclear could disable cache_dir when it starts, or inform the user that it is better to stop cache_dir while preclearing? Quote Link to comment
RobJ Posted October 20, 2014 Share Posted October 20, 2014 Okay, that makes sense. I'll preclear it 2 more times, then see what I come up with. Should I do the full preclear, or can/should I use the -W option to save a bit of time? That preclear took about 50 hours. I can live with it taking 100 hours, but if I can cut that down a bit, that might be better. You are trying to thoroughly test the drive, so cutting back on the testing seems a little counter-productive, but yes the preread is the least useful part of the testing, so skipping it shouldn't affect the result and would save time. Quote Link to comment
Joe L. Posted October 21, 2014 Share Posted October 21, 2014 Okay, that makes sense. I'll preclear it 2 more times, then see what I come up with. Should I do the full preclear, or can/should I use the -W option to save a bit of time? That preclear took about 50 hours. I can live with it taking 100 hours, but if I can cut that down a bit, that might be better. You are trying to thoroughly test the drive, so cutting back on the testing seems a little counter-productive, but yes the preread is the least useful part of the testing, so skipping it shouldn't affect the result and would save time. Actually, it is only when "reading" that un-readable sectors can be identified. If you skip the pre-read, you'll only identify marginal sectors on the post-read, and not have any subsequent "write" to attempt to re-allocate those marginal sectors. When requesting multiple cycles, the post-read of the first/prior cycle is used as the pre-read of the next, so some time is saved and you still get the benefit of the full test. Joe L. Quote Link to comment
JustinChase Posted October 21, 2014 Share Posted October 21, 2014 hmmm... okay, too bad. I'm 48 hours into the 2 cycles without the pre-read, so that's 2 more wasted days so much for saving time. I guess I'll cancel and start over again. thanks for the info Quote Link to comment
Joe L. Posted October 21, 2014 Share Posted October 21, 2014 hmmm... okay, too bad. I'm 48 hours into the 2 cycles without the pre-read, so that's 2 more wasted days so much for saving time. I guess I'll cancel and start over again. thanks for the info you'll be fine, as the intermediate post-read serves as the pre-read for the subsequent cycle. Let it continue to completion. Quote Link to comment
grither Posted October 24, 2014 Share Posted October 24, 2014 hi all, wanted to run my preclear results by you. i've highlighted the part in bold that looks weird to me. for some background, this is a drive that flaked out on my one day and i lost ALL the data on it. i don't intend to reuse it, however i ran preclear to see what it would say. it says it has NOT been successfully precleared? =======================================================================1.13 == invoked as: ./preclear_disk.sh -M 4 /dev/sdp == == Disk /dev/sdp has NOT been successfully precleared == Postread detected un-expected non-zero bytes on disk== == Ran 1 cycle == == Using :Read block size = 8225280 Bytes == Last Cycle's Pre Read Time : 8:07:02 (68 MB/s) == Last Cycle's Zeroing time : 7:09:46 (77 MB/s) == Last Cycle's Post Read Time : 15:36:35 (35 MB/s) == Last Cycle's Total Time : 30:54:24 == == Total Elapsed Time 30:54:24 == == Disk Start Temperature: 34C == == Current Disk Temperature: 34C, == ============================================================================ No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ============================================================================ == == S.M.A.R.T Initial Report for /dev/sdp == Disk: /dev/sdp smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green family Device Model: WDC WD20EADS-00R6B0 Serial Number: WD-WCAVY0403098 Firmware Version: 01.00A01 User Capacity: 2,000,398,934,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Tue Oct 21 20:21:21 2014 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (43200) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 48 3 Spin_Up_Time 0x0027 158 149 021 Pre-fail Always - 9066 4 Start_Stop_Count 0x0032 087 087 000 Old_age Always - 13980 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 040 040 000 Old_age Always - 43972 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 118 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 8 193 Load_Cycle_Count 0x0032 196 196 000 Old_age Always - 13260 194 Temperature_Celsius 0x0022 118 094 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ ============================================================================ == == S.M.A.R.T Final Report for /dev/sdp == Disk: /dev/sdp smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build) Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net === START OF INFORMATION SECTION === Model Family: Western Digital Caviar Green family Device Model: WDC WD20EADS-00R6B0 Serial Number: WD-WCAVY0403098 Firmware Version: 01.00A01 User Capacity: 2,000,398,934,016 bytes Device is: In smartctl database [for details use: -P show] ATA Version is: 8 ATA Standard is: Exact ATA specification draft version not indicated Local Time is: Thu Oct 23 03:15:44 2014 CDT SMART support is: Available - device has SMART capability. SMART support is: Enabled === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x84) Offline data collection activity was suspended by an interrupting command from host. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: (43200) seconds. Offline data collection capabilities: (0x7b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: ( 255) minutes. Conveyance self-test routine recommended polling time: ( 5) minutes. SCT capabilities: (0x303f) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 48 3 Spin_Up_Time 0x0027 158 149 021 Pre-fail Always - 9066 4 Start_Stop_Count 0x0032 087 087 000 Old_age Always - 13980 5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0 7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0 9 Power_On_Hours 0x0032 040 040 000 Old_age Always - 44002 10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0 11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0 12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 118 192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 8 193 Load_Cycle_Count 0x0032 196 196 000 Old_age Always - 13260 194 Temperature_Celsius 0x0022 118 094 000 Old_age Always - 34 196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0 197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0 198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0 199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0 200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0 SMART Error Log Version: 1 No Errors Logged SMART Self-test log structure revision number 1 No self-tests have been logged. [To run self-tests, use: smartctl -t] SMART Selective self-test log data structure revision number 1 SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS 1 0 0 Not_testing 2 0 0 Not_testing 3 0 0 Not_testing 4 0 0 Not_testing 5 0 0 Not_testing Selective self-test flags (0x0): After scanning selected spans, do NOT read-scan remainder of disk. If Selective self-test is pending on power-up, resume after 0 minute delay. == ============================================================================ Quote Link to comment
Joe L. Posted October 24, 2014 Share Posted October 24, 2014 You did not specify which version of unRAID you are running, but... according to the preclear_disk thread: The current version is 1.15. If you have an older version, please download the newest one. Older versions prior to 1.14 did not have the ability to properly handle larger disks. (larger than 2.2TB) Versions prior to 1.15 did not work properly on 64 Bit unRAID. The current version is 1.15. If you have an older version, please download the newest one. Older versions prior to 1.14 did not have the ability to properly handle larger disks. (larger than 2.2TB) Versions prior to 1.15 did not work properly on 64 Bit unRAID. Assuming you are not on 64 bit unRAID, and not clearing a disk greater than 2.2TB, your disk did not contain all zeros after being written with all zeros. (that is not a good thing) In any case, you might want to download the current version as you are two versions behind. Joe L. Quote Link to comment
grither Posted October 24, 2014 Share Posted October 24, 2014 oops didn't notice i was using old version just updated and restarted the preclear. i'm fairly certain i'm just going to dispose of this drive once preclear is done. after losing a full drive of data i would be insane to put this back in my server even if results come back clear right? Quote Link to comment
garycase Posted October 25, 2014 Share Posted October 25, 2014 If the drive tests okay, and reformats okay, then it's a good drive to use for some of your backups. Generally you only write data to a backup disk once; then "tuck it away" for use in an emergency .... so it doesn't get nearly the use of an operational drive in the system. Several of my backup drives are either RMA replacements or drives that I just elected to replace because I didn't like the reallocated sector counts. Quote Link to comment
grither Posted October 26, 2014 Share Posted October 26, 2014 okay reran with updated script. still got weird error from before. will remove from system however i'm wondering what it actually means, and what caused it? ========================================================================1.15 == invoked as: ./preclear_disk.sh -M 4 /dev/sdp == == Disk /dev/sdp has NOT been successfully precleared == Postread detected un-expected non-zero bytes on disk== == Ran 1 cycle == == Using :Read block size = 8388608 Bytes == Last Cycle's Pre Read Time : 8:06:47 (68 MB/s) == Last Cycle's Zeroing time : 7:09:45 (77 MB/s) == Last Cycle's Post Read Time : 15:38:23 (35 MB/s) == Last Cycle's Total Time : 30:55:57 == == Total Elapsed Time 30:55:57 == == Disk Start Temperature: 33C == == Current Disk Temperature: 32C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdp /tmp/smart_finish_sdp ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Temperature_Celsius = 120 119 0 ok 32 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 0 sectors had been re-allocated before the start of the preclear. 0 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ ============================================================================ == Quote Link to comment
itimpi Posted October 26, 2014 Share Posted October 26, 2014 The error message is saying that what was written to the disk is not what was found when the disk was read. As such the disk is unreliable and should not be used in unRAID. A bit strange, though, that nothing appears to be showing up in the SMART report. Quote Link to comment
SSD Posted October 26, 2014 Share Posted October 26, 2014 This is an exceedingly rare scenario. It may mean bad memory in the computer, bad (cache) memory on the disk, or even bad memory on a controller. Remotely possible is a bad connection. Faced with this I'd likely run a 24 hour memory test on my server. Quote Link to comment
JustinChase Posted October 26, 2014 Share Posted October 26, 2014 so, this is my old Parity drive. it started acting up and I've replaced it with a new drive. I was hoping a preclear would 'fix' this drive, but I'm not sure how to interpret these results. Is this drive okay to put back into service in the array as a data drive? ** Changed attributes in files: /tmp/smart_start_sdf /tmp/smart_finish_sdf ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Raw_Read_Error_Rate = 115 117 6 ok 188984744 Power_On_Hours = 91 92 0 ok 7887 Spin_Retry_Count = 100 100 97 near_thresh 0 End-to-End_Error = 100 100 99 near_thresh 0 Reported_Uncorrect = 89 91 0 ok 11 High_Fly_Writes = 88 92 0 ok 12 Airflow_Temperature_Cel = 69 72 45 near_thresh 31 Temperature_Celsius = 31 28 0 ok 31 No SMART attributes are FAILING_NOW 8 sectors were pending re-allocation before the start of the preclear. 8 sectors were pending re-allocation after pre-read in cycle 1 of 1. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 1. 0 sectors are pending re-allocation at the end of the preclear, a change of -8 in the number of sectors pending re-allocation. 40 sectors had been re-allocated before the start of the preclear. 40 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. It means the drive tested each of the 8 sectors, when written zeroes to, and decided they were fine so put them back online. It doesn't mean those 8 are perfect, although they could be, but for now the drive believes each of them can be trusted. The drive appears to be fine now, but not perfect. The best advice we give users when this happens is to Preclear it one or two more times, and see if anything changes. If no further changes, then the drive should be good. All other numbers look OK. After a couple of false starts, here are the results. They look good to me, but would just like confirmation before putting this drive back into the array as a data drive... ========================================================================1.15 == invoked as: ./preclear_disk.sh -c 2 /dev/sdm == ST4000DM000-1F2168 W300LGJ3 == Disk /dev/sdm has been successfully precleared == with a starting sector of 1 == Ran 2 cycles == == Using :Read block size = 1000448 Bytes == Last Cycle's Pre Read Time : 11:35:45 (95 MB/s) == Last Cycle's Zeroing time : 8:13:31 (135 MB/s) == Last Cycle's Post Read Time : 20:41:48 (53 MB/s) == Last Cycle's Total Time : 28:56:19 == == Total Elapsed Time 69:44:26 == == Disk Start Temperature: 22C == == Current Disk Temperature: 30C, == ============================================================================ ** Changed attributes in files: /tmp/smart_start_sdm /tmp/smart_finish_sdm ATTRIBUTE NEW_VAL OLD_VAL FAILURE_THRESHOLD STATUS RAW_VALUE Spin_Retry_Count = 100 100 97 near_thresh 0 End-to-End_Error = 100 100 99 near_thresh 0 High_Fly_Writes = 78 82 0 ok 22 Airflow_Temperature_Cel = 70 78 45 near_thresh 30 Temperature_Celsius = 30 22 0 ok 30 No SMART attributes are FAILING_NOW 0 sectors were pending re-allocation before the start of the preclear. 0 sectors were pending re-allocation after pre-read in cycle 1 of 2. 0 sectors were pending re-allocation after zero of disk in cycle 1 of 2. 0 sectors were pending re-allocation after post-read in cycle 1 of 2. 0 sectors were pending re-allocation after zero of disk in cycle 2 of 2. 0 sectors are pending re-allocation at the end of the preclear, the number of sectors pending re-allocation did not change. 40 sectors had been re-allocated before the start of the preclear. 40 sectors are re-allocated at the end of the preclear, the number of sectors re-allocated did not change. ============================================================================ Quote Link to comment
grither Posted October 26, 2014 Share Posted October 26, 2014 This is an exceedingly rare scenario. It may mean bad memory in the computer, bad (cache) memory on the disk, or even bad memory on a controller. Remotely possible is a bad connection. Faced with this I'd likely run a 24 hour memory test on my server. thanks for all the feedback. can anyone point me in the right direction in terms of how to run a memory test on the server? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.