Can someone have a look at my SMART test?


Recommended Posts

Hi,

I have a problematic drive (WD 2tb) and have run a long SMART test on it and was wondering if someone could take a look and see if I need to RMA it.I don't really have much of a clue  as to what I'm looking at and would be most grateful if someone could help me.

 

Here's the SMART test:

 

smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (AF, SATA 6Gb/s)
Device Model:     WDC WD20EZRX-00D8PB0
Serial Number:    WD-WCC4M9YLCTRE
LU WWN Device Id: 5 0014ee 20a8f8833
Firmware Version: 80.00A80
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Nov 11 05:50:41 2014 GMT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x80)	Offline data collection activity
				was never started.
				Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 117)	The previous self-test completed having
				the read element of the test failed.
Total time to complete Offline 
data collection: 		(26160) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 265) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x7035)	SCT Status supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   178   178   051    Pre-fail  Always       -       21327
  3 Spin_Up_Time            0x0027   167   165   021    Pre-fail  Always       -       4650
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       62
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       16
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       225
10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       60
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       33
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -       3614
194 Temperature_Celsius     0x0022   129   120   000    Old_age   Always       -       18
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       22
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   197   197   000    Old_age   Offline      -       1369

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       50%       215         1501859712
# 2  Extended offline    Completed: read failure       70%       187         946507768
# 3  Short offline       Completed: read failure       10%       186         946507768
# 4  Extended offline    Completed: read failure       70%       186         946507768
# 5  Extended offline    Completed: read failure       70%       185         946507768
# 6  Short offline       Completed without error       00%       160         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 

Link to comment

The things I noticed were

  • There are a number of re-allocated sectors.    While this is not an issue in itself as modern disks are designed to do that if needed, it would be an issue if that number is continually increasing.
  • there are a number of 'pending sectors'.  These indicate sectors that were not being read reliably.  You do not want any disks in unRAID to have non-zero pending sector values as if a different disk subsequently failed these might stop a rebuild of the failed disk being 100% successful.

Pending Sectors are only ever cleared on a write operation.  If successful the pending status is cleared, and if unsuccessful the sector should be reallocated. 

 

If the disk in question is not part of the array then the easy solution is to run the pre-clear script against it.

 

If it is part of the array then one way to Try to clear these would be to force a rebuild of the disk.  However in such a case to minimise any chance of data loss it is better to do the rebuild onto a spare disk (if you have one) and then after that has succeeded the problem disk can then be tested with the pre-clear script without any risk of data loss.

Link to comment
If the disk in question is not part of the array then the easy solution is to run the pre-clear script against it.

Hi and thanks for your input.

This SMART test is the first thing I've done since a preclear on the drive.

The preclear completed all 10 steps but on the final post- read I lost the network on the putty session.I noticed that on the final read the speed had slowed down to several kb/s.

When I tested to see if the drive had been precleared using the "-t" option, it said it  had.

Because I lost the putty session I didn't get the final SMART test results, hence this SMART test.

 

Bearing this in mind do you think the drive is worth persevering with?

Link to comment

use screen when you pre-clear.  http://lime-technology.com/wiki/index.php/Configuration_Tutorial#Preclearing_With_Screen

 

If you get disconnected you can reattach to the session.

 

Personally if it's "problematic" put a new one is and bin it or RMA it, they are only a hundred bucks these days.  I don't know what problems you had maybe you can explain more. 

 

I recently had a drive like this on my Windoz machine, and it had 55 bad sectors but it was unrecoverable due to I/O problems, and really slow read issues both as a SATA drive and in USB3 enclosure.  Lucky I had backups in place, and I just binned it (or in my case I use them for paperweights!).

Link to comment

Thanks for the advice.

I bought 3 WD 2TB drives from you to go in an unraid box.  Two are performing adequately but one is severely under performing in speed tests. I've run a 400MB read test on each drive using

dd of=/dev/null bs=4096 count=102400 if=/dev/sdd

 

2 of the drives report read speeds of around 150mb/s whereas the other one is at 3.9mb/s .

After preclearing it got better  but was erratic.

 

Anyway,  I've has enough messing about with it. I'VE RM-ed it.

Link to comment

RMA it.

This is enough to RMA it.

 

@ 225 hours there are 22 pending sectors and you cannot pass a read test.

 

Don't bother with pre-clear at this point.

 

..
9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       225
...
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       22
..
# 1  Extended offline    Completed: read failure       50%       215         1501859712
# 2  Extended offline    Completed: read failure       70%       187         946507768
# 3  Short offline       Completed: read failure       10%       186         946507768
# 4  Extended offline    Completed: read failure       70%       186         946507768
# 5  Extended offline    Completed: read failure       70%       185         946507768

 

Link to comment

ok.

So I RMA-ed the drive and got a new one yesterday.

I've got 3 drives in the machine, all the same WD 2tb green drives.

I decided to preclear them all again.

I started them all off at the same time.

The 2 old drives finished preclearing in about 16 hours but the new drive is still preclearing after 25 hours and showing no signs of finishing anytime soon.

Here is the current status : note the read speed is 468 kB/s

unRAID server Pre-Clear disk /dev/sdd
=               cycle 1 of 1, partition start on sector 64
= Disk Pre-Clear-Read completed                                 DONE
= Step 1 of 10 - Copying zeros to first 2048k bytes             DONE
= Step 2 of 10 - Copying zeros to remainder of disk to clear it DONE
= Step 3 of 10 - Disk is now cleared from MBR onward.           DONE
= Step 4 of 10 - Clearing MBR bytes for partition 2,3 & 4       DONE
= Step 5 of 10 - Clearing MBR code area                         DONE
= Step 6 of 10 - Setting MBR signature bytes                    DONE
= Step 7 of 10 - Setting partition 1 to precleared state        DONE
= Step 8 of 10 - Notifying kernel we changed the partitioning   DONE
= Step 9 of 10 - Creating the /dev/disk/by* entries             DONE
= Step 10 of 10 - Verifying if the MBR is cleared.              DONE
= Post-Read in progress: 2% complete.
(  40,265,318,400  of  2,000,398,934,016  bytes read ) 468 kB/s
Disk Temperature: 23C, Elapsed Time:  25:57:43

Surely this drive cant be faulty as well?

I tried changing the cables and the sata port on the old drive and it made no difference

Any ideas what might be going on here?

Link to comment

I would abort the preclear.

Swap this drive's position on the motherboard with another drive and test it out.

I would probably look at the syslog first to see what messages are coming out.

 

However you should still do the basic speed test.

 

dd of=/dev/null bs=4096 count=102400 if=/dev/sd? where ? = device to test.

Do hdparm speed tests also.

 

Did you review the smart log of the new drive? (post it).

 

Perhaps post your syslog to see if something else is going on with the motherboard.

 

 

Link to comment

thanks.

the other reports are there but not the drive that is still preclearing.

if I stop preclear, will it appear?

 

There should definitely be a starting preclear smart report for the drive in process.

There is a start rpt and finish file.

 

At least from what I see with this example:

 

root@unRAIDb:/boot/preclear_reports# ls -l *W1F1H834*                         

-rwxrwxrwx 1 root root 5054 2013-01-14 18:49 preclear_finish_\ W1F1H834_2013-01-14*

-rwxrwxrwx 1 root root 1889 2013-01-14 18:49 preclear_rpt_\ W1F1H834_2013-01-14*

-rwxrwxrwx 1 root root 5032 2013-01-14 18:49 preclear_start_\ W1F1H834_2013-01-14*

root@unRAIDb:/boot/preclear_reports#

 

If not capture one with smartctl -a /dev/sd?

Link to comment
  • 2 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.