Jump to content

Smart fails Seagate ST3000DM001 on v5


Recommended Posts

I have a 3tb Seagate showing 128 errors in the error column of the gui.  It has no problems with parity checks passing them successfully, but I expect it is going bad.  I tried to run a smart report, but it doesn't want to work.  Drive is attached to a M1015 controller in a Norco 4224 chassis

 

ST3000DM001-9YN166_W1F0MED2 (sdl) 2930266532

 

root@Server1:~# smartctl -a -d ata /dev/sdl
smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: Invalid argument

A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
==========
root@Server1:~# smartctl -a -d ata -T permissive  /dev/sdl
smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: Invalid argument

=== START OF INFORMATION SECTION ===
Device Model:     [No Information Found]
Serial Number:    [No Information Found]
Firmware Version: [No Information Found]
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   [No Information Found]
Local Time is:    Thu Aug  6 09:29:56 2015 CDT
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled.
A mandatory SMART command failed: exiting. To continue, add one or more '-T permissive' options.
==========
root@Server1:~# smartctl -a -d ata -T verypermissive  /dev/sdl
smartctl 6.2 2013-07-26 r3841 [i686-linux-3.9.11p-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

Read Device Identity failed: Invalid argument

=== START OF INFORMATION SECTION ===
Device Model:     [No Information Found]
Serial Number:    [No Information Found]
Firmware Version: [No Information Found]
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   [No Information Found]
Local Time is:    Thu Aug  6 09:30:05 2015 CDT
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 82-83 don't show if SMART supported.
SMART support is: Ambiguous - ATA IDENTIFY DEVICE words 85-87 don't show if SMART is enabled.
                  Checking to be sure by trying SMART RETURN STATUS command.
SMART support is: Unknown - Try option -s with argument 'on' to enable it.
Read SMART Data failed: Invalid argument

=== START OF READ SMART DATA SECTION ===
Error SMART Status command failed: Invalid argument
SMART overall-health self-assessment test result: UNKNOWN!
SMART Status, Attributes and Thresholds cannot be read.

Read SMART Error Log failed: Invalid argument

Read SMART Self-test Log failed: Invalid argument

Selective Self-tests/Logging not supported

root@Server1:~#

Link to comment

Get rid of the "-d ata"

 

$ smartctl -a -T verypermissive  /dev/sdl

 

That did it.  I think I will pull it and rebuild the array then put it into another machine and pre-clear the daylights out of it.

 

(wonder why My-Main from unMenu runs it with the -d ata parameter)

 

Serial Number:    W1F0MED2
LU WWN Device Id: 5 000c50 0524a68f1
Firmware Version: CC9E
User Capacity:    3,000,592,982,016 bytes [3.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Fri Aug  7 09:34:58 2015 CDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
See vendor-specific Attribute list for marginal Attributes.

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  575) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 333) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x3081) SCT Status supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   117   099   006    Pre-fail  Always       -       129471864
  3 Spin_Up_Time            0x0003   092   092   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   095   095   020    Old_age   Always       -       6141
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  Always       -       200
  7 Seek_Error_Rate         0x000f   085   060   030    Pre-fail  Always       -       396581271
  9 Power_On_Hours          0x0032   078   078   000    Old_age   Always       -       19956
10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       96
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   087   087   000    Old_age   Always       -       13
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   063   045   045    Old_age   Always   In_the_past 37 (Min/Max 24/45)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       85
193 Load_Cycle_Count        0x0032   063   063   000    Old_age   Always       -       74034
194 Temperature_Celsius     0x0022   037   055   000    Old_age   Always       -       37 (0 10 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       2548h+18m+43.797s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       23074907530402
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3995737547186

SMART Error Log Version: 1
ATA Error Count: 13 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 13 occurred at disk power-on lifetime: 19773 hours (823 days + 21 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00  44d+07:28:25.683  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00  44d+07:28:25.656  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  44d+07:28:25.646  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  44d+07:28:25.620  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00  44d+07:28:25.593  READ FPDMA QUEUED

Error 12 occurred at disk power-on lifetime: 19756 hours (823 days + 4 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 10 ff ff ff 4f 00  43d+14:23:04.071  READ FPDMA QUEUED
  60 00 20 ff ff ff 4f 00  43d+14:23:04.065  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00  43d+14:22:57.187  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00  43d+14:22:57.184  READ FPDMA QUEUED
  60 00 18 ff ff ff 4f 00  43d+14:22:50.309  READ FPDMA QUEUED

Error 11 occurred at disk power-on lifetime: 19755 hours (823 days + 3 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00  43d+13:52:23.939  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00  43d+13:52:19.501  READ FPDMA QUEUED
  60 00 40 ff ff ff 4f 00  43d+13:52:19.471  READ FPDMA QUEUED
  60 00 90 ff ff ff 4f 00  43d+13:52:19.469  READ FPDMA QUEUED
  60 00 70 ff ff ff 4f 00  43d+13:52:19.468  READ FPDMA QUEUED

Error 10 occurred at disk power-on lifetime: 19742 hours (822 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 00 ff ff ff 4f 00  43d+00:45:47.175  READ FPDMA QUEUED
  60 00 00 ff ff ff 4f 00  43d+00:45:47.098  READ FPDMA QUEUED
  60 00 10 ff ff ff 4f 00  43d+00:45:41.793  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00  43d+00:45:39.292  READ FPDMA QUEUED
  60 00 38 ff ff ff 4f 00  43d+00:45:39.286  READ FPDMA QUEUED

Error 9 occurred at disk power-on lifetime: 19742 hours (822 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 ff ff ff 0f  Error: UNC at LBA = 0x0fffffff = 268435455

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  60 00 08 ff ff ff 4f 00  42d+23:59:45.230  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00  42d+23:59:45.226  READ FPDMA QUEUED
  60 00 08 ff ff ff 4f 00  42d+23:59:39.302  READ FPDMA QUEUED
  60 00 10 ff ff ff 4f 00  42d+23:59:39.295  READ FPDMA QUEUED
  60 00 10 ff ff ff 4f 00  42d+23:59:33.293  READ FPDMA QUEUED

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]


SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@Server1:~#

Link to comment

I believe the smartctl parms can be configured, so the -d ata was added and can be removed.

 

The drive reported errors recently (within weeks) after 2+ years of service. Warranty or trash it. Unless you have time and patience to test it to death.

Link to comment

Get rid of the "-d ata"

 

$ smartctl -a -T verypermissive  /dev/sdl

 

That did it.  I think I will pull it and rebuild the array then put it into another machine and pre-clear the daylights out of it.

 

(wonder why My-Main from unMenu runs it with the -d ata parameter)

 

In myMain, it you click on the hyperlinked last 4 digits of the serial number for the drive in question, a drive settings page will be displayed. Set the smartopt to -A as shown in the screenshot below. That will cause myMain to stop using the "-d ata". The -T option you are using is not required.

Setsmartopt.JPG.e2ae4493d235aba166716f7c88372627.JPG

Link to comment

Get rid of the "-d ata"

 

$ smartctl -a -T verypermissive  /dev/sdl

 

That did it.  I think I will pull it and rebuild the array then put it into another machine and pre-clear the daylights out of it.

 

(wonder why My-Main from unMenu runs it with the -d ata parameter)

 

In myMain, it you click on the hyperlinked last 4 digits of the serial number for the drive in question, a drive settings page will be displayed. Set the smartopt to -A as shown in the screenshot below. That will cause myMain to stop using the "-d ata". The -T option you are using is not required.

Awesome I'll change that. But first I need to get disk 23 to show up in my main Smart view. This is also very strange.  Disk23 shows up in Default my main view but not when I go to Smart view. All disks up to 22 show but not 23.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...