Should I replace the HDD?


Recommended Posts

Dear Experts,

 

Please see below the error log from my 2Tb Samsung drive.  It has failed in the past in Unraid which I then removed and rebuilt and it seems to work again.

 

Based on the error log below, should I be proactive and send the drive back and get a replacement or just stick with it and hope it doesn't fail again?

 

Device Model: SAMSUNG HD203WI

Serial Number: S1UYJ1BZ107154

Firmware Version: 1AN10003

User Capacity: 2,000,398,934,016 bytes

Device is: Not in smartctl database [for details use: -P showall]

ATA Version is: 8

ATA Standard is: ATA-8-ACS revision 6

Local Time is: Tue Jun 4 14:31:34 2013 BST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

SMART overall-health : PASSED

 

ATA Error Count: 69 (device log contains only the most recent five errors)

CR = Command Register [HEX]

FR = Features Register [HEX]

SC = Sector Count Register [HEX]

SN = Sector Number Register [HEX]

CL = Cylinder Low Register [HEX]

CH = Cylinder High Register [HEX]

DH = Device/Head Register [HEX]

DC = Device Command Register [HEX]

ER = Error register [HEX]

ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

 

Error 69 occurred at disk power-on lifetime: 5125 hours (213 days + 13 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  84 51 00 00 00 00 a0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  ec 00 00 00 00 00 a0 08      00:01:07.856  IDENTIFY DEVICE

  ec 00 00 00 00 00 a0 08      00:01:07.856  IDENTIFY DEVICE

  00 00 01 01 00 00 00 08      00:01:07.856  NOP [Abort queued commands]

  00 00 01 01 00 00 00 00      00:01:07.856  NOP [Abort queued commands]

  00 00 01 01 00 00 00 00      00:01:07.856  NOP [Abort queued commands]

 

Error 68 occurred at disk power-on lifetime: 5125 hours (213 days + 13 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  84 51 00 00 00 00 a0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  ec 00 00 00 00 00 a0 08      00:01:07.851  IDENTIFY DEVICE

  ec 00 00 00 00 00 a0 08      00:01:07.851  IDENTIFY DEVICE

  00 00 01 01 00 00 00 08      00:01:07.851  NOP [Abort queued commands]

  a1 00 00 00 00 00 a0 08      00:01:07.840  IDENTIFY PACKET DEVICE

  ec 00 00 00 00 00 a0 08      00:01:07.840  IDENTIFY DEVICE

 

Error 67 occurred at disk power-on lifetime: 5125 hours (213 days + 13 hours)

  When the command that caused the error occurred, the device was in a reserved state.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  84 51 00 00 00 00 a0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  ec 00 00 00 00 00 a0 08      00:01:07.840  IDENTIFY DEVICE

  ec 00 00 00 00 00 a0 08      00:01:07.840  IDENTIFY DEVICE

  00 00 01 01 00 00 00 08      00:01:07.839  NOP [Abort queued commands]

  00 00 01 01 00 00 00 00      00:01:07.839  NOP [Abort queued commands]

  00 00 01 01 00 00 00 00      00:01:07.839  NOP [Abort queued commands]

 

Error 66 occurred at disk power-on lifetime: 5099 hours (212 days + 11 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  84 51 00 00 00 00 a0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  ec 00 00 00 00 00 a0 08      00:00:18.771  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 08      00:00:18.771  SET FEATURES [set transfer mode]

  ec 00 00 00 00 00 a0 08      00:00:18.771  IDENTIFY DEVICE

  00 00 01 01 00 00 00 08      00:00:18.771  NOP [Abort queued commands]

  00 00 01 01 00 00 00 00      00:00:18.771  NOP [Abort queued commands]

 

Error 65 occurred at disk power-on lifetime: 5099 hours (212 days + 11 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  84 51 00 00 00 00 a0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  ec 00 00 00 00 00 a0 08      00:00:18.766  IDENTIFY DEVICE

  00 00 01 01 00 00 00 08      00:00:18.766  NOP [Abort queued commands]

  00 00 01 01 00 00 00 00      00:00:18.765  NOP [Abort queued commands]

  00 00 01 01 00 00 00 00      00:00:18.765  NOP [Abort queued commands]

  ec 00 00 00 00 00 a0 08      00:00:18.760  IDENTIFY DEVICE

Link to comment

Difficult to predict the reliability of a drive -- in some ways drives are like lightbulbs ... they may last a LONG time; or they map "pop" at any instant.

 

I tend to be VERY conservative ... if a drive shows ANY errors; or even has ANY rebuilt sectors; I relegate it to other uses and replace it in my array.  But that's clearly a "paranoid" approach -- and doesn't guarantee a good array any more than just watching the parameters to ensure a drive isn't getting worse.

 

Bottom line:  Do what YOU feel comfortable with.

 

Link to comment

Good advice.  Would it be within my rights to send it back for a replacement to Seagate?

 

I've had Seagate grant me an RMA in the past for SMART errors even though the drive otherwise passes the overall-health test. It was for an actual Seagate drive but I don't imagine there would be any difference now that Samsung has been completely absorbed.

Link to comment

It's clearly up to the manufacturer, but I've never had a manufacturer refuse to accept a drive I returned and simply send me another one ... WD, Seagate, Samsung, Maxtor, etc. are some of the ones I've dealt with over the years.    So as I noted above, if it was me, I'd send it back for a replacement while it's under warranty.

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.