Errors with hard drive. What is the next approach to take?


Recommended Posts

I just bought this drive. I cleared the drive days ago. I don't recall seeing errors on the preclear. I only did the preclear once. I am seeing errors after syncing it to teh array. I had a 2TB drive as the parity before. This is my first time with unRIAD and seeing errors. Been using unRAID for about a year on this setup.

 

 

Also, is this something I can just take back to the store? Best Buy had them for $150 so I snagged one as I am getting low on space.

 

 

 

 

 

 

Screen%20Shot%202012-07-20%20at%2012.21.23%20AM.png

 

 

 

 

 

 

I also have this:

 

 

 

 

 

 

smartctl -a -d ata /dev/sdj

smartctl 5.40 2010-10-16 r3189 [i486-slackware-linux-gnu] (local build)

Copyright © 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net

 

 

=== START OF INFORMATION SECTION ===

Device Model:    WDC WD30EZRX-00MMMB0

Serial Number:    WD-WCAWZ2240746

Firmware Version: 80.00A80

User Capacity:    3,000,592,982,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  8

ATA Standard is:  Exact ATA specification draft version not indicated

Local Time is:    Fri Jul 20 00:28:09 2012 CDT

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

 

General SMART Values:

Offline data collection status:  (0x80)  Offline data collection activity

              was never started.

              Auto Offline Data Collection: Enabled.

Self-test execution status:      ( 119)  The previous self-test completed having

              the read element of the test failed.

Total time to complete Offline

data collection:        (49980) seconds.

Offline data collection

capabilities:          (0x7b) SMART execute Offline immediate.

              Auto Offline data collection on/off support.

              Suspend Offline collection upon new

              command.

              Offline surface scan supported.

              Self-test supported.

              Conveyance Self-test supported.

              Selective Self-test supported.

SMART capabilities:            (0x0003)  Saves SMART data before entering

              power-saving mode.

              Supports SMART auto save timer.

Error logging capability:        (0x01)  Error logging supported.

              General Purpose Logging supported.

Short self-test routine

recommended polling time:    (  2) minutes.

Extended self-test routine

recommended polling time:    ( 255) minutes.

Conveyance self-test routine

recommended polling time:    (  5) minutes.

SCT capabilities:          (0x3035)  SCT Status supported.

              SCT Feature Control supported.

              SCT Data Table supported.

 

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  199  199  051    Pre-fail  Always      -      1731

  3 Spin_Up_Time            0x0027  153  152  021    Pre-fail  Always      -      9333

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      26

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  100  100  000    Old_age  Always      -      92

10 Spin_Retry_Count        0x0032  100  253  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      14

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      12

193 Load_Cycle_Count        0x0032  200  200  000    Old_age  Always      -      69

194 Temperature_Celsius    0x0022  114  102  000    Old_age  Always      -      38

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0030  100  253  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x0008  100  253  000    Old_age  Offline      -      0

 

 

SMART Error Log Version: 1

ATA Error Count: 1257 (device log contains only the most recent five errors)

  CR = Command Register [HEX]

  FR = Features Register [HEX]

  SC = Sector Count Register [HEX]

  SN = Sector Number Register [HEX]

  CL = Cylinder Low Register [HEX]

  CH = Cylinder High Register [HEX]

  DH = Device/Head Register [HEX]

  DC = Device Command Register [HEX]

  ER = Error register [HEX]

  ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

 

 

Error 1257 occurred at disk power-on lifetime: 92 hours (3 days + 20 hours)

  When the command that caused the error occurred, the device was active or idle.

 

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 08 c0 00 00 e0  Error: UNC 8 sectors at LBA = 0x000000c0 = 192

 

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 c0 00 00 e0 08  3d+09:01:01.179  READ DMA

  ec 00 00 00 00 00 a0 08  3d+09:01:01.139  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 08  3d+09:01:01.139  SET FEATURES [set transfer mode]

 

 

Error 1256 occurred at disk power-on lifetime: 92 hours (3 days + 20 hours)

  When the command that caused the error occurred, the device was active or idle.

 

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 08 c0 00 00 e0  Error: UNC 8 sectors at LBA = 0x000000c0 = 192

 

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 c0 00 00 e0 08  3d+09:00:57.842  READ DMA

  ec 00 00 00 00 00 a0 08  3d+09:00:57.802  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 08  3d+09:00:57.802  SET FEATURES [set transfer mode]

 

 

Error 1255 occurred at disk power-on lifetime: 92 hours (3 days + 20 hours)

  When the command that caused the error occurred, the device was active or idle.

 

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 08 c0 00 00 e0  Error: UNC 8 sectors at LBA = 0x000000c0 = 192

 

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 c0 00 00 e0 08  3d+09:00:54.515  READ DMA

  ec 00 00 00 00 00 a0 08  3d+09:00:54.475  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 08  3d+09:00:54.456  SET FEATURES [set transfer mode]

 

 

Error 1254 occurred at disk power-on lifetime: 92 hours (3 days + 20 hours)

  When the command that caused the error occurred, the device was active or idle.

 

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 08 c0 00 00 e0  Error: UNC 8 sectors at LBA = 0x000000c0 = 192

 

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 c0 00 00 e0 08  3d+09:00:51.159  READ DMA

  ec 00 00 00 00 00 a0 08  3d+09:00:51.119  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 08  3d+09:00:51.119  SET FEATURES [set transfer mode]

 

 

Error 1253 occurred at disk power-on lifetime: 92 hours (3 days + 20 hours)

  When the command that caused the error occurred, the device was active or idle.

 

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 08 c0 00 00 e0  Error: UNC 8 sectors at LBA = 0x000000c0 = 192

 

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 08 c0 00 00 e0 08  3d+09:00:47.822  READ DMA

  ec 00 00 00 00 00 a0 08  3d+09:00:47.782  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 08  3d+09:00:47.782  SET FEATURES [set transfer mode]

 

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline      Completed: read failure      70%        92        -

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

Link to comment

Those are NOT "smart" errors, but "read" errors.  unRAID does not look at SMART data.... ever.

 

I don't see any sectors pending re-allocation, or already re-allocated, so it might not be the drive.  (could be ANYTHING, but since you did not attach a syslog for analysis, we have no way to know what the errors represent.)

Link to comment

Here is a link to the log http://pastebin.com/CwXwPfPE

 

 

Since my orignal post I reformatted the usb stick and started over as I had many things installed and I wanted a fresh copy of this running. I have a back up.

In the log there is a problem with the only 3TB drive. I don't understand it. I just setup the machine it is in. There were no problems with it until I switched the 2TB parity drive with a 3TB drive.

I am running the latest version named 5.0-rc6-r8168-test.

 

 

You will notice in the log I restarted the machine after the first initial start. I pulled the parity drive from one slot and stuck into another.

 

 

This is what I get from unraid when setting up the drives. The 3TB drive is missing. And the drive light continues to blink.

 

 

 

 

Screen%20Shot%202012-07-21%20at%202.26.57%20AM.png

 

 

I am using X7DBE-X with 3 AOC-SAT2-MV8 and Super Talent 2GB PC2-5300 DDR2-667MHz ECC Registered CL5 240-Pin DIMM which I ran memtest on.

If anyone is interested I got this from here http://goo.gl/0UvPQ which were on avsforum. More on one their way to eBay.

 

 

Anything I can add for some help I would be glad to know what that it is. Thanks.

 

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.