Drive red balled. need help


Recommended Posts

running unraid 5b14

 

recently got ALOT of errors during parity check

I believe I shut down, reset power and sata cables.

Started back and 2 drives red balled. One unformatted and one showing up as wrong drive

I shut down again swapped out the sata cables,  and unplugged the cache drive.

Started up again and one disk is still disabled.

Ran short and long smartctl test on the drive

 

Can anyone help in interpreting the results. Have read up on it but don't have a background in IT so....

Where to go from here, Trust my array procedure?

 

 

long smart ctl test results

 

Conveyance self-test routine

recommended polling time: (  5) minutes.

SCT capabilities:       (0x3031) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -      1009

  3 Spin_Up_Time            0x0027  195  171  021    Pre-fail  Always      -      5216

  4 Start_Stop_Count        0x0032  099  099  000    Old_age  Always      -      1164

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002e  100  253  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  086  086  000    Old_age  Always      -      10392

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  100  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      303

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      92

193 Load_Cycle_Count        0x0032  179  179  000    Old_age  Always      -      64569

194 Temperature_Celsius    0x0022  119  113  000    Old_age  Always      -      31

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  200  198  000    Old_age  Always      -      51

198 Offline_Uncorrectable  0x0030  200  198  000    Old_age  Offline      -      51

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x0008  200  198  000    Old_age  Offline      -      51

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Extended offline    Completed: read failure      40%    10382        2227052200

# 2  Short offline      Completed without error      00%    10378        -

# 3  Short offline      Completed without error      00%    10378        -

# 4  Short offline      Completed without error      00%      9384        -

# 5  Short offline      Completed without error      00%      9384        -

# 6  Short offline      Completed without error      00%      9176        -

# 7  Short offline      Completed without error      00%      9175        -

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

syslog-20120728-112815.txt

Link to comment

What is the current state of your array, can you still get to your data ?

 

Wrt the smart results:

 

The smart results show that there are definite drive errors, I would get a new drive, preclear it and replace this one, then put this one thru a rigoreous preclear cycle to see if the errors mount up (get more)..

 

The power on hours signify that your drive might still be under guarantee so you can possibly RMA it..

Link to comment

The value that is of concern is the Current_Pending_Sector. Those sectors caused the read errors. Do you have a pre-cleared spare drive? Otherwise un-assign and pre clear the drive. Then post a new SMART report. If the Current_Pending_Sector goes to zero, you can assign and rebuild the drive.

Link to comment

Thank you very much for your prompt assistance.

 

I don't have a pre-cleared spare laying around. I That would have definitely been optimal.  :'(

 

hmm I did start the array briefly with the drive disabled and was able to access the data. Presently I have the server shutdown. just in case.

 

sorry results pasted were not complete

was just looking at the results when I had pre-cleared it a month ago and it definitely looks troubling

Raw_Read_Error_Rate 514->1009

current pending sectors: from 4->51

Offline_Uncorrectable: 0->51

Multi_Zone_Error_Rate: 0->51

 

Is this worth further testing or should I  just RMA it now. I have until  03/25/2013.

thank you again for your assistance

 

smart.txt

preclear_finish__WD-WMAVU2279913_2012-06-27.txt

preclear_rpt__WD-WMAVU2279913_2012-06-27.txt

Link to comment

I would not use any drive that has pending sectors. Repeat pre-clear until pending sectors goes to zero or RMA.

IF A PRECLEAR DISK PROCESS ENDS WITH SECTORS PENDING REALLOCATION, Repeat pre-clear until pending sectors goes to zero and STAYS AT ZERO FOR AY LEAST ONE ADDITIONAL CYCLE or RMA.
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.