[Solved] - unraid 5.0.6 - Data is invalid, looking for opinions on next step


Recommended Posts

Hello,

 

I have attached my syslog and smartctl reports for the drive in question and a screen capture (

) of my array configuration.

 

I think the relevant syslog entry starts at Apr 24 20:45:42

 

This drive also had a similar issue in January, referenced here: http://lime-technology.com/forum/index.php?topic=45375.msg433138

 

I did reboot the system and ran the smartctl (long report attached) report which reports the second instance of the error but states things have PASSED.

 

I have not restarted the array.

 

I believe there is enough space on 8/9 to put the data there from disk 7, if that is a smart thing to do. I have a precleared drive in slot /dev/sdl if replacing sdh (the bad one) in the array is the best course of action.

 

Any opinions on what the professionals would do is greatly appreciated.

 

-Chris

smart.txt

syslog.txt

Link to comment

SMART look ok but there are some recent sector errors that would make me replace this disk:

 

  9 Power_On_Hours          0x0012  095  095  000    Old_age  Always      -      39564

 

Error 2 occurred at disk power-on lifetime: 39264 hours (1636 days + 0 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 08 90 4e 1f 02  Error: UNC 8 sectors at LBA = 0x021f4e90 = 35606160

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  25 00 08 90 4e 1f e0 00  2d+01:50:22.486  READ DMA EXT

  ca 00 08 58 1d 00 e0 00  2d+01:50:19.012  WRITE DMA

  c8 00 08 58 1d 00 e0 00  2d+01:50:19.012  READ DMA

  ea 00 00 00 00 00 a0 00  2d+01:50:19.002  FLUSH CACHE EXT

  ca 00 98 c0 1c 00 e0 00  2d+01:50:19.002  WRITE DMA

 

Error 1 occurred at disk power-on lifetime: 36555 hours (1523 days + 3 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  40 51 09 67 e6 40 01  Error: UNC 9 sectors at LBA = 0x0140e667 = 21030503

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 18 58 e6 40 e1 00      05:41:15.646  READ DMA

  c8 00 08 50 e6 40 e1 00      05:41:15.646  READ DMA

  c8 00 20 30 e6 40 e1 00      05:41:15.646  READ DMA

  c8 00 08 28 e6 40 e1 00      05:41:15.645  READ DMA

  c8 00 08 20 e6 40 e1 00      05:41:15.645  READ DMA

Link to comment

SMART look ok but there are some recent sector errors that would make me replace this disk:

 

Thank you for the assessment and advice.  :)

 

The unused drive, /dev/sdl, is precleared and ready to use.

 

Is this the proper protocol to follow?

1. Stop the array

2. Unassign the old drive from disk 7 (/dev/sdh).

3. Assign the new drive in the slot of the old drive (it is already installed and precleared)

4. Go to the Main -> Array Operation section

5. Put a check in the Yes, I'm sure checkbox (next to the information indicating the drive will be rebuilt), and click the Start button

 

The rebuild will begin, with hefty disk activity on all drives, lots of writes on the new drive and lots of reads on all other drives

All of the contents of the old drive will be copied onto the new drive, making it an exact replacement, except possibly with more capacity than the old drive.

Link to comment

Looks ok, before starting the rebuild check SMART for the other disks, make sure there are no pending sectors, and keep the old disk intact until the rebuild is finished.

 

Thank you. Is the pending sector check a long test or short test?

Link to comment

Just check the SMART attributes:

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000b  100  100  016    Pre-fail  Always      -      0

  2 Throughput_Performance  0x0005  134  134  054    Pre-fail  Offline      -      111

  3 Spin_Up_Time            0x0007  125  125  024    Pre-fail  Always      -      559 (Average 554)

  4 Start_Stop_Count        0x0012  100  100  000    Old_age  Always      -      3085

  5 Reallocated_Sector_Ct  0x0033  100  100  005    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000b  100  100  067    Pre-fail  Always      -      0

  8 Seek_Time_Performance  0x0005  132  132  020    Pre-fail  Offline      -      32

  9 Power_On_Hours          0x0012  095  095  000    Old_age  Always      -      39564

10 Spin_Retry_Count        0x0013  100  100  060    Pre-fail  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      39

192 Power-Off_Retract_Count 0x0032  098  098  000    Old_age  Always      -      3144

193 Load_Cycle_Count        0x0012  098  098  000    Old_age  Always      -      3144

194 Temperature_Celsius    0x0002  253  253  000    Old_age  Always      -      23 (Min/Max 11/33)

196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0022  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0008  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x000a  200  200  000    Old_age  Always      -      0

Link to comment

Alright. I checked all disk smartctl attributes and for all disks, 197 Current_Pending_Sector  0x0022  100  100  000    Old_age  Always      -      0.

 

I will move forward with the replacement and rebuild.

 

On another note, is there a threshold for amount of "power on hours" you typically use before you replace a disk?

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.