[Solved] - unraid 5.0.6 - Data is invalid, looking for opinions on next step - General Support (V5 and Older)

May 8, 201610 yr

Hello,

I have attached my syslog and smartctl reports for the drive in question and a screen capture (

) of my array configuration.

I think the relevant syslog entry starts at Apr 24 20:45:42

This drive also had a similar issue in January, referenced here: http://lime-technology.com/forum/index.php?topic=45375.msg433138

I did reboot the system and ran the smartctl (long report attached) report which reports the second instance of the error but states things have PASSED.

I have not restarted the array.

I believe there is enough space on 8/9 to put the data there from disk 7, if that is a smart thing to do. I have a precleared drive in slot /dev/sdl if replacing sdh (the bad one) in the array is the best course of action.

Any opinions on what the professionals would do is greatly appreciated.

-Chris

smart.txt

syslog.txt

Quote

May 8, 201610 yr

SMART look ok but there are some recent sector errors that would make me replace this disk:

9 Power_On_Hours 0x0012 095 095 000 Old_age Always - 39564

Error 2 occurred at disk power-on lifetime: 39264 hours (1636 days + 0 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

40 51 08 90 4e 1f 02 Error: UNC 8 sectors at LBA = 0x021f4e90 = 35606160

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

25 00 08 90 4e 1f e0 00 2d+01:50:22.486 READ DMA EXT

ca 00 08 58 1d 00 e0 00 2d+01:50:19.012 WRITE DMA

c8 00 08 58 1d 00 e0 00 2d+01:50:19.012 READ DMA

ea 00 00 00 00 00 a0 00 2d+01:50:19.002 FLUSH CACHE EXT

ca 00 98 c0 1c 00 e0 00 2d+01:50:19.002 WRITE DMA

Error 1 occurred at disk power-on lifetime: 36555 hours (1523 days + 3 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

40 51 09 67 e6 40 01 Error: UNC 9 sectors at LBA = 0x0140e667 = 21030503

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

c8 00 18 58 e6 40 e1 00 05:41:15.646 READ DMA

c8 00 08 50 e6 40 e1 00 05:41:15.646 READ DMA

c8 00 20 30 e6 40 e1 00 05:41:15.646 READ DMA

c8 00 08 28 e6 40 e1 00 05:41:15.645 READ DMA

c8 00 08 20 e6 40 e1 00 05:41:15.645 READ DMA

Quote

May 8, 201610 yr

Author

SMART look ok but there are some recent sector errors that would make me replace this disk:

Thank you for the assessment and advice.

The unused drive, /dev/sdl, is precleared and ready to use.

Is this the proper protocol to follow?

1. Stop the array

2. Unassign the old drive from disk 7 (/dev/sdh).

3. Assign the new drive in the slot of the old drive (it is already installed and precleared)

4. Go to the Main -> Array Operation section

5. Put a check in the Yes, I'm sure checkbox (next to the information indicating the drive will be rebuilt), and click the Start button

The rebuild will begin, with hefty disk activity on all drives, lots of writes on the new drive and lots of reads on all other drives

All of the contents of the old drive will be copied onto the new drive, making it an exact replacement, except possibly with more capacity than the old drive.

Quote

May 8, 201610 yr

Looks ok, before starting the rebuild check SMART for the other disks, make sure there are no pending sectors, and keep the old disk intact until the rebuild is finished.

Quote

May 8, 201610 yr

Author

Looks ok, before starting the rebuild check SMART for the other disks, make sure there are no pending sectors, and keep the old disk intact until the rebuild is finished.

Thank you. Is the pending sector check a long test or short test?

Quote

May 8, 201610 yr

Just check the SMART attributes:

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x000b 100 100 016 Pre-fail Always - 0

2 Throughput_Performance 0x0005 134 134 054 Pre-fail Offline - 111

3 Spin_Up_Time 0x0007 125 125 024 Pre-fail Always - 559 (Average 554)

4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 3085

5 Reallocated_Sector_Ct 0x0033 100 100 005 Pre-fail Always - 0

7 Seek_Error_Rate 0x000b 100 100 067 Pre-fail Always - 0

8 Seek_Time_Performance 0x0005 132 132 020 Pre-fail Offline - 32

9 Power_On_Hours 0x0012 095 095 000 Old_age Always - 39564

10 Spin_Retry_Count 0x0013 100 100 060 Pre-fail Always - 0

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 39

192 Power-Off_Retract_Count 0x0032 098 098 000 Old_age Always - 3144

193 Load_Cycle_Count 0x0012 098 098 000 Old_age Always - 3144

194 Temperature_Celsius 0x0002 253 253 000 Old_age Always - 23 (Min/Max 11/33)

196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0

197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x000a 200 200 000 Old_age Always - 0

Quote

May 8, 201610 yr

Author

Alright. I checked all disk smartctl attributes and for all disks, 197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0.

I will move forward with the replacement and rebuild.

On another note, is there a threshold for amount of "power on hours" you typically use before you replace a disk?

Quote

[Solved] - unraid 5.0.6 - Data is invalid, looking for opinions on next step

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)