Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Parity Check Errors

Featured Replies

Hello. I've been running a home-built unAID (v5.0.6) server for a little over a month now. I have a 6TB parity drive and roughly 2.5TB of info spread unevenly over 5 differently sized disks (largest disk being 2TB).

 

Everything has been going great. But upon running my first parity check, there were roughly 2,000 errors. Somewhere in all my reading before getting the server up and running I had read that someone's opinion was that you should not run "Correct any errors" when doing parity checks, rather look at the log files and figure out what the issue is so that you can correct appropriately. So that is what I've done.

 

The first time I ran a parity check was about a week ago. And per the log files, it appears that all of the errors were on disk1. I didn't have time to address the issue when I first saw the log, so I walked away. The next day, when I was prepared to tackle it, the log file had completely changed. I'm not sure if this is totally normal or what, but rather than hundreds of lines of code that I saw before with what seemed like useful info, there were now only 4 talking about spindown times for various disks.

 

Rather than go searching for it, I ran another parity check (again without correcting errors). This time I got fewer errors...

 

Last checked on Thu Feb 12 03:40:56 2015 PST, finding 1477 errors.

 

But again, the log file had only a few lines with the spindown info. I expected to get a log file containing info on the errors.

 

So, I guess my questions are:

 

1. Am I doing something wrong with trying to get the log file with the pertinent info?

2. Is there a simple way to retrieve old log files from the browser interface?

3. Should I be worried about that many errors during parity check? am I ordering a new disk right now and swapping in a new drive immediately?

4. Am I right to avoid the "Correct errors" when doing the parity check? or should I feel good about using it because it is there for a reason?

 

Any other bits of wisdom would be SUPER helpful.

 

Thank you so much, community, for your time!

 

-Vic

I assume that except for parity these were just disks you already had. Did you preclear all the disks?

  • Author

Your assumption is spot on. Embarrassingly or not, I did NOT pre-clear any of the disks. Truthfully, that process seemed too foreign to me with command line stuff AND I was in a hurry. I forged ahead without, utilizing disks on hand of all ages and sizes, thinking they've "worked fine so far" as is and I can just replace if one gives me problems. Please don't judge too harshly :)

Probably worthwhile to get smart reports of all drives.

  • Author

I actually did that for each drive using GSSmartControl. Using that particular utility, it wasn't really clear which of the provided numbers were actionable. Should it be pretty obvious?

 

And any thoughts on any of the posed questions?

 

1. Am I doing something wrong with trying to get the log file with the pertinent info?

2. Is there a simple way to retrieve old log files from the browser interface?

3. Should I be worried about that many errors during parity check? am I ordering a new disk right now and swapping in a new drive immediately?

4. Am I right to avoid the "Correct errors" when doing the parity check? or should I feel good about using it because it is there for a reason?

 

Thanks again.

Post smart reports of all drives.

  • Author

OK, may I ask the best way to accomplish this with the drives in the server? Is there a particular tool you'd recommend?

 

And to get my other questions answered, would you suggest I post those individually?

 

Thanks.

OK, may I ask the best way to accomplish this with the drives in the server? Is there a particular tool you'd recommend?

 

And to get my other questions answered, would you suggest I post those individually?

 

Thanks.

 

There is an alternative GUI that you can install that has SMART plugin that operates from the GUI.  It is called Dynamix and you can find details on it in the first post in this thread:

 

      http://lime-technology.com/forum/index.php?topic=30939.0

 

              and the install instructions are on this page (Some people have had problems finding this link among all the other details.)

 

    https://github.com/bergware/dynamix

 

 

 

  • Author

Thank you very much for the options to get the requested info. I ended up PuTTYing in. Here is the first of my drives...

 

PARITY

smartctl -a -A /dev/sdf

 

=== START OF INFORMATION SECTION ===

Device Model:    WDC WD60EFRX-68MYMN1

Serial Number:    WD-WX11D74RHY39

LU WWN Device Id: 5 0014ee 20ad20707

Firmware Version: 82.00A82

User Capacity:    6,001,175,126,016 bytes [6.00 TB]

Sector Sizes:    512 bytes logical, 4096 bytes physical

Rotation Rate:    5700 rpm

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  ACS-2, ACS-3 T13/2161-D revision 3b

SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)

Local Time is:    Thu Feb 19 12:29:40 2015 PST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x00) Offline data collection activity

                                        was never started.

                                        Auto Offline Data Collection: Disabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                ( 6824) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  2) minutes.

Extended self-test routine

recommended polling time:        ( 722) minutes.

Conveyance self-test routine

recommended polling time:        (  5) minutes.

SCT capabilities:              (0x303d) SCT Status supported.

                                        SCT Error Recovery Control supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -      0

  3 Spin_Up_Time            0x0027  208  205  021    Pre-fail  Always      -      8583

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      136

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -      1314

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      31

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      21

193 Load_Cycle_Count        0x0032  200  200  000    Old_age  Always      -      938

194 Temperature_Celsius    0x0022  116  107  000    Old_age  Always      -      36

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0030  100  253  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x0008  100  253  000    Old_age  Offline      -      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

  • Author

DISK1

smartctl -a -A /dev/sdd

 

=== START OF INFORMATION SECTION ===

Model Family:    Western Digital Caviar Green (AF)

Device Model:    WDC WD20EARS-00MVWB0

Serial Number:    WD-WMAZA0993182

LU WWN Device Id: 5 0014ee 6ab1dc8a2

Firmware Version: 51.0AB51

User Capacity:    2,000,398,934,016 bytes [2.00 TB]

Sector Size:      512 bytes logical/physical

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  ATA8-ACS (minor revision not indicated)

SATA Version is:  SATA 2.6, 3.0 Gb/s

Local Time is:    Thu Feb 19 12:23:23 2015 PST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x82) Offline data collection activity

                                        was completed without error.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                (36600) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  2) minutes.

Extended self-test routine

recommended polling time:        ( 353) minutes.

Conveyance self-test routine

recommended polling time:        (  5) minutes.

SCT capabilities:              (0x3035) SCT Status supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  175  175  051    Pre-fail  Always      -      163273

  3 Spin_Up_Time            0x0027  168  165  021    Pre-fail  Always      -      6600

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      225

  5 Reallocated_Sector_Ct  0x0033  183  183  140    Pre-fail  Always      -      337

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  056  056  000    Old_age  Always      -      32705

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  100  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      103

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      78

193 Load_Cycle_Count        0x0032  066  066  000    Old_age  Always      -      402565

194 Temperature_Celsius    0x0022  120  105  000    Old_age  Always      -      30

196 Reallocated_Event_Count 0x0032  001  001  000    Old_age  Always      -      209

197 Current_Pending_Sector  0x0032  198  196  000    Old_age  Always      -      916

198 Offline_Uncorrectable  0x0030  200  199  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0032  200  001  000    Old_age  Always      -      2436312

200 Multi_Zone_Error_Rate  0x0008  153  001  000    Old_age  Offline      -      12772

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

  • Author

DISK2

smartctl -a -A /dev/hdd

 

=== START OF INFORMATION SECTION ===

Model Family:    Toshiba 3.5" HDD DT01ACA...

Device Model:    TOSHIBA DT01ACA300

Serial Number:    44LX8U5KS

LU WWN Device Id: 5 000039 ff4daf2f9

Firmware Version: MX6OABB0

User Capacity:    3,000,592,982,016 bytes [3.00 TB]

Sector Sizes:    512 bytes logical, 4096 bytes physical

Rotation Rate:    7200 rpm

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  ATA8-ACS T13/1699-D revision 4

SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)

Local Time is:    Thu Feb 19 12:32:11 2015 PST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x84) Offline data collection activity

                                        was suspended by an interrupting command from host.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                (21648) seconds.

Offline data collection

capabilities:                    (0x5b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        No Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  1) minutes.

Extended self-test routine

recommended polling time:        ( 361) minutes.

SCT capabilities:              (0x003d) SCT Status supported.

                                        SCT Error Recovery Control supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000b  100  100  016    Pre-fail  Always      -      0

  2 Throughput_Performance  0x0005  140  140  054    Pre-fail  Offline      -      67

  3 Spin_Up_Time            0x0007  135  135  024    Pre-fail  Always      -      425 (Average 425)

  4 Start_Stop_Count        0x0012  100  100  000    Old_age  Always      -      64

  5 Reallocated_Sector_Ct  0x0033  100  100  005    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000b  100  100  067    Pre-fail  Always      -      0

  8 Seek_Time_Performance  0x0005  124  124  020    Pre-fail  Offline      -      33

  9 Power_On_Hours          0x0012  100  100  000    Old_age  Always      -      1316

10 Spin_Retry_Count        0x0013  100  100  060    Pre-fail  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      31

192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      64

193 Load_Cycle_Count        0x0012  100  100  000    Old_age  Always      -      64

194 Temperature_Celsius    0x0002  200  200  000    Old_age  Always      -      30 (Min/Max 20/47)

196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0022  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0008  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x000a  200  200  000    Old_age  Always      -      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

  • Author

DISK3

smartctl -a -A /dev/hdc

 

=== START OF INFORMATION SECTION ===

Device Model:    ST2000DL004 HD204UI

Serial Number:    S2H7J90C619456

LU WWN Device Id: 5 0004cf 207ba9ce5

Firmware Version: 1AQ10001

User Capacity:    2,000,398,934,016 bytes [2.00 TB]

Sector Size:      512 bytes logical/physical

Rotation Rate:    5400 rpm

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  ATA8-ACS T13/1699-D revision 6

SATA Version is:  SATA 2.6, 3.0 Gb/s

Local Time is:    Thu Feb 19 12:33:37 2015 PST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x00) Offline data collection activity

                                        was never started.

                                        Auto Offline Data Collection: Disabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                (20100) seconds.

Offline data collection

capabilities:                    (0x5b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        No Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  2) minutes.

Extended self-test routine

recommended polling time:        ( 335) minutes.

SCT capabilities:              (0x003f) SCT Status supported.

                                        SCT Error Recovery Control supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  100  100  051    Pre-fail  Always      -      400

  2 Throughput_Performance  0x0026  252  252  000    Old_age  Always      -      0

  3 Spin_Up_Time            0x0023  076  066  025    Pre-fail  Always      -      7351

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      109

  5 Reallocated_Sector_Ct  0x0033  252  252  010    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002e  252  252  051    Old_age  Always      -      0

  8 Seek_Time_Performance  0x0024  252  252  015    Old_age  Offline      -      0

  9 Power_On_Hours          0x0032  100  100  000    Old_age  Always      -      15585

10 Spin_Retry_Count        0x0032  252  252  051    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  252  252  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      86

181 Program_Fail_Cnt_Total  0x0022  090  090  000    Old_age  Always      -      236498312

191 G-Sense_Error_Rate      0x0022  252  252  000    Old_age  Always      -      0

192 Power-Off_Retract_Count 0x0022  252  252  000    Old_age  Always      -      0

194 Temperature_Celsius    0x0002  064  056  000    Old_age  Always      -      36 (Min/Max 17/45)

195 Hardware_ECC_Recovered  0x003a  100  100  000    Old_age  Always      -      0

196 Reallocated_Event_Count 0x0032  252  252  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  252  252  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0030  252  252  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0036  064  001  000    Old_age  Always      -      19229

200 Multi_Zone_Error_Rate  0x002a  100  100  000    Old_age  Always      -      75

223 Load_Retry_Count        0x0032  252  252  000    Old_age  Always      -      0

225 Load_Cycle_Count        0x0032  100  100  000    Old_age  Always      -      116

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 0

Note: revision number not 1 implies that no selective self-test has ever been run

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Completed [00% left] (0-65535)

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

  • Author

DISK4

smartctl -a -A /dev/sdb

 

=== START OF INFORMATION SECTION ===

Model Family:    Western Digital Caviar Green (AF)

Device Model:    WDC WD20EARS-00MVWB0

Serial Number:    WD-WMAZ20072749

LU WWN Device Id: 5 0014ee 6557bef2b

Firmware Version: 50.0AB50

User Capacity:    2,000,397,852,160 bytes [2.00 TB]

Sector Size:      512 bytes logical/physical

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  ATA8-ACS (minor revision not indicated)

SATA Version is:  SATA 2.6, 3.0 Gb/s

Local Time is:    Thu Feb 19 12:19:17 2015 PST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x84) Offline data collection activity

                                        was suspended by an interrupting command from host.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                (35400) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  2) minutes.

Extended self-test routine

recommended polling time:        ( 403) minutes.

Conveyance self-test routine

recommended polling time:        (  5) minutes.

SCT capabilities:              (0x3035) SCT Status supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  200  200  051    Pre-fail  Always      -      0

  3 Spin_Up_Time            0x0027  169  165  021    Pre-fail  Always      -      6550

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      156

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  056  056  000    Old_age  Always      -      32796

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      94

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      65

193 Load_Cycle_Count        0x0032  160  160  000    Old_age  Always      -      120922

194 Temperature_Celsius    0x0022  116  107  000    Old_age  Always      -      34

196 Reallocated_Event_Count 0x0032  200  200  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0030  200  200  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      1

200 Multi_Zone_Error_Rate  0x0008  200  200  000    Old_age  Offline      -      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

  • Author

DISK5

smartctl -a -A /dev/sdc

 

=== START OF INFORMATION SECTION ===

Model Family:    Seagate Barracuda 7200.11

Device Model:    ST31500341AS

Serial Number:    9VS2MF3Z

LU WWN Device Id: 5 000c50 00fdf5db3

Firmware Version: CC1H

User Capacity:    1,500,301,910,016 bytes [1.50 TB]

Sector Size:      512 bytes logical/physical

Rotation Rate:    7200 rpm

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  ATA8-ACS T13/1699-D revision 4

SATA Version is:  SATA 2.6, 3.0 Gb/s

Local Time is:    Thu Feb 19 12:22:15 2015 PST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x82) Offline data collection activity

                                        was completed without error.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                (  625) seconds.

Offline data collection

capabilities:                    (0x7b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  1) minutes.

Extended self-test routine

recommended polling time:        ( 292) minutes.

Conveyance self-test routine

recommended polling time:        (  2) minutes.

SCT capabilities:              (0x103f) SCT Status supported.

                                        SCT Error Recovery Control supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 10

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000f  117  100  006    Pre-fail  Always      -      144603021

  3 Spin_Up_Time            0x0003  092  092  000    Pre-fail  Always      -      0

  4 Start_Stop_Count        0x0032  100  100  020    Old_age  Always      -      19

  5 Reallocated_Sector_Ct  0x0033  100  100  036    Pre-fail  Always      -      1

  7 Seek_Error_Rate        0x000f  100  253  030    Pre-fail  Always      -      523678

  9 Power_On_Hours          0x0032  099  099  000    Old_age  Always      -      1172

10 Spin_Retry_Count        0x0013  100  100  097    Pre-fail  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  020    Old_age  Always      -      5

184 End-to-End_Error        0x0032  100  100  099    Old_age  Always      -      0

187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0

188 Command_Timeout        0x0032  100  100  000    Old_age  Always      -      0

189 High_Fly_Writes        0x003a  094  094  000    Old_age  Always      -      6

190 Airflow_Temperature_Cel 0x0022  071  054  045    Old_age  Always      -      29 (Min/Max 23/46)

194 Temperature_Celsius    0x0022  029  046  000    Old_age  Always      -      29 (0 20 0 0 0)

195 Hardware_ECC_Recovered  0x001a  051  040  000    Old_age  Always      -      144603021

197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0010  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  200  200  000    Old_age  Always      -      0

240 Head_Flying_Hours      0x0000  100  253  000    Old_age  Offline      -      36747740184703

241 Total_LBAs_Written      0x0000  100  253  000    Old_age  Offline      -      848406174

242 Total_LBAs_Read        0x0000  100  253  000    Old_age  Offline      -      3818515775

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

  • Author

POTENTIAL CACHE DRIVE (NOT CURRENTLY SELECTED)

smartctl -a -A /dev/sda

 

=== START OF INFORMATION SECTION ===

Model Family:    Toshiba 3.5" HDD DT01ACA...

Device Model:    TOSHIBA DT01ACA300

Serial Number:    239DZMBGS

LU WWN Device Id: 5 000039 ff4c5e76a

Firmware Version: MX6OABB0

User Capacity:    3,000,592,982,016 bytes [3.00 TB]

Sector Sizes:    512 bytes logical, 4096 bytes physical

Rotation Rate:    7200 rpm

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  ATA8-ACS T13/1699-D revision 4

SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)

Local Time is:    Thu Feb 19 12:13:24 2015 PST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x82) Offline data collection activity

                                        was completed without error.

                                        Auto Offline Data Collection: Enabled.

Self-test execution status:      (  0) The previous self-test routine completed

                                        without error or no self-test has ever

                                        been run.

Total time to complete Offline

data collection:                (22652) seconds.

Offline data collection

capabilities:                    (0x5b) SMART execute Offline immediate.

                                        Auto Offline data collection on/off support.

                                        Suspend Offline collection upon new

                                        command.

                                        Offline surface scan supported.

                                        Self-test supported.

                                        No Conveyance Self-test supported.

                                        Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

                                        power-saving mode.

                                        Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

                                        General Purpose Logging supported.

Short self-test routine

recommended polling time:        (  1) minutes.

Extended self-test routine

recommended polling time:        ( 378) minutes.

SCT capabilities:              (0x003d) SCT Status supported.

                                        SCT Error Recovery Control supported.

                                        SCT Feature Control supported.

                                        SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000b  100  100  016    Pre-fail  Always      -      0

  2 Throughput_Performance  0x0005  139  139  054    Pre-fail  Offline      -      71

  3 Spin_Up_Time            0x0007  142  142  024    Pre-fail  Always      -      390 (Average 419)

  4 Start_Stop_Count        0x0012  100  100  000    Old_age  Always      -      365

  5 Reallocated_Sector_Ct  0x0033  100  100  005    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000b  100  100  067    Pre-fail  Always      -      0

  8 Seek_Time_Performance  0x0005  124  124  020    Pre-fail  Offline      -      33

  9 Power_On_Hours          0x0012  100  100  000    Old_age  Always      -      2514

10 Spin_Retry_Count        0x0013  100  100  060    Pre-fail  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      30

192 Power-Off_Retract_Count 0x0032  100  100  000    Old_age  Always      -      404

193 Load_Cycle_Count        0x0012  100  100  000    Old_age  Always      -      404

194 Temperature_Celsius    0x0002  176  176  000    Old_age  Always      -      34 (Min/Max 17/41)

196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0022  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0008  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x000a  200  200  000    Old_age  Always      -      0

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline      Completed without error      00%        0        -

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

  • Author

There was the SMART data for each of the 7 drives connected to my uNRAID server. Any information you guys could provide would be appreciated. The drive that seemed to show the errors when I was able to see the log file was DISK1. But based on what I am seeing, maybe others are going to need attention as well?

 

Hopefully I can also get info on the following now too...

 

1. Am I doing something wrong with trying to get the log file with the pertinent info?

2. Is there a simple way to retrieve old log files from the browser interface?

3. Should I be worried about that many errors during parity check? am I ordering a new disk right now and swapping in a new drive immediately?

4. Am I right to avoid the "Correct errors" when doing the parity check? or should I feel good about using it because it is there for a reason?

 

Obviously, I'm new. Thanks again for the help!

The SMART report for DISK1 does not look good.  There are lots of reallocated sectors, and also a lot of Pending Sectors (indicating read failures).  I would consider replacing that drive ASAP as those value indicate a disk that could die at any time.

 

I strongly suspect it is the read failures on disk1 that is causing the parity check failures but it is hard to be sure. 

I would use that 'potential' cache drive to replace Disk 1 ASAP.  Have you been preclearing your drives?  Most experienced users recommend doing at least one preclear cycle on every drive before adding it to the array.

 

For more information the preclear shell program, see here:

 

    http://lime-technology.com/forum/index.php?topic=2817.0

 

It will probably take a couple of days to do a single cycle on your 3TB drive.

 

 

  • Author

Apparently foolishly, I didn't preclear any of the drives. I'll definitely look into it from this point forward.

 

OK, so the next big question is, do I do one last parity check and allow uNRAID to try to correct errors? Or does that potentially cause more issues, and I should just yank the failing drive and attempt the rebuild?

 

Thanks

OK, so the next big question is, do I do one last parity check and allow uNRAID to try to correct errors? Or does that potentially cause more issues, and I should just yank the failing drive and attempt the rebuild?

Difficult to be sure.  It is not clear whether the parity is correct, and the errors are being caused by read errors on the problem drive.

 

My gut feeling is that with read errors being reported on the disk then running a correcting parity check may just aggravate any potential corruption.  it is probably best to simply rebuild onto a new disk and see what happens.  Keep the yanked disk somewhere safe, and if the rebuild appears to be complete OK, then you could mount it outside the array and do a comparison between that disk and the rebuilt version.

  • Author
I strongly suspect it is the read failures on disk1 that is causing the parity check failures but it is hard to be sure.

 

I'm pretty sure when I was able to find a log file that it did indeed indicate the errors were on DISK1. There were no other errors on other disks that I could see. However, for the life of me, I cannot find where past log files live. It appears that the current log file is nuked upon restart of the server?

 

Here is my current log file...

 

Feb 12 04:40:02 server syslogd 1.4.1: restart.

Feb 12 07:32:39 server kernel: mdcmd (276): spindown 0

Feb 12 07:32:39 server kernel: mdcmd (277): spindown 1

Feb 12 10:02:01 server kernel: mdcmd (278): spindown 0

Feb 12 10:02:01 server kernel: mdcmd (279): spindown 1

Feb 12 11:04:02 server kernel: mdcmd (280): spindown 1

Feb 12 13:07:34 server kernel: mdcmd (281): spindown 4

Feb 13 01:20:00 server kernel: mdcmd (282): spindown 4

Feb 13 01:39:01 server kernel: mdcmd (283): spindown 0

Feb 13 01:39:01 server kernel: mdcmd (284): spindown 1

Feb 13 05:25:43 server kernel: mdcmd (285): spindown 0

Feb 13 05:25:44 server kernel: mdcmd (286): spindown 1

Feb 13 07:30:45 server kernel: mdcmd (287): spindown 0

Feb 13 07:43:06 server kernel: mdcmd (288): spindown 1

Feb 14 06:32:57 server kernel: mdcmd (289): spindown 1

Feb 14 12:32:02 server kernel: r8168: eth0: link down

Feb 14 12:32:26 server kernel: r8168: eth0: link up

Feb 15 03:34:27 server kernel: mdcmd (290): spindown 0

Feb 15 03:34:28 server kernel: mdcmd (291): spindown 1

Feb 16 02:31:29 server kernel: mdcmd (292): spindown 0

Feb 16 02:31:30 server kernel: mdcmd (293): spindown 1

Feb 16 03:35:58 server emhttp: shcmd (193): :>/etc/samba/smb-shares.conf

Feb 16 03:35:58 server avahi-daemon[1586]: Files changed, reloading.

Feb 16 03:36:13 server emhttp: Restart SMB...

Feb 16 03:36:13 server emhttp: shcmd (194): killall -HUP smbd

Feb 16 03:36:13 server emhttp: shcmd (195): cp /etc/avahi/services/smb.service- /etc/avahi/services/smb.service

Feb 16 03:36:13 server avahi-daemon[1586]: Files changed, reloading.

Feb 16 03:36:13 server avahi-daemon[1586]: Service group file /services/smb.service changed, reloading.

Feb 16 03:36:13 server emhttp: shcmd (196): ps axc | grep -q rpc.mountd

Feb 16 03:36:13 server emhttp: _shcmd: shcmd (196): exit status: 1

Feb 16 03:36:13 server emhttp: shcmd (197): /usr/local/sbin/emhttp_event svcs_restarted

Feb 16 03:36:13 server emhttp_event: svcs_restarted

Feb 16 03:36:14 server avahi-daemon[1586]: Service "server" (/services/smb.service) successfully established.

Feb 16 04:36:04 server kernel: mdcmd (294): spindown 4

Feb 16 04:36:14 server kernel: mdcmd (295): spindown 5

Feb 16 06:50:44 server kernel: ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x400000 action 0x6 frozen

Feb 16 06:50:44 server kernel: ata6.00: irq_stat 0x08000000, interface fatal error

Feb 16 06:50:44 server kernel: ata6: SError: { Handshk }

Feb 16 06:50:44 server kernel: ata6.00: failed command: WRITE DMA EXT

Feb 16 06:50:44 server kernel: ata6.00: cmd 35/00:00:a0:5a:e1/00:04:01:00:00/e0 tag 0 dma 524288 out

Feb 16 06:50:44 server kernel:          res 50/00:00:a0:5a:e1/00:00:01:00:00/e0 Emask 0x10 (ATA bus error)

Feb 16 06:50:44 server kernel: ata6.00: status: { DRDY }

Feb 16 06:50:44 server kernel: ata6: hard resetting link

Feb 16 06:50:45 server kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)

Feb 16 06:50:45 server kernel: ata6.00: configured for UDMA/133

Feb 16 06:50:45 server kernel: ata6: EH complete

Feb 16 07:59:30 server kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Feb 16 07:59:30 server kernel: ata5.00: irq_stat 0x40000001

Feb 16 07:59:30 server kernel: ata5.00: failed command: READ DMA EXT

Feb 16 07:59:30 server kernel: ata5.00: cmd 25/00:08:50:00:20/00:00:8a:00:00/e0 tag 0 dma 4096 in

Feb 16 07:59:30 server kernel:          res 51/40:08:50:00:20/00:00:8a:00:00/e0 Emask 0x9 (media error)

Feb 16 07:59:30 server kernel: ata5.00: status: { DRDY ERR }

Feb 16 07:59:30 server kernel: ata5.00: error: { UNC }

Feb 16 07:59:30 server kernel: ata5.00: configured for UDMA/133

Feb 16 07:59:30 server kernel: sd 5:0:0:0: [sdd] Unhandled sense code

Feb 16 07:59:30 server kernel: sd 5:0:0:0: [sdd] 

Feb 16 07:59:30 server kernel: Result: hostbyte=0x00 driverbyte=0x08

Feb 16 07:59:30 server kernel: sd 5:0:0:0: [sdd] 

Feb 16 07:59:30 server kernel: Sense Key : 0x3 [current] [descriptor]

Feb 16 07:59:30 server kernel: Descriptor sense data with sense descriptors (in hex):

Feb 16 07:59:30 server kernel:        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00

Feb 16 07:59:30 server kernel:        8a 20 00 50

Feb 16 07:59:30 server kernel: sd 5:0:0:0: [sdd] 

Feb 16 07:59:30 server kernel: ASC=0x11 ASCQ=0x4

Feb 16 07:59:30 server kernel: sd 5:0:0:0: [sdd] CDB:

Feb 16 07:59:30 server kernel: cdb[0]=0x28: 28 00 8a 20 00 50 00 00 08 00

Feb 16 07:59:30 server kernel: end_request: I/O error, dev sdd, sector 2317353040

Feb 16 07:59:30 server kernel: ata5: EH complete

Feb 16 07:59:30 server kernel: md: disk1 read error, sector=2317352976

Feb 16 07:59:30 server kernel: sd 5:0:0:0: [sdd] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA

Feb 16 08:59:37 server kernel: mdcmd (296): spindown 2

Feb 16 08:59:37 server kernel: mdcmd (297): spindown 3

Feb 16 08:59:40 server kernel: mdcmd (298): spindown 4

Feb 16 08:59:40 server kernel: mdcmd (299): spindown 5

Feb 16 09:50:11 server kernel: mdcmd (300): spindown 0

Feb 16 10:59:22 server kernel: mdcmd (301): spindown 1

Feb 16 13:05:03 server kernel: mdcmd (302): spindown 1

Feb 17 09:33:43 server kernel: mdcmd (303): spindown 0

Feb 17 09:33:44 server kernel: mdcmd (304): spindown 1

Feb 18 00:49:30 server kernel: ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0

Feb 18 00:49:30 server kernel: ata5.00: irq_stat 0x40000001

Feb 18 00:49:30 server kernel: ata5.00: failed command: READ DMA EXT

Feb 18 00:49:30 server kernel: ata5.00: cmd 25/00:c8:48:00:e8/00:00:5f:00:00/e0 tag 0 dma 102400 in

Feb 18 00:49:30 server kernel:          res 51/40:c8:48:00:e8/00:00:5f:00:00/e0 Emask 0x9 (media error)

Feb 18 00:49:30 server kernel: ata5.00: status: { DRDY ERR }

Feb 18 00:49:30 server kernel: ata5.00: error: { UNC }

Feb 18 00:49:30 server kernel: ata5.00: configured for UDMA/133

Feb 18 00:49:30 server kernel: sd 5:0:0:0: [sdd] Unhandled sense code

Feb 18 00:49:30 server kernel: sd 5:0:0:0: [sdd] 

Feb 18 00:49:30 server kernel: Result: hostbyte=0x00 driverbyte=0x08

Feb 18 00:49:30 server kernel: sd 5:0:0:0: [sdd] 

Feb 18 00:49:30 server kernel: Sense Key : 0x3 [current] [descriptor]

Feb 18 00:49:30 server kernel: Descriptor sense data with sense descriptors (in hex):

Feb 18 00:49:30 server kernel:        72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00

Feb 18 00:49:30 server kernel:        5f e8 00 48

Feb 18 00:49:30 server kernel: sd 5:0:0:0: [sdd] 

Feb 18 00:49:30 server kernel: ASC=0x11 ASCQ=0x4

Feb 18 00:49:30 server kernel: sd 5:0:0:0: [sdd] CDB:

Feb 18 00:49:30 server kernel: cdb[0]=0x28: 28 00 5f e8 00 48 00 00 c8 00

Feb 18 00:49:30 server kernel: end_request: I/O error, dev sdd, sector 1609039944

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039880

Feb 18 00:49:30 server kernel: ata5: EH complete

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039888

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039896

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039904

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039912

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039920

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039928

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039936

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039944

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039952

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039960

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039968

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039976

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039984

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609039992

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040000

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040008

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040016

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040024

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040032

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040040

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040048

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040056

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040064

Feb 18 00:49:30 server kernel: md: disk1 read error, sector=1609040072

Feb 18 01:49:32 server kernel: mdcmd (305): spindown 2

Feb 18 01:49:32 server kernel: mdcmd (306): spindown 3

Feb 18 01:49:35 server kernel: mdcmd (307): spindown 4

Feb 18 01:49:35 server kernel: mdcmd (308): spindown 5

Feb 18 02:46:56 server kernel: mdcmd (309): spindown 0

Feb 18 02:46:56 server kernel: mdcmd (310): spindown 1

Feb 18 08:36:40 server kernel: mdcmd (311): spindown 0

Feb 18 08:36:40 server kernel: mdcmd (312): spindown 1

Feb 19 06:12:51 server kernel: mdcmd (313): spindown 0

Feb 19 10:19:33 server kernel: mdcmd (314): spindown 0

Feb 19 10:19:34 server kernel: mdcmd (315): spindown 1

Feb 19 12:12:23 server in.telnetd[16918]: connect from 192.168.0.100 (192.168.0.100)

Feb 19 12:12:42 server login[16919]: ROOT LOGIN  on '/dev/pts/0' from '192.168.0.100'

Feb 19 13:01:36 server kernel: mdcmd (316): spindown 1

Feb 19 13:02:26 server kernel: mdcmd (317): spindown 0

Feb 19 13:10:07 server kernel: mdcmd (318): spindown 4

Yes - the logs are nuked on each reboot as they are held in RAM unless you take action to save them to persistent media before rebooting. 

 

The SMART report show lots of pending sectors on disk1, and since these typically indicate a read failure I would think that trying to correct parity while these are present will just end up creating invalid parity.

Personally, I would preclear that spare disk and use it to replace disk 1.  If the array rebuilds correctly, you can begin to play around with the old Disk 1. 

 

If you have a problem with the rebuild, you can try to recover the data on that old disk 1.  You may or may not be successful!

 

If the rebuild goes with without a problem (and this is the most likely scenario), you can try to preclear that old disk 1.  If it passes the first pass, I personally would do another two passes before I used it again in my server.  (BUT I would bet it will fail!  That disk is old and there are  a massive number of failures on three of the most watch parameters for failures on unRAID server, ID# 5, 196 and 197.  If these counts for 196 and 197 were under 10, the chances would be better.)

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.