Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

failing parity disk, interpretting SMART reports

Featured Replies

Just to preface, after reading many forum threads I am aware that I did not handle this troubleshoot process in the ideal way initially.  Hoping for some guidance at my current point.  So this started with noticing that my parity disk had thrown thousands of errors per the unraid main page.  I had not run a parity check in a few weeks, and all previous parity checks have been error free.  None of my 3 data drives at this point were showing any errors on the main page and everything was still green balled.  I unfortunately decided to reboot without grabbing a syslog, then ran a parity check which was progressing at an incredibly slow pace.  At this point I cancelled the parity check and I bought a new HDD to replace my failing parity drive.  After performing the swap, I started the parity-sync which has now finished finding 7 errors, all of which are on disk3 (a 2TB drive that is about half-full with data).  No other errors on any of the other drives.  At this point, I assume there is no real way to determine if I suffered any data loss.  Posted below is the syslog showing the errors during the parity-sync, as well as the SMART report for disk3.  My best interpretation is that I probably did not suffer any data loss, but I should replace disk3 with a new drive and let my new parity disk rebuild it.  Please advise, thanks!

 

=== START OF INFORMATION SECTION ===

Device Model:    WDC WD20EARX-00PASB0

Serial Number:    WD-WCAZAH474551

Firmware Version: 51.0AB51

User Capacity:    2,000,398,934,016 bytes

Device is:        Not in smartctl database [for details use: -P showall]

ATA Version is:  8

ATA Standard is:  Exact ATA specification draft version not indicated

Local Time is:    Mon Nov  3 20:49:25 2014 EST

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x82) Offline data collection activity

was completed without error.

Auto Offline Data Collection: Enabled.

Self-test execution status:      ( 113) The previous self-test completed having

the read element of the test failed.

Total time to complete Offline

data collection: (38880) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: (  2) minutes.

Extended self-test routine

recommended polling time: ( 255) minutes.

Conveyance self-test routine

recommended polling time: (  5) minutes.

SCT capabilities:       (0x3035) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x002f  145  145  051    Pre-fail  Always      -      213849

  3 Spin_Up_Time            0x0027  213  174  021    Pre-fail  Always      -      4316

  4 Start_Stop_Count        0x0032  100  100  000    Old_age  Always      -      810

  5 Reallocated_Sector_Ct  0x0033  200  200  140    Pre-fail  Always      -      37

  7 Seek_Error_Rate        0x002e  200  200  000    Old_age  Always      -      0

  9 Power_On_Hours          0x0032  080  080  000    Old_age  Always      -      15027

10 Spin_Retry_Count        0x0032  100  100  000    Old_age  Always      -      0

11 Calibration_Retry_Count 0x0032  100  253  000    Old_age  Always      -      0

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      14

192 Power-Off_Retract_Count 0x0032  200  200  000    Old_age  Always      -      7

193 Load_Cycle_Count        0x0032  198  198  000    Old_age  Always      -      6107

194 Temperature_Celsius    0x0022  128  116  000    Old_age  Always      -      22

196 Reallocated_Event_Count 0x0032  192  192  000    Old_age  Always      -      8

197 Current_Pending_Sector  0x0032  200  200  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0030  200  200  000    Old_age  Offline      -      1

199 UDMA_CRC_Error_Count    0x0032  200  200  000    Old_age  Always      -      0

200 Multi_Zone_Error_Rate  0x0008  128  113  000    Old_age  Offline      -      19442

 

SMART Error Log Version: 1

No Errors Logged

 

SMART Self-test log structure revision number 1

Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error

# 1  Short offline      Completed: read failure      10%    15024        1203316928

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

syslog.txt

There's a bad spot on the disk as defined by the LBA, I would try to get my data off that disk in any way possible, then do a preclear.

 

 

You can possibly exercise the full drive with badblocks in readonly mode.

You can possible try a smart long test.

 

 

What has me curious us there is one off line uncorrectable sector, but no pending sectors.

I'm not sure if the offline uncorrectable sector is where the LBA error is or there's a new spot that's causing issue.

 

 

You can try the Western Digital utilities also.

 

 

Once you do get your data safely off the drive, chances are a  preclear should help clear or reallocate any bad spots.

Watch the reallocated count. if that starts climbing fast and/or high, then the drive should be RMA'd.

 

 

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.