Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

[SOLVED] Possible Drive Failure

Featured Replies

Hey,

 

I've started finding the server unresponsive more and more often, and I'm having to do a hard reset as I can't remote into it.

I also seem to get lag when streaming to Kodi, it freezes for about 10 secs then seems to fast forward to the relevant point playing everything at a high speed. It's been happening more and more.

 

I've attached the diagnostics file, usually I get nothing in the syslog, but this time it is showing disk 6 read errors. This is one of the older drives in the array (circa 5-6 years) though there are older. If it's this one going would it effect everything though?

Disk 6 is WDC_WD20EARX-00PASB0_WD-WCAZAH738536-20160929-1125.txt

 

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.1.7-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (AF, SATA 6Gb/s)
Device Model:     WDC WD20EARX-00PASB0
Serial Number:    WD-WCAZAH738536
LU WWN Device Id: 5 0014ee 2b209029b
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Thu Sep 29 11:30:52 2016 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
				was completed without error.
				Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
				without error or no self-test has ever 
				been run.
Total time to complete Offline 
data collection: 		(38580) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 372) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   199   199   051    Pre-fail  Always       -       30228
  3 Spin_Up_Time            0x0027   164   161   021    Pre-fail  Always       -       6800
  4 Start_Stop_Count        0x0032   096   096   000    Old_age   Always       -       4460
  5 Reallocated_Sector_Ct   0x0033   145   145   140    Pre-fail  Always       -       2327
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   059   059   000    Old_age   Always       -       29996
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       117
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       46
193 Load_Cycle_Count        0x0032   171   171   000    Old_age   Always       -       89615
194 Temperature_Celsius     0x0022   115   099   000    Old_age   Always       -       35
196 Reallocated_Event_Count 0x0032   001   001   000    Old_age   Always       -       434
197 Current_Pending_Sector  0x0032   200   198   000    Old_age   Always       -       174
198 Offline_Uncorrectable   0x0030   198   198   000    Old_age   Offline      -       754
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   162   162   000    Old_age   Offline      -       10145

SMART Error Log Version: 1
ATA Error Count: 7 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 7 occurred at disk power-on lifetime: 29909 hours (1246 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 90 0b 67 ea  Error: UNC 8 sectors at LBA = 0x0a670b90 = 174525328

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 90 0b 67 ea 08      00:50:57.339  READ DMA
  c8 00 08 f8 2f 66 ea 08      00:50:55.490  READ DMA
  c8 00 08 48 3b 64 ea 08      00:50:54.319  READ DMA
  c8 00 08 b8 3f 64 ea 08      00:50:52.490  READ DMA
  c8 00 08 e8 cb 64 ea 08      00:50:51.089  READ DMA

Error 6 occurred at disk power-on lifetime: 29438 hours (1226 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 a8 09 28 ef  Error: UNC 8 sectors at LBA = 0x0f2809a8 = 254282152

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 a8 09 28 ef 08      01:24:24.444  READ DMA
  ca 00 08 a0 09 28 ef 08      01:24:24.444  WRITE DMA
  ef 10 02 00 00 00 a0 08      01:24:24.444  SET FEATURES [Enable SATA feature]
  ec 00 00 00 00 00 a0 08      01:24:24.438  IDENTIFY DEVICE

Error 5 occurred at disk power-on lifetime: 29438 hours (1226 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 a0 09 28 ef  Error: UNC 8 sectors at LBA = 0x0f2809a0 = 254282144

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 a0 09 28 ef 08      01:24:21.168  READ DMA
  c8 00 08 98 09 28 ef 08      01:24:21.168  READ DMA
  c8 00 08 90 09 28 ef 08      01:24:21.168  READ DMA
  c8 00 08 88 09 28 ef 08      01:24:21.168  READ DMA
  c8 00 08 80 09 28 ef 08      01:24:21.168  READ DMA

Error 4 occurred at disk power-on lifetime: 29437 hours (1226 days + 13 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 28 90 76 2f e2  Error: UNC 40 sectors at LBA = 0x022f7690 = 36664976

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 28 90 76 2f e2 08      00:11:46.743  READ DMA
  ca 00 08 88 6f 2f e2 08      00:11:46.742  WRITE DMA
  ca 00 08 90 6f 2f e2 08      00:11:46.742  WRITE DMA
  ca 00 08 98 6f 2f e2 08      00:11:46.742  WRITE DMA
  ca 00 08 a0 6f 2f e2 08      00:11:46.742  WRITE DMA

Error 3 occurred at disk power-on lifetime: 28284 hours (1178 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 10 88 0d 6f ef  Error: UNC 16 sectors at LBA = 0x0f6f0d88 = 258936200

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 10 88 0d 6f ef 00   9d+05:21:54.038  READ DMA
  c8 00 08 80 0d 6f ef 00   9d+05:21:54.038  READ DMA
  c8 00 08 78 0d 6f ef 00   9d+05:21:54.038  READ DMA
  c8 00 08 70 0d 6f ef 00   9d+05:21:54.038  READ DMA
  c8 00 08 68 0d 6f ef 00   9d+05:21:54.038  READ DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     14007         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 

Disk 3 WDC_WD20EARS-00MVWB0_WD-WMAZA3020880-20160929-1130 is showing some errors but again not in the syslog!

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.1.7-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (AF)
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WMAZA3020880
LU WWN Device Id: 5 0014ee 600b6eb71
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Thu Sep 29 11:30:52 2016 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84)	Offline data collection activity
				was suspended by an interrupting command from host.
				Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
				without error or no self-test has ever 
				been run.
Total time to complete Offline 
data collection: 		(35760) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 345) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x3035)	SCT Status supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       3101
  3 Spin_Up_Time            0x0027   186   171   021    Pre-fail  Always       -       5683
  4 Start_Stop_Count        0x0032   095   095   000    Old_age   Always       -       5037
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   037   037   000    Old_age   Always       -       46487
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       249
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       82
193 Load_Cycle_Count        0x0032   143   143   000    Old_age   Always       -       172365
194 Temperature_Celsius     0x0022   114   102   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       3
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       2
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       46

SMART Error Log Version: 1
ATA Error Count: 1
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 45415 hours (1892 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 68 30 da ee  Error: UNC 8 sectors at LBA = 0x0eda3068 = 249180264

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 68 30 da ee 08   3d+16:45:45.619  READ DMA
  c8 00 08 60 30 da ee 08   3d+16:45:45.619  READ DMA
  c8 00 08 58 30 da ee 08   3d+16:45:45.619  READ DMA
  c8 00 08 50 30 da ee 08   3d+16:45:45.619  READ DMA
  c8 00 08 48 30 da ee 08   3d+16:45:45.619  READ DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     30488         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 

 

Can anyone please help me decode these?

I'm assuming it means the 2 older drives are on the way out, Disk 6 being more terminal.

Disk 3 and the Parity though stump me a bit since I'm not seeing errors in the syslog?

 

Apologies for the embedding it all, I couldn't attach them all due to size!

 

Thanks in advance!

  • Author

Syslog1

Sep 26 04:40:01 Wintermute rsyslogd: [origin software="rsyslogd" swVersion="8.6.0" x-pid="1144" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Sep 26 04:40:01 Wintermute logger: Community Applications Auto Update Running
Sep 26 04:44:20 Wintermute kernel: ata8.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
Sep 26 04:44:20 Wintermute kernel: ata8.00: irq_stat 0x40000001
Sep 26 04:44:20 Wintermute kernel: ata8.00: failed command: READ DMA EXT
Sep 26 04:44:20 Wintermute kernel: ata8.00: cmd 25/00:40:30:f6:2e/00:05:3a:00:00/e0 tag 18 dma 688128 in
Sep 26 04:44:20 Wintermute kernel:         res 51/40:3f:30:f7:2e/00:04:3a:00:00/e0 Emask 0x9 (media error)
Sep 26 04:44:20 Wintermute kernel: ata8.00: status: { DRDY ERR }
Sep 26 04:44:20 Wintermute kernel: ata8.00: error: { UNC }
Sep 26 04:44:20 Wintermute kernel: ata8.00: configured for UDMA/133
Sep 26 04:44:20 Wintermute kernel: sd 8:0:0:0: [sdi] tag#18 UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Sep 26 04:44:20 Wintermute kernel: sd 8:0:0:0: [sdi] tag#18 Sense Key : 0x3 [current] [descriptor] 
Sep 26 04:44:20 Wintermute kernel: sd 8:0:0:0: [sdi] tag#18 ASC=0x11 ASCQ=0x4 
Sep 26 04:44:20 Wintermute kernel: sd 8:0:0:0: [sdi] tag#18 CDB: opcode=0x28 28 00 3a 2e f6 30 00 05 40 00
Sep 26 04:44:20 Wintermute kernel: blk_update_request: I/O error, dev sdi, sector 976156464
Sep 26 04:44:20 Wintermute kernel: ata8: EH complete
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156400
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156408
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156416
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156424
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156432
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156440
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156448
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156456
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156464
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156472
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156480
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156488
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156496
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156504
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156512
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156520
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156528
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156536
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156544
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156552
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156560
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156568
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156576
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156584
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156592
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156600
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156608
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156616
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156624
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156632
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156640
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156648
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156656
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156664
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156672
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156680
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156688
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156696
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156704
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156712
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156720
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156728
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156736
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156744
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156752
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156760
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156768
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156776
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156784
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156792
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156800
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156808
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156816
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156824
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156832
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156840
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156848
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156856
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156864
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156872
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156880
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156888
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156896
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156904
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156912
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156920
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156928
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156936
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156944
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156952
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156960
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156968
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156976
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156984
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976156992
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157000
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157008
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157016
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157024
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157032
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157040
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157048
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157056
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157064
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157072
Sep 26 04:44:20 Wintermute kernel: md: disk6 read error, sector=976157080

 

It goes on like that for a while always disk 6

  • Community Expert

You didn't post the SMART for disk3, or better yet post the diagnostics.

 

Disk6 is on its way out, parity had some issues in the past, but it should be OK for now, I guess you'll know after trying to rebuild disk6.

  • Author

Hey yeah sorry, I overdid the max characters so ended up chopping the wrong bit out!

 

Disk 6 looks a goner! Thanks for confirming.

 

 

 

Everything else should be here now!

 

 

My Parity is also showing an error though I'm stumped as to what it means!? And that's a fairly new 6TB WD Red (2 years tops)

WDC_WD60EFRX-68MYMN1_WD-WX51D6427226-20160929-1130.txt

smartctl 6.2 2013-07-26 r3841 [x86_64-linux-4.1.7-unRAID] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD60EFRX-68MYMN1
Serial Number:    WD-WX51D6427226
LU WWN Device Id: 5 0014ee 260205560
Firmware Version: 82.00A82
User Capacity:    6,001,175,126,016 bytes [6.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5700 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Sep 29 11:30:53 2016 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
				was never started.
				Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
				without error or no self-test has ever 
				been run.
Total time to complete Offline 
data collection: 		( 5204) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
				Auto Offline data collection on/off support.
				Suspend Offline collection upon new
				command.
				Offline surface scan supported.
				Self-test supported.
				Conveyance Self-test supported.
				Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
				power-saving mode.
				Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
				General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 705) minutes.
Conveyance self-test routine
recommended polling time: 	 (   5) minutes.
SCT capabilities: 	       (0x303d)	SCT Status supported.
				SCT Error Recovery Control supported.
				SCT Feature Control supported.
				SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   197   193   021    Pre-fail  Always       -       9141
  4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -       1277
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   077   077   000    Old_age   Always       -       17156
10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       69
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       29
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -       4379
194 Temperature_Celsius     0x0022   118   108   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       1
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
ATA Error Count: 34 (device log contains only the most recent five errors)
CR = Command Register [HEX]
FR = Features Register [HEX]
SC = Sector Count Register [HEX]
SN = Sector Number Register [HEX]
CL = Cylinder Low Register [HEX]
CH = Cylinder High Register [HEX]
DH = Device/Head Register [HEX]
DC = Device Command Register [HEX]
ER = Error register [HEX]
ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 34 occurred at disk power-on lifetime: 13493 hours (562 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 90 d8 00 e0  Error: UNC 8 sectors at LBA = 0x0000d890 = 55440

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 90 d8 00 e0 08   2d+05:06:02.778  READ DMA
  ca 00 80 10 d8 00 e0 08   2d+05:06:02.777  WRITE DMA
  ca 00 08 08 d8 00 e0 08   2d+05:06:02.777  WRITE DMA
  ca 00 08 00 d8 00 e0 08   2d+05:06:02.628  WRITE DMA
  c8 00 80 10 d8 00 e0 08   2d+05:06:02.628  READ DMA

Error 33 occurred at disk power-on lifetime: 13493 hours (562 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 f8 d7 00 e0  Error: UNC 8 sectors at LBA = 0x0000d7f8 = 55288

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 f8 d7 00 e0 08   2d+05:05:04.113  READ DMA
  ca 00 40 b8 d7 00 e0 08   2d+05:05:04.112  WRITE DMA
  ca 00 08 b0 d7 00 e0 08   2d+05:05:04.112  WRITE DMA
  ca 00 08 a8 d7 00 e0 08   2d+05:05:03.963  WRITE DMA
  c8 00 40 b8 d7 00 e0 08   2d+05:05:03.963  READ DMA

Error 32 occurred at disk power-on lifetime: 13493 hours (562 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 28 d6 00 e0  Error: UNC 8 sectors at LBA = 0x0000d628 = 54824

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 28 d6 00 e0 08   2d+05:03:55.097  READ DMA
  ef 10 02 00 00 00 a0 08   2d+05:03:55.079  SET FEATURES [Enable SATA feature]
  ec 00 00 00 00 00 a0 08   2d+05:03:55.079  IDENTIFY DEVICE

Error 31 occurred at disk power-on lifetime: 13493 hours (562 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 08 20 d6 00 e0  Error: IDNF at LBA = 0x0000d620 = 54816

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  ca 00 08 20 d6 00 e0 08   2d+05:03:32.474  WRITE DMA
  c8 00 08 20 d6 00 e0 08   2d+05:03:32.040  READ DMA
  ca 00 70 b0 d5 00 e0 08   2d+05:03:32.039  WRITE DMA

Error 30 occurred at disk power-on lifetime: 13493 hours (562 days + 5 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 08 50 d5 00 e0  Error: UNC 8 sectors at LBA = 0x0000d550 = 54608

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 08 50 d5 00 e0 08   2d+05:02:38.969  READ DMA
  ca 00 68 e8 d4 00 e0 08   2d+05:02:38.968  WRITE DMA
  ca 00 08 e0 d4 00 e0 08   2d+05:02:38.803  WRITE DMA
  c8 00 68 e8 d4 00 e0 08   2d+05:02:38.765  READ DMA
  ca 00 08 d8 d4 00 e0 08   2d+05:02:38.765  WRITE DMA

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      1258         -
# 2  Short offline       Completed without error       00%      1206         -
# 3  Short offline       Completed without error       00%      1206         -

SMART Selective self-test log data structure revision number 1
SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

  • Community Expert

Disk3 doesn't look very good, but it can be a false positive and disk6 looks worse, so I would replace that one first.

 

After the rebuild look at the error counters, if there any in disk3, parity or any other disk there could be some corrupt files.

  • Author

Ok thanks for confirming it all, Disk 3 threw me as it wasn't in the syslog but I was combing them all trying to work it out. I really wish SMART reports gave more detail for dummies ha ha.

 

Well I've been looking for an excuse for another 6TB...oh well.

 

I'll redo the scans post fixing Disk 6. I do wonder if the parity was down to how often the thing had been crashing recently?

 

Thanks again!

  • Community Expert

Parity should be OK, there were some errors but they were about 5 month ago, and there are no pending sectors.

 

Disk3 shows some pending sectors and some read errors 45 days ago, so during the rebuild there could be some read errors that could result on some corrupt files on the rebuilt disk.

  • Author

Cool thanks man!

  • Community Expert

Here are 2 things for future reference.

 

1) Always to go Tools - Diagnostics and post the complete diagnostics zip. That one file would have included everything needed instead of embedding individual SMART and syslog excerpts across several posts.

 

2) Set up Notifications. Looks like disk6 had probably been an issue for a long time but you weren't aware of it. Notifications would have told you.

  • Author

Yeah I'm reading up on notifications at the minute. I got lazy post set up, I should have covered my back earlier.

 

I tried the zip, too large apparently!

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.