Skip to content
View in the app

A better way to browse. Learn more.

Unraid

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

trouble identifying a dying (or dead) drive

Featured Replies

Howdy folks,

    I was greeted by a server unavailable page when I tried to hit my "tower" server tonight. I also have an automated scheduled parity check for the start of the month (ie today) which is currently running :-(

 

I suspect one of my drives has failed but I can't really tell which one it is from the log - any help identifying it would be appreciated - as the main front page shows a status of AOK

 

 

Status Disk Mounted Device Model/Serial Temp Reads Writes Errors Size Used %Used Free

OK parity /dev/sdb WDC_WD20EADS-00R6B0_WD-WCAVY0100163 35°C 104138 37677

OK /dev/md1 /mnt/disk1 /dev/sda ST31500341AS_9VS20M4T 31°C 78728 1769 1.50T 1.46T 98% 40.08G

OK /dev/md2 /mnt/disk2 /dev/sdd ST31500341AS_9VS21EV7 30°C 79641 1650 1.50T 1.48T 99% 16.75G

OK /dev/md3 /mnt/disk3 /dev/sdc ST31500341AS_9VS20RC4 34°C 84598 4603 1.50T 1.34T 90% 158.85G

OK /dev/md4 /mnt/disk4 /dev/hdd ST31500341AS_9VS21B3M 33°C 56075 1506 1.50T 1.36T 91% 140.59G

OK /dev/md5 /mnt/disk5 /dev/hdc WDC_WD15EADS-00R6B0_WD-WCAVY0489509 36°C 59012 5857 1.50T 1.31T 88% 193.54G

OK /dev/md6 /mnt/disk6 /dev/sdf SAMSUNG_HD154UI_S1XWJ1LSB07244 19°C 77889 5120 1.50T 1.39T 93% 111.81G

OK /dev/md7 /mnt/disk7 /dev/sdi SAMSUNG_HD154UI_S1XWJ1LSB07242 19°C 71585 5889 1.50T 1.23T 82% 274.91G

OK /dev/md8 /mnt/disk8 /dev/sde WDC_WD20EARS-00J2GB0_WD-WCAYY0017128 26°C 77154 5717 2.00T 1.73T 87% 271.50G

OK /dev/md9 /mnt/disk9 /dev/sdh WDC_WD20EARS-00J2GB0_WD-WCAYY0021248 26°C 73847 5791 2.00T 1.72T 87% 277.14G

Total: 14.50T 13.02T 89% 1.49T

 

It's been a while ... so I can't actually remember if I used to have a md10 (ie 10 data drives and 1 parity) :-) so it's possible one drive is missing.

 

 

And secondly ... has my automated parity check now just removed my ability to recover from this dying/dead disk (ie it's recomputing the parity now based on garbage input)?

 

I have attached my latest syslog (fragment) in the hope that someone could provide some helpful insight.

 

Thanks

    beloion

syslog-fragment.txt.zip

The parity check would not start if you had a failed drive, so no worries there.

 

According to the syslog, these are your disks:

 

Sep  1 21:14:58 Tower kernel: md: import disk0: [8,16] (sdb) WDC WD20EADS-00R6B0                          WD-WCAVY0100163 offset: 63 size: 1953514552

Sep  1 21:14:58 Tower kernel: md: import disk1: [8,0] (sda) ST31500341AS                                        9VS20M4T offset: 63 size: 1465137496

Sep  1 21:14:58 Tower kernel: md: import disk2: [8,48] (sdd) ST31500341AS                                        9VS21EV7 offset: 63 size: 1465138552

Sep  1 21:14:58 Tower kernel: md: import disk3: [8,32] (sdc) ST31500341AS                                        9VS20RC4 offset: 63 size: 1465138552

Sep  1 21:14:58 Tower kernel: md: import disk4: [22,64] (hdd) ST31500341AS 9VS21B3M offset: 63 size: 1465138552

Sep  1 21:14:58 Tower kernel: md: import disk5: [22,0] (hdc) WDC WD15EADS-00R6B0 WD-WCAVY0489509 offset: 63 size: 1465138552

Sep  1 21:14:58 Tower kernel: md: import disk6: [8,80] (sdf) SAMSUNG HD154UI                          S1XWJ1LSB07244      offset: 63 size: 1465138552

Sep  1 21:14:58 Tower kernel: md: import disk7: [8,128] (sdi) SAMSUNG HD154UI                          S1XWJ1LSB07242      offset: 63 size: 1465138552

Sep  1 21:14:58 Tower kernel: md: import disk8: [8,64] (sde) WDC WD20EARS-00J2GB0                          WD-WCAYY0017128 offset: 63 size: 1953514552

Sep  1 21:14:58 Tower kernel: md: import disk9: [8,112] (sdh) WDC WD20EARS-00J2GB0                          WD-WCAYY0021248 offset: 63 size: 1953514552

 

There is no indication of a disk10.

 

Your log is filled with errors like these (BadCRC = bad checksum on the data to the disk controller)

Sep  1 21:15:00 Tower kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x0

Sep  1 21:15:00 Tower kernel: ata11.00: BMDMA2 stat 0x80d0009

Sep  1 21:15:00 Tower kernel: ata11: SError: { 10B8B BadCRC }

Sep  1 21:15:00 Tower kernel: ata11.00: cmd c8/00:00:3f:15:00/00:00:00:00:00/e0 tag 0 dma 131072 in

Sep  1 21:15:00 Tower kernel:          res 51/04:5f:e0:15:00/00:03:00:00:00/f0 Emask 0x1 (device error)

Sep  1 21:15:00 Tower kernel: ata11.00: status: { DRDY ERR }

Sep  1 21:15:00 Tower kernel: ata11.00: error: { ABRT }

Sep  1 21:15:00 Tower kernel: ata11.00: failed to IDENTIFY (I/O error, err_mask=0x1)

Sep  1 21:15:00 Tower kernel: ata11.00: revalidation failed (errno=-5)

Sep  1 21:15:00 Tower kernel: ata11: hard resetting link

Sep  1 21:15:00 Tower kernel: ata11: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Sep  1 21:15:00 Tower kernel: ata11.00: configured for UDMA/100

Sep  1 21:15:00 Tower kernel: ata11: EH complete

Sep  1 21:15:01 Tower kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x0

Sep  1 21:15:01 Tower kernel: ata11.00: BMDMA2 stat 0x80c0009

Sep  1 21:15:01 Tower kernel: ata11: SError: { 10B8B BadCRC }

Sep  1 21:15:01 Tower kernel: ata11.00: cmd c8/00:08:7f:51:75/00:00:00:00:00/e6 tag 0 dma 4096 in

Sep  1 21:15:01 Tower kernel:          res 51/04:00:86:51:75/00:03:00:00:00/f6 Emask 0x1 (device error)

Sep  1 21:15:01 Tower kernel: ata11.00: status: { DRDY ERR }

Sep  1 21:15:01 Tower kernel: ata11.00: error: { ABRT }

Sep  1 21:15:01 Tower kernel: ata11.00: failed to IDENTIFY (I/O error, err_mask=0x1)

Sep  1 21:15:01 Tower kernel: ata11.00: revalidation failed (errno=-5)

Sep  1 21:15:01 Tower kernel: ata11: hard resetting link

Sep  1 21:15:01 Tower kernel: ata11: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Sep  1 21:15:01 Tower kernel: ata11.00: configured for UDMA/100

Sep  1 21:15:01 Tower kernel: ata11: EH complete

Sep  1 21:15:01 Tower kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x0

Sep  1 21:15:01 Tower kernel: ata11.00: BMDMA2 stat 0x80d0009

Sep  1 21:15:01 Tower kernel: ata11: SError: { 10B8B BadCRC }

Sep  1 21:15:01 Tower kernel: ata11.00: cmd 25/00:00:bf:1a:00/00:04:00:00:00/e0 tag 0 dma 524288 in

Sep  1 21:15:01 Tower kernel:          res 51/04:3f:80:1b:00/00:03:00:00:00/f0 Emask 0x1 (device error)

Sep  1 21:15:01 Tower kernel: ata11.00: status: { DRDY ERR }

Sep  1 21:15:01 Tower kernel: ata11.00: error: { ABRT }

Sep  1 21:15:01 Tower kernel: ata11.00: failed to IDENTIFY (I/O error, err_mask=0x1)

Sep  1 21:15:01 Tower kernel: ata11.00: revalidation failed (errno=-5)

Sep  1 21:15:01 Tower kernel: ata11: hard resetting link

 

ata11 is this disk  ( /dev/sdi  assigned as disk7 in the array ):

Sep  1 21:14:58 Tower kernel: ata11: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Sep  1 21:14:58 Tower kernel: ata11.00: ATA-7: SAMSUNG HD154UI, 1AG01118, max UDMA7

Sep  1 21:14:58 Tower kernel: ata11.00: 2930277168 sectors, multi 16: LBA48 NCQ (depth 0/32)

Sep  1 21:14:58 Tower kernel: ata11.00: configured for UDMA/100

Sep  1 21:14:58 Tower kernel: scsi 11:0:0:0: Direct-Access    ATA      SAMSUNG HD154UI  1AG0 PQ: 0 ANSI: 5

Sep  1 21:14:58 Tower kernel: sd 11:0:0:0: [sdi] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)

Sep  1 21:14:58 Tower kernel: sd 11:0:0:0: [sdi] Write Protect is off

Sep  1 21:14:58 Tower kernel: sd 11:0:0:0: [sdi] Mode Sense: 00 3a 00 00

Sep  1 21:14:58 Tower kernel: sd 11:0:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

 

Either the cable to the /dev/sdi drive is bad (or loose), or the SATA cable is picking up noise from another cable it is bundled close to, or the drive itself is bad, or the disk controller port is bad, or the power supply to the drive is marginal, leading to the noisy response back to the controller and bad CRC checksums)

 

I would try re-seating the cable first, replacing it second.  A SMART report from the drive might show something if the drive was failing internally, you could try it too:

smartctl -d ata -a /dev/sdi

 

Good luck.

 

Joe L.

  • Author

Thanks for the quick reply :- ) I will try that tomorrow as it's late in Oz.

  • Author

Hmmmm ... not quite going as I hoped.

 

The system IS doing a parity check ... not sure what that means in this scenario

 

STARTED; 10 disks in array.

 

Parity Check in progress

Total Size 1,953,514,552 KB

Current 248,857,416 (12.7%)

Speed 3,739 KB/sec

Finish 7596.5 minutes

Sync Errors 163 (corrected)

 

Array Disk Status

Status Disk Mounted Device Model/Serial Temp Reads Writes Errors Size Used %Used Free

OK parity /dev/sdb WDC_WD20EADS-00R6B0_WD-WCAVY0100163 36°C 879259 121013

OK /dev/md1 /mnt/disk1 /dev/sda ST31500341AS_9VS20M4T 32°C 774834 1769 1.50T 1.46T 98% 40.08G

OK /dev/md2 /mnt/disk2 /dev/sdd ST31500341AS_9VS21EV7 30°C 784757 1650 1.50T 1.48T 99% 16.75G

OK /dev/md3 /mnt/disk3 /dev/sdc ST31500341AS_9VS20RC4 34°C 804806 10407 1.50T 1.34T 90% 158.82G

OK /dev/md4 /mnt/disk4 /dev/hdd ST31500341AS_9VS21B3M 33°C 565514 1506 1.50T 1.36T 91% 140.59G

OK /dev/md5 /mnt/disk5 /dev/hdc WDC_WD15EADS-00R6B0_WD-WCAVY0489509 37°C 554358 5857 1.50T 1.31T 88% 193.54G

OK /dev/md6 /mnt/disk6 /dev/sdf SAMSUNG_HD154UI_S1XWJ1LSB07244 20°C 747497 8824 1.50T 1.39T 93% 111.81G

OK /dev/md7 /mnt/disk7 /dev/sdi SAMSUNG_HD154UI_S1XWJ1LSB07242 19°C 715718 43817 1.50T 1.22T 82% 278.97G

OK /dev/md8 /mnt/disk8 /dev/sde WDC_WD20EARS-00J2GB0_WD-WCAYY0017128 26°C 749815 25537 2.00T 1.73T 87% 271.49G

OK /dev/md9 /mnt/disk9 /dev/sdh WDC_WD20EARS-00J2GB0_WD-WCAYY0021248 27°C 705923 16901 2.00T 1.73T 87% 272.75G

Total: 14.50T 13.02T 89% 1.48T

 

 

That's going to take 6.25 days to complete !!!!

 

 

I ran the S.M.A.R.T command as suggested and got this output

 

root@Tower:~# smartctl -d ata -a /dev/sdi -F samsung2

smartctl version 5.38 [i486-slackware-linux-gnu] Copyright © 2002-8 Bruce Allen

Home page is http://smartmontools.sourceforge.net/

 

=== START OF INFORMATION SECTION ===

Device Model:    SAMSUNG HD154UI

Serial Number:    S1XWJ1LSB07242

Firmware Version: 1AG01118

User Capacity:    1,500,301,910,016 bytes

Device is:        In smartctl database [for details use: -P show]

ATA Version is:  8

ATA Standard is:  ATA-8-ACS revision 3b

Local Time is:    Thu Sep  2 22:14:29 2010 GMT-11

 

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.

 

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

 

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

 

General SMART Values:

Offline data collection status:  (0x00) Offline data collection activity

was never started.

Auto Offline Data Collection: Disabled.

Self-test execution status:      (  0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (19958) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities:            (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability:        (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: (  2) minutes.

Extended self-test routine

recommended polling time: ( 255) minutes.

Conveyance self-test routine

recommended polling time: (  35) minutes.

SCT capabilities:       (0x003f) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

 

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME          FLAG    VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE

  1 Raw_Read_Error_Rate    0x000f  100  100  051    Pre-fail  Always      -      0

  3 Spin_Up_Time            0x0007  074  074  011    Pre-fail  Always      -      8610

  4 Start_Stop_Count        0x0032  099  099  000    Old_age  Always      -      889

  5 Reallocated_Sector_Ct  0x0033  100  100  010    Pre-fail  Always      -      0

  7 Seek_Error_Rate        0x000f  253  253  051    Pre-fail  Always      -      0

  8 Seek_Time_Performance  0x0025  100  100  015    Pre-fail  Offline      -      0

  9 Power_On_Hours          0x0032  100  100  000    Old_age  Always      -      1407

10 Spin_Retry_Count        0x0033  100  100  051    Pre-fail  Always      -      0

11 Calibration_Retry_Count 0x0012  100  100  000    Old_age  Always      -      1

12 Power_Cycle_Count      0x0032  100  100  000    Old_age  Always      -      309

13 Read_Soft_Error_Rate    0x000e  100  100  000    Old_age  Always      -      0

183 Unknown_Attribute      0x0032  100  100  000    Old_age  Always      -      0

184 Unknown_Attribute      0x0033  100  100  000    Pre-fail  Always      -      0

187 Reported_Uncorrect      0x0032  100  100  000    Old_age  Always      -      0

188 Unknown_Attribute      0x0032  100  100  000    Old_age  Always      -      0

190 Airflow_Temperature_Cel 0x0022  082  068  000    Old_age  Always      -      18 (Lifetime Min/Max 16/18)

194 Temperature_Celsius    0x0022  081  067  000    Old_age  Always      -      19 (Lifetime Min/Max 15/20)

195 Hardware_ECC_Recovered  0x001a  100  100  000    Old_age  Always      -      130294342

196 Reallocated_Event_Count 0x0032  100  100  000    Old_age  Always      -      0

197 Current_Pending_Sector  0x0012  100  100  000    Old_age  Always      -      0

198 Offline_Uncorrectable  0x0030  100  100  000    Old_age  Offline      -      0

199 UDMA_CRC_Error_Count    0x003e  099  099  000    Old_age  Always      -      210476

200 Multi_Zone_Error_Rate  0x000a  100  100  000    Old_age  Always      -      0

201 Soft_Read_Error_Rate    0x000a  100  100  000    Old_age  Always      -      0

 

SMART Error Log Version: 1

ATA Error Count: 29709 (device log contains only the most recent five errors)

CR = Command Register [HEX]

FR = Features Register [HEX]

SC = Sector Count Register [HEX]

SN = Sector Number Register [HEX]

CL = Cylinder Low Register [HEX]

CH = Cylinder High Register [HEX]

DH = Device/Head Register [HEX]

DC = Device Command Register [HEX]

ER = Error register [HEX]

ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

 

Error 29709 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  00 d0 00 00 00 00 a0  at LBA = 0x00000000 = 0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 98 f7 5d 00 e0 00  1d+00:19:26.070  READ DMA

  ec 00 00 00 00 00 a0 02  1d+00:19:26.000  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 02  1d+00:19:26.000  SET FEATURES [set transfer mode]

  ec 00 00 00 00 00 a0 02  1d+00:19:25.980  IDENTIFY DEVICE

  00 00 01 01 00 00 a0 ff  1d+00:19:25.650  NOP [Abort queued commands]

 

Error 29708 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  00 d0 00 00 00 00 a0  at LBA = 0x00000000 = 0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 58 a7 32 00 e0 00  1d+00:07:43.590  READ DMA

  ec 00 00 00 00 00 a0 02  1d+00:07:43.530  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 02  1d+00:07:43.530  SET FEATURES [set transfer mode]

  ec 00 00 00 00 00 a0 02  1d+00:07:43.510  IDENTIFY DEVICE

  00 00 01 01 00 00 a0 ff  1d+00:07:43.180  NOP [Abort queued commands]

 

Error 29707 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  00 d0 00 00 00 00 a0  at LBA = 0x00000000 = 0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 b8 2f 06 00 e0 00      23:52:03.030  READ DMA

  ec 00 00 00 00 00 a0 02      23:52:02.870  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 02      23:52:02.870  SET FEATURES [set transfer mode]

  ec 00 00 00 00 00 a0 02      23:52:02.850  IDENTIFY DEVICE

  00 00 01 01 00 00 a0 ff      23:52:02.530  NOP [Abort queued commands]

 

Error 29706 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  00 d0 00 00 00 00 a0  at LBA = 0x00000000 = 0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 48 5f 95 d5 e8 00      23:49:22.920  READ DMA

  ec 00 00 00 00 00 a0 02      23:49:22.850  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 02      23:49:22.850  SET FEATURES [set transfer mode]

  ec 00 00 00 00 00 a0 02      23:49:22.830  IDENTIFY DEVICE

  00 00 01 01 00 00 a0 ff      23:49:22.500  NOP [Abort queued commands]

 

Error 29705 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

  When the command that caused the error occurred, the device was active or idle.

 

  After command completion occurred, registers were:

  ER ST SC SN CL CH DH

  -- -- -- -- -- -- --

  00 d0 00 00 00 00 a0  at LBA = 0x00000000 = 0

 

  Commands leading to the command that caused the error were:

  CR FR SC SN CL CH DH DC  Powered_Up_Time  Command/Feature_Name

  -- -- -- -- -- -- -- --  ----------------  --------------------

  c8 00 88 d7 32 ec e7 00      23:48:14.050  READ DMA

  c8 00 08 07 2e ec e7 00      23:48:14.050  READ DMA

  c8 00 08 af 28 ec e7 00      23:48:14.030  READ DMA

  ec 00 00 00 00 00 a0 02      23:48:14.000  IDENTIFY DEVICE

  ef 03 42 00 00 00 a0 02      23:48:14.000  SET FEATURES [set transfer mode]

 

SMART Self-test log structure revision number 1

No self-tests have been logged.  [To run self-tests, use: smartctl -t]

 

 

SMART Selective self-test log data structure revision number 1

SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS

    1        0        0  Not_testing

    2        0        0  Not_testing

    3        0        0  Not_testing

    4        0        0  Not_testing

    5        0        0  Not_testing

Selective self-test flags (0x0):

  After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

 

 

 

I am uncertain what all this means - as I can't see any definite error threshold being reached (ie a S.M.A.R.T failure). The parity check is going to take a week.

 

Can I stop the parity check ? Can I just switch off (halt) the PC and check the cabling as suggested without making it so that I can recover the missing content should the drive truly be failing.

 

Any helpful suggestions would be appreciated :- )

 

Thanks

      belorion

 

It sounds like the monthly parity check you had scheduled just kicked off.

 

Yes, you can cancel the parity check, then "Stop" the array and power down and re-seat the cable.

 

Joe L.

  • Author

Thanks muchly joe :-)

 

I managed to stop the check (properly as you described). I will attempt to verify the connections are AOK as you suggested ... if that doesn't work I will get a new drive and attempt to rebuild as the bad drive is slowing the whole system down (that drive seems to have content on it from most of my shares - as I was kinda keen to get better utilisation of my drives - be careful what you wish for eh ;-)  ).

 

ciao

    belorion

 

  Model / Serial No.  Temperature  Size  Free  Reads  Writes  Errors

parity ata-WDC_WD20EADS-00R6B0_WD-WCAVY0100163 35°C 1,953,514,552 - 926,902 121,132 0

disk1 ata-ST31500341AS_9VS20M4T 30°C 1,465,137,496 39,141,872 823,496 1,769 0

disk2 ata-ST31500341AS_9VS21EV7 28°C 1,465,138,552 16,358,236 835,429 1,668 0

disk3 ata-ST31500341AS_9VS20RC4 32°C 1,465,138,552 155,094,232 855,699 10,407 0

disk4 ata-ST31500341AS_9VS21B3M 31°C 1,465,138,552 137,293,888 600,034 1,506 0

disk5 ata-WDC_WD15EADS-00R6B0_WD-WCAVY0489509 34°C 1,465,138,552 189,000,848 588,145 5,857 0

disk6 ata-SAMSUNG_HD154UI_S1XWJ1LSB07244 18°C 1,465,138,552 109,192,140 793,351 8,878 0

disk7 ata-SAMSUNG_HD154UI_S1XWJ1LSB07242 18°C 1,465,138,552 272,431,656 762,015 43,875 45,672

disk8 ata-WDC_WD20EARS-00J2GB0_WD-WCAYY0017128 25°C 1,953,514,552 265,131,504 794,480 25,537 0

disk9 ata-WDC_WD20EARS-00J2GB0_WD-WCAYY0021248 26°C 1,953,514,552 266,355,644 748,356 16,901 0

  • Author

Just to close this off.

 

I stopped the system. Checked the drive cables - the drive in question is connected to a lose connector. Restarted and just completed a parity check.

 

Everything is AOK :-)

 

Thanks muchly

    belorion

Archived

This topic is now archived and is closed to further replies.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.