trouble identifying a dying (or dead) drive

September 1, 201015 yr

Howdy folks,

I was greeted by a server unavailable page when I tried to hit my "tower" server tonight. I also have an automated scheduled parity check for the start of the month (ie today) which is currently running :-(

I suspect one of my drives has failed but I can't really tell which one it is from the log - any help identifying it would be appreciated - as the main front page shows a status of AOK

Status Disk Mounted Device Model/Serial Temp Reads Writes Errors Size Used %Used Free

OK parity /dev/sdb WDC_WD20EADS-00R6B0_WD-WCAVY0100163 35°C 104138 37677

OK /dev/md1 /mnt/disk1 /dev/sda ST31500341AS_9VS20M4T 31°C 78728 1769 1.50T 1.46T 98% 40.08G

OK /dev/md2 /mnt/disk2 /dev/sdd ST31500341AS_9VS21EV7 30°C 79641 1650 1.50T 1.48T 99% 16.75G

OK /dev/md3 /mnt/disk3 /dev/sdc ST31500341AS_9VS20RC4 34°C 84598 4603 1.50T 1.34T 90% 158.85G

OK /dev/md4 /mnt/disk4 /dev/hdd ST31500341AS_9VS21B3M 33°C 56075 1506 1.50T 1.36T 91% 140.59G

OK /dev/md5 /mnt/disk5 /dev/hdc WDC_WD15EADS-00R6B0_WD-WCAVY0489509 36°C 59012 5857 1.50T 1.31T 88% 193.54G

OK /dev/md6 /mnt/disk6 /dev/sdf SAMSUNG_HD154UI_S1XWJ1LSB07244 19°C 77889 5120 1.50T 1.39T 93% 111.81G

OK /dev/md7 /mnt/disk7 /dev/sdi SAMSUNG_HD154UI_S1XWJ1LSB07242 19°C 71585 5889 1.50T 1.23T 82% 274.91G

OK /dev/md8 /mnt/disk8 /dev/sde WDC_WD20EARS-00J2GB0_WD-WCAYY0017128 26°C 77154 5717 2.00T 1.73T 87% 271.50G

OK /dev/md9 /mnt/disk9 /dev/sdh WDC_WD20EARS-00J2GB0_WD-WCAYY0021248 26°C 73847 5791 2.00T 1.72T 87% 277.14G

Total: 14.50T 13.02T 89% 1.49T

It's been a while ... so I can't actually remember if I used to have a md10 (ie 10 data drives and 1 parity) :-) so it's possible one drive is missing.

And secondly ... has my automated parity check now just removed my ability to recover from this dying/dead disk (ie it's recomputing the parity now based on garbage input)?

I have attached my latest syslog (fragment) in the hope that someone could provide some helpful insight.

Thanks

beloion

syslog-fragment.txt.zip

September 1, 201015 yr

The parity check would not start if you had a failed drive, so no worries there.

According to the syslog, these are your disks:

Sep 1 21:14:58 Tower kernel: md: import disk0: [8,16] (sdb) WDC WD20EADS-00R6B0 WD-WCAVY0100163 offset: 63 size: 1953514552

Sep 1 21:14:58 Tower kernel: md: import disk1: [8,0] (sda) ST31500341AS 9VS20M4T offset: 63 size: 1465137496

Sep 1 21:14:58 Tower kernel: md: import disk2: [8,48] (sdd) ST31500341AS 9VS21EV7 offset: 63 size: 1465138552

Sep 1 21:14:58 Tower kernel: md: import disk3: [8,32] (sdc) ST31500341AS 9VS20RC4 offset: 63 size: 1465138552

Sep 1 21:14:58 Tower kernel: md: import disk4: [22,64] (hdd) ST31500341AS 9VS21B3M offset: 63 size: 1465138552

Sep 1 21:14:58 Tower kernel: md: import disk5: [22,0] (hdc) WDC WD15EADS-00R6B0 WD-WCAVY0489509 offset: 63 size: 1465138552

Sep 1 21:14:58 Tower kernel: md: import disk6: [8,80] (sdf) SAMSUNG HD154UI S1XWJ1LSB07244 offset: 63 size: 1465138552

Sep 1 21:14:58 Tower kernel: md: import disk7: [8,128] (sdi) SAMSUNG HD154UI S1XWJ1LSB07242 offset: 63 size: 1465138552

Sep 1 21:14:58 Tower kernel: md: import disk8: [8,64] (sde) WDC WD20EARS-00J2GB0 WD-WCAYY0017128 offset: 63 size: 1953514552

Sep 1 21:14:58 Tower kernel: md: import disk9: [8,112] (sdh) WDC WD20EARS-00J2GB0 WD-WCAYY0021248 offset: 63 size: 1953514552

There is no indication of a disk10.

Your log is filled with errors like these (BadCRC = bad checksum on the data to the disk controller)

Sep 1 21:15:00 Tower kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x0

Sep 1 21:15:00 Tower kernel: ata11.00: BMDMA2 stat 0x80d0009

Sep 1 21:15:00 Tower kernel: ata11: SError: { 10B8B BadCRC }

Sep 1 21:15:00 Tower kernel: ata11.00: cmd c8/00:00:3f:15:00/00:00:00:00:00/e0 tag 0 dma 131072 in

Sep 1 21:15:00 Tower kernel: res 51/04:5f:e0:15:00/00:03:00:00:00/f0 Emask 0x1 (device error)

Sep 1 21:15:00 Tower kernel: ata11.00: status: { DRDY ERR }

Sep 1 21:15:00 Tower kernel: ata11.00: error: { ABRT }

Sep 1 21:15:00 Tower kernel: ata11.00: failed to IDENTIFY (I/O error, err_mask=0x1)

Sep 1 21:15:00 Tower kernel: ata11.00: revalidation failed (errno=-5)

Sep 1 21:15:00 Tower kernel: ata11: hard resetting link

Sep 1 21:15:00 Tower kernel: ata11: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Sep 1 21:15:00 Tower kernel: ata11.00: configured for UDMA/100

Sep 1 21:15:00 Tower kernel: ata11: EH complete

Sep 1 21:15:01 Tower kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x0

Sep 1 21:15:01 Tower kernel: ata11.00: BMDMA2 stat 0x80c0009

Sep 1 21:15:01 Tower kernel: ata11: SError: { 10B8B BadCRC }

Sep 1 21:15:01 Tower kernel: ata11.00: cmd c8/00:08:7f:51:75/00:00:00:00:00/e6 tag 0 dma 4096 in

Sep 1 21:15:01 Tower kernel: res 51/04:00:86:51:75/00:03:00:00:00/f6 Emask 0x1 (device error)

Sep 1 21:15:01 Tower kernel: ata11.00: status: { DRDY ERR }

Sep 1 21:15:01 Tower kernel: ata11.00: error: { ABRT }

Sep 1 21:15:01 Tower kernel: ata11.00: failed to IDENTIFY (I/O error, err_mask=0x1)

Sep 1 21:15:01 Tower kernel: ata11.00: revalidation failed (errno=-5)

Sep 1 21:15:01 Tower kernel: ata11: hard resetting link

Sep 1 21:15:01 Tower kernel: ata11: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Sep 1 21:15:01 Tower kernel: ata11.00: configured for UDMA/100

Sep 1 21:15:01 Tower kernel: ata11: EH complete

Sep 1 21:15:01 Tower kernel: ata11.00: exception Emask 0x0 SAct 0x0 SErr 0x280000 action 0x0

Sep 1 21:15:01 Tower kernel: ata11.00: BMDMA2 stat 0x80d0009

Sep 1 21:15:01 Tower kernel: ata11: SError: { 10B8B BadCRC }

Sep 1 21:15:01 Tower kernel: ata11.00: cmd 25/00:00:bf:1a:00/00:04:00:00:00/e0 tag 0 dma 524288 in

Sep 1 21:15:01 Tower kernel: res 51/04:3f:80:1b:00/00:03:00:00:00/f0 Emask 0x1 (device error)

Sep 1 21:15:01 Tower kernel: ata11.00: status: { DRDY ERR }

Sep 1 21:15:01 Tower kernel: ata11.00: error: { ABRT }

Sep 1 21:15:01 Tower kernel: ata11.00: failed to IDENTIFY (I/O error, err_mask=0x1)

Sep 1 21:15:01 Tower kernel: ata11.00: revalidation failed (errno=-5)

Sep 1 21:15:01 Tower kernel: ata11: hard resetting link

ata11 is this disk ( /dev/sdi assigned as disk7 in the array ):

Sep 1 21:14:58 Tower kernel: ata11: SATA link up 1.5 Gbps (SStatus 113 SControl 310)

Sep 1 21:14:58 Tower kernel: ata11.00: ATA-7: SAMSUNG HD154UI, 1AG01118, max UDMA7

Sep 1 21:14:58 Tower kernel: ata11.00: 2930277168 sectors, multi 16: LBA48 NCQ (depth 0/32)

Sep 1 21:14:58 Tower kernel: ata11.00: configured for UDMA/100

Sep 1 21:14:58 Tower kernel: scsi 11:0:0:0: Direct-Access ATA SAMSUNG HD154UI 1AG0 PQ: 0 ANSI: 5

Sep 1 21:14:58 Tower kernel: sd 11:0:0:0: [sdi] 2930277168 512-byte logical blocks: (1.50 TB/1.36 TiB)

Sep 1 21:14:58 Tower kernel: sd 11:0:0:0: [sdi] Write Protect is off

Sep 1 21:14:58 Tower kernel: sd 11:0:0:0: [sdi] Mode Sense: 00 3a 00 00

Sep 1 21:14:58 Tower kernel: sd 11:0:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Either the cable to the /dev/sdi drive is bad (or loose), or the SATA cable is picking up noise from another cable it is bundled close to, or the drive itself is bad, or the disk controller port is bad, or the power supply to the drive is marginal, leading to the noisy response back to the controller and bad CRC checksums)

I would try re-seating the cable first, replacing it second. A SMART report from the drive might show something if the drive was failing internally, you could try it too:

smartctl -d ata -a /dev/sdi

Good luck.

Joe L.

September 1, 201015 yr

Author

Thanks for the quick reply :- ) I will try that tomorrow as it's late in Oz.

September 2, 201015 yr

Author

Hmmmm ... not quite going as I hoped.

The system IS doing a parity check ... not sure what that means in this scenario

STARTED; 10 disks in array.

Parity Check in progress

Total Size 1,953,514,552 KB

Current 248,857,416 (12.7%)

Speed 3,739 KB/sec

Finish 7596.5 minutes

Sync Errors 163 (corrected)

Array Disk Status

Status Disk Mounted Device Model/Serial Temp Reads Writes Errors Size Used %Used Free

OK parity /dev/sdb WDC_WD20EADS-00R6B0_WD-WCAVY0100163 36°C 879259 121013

OK /dev/md1 /mnt/disk1 /dev/sda ST31500341AS_9VS20M4T 32°C 774834 1769 1.50T 1.46T 98% 40.08G

OK /dev/md2 /mnt/disk2 /dev/sdd ST31500341AS_9VS21EV7 30°C 784757 1650 1.50T 1.48T 99% 16.75G

OK /dev/md3 /mnt/disk3 /dev/sdc ST31500341AS_9VS20RC4 34°C 804806 10407 1.50T 1.34T 90% 158.82G

OK /dev/md4 /mnt/disk4 /dev/hdd ST31500341AS_9VS21B3M 33°C 565514 1506 1.50T 1.36T 91% 140.59G

OK /dev/md5 /mnt/disk5 /dev/hdc WDC_WD15EADS-00R6B0_WD-WCAVY0489509 37°C 554358 5857 1.50T 1.31T 88% 193.54G

OK /dev/md6 /mnt/disk6 /dev/sdf SAMSUNG_HD154UI_S1XWJ1LSB07244 20°C 747497 8824 1.50T 1.39T 93% 111.81G

OK /dev/md7 /mnt/disk7 /dev/sdi SAMSUNG_HD154UI_S1XWJ1LSB07242 19°C 715718 43817 1.50T 1.22T 82% 278.97G

OK /dev/md8 /mnt/disk8 /dev/sde WDC_WD20EARS-00J2GB0_WD-WCAYY0017128 26°C 749815 25537 2.00T 1.73T 87% 271.49G

OK /dev/md9 /mnt/disk9 /dev/sdh WDC_WD20EARS-00J2GB0_WD-WCAYY0021248 27°C 705923 16901 2.00T 1.73T 87% 272.75G

Total: 14.50T 13.02T 89% 1.48T

That's going to take 6.25 days to complete !!!!

I ran the S.M.A.R.T command as suggested and got this output

root@Tower:~# smartctl -d ata -a /dev/sdi -F samsung2

Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===

Device Model: SAMSUNG HD154UI

Serial Number: S1XWJ1LSB07242

Firmware Version: 1AG01118

User Capacity: 1,500,301,910,016 bytes

Device is: In smartctl database [for details use: -P show]

ATA Version is: 8

ATA Standard is: ATA-8-ACS revision 3b

Local Time is: Thu Sep 2 22:14:29 2010 GMT-11

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for details.

SMART support is: Available - device has SMART capability.

SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===

SMART overall-health self-assessment test result: PASSED

General SMART Values:

Offline data collection status: (0x00) Offline data collection activity

was never started.

Auto Offline Data Collection: Disabled.

Self-test execution status: ( 0) The previous self-test routine completed

without error or no self-test has ever

been run.

Total time to complete Offline

data collection: (19958) seconds.

Offline data collection

capabilities: (0x7b) SMART execute Offline immediate.

Auto Offline data collection on/off support.

Suspend Offline collection upon new

command.

Offline surface scan supported.

Self-test supported.

Conveyance Self-test supported.

Selective Self-test supported.

SMART capabilities: (0x0003) Saves SMART data before entering

power-saving mode.

Supports SMART auto save timer.

Error logging capability: (0x01) Error logging supported.

General Purpose Logging supported.

Short self-test routine

recommended polling time: ( 2) minutes.

Extended self-test routine

recommended polling time: ( 255) minutes.

Conveyance self-test routine

recommended polling time: ( 35) minutes.

SCT capabilities: (0x003f) SCT Status supported.

SCT Feature Control supported.

SCT Data Table supported.

SMART Attributes Data Structure revision number: 16

Vendor Specific SMART Attributes with Thresholds:

ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE

1 Raw_Read_Error_Rate 0x000f 100 100 051 Pre-fail Always - 0

3 Spin_Up_Time 0x0007 074 074 011 Pre-fail Always - 8610

4 Start_Stop_Count 0x0032 099 099 000 Old_age Always - 889

5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0

7 Seek_Error_Rate 0x000f 253 253 051 Pre-fail Always - 0

8 Seek_Time_Performance 0x0025 100 100 015 Pre-fail Offline - 0

9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 1407

10 Spin_Retry_Count 0x0033 100 100 051 Pre-fail Always - 0

11 Calibration_Retry_Count 0x0012 100 100 000 Old_age Always - 1

12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 309

13 Read_Soft_Error_Rate 0x000e 100 100 000 Old_age Always - 0

183 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

184 Unknown_Attribute 0x0033 100 100 000 Pre-fail Always - 0

187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0

188 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0

190 Airflow_Temperature_Cel 0x0022 082 068 000 Old_age Always - 18 (Lifetime Min/Max 16/18)

194 Temperature_Celsius 0x0022 081 067 000 Old_age Always - 19 (Lifetime Min/Max 15/20)

195 Hardware_ECC_Recovered 0x001a 100 100 000 Old_age Always - 130294342

196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0

197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0

198 Offline_Uncorrectable 0x0030 100 100 000 Old_age Offline - 0

199 UDMA_CRC_Error_Count 0x003e 099 099 000 Old_age Always - 210476

200 Multi_Zone_Error_Rate 0x000a 100 100 000 Old_age Always - 0

201 Soft_Read_Error_Rate 0x000a 100 100 000 Old_age Always - 0

SMART Error Log Version: 1

ATA Error Count: 29709 (device log contains only the most recent five errors)

CR = Command Register [HEX]

FR = Features Register [HEX]

SC = Sector Count Register [HEX]

SN = Sector Number Register [HEX]

CL = Cylinder Low Register [HEX]

CH = Cylinder High Register [HEX]

DH = Device/Head Register [HEX]

DC = Device Command Register [HEX]

ER = Error register [HEX]

ST = Status register [HEX]

Powered_Up_Time is measured from power on, and printed as

DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,

SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 29709 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

00 d0 00 00 00 00 a0 at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

c8 00 98 f7 5d 00 e0 00 1d+00:19:26.070 READ DMA

ec 00 00 00 00 00 a0 02 1d+00:19:26.000 IDENTIFY DEVICE

ef 03 42 00 00 00 a0 02 1d+00:19:26.000 SET FEATURES [set transfer mode]

ec 00 00 00 00 00 a0 02 1d+00:19:25.980 IDENTIFY DEVICE

00 00 01 01 00 00 a0 ff 1d+00:19:25.650 NOP [Abort queued commands]

Error 29708 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

00 d0 00 00 00 00 a0 at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

c8 00 58 a7 32 00 e0 00 1d+00:07:43.590 READ DMA

ec 00 00 00 00 00 a0 02 1d+00:07:43.530 IDENTIFY DEVICE

ef 03 42 00 00 00 a0 02 1d+00:07:43.530 SET FEATURES [set transfer mode]

ec 00 00 00 00 00 a0 02 1d+00:07:43.510 IDENTIFY DEVICE

00 00 01 01 00 00 a0 ff 1d+00:07:43.180 NOP [Abort queued commands]

Error 29707 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

00 d0 00 00 00 00 a0 at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

c8 00 b8 2f 06 00 e0 00 23:52:03.030 READ DMA

ec 00 00 00 00 00 a0 02 23:52:02.870 IDENTIFY DEVICE

ef 03 42 00 00 00 a0 02 23:52:02.870 SET FEATURES [set transfer mode]

ec 00 00 00 00 00 a0 02 23:52:02.850 IDENTIFY DEVICE

00 00 01 01 00 00 a0 ff 23:52:02.530 NOP [Abort queued commands]

Error 29706 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

00 d0 00 00 00 00 a0 at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

c8 00 48 5f 95 d5 e8 00 23:49:22.920 READ DMA

ec 00 00 00 00 00 a0 02 23:49:22.850 IDENTIFY DEVICE

ef 03 42 00 00 00 a0 02 23:49:22.850 SET FEATURES [set transfer mode]

ec 00 00 00 00 00 a0 02 23:49:22.830 IDENTIFY DEVICE

00 00 01 01 00 00 a0 ff 23:49:22.500 NOP [Abort queued commands]

Error 29705 occurred at disk power-on lifetime: 0 hours (0 days + 0 hours)

When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:

ER ST SC SN CL CH DH

-- -- -- -- -- -- --

00 d0 00 00 00 00 a0 at LBA = 0x00000000 = 0

Commands leading to the command that caused the error were:

CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name

-- -- -- -- -- -- -- -- ---------------- --------------------

c8 00 88 d7 32 ec e7 00 23:48:14.050 READ DMA

c8 00 08 07 2e ec e7 00 23:48:14.050 READ DMA

c8 00 08 af 28 ec e7 00 23:48:14.030 READ DMA

ec 00 00 00 00 00 a0 02 23:48:14.000 IDENTIFY DEVICE

ef 03 42 00 00 00 a0 02 23:48:14.000 SET FEATURES [set transfer mode]

SMART Self-test log structure revision number 1

No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1

SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS

1 0 0 Not_testing

2 0 0 Not_testing

3 0 0 Not_testing

4 0 0 Not_testing

5 0 0 Not_testing

Selective self-test flags (0x0):

After scanning selected spans, do NOT read-scan remainder of disk.

If Selective self-test is pending on power-up, resume after 0 minute delay.

I am uncertain what all this means - as I can't see any definite error threshold being reached (ie a S.M.A.R.T failure). The parity check is going to take a week.

Can I stop the parity check ? Can I just switch off (halt) the PC and check the cabling as suggested without making it so that I can recover the missing content should the drive truly be failing.

Any helpful suggestions would be appreciated :- )

Thanks

belorion

September 2, 201015 yr

It sounds like the monthly parity check you had scheduled just kicked off.

Yes, you can cancel the parity check, then "Stop" the array and power down and re-seat the cable.

Joe L.

September 2, 201015 yr

Author

Thanks muchly joe :-)

I managed to stop the check (properly as you described). I will attempt to verify the connections are AOK as you suggested ... if that doesn't work I will get a new drive and attempt to rebuild as the bad drive is slowing the whole system down (that drive seems to have content on it from most of my shares - as I was kinda keen to get better utilisation of my drives - be careful what you wish for eh ;-) ).

ciao

belorion

Model / Serial No. Temperature Size Free Reads Writes Errors

parity ata-WDC_WD20EADS-00R6B0_WD-WCAVY0100163 35°C 1,953,514,552 - 926,902 121,132 0

disk1 ata-ST31500341AS_9VS20M4T 30°C 1,465,137,496 39,141,872 823,496 1,769 0

disk2 ata-ST31500341AS_9VS21EV7 28°C 1,465,138,552 16,358,236 835,429 1,668 0

disk3 ata-ST31500341AS_9VS20RC4 32°C 1,465,138,552 155,094,232 855,699 10,407 0

disk4 ata-ST31500341AS_9VS21B3M 31°C 1,465,138,552 137,293,888 600,034 1,506 0

disk5 ata-WDC_WD15EADS-00R6B0_WD-WCAVY0489509 34°C 1,465,138,552 189,000,848 588,145 5,857 0

disk6 ata-SAMSUNG_HD154UI_S1XWJ1LSB07244 18°C 1,465,138,552 109,192,140 793,351 8,878 0

disk7 ata-SAMSUNG_HD154UI_S1XWJ1LSB07242 18°C 1,465,138,552 272,431,656 762,015 43,875 45,672

disk8 ata-WDC_WD20EARS-00J2GB0_WD-WCAYY0017128 25°C 1,953,514,552 265,131,504 794,480 25,537 0

disk9 ata-WDC_WD20EARS-00J2GB0_WD-WCAYY0021248 26°C 1,953,514,552 266,355,644 748,356 16,901 0

September 5, 201015 yr

Author

Just to close this off.

I stopped the system. Checked the drive cables - the drive in question is connected to a lose connector. Restarted and just completed a parity check.

Everything is AOK :-)

Thanks muchly

belorion

trouble identifying a dying (or dead) drive

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)