HDDs Locked up


Recommended Posts

Yesterday I inserted 3 new HDDs (after preclearing them) in my array to a total of 14 disks now, and I left two "move" processes overnight, trough telnet sessions, and aparently there have been some issues.

 

- One of the processes is still running, but really really slow, it's already taking 10 hours or more to move just 1TB internally, I don't remember it being that slow;

 

- The other process just stopped moving overnight, with no warning whatsoever in the telnet window.

 

Here is what I found in the syslog, can somebody please explain me what is is?

 

Jan  4 02:57:02 Tower kernel: ata10: exception Emask 0x10 SAct 0x0 SErr 0x180000 action 0x6 frozen
Jan  4 02:57:02 Tower kernel: ata10: edma_err_cause=00000020 pp_flags=00000000, SError=00180000
Jan  4 02:57:02 Tower kernel: ata10: SError: { 10B8B Dispar }
Jan  4 02:57:02 Tower kernel: ata10: hard resetting link
Jan  4 02:57:02 Tower kernel: ata11: exception Emask 0x10 SAct 0x0 SErr 0x180000 action 0x6 frozen
Jan  4 02:57:02 Tower kernel: ata11: edma_err_cause=00000020 pp_flags=00000000, SError=00180000
Jan  4 02:57:02 Tower kernel: ata11: SError: { 10B8B Dispar }
Jan  4 02:57:02 Tower kernel: ata11: hard resetting link
Jan  4 02:57:02 Tower kernel: ata13: exception Emask 0x10 SAct 0x0 SErr 0x180000 action 0x6 frozen
Jan  4 02:57:02 Tower kernel: ata13: edma_err_cause=00000020 pp_flags=00000000, SError=00180000
Jan  4 02:57:02 Tower kernel: ata13: SError: { 10B8B Dispar }
Jan  4 02:57:02 Tower kernel: ata13: hard resetting link
Jan  4 02:57:02 Tower kernel: ata13: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan  4 02:57:02 Tower kernel: ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan  4 02:57:02 Tower kernel: ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Jan  4 02:57:02 Tower kernel: ata11.00: configured for UDMA/133
Jan  4 02:57:02 Tower kernel: ata11: EH complete
Jan  4 02:57:02 Tower kernel: ata10.00: configured for UDMA/133
Jan  4 02:57:02 Tower kernel: ata10: EH complete
Jan  4 02:57:02 Tower kernel: ata13.00: configured for UDMA/133
Jan  4 02:57:02 Tower kernel: ata13: EH complete

 

Thanks in advance!

Link to comment

Another thing I saw. What do these errors mean exactly ?

 

Jan  4 13:00:16 Tower kernel: ata15: exception Emask 0x10 SAct 0x0 SErr 0x80000 action 0xe frozen (Errors)
Jan  4 13:00:16 Tower kernel: ata15: irq_stat 0x01100010, PHY RDY changed (Drive related)
Jan  4 13:00:16 Tower kernel: ata15: SError: { 10B8B } (Errors)
Jan  4 13:00:16 Tower kernel: ata15: hard resetting link (Minor Issues)
Jan  4 13:00:18 Tower kernel: ata15: SATA link down (SStatus 0 SControl 0) (Drive related)
Jan  4 13:00:23 Tower kernel: ata15: hard resetting link (Minor Issues)
Jan  4 13:00:25 Tower kernel: ata15: SATA link down (SStatus 0 SControl 0) (Drive related)
Jan  4 13:00:25 Tower kernel: ata15: limiting SATA link speed to 1.5 Gbps (Drive related)
Jan  4 13:00:30 Tower kernel: ata15: hard resetting link (Minor Issues)
Jan  4 13:00:32 Tower kernel: ata15: SATA link down (SStatus 0 SControl 10) (Drive related)
Jan  4 13:00:32 Tower kernel: ata15.00: disabled (Errors)
Jan  4 13:00:32 Tower kernel: ata15: EH complete (Drive related)
Jan  4 13:00:32 Tower kernel: ata15.00: detaching (SCSI 15:0:0:0) (Drive related)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Synchronizing SCSI cache (Drive related)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Result: hostbyte=0x04 driverbyte=0x00 (System)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Stopping disk (Drive related)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] START_STOP FAILED (Drive related)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Result: hostbyte=0x04 driverbyte=0x00 (System)

Link to comment

Another thing I saw. What do these errors mean exactly ?

 

Jan  4 13:00:16 Tower kernel: ata15: exception Emask 0x10 SAct 0x0 SErr 0x80000 action 0xe frozen (Errors)
Jan  4 13:00:16 Tower kernel: ata15: irq_stat 0x01100010, PHY RDY changed (Drive related)
Jan  4 13:00:16 Tower kernel: ata15: SError: { 10B8B } (Errors)
Jan  4 13:00:16 Tower kernel: ata15: hard resetting link (Minor Issues)
Jan  4 13:00:18 Tower kernel: ata15: SATA link down (SStatus 0 SControl 0) (Drive related)
Jan  4 13:00:23 Tower kernel: ata15: hard resetting link (Minor Issues)
Jan  4 13:00:25 Tower kernel: ata15: SATA link down (SStatus 0 SControl 0) (Drive related)
Jan  4 13:00:25 Tower kernel: ata15: limiting SATA link speed to 1.5 Gbps (Drive related)
Jan  4 13:00:30 Tower kernel: ata15: hard resetting link (Minor Issues)
Jan  4 13:00:32 Tower kernel: ata15: SATA link down (SStatus 0 SControl 10) (Drive related)
Jan  4 13:00:32 Tower kernel: ata15.00: disabled (Errors)
Jan  4 13:00:32 Tower kernel: ata15: EH complete (Drive related)
Jan  4 13:00:32 Tower kernel: ata15.00: detaching (SCSI 15:0:0:0) (Drive related)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Synchronizing SCSI cache (Drive related)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Result: hostbyte=0x04 driverbyte=0x00 (System)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Stopping disk (Drive related)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] START_STOP FAILED (Drive related)
Jan  4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Result: hostbyte=0x04 driverbyte=0x00 (System)

disk /dev/sdp has stopped communicating to the disk controller and it is being reset in an attempt to restore communications.

 

Could be anything, from a bad/loose cable to the drive, to a bad drive, to a power supply unable to keep up with the additional disks just added.

Link to comment

Thanks Joe L! I didn't notice that in the second log the device name was there. What about the first log, the overnight one... How can I tell which disk is which?

Usually ata15 ends up assigned as "sd:15:0:0:0" and further back in the log you can find where the "sd" device is affiliated with a /dev/sdX device.

 

Here is an example from my syslog for ata2

Dec  2 18:24:09 Tower2 kernel: ata2.00: ATA-8: Hitachi HDS722020ALA330, JKAOA3EA, max UDMA/133

Dec  2 18:24:09 Tower2 kernel: ata2.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 0/32)

Dec  2 18:24:09 Tower2 kernel: ata2.00: configured for UDMA/100

Dec  2 18:24:09 Tower2 kernel: scsi 2:0:0:0: Direct-Access    ATA      Hitachi HDS72202 JKAO PQ: 0 ANSI: 5

Dec  2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB)

Dec  2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] Write Protect is off

Dec  2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00

Dec  2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

Dec  2 18:24:09 Tower2 kernel:  sdb: sdb1

Dec  2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] Attached SCSI disk

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.