samukas Posted January 4, 2011 Share Posted January 4, 2011 Yesterday I inserted 3 new HDDs (after preclearing them) in my array to a total of 14 disks now, and I left two "move" processes overnight, trough telnet sessions, and aparently there have been some issues. - One of the processes is still running, but really really slow, it's already taking 10 hours or more to move just 1TB internally, I don't remember it being that slow; - The other process just stopped moving overnight, with no warning whatsoever in the telnet window. Here is what I found in the syslog, can somebody please explain me what is is? Jan 4 02:57:02 Tower kernel: ata10: exception Emask 0x10 SAct 0x0 SErr 0x180000 action 0x6 frozen Jan 4 02:57:02 Tower kernel: ata10: edma_err_cause=00000020 pp_flags=00000000, SError=00180000 Jan 4 02:57:02 Tower kernel: ata10: SError: { 10B8B Dispar } Jan 4 02:57:02 Tower kernel: ata10: hard resetting link Jan 4 02:57:02 Tower kernel: ata11: exception Emask 0x10 SAct 0x0 SErr 0x180000 action 0x6 frozen Jan 4 02:57:02 Tower kernel: ata11: edma_err_cause=00000020 pp_flags=00000000, SError=00180000 Jan 4 02:57:02 Tower kernel: ata11: SError: { 10B8B Dispar } Jan 4 02:57:02 Tower kernel: ata11: hard resetting link Jan 4 02:57:02 Tower kernel: ata13: exception Emask 0x10 SAct 0x0 SErr 0x180000 action 0x6 frozen Jan 4 02:57:02 Tower kernel: ata13: edma_err_cause=00000020 pp_flags=00000000, SError=00180000 Jan 4 02:57:02 Tower kernel: ata13: SError: { 10B8B Dispar } Jan 4 02:57:02 Tower kernel: ata13: hard resetting link Jan 4 02:57:02 Tower kernel: ata13: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 4 02:57:02 Tower kernel: ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 4 02:57:02 Tower kernel: ata11: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Jan 4 02:57:02 Tower kernel: ata11.00: configured for UDMA/133 Jan 4 02:57:02 Tower kernel: ata11: EH complete Jan 4 02:57:02 Tower kernel: ata10.00: configured for UDMA/133 Jan 4 02:57:02 Tower kernel: ata10: EH complete Jan 4 02:57:02 Tower kernel: ata13.00: configured for UDMA/133 Jan 4 02:57:02 Tower kernel: ata13: EH complete Thanks in advance! Quote Link to comment
samukas Posted January 4, 2011 Author Share Posted January 4, 2011 Another thing I saw. What do these errors mean exactly ? Jan 4 13:00:16 Tower kernel: ata15: exception Emask 0x10 SAct 0x0 SErr 0x80000 action 0xe frozen (Errors) Jan 4 13:00:16 Tower kernel: ata15: irq_stat 0x01100010, PHY RDY changed (Drive related) Jan 4 13:00:16 Tower kernel: ata15: SError: { 10B8B } (Errors) Jan 4 13:00:16 Tower kernel: ata15: hard resetting link (Minor Issues) Jan 4 13:00:18 Tower kernel: ata15: SATA link down (SStatus 0 SControl 0) (Drive related) Jan 4 13:00:23 Tower kernel: ata15: hard resetting link (Minor Issues) Jan 4 13:00:25 Tower kernel: ata15: SATA link down (SStatus 0 SControl 0) (Drive related) Jan 4 13:00:25 Tower kernel: ata15: limiting SATA link speed to 1.5 Gbps (Drive related) Jan 4 13:00:30 Tower kernel: ata15: hard resetting link (Minor Issues) Jan 4 13:00:32 Tower kernel: ata15: SATA link down (SStatus 0 SControl 10) (Drive related) Jan 4 13:00:32 Tower kernel: ata15.00: disabled (Errors) Jan 4 13:00:32 Tower kernel: ata15: EH complete (Drive related) Jan 4 13:00:32 Tower kernel: ata15.00: detaching (SCSI 15:0:0:0) (Drive related) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Synchronizing SCSI cache (Drive related) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Result: hostbyte=0x04 driverbyte=0x00 (System) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Stopping disk (Drive related) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] START_STOP FAILED (Drive related) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Result: hostbyte=0x04 driverbyte=0x00 (System) Quote Link to comment
Joe L. Posted January 4, 2011 Share Posted January 4, 2011 Another thing I saw. What do these errors mean exactly ? Jan 4 13:00:16 Tower kernel: ata15: exception Emask 0x10 SAct 0x0 SErr 0x80000 action 0xe frozen (Errors) Jan 4 13:00:16 Tower kernel: ata15: irq_stat 0x01100010, PHY RDY changed (Drive related) Jan 4 13:00:16 Tower kernel: ata15: SError: { 10B8B } (Errors) Jan 4 13:00:16 Tower kernel: ata15: hard resetting link (Minor Issues) Jan 4 13:00:18 Tower kernel: ata15: SATA link down (SStatus 0 SControl 0) (Drive related) Jan 4 13:00:23 Tower kernel: ata15: hard resetting link (Minor Issues) Jan 4 13:00:25 Tower kernel: ata15: SATA link down (SStatus 0 SControl 0) (Drive related) Jan 4 13:00:25 Tower kernel: ata15: limiting SATA link speed to 1.5 Gbps (Drive related) Jan 4 13:00:30 Tower kernel: ata15: hard resetting link (Minor Issues) Jan 4 13:00:32 Tower kernel: ata15: SATA link down (SStatus 0 SControl 10) (Drive related) Jan 4 13:00:32 Tower kernel: ata15.00: disabled (Errors) Jan 4 13:00:32 Tower kernel: ata15: EH complete (Drive related) Jan 4 13:00:32 Tower kernel: ata15.00: detaching (SCSI 15:0:0:0) (Drive related) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Synchronizing SCSI cache (Drive related) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Result: hostbyte=0x04 driverbyte=0x00 (System) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Stopping disk (Drive related) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] START_STOP FAILED (Drive related) Jan 4 13:00:32 Tower kernel: sd 15:0:0:0: [sdp] Result: hostbyte=0x04 driverbyte=0x00 (System) disk /dev/sdp has stopped communicating to the disk controller and it is being reset in an attempt to restore communications. Could be anything, from a bad/loose cable to the drive, to a bad drive, to a power supply unable to keep up with the additional disks just added. Quote Link to comment
samukas Posted January 4, 2011 Author Share Posted January 4, 2011 Thanks Joe L! I didn't notice that in the second log the device name was there. What about the first log, the overnight one... How can I tell which disk is which? Quote Link to comment
Joe L. Posted January 4, 2011 Share Posted January 4, 2011 Thanks Joe L! I didn't notice that in the second log the device name was there. What about the first log, the overnight one... How can I tell which disk is which? Usually ata15 ends up assigned as "sd:15:0:0:0" and further back in the log you can find where the "sd" device is affiliated with a /dev/sdX device. Here is an example from my syslog for ata2 Dec 2 18:24:09 Tower2 kernel: ata2.00: ATA-8: Hitachi HDS722020ALA330, JKAOA3EA, max UDMA/133 Dec 2 18:24:09 Tower2 kernel: ata2.00: 3907029168 sectors, multi 16: LBA48 NCQ (depth 0/32) Dec 2 18:24:09 Tower2 kernel: ata2.00: configured for UDMA/100 Dec 2 18:24:09 Tower2 kernel: scsi 2:0:0:0: Direct-Access ATA Hitachi HDS72202 JKAO PQ: 0 ANSI: 5 Dec 2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] 3907029168 512-byte logical blocks: (2.00 TB/1.81 TiB) Dec 2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] Write Protect is off Dec 2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00 Dec 2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Dec 2 18:24:09 Tower2 kernel: sdb: sdb1 Dec 2 18:24:09 Tower2 kernel: sd 2:0:0:0: [sdb] Attached SCSI disk Quote Link to comment
samukas Posted January 7, 2011 Author Share Posted January 7, 2011 I replaced some sata cables for the array drives and seems the issue is gone! At least for now. My external dock which I use to copy files over also seems to be having issues, so, I suppose the eSATA cable needs replacement also. Thanks for the help, Joe L! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.