armbrust Posted December 15, 2011 Posted December 15, 2011 Uraid 4.7 I have a segate BARRACUDA GREEN Model Number: ST2000DL003 drive that has started acting up. It is my parity drive. What happens is that it just dissapears from the system, parity is invalidated. I can restart the system (hard turn off power) and it returns. In the smart test everything appears normal. Also, I've tried different power and sata cables and sata ports. The last time this happened I restarted, and did a parity rebuild. It finished fine, but shortly after the drive dropped off. Here is a section of the system log after the partiy rebuild finished. Any Ideas? Dec 15 15:08:19 Tower kernel: md: sync done. time=46700sec rate=41831K/sec Dec 15 15:08:19 Tower kernel: md: recovery thread sync completion status: 0 Dec 15 15:23:34 Tower in.telnetd[14529]: connect from 192.168.1.2 (192.168.1.2) Dec 15 15:23:40 Tower login[14530]: ROOT LOGIN on `pts/1' from `DD-WRT' Dec 15 15:26:29 Tower kernel: ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Dec 15 15:26:29 Tower kernel: ata6.00: failed command: SMART Dec 15 15:26:29 Tower kernel: ata6.00: cmd b0/da:00:00:4f:c2/00:00:00:00:00/00 tag 0 Dec 15 15:26:29 Tower kernel: res 40/00:00:46:47:00/00:00:00:00:00/e0 Emask 0x4 (timeout) Dec 15 15:26:29 Tower kernel: ata6.00: status: { DRDY } Dec 15 15:26:29 Tower kernel: ata6: hard resetting link Dec 15 15:26:35 Tower kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Dec 15 15:26:35 Tower kernel: ata6.00: link online but device misclassifed Dec 15 15:26:40 Tower kernel: ata6.00: qc timeout (cmd 0xec) Dec 15 15:26:40 Tower kernel: ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4) Dec 15 15:26:40 Tower kernel: ata6.00: revalidation failed (errno=-5) Dec 15 15:26:40 Tower kernel: ata6: hard resetting link Dec 15 15:26:45 Tower kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300) Dec 15 15:26:45 Tower kernel: ata6.00: link online but device misclassifed Dec 15 15:26:47 Tower cache_dirs: ============================================== Dec 15 15:26:47 Tower cache_dirs: command-args= Dec 15 15:26:47 Tower cache_dirs: vfs_cache_pressure=10 Dec 15 15:26:47 Tower cache_dirs: max_seconds=10, min_seconds=1 Dec 15 15:26:47 Tower cache_dirs: max_depth=9999 Dec 15 15:26:47 Tower cache_dirs: command=find -noleaf Dec 15 15:26:47 Tower cache_dirs: version=1.6.4 Dec 15 15:26:47 Tower cache_dirs: ---------- caching directories --------------- Dec 15 15:26:47 Tower cache_dirs: Backup-ReadOnly Dec 15 15:26:47 Tower cache_dirs: Backups Dec 15 15:26:47 Tower cache_dirs: Downloads Dec 15 15:26:47 Tower cache_dirs: Media Dec 15 15:26:47 Tower cache_dirs: Others Dec 15 15:26:47 Tower cache_dirs: Portable Dec 15 15:26:47 Tower cache_dirs: Sage Dec 15 15:26:47 Tower cache_dirs: ftp Dec 15 15:26:47 Tower cache_dirs: mysql Dec 15 15:26:47 Tower cache_dirs: sdf1 Dec 15 15:26:47 Tower cache_dirs: torrents Dec 15 15:26:47 Tower cache_dirs: ---------------------------------------------- Dec 15 15:26:47 Tower cache_dirs: cache_dirs process ID 14961 started, To terminate it, type: cache_dirs -q Dec 15 15:26:55 Tower kernel: ata6.00: qc timeout (cmd 0xec) Dec 15 15:26:55 Tower kernel: ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4) Dec 15 15:26:55 Tower kernel: ata6.00: revalidation failed (errno=-5) Dec 15 15:26:55 Tower kernel: ata6: limiting SATA link speed to 1.5 Gbps Dec 15 15:26:55 Tower kernel: ata6: hard resetting link Dec 15 15:27:01 Tower kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 15 15:27:01 Tower kernel: ata6.00: link online but device misclassifed Dec 15 15:27:31 Tower kernel: ata6.00: qc timeout (cmd 0xec) Dec 15 15:27:31 Tower kernel: ata6.00: failed to IDENTIFY (I/O error, err_mask=0x4) Dec 15 15:27:31 Tower kernel: ata6.00: revalidation failed (errno=-5) Dec 15 15:27:31 Tower kernel: ata6.00: disabled Dec 15 15:27:31 Tower kernel: ata6: exception Emask 0x40 SAct 0x0 SErr 0x800 action 0x6 frozen t4 Dec 15 15:27:31 Tower kernel: ata6: SError: { HostInt } Dec 15 15:27:31 Tower kernel: ata6: hard resetting link Dec 15 15:27:36 Tower kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Dec 15 15:27:36 Tower kernel: ata6.00: link online but device misclassifed Dec 15 15:27:36 Tower kernel: ata6: EH complete Dec 15 15:27:36 Tower kernel: sd 6:0:0:0: [sde] Unhandled error code Dec 15 15:27:36 Tower kernel: sd 6:0:0:0: [sde] Result: hostbyte=0x04 driverbyte=0x00 Dec 15 15:27:36 Tower kernel: sd 6:0:0:0: [sde] CDB: cdb[0]=0x2a: 2a 00 59 e2 33 6f 00 00 08 00 Dec 15 15:27:36 Tower kernel: end_request: I/O error, dev sde, sector 1507996527 Dec 15 15:27:36 Tower kernel: md: disk0 write error Dec 15 15:27:36 Tower kernel: handle_stripe write error: 1507996464/0, count: 1 Dec 15 15:27:36 Tower kernel: md: recovery thread woken up ... Dec 15 15:27:36 Tower kernel: md: recovery thread has nothing to resync
RobJ Posted December 16, 2011 Posted December 16, 2011 All I can do is verify what you are seeing (system suddenly loses contact completely with drive and marks it disabled), and you have already tried what I would have recommended (try a different SATA port, replace the SATA and power cables, make sure power supply is sufficient, check a SMART report). Sorry, I can't be more helpful. This has got to be rather frustrating ... Only thing I can suggest (which you may have already tried) is move it to a completely different controller, and double-check its cable connections and their 'pins', for any looseness or defects.
armbrust Posted December 19, 2011 Author Posted December 19, 2011 Update: After the last rebuild, the drive *seems* to be stable - hasn't dropped out yet. These were the changes I made: -Put the tunables back to default -Added acpi=off libata.force=noncq to syslinux.cfg
Recommended Posts
Archived
This topic is now archived and is closed to further replies.