musicmann Posted January 2, 2011 Share Posted January 2, 2011 It seems like I lost power at some point while I was away for Christmas holidays, and when I returned, my machine was doing a parity sync. The speed was so slow, I decided to stop, reboot, and start another parity check. It's been running an hour, and the speed is about 142 KB/sec. Something's definitely wrong and I see a lot of red lines on the unMenu-Syslog view. However, I'm lost reading it and can't tell is this is a disk issue, a multiple disk issue, a controller issue, etc. Any help would be greatly appreciated. syslog-2011-01-02.zip Link to comment
Joe L. Posted January 2, 2011 Share Posted January 2, 2011 Looks like 1 disk /dev/sdi Jan 2 10:50:26 Tower kernel: md: import disk4: [8,128] (sdi) ST32000542AS 5XW1MCRN offset: 63 size: 1953514552 The Linux OS keeps trying to reset the communications to the disk. Could be the disk, the cable, the controller port, or may just need to be cleanly power cycled. Joe L. Jan 2 10:57:46 Tower kernel: sd 9:0:0:0: [sdi] 3907029168 512-byte hardware sectors (2000399 MB) Jan 2 10:57:46 Tower kernel: sd 9:0:0:0: [sdi] Write Protect is off Jan 2 10:57:46 Tower kernel: sd 9:0:0:0: [sdi] Mode Sense: 00 3a 00 00 Jan 2 10:57:46 Tower kernel: sd 9:0:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jan 2 10:58:41 Tower kernel: ata9: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen Jan 2 10:58:41 Tower kernel: ata9: irq_stat 0x00400040, connection status changed Jan 2 10:58:41 Tower kernel: ata9: SError: { PHYRdyChg DevExch } Jan 2 10:58:41 Tower kernel: ata9: hard resetting link Jan 2 10:58:48 Tower kernel: ata9: link is slow to respond, please be patient (ready=0) Jan 2 10:58:51 Tower kernel: ata9: softreset failed (device not ready) Jan 2 10:58:51 Tower kernel: ata9: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 2 10:58:56 Tower kernel: ata9.00: qc timeout (cmd 0xec) Jan 2 10:58:56 Tower kernel: ata9.00: failed to IDENTIFY (I/O error, err_mask=0x4) Jan 2 10:58:56 Tower kernel: ata9.00: revalidation failed (errno=-5) Jan 2 10:58:56 Tower kernel: ata9: hard resetting link Jan 2 10:58:56 Tower kernel: ata9: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 2 10:58:56 Tower kernel: ata9.00: configured for UDMA/133 Jan 2 10:58:56 Tower kernel: ata9: EH complete Jan 2 10:58:56 Tower kernel: sd 9:0:0:0: [sdi] 3907029168 512-byte hardware sectors (2000399 MB) Jan 2 10:58:56 Tower kernel: sd 9:0:0:0: [sdi] Write Protect is off Jan 2 10:58:56 Tower kernel: sd 9:0:0:0: [sdi] Mode Sense: 00 3a 00 00 Jan 2 10:58:56 Tower kernel: sd 9:0:0:0: [sdi] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Jan 2 10:59:05 Tower kernel: ata9: exception Emask 0x10 SAct 0x0 SErr 0x4010000 action 0xe frozen Jan 2 10:59:05 Tower kernel: ata9: irq_stat 0x00400040, connection status changed Jan 2 10:59:05 Tower kernel: ata9: SError: { PHYRdyChg DevExch } Jan 2 10:59:05 Tower kernel: ata9: hard resetting link Link to comment
musicmann Posted January 4, 2011 Author Share Posted January 4, 2011 Thanks, for the advice, Joe. I had let the parity check continue, and that disk eventually showed up as "disabled." This is the 2nd time recently that I've had problem with a disk on this port. In fact, this drive was just added 2 or 3 weeks ago. Given this history, I think it might be the controller port and not the disk. Since this disk was new and not burned in, I'm not just willing to trust doing a rebuild just using this disk on a different port. Here's what I'm thinking of doing. I'll pull Disk 4, put it in another system, and run a long Smart test on it. If it passes without issues. I'll reinstall it into my unRAID on a different controller port, and rebuild the data as if it's a new disk. If it has any issues, I'll replace it with a new drive, and rebuild the data (again on a different port). What do you think? Additionally, I think I will test the old disk that was on port 4 to see if it shows any errors. In fact, I also had one "disappear" on port 5 previously, and maybe I should test it too. The replacement for the one on 5 hasn't shown any issues though. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.