6.1.6 Very slow array performance


Jaco2k

Recommended Posts

Hello - currently I am experiencing very slow performance on my array, but with 15 disks inside, bit hard to say which one is the exact culprit.

 

The log I pulled out probably has the answer, but I have no idea what to make of it:

 

Dec 29 01:12:22 Tower kernel: ata13.00: device reported invalid CHS sector 0
Dec 29 01:12:22 Tower kernel: ata13: EH complete
Dec 29 01:13:00 Tower kernel: ata13.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 01:13:00 Tower kernel: ata13.00: failed command: WRITE DMA EXT
Dec 29 01:13:00 Tower kernel: ata13.00: cmd 35/00:40:90:f8:2c/00:05:77:00:00/e0 tag 24 dma 688128 out
Dec 29 01:13:00 Tower kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 01:13:00 Tower kernel: ata13.00: status: { DRDY }
Dec 29 01:13:00 Tower kernel: ata13: hard resetting link
Dec 29 01:13:01 Tower kernel: ata13: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 29 01:13:01 Tower kernel: ata13.00: configured for UDMA/33
Dec 29 01:13:01 Tower kernel: ata13.00: device reported invalid CHS sector 0
Dec 29 01:13:01 Tower kernel: ata13: EH complete
Dec 29 01:13:36 Tower kernel: ata13.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 01:13:36 Tower kernel: ata13.00: failed command: READ DMA EXT
Dec 29 01:13:36 Tower kernel: ata13.00: cmd 25/00:40:a8:b9:32/00:05:77:00:00/e0 tag 7 dma 688128 in
Dec 29 01:13:36 Tower kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 01:13:36 Tower kernel: ata13.00: status: { DRDY }
Dec 29 01:13:36 Tower kernel: ata13: hard resetting link
Dec 29 01:13:37 Tower kernel: ata13: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 29 01:13:37 Tower kernel: ata13.00: configured for UDMA/33
Dec 29 01:13:37 Tower kernel: ata13.00: device reported invalid CHS sector 0
Dec 29 01:13:37 Tower kernel: ata13: EH complete
Dec 29 01:14:14 Tower kernel: ata13.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 01:14:14 Tower kernel: ata13.00: failed command: WRITE DMA EXT
Dec 29 01:14:14 Tower kernel: ata13.00: cmd 35/00:40:c8:72:39/00:05:77:00:00/e0 tag 28 dma 688128 out
Dec 29 01:14:14 Tower kernel: res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 01:14:14 Tower kernel: ata13.00: status: { DRDY }
Dec 29 01:14:14 Tower kernel: ata13: hard resetting link
Dec 29 01:14:15 Tower kernel: ata13: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 29 01:14:15 Tower kernel: ata13.00: configured for UDMA/33
Dec 29 01:14:15 Tower kernel: ata13: EH complete
Dec 29 01:14:59 Tower kernel: ata13.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 01:14:59 Tower kernel: ata13.00: failed command: READ DMA EXT
Dec 29 01:14:59 Tower kernel: ata13.00: cmd 25/00:00:80:99:0d/00:05:77:00:00/e0 tag 7 dma 655360 in
Dec 29 01:14:59 Tower kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 01:14:59 Tower kernel: ata13.00: status: { DRDY }
Dec 29 01:14:59 Tower kernel: ata13: hard resetting link
Dec 29 01:15:00 Tower kernel: ata13: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 29 01:15:00 Tower kernel: ata13.00: configured for UDMA/33
Dec 29 01:15:00 Tower kernel: ata13.00: device reported invalid CHS sector 0
Dec 29 01:15:00 Tower kernel: ata13: EH complete

 

Any help?

 

Thank you all in advance.

 

Link to comment

Answering my own question - with the full logs I was able to determine what was the exact disk. Switched to another slot and it worked just fine, so...

- It is either a SATA cable (most probably)

- The cage slot

- The controller

 

Since the disk and the controller are both new and the disk is working well now in another slot in the drive cage, I will need to find out which one is the faulty cable.

 

Oh, well...

 

Cheers and thanks anyway :)

Link to comment

Nope - problem is still there, and it is not affecting the same drive... *sigh*

 

Dec 29 02:14:36 Tower kernel: ata14: limiting SATA link speed to 1.5 Gbps
Dec 29 02:14:36 Tower kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 02:14:36 Tower kernel: ata14.00: failed command: READ DMA EXT
Dec 29 02:14:36 Tower kernel: ata14.00: cmd 25/00:00:50:5b:4f/00:03:0f:00:00/e0 tag 4 dma 393216 in
Dec 29 02:14:36 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 02:14:36 Tower kernel: ata14.00: status: { DRDY }
Dec 29 02:14:36 Tower kernel: ata14: hard resetting link
Dec 29 02:14:37 Tower kernel: ata14: SATA link up 3.0 Gbps (SStatus 123 SControl 310)
Dec 29 02:14:37 Tower kernel: ata14.00: configured for UDMA/133
Dec 29 02:14:37 Tower kernel: ata14.00: device reported invalid CHS sector 0
Dec 29 02:14:37 Tower kernel: ata14: EH complete
Dec 29 02:15:08 Tower kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 02:15:08 Tower kernel: ata14.00: failed command: READ DMA EXT
Dec 29 02:15:08 Tower kernel: ata14.00: cmd 25/00:00:50:8d:50/00:03:0f:00:00/e0 tag 24 dma 393216 in
Dec 29 02:15:08 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 02:15:08 Tower kernel: ata14.00: status: { DRDY }
Dec 29 02:15:08 Tower kernel: ata14: hard resetting link
Dec 29 02:15:09 Tower kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 29 02:15:09 Tower kernel: ata14.00: configured for UDMA/133
Dec 29 02:15:09 Tower kernel: ata14.00: device reported invalid CHS sector 0
Dec 29 02:15:09 Tower kernel: ata14: EH complete
Dec 29 02:15:41 Tower kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 02:15:41 Tower kernel: ata14.00: failed command: READ DMA EXT
Dec 29 02:15:41 Tower kernel: ata14.00: cmd 25/00:00:50:9a:52/00:03:0f:00:00/e0 tag 24 dma 393216 in
Dec 29 02:15:41 Tower kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 02:15:41 Tower kernel: ata14.00: status: { DRDY }
Dec 29 02:15:41 Tower kernel: ata14: hard resetting link
Dec 29 02:15:42 Tower kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 29 02:15:42 Tower kernel: ata14.00: configured for UDMA/133
Dec 29 02:15:42 Tower kernel: ata14.00: device reported invalid CHS sector 0
Dec 29 02:15:42 Tower kernel: ata14: EH complete
Dec 29 02:16:13 Tower kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 02:16:13 Tower kernel: ata14.00: failed command: READ DMA EXT
Dec 29 02:16:13 Tower kernel: ata14.00: cmd 25/00:00:90:26:53/00:03:0f:00:00/e0 tag 15 dma 393216 in
Dec 29 02:16:13 Tower kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 02:16:13 Tower kernel: ata14.00: status: { DRDY }
Dec 29 02:16:13 Tower kernel: ata14: hard resetting link
Dec 29 02:16:14 Tower kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 29 02:16:14 Tower kernel: ata14.00: configured for UDMA/133
Dec 29 02:16:14 Tower kernel: ata14.00: device reported invalid CHS sector 0
Dec 29 02:16:14 Tower kernel: ata14: EH complete
Dec 29 02:16:24 Tower emhttp: cmd: /usr/local/emhttp/plugins/dynamix/scripts/tail_log syslog
Dec 29 02:16:45 Tower kernel: ata14.00: limiting speed to UDMA/100:PIO4
Dec 29 02:16:45 Tower kernel: ata14.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Dec 29 02:16:45 Tower kernel: ata14.00: failed command: READ DMA EXT
Dec 29 02:16:45 Tower kernel: ata14.00: cmd 25/00:00:90:86:53/00:03:0f:00:00/e0 tag 27 dma 393216 in
Dec 29 02:16:45 Tower kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Dec 29 02:16:45 Tower kernel: ata14.00: status: { DRDY }
Dec 29 02:16:45 Tower kernel: ata14: hard resetting link
Dec 29 02:16:46 Tower kernel: ata14: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Dec 29 02:16:46 Tower kernel: ata14.00: configured for UDMA/100
Dec 29 02:16:46 Tower kernel: ata14.00: device reported invalid CHS sector 0
Dec 29 02:16:46 Tower kernel: ata14: EH complete

Link to comment
  • 1 month later...

I'm getting a similar error.  I tried to complete a parity check and that's when I noticed the error.  Not to mention it said the check would take over 100 days.

 

I've tried several different cables.  Several different slots.  Different cables with different slots.  Different power connections (850watt PS feeding 11 drives.  New hard drive.  It's attempting to rebuild a drive now and it only throws the error up sometimes.  When it does throw errors up it slows to 1MB/s then ramp back up to 90+MB/s.  I don't know what else to try right now.

 

Feb 5 16:55:44 XORUNRAID kernel: ata16.00: cmd 35/00:08:18:56:68/00:00:14:01:00/e0 tag 14 dma 4096 out
Feb 5 16:55:44 XORUNRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb 5 16:55:44 XORUNRAID kernel: ata16.00: status: { DRDY }
Feb 5 16:55:44 XORUNRAID kernel: ata16: hard resetting link
Feb 5 16:55:44 XORUNRAID kernel: ata16: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 5 16:55:44 XORUNRAID kernel: ata16.00: configured for UDMA/133
Feb 5 16:55:44 XORUNRAID kernel: ata16.00: device reported invalid CHS sector 0
Feb 5 16:55:44 XORUNRAID kernel: ata16: EH complete
Feb 5 16:56:19 XORUNRAID kernel: ata16: limiting SATA link speed to 3.0 Gbps
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: failed command: WRITE DMA
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: cmd ca/00:08:98:8b:12/00:00:00:00:00/e0 tag 6 dma 4096 out
Feb 5 16:56:19 XORUNRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: status: { DRDY }
Feb 5 16:56:19 XORUNRAID kernel: ata16: hard resetting link
Feb 5 16:56:19 XORUNRAID kernel: ata16: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: configured for UDMA/133
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: device reported invalid CHS sector 0
Feb 5 16:56:19 XORUNRAID kernel: ata16: EH complete
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: failed command: WRITE DMA EXT
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: cmd 35/00:80:f0:af:21/00:01:00:00:00/e0 tag 11 dma 196608 out
Feb 5 16:56:50 XORUNRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: status: { DRDY }
Feb 5 16:56:50 XORUNRAID kernel: ata16: hard resetting link
Feb 5 16:56:50 XORUNRAID kernel: ata16: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: configured for UDMA/133
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: device reported invalid CHS sector 0
Feb 5 16:56:50 XORUNRAID kernel: ata16: EH complete
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: failed command: WRITE DMA EXT
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: cmd 35/00:80:30:56:26/00:01:00:00:00/e0 tag 28 dma 196608 out
Feb 5 16:57:26 XORUNRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: status: { DRDY }
Feb 5 16:57:26 XORUNRAID kernel: ata16: hard resetting link
Feb 5 16:57:26 XORUNRAID kernel: ata16: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: configured for UDMA/133
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: device reported invalid CHS sector 0
Feb 5 16:57:26 XORUNRAID kernel: ata16: EH complete

Link to comment
  • 1 month later...

How did you figure out which disk was the culprit from the logs?  I have the same issue and I'm also getting "ata1: hard resetting" and would like to figure out how I can find out which disk is the problem so I can also try and swap out cables/ports.

 

 

Answering my own question - with the full logs I was able to determine what was the exact disk. Switched to another slot and it worked just fine, so...

- It is either a SATA cable (most probably)

- The cage slot

- The controller

 

Since the disk and the controller are both new and the disk is working well now in another slot in the drive cage, I will need to find out which one is the faulty cable.

 

Oh, well...

 

Cheers and thanks anyway :)

Link to comment

Hi thegurujim,

 

Did you ever solve your problem?  I have exactly the same issue with a disk rebuild and was hoping I could troubleshoot somehow and not have to replace everyting.

 

Thanks

Johan

 

 

I'm getting a similar error.  I tried to complete a parity check and that's when I noticed the error.  Not to mention it said the check would take over 100 days.

 

I've tried several different cables.  Several different slots.  Different cables with different slots.  Different power connections (850watt PS feeding 11 drives.  New hard drive.  It's attempting to rebuild a drive now and it only throws the error up sometimes.  When it does throw errors up it slows to 1MB/s then ramp back up to 90+MB/s.  I don't know what else to try right now.

 

Feb 5 16:55:44 XORUNRAID kernel: ata16.00: cmd 35/00:08:18:56:68/00:00:14:01:00/e0 tag 14 dma 4096 out
Feb 5 16:55:44 XORUNRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb 5 16:55:44 XORUNRAID kernel: ata16.00: status: { DRDY }
Feb 5 16:55:44 XORUNRAID kernel: ata16: hard resetting link
Feb 5 16:55:44 XORUNRAID kernel: ata16: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 5 16:55:44 XORUNRAID kernel: ata16.00: configured for UDMA/133
Feb 5 16:55:44 XORUNRAID kernel: ata16.00: device reported invalid CHS sector 0
Feb 5 16:55:44 XORUNRAID kernel: ata16: EH complete
Feb 5 16:56:19 XORUNRAID kernel: ata16: limiting SATA link speed to 3.0 Gbps
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: failed command: WRITE DMA
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: cmd ca/00:08:98:8b:12/00:00:00:00:00/e0 tag 6 dma 4096 out
Feb 5 16:56:19 XORUNRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: status: { DRDY }
Feb 5 16:56:19 XORUNRAID kernel: ata16: hard resetting link
Feb 5 16:56:19 XORUNRAID kernel: ata16: SATA link up 6.0 Gbps (SStatus 133 SControl 320)
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: configured for UDMA/133
Feb 5 16:56:19 XORUNRAID kernel: ata16.00: device reported invalid CHS sector 0
Feb 5 16:56:19 XORUNRAID kernel: ata16: EH complete
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: failed command: WRITE DMA EXT
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: cmd 35/00:80:f0:af:21/00:01:00:00:00/e0 tag 11 dma 196608 out
Feb 5 16:56:50 XORUNRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: status: { DRDY }
Feb 5 16:56:50 XORUNRAID kernel: ata16: hard resetting link
Feb 5 16:56:50 XORUNRAID kernel: ata16: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: configured for UDMA/133
Feb 5 16:56:50 XORUNRAID kernel: ata16.00: device reported invalid CHS sector 0
Feb 5 16:56:50 XORUNRAID kernel: ata16: EH complete
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: failed command: WRITE DMA EXT
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: cmd 35/00:80:30:56:26/00:01:00:00:00/e0 tag 28 dma 196608 out
Feb 5 16:57:26 XORUNRAID kernel: res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: status: { DRDY }
Feb 5 16:57:26 XORUNRAID kernel: ata16: hard resetting link
Feb 5 16:57:26 XORUNRAID kernel: ata16: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: configured for UDMA/133
Feb 5 16:57:26 XORUNRAID kernel: ata16.00: device reported invalid CHS sector 0
Feb 5 16:57:26 XORUNRAID kernel: ata16: EH complete

Link to comment

I have exactly the same issue with a disk rebuild and was hoping I could troubleshoot somehow and not have to replace everything.

While it may be true you have the same issue, you didn't provide any info that can help us.  Series of exception handler lines, like the examples above, look very similar to users, but that's because it's the same exception handler.  However, the errors it handles can be very different, and involve very different physical components, such as the disk drive, the cables, their connectors, the power source, the SATA port, and the disk controller.

 

The samples from the 2 users above do happen to be very similar, in that the SATA communications channels are working well, but after a read or write request, the drive is not responding, even after resetting.  Neither case has any error codes at all, just a lack of response from the drive on a read or write request (they 'timed out').  The drive was given more than enough time to respond but did not.  That's probably an issue with the drive, but sometimes could possibly be an issue with the controller (I really don't understand how though), because sometimes moving to a different port will actually correct the issue!

 

What is regrettable is that of the 3 users reporting above and wanting help, not one included their diagnostics.  The diagnostics files include the complete syslog, so we can see not only the error lines, but the setup too, where the scsi and ata assignments are made.  It also includes the SMART reports, so we can check for drive issues.  And it includes the lsscsi report, which often (but not always) includes info on the ata assignments.

 

How did you figure out which disk was the culprit from the logs?  I have the same issue and I'm also getting "ata1: hard resetting" and would like to figure out how I can find out which disk is the problem so I can also try and swap out cables/ports.

This can be hard!  The unRAID v6 diagnostics now include the lsscsi report, which includes device symbols.  You may be able to see the ata assignments within the full device path for each drive.  For an example -

[10:0:0:0]  disk    ATA      Hitachi HDS72101 A39C  /dev/sdh  /dev/sg7

  state=running queue_depth=1 scsi_level=6 type=0 device_blocked=0 timeout=30

  dir: /sys/bus/scsi/devices/10:0:0:0  [/sys/devices/pci0000:00/0000:00:05.2/ata11/host10/target10:0:0/10:0:0:0]

 

If it's not there, then it takes experience and time.  I've written a wiki article to help but it's very incomplete, and a little out-of-date.

  Drive Symbols

 

By the way, the syslog samples above do not indicate cable issues, as their communications channels (the SATA link) are fine.  Cable issues always seem to be indicated by the BadCRC error flag.  I personally believe that other communication-related error flags are more likely to indicate issues with power or connectors or the controller, but not the cable.  Above, there are NO error flags!

Link to comment

Hi Robj,

 

Sorry for not supplying more info.  I have another open post:

http://lime-technology.com/forum/index.php?topic=47389.msg453596#msg453596

 

And was also just looking at and asking on posts that looked related.  You will find more info in my post referred to above.

 

Thanks for the reference to the wiki and the info on finding the device symbols.  I'm looking into that and will try and understand how to go about doing that.

 

Regards

Johan

Link to comment

Without reading the whole thing, here's a typical exception from yours (others that I saw were the same, but very different from the other 2 users) -

Mar 10 21:09:52 Tower kernel: ata1.00: exception Emask 0x50 SAct 0x0 SErr 0x4890800 action 0xe frozen

Mar 10 21:09:52 Tower kernel: ata1.00: irq_stat 0x0c400040, interface fatal error, connection status changed

Mar 10 21:09:52 Tower kernel: ata1: SError: { HostInt PHYRdyChg 10B8B LinkSeq DevExch }

Mar 10 21:09:52 Tower kernel: ata1.00: failed command: READ DMA EXT

Mar 10 21:09:52 Tower kernel: ata1.00: cmd 25/00:18:c8:3a:01/00:05:00:00:00/e0 tag 29 dma 667648 in

Mar 10 21:09:52 Tower kernel:        res 50/00:00:8f:ec:44/00:00:00:00:00/e0 Emask 0x50 (ATA bus error)

Mar 10 21:09:52 Tower kernel: ata1.00: status: { DRDY }

Mar 10 21:09:52 Tower kernel: ata1: hard resetting link

 

Appears to be interface connection issues between the drive and controller, Disk 4 as Johnnie said.  I understand PHYRdyChg to mean that a change in device readiness was detected, probably causing the rest of the decoding errors.  This could be a power issue, but the easiest and more likely is a loose connection somewhere, or corrosion on a connector causing an intermittent connection.  Check the SATA cable connectors, both ends, for a good solid connection.  Check the power cables and especially any power cable splitters (cheap ones are notorious for issues) for solid connections.  Re-seat each one several times, in case there's a little corrosion on any leads.  If it's on a back frame, re-seat the drive several times, and make sure it's not vibrating loose.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.