Jump to content

One off drive error: like to be a power problem?


Recommended Posts

I am occasionally seeing a log entry like this when I kick off a parity check. In this example; I kicked off a parity check, saw the error, left it going for 30mins to see how it went, cancelled the parity check, checked a few things and then kicked it off again.

 

The error occurred on the 1st time only, performance is completely normal (peak parity speeds of ~130M/s seen at times) through the check and read/write performance seems normal the rest of the time. Running a preclear also goes at what I would consider a normal speed (~24hrs for a 3T drive).

 

The wiki tells me this is a drive interface issue possibly caused by cables or power supply issues. Since it only ever, as far as I can see, occurs when the parity kicks off then I'm wondering if this more strongly points at a weakness in the power supply as kicking off a parity check on an idle server presumably tells everything to spin up at once. Does this sound feasible? Is there anything else I should look into?

 

I have done short smart reports on all drives and they appear to be fine.

 

This is an unraid 5.0.4 instance btw.

 

Oct 17 08:30:22 zalaga-unraid kernel: mdcmd (21): check CORRECT (unRAID engine)
Oct 17 08:30:22 zalaga-unraid kernel: md: recovery thread woken up ... (unRAID engine)
Oct 17 08:30:22 zalaga-unraid kernel: md: recovery thread checking parity... (unRAID engine)
Oct 17 08:30:22 zalaga-unraid kernel: md: using 1536k window, over a total of 2930266532 blocks. (unRAID engine)
Oct 17 08:30:23 zalaga-unraid kernel: ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen (Errors)
Oct 17 08:30:23 zalaga-unraid kernel: ata6.00: irq_stat 0x08000000, interface fatal error (Errors)
Oct 17 08:30:23 zalaga-unraid kernel: ata6: SError: { UnrecovData 10B8B BadCRC } (Errors)
Oct 17 08:30:23 zalaga-unraid kernel: ata6.00: failed command: READ DMA EXT (Minor Issues)
Oct 17 08:30:23 zalaga-unraid kernel: ata6.00: cmd 25/00:00:c8:f8:03/00:04:00:00:00/e0 tag 0 dma 524288 in (Drive related)
Oct 17 08:30:23 zalaga-unraid kernel:          res 50/00:00:c7:f8:03/00:00:00:00:00/e0 Emask 0x10 (ATA bus error) (Errors)
Oct 17 08:30:23 zalaga-unraid kernel: ata6.00: status: { DRDY } (Drive related)
Oct 17 08:30:23 zalaga-unraid kernel: ata6: hard resetting link (Minor Issues)
Oct 17 08:30:23 zalaga-unraid kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) (Drive related)
Oct 17 08:30:23 zalaga-unraid kernel: ata6.00: configured for UDMA/133 (Drive related)
Oct 17 08:30:23 zalaga-unraid kernel: ata6: EH complete (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen (Errors)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: irq_stat 0x08000000, interface fatal error (Errors)
Oct 17 08:30:25 zalaga-unraid kernel: ata6: SError: { UnrecovData 10B8B BadCRC } (Errors)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: failed command: READ DMA EXT (Minor Issues)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: cmd 25/00:00:c8:40:09/00:04:00:00:00/e0 tag 0 dma 524288 in (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel:          res 50/00:00:c7:40:09/00:00:00:00:00/e0 Emask 0x10 (ATA bus error) (Errors)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: status: { DRDY } (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel: ata6: hard resetting link (Minor Issues)
Oct 17 08:30:25 zalaga-unraid kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: configured for UDMA/133 (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel: ata6: EH complete (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen (Errors)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: irq_stat 0x08000000, interface fatal error (Errors)
Oct 17 08:30:25 zalaga-unraid kernel: ata6: SError: { UnrecovData 10B8B BadCRC } (Errors)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: failed command: READ DMA EXT (Minor Issues)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: cmd 25/00:00:c8:48:09/00:04:00:00:00/e0 tag 0 dma 524288 in (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel:          res 50/00:00:c7:48:09/00:00:00:00:00/e0 Emask 0x10 (ATA bus error) (Errors)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: status: { DRDY } (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel: ata6: hard resetting link (Minor Issues)
Oct 17 08:30:25 zalaga-unraid kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel: ata6.00: configured for UDMA/133 (Drive related)
Oct 17 08:30:25 zalaga-unraid kernel: ata6: EH complete (Drive related)
Oct 17 08:30:29 zalaga-unraid kernel: ata6: limiting SATA link speed to 3.0 Gbps (Drive related)
Oct 17 08:30:29 zalaga-unraid kernel: ata6.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen (Errors)
Oct 17 08:30:29 zalaga-unraid kernel: ata6.00: irq_stat 0x08000000, interface fatal error (Errors)
Oct 17 08:30:29 zalaga-unraid kernel: ata6: SError: { UnrecovData 10B8B BadCRC } (Errors)
Oct 17 08:30:29 zalaga-unraid kernel: ata6.00: failed command: READ DMA EXT (Minor Issues)
Oct 17 08:30:29 zalaga-unraid kernel: ata6.00: cmd 25/00:00:c8:9c:17/00:04:00:00:00/e0 tag 0 dma 524288 in (Drive related)
Oct 17 08:30:29 zalaga-unraid kernel:          res 50/00:00:c7:9c:17/00:00:00:00:00/e0 Emask 0x10 (ATA bus error) (Errors)
Oct 17 08:30:29 zalaga-unraid kernel: ata6.00: status: { DRDY } (Drive related)
Oct 17 08:30:29 zalaga-unraid kernel: ata6: hard resetting link (Minor Issues)
Oct 17 08:30:29 zalaga-unraid kernel: ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 320) (Drive related)
Oct 17 08:30:29 zalaga-unraid kernel: ata6.00: configured for UDMA/133 (Drive related)
Oct 17 08:30:29 zalaga-unraid kernel: ata6: EH complete (Drive related)
Oct 17 08:31:33 zalaga-unraid in.telnetd[2027]: connect from 192.168.1.6 (192.168.1.6) (Routine)
Oct 17 08:31:35 zalaga-unraid login[2028]: ROOT LOGIN  on '/dev/pts/0' from '192.168.1.6' (Logins)
Oct 17 09:05:07 zalaga-unraid kernel: mdcmd (22): nocheck  (unRAID engine)
Oct 17 09:05:07 zalaga-unraid kernel: md: md_do_sync: got signal, exit... (unRAID engine)
Oct 17 09:05:07 zalaga-unraid kernel: md: recovery thread sync completion status: -4 (unRAID engine)
Oct 17 09:14:37 zalaga-unraid kernel: mdcmd (23): check CORRECT (unRAID engine)
Oct 17 09:14:37 zalaga-unraid kernel: md: recovery thread woken up ... (unRAID engine)
Oct 17 09:14:37 zalaga-unraid kernel: md: recovery thread checking parity... (unRAID engine)
Oct 17 09:14:37 zalaga-unraid kernel: md: using 1536k window, over a total of 2930266532 blocks. (unRAID engine)

 

 

Link to comment

Also upgrade to v5.05 if your running 5.04.

Why is that? I hadn't bothered as the release notes suggested there was nothing critical in there.

Simply because if you are not on the latest release then the first suggestion if you encounter any issues will be to do that so that you are on the same release as the majority of people.  Also, it is always possible that something was fixed that is non-obvious.

 

Note that 5.06 release is now out.  The main change here is a fix for the 'bash' security bug that recently got wide coverage in the news.

Link to comment

Simply because if you are not on the latest release then the first suggestion if you encounter any issues will be to do that so that you are on the same release as the majority of people.  Also, it is always possible that something was fixed that is non-obvious.

 

Note that 5.06 release is now out.  The main change here is a fix for the 'bash' security bug that recently got wide coverage in the news.

ok fair enough, i'll try swapping cables again first & see what happens.

Link to comment

Also upgrade to v5.05 if your running 5.04.

Why is that? I hadn't bothered as the release notes suggested there was nothing critical in there.

Simply because if you are not on the latest release then the first suggestion if you encounter any issues will be to do that so that you are on the same release as the majority of people.  Also, it is always possible that something was fixed that is non-obvious.

 

Note that 5.06 release is now out.  The main change here is a fix for the 'bash' security bug that recently got wide coverage in the news.

 

Thanks Itimpi I didn't know 5.06 was out.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...