Lots of disk errors while running preclear

February 21, 201610 yr

Hello all,

I have recently added a new SATA controller (SI-PEX40071) and two additional 2TB HDDs. All recognised and apparently OK until I started preclear.

When I run preclear I get lots of other issues reported onother drives. These other drives are all connected to a Marvel 88SE9235 4 port controller.

I am now rebuilding the array and while this was running I tried to preclear again and see the following in logs:

Feb 21 10:49:40 server kernel: sdr: sdr1
Feb 21 10:49:59 server kernel: ata17.00: exception Emask 0x10 SAct 0x60000 SErr 0x200000 action 0x6 frozen
Feb 21 10:49:59 server kernel: ata17.00: irq_stat 0x08000008, interface fatal error
Feb 21 10:49:59 server kernel: ata17: SError: { BadCRC }
Feb 21 10:49:59 server kernel: ata17.00: failed command: READ FPDMA QUEUED
Feb 21 10:49:59 server kernel: ata17.00: cmd 60/00:88:60:a7:24/01:00:00:00:00/40 tag 17 ncq 131072 in
Feb 21 10:49:59 server kernel:         res 50/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Feb 21 10:49:59 server kernel: ata17.00: status: { DRDY }
Feb 21 10:49:59 server kernel: ata17.00: failed command: READ FPDMA QUEUED
Feb 21 10:49:59 server kernel: ata17.00: cmd 60/00:90:60:a8:24/01:00:00:00:00/40 tag 18 ncq 131072 in
Feb 21 10:49:59 server kernel:         res 50/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Feb 21 10:49:59 server kernel: ata17.00: status: { DRDY }
Feb 21 10:49:59 server kernel: ata17: hard resetting link
Feb 21 10:49:59 server kernel: ata17: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 21 10:49:59 server kernel: ata17.00: configured for UDMA/133
Feb 21 10:49:59 server kernel: ata17: EH complete
Feb 21 10:50:01 server crond[1714]: exit status 127 from user root /usr/local/emhttp/plugins/dynamix.system.stats/scripts/sa1 1 1 &> /dev/null
Feb 21 10:50:43 server kernel: ata17.00: exception Emask 0x10 SAct 0x600000 SErr 0x200000 action 0x6 frozen
Feb 21 10:50:43 server kernel: ata17.00: irq_stat 0x08000008, interface fatal error
Feb 21 10:50:43 server kernel: ata17: SError: { BadCRC }
Feb 21 10:50:43 server kernel: ata17.00: failed command: READ FPDMA QUEUED
Feb 21 10:50:43 server kernel: ata17.00: cmd 60/00:a8:10:3a:9a/01:00:00:00:00/40 tag 21 ncq 131072 in
Feb 21 10:50:43 server kernel:         res 50/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Feb 21 10:50:43 server kernel: ata17.00: status: { DRDY }
Feb 21 10:50:43 server kernel: ata17.00: failed command: READ FPDMA QUEUED
Feb 21 10:50:43 server kernel: ata17.00: cmd 60/00:b0:10:3b:9a/01:00:00:00:00/40 tag 22 ncq 131072 in
Feb 21 10:50:43 server kernel:         res 50/00:ff:00:00:00/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
Feb 21 10:50:43 server kernel: ata17.00: status: { DRDY }
Feb 21 10:50:43 server kernel: ata17: hard resetting link
Feb 21 10:50:43 server kernel: ata17: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Feb 21 10:50:43 server kernel: ata17.00: configured for UDMA/133
Feb 21 10:50:43 server kernel: ata17: EH complete

When I killed preclear the errors stopped again.

server-diagnostics-20160221-1115.zip

Quote

February 21, 201610 yr

Community Expert

Device Model:     ST2000DL003-9VT166
Serial Number:    5YD83980

199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       9

Replace SATA cable, if errors continue possible bad SATA port.

Quote

February 21, 201610 yr

Community Expert

It might also be worth stating exactly what power supply is being used in case the new hardware has taken the system over its capacity.

Quote

February 21, 201610 yr

Author

Thanks both!

That drive is already on a new sata cable, will switch to another port and try.

PSU is a modular 550W, dont have specifics to hand.

Quote

February 21, 201610 yr

Community Expert

Thanks both!

That drive is already on a new sata cable, will switch to another port and try.

PSU is a modular 550W, dont have specifics to hand.

Is it a single rail power supply? If not then it is probably not sufficient for the number of drives you have (I made it 17). Even if it is then is probably getting near the limit under peak load depending on what your motherboard, add-on cards and any graphics cards require.

Quote

February 29, 201610 yr

Author

Thanks, I dont know about the PSU but four drives are in an external enclosure and the only cards installed are three sata controllers.

My issues seem to have been cable related - I found a few older cables which did not have any locking mechanism on either end. Bet I knocked / shifted 'em. Replaced them all and I've not seen any errors since.

I've pre-cleared drives and completed parity checks with no errors reported so i think all is well again.

Thanks for everybody's help.

Quote

Lots of disk errors while running preclear

Featured Replies

Archived

Account

Navigation

Search

Configure browser push notifications

Chrome (Android)

Chrome (Desktop)

Safari (iOS 16.4+)

Safari (macOS)

Edge (Android)

Edge (Desktop)

Firefox (Android)

Firefox (Desktop)