June 22, 20233 yr Just noticed that parity check was taking a long, long time. I pulled up the logs, but I'm sure entirely what I'm looking at. There are lines like this: Jun 20 04:40:02 server kernel: ata6.00: exception Emask 0x50 SAct 0xef0 SErr 0x4890800 action 0xe frozen Jun 20 04:40:02 server kernel: ata6.00: irq_stat 0x0c400040, interface fatal error, connection status changed Jun 20 04:40:02 server kernel: ata6: SError: { HostInt PHYRdyChg 10B8B LinkSeq DevExch } Jun 20 04:40:02 server kernel: ata6.00: failed command: READ FPDMA QUEUED Jun 20 04:40:02 server kernel: ata6.00: cmd 60/00:20:70:fe:9a/04:00:ac:00:00/40 tag 4 ncq dma 524288 in Jun 20 04:40:02 server kernel: res 40/00:30:70:06:9b/00:00:ac:00:00/40 Emask 0x50 (ATA bus error) Jun 20 04:40:02 server kernel: ata6.00: status: { DRDY } Does this mean I need to find which drive is ATA6? Or is that a generic ATA error label? Disk 10 is also showing read errors and I've seen some posts that talk about replacing the SATA cable. What should I do? server-diagnostics-20230621-1854.zip
June 22, 20233 yr Community Expert ATA6 is disk10, start by replacing the cables, since the disk looks healthy.
June 26, 20233 yr Author I can replace the sata cable, but the drive is in a hot swap bay, so the power comes from the drive array unit.
June 29, 20233 yr Author Well, a lot has happened since my last post. It would seem that something else is going on; I've had to rebuild a few times now. I've had multiple drive issues at this point, I'm thinking maybe I have a bad controller, bad hot swap bay, or bad power supply? One of my drives won't show up (Disk 10), so maybe bad hot swap bay (tried two at this point). Another is reporting seek errors, maybe from all the rebuilding? Not sure what to do. I'm concerned with all the drive issues that I may lose some data... I've got an Amazon cart locked and loaded with new hot swap bays and a new drive controller. What should I do next? server-diagnostics-20230629-0736.zip
June 29, 20233 yr Author Just noticed that my rebuild speed is down to 3MB/sec, it'll take 22 days to finish :-|
June 29, 20233 yr Community Expert Solution No disk errors that I can see in the diags posted, I would swap first whichever is easier, PSU or backplane.
June 29, 20233 yr Author Strange, one drive was showing tons of seek errors… Edited June 29, 20233 yr by theamzngq Auto correct
June 29, 20233 yr Community Expert The "seek error rate" column is not a raw number, it's several ones spread across the bits. It's normal for it to have a big meaningless number when viewed raw.
July 6, 20232 yr Author So I ended up replacing the backplanes (drive enclosures) as well as the power supply. Not very "scientific method" of me, but that seems to have settled things down. All appears to be well after 3-4 days now.
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.