Borys Posted February 5, 2023 Share Posted February 5, 2023 (edited) Hi all, My regular data-parity check found 2044 errors (how do I fond what kinds) and marked the disk as disabled and content emulated. I have removed the disk from the array and run the preclear process without any issues with spot on speed. However, after adding the disk back to the array and starting the data-rebuild process the disk has issues. At the beginning everything was fine but then the disk started making low, short and vibrating noise in regular intervals and the speed dropped to 0. Any ideas? The logs show something like this: Feb 5 22:29:35 Tower kernel: ata4.00: failed command: READ FPDMA QUEUED Feb 5 22:29:35 Tower kernel: ata4.00: cmd 60/00:58:e8:49:cd/04:00:5c:00:00/40 tag 11 ncq dma 524288 in Feb 5 22:29:35 Tower kernel: res 40/00:48:a8:44:cd/00:00:5c:00:00/40 Emask 0x50 (ATA bus error) Feb 5 22:29:35 Tower kernel: ata4.00: status: { DRDY } Feb 5 22:29:35 Tower kernel: ata4.00: failed command: READ FPDMA QUEUED Feb 5 22:29:35 Tower kernel: ata4.00: cmd 60/40:60:e8:4d:cd/01:00:5c:00:00/40 tag 12 ncq dma 163840 in Feb 5 22:29:35 Tower kernel: res 40/00:48:a8:44:cd/00:00:5c:00:00/40 Emask 0x50 (ATA bus error) Feb 5 22:29:35 Tower kernel: ata4.00: status: { DRDY } Feb 5 22:29:35 Tower kernel: ata4: hard resetting link Feb 5 22:29:41 Tower kernel: ata4: link is slow to respond, please be patient (ready=0) Feb 5 22:29:45 Tower kernel: ata4: COMRESET failed (errno=-16) Feb 5 22:29:45 Tower kernel: ata4: hard resetting link Feb 5 22:29:51 Tower kernel: ata4: link is slow to respond, please be patient (ready=0) Feb 5 22:29:53 Tower kernel: ata4: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Feb 5 22:29:53 Tower kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.PRT3._GTF.DSSP], AE_NOT_FOUND (20220331/psargs-330) Feb 5 22:29:53 Tower kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.PRT3._GTF due to previous error (AE_NOT_FOUND) (20220331/psparse-529) Feb 5 22:29:53 Tower kernel: ACPI BIOS Error (bug): Could not resolve symbol [\_SB.PCI0.SAT0.PRT3._GTF.DSSP], AE_NOT_FOUND (20220331/psargs-330) Feb 5 22:29:53 Tower kernel: ACPI Error: Aborting method \_SB.PCI0.SAT0.PRT3._GTF due to previous error (AE_NOT_FOUND) (20220331/psparse-529) Feb 5 22:29:53 Tower kernel: ata4.00: configured for UDMA/100 Feb 5 22:29:53 Tower kernel: ata4: EH complete Feb 5 22:29:53 Tower kernel: ata4.00: exception Emask 0x50 SAct 0x7e00003f SErr 0x4890800 action 0xe frozen Feb 5 22:29:53 Tower kernel: ata4.00: irq_stat 0x0d400040, interface fatal error, connection status changed Feb 5 22:29:53 Tower kernel: ata4: SError: { HostInt PHYRdyChg 10B8B LinkSeq DevExch } Feb 5 22:29:53 Tower kernel: ata4.00: failed command: READ FPDMA QUEUED Edited February 5, 2023 by Borys spelling :/ Quote Link to comment
Solution JorgeB Posted February 6, 2023 Solution Share Posted February 6, 2023 Next time please post the complete diags but based on that looks more like a power/connection problem, replace/swap both cables/slot and try again, if the problem follows the disk it's likely a disk problem. Quote Link to comment
Borys Posted February 6, 2023 Author Share Posted February 6, 2023 OK, will do next time. (Simplifying the journey and removing all the swearing). I changed a cable and the disk died (it was a funny looking old cable). I changed the cable again but the disk was dead. I changed a disk and everything works fine. The dead disk was successfully recovered afterwards, I think. I will test it properly (preclear) later. Also, the original cable was part of 4-split HBA cable and that particular one was flaky first few times I attached it. A disk connected with it was not working. Lesson learned, if something works funny even for a short period of time just don't trust it. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.