KJB Posted June 16, 2019 Share Posted June 16, 2019 Hi all, I had an issue the other day where one of my drives became disabled overnight. Ran a full SMART check and the drive showed no issues, there were no writes being performed at the time so I put it down to a hardware error, changed the SATA cable and re-enabled the drive with the Trust My Array procedure. Runnings through the logs of the drive at the moment I keep seeing these errors. Jun 15 02:45:56 SERVER kernel: ata10: SATA link up 3.0 Gbps (SStatus 123 SControl 320) Jun 15 02:45:56 SERVER kernel: ata10.00: configured for UDMA/133 Jun 15 02:45:56 SERVER kernel: ata10: EH complete Jun 15 02:45:56 SERVER kernel: ata10.00: sense data available but port frozen Jun 15 02:45:56 SERVER kernel: ata10: limiting SATA link speed to 1.5 Gbps Jun 15 02:45:56 SERVER kernel: ata10.00: exception Emask 0x10 SAct 0x0 SErr 0x280100 action 0x6 frozen Jun 15 02:45:56 SERVER kernel: ata10.00: irq_stat 0x48000001, interface fatal error Jun 15 02:45:56 SERVER kernel: ata10: SError: { UnrecovData 10B8B BadCRC } Jun 15 02:45:56 SERVER kernel: ata10.00: failed command: READ DMA EXT Jun 15 02:45:56 SERVER kernel: ata10.00: cmd 25/00:00:78:df:fa/00:01:2a:03:00/e0 tag 20 dma 131072 in Jun 15 02:45:56 SERVER kernel: ata10.00: status: { DRDY SENSE ERR } Jun 15 02:45:56 SERVER kernel: ata10.00: error: { ICRC ABRT } Jun 15 02:45:56 SERVER kernel: ata10: hard resetting link Jun 15 02:45:57 SERVER kernel: ata10: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jun 15 02:45:57 SERVER kernel: ata10.00: configured for UDMA/133 Jun 15 02:45:57 SERVER kernel: ata10: EH complete Any ideas what is causing this? I'm confident the drive is fine. Should I be looking at changing the SATA port on my MB? Another dud cable perhaps? Looking through the SMART data I have a tonne of UDMA CRC error's. Cheers Quote Link to comment
Abzstrak Posted June 16, 2019 Share Posted June 16, 2019 why are you confident the drive is fine? smart is like a check oil light in the car, hardly 100% accurate, just tends to give more info. Why did you do a trust array? you should recalc parity at the least now, because that wasn't a great idea. I'd run badblocks on the drive, I like to use random pattern write tests, but be aware it will clear that drive out. Its the best test i know of to determine if the drive is problematic or not. Quote Link to comment
KJB Posted June 16, 2019 Author Share Posted June 16, 2019 (edited) The only reason i'd suggest the drive is fine is because it was purchased 6 weeks ago and pre-cleared fine before use. I know drives can fail at any time but is it likely within 4 weeks of a pre-clear completing? Parity was done after the new array was set and came up with 0 errors. Edited June 16, 2019 by qwijibo Quote Link to comment
Abzstrak Posted June 16, 2019 Share Posted June 16, 2019 20 minutes ago, qwijibo said: The only reason i'd suggest the drive is fine is because it was purchased 6 weeks ago and pre-cleared fine before use. I know drives can fail at any time but is it likely within 4 weeks of a pre-clear completing? Parity was done after the new array was set and came up with 0 errors. Drives usually die early in their life, or pretty late. I mean you should recalc parity after you "trusted" the drive that was dropped out, because you couldn't trust it if it dropped out of an active array... it might have been fine, but you should recalc parity to be sure. Your idea of the cable isn't bad, they are cheap. Always use new cables when swapping drives or building machines, they aren't worth the cost if they could potentially cause any issues at all. memtest would be a good idea too, just to cover bases. Quote Link to comment
JorgeB Posted June 16, 2019 Share Posted June 16, 2019 3 hours ago, qwijibo said: BadCRC 9 times out of 10 this is a bad SATA cable. Quote Link to comment
trurl Posted June 16, 2019 Share Posted June 16, 2019 Go to Tools-Diagnostics and attach the complete diagnostics zip file to your next post. 6 hours ago, qwijibo said: there were no writes being performed at the time Only a write will disable a drive. Quote Link to comment
trurl Posted June 16, 2019 Share Posted June 16, 2019 30 minutes ago, trurl said: Only a write will disable a drive. Something was writing to the disk and the write failed. Something was reading from the disk, the read failed, Unraid got the data from the parity calculation, tried to write it back to the disk, and that write failed. I wouldn't be surprised if scenario 2 isn't the most common way a disk with a bad connection gets disabled. Quote Link to comment
KJB Posted June 17, 2019 Author Share Posted June 17, 2019 I replaced the cable and swapped to a different SATA port and has been fine for the past 24 hours. I'm putting it down to a dud cable. Or port. Thanks for the replies. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.