Jump to content

Replace Potentially Failing Parity Drives?


Go to solution Solved by hansolo77,

Recommended Posts

I've been having recurring issues with my monthly parity checks.  Some months go fine with 0 errors.  Then occasionally I'll get a couple hundred.  Then again, back to 0.  My power went out on 9/28, and the system should have shut down safely (I'm on a UPS).  Apparently it didn't, as the system started a parity check once it was back online.  The UPS issue needs to be addressed at a later time.  Anyway, the parity check returned with over 160k errors.  I'm tired of seeing it.  Before doing anything, I shut the server down and unplugged and replugged all the power and data cables going from the HBA to the backplane, and the parity drives to the motherboard.  I also removed and reseated each RAM stick and gave the CPU some new thermal grease and reseated that.  I then pulled each drive, blew out the connectors and reseated those as well.  I then rebooted into a MEMTEST drive and ran it for 24 hours.  After 4 complete passes I rebooted back into Unraid and did another manual check.  After about 3 hours it started hitting errors so I stopped it and restarted it doing a corrective pass.  This time it started failing within about 20 minutes, but by the end of the cycle it only found and correct 16 errors.  Ok.. that's a lot better than THOUSANDS.  So I rebooted once more, and did a manual check once again.  Got home today and it came back clean.  Whew.

 

I obviously missed my 10/1 scan, but given what I just went through, I think I'm ok.  The next thing I did was to run an extended Fix Common Problems scan.  It found a bunch of files with wrong permissions.  Might have been due to the parity corrections, hard to say.  I went in and started doing a Docker Safe Perms scan to fix the permissions.  While running that, I had a log window open (system log, not the tool log).  It was throwing up a bunch of errors (sorry if this spams the post)

Oct  4 14:56:22 Kyber  emhttpd: cmd: /usr/local/emhttp/plugins/fix.common.problems/scripts/newperms.sh
Oct  4 14:58:02 Kyber kernel: ata3.00: exception Emask 0x10 SAct 0xa5effff9 SErr 0x90202 action 0xe frozen
Oct  4 14:58:02 Kyber kernel: ata3.00: irq_stat 0x00400000, PHY RDY changed
Oct  4 14:58:02 Kyber kernel: ata3: SError: { RecovComm Persist PHYRdyChg 10B8B }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:00:40:11:6a/00:00:3b:00:00/40 tag 0 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 61/c0:18:80:ea:91/00:00:43:00:00/40 tag 3 ncq dma 98304 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 61/40:20:c0:d7:61/00:00:3c:00:00/40 tag 4 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:28:40:09:91/00:00:43:00:00/40 tag 5 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/80:30:00:71:52/00:00:0c:00:00/40 tag 6 ncq dma 65536 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/80:38:40:70:52/00:00:0c:00:00/40 tag 7 ncq dma 65536 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/80:40:80:e1:05/00:00:0a:00:00/40 tag 8 ncq dma 65536 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:48:40:93:a6/00:00:09:00:00/40 tag 9 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:50:80:b3:67/00:00:08:00:00/40 tag 10 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/80:58:c0:b2:67/00:00:08:00:00/40 tag 11 ncq dma 65536 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/20:60:e0:c5:c5/00:00:07:00:00/40 tag 12 ncq dma 16384 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/20:68:a0:cc:05/00:00:06:00:00/40 tag 13 ncq dma 16384 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:70:c0:70:1f/00:00:44:00:00/40 tag 14 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 61/40:78:40:d8:61/00:00:3c:00:00/40 tag 15 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:80:80:2a:16/00:00:44:00:00/40 tag 16 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 61/40:88:00:04:5a/00:00:3b:00:00/40 tag 17 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 61/40:90:80:d2:8d/00:00:3b:00:00/40 tag 18 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:98:80:43:e9/00:00:42:00:00/40 tag 19 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:a8:80:47:91/00:00:43:00:00/40 tag 21 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:b0:40:36:23/00:00:44:00:00/40 tag 22 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:b8:40:70:91/00:00:43:00:00/40 tag 23 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:c0:80:48:91/00:00:43:00:00/40 tag 24 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:d0:00:7a:36/00:00:43:00:00/40 tag 26 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/40:e8:c0:4c:2a/00:00:44:00:00/40 tag 29 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata3.00: cmd 60/00:f8:c0:11:6a/01:00:3b:00:00/40 tag 31 ncq dma 131072 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:18:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata3.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata3: hard resetting link
Oct  4 14:58:02 Kyber kernel: ata2.00: exception Emask 0x10 SAct 0x3b9ffffd SErr 0x90202 action 0xe frozen
Oct  4 14:58:02 Kyber kernel: ata2.00: irq_stat 0x00400000, PHY RDY changed
Oct  4 14:58:02 Kyber kernel: ata2: SError: { RecovComm Persist PHYRdyChg 10B8B }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:00:40:d3:8d/00:00:3b:00:00/40 tag 0 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 61/c0:10:80:ea:91/00:00:43:00:00/40 tag 2 ncq dma 98304 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:18:c0:13:5a/00:00:3b:00:00/40 tag 3 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:20:c0:11:5a/00:00:3b:00:00/40 tag 4 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 61/40:28:c0:d7:61/00:00:3c:00:00/40 tag 5 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:30:80:83:22/00:00:3b:00:00/40 tag 6 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:38:80:2a:16/00:00:44:00:00/40 tag 7 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/80:40:00:71:52/00:00:0c:00:00/40 tag 8 ncq dma 65536 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/80:48:40:70:52/00:00:0c:00:00/40 tag 9 ncq dma 65536 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/80:50:80:e1:05/00:00:0a:00:00/40 tag 10 ncq dma 65536 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:58:40:93:a6/00:00:09:00:00/40 tag 11 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:60:80:b3:67/00:00:08:00:00/40 tag 12 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/80:68:c0:b2:67/00:00:08:00:00/40 tag 13 ncq dma 65536 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/20:70:e0:c5:c5/00:00:07:00:00/40 tag 14 ncq dma 16384 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/20:78:a0:cc:05/00:00:06:00:00/40 tag 15 ncq dma 16384 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:80:40:70:91/00:00:43:00:00/40 tag 16 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 61/40:88:40:d8:61/00:00:3c:00:00/40 tag 17 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 61/40:90:00:04:5a/00:00:3b:00:00/40 tag 18 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 61/40:98:80:1b:18/00:00:44:00:00/40 tag 19 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: WRITE FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 61/40:a0:80:d2:8d/00:00:3b:00:00/40 tag 20 ncq dma 32768 out
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:b8:40:10:18/00:00:44:00:00/40 tag 23 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:c0:40:45:10/00:00:43:00:00/40 tag 24 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/c0:c8:80:ca:f2/00:00:42:00:00/40 tag 25 ncq dma 98304 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:d8:80:ba:79/00:00:43:00:00/40 tag 27 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/40:e0:80:b8:79/00:00:43:00:00/40 tag 28 ncq dma 32768 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)
Oct  4 14:58:02 Kyber kernel: ata2.00: status: { DRDY }
Oct  4 14:58:02 Kyber kernel: ata2.00: failed command: READ FPDMA QUEUED
Oct  4 14:58:02 Kyber kernel: ata2.00: cmd 60/c0:e8:c0:b2:1d/00:00:44:00:00/40 tag 29 ncq dma 98304 in
Oct  4 14:58:02 Kyber kernel:         res 40/00:10:80:ea:91/00:00:43:00:00/40 Emask 0x10 (ATA bus error)

 

I checked the MAIN tab and I believe ATA2 and ATA3 are my Parity 1 and Parity 2 drives.  When I click on the disk log for each of those drives, I get a similar log of errors, of which are not on any of the actual array drives.  

 

So is this an indication that my parity drives are failing?  If so, that would explain a lot such as the frequent parity checks failing with so many errors.  The strange thing though, neither drive seems to indicate anything wrong in the SMART logs.  If, in fact, this does indicate a possible failure or corruption, how does one go about replacing the Parity Drives?  I'm due for an upgrade soon anyway, these are 12TB drives and I'd like to maybe start rebuilding up to 20TB. 

 

Just looking for some advice from the better educated people out there.  :)  Thanks!

Edited by hansolo77
Link to comment

Suspicious that there are issues with two of the same model disk connected to the onboard controller, disk1 on the other hand is the same model but connected to the HBA and no issues there, if you have free ports connect one or both parity disks to the HBA, if you don't swap with other disks (except disk1 in case it cause problems to that one).

Link to comment

Yeah my HBA is completely full with drives, no spares left.  Is it possible that I'm having trouble because the parities are on the motherboard and not connected to the HBA?  Like a speed problem, or something like cross compatibility (using SATA rather than SAS (even though all my drives are SATA)).  I have no problem troubleshooting, it's what I like to do.  Just getting irritated with all the errors cropping up, when there shouldn't be any.

 

Before trying to swap out the drives entirely, do you think changing SATA ports on the motherboard might help?  I'm not looking at it right now, but I think the motherboard had 4 or 6 SATA ports and the parities are only using 1 and 2.

 

If/when I go to swap the drives, how do I go about doing that?  Am I correct in this procedure:

  1. Stop Array
  2. Unassign Parity 1 and Drive 2
  3. Physically Remove Parity 1 and Drive 2
  4. Start Array (Detects drives are missing)
  5. Stop Array
  6. Replace alternates (Disk 2 to Parity 1, and Parity 1 to Disk 2)
  7. Reassign New Parity 1 and Drive 2
  8. Start Array

That would then either automatically start a parity rebuild, or I'd have to manually start it.  If I'm correct, do I need to do anything to the drives prior to rebuilding, like pre-clear them first?

Link to comment
14 minutes ago, hansolo77 said:

Like a speed problem, or something like cross compatibility

I suspect more a compatibility problem, especially since there's another user with what look like issues with an identical AMD board SATA controller and an Ironwolf drive, 12TB in that case.

 

I'm not suggesting you get new disks for now, just connect both parity drives to the HBA swapping other disks to the SATA controller, just not disk1 since it's the same model, connect for example disks 2 and 3 to the onboard SATA and both parity disks to the HBA in their place.

Link to comment

Ok, was I correct in the procedure?  I've never done this before.

 

Real Quick Edit:

I was checking out the Parity Drives just now, did a quick self test on them to see that they're still ok which they are).  I noticed in the SMART data that neither drive has had any reallocated sectors, and really show no signs of pre-fail.  However, what I ALSO noticed was that the 2nd drive is running at 6.0 Gb/s SATA speed while the first drive is only running at 1.5 Gb/s SATA speed.  Maybe there's a misconfiguration somewhere.  They should both be running 6 right?

Edited by hansolo77
Link to comment

Just to quickly follow up. I took another look at the SATA cables connecting my parity drives. One of them had a loose connector plug on one end and the other one was pinched under the motherboard. I was trying to maneuver them out of the way to improve air flow. I went ahead and replaced both cables and just left them “in the way” for now. I did another parity check with recovery and it found another 6k in errors. But it fixed them and I ran another check that came back clean. I then did a reboot and another check and it was clean. Then I rebooted again and set the scheduler to parity check on every Sunday just so I’m not waiting a month for the next checks. It came back clean again last night. So it might have been bad cables the whole time. I’ll wait another week with typical usage and another check before I mark this as resolved. I hate that it’s been doing this but I hate even more if it was a simple fix lol. Fingers crossed. 

Link to comment
  • hansolo77 changed the title to Replace Potentially Failing Parity Drives?
  • 2 weeks later...
  • Solution

I'm going to mark this as resolved.  I had another 100% successful (without error) parity check and the logs have been clean.  I definitely feel at this point that the problem was with the SATA connector.  Had a crack in it, so it was apparently keeping the connection loose.  If things change, I'll try to remember to reopen this thread.  Thanks for being a support!

  • Like 1
Link to comment
  • 2 months later...

Darn, 1st of the month and I'm getting the same symptoms again.  Lots of sync errors and the same errors in the system log as posted in the OP.  I've gone back through and re-read the thread.  Apparently my brain is numb.. I completely glazed over the suggestion you had of simply moving the drives around.  I'll give that a shot.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...