rykr Posted June 9, 2023 Share Posted June 9, 2023 I have a 26TB array that is out of space. It has dual parity using 2 8TB drives. I bought 2 14TB drives. I replaced one of the 8TB parity drives with one of the 14TB drives and started the array and it started doing a parity re-sync. It ran ok for a couple of hours and now I'm getting a crazy number of read errors on 3 of the 4 data drives. What could possibly be going on here? Quote Link to comment
rykr Posted June 9, 2023 Author Share Posted June 9, 2023 I've checked smart status on the drives and done short tests. Drives appear ok? They also have have fans blowing on them and do not appear to be hot. Quote Link to comment
ChatNoir Posted June 9, 2023 Share Posted June 9, 2023 Just now, rykr said: What could possibly be going on here? please attach your diagnostics, there could be some information in them. Quote Link to comment
rykr Posted June 9, 2023 Author Share Posted June 9, 2023 Here ya go. vault-diagnostics-20230609-1217.zip Quote Link to comment
rykr Posted June 9, 2023 Author Share Posted June 9, 2023 lots of line like Jun 9 11:32:18 Vault kernel: md: disk3 read error, sector=2045235784 Jun 9 11:32:18 Vault kernel: md: disk3 read error, sector=2045235792 in the log Quote Link to comment
rykr Posted June 9, 2023 Author Share Posted June 9, 2023 Also a lot of stuff like this: Jun 9 11:31:16 Vault kernel: ata5.00: failed command: READ FPDMA QUEUED Jun 9 11:31:16 Vault kernel: ata5.00: cmd 60/08:00:80:29:19/00:00:84:02:00/40 tag 0 ncq dma 4096 in Jun 9 11:31:16 Vault kernel: res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) Jun 9 11:31:16 Vault kernel: ata5.00: status: { DRDY } Jun 9 11:31:16 Vault kernel: ata5.00: failed command: READ FPDMA QUEUED Jun 9 11:31:16 Vault kernel: ata5.00: cmd 60/08:08:40:77:18/00:00:80:01:00/40 tag 1 ncq dma 4096 in Jun 9 11:31:16 Vault kernel: res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Jun 9 11:31:16 Vault kernel: ata5.00: status: { DRDY } Quote Link to comment
JorgeB Posted June 9, 2023 Share Posted June 9, 2023 Looks like the fairly common Ryzen onboard SATA controller problem under load, look for a BIOS update, you can also try v6.12-rc7, newer kernel may help, if issues persist best to use an addon controller, or use a different board. Quote Link to comment
rykr Posted June 9, 2023 Author Share Posted June 9, 2023 Thanks. I'll check those Quote Link to comment
rykr Posted June 9, 2023 Author Share Posted June 9, 2023 ok, checked BIOS update. There was one but only from 1 month ago with next to nothing in changelog. Didn't apply. Updated to 6.12-rc7 which was very smooth. I already have a 4 port external card so I moved two of my drives off the APU Sata ports from the board to the card. Now there is 4 drives connected to the chipset sata ports on the board and 4 drives connected to the external card. Restarted parity sync. Fingers crossed. If this fails, may look at board upgrade. Any suggestions? Should I go with one of the dual proc older Xeon boards? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.