March 2, 20179 yr A couple of days ago I noticed that one of my disks redballed. I figured it was likely bad because of errors that I'd seen and decided to replace it. The drive that I replaced was an Hitachi 2TB and I replaced it with a HGST 4TB Coolspin. And proceeded to rebuild the drive. I'm only seeing between 1 and 2 MB per second rebuild speed. At this rate, it will take about 25 - 30 days to rebuild the drive. That doesn't seem normal. Here is my system info: ASUS M4A785-M Motherboard AMD Sempron 145 CPU 1GB DDR2 RAM Corsair CX500 PSU UnRAID v5.0.6 Parity: Seagate ST4000DM000 4TB Cache: None Array: 10 2TB WD Green, 1 2TB Hitachi, 2 4TB HGST Coolspin (including the new one that's being rebuilt) The new drive being rebuilt is connected to the motherboard. I checked the data and power connections to the drives when I replaced the drive and they seemed OK. I'm seeing this repeatedly in the log: /usr/bin/tail -f /var/log/syslog Mar 2 08:57:22 Tower kernel: ata2.00: configured for UDMA/33 Mar 2 08:57:22 Tower kernel: ata2: EH complete Mar 2 08:57:22 Tower kernel: ata2.00: exception Emask 0x50 SAct 0x0 SErr 0x90a00 action 0xe frozen Mar 2 08:57:22 Tower kernel: ata2.00: irq_stat 0x01400000, PHY RDY changed Mar 2 08:57:22 Tower kernel: ata2: SError: { Persist HostInt PHYRdyChg 10B8B } Mar 2 08:57:22 Tower kernel: ata2.00: failed command: READ DMA EXT Mar 2 08:57:22 Tower kernel: ata2.00: cmd 25/00:00:60:10:d8/00:04:1b:00:00/e0 tag 0 dma 524288 in Mar 2 08:57:22 Tower kernel: res 50/00:00:5f:10:d8/00:00:1b:00:00/e0 Emask 0x50 (ATA bus error) Mar 2 08:57:22 Tower kernel: ata2.00: status: { DRDY } Mar 2 08:57:22 Tower kernel: ata2: hard resetting link Mar 2 08:57:29 Tower kernel: ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Mar 2 08:57:29 Tower kernel: ata2.00: configured for UDMA/33 Mar 2 08:57:29 Tower kernel: ata2: EH complete Mar 2 08:57:30 Tower kernel: ata2.00: exception Emask 0x50 SAct 0x0 SErr 0x90a00 action 0xe frozen Mar 2 08:57:30 Tower kernel: ata2.00: irq_stat 0x01400000, PHY RDY changed Mar 2 08:57:30 Tower kernel: ata2: SError: { Persist HostInt PHYRdyChg 10B8B } Mar 2 08:57:30 Tower kernel: ata2.00: failed command: READ DMA EXT Mar 2 08:57:30 Tower kernel: ata2.00: cmd 25/00:00:60:50:d8/00:04:1b:00:00/e0 tag 0 dma 524288 in Mar 2 08:57:30 Tower kernel: res 50/00:00:5f:50:d8/00:00:1b:00:00/e0 Emask 0x50 (ATA bus error) Mar 2 08:57:30 Tower kernel: ata2.00: status: { DRDY } Mar 2 08:57:30 Tower kernel: ata2: hard resetting link Perhaps an issue with the data connection? Any ideas on what to check or how to fix this? syslog-2017-03-02.txt Edited March 5, 20179 yr by bw1 solved
March 2, 20179 yr 9 minutes ago, bw1 said: Perhaps an issue with the data connection? Probably. Your syslog has rotated. The one you posted is just full of those errors. Without more I can't even tell which disk it is referring to. Can you get us the older logs? They are in /var/log. Once you get this straight you really should consider upgrading. It is difficult to support this old version since most haven't used it in years.
March 2, 20179 yr Author 1 minute ago, trurl said: Probably. Your syslog has rotated. The one you posted is just full of those errors. Without more I can't even tell which disk it is referring to. Can you get us the older logs? They are in /var/log. Once you get this straight you really should consider upgrading. It is difficult to support this old version since most haven't used it in years. I attached the one that I created when I first started the rebuild. Hope that helps. I haven't checked here in a while. The 6.x version that I downloaded and have actually used on a test server was still beta. I'll definitely be looking into upgrading. syslog-2017-02-28.txt
March 2, 20179 yr According to that older syslog, ata2 is disk2, but I don't think we can trust that since I assume you rebooted since that log was taken. Have you looked in /var/log for other logs? Did you test the new disk with preclear or anything?
March 2, 20179 yr Author 8 minutes ago, trurl said: According to that older syslog, ata2 is disk2, but I don't think we can trust that since I assume you rebooted since that log was taken. Have you looked in /var/log for other logs? Did you test the new disk with preclear or anything? Yes, the new disk was precleared 3 times and it's been in storage for a while (about 3 years ago). I assume the disk won't go bad in storage. syslog.1 syslog.2
March 2, 20179 yr Author Well those attached files don't look very readable! And I didn't zip them. Edited March 2, 20179 yr by bw1
March 2, 20179 yr 30 minutes ago, bw1 said: Well those attached files don't look very readable! And I didn't zip them. I could use them OK. Looks like ata2 is disk2, but the disk you are rebuilding is disk3. Stop, shutdown and recheck the connections. The disk3 rebuild isn't going to be good if disk2 can't be read reliably.
March 2, 20179 yr Author 43 minutes ago, trurl said: I could use them OK. Looks like ata2 is disk2, but the disk you are rebuilding is disk3. Stop, shutdown and recheck the connections. The disk3 rebuild isn't going to be good if disk2 can't be read reliably. OK, I cancelled the rebuild, stopped, shutdown and the connections looked fine, but I pulled the first 3 data connectors from the SS-500 5-in-3 enclosure and reconnected them. Now I'm only getting 200-300 KB/s, so I think I made it worse. syslog-2017-03-02-2.zip
March 2, 20179 yr Now disk2 and disk4 are resetting connections, so it is worse. Make sure both SATA and power connections are good at both ends. SATA connections should be square on the connector. If you have bundled your cables you may be putting stress on the connection.
March 2, 20179 yr Author 56 minutes ago, trurl said: Now disk2 and disk4 are resetting connections, so it is worse. Make sure both SATA and power connections are good at both ends. SATA connections should be square on the connector. If you have bundled your cables you may be putting stress on the connection. Thanks for your help. I'll have to check the connections again, but I'll have to do that later. I do have another power supply that I can try and I also have a motherboard, if I need to swap that out. I'll have to check and see if I have more SATA cables. BTW, when I went to shut down, I still had Windows File Explorer connected to the flash share and I was getting errors unmounting the drives. I had shutdown my desktop computer that was previously connected and restart that and then reconnect the browser to the Tower and then I noticed the Parity drive was missing. So I definitely have some kind of connection problem.
March 2, 20179 yr Those errors are symptomatic of a loose connection, drive disappearing then reappearing, with line corruption. That could be loose connectors or bad power, and bad power is my best guess. It's possible your power supply is failing, or there are too many drives on this power rail.
March 2, 20179 yr Author 2 hours ago, RobJ said: Those errors are symptomatic of a loose connection, drive disappearing then reappearing, with line corruption. That could be loose connectors or bad power, and bad power is my best guess. It's possible your power supply is failing, or there are too many drives on this power rail. I thought the CX500 was good for 15+ drives. I only have 14 and they're low power drives. But thanks that will be one thing I will check since I have another PSU available that has a higher output.
March 2, 20179 yr 11 minutes ago, bw1 said: I thought the CX500 was good for 15+ drives. I only have 14 and they're low power drives. But thanks that will be one thing I will check since I have another PSU available that has a higher output. If that is a Corsair CX500, the Corsair CX series of power supply do NOT have a good reputation! The advice has generally been to buy any Corsair power supply but the CX series. The higher end Corsairs are decent. I'm almost shocked that you have been able to run very long with 14 drives. Bad power supplies fail quicker, and often fail to maintain correct voltages under load. You might try a PSU tester, they're fairly inexpensive.
March 5, 20179 yr Author On Thursday, March 02, 2017 at 2:29 PM, RobJ said: If that is a Corsair CX500, the Corsair CX series of power supply do NOT have a good reputation! The advice has generally been to buy any Corsair power supply but the CX series. The higher end Corsairs are decent. I'm almost shocked that you have been able to run very long with 14 drives. Bad power supplies fail quicker, and often fail to maintain correct voltages under load. You might try a PSU tester, they're fairly inexpensive. Yes, it is the Corsair CX500. Like I said when I selected it, it was one of the recommended drives here for up to 15 drives. But maybe it has gone bad. I swapped the PSU out for a Seasonic X-650 and that seems to have fixed it: Data-Rebuild in progress. Total size: 4 TB Current position: 326.82 GB (8%) Estimated speed: 100.22 MB/sec Estimated finish: 611 minutes
Archived
This topic is now archived and is closed to further replies.