StewLoft Posted August 7, 2022 Author Share Posted August 7, 2022 OK, I stopped rebuild. Will run memtest next. Quote Link to comment
StewLoft Posted August 8, 2022 Author Share Posted August 8, 2022 memtest passed, no errors Quote Link to comment
StewLoft Posted August 8, 2022 Author Share Posted August 8, 2022 tower-diagnostics-20220807-1929.zip Quote Link to comment
trurl Posted August 8, 2022 Share Posted August 8, 2022 That might not have been long enough memtest, but usually a problem will show up in the first pass so I guess we will go with it. The best advice we had was probably here 16 hours ago, JorgeB said: recommend running xfs_repair (without -n) on all array disks. But lets do it with -n first, and one disk at a time, starting with unmountable disk9 Check filesystem on disk9, capture the output so you can post it. Quote Link to comment
JorgeB Posted August 8, 2022 Share Posted August 8, 2022 There are still ATA errors with multiple disks on the last diags: Aug 7 19:26:59 Tower kernel: ata2: link is slow to respond, please be patient (ready=0) Aug 7 19:27:03 Tower kernel: ata2: COMRESET failed (errno=-16) Aug 7 19:27:08 Tower kernel: ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Aug 7 19:27:08 Tower kernel: ata2.00: configured for UDMA/133 Aug 7 19:27:08 Tower kernel: mdcmd (36): set md_write_method 1 Aug 7 19:27:08 Tower kernel: Aug 7 19:27:12 Tower kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Aug 7 19:27:12 Tower kernel: ata6.00: configured for UDMA/133 Aug 7 19:27:13 Tower kernel: ata7: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Aug 7 19:27:13 Tower kernel: ata7.00: configured for UDMA/133 Aug 7 19:27:52 Tower kernel: ata6: link is slow to respond, please be patient (ready=0) Aug 7 19:27:55 Tower kernel: ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) Aug 7 19:27:55 Tower kernel: ata6.00: configured for UDMA/133 Ideally you fix those before running xfs_repair, if replacing the SATA cables didn't help it could be a power problem, PSU or power cables. Quote Link to comment
trurl Posted August 8, 2022 Share Posted August 8, 2022 3 hours ago, JorgeB said: power problem, PSU or power cables. That's what I'm beginning to suspect. What is the exact model of your power supply? How are the drives connected to the power supply? You probably don't want more than 3, maybe 4, drives on a single SATA power cable. Don't split SATA power connectors. You might be able to put more than 4 drives on a peripheral cable, if you need to split anything do it there on the molex connectors. Quote Link to comment
StewLoft Posted August 9, 2022 Author Share Posted August 9, 2022 Will I lose my data running xfs_repair? I checked the power cables, and even switch a few around when I switched the sata cables. I can order another power supply and try that. Quote Link to comment
StewLoft Posted August 9, 2022 Author Share Posted August 9, 2022 11 hours ago, trurl said: You probably don't want more than 3, maybe 4, drives on a single SATA power cable. Don't split SATA power connectors. You might be able to put more than 4 drives on a peripheral cable, if you need to split anything do it there on the molex connectors. I am using the molex to sata cable. Quote Link to comment
StewLoft Posted August 9, 2022 Author Share Posted August 9, 2022 My current power supply is an EVGA 600watt BA Series. Quote Link to comment
StewLoft Posted August 9, 2022 Author Share Posted August 9, 2022 Recommendations on a power supply? I have 13 HDD, 1 SSD, do not use GPU power, have 2 Pcie controllers, 5 fans, 65w CPU with a stock cooler. Quote Link to comment
JorgeB Posted August 9, 2022 Share Posted August 9, 2022 Post new diags after checking the cables to see if there are still issues. A good quality single rail 550/600W PSU is enough for that, Corsair and Seasonic for example are usually well regarded, though avoid the cheapest models, go for a gold model. Quote Link to comment
StewLoft Posted August 10, 2022 Author Share Posted August 10, 2022 I checked both sata cables and power cables for all the HDD's. Restarted and the same 2 disc are showing unmountable again. Main page shows parity not valid, dashboard shows parity valid. Diagnostics attached. tower-diagnostics-20220809-1956.zip Quote Link to comment
trurl Posted August 10, 2022 Share Posted August 10, 2022 Disk1 disabled and unmountable, disk9 unmountable. That's the way it was in your previous diagnostics. Nothing you can do to hardware is going to fix those things since they have already happened. The filesystems of both have to be repaired and disk1 has to be rebuilt. But if hardware isn't working well then repair and rebuild isn't going to work well. Can't tell much about how well the hardware is currently working since diagnostics are immediately after reboot. 20 minutes ago, StewLoft said: Main page shows parity not valid, dashboard shows parity valid Not clear what you mean there, parity2 looks fine in diagnostics. Post screenshots. Quote Link to comment
StewLoft Posted August 10, 2022 Author Share Posted August 10, 2022 New Diagnostics. tower-diagnostics-20220809-2201.zip Quote Link to comment
StewLoft Posted August 10, 2022 Author Share Posted August 10, 2022 I am considering this power supply. Thermaltake Toughpower GF1 850W 80+ Gold SLI/Crossfire Ready Ultra Quiet 140mm Hydraulic Bearing Smart Zero Fan Full Modular Power Supply 10 Year Warranty PS-TPD-0850FNFAGU-1 Opinions? Quote Link to comment
trurl Posted August 10, 2022 Share Posted August 10, 2022 8 hours ago, StewLoft said: I am considering this power supply That looks very nice. I wish we could be more confident it would solve anything. You could get rid of the Marvell controllers, but also not confident it would solve anything. These diagnostics are mostly full of call traces. Not clear to me if it has anything to do with drives, connections, cables, controllers... Maybe someone else will have some idea what those mean. Without reviewing the thread can't remember every suggestion. I don't think you are using VMs, but is VT-d enabled in BIOS? Marvell and VT-d might not play well together. Have you tried reseating the controllers? Connections and cables. I am beginning to think drives, connections, cables are not the real problem, but I will mention some things about that anyway. Are you bundling data cables? Don't All connectors should sit squarely on the connection with plenty of slack in the cables. I know you did memtest though it may have only been one pass. Couldn't hurt to let memtest run overnight. Quote Link to comment
StewLoft Posted August 11, 2022 Author Share Posted August 11, 2022 Not using VM's, cables are not bundled, I did try resetting the marvel controllers, I have ordered 1 non-marvel controller in case that has something to do with it. Connectors are on solid and have plenty of slack. I am not sure what VT-d is. Should I just remove the 2 drives that are showing unmountable? Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 You already have a disabled disk and single parity. You can't start the array if you remove any disks unless you New Config and rebuild parity. And if you New Config, you can't rebuild disk1. I guess there is some possibility that everything else would work better without those disks but maybe not. I need to review the thread. Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 1 hour ago, StewLoft said: I have ordered 1 non-marvel controller What did you order? Do you still have the original disk1? Is it currently plugged in? Quote Link to comment
StewLoft Posted August 11, 2022 Author Share Posted August 11, 2022 The original disc1 is still in place in the array as disc1 Quote Link to comment
trurl Posted August 11, 2022 Share Posted August 11, 2022 21 hours ago, trurl said: What did you order? Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.