fysmd Posted February 23, 2019 Share Posted February 23, 2019 (edited) So I've just had my third drive in a fortnight go bad. Last week I took the precaution of adding a 2nd parity but had another go unmountable since then. Now the drives failing have been old but I'm wondering if there is something going on here which I'm not noticing. Posted diag from today and one from a v poory state last week. Any advise / comments much appreciated.. [off to buy another couple of drives!] server-diagnostics-20190223-0956.zip server-diagnostics-20190216-1045.zip Edited March 3, 2019 by fysmd Mark solved - thanks! Quote Link to comment
JorgeB Posted February 23, 2019 Share Posted February 23, 2019 Older diags disk7 dropped offline so no SMART. On the newer ones SMART for the disabled disk17, looks OK, it might have been a cable issue, disk18 is failing though. Quote Link to comment
fysmd Posted February 23, 2019 Author Share Posted February 23, 2019 thanks, I noticed disk18 and have a replacement now - preclearing. While starting a pre-clear, disk2 just went offline!! It does seem that all these drives were on the same cable. Bi of rearrangement and that cable's out of play now. REALLY glad I added two parity now! Other than drive18 with errors, would I be right thinking that they're likely OK? Quote Link to comment
JorgeB Posted February 23, 2019 Share Posted February 23, 2019 Only took a quick look but other array drives looked mostly fine, there's a 2TB WD with a very high load cycle count, you could disabled parking with wdidle, but probably not much point now. Quote Link to comment
fysmd Posted February 23, 2019 Author Share Posted February 23, 2019 Sorry, that's not what I meant - I now have four drives which i removed from the array, likely cable related. is there any reason to suspect them faulty? One back at full strength I may run preclear a few times on them to test them but is there anything else? Quote Link to comment
fysmd Posted February 23, 2019 Author Share Posted February 23, 2019 Just had another drive go, disk2 now, diag attached server-diagnostics-20190223-1439.zip Quote Link to comment
JorgeB Posted February 23, 2019 Share Posted February 23, 2019 6 minutes ago, fysmd said: is there any reason to suspect them faulty? Post SMART reports for all 4. Quote Link to comment
JorgeB Posted February 23, 2019 Share Posted February 23, 2019 2 minutes ago, fysmd said: Just had another drive go, disk2 Also dropped offline, but previous SMART was fine, you can bring it online by rebooting and post new report to confirm, but it's likely still fine, same for the previous ones. Quote Link to comment
fysmd Posted February 23, 2019 Author Share Posted February 23, 2019 machine reboot rather than stop and restart array? - damn, in the middle of preclears Quote Link to comment
trurl Posted February 23, 2019 Share Posted February 23, 2019 Are you sure you have enough power for all those disks? Maybe you have an old power supply? You have some hardware problem, and it's not the disks. Quote Link to comment
JorgeB Posted February 23, 2019 Share Posted February 23, 2019 Just now, fysmd said: machine reboot rather than stop and restart array? Stopping array won't work, you can remove and re-insert the disk if the server supports hot plug. Quote Link to comment
Frank1940 Posted February 23, 2019 Share Posted February 23, 2019 2 minutes ago, trurl said: Are you sure you have enough power for all those disks? Maybe you have an old power supply? I would be suspicious of the Power Supply. There have been several hardware problems lately which appear to have been fixed when the PS was replaced. (PS are a much more complicated device now then they were back in the early 2000's and, perhaps, there has been some manufacturing cost cutting going on...) Quote Link to comment
JorgeB Posted February 23, 2019 Share Posted February 23, 2019 I would say if the 4 disks share a minSAS cable as the OP suggested start there, if not I agree the PSU is a good suspect. Quote Link to comment
fysmd Posted February 23, 2019 Author Share Posted February 23, 2019 the drives are split over two 550W PSUs but will tot up how many / which on each. Quote Link to comment
fysmd Posted February 24, 2019 Author Share Posted February 24, 2019 During rebuild two otherwise happy drives now report a >>LOT<< of read errors. I'm replacing the PSU which supplies all so far affcted drives, is there anything lurking in the logs I dont see? server-diagnostics-20190224-0822.zip Quote Link to comment
fysmd Posted February 24, 2019 Author Share Posted February 24, 2019 So, after more investigation I found the only component common to the misbehaving drives was one of these puppies: https://www.scan.co.uk/products/silverstone-cp06-1-to-4-sata-power-adaptor-cable-with-capacitor Removed and now over three hours into rebuild with no faults so far (everything's crossed). 1 Quote Link to comment
fysmd Posted March 3, 2019 Author Share Posted March 3, 2019 Hi all, Well, a week later, a full rebuild and much frustration later I think I can confirm that the SATA power cable with capacitor was my issue. No issues reported since! Thanks for the help and advise all! Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.