abhi.ko Posted February 7, 2022 Share Posted February 7, 2022 (edited) Hello All - I have multiple disks in the array failing and multiple disks with errors, out of the blue. Seems like it is my controller or cables that is causing the issue, but not sure, i did check everything recently when I added some RAM and all looked good. I have the server shutdown currently since there are 2 failed disks now. The attached diagnostics was before the second one had failed. Multiple disks with errors as well, I did replace one failed disk and while the parity sync was going on I got a log full of errors and multiple disks were reporting errors and one failed at the beginning of parity sync and the second one towards the end. I did recently update to 6.10 rc2, but the initial issue started while I was on 6.9 stable, referenced here, the disk now in failed state is the same one referenced in that thread. I did a Win 11 VM yesterday which got added fine and everything was working well, and then this started. Please help. I have an HA virtual machine that is always running and hence my home automation is not working either currently. I am trying to determine next course of action, all hardware is pretty new. tower-diagnostics-20220207-0650.zip Edited February 7, 2022 by abhi.ko Quote Link to comment
trurl Posted February 8, 2022 Share Posted February 8, 2022 Your log space had completely filled and diagnostics only includes a little of that. Since you have so many disks I wonder if power might be an issue. Splitters? Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 (edited) I have an LSI 9300 16i controller and 4 SAS cables plugged into it and it has a power connection from the psu as well. psu is a 1000w EVGA psu which should be plenty of power I thought, shouldn't it be? I just reconnected all the cables to the LSI card again and re-seated the card. I started the server and see the two disks are in disabled error state. I have dual parity can I just rebuild it on to itself. Attached a diagnostics output without starting the array, not sure if that helps. tower-diagnostics-20220207-1833.zip Edited February 8, 2022 by abhi.ko Attached diagnostics Quote Link to comment
trurl Posted February 8, 2022 Share Posted February 8, 2022 2 hours ago, abhi.ko said: diagnostics output without starting the array Can't tell anything about filesystems (unmountable?) or shares without the array started. Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 (edited) What do you suggest? Should I start the array with the two disabled disks and run diagnostics to post here. Array is mountable though. I am just worried about loosing data. I keep getting sector reallocated errors and the counts going up on all the Toshiba disks I have on the array, weird that it is just those disks they are spread over in different trays (physically) on the case as well. So it is not like one tray/backplate has gone bad, other disks are not having the same errors. Is there any issues with Toshiba disks that has been reported, especially with RC2? Screenshot below of all the warnings I got when I turned the server on for a few minutes. Any advice on next steps please? Edited February 8, 2022 by abhi.ko Quote Link to comment
itimpi Posted February 8, 2022 Share Posted February 8, 2022 Reallocated sectors is something that is completely internal to the drive. If it is not stable for any given drive then I would consider replacing it. If you have multiple drives causing this symptom did you purchase them all at the same time (in which case you could have a bad batch). Quote Link to comment
JorgeB Posted February 8, 2022 Share Posted February 8, 2022 Bad power, be it a failing PSU or a problem with the connections, can also cause reallocated sectors, in those cases you can usually hear the drives powering up/down when that happens. Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 Thank you both. How can we make sure if it is the drive or the PSU or the cables? I do not know why it is only those Toshiba drives and not any of the other 18 or so drives. Quote Link to comment
JorgeB Posted February 8, 2022 Share Posted February 8, 2022 Like mentioned if it's a power issue you should ear the drives clicking or spinning up/down during normal usage. Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 (edited) Okay I will turn on and listen to it and see if I hear anything. I have reconnected all the cables and re-seated the LSI card and made sure all connections are tight. Question - I have two drives that are disabled in the array - what are the next steps when I turn it on, do I just unassign them and start the array and stop again and reassign the same drives and let the parity resync run for both drives at once, or do I do one drive at a time, or should I do something else? I have dual parity, so if one more drive becomes disabled then I will loose data wouldn't I? I just rebuilt another old Seagate drive that failed last week, not sure if that is related to this or not, so I'm concerned whether one of these drives with reallocated sectors will go bad before the parity sync finishes and cause me to loose data. Any suggestions you have for next steps would be very helpful, as I had asked I can turn the array on and run/post diagnostics as a first step, and then shutdown the server, if you think more information would help and if that is safer. Edited February 8, 2022 by abhi.ko Quote Link to comment
JorgeB Posted February 8, 2022 Share Posted February 8, 2022 7 minutes ago, abhi.ko said: I have two drives that are disabled in the array - what are the next steps when I turn it on First post the diags after array start like asked, so we can see if the emulated disks are mounting. Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 Will do. At work now, but will post soon. Quote Link to comment
Michael_P Posted February 8, 2022 Share Posted February 8, 2022 2 hours ago, abhi.ko said: I do not know why it is only those Toshiba drives I had a few Toshiba drives do the same thing, it was power related (too many drives on one line). So it's possible. Only the Toshiba drives would start reallocating sectors, the WD drives would just fall out of the array. If you're using splitters, try eliminating/using as few as possible Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 Thank you @Michael_P What do you mean by splitters? Like a SAS to SATA cable? I use a Norco 4224 case which has a SAS backplane with 6 SAS connectors (1 per 4 drive tray) and I have 8 Sata slots on my motherboard, which are connected using 2 SAS to 4 Sata reverse breakout cables and 16 drives goes directly to the LSI 9300 16i card, using SAS connectors similar to this. Both failed drives are on the SAS cables connected directly to the HBA card. Only one of the sata connected discs are showing errors. Do you mean the reverse breakout cables when you say splitters? Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 3 hours ago, JorgeB said: First post the diags after array start like asked, so we can see if the emulated disks are mounting. Diagnostics attached. Also attached is a picture I took from the monitor attached to the server, seems like that is for the two disabled disks, but attached just in case if it gave more info. All other disks mounted fine. No sounds other than the normal bootup and fan noises were noticed. Hopefully this diagnostics has enough information. tower-diagnostics-20220208-1240.zip Quote Link to comment
Vr2Io Posted February 8, 2022 Share Posted February 8, 2022 (edited) 18 hours ago, abhi.ko said: psu is a 1000w EVGA psu which should be plenty of power I thought, shouldn't it be? Yes, but just means on 12v 6 hours ago, abhi.ko said: Is there any issues with Toshiba disks that has been reported, especially with RC2? No, I have ~18 Toshiba 6TB disk no any issue. 1 hour ago, abhi.ko said: What do you mean by splitters? He means molex/sata power spliter, as you use backplate then it won't that issue. Btw, it look like PSU problem. Edited February 8, 2022 by Vr2Io Quote Link to comment
JorgeB Posted February 8, 2022 Share Posted February 8, 2022 9 minutes ago, abhi.ko said: seems like that is for the two disabled disks Yep, check filesystem on both disabled disks, then if they mount look for a lost+found folder, if there's a lot of files there it's probably best to re-sync parity instead of rebuilding. Quote Link to comment
Vr2Io Posted February 8, 2022 Share Posted February 8, 2022 (edited) Pls state more in how PSU power cabling for the backplate and disk. Also check each plug/socket does any burn-out found. How age of those Toshiba disk and the PSU ? Edited February 8, 2022 by Vr2Io Quote Link to comment
Michael_P Posted February 8, 2022 Share Posted February 8, 2022 1 hour ago, abhi.ko said: What do you mean by splitters? 34 minutes ago, Vr2Io said: He means molex/sata power spliter, as you use backplate then it won't that issue. I have the 4224, too and absolutely it will be an issue if you're running all 6 backplanes off of 1 line back to the PSU (ask me how I know). Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 (edited) 2 hours ago, JorgeB said: Yep, check filesystem on both disabled disks, then if they mount look for a lost+found folder, if there's a lot of files there it's probably best to re-sync parity instead of rebuilding. Thank you @JorgeB I will do it. should I do something about the power situation in my case before that. Based on other comments here from @Michael_P and @Vr2Io - Thank you both and yes I am using power splitters to connect all 6 backplanes to a single PSU connector (picture attached) - which I think might be causing all of this, please correct me if I am wrong. Should I get a different PSU - I currently have this - which I believe is a single +12V rail PSU with a 83A max output. I have a total of 23 disks including parity and cache (cache is an SSD) and majority of these HDD's are the 7200RPM ones. If I should change - do you have any recommendations? Or should I change how they are powered? Edited February 8, 2022 by abhi.ko Quote Link to comment
Vr2Io Posted February 8, 2022 Share Posted February 8, 2022 (edited) Your full modual PSU have 1 PERF and 3 SATA power socket, all four could use for backplane ( need modify ), but still miss two for 6 required. If you know how to DIY molex power plug on cable then you can DIY ( any wrong could burn all stuff ), otherwise you need ask EVGA to buy 3 more molex cable or found other PSU which suit your need. https://www.moddiy.com/products/DIY-IDE-Molex-Power-EZ-Crimp-Connector-%2d-Black.html Edited February 8, 2022 by Vr2Io Quote Link to comment
Michael_P Posted February 8, 2022 Share Posted February 8, 2022 10 minutes ago, Vr2Io said: DIY molex power plug This is what I did to add extra connectors to one of the unused sata lines. Easy enough to tone out with a basic multimeter (use a splitter to check all diy connectors tone out the same) Quote Link to comment
Michael_P Posted February 8, 2022 Share Posted February 8, 2022 Just to add, think max 4 drives per connector, and not all on the same line. The PSU is plenty, but you have to spread the load Quote Link to comment
Vr2Io Posted February 8, 2022 Share Posted February 8, 2022 11 minutes ago, Michael_P said: Just to add, think max 4 drives per connector, and not all on the same line. The PSU is plenty, but you have to spread the load Yes, always follow those rule. Quote Link to comment
abhi.ko Posted February 8, 2022 Author Share Posted February 8, 2022 1 hour ago, Vr2Io said: Your full modual PSU have 1 PERF and 3 SATA power socket, all four could use for backplane ( need modify ), but still miss two for 6 required. If you know how to DIY molex power plug on cable then you can DIY ( any wrong could burn all stuff ), otherwise you need ask EVGA to buy 3 more molex cable or found other PSU which suit your need. https://www.moddiy.com/products/DIY-IDE-Molex-Power-EZ-Crimp-Connector-%2d-Black.html Thanks - but I am a little confused, because this is all still on a single 12V line right, irrespective of what connector we plug it into? So how does it distribute the load? Apologies if I am missing something obvious. Is the 83A power draw enough to boot up the whole system, I thought that was the problem and I needed a more beefier PSU with more amperage on the single 12V line. If yes, then I have a few of these lying around - shouldn't these do the trick, connect them to the SATA connectors from the PSU, and connect 6 backplanes to the 4 SATA/PERIF connectors. Not 1:1 but that distribution of the load should help right, currently everything is on one connector to the PSU. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.