Harro Posted September 9, 2019 Share Posted September 9, 2019 (edited) Ever since changing to the lsi 9300-16i controller I have been plagued with disks dropping offline. Disks seem fine with smart reports. I have ordered new mini-sas cables thinking this is the only item related to all the drives that go offline, besides a bad controller. I have also ordered new drives in case these are deemed bad. I was running an extended smart on disk 14 when everything went tit's up. In the last week I have rebuilt disk 1 & 14 fours times, only to have them go offline in a day or so. anything else others see in my diagnostics? What steps next? Update : Replaced cables and all has been working fine. tower-diagnostics-20190909-1350.zip Edited September 13, 2019 by Harro Solved Quote Link to comment
JonathanM Posted September 9, 2019 Share Posted September 9, 2019 50 minutes ago, Harro said: Ever since changing to the lsi 9300-16i controller I have been plagued with disks dropping offline. Do you have any suspicion it might be a counterfeit card? Have you verified the serial number with LSI? 1 Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 Bought card from ALLHDD.com so not an ebay purchase. It was new in a sealed box along with a vacuum seal around card. So I doubt counterfeit but will run serial on LSI web. Quote Link to comment
JonathanM Posted September 9, 2019 Share Posted September 9, 2019 Any firmware updates available? 1 Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 Updated the firmware to the latest after install. Added the extra power to card from this thread Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 Is it advisable to shut server down and re set all cables. I have now 3 disks that are now in unassigned devices, so I can rebuild 2 and I will copy off the data on the 3rd, which is not a whole lot maybe 4TB. Would I shrink array and take the 3rd drive out and let the 2 other drives rebuild? Quote Link to comment
Vr2Io Posted September 9, 2019 Share Posted September 9, 2019 (edited) The reason for change to 9300-16i, what disk controller use before ? You have one 9300 ? ( lspci show have two, but I assume should be one ) Forget it, 9300-16i have a PLX chip and two SAS3008 controller. Edited September 9, 2019 by Benson 1 Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 (edited) 6 minutes ago, Benson said: The reason for change to 9300-16i, what disk controller use before ? You have one 9300 ? ( lspci show have two, but I assume should be one ) I had 2 HP220 controllers each handling 8 drives. Went with a single 16i so I could replace the 2 and use the PCIe 3.0 x16 to gain parity check speed. The 16i is actually 2 controllers on one card. Atleast that is what it showed when I flashed the newer firmware. Edited September 9, 2019 by Harro Quote Link to comment
Vr2Io Posted September 9, 2019 Share Posted September 9, 2019 All disk drop or half, may be try only use 8 port of 9300 and reinsert HP220 for another disks. 1 Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 I am running 12 disks off the 9300 and the other disks are running on the onboard sata. I was planning on inserting my old card again and seeing what happens. 1 Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 1 hour ago, Benson said: reinsert HP220 for another disks. Reinserted the HP220 and put 3 disks on that card and am now rebuilding disk 1. Disk 1 was reballed before the restart so I was expecting to rebuild. All other disks are online with are looking ok. Quote Link to comment
Vr2Io Posted September 9, 2019 Share Posted September 9, 2019 (edited) 36 minutes ago, Harro said: Reinserted the HP220 and put 3 disks on that card and am now rebuilding disk 1. Disk 1 was reballed before the restart so I was expecting to rebuild. All other disks are online with are looking ok. Suggest not use the port which have drop disk ( best could use one of the controller on 9300-16i ), this just test does one controller have problem or not. After rebuild, I would put array in maintenance mode and perform parity check ( no correction ), this just make loading to controller, if disk show error or drop again, then you don't need rebuild any disk. Once load test pass, then start array as usual and test/monitor does everything normal. Edited September 9, 2019 by Benson Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 (edited) 13 minutes ago, Benson said: Suggest not use the port which have drop disk ( best could use one of the controller on 9300-16i ), this just test does one controller have problem or not. I do have 1 disk on that side of the 9300-16i. That disk has not shown any signs of problems and the other 3 disks are on the HP220. I have a new set of mini sas cables coming which I suspect might be the problem. Atleast I am hoping instead on the card. Question now remains is do I format the disk (1) that is rebuilding since it is showing no file system or let it rebuild? Edited September 9, 2019 by Harro added txt Quote Link to comment
Vr2Io Posted September 9, 2019 Share Posted September 9, 2019 1 minute ago, Harro said: Atleast I am hoping instead on the card. Sure. HBA problem quite trouble. Two controller in one card also too hot. Quote Link to comment
Vr2Io Posted September 9, 2019 Share Posted September 9, 2019 (edited) 11 minutes ago, Harro said: Question now remains is do I format the disk (1) that is rebuilding since it is showing no file system or let it rebuild? Too bad, if emulate disk show unmountable, even rebuild won't fix this, waiting some expert jump in. ( don't perform format ) Edited September 9, 2019 by Benson 1 Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 2 minutes ago, Benson said: Too bad, if emulate disk show unmountable, even rebuild won't fix this This is why I am leaning toward cancelling the Parity-Sync/Data-Rebuild and format , then rebuild. Quote Link to comment
Vr2Io Posted September 9, 2019 Share Posted September 9, 2019 You can format disk1 if no data on it, otherwise don't format it. Quote Link to comment
Harro Posted September 9, 2019 Author Share Posted September 9, 2019 (edited) I stopped Parity-Sync/Data-Rebuild and went into maintenance mode and am checking using the xfs repair Update XFS repair done and disk mounted once again and rebuilding now. Edited September 9, 2019 by Harro 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.