Cessquill Posted September 19, 2023 Share Posted September 19, 2023 (edited) Hi - ongoing issue that I'm trying to diagnose. Once every week or so a drive will randomly drop off the array. Not the same drive - appears to be random (from what I can tell). See first diagnostics taken just after it happened this morning. When it does happen, I can stop VM and all dockers, but I can never stop the docker or VM services (pizza wheels for a long time). Following that, I can not stop the array with the GUI. See second diagnostics taken automatically because I have to give it an unclean shutdown over IPMI. Points to note All data drives (except Parity & SSDs) are connected to a SAS backplane It seems to be an Ironwolf drive that drops off (although I am aware of the LSI issues with them and have set up the drives as per recommendations) Every time it happens I can not stop the array I have no idea what it could be with this, and would really love to solve it - it seems to be running a drive rebuild a lot of the time Edited April 8 by Cessquill Removed diagnostics file for safety Quote Link to comment
JorgeB Posted September 19, 2023 Share Posted September 19, 2023 It's happening on disk spin up, so it might be the LSI + Ironwolf issue, you can temporarily disable spin down to try and confirm. Quote Link to comment
Cessquill Posted September 19, 2023 Author Share Posted September 19, 2023 7 minutes ago, JorgeB said: It's happening on disk spin up, so it might be the LSI + Ironwolf issue, you can temporarily disable spin down to try and confirm. Thanks, I've set the spin down delay to never. I'm hoping it's not a further LSI/Ironwolf issue, since all drives have the fixes (and the drive that dropped off, and previously disk 19 are ST8000VN0022 which didn't initially suffer from that). That said, if it's not that I'm stuck. The PSU should cope with all drives turning on. Quote Link to comment
Cessquill Posted September 20, 2023 Author Share Posted September 20, 2023 As a thought, I was searching yesterday and discovered that the onboard LSI 2308 was running old firmware in IR mode... LSI Corporation SAS2 Flash Utility Version 12.00.00.00 (2011.11.08) Copyright (c) 2008-2011 LSI Corporation. All rights reserved Adapter Selected is a LSI SAS: SAS2308_1(Rev 5) Controller Number : 0 Controller : SAS2308_1(Rev 5) PCI Address : 00:02:00:00 SAS Address : 5003048-0-11b9-9100 NVDATA Version (Default) : 0f.00.00.12 NVDATA Version (Persistent) : 0f.00.00.12 Firmware Product ID : 0x2714 Firmware Version : 15.00.00.00 NVDATA Vendor : LSI NVDATA Product ID : SMC2308-IR BIOS Version : 07.29.00.00 UEFI BSD Version : N/A FCODE Version : N/A Board Name : SMC2308-IR Board Assembly : N/A Board Tracer Number : N/A Finished Processing Commands Successfully. Exiting SAS2Flash. The last version I could find was 20.00.07.00. Regardless of whether it fixes my issues, would it be recommended to upgrade to the latest version and switch over to IT mode? The motherboard preceded switching to a SAS backplane, and since it worked I didn't give it a second thought. @JorgeB - in researching I also found your excellent post about flashing firmware, thank you Quote Link to comment
Solution JorgeB Posted September 20, 2023 Solution Share Posted September 20, 2023 4 minutes ago, Cessquill said: would it be recommended to upgrade to the latest version and switch over to IT mode? Yep 1 Quote Link to comment
Cessquill Posted October 9, 2023 Author Share Posted October 9, 2023 Well, after flashing the onboard LSI 2308 with the latest IT firmware (it was old and IR), Disabling EPC & Low Current Spinup on all Ironwolf drives (regardless of whether they were the models affected), and waiting, I've had no more drive issues. Everything spins up and down normally again. Thank you for your time and advice. 1 Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.