Random Drives Dropping from Array


Cessquill
Go to solution Solved by JorgeB,

Recommended Posts

Hi - ongoing issue that I'm trying to diagnose.

 

Once every week or so a drive will randomly drop off the array.  Not the same drive - appears to be random (from what I can tell).  See first diagnostics taken just after it happened this morning.

 

When it does happen, I can stop VM and all dockers, but I can never stop the docker or VM services (pizza wheels for a long time).  Following that, I can not stop the array with the GUI.  See second diagnostics taken automatically because I have to give it an unclean shutdown over IPMI.

 

Points to note

  • All data drives (except Parity & SSDs) are connected to a SAS backplane
  • It seems to be an Ironwolf drive that drops off (although I am aware of the LSI issues with them and have set up the drives as per recommendations)
  • Every time it happens I can not stop the array

 

I have no idea what it could be with this, and would really love to solve it - it seems to be running a drive rebuild a lot of the time

 

Edited by Cessquill
Removed diagnostics file for safety
Link to comment
7 minutes ago, JorgeB said:

It's happening on disk spin up, so it might be the LSI + Ironwolf issue, you can temporarily disable spin down to try and confirm.

Thanks, I've set the spin down delay to never.

 

I'm hoping it's not a further LSI/Ironwolf issue, since all drives have the fixes (and the drive that dropped off, and previously disk 19 are ST8000VN0022 which didn't initially suffer from that).  That said, if it's not that I'm stuck.  The PSU should cope with all drives turning on.

Link to comment

As a thought, I was searching yesterday and discovered that the onboard LSI 2308 was running old firmware in IR mode...

 

LSI Corporation SAS2 Flash Utility
Version 12.00.00.00 (2011.11.08) 
Copyright (c) 2008-2011 LSI Corporation. All rights reserved 

        Adapter Selected is a LSI SAS: SAS2308_1(Rev 5)

        Controller Number              : 0
        Controller                     : SAS2308_1(Rev 5)
        PCI Address                    : 00:02:00:00
        SAS Address                    : 5003048-0-11b9-9100
        NVDATA Version (Default)       : 0f.00.00.12
        NVDATA Version (Persistent)    : 0f.00.00.12
        Firmware Product ID            : 0x2714 
        Firmware Version               : 15.00.00.00
        NVDATA Vendor                  : LSI
        NVDATA Product ID              : SMC2308-IR
        BIOS Version                   : 07.29.00.00
        UEFI BSD Version               : N/A
        FCODE Version                  : N/A
        Board Name                     : SMC2308-IR
        Board Assembly                 : N/A
        Board Tracer Number            : N/A

        Finished Processing Commands Successfully.
        Exiting SAS2Flash.

 

The last version I could find was 20.00.07.00.  Regardless of whether it fixes my issues, would it be recommended to upgrade to the latest version and switch over to IT mode?  The motherboard preceded switching to a SAS backplane, and since it worked I didn't give it a second thought.

 

@JorgeB - in researching I also found your excellent post about flashing firmware, thank you

Link to comment
  • 3 weeks later...

Well, after flashing the onboard LSI 2308 with the latest IT firmware (it was old and IR), Disabling EPC & Low Current Spinup on all Ironwolf drives (regardless of whether they were the models affected), and waiting, I've had no more drive issues.  Everything spins up and down normally again.

 

Thank you for your time and advice.

  • Like 1
Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.