• [6.9.0][6.9.1] Disks on second HBA card not listed


    Dark-Raptor
    • Urgent

    After updating to 6.9.0 all the drives on my second LSI SAS2308 dont show up and are missing they appear in bios on the controler and after downgrading back to 6.8.3 they all showed up again.

     

    i have done several reboots and cold starts and the same thing happens.

     

    [6.9.0]

    iommu.PNG

    asgard-diagnostics-20210304-0032.zip

     

    [6.9.1]

     

    just updated to [6.9.1] and same issues mppt3sas max_queue_depth=10000 was already set

     

     

    [6.9.1].PNG

    [6.9.1]asgard-diagnostics-20210322-0405.zip




    User Feedback

    Recommended Comments

    There was a similar issue in the betas, but this fix should already be in 6.9.0. 

     

    Changes vs. 6.9.0-beta29 include, is your card using the mpt3sas driver?

     

    Added workaround for mpt3sas not recognizing devices with certain LSI chipsets. We created this file:

    /etc/modprobe.d/mpt3sas-workaround.conf

    which contains this line:

    options mpt3sas max_queue_depth=10000

    When the mpt3sas module is loaded at boot, that option will be specified.  If you add "mpt3sas.max_queue_depth=10000" to syslinux kernel append line, you can remove it.  Likewise, if you manually load the module via 'go' file, can also remove it.  When/if the mpt3sas maintainer fixes the core issue in the driver we'll get rid of this workaround.

     

    You are using the mpt3sas driver.

     

    01:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
        Subsystem: Broadcom / LSI 9207-8i SAS2.1 HBA [1000:3020]
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas
    02:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
        Subsystem: Broadcom / LSI 9207-8i SAS2.1 HBA [1000:3020]
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas

     

    Mar  4 00:08:33 Asgard kernel: mpt2sas_cm0: LSISAS2308: FWVersion(20.00.07.00), ChipRevision(0x05), BiosVersion(07.39.02.00)
    Mar  4 00:08:33 Asgard kernel: mpt2sas_cm0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)

     

    Below is the fix that is in 6.9.0

    root@Tower:/etc/modprobe.d# cat mpt3sas.conf 
    options mpt3sas max_queue_depth=10000

     

     

    @JorgeB Do you think the que depth may need to be doubled as 2 identical cards?

    Edited by SimonF
    Link to comment
    1 hour ago, SimonF said:

    Do you think the que depth may need to be doubled as 2 identical cards?

    That issue didn't affect the SAS2308 chip, and the other one is working, rather strange.

    • Thanks 1
    Link to comment

    i dont see how it cant be a unraid issue as it works when i revert back to 6.8.3 everything works they also appear in bios so its the OS not seeing them

     

     

    in the log there is this line "mpt2sas_cm0: port enable: FAILED with timeout (timeout=300s)"  think this could be the issue but dont know how to fix that

     

    Edited by Dark-Raptor
    Link to comment

    It could be an LSI driver issue, the one included in the newer kernel, did you try swapping the controllers?

    Link to comment

    i have not swapped the cards as i have reverted back to 6.8.3 as i dont really want to leave the server offline for long periods of time, i can do this if needed but dont think it will add anything and would think its more related to why the OS is taking 5+ min to load compared to the 2ish min from 6.8.3

    Link to comment
    55 minutes ago, Dark-Raptor said:

    and would think its more related to why the OS is taking 5+ min to load compared to the 2ish min from 6.8.3

    That's a just a consequence of the issue, the HBA is timing out during boot:

     

    Mar 22 04:02:24 Asgard kernel: mpt2sas_cm0: port enable: FAILED with timeout (timeout=300s)

     

    And the timeout is 5 minutes.

     

    The main reason I asked to swap them is to see if it's the same controller that fails to initialize, if it's the other one it could be more of a general problem with the LSI driver, though many uses are using multiple LSI controller without issues, but it's still the same one it's likely something specific to that HBA that the new driver isn't liking, and that could be more difficult to get fixed.

    Link to comment

    hey, I have the same issue with my Adaptec 6805H HBA.

    Works flawlessly in 6.8.3, but if I update to 6.9.2, all drives connected to it, are gone (which is basically my hole array).

     

    The controller is still recognized - see attached file. If I revert back, the drives are there again.

     

    Under /etc/modprobe.d/ I have the mpt3sas.conf, which already contains the max_queue_depth=10000.

    Any other idea? :(

    controller.PNG

    Link to comment

    ok, found the issue myself....driver not working for it.

    Log says: 

     

    pm80xx0:: pm8001_pci_probe  1107:chip_init failed [ret: -16]

    ...so have to change hardware, or wait for a new driver or roll back :/

    Edited by zyv
    Link to comment
    9 hours ago, zyv said:

    Under /etc/modprobe.d/ I have the mpt3sas.conf, which already contains the max_queue_depth=10000.

    That's for LSI only.

     

    9 hours ago, zyv said:

    pm80xx0:: pm8001_pci_probe 1107:chip_init failed [ret: -16]

    There's a known issue with those controllers and the newer driver:

    https://forums.unraid.net/bug-reports/stable-releases/690-691-netapp-pmc-sierra-pm8003-scc-4-port-qsfp-pcie-x8-controller-didnt-find-the-hdds-r1300/?do=getNewComment&d=2&id=1300

    Link to comment

    The issue is there is a bug in the pm80xx release that causes the pm8003 cards to not initialize (pm8001). It is a really tiny patch in the 5.10.30 kernel but 6.9.2 is running 5.10.28. I did in the noted post above compile and anyone can download to use on their 6.9.2 release

    https://forums.unraid.net/bug-reports/stable-releases/690-691-692-netapp-pmc-sierra-pm8003-rev5-and-adaptec-6805h-hba-controller-didnt-find-the-hdds-r1300/page/2/?tab=comments#comment-14699

    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.