Upgrading HBA worth it?


GoChris

Recommended Posts

Wondering if upgrading my HBA SAS 6G cards will net me more speed in rebuilds and parity checks.

 

Case: Norco 4224, 6 backplanes, I'm told they are SAS 3 capable. Drives on all backplanes.

Drives: All 7200rpm

Current HBAs: Two 2-port sas2 cards.

     Card 1: both ports -> intel expander -> 4 ports to backplanes

     Card 2: both ports to remaining 2 backplanes

 

Will upgrading to a SAS3 12Gbps card actually increase the hdd speeds at all?

 

Cheers

Link to comment
6 hours ago, JorgeB said:

Which Intel expander? Also what model disks?

 

Intel RES2SV240 expander

 

Drives are a combination of:

HGST Deskstar NAS

Seagate Ironwolf (NAS and Pros)

And just some Seagate NAS and shucked desktop drives, might have a couple SMRs in there.

Not deployed yet to upgrade some 4tb: WD Gold, Toshiba X300s (desktop performace they say)

 

Edit: I do have a 2nd expander (same model) that's unused, unsure if it would help in any way.

Edited by GoChris
Link to comment
18 minutes ago, JorgeB said:

Those are PCIe 2.0, a PCIe 3.0 model connected to the expander with dual link would increase total bandwidth from the currently usable 3GB/s to 4.8GB/s, assuming of course the board/CPU supports PCIe 3.0.

Yes, the board does, thanks for all the info so far.

 

So in theory if all drives on the backplanes off the expander are populated (16), currently at 3GB/s would result in a max of 192MB/s per drive? If so, doesn't seem like an upgrade would net much with current HDD speeds.

Link to comment
5 minutes ago, JorgeB said:

If you are getting lower speeds than that during rebuilds/parity check HBAs are not the problem.

I am yes, alright I wonder what it could be. I rebuilt a 10TB a few days ago @ 116.5 MB/s. Good to know.

CPU is an AMD Ryzen 7 3800X

Edited by GoChris
Link to comment

Check the HBAs link speed and with, you can do that with

lspci -d 1000: -vv

 

Also check that the expander is using dual link

 

cat /sys/class/sas_host/host1/device/port-1\:0/sas_port/port-1\:0/num_phys

 

if it's not host1 you need to adjust the command, e.g. for host7 it would be

 

cat /sys/class/sas_host/host7/device/port-7\:0/sas_port/port-7\:0/num_phys

 

Output of 8 means dual link, 4 means single link, you can also post the diags if you want me to check the correct host for you.

Link to comment
Quote

Check the HBAs link speed and with, you can do that with

09:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
        Subsystem: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 81
        IOMMU group: 17
        Region 0: I/O ports at e000 [size=256]
        Region 1: Memory at d04c0000 (64-bit, non-prefetchable) [size=16K]
        Region 3: Memory at d0080000 (64-bit, non-prefetchable) [size=256K]
        Expansion ROM at d0000000 [disabled] [size=512K]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s (ok), Width x8 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range BC, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
                         EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [d0] Vital Product Data
pcilib: sysfs_read_vpd: read failed: No such device
                Not readable
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003800
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [138 v1] Power Budgeting <?>
        Capabilities: [150 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy-
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 0072
                Supported Page Size: 00000553, System Page Size: 00000001
                Region 0: Memory at 00000000d04c4000 (64-bit, non-prefetchable)
                Region 2: Memory at 00000000d00c0000 (64-bit, non-prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas

0a:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
        Subsystem: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 89
        IOMMU group: 18
        Region 0: I/O ports at d000 [size=256]
        Region 1: Memory at d0ac0000 (64-bit, non-prefetchable) [size=16K]
        Region 3: Memory at d0680000 (64-bit, non-prefetchable) [size=256K]
        Expansion ROM at d0600000 [disabled] [size=512K]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 512 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 5GT/s (ok), Width x8 (ok)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range BC, TimeoutDis+ NROPrPrP- LTR-
                         10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS- TPHComp- ExtTPHComp-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
                         EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [d0] Vital Product Data
pcilib: sysfs_read_vpd: read failed: No such device
                Not readable
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
                Vector table: BAR=1 offset=00002000
                PBA: BAR=1 offset=00003800
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [138 v1] Power Budgeting <?>
        Capabilities: [150 v1] Single Root I/O Virtualization (SR-IOV)
                IOVCap: Migration-, Interrupt Message Number: 000
                IOVCtl: Enable- Migration- Interrupt- MSE- ARIHierarchy-
                IOVSta: Migration-
                Initial VFs: 16, Total VFs: 16, Number of VFs: 0, Function Dependency Link: 00
                VF offset: 1, stride: 1, Device ID: 0072
                Supported Page Size: 00000553, System Page Size: 00000001
                Region 0: Memory at 00000000d0ac4000 (64-bit, non-prefetchable)
                Region 2: Memory at 00000000d06c0000 (64-bit, non-prefetchable)
                VF Migration: offset: 00000000, BIR: 0
        Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
                ARICap: MFVC- ACS-, Next Function: 0
                ARICtl: MFVC- ACS-, Function Group: 0
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas

 

Quote

Also check that the expander is using dual link

 

root@Tower:~# cat /sys/class/sas_host/host11/device/port-11\:0/sas_port/port-11\:0/num_phys
8
root@Tower:~# cat /sys/class/sas_host/host12/device/port-12\:0/sas_port/port-12\:0/num_phys
1

 

I have also attached diagnostics. I appreciate your help immensely.

tower-diagnostics-20220829-1140.zip

Link to comment
6 minutes ago, GoChris said:
LnkSta: Speed 5GT/s (ok), Width x8 (ok)

 

7 minutes ago, GoChris said:
LnkSta: Speed 5GT/s (ok), Width x8 (ok)

 

HBA links are OK, expander is also using dual link

 

8 minutes ago, GoChris said:
root@Tower:~# cat /sys/class/sas_host/host11/device/port-11\:0/sas_port/port-11\:0/num_phys
8

 

Something else is the problem, is 116.5MB/s the average speed or the starting speed? i.e., if you start a parity check a what speed does it start? Make sure nothing else is using the array.

Link to comment
2 hours ago, JorgeB said:

HBA links are OK, expander is also using dual link

 

Something else is the problem, is 116.5MB/s the average speed or the starting speed? i.e., if you start a parity check a what speed does it start? Make sure nothing else is using the array.

That was the avg speed from the history, so it would be faster at one point. I will do a test and report back after stopping any activity.

Link to comment
2 hours ago, JorgeB said:

Something else is the problem, is 116.5MB/s the average speed or the starting speed? i.e., if you start a parity check a what speed does it start? Make sure nothing else is using the array.

 

Alright in the first 5 minutes, it was between ~162 to 172MB/s, seemed pretty stable once it got going in the high 160s. I know drives slow down on the outer edges, so perhaps some of the older 4TB drives are quite slow? I will do some speed tests on all of them at different positions to see if anyone individually is slow. Average still seems low compared to the starting speed. I thought most dockers were stopped during the last rebuild, but I can't be certain now.

Link to comment

I did a scan at 25% intervals using the Disk Speed container. The slowest 4TB drive does indeed slow to ~80MB/s at the 4TB mark. As well as one 8TB that slows to ~88MB/s at it's end. So perhaps the slow performance of those at their edges are what's dropping the average down so much.

Link to comment
4 hours ago, GoChris said:

So perhaps the slow performance of those at their edges are what's dropping the average down so much.

Most likely, and it's not a lack of bandwidth problem, that could be the problem if the check started say at 180/190MB/s and after a couple of hours it was still going at that speed.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.