wulperdinger Posted May 22 Share Posted May 22 Hi guys, I was wondering if there is anything I can do to improve my speed for the parity check/data rebuild. My parity disk is a shucked WD Elements disk and all the other disks are 18tb HGST and 2 12tb WD Reds. As you can see, with just the HGST disks, the data rebuild is currently at ~80 MB/s, and it won't go any higher. The Parity check is even slower at around 70 MB/s max (same speed when the mover (Samsung 990pro) is running). They are all connected via an LSI 9300-16i Card, a few weeks back, 6 of them were connected to the mobo via SATA, but the speed overall didn't change. Would it help, if I upgrade/change the parity disk to an 18tb HGST too? Is there a significant difference in speed, if all the disks are the same? Is there even room for improvement in terms of speed, or do you guys have similar speeds for parity check/data rebuild? Cheers Quote Link to comment
MAM59 Posted May 22 Share Posted May 22 Parity disk do not only be the largest drives in the array, they should als be the fastest. Because ALL write (and parity check) operations depend on them. A shucked WD Elements disk is nothing I would allow in my array and of course absolutely no choice for a parity drive. My checks run with 270Mb/s (down to 140 Mb/s. Average 193Mb/s). Quote Link to comment
wulperdinger Posted May 22 Author Share Posted May 22 8 minutes ago, MAM59 said: Parity disk do not only be the largest drives in the array, they should als be the fastest. Because ALL write (and parity check) operations depend on them. A shucked WD Elements disk is nothing I would allow in my array and of course absolutely no choice for a parity drive. My checks run with 270Mb/s (down to 140 Mb/s. Average 193Mb/s). The WD Element was just lying around, and I spontaneously decided to use it as a parity drive for more safety besides all my offline backup drives. 270Mb/s is sick compared to my speed Quote Link to comment
MAM59 Posted May 22 Share Posted May 22 1 hour ago, wulperdinger said: The WD Element was just lying around let her lie and rest in peace. For "Element" WD uses the slowest drives. They are meant for backup purposes. They even do not dare to sell them without the elements cover. Do not like to be written very often and do not like to step around. The controller inside the elements box compensates a bit for this by doing intensive caching. Without the box the drives are slow as a dog. Therefore, to be able to "shuck" them is a questionable success. A parity drive needs the absolute opposite optimizations. You MAY use it within the array (but then, it will pull down parity check speed quite a bit too), The other drives are slow too, but the parity disk marks the upper possible speed limit for all drives. If you have a faster one, it will be cut off by the parity. Quote Link to comment
JorgeB Posted May 22 Share Posted May 22 Post the output of: lspci -d 1000: -vv Quote Link to comment
wulperdinger Posted May 22 Author Share Posted May 22 2 hours ago, JorgeB said: Post the output of: lspci -d 1000: -vv Unfortunately, I have no idea what that means or what I should do Quote Link to comment
JorgeB Posted May 22 Share Posted May 22 Just paste that in the terminal window, and post there the output. Quote Link to comment
wulperdinger Posted May 22 Author Share Posted May 22 1 hour ago, JorgeB said: Just paste that in the terminal window, and post there the output. lspci -d 1000: -vv 08:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02) Subsystem: Broadcom / LSI SAS 9300-16i Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 18 Region 0: I/O ports at 4000 Region 1: Memory at 42140000 (64-bit, non-prefetchable) Region 3: Memory at 42100000 (64-bit, non-prefetchable) Expansion ROM at 42000000 [disabled] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend+ LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s, Width x8 TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range BC, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS- LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+ EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [a8] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [c0] MSI-X: Enable+ Count=96 Masked- Vector table: BAR=1 offset=0000e000 PBA: BAR=1 offset=0000f000 Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [1e0 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn- PerformEqu- LaneErrStat: 0 Capabilities: [1c0 v1] Power Budgeting <?> Capabilities: [190 v1] Dynamic Power Allocation <?> Capabilities: [148 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 Kernel driver in use: mpt3sas Kernel modules: mpt3sas 0a:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 (rev 02) Subsystem: Broadcom / LSI SAS 9300-16i Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 19 Region 0: I/O ports at 3000 Region 1: Memory at 42340000 (64-bit, non-prefetchable) Region 3: Memory at 42300000 (64-bit, non-prefetchable) Expansion ROM at 42200000 [disabled] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express (v2) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+ RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset- MaxPayload 256 bytes, MaxReadReq 512 bytes DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend+ LnkCap: Port #0, Speed 8GT/s, Width x8, ASPM not supported ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+ LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 8GT/s, Width x8 TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- DevCap2: Completion Timeout: Range BC, TimeoutDis+ NROPrPrP- LTR- 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix- EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit- FRS- TPHComp- ExtTPHComp- AtomicOpsCap: 32bit- 64bit- 128bitCAS- DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled, AtomicOpsCtl: ReqEn- LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS- LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+ EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest- Retimer- 2Retimers- CrosslinkRes: unsupported Capabilities: [a8] MSI: Enable- Count=1/1 Maskable+ 64bit+ Address: 0000000000000000 Data: 0000 Masking: 00000000 Pending: 00000000 Capabilities: [c0] MSI-X: Enable+ Count=96 Masked- Vector table: BAR=1 offset=0000e000 PBA: BAR=1 offset=0000f000 Capabilities: [100 v2] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+ AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn- MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap- HeaderLog: 00000000 00000000 00000000 00000000 Capabilities: [1e0 v1] Secondary PCI Express LnkCtl3: LnkEquIntrruptEn- PerformEqu- LaneErrStat: 0 Capabilities: [1c0 v1] Power Budgeting <?> Capabilities: [190 v1] Dynamic Power Allocation <?> Capabilities: [148 v1] Alternative Routing-ID Interpretation (ARI) ARICap: MFVC- ACS-, Next Function: 0 ARICtl: MFVC- ACS-, Function Group: 0 Kernel driver in use: mpt3sas Kernel modules: mpt3sas What information can you gather from that? Quote Link to comment
wulperdinger Posted May 22 Author Share Posted May 22 5 hours ago, MAM59 said: The other drives are slow too Do you have any recommendations for faster drives with a capacity of 18tb+? Quote Link to comment
JorgeB Posted May 22 Share Posted May 22 HBA is not downgraded, you can run the diskspeed container test to see if all disks are performing normally. Quote Link to comment
MAM59 Posted May 22 Share Posted May 22 2 hours ago, wulperdinger said: Do you have any recommendations for faster drives with a capacity of 18tb+? "recommendations" is said too much. Before I used WD-Red-Pro drives (not the slow ones without Pro), but then I found Toshiba ones. Those are a bit faster and also a lot cheaper. So far no Problems (run for 2 yrs now) But then, others may have other recommendations. As long as a drive runs with 7200rpm, has CMR (and not SMR!) technics and is "NAS-approved" it should work an be quite fast. CMR ensures fast writing speed, "NAS-approved" means that it recognises vibrations from other disks in the same enclosure and compensates them. Quote Link to comment
speedyonthehill Posted May 23 Share Posted May 23 Okay does anyone actually have a remedy with the WD drives? I have a similar issue with 52TB across 13 drives in mixed environment of 7200 and 5400 speeds. Is it possible to go around the parity/data rebuild from a failed drive? Can these checks be done separately and what method of order if so. Quote Link to comment
roberth58 Posted May 23 Share Posted May 23 You are running a SAS3 controller, use SAS3 drives for your parity. Right now you have a ferrari controller hooked up to moped parity drives. Quote Link to comment
wulperdinger Posted May 30 Author Share Posted May 30 On 5/22/2024 at 4:36 PM, JorgeB said: HBA is not downgraded, you can run the diskspeed container test to see if all disks are performing normally. Should I downgrade the HBA software? Here is a ss of my disk speeds I switched out the parity disk, but the parity check speed is still very slow Quote Link to comment
wulperdinger Posted May 30 Author Share Posted May 30 On 5/23/2024 at 5:30 AM, roberth58 said: You are running a SAS3 controller, use SAS3 drives for your parity. Right now you have a ferrari controller hooked up to moped parity drives. I switched the parity drive but the speed is still very slow (see pictures above) Quote Link to comment
JorgeB Posted May 30 Share Posted May 30 Disks look OK, also run the controller benchmark in the container and post that. Quote Link to comment
wulperdinger Posted May 30 Author Share Posted May 30 1 hour ago, JorgeB said: Disks look OK, also run the controller benchmark in the container and post that. do you mean this? Quote Link to comment
JorgeB Posted May 31 Share Posted May 31 Post the test results, looks like this: Quote Link to comment
caplam Posted May 31 Share Posted May 31 Did you check that you have no others access to disks? It may seem a stupid question but a few weeks ago the same kind of thing append to me and the culprit was my time machine backup which running at the same time. Parity rebuild was stuck at low speed (12MB/S). Speed went up few minutes after i cancelled tm backup. Quote Link to comment
wulperdinger Posted May 31 Author Share Posted May 31 11 hours ago, caplam said: Did you check that you have no others access to disks? It may seem a stupid question but a few weeks ago the same kind of thing append to me and the culprit was my time machine backup which running at the same time. Parity rebuild was stuck at low speed (12MB/S). Speed went up few minutes after i cancelled tm backup. I only have binhex crusader, diskspeed and plex, all turned off when parity check is running Quote Link to comment
wulperdinger Posted May 31 Author Share Posted May 31 (edited) 13 hours ago, JorgeB said: Post the test results, looks like this: took me a few minutes to find how to do that it kinda makes sense now, somehow. Is there a reason for the low avg. speed with all drives active? edit: preclear is running atm on sdc , while the benchmark test was done Edited May 31 by wulperdinger Quote Link to comment
JorgeB Posted May 31 Share Posted May 31 7 minutes ago, wulperdinger said: it kinda makes sense now It shows the problem, but doesn't really make sense, the disks are connected directly to the HBA right? No enclosure/expander? Quote Link to comment
wulperdinger Posted May 31 Author Share Posted May 31 (edited) 8 minutes ago, JorgeB said: It shows the problem, but doesn't really make sense, the disks are connected directly to the HBA right? No enclosure/expander? preclear is running atm on sdc (started it long after I started the parity check), but even if it isn't running, the speed never exceeds 60-80mbit all connected via the HBA only Edited May 31 by wulperdinger Quote Link to comment
JorgeB Posted May 31 Share Posted May 31 If there's no enclosure I would suspect an issue with the HBA. Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.