Slow Parity rebuild


SavellM

Recommended Posts

  • 4 weeks later...

Hello, I'm having a similar issue.  Parity rebuild always runs at about 40-41MB/s.  The parity drive is a brand new WD Red 8TB, other drives are 8TB WD Whites and a pair of Seagate 4TBs, all plugged into a Dell Perc H310 SAS HBA reflashed with LSI 9211-8i firmware using SAS-to-Sata breakout cables.

 

I'm wondering if the issue is that my SAS HBA is only initializing with one PCIE lane instead of 4?   My total bandwidth that I see on it (~205MB/s) might seem to indicate that.   I have no docker apps or VMs running right now, and only plugins are community applications, nerd tools, unnassigned devs and user scripts.  Not currently transferring any files, either.  CPUs are basically doing nothing.  I see these same speeds consistently for days at a time during the parity rebuild.  I would expect the parity rebuild to operate north of 100MBps on this hardware.

 

05:00.0 Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)
        Subsystem: Dell 6Gbps SAS HBA Adapter
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: I/O ports at d000
        Region 1: Memory at f6940000 (64-bit, non-prefetchable)
        Region 3: Memory at f6900000 (64-bit, non-prefetchable)
        Expansion ROM at f6800000 [disabled]
        Capabilities: [50] Power Management version 3
                Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+ Unsupported+
                        RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
                        MaxPayload 128 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s, Exit Latency L0s <64ns
                        ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp-
                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range BC, TimeoutDis+, LTR-, OBFF Not Supported
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
                         AtomicOpsCtl: ReqEn-
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
                         EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
        Capabilities: [d0] Vital Product Data
pcilib: sysfs_read_vpd: read failed: Input/output error
                Not readable
        Capabilities: [a8] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [c0] MSI-X: Enable+ Count=15 Masked-
                Vector table: BAR=1 offset=0000e000
                PBA: BAR=1 offset=0000f800
        Capabilities: [100 v1] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [138 v1] Power Budgeting <?>
        Kernel driver in use: mpt3sas
        Kernel modules: mpt3sas
 

unraid-parity-rebuild-perf.jpg

samurai-diagnostics-20180303-1225.zip

Link to comment

....AAAAAND that was in fact the problem.  I dug around and found a BIOS option on my Gigabyte Z97X-UD3H mobo to turn the PCIE bandwidth of that bottom slot where the SAS HBA is plugged in from "Auto" to "x4" and I'm now seeing 725MBps reads off my data drives and 145MBps writes onto my parity drive.  It was useful to write that out though- I was pretty bummed with the overall disk performance until now.  It turns out that a single PCIE2 lane doesn't support 6 drives very well!

Link to comment
6 minutes ago, John_M said:

Your H310 can use 8 lanes if they are available. You could put it in a different slot.

 

Unfortunately the IOMMU groupings on this motherboard are really weird.  I literally tried every combination and setting (spending an entire Saturday), and this bottom slot was the only place that I could keep it separate from an Nvidia GPU in one of the other PCIE slots and be able to virtualize that GPU to a VM for my wife to use.  So right now I have that GPU in the top x16 slot and the H310 in the bottom PCIEX4 slot, which seems to work.   PCIE ACS override did NOT work - it would split out the IOMMU groups more logically but that Nvidia GPU was never recognized for passthrough.

Edited by Kamikazejs
Link to comment

Yeah, you're probably right that there's a little more speed to be gained but this is Good Enough for now until I upgrade that server at the end of the year.  The drives are still plugged into the H310 HBA.  I don't trust the onboard SATA controllers for drives this large and new.

Link to comment

I guess I've read in too many places about potential data loss with onboard SATA controllers, and had issues myself on other hardware with onboard SATA ports wreaking havoc and causing massive data corruption (which, thankfully, btrfs in raid1 was able to correct) that I just would rather trust a more "enterprise" grade device at this point.  The SAS breakout cables are really nice too... that sure cleans things up in that box.

On a side node, I'm a big fan of BTRFS in raid 1.  I would really, really like to see btrfs in raid1 for metadata & data made available in unraid for volumes with critical data in the future, with the parity an additional layer of protection on top of that.  That might sound paranoid to some but I'm a data storage guy for a living.. =-)

Link to comment
  • 6 months later...

I'm hoping that the highly intelligent folks on this thread will able to assist. 

 

I just installed my first HBA (LSI 9211-8i) flashed to IT mode. X399 Threadripper 2950x with (2) 8x and (2) 16x PCIe slots. HBA is on an 8x slot with GTX 1050 on a 16x slot next to it. 

I'm getting 9 days 15 hours ETA on my rebuild of a 8TB disk. 

Read speeds on the array seem to be capped around 57MB/s. Writes are at 8.2MB/s.

8TB Seagate IW Parity Drive (MB port)

2x 500GB EVO 950/960 Cache drives (MB port)

4x 8TB WD Red data drives

3x 3TB WD Red data drives

I'm not too familiar with IOMMU groups to know what I'm doing or looking for...

 

Questions:

- Can I stop the parity sync/rebuild to check my BIOS settings?

- Do I need to look at anything in the IOMMU groups?

tower-diagnostics-20180915-1502.zip

 

Not sure if anyone has played with the x399 Taichi with the 2950x but here is my BIOS/AMD PBS config

 

I am trying to find any information I can on the PCIe Slot Configuration and the different modes.
x399 Taichi
Threadripper 2950x
32GB GSkill 3200 (B-Die) RAM Dual Channel
I have a GTX 1050 in Slot 1
LSI 9211-8i in Slot 2 (Card is x8 PCIe 2.0
 
BIOS Config: Advanced/AMD PBS
PCIe x16 Switch - Auto
Promontory PCIe Switch - Auto
NVMe RAID mode - Disabled
PCIe Slot1 Configuration  - x16 Mode
PCIe Slot2 Configuration - x8 Mode
PCIe Slot4 Configuration - x16 Mode
PCIe Slot5 Configuration - x8 Mode
 
What is the x4x4 Mode on Slot2?
Edited by blurb2m
Added diagnostics.
Link to comment

Nothing jumps out why it's going so slow, you'll need to do some testing to find the problem, but it's harder to do that with a disabled disk.

 

4 hours ago, blurb2m said:

Can I stop the parity sync/rebuild to check my BIOS settings? 

You can cancel a rebuild at any time, but it can't be resume, i.e., it will always start from the beginning.

 

4 hours ago, blurb2m said:

Do I need to look at anything in the IOMMU groups?

This is for virtualization, it shouldn't affect rebuild speed, you can check with lspci is the HBA is linking @ x8, but with only 4 disks connected it shouldn't make so much difference even if it wasn't.

 

lspci -vv -s 41:00.0

Look for LnkSta

Link to comment

It settled out at around 15 days with no docker or VMs running and there was no way I was going to wait that long.

So I tinkered for a bit and I changed the PCIe x16 Switch to Gen 2. That drastically shot my rates up.

800MB/s read on the array and 90-120MB/s write to the rebuild disk. (rebuild came down to 16.5 hours)

 

Still a bit iffy when the system boots whether the LSI will initialize or not. Sometimes it freezes at "Initializing..."

One or two reboots and it will finish that in about 30 seconds, identify the disks connected and boot into Unraid.

Wondering if I should remove the MPT BIOS or just leave it be.

Link to comment
4 hours ago, blurb2m said:

That drastically shot my rates up.

800MB/s read on the array and 90-120MB/s write to the rebuild disk. (rebuild came down to 16.5 hours)

That's much better and close to full speed.

 

4 hours ago, blurb2m said:

Still a bit iffy when the system boots whether the LSI will initialize or not. Sometimes it freezes at "Initializing..."

One or two reboots and it will finish that in about 30 seconds, identify the disks connected and boot into Unraid.

Wondering if I should remove the MPT BIOS or just leave it be.

You can remove it, bios is only needed if you plan to boot from one of the connected devices, if not it just increases boot time.

Link to comment

Sorry for my slow replies. I truly appreciate your help @johnnie.black !

Is there a guide for removing just the MPT BIOS or does that require erasing everything and a new flash? I used the efi version on my gaming rig to flash FW and BIOS.

 

Side note: This 2950x doesn't care what I throw at it! 4k HDR HEVC transcode? Sure no problem!

Edited by blurb2m
Link to comment
  • 6 months later...

I'm experiencing what I consider slow Parity Disk rebuild, 50MB/s.  I replaced my parity 1 disk with a bigger disk and I plan on replacing party 2 disk with a bigger one in the next month or two so I can start using bigger disks.  I upgraded my computer recently to a Supermicro 24 bay system.

 

Primary Specs:

CPU: 2x Intel Xeon E5-2690 V2 Deca (10) Core 3Ghz

RAM: 64GB DDR3 (4 x 16GB - DDR3 - PC3-10600R REG ECC)

Hard Drives: Not included, visit our store to purchase 1 Year Warranty drives at low price

Storage Controller: 24 Ports via 3x LSI 9210-8i HBA Controller Installed closer to all 3 PCI-E Slots near power supply so Customer can Install his Dual Graphics card

NIC: Integrated Onboard 4x 1GB Ethernet

 

Secondary Specs: Chassis/ Motherboard

Supermicro 4U 24x 3.5" Drive Bays 1 Nodes Server 
Server Chassis/ Case: CSE-846BA-R920B (upgraded to 920W-SQ PS quiet power supplies)
Motherboard: X9DRi-LN4F+ Rev 1.20
Backplane: BPN-SAS-846A 24-port 4U SAS 6Gbps direct-attached backplane, support up to 24x 3.5-inch SAS2/SATA3 HDD/SSD
PCI-Expansions slots: Full Height 4 x16 PCI-E 3.0, 1 x8 PCI-E 3.0, 1 x4 PCI-E 3.0 (in x8)
* Integrated Quad Intel 1000BASE-T Ports
* Integrated IPMI 2.0 Management

24x 3.5" Supermicro caddy

 

I don't see a reason for the parity rebuild to be only 50 MB/s.  I'm including my diagnostics file.  I did the:

On 9/15/2018 at 6:10 PM, johnnie.black said:

lspci -vv

and verified that all my cards were running in PCIe x8 mode.  My LSI cards are PCIe 2.0, want to upgrade to 3.0 someday but that's not high on the priority list.  My bios says all the PCIe slots are in PCIe Gen3 x16 mode with exception of the x8 that is in x8 but they defiantly are over the required standard for the LSI cards.

 

Been reading and researching for 24 hours before posting here for an extra set of eyes.  Thank you for your help in advance.

rudder2-server-diagnostics-20190407-1333.tar.gz

Link to comment
5 hours ago, johnnie.black said:

Some SAS devices come with write cache disable, see here:

https://forums.unraid.net/topic/72862-drive-write-speeds-really-slow-solved/

 

You are the man!  This was my problem.  Man, it would of saved me 24 hours if I could of found this.  LOL.  I wander why unRAID is set to have write cache off by default on SAS drives.  It's a drive setting not an unRAID setting.  (Corrected by jomathanm below.)  It's not a problem now that I know.  Thank you so much for your help!

Edited by Rudder2
Link to comment
3 minutes ago, Rudder2 said:

I wander why unRAID is set to have write cache off by default on SAS drives.

It doesn't. Unraid uses the drives as presented. Write cache is inherently risky, you need a UPS in place to mitigate the risk, so some enterprise drives ship with the setting off the lower the risk, it's up to you to enable write caching and ensure your infrastructure won't lose those writes.

Link to comment
1 minute ago, jonathanm said:

It doesn't. Unraid uses the drives as presented. Write cache is inherently risky, you need a UPS in place to mitigate the risk, so some enterprise drives ship with the setting off the lower the risk, it's up to you to enable write caching and ensure your infrastructure won't lose those writes.

Thank you for the information.  I have SATA enterprise drives from Seagate and they came with it enabled by default.  I pick up these SAS Seagate drives and it was disabled.  Thank you for the correction.  It makes since.  I have a UPS so no issues.  Never had a problem even with power outages on my computer over many years so it sound like the risk is small but can understand in enterprise environment even a small risk is too much.

Link to comment
  • 5 weeks later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.