Parity bottleneck - Supermicro AOC-SAS2LP-MV8


Recommended Posts

I'm currently running a parity sync and the behaviour seems a little odd..

 

Initially sync was slow at 30-40MB/sec, but as the smaller 2TB drives finished, it increased.. and then increased again to 100-120MB/sec once the 3TB drives had finished and only the 4TB drives were left to complete.  Up until the 3TB drives finished, the disk read stats graph indicated the total throughput was around 450MB/sec.  When left with just the 4TB drives, the stats graph showed a drop to ~350MB/sec.

 

8 drives are connected to a Supermicro AOC-SAS2LP-MV8 in a PCIe 2.0 x8 slot which should be capable of 32 Gigabit/sec (e.g. 4GB/sec or 500MB/sec per drive).  The remaining 4 drives are connected to SATA 3 6Gbit ports on an ASRock FM2A75M-ITX.

 

The AOC-SAS2LP-MV8 reports it is linked at "5Gbps" during boot.  Given PCI-e 8/10bit overheads this perhaps means "4Gbps" real world.  I had assumed this must have meant GigaBYTE but is it in fact linked at only 5 Gigabit??  This would explain the bottleneck..

 

Syslog says: kernel: mvsas 0000:01:00.0: mvsas: PCI-E x8, Bandwidth Usage: 5.0 Gbps

 

Obviously the system is not capable of reading data at more than 450MB/sec (just under 4gigabit per sec).. but my question is, why is this the case?  Can anyone clarify what the "5.0 Gbps" in syslog refers to?  Should the card be linking at a higher speed?

 

SjJU5wN.png

DUrT8X9.png

Link to comment

I think you are seeing what I saw with my prior issue.  Your SATA controller card has some theoretical top throughput that it can handle.  When you are doing a parity check you are initially starting out with all disks, and the speed of it at that point can be limited by 1) the speed of your slowest drive, and 2) the throughput capability of your SATA controllers.

 

If I hook up a firehose to a garden hose, I limit the water going through, same thing here with the data.  Ton of data going through that controller, perhaps you are getting close to the actual throughput your card can handle, and typically your smaller drives are going to be older and slower.  So once you get above 2TB's, your 2TB drives spin down and won't be used anymore, if the bigger drivers are faster you will now see higher throughput and your sync completion time will get lower.  You also are now sending less data through the controller, which helps as well.  Typically my numbers start VERY low, and as the 2TB disks drop off it increases, when it gets to just the 4TB disks it flies.

 

Totally expected.  The best you can do, is figure out out to best split up your drives between the  AOC-SAS2LP-MV8 controller and the onboard controller.  And remember, these numbers are totally pointless, it is rare that all your drives will ever be spinning up at the same time, except for parity checks.

 

 

Link to comment

I get what you are saying, but my expectation is that this card has a higher maximum throughput than 5 gigabit.  It advertises that each of its 8 ports is "up to 6 gigabit" and is connected with a PCI-E 8x interface which is capable of much more than we are seeing.

 

If 5 gigabit is the true maximum speed of this card then it is false advertising as not even a single drive can reach the maximum stated throughput.  It would be like an ISP selling me a gigabit internet connection when their link further upstream is only 10mbit.

 

I would appreciate if someone could confirm that the syslog is indicating it is linked at only 5 gigabit.

 

Can someone with another PCI-E x8 card post what their card links at?

Link to comment

When I got a couple of sas2-lp I did some tests to see how it compared to my old sas-lp.

 

Start of parity check speed on my test server with only 8 hdd,  all WD green 2TB,  all connected to the sas card:

SAS-LP - 80MB/s

SAS2-LP - 125MB/s

 

Can you confirm what speed the card links at, and the maximum speeds the disk usage graph displays during parity?

Link to comment

According to the log the SAS2LP links at 5.0Gbps, the SASLP links at 2.5Gbps, I believe this is per lane, as the SASLP is pci-e 1.0 4x and the SAS2LP is pci-e 2.0 8x, although at the moment the card is in a pci-e 2.0 4x slot.

 

max theoretical bandwidth for the SASLP should be 4 lanes @ 250MB/s each, max I got was 640MB/s

max theoretical bandwidth for the SAS2LP should be 8 lanes @ 500MB/s each, max I got was 1000MB/s, limited by the HDDs used I believe.

 

Disk usage graph with the SAS2LP stays around 1GB/s.

 

Remember this is for 8 HDDs only, as you add more disks other bottlenecks came into play, one of my servers has 22 HDDs and can't get more than 80MB/s parity check

 

 

Mar 22 14:48:37 Tower kernel: mvsas 0000:01:00.0: mvsas: driver version 0.8.16

Mar 22 14:48:37 Tower kernel: mvsas 0000:01:00.0: mvsas: PCI-E x4, Bandwidth Usage: 2.5 Gbps

 

Mar 22 14:48:37 Tower kernel: mvsas 0000:02:00.0: mvsas: driver version 0.8.16

Mar 22 14:48:37 Tower kernel: mvsas 0000:02:00.0: mvsas: PCI-E x4, Bandwidth Usage: 5.0 Gbps

 

Link to comment

...  When left with just the 4TB drives, the stats graph showed a drop to ~350MB/sec.

 

This doesn't seem to indicate any bottleneck => you've only got 3 4TB drives, so you're getting about 117MB/s speeds for the final TB of your parity check.    Seems about right for the final TB of a 4TB unit -- remember that at this point they're moving towards the innermost (slowest) cylinders.

 

Link to comment

...  When left with just the 4TB drives, the stats graph showed a drop to ~350MB/sec.

 

This doesn't seem to indicate any bottleneck => you've only got 3 4TB drives, so you're getting about 117MB/s speeds for the final TB of your parity check.    Seems about right for the final TB of a 4TB unit -- remember that at this point they're moving towards the innermost (slowest) cylinders.

 

This is when the bottleneck had gone.  The graph shows the bottleneck was at ~450MB/sec.  With just the 4TB drives left there was no bottleneck.

 

The Supermicro card should be able to provide up to 6 Gigabit per port, but it can't even do that _combined_ across all 8 drives!

Link to comment

I checked your 3TB drives, and they're all 1TB/platter drives (the DM3000's were made in both 3 platter 1TB/platter and 4 platter 750GB/platter versions ... but yours are 1TB/platter.    Unless there are some latency issues being caused by mixing rotation speeds, it certainly appears you're indeed being bottlenecked by the controller (or its interface).

 

I think the reported link speed at boot is per lane, so that looks fine.  But I agree I'd expect a notably higher rate once you're past the 2TB drives.

 

Just to absolutely confirm this is an issue with the SAS2LP, I'd switch your connections so the 4 3TB drives are on the motherboard ports ... then run a parity check and see what the speeds are like.    If everything improves as expected, then you either have a defective card or your x16 slot isn't operating correctly.

 

... Just for grins, have you tried unplugging the card and re-plugging it, just to confirm it's firmly seated.

 

 

 

Link to comment

I checked your 3TB drives, and they're all 1TB/platter drives (the DM3000's were made in both 3 platter 1TB/platter and 4 platter 750GB/platter versions ... but yours are 1TB/platter.    Unless there are some latency issues being caused by mixing rotation speeds, it certainly appears you're indeed being bottlenecked by the controller (or its interface).

 

I think the reported link speed at boot is per lane, so that looks fine.  But I agree I'd expect a notably higher rate once you're past the 2TB drives.

 

Just to absolutely confirm this is an issue with the SAS2LP, I'd switch your connections so the 4 3TB drives are on the motherboard ports ... then run a parity check and see what the speeds are like.    If everything improves as expected, then you either have a defective card or your x16 slot isn't operating correctly.

 

... Just for grins, have you tried unplugging the card and re-plugging it, just to confirm it's firmly seated.

 

I have now moved the 4x3TB drives to motherboard, with the remainder on the Supermicro.  Restarted parity and it is still maxing out at just under 512MB/sec according to disk stats graph, and ~40MB/sec "Estimated speed" on the Main tab.  I'll post a full graph in an estimated 28 hours time (!)

Link to comment

The real test comes after you cross the 2TB point.  It's likely that your EADS drive is a 333GB/platter unit, so it's slowing everything down until you cross the point where it's no longer involved in the parity test.

 

But you should see a MAJOR jump after you cross 2TB if the issue was your SAS2LP card.

 

Link to comment

The real test comes after you cross the 2TB point.  It's likely that your EADS drive is a 333GB/platter unit, so it's slowing everything down until you cross the point where it's no longer involved in the parity test.

 

But you should see a MAJOR jump after you cross 2TB if the issue was your SAS2LP card.

 

The speed did indeed jump from ~40MB/sec to ~95MB/sec (about 600MB/sec according to disk graph) after the 2TB drives had finished.  However, I think this is more likely down to the fact that at this point in the operation, 4 drives are now on the motherboard SATA ports, and three drives including parity are on the Supermicro card. 

 

Either the 2TB drives are very slow at reading as you suggest, or the Supermicro card is the bottleneck, and the bottleneck is no longer a factor with only 3 active drives.

 

I will perform individual speed tests on each drive when parity has completed, but my money is still on the Supermicro card.

 

It still took 15 hours to complete the first 2TB of parity check and I think this is too much when the hardware should be capable of much quicker.

Link to comment

Yes, 15 hrs is a very long time to traverse 2TB => I'd agree the that either the SuperMicro card is having issues or your PCIe port has a problem.    The speed you're seeing looks like you're only using one PCIe lane.

 

Did you unplug the card and re-plug it to confirm it was firmly in place?  You may also want to clean both the connectors and the slot with some electrical solvent (or a pencil eraser & air).

 

Link to comment

The speed you're seeing looks like you're only using one PCIe lane.

 

That makes sense, 1 lane pcie 2.0 is 500MB/s, in my tests 4 lanes pcie 1.0 (1000MB/S) translate to 640MB/s "real world", using only one 2.0 lane should get around 320MB/s "real world", that's what you're getting, 8 x 40MB/s.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.