RC11 on X9SCL-F-O LGA w/ i3-2100 slow Parity Checks with new SAS2LP-MV8


Recommended Posts

Tom doesn't, necessarily, know precisely what changed in each release - don't forget that he is dependent on the Linux kernel and drivers.

 

You don't have to try every intermediate release - try doing a 'binary chop' on the set of releases.  ie, try the one mid-way between rc5 and rc11 and use the result of that to determine which half of the set to investigate now.  That way you don;t have to try more than four of the ten intermediate releases.

Unfortunately that may not be the case. It's very possible that the behaviour could change back and forth between multiple releases, as the cause could be any number of interactions between kernel versions, software versions and device driver versions. The only way to know is to try every single version with a set of synthetic benchmarks.
Link to comment

Unfortunately that may not be the case. It's very possible that the behaviour could change back and forth between multiple releases, as the cause could be any number of interactions between kernel versions, software versions and device driver versions. The only way to know is to try every single version with a set of synthetic benchmarks.

 

I tried every version of unRAID since the version that initially caused the problem and they all have it (Beta13 to RC11).

 

I'll see what i can do, but i think Tom knows best what changed when and why...

 

He doesn't know, in the past i've talked to him about the issue and he couldn't reproduce it. Since then, he hasn't replied to people experiencing the issue. I don't know what the deal is but i'm very close to looking at alternative solutions, been waiting for a fix for almost a year... and I even ironed it down to the version of unRAID that caused it to start happening.

 

Try Beta12a and see if you still have your speed. Then try Beta13. If it's anything like mine, 13 will be slow and 12a won't. If both of those perform bad, try Beta12. I'm 99% sure it was caused by Beta13 but it's been awhile. Sadly, these earlier versions have critical SAS bugs that will randomly take down your entire system on SAS2LP cards... don't recommend staying on them.

 

Changelog for Beta13:

- linux: use kernel version 3.1.0

- linux: restore linux r8169 driver

- linux: include 'bonding' module and '/sbin/ifenslave' command

- netatalk: use version 2.2.1

- samba: use version 3.6.1

Link to comment

(Posted this in the rc11 thread as well)

 

Ok, i think i've reached some sort of conclusion. After a lot of testing, i think there are 2 'problems' here.

 

First thing is that after a reboot, sometimes the parcheck speed is way to slow (about 18-28MB/s MAX...) This happens on rc11, but also on rc5. Sometimes i have to reboot a couple of times to get it 'right'. I suspect the sas2lp card. I've ordered an IBM 1015 to see if that solves this erratic boot behaviour.

 

Once booted in a 'good mode', it looks like the overall transfer speed of rc5 is about 10-20% faster then rc11. This is confirmed by others in this thread as well. The parcheck speed however, has a bigger difference, at least in my system. I've ran both versions for over 1 hour (stock unraid, no plugins, no sf), checking parcheck speed every 10/15 minutes, and both versions reached their 'constant' speed in about a minute and stayed there, rc5 ran at 120-130MB/s, while rc11 ran at 70-80MB/s. That is about 40% slower.

 

For now, i'm back at rc5. Once i get the M1015 card, i'll do another session like this.

Link to comment

18 1/2 hour parity syncs on my servers this week. Totally unacceptable.

 

Why can't we get an official response, this affects enough people to warrant a reply that it's being looked into. There's factors that have been stated on this thread that prove it's either an unRAID, Drivers, or Linux issue, and not hardware related. We've even narrowed it down to the version that causes this even. I've purchased over 4 licenses over the years and feel completely screwed here.

 

Link to comment

18 1/2 hour parity syncs on my servers this week. Totally unacceptable.

 

You're running software which is not 'production ready'.  If you feel so strongly, while waiting for the issue to be resolved, you can go back to an earlier V5, or even back to 4.7.

Link to comment

Got the 1015 today, struggled a bit to get the thing flashed, the megarec.exe wont run on all msdos versions. Took me about 2 hours to find a specific tool (Rufus 1.1.7, not the most recent version...) that could make a bootable usb key with a dos version that could actually run the megarec.exe... damn.

 

Flashing was a breeze on an ancient Gigabyte mobo i'm using as an unraid testplatform. I can see in the syslog that it's detected:

Mar 14 16:24:36 Tower kernel: mpt2sas0: LSISAS2008: FWVersion(15.00.00.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
Mar 14 16:24:36 Tower kernel: mpt2sas0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
Mar 14 16:24:36 Tower kernel: mpt2sas0: sending port enable !!
Mar 14 16:24:36 Tower kernel: mpt2sas0: host_add: handle(0x0001), sas_addr(0x500605b0033c0a20), phys(
Mar 14 16:24:36 Tower kernel: mpt2sas0: port enable: SUCCESS

 

Just installed it in the 'real' unraid server, all looks well. I've ran a parity check for about 30 minutes using v5.0-rc8, at speeds comparable to the SAS2LP, so starting at 160MB/s slowing down to about 140MB/s after 30 minutes.

 

I am now running V5.0-rc11, and also started a parity check. It also started at 160MB/s and slowed down a bit more after about 20 minutes, to 120MB/s. That is A LOT BETTER then rc11 running the SAS2LP, but not as fast as rc8 on a SAS2LP. Using rc11 on the SAS2LP would be under 60MB/s slowing to 20MB/s... unacceptable.

 

For now, it looks like unraid rc11 has a lot of trouble using the SAS2LP. Replacing it with a M1015 definitly has a positive effect. Will keep an eye on it.

Link to comment

Its not all card related; once i installed the latest simplefeatures 1.0.11, the slow parity speed was back... so i've decided to stop messing around with stuff that wont work properly and reinstallled v5.0-rc5. For me that works like a charm. Stable, fast speeds. rc11 has no added value whatsoever on my setup. Nor has simplefeatures 1.0.11.

 

So basically i did not have to switch to the m1015, with rc5 it performs pretty much the same as it did with the sas2lp in my system. But it was a nice experiment to show rc11 is not up to it yet. And the m1015 boots a lot faster in IT mode, and is a tiny bit faster with transfer speeds.

Link to comment

So is this a card issue with rc11 or a simple features issue??  I currently run an old simple features (not sure exactly which) with m1015 on rc11 and I have no issues with parity checks.  So I am not going to update my simple features :) jowi I noted you indicated that simple features may be polling, do you leave the gui up the whole time your running checks and by any chance do you have uumenu installed??  I usually check uumenu for parity check speeds as it's a more minamilistic web page and usually has less of an impact on speeds.

Link to comment

The sas2lp card wont work properly on rc11, no matter what version of sf. (At least, for a lot of x9scm owners.) The m1015 does a better job on rc11, still not as fast as rc5 on a sas2lp, but acceptable.

But running sf 1.011 on an m1015/rc11 again has ridicoulous slow parcheck speeds. As bad as asas2lp/rc5 with sf 1.05 (20..30MB/s max)

Link to comment

You know I just noticed that after upgrading to SimpleFeatures 1.0.11, I too am having slow parity checks.  Mine's been running for 1 day, 2 hours and it's only at 45%.  Wonder which SimpleFeatures plugin is causing the issue.

Its not a plugin that causes it, it is SimpleFeatures itself...

Chances are if you close the (all!) SF webpage(s) (on all machines) and go to unmenu, the parity speed will get back to normal.

Link to comment
  • 2 months later...

I have recently built a 24-bay server using a HighPoint 2760A 24-drive controller card.  This card uses the same Marvell 88SE9485 controller chip as the AOC-SAS2LP-MV8.  Actually, the 2760A is like three AOC-SAS2LP-MV8's in one, as the 2760A has 3 Marvell 88SE9485 controller chips (instead of just one) which are connected by a PLX PCI bridge.

 

I'm mentioning all this because the 2760A is having a similar parity speed performance issue as described for the AOC-SAS2LP-MV8.  Interestingly, I've been able to conduct a special test on the 2760A that is difficult or impossible with the AOC-SAS2LP-MV8, and I wanted to share those results.

 

I currently only have 16 drives in the server (15 data, 1 parity).  When I first installed the drives in the server, I simply filled up the drive bays starting at the top and working down.  That resulted in the first 88SE9485 chip having 8 data drives on it, the second 88SE9485 having 7 data drives on it, and the third 88SE9485 only having the one parity drive on it.

 

In that configuration, I had parity check speeds of 4MB/s under 5RC6, and 40MB/s under 5RC12a.

 

As a troubleshooting step, I moved the drives around on the controller to spread out the load more evenly.  The first two 88SE9485 chips now had 6 drives on them, and the third 88SE9485 now had 4 drives.

 

My parity speeds jumped from ~40MB/s to ~70MB/s.

 

I didn't reboot to make the change.  All drives were still connected to the 2760A.  I simply put the drives on different ports, making sure that each 88SE9485 chip didn't have more than 6 drives to manage.

 

An earlier test with only 3 drives produced parity speeds well in excess of 110MB/s.

 

PCIe bandwidth is not an issue, as each drive had a full PCIe 2.0 lane available in my tests.  I'm running my test bare metal with zero plug-ins.

 

So, my observations are that:

  • The Marvel 88SE9485 controller seems to be sensitive to the number of drives it is managing, slowing down with each new drive added
  • Different unRAID builds have had a very dramatic influence on the speed, so the limiting factor appears to be software based

 

Hopefully these results shed new light on an issue my fellow unRAIDers have been struggling with for over a year.

 

You can see more of my results here:  http://lime-technology.com/forum/index.php?topic=27460.msg243897#msg243897

 

Surprisingly I haven't seen any mention of the mvsas driver, which is used on both of these cards.  I can't help but think that different mvsas driver revisions, shipped inside the different Linux kernels Tom has incorporated into the various Beta/RC builds, should be a prime suspect.

 

Does anyone have the changelog for the mvsas driver?

 

Link to comment
  • 2 weeks later...

Hi there,

 

I just bought a SAS2LP myselft, I was wondering if you just plugged it in your server or did you have to update or fiddle with its Bios at all to get it working.

As soon as I get it I will also perform the same tests you did, on the latest version of Unraid beta.

Thanks for your reply in advance

 

Ivan

 

Link to comment
  • 1 year later...
  • 1 month later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.