[Partially SOLVED] Is there an effort to solve the SAS2LP issue? (Tom Question)


TODDLT

Recommended Posts

  • Replies 453
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

I just bought a couple refurb LSI 9240-8I cards for $89 each on ebay.  I will take my chances..... No way am I paying newegg prices.  Now I just need to unload these SAS2LP cards  :)

I'm thinking about doing the same with a M1015. Looks like the SAS2LP's are selling for about $80-90US on eBay. I'll probably sell mine there and make back most of the cost of the M1015.

Link to comment

And these will work without needing to be flashed?

The H310s I got off of EBay needed to be flashed they were in IR mode not IT mode that unRAID needs.
Link to comment

It seems that the discussion in this thread has changed from trying to figure out a "software" resolution to the possible Supermicro card problem, towards hardware based solutions.  Things like replacing the card with one from another manufacturer, upgrading your processor, and etc...

 

Is this Limetech's position on the issue?  If a software based solution is still being sought, I'd prefer to not spend the money on a hardware upgrade right now. 

Link to comment

I was under the impression that this was a problem with the 64 bit driver and LimeTech had little or no control over the problem. Wouldn't SuperMicro(?) have to write a new driver? I admit not knowing much about this sort of thing. Very possible I've misunderstood the whole issue but my fear is that SuperMicro may never get around to updating the driver.

Link to comment

Dell PERC H310s seem to be considerably less expensive than IBM M1015s eBay (IBM M1115s are also a little cheaper) - at least at the moment.  Any reason to pick one over the other as we consider retiring our SAS2LPs?

I switched to H310s just because of that reason.  Also I wanted a few that had ports out the back instead of the top like the M1015 for my N54Ls and N40L micro servers.  The above mentioned H200 appears to be Dell's version of the M1015 (looks like it based on pictures anyway) and would likely be cheaper on ebay than getting an M1015.
Link to comment

Obviously I would definitely NOT recommend buying one of these cards now -- but the older SASLP-MV8 card is still a good choice (and is available very inexpensively).

 

How much slower is the SASLP? I know they're only rated at 3Gb/s but I'm not sure that means parity checks would be at half speed.

 

I've been looking for a good out-of-the-box controller. Don't have any Windows machines/licenses that I can use to flash a card.

 

If you are in the US and the card(s) are new I would be willing to help you flash them. I just expect shipping to be paid. That's how I was able to do mine. Just requires any old PC that you can boot from the USB port and has an PCIE slot. I recommend the Dell H310 cards. Why? Because I've just had good luck with them and Dell has their own IT firmware for it. No cross flashing needing to be done.

Link to comment

I just bought a couple refurb LSI 9240-8I cards for $89 each on ebay.  I will take my chances..... No way am I paying newegg prices.  Now I just need to unload these SAS2LP cards  :)

I'm thinking about doing the same with a M1015. Looks like the SAS2LP's are selling for about $80-90US on eBay. I'll probably sell mine there and make back most of the cost of the M1015.

 

Just buy new- brand new, shiney right out of the anti-static bag. I have dealt with this Ebay vendor in the past. All new stuff and very fast shipping. Saving such a small amount on something that's very important in the grand scheme of what you are doing doesn't make sense to me. Don't be cheap. Spend the extra $10-$15 for new card(s). Skip lunch one day or something.

 

Brand new Dell H310 with full size bracket.

http://www.ebay.com/itm/201391927743?_trksid=p2060353.m2749.l2649&ssPageName=STRK%3AMEBIDX%3AIT

 

EDIT: I also opted for this specific model- the SAS connectors are at the rear of the card and work perfect in a server case. The other model cards have the same connections but the connectors are at the top. This is what it looks like installed.

 

unraidbuild5.jpg

Link to comment

I suspect this Adaptec card would perform very nicely, although I don't see anyone using it with UnRAID in this forum.

 

http://www.newegg.com/Product/Product.aspx?Item=N82E16816103231

 

It's a 4-lane, PCIe v2 card, so will have 2000GB/s of bandwidth available ... or 250GB/s/disk => which is well above what any spinning disk can sustain.

 

BTW, the Dell H310 is an 8-lane PCIe V2.

 

Dell PERC H310 (PowerEdge Raid Controller)

Link to comment

I was under the impression that this was a problem with the 64 bit driver and LimeTech had little or no control over the problem. Wouldn't SuperMicro(?) have to write a new driver? I admit not knowing much about this sort of thing. Very possible I've misunderstood the whole issue but my fear is that SuperMicro may never get around to updating the driver.

I must have missed that.  This makes sense.  Sucks, but makes sense. 

 

I wonder if anyone has asked Supermicro about this.  Perhaps pointed them to this thread.  I mean, up until this started, I was happy with my one and only Supermicro purchase and was likely to purchase another supermicro product in the future.  Now I'll think twice about it. 

Link to comment

I was under the impression that this was a problem with the 64 bit driver and LimeTech had little or no control over the problem. Wouldn't SuperMicro(?) have to write a new driver? I admit not knowing much about this sort of thing. Very possible I've misunderstood the whole issue but my fear is that SuperMicro may never get around to updating the driver.

My feeling is that it's not purely a software issue, but rather a hardware issue (or hardware combined with software), as not all users (myself included) have seen any issues at all with the SAS2LP cards.  No dropped drives, parity speeds ~115MB/s. 

 

And if the problem only affects a certain subset of users, then the problem becomes far harder to fix rather than if it affected everyone across the board.

Link to comment

Dell PERC H310s seem to be considerably less expensive than IBM M1015s eBay (IBM M1115s are also a little cheaper) - at least at the moment.  Any reason to pick one over the other as we consider retiring our SAS2LPs?

I switched to H310s just because of that reason.  Also I wanted a few that had ports out the back instead of the top like the M1015 for my N54Ls and N40L micro servers.  The above mentioned H200 appears to be Dell's version of the M1015 (looks like it based on pictures anyway) and would likely be cheaper on ebay than getting an M1015.

 

The port location is a huge consideration. I had 0.5m sff-8087 cables which were great with the SAS2LP cards, but when I installed the M1015 cards they were too short, so I had to buy 1.0m sff-8087 cables, which was an unexpected expense (Especially since I bought 6 to cover all 6 backplanes on the Norco 4224).

 

If you are swapping out the cards consider cable length before you buy new cards. I sure wish I had. :)

Link to comment

I believe the issue is not the driver or the controller itself, but for some reason the way a parity check works in Unraid does not go well with this card.

 

Using a script I found on the forum these are the results for simultaneous reads of 6 SSDs on the SAS2LP

 

 sdb = 391.98 MB/sec
sdd = 391.84 MB/sec
sde = 392.20 MB/sec
sdc = 392.18 MB/sec
sdg = 436.42 MB/sec
sdf = 427.27 MB/sec 

 

So the card and driver are working ok and it has plenty of bandwidth.

 

Note that the parity sync speed is also normal:

 

oNT9RBV.jpg

 

Now a parity check on the same system:

 

BfJisNv.jpg

 

 

 

For comparison here are the results with the same hardware but replacing the SAS2LP with a Dell H310

 

Bandwidth test:

 

sdd = 413.83 MB/sec
sdb = 441.63 MB/sec
sdc = 467.75 MB/sec
sdg = 435.47 MB/sec
sdf = 467.66 MB/sec
sde = 442.71 MB/sec

 

Parity sync:

 

KRYNT2o.jpg

 

Parity check:

 

JD5mcpm.jpg

 

 

Searching the forum for another subject I found this 2013 post from Pauven (the creator of tunables tester) with a similar issue with a Highpoint 2760A which I believe uses the same Marvell 9485 chipset, if you read his post later on thread the issue was fixed by increasing the the md_sync_window value, at the time this “fix” also worked on the SAS2LP. Now I think we are seeing the same issue surface in Unraid V6 but at least for me and other users changing the tunables values has little to no effect.

 

The problem appears to be only with Unraid during a parity check, if it was only a driver problem the card should also perform slower during parity syncs and any other simultaneous reads, so I really doubt that Supermicro can do anything about this.

 

This does not mean it’s a simple issue for Limetech to solve, especially because not all users are affected, I’m more a hardware than software guy, but maybe it can only be solved by changing the parity check code, in any case I suspect that if they can’t solve it nobody can.

 

Maybe Tom can update us and tell us if some progress was made.

 

Link to comment

VERY interesting results !!    Clearly the card & driver are working fine ... and indeed the parity check WORKED fine with v5.    SOMETHING has clearly changed in v6 r.e. the parity check that is causing these issues => I have to wonder if this is also the reason the parity checks are using so much more CPU % then with v5 (which is counter-intuitive, since the 64-bit OS should be more efficient) ... and thus causing slower parity checks in many cases [e.g. when I updated my oldest server from v5 to v6.1.3 the parity check time increased by 31%].

 

Particularly interesting that the parity sync is still at full speed.    Can you try one more test?  Do a drive rebuild and see if that's at the "good" speed shown with parity sync or the "bad" speed that a parity check displays.

 

Link to comment

VERY interesting results !!    Clearly the card & driver are working fine ... and indeed the parity check WORKED fine with v5.    SOMETHING has clearly changed in v6 r.e. the parity check that is causing these issues => I have to wonder if this is also the reason the parity checks are using so much more CPU % then with v5 (which is counter-intuitive, since the 64-bit OS should be more efficient) ... and thus causing slower parity checks in many cases [e.g. when I updated my oldest server from v5 to v6.1.3 the parity check time increased by 31%].

 

Particularly interesting that the parity sync is still at full speed.    Can you try one more test?  Do a drive rebuild and see if that's at the "good" speed shown with parity sync or the "bad" speed that a parity check displays.

 

FWIW, I did a drive rebuild when I first got the cards a couple weeks ago and it went at much higher speeds than the SASLPs that I replaced. After it made it up to the SATA III drives, I was getting in the 90s for MB/s.

I get phenomenally low parity checks for some reason as well. It went all the way down to 1 MB/s before I just canceled it.

Link to comment

Thanks for the extra data point => hopefully this will provide Limetech the clue they need to isolate why this issue is happening ... you've certainly exonerated the card and the driver, as clearly both are working just fine with both parity syncs and disk rebuilds -- and there's certainly not a different set of drivers used for the parity checks.

 

Link to comment

*Pout*  I want >300MB/sec transfer rates!!  My paltry 70-80MB/sec seems very turtleish!

Too bad I can't afford a crapload of 1TB or bigger SSD drive!  ;D

 

Actually you can now get 2TB units  :)

http://www.amazon.com/Samsung-2-5-Inch-SATA-Internal-MZ-7KE2T0BW/dp/B010QD6RX4/ref=sr_1_1?ie=UTF8&qid=1444930861&sr=8-1&keywords=samsung+ssd+2tb

 

So all you need is $21,319.68 (plus tax of course) and Amazon Prime will have your disks there in two days !!

Wonder what the parity check time would be for that 46TB array ??  8) 8)

Link to comment

I believe the issue is not the driver or the controller itself, but for some reason the way a parity check works in Unraid does not go well with this card.

[...]

Maybe Tom can update us and tell us if some progress was made.

What I noticed and posted already some time ago is, that the read counts in UnRaid GUI are not counting up the same on parity checks. With V5.x, the counts ("Reads" in GUI) had been very close to be the same for all disks, which seems "logical" (assuming, nothing else accesse the disks except the parity check). With V6 they have big differences. Didn't get an answer from Tom yet, how and where those counts are taken in the software - maybe that could give a hint why and where the speed degradation is caused.

 

Reads.JPG.63629945f864b1f1cb4820ed4b4c1f0f.JPG

Link to comment

The read counts have been different for a long time ... starting with v5.  I think that simply depends on the controller, the interface (IDE vs. SATA) and the access mode (should be AHCI for all modern drives).    But in any event, that's not likely the underlying reason for the difference.    But there IS a notable difference -- as this thread notes, it's not just a difference in the CPU % in v5 vs. v6, but also a significant difference between parity syncs and drive rebuilds vs. parity checks.    Definitely strange -- and it'd certainly be nice to see it resolved.

 

Link to comment

A bit more info on the parity check speed vs. syncs & rebuilds ...

 

Decided to do a new parity sync on my server just to see if it would indeed be faster, as Johnnie's results have shown is likely.    Indeed, the parity sync on v6.1.3 was actually FASTER than a parity check had been on v5.    Running a parity check now to compare the exact difference between the sync and check ... has a few hours to go, but based on where it's at it's clearly going to take notably longer than the sync did [Which, of course, makes no sense].

 

When that finishes, I plan to upgrade an older drive, so I'll do a rebuild as well (probably tomorrow) ... it'll be interesting if that also works at the higher speed the sync did (I suspect it will).    I'll post relative differences after all is done.

 

The CPU utilization during the parity sync was FAR lower than it's been during the ongoing parity check ... and I suspect it will also be relatively low during the rebuild.

 

CLEARLY there's something in the parity checks that has changed since v5 and is causing these issues.  It'd certainly be nice if Limetech would have an epiphany and realize what it is !!  One of those forehead-slapping moments when you suddenly realize what's going on  :)

 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.