[Partially SOLVED] Is there an effort to solve the SAS2LP issue? (Tom Question)


TODDLT

Recommended Posts

  • Replies 453
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

Definitely interested in how the M1015's do.    Until the recent spate of issues with the SAS2LP's, that's definitely the card I'd have purchased if I was adding more SATA ports.    But now I'm gun-shy r.e. those cards and would likely avoid them ... either in favor of the older SASLP's or the M1015.

 

Fortunately I don't need any additional ports at the moment ... although I may around Christmas time, as I plan to build a new server this winter.

 

Once I get the new cards flashed and installed I will run a new parity check and post the results. I am rather curious now to see the difference as well.

Link to comment

FTP site to dump old versions

 

host ftp.diggsnet.com

user [email protected]

pw unraid

 

 

EDIT: Also running mvsas

 

Tried your FTP server but only getting a hidden quota file.

 

I just looked and I see a whole bunch of uploads of the unraid OS. So it is working. Doing this just until testing is over of course and permission from LimeTech.

 

EDIT: Just saw Jonp's post about this is temporary. Of course, and if you need any particular files removed or it shut down right away please just let me know.

 

Link to comment

Rebooted server. Shut off all VM's. Turned as much off as I could. Started a fresh parity check and only getting 56MB/sec. Going to take over a day. Since I don't perform parity checks monthly, never really worried about the speeds but that's pretty slow. I'll see if I can grab those Dell H310's from work. On the wiki it indicates they were crossflashed with LSI9211 but the Dell Perc H310 has an LSISAS2008 on it. Weird. Don't want to brick any cards.

See this post to flash your H310 controller cards.

http://lime-technology.com/forum/index.php?topic=12767.msg259006#msg259006

 

With respect to my problem of reduced parity check speeds.

http://lime-technology.com/forum/index.php?topic=42629.msg406745#msg406745

 

I found a Core2Duo in my pool of spares and replaced the Celeron.

Fixed the problem, CPU usage is now at ~80% during parity check.

Speed ist nice around ~100MB/s.

The C2D TDP is 65W vs. 35W of the Celeron but as this is my backup machine

I'm fine with that also.

Don't know if there is something like that, but the requirements for unRAID should

mention a multi-core processor as a "must".

 

Can you upload the files to flash the Dell H310 here? That download site is loaded with malware and advertisements that freaks out my anti-virus and I can't actually begin the download.

 

FTP site to dump old versions

 

host ftp.diggsnet.com

user [email protected]

pw unraid

Link to comment

Definitely interested in how the M1015's do.    Until the recent spate of issues with the SAS2LP's, that's definitely the card I'd have purchased if I was adding more SATA ports.    But now I'm gun-shy r.e. those cards and would likely avoid them ... either in favor of the older SASLP's or the M1015.

 

Fortunately I don't need any additional ports at the moment ... although I may around Christmas time, as I plan to build a new server this winter.

 

Once I get the new cards flashed and installed I will run a new parity check and post the results. I am rather curious now to see the difference as well.

 

If I can't get the two Dell H310's at work they are on Ebay brand new shipped from the US from a good reputable vendor. BTW, where are you grabbing the IBM M1050's from?

 

Link to comment

Definitely interested in how the M1015's do.    Until the recent spate of issues with the SAS2LP's, that's definitely the card I'd have purchased if I was adding more SATA ports.    But now I'm gun-shy r.e. those cards and would likely avoid them ... either in favor of the older SASLP's or the M1015.

 

Fortunately I don't need any additional ports at the moment ... although I may around Christmas time, as I plan to build a new server this winter.

 

Once I get the new cards flashed and installed I will run a new parity check and post the results. I am rather curious now to see the difference as well.

 

If I can't get the two Dell H310's at work they are on Ebay brand new shipped from the US from a good reputable vendor. BTW, where are you grabbing the IBM M1050's from?

 

I am getting them from ebay:

 

http://www.ebay.ca/itm/IBM-M1015-46M0831-46C8933-PCI-e-Serial-ATA-300-SAS-LSI-9220-8I-LSI-9211-8I-/271971476950?hash=item3f52c365d6&autorefresh=true

 

I offered the guy $80USD/per for 3 and he accepted. I only had to pay shipping once for the 3 cards, with tracking, so it ended up being $255USD for the 3 cards. He was selling 7, and still has 3 available.

 

I am supposed to get them on Thursday, so I can confirm if they are good once I have them in my hands. I know it's a risk (someone mentioned there are fake ones out there), but with ebay, paypal and AMEX protection I figured I was covered in case something went wrong. :)

Link to comment

Does this issue just effect the SAS2LP cards? Not that I'm planning on rushing out and buying new controllers but it is rather annoying. I started a parity check almost 30 hours ago and it isn't even half finished. Was wondering about the AOC-SASLP-MV8 controllers? On paper they look to be about the same unless I'm missing something.

 

Edit: Except for the speed. Missed that at first.

Link to comment

Does this issue just effect the SAS2LP cards? Not that I'm planning on rushing out and buying new controllers but it is rather annoying. I started a parity check almost 30 hours ago and it isn't even half finished. Was wondering about the AOC-SASLP-MV8 controllers? On paper they look to be about the same unless I'm missing something.

 

Others can likely comment better, but it appears to be primarily a SAS2LP issue. Some reported much better success with the SASLP and moving to the SAS2LP card introduced the slow downs.

 

Out of curiosity, are you on 6.1 or 6.1.1? There was a "fix" or feature removal in 6.1.1 that was supposed to help with the issue, though I didn't see any real difference personally - though I appear to be on the high-end of the scale with minimal slow-down compared to many others.

Link to comment

Out of curiosity, are you on 6.1 or 6.1.1? There was a "fix" or feature removal in 6.1.1 that was supposed to help with the issue, though I didn't see any real difference personally - though I appear to be on the high-end of the scale with minimal slow-down compared to many others.

6.1.1

Getting about 20MB/sec right now at 47% complete.

Link to comment

Out of curiosity, are you on 6.1 or 6.1.1? There was a "fix" or feature removal in 6.1.1 that was supposed to help with the issue, though I didn't see any real difference personally - though I appear to be on the high-end of the scale with minimal slow-down compared to many others.

6.1.1

Getting about 20MB/sec right now at 47% complete.

 

Is this issue new to 6.1.1? Or were you having issues on 6.1 (or 6.0.1 if you were running that).

Link to comment

Definitely interested in how the M1015's do.    Until the recent spate of issues with the SAS2LP's, that's definitely the card I'd have purchased if I was adding more SATA ports.    But now I'm gun-shy r.e. those cards and would likely avoid them ... either in favor of the older SASLP's or the M1015.

 

Fortunately I don't need any additional ports at the moment ... although I may around Christmas time, as I plan to build a new server this winter.

 

Once I get the new cards flashed and installed I will run a new parity check and post the results. I am rather curious now to see the difference as well.

 

If I can't get the two Dell H310's at work they are on Ebay brand new shipped from the US from a good reputable vendor. BTW, where are you grabbing the IBM M1050's from?

 

I am getting them from ebay:

 

http://www.ebay.ca/itm/IBM-M1015-46M0831-46C8933-PCI-e-Serial-ATA-300-SAS-LSI-9220-8I-LSI-9211-8I-/271971476950?hash=item3f52c365d6&autorefresh=true

 

I offered the guy $80USD/per for 3 and he accepted. I only had to pay shipping once for the 3 cards, with tracking, so it ended up being $255USD for the 3 cards. He was selling 7, and still has 3 available.

 

I am supposed to get them on Thursday, so I can confirm if they are good once I have them in my hands. I know it's a risk (someone mentioned there are fake ones out there), but with ebay, paypal and AMEX protection I figured I was covered in case something went wrong. :)

 

Good deal. $80 for three, when he is selling them each for $95? Can't beat that. I'd rather have the SAS connectors at the rear of the card instead of pointing up in the air.

 

Link to comment

Does this issue just effect the SAS2LP cards? Not that I'm planning on rushing out and buying new controllers but it is rather annoying. I started a parity check almost 30 hours ago and it isn't even half finished. Was wondering about the AOC-SASLP-MV8 controllers? On paper they look to be about the same unless I'm missing something.

 

Edit: Except for the speed. Missed that at first.

 

My SASLP works fine in V6.  I was/am (?)  moving to the SAS2LP because the the SASLP will bog down some when fully loaded (I'm getting close).  I was having some parity check speed issues but I think it was CPU or drive related.  I don't think you will have any issues with the SASLp until you have 7 or 8 drives connected and then its still up in the 70's.  Someone else can comment if they have seen otherwise.

Link to comment

Definitely interested in how the M1015's do.    Until the recent spate of issues with the SAS2LP's, that's definitely the card I'd have purchased if I was adding more SATA ports.    But now I'm gun-shy r.e. those cards and would likely avoid them ... either in favor of the older SASLP's or the M1015.

 

Fortunately I don't need any additional ports at the moment ... although I may around Christmas time, as I plan to build a new server this winter.

 

Once I get the new cards flashed and installed I will run a new parity check and post the results. I am rather curious now to see the difference as well.

 

If I can't get the two Dell H310's at work they are on Ebay brand new shipped from the US from a good reputable vendor. BTW, where are you grabbing the IBM M1050's from?

 

I am getting them from ebay:

 

http://www.ebay.ca/itm/IBM-M1015-46M0831-46C8933-PCI-e-Serial-ATA-300-SAS-LSI-9220-8I-LSI-9211-8I-/271971476950?hash=item3f52c365d6&autorefresh=true

 

I offered the guy $80USD/per for 3 and he accepted. I only had to pay shipping once for the 3 cards, with tracking, so it ended up being $255USD for the 3 cards. He was selling 7, and still has 3 available.

 

I am supposed to get them on Thursday, so I can confirm if they are good once I have them in my hands. I know it's a risk (someone mentioned there are fake ones out there), but with ebay, paypal and AMEX protection I figured I was covered in case something went wrong. :)

 

Good deal. $80 for three, when he is selling them each for $95? Can't beat that. I'd rather have the SAS connectors at the rear of the card instead of pointing up in the air.

 

I agree, but since I am in a Norco 4224 case I have a ton of head space, so am not too concerned there. I could see it being an issue with smaller cases though as those SAS connectors only have so much give and having to do a quick 90 degree turn in a small case could be frustrating.

 

And for clarity it's $80 per card, not $80 for three (just in case someone else tries to hit him up for some of the remaining and is expecting an insane deal). :)

 

Link to comment

I think that this is a hard issue to get ones head around.  Because it seems to display itself differently depending on what your hardware configuration is. 

 

I'm wondering if just as new features have been added to unraid over the past year or 2, if the cpu power necessary to successfully run it has gone up? 

 

For example, I think there are a couple of people who had this issue, whose problems went away by swapping out a single core processor with a dual core one.  And if LT came out and said to me that this is what I needed to do in order to resolve the issue, I'd be fine with that.  Not ecstatically happy, but fine.  Other operating systems up their required system specs with new versions, why not unraid?

 

It also seems that more than parity check speeds need to be tested.  CPU utilization during those parity checks needs to be checked as well. 

 

My lowly single core celeron doesn't go above 50% utilization doing a parity check with 6.0B14, but with any version above that it get's pegged at 100%.  It seems like a couple people have mentioned parity check cpu utilization in the 80% range with a core 2 duo processor.  In that scenario everything may seem fine because your processor is not getting maxed, but is it really fine?

 

Link to comment

OK, I'm not sure how useful this is but I think it's noteworthy.

 

Earlier today I posted some copies of my system stats graphs during a V5 and V6 parity check with the SAS2LP installed.  It was not showing any appreciable difference in CPU usage, but the speeds of course were half(ish) between V5 and V6.  The graphs also show the CPU cycling during a parity check in V6, not in V5.  However it also did it at rest in V6 so I as assuming the cycling behavior had nothing to do with the card.

 

This evening I swapped back to the SASLP card and re-ran a parity check.  The speeds went from the 30's to the 80's, that was expected.  However: With the SAS2LP card in V6, the CPU utilization cycled form ~ 30% to ~65%.  With the SASLP card in V6 CPU utilization was ~ 80% with spikes to 100%.  When my first small (32 GB) drive drops out, you see a slight drop in CPU usage and slight increase in speed to around 90. 

 

So as speculated earlier, the CPU cycling may be connected to this issue, but my guess is, that it has to do with how fast the data is getting off the card to the CPU, its coming through in batches not a smooth flow.  Anyway.  I'm above my pay grade and in speculation mode now...  So I'll stick to providing what might be some useful information. 

 

Bottom line:

 

V6 w/ SAS2LP:

Parity check in the 30's MB/Sec

CPU Usage cycles from ~ 30 to ~ 65.

 

V6 w/ SASLP

Parity check in the 80's MB/SEc

CPU Usage at a solid 80 with spikes to 100.

 

Graphs below:  You can see in both the slight change when the 32 GB SSD drops off.

2015-09-07_V6_SAS2LP_CPU_During_Parity_check.JPG.f81669d5f7fa3800d0ac3cde97ea9d48.JPG

2015-09-07_V6_SASLP_CPU_in_Parity_Check.JPG.759bc59ce6069d86cf5ca1f10a09502c.JPG

Link to comment

What is the cpu governor?

 

On my system, cpufreq-info displays a lot of information. What may be interesting between the various V6 / V5 systems is what the existing cpu governor is.

 

~# cpufreq-info
cpufrequtils 008: cpufreq-info (C) Dominik Brodowski 2004-2009
Report errors and bugs to [email protected], please.
analyzing CPU 0:
  driver: acpi-cpufreq
  CPUs which run at the same hardware frequency: 0
  CPUs which need to have their frequency coordinated by software: 0
  maximum transition latency: 10.0 us.
  hardware limits: 1.60 GHz - 3.40 GHz
  available frequency steps: 3.40 GHz, 3.40 GHz, 3.30 GHz, 3.10 GHz, 3.00 GHz, 2.90 GHz, 2.80 GHz, 2.60 GHz, 2.50 GHz, 2.40 GHz, 2.20 GHz, 2.10 GHz, 2.00 GHz, 1.90 GHz, 1.70 GHz, 1.60 GHz
  available cpufreq governors: conservative, userspace, powersave, ondemand, performance
  current policy: frequency should be within 1.60 GHz and 3.40 GHz.
                  The governor "ondemand" may decide which speed to use
                  within this range.
  current CPU frequency is 2.40 GHz (asserted by call to hardware).

Link to comment

Out of curiosity, are you on 6.1 or 6.1.1? There was a "fix" or feature removal in 6.1.1 that was supposed to help with the issue, though I didn't see any real difference personally - though I appear to be on the high-end of the scale with minimal slow-down compared to many others.

6.1.1

Getting about 20MB/sec right now at 47% complete.

 

There’s no issue with the SASLP, in my experience and because it’s PCI-e 4x a fully load card is limited to about 75-80Mb/s during a parity check.

The kind of speed you are seeing is more common for users being CPU limited, when the CPU stays pegged at 100% during a check, that shouldn’t happen with a Haswell Xeon .

 

Link to comment

I guess all this excitement I had a 2TB drive give me a big red X today. Drive was performing fine according to all the smart reports. Since it is coming off a Supermicro - SAS2, what I usually do is put it into a Windows external eSATA box and format. Then I run a sector by sector check on it, which usually comes back with no problems. Then I'll reinstall it into unraid. I have two drives currently in my box that I've done that and it has been 2 years with no problems. Why unraid would red ball them, who knows. So now my array is really vulnerable, but I think I'll get a new drive anyway for a backup while I do the Windows thing. Now I'm afraid to even touch my drives/ARRAY with the Supermicro cards in them. It is like "the ghost in the machine".

 

 

Link to comment

@opentoe, I experienced the same. Look at the mess here (partially self created): http://lime-technology.com/forum/index.php?topic=42594.msg407433#msg407433

 

Anyhow, I don't trust my array anymore. The >200 errors the the parity sync showed and that have been corrected yesterday were making me very nervous and I really don't know which files were affected by that "correction". My main question: is a backup now overwriting with corrupted files?

 

root@Tower:~# lspci -vv -d 1b4b:*
02:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev 03)
Subsystem: Marvell Technology Group Ltd. Device 9480
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 64 bytes
Interrupt: pin A routed to IRQ 17
Region 0: Memory at dfa40000 (64-bit, non-prefetchable) [size=128K]
Region 2: Memory at dfa00000 (64-bit, non-prefetchable) [size=256K]
Expansion ROM at dfa60000 [disabled] [size=64K]
Capabilities: [40] Power Management version 3
	Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)
	Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Address: 0000000000000000  Data: 0000
Capabilities: [70] Express (v2) Endpoint, MSI 00
	DevCap:	MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us
		ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
	DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
		RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-
		MaxPayload 128 bytes, MaxReadReq 512 bytes
	DevSta:	CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr- TransPend-
	LnkCap:	Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us
		ClockPM- Surprise- LLActRep- BwNot-
	LnkCtl:	ASPM L0s Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
		ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
	LnkSta:	Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
	DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
	LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
		 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
		 Compliance De-emphasis: -6dB
	LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
		 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
Capabilities: [100 v1] Advanced Error Reporting
	UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
	UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
	UESvrt:	DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
	CESta:	RxErr+ BadTLP+ BadDLLP+ Rollover- Timeout+ NonFatalErr+
	CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
	AERCap:	First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Capabilities: [140 v1] Virtual Channel
	Caps:	LPEVC=0 RefClk=100ns PATEntryBits=1
	Arb:	Fixed- WRR32- WRR64- WRR128-
	Ctrl:	ArbSelect=Fixed
	Status:	InProgress-
	VC0:	Caps:	PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
		Arb:	Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
		Ctrl:	Enable+ ID=0 ArbSelect=Fixed TC/VC=01
		Status:	NegoPending- InProgress-
Kernel driver in use: mvsas
Kernel modules: mvsas

 

I'm not affected by the speed drop. Parity check is starting at around 150MB/sec and finished yesterday at 105MB/sec.

Link to comment

I see that you and bkastner, which appears to be only a little affected, both have Xeons, I can’t go buy one just for testing but if anyone reading this thread has a Xeon and slow parity checks speed with a SAS2LP, or normal speed without a Xeon please post here, maybe we can find some logic to why only some users are affected.

Link to comment

Out of curiosity, are you on 6.1 or 6.1.1? There was a "fix" or feature removal in 6.1.1 that was supposed to help with the issue, though I didn't see any real difference personally - though I appear to be on the high-end of the scale with minimal slow-down compared to many others.

6.1.1

Getting about 20MB/sec right now at 47% complete.

 

Is this issue new to 6.1.1? Or were you having issues on 6.1 (or 6.0.1 if you were running that).

I noticed that speeds had slowed to around 60MB/s on 6.0 & 6.0.1. Didn't install any of the RC's. Just upgraded to 6.1.1 on Sunday. First parity check finished last night with an average speed of 31.5 MB/s. Seems to really be slowing down read/write speeds as well.

Link to comment

I guess all this excitement I had a 2TB drive give me a big red X today. Drive was performing fine according to all the smart reports. Since it is coming off a Supermicro - SAS2, what I usually do is put it into a Windows external eSATA box and format. Then I run a sector by sector check on it, which usually comes back with no problems. Then I'll reinstall it into unraid. I have two drives currently in my box that I've done that and it has been 2 years with no problems. Why unraid would red ball them, who knows. So now my array is really vulnerable, but I think I'll get a new drive anyway for a backup while I do the Windows thing. Now I'm afraid to even touch my drives/ARRAY with the Supermicro cards in them. It is like "the ghost in the machine".

 

Before you do anything you should post diagnostics. I had the same issue - one drive fell offline while doing a parity check testing for this issue. I ran SMART tests which were all fine on the drive, and it was suggested I remove the drive, start the array, stop the array and re-add the drive to get it rebuilt.

 

I posted diagnostics as requested, but got impatient and started the rebuild. Diagnostics ended up saying the SAS2LP card crashed and forced the drive offline. It also started "correcting parity"

 

I would have been better to re-add the drive and do a new config, but had started the rebuild. I ended up with 28 parity corrections which may have written bad data to my drive instead of trusting the data that was on it. It's 4TB of TV shows, so hundreds of episodes, and I have no idea where the potential corruption is.

 

This was the final straw that caused me to look at another controller. I can handle slower parity checks (though again mine are not bad), but controller crashes, drives being thrown offline, potential data loss... these I can't handle.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.