[Partially SOLVED] Is there an effort to solve the SAS2LP issue? (Tom Question)


TODDLT

Recommended Posts

Thanks for your report.  I modified my previous post to use '1b4b' instead - that's the proper vendor-id for that card.

 

Also, in your report notice the

 

Subsystem: Marvell Technology Group Ltd. Device 9480

 

That's what we're looking for.  9480 is the subdevice id of the 'older' cards.  The value 9485 is the 'new' subdevice id.  Maybe those are the ones with this issue?  Let's see...

Now that's interesting!  We recently had a user (3blackdots) with a 9485 that had the Marvell bug (no drives seen), post is here with my bug confirmation after it.  A few posts up, user opentoe had a 9480 where the drives showed up, but parity checks were very slow, summary post is here.

 

This seems like 2 strikes against Marvell now.  Would it be useful to get them involved?  Their reputation is at stake here, going to 'strike out' with unRAID users, if they can't give us a correction/patch or configuration change.

 

Ironic. Without even seeing these posts yet I started a parity check earlier. I stopped it since it was going terribly slow. 60MB/sec or less. Should be in the 100+ range on my system. I have two SAS2 cars. All drives on both cards. Not using any of the mainboards SATA connections. Right now I'm running that tunable script so I don't want to touch the server. Both cards are attached to Norco 4224 blackplanes. I would even buy a couple other brand cards, but if I do that would want to get, compatible and good performing cards. Even for test purposes I would do it, but need to find the cards available somewhere online.

 

Are you on 6.1.1 version?

Link to comment
  • Replies 453
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Posted Images

6.1.0 right now. Had really bad parity speeds, so right now I'm running that tunable script and selected FULLAUTO so that will take a couple of hours. My last parity took over 24 hours. I guess I'll see what happens after the script runs.

 

 

EDIT: From looking at the SSH screen doesn't look like I'm going to get any good speeds out of it. Started at 46MB/sec, went up to 60MB/sec and already dropping as the md_sync window increases. I just may stop the tunable script, I doubt it is going to help me any. I was checking to see what controllers I could possibly get/test with and the thread is from 2011, so it is a little out dated. Anyone recommend a solid 8 port JBOD card(s) that doesn't reflect these slow parity check speeds?

 

Link to comment

6.1.0 right now. Had really bad parity speeds, so right now I'm running that tunable script and selected FULLAUTO so that will take a couple of hours. My last parity took over 24 hours. I guess I'll see what happens after the script runs.

 

I tried this on my system with the SAS2 issues.  No appreciable difference.  If you see above I posted my results in this thread.  I know all systems are different, but I think the major slowdown is something else.

Link to comment

6.1.0 right now. Had really bad parity speeds, so right now I'm running that tunable script and selected FULLAUTO so that will take a couple of hours. My last parity took over 24 hours. I guess I'll see what happens after the script runs.

 

I tried this on my system with the SAS2 issues.  No appreciable difference.  If you see above I posted my results in this thread.  I know all systems are different, but I think the major slowdown is something else.

 

I saw it. I may just stop the tunable script process. I always said I never had issues with my two SAS2 cards, which is partly correct. Even with a maint parity check I'd like to get good performance.

Link to comment

I did however find on the net v6beta1, 2 and 3 and in all 3 the SAS2LP is back to normal speed, so it was a change between beta 3 and beta 14, I can’t find those versions so if Mr-Hexen or anyone else can send me a link to the betas in between so I can test and pinpoint the version here the issue begins I will be very grateful.

 

Update:

 

Found v6 beta6, thanks google!

 

SAS2LP slowdown is present, so the issue was introduced in beta 4, 5 or 6

 

Any luck in finding beta-4 or beta5a? Going from the change summary, these changes might be possibilities to test out, even if they're a bit of a stretch [ http://dnld.lime-technology.com/stable/unRAIDServer-6.0.0-x86_64.txt ]:

 

Version 6.0-beta6 - linux: update to 3.15.0

Version 6.0-beta6 - CONFIG_IRQ_REMAP: Support for Interrupt Remapping

 

Version 6.0-beta4 - SCSI_MPT3SAS: LSI MPT Fusion SAS 3.0 Device Driver 

 

Version 6.0-beta1 - linux: version 3.10.24p x86_64

Link to comment

Awesome. Maybe this weird anomaly can be put to rest if we get together and work on it. I'm willing to help if anything is needed but only have my one main server running, so anything too invasive would be a risk. If any technical information is needed, just let me know. I let the tunable script continue  and from the looks at the screen 60MB/sec may be the highest speed I would get with a parity check. I remember on V5 I would get 111MB/sec with my parity checks.

 

Link to comment

What might be helpful is to post the output of this command:

 

lspci -vv -d 1b4b:*

 

This will list the details of the Marvell controller installed in your server.

 

There was an odd thing that happened with that card about 2 years ago.  During testing of a batch of about 10 cards, we came across one that would not be recognized by linux.  We set it aside and then encountered another one.  I think we ended up finding 3 or 4 cards that had this problem.  If only one or two cards I might have just RMA'ed them, but decided to look into it and discovered that the PCI subdevice ID was different.  Eventually found a patch as well.  (This patch has since been merged into linux upstream.)

 

Well the cards seemed to work ok with the motherboards we were using so we shipped 'em.

 

I've always wondered about that however.  Why would Marvell change the subdevice ID?  I have no idea.  But maybe we can discover a pattern of which controllers work and which suffer a slowdown.

 

For a completely different theory: besides going from 32-bit to 64-bit kernel, the other big change in unRaid is that we moved from non-preemptible to preemptible kernel.  Maybe this makes a difference?

 

Also, back in 6.0-beta6 we added this:

  CONFIG_SCSI_MVSAS_TASKLET: Support for interrupt tasklet (improves mvsas performance)

 

This is everything I know about this issue.  Needless to say, we cannot debug the mvsas driver code.  If we can find a pattern and/or find other postings on the 'net related to this issue, there might be a chance of a fix.

 

Here are my results.

 

root@SUN:~# lspci -vv -d 1b4b:*

01:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev c3)

        Subsystem: Marvell Technology Group Ltd. Device 9480

        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-

        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

        Latency: 0, Cache Line Size: 64 bytes

        Interrupt: pin A routed to IRQ 25

        Region 0: Memory at fb640000 (64-bit, non-prefetchable)

        Region 2: Memory at fb600000 (64-bit, non-prefetchable)

        Expansion ROM at fb660000 [disabled]

        Capabilities: [40] Power Management version 3

                Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)

                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+

                Address: 0000000000000000  Data: 0000

        Capabilities: [70] Express (v2) Endpoint, MSI 00

                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us

                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-

                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-

                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-

                        MaxPayload 256 bytes, MaxReadReq 512 bytes

                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-

                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us

                        ClockPM- Surprise- LLActRep- BwNot-

                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+

                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

                DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported

                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled

                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-

                        Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

                        Compliance De-emphasis: -6dB

                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-

                        EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

        Capabilities: [100 v1] Advanced Error Reporting

                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-

                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-

                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+

                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-

        Capabilities: [140 v1] Virtual Channel

                Caps:  LPEVC=0 RefClk=100ns PATEntryBits=1

                Arb:    Fixed- WRR32- WRR64- WRR128-

                Ctrl:  ArbSelect=Fixed

                Status: InProgress-

                VC0:    Caps:  PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-

                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-

                        Ctrl:  Enable+ ID=0 ArbSelect=Fixed TC/VC=01

                        Status: NegoPending- InProgress-

        Kernel driver in use: mvsas

        Kernel modules: mvsas

 

02:00.0 RAID bus controller: Marvell Technology Group Ltd. 88SE9485 SAS/SATA 6Gb/s controller (rev c3)

        Subsystem: Marvell Technology Group Ltd. Device 9480

        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-

        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-

        Latency: 0, Cache Line Size: 64 bytes

        Interrupt: pin A routed to IRQ 26

        Region 0: Memory at fb540000 (64-bit, non-prefetchable)

        Region 2: Memory at fb500000 (64-bit, non-prefetchable)

        Expansion ROM at fb560000 [disabled]

        Capabilities: [40] Power Management version 3

                Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold-)

                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-

        Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+

                Address: 0000000000000000  Data: 0000

        Capabilities: [70] Express (v2) Endpoint, MSI 00

                DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <1us, L1 <8us

                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-

                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-

                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop-

                        MaxPayload 256 bytes, MaxReadReq 512 bytes

                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-

                LnkCap: Port #0, Speed 5GT/s, Width x8, ASPM L0s L1, Latency L0 <512ns, L1 <64us

                        ClockPM- Surprise- LLActRep- BwNot-

                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+

                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-

                LnkSta: Speed 5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-

                DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported

                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled

                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-

                        Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-

                        Compliance De-emphasis: -6dB

                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-

                        EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-

        Capabilities: [100 v1] Advanced Error Reporting

                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-

                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-

                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-

                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+

                AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-

        Capabilities: [140 v1] Virtual Channel

                Caps:  LPEVC=0 RefClk=100ns PATEntryBits=1

                Arb:    Fixed- WRR32- WRR64- WRR128-

                Ctrl:  ArbSelect=Fixed

                Status: InProgress-

                VC0:    Caps:  PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-

                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-

                        Ctrl:  Enable+ ID=0 ArbSelect=Fixed TC/VC=01

                        Status: NegoPending- InProgress-

        Kernel driver in use: mvsas

        Kernel modules: mvsas

 

 

Link to comment

 

Any luck in finding beta-4 or beta5a? Going from the change summary, these changes might be possibilities to test out, even if they're a bit of a stretch [ http://dnld.lime-technology.com/stable/unRAIDServer-6.0.0-x86_64.txt ]:

 

 

Not yet.

 

I'm able to get two Dell PERC H310's from work I think. Once I figure out how to flash them, I can just take out my two Supermicro cards, throw in the H310's and see if there is a difference right off the start. Is it hard to flash these cards?

Link to comment

Thanks for your report.  I modified my previous post to use '1b4b' instead - that's the proper vendor-id for that card.

 

Also, in your report notice the

 

Subsystem: Marvell Technology Group Ltd. Device 9480

 

That's what we're looking for.  9480 is the subdevice id of the 'older' cards.  The value 9485 is the 'new' subdevice id.  Maybe those are the ones with this issue?  Let's see...

Now that's interesting!  We recently had a user (3blackdots) with a 9485 that had the Marvell bug (no drives seen), post is here with my bug confirmation after it.  A few posts up, user opentoe had a 9480 where the drives showed up, but parity checks were very slow, summary post is here.

 

This seems like 2 strikes against Marvell now.  Would it be useful to get them involved?  Their reputation is at stake here, going to 'strike out' with unRAID users, if they can't give us a correction/patch or configuration change.

 

Ironic. Without even seeing these posts yet I started a parity check earlier. I stopped it since it was going terribly slow. 60MB/sec or less. Should be in the 100+ range on my system. I have two SAS2 cards. All drives on both cards. Not using any of the mainboards SATA connections. Right now I'm running that tunable script so I don't want to touch the server. Both cards are attached to Norco 4224 blackplanes. I would even buy a couple other brand cards, but if I do that would want to get, compatible and good performing cards. Even for test purposes I would do it, but need to find the cards available somewhere online.

 

I have a nearly identical setup to you and just bought 3 M1015 cards to replace my 2 SAS2 cards (which were almost full). I've had better experience than most with the SAS2LP cards, but it's at a point of minor issues (the last likely causing data issues as I got impatient), so I am done with the cards.

 

I found M1015 cards in Canada for $80US/piece so took the plunge. Even with the exchange rate it's still a pretty good deal I think.

Link to comment

Rebooted server. Shut off all VM's. Turned as much off as I could. Started a fresh parity check and only getting 56MB/sec. Going to take over a day. Since I don't perform parity checks monthly, never really worried about the speeds but that's pretty slow. I'll see if I can grab those Dell H310's from work. On the wiki it indicates they were crossflashed with LSI9211 but the Dell Perc H310 has an LSISAS2008 on it. Weird. Don't want to brick any cards.

 

Link to comment

under the WebGUI, Settings>Display Settings>

 

Do you guys all have

disable page updates while parity operation is running.

checked off??

 

It's at the bottom of the page.

 

I turned off page frequency updating.

When I get those H310 cards from work and slap them in the unraid box and perform a parity check and if the speeds are much different then 50-60MB/sec then for sure it has something to do with the Supermicro cards and the later versions of unraid. I do remember using an older version SAS card with version 5 and getting around 110MB/sec parity check speeds. And I thought Windows was one to have all these weird/odd quirks.

 

 

 

Link to comment

 

Any luck in finding beta-4 or beta5a? Going from the change summary, these changes might be possibilities to test out, even if they're a bit of a stretch [ http://dnld.lime-technology.com/stable/unRAIDServer-6.0.0-x86_64.txt ]:

 

 

Not yet.

 

I have just about every release ever made, but I don't have a public place to put them.  With some work and time, I could possibly set up an FTP location, but I would prefer something else.  If someone has a repository, I'm happy to copy them there, or another simple method, let me know.

Link to comment

So I decided to try and wrap my head around this issue as well.  I am not a slackware guy but a RedHat / CentOS guy by trade.  It looks like at least on my system there are modules for mpt3sas and mpt2sas but my system is not using them at all.  It is using the generic mvsas driver. 

 

 

root@Zeus:/dev# udevadm info -a -n /dev/sdf

 

Udevadm info starts with the device specified by the devpath and then

walks up the chain of parent devices. It prints for every device

found, all possible attributes in the udev rules key format.

A rule to match, can be composed by the attributes of the device

and the attributes from one single parent device.

Udevadm info starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/pci0000:00/0000:00:02.0/0000:01:00.0/host7/port-7:2/end_device-7:2/target7:0:2/7:0:2:0/block/sdf':
    KERNEL=="sdf"
    SUBSYSTEM=="block"
    DRIVER==""
    ATTR{ro}=="0"
    ATTR{size}=="468862128"
    ATTR{stat}=="  411208  2337955 322828096  2362920  2521355  4666016 715467024 100873423        0 13123529 103234345"
    ATTR{range}=="16"
    ATTR{discard_alignment}=="0"
    ATTR{events}==""
    ATTR{ext_range}=="256"
    ATTR{events_poll_msecs}=="-1"
    ATTR{alignment_offset}=="0"
    ATTR{inflight}=="       0        0"
    ATTR{removable}=="0"
    ATTR{capability}=="50"
    ATTR{events_async}==""

  looking at parent device '/devices/pci0000:00/0000:00:02.0/0000:01:00.0/host7/port-7:2/end_device-7:2/target7:0:2/7:0:2:0':
    KERNELS=="7:0:2:0"
    SUBSYSTEMS=="scsi"
    DRIVERS=="sd"
    ATTRS{rev}=="BBF0"
    ATTRS{type}=="0"
    ATTRS{scsi_level}=="6"
    ATTRS{model}=="MKNSSDEC240GB   "
    ATTRS{state}=="running"
    ATTRS{queue_type}=="none"
    ATTRS{iodone_cnt}=="0x31c3ed"
    ATTRS{iorequest_cnt}=="0x3206bf"
    ATTRS{queue_ramp_up_period}=="120000"
    ATTRS{device_busy}=="0"
    ATTRS{evt_capacity_change_reported}=="0"
    ATTRS{timeout}=="30"
    ATTRS{evt_media_change}=="0"
    ATTRS{ioerr_cnt}=="0xc9"
    ATTRS{queue_depth}=="31"
    ATTRS{vendor}=="ATA     "
    ATTRS{evt_soft_threshold_reached}=="0"
    ATTRS{device_blocked}=="0"
    ATTRS{evt_mode_parameter_change_reported}=="0"
    ATTRS{evt_lun_change_reported}=="0"
    ATTRS{evt_inquiry_change_reported}=="0"
    ATTRS{iocounterbits}=="32"
    ATTRS{vpd_pg80}==""
    ATTRS{vpd_pg83}==""
    ATTRS{eh_timeout}=="10"

  looking at parent device '/devices/pci0000:00/0000:00:02.0/0000:01:00.0/host7/port-7:2/end_device-7:2/target7:0:2':
    KERNELS=="target7:0:2"
    SUBSYSTEMS=="scsi"
    DRIVERS==""

  looking at parent device '/devices/pci0000:00/0000:00:02.0/0000:01:00.0/host7/port-7:2/end_device-7:2':
    KERNELS=="end_device-7:2"
    SUBSYSTEMS==""
    DRIVERS==""

  looking at parent device '/devices/pci0000:00/0000:00:02.0/0000:01:00.0/host7/port-7:2':
    KERNELS=="port-7:2"
    SUBSYSTEMS==""
    DRIVERS==""

  looking at parent device '/devices/pci0000:00/0000:00:02.0/0000:01:00.0/host7':
    KERNELS=="host7"
    SUBSYSTEMS=="scsi"
    DRIVERS==""

  looking at parent device '/devices/pci0000:00/0000:00:02.0/0000:01:00.0':
    KERNELS=="0000:01:00.0"
    SUBSYSTEMS=="pci"
    DRIVERS=="mvsas"
    ATTRS{irq}=="18"
    ATTRS{subsystem_vendor}=="0x1b4b"
    ATTRS{broken_parity_status}=="0"
    ATTRS{class}=="0x010400"
    ATTRS{driver_override}=="(null)"
    ATTRS{consistent_dma_mask_bits}=="64"
    ATTRS{dma_mask_bits}=="64"
    ATTRS{local_cpus}=="3f"
    ATTRS{device}=="0x9485"
    ATTRS{enable}=="1"
    ATTRS{msi_bus}=="1"
    ATTRS{local_cpulist}=="0-5"
    ATTRS{vendor}=="0x1b4b"
    ATTRS{subsystem_device}=="0x9485"
    ATTRS{numa_node}=="0"
    ATTRS{d3cold_allowed}=="1"

  looking at parent device '/devices/pci0000:00/0000:00:02.0':
    KERNELS=="0000:00:02.0"
    SUBSYSTEMS=="pci"
    DRIVERS=="pcieport"
    ATTRS{irq}=="18"
    ATTRS{subsystem_vendor}=="0x1565"
    ATTRS{broken_parity_status}=="0"
    ATTRS{class}=="0x060400"
    ATTRS{driver_override}=="(null)"
    ATTRS{consistent_dma_mask_bits}=="32"
    ATTRS{dma_mask_bits}=="32"
    ATTRS{local_cpus}=="3f"
    ATTRS{device}=="0x5a16"
    ATTRS{enable}=="1"
    ATTRS{msi_bus}=="1"
    ATTRS{local_cpulist}=="0-5"
    ATTRS{vendor}=="0x1002"
    ATTRS{subsystem_device}=="0x1709"
    ATTRS{numa_node}=="0"
    ATTRS{d3cold_allowed}=="0"

  looking at parent device '/devices/pci0000:00':
    KERNELS=="pci0000:00"
    SUBSYSTEMS==""
    DRIVERS==""

 

 

This command will help show you what driver is being used for your drives..    udevadm info -a -n /dev/sd? 

 

or for the shortened version

udevadm info -a -n /dev/sdf | grep -oP 'DRIVERS?=="\K[^"]+'

sd
mvsas
pcieport

 

obviously /dev/sd changes but for my system /dev/sdd up to /dev/sdk for my 8 port SAS2LP.  I am curious to know if beta3 loads the mvsas driver or loads something different. 

 

Another sidenote my card is the AOC-SAS2LP-MV8 based on the Marvell 9480 host controller

 

I am also curious to know what unRAID5 loads.  I don't have a spare card and I am moving data around right now but I will try and load up 5 and see what driver gets loaded when / if I get a chance. 

 

Can you guys run this command on one of your drives attached to your SAS2LP and post the results?

Link to comment

Well at least we don't have to worry about the mpt3sas and mpt2sas drivers in unraid.  I don't think the SAS2LP cards use those drivers.  Mr Google has confirmed that the Supermicro AOC-SAS2LP-MV8 is supported in mvsas driver.  Heading in search of release notes for all the beta and rc 6.x to see if I can figure out when the driver changed.

Link to comment

What might be helpful is to post the output of this command:

 

lspci -vv -d 1b4b:*

 

This will list the details of the Marvell controller installed in your server.

 

There was an odd thing that happened with that card about 2 years ago.  During testing of a batch of about 10 cards, we came across one that would not be recognized by linux.  We set it aside and then encountered another one.  I think we ended up finding 3 or 4 cards that had this problem.  If only one or two cards I might have just RMA'ed them, but decided to look into it and discovered that the PCI subdevice ID was different.  Eventually found a patch as well.  (This patch has since been merged into linux upstream.)

 

Well the cards seemed to work ok with the motherboards we were using so we shipped 'em.

 

I've always wondered about that however.  Why would Marvell change the subdevice ID?  I have no idea.  But maybe we can discover a pattern of which controllers work and which suffer a slowdown.

 

For a completely different theory: besides going from 32-bit to 64-bit kernel, the other big change in unRaid is that we moved from non-preemptible to preemptible kernel.  Maybe this makes a difference?

 

Also, back in 6.0-beta6 we added this:

  CONFIG_SCSI_MVSAS_TASKLET: Support for interrupt tasklet (improves mvsas performance)

 

This is everything I know about this issue.  Needless to say, we cannot debug the mvsas driver code.  If we can find a pattern and/or find other postings on the 'net related to this issue, there might be a chance of a fix.

 

I think this is the issue.  CONFIG_SCSI_MVSAS_TASKLET: Support for interrupt tasklet (improves mvsas performance) was introduced beta 6. 

 

johnnie.black I'll bet if you can get beta6 to test with you will see the speed drop from from the tests you did with beta3. 

 

Tom is there any way we can get a current build without this tweak to test?

Link to comment

One question to the Masses.  What FS are you running for your array storage and cache?  I am running XFS for the array and 3 SSD's in a BTRFS raid pool.    Just wondering if this is an XFS file system issue or what.   

 

Checking out the release notes for the Linux kernal that was implemented for beta6 as well..  3.15.0 maybe there is something in there that started to hose us. 

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.