"Ideal SATA/SAS Controllers for Linux MD RAID"


Recommended Posts

This article might be interesting to some. Not all controllers are supported in Linux/unRAID, but there's lots of info to digest:

 

http://blog.zorinaq.com/?e=10

 

"From 16 to 2 ports: Ideal SATA/SAS Controllers for ZFS & Linux MD RAID

 

I need a lot of reliable and cheap storage space (media collection, backups). Hardware RAID tends to be expensive and clunky. I recognize quite a few advantages in ZFS on Solaris/FreeBSD, and Linux MD RAID:

 

   Performance. In many cases they are as fast as hardware RAID, and sometimes faster because the OS is aware of the RAID layout and can optimize I/O patterns for it. Indeed, even the most compute intensive RAID5 or 6 parity calculations take negligible CPU time on a modern processor. For a concrete example, Linux 2.6.32 on a Phenom II X4 945 3.0GHz computes RAID6 parity at close to 8 GB/s on a single core (check dmesg: "raid6: using algorithm sse2x4 (7976 MB/s)"). So achieving a throughput of 500 MB/s on a Linux MD raid6 array requires spending less than 1.5% CPU time computing parity. Now regarding the optimized I/O patterns, here is an interesting anecdote: one of the steps that Youtube took in its early days to scale their infrastructure up was to switch from hardware RAID to software RAID on their database server. They noticed a 20-30% increase in I/O throughput. Watch Seattle Conference on Scalability: YouTube Scalability @ 34'50".

   Scalability. ZFS and Linux MD RAID allow building arrays across multiple disk controllers, or multiple SAN devices, alleviating throughput bottlenecks that can arise on PCIe links, or GbE links. Whereas hardware RAID is restricted to a single controller, with no room for expansion.

   Reliability. No hardware RAID = one less hardware component that can fail.

   Ease of recoverability. The data can be recovered by putting the disks in any server. There is no reliance on a particular model of RAID controller.

   Flexibility. It is possible to create arrays on any disk on any type of controller in the system, or to move disks from one controller to another.

   Ease of administration. There is only one software interface to learn: zpool(1M)/zfs(1M) or mdadm(8). No need to install proprietary vendor tools, or to reboot into BIOSes to manage arrays.

   Cost. Obviously cheaper since there is no hardware RAID controller to buy.

 

Consequently, many ZFS and Linux MD RAID users, such as me, look for non-RAID controllers that are simply reliable, fast, cheap, and otherwise come with no bells and whistles. Most motherboards have up to 4 or 6 onboard ports (be sure to always enable AHCI mode in the BIOS as it is the best designed hardware interface that a chip can present to the OS to enable maximum performance), but for more than 4 or 6 disks, there are surprisingly not that many choices of controllers. Over the years, I have spent quite some time on the controllers manufacturers' websites, the LKML, linux-ide and ZFS mailing lists, and have established a list of SATA/SAS controllers that are ideal, in my opinion, for ZFS or Linux MD RAID. I also included links to online retailers because some of these controllers are not that easy to find online.

 

The reason the list contains SAS controllers is because they are just as good as an option as SATA controllers: many of them are as inexpensive as SATA controllers (even though they target the enterprise market), they are fully compatible with SATA 3Gbps and 6Gbps disks, and they support all the usual features: hotplug, queueing, etc. A SAS controller typically present SFF-8087 connectors, also known as internal mini SAS, or even iPASS connectors. Up to 4 SATA drives can be connected to such a connector with an SFF-8087 to 4xSATA forward breakout cable (as opposed to reverse breakout). This type of cable usually sells for $15-30. Here are a few links if you have trouble finding them.

 

There are really only 4 significant manufacturers of discrete non-RAID SATA/SAS controller chips on the market: LSI, Marvell, JMicron, and Silicon Image. Controller cards from Adaptec, Areca, HighPoint, Intel, Supermicro, Tyan, etc, often use chips from one of these 4 manufacturers.

 

Here is my list of non-RAID SATA/SAS controllers, from 16-port to 2-port controllers, with the kernel driver used to support them under Linux, and Solaris. There is also limited information on FreeBSD support. I focused on native PCIe controllers only, with very few PCI-X (actually only 1 very popular: 88SX6081). The MB/s/port number in square brackets indicates the maximum practical throughput that can be expected from each SATA port, assuming concurrent I/O on all ports, given the bottleneck of the host link or bus (PCIe or PCI-X). I also assumed, for all PCIe controllers, that only 60-70% of the maximum theoretical PCIe throughput can be achieved, and for all PCI-X controllers, that only 80% of the maximum theoretical PCI-X throughput can be achieved on this bus. These assumptions concur with what I have seen in real world benchmarks assuming a Max_Payload_Size setting of either 128 or 256 bytes for PCIe (which is very often the default), and a more or less default PCI latency timer setting for PCI-X. As of May 2010, modern disks can easily reach 120-130MB/s of sequential throughput at the beginning of the platter, so avoid controllers with a throughput of less than 150MB/s/port if you want to reduce the possibility of bottlenecks to zero.

16 ports

 

   [sAS] LSI SAS2116, 6Gbps, PCIe (gen2) x8 [150-175MB/s/port]

   Availability: $400 $510. LSI HBA based on this chip: LSISAS9200-16e, LSISAS9201-16i. [update 2010-10-27: only the model with external ports used to be available but now the one with internal ports is available and less expensive.]

       Linux support: mpt2sas (2.6.30+)

       Solaris support: mpt_sas (snv_137+)

 

8 ports

 

   [sAS] Marvell 88SE9485 or 88SE9480, 6Gbps, PCIe (gen2) x8 [300-350MB/s/port]

   Availability: $280. Areca HBA based on the 9480: ARC-1320. HighPoint HBA based on the 9485: RocketRAID 2720. Lots of bandwidth available to each port. However it is currently not supported by Solaris. I would recommend the LSI SAS2008 instead, which is cheaper, better supported, and provides just as much bandwidth.

       Linux support: mvsas (94xx: 2.6.31+, ARC-1320: 2.6.32+)

       Solaris support: not supported (see 88SE6480)

   [sAS] LSI SAS2008, 6Gbps, PCIe (gen2) x8 [300-350MB/s/port]

   Availability: $140 $150 $230 $250 $290. [update 2010-12-21: Intel HBA based on this chip: RS2WC080]. Supermicro HBAs based on this chip: AOC-USAS2-L8i AOC-USAS2-L8e (these are 2 "UIO" cards with the electronic components mounted on the other side of the PCB which may not be mechanically compatible with all chassis). LSI HBAs based on this chip: LSISAS9200-8e LSISAS9210-8i LSISAS9211-8i LSISAS9212-4i4e. Lots of bandwidth per port. Good Linux and Solaris support.

       Linux support: mpt2sas (2.6.30+)

       Solaris support: mpt_sas (snv_118+ or s10u8+)

   [sAS] LSI SAS1068E, 3Gbps, PCIe (gen1) x8 [150-175MB/s/port]

   Availability: $110 $120 $150 $150. Intel HBAs based on this chip: SASUC8I. Supermicro HBAs based on this chip: AOC-USAS-L8i AOC-USASLP-L8i (these are 2 "UIO" cards - see warning above.) [update 2010-10-27: LSI HBAs based on this chip: LSISAS3081E-R LSISAS3801E.] Can provide 150-175MB/s/port of concurrent I/O, which is good enough for HDDs (but not SSDs). Good Linux and Solaris support. This chip is popular because it has very good Solaris support and was chosen by Sun for their second generation Sun Fire X4540 Server "Thumper".

       Linux support: mptsas

       Solaris support: mpt

       FreeBSD support: mpt (supported at least since 7.3)

   [sATA] Marvell 88SX6081, 3Gbps, PCI-X 64-bit 133MHz [107MB/s/port]

   Availability: $100. Supermicro HBAs based on this chip: AOC-SAT2-MV8 Based on PCI-X, which is an aging technology being replaced with PCIe. The approximate 107MB/s/port of concurrent I/O it supports is a bottleneck with modern HDDs. However this chip is especially popular because it has very good Solaris support and was chosen by Sun for their first generation Sun Fire X4500 Server "Thumper".

       Linux support: sata_mv (no suspend support)

       Solaris support: marvell88sx

       FreeBSD support: ata (supported at least since 7.0, if the hptrr driver is commented out)

   [sAS] Marvell 88SE6485 or 88SE6480, 3Gbps, PCIe (gen1) x4 [75-88MB/s/port]

   Availability: $100. Supermicro HBAs based on this chip: AOC-SASLP-MV8. The PCIe x4 link is a bottleneck for 8 drives, restricting the concurrent I/O to 75-88MB/s/port. A better and slightly more expensive alternative is the LSI SAS1068E.

       Linux support: mvsas (6485: 2.6.25 or 2.6.31 ?, 6480: 2.6.25+)

       Solaris support: not supported

 

4 ports

 

   [sAS] LSI SAS2004, 6Gbps, PCIe (gen2) x4 [300-350MB/s/port]

   Availability: $160. LSI HBA based on this chip: LSISAS9211-4i. Quite expensive. I would recommend either buying two 2-port controllers instead, or a (cheaper!) 8-port controller.

       Linux support: mpt2sas (2.6.30+)

       Solaris support: mpt_sas (snv_118+ or s10u8+)

   [sAS] LSI SAS1064E, 3Gbps, PCIe (gen1) x8 [300-350MB/s/port]

   Availability: $120 $130. Intel HBA based on this chip: SASWT4I. [update 2010-10-27: LSI HBA based on this chip: LSISAS3041E-R.] Also quite expensive. I recommend instead buying two 2-port, or one cheaper 8-port controller.

       Linux support: mptsas

       Solaris support: mpt

       FreeBSD support: mpt (supported at least since 7.3)

   [sAS] Marvell 88SE6445 or 88SE6440, 3Gbps, PCIe (gen1) x4 [150-175MB/s/port]

   Availability: $80. Areca HBA based on the 6440: ARC-1300. Adaptec HBA based on the 6440: ASC-1045/1405. Provides good bandwidth at a decent price.

       Linux support: mvsas (6445: 2.6.25 or 2.6.31 ?, 6440: 2.6.25+, ARC-1300: 2.6.32+)

       Solaris support: not supported (see 88SE6480)

   [sATA] Marvell 88SX7042, 3Gbps, PCIe (gen1) x4 [150-175MB/s/port]

   Availability: $70. Adaptec HBA based on this chip: AAR-1430SA. Rosewill HBA based on this chip: RC-218. This is the only 4-port SATA controller supported by Linux providing acceptable throughput to each port. [2010-05-30 update: I bought one for $50 from Newegg in October 2009. Listed at $70 when I wrote this blog. Currently out of stock and listed at $90. Its popularity is spreading...]

       Linux support: sata_mv (no suspend support)

       Solaris support: no - marvell88sx does not support it

   [sAS] Marvell 88SE6340, 3Gbps, PCIe (gen1) x1 [38-44MB/s/port]

   Hard to find. Only found references to this chip on Marvell's website. Performance is low anyway (38-44MB/s/port).

       Linux support: mvsas

       Solaris support: not supported (see 88SE6480)

   [sATA] Marvell 88SE6145 or 88SE6141, 3Gbps, PCIe (gen1) x1 [38-44MB/s/port]

   Hard to find. Chip seems to be mostly found on motherboards for onboard SATA. Performance is low anyway (38-44MB/s/port).

       Linux support: ahci

       Solaris support: ahci

       FreeBSD support: ahci

 

2 ports

 

   [sATA] Marvell 88SE9128 or 88SE9125 or 88SE9120, 6Gbps, PCIe (gen2) x1 [150-175MB/s/port]

   Availability: $25 $35. HighPoint HBA based on this chip: Rocket 620. LyCOM HBA based on this chip: PE-115. Koutech HBA based on this chip: PESA230. This is the only 2-port chip on the market with no bottleneck caused by the PCIe link at Max_Payload_Size=128. Pretty surprising that it is being sold for such a low price.

       Linux support: ahci

       Solaris support: not supported [update 2010-09-21: Despite being AHCI-compliant, this series of chips seems unsupported by Solaris according to reader comments, see below.]

       FreeBSD support: ahci

   [sATA] Marvell 88SE6121, 3Gbps, PCIe (gen1) x1 [75-88MB/s/port]

   Hard to find. Chip seems to be mostly found on motherboards for onboard SATA.

       Linux support: ahci

       Solaris support: ahci

       FreeBSD support: ahci

   [sATA] JMicron JMB362 or JMB363 or JMB366, 3Gbps, PCIe (gen1) x1 [75-88MB/s/port]

   Availability: $22.

       Linux support: ahci

       Solaris support: ahci

       FreeBSD support: ahci

   [sATA] SiI3132, 3Gbps, PCIe (gen1) x1 [75-88MB/s/port]

   Availability: $20. Warning: the overall bottleneck of the PCIe link is 150-175MB/s, or 75-88MB/s/port, but the chip has a 110-120MB/s bottleneck per port. So a single SATA device on a single port cannot fully use the 150-175MB/s by itself, it will be bottlenecked at 110-120MB/s.

       Linux support: sata_sil24

       Solaris support: si3124 - snv_135 and older unable to cope with bus reset

 

Finding cards based on these controller chips can be surprisingly difficult (I have had to zoom on product images on newegg.com to read the inscription on the chip before buying), hence the reason I included some links to online retailers.

 

For reference, the maximum practical throughputs per port I assumed have been computed with these formulas:

 

   For PCIe gen2: 300-350MB/s (60-70% of 500MB/s) * pcie-link-width / number-of-ports

   For PCIe gen1: 150-175MB/s (60-70% of 250MB/s) * pcie-link-width / number-of-ports

   For PCI-X 64-bit 133MHz: 853MB/s (80% of 1066MB/s) / number-of-ports

 

To anyone building ZFS or Linux MD RAID storage servers, I recommend to first make use of all onboard AHCI ports on the motherboard. Then put any extra disks on a discrete controller, and I recommend specifically these ones:

 

   For a 2-port controller: Marvell 88SE9128 or 88SE9125 or 88SE9120. I do not primarily recommend it because it is SATA 6Gbps, but because it supports PCIe gen2, which allows the controller to handle an overall throughput of at least 300-350MB/s, or 150-175MB/s/port, with a default PCIe Max_Payload_Size setting of 128 bytes. It is also fully AHCI compliant, in other words robust, well-designed, and virtually compatible with all operating systems; a notable exception is Solaris for which I recommend instead the next best controller: JMicron JMB362 or JMB363 or JMB366. The icing on the cake is that cards using these chips are inexpensive (starting from $22, or $11/port).

   For an 8-port controller: LSI SAS1068E. Controllers based on this chip can be found inexpensively (starting from $110, or $13.75/port) and are supported out of the box by many current and older Linux and Solaris versions. In fact this chip is the one that Sun used in their second generation Sun Fire X4540 Server "Thumper". The fact that it can support up to 150-175MB/s/port due to the PCIe bottleneck with concurrent I/O on all ports is sufficient for current disks. However if you need more throughput (eg. are using SSDs), then go for its more expensive successor, LSI SAS2008, which supports PCI gen2, which should allow for 300-350MB/s/port before hitting the PCIe bottleneck."

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.