SSD based unRAID install


Recommended Posts

I am needing to store a large amount of images and photoshop files on SSD disks, large batch jobs are run against these images which causes quite a bit of disk thrashing and generally runs real slow when running on a spinning drive.  Using a SSD makes a drastic difference, but there is no real requirement for performance beyond what one SSD will typically do so using some form of a RAID array wouldn't provide much benefit.

 

Will unRAID work in an all SSD arrangement?  A SSD cache to suck up writes and a set of 3-8 SSD disks for the storage?  I really like unRAID over other options as long as it can perform good.  "Good" = equivalent performance to a single SSD on a NAS that is fast enough not to introduce a bottleneck itself.

Link to comment

Someone here, Weeboo I think, actually made up a test server with SSD's As I recall it FLEW.  Unraid would work very well with SSD's and the best part is, you probably could get away without a cache drive at all. First that would be a LOT of write/erase cycles for an SSD (though this says you shouldn't worry http://techreport.com/review/27436/the-ssd-endurance-experiment-two-freaking-petabytes) but second, without the latency of a spinner, writing to the parity-array should still be blazing fast (and I believe that was Weebo's experience).

 

The md driver that creates the unified share structure does impart a small penalty but should be nominal.  Your biggest bottleneck will for sure be the network. Even a gig-E network will really choke things off. That is, unless you plan to run these batch jobs in unraid bare-metal, docker, or a VM?

 

If you happen to have at least two SSD's sitting around, why not make a quick array setup and see how it performs doing your typical operations.

Link to comment

Someone here, Weeboo I think, actually made up a test server with SSD's As I recall it FLEW.

 

That was BubbaQ.

The bottleneck of parity still exists, then you have the network level.

Now that was years ago and technology is better today.

 

So far,

I can burst at a pretty decent clip with spinners. A couple of kernel tuning and the right parity / data drives and I sometimes burst at 90MB/s.

 

That being said, you can use a SSD as a cache work path, let the mover move things to spinners or as I've done on the past.

Leave data on a mounted SSD drive, but rsync it to the array on schedule in an incremental update fashion.

 

These are my kernel tunings in my go script (I have 4GB of ram).

 

sysctl vm.vfs_cache_pressure=10

sysctl vm.swappiness=100

sysctl vm.dirty_ratio=20

# (you can set it higher as an experiment).

# sysctl vm.min_free_kbytes=8192

# sysctl vm.min_free_kbytes=65535

sysctl vm.min_free_kbytes=131072

sysctl vm.highmem_is_dirtyable=1

 

vm.highmem_is_dirtable=1 had the biggest effect as it allowed the buffer cache to be utilized for bursting.

This requires the machine to be on a UPS.

 

I use the 3TB Seagate 7200 RPM drives as they are among the fastest cheap drive. They get about 190MB/s on the outer tracks.

The new Hitachi 6TB drives get around 225MB/s on the outer tracks.

 

Yes you can use SSD, but you are still limited to the network speed.

Link to comment

And SSD parity would still add some bottleneck sure, but nothing like what a spinner adds by virtue of having to read, spine 360, write.  That "spin 360" is orders of magnitude faster on an SSD.  Also another thing he could do is operate reconstruct-write mode (note to self: bug limetech to add this to the GUI!!!). For spinners people night not want to operate like that all the time, for an SSD array there would be no reason not to, just like their'd be no reason to ever spin down the array either.

 

Come on man, we all want to see what a full array of Samsung 840 512GB drives would do ;-)

 

But yes, in the end, he's going to be limited by the network unless he is using some form of visualization to run his batch processes on his server.

 

OP: take note, that is the real question you should be asking if you haven't not already. No matter how fast you make your mass storage, it will be hamstrung by even a 2x bonded gig-E network. May we ask what your operating environment is?

Link to comment

And SSD parity would still add some bottleneck sure, but nothing like what a spinner adds by virtue of having to read, spine 360, write.  That "spin 360" is orders of magnitude faster on an SSD. 

 

According to BubbaQ's test, it was not cost effective or as high throughput as expected.

 

Also another thing he could do is operate reconstruct-write mode (note to self: bug limetech to add this to the GUI!!!). For spinners people night not want to operate like that all the time, for an SSD array there would be no reason not to, just like their'd be no reason to ever spin down the array either.

 

Turbo write does actually improve throughput, unless you are writing to multiple drives at the same time.

Then the 'other' writes get in the way of the reads.

 

(note to self: bug limetech to add this to the GUI!!!).

 

I rsync a file called md_write_method into /etc/cron.d from my /boot/local/etc/cron.d/

 

30 08 * * * /root/mdcmd set md_write_method 1

30 23 * * * /root/mdcmd set md_write_method 0

#

# * * * * * <command to be executed>

# | | | | |

# | | | | |

# | | | | +---- Day of the Week  (range: 1-7, 1 standing for Monday)

# | | | +------ Month of the Year (range: 1-12)

# | | +-------- Day of the Month  (range: 1-31)

# | +---------- Hour              (range: 0-23)

# +------------ Minute            (range: 0-59)

 

go script commands.

 

rsync -rv /boot/local/etc/cron.d/ /etc/cron.d/

find /etc/cron.d -type f -exec chmod a-wx {} \;

 

But yes, in the end, he's going to be limited by the network unless he is using some form of visualization to run his batch processes on his server.

 

OP: take note, that is the real question you should be asking if you haven't not already. No matter how fast you make your mass storage, it will be hamstrung by even a 2x bonded gig-E network. May we ask what your operating environment is?

 

Even with bonding, it will still be limited to Gigabit Ethernet.

So instead of 1 stream maxing out at 90MB/s I think you can get two streams maxing out at 90MB/s.

 

Depending on the sizes of these images and how they are to be processed regular spinners and allot of ram in the server may still work fine.

I edit mp3's all day long.  I can say modifying them locally on an SSD or over the network still shows radical difference in speed.

 

If the SSD's are available, I would say try it out.

 

I might try these experiments

 

1. Use kernel tunings on spinners (using high quality fast Spinners)

 

2. High quality fast spinner on parity, enable turbo write.  Data on SSD.

The spinner will be the bottleneck here. See my prior post. Even if I had 512GB SSD's, I might invest in a large fast spinner just for the extra ooomph.  226MB/s on a spinner is no slouch. Fastest I've benchmarked yet.  Plus it sets the stage for other spinners and archiving. Since these are images and files being worked on, you can use rsync with the --link-dest=DIR to keep incremental daily backups of the files on a protected storage array.

 

3. SSD on parity, turbo write enabled, Data on SSD.

I would get a very high quality SSD. The most recent Samsung EVO have up to 1GB DRAM cache on the SSD for SSD's 500 and up.

 

4. SSD Parity, No turbo write, Data on SSD

 

So in looking at the original post, you can mount an SSD as a work space and when you're finished with the project, move it to the protected array.

 

 

There are also cache only shares.

You can work on the SSD cache only share, and move it to another share which is not cache only on the same SSD. Should be very fast operation. After that it gets moved to the protected array by the nightly mover.

Link to comment

I can't think of any reason it shouldn't work VERY well -- and certainly don't see any reason for a cache drive.

 

An all-SSD array using modern high-performance SATA-3 SSD's could easily complete 4 disk operations quickly enough to keep up with a Gb network ... so parity shouldn't be a bottleneck at all.

 

Simple enough to test -- pop in 2 good SSDs and set up a basic UnRAID server with them ... then see what kind of performance you get.    In fact, I've got a pair of Crucial MX100's enroute for a system I'm building next week ... if I get a chance (and remember to do it), I'll set up a quick UnRAID boot with them before I load the system and see what kind of speeds I get across my network and what kind of parity check speed it yields.

 

Link to comment

Throughput isn't really a requirement, the batch job opens a bunch of <1mb images, processes them, and spits out a couple 100k images for each source file all to/from the same share.  Latency is the real killer here and the latency of rotational disks really slows things down.  Also looking to improve indexing and thumbnailing speed when using Adobe Bridge against a several million image store.  Not really throughput but more of latency.  I don't think uNRAID supports SMB3 yet, which would help greatly on the transaction chattiness and throughput/latency vs SMB2, but SMB2 over 1gb should be sufficient.

 

The workstation is 1gb connected so that is the throughput bottleneck anyways.  Throughput would only come into play when copying mass data in/out of the shares...which is done from the workstation and in a sequential manner.

 

Local SSD would of course be best but I would be stuck using consumer SSD's on a hardware RAID card and I don't really want to do that.  Or a NAS with a traditional RAID 10 or something over CIFS, but that is costly.  Or even so far as a higher end NAS with a iSCSI or FC block output, but that is complicated.  And in reality I only need the performance/latency/throughput of a single SSD drive but the redundancy and space from some sort of RAID array.

Link to comment

Latency is the real killer here and the latency of rotational disks really slows things down.

 

cache_dirs can help here along with the kernel tunings I use.

The tunings help with buffering writes.

 

  Also looking to improve indexing and thumbnailing speed when using Adobe Bridge against a several million image store.  Not really throughput but more of latency.  I don't think uNRAID supports SMB3 yet, which would help greatly on the transaction chattiness and throughput/latency vs SMB2, but SMB2 over 1gb should be sufficient.

 

This is where unRAID and reiserFS tend to fail.

That's allot of files, so having them on SSD could help a great deal since the walk down the directory tree would be faster then a spinner.

 

I also think unRAID 6 might help with a machine containing allot of ram. I.E. more kernel data such as inode/dentry blocks can be cached in ram.

 

I'm excited to see your results with testing this on SSD's.

 

Here's s'more reading and BubbaQ's experiment.

http://lime-technology.com/forum/index.php?topic=23539.msg207541#msg207541

Link to comment

  Also looking to improve indexing and thumbnailing speed when using Adobe Bridge against a several million image store.  Not really throughput but more of latency.  I don't think uNRAID supports SMB3 yet, which would help greatly on the transaction chattiness and throughput/latency vs SMB2, but SMB2 over 1gb should be sufficient.

 

This is where unRAID and reiserFS tend to fail.

That's allot of files, so having them on SSD could help a great deal since the walk down the directory tree would be faster then a spinner.

 

What about xfs?

Link to comment

  Also looking to improve indexing and thumbnailing speed when using Adobe Bridge against a several million image store.  Not really throughput but more of latency.  I don't think uNRAID supports SMB3 yet, which would help greatly on the transaction chattiness and throughput/latency vs SMB2, but SMB2 over 1gb should be sufficient.

 

This is where unRAID and reiserFS tend to fail.

That's allot of files, so having them on SSD could help a great deal since the walk down the directory tree would be faster then a spinner.

 

What about xfs?

 

 

I have not tried it yet, so I cannot speak from experience.

Link to comment

As a JBOD for unRAID?  Or the workstation itself?

 

Workstation is a X9SRA board with a E5-1620 in it so I got plenty of options for commercial grade storage controllers in the workstation itself if I want to go that way.

 

 

JBOD or a RAID0 array of mSATA's for size/speed. (Since that's what we were talking about) Looks like a neat way for compact high speed storage.

At least for the working set of data.  if I had all the machines I used to have, I would use this card for my central syslog server. In days gone past I used to have an i-RAM card for the central syslog server. worked great.

Link to comment

FYI, This might be an interesting product to test with.

 

 

Quad mSATA PCIe SSD

http://www.addonics.com/products/ad4mspx2.php

 

It would certainly be a very space-efficient way to put 4 SATA-3 MSATA SSDs in a system.  However, the SSDs are still limited to SATA-3 speeds.      If the goal is SPEED it would be neat to see what could be done with a few PCIe interfaced SSDs ... but that would be VERY expensive to set up.  The Intel Fultondale series units are VERY fast, but are also big-time pricey ($900 for a 400GB unit up to $6500 for a 2TB unit).    A motherboard with a pair of M.2 slots (there are a few) could be used for a pair of PCIe M.2 units, which are much less expensive ... a Plextor M6e 512GB unit is ~ $400.    But with M.2 units you'd be limited to just a 2-drive system, plus any "standard" SSDs you installed via SATA ports or additional PCIe units in the PCIe slots.

 

Link to comment

While the mSATA's themselves are limited to SATA III, that's pretty fast. On my HP Micro server with 1 SATA III card I was getting 450MB/s.

If you RAID0 them on the card, I'm sure that it will be even faster since the card can accept data at PCIe x4 GEN 2.0 speed.

 

I posted it because it seems like a neat compact way to setup an SSD array without other physically supporting hardware.

It does raid

Someone in another thread posted that the marvel chipset this uses was detected by unRAID as individual drives and a RAID array.

Keep in mind use of this card will be blazing new territory, but chances are good that it would work.

 

Using the 1TB mSATA cards, there could be 4TB in a single slot.

 

There are some other cool devices to get the density higher.

I have both of these. Although I do not use them in RAID and using two single drives requires PMP support.

 

StarTech.com Dual mSATA SSD to 2.5-Inch SATA RAID Adapter Converter (25SAT22MSAT)

http://www.amazon.com/o/ASIN/B00ITJ7WDC?tag=e0495-20

 

2 x 2.5" Drive to 3.5" Bay KINGWIN HDCV-2 Dual 2.5" to 3.5" SATA HDD Converter w/ Raid

http://www.newegg.com/Product/Product.aspx?Item=N82E16817990020

 

If I had a larger unRAID server I probably would use the Addonics card as my cache/scratch pad.

 

Sandisk also has a neat external 2.5" SAS array chassis.

Link to comment

As long as unRAID doesn't add too much overhead I think I am going to go that route with a SSD only based array and an SSD cache drive to handle any writes.  6.0 Beta looks to be adding the SMB3 protocol which helps out a great amount on small files to reduce the amount of chatter that SMB2 has. 

 

The batch process is currently running off of a 7200 RPM based unRAID array and it works, its just slow.  If it tests out I'll get a Norco 2132 or 4164 chassis and run it as a slave chassis off the main one, I already have a large handful of 2.5" drives I could move to it plus a bunch of 10k SAS 2.5" drives I wouldn't mind using.

 

If unRAID doesn't test out, I'll get a hardware RAID card to run the SSD on and throw up a Windows server VM to host the share.  I have a couple HP P410 laying around, though not ideal for SSD I already got them.

 

The StarTech card does look very cool.  Maybe for a future RAID0 scratch disk for Photoshop or something.  The workstation already has 32gb (looking at doubling that), photoshop uses most all of it, and it still writes a huge amount of data to the scratch disk.

Link to comment

sysctl vm.highmem_is_dirtyable=1

error: "vm.highmem_is_dirtyable" is an unknown key

 

Should that work from the command line (all the other lines work). Does unRAID 6 (64-bit) have highmem?

 

I suppose not. This is more geared to unRAID 5 since it's 32bit.  For years I could not understand why my extra memory wasn't being used for cache buffering on writes. Once I enabled this parameter it started working as I had expected.

 

With 64 bit unraid, I guess this parameter is of no use.

 

root@unRAIDb:/# sysctl -A | grep dirt

error: permission denied on key 'net.ipv4.route.flush'

error: permission denied on key 'vm.compact_memory'

vm.dirty_background_bytes = 0

vm.dirty_background_ratio = 10

vm.dirty_bytes = 0

vm.dirty_expire_centisecs = 3000

vm.dirty_ratio = 20

vm.dirty_writeback_centisecs = 500

Link to comment

sysctl vm.highmem_is_dirtyable=1

error: "vm.highmem_is_dirtyable" is an unknown key

 

Should that work from the command line (all the other lines work). Does unRAID 6 (64-bit) have highmem?

 

I suppose not. This is more geared to unRAID 5 since it's 32bit.  For years I could not understand why my extra memory wasn't being used for cache buffering on writes. Once I enabled this parameter it started working as I had expected.

 

With 64 bit unraid, I guess this parameter is of no use.

 

So for cache buffering, should one use all the other lines in go, or not use any of them?

 

Cheers,

 

Neil.

Link to comment

I do not think you need any of them on unRAID 6.

 

A person can try and adjust vm.vfs_cache_pressure to attempt to prefer and keep dentry/inode cache structures in memory longer.

This is something you have to test with your load. I've always tuned this down a but to prefer dentry/inode structures over cache buffers.

 

sysctl vm.vfs_cache_pressure=10

 

vfs_cache_pressure
------------------
This percentage value controls the tendency of the kernel to reclaim
the memory which is used for caching of directory and inode objects.

At the default value of vfs_cache_pressure=100 the kernel will attempt to
reclaim dentries and inodes at a "fair" rate with respect to pagecache and
swapcache reclaim.  Decreasing vfs_cache_pressure causes the kernel to prefer
to retain dentry and inode caches. When vfs_cache_pressure=0, the kernel will
never reclaim dentries and inodes due to memory pressure and this can easily
lead to out-of-memory conditions. Increasing vfs_cache_pressure beyond 100
causes the kernel to prefer to reclaim dentries and inodes.

Increasing vfs_cache_pressure significantly beyond 100 may have negative
performance impact. Reclaim code needs to take various locks to find freeable
directory and inode objects. With vfs_cache_pressure=1000, it will look for
ten times more freeable objects than there are.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.