Jump to content
Interstellar

Disk Settings Page - Tunables - What does what?

13 posts in this topic Last Reply

Recommended Posts

I've googled and searched the forums but I can't for the life of me find a definitive answer on what the three tunables do.

 

I assume they are directly linked to performance, but how or why?

 

Without even a basic idea of what they do and what errors can be caused by changing them I've left them well alone!

 

Someone put me out of my poor search failings?

 

Cheers!

Share this post


Link to post

I've googled and searched the forums but I can't for the life of me find a definitive answer on what the three tunables do.

 

I assume they are directly linked to performance, but how or why?

 

Without even a basic idea of what they do and what errors can be caused by changing them I've left them well alone!

 

Someone put me out of my poor search failings?

 

Cheers!

 

Wondering the same thing a while back & found this from Tom in an old post:

 

None of these tunables have anything to do with the Cache Drive; they are 'tweaks' for the low-level unraid array driver only.

 

First, you can create a file in the 'config' directory of the Flash called 'extra.cfg'.  This file is read when the array is Started and may be used to define the values (or override default values) of these tunables.  Here are the current defaults:

 

md_num_stripes=1280

md_write_limit=768

md_sync_window=288

 

md_num_stripes defines the maximum number of active 'stripes' the driver can maintain.  You can think of a 'stripe' as a 4K I/O request.  For example, let's say you are reading a large (many megabytes) file from the server.  Well the driver will "see" a series of (largely) sequential 4K read requests.  The driver will determine which disk the read is for and then pass that 4K read request to the disk driver.  Well the most such requests would be 1280.  If there are 1280 requests already active and another request comes in, then the process making that request has to wait until a stripe become free (as a result of previous I/O finishing).

 

Bottom line is this: the greater this number is, the more I/O can be queued down into the disk drives.  However each 'stripe' requires memory in the amount of (4096 x highest disk number in array).  So if you have 20 disks, each stripe will require 81920 bytes of memory; multiplied by 1280 = over 104MB.  The default value was chosen to maximize performance in systems with only 512MB of RAM.  If you have more RAM then you can experiment with higher values.  If you go too high and the system starts running out of memory, what will happen is 'random' processes will start getting killed (not good).

 

md_write_limit specifies the maximum number of stripes which will be used for write.  This is because if you starting writing a large file, there can easily be 'md_num_stripes' requests immediately queued.  This will cause reads to different disks to suffer greatly.  So you want to pick a number here that is large but which also leaves some stripes available for reads.  You will find there is a point of diminishing returns.  If you set the number low, say 32, then writes will be very slow.  If you leave it at default, writes will be much faster.  If you set it to say 1000, writes will be a "little bit" faster.  By increasing both md_num_stripes and md_write_limit you might get 10% more performance than default values.  The best you can get is going to be something less than 50% the raw speed of the disk - if you get above 33% let me know :)

 

md_sync_window specifies the most number of parity-sync stripes active during a parity-sync or parity-check operation.  Again, the larger this number, the faster parity-sync will run, with diminishing returns at some point (due mainly to saturating PCI and/or PCIe buses).

 

You want to make sure the sum of md_write_limit+md_sync_window < md_num_stripes so that reads do not get starved if you starting writing a large file while a parity-sync/check is in process.

 

Hopefully I will be able to write up a more thorough document on this subject along with some of theory behind it.

 

Note: if you change the values of these tunables via the 'extra.cfg' file, you must Stop the array, then Start the array for the values to take effect.  The md_write_limit and md_sync_window may be dynamically changed by typing these commands in a telnet session:

 

mdcmd set md_write_limit <value>

mdcmd set md_sync_window <value>

 

 

 

Share this post


Link to post

Thanks!

 

Currently I have 4GB and soon will have 8GB (4x2GB pulled from a Mac Pro).

 

So I can start increasing the values, excellent!

 

I don't run the parity checks during the day, so it doesn't matter about that, so I can increase that value too.

 

The server does get choked by simultaneous reads and writes tho, how would I improve that? Increase md_num_stripes?

 

I only have 6 drives so:

 

12800*4096*6/1024/1024 = 300MB?

Share this post


Link to post

Well...

 

I've done some testing.

 

Doubling all the values added 5% to my performance.

Doubling all of them again brought it up to 10% of the original performance.

 

Times by 10 and basically get 11-12% of original performance.

 

 

Share this post


Link to post

Mine are

 

Tunable (md_num_stripes): 2784

Tunable (md_write_limit): 1536

Tunable (md_sync_window): 576

 

I also adjusted kernel tunings to.

 

vm.dirty_expire_centisecs = 100

vm.dirty_writeback_centisecs = 50

vm.dirty_ratio = 10

vm.dirty_background_ratio = 5

vm.vfs_cache_pressure = 10

 

 

What I get is a burst caching of data up to 1GB at almost 60MB/s.

After that it slowly drops off to around 30MB/s as the cache starts filling.

 

But for average user files, I get high speed burst out of my machine onto the file server.

Share this post


Link to post

Mine are

 

Tunable (md_num_stripes): 2784

Tunable (md_write_limit): 1536

Tunable (md_sync_window): 576

 

I also adjusted kernel tunings to.

 

vm.dirty_expire_centisecs = 100

vm.dirty_writeback_centisecs = 50

vm.dirty_ratio = 10

vm.dirty_background_ratio = 5

vm.vfs_cache_pressure = 10

 

 

What I get is a burst caching of data up to 1GB at almost 60MB/s.

After that it slowly drops off to around 30MB/s as the cache starts filling.

 

But for average user files, I get high speed burst out of my machine onto the file server.

 

What do those 5 new ones do?

 

 

As for further testing, I think these values REALLY come into their own when your writing/reading to more than one disk at a time.

 

Using values 8 times the original (If it was optimised for 512MB and I have 4GB, thats 8 times more).

 

Two machines writing/reading 1GB, both over gigabit resulted in a 10% increase in read performance and a whopping 40% increase in write performance!!

 

Single Machine - Read/Write - Orig Settings: 85.5 / 28.8 MB/sec

Single Machine - Read/Write - New Settings: 83.4 / 33.5 MB/sec (-2.5% / 16.3%)

 

Dual Machine - Read/Write - Orig Settings: 56.6 / 12.4 MB/sec

Dual Machine - Read/Write - New Settings: 62 / 17.3 MB/sec (9.5% / 39%)

 

I started to run out of free memory beyond 10 times the original value, so once I get 8GB I might up it by 16 times and see what is what...

 

With regards to the slight loss in read speed, I think thats down to the disk itself as it's a 1TB drive at 75% fill, as all the other testing resulted in ~84.5 MB/sec.

 

Share this post


Link to post

vm.dirty_expire_centisecs = 100

vm.dirty_writeback_centisecs = 50

vm.dirty_ratio = 10

vm.dirty_background_ratio = 5

vm.vfs_cache_pressure = 10

 

Do you have any links to an explanation of what these do?

 

1Gb burst sounds good tho, as most of my time-machine delta-backups are less than that and hence would be awesome if I could write to it at 60MB/sec for the first few 100 MB!

Share this post


Link to post

Mine are

 

Tunable (md_num_stripes): 2784

Tunable (md_write_limit): 1536

Tunable (md_sync_window): 576

 

I also adjusted kernel tunings to.

 

vm.dirty_expire_centisecs = 100

vm.dirty_writeback_centisecs = 50

vm.dirty_ratio = 10

vm.dirty_background_ratio = 5

vm.vfs_cache_pressure = 10

 

 

What I get is a burst caching of data up to 1GB at almost 60MB/s.

After that it slowly drops off to around 30MB/s as the cache starts filling.

 

But for average user files, I get high speed burst out of my machine onto the file server.

 

WeeboTech - is this on the server in your signature (8gb RAM)?  Anyone have recommended setting for 6 disks. 4gb RAM?

Share this post


Link to post

Mine are

 

Tunable (md_num_stripes): 2784

Tunable (md_write_limit): 1536

Tunable (md_sync_window): 576

 

I also adjusted kernel tunings to.

 

vm.dirty_expire_centisecs = 100

vm.dirty_writeback_centisecs = 50

vm.dirty_ratio = 10

vm.dirty_background_ratio = 5

vm.vfs_cache_pressure = 10

 

 

What I get is a burst caching of data up to 1GB at almost 60MB/s.

After that it slowly drops off to around 30MB/s as the cache starts filling.

 

But for average user files, I get high speed burst out of my machine onto the file server.

 

WeeboTech - is this on the server in your signature (8gb RAM)?  Anyone have recommended setting for 6 disks. 4gb RAM?

 

This is on my 8GB server.  The settings are not too aggressive and you could probably get away with it on 4GB without issue. I have before I upgraded to 8GB.

Share this post


Link to post

With regards to: http://lime-technology.com/forum/index.php?topic=15224

 

 

Would you mind posting an explanation of what the 5 vm tunables do in terms of increasing/reducing the values?

 

I've set my server to use your values and it seems even slower (I have 8GB).

 

Cheers

 

I don't remember all the settings and why I choose them over 2 years ago.

 

I do know I readjusted vm_dirty numbers so that more data was cached in the buffer cache.

This works out good for small files as they transfer as fast as possible without waiting for the sync to disk.

This could affect very large transfers, but not on my system.

 

The tunable MD values are defined elsewhere and I'm still running 4.5.4 so they may be incorrect for the current version.

 

also, I do have force NCQ disabled.

 

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.