unraid-tunables-tester.sh - A New Utility to Optimize unRAID md_* Tunables


Recommended Posts

Restarted the test after watching a tail on the syslog - there's two places the NumStripes variable is calculated - in the Calc function and inside the RunTest loop at the end.

 

Sorry I failed to mention that.  It simply increments to the next set of test values.  I forgot it doesn't call CalcValues.  I should probably make a version that does all the calcs in one place so yo don't have to go hunting.

Link to comment

I've found that rebuilds or generates to be faster then read checks.

It could also be the direction of the traffic on the bus or controller.

The PCIe bus is full duplex. It could also be that write caching on the hardware comes into play.

When I used my Areca with write caching my parity generate speed improvement was measurable.

 

After tuning on my server, I found that read checks were slightly faster than rebuilds.  Stock reads checks were much slower.

 

At least in my case, with my hardware, the tunables appeared to improve read check performance to hardware maximum, while rebuilds were already at maximum.

Link to comment

Try copying a file from the cache drive.

 

That's an elegant solution to eliminate network performance as a factor.

 

Having a very high performance cache drive, perhaps a modern fast ssd, would also help mitigate external influence. A RAM drive, with extremely low latency and high transfer speeds, would still be best.

 

Any way to create a RAM drive and assign it as a cache drive (or simply mount it at all) in Linux?

Link to comment

I found the best way to test writes is eliminate some of the IO by reading /dev/zero

quick scriptlet.

 

bs=1024

count=4000000

count=10000000

 

dd if=/dev/zero bs=$bs count=$count of=$1

One issue when testing writes is you want as empty of a file system as possible.

Allocation of the directory and file nodes takes time. Searching through the filesystem looking for free blocks takes time.

 

Remove the file before your next test or it will be skewed.

OR

Create the file, then overwrite the same file each time, this way the allocation is already done.

Link to comment

Thrashing by it's technical term is in dealing with virtual memory and paging.

 

So I googled the term and WeeboTech is right about the technical definition of Thrashing.  I've certainly been using the term incorrectly.

 

I was looking for a single word that describes high seek activity with very large head movements caused when a hard drive is responding to competing requests that send the actuator arms back and forth between the inner and outer cylinders of a platter, ultimately causing total data transfer rates (summed together) to be far below the achievable rate if the hard drive was able to focus on a single request.  Basically, a drive that is so busy seeking instead of reading and writing that data transfer rates fall.

 

While in my mind the mental image of a hard drive's actuator arm rapidly seeking over the platters, like a teenager flailing their arms about, was best described by the word thrashing (like something you would do at a metal band concert), technically this is not Thrashing at all.

 

From what I've just read, a more technically correct definition of the scenario I am describing is a drive Saturated with Random I/O from Competing Requests, though I would still love a single word that describes this condition, and perhaps WeeboTech is right and the word is Competing.

 

Regardless, though some of us were using the wrong term, we were all talking about the same problem: drives saturated with random I/O from competing requests, and the need to prioritize some requests over others.

 

Anyway, back to the core topic.

Link to comment

Try copying a file from the cache drive.

 

That's an elegant solution to eliminate network performance as a factor.

 

Having a very high performance cache drive, perhaps a modern fast ssd, would also help mitigate external influence. A RAM drive, with extremely low latency and high transfer speeds, would still be best.

 

Any way to create a RAM drive and assign it as a cache drive (or simply mount it at all) in Linux?

 

From http://lime-technology.com/forum/index.php?topic=25250.msg220405#msg220405:

 

mkdir -m 777 /tmp/ramdrv 

 

Sync and clear the caches and see what is left when choosing the size:

sync && echo 3 > /proc/sys/vm/drop_caches

free -lm

 

Create the ram drive:

mount -t tmpfs -o size=3G tmpfs /tmp/ramdrv

The size is important and should be chosen conservatively if you don't have a swap partition enabled.  G = Gig, M = Meg and so on.  Tmpfs will be created with a set max size - will not grow more than what you tell it.  It will also be swappable if you have a swap partition enabled, which is nice for those oops as it keeps the server from crashing.

 

This method will provide some protection against the server running out of memory since you are defining a max size and it will never get larger.

Link to comment

I found the best way to test writes is eliminate some of the IO by reading /dev/zero

quick scriptlet.

 

bs=1024

count=4000000

count=10000000

 

dd if=/dev/zero bs=$bs count=$count of=$1

Sounds great!  But I have no idea what any of that means.  Can you put it in layman terms?

 

 

One issue when testing writes is you want as empty of a file system as possible.

Allocation of the directory and file nodes takes time. Searching through the filesystem looking for free blocks takes time.

 

Remove the file before your next test or it will be skewed.

OR

Create the file, then overwrite the same file each time, this way the allocation is already done.

 

That makes sense.  So a newly added, empty drive is a great place to start? 

 

I would think you would also want the drive to be the same model as your fastest drive, otherwise if you are testing a slow drive you might come up with values that don't work well for faster drives.  Of course, I am also making the assumption that values that work well on your fastest drive will also work well on your slowest drive.

 

Any chance you might make a script out of this?  This is way beyond my technical ability.

Link to comment

From http://lime-technology.com/forum/index.php?topic=25250.msg220405#msg220405:

 

mkdir -m 777 /tmp/ramdrv 

 

Sync and clear the caches and see what is left when choosing the size:

sync && echo 3 > /proc/sys/vm/drop_caches

free -lm

 

Create the ram drive:

mount -t tmpfs -o size=3G tmpfs /tmp/ramdrv

The size is important and should be chosen conservatively if you don't have a swap partition enabled.  G = Gig, M = Meg and so on.  Tmpfs will be created with a set max size - will not grow more than what you tell it.  It will also be swappable if you have a swap partition enabled, which is nice for those oops as it keeps the server from crashing.

 

Awesome, thanks unevent.  So after creating a ramdisk, you could load it up with test files, then run the write tests.

 

Do you think there is a way to turn those steps into a script?  How would you make a script smart enough to choose conservatively?  Sounds like there are several variables to consider.

 

Not that I understand it, but WeeboTech's suggestion might be better.  I don't think you have to create a ramdisk or allocate memory.  I think it might be a way to create test data out of thin air, so to speak.  I'm concerned that allocating some memory to a ramdisk could influence the results, not to mention the possibility of crashing the server if you run out of memory.

Link to comment

 

Any chance you might make a script out of this?  This is way beyond my technical ability.

 

I had a script called write4gb which is similiar to

 

#!/bin/bash
bs=1024
count=4000000
# count=10000000

dd if=/dev/zero bs=$bs count=$count of=$1

 

ideally you want it to be 2x ram so I had the write10gb script.

 

#!/bin/bash
bs=1024
# count=4000000
count=10000000

dd if=/dev/zero bs=$bs count=$count of=$1

 

 

you just run it with ./write4gb /mnt/disk1/test.dd

 

 

Over here I have my writeread10gb script.

Writes 10gb, then reads it back.

https://code.google.com/p/unraid-weebotech/downloads/detail?name=writeread10gb&can=2&q=#makechanges

 

You can cannibalize it and use only the write part of it.

What I like about it is that it will give you a status per interval so you can see how buffer cache and kernel tuning come into play.

 

Throughout the board I mention that I can burst at high speed. This is because of buffer caching and some tunings. I'm able to get upwards of 60MB/s sometimes higher depending on the drive.  Burst to about 500GB, then is slows down.

 

For me, this is what I need as I move allot of MP3's around, tag the, add artwork. etc. etc.

So I want to burst write as fast as I can even if it's only for a smaller subset of data that I may move. i.e. movies.

 

In any case, fastest drives on parity and the data disk you are testing matter.

Plus,  as empty of a filesystem as possible.

Link to comment

Not that I understand it, but WeeboTech's suggestion might be better.  I don't think you have to create a ramdisk or allocate memory.  I think it might be a way to create test data out of thin air, so to speak.  I'm concerned that allocating some memory to a ramdisk could influence the results, not to mention the possibility of crashing the server if you run out of memory.

 

I would suggest reading /dev/zero as the kernel will provide 0's until you stop reading.

There is also /dev/urandom but that takes more cpu cycles and I've found it to slow things down.

 

Anyway, it's not out of thin air, it's from the kernel.

It skips the need for a ramdisk. Personally when bench marking writes AND kernel tuning.  I would not use the ram disk.

 

I would drop the kernel caches.

sync && echo 3 > /proc/sys/vm/drop_caches

 

read from /dev/zero, writing to where I want to test with a specific limit.

 

If you use the ramdisk approach, then you are holding ram that could possibly be used for buffering.

 

For most people it would not matter, with my tuning and usage pattern it does.

For example testing with sysctl vm.highmem_is_dirtyable=1 has a big effect with burstable buffering.

Link to comment

I would suggest reading /dev/zero as the kernel will provide 0's until you stop reading.

There is also /dev/urandom but that takes more cpu cycles and I've found it to slow things down.

 

Anyway, it's not out of thin air, it's from the kernel.

It skips the need for a ramdisk. Personally when bench marking writes AND kernel tuning.  I would not use the ram disk.

 

I would drop the kernel caches.

sync && echo 3 > /proc/sys/vm/drop_caches

 

read from /dev/zero, writing to where I want to test with a specific limit.

 

If you use the ramdisk approach, then you are holding ram that could possibly be used for buffering.

 

For most people it would not matter, with my tuning and usage pattern it does.

For example testing with sysctl vm.highmem_is_dirtyable=1 has a big effect with burstable buffering.

 

Alright, so I read through some documentation on the dd command (completely new to me), and I can't believe I'm gonna say this, but I'm thinking about incorporating a routine to test md_write_limit using dd.

 

There are some logistical problems that I didn't have to worry about with a parity check, so I'm open to suggestions on the following:

 

How to choose what drive to write to?

  A) Write to All Drives and select the fastest for the test

  B) Let the user specify

  C) Automatically select the emptiest drive

  D) Automatically select the largest drive, and if more than one select the emptiest

  E) Make A-D selectable options and make the user choose

  F) Another option I failed to think of

 

dd also has a nickname - data destroyer.  Misuse of the tool can easily cause a loss of data.  I need to make sure that the dangerous options are not user selectable.  It looks all too easy to perform a full disk wipe by specifying of=/dev/sda instead of of=/mnt/disk18/testfile10gbzeros.txt.  Scary.  cp is child's play next to dd.  I didn't see any logic in your script to prevent someone from specifying a whole drive or partition as the output file...

 

Since we will be writing a file, I think we need to make sure that the filename doesn't already exist (with real user data).  I bet that would really piss someone off.

 

Do the kernel caches need to be dropped after each test, or can I run 10 tests back to back with different md_write_limit values without dropping the caches?

 

I assume with dd, I would log the time, call dd to write the test file, and when it completes log the time again, right?

 

Something like this?

testloop (){
     /root/mdcmd set md_write_limit $WriteLimit
     sync && echo 3 > /proc/sys/vm/drop_caches
     StartTime=$(date +%s.%N)
     dd if=/dev/zero bs=1024 count=10485760 of=/mnt/$DiskNum/testfile10gbzeros.txt
     EndTime=$(date +%s.%N)
     Duration=`echo "$EndTime $StartTime" | awk '{ printf( "%0.3f\n", ($1 - $2) )}'`
     Speed=`echo "$Duration" | awk '{ printf( "%0.1f\n", ((10737418240 / $1) / 1024) )}'`
}

 

I appreciate any insight.

 

-Paul

 

I EDITED TO CORRECT A STUPID MISTAKE ON SPECIFYING THE FILE PATH...

Link to comment

I would 'start' with b but only in that it selects /mnt/disk.

 

Do not use of=/dev/sdg/testfile10gbzeros.txt as you will destroy the partition.

 

use /mnt/disk1-24/filename to insure that you are writing to an existing filesystem.

 

Put these at the start of your script and you can use your script for the basename of the temporary file.

 

#!/bin/bash

 

[ ${DEBUG:=0} -gt 0 ] && set -x -v

 

P=${0##*/}              # basename of program

R=${0%%$P}              # dirname of program

P=${P%.*}              # strip off after last . character

 

TMPFILE=/mnt/disk${1}

 

if [ ! -d ${TMPFILE} ]

  then echo "Supplied disk is not mounted.

          exit

fi

 

TMPFILE=${TMPFILE}/${P}.test

 

then add

 

trap "rm -f ${TMPFILE}" HUP INT QUIT TERM EXIT

 

 

There's more then one way to skin a cat, we can use output of mount to find all mounted disks and present a menu.

 

What the user would supply is the disk number.

 

Later on you can calculate the other options or present a menu using select. (I'll send you some examples).

Link to comment

Well I ran a parity check last night and only saw about a 10 minute improvement. I was really hoping for more. Still ended up taking 9:10 to complete with no errors. Oh well. Must be those 2TB drives.

 

Yup.  That's a pretty good time considering your mix of drives.

Link to comment

All good info, thanks WeeboTech.

 

I was thinking, regarding selecting an empty drive - it's not just about selecting the emptiest drive, but rather about selecting a drive with free outer sectors, where performance will be greatest.  It's possible (not likely, but possible) to have a drive 90% full, but because some files were deleted the outer cylinders are empty.

 

Any way to check for that condition?

 

And if the answer is yes, is there any way to control where on a drive the file gets written to?

Link to comment

Any way to check for that condition?

 

And if the answer is yes, is there any way to control where on a drive the file gets written to?

 

Technical, anything is possible to be programmed. Practically, this may be splitting hairs.

 

If you have it configurable to run a chosen drive that should be enough for the first go round.

 

If you can iterate through arguments, then you can test each disk supplied.

 

I.E.

 

for i in $*

do [ ! -d  /mnt/disk${i} ] && continue

    TMPFILE=/mnt/disk${i}/${P}.test

    somefunction_to_test_the disk() ${TMPFILE}

done

 

this would allow people to test a range of disks with

 

./script.sh 1 5 7 4 3

 

This also means if they did not supply an argument for a disk write test, it would skip the code in the loop.

 

I have diskrawread.sh you can look at via googlecode. It's different then a filesystem test.

It's a read test to find the maximum read speed of a drive.

Link to comment

r.e. write testing from the cache drive (or an internal RAM drive)  ==> agree this would provide the best test of the maximum possible write speed of the array with zero impact from the network.

 

However ... in real life writes are done over the network  :)    And if the writes are at a speed significantly below what the network can sustain (which is certainly the case with Gb networks), then there's no benefit to using md_write_limit settings higher than whatever maximizes your writes across the network.

 

There IS one other factor I didn't consider and/or test (because I can't) => writes TO the cache drive ... these should be very fast as they are (a) done at drive speed without any parity computations;  and (b) don't have any impact parity checks, since the cache is "outside" of any parity checks.  Not sure how much impact md_write_limit might have on this, but it might be interesting to test it.

 

 

Link to comment

r.e. write testing from the cache drive (or an internal RAM drive)  ==> agree this would provide the best test of the maximum possible write speed of the array with zero impact from the network.

 

However ... in real life writes are done over the network  :)    And if the writes are at a speed significantly below what the network can sustain (which is certainly the case with Gb networks), then there's no benefit to using md_write_limit settings higher than whatever maximizes your writes across the network.

 

There IS one other factor I didn't consider and/or test (because I can't) => writes TO the cache drive ... these should be very fast as they are (a) done at drive speed without any parity computations;  and (b) don't have any impact parity checks, since the cache is "outside" of any parity checks.  Not sure how much impact md_write_limit might have on this, but it might be interesting to test it.

Moves from the cache drive to the array would not be limited by network speed, so there is a reason to test beyond what a network can provide.

 

I used to be a network engineer, and I don't trust LAN connections as a variable in testing hard drive speed.  Too variable.  The only time to run a performance test that includes LAN connections is when you are actually testing LAN connections.

 

I'm not saying you aren't able to get good results testing transfers from a Windows PC over a LAN connection, but I wouldn't engineer a test script around it, way too many variables.  It's also more complex than the dd method.

 

As far as I know, since the cache drive is outside the array, md_* parameters don't affect it.

Link to comment

Here is the results of the rebuild test.

 

Tunables Report from  unRAID Tunables Tester v2.2 by Pauven

NOTE: Use the smallest set of values that produce good results. Larger values
      increase server memory use, and may cause stability issues with unRAID,
      especially if you have any add-ons or plug-ins installed.

Test | num_stripes | write_limit | sync_window |   Speed 
---------------------------------------------------------------------------
   1  |    1544     |     768     |     8     |  45.2 MB/s 
   2  |    1544     |     768     |     72     |  59.8 MB/s 
   3  |    1544     |     768     |     136     |  59.9 MB/s 
   4  |    1544     |     768     |     200     |  73.9 MB/s 
   5  |    1544     |     768     |     264     |  88.4 MB/s 
   6  |    1544     |     768     |     328     | 101.6 MB/s 
   7  |    1544     |     768     |     392     | 106.9 MB/s 
   8  |    1544     |     768     |     456     | 112.4 MB/s 

Completed: 0 Hrs 33 Min 52 Sec.

Best Bang for the Buck: Test 2 with a speed of 59.8 MB/s

     Tunable (md_num_stripes): 1544
     Tunable (md_write_limit): 768
     Tunable (md_sync_window): 72

These settings will consume 60MB of RAM on your hardware.


Unthrottled values for your server came from Test 8 with a speed of 112.4 MB/s

     Tunable (md_num_stripes): 1544
     Tunable (md_write_limit): 768
     Tunable (md_sync_window): 456

These settings will consume 60MB of RAM on your hardware.
This is -60MB less than your current utilization of 120MB.
NOTE: Adding additional drives will increase memory consumption.

In unRAID, go to Settings > Disk Settings to set your chosen parameter values.

Link to comment

I was looking for a single word that describes high seek activity with very large head movements caused when a hard drive is responding to competing requests that send the actuator arms back and forth between the inner and outer cylinders of a platter, ultimately causing total data transfer rates (summed together) to be far below the achievable rate if the hard drive was able to focus on a single request.  Basically, a drive that is so busy seeking instead of reading and writing that data transfer rates fall.

 

While in my mind the mental image of a hard drive's actuator arm rapidly seeking over the platters, like a teenager flailing their arms about, was best described by the word thrashing (like something you would do at a metal band concert), technically this is not Thrashing at all.

 

While it does seem that the most common (but not the only) definition of thrashing on the 'net is virtual memory thrashing, that word has been used since I was a young CS student 50 years ago to mean exactly what you described above ... my CS Profs used to compare it to a needle on an LP jumping back & forth between songs so much that you couldn't tell what either one of the songs was.  They called it thrashing the disk, and it's stuck with me ever since, and still seems like a good use of the word.    Once virtual memory systems became common in the 70's, page file thrashing became a significant problem ... and that's what the current Wikipedia definition refers to.    [but note that disk thrashing was discussed in CS BEFORE there were ever virtual memory systems.]    This was also often called "system thrashing", to differentiate it from the same problem that was found with database systems with "index thrashing", "buffer cache thrashing", and other problems that would significantly slow down database accesses.  They ALL refer to essentially the same thing -- a LOT of disk head movement.

 

My mental picture has always been exactly that phonograph arm jumping around on the platter so much that everything simply runs slower.    Regardless of what you call it, when you have a lot of activities causing significant head movement, the time for those activities will be longer than if they were done sequentially, simply because seeks are the slowest part of a disk access.

 

By the way, your teenager is indeed thrashing according to Webster  :)  ["... or stir about violently : toss about ..."]

 

Enough on this ... I understand the modern usage that tends to equate "thrashing" to "system level virtual memory thrashing"  => but it's still a word that means exactly what you described to me  :)

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.