Drive performance testing (version 2.6.5) for UNRAID 5 thru 6.4


Recommended Posts

1 hour ago, CraziFuzzy said:

I definitely don't see any reason why this would gain anything from the added complication of going the docker approach.  Plugin seems the obvious place for this functionality.

 

If it was only intended to be used under unRAID, this would likely be true but I've always wanted to eventually expand past unRAID and switching to Docker now while it's still in early development makes the most sense since I validated that I could access the hosts hardware from inside the container. And running under Docker is a good compromise for those with limited RAM availability as it uses already provided functionality for start/stopping. As a Plugin, I would need to provide an extra layer to start & stop the Lucee app server.

 

1 hour ago, SiNtEnEl said:

Just tested the docker approach by running dd from a container and i'm getting same disk performance as the alpha version of diskspeed. Was concerned it may influence the benchmarks.

 

I see the same. There's likely an extra pass-through going with Docker and if there is, any delays it introduces are negated by the latencies of the drive itself.

 

 

Link to comment
3 hours ago, jbartlett said:

 

If it was only intended to be used under unRAID, this would likely be true but I've always wanted to eventually expand past unRAID and switching to Docker now while it's still in early development makes the most sense since I validated that I could access the hosts hardware from inside the container. And running under Docker is a good compromise for those with limited RAM availability as it uses already provided functionality for start/stopping. As a Plugin, I would need to provide an extra layer to start & stop the Lucee app server.

Out of curiosity, what is lucee actually providing here that the built in server on unraid can't?

Additionally, I can't imagine someone with bare minimum RAM putting much focus on tracking disk performance often.

3 hours ago, jbartlett said:

 

 

 

Link to comment
2 hours ago, CraziFuzzy said:

Out of curiosity, what is lucee actually providing here that the built in server on unraid can't?

Additionally, I can't imagine someone with bare minimum RAM putting much focus on tracking disk performance often.

 

Lucee is a free app server for the ColdFusion scripting language. unRAID only supports PHP natively.

 

We're not talking just system with minimal RAM installed, many people (like myself) have systems with large amounts of RAM but mostly all allocated to VM's. And regardless of going with a Plugin or Docker, the same RAM allocation is going to take place as Lucee/Java would need to be installed & running. Going with the Docker route keeps it more tightly contained and the end user GUI experience is identical.

Link to comment

From a development standpoint, here's the steps I had to take to deploy each version:

 

Plugin:

  • Check to see if new Java version is available. Update path reference in PLG file if so.
  • Check to see if new Lucee version is available. Update path reference in PLG file if so and Lucee version so existing instances and just download the new lucee.jar file vs everything.
  • Every so often, check to see if there's new versions of the utility packages (such as nvme-cli) and integrate
  • Zip up the source code and place it on my web server
  • Update PLG file to reference the new file name for the source file, update the version number of the plugin, then deploy updated PLG
  • Update plugin to ensure everything updates correctly
  • Uninstall & install plugin to insure everything installs correctly

Docker:

  • Copy directory with the code to my "dockerfile" directory.
  • Build Docker
  • Push Docker

Any updates to any part such as Java, Lucee, packages are automatically bundled in with no effort on my part.

Link to comment

Oooo, I'm able to figure out the max data transfer speeds for storage controllers that report a link speed. I can show you on a graph how much data you're transferring per drive with the max possible throughput.

 

So basically, I can give you a percentage of how much bandwidth you're using and how much you have free. Then you can decide for yourself how to best utilize that bandwidth and with what drives.

 

SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon]

LSI Logic / Symbios Logic
Serial Attached SCSI controller

Current Link Speed: 5GT/s width x8 (4 GB/s max throughput)
Maximum Link Speed: 5GT/s width x8 (4 GB/s max throughput)

 

"Exciting stuff."

-Cave Johnson

Link to comment
10 minutes ago, jbartlett said:

So basically, I can give you a percentage of how much bandwidth you're using and how much you have free. Then you can decide for yourself how to best utilize that bandwidth and with what drives.

That's cool, but maybe mention that the max theoretical PCIe bandwidth will never be reached, usually 75 to 80% is the max real world possible speed for PCIe, other protocols like SATA and SAS have less overhead and higher usable bandwidth, around 90-92% in my experience.

Link to comment
2 hours ago, jbartlett said:

Looks like I can do away with the whole thing of reading the drives in incrementing block sizes to determine the optimum sector size. The drives return this information! Utilizing the max sectors per request value multiplied by the logical sector size yields the optimize block size.

Would it be at all beneficial to include a way to test whether or not it is actually optimal?

Link to comment
56 minutes ago, jonathanm said:

Would it be at all beneficial to include a way to test whether or not it is actually optimal?

 

I created a test script to read 5 GB of data starting with a block size of 512 bytes and doubling it every pass which was the same logic I used in the app. Here's the log of how long it took and the average speed against my 6TB WD Red Pro 2 drive. The drive reported that 16K was the optimal block size.

 

5120000000 bytes (5.1 GB, 4.8 GiB) copied, 446.494 s, 11.5 MB/s (512B)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 229.114 s, 22.3 MB/s (1K)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 116.944 s, 43.8 MB/s (2K)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 62.9478 s, 81.3 MB/s (4K)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 36.6456 s, 140 MB/s (8k)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 31.7995 s, 161 MB/s (16K)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 31.8183 s, 161 MB/s (32K)

5120000000 bytes (5.1 GB, 4.8 GiB) copied, 31.7485 s, 161 MB/s (64K)
5119934464 bytes (5.1 GB, 4.8 GiB) copied, 31.758 s, 161 MB/s (128K)
5119934464 bytes (5.1 GB, 4.8 GiB) copied, 31.7385 s, 161 MB/s (256K)
5119672320 bytes (5.1 GB, 4.8 GiB) copied, 31.7558 s, 161 MB/s (512K)
5119148032 bytes (5.1 GB, 4.8 GiB) copied, 31.7352 s, 161 MB/s (1M)
5119148032 bytes (5.1 GB, 4.8 GiB) copied, 31.758 s, 161 MB/s (2M)
5117050880 bytes (5.1 GB, 4.8 GiB) copied, 31.7343 s, 161 MB/s (4M)
5117050880 bytes (5.1 GB, 4.8 GiB) copied, 31.7281 s, 161 MB/s (8M)
5117050880 bytes (5.1 GB, 4.8 GiB) copied, 31.7283 s, 161 MB/s (16M)
5100273664 bytes (5.1 GB, 4.8 GiB) copied, 31.6791 s, 161 MB/s (32M)

Link to comment

Because diskspeed is not available for unraid 6.4.0 are there other ways to check performance of the disks in my array. I am in the process of re-organizing one of my servers and building a RAID0 parity-volume on an ARC 1210 and replacing some slower disks in the array, which are slowing down parity checks for some time now. Or should I go back to unraid 6.3.5 for checking performance alone?

Link to comment
On 2/3/2018 at 2:13 AM, jbartlett said:

 

I created a test script to read 5 GB of data starting with a block size of 512 bytes and doubling it every pass which was the same logic I used in the app. Here's the log of how long it took and the average speed against my 6TB WD Red Pro 2 drive. The drive reported that 16K was the optimal block size.

 

5120000000 bytes (5.1 GB, 4.8 GiB) copied, 446.494 s, 11.5 MB/s (512B)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 229.114 s, 22.3 MB/s (1K)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 116.944 s, 43.8 MB/s (2K)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 62.9478 s, 81.3 MB/s (4K)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 36.6456 s, 140 MB/s (8k)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 31.7995 s, 161 MB/s (16K)
5120000000 bytes (5.1 GB, 4.8 GiB) copied, 31.8183 s, 161 MB/s (32K)

5120000000 bytes (5.1 GB, 4.8 GiB) copied, 31.7485 s, 161 MB/s (64K)
5119934464 bytes (5.1 GB, 4.8 GiB) copied, 31.758 s, 161 MB/s (128K)
5119934464 bytes (5.1 GB, 4.8 GiB) copied, 31.7385 s, 161 MB/s (256K)
5119672320 bytes (5.1 GB, 4.8 GiB) copied, 31.7558 s, 161 MB/s (512K)
5119148032 bytes (5.1 GB, 4.8 GiB) copied, 31.7352 s, 161 MB/s (1M)
5119148032 bytes (5.1 GB, 4.8 GiB) copied, 31.758 s, 161 MB/s (2M)
5117050880 bytes (5.1 GB, 4.8 GiB) copied, 31.7343 s, 161 MB/s (4M)
5117050880 bytes (5.1 GB, 4.8 GiB) copied, 31.7281 s, 161 MB/s (8M)
5117050880 bytes (5.1 GB, 4.8 GiB) copied, 31.7283 s, 161 MB/s (16M)
5100273664 bytes (5.1 GB, 4.8 GiB) copied, 31.6791 s, 161 MB/s (32M)

If the drive claims 512 byte sectors and a buffer capacity of 32 sectors, then 16 kB transfers will result in the fewest "transactions" for a large file.

Making larger requests will just force the OS layer to split the operation into the 16 kB transfer sizes that the disk can handle.


But I wonder if a drive could potentially claim a too large buffer capacity, where you get pipe stalls while the pipe gets filled.

 

I'm just not convinced that there couldn't be drives that specifies a very large buffer and will actually perform better if the full buffer size isn't used.

 

Another curious thing if I look at a number of drives I have is that they all claim (hdparm output):

R/W multiple sector transfer: Max = 16

 

But they give varying response to the Current R/W multiple sector transfer:

Current = 0

Current = ?

Current = 16

Link to comment
11 hours ago, pwm said:

Another curious thing if I look at a number of drives I have is that they all claim (hdparm output):

R/W multiple sector transfer: Max = 16

 

But they give varying response to the Current R/W multiple sector transfer:

Current = 0

Current = ?

Current = 16

 

I'm getting the value from "MaxSectorsPerRequest" using blockdev and from the drives I double-checked, it matched the value in the "queue" drive device directory.

 

However, I'm now likewise a little suspect since all my drives are reporting the same value. Could be a coincidence but....

 

I have not been fully satisfied with how I was doing the tests by running a balls-to-the-wall read for 3 seconds to get a baseline scan and then 8 second tests of the three 3-second tests that had the highest results. But those three second tests kept showing a false spike towards the top end of the block sizes. The time it took to do a scan on a system with many drives just took too long and I know people will get impatient.

 

I'm going to default to the MaxSectorsPerRequest value which gives a good baseline starting point but allow people to do an in-depth scan but implement a smarter method. Do a ten second read starting at the MaxSectorsPerRequest value and then check above & below it to see if above is equal (within a tiny margin of error) and the below is less. If the values show improvement with a bigger or smaller block size, adjust and rescan. Then optionally apply that same result for all drives with the same make/model/rev on the same controller. The option of testing individual drives will be there too.

Edited by jbartlett
Link to comment
5 minutes ago, jbartlett said:

 

I'm getting the value from "MaxSectorsPerRequest" using blockdev and from the drives I double-checked, it matched the value in the "queue" drive device directory.

 

However, I'm now likewise a little suspect since all my drives are reporting the same value. Could be a coincidence but....

 

I have not been fully satisfied with how I was doing the tests by running a balls-to-the-wall read for 3 seconds to get a baseline scan and then 8 second tests of the three 3-second tests that had the highest results. But those three second tests kept showing a false spike towards the top end of the block sizes. The time it took to do a scan on a system with many drives just took too long and I know people will get impatient.

 

I'm going to default to the MaxSectorsPerRequest value which gives a good baseline starting point but allow people to do an in-depth scan but implement a smarter method. Do a ten second read starting at the MaxSectorsPerRequest value and then check above & below it to see if above is equal (within a tiny margin of error) and the below is less. If the values show improvement with a bigger or smaller block size, adjust and rescan. Then optionally apply that same result for all drives with the same make/model/rev on the same controller. The option of testing individual drives will be there too.

I don't think the drive can perform better for larger transfers than the max sectors per request value. I don't think the Linux disk subsystem will attempt any larger transfers.

 

That parameter should be a "contract" in the interaction between disk and OS and the OS shouldn't try to "overclock" by trying to send larger write blocks than what the drive promises to support.


But since hwinfo shows both a max and a current value, that indicates that this is a tunable parameter. So the drive could potentially be tuned for a lower value. And the question then is if your API shows the max value or the current value. My guess is that if the drive specifies a max of 16 sectors and is tuned to 8 sectors, then Linux will never send more than 8 sectors at a time. But will it be the value 8 or the value 16 you will see in your blockdev information?

 

I'm assuming you are thinking about the parameter "--getmaxsect" but the documentation doesn't mention if it corresponds to "max" or "current" from hdinfo. I would suspect it's the "current" value, i.e. taking into consideration any tuning. The -m parameter of hdparm relates to the tunable value, i.e. "current" so it would be strange if--getmaxsect of blockdev doesn't too.

http://man7.org/linux/man-pages/man8/blockdev.8.html

http://www.manpages.info/linux/hdparm.8.html

 

Link to comment
47 minutes ago, jbartlett said:

But those three second tests kept showing a false spike towards the top end of the block sizes

It's actually expected to get a tiny bit better performance for really large transfers because you do fewer translations from your program into the Linux kernel. But the difference should be ignorable. That's what can be seen for transfer sizes between 32kB and 16MB. It's hard to tell exactly how much it matters from a single test-run - there are quite a bit of random noise in every time measurement.

 

But something else must happen for your 32MB test. But it's hard to say what optimization step you trig in your code or in the Linux kernel when going for 32MB blocks. Will you see similar times as for 32MB blocks if you continue and writes 64MB, 128MB, 256MB blocks?

 

One note is that some Intel processors can support 2MB and even 1GB virtual memory pages besides the normal 4kB pages of the old 386 chips. So it isn't impossible that you may trig some very specific optimization code that isn't really meaningful from the scope of the current project.

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.