Jump to content
Pauven

unraid-tunables-tester.sh - A New Utility to Optimize unRAID md_* Tunables

1039 posts in this topic Last Reply

Recommended Posts

Posted (edited)
25 minutes ago, Marshalleq said:

I actually hadn't edited it.  I use VI and Nano so I don't think there's much chance of it coming from my end? - are you saying it was published to this site with windows line endings and I need to change it?  In which case I probably would need to do that with something (I run Mac) but this would be a first in 20 years!

The forum may be mucking it up for us, let me post a link from a sane paste site.

 

https://paste.ee/p/wcwWV


Some quick notes for the curious for some decisions:

Each call of echo actually eats CPU and I/O scheduler time, even the ones that only write to the screen. It's better to produce many lines of output with a single echo or cat statement generally than to use many. I left some of these in just to avoid frustration with implementing a single echo that did the same thing.


In some places I've opted to use cat. This happens in two circumstances:
1.) Where there's identations. A CR escaped echo `echo "\` prints identations literally. You could use a heredoc with echo, but it can be unpredictable depending on what implementation of echo is present on the target machine. It's easier, and safer in this circumstance to use `cat`
2.) Where there's bash sentinel characters `*` for similar reasons as above, if you use a cr escaped echo to print a block that contains a bash sentinel, it will expand those sentinels. To work around this, you can temporarily turn off this processing, and then turn it back on with `set -f` and `set +f` respectively. This is 3 separate calls to do what a single cat handles with no problem. cat can be measurably slower in certain circumstances, but its generally negligible, especially since we've reduced the number of calls 3 fold.

You'll also notice I use two different heredoc sentinels with cat, `<< EOF` and `<<-EOF` the former includes whitespace to the left of any text, and requires that terminating EOF sentinel to rest on the first character and be the only character on the entire line at the end of the heredoc. While the latter truncates anything whitespace located to the left of the terminating EOF sentinel, and allows whitespace to preceed the terminating EOF sentinel. In this manner I can retain identation on appropriate text blocks without including it in the resulting output.

 

The other changes are not preferential but agreed upon by most of the industry as "proper syntax" ( `$()` for command substitution in lieu of ``, double quotes around variables to prevent globbing, etc.)

I didn't bother messing with the logic as I wanted the script to retain its original functionality, and just have more longevity. Currently no errors should be output during a FULLAUTO run, from my test earlier today. If you run into an error message ping me here and I'll see what I can do for now. I don't plan on adopting this project, just don't like seeing hard work lost or error messages on my terminal windows.
(I suppose I could just wrap everything with a 2>/dev/null)

 

 

PS: I've been very ill, so I've made a substantial number of typographical errors in my notes and comments in the script. The script runs okay, as I had to test it - but trust me, it didn't run the first time I tried after making all of the changes. Not even close. Hopefully my brain starts to pick up speed again over the next several days.

Edited by Xaero

Share this post


Link to post

Hey, I'm in the middle of running this while I cook dinner - thought I'd share the output incase you can shed some light / adjust the script.  Sorry to see you've been sick BTW, hopefully on the mend!

 

unRAID Tunables Tester v2.2 by Pauven

unraid-tunables-tester.sh: line 80: /root/mdcmd: No such file or directory

 

unraid-tunables-tester.sh: line 388: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 389: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 390: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 394: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 397: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 400: [: : integer expression expected

Test 1 - md_sync_window=384 - Test Range Entered - Time Remaining: 1s unraid-tunables-tester.sh: line 425: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 429: /root/mdcmd: No such file or directory

Test 1   - md_sync_window=384  - Completed in 240.717 seconds =   0.0 MB/s

unraid-tunables-tester.sh: line 388: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 389: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 390: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 394: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 397: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 400: [: : integer expression expected

Test 2   - md_sync_window=512  - Test Range Entered - Time Remaining: 1s unraid-tunables-tester.sh: line 425: /root/mdcmd: No such file or directory

 

It repeats from here.

Share this post


Link to post

Excuse me for asking if it has been answered but is there a post in this long thread that explains the setting up and how to use this script?

Share this post


Link to post
18 minutes ago, DanielCoffey said:

Excuse me for asking if it has been answered but is there a post in this long thread that explains the setting up and how to use this script?

Just to copy it to /boot and run it basically.

Share this post


Link to post

@Xaero there still seems to be quite a major issue on my hardware - see below results:

 

Completed: 0 Hrs 56 Min 11 Sec.

 

Press ENTER To Continue

 

 

unraid-tunables-tester.sh: line 55: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 598: [: : integer expression expected

Best Bang for the Buck: Test 0 with a speed of 1 MB/s

 

     Tunable (md_num_stripes): 0

     Tunable (md_write_limit): 0

     Tunable (md_sync_window): 0

 

These settings will consume 0MB of RAM on your hardware.

 

 

Unthrottled values for your server came from Test 0 with a speed of  MB/s

 

     Tunable (md_num_stripes): 0

     Tunable (md_write_limit): 0

     Tunable (md_sync_window): 0

 

These settings will consume 0MB of RAM on your hardware.

This is -299MB less than your current utilization of 299MB.

NOTE: Adding additional drives will increase memory consumption.

 

In unRAID, go to Settings > Disk Settings to set your chosen parameter values.

 

Full Test Results have been written to the file TunablesReport.txt.

Show TunablesReport.txt now? (Y to show):

 

I was expecting more for 56 minutes of testing :D

 

It doesn't look to me like it's autodetecting mdcmd which I think you said it would earlier....

Share this post


Link to post

Can I use CA User Scripts to run this without editing it? It does not put scripts in /boot of course.

Share this post


Link to post
2 minutes ago, DanielCoffey said:

Can I use CA User Scripts to run this without editing it? It does not put scripts in /boot of course.

Anything is possible, but it's not designed for that.  It's designed to run manually so you can run it and rerun it to get the optimum settings for disk performance on your particular setup.

Share this post


Link to post

In the interim I seem to have gotten around the problem by running ln -s /usr/local/sbin/mdcmd /root/mdcmd, which is what I had to do on the old script too if I recall correctly.

Share this post


Link to post
Posted (edited)
24 minutes ago, Marshalleq said:

In the interim I seem to have gotten around the problem by running ln -s /usr/local/sbin/mdcmd /root/mdcmd, which is what I had to do on the old script too if I recall correctly.

 

1 hour ago, Marshalleq said:

Hey, I'm in the middle of running this while I cook dinner - thought I'd share the output incase you can shed some light / adjust the script.  Sorry to see you've been sick BTW, hopefully on the mend!

 

unRAID Tunables Tester v2.2 by Pauven

unraid-tunables-tester.sh: line 80: /root/mdcmd: No such file or directory

 

unraid-tunables-tester.sh: line 388: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 389: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 390: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 394: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 397: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 400: [: : integer expression expected

Test 1 - md_sync_window=384 - Test Range Entered - Time Remaining: 1s unraid-tunables-tester.sh: line 425: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 429: /root/mdcmd: No such file or directory

Test 1   - md_sync_window=384  - Completed in 240.717 seconds =   0.0 MB/s

unraid-tunables-tester.sh: line 388: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 389: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 390: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 394: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 397: /root/mdcmd: No such file or directory

unraid-tunables-tester.sh: line 400: [: : integer expression expected

Test 2   - md_sync_window=512  - Test Range Entered - Time Remaining: 1s unraid-tunables-tester.sh: line 425: /root/mdcmd: No such file or directory

 

It repeats from here.

You aren't running my script.

If you do a ctrl-f on my script and type "/root" you'll see there is no hard coded reference to that location.

Your output is also showing:

"unRAID Tunables Tester v2.2 by Pauven"

I've added to the end of that and it should read:

"unRAID Tunables Tester v2.2 by Pauven - MOD by Xaero"

 

Both of these changes were intentional. One was to help differentiate, the other was because we shouldn't be using a hard coded nonstandard path like /root.

 

Capture+_2019-07-19-02-28-44.png

Edited by Xaero

Share this post


Link to post

Hmm, that's interesting, OK I'll have another look at the script, I must have gotten the old one mixed up with the new one somehow.

Share this post


Link to post
Posted (edited)

Thanks, sorted that and yes it's much cleaner now.  I did run your script once previously but somehow did overwrite it with the wrong one lol.  Thanks again, I'll do some more testing again later, when there's no-one on my server to skew the results!

Edited by Marshalleq
Fixing incorrect autocorrect - Grrr.

Share this post


Link to post

Thanks for updating this!

 

Best Bang for the Buck: Test 2 with a speed of 163.8 MB/s

     Tunable (md_num_stripes): 1408
     Tunable (md_sync_window): 512
 
These settings will consume 66MB of RAM on your hardware.


Unthrottled values for your server came from Test 14 with a speed of 173.3 MB/s

     Tunable (md_num_stripes): 4480
     Tunable (md_sync_window): 2048

These settings will consume 210MB of RAM on your hardware.
This is 18MB more than your current utilization of 192MB.
NOTE: Adding additional drives will increase memory consumption.

In unRAID, go to Settings > Disk Settings to set your chosen parameter values.

 

Share this post


Link to post

I got the script working without issue on my server and got the following results...

Best Bang for the Buck: Test 3 with a speed of 151.1 MB/s
     Tunable (md_num_stripes): 1664
     Tunable (md_sync_window): 768
These settings will consume 32MB of RAM on your hardware.

Unthrottled values for your server came from Test 35 with a speed of 159.4 MB/s
     Tunable (md_num_stripes): 5952
     Tunable (md_sync_window): 2680
These settings will consume 116MB of RAM on your hardware.
This is 91MB more than your current utilization of 25MB.

 

Now to see what a difference it makes to the full Parity check. Times under vanilla settings range from 17h15 to 19h00.

Share this post


Link to post

It came in at 17h15 again so it shows that the default settings are at least in the right ballpark. The longer parity checks I had in the past tended to be ones where I had suffered an unclean shutdown or dropped a drive due to cable wiggle and it needed a good look at everything.

Share this post


Link to post
Posted (edited)

I was surprised to see that this script starts a parity check when you run it. Is this normal?

 

Edit: Tried canceling the parity check, but it just restarted when the next test started.

Edited by wgstarks

Share this post


Link to post
1 hour ago, wgstarks said:

I was surprised to see that this script starts a parity check when you run it. Is this normal?

Yes, since the point of this is to optimize the tunables for parity checks.

 

Side benefit is that it might improve regular performance as well, but it benchmarks parity check times.

Share this post


Link to post
On 7/19/2019 at 8:31 PM, StevenD said:

Thanks for updating this!

 


Best Bang for the Buck: Test 2 with a speed of 163.8 MB/s

     Tunable (md_num_stripes): 1408
     Tunable (md_sync_window): 512
 
These settings will consume 66MB of RAM on your hardware.


Unthrottled values for your server came from Test 14 with a speed of 173.3 MB/s

     Tunable (md_num_stripes): 4480
     Tunable (md_sync_window): 2048

These settings will consume 210MB of RAM on your hardware.
This is 18MB more than your current utilization of 192MB.
NOTE: Adding additional drives will increase memory consumption.

In unRAID, go to Settings > Disk Settings to set your chosen parameter values.

 

Seems ~4x of default value would got best result.

Share this post


Link to post
On 7/19/2019 at 4:27 PM, Marshalleq said:

MB: Asus Prime X370-A Memory: 128GB Corsair Ballistix BLS16G4D26BFSE CPU: AMD Threadripper 1950X

PCIe: Geforce 1070ti - Intel Dual Port e1000e NIC - Dell Perc H310 in IT mode

Notice mistake on MB model 😉

Share this post


Link to post
4 hours ago, wgstarks said:

I was surprised to see that this script starts a parity check when you run it. Is this normal?

 

Edit: Tried canceling the parity check, but it just restarted when the next test started.

 

3 hours ago, jonathanm said:

Yes, since the point of this is to optimize the tunables for parity checks.

 

Side benefit is that it might improve regular performance as well, but it benchmarks parity check times.

 

To add to jonathanm's answer, the script starts a series of partial, read-only, non-correcting parity checks, each with slightly tweaked parameters, and logs the performance of each combination of settings.  Essentially, it is measuring the peak performance of reading from all discs simultaneously, and showing how that can be tweaked to improve peak performance.

 

 

6 hours ago, DanielCoffey said:

It came in at 17h15 again so it shows that the default settings are at least in the right ballpark. The longer parity checks I had in the past tended to be ones where I had suffered an unclean shutdown or dropped a drive due to cable wiggle and it needed a good look at everything.

 

Improving peak performance is not the same thing as improving the time of a full parity check, as your parity check only spends a few minutes at the beginning of the drives where peak performance has an impact, as performance gradually tapers off from the beginning of your drive to the end.  If your peak performance was abnormally slow (i.e. 50 MB/s), then that would affect a much larger percentage of the parity check, and improving that to 150MB/s would make a huge improvement in parity check times, but increasing from 164 MB/s to 173 MB/s won't make much of a difference, since essentially you were already close to max performance and that small increase will only affect perhaps the first few % of the drive space.

 

In a similar way, I could improve aerodynamics on my car to increase top speed from 164 MPH to 173 MPH, but that won't necessarily help my work commute where I'm limited to speeds below 65 MPH.  But if for some reason my car couldn't go faster than 50 MPH, any increase at all would help my commute time.

 

There are a handful of drive controllers (like the one in my sig) that suffer extremely slow parity check speeds with stock Unraid settings, so I see a huge performance increase from tweaking the tunables.

 

There is also some evidence that tweaking these tunables can help with multi-tasking (i.e. streaming a movie without stuttering during a parity check), and for some users this seems to be true.  I know there are some users who have concerns that maximizing parity check speed takes away bandwidth for streaming, though I don't think we ever actually saw evidence of this.

 

 

On 7/19/2019 at 12:11 AM, Xaero said:

I didn't bother messing with the logic as I wanted the script to retain its original functionality, and just have more longevity. 


That's a shame, as that is really what is needed to make this script compatible with 6.x.  LT changed the tunables from 5.x to 6.x, and the original script needs updating to work properly with the 6.x tunables.  Fixing a few obsolete code segments to make it run without errors on 6.x doesn't mean you will get usable results on 6.x.

 

I had created a beta version for Unraid 6.x a while back, but testing showed it was not producing usable results.  I documented a new tunables testing strategy based on those results, but never did get around to implementing them.  It seems that finding good settings on 6.x is harder than it was for 5.x - possibly because 6.x just runs better and there's less issues to be resolved. 

 

I still have my documented changes for the next version around here somewhere...

 

 

On 7/19/2019 at 12:11 AM, Xaero said:

I don't plan on adopting this project, just don't like seeing hard work lost or error messages on my terminal windows.

 

That's another shame.  Seems like you know what you're doing, more so than I do with regards to Linux.  I'm a Windows developer, and my limited experience with Linux and Bash (that's what this script is, right?) is this script.  For me to pick it up again, I have to essentially re-learn everything.  I keep thinking a much stronger developer than I will pick this up someday.

 

 

I'm not trying to convince users not to use this tool, and I certainly appreciate someone trying to keep it alive, but I did want to clarify that the logic needs improvement for Unraid 6.x, and you may not get accurate recommendations with this Unraid 5.x tunables tester.

 

Paul

  • Like 1
  • Upvote 1

Share this post


Link to post
4 hours ago, Benson said:

Notice mistake on MB model 😉

Oh yeah, I should probably update that!  Thanks ;)

Share this post


Link to post

Hello!
 

3 hours ago, Pauven said:

I had created a beta version for Unraid 6.x a while back, but testing showed it was not producing usable results.  I documented a new tunables testing strategy based on those results, but never did get around to implementing them.  It seems that finding good settings on 6.x is harder than it was for 5.x - possibly because 6.x just runs better and there's less issues to be resolved. 

 

I still have my documented changes for the next version around here somewhere...

This is also my surface findings from running the old version of the script, the increase is marginal at best - but still worth squeezing any performance out you can. I know that between 3.x and 4.x, the Linux kernel has seen some pretty insane improvements in I/O overhead to begin with, unraid gets to take advantage of that out door. It also means that for a (relatively) old Linux goon like myself, I have a lot of new learning to pick up with what kernel tunables can yield what results. We had a science. That science has changed. I'd also be curious to see how BFQ would affect unraid's performance, including that during parity checks. I can only imagine that BFQ would enable much higher total throughput, at the cost of substantially higher CPU utilization. If I had more free time, I'd play with it... which brings us to the next two points...
 

3 hours ago, Pauven said:
On 7/18/2019 at 11:11 PM, Xaero said:

I didn't bother messing with the logic as I wanted the script to retain its original functionality, and just have more longevity. 


That's a shame, as that is really what is needed to make this script compatible with 6.x.  LT changed the tunables from 5.x to 6.x, and the original script needs updating to work properly with the 6.x tunables.  Fixing a few obsolete code segments to make it run without errors on 6.x doesn't mean you will get usable results on 6.x.

Indeed, the point was to get something from the script usable with the existing code base and logic - not to create something new. The testing automation ideology implemented isn't fundamentally broken - just the amount of flexibility is. Ideally, one would create a much more in-depth test pattern for a finalized script for 6.x and Linux kernel versions in the 4.x->5.x family. You'd need to find ideal baseline values for several tunables, and then play with those values in tandem to find a suitable combination, since some of these values will impact eachother. I think, looking at the behavior of this script in the real world, and using a bit of computer science, a lot of the "testing time" can be eliminated. For example, performance always seems to be best at direct power of two intervals. We usually see okay performance at 512, with a slight drop off at 640, trending downward until we hit 1024. Knowing this, we can make big jumps upward until we see a decline, and then make smaller jumps downward in value to reach a nominal value with less "work" Obviously this would be great for getting initial rough values quickly. Higher granularity could then be used to optimize further. There's probably some math that could be done with disk allocation logic (stripe, sector, cache size et al) for even further optimizing, but that's a pretty large amount of research that needs to be done.

I actually don't know what tunables are available currently (It's not hard to dump them with mdcmd) or what their direct impact is on various workloads.

 

3 hours ago, Pauven said:
On 7/18/2019 at 11:11 PM, Xaero said:

I don't plan on adopting this project, just don't like seeing hard work lost or error messages on my terminal windows.

 

That's another shame.  Seems like you know what you're doing, more so than I do with regards to Linux.  I'm a Windows developer, and my limited experience with Linux and Bash (that's what this script is, right?) is this script.  For me to pick it up again, I have to essentially re-learn everything.  I keep thinking a much stronger developer than I will pick this up someday.

I'll take the compliment. I'm still ever-improving. I'm an amateur developer, but I have many, many years of experience with Linux as a platform, including building automated install suites for various distributions (though Arch is probably the most friendly for this type of application). And various performance tuning for other applications, specifically low-latency gaming applications related to rhythm-games. You'd be surprised how much it is possible to optimize each step of the chain when you really dig into it. Unfortunately, life has sucked all of my free time away from me. I want to spend more time focusing on projects like this - I just don't have that time available. Spending an hour or two while I'm sick fixing a script is affordable, spending a couple of hours a day for several weeks designing and implementing a replacement isn't an option yet.

It's interesting that even though the two tunables the script currently tests are not as relevant today, there is still a fairly substantial performance improvement available using them, though. No, it won't shave hours off your parity check, but it will make the difference between saturating gigabit connections and not. In my case, I have a bottleneck upstream that caps my parity check speed around 70mb/s Once I get rid of this slow port expander...

Share this post


Link to post
9 minutes ago, Xaero said:

Indeed, the point was to get something from the script usable with the existing code base and logic - not to create something new. The testing automation ideology implemented isn't fundamentally broken - just the amount of flexibility is. Ideally, one would create a much more in-depth test pattern for a finalized script for 6.x and Linux kernel versions in the 4.x->5.x family.

 

While the testing ideology isn't broken, due to the v6.x changes in tunables, the script no longer provides a complete picture.  Going from memory, nr_requests is a new tunable that affects performance, and I had planned to include it in my v6 compatible beta, and in testing we saw some interesting results around 4 on some machines. 

 

And while md_write_limit is gone in v6, we have a new md_sync_thresh that needs to be tested.  So instead of 3 tunables, with v6 there's really at least 4 - which only further complicates testing.

 

 

12 minutes ago, Xaero said:

I actually don't know what tunables are available currently (It's not hard to dump them with mdcmd) or what their direct impact is on various workloads.

 

You can find them on the Disk Settings.  Note, some of them wouldn't be part of this type of testing, like md_write_method is beyond the scope of what this tunables tester is doing (and is irrelevant for parity checks anyway).

 

I vaguely recall doing some type of testing with NCQ - but I don't think I ever automated that.  I think that was more of a manual effort, test with it both on and off, and see what works for your machine.

 

poll_attributes I think is a newer tunable added in a fairly recent version - I don't think it existed back when I was working on my v6.x beta tunables tester.  I don't even know what this does.

 

image.thumb.png.e9c6ee8029901d7d353f7e12dc749980.png

 

18 minutes ago, Xaero said:

I think, looking at the behavior of this script in the real world, and using a bit of computer science, a lot of the "testing time" can be eliminated. For example, performance always seems to be best at direct power of two intervals. We usually see okay performance at 512, with a slight drop off at 640, trending downward until we hit 1024. Knowing this, we can make big jumps upward until we see a decline, and then make smaller jumps downward in value to reach a nominal value with less "work" Obviously this would be great for getting initial rough values quickly. Higher granularity could then be used to optimize further. There's probably some math that could be done with disk allocation logic (stripe, sector, cache size et al) for even further optimizing, but that's a pretty large amount of research that needs to be done. 

 

Believe it or not, the v5 script does a lot of that.  At runtime it provides multiple test types to choose from, and the quicker tests do exactly as you describe.  All of the tests try to skip around, hunting for a value region that performs better, then focusing in on that region and testing more values.  Older versions of the tunables tester made big jumps and tested much faster, but as I refined the tool I added more detailed testing that didn't skip around as much, because I've also found, from examining dozens upon dozens of user submitted results from different Unraid builds, that it is a mistake to make any major assumptions about what values work best.  There are some machines that get faster with smaller values (below 512).  Performance is not always best at powers of two intervals - for some machines yes, but not all. 

 

These tunables seems to be controlling how Unraid communicates with the drive controller.  And there are dozens of different drive controllers, each with unique behaviors and tuning requirements.  Add in that many users have "Frankenstein" builds using different combinations of controller cards (some from the CPU/northbridge, some from the chipset/southbridge, and the rest from sometimes mismatched controller cards), and what you end up with is an entirely unpredictable set of tuning parameters to make that type of machine perform well.

 

While I don't disagree with your sentiment, making a tunables tester that works equally well on every type of build in the real world doesn't align very well with expedited testing that skips around too much - what works great on one machine doesn't work at all on another.  It was very frustrating trying to identify a testing pattern that worked well on any and all machines.  To me, the big picture is that it's better to spend 8 hours identifying a set of parameters that provide a real-world benefit, rather than wasting 1 hour to come up with parameters that aren't really that great.  It's not like you run this thing every day - for most users it is a run once and then never again.  Trying to save time for something you run once, at the cost of accuracy, isn't ideal.

 

Paul

Share this post


Link to post

 

52 minutes ago, Pauven said:

While the testing ideology isn't broken, due to the v6.x changes in tunables, the script no longer provides a complete picture.  Going from memory, nr_requests is a new tunable that affects performance, and I had planned to include it in my v6 compatible beta, and in testing we saw some interesting results around 4 on some machines. 

 

And while md_write_limit is gone in v6, we have a new md_sync_thresh that needs to be tested.  So instead of 3 tunables, with v6 there's really at least 4 - which only further complicates testing.

51 minutes ago, Pauven said:

You can find them on the Disk Settings.  Note, some of them wouldn't be part of this type of testing, like md_write_method is beyond the scope of what this tunables tester is doing (and is irrelevant for parity checks anyway).

Even then, this is an incomplete picture. Those are just the tunables exposed via the webui. There are more tunables available than that, even most likely. And yes, they do directly change how data is lined up to be written to the array. The latter point about mismatched hardware is why I think a different I/O scheduler could also make a pretty big impact. I believe I read in the past of people switching to CFQ to speed up write operations (currently unraid defaults to noop - which more or less is a hands-off I/O scheduler) something like BFQ could probably mask or mitigate the majority of the IO wait time for different operations. In particular, read operations would probably see a big relief. We also have a new kernel option available in 4.12+ - blk-mq and scsi-mq These are multi-queue options designed for dense multi-disk storage applications to have substantially lower access times. For example, Kyzen scheduler, which is similar in principle to BFQ, but simpler, has shown dropping SSD array times from 8ms to 1ms or lower times. That's a huge impact when there are hundreds or thousands of operations in the queue. I'd have to read more, but I believe at the kernel level, with the old (current) system, the I/O queues are each singular. That meaning, each device in the noop queue, shares the noop queue. Where in the mq implementation, each block or scsi device gets its own separate queue. Meaning increased CPU overhead for command routing, but each device getting more readily available queue handling. This would in theory mean less blocking from varying disk arrangements. A slower 16m cached disk won't be holding back a 10krpm disk with 256m of cache, for example.

These sorts of implications are why testing MQ and BFQ/Kyzen are higher on my priority list. I feel like limetech should probably be looking at these options as well - but they are brand new.

Share this post


Link to post

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.