Jump to content

How to adjust poll_spindown?


bobkart

Recommended Posts

I started a new server build, and decided to try unRAID 6 official, as opposed to beta12 as I've been using up until now (and still am on my primary server).

 

I believe I'm getting the same problem as I reported with beta12 (compared to beta10a, reported here: http://lime-technology.com/forum/index.php?topic=38290.0 ).

 

The workaround arrived at there was to set a longer poll_spindown time (100 instead of 10).  In trying to apply that to the official release, I see poll_attributes instead, and it's already set quite long (30 minutes).

 

I searched around a bit and found this brief discussion:

 

http://lime-technology.com/forum/index.php?topic=38295.msg356200#msg356200

 

The relevant bit is "poll_spindown remains a variable, just not tunable from the webGui."

 

Apparently the mdcmd command is used to adjust these tunables from the command line.  What I'm not finding is a way to ask for the value of a tunable.  There's 'mdcmd status', which dumps /proc/mdcmd, but I don't see poll_(anything) in that output.

 

I could just start trying to set poll_spindown to different values to see if it fixes my problem, but ideally I could at least see what it's set to now, and also see that I'm actually changing it when I do try to adjust it.

 

(Just FYI, how I suspect this or something similar to be the problem is that Parity Checks/Syncs proceed much more quickly when I try them with 6.0-beta10 compared to 6.1.6.  I've tried all the obvious stuff like disabling page updates during parity operations, and even completely disabling page updates.  I still see markedly slower speeds, both peak and average, with *nothing else changed* other than the bzroot/bzimage files.  Peak speeds can be over 200 MB/s in the "good" situation versus not much more than 160 MB/s for the problematic one.  That's how I know it's not drives/cables/SATA ports/motherboard/CPU/RAM/...)

 

Thanks for your help.

Link to comment

From v6.1.4 and up there’s a new setting md_sync_thresh that can have a great impact on parity check speed, especially If you tuned md_num_stripes or md_sync_window values, are you using default settings for these?

 

Example of one of my servers with using default and tuned md_sync_thresh value:

 

Avg speed: 132,5MB/s   Duration:16:46:13

Avg speed: 148MB/s     Duration:15:01:14

 

This is supposed to be a GUI tunable for v6.2.

Link to comment

Hi JB, thanks for that information.

 

I found this mention of md_sync_thresh: http://lime-technology.com/forum/index.php?topic=44952.msg429454#msg429454

 

So it sounds like you're suggesting that my problem *could* be due to md_sync_thresh being (now by default) half of md_sync_window compared to (in the past) one less than md_sync_window.

 

Your average speed and duration data points look JUST like mine: I get ~15 hours when it's working (beta10a) and more like mid-16s when not (6.1.6).  Looks like you're running the same drives as I am (Seagate 8TB).

 

What are your settings of md_sync_thresh between your two run?  I'm guessing the slower run is for a setting of half of md_sync_window, is the faster run for a setting of one less than md_sync_window?

 

And to answer your question, no I've not adjusted the two tunables that *are* still pinned out to the GUI.

 

EDIT: I still would like to know how to reveal the values of tunables that *aren't* accessible via the GUI.

Link to comment

In my limited testing there are some controllers that are faster with md_sync_window/2 value , like the SASLP and the SAS2LP, but most others work better with a value close to the current md_sync_window-1, in the example I used above, HP N54L with 4 8TB Seagates, md_sync_window was always 1280, optimal tresh value is anything above 1000, I'm using 1024.

Link to comment

I see.  So you bumped md_sync_window up from the default of 384 to 1280, then settled on an md_sync_threshold of 1024, leaving md_num_stripes at the default setting of 1280.

 

Let me know if I have that right . . . once my baseline Parity Check finishes (on 6.0-beta10a), I'll try those/similar settings on 6.1.6.

 

Is it just coincidence that you're using 1280 for both md_sync_window and md_num_stripes?  Or is there a connection.

 

Thanks again JB.

Link to comment

Md_num_stripes should be higher than md_sync_window, in this case I’m using 2816, there’s a very good script to find optimal values, unraid tunables tester, it’s not v6.1 compatible but it’s easily modifiable, unfortunately it’s not aware of md_sync_thresh, so it’s best to use it on releases prior to v6.1.4, where it’s useful to established optimal md_num_stripes and md_sync_window values, then you just need to tweak the md_sync_thresh value.

Link to comment

Ah, okay.  I'll keep your numbers in mind during my testing.

 

I have a run going now on 6.1.6 with settings based on what I used for the baseline run (on 6.0-beta10a):

  md_num_stripes = 40960

  md_sync_window = 12288

  md_sync_thresh = 12280

 

So far (~15%) it's yielding speeds comparable to the baseline run.  Only 3% memory used (I've got way more RAM than I need on this motherboard!).

 

I may go back to 6.0-beta10a to try the tunables tester . . . in the past when I've used it, it revealed only minor improvements, but I was using SAS HBAs then and I'm straight to the motherboard now.

 

But this isn't likely to be the motherboard I (eventually) use for this build.  I just ordered that one (ASRock Rack C2550D4I).  RAM should move straight across from the board I'm using now as that's a Supermicro A1SAM-C2750F.  Although I might back it down from 64GB to 32GB.  I do notice that my over-the-network writes finish more quickly with lots of RAM (Linux buffering).

Link to comment

The latest run came in at 15.9 hours, 140.7 MB/s.

 

I think there's still room for improvement, because I had some earlier results (with just three drives instead of six) in a SAS enclosure (SE3016) that came in just under 15 hours (148.9 MB/s).

 

I guess I'll keep playing with it.

 

One thing I noticed, on two 6.1.6 servers I'm experimenting with right now, that after each Parity Check, where the duration is normally reported, it says unavailable (no parity-check entries logged).

 

Some entries from last month are in the History, but nothing recent.  The only thing I can think of is that I Disabled Page Updates around when it seems to have started not logging them.

 

A search of this forum didn't turn up any similar complaints, am I the first to encounter this problem?

Link to comment

My tunables testing on 6.0-beta10 yielded a time of 15 hours 5 minutes (147.3 MB/s).

 

Switching to 6.1.6 with the same settings gave me a time within 10 seconds of that one.

 

Now I just hope these settings work on the ASRock C2550D4I board I just got.

 

Also, I'm still getting the no parity-check entries logged problem . . . tried the things in that thread (remove parity-check log file, switch back and forth to dashboard), no joy.

Link to comment

Archived

This topic is now archived and is closed to further replies.

×
×
  • Create New...