bobkart Posted February 2, 2016 Share Posted February 2, 2016 I started a new server build, and decided to try unRAID 6 official, as opposed to beta12 as I've been using up until now (and still am on my primary server). I believe I'm getting the same problem as I reported with beta12 (compared to beta10a, reported here: http://lime-technology.com/forum/index.php?topic=38290.0 ). The workaround arrived at there was to set a longer poll_spindown time (100 instead of 10). In trying to apply that to the official release, I see poll_attributes instead, and it's already set quite long (30 minutes). I searched around a bit and found this brief discussion: http://lime-technology.com/forum/index.php?topic=38295.msg356200#msg356200 The relevant bit is "poll_spindown remains a variable, just not tunable from the webGui." Apparently the mdcmd command is used to adjust these tunables from the command line. What I'm not finding is a way to ask for the value of a tunable. There's 'mdcmd status', which dumps /proc/mdcmd, but I don't see poll_(anything) in that output. I could just start trying to set poll_spindown to different values to see if it fixes my problem, but ideally I could at least see what it's set to now, and also see that I'm actually changing it when I do try to adjust it. (Just FYI, how I suspect this or something similar to be the problem is that Parity Checks/Syncs proceed much more quickly when I try them with 6.0-beta10 compared to 6.1.6. I've tried all the obvious stuff like disabling page updates during parity operations, and even completely disabling page updates. I still see markedly slower speeds, both peak and average, with *nothing else changed* other than the bzroot/bzimage files. Peak speeds can be over 200 MB/s in the "good" situation versus not much more than 160 MB/s for the problematic one. That's how I know it's not drives/cables/SATA ports/motherboard/CPU/RAM/...) Thanks for your help. Link to comment
JorgeB Posted February 2, 2016 Share Posted February 2, 2016 From v6.1.4 and up there’s a new setting md_sync_thresh that can have a great impact on parity check speed, especially If you tuned md_num_stripes or md_sync_window values, are you using default settings for these? Example of one of my servers with using default and tuned md_sync_thresh value: Avg speed: 132,5MB/s Duration:16:46:13 Avg speed: 148MB/s Duration:15:01:14 This is supposed to be a GUI tunable for v6.2. Link to comment
bobkart Posted February 2, 2016 Author Share Posted February 2, 2016 Hi JB, thanks for that information. I found this mention of md_sync_thresh: http://lime-technology.com/forum/index.php?topic=44952.msg429454#msg429454 So it sounds like you're suggesting that my problem *could* be due to md_sync_thresh being (now by default) half of md_sync_window compared to (in the past) one less than md_sync_window. Your average speed and duration data points look JUST like mine: I get ~15 hours when it's working (beta10a) and more like mid-16s when not (6.1.6). Looks like you're running the same drives as I am (Seagate 8TB). What are your settings of md_sync_thresh between your two run? I'm guessing the slower run is for a setting of half of md_sync_window, is the faster run for a setting of one less than md_sync_window? And to answer your question, no I've not adjusted the two tunables that *are* still pinned out to the GUI. EDIT: I still would like to know how to reveal the values of tunables that *aren't* accessible via the GUI. Link to comment
JorgeB Posted February 2, 2016 Share Posted February 2, 2016 In my limited testing there are some controllers that are faster with md_sync_window/2 value , like the SASLP and the SAS2LP, but most others work better with a value close to the current md_sync_window-1, in the example I used above, HP N54L with 4 8TB Seagates, md_sync_window was always 1280, optimal tresh value is anything above 1000, I'm using 1024. Link to comment
bobkart Posted February 2, 2016 Author Share Posted February 2, 2016 Got it, thanks again JB. Any way to read out what these are set to? The 'mdcmd status' command isn't showing any of them. Link to comment
JorgeB Posted February 2, 2016 Share Posted February 2, 2016 md_sync_window is set on disk settings page, md_sync_tresh default is 192, don't know of a way to show current value. Link to comment
bobkart Posted February 2, 2016 Author Share Posted February 2, 2016 I see. So you bumped md_sync_window up from the default of 384 to 1280, then settled on an md_sync_threshold of 1024, leaving md_num_stripes at the default setting of 1280. Let me know if I have that right . . . once my baseline Parity Check finishes (on 6.0-beta10a), I'll try those/similar settings on 6.1.6. Is it just coincidence that you're using 1280 for both md_sync_window and md_num_stripes? Or is there a connection. Thanks again JB. Link to comment
JorgeB Posted February 3, 2016 Share Posted February 3, 2016 Md_num_stripes should be higher than md_sync_window, in this case I’m using 2816, there’s a very good script to find optimal values, unraid tunables tester, it’s not v6.1 compatible but it’s easily modifiable, unfortunately it’s not aware of md_sync_thresh, so it’s best to use it on releases prior to v6.1.4, where it’s useful to established optimal md_num_stripes and md_sync_window values, then you just need to tweak the md_sync_thresh value. Link to comment
bobkart Posted February 3, 2016 Author Share Posted February 3, 2016 Ah, okay. I'll keep your numbers in mind during my testing. I have a run going now on 6.1.6 with settings based on what I used for the baseline run (on 6.0-beta10a): md_num_stripes = 40960 md_sync_window = 12288 md_sync_thresh = 12280 So far (~15%) it's yielding speeds comparable to the baseline run. Only 3% memory used (I've got way more RAM than I need on this motherboard!). I may go back to 6.0-beta10a to try the tunables tester . . . in the past when I've used it, it revealed only minor improvements, but I was using SAS HBAs then and I'm straight to the motherboard now. But this isn't likely to be the motherboard I (eventually) use for this build. I just ordered that one (ASRock Rack C2550D4I). RAM should move straight across from the board I'm using now as that's a Supermicro A1SAM-C2750F. Although I might back it down from 64GB to 32GB. I do notice that my over-the-network writes finish more quickly with lots of RAM (Linux buffering). Link to comment
bobkart Posted February 4, 2016 Author Share Posted February 4, 2016 The latest run came in at 15.9 hours, 140.7 MB/s. I think there's still room for improvement, because I had some earlier results (with just three drives instead of six) in a SAS enclosure (SE3016) that came in just under 15 hours (148.9 MB/s). I guess I'll keep playing with it. One thing I noticed, on two 6.1.6 servers I'm experimenting with right now, that after each Parity Check, where the duration is normally reported, it says unavailable (no parity-check entries logged). Some entries from last month are in the History, but nothing recent. The only thing I can think of is that I Disabled Page Updates around when it seems to have started not logging them. A search of this forum didn't turn up any similar complaints, am I the first to encounter this problem? Link to comment
JorgeB Posted February 4, 2016 Share Posted February 4, 2016 Some entries from last month are in the History, but nothing recent. The only thing I can think of is that I Disabled Page Updates around when it seems to have started not logging them. A search of this forum didn't turn up any similar complaints, am I the first to encounter this problem? See here: http://lime-technology.com/forum/index.php?topic=45607.msg436176#msg436176 Link to comment
bobkart Posted February 9, 2016 Author Share Posted February 9, 2016 My tunables testing on 6.0-beta10 yielded a time of 15 hours 5 minutes (147.3 MB/s). Switching to 6.1.6 with the same settings gave me a time within 10 seconds of that one. Now I just hope these settings work on the ASRock C2550D4I board I just got. Also, I'm still getting the no parity-check entries logged problem . . . tried the things in that thread (remove parity-check log file, switch back and forth to dashboard), no joy. Link to comment
Recommended Posts
Archived
This topic is now archived and is closed to further replies.