• Power consumption is not stable since Unraid 6.10 / 6.11


    mgutt
    • Solved Minor

    Found the problem. Read the first comment of this bug report.

     

    TL;DR

    Multiple users reported spikes in their power consumption after updating to Unraid 6.10 and this is still present in Unraid 6.11:

     

    https://forums.unraid.net/topic/105909-mein-10-zoll-server/?do=findComment&comment=1143526

    image.png.478144dcafafc03f1047da3aa1f8bfee.png.b1a9a6fb6074f309d05793f205bfe13f.png

     

    https://forums.unraid.net/topic/108966-strom-sparen-mit-powertop-stromverbrauch-von-unraid-verbessern/?do=findComment&comment=1177306

    image.png.3c3dc563e1bae7ad7d560a5e30e4d4fd.thumb.png.ba7d1fbf370c2b565f849f92e61b25ad.png

     

    https://forums.unraid.net/topic/105909-mein-10-zoll-server/?do=findComment&comment=1176094

    grafik.png.ff3d3ba6ca0e13cfbbe4fb8955219bd6.png.2ae0b75240cba6bfd944c876d018e1b4.png

     

    In Unraid 6.9 the power consumption was relatively stable. Example:

    image.png.d79c2e57aa1fd1822c6819cde5482b59.png.5d82563174d98fdb2d395ba9a2bf7006.png

     

     

    I will try to monitor the processes on my test server, which uses Unraid 6.11.1. Hopefully I can find out which process causes those spikes. I will update this bug report later...

     

    EDIT: This is what came in my mind:

    - close all web GUI windows

    - connect through SSH and execute this:

    watch -d "ps -Ao comm,pid,pcpu --sort=-pcpu | head -n 12"

    - watch parallely power consumption...

     

    EDIT2:

    Every couple of minutes I was able to capture an "rpcd_lsad" process:

    image.png.f0e447d3c06ba85d5f06590d069fa6dc.png

     

    Compared with the default situation:

    image.png.47a71ebc892b42caa5c4f463f27216fe.png

     

    In addition you can see that "emhttpd", "update_3", "file_manager", "device_list" and "parity_list" are constantly producing load.

     

    1.) The "emhttpd" process should be the main webserver process to deliver the Unraid WebGUI. Any possible improvements needed to be checked by @limetech.

     

    2.) The "update_3" process is an PHP script:

    # pgrep -af update_3
    15795 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/update_3

    It seems constantly logging network status from multiple network devices and ports. Is it really needed to do this with PHP parser overhead? And is it necessary to run this all the time?

     

    3.) The "file_manager" process is caused through the new Dynamix File Manager Plugin of @bonienl. I wonder why its active although I do not have the GUI open?! After uninstalling it disappears (logic), after installing it's not present (?!), but after opening a dir and closing the GUI it stays present:

    # pgrep -af file_manager
    24508 /usr/bin/php -q /usr/local/emhttp/plugins/dynamix.file.manager/nchan/file_manager

    I think it should auto-exit after the GUI has been closed.

     

    4.) The "rpcd_lsad" process is related to SMB:

    # find / -xdev -name rpcd_lsad
    /usr/libexec/samba/rpcd_lsad

    Not sure why it is starting and stopping every couple of minutes as I'm not using Active Directory?!

     

    5.) The "parity_list" and "device_list" PHP scripts are really necessary while the GUI is closed?!

    [email protected]:~# pgrep -af parity_list
    2893 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/parity_list
    [email protected]:~# pgrep -af device_list
    2889 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/device_list

     

    6.) The bash script "disk_load" is constantly writing disk stats to an ini file. Is it really needed although the GUI is closed?

    [email protected]:~# pgrep -af disk_load
    2891 /bin/bash /usr/local/emhttp/webGui/nchan/disk_load

     

    EDIT3:

    I deleted the file (test server ;) ) and checking the syslog proofs that it is executed every minute:

    [email protected]:~# rm /usr/libexec/samba/rpcd_lsad
    [email protected]:~# tail -f /var/log/syslog
    ...
    Oct  8 15:12:01 Tower  winbindd[2222]: [2022/10/08 15:12:01.690211,  0] ../../source3/winbindd/winbindd_samr.c:71(open_internal_samr_conn)
    Oct  8 15:12:01 Tower  winbindd[2222]:   open_internal_samr_conn: Could not connect to samr pipe: NT_STATUS_CONNECTION_REFUSED
    Oct  8 15:13:01 Tower  winbindd[2222]: [2022/10/08 15:13:01.766520,  0] ../../source3/winbindd/winbindd_samr.c:71(open_internal_samr_conn)
    Oct  8 15:13:01 Tower  winbindd[2222]:   open_internal_samr_conn: Could not connect to samr pipe: NT_STATUS_CONNECTION_REFUSED
    Oct  8 15:14:01 Tower  winbindd[2222]: [2022/10/08 15:14:01.844427,  0] ../../source3/winbindd/winbindd_samr.c:71(open_internal_samr_conn)
    Oct  8 15:14:01 Tower  winbindd[2222]:   open_internal_samr_conn: Could not connect to samr pipe: NT_STATUS_CONNECTION_REFUSED
    Oct  8 15:14:01 Tower  winbindd[2222]: [2022/10/08 15:14:01.861246,  0] ../../source3/winbindd/winbindd_samr.c:71(open_internal_samr_conn)
    Oct  8 15:14:01 Tower  winbindd[2222]:   open_internal_samr_conn: Could not connect to samr pipe: NT_STATUS_CONNECTION_REFUSED

     

    EDIT4:

    Before disabling SMB:

    image.png.c1c94be58fce65750adccef362aa7f6e.png

     

    After disabling SMB (only a small improvement visible):

    image.png.a9559beba1218db61d98c0bf324df465.png

     

    EDIT5: After killing the "parity_list" and "device_list" processes its still fluctuating, but the overall consumption is reduced:

    image.png.3ed604d2215620a6e3cf5e53a88232d2.png

     

    EDIT6: After killing "update_3"

    image.png.96aeecb64c2e19b6dce666650ef2a177.png

     

    As you can see the power consumption has become lower, but it's still fluctuating.

     

    EDIT7: After killing all php processes (update_1, update_2, ...) and the bash script /usr/local/emhttp/nchan/disk_load, it looks much more like Unraid 6.9:

    image.png.e94a681cf03af57f51990b3023323583.png

     

    In comparison Unraid 6.9 only had the constanly running diskload script:

    [email protected]:~# pgrep -af emhttp
    7362 /usr/local/sbin/emhttpd
    7590 /bin/bash /usr/local/emhttp/webGui/scripts/diskload
    [email protected]:~# pgrep -af php
    11519 php-fpm: master process (/etc/php-fpm/php-fpm.conf)

     

    I will reboot the server and do some more tests, but I think the constantly running PHP-scripts are the reason for the higher power consumption and the spikes (of course the SMB thing has be checked separately).

     

    • Like 7



    User Feedback

    Recommended Comments

    Ok, I can reproduce the spikes and "solve" the issue as follows:

     

    - Reboot server

    - connect through SSH and execute the following (= returns only a single process)

    [email protected]:~# pgrep -af emhttp
    1835 /usr/local/sbin/emhttpd

    - open the WebGUI

    - open the Main page

    - open the Dashboard

    - close the Browser

    - now the following 10 Bash and PHP scripts stay open and the power consumption raises by several watts and is fluctuating

    [email protected]:~# pgrep -af emhttp
    1835 /usr/local/sbin/emhttpd
    10100 /bin/bash /usr/local/emhttp/webGui/nchan/disk_load
    10240 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/notify_poller
    10242 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/session_check
    10360 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/device_list
    10514 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/ups_status
    10364 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/parity_list
    10506 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/wg_poller
    10508 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/update_1
    10510 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/update_2
    10512 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/update_3

     

    - now I kill all scripts and remove the pid file:

    pkill -cf /usr/local/emhttp/webGui/nchan
    rm /var/run/nchan.pid
    

    - power consumption is low and stable (does not show any spikes)

     

    - until... opening the WebGUI again, which restarts all the above scripts

     

    Note: The script /usr/local/emhttp/webGui/nchan/ups_status is the only one which is automatically stopped, but only if you load an additional WebGUI page after visiting the Dashboard. If you directly close the browser, it will run forever, too.

     

    @bonienl ich777 said I should notify you 😉

    • Upvote 3
    Link to comment

    Wenn man mit:

    pkill -cf /usr/local/emhttp/webGui/nchan
    rm /var/run/nchan.pid

     

    etwas herumspielt und vllt 5W pro Stunde spart. Sind das mit 10000 24/7 Unraid Nutzern auf 6.10/6.11 schon 50.000 W ersparnis pro Stunde. Bei 0,70 Cent pro KWh sind das ... in DM und Pferden umgerechnet....

    Spaß beiseite, ich hoffe auf einen baldigen Fix denn so "Minor" ist das Problem aus meiner Sicht nicht. Natürlich auch danke an @mgutt für seinen kontinuierlich guten Beitrag zur Community.

    Edited by T-Birth
    Link to comment

    I won't classify this as a bug, because it is all intended behavior.

    The Nchan processes are started as soon as you visit the Main page or Dashboard page in your web browser.

    The reason they stay running is because it allows multiple users or browsers to open the GUI and view the pages at the same time with up-to-date info.

     

    There is a mechanism though to stop a background nchan process as soon as you leave the Main page or Dashboard page, this however only works when there is a single user and a single browser viewing the GUI.

    To use this start/stop mechanism you'll need to edit two files:

     

    ArrayOperation.page

    Nchan="device_list,disk_load,parity_list"

    change to:

    Nchan="device_list:stop,disk_load:stop,parity_list:stop"

     

    DashStats.page

    Nchan="wg_poller,update_1,update_2,update_3,ups_status:stop"

    change to:

    Nchan="wg_poller:stop,update_1:stop,update_2:stop,update_3:stop,ups_status:stop"
    

     

    You can view the running nchan processes in an easy way by opening a terminal window and using:

    htop -F nchan

     

    Keep the terminal window open and navigate thru the GUI, you will see processes being started and stopped when visiting / leaving the Main page and Dashboard page.

     

    Ps. Regarding file manager: this process runs because the file manager allows file operations to be continued in the background. E.g. you can start a (large) copy operation and then close the GUI, the copy operation will continue until done.

    When you re-open the GUI and visit the file manager again, it will then display the current progress status

     

    • Like 1
    • Thanks 1
    Link to comment
    On 10/20/2022 at 2:35 PM, T-Birth said:

    Fix denn so "Minor" ist das Problem aus meiner Sicht nicht

     

    This is all relative. My main server is an AMD Epyc based system and consumes around 80W in idle time.

    When I stop the nchan processes, it saves me 1W, not really helpful in my case. True power savings I only get by powering off the server.

    -------------

    Das ist alles relativ. Mein Hauptserver ist ein AMD Epyc-basiertes System und verbraucht im Leerlauf etwa 80W.

    Wenn ich die nchan-Prozesse stoppe, spare ich 1W, was in meinem Fall nicht wirklich hilfreich ist. Echte Energieeinsparungen erhalte ich nur, wenn ich den Server ausschalte.

     

    Link to comment
    On 10/20/2022 at 2:35 PM, T-Birth said:

    und vllt 5W pro Stunde spart. Sind das mit 10000 24/7 Unraid Nutzern auf 6.10/6.11 schon 50.000 W ersparnis pro Stunde. Bei 0,70 Cent pro KWh sind das ... in DM und Pferden umgerechnet....

     

    You need to do the math right!

     

    Average kWh price in NL = €0,73.

    A 5W power reduction would save me: 5 / 1000 * 0,73 * 24 * 30 = €2,63 per month

     

    Link to comment
    6 hours ago, bonienl said:

    saves me 1W

    Maybe valid for your system, but mine consumes 3 or 4W more or in other words 30 to 40% more than with Unraid 6.9. For me it's a huge deal breaker. I will not udpate as long this isn't fixed.

    Link to comment
    32 minutes ago, mgutt said:

    Maybe valid for your system, but mine consumes 3 or 4W more or in other words 30 to 40% more than with Unraid 6.9. For me it's a huge deal breaker. I will not udpate as long this isn't fixed.

     

    In my earlier answer I explained what you can do to fix your problem.

     

    Link to comment
    1 hour ago, mgutt said:

    I will not udpate as long this isn't fixed.

    Same here, but I'll try the described start/stop mechanism to see if I can get close to the consumption of 6.9.2

    In 6.9.2 power consumption of my server is around 0,62 kW per day

    I tested 6.11 for nearly a month. Power consumption was around 0,72 kW per day, that is 16% more than with 6.9.

     

    Link to comment
    13 hours ago, Enks said:

    but I'll try the described start/stop mechanism

    Will not work as I already explained. If you open the dashboard and close the window, it never fires the stop mechanism.

    Link to comment
    4 minutes ago, mgutt said:

    Will not work as I already explained. If you open the dashboard and close the window, it never fires the stop mechanism.

     

    As explained earlier, you need to move to another page first before closing the window.

    The stop command is only executed when leaving a page.

     

    • Thanks 1
    Link to comment
    On 10/8/2022 at 12:21 PM, mgutt said:

    If you directly close the browser, it will run forever, too.

    Maybe what the webui needs is a sort of reaper task to periodically clean up the nchan processes if no clients remain connected.  Can nchan detect if there are active subscribers to a channel? If not is there a way for a nchan client page to send a periodic keep alive back to the server?

    • Upvote 1
    Link to comment

    I have created a nchan monitoring script which will terminate the running nchan processes when no more subscribers are present.

     

    With this solution there is no need to add the ":stop" option as explained earlier. Running nchan processes will stop automatically when the user closes the browser(s) to the GUI.

     

    I'll make this solution available in the next version, there is no need to do anything by the user.

     

    • Like 5
    • Thanks 1
    Link to comment
    On 11/12/2022 at 4:05 PM, bonienl said:

    I have created a nchan monitoring script which will terminate the running nchan processes when no more subscribers are present.

    Thank you! Works flawlessly:

    [email protected]:~# pgrep -af http
    3337 /usr/local/sbin/emhttpd
    4334 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/notify_poller
    4336 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/session_check
    4338 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/device_list
    4340 /bin/bash /usr/local/emhttp/webGui/nchan/disk_load
    4342 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/parity_list
    4487 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/wg_poller
    4489 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/update_1
    4491 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/update_2
    4493 /usr/bin/php -q /usr/local/emhttp/webGui/nchan/update_3
    
    # browser closed, some seconds later...
    [email protected]:~# pgrep -af http
    3337 /usr/local/sbin/emhttpd
    [email protected]:~#

     

    • Like 1
    Link to comment


    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.
    Note: Your post will require moderator approval before it will be visible.

    Guest
    Add a comment...

    ×   Pasted as rich text.   Restore formatting

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.


  • Status Definitions

     

    Open = Under consideration.

     

    Solved = The issue has been resolved.

     

    Solved version = The issue has been resolved in the indicated release version.

     

    Closed = Feedback or opinion better posted on our forum for discussion. Also for reports we cannot reproduce or need more information. In this case just add a comment and we will review it again.

     

    Retest = Please retest in latest release.


    Priority Definitions

     

    Minor = Something not working correctly.

     

    Urgent = Server crash, data loss, or other showstopper.

     

    Annoyance = Doesn't affect functionality but should be fixed.

     

    Other = Announcement or other non-issue.