CPU Governor state


Recommended Posts

I have set the CPU governor to Performance via Tips and Tweaks plugin. While booting the unraid server the console output says "Enabled CPU frequency scaling governor: powersave".

 

How can I check which governor is set after the server booted up?

Link to comment

Hi,

 

I think it's best to set this on the motherboard. I have a feeling that the setting is currently on powersave in your BIOS and the plugin can't modify it. If you can't find the setting maybe upgrade the BIOS to the latest firmware and try it after.

 

I hope it helps.

Link to comment
6 hours ago, Warrentheo said:

The best method I have seen is the "Tips and Tricks" plugin, seen here: (Allows you to set it as you see fit)

yes, I use Tips and Tweaks plugin and set CPU governor there to Performace. But while booting the server it says "Enabled CPU frequency scaling governor: powersave".

 

Link to comment
  • 2 years later...
On 7/21/2018 at 10:05 AM, gizmer said:

How can I check which governor is set after the server booted up?

I asked this question by myself and found out that this message is created through the underlying Slackware:

https://www.linuxquestions.org/questions/slackware-14/locking-all-cpu's-to-their-maximum-frequency-4175607506/

 

So I checked my file as follows:

cat /etc/rc.d/rc.cpufreq
#!/bin/sh
#
# rc.cpufreq:  Settings for CPU frequency and voltage scaling in the kernel.
#              For more information, see the kernel documentation in
#              /usr/src/linux/Documentation/cpu-freq/


# Default CPU scaling governor to try.  Some possible choices are:
# performance:  The CPUfreq governor "performance" sets the CPU statically
#             to the highest frequency within the borders of scaling_min_freq
#             and scaling_max_freq.
# powersave:  The CPUfreq governor "powersave" sets the CPU statically to the
#             lowest frequency within the borders of scaling_min_freq and
#             scaling_max_freq.
# userspace:  The CPUfreq governor "userspace" allows the user, or any
#             userspace program running with UID "root", to set the CPU to a
#             specific frequency by making a sysfs file "scaling_setspeed"
#             available in the CPU-device directory.
# ondemand:   The CPUfreq governor "ondemand" sets the CPU depending on the
#             current usage.
# conservative:  The CPUfreq governor "conservative", much like the "ondemand"
#             governor, sets the CPU depending on the current usage.  It
#             differs in behaviour in that it gracefully increases and
#             decreases the CPU speed rather than jumping to max speed the
#             moment there is any load on the CPU.
# schedutil:  The CPUfreq governor "schedutil" aims at better integration with
#             the Linux kernel scheduler. Load estimation is achieved through
#             the scheduler's Per-Entity Load Tracking (PELT) mechanism, which
#             also provides information about the recent load.
SCALING_GOVERNOR=ondemand

# For CPUs using intel_pstate, always use the performance governor. This also
# provides power savings on Intel processors while avoiding the ramp-up lag
# present when using the powersave governor (which is the default if ondemand
# is requested on these machines):
if [ "$(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver 2> /dev/null)" = "intel_pstate" ]; then
  SCALING_GOVERNOR="performance"
fi

# If rc.cpufreq is given an option, use it for the CPU scaling governor instead:
if [ ! -z "$1" -a "$1" != "start" ]; then
  SCALING_GOVERNOR=$1
fi

# To force a particular option without having to edit this file, uncomment the
# line in /etc/default/cpufreq and edit it to select the desired option:
if [ -r /etc/default/cpufreq ]; then
  . /etc/default/cpufreq
fi

# If you need to load a specific CPUFreq driver, load it here.  Most likely you don't.
#/sbin/modprobe acpi-cpufreq

# Attempt to apply the CPU scaling governor setting.  This may or may not
# actually override the default value depending on if the choice is supported
# by the architecture, processor, or underlying CPUFreq driver.  For example,
# processors that use the Intel P-state driver will only be able to set
# performance or powersave here.
echo $SCALING_GOVERNOR | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor 1> /dev/null 2> /dev/null

# Report what CPU scaling governor is in use after applying the setting:
if [ -r /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor ]; then
  echo "Enabled CPU frequency scaling governor:  $(cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor)"
fi

As you can see the default is "ondemand", but then follows this condition which sets "performance" as default:

# For CPUs using intel_pstate, always use the performance governor. This also
# provides power savings on Intel processors while avoiding the ramp-up lag
# present when using the powersave governor

By that explanation its not recommened to set something else then "performance" for an Intel cpu. I had problems with "powersave" in the past, but this was with an Intel Atom CPU (now I'm having an i3):

https://forums.plex.tv/t/cpu-scaling-governor-powersave-causes-massive-buffering/604018

 

I never experienced similar problems with "ondemand", so I wanted this gorvernor for the i3, too.

 

Nevertheless I checked the active governor as follows:

cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
performance

Ok, performance has been set as expected. Lets try to overwrite it:

/etc/rc.d/rc.cpufreq ondemand
Enabled CPU frequency scaling governor:  performance

Hmm.. does not work. Seems to be this condition:

# ... For example,
# processors that use the Intel P-state driver will only be able to set
# performance or powersave here.

Lets try it out:

cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
performance
performance
performance
performance

This means our cpu cores support only "performance"?! EDIT: Yes, recent Intel CPUs do only support performance or powersave (with massive lags):

https://wiki.archlinux.org/index.php/CPU_frequency_scaling#Scaling_governors

 

And the most important part:

The performance governor should give better power saving functionality than the old ondemand governor.

For me it does not really look like a proper p-state handling as my cpus maximum is 3.6Ghz and with really low load it never reduces the frequency:

133117749_2020-09-1901_24_18.png.bcc66193fc89276a9db93d3c1deb1826.png

 

So lets try to find out what goes wrong here. At first the pstate values:

ls /sys/devices/system/cpu/intel_pstate/*
/sys/devices/system/cpu/intel_pstate/hwp_dynamic_boost  /sys/devices/system/cpu/intel_pstate/num_pstates
/sys/devices/system/cpu/intel_pstate/max_perf_pct       /sys/devices/system/cpu/intel_pstate/status
/sys/devices/system/cpu/intel_pstate/min_perf_pct       /sys/devices/system/cpu/intel_pstate/turbo_pct
/sys/devices/system/cpu/intel_pstate/no_turbo
root@Thoth:~# cat /sys/devices/system/cpu/intel_pstate/hwp_dynamic_boost
0
root@Thoth:~# cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100
root@Thoth:~# cat /sys/devices/system/cpu/intel_pstate/min_perf_pct
22
root@Thoth:~# cat /sys/devices/system/cpu/intel_pstate/no_turbo
1
root@Thoth:~# cat /sys/devices/system/cpu/intel_pstate/num_pstates
29
root@Thoth:~# cat /sys/devices/system/cpu/intel_pstate/status
active
root@Thoth:~# cat /sys/devices/system/cpu/intel_pstate/turbo_pct
0

Explanations can be found here:

https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html#user-space-interface-in-sysfs

 

For example num_pstates returns the amount of p-states supported by the cpu. As we can see we have 29 for my cpu. And we know that the status is "active" and this means changing the p-states should work, but we do not know how:

https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html#active-mode

Quote

There are two P-state selection algorithms provided by intel_pstate in the active mode: powersave and performance. The way they both operate depends on whether or not the hardware-managed P-states (HWP) feature has been enabled in the processor and possibly on the processor model.

 

do we have Active with HWP or not? I found this sentence:

https://01.org/linuxgraphics/gfx-docs/drm/admin-guide/pm/intel_pstate.html#user-space-interface-in-sysfs

Quote

 

hwp_dynamic_boost

This attribute is only present if intel_pstate works in the active mode with the HWP feature enabled in the processor. If set (equal to 1), it causes the minimum P-state limit to be increased dynamically for a short time whenever a task previously waiting on I/O is selected to run on a given logical CPU (the purpose of this mechanism is to improve performance).

 

Our value is zero. What could that mean?

 

Another hint that we are using Active with HWP is this explanation:

https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html#energy-vs-performance-hints

Quote

 

If intel_pstate works in the active mode with the HWP feature enabled in the processor, additional attributes are present in every CPUFreq policy directory in sysfs. They are intended to allow user space to help intel_pstate to adjust the processor’s internal P-state selection logic by focusing it on performance or on energy-efficiency, or somewhere between the two extremes:

energy_performance_preference

Current value of the energy vs performance hint for the given policy (or the CPU represented by it).

The hint can be changed by writing to this attribute.

energy_performance_available_preferences

List of strings that can be written to the energy_performance_preference attribute.

They represent different energy vs performance hints and should be self-explanatory, except that default represents whatever hint value was set by the platform firmware.

 

Lets check if they are present:

Quote

/sys/devices/system/cpu/cpu0/cpufreq# ls
affected_cpus     cpuinfo_transition_latency                related_cpus                 scaling_driver    scaling_min_freq
cpuinfo_max_freq  energy_performance_available_preferences  scaling_available_governors  scaling_governor  scaling_setspeed
cpuinfo_min_freq  energy_performance_preference             scaling_cur_freq             scaling_max_freq

Both are present so we can be sure. We use Active Mode with HWP:

https://www.kernel.org/doc/html/v4.12/admin-guide/pm/intel_pstate.html#active-mode-with-hwp

Quote

 

Active Mode With HWP

If the processor supports the HWP feature, it will be enabled during the processor initialization and cannot be disabled after that. It is possible to avoid enabling it by passing the intel_pstate=no_hwp argument to the kernel in the command line.

 

If the HWP feature has been enabled, intel_pstate relies on the processor to select P-states by itself, but still it can give hints to the processor’s internal P-state selection logic. What those hints are depends on which P-state selection algorithm has been applied to the given policy (or to the CPU it corresponds to).

 

Even though the P-state selection is carried out by the processor automatically, intel_pstate registers utilization update callbacks with the CPU scheduler in this mode. However, they are not used for running a P-state selection algorithm, but for periodic updates of the current CPU frequency information to be made available from the scaling_cur_freq policy attribute in sysfs.

 

HWP + performance

In this configuration intel_pstate will write 0 to the processor’s Energy-Performance Preference (EPP) knob (if supported) or its Energy-Performance Bias (EPB) knob (otherwise), which means that the processor’s internal P-state selection logic is expected to focus entirely on performance.

 

This will override the EPP/EPB setting coming from the sysfs interface (see Energy vs Performance Hints below).

Also, in this configuration the range of P-states available to the processor’s internal P-state selection logic is always restricted to the upper boundary (that is, the maximum P-state that the driver is allowed to use).

 

HWP + powersave

In this configuration intel_pstate will set the processor’s Energy-Performance Preference (EPP) knob (if supported) or its Energy-Performance Bias (EPB) knob (otherwise) to whatever value it was previously set to via sysfs (or whatever default value it was set to by the platform firmware). This usually causes the processor’s internal P-state selection logic to be less performance-focused.

 

 

I disabled all my writings to the unraid server and stopped all disks. In this state the server consumes 24W. Performance still does not downclock:

watch -n1 "cat /proc/cpuinfo | grep \"^[c]pu MHz\""

cpu MHz         : 3600.114
cpu MHz         : 3600.910
cpu MHz         : 3601.040
cpu MHz         : 3600.269

Does HWP + Performance mean it never changes the p-state? Which algorithm is used and where can I find it or influence it?

 

I tried it with powersave

/etc/rc.d/rc.cpufreq powersave
Enabled CPU frequency scaling governor:  powersave

I don't know why, but all disks started with several seconds delay and very small writes (1,4 kB/s) were done. I waited one minute and spun them down again. The power consumption stayed at 24W. The frequency is only a little bit lower

watch -n10 "cat /proc/cpuinfo | grep \"^[c]pu MHz\""
cpu MHz         : 3263.100
cpu MHz         : 3021.631
cpu MHz         : 3252.913
cpu MHz         : 2819.033

I checked the available HWP profiles and which one is used:

cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
balance_performance
cat /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_available_preferences
default performance balance_performance balance_power power

I did not found any documentation about these variables. Only this answer to the same question:

https://superuser.com/a/1449813/129262

Quote

 

The EPP settings affect "how aggressively the hardware enters and exits CPU idle states (C-states) and Processor Performance States (P-states)" for example.

It looks like "balance_performance" is the default setting for having a responsive, well-performing system but with "potentially-significant energy saving"

 

So lets try them out:

echo "power" > /sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference
echo "power" > /sys/devices/system/cpu/cpu1/cpufreq/energy_performance_preference
echo "power" > /sys/devices/system/cpu/cpu2/cpufreq/energy_performance_preference
echo "power" > /sys/devices/system/cpu/cpu3/cpufreq/energy_performance_preference
cat /sys/devices/system/cpu/cpu*/cpufreq/energy_performance_preference
power
power
power
power

I tested "power" and "balance_power". No difference in power consumption.

 

If I set "/etc/rc.d/rc.cpufreq" to "performance" the "/sys/devices/system/cpu/cpu0/cpufreq/energy_performance_preference" becomes "performance". If I set it to "powersave" the preference becomes "balance_performance".

 

Conclusion:

Although "performance" is set, there is no room to save more energy in an idle state. "powersave" seems only influence the time in which the core stays in a slower state which could cause latency issues, but the lowest p-state is the same for all profiles. This is completely different to my Atom CPU, which directly showed a lower energy consumption after changing the profile to "ondemand". So it depends on the used CPU. But finally "ondemand" is set if its present so no further optimization seems to be needed.

 

The next we could check are the c-states. For this I used "powertop" from the nerd pack. It seems we have the best results as c-state C10 is used most of the time:

1507872909_2020-09-1904_15_18.thumb.png.ee0b3c09805058b4ad7f311fcd5bee30.png

 

 

Edited by mgutt
  • Like 1
  • Thanks 4
Link to comment
  • 5 months later...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.