5.0 Final: Possible bug in cpu frequency scaling.


Recommended Posts

Possible bug in cpu frequency scaling.

 

After upgrade, my parity checks were very slow... 35MB/sec when they were over 100MB/sec before.  Funny thing was that there was no appreciable i/o wait time.... so it was not disk bottleneck.

 

I noticed one core was maxed out 100%, and checked and all 4 cores were sitting at 800mHz.  I changed all 4 governors to performance instead of ondemand, and parity checks went back to the expected range.

 

Change back to ondemand, and they don't get off the minimum 800mHz.

 

In any event, I'll get into the frequency scaling tunables later and see if I can tweak the upscaling parameters.  But something changed since with the exact same config as before, all 4 CPU's would upscale to 3.5gHz when doing a parity check using ondemand governor.

Link to comment

Possible bug in cpu frequency scaling.

 

After upgrade, my parity checks were very slow... 35MB/sec when they were over 100MB/sec before.  Funny thing was that there was no appreciable i/o wait time.... so it was not disk bottleneck.

 

I noticed one core was maxed out 100%, and checked and all 4 cores were sitting at 800mHz.  I changed all 4 governors to performance instead of ondemand, and parity checks went back to the expected range.

 

Change back to ondemand, and they don't get off the minimum 800mHz.

 

In any event, I'll get into the frequency scaling tunables later and see if I can tweak the upscaling parameters.  But something changed since with the exact same config as before, all 4 CPU's would upscale to 3.5gHz when doing a parity check using ondemand governor.

 

I've had the code below in my go script for some time as I experienced the same problem as you just reported on earlier rc builds:

# Configure ondemand governor
echo 50 > /sys/devices/system/cpu/cpufreq/ondemand/up_threshold & logger Go Script - ondemand up_threshold set to 50
echo 50 > /sys/devices/system/cpu/cpufreq/ondemand/sampling_down_factor & logger Go Script - ondemand sampling_down_factor set to 50

 

watch -n.1 'cat /proc/cpuinfo|grep MHz'

up_threshold: default was/is 80

sampling_down_factor: default was/is 1

 

Link to comment

Adjusting some of the tunables got parity checks going at full speed again so ondemand is actually working properly.  But the tune had to be so sensitive for up-scaling, and staying there, now even when idling cores are jumping up to high frequency, and CPU temp is trending higher.

 

 

Link to comment

Adjusting some of the tunables got parity checks going at full speed again so ondemand is actually working properly.  But the tune had to be so sensitive for up-scaling, and staying there, now even when idling cores are jumping up to high frequency, and CPU temp is trending higher.

 

I find the same issue on the latest rc builds.  I believe there was a change in sysload calculation as my sysload is near zero yet I see cpu clock bouncing as you describe.  Older kernels reported larger sysload numbers for an otherwise idle system (0.5-0.9 with just python running, now is 0.0-0.2 for same average load).  I tried setting up_threshold to 90 and sampling_down_factor back to 1 as worst case and it still bounced heavily.  Not sure what is going on in the newer kernel.

Link to comment

I have seen similar behavior with single-threaded apps, that peg one core, but overall average across all cores is not high.  When your range between low and high CPU speed is close to the same factor as the up and down thresholds, you get this kind of thrashing.  When it does bump up with ondemand governor, it jumps to the max, and instantly is below the throttle down threshold.  Going from 800mHz to 3500mHz is over 4x factor.  That is one scenario where ondemand is not the best choice particularly when you have multiple cores.

 

Switching to the conservative governor is a good workaround, since instead of bouncing between min and max speeds, it will find a middle ground.  Once I changed to conservative, doing a parity check, within a few seconds the one heavily used core ended up staying at 2.4gHz (which leaves it at a healthy 60% constant utilization) , and all the others stay at 800mHz.... parity check speeds are at normal (100MB/s) range along with CPU temps.

 

Some of the folks that have unexplained changes in parity check speeds may consider changing the governor to conservative for all cores.

 

echo conservative > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo conservative > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo conservative > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo conservative > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
... etc for all the cores you have ...

Link to comment

Possible bug in cpu frequency scaling.

 

After upgrade, my parity checks were very slow... 35MB/sec when they were over 100MB/sec before.  Funny thing was that there was no appreciable i/o wait time.... so it was not disk bottleneck.

 

I noticed one core was maxed out 100%, and checked and all 4 cores were sitting at 800mHz.  I changed all 4 governors to performance instead of ondemand, and parity checks went back to the expected range.

 

Change back to ondemand, and they don't get off the minimum 800mHz.

 

In any event, I'll get into the frequency scaling tunables later and see if I can tweak the upscaling parameters.  But something changed since with the exact same config as before, all 4 CPU's would upscale to 3.5gHz when doing a parity check using ondemand governor.

 

 

Something similar might be happening on my server.

 

How can I check this and where and how can I change these governors? Is this a feature of 5.0 final? I am still on RC16b.                   

Link to comment
  • 1 month later...

Any updated info on this?

 

I cannot seem to set ondemand, but I can set it to powersave and then back to performance:

 

root@Tower:~# cpufreq-info

cpufrequtils 007: cpufreq-info © Dominik Brodowski 2004-2009

Report errors and bugs to [email protected], please.

analyzing CPU 0:

  driver: p4-clockmod

  CPUs which run at the same hardware frequency: 0

  CPUs which need to have their frequency coordinated by software: 0

  maximum transition latency: 10.00 ms.

  hardware limits: 400 MHz - 3.20 GHz

  available frequency steps: 400 MHz, 800 MHz, 1.20 GHz, 1.60 GHz, 2.00 GHz, 2.40 GHz, 2.80 GHz, 3.20 GHz

  available cpufreq governors: conservative, userspace, powersave, ondemand, performance

  current policy: frequency should be within 400 MHz and 3.20 GHz.

                  The governor "performance" may decide which speed to use

                  within this range.

  current CPU frequency is 3.20 GHz (asserted by call to hardware).

  cpufreq stats: 400 MHz:0.00%, 800 MHz:0.00%, 1.20 GHz:0.00%, 1.60 GHz:0.00%, 2.00 GHz:0.00%, 2.40 GHz:0.00%, 2.80 GHz:0.00%, 3.20 GHz:100.00%

 

root@Tower:~# cpufreq-set -g ondemand

 

root@Tower:~# cpufreq-info

cpufrequtils 007: cpufreq-info © Dominik Brodowski 2004-2009

Report errors and bugs to [email protected], please.

analyzing CPU 0:

  driver: p4-clockmod

  CPUs which run at the same hardware frequency: 0

  CPUs which need to have their frequency coordinated by software: 0

  maximum transition latency: 10.00 ms.

  hardware limits: 400 MHz - 3.20 GHz

  available frequency steps: 400 MHz, 800 MHz, 1.20 GHz, 1.60 GHz, 2.00 GHz, 2.40 GHz, 2.80 GHz, 3.20 GHz

  available cpufreq governors: conservative, userspace, powersave, ondemand, performance

  current policy: frequency should be within 400 MHz and 3.20 GHz.

                  The governor "performance" may decide which speed to use

                  within this range.

  current CPU frequency is 3.20 GHz (asserted by call to hardware).

  cpufreq stats: 400 MHz:0.00%, 800 MHz:0.00%, 1.20 GHz:0.00%, 1.60 GHz:0.00%, 2.00 GHz:0.00%, 2.40 GHz:0.00%, 2.80 GHz:0.00%, 3.20 GHz:100.00%

root@Tower:~#

 

Link to comment

root@Tower:~# cpufreq-info

analyzing CPU 0:

  driver: p4-clockmod

The p4-clockmod driver has always been problematic.  Many people chose to outright blacklist that driver.  Can't you use something like the acpi-cpufreq driver instead?  What's your CPU anyway?

 

 

Link to comment

It's a Cedarmill Celeron in an Asus P5LD2-VM R2.0.

 

It was working in BubbaRaid...I could see the frequency throttle up and down in php_sysinfo.

 

I'll try to comment out the modprobe p4_clockmod line in my go script.

 

 

root@Tower:~# cpufreq-info

analyzing CPU 0:

  driver: p4-clockmod

The p4-clockmod driver has always been problematic.  Many people chose to outright blacklist that driver.  Can't you use something like the acpi-cpufreq driver instead?  What's your CPU anyway?

 

 

Link to comment

Yup, I did comment it out but it still loads the p4-clockmod module and even if I issue a modprobe acpi-cpufreq or modprobe cpufreq_ondemand it stays p4-clockmod.

 

It's only a single core celeron and if I echo ondemand or conservative >scaling governor the kernel always complains with

 

"ondemand governor failed, too long transition latency of HW, fallback to performance governor"

 

I see that the transition_latency is set to  10000001 Per the comment in the second link and I think the kernel needs to be recompiled to change this, I can't echo a new value into the file:

AnnoLoki

November 21st, 2009 on 21:28

ACPI_CPUFREQ only gives me 2 speeds! 3.2 and 2.4GHz. Clockmod gives me 8 from 3.2GHz to 400MHz. Just because someone deemed that my processors too slow for me to use this, despite the fact I’ve been using it just fine for ages?! Thankfully it’s open source and very easy to undo if you’re building your own kernel/modules, and so I thought I’d share what I found cuz it *%#! annoyed the hell out of me too.

 

If you run:

cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_transition_latency

you will notice the latency is 10000001, which I thought looked highly suspicously like someone had hardcoded that number somewhere to be 1 more than a max number set somewhere where the ondemand/etc would work. You can find it in the file (assuming kernel sources are in /usr/src/linux) :

/usr/src/linux/arch/x86/kernel/cpu/cpufreq/p4-clockmod.c

If you search in that file for the number 10000001 in that file, decrease it by one to 10000000, and recompile the module (if you have p4-clockmod compiled as a module; if it’s linked into the kernel you’ll need to recompile the kernel and reboot).

 

I have found that changing the up_threshold of the ondemand governor from its default to 40 makes it ramp its speed up when I need it to, and stay there until it drops back down again. With my usage pattern it’s not making these switches very often, so any delays in changing the speed are perfectly acceptable -to me-. To do this, simply:

echo 40 > /sys/devices/system/cpu/cpu0/cpufreq/ondemand/up_threshold

 

Hope this is useful to you / anyone.

It's just really weird since it was working in 4.4.2 or whatever BubbaRaid was based on.

 

I'll try to comment out the modprobe p4_clockmod line in my go script.

 

So, did you?

 

 

What happens when you copy/paste the following lines into a telnet session to your server?

modprobe acpi-cpufreq
cpufreq-info | grep driver
for i in /sys/devices/system/cpu/cpu[[:digit:]]* ;do  echo $i ; cat $i/cpufreq/scaling_available_governors ;done
for i in /sys/devices/system/cpu/cpu[[:digit:]]* ;do echo ondemand > $i/cpufreq/scaling_governor ;done
for i in /sys/devices/system/cpu/cpu[[:digit:]]* ;do echo $i ; cat $i/cpufreq/scaling_governor ;done
cat /proc/cpuinfo | grep MHz

 

 

Link to comment

Yup, I did comment it out but it still loads the p4-clockmod module

Then blacklist it.  Modify this line in the syslinux.cfg file like this:

   append  p4-clockmod.blacklist=yes  initrd=bzroot

...and reboot.  Then...

cpufreq-info | grep driver

 

 

Link to comment

Thanks for the help but that didn't seem to work either and neither did modprobe -r p4-clockmod since it was built into the kernel.

 

According to:

https://bugzilla.redhat.com/show_bug.cgi?id=853179

 

I'm not sure the cedarmill celeron even supports acpi since eist/est is not listed in /proc/cpuinfo flags.

 

Close to giving up now. It's not a huge deal as sleep is working so it does powerdown. I'm just really puzzled since it seemed to be working before. Oh well.

 

 

Yup, I did comment it out but it still loads the p4-clockmod module

Then blacklist it.  Modify this line in the syslinux.cfg file like this:

   append  p4-clockmod.blacklist=yes  initrd=bzroot

...and reboot.  Then...

cpufreq-info | grep driver

 

 

Tha
Link to comment

Thanks for the help but that didn't seem to work either and neither ... p4-clockmod since it was built into the kernel.

Did you actually do it?  What did `cpufreq-info | grep driver` show after the reboot?

Can you post a syslog from after the reboot, so I can see the effect of the blacklist option?

 

 

Link to comment

Yup, I added:

 

root@Tower:~# cd /boot/

root@Tower:/boot# more syslinux.cfg

default menu.c32

menu title Lime Technology LLC

prompt 0

timeout 50

label unRAID OS

  menu default

  kernel bzimage

append p4-clockmod.blacklist=yes initrd=bzroot

label unRAID OS (Safe Mode)

  kernel bzimage

  append initrd=bzroot unraidsafemode

label Memtest86+

  kernel memtest

 

root@Tower:/boot# cpufreq-info |grep driver

  driver: p4-clockmod

 

syslog:

 

root@Tower:/var/log# grep p4 syslog

Oct  7 18:06:12 Tower kernel: Kernel command line: p4-clockmod.blacklist=yes initrd=bzroot BOOT_IMAGE=bzimage

Oct  7 18:06:12 Tower kernel: p4-clockmod: P4/Xeon CPU On-Demand Clock Modulation available

Oct  7 18:06:40 Tower cmdline[3429]: p4-clockmod.blacklist=yes initrd=bzroot BOOT_IMAGE=bzimage

 

root@Tower:/var/log# grep ondemand syslog

Oct  7 18:06:12 Tower kernel: ondemand governor failed, too long transition latency of HW, fallback to performance governor

 

root@Tower:/var/log# modprobe -r p4-clockmod

FATAL: Module p4_clockmod is builtin

 

Happy to try anything else, but not looking good at this point.

Like I said, the Celeron may not support the new implementation of frequency scaling :

http://ark.intel.com/products/27129/Intel-Celeron-D-Processor-352-(512K-Cache-3_20-GHz-533-MHz-FSB)?wapkw=celeron+d+352

 

 

 

Thanks for the help but that didn't seem to work either and neither ... p4-clockmod since it was built into the kernel.

Did you actually do it?  What did `cpufreq-info | grep driver` show after the reboot?

Can you post a syslog from after the reboot, so I can see the effect of the blacklist option?

 

 

Link to comment

That's so very strange!  I've always had success blacklisting modules that way. (for example, the ancient ide piix driver, which for some unknown reason is still in unRAID.)  You're right, it doesn't look good.  Perhaps Limetech will show some mercy and remove the p4-clockmod from the kernel in the next release. (you'll have to email him about that).

 

 

Link to comment
  • 3 weeks later...

unRaid 5.0.1 has kernel updated to 3.9.11 but I have not changed anything with "CPU Frequencey scaling".  The default governor is "ondemand" and here is the config for the frequency scaling section (all drivers are "built-in" as opposed to being modules):

 

#
# x86 CPU frequency scaling drivers
#
# CONFIG_X86_INTEL_PSTATE is not set
# CONFIG_X86_PCC_CPUFREQ is not set
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_ACPI_CPUFREQ_CPB=y
CONFIG_X86_POWERNOW_K6=y
CONFIG_X86_POWERNOW_K7=y
CONFIG_X86_POWERNOW_K7_ACPI=y
CONFIG_X86_POWERNOW_K8=y
# CONFIG_X86_GX_SUSPMOD is not set
# CONFIG_X86_SPEEDSTEP_CENTRINO is not set
# CONFIG_X86_SPEEDSTEP_ICH is not set
# CONFIG_X86_SPEEDSTEP_SMI is not set
CONFIG_X86_P4_CLOCKMOD=y
CONFIG_X86_CPUFREQ_NFORCE2=y
# CONFIG_X86_LONGRUN is not set
# CONFIG_X86_LONGHAUL is not set
# CONFIG_X86_E_POWERSAVER is not set

 

What are the desired changes in this area?

Link to comment

I'm not 100% sure. I just know that the frequency used to scale in whatever kernel + config bubbaraid was using even with p4_clockmod. With 5.0 I'm unable to "unload" p4_clockmod since it's kernel compiled to try acpi_cpufreq but I'm not sure my cpu supports it anyway(has to be EIST?).

 

I think somewhere along the line someone figured out that ondemand and p4_clockmod where not compatible for some reason and basically forced "performance" by setting the transition latency to 10000001 so that ondemand could never load.

 

This is not super critical for me as I sleep the server after 15min of inactivity so it not much power is wasted.

 

Link to comment

For 5.0.1, changed the cpu frequency scaling drivers from "built-in" to "modules" - this will at least let you manually remove/install these drivers if desired.  Also added a couple more; the complete list is:

 

drivers/cpufreq/acpi-cpufreq.o

drivers/cpufreq/mperf.o

drivers/cpufreq/powernow-k8.o

drivers/cpufreq/pcc-cpufreq.o

drivers/cpufreq/powernow-k6.o

drivers/cpufreq/powernow-k7.o

drivers/cpufreq/speedstep-ich.o

drivers/cpufreq/speedstep-lib.o

drivers/cpufreq/speedstep-smi.o

drivers/cpufreq/p4-clockmod.o

drivers/cpufreq/cpufreq-nforce2.o

 

Link to comment

Worth a shot.

 

Thanks!

 

For 5.0.1, changed the cpu frequency scaling drivers from "built-in" to "modules" - this will at least let you manually remove/install these drivers if desired.  Also added a couple more; the complete list is:

 

drivers/cpufreq/acpi-cpufreq.o

drivers/cpufreq/mperf.o

drivers/cpufreq/powernow-k8.o

drivers/cpufreq/pcc-cpufreq.o

drivers/cpufreq/powernow-k6.o

drivers/cpufreq/powernow-k7.o

drivers/cpufreq/speedstep-ich.o

drivers/cpufreq/speedstep-lib.o

drivers/cpufreq/speedstep-smi.o

drivers/cpufreq/p4-clockmod.o

drivers/cpufreq/cpufreq-nforce2.o

 

Link to comment
  • 1 month later...

Talk about counter intuitive ... I swapped out an Athlon X2 7750 for a Phenom II x4 975 and partiy check went from an initial 130MB/s down to 60MB/s.  CPU freq was sitting at 800 and yet CPU% was barely 100%.  Change to performance governor and speeds are back ... if not faster :o. [sigh]  Time to start playing with scaling :(

 

You know, before people start playing with the MD parms i think they need to get into their scaling parms.

 

Soooooo exactly what am I tweaking?

Link to comment