diff mbox

cpufreq/arm-bl-cpufreq: Add simple cpufreq big.LITTLE switcher frontend

Message ID 1332173724-9327-1-git-send-email-dave.martin@linaro.org
State Not Applicable
Headers show

Commit Message

Dave Martin March 19, 2012, 4:15 p.m. UTC
This patch adds a very simple cpufreq-based frontend to the ARM
big.LITTLE switcher.

This driver simply simulates two performance points corresponding
to the big and little clusters.

Note that this driver requires the ARM switcher implementation to
be loaded in order to work.  The switcher must have been built with
ASYNC = FALSE in big-little/Makefile in order for the synchronous
switching interface to be exposed.

Bugs and limitations:

  * Switching twice in quick succession currently tends to cause
    a deadlock inside the switcher.  I still need to identify
    exactly what is going wrong here.

  * There is currently no tracing interface, and no interface for
    reporting what the dummy frequencies exposed by the driver
    actually mean in terms of real cluster / performance point
    combinations.  For the very simple case supported,
    cpuinfo_max_freq corresponds to big and cpuinfo_min_freq
    corresponds to LITTLE.

  * scaling_cur_freq doesn't accurately report what the current
    performance point is.

  * a different governor can be set on each CPU -- it's not clear
    whether this is a general cpufreq feature or a bug in my
    driver, but we should find a way to prevent it, if possible.

  * cpufreq will trigger spurious extra cluster switches.  This is
    a "feature" since I didn't tell cpufreq that this matters, but
    it may not be desirable.  Setting policy->cpus can probably fix
    this.

  * The low-level switcher interface for switching cluster does not
    currently provide a way to specify the desired destination
    cluster.  This ia possible design flaw in the switcher
    interface.  Currently I work around this by taking a spinlock,
    but this means that the lock has to be held across the cluster
    switch by the CPU instigating the switch.  This "works" but
    it's not clear whether it's desirable.

Signed-off-by: Dave Martin <dave.martin@linaro.org>
---
 Documentation/cpu-freq/cpufreq-arm-bl.txt |   41 ++++++
 arch/arm/Kconfig                          |    1 +
 drivers/cpufreq/Kconfig.arm               |   18 +++
 drivers/cpufreq/Makefile                  |    4 +
 drivers/cpufreq/arm-bl-cpufreq.h          |   13 ++
 drivers/cpufreq/arm-bl-cpufreq_driver.c   |  207 +++++++++++++++++++++++++++++
 drivers/cpufreq/arm-bl-cpufreq_hvc.S      |   15 ++
 7 files changed, 299 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/cpu-freq/cpufreq-arm-bl.txt
 create mode 100644 drivers/cpufreq/arm-bl-cpufreq.h
 create mode 100644 drivers/cpufreq/arm-bl-cpufreq_driver.c
 create mode 100644 drivers/cpufreq/arm-bl-cpufreq_hvc.S

Comments

Nicolas Pitre March 19, 2012, 6:41 p.m. UTC | #1
On Mon, 19 Mar 2012, Dave Martin wrote:

> This patch adds a very simple cpufreq-based frontend to the ARM
> big.LITTLE switcher.
> 
> This driver simply simulates two performance points corresponding
> to the big and little clusters.

Good!

> Note that this driver requires the ARM switcher implementation to
> be loaded in order to work.  The switcher must have been built with
> ASYNC = FALSE in big-little/Makefile in order for the synchronous
> switching interface to be exposed.
> 
> Bugs and limitations:
> 
>   * Switching twice in quick succession currently tends to cause
>     a deadlock inside the switcher.  I still need to identify
>     exactly what is going wrong here.
> 
>   * There is currently no tracing interface, and no interface for
>     reporting what the dummy frequencies exposed by the driver
>     actually mean in terms of real cluster / performance point
>     combinations.  For the very simple case supported,
>     cpuinfo_max_freq corresponds to big and cpuinfo_min_freq
>     corresponds to LITTLE.
> 
>   * scaling_cur_freq doesn't accurately report what the current
>     performance point is.

Also, it is worth remembering that any governor using a feedback loop is 
likely to become confused in some conditions when used on the software 
model as the model is probably not making the actual performance any 
different whether the A7 or the A15 is being emulated.

>   * a different governor can be set on each CPU -- it's not clear
>     whether this is a general cpufreq feature or a bug in my
>     driver, but we should find a way to prevent it, if possible.

Since the in-kernel switcher will make all CPU pairs independent from 
each other, this won't be a concern there, at least from a correctness 
point of view.  Whether or not this is something logical to do will be a 
separate matter.

>   * cpufreq will trigger spurious extra cluster switches.  This is
>     a "feature" since I didn't tell cpufreq that this matters, but
>     it may not be desirable.  Setting policy->cpus can probably fix
>     this.
> 
>   * The low-level switcher interface for switching cluster does not
>     currently provide a way to specify the desired destination
>     cluster.  This ia possible design flaw in the switcher
>     interface.  Currently I work around this by taking a spinlock,
>     but this means that the lock has to be held across the cluster
>     switch by the CPU instigating the switch.  This "works" but
>     it's not clear whether it's desirable.

From the "client" point of view i.e. Linux, I don't think this matters 
much as the actual spinlock may not be busily be contended during the 
switch since execution is being redirected to the switcher itself.

> +Within each CPU's cpufreq directory in sysfs (/sys/devices/system/cpu/cpu?/cpufreq/):
> +
> +cpuinfo_min_freq:
> +
> +	reports the dummy frequency value which corresponds to the "big"
> +	cluster.

You probably meant cpuinfo_max_freq here.

BTW, it seems that the arm-soc's for-next branch has everything needed 
to boot on the model when it is passed the appropriate DTB.


Nicolas
Dave Martin March 20, 2012, 10:25 a.m. UTC | #2
On Mon, Mar 19, 2012 at 02:41:16PM -0400, Nicolas Pitre wrote:
> On Mon, 19 Mar 2012, Dave Martin wrote:
> 
> > This patch adds a very simple cpufreq-based frontend to the ARM
> > big.LITTLE switcher.
> > 
> > This driver simply simulates two performance points corresponding
> > to the big and little clusters.
> 
> Good!
> 
> > Note that this driver requires the ARM switcher implementation to
> > be loaded in order to work.  The switcher must have been built with
> > ASYNC = FALSE in big-little/Makefile in order for the synchronous
> > switching interface to be exposed.
> > 
> > Bugs and limitations:
> > 
> >   * Switching twice in quick succession currently tends to cause
> >     a deadlock inside the switcher.  I still need to identify
> >     exactly what is going wrong here.
> > 
> >   * There is currently no tracing interface, and no interface for
> >     reporting what the dummy frequencies exposed by the driver
> >     actually mean in terms of real cluster / performance point
> >     combinations.  For the very simple case supported,
> >     cpuinfo_max_freq corresponds to big and cpuinfo_min_freq
> >     corresponds to LITTLE.
> > 
> >   * scaling_cur_freq doesn't accurately report what the current
> >     performance point is.
> 
> Also, it is worth remembering that any governor using a feedback loop is 
> likely to become confused in some conditions when used on the software 
> model as the model is probably not making the actual performance any 
> different whether the A7 or the A15 is being emulated.

Indeed.  Actually, the non-dumb governors like ondemand and conservative
explicitly require the transition latency to be set, so they refuse to
work right now.  For now, I consider that to be a "feature", but we
can sort it out later.

I locally implemented the driver->get() method, so the frequency can
now be read out through cpu?/cpufreq/cpuinfo_cur_freq.  I'll follow
up with an updated patch later.

> >   * a different governor can be set on each CPU -- it's not clear
> >     whether this is a general cpufreq feature or a bug in my
> >     driver, but we should find a way to prevent it, if possible.
> 
> Since the in-kernel switcher will make all CPU pairs independent from 
> each other, this won't be a concern there, at least from a correctness 
> point of view.  Whether or not this is something logical to do will be a 
> separate matter.

You're right, now that I think a bit more about it.

Having different governors on different clusters may not be useful, but
it is safe -- so there is no point disallowing it if that would
complicate the implementation.

> >   * cpufreq will trigger spurious extra cluster switches.  This is
> >     a "feature" since I didn't tell cpufreq that this matters, but
> >     it may not be desirable.  Setting policy->cpus can probably fix
> >     this.
> > 
> >   * The low-level switcher interface for switching cluster does not
> >     currently provide a way to specify the desired destination
> >     cluster.  This ia possible design flaw in the switcher
> >     interface.  Currently I work around this by taking a spinlock,
> >     but this means that the lock has to be held across the cluster
> >     switch by the CPU instigating the switch.  This "works" but
> >     it's not clear whether it's desirable.
> 
> From the "client" point of view i.e. Linux, I don't think this matters 
> much as the actual spinlock may not be busily be contended during the 
> switch since execution is being redirected to the switcher itself.

OK -- I suggest we leave things as-is for now, then.

> > +Within each CPU's cpufreq directory in sysfs (/sys/devices/system/cpu/cpu?/cpufreq/):
> > +
> > +cpuinfo_min_freq:
> > +
> > +	reports the dummy frequency value which corresponds to the "big"
> > +	cluster.
> 
> You probably meant cpuinfo_max_freq here.

Well spotted :)


Thanks for the review.

> BTW, it seems that the arm-soc's for-next branch has everything needed 
> to boot on the model when it is passed the appropriate DTB.

I had not yet got around to testing this ... have you got it running?

I suspect that we might still need a couple of extra patches in the
short term, but rebasing onto there would still be a good idea.
Anything which takes us closer to mainline is good.


Cheers
---Dave
Avik Sil March 20, 2012, 10:48 a.m. UTC | #3
Hi Dave,

I've used this patch with this kernel:
git://git.linaro.org/people/dmart/linux-2.6-arm.git; branch
arm/vexpressdt-rtsm with the following configs:

CONFIG_CPU_FREQ=y
CONFIG_CPU_FREQ_TABLE=y
CONFIG_CPU_FREQ_STAT=y
# CONFIG_CPU_FREQ_STAT_DETAILS is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
# CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=y
CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y

I'm able to use the switcher with performance and powersave governors
but not with the userspace governor. The default governor is always
performance. Let me know if I'm missing something.

Regards,
Avik

On Monday 19 March 2012 09:45 PM, Dave Martin wrote:

> This patch adds a very simple cpufreq-based frontend to the ARM
> big.LITTLE switcher.
> 
> This driver simply simulates two performance points corresponding
> to the big and little clusters.
> 
> Note that this driver requires the ARM switcher implementation to
> be loaded in order to work.  The switcher must have been built with
> ASYNC = FALSE in big-little/Makefile in order for the synchronous
> switching interface to be exposed.
> 
> Bugs and limitations:
> 
>   * Switching twice in quick succession currently tends to cause
>     a deadlock inside the switcher.  I still need to identify
>     exactly what is going wrong here.
> 
>   * There is currently no tracing interface, and no interface for
>     reporting what the dummy frequencies exposed by the driver
>     actually mean in terms of real cluster / performance point
>     combinations.  For the very simple case supported,
>     cpuinfo_max_freq corresponds to big and cpuinfo_min_freq
>     corresponds to LITTLE.
> 
>   * scaling_cur_freq doesn't accurately report what the current
>     performance point is.
> 
>   * a different governor can be set on each CPU -- it's not clear
>     whether this is a general cpufreq feature or a bug in my
>     driver, but we should find a way to prevent it, if possible.
> 
>   * cpufreq will trigger spurious extra cluster switches.  This is
>     a "feature" since I didn't tell cpufreq that this matters, but
>     it may not be desirable.  Setting policy->cpus can probably fix
>     this.
> 
>   * The low-level switcher interface for switching cluster does not
>     currently provide a way to specify the desired destination
>     cluster.  This ia possible design flaw in the switcher
>     interface.  Currently I work around this by taking a spinlock,
>     but this means that the lock has to be held across the cluster
>     switch by the CPU instigating the switch.  This "works" but
>     it's not clear whether it's desirable.
> 
> Signed-off-by: Dave Martin <dave.martin@linaro.org>
> ---
>  Documentation/cpu-freq/cpufreq-arm-bl.txt |   41 ++++++
>  arch/arm/Kconfig                          |    1 +
>  drivers/cpufreq/Kconfig.arm               |   18 +++
>  drivers/cpufreq/Makefile                  |    4 +
>  drivers/cpufreq/arm-bl-cpufreq.h          |   13 ++
>  drivers/cpufreq/arm-bl-cpufreq_driver.c   |  207 +++++++++++++++++++++++++++++
>  drivers/cpufreq/arm-bl-cpufreq_hvc.S      |   15 ++
>  7 files changed, 299 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/cpu-freq/cpufreq-arm-bl.txt
>  create mode 100644 drivers/cpufreq/arm-bl-cpufreq.h
>  create mode 100644 drivers/cpufreq/arm-bl-cpufreq_driver.c
>  create mode 100644 drivers/cpufreq/arm-bl-cpufreq_hvc.S
> 
> diff --git a/Documentation/cpu-freq/cpufreq-arm-bl.txt b/Documentation/cpu-freq/cpufreq-arm-bl.txt
> new file mode 100644
> index 0000000..0a47c47
> --- /dev/null
> +++ b/Documentation/cpu-freq/cpufreq-arm-bl.txt
> @@ -0,0 +1,41 @@
> +Synchronous cluster switching interface for the ARM big.LITTLE switcher
> +-----------------------------------------------------------------------
> +
> +The arm-bl-cpufreq driver provides a simple interface which models two
> +clusters as two performance points.
> +
> +Within each CPU's cpufreq directory in sysfs (/sys/devices/system/cpu/cpu?/cpufreq/):
> +
> +cpuinfo_min_freq:
> +
> +	reports the dummy frequency value which corresponds to the "big"
> +	cluster.
> +
> +cpuinfo_min_freq:
> +
> +	reports the dummy frequency value which corresponds to the
> +	"little" cluster.
> +
> +
> +To switch clusters, either the built-in "powersave" or "performance"
> +governors can be used to force the "little" or "big" cluster
> +respectively; or alternatively the "userspace" governor can be used,
> +
> +The following script fragment demonstrates how the userspace governor
> +can be used to switch:
> +
> +
> +for x in /sys/devices/system/cpu/cpu[0-9]*; do
> +	echo userspace >$x/cpufreq/scaling_governor
> +done
> +
> +big_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq`
> +little_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq`
> +
> +switch_to_big () {
> +	echo $big_freq >/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
> +}
> +
> +switch_to_little () {
> +	echo $little_freq >/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
> +}
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 455367d..907d44a 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -301,6 +301,7 @@ config ARCH_VERSATILE
>  config ARCH_VEXPRESS
>  	bool "ARM Ltd. Versatile Express family"
>  	select ARCH_WANT_OPTIONAL_GPIOLIB
> +	select ARCH_HAS_CPUFREQ
>  	select ARM_AMBA
>  	select ARM_TIMER_SP804
>  	select CLKDEV_LOOKUP
> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> index 72a0044..0e4f6d0 100644
> --- a/drivers/cpufreq/Kconfig.arm
> +++ b/drivers/cpufreq/Kconfig.arm
> @@ -30,3 +30,21 @@ config ARM_EXYNOS4210_CPUFREQ
>  	  SoC (S5PV310 or S5PC210).
>  
>  	  If in doubt, say N.
> +
> +config ARM_BL_CPUFREQ
> +	depends on EXPERIMENTAL
> +	depends on ARCH_VEXPRESS_DT
> +	tristate "Simple cpufreq interface for the ARM big.LITTLE switcher"
> +	help
> +	  Provides a simple cpufreq interface to control the ARM
> +	  big.LITTLE switcher.
> +
> +	  Note that this code is not currently safe unless the
> +	  big.LITTLE switcher binary has been loaded separately by an
> +	  external bootloader or firmware before entering the kernel.
> +	  Otherwise, you can still build this code as a module,
> +	  providing that you don't load it.
> +
> +	  Refer to Documentation/cpufreq/cpufreq-arm-bl.txt for details.
> +
> +	  If unsure, say N.
> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
> index a48bc02..ecf492d 100644
> --- a/drivers/cpufreq/Makefile
> +++ b/drivers/cpufreq/Makefile
> @@ -40,6 +40,10 @@ obj-$(CONFIG_X86_CPUFREQ_NFORCE2)	+= cpufreq-nforce2.o
>  ##################################################################################
>  # ARM SoC drivers
>  obj-$(CONFIG_UX500_SOC_DB8500)		+= db8500-cpufreq.o
> +obj-$(CONFIG_ARM_BL_CPUFREQ)		+= arm-bl-cpufreq.o
> +arm-bl-cpufreq-y			+= arm-bl-cpufreq_driver.o \
> +					   arm-bl-cpufreq_hvc.o
> +AFLAGS_arm-bl-cpufreq_hvc.o		:= -march=armv7-a
>  obj-$(CONFIG_ARM_S3C64XX_CPUFREQ)	+= s3c64xx-cpufreq.o
>  obj-$(CONFIG_ARM_S5PV210_CPUFREQ)	+= s5pv210-cpufreq.o
>  obj-$(CONFIG_ARM_EXYNOS4210_CPUFREQ)	+= exynos4210-cpufreq.o
> diff --git a/drivers/cpufreq/arm-bl-cpufreq.h b/drivers/cpufreq/arm-bl-cpufreq.h
> new file mode 100644
> index 0000000..2f9b0dc
> --- /dev/null
> +++ b/drivers/cpufreq/arm-bl-cpufreq.h
> @@ -0,0 +1,13 @@
> +#ifndef ARM_BL_CPUFREQ_HVC_H
> +#define ARM_BL_CPUFREQ_HVC_H
> +
> +#ifndef __ASSEMBLY__
> +int __arm_bl_get_cluster(void);
> +void __arm_bl_switch_cluster(void);
> +#endif /* ! __ASSEMBLY__ */
> +
> +/* Hypervisor call numbers for the ARM big.LITTLE switcher: */
> +#define ARM_BL_HVC_SWITCH_CLUSTER 1
> +#define ARM_BL_HVC_GET_MPIDR 2
> +
> +#endif /* ! ARM_BL_CPUFREQ_HVC_H */
> diff --git a/drivers/cpufreq/arm-bl-cpufreq_driver.c b/drivers/cpufreq/arm-bl-cpufreq_driver.c
> new file mode 100644
> index 0000000..96743dd
> --- /dev/null
> +++ b/drivers/cpufreq/arm-bl-cpufreq_driver.c
> @@ -0,0 +1,207 @@
> +/*
> + * arm-bl-cpufreq.c: Simple cpufreq backend for the ARM big.LITTLE switcher
> + * Copyright (C) 2012  Linaro Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation; either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, write to the Free Software Foundation, Inc.,
> + * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
> + */
> +
> +/* WARNING: This code is experimental and depends on external firmware */
> +
> +#include <linux/bug.h>
> +#include <linux/cache.h>
> +#include <linux/cpufreq.h>
> +#include <linux/cpumask.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/printk.h>
> +#include <linux/string.h>
> +#include <linux/spinlock.h>
> +
> +#include "arm-bl-cpufreq.h"
> +
> +#define DRIVER_NAME "arm-bl"
> +#define MODULE_NAME "arm-bl-cpufreq"
> +
> +#define info(format...) printk(KERN_INFO MODULE_NAME ": " format)
> +#define warn(format...) printk(KERN_WARNING MODULE_NAME ": " format)
> +
> +/* Dummy frequencies representing the big and little clusters: */
> +#define FREQ_BIG	1000000
> +#define FREQ_LITTLE	 100000
> +
> +/*  Cluster numbers */
> +#define CLUSTER_BIG	0
> +#define CLUSTER_LITTLE	1
> +
> +static DEFINE_SPINLOCK(switcher_lock);
> +
> +static struct cpufreq_frequency_table __read_mostly bl_freqs[] = {
> +	{ CLUSTER_BIG,		FREQ_BIG		},
> +	{ CLUSTER_LITTLE,	FREQ_LITTLE		},
> +	{ 0,			CPUFREQ_TABLE_END	},
> +};
> +
> +
> +/* Miscellaneous helpers */
> +
> +static unsigned int cluster_to_freq(int cluster)
> +{
> +	switch(cluster) {
> +	case 0: return FREQ_BIG;
> +	case 1: return FREQ_LITTLE;
> +	default:
> +		WARN(1, "%s: %s(): invalid cluster number %d, assuming 0\n",
> +		     MODULE_NAME, __func__, cluster);
> +		return FREQ_BIG;
> +	}
> +}
> +
> +/*
> + * Functions to get the current status.
> + *
> + * If you intend to use the result (i.e., it's not just for diagnostic
> + * purposes) then you should be holding switcher_lock ... otherwise
> + * the current cluster may change unexpectedly.
> + */
> +static int get_current_cluster(void)
> +{
> +	return (__arm_bl_get_cluster() >> 8) & 0xF;
> +}
> +
> +static unsigned int get_current_freq(void)
> +{
> +	return cluster_to_freq(get_current_cluster());
> +}
> +
> +/*
> + * Switch to the requested cluster.
> + * There is no "switch_to_frequency" function, because the cpufreq frequency
> + * table helpers can easily look up the appropriate cluster number for us.
> + */
> +static void switch_to_cluster(int cluster)
> +{
> +	info("Switching to cluster %d\n", cluster);
> +
> +	spin_lock(&switcher_lock);
> +	if(cluster != get_current_cluster())
> +		__arm_bl_switch_cluster();
> +	spin_unlock(&switcher_lock);
> +}
> +
> +
> +/* Cpufreq methods and module code */
> +
> +static int bl_cpufreq_init(struct cpufreq_policy *policy)
> +{
> +	int err;
> +
> +	/*
> +	 * Set CPU and policy min and max frequencies based on bl_freqs:
> +	 */
> +	err = cpufreq_frequency_table_cpuinfo(policy, bl_freqs);
> +	if (err)
> +		goto error;
> +
> +	/*
> +	 * No need for locking here:
> +	 * cpufreq is not active until initialisation has finished.
> +	 * Ideally, transition_latency should be calibrated here.
> +	 */
> +	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
> +	policy->cur = get_current_freq();
> +	policy->policy = CPUFREQ_POLICY_PERFORMANCE;
> +
> +	/*
> +	 * A b.L switch can be triggered from any CPU, but will affect them all.
> +	 * The set of related CPUs should perhaps be determined from the
> +	 * system CPU topology, rather than just the set of CPUs present...
> +	 */
> +	policy->shared_type = CPUFREQ_SHARED_TYPE_ANY;
> +	cpumask_copy(policy->related_cpus, cpu_present_mask);
> +	/*
> +	 * We do not set ->cpus here, because it doesn't actually matter if
> +	 * we try to switch on two CPUs at the same time.  Setting ->cpus
> +	 * to cpu_present_mask might provide a way to avoid the need to take
> +	 * switcher_lock when switching, though.
> +	 */
> +
> +	info("cpufreq initialised successfully\n");
> +	return 0;
> +	
> +error:
> +	warn("%s: cpufreq initialisation failed (%d)\n", __func__, err);
> +	return err;
> +}
> +
> +static int bl_cpufreq_verify(struct cpufreq_policy *policy)
> +{
> +	return cpufreq_frequency_table_verify(policy, bl_freqs);
> +}
> +
> +static int bl_cpufreq_target(struct cpufreq_policy *policy,
> +			     unsigned int target_freq,
> +			     unsigned int relation)
> +{
> +	int err;
> +	int index;
> +
> +	if(cpufreq_frequency_table_target(policy, bl_freqs, target_freq,
> +					  relation, &index))
> +		return err;
> +
> +	switch_to_cluster(bl_freqs[index].index);
> +	return 0;
> +}
> +
> +static struct cpufreq_driver __read_mostly bl_cpufreq_driver = {
> +	.owner = THIS_MODULE,
> +	.name = DRIVER_NAME,
> +
> +	.init = bl_cpufreq_init,
> +	.verify = bl_cpufreq_verify,
> +	.target = bl_cpufreq_target,
> +	/* what else? */
> +};	
> +
> +static int __init bl_cpufreq_module_init(void)
> +{
> +	int err;
> +
> +	err = cpufreq_register_driver(&bl_cpufreq_driver);
> +	if(err)
> +		info("cpufreq backend driver registration failed (%d)\n", err);
> +	else
> +		info("cpufreq backend driver registered.\n");
> +
> +	return err;
> +}
> +module_init(bl_cpufreq_module_init);
> +
> +static void __exit bl_cpufreq_module_exit(void)
> +{
> +	cpufreq_unregister_driver(&bl_cpufreq_driver);
> +
> +	/* Restore the "default" cluster: */
> +	switch_to_cluster(CLUSTER_BIG);
> +
> +	info("cpufreq backend driver unloaded.\n");
> +}
> +module_exit(bl_cpufreq_module_exit);
> +
> +
> +MODULE_AUTHOR("Dave Martin");
> +MODULE_DESCRIPTION("Simple cpufreq interface for the ARM big.LITTLE switcher");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/cpufreq/arm-bl-cpufreq_hvc.S b/drivers/cpufreq/arm-bl-cpufreq_hvc.S
> new file mode 100644
> index 0000000..6c7eb6d
> --- /dev/null
> +++ b/drivers/cpufreq/arm-bl-cpufreq_hvc.S
> @@ -0,0 +1,15 @@
> +#include <linux/linkage.h>
> +
> +#include "arm-bl-cpufreq.h"
> +
> +.arch_extension virt
> +
> +ENTRY(__arm_bl_get_cluster)
> +	hvc	#ARM_BL_HVC_GET_MPIDR
> +	bx	lr
> +ENDPROC(__arm_bl_get_cluster)
> +
> +ENTRY(__arm_bl_switch_cluster)
> +	hvc	#ARM_BL_HVC_SWITCH_CLUSTER
> +	bx	lr
> +ENDPROC(__arm_bl_switch_cluster)
Dave Martin March 20, 2012, 12:09 p.m. UTC | #4
On Tue, Mar 20, 2012 at 10:48 AM, Avik Sil <avik.sil@linaro.org> wrote:
> Hi Dave,
>
> I've used this patch with this kernel:
> git://git.linaro.org/people/dmart/linux-2.6-arm.git; branch
> arm/vexpressdt-rtsm with the following configs:
>
> CONFIG_CPU_FREQ=y
> CONFIG_CPU_FREQ_TABLE=y
> CONFIG_CPU_FREQ_STAT=y
> # CONFIG_CPU_FREQ_STAT_DETAILS is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
> CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
> # CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
> # CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
> CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
> CONFIG_CPU_FREQ_GOV_POWERSAVE=y
> CONFIG_CPU_FREQ_GOV_USERSPACE=y
> CONFIG_CPU_FREQ_GOV_ONDEMAND=y
> CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
>
> I'm able to use the switcher with performance and powersave governors
> but not with the userspace governor. The default governor is always
> performance. Let me know if I'm missing something.

Currently, I didn't anticipate that we might want to default to the
userspace governor -- this doesn't necessarily makes a lot of sense:
the userspace governor needs active control from userspace in order to
be useful, so it makes sense for that userspace daemon to switch to
the userspace governor when it starts up.  Before the daemon starts
up, it makes sense to default to a trivial in-kernel governor instead.

Having said that, if some has selected a default governor through the
kernel config we should try to honour it -- I'll need to check with
other cpufreq drivers to see what is appropriate here.  I may have
done something wrong.


In the meantime, you can check that the userspace governor is usable:

for x in /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_governor; do
    echo userspace >$x
done

... and then try to switch the frequency by echoing to

/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed

Cheers
---Dave
Paul Larson March 20, 2012, 12:18 p.m. UTC | #5
From previous discussions about testing, I think we definitely want to
default to userspace governor so that we maintain full control over when
switching happens during testing.
On Mar 20, 2012 7:09 AM, "Dave Martin" <dave.martin@linaro.org> wrote:

> On Tue, Mar 20, 2012 at 10:48 AM, Avik Sil <avik.sil@linaro.org> wrote:
> > Hi Dave,
> >
> > I've used this patch with this kernel:
> > git://git.linaro.org/people/dmart/linux-2.6-arm.git; branch
> > arm/vexpressdt-rtsm with the following configs:
> >
> > CONFIG_CPU_FREQ=y
> > CONFIG_CPU_FREQ_TABLE=y
> > CONFIG_CPU_FREQ_STAT=y
> > # CONFIG_CPU_FREQ_STAT_DETAILS is not set
> > # CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
> > CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
> > # CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set
> > # CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set
> > CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
> > CONFIG_CPU_FREQ_GOV_POWERSAVE=y
> > CONFIG_CPU_FREQ_GOV_USERSPACE=y
> > CONFIG_CPU_FREQ_GOV_ONDEMAND=y
> > CONFIG_CPU_FREQ_GOV_CONSERVATIVE=y
> >
> > I'm able to use the switcher with performance and powersave governors
> > but not with the userspace governor. The default governor is always
> > performance. Let me know if I'm missing something.
>
> Currently, I didn't anticipate that we might want to default to the
> userspace governor -- this doesn't necessarily makes a lot of sense:
> the userspace governor needs active control from userspace in order to
> be useful, so it makes sense for that userspace daemon to switch to
> the userspace governor when it starts up.  Before the daemon starts
> up, it makes sense to default to a trivial in-kernel governor instead.
>
> Having said that, if some has selected a default governor through the
> kernel config we should try to honour it -- I'll need to check with
> other cpufreq drivers to see what is appropriate here.  I may have
> done something wrong.
>
>
> In the meantime, you can check that the userspace governor is usable:
>
> for x in /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_governor; do
>    echo userspace >$x
> done
>
> ... and then try to switch the frequency by echoing to
>
> /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
>
> Cheers
> ---Dave
>
> _______________________________________________
> linaro-big-little mailing list
> linaro-big-little@lists.linaro.org
> http://lists.linaro.org/mailman/listinfo/linaro-big-little
>
Dave Martin March 20, 2012, 2:26 p.m. UTC | #6
On Tue, Mar 20, 2012 at 12:18 PM, Paul Larson <paul.larson@linaro.org> wrote:
> From previous discussions about testing, I think we definitely want to
> default to userspace governor so that we maintain full control over when
> switching happens during testing.

It turns out that I was overriding the default governor for my driver
unnecessarily, due to a misreading of the cpufreq documentation.

Booting up using the userspace governor still feels a bit wrong --
until something actually starts to drive it it's equivalent to
"performance" anyway -- but it should now work as expected.

I'll push out an updated patch later today.

Cheers
---Dave
Nicolas Pitre March 20, 2012, 3:32 p.m. UTC | #7
On Tue, 20 Mar 2012, Dave Martin wrote:

> > BTW, it seems that the arm-soc's for-next branch has everything needed 
> > to boot on the model when it is passed the appropriate DTB.
> 
> I had not yet got around to testing this ... have you got it running?

Yes.

> I suspect that we might still need a couple of extra patches in the
> short term, but rebasing onto there would still be a good idea.

What patch do you have in mind?


Nicolas
Dave Martin March 20, 2012, 4:09 p.m. UTC | #8
On Tue, Mar 20, 2012 at 3:32 PM, Nicolas Pitre <nicolas.pitre@linaro.org> wrote:
> On Tue, 20 Mar 2012, Dave Martin wrote:
>
>> > BTW, it seems that the arm-soc's for-next branch has everything needed
>> > to boot on the model when it is passed the appropriate DTB.
>>
>> I had not yet got around to testing this ... have you got it running?
>
> Yes.
>
>> I suspect that we might still need a couple of extra patches in the
>> short term, but rebasing onto there would still be a good idea.
>
> What patch do you have in mind?

Some hacks to get the kernel to match the model with regard to
expexted coretile IDs, some more hacks to get the smc91x network
driver working, and a hack to make the CLCD work -- of course Pawel
may have done some or all of that already.  I haven't had time to
review what's in arm-soc yet...

Cheers
---Dave
Grant Likely March 20, 2012, 4:10 p.m. UTC | #9
On Mon, 19 Mar 2012 16:15:24 +0000, Dave Martin <dave.martin@linaro.org> wrote:
> This patch adds a very simple cpufreq-based frontend to the ARM
> big.LITTLE switcher.
> 
> This driver simply simulates two performance points corresponding
> to the big and little clusters.
> 
> Note that this driver requires the ARM switcher implementation to
> be loaded in order to work.  The switcher must have been built with
> ASYNC = FALSE in big-little/Makefile in order for the synchronous
> switching interface to be exposed.
> 
> Bugs and limitations:
> 
>   * Switching twice in quick succession currently tends to cause
>     a deadlock inside the switcher.  I still need to identify
>     exactly what is going wrong here.
> 
>   * There is currently no tracing interface, and no interface for
>     reporting what the dummy frequencies exposed by the driver
>     actually mean in terms of real cluster / performance point
>     combinations.  For the very simple case supported,
>     cpuinfo_max_freq corresponds to big and cpuinfo_min_freq
>     corresponds to LITTLE.
> 
>   * scaling_cur_freq doesn't accurately report what the current
>     performance point is.
> 
>   * a different governor can be set on each CPU -- it's not clear
>     whether this is a general cpufreq feature or a bug in my
>     driver, but we should find a way to prevent it, if possible.
> 
>   * cpufreq will trigger spurious extra cluster switches.  This is
>     a "feature" since I didn't tell cpufreq that this matters, but
>     it may not be desirable.  Setting policy->cpus can probably fix
>     this.
> 
>   * The low-level switcher interface for switching cluster does not
>     currently provide a way to specify the desired destination
>     cluster.  This ia possible design flaw in the switcher
>     interface.  Currently I work around this by taking a spinlock,
>     but this means that the lock has to be held across the cluster
>     switch by the CPU instigating the switch.  This "works" but
>     it's not clear whether it's desirable.
> 
> Signed-off-by: Dave Martin <dave.martin@linaro.org>

I've read though this entire thread and it looks like between you and
Nicolas things are in good shape.  All I'm left to add is a minor
comment about coding style:

> +#define DRIVER_NAME "arm-bl"
> +#define MODULE_NAME "arm-bl-cpufreq"
> +
> +#define info(format...) printk(KERN_INFO MODULE_NAME ": " format)
> +#define warn(format...) printk(KERN_WARNING MODULE_NAME ": " format)

pr_fmt
pr_info()
pr_warning()

g.
Dave Martin March 20, 2012, 4:38 p.m. UTC | #10
On Tue, Mar 20, 2012 at 4:10 PM, Grant Likely <grant.likely@secretlab.ca> wrote:
> On Mon, 19 Mar 2012 16:15:24 +0000, Dave Martin <dave.martin@linaro.org> wrote:
>> This patch adds a very simple cpufreq-based frontend to the ARM
>> big.LITTLE switcher.
>>
>> This driver simply simulates two performance points corresponding
>> to the big and little clusters.
>>
>> Note that this driver requires the ARM switcher implementation to
>> be loaded in order to work.  The switcher must have been built with
>> ASYNC = FALSE in big-little/Makefile in order for the synchronous
>> switching interface to be exposed.
>>
>> Bugs and limitations:
>>
>>   * Switching twice in quick succession currently tends to cause
>>     a deadlock inside the switcher.  I still need to identify
>>     exactly what is going wrong here.
>>
>>   * There is currently no tracing interface, and no interface for
>>     reporting what the dummy frequencies exposed by the driver
>>     actually mean in terms of real cluster / performance point
>>     combinations.  For the very simple case supported,
>>     cpuinfo_max_freq corresponds to big and cpuinfo_min_freq
>>     corresponds to LITTLE.
>>
>>   * scaling_cur_freq doesn't accurately report what the current
>>     performance point is.
>>
>>   * a different governor can be set on each CPU -- it's not clear
>>     whether this is a general cpufreq feature or a bug in my
>>     driver, but we should find a way to prevent it, if possible.
>>
>>   * cpufreq will trigger spurious extra cluster switches.  This is
>>     a "feature" since I didn't tell cpufreq that this matters, but
>>     it may not be desirable.  Setting policy->cpus can probably fix
>>     this.
>>
>>   * The low-level switcher interface for switching cluster does not
>>     currently provide a way to specify the desired destination
>>     cluster.  This ia possible design flaw in the switcher
>>     interface.  Currently I work around this by taking a spinlock,
>>     but this means that the lock has to be held across the cluster
>>     switch by the CPU instigating the switch.  This "works" but
>>     it's not clear whether it's desirable.
>>
>> Signed-off-by: Dave Martin <dave.martin@linaro.org>
>
> I've read though this entire thread and it looks like between you and
> Nicolas things are in good shape.  All I'm left to add is a minor
> comment about coding style:
>
>> +#define DRIVER_NAME "arm-bl"
>> +#define MODULE_NAME "arm-bl-cpufreq"
>> +
>> +#define info(format...) printk(KERN_INFO MODULE_NAME ": " format)
>> +#define warn(format...) printk(KERN_WARNING MODULE_NAME ": " format)
>
> pr_fmt
> pr_info()
> pr_warning()

Ah, excellent.  I'm still a little backward when it comes to the
details of writing drivers, I'm afraid ... I thought I oughtn't to
have to reinvent this kind of thing, though.

Is there a way to get the equivalent of MODULE_NAME without having to
define it explicitly?

Cheers
---Dave
Grant Likely March 20, 2012, 9:19 p.m. UTC | #11
On Tue, 20 Mar 2012 16:38:32 +0000, Dave Martin <dave.martin@linaro.org> wrote:
> On Tue, Mar 20, 2012 at 4:10 PM, Grant Likely <grant.likely@secretlab.ca> wrote:
> > On Mon, 19 Mar 2012 16:15:24 +0000, Dave Martin <dave.martin@linaro.org> wrote:
> >> +#define DRIVER_NAME "arm-bl"
> >> +#define MODULE_NAME "arm-bl-cpufreq"
> >> +
> >> +#define info(format...) printk(KERN_INFO MODULE_NAME ": " format)
> >> +#define warn(format...) printk(KERN_WARNING MODULE_NAME ": " format)
> >
> > pr_fmt
> > pr_info()
> > pr_warning()
> 
> Ah, excellent.  I'm still a little backward when it comes to the
> details of writing drivers, I'm afraid ... I thought I oughtn't to
> have to reinvent this kind of thing, though.
> 
> Is there a way to get the equivalent of MODULE_NAME without having to
> define it explicitly?

Not that I know of; but why would you want the module name and the
driver name to be different?  :-)

g.
diff mbox

Patch

diff --git a/Documentation/cpu-freq/cpufreq-arm-bl.txt b/Documentation/cpu-freq/cpufreq-arm-bl.txt
new file mode 100644
index 0000000..0a47c47
--- /dev/null
+++ b/Documentation/cpu-freq/cpufreq-arm-bl.txt
@@ -0,0 +1,41 @@ 
+Synchronous cluster switching interface for the ARM big.LITTLE switcher
+-----------------------------------------------------------------------
+
+The arm-bl-cpufreq driver provides a simple interface which models two
+clusters as two performance points.
+
+Within each CPU's cpufreq directory in sysfs (/sys/devices/system/cpu/cpu?/cpufreq/):
+
+cpuinfo_min_freq:
+
+	reports the dummy frequency value which corresponds to the "big"
+	cluster.
+
+cpuinfo_min_freq:
+
+	reports the dummy frequency value which corresponds to the
+	"little" cluster.
+
+
+To switch clusters, either the built-in "powersave" or "performance"
+governors can be used to force the "little" or "big" cluster
+respectively; or alternatively the "userspace" governor can be used,
+
+The following script fragment demonstrates how the userspace governor
+can be used to switch:
+
+
+for x in /sys/devices/system/cpu/cpu[0-9]*; do
+	echo userspace >$x/cpufreq/scaling_governor
+done
+
+big_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq`
+little_freq=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq`
+
+switch_to_big () {
+	echo $big_freq >/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
+}
+
+switch_to_little () {
+	echo $little_freq >/sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed
+}
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 455367d..907d44a 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -301,6 +301,7 @@  config ARCH_VERSATILE
 config ARCH_VEXPRESS
 	bool "ARM Ltd. Versatile Express family"
 	select ARCH_WANT_OPTIONAL_GPIOLIB
+	select ARCH_HAS_CPUFREQ
 	select ARM_AMBA
 	select ARM_TIMER_SP804
 	select CLKDEV_LOOKUP
diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index 72a0044..0e4f6d0 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -30,3 +30,21 @@  config ARM_EXYNOS4210_CPUFREQ
 	  SoC (S5PV310 or S5PC210).
 
 	  If in doubt, say N.
+
+config ARM_BL_CPUFREQ
+	depends on EXPERIMENTAL
+	depends on ARCH_VEXPRESS_DT
+	tristate "Simple cpufreq interface for the ARM big.LITTLE switcher"
+	help
+	  Provides a simple cpufreq interface to control the ARM
+	  big.LITTLE switcher.
+
+	  Note that this code is not currently safe unless the
+	  big.LITTLE switcher binary has been loaded separately by an
+	  external bootloader or firmware before entering the kernel.
+	  Otherwise, you can still build this code as a module,
+	  providing that you don't load it.
+
+	  Refer to Documentation/cpufreq/cpufreq-arm-bl.txt for details.
+
+	  If unsure, say N.
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index a48bc02..ecf492d 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -40,6 +40,10 @@  obj-$(CONFIG_X86_CPUFREQ_NFORCE2)	+= cpufreq-nforce2.o
 ##################################################################################
 # ARM SoC drivers
 obj-$(CONFIG_UX500_SOC_DB8500)		+= db8500-cpufreq.o
+obj-$(CONFIG_ARM_BL_CPUFREQ)		+= arm-bl-cpufreq.o
+arm-bl-cpufreq-y			+= arm-bl-cpufreq_driver.o \
+					   arm-bl-cpufreq_hvc.o
+AFLAGS_arm-bl-cpufreq_hvc.o		:= -march=armv7-a
 obj-$(CONFIG_ARM_S3C64XX_CPUFREQ)	+= s3c64xx-cpufreq.o
 obj-$(CONFIG_ARM_S5PV210_CPUFREQ)	+= s5pv210-cpufreq.o
 obj-$(CONFIG_ARM_EXYNOS4210_CPUFREQ)	+= exynos4210-cpufreq.o
diff --git a/drivers/cpufreq/arm-bl-cpufreq.h b/drivers/cpufreq/arm-bl-cpufreq.h
new file mode 100644
index 0000000..2f9b0dc
--- /dev/null
+++ b/drivers/cpufreq/arm-bl-cpufreq.h
@@ -0,0 +1,13 @@ 
+#ifndef ARM_BL_CPUFREQ_HVC_H
+#define ARM_BL_CPUFREQ_HVC_H
+
+#ifndef __ASSEMBLY__
+int __arm_bl_get_cluster(void);
+void __arm_bl_switch_cluster(void);
+#endif /* ! __ASSEMBLY__ */
+
+/* Hypervisor call numbers for the ARM big.LITTLE switcher: */
+#define ARM_BL_HVC_SWITCH_CLUSTER 1
+#define ARM_BL_HVC_GET_MPIDR 2
+
+#endif /* ! ARM_BL_CPUFREQ_HVC_H */
diff --git a/drivers/cpufreq/arm-bl-cpufreq_driver.c b/drivers/cpufreq/arm-bl-cpufreq_driver.c
new file mode 100644
index 0000000..96743dd
--- /dev/null
+++ b/drivers/cpufreq/arm-bl-cpufreq_driver.c
@@ -0,0 +1,207 @@ 
+/*
+ * arm-bl-cpufreq.c: Simple cpufreq backend for the ARM big.LITTLE switcher
+ * Copyright (C) 2012  Linaro Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+/* WARNING: This code is experimental and depends on external firmware */
+
+#include <linux/bug.h>
+#include <linux/cache.h>
+#include <linux/cpufreq.h>
+#include <linux/cpumask.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/printk.h>
+#include <linux/string.h>
+#include <linux/spinlock.h>
+
+#include "arm-bl-cpufreq.h"
+
+#define DRIVER_NAME "arm-bl"
+#define MODULE_NAME "arm-bl-cpufreq"
+
+#define info(format...) printk(KERN_INFO MODULE_NAME ": " format)
+#define warn(format...) printk(KERN_WARNING MODULE_NAME ": " format)
+
+/* Dummy frequencies representing the big and little clusters: */
+#define FREQ_BIG	1000000
+#define FREQ_LITTLE	 100000
+
+/*  Cluster numbers */
+#define CLUSTER_BIG	0
+#define CLUSTER_LITTLE	1
+
+static DEFINE_SPINLOCK(switcher_lock);
+
+static struct cpufreq_frequency_table __read_mostly bl_freqs[] = {
+	{ CLUSTER_BIG,		FREQ_BIG		},
+	{ CLUSTER_LITTLE,	FREQ_LITTLE		},
+	{ 0,			CPUFREQ_TABLE_END	},
+};
+
+
+/* Miscellaneous helpers */
+
+static unsigned int cluster_to_freq(int cluster)
+{
+	switch(cluster) {
+	case 0: return FREQ_BIG;
+	case 1: return FREQ_LITTLE;
+	default:
+		WARN(1, "%s: %s(): invalid cluster number %d, assuming 0\n",
+		     MODULE_NAME, __func__, cluster);
+		return FREQ_BIG;
+	}
+}
+
+/*
+ * Functions to get the current status.
+ *
+ * If you intend to use the result (i.e., it's not just for diagnostic
+ * purposes) then you should be holding switcher_lock ... otherwise
+ * the current cluster may change unexpectedly.
+ */
+static int get_current_cluster(void)
+{
+	return (__arm_bl_get_cluster() >> 8) & 0xF;
+}
+
+static unsigned int get_current_freq(void)
+{
+	return cluster_to_freq(get_current_cluster());
+}
+
+/*
+ * Switch to the requested cluster.
+ * There is no "switch_to_frequency" function, because the cpufreq frequency
+ * table helpers can easily look up the appropriate cluster number for us.
+ */
+static void switch_to_cluster(int cluster)
+{
+	info("Switching to cluster %d\n", cluster);
+
+	spin_lock(&switcher_lock);
+	if(cluster != get_current_cluster())
+		__arm_bl_switch_cluster();
+	spin_unlock(&switcher_lock);
+}
+
+
+/* Cpufreq methods and module code */
+
+static int bl_cpufreq_init(struct cpufreq_policy *policy)
+{
+	int err;
+
+	/*
+	 * Set CPU and policy min and max frequencies based on bl_freqs:
+	 */
+	err = cpufreq_frequency_table_cpuinfo(policy, bl_freqs);
+	if (err)
+		goto error;
+
+	/*
+	 * No need for locking here:
+	 * cpufreq is not active until initialisation has finished.
+	 * Ideally, transition_latency should be calibrated here.
+	 */
+	policy->cpuinfo.transition_latency = CPUFREQ_ETERNAL;
+	policy->cur = get_current_freq();
+	policy->policy = CPUFREQ_POLICY_PERFORMANCE;
+
+	/*
+	 * A b.L switch can be triggered from any CPU, but will affect them all.
+	 * The set of related CPUs should perhaps be determined from the
+	 * system CPU topology, rather than just the set of CPUs present...
+	 */
+	policy->shared_type = CPUFREQ_SHARED_TYPE_ANY;
+	cpumask_copy(policy->related_cpus, cpu_present_mask);
+	/*
+	 * We do not set ->cpus here, because it doesn't actually matter if
+	 * we try to switch on two CPUs at the same time.  Setting ->cpus
+	 * to cpu_present_mask might provide a way to avoid the need to take
+	 * switcher_lock when switching, though.
+	 */
+
+	info("cpufreq initialised successfully\n");
+	return 0;
+	
+error:
+	warn("%s: cpufreq initialisation failed (%d)\n", __func__, err);
+	return err;
+}
+
+static int bl_cpufreq_verify(struct cpufreq_policy *policy)
+{
+	return cpufreq_frequency_table_verify(policy, bl_freqs);
+}
+
+static int bl_cpufreq_target(struct cpufreq_policy *policy,
+			     unsigned int target_freq,
+			     unsigned int relation)
+{
+	int err;
+	int index;
+
+	if(cpufreq_frequency_table_target(policy, bl_freqs, target_freq,
+					  relation, &index))
+		return err;
+
+	switch_to_cluster(bl_freqs[index].index);
+	return 0;
+}
+
+static struct cpufreq_driver __read_mostly bl_cpufreq_driver = {
+	.owner = THIS_MODULE,
+	.name = DRIVER_NAME,
+
+	.init = bl_cpufreq_init,
+	.verify = bl_cpufreq_verify,
+	.target = bl_cpufreq_target,
+	/* what else? */
+};	
+
+static int __init bl_cpufreq_module_init(void)
+{
+	int err;
+
+	err = cpufreq_register_driver(&bl_cpufreq_driver);
+	if(err)
+		info("cpufreq backend driver registration failed (%d)\n", err);
+	else
+		info("cpufreq backend driver registered.\n");
+
+	return err;
+}
+module_init(bl_cpufreq_module_init);
+
+static void __exit bl_cpufreq_module_exit(void)
+{
+	cpufreq_unregister_driver(&bl_cpufreq_driver);
+
+	/* Restore the "default" cluster: */
+	switch_to_cluster(CLUSTER_BIG);
+
+	info("cpufreq backend driver unloaded.\n");
+}
+module_exit(bl_cpufreq_module_exit);
+
+
+MODULE_AUTHOR("Dave Martin");
+MODULE_DESCRIPTION("Simple cpufreq interface for the ARM big.LITTLE switcher");
+MODULE_LICENSE("GPL");
diff --git a/drivers/cpufreq/arm-bl-cpufreq_hvc.S b/drivers/cpufreq/arm-bl-cpufreq_hvc.S
new file mode 100644
index 0000000..6c7eb6d
--- /dev/null
+++ b/drivers/cpufreq/arm-bl-cpufreq_hvc.S
@@ -0,0 +1,15 @@ 
+#include <linux/linkage.h>
+
+#include "arm-bl-cpufreq.h"
+
+.arch_extension virt
+
+ENTRY(__arm_bl_get_cluster)
+	hvc	#ARM_BL_HVC_GET_MPIDR
+	bx	lr
+ENDPROC(__arm_bl_get_cluster)
+
+ENTRY(__arm_bl_switch_cluster)
+	hvc	#ARM_BL_HVC_SWITCH_CLUSTER
+	bx	lr
+ENDPROC(__arm_bl_switch_cluster)