diff mbox series

[V4,1/2] base/drivers/arch_topology: Replace mutex with READ_ONCE / WRITE_ONCE

Message ID 1543234847-21611-1-git-send-email-daniel.lezcano@linaro.org
State Superseded
Headers show
Series [V4,1/2] base/drivers/arch_topology: Replace mutex with READ_ONCE / WRITE_ONCE | expand

Commit Message

Daniel Lezcano Nov. 26, 2018, 12:20 p.m. UTC
The mutex protects a per_cpu variable access. The potential race can
happen only when the cpufreq governor module is loaded and at the same
time the cpu capacity is changed in the sysfs.

There is no real interest of using a mutex to protect a variable
assignation when there is no situation where a task can take the lock
and block.

Replace the mutex by READ_ONCE / WRITE_ONCE.

Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

Cc: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>

---
 drivers/base/arch_topology.c  | 7 +------
 include/linux/arch_topology.h | 2 +-
 2 files changed, 2 insertions(+), 7 deletions(-)

-- 
2.7.4

Comments

Daniel Lezcano Nov. 26, 2018, 12:49 p.m. UTC | #1
On 26/11/2018 13:48, Quentin Perret wrote:
> On Monday 26 Nov 2018 at 13:20:43 (+0100), Daniel Lezcano wrote:

>> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c

>> index fd5325b..e0c5b60 100644

>> --- a/drivers/base/arch_topology.c

>> +++ b/drivers/base/arch_topology.c

>> @@ -243,9 +243,20 @@ static int __init register_cpufreq_notifier(void)

>>  	 * until we have the necessary code to parse the cpu capacity, so

>>  	 * skip registering cpufreq notifier.

>>  	 */

>> -	if (!acpi_disabled || !raw_capacity)

>> +	if (!acpi_disabled)

>>  		return -EINVAL;

>>  

>> +	if (!raw_capacity) {

>> +

>> +		pr_info("cpu_capacity: No capacity defined in DT, set default "

>> +		       "values to %ld\n", SCHED_CAPACITY_SCALE);

>> +

>> +		raw_capacity = kmalloc_array(num_possible_cpus(),

>> +					     sizeof(*raw_capacity), GFP_KERNEL);

>> +		if (!raw_capacity)

>> +			return -ENOMEM;

>> +	}

>> +

>>  	if (!alloc_cpumask_var(&cpus_to_visit, GFP_KERNEL)) {

>>  		pr_err("cpu_capacity: failed to allocate memory for cpus_to_visit\n");

>>  		return -ENOMEM;

> 

> I just tried on my Juno by removing the capacity-dmips-mhz values from

> the DT and got the following:

> 

>   $ cat /sys/devices/system/cpu/cpufreq/policy*/scaling_available_frequencies

>   450000 575000 700000 775000 850000

>   450000 625000 800000 950000 1100000

>   $ cat /sys/devices/system/cpu/cpu*/cpu_capacity

>   791

>   1024

>   1024

>   791

>   791

>   791

> 

> Same thing with a partially-filled DT (which is the expected behaviour

> now). So feel free to add:

> 

> Tested-by: Quentin Perret <quentin.perret@arm.com>


Thanks for testing!


-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Greg Kroah-Hartman Nov. 26, 2018, 3:06 p.m. UTC | #2
On Mon, Nov 26, 2018 at 01:20:43PM +0100, Daniel Lezcano wrote:
> --- a/drivers/base/arch_topology.c

> +++ b/drivers/base/arch_topology.c

> @@ -243,9 +243,20 @@ static int __init register_cpufreq_notifier(void)

>  	 * until we have the necessary code to parse the cpu capacity, so

>  	 * skip registering cpufreq notifier.

>  	 */

> -	if (!acpi_disabled || !raw_capacity)

> +	if (!acpi_disabled)

>  		return -EINVAL;

>  

> +	if (!raw_capacity) {

> +

> +		pr_info("cpu_capacity: No capacity defined in DT, set default "

> +		       "values to %ld\n", SCHED_CAPACITY_SCALE);


Why the extra blank line?

And what is userspace going to do with this noise?  Is this an error?
Just normal operation?  A device should never be saying anything to the
log for normal boot functionality.  When is this called?

And no need for the "cpu_capacity:" right?  Shouldn't the pr_info() line
handle the prefix for you?

thanks,

greg k-h
Rob Herring Nov. 27, 2018, 1:42 a.m. UTC | #3
On Mon, Nov 26, 2018 at 01:20:43PM +0100, Daniel Lezcano wrote:
> In the case of asymmetric SoC with the same micro-architecture, we

> have a group of CPUs with smaller OPPs than the other group. One

> example is the 96boards dragonboard 820c. There is no dmips/MHz

> difference between both groups, so no need to specify the values in

> the DT. Unfortunately, without these defined, there is no scaling

> capacity computation triggered, so we need to write

> 'capacity-dmips-mhz' for each CPU with the same value in order to

> force the scaled capacity computation.

> 

> In order to fix this situation, allocate 'raw_capacity' so the pointer

> is set and the init_cpu_capacity_callback() function can be called.

> 

> This was tested on db820c:

>  - specified values in the DT (correct results)

>  - partial values defined in the DT (error + fallback to defaults)

>  - no specified values in the DT (correct results)

> 

> correct results are:

>   cat /sys/devices/system/cpu/cpu*/cpu_capacity

>    758

>    758

>   1024

>   1024

> 

>   ... respectively for CPU0, CPU1, CPU2 and CPU3.

> 

> That reflects the capacity for the max frequencies 1593600 and 2150400.

> 

> Cc: Chris Redpath <chris.redpath@linaro.org>

> Cc: Quentin Perret <quentin.perret@linaro.org>

> Cc: Viresh Kumar <viresh.kumar@linaro.org>

> Cc: Amit Kucheria <amit.kucheria@linaro.org>

> Cc: Nicolas Dechesne <nicolas.dechesne@linaro.org>

> Cc: Niklas Cassel <niklas.cassel@linaro.org>

> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org>

> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>

> ---

>  Documentation/devicetree/bindings/arm/cpu-capacity.txt |  6 ++++++


Acked-by: Rob Herring <robh@kernel.org>


>  drivers/base/arch_topology.c                           | 13 ++++++++++++-

>  2 files changed, 18 insertions(+), 1 deletion(-)

> 

> diff --git a/Documentation/devicetree/bindings/arm/cpu-capacity.txt b/Documentation/devicetree/bindings/arm/cpu-capacity.txt

> index 84262cd..f53a3c9 100644

> --- a/Documentation/devicetree/bindings/arm/cpu-capacity.txt

> +++ b/Documentation/devicetree/bindings/arm/cpu-capacity.txt

> @@ -54,6 +54,12 @@ fall back to the default capacity value for every CPU. If cpufreq is not

>  available, final capacities are calculated by directly using capacity-dmips-

>  mhz values (normalized w.r.t. the highest value found while parsing the DT).

>  

> +If capacity-dmips-mhz is not specified or if the parsing fails, the

> +default capacity value will be computed against the highest frequency.

> +When all CPUs have the same OPP, they will have the same capacity

> +value otherwise the capacity will be scaled down for CPUs having lower

> +frequencies.

> +

>  ===========================================

>  4 - Examples

>  ===========================================

> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c

> index fd5325b..e0c5b60 100644

> --- a/drivers/base/arch_topology.c

> +++ b/drivers/base/arch_topology.c

> @@ -243,9 +243,20 @@ static int __init register_cpufreq_notifier(void)

>  	 * until we have the necessary code to parse the cpu capacity, so

>  	 * skip registering cpufreq notifier.

>  	 */

> -	if (!acpi_disabled || !raw_capacity)

> +	if (!acpi_disabled)

>  		return -EINVAL;

>  

> +	if (!raw_capacity) {

> +

> +		pr_info("cpu_capacity: No capacity defined in DT, set default "

> +		       "values to %ld\n", SCHED_CAPACITY_SCALE);

> +

> +		raw_capacity = kmalloc_array(num_possible_cpus(),

> +					     sizeof(*raw_capacity), GFP_KERNEL);

> +		if (!raw_capacity)

> +			return -ENOMEM;

> +	}

> +

>  	if (!alloc_cpumask_var(&cpus_to_visit, GFP_KERNEL)) {

>  		pr_err("cpu_capacity: failed to allocate memory for cpus_to_visit\n");

>  		return -ENOMEM;

> -- 

> 2.7.4

>
Daniel Lezcano Nov. 27, 2018, 8:31 a.m. UTC | #4
On 27/11/2018 04:57, Viresh Kumar wrote:
> On 26-11-18, 13:20, Daniel Lezcano wrote:

>> diff --git a/Documentation/devicetree/bindings/arm/cpu-capacity.txt b/Documentation/devicetree/bindings/arm/cpu-capacity.txt

>> index 84262cd..f53a3c9 100644

>> --- a/Documentation/devicetree/bindings/arm/cpu-capacity.txt

>> +++ b/Documentation/devicetree/bindings/arm/cpu-capacity.txt

>> @@ -54,6 +54,12 @@ fall back to the default capacity value for every CPU. If cpufreq is not

>>  available, final capacities are calculated by directly using capacity-dmips-

>>  mhz values (normalized w.r.t. the highest value found while parsing the DT).

>>  

>> +If capacity-dmips-mhz is not specified or if the parsing fails, the

>> +default capacity value will be computed against the highest frequency.

>> +When all CPUs have the same OPP, they will have the same capacity

>> +value otherwise the capacity will be scaled down for CPUs having lower

>> +frequencies.

> 

> I know you added this based on Quentin's feedback, but I wonder if this is

> really required and if it is improving anything at all. This is what the

> documentation says currently without this patch:

> 

> "

> capacity-dmips-mhz is an optional cpu node [1] property: u32 value

> representing CPU capacity expressed in normalized DMIPS/MHz. At boot time, the

> maximum frequency available to the cpu is then used to calculate the capacity

> value internally used by the kernel.

> 

> capacity-dmips-mhz property is all-or-nothing: if it is specified for a cpu

> node, it has to be specified for every other cpu nodes, or the system will

> fall back to the default capacity value for every CPU. If cpufreq is not

> available, final capacities are calculated by directly using capacity-dmips-

> mhz values (normalized w.r.t. the highest value found while parsing the DT).

> "

> 

> So it already clearly says two things:

> - If all CPUs don't have this property, we fallback to default capacity for

>   every CPU.

> - And the OS may also normalize the capacity based on the maximum frequency.

> 

> What more do we want to add here ?


I think what is new is the silver-gold platform. I agree the description
above gives us the information but in a condensed way. With this extra
paragraph we elaborate a bit and make it more clear for SMP/AMP systems.

-- 
 <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro:  <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog
Quentin Perret Nov. 27, 2018, 9:09 a.m. UTC | #5
On Tuesday 27 Nov 2018 at 09:27:35 (+0530), Viresh Kumar wrote:
> On 26-11-18, 13:20, Daniel Lezcano wrote:

> > diff --git a/Documentation/devicetree/bindings/arm/cpu-capacity.txt b/Documentation/devicetree/bindings/arm/cpu-capacity.txt

> > index 84262cd..f53a3c9 100644

> > --- a/Documentation/devicetree/bindings/arm/cpu-capacity.txt

> > +++ b/Documentation/devicetree/bindings/arm/cpu-capacity.txt

> > @@ -54,6 +54,12 @@ fall back to the default capacity value for every CPU. If cpufreq is not

> >  available, final capacities are calculated by directly using capacity-dmips-

> >  mhz values (normalized w.r.t. the highest value found while parsing the DT).

> >  

> > +If capacity-dmips-mhz is not specified or if the parsing fails, the

> > +default capacity value will be computed against the highest frequency.

> > +When all CPUs have the same OPP, they will have the same capacity

> > +value otherwise the capacity will be scaled down for CPUs having lower

> > +frequencies.

> 

> I know you added this based on Quentin's feedback, but I wonder if this is

> really required and if it is improving anything at all. This is what the

> documentation says currently without this patch:

> 

> "

> capacity-dmips-mhz is an optional cpu node [1] property: u32 value

> representing CPU capacity expressed in normalized DMIPS/MHz. At boot time, the

> maximum frequency available to the cpu is then used to calculate the capacity

> value internally used by the kernel.

> 

> capacity-dmips-mhz property is all-or-nothing: if it is specified for a cpu

> node, it has to be specified for every other cpu nodes, or the system will

> fall back to the default capacity value for every CPU. If cpufreq is not

> available, final capacities are calculated by directly using capacity-dmips-

> mhz values (normalized w.r.t. the highest value found while parsing the DT).

> "

> 

> So it already clearly says two things:

> - If all CPUs don't have this property, we fallback to default capacity for

>   every CPU.


Which is not what we do with this patch any more. We fallback to
scaling with frequency. So I do think the doc needs updating one way or
another. You could define more clearly what "default capacity" means if
you want and say that is scaled with frequency, that'd be fine by me.

Thanks,
Quentin
diff mbox series

Patch

diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index edfcf8d..fd5325b 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -31,12 +31,11 @@  void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq,
 		per_cpu(freq_scale, i) = scale;
 }
 
-static DEFINE_MUTEX(cpu_scale_mutex);
 DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE;
 
 void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity)
 {
-	per_cpu(cpu_scale, cpu) = capacity;
+	WRITE_ONCE(per_cpu(cpu_scale, cpu), capacity);
 }
 
 static ssize_t cpu_capacity_show(struct device *dev,
@@ -71,10 +70,8 @@  static ssize_t cpu_capacity_store(struct device *dev,
 	if (new_capacity > SCHED_CAPACITY_SCALE)
 		return -EINVAL;
 
-	mutex_lock(&cpu_scale_mutex);
 	for_each_cpu(i, &cpu_topology[this_cpu].core_sibling)
 		topology_set_cpu_scale(i, new_capacity);
-	mutex_unlock(&cpu_scale_mutex);
 
 	schedule_work(&update_topology_flags_work);
 
@@ -141,7 +138,6 @@  void topology_normalize_cpu_scale(void)
 		return;
 
 	pr_debug("cpu_capacity: capacity_scale=%u\n", capacity_scale);
-	mutex_lock(&cpu_scale_mutex);
 	for_each_possible_cpu(cpu) {
 		pr_debug("cpu_capacity: cpu=%d raw_capacity=%u\n",
 			 cpu, raw_capacity[cpu]);
@@ -151,7 +147,6 @@  void topology_normalize_cpu_scale(void)
 		pr_debug("cpu_capacity: CPU%d cpu_capacity=%lu\n",
 			cpu, topology_get_cpu_scale(NULL, cpu));
 	}
-	mutex_unlock(&cpu_scale_mutex);
 }
 
 bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu)
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index d9bdc1a..12c439f 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -20,7 +20,7 @@  struct sched_domain;
 static inline
 unsigned long topology_get_cpu_scale(struct sched_domain *sd, int cpu)
 {
-	return per_cpu(cpu_scale, cpu);
+	return READ_ONCE(per_cpu(cpu_scale, cpu));
 }
 
 void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);