diff mbox series

[v2] PM: EM: Fix potential division-by-zero error in em_compute_costs()

Message ID tencent_EE27C7D1D6BDB3EE57A2C467CC59A866C405@qq.com
State New
Headers show
Series [v2] PM: EM: Fix potential division-by-zero error in em_compute_costs() | expand

Commit Message

Yaxiong Tian April 11, 2025, 1:28 a.m. UTC
From: Yaxiong Tian <tianyaxiong@kylinos.cn>

When the device is of a non-CPU type, table[i].performance won't be
initialized in the previous em_init_performance(), resulting in division
by zero when calculating costs in em_compute_costs().

Since the 'cost' algorithm is only used for EAS energy efficiency
calculations and is currently not utilized by other device drivers, we
should add the _is_cpu_device(dev) check to prevent this division-by-zero
issue.

Fixes: <1b600da51073> ("PM: EM: Optimize em_cpu_energy() and remove division")
Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
---
 kernel/power/energy_model.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Lukasz Luba April 14, 2025, 8:08 a.m. UTC | #1
Hi Yaxiong,

On 4/11/25 02:28, Yaxiong Tian wrote:
> From: Yaxiong Tian <tianyaxiong@kylinos.cn>
> 
> When the device is of a non-CPU type, table[i].performance won't be
> initialized in the previous em_init_performance(), resulting in division
> by zero when calculating costs in em_compute_costs().
> 
> Since the 'cost' algorithm is only used for EAS energy efficiency
> calculations and is currently not utilized by other device drivers, we
> should add the _is_cpu_device(dev) check to prevent this division-by-zero
> issue.
> 
> Fixes: <1b600da51073> ("PM: EM: Optimize em_cpu_energy() and remove division")
> Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
> ---
>   kernel/power/energy_model.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
> index d9b7e2b38c7a..d1fa7e8787b5 100644
> --- a/kernel/power/energy_model.c
> +++ b/kernel/power/energy_model.c
> @@ -244,7 +244,7 @@ static int em_compute_costs(struct device *dev, struct em_perf_state *table,
>   					cost, ret);
>   				return -EINVAL;
>   			}
> -		} else {
> +		} else if (_is_cpu_device(dev)) {
>   			/* increase resolution of 'cost' precision */
>   			power_res = table[i].power * 10;
>   			cost = power_res / table[i].performance;


As the test robot pointed out, please set the 'cost' to 0
where it's declared.

The rest should be fine.

Regards,
Lukasz
Yaxiong Tian April 14, 2025, 9:08 a.m. UTC | #2
在 2025/4/14 16:08, Lukasz Luba 写道:
> Hi Yaxiong,
> 
> On 4/11/25 02:28, Yaxiong Tian wrote:
>> From: Yaxiong Tian <tianyaxiong@kylinos.cn>
>>
>> When the device is of a non-CPU type, table[i].performance won't be
>> initialized in the previous em_init_performance(), resulting in division
>> by zero when calculating costs in em_compute_costs().
>>
>> Since the 'cost' algorithm is only used for EAS energy efficiency
>> calculations and is currently not utilized by other device drivers, we
>> should add the _is_cpu_device(dev) check to prevent this division-by-zero
>> issue.
>>
>> Fixes: <1b600da51073> ("PM: EM: Optimize em_cpu_energy() and remove 
>> division")
>> Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
>> ---
>>   kernel/power/energy_model.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
>> index d9b7e2b38c7a..d1fa7e8787b5 100644
>> --- a/kernel/power/energy_model.c
>> +++ b/kernel/power/energy_model.c
>> @@ -244,7 +244,7 @@ static int em_compute_costs(struct device *dev, 
>> struct em_perf_state *table,
>>                       cost, ret);
>>                   return -EINVAL;
>>               }
>> -        } else {
>> +        } else if (_is_cpu_device(dev)) {
>>               /* increase resolution of 'cost' precision */
>>               power_res = table[i].power * 10;
>>               cost = power_res / table[i].performance;
> 
> 
> As the test robot pointed out, please set the 'cost' to 0
> where it's declared.
> 
> The rest should be fine.
> 
> Regards,
> Lukasz

Okay.
I wasn’t sure whether the new patch should reuse the current Message-ID
or create a new one, so I checked with ChatGPT and kept the original
Message-ID for v2. If a new Message-ID is required, I can resend the
email.
Yaxiong Tian April 15, 2025, 1:12 a.m. UTC | #3
在 2025/4/14 16:08, Lukasz Luba 写道:
> Hi Yaxiong,
> 
> On 4/11/25 02:28, Yaxiong Tian wrote:
>> From: Yaxiong Tian <tianyaxiong@kylinos.cn>
>>
>> When the device is of a non-CPU type, table[i].performance won't be
>> initialized in the previous em_init_performance(), resulting in division
>> by zero when calculating costs in em_compute_costs().
>>
>> Since the 'cost' algorithm is only used for EAS energy efficiency
>> calculations and is currently not utilized by other device drivers, we
>> should add the _is_cpu_device(dev) check to prevent this division-by-zero
>> issue.
>>
>> Fixes: <1b600da51073> ("PM: EM: Optimize em_cpu_energy() and remove 
>> division")
>> Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
>> ---
>>   kernel/power/energy_model.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
>> index d9b7e2b38c7a..d1fa7e8787b5 100644
>> --- a/kernel/power/energy_model.c
>> +++ b/kernel/power/energy_model.c
>> @@ -244,7 +244,7 @@ static int em_compute_costs(struct device *dev, 
>> struct em_perf_state *table,
>>                       cost, ret);
>>                   return -EINVAL;
>>               }
>> -        } else {
>> +        } else if (_is_cpu_device(dev)) {
>>               /* increase resolution of 'cost' precision */
>>               power_res = table[i].power * 10;
>>               cost = power_res / table[i].performance;
> 
> 
> As the test robot pointed out, please set the 'cost' to 0
> where it's declared.
> 
> The rest should be fine.
> 
> Regards,
> Lukasz

Sorry, the V3 version with cost=0 still has issues.

I noticed that if the cost is set to 0, the condition "if (table[i].cost
  >= prev_cost)" in the following code will always evaluate to true. This
  will incorrectly set the flags to EM_PERF_STATE_INEFFICIENT.

Should we change ">=" to ">"?
Rafael J. Wysocki April 15, 2025, 5:19 p.m. UTC | #4
On Tue, Apr 15, 2025 at 4:03 AM Yaxiong Tian <iambestgod@qq.com> wrote:
>
>
>
> 在 2025/4/15 09:12, Yaxiong Tian 写道:
> >
> >
> > 在 2025/4/14 16:08, Lukasz Luba 写道:
> >> Hi Yaxiong,
> >>
> >> On 4/11/25 02:28, Yaxiong Tian wrote:
> >>> From: Yaxiong Tian <tianyaxiong@kylinos.cn>
> >>>
> >>> When the device is of a non-CPU type, table[i].performance won't be
> >>> initialized in the previous em_init_performance(), resulting in division
> >>> by zero when calculating costs in em_compute_costs().
> >>>
> >>> Since the 'cost' algorithm is only used for EAS energy efficiency
> >>> calculations and is currently not utilized by other device drivers, we
> >>> should add the _is_cpu_device(dev) check to prevent this
> >>> division-by-zero
> >>> issue.
> >>>
> >>> Fixes: <1b600da51073> ("PM: EM: Optimize em_cpu_energy() and remove
> >>> division")
> >>> Signed-off-by: Yaxiong Tian <tianyaxiong@kylinos.cn>
> >>> ---
> >>>   kernel/power/energy_model.c | 2 +-
> >>>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>>
> >>> diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
> >>> index d9b7e2b38c7a..d1fa7e8787b5 100644
> >>> --- a/kernel/power/energy_model.c
> >>> +++ b/kernel/power/energy_model.c
> >>> @@ -244,7 +244,7 @@ static int em_compute_costs(struct device *dev,
> >>> struct em_perf_state *table,
> >>>                       cost, ret);
> >>>                   return -EINVAL;
> >>>               }
> >>> -        } else {
> >>> +        } else if (_is_cpu_device(dev)) {
> >>>               /* increase resolution of 'cost' precision */
> >>>               power_res = table[i].power * 10;
> >>>               cost = power_res / table[i].performance;
> >>
> >>
> >> As the test robot pointed out, please set the 'cost' to 0
> >> where it's declared.
> >>
> >> The rest should be fine.
> >>
> >> Regards,
> >> Lukasz
> >
> > Sorry, the V3 version with cost=0 still has issues.
> >
> > I noticed that if the cost is set to 0, the condition "if (table[i].cost
> >   >= prev_cost)" in the following code will always evaluate to true. This
> >   will incorrectly set the flags to EM_PERF_STATE_INEFFICIENT.
> >
> > Should we change ">=" to ">"?
> >
>
> Sorry Again, Setting EM_PERF_STATE_INEFFICIENT in this case is correct.
> Earlier, I misunderstood the definition/usage of EM_PERF_STATE_INEFFICIENT.

Well, EM_PERF_STATE_INEFFICIENT is only looked at in CPU energy
models, so setting it in a non-CPU one is redundant.
diff mbox series

Patch

diff --git a/kernel/power/energy_model.c b/kernel/power/energy_model.c
index d9b7e2b38c7a..d1fa7e8787b5 100644
--- a/kernel/power/energy_model.c
+++ b/kernel/power/energy_model.c
@@ -244,7 +244,7 @@  static int em_compute_costs(struct device *dev, struct em_perf_state *table,
 					cost, ret);
 				return -EINVAL;
 			}
-		} else {
+		} else if (_is_cpu_device(dev)) {
 			/* increase resolution of 'cost' precision */
 			power_res = table[i].power * 10;
 			cost = power_res / table[i].performance;