diff mbox series

[v2,4/4] cpufreq: schedutil: Always call drvier if need_freq_update is set

Message ID 1905098.zDJocX6404@kreacher
State New
Headers show
Series [v2,1/4] cpufreq: Introduce CPUFREQ_NEED_UPDATE_LIMITS driver flag | expand

Commit Message

Rafael J. Wysocki Oct. 23, 2020, 3:36 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Because sugov_update_next_freq() may skip a frequency update even if
the need_freq_update flag has been set for the policy at hand, policy
limits updates may not take effect as expected.

For example, if the intel_pstate driver operates in the passive mode
with HWP enabled, it needs to update the HWP min and max limits when
the policy min and max limits change, respectively, but that may not
happen if the target frequency does not change along with the limit
at hand.  In particular, if the policy min is changed first, causing
the target frequency to be adjusted to it, and the policy max limit
is changed later to the same value, the HWP max limit will not be
updated to follow it as expected, because the target frequency is
still equal to the policy min limit and it will not change until
that limit is updated.

To address this issue, modify get_next_freq() to clear
need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
not set for the cpufreq driver in use (and it should be set for all
potentially affected drivers) and make sugov_update_next_freq()
check need_freq_update and continue when it is set regardless of
whether or not the new target frequency is equal to the old one.

Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")
Reported-by: Zhang Rui <rui.zhang@intel.com>
Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

New patch in v2.

---
 kernel/sched/cpufreq_schedutil.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Viresh Kumar Oct. 27, 2020, 4:25 a.m. UTC | #1
Spelling mistake in $subject (driver)

On 23-10-20, 17:36, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> 

> Because sugov_update_next_freq() may skip a frequency update even if

> the need_freq_update flag has been set for the policy at hand, policy

> limits updates may not take effect as expected.

> 

> For example, if the intel_pstate driver operates in the passive mode

> with HWP enabled, it needs to update the HWP min and max limits when

> the policy min and max limits change, respectively, but that may not

> happen if the target frequency does not change along with the limit

> at hand.  In particular, if the policy min is changed first, causing

> the target frequency to be adjusted to it, and the policy max limit

> is changed later to the same value, the HWP max limit will not be

> updated to follow it as expected, because the target frequency is

> still equal to the policy min limit and it will not change until

> that limit is updated.

> 

> To address this issue, modify get_next_freq() to clear

> need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is

> not set for the cpufreq driver in use (and it should be set for all

> potentially affected drivers) and make sugov_update_next_freq()

> check need_freq_update and continue when it is set regardless of

> whether or not the new target frequency is equal to the old one.

> 

> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")

> Reported-by: Zhang Rui <rui.zhang@intel.com>

> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+

> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---

> 

> New patch in v2.

> 

> ---

>  kernel/sched/cpufreq_schedutil.c |    8 ++++++--

>  1 file changed, 6 insertions(+), 2 deletions(-)

> 

> Index: linux-pm/kernel/sched/cpufreq_schedutil.c

> ===================================================================

> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c

> +++ linux-pm/kernel/sched/cpufreq_schedutil.c

> @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str

>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,

>  				   unsigned int next_freq)

>  {

> -	if (sg_policy->next_freq == next_freq)

> +	if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)

>  		return false;

>  

>  	sg_policy->next_freq = next_freq;

>  	sg_policy->last_freq_update_time = time;

> +	sg_policy->need_freq_update = false;

>  

>  	return true;

>  }

> @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct

>  	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)

>  		return sg_policy->next_freq;

>  

> -	sg_policy->need_freq_update = false;

> +	if (sg_policy->need_freq_update)

> +		sg_policy->need_freq_update =

> +			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);

> +


The behavior here is a bit different from what we did in cpufreq.c. In cpufreq
core we are _always_ allowing the call to reach the driver's target() routine,
but here we do it only if limits have changed. Wonder if we should have similar
behavior here as well ?

Over that the code here can be rewritten a bit like:

	if (sg_policy->need_freq_update)
                sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
        else if (freq == sg_policy->cached_raw_freq)
		return sg_policy->next_freq;

-- 
viresh
Zhang Rui Oct. 27, 2020, 8:47 a.m. UTC | #2
On Fri, 2020-10-23 at 17:36 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Because sugov_update_next_freq() may skip a frequency update even if
> the need_freq_update flag has been set for the policy at hand, policy
> limits updates may not take effect as expected.
> 
> For example, if the intel_pstate driver operates in the passive mode
> with HWP enabled, it needs to update the HWP min and max limits when
> the policy min and max limits change, respectively, but that may not
> happen if the target frequency does not change along with the limit
> at hand.  In particular, if the policy min is changed first, causing
> the target frequency to be adjusted to it, and the policy max limit
> is changed later to the same value, the HWP max limit will not be
> updated to follow it as expected, because the target frequency is
> still equal to the policy min limit and it will not change until
> that limit is updated.
> 
> To address this issue, modify get_next_freq() to clear
> need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is
> not set for the cpufreq driver in use (and it should be set for all
> potentially affected drivers) and make sugov_update_next_freq()
> check need_freq_update and continue when it is set regardless of
> whether or not the new target frequency is equal to the old one.
> 
> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode
> with HWP enabled")
> Reported-by: Zhang Rui <rui.zhang@intel.com>
> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

I have confirmed that the problem is gone with this patch series
applied.

Tested-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui

> ---
> 
> New patch in v2.
> 
> ---
>  kernel/sched/cpufreq_schedutil.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> Index: linux-pm/kernel/sched/cpufreq_schedutil.c
> ===================================================================
> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
> +++ linux-pm/kernel/sched/cpufreq_schedutil.c
> @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str
>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy,
> u64 time,
>  				   unsigned int next_freq)
>  {
> -	if (sg_policy->next_freq == next_freq)
> +	if (sg_policy->next_freq == next_freq && !sg_policy-
> >need_freq_update)
>  		return false;
>  
>  	sg_policy->next_freq = next_freq;
>  	sg_policy->last_freq_update_time = time;
> +	sg_policy->need_freq_update = false;
>  
>  	return true;
>  }
> @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct
>  	if (freq == sg_policy->cached_raw_freq && !sg_policy-
> >need_freq_update)
>  		return sg_policy->next_freq;
>  
> -	sg_policy->need_freq_update = false;
> +	if (sg_policy->need_freq_update)
> +		sg_policy->need_freq_update =
> +			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_L
> IMITS);
> +
>  	sg_policy->cached_raw_freq = freq;
>  	return cpufreq_driver_resolve_freq(policy, freq);
>  }
> 
> 
>
Rafael J. Wysocki Oct. 27, 2020, 1:14 p.m. UTC | #3
On Tue, Oct 27, 2020 at 5:26 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>

> Spelling mistake in $subject (driver)

>

> On 23-10-20, 17:36, Rafael J. Wysocki wrote:

> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> >

> > Because sugov_update_next_freq() may skip a frequency update even if

> > the need_freq_update flag has been set for the policy at hand, policy

> > limits updates may not take effect as expected.

> >

> > For example, if the intel_pstate driver operates in the passive mode

> > with HWP enabled, it needs to update the HWP min and max limits when

> > the policy min and max limits change, respectively, but that may not

> > happen if the target frequency does not change along with the limit

> > at hand.  In particular, if the policy min is changed first, causing

> > the target frequency to be adjusted to it, and the policy max limit

> > is changed later to the same value, the HWP max limit will not be

> > updated to follow it as expected, because the target frequency is

> > still equal to the policy min limit and it will not change until

> > that limit is updated.

> >

> > To address this issue, modify get_next_freq() to clear

> > need_freq_update only if the CPUFREQ_NEED_UPDATE_LIMITS flag is

> > not set for the cpufreq driver in use (and it should be set for all

> > potentially affected drivers) and make sugov_update_next_freq()

> > check need_freq_update and continue when it is set regardless of

> > whether or not the new target frequency is equal to the old one.

> >

> > Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")

> > Reported-by: Zhang Rui <rui.zhang@intel.com>

> > Cc: 5.9+ <stable@vger.kernel.org> # 5.9+

> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> > ---

> >

> > New patch in v2.

> >

> > ---

> >  kernel/sched/cpufreq_schedutil.c |    8 ++++++--

> >  1 file changed, 6 insertions(+), 2 deletions(-)

> >

> > Index: linux-pm/kernel/sched/cpufreq_schedutil.c

> > ===================================================================

> > --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c

> > +++ linux-pm/kernel/sched/cpufreq_schedutil.c

> > @@ -102,11 +102,12 @@ static bool sugov_should_update_freq(str

> >  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,

> >                                  unsigned int next_freq)

> >  {

> > -     if (sg_policy->next_freq == next_freq)

> > +     if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)

> >               return false;

> >

> >       sg_policy->next_freq = next_freq;

> >       sg_policy->last_freq_update_time = time;

> > +     sg_policy->need_freq_update = false;

> >

> >       return true;

> >  }

> > @@ -164,7 +165,10 @@ static unsigned int get_next_freq(struct

> >       if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)

> >               return sg_policy->next_freq;

> >

> > -     sg_policy->need_freq_update = false;

> > +     if (sg_policy->need_freq_update)

> > +             sg_policy->need_freq_update =

> > +                     cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);

> > +

>

> The behavior here is a bit different from what we did in cpufreq.c. In cpufreq

> core we are _always_ allowing the call to reach the driver's target() routine,

> but here we do it only if limits have changed. Wonder if we should have similar

> behavior here as well ?


I didn't think about that, but now that you mentioned it, I think that
this is a good idea.

Will send an updated patch with that implemented shortly.

> Over that the code here can be rewritten a bit like:

>

>         if (sg_policy->need_freq_update)

>                 sg_policy->need_freq_update = cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);

>         else if (freq == sg_policy->cached_raw_freq)

>                 return sg_policy->next_freq;


Right, but it will be somewhat different anyway. :-)
Viresh Kumar Oct. 29, 2020, 11:23 a.m. UTC | #4
On 29-10-20, 12:12, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> 

> Because sugov_update_next_freq() may skip a frequency update even if

> the need_freq_update flag has been set for the policy at hand, policy

> limits updates may not take effect as expected.

> 

> For example, if the intel_pstate driver operates in the passive mode

> with HWP enabled, it needs to update the HWP min and max limits when

> the policy min and max limits change, respectively, but that may not

> happen if the target frequency does not change along with the limit

> at hand.  In particular, if the policy min is changed first, causing

> the target frequency to be adjusted to it, and the policy max limit

> is changed later to the same value, the HWP max limit will not be

> updated to follow it as expected, because the target frequency is

> still equal to the policy min limit and it will not change until

> that limit is updated.

> 

> To address this issue, modify get_next_freq() to let the driver

> callback run if the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag

> is set regardless of whether or not the new frequency to set is

> equal to the previous one.

> 

> Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")

> Reported-by: Zhang Rui <rui.zhang@intel.com>

> Tested-by: Zhang Rui <rui.zhang@intel.com>

> Cc: 5.9+ <stable@vger.kernel.org> # 5.9+

> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> ---

> 

> v2.1 -> v2.2:

>    * Instead of updating need_freq_update if CPUFREQ_NEED_UPDATE_LIMITS is set

>      in get_next_freq() and checking it again in sugov_update_next_freq(),

>      check CPUFREQ_NEED_UPDATE_LIMITS directly in sugov_update_next_freq().

>    * Update the subject.

> 

> v2 -> v2.1:

>    * Fix typo in the subject.

>    * Make get_next_freq() and sugov_update_next_freq() ignore the

>      sg_policy->next_freq == next_freq case when CPUFREQ_NEED_UPDATE_LIMITS

>      is set for the driver.

>    * Add Tested-by from Rui (this version lets the driver callback run more

>      often than the v2, so the behavior in the Rui's case doesn't change).

> 

> ---

>  kernel/sched/cpufreq_schedutil.c |    6 ++++--

>  1 file changed, 4 insertions(+), 2 deletions(-)

> 

> Index: linux-pm/kernel/sched/cpufreq_schedutil.c

> ===================================================================

> --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c

> +++ linux-pm/kernel/sched/cpufreq_schedutil.c

> @@ -102,7 +102,8 @@ static bool sugov_should_update_freq(str

>  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,

>  				   unsigned int next_freq)

>  {

> -	if (sg_policy->next_freq == next_freq)

> +	if (sg_policy->next_freq == next_freq &&

> +	    !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))

>  		return false;


Since sg_policy->next_freq is used elsewhere as well, this is the best
we can do here.

>  	sg_policy->next_freq = next_freq;

> @@ -161,7 +162,8 @@ static unsigned int get_next_freq(struct

>  

>  	freq = map_util_freq(util, freq, max);

>  

> -	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)

> +	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update &&

> +	    !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))

>  		return sg_policy->next_freq;

>  

>  	sg_policy->need_freq_update = false;


But I was wondering if instead of this we just do this here:

        if (!cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))
                sg_policy->cached_raw_freq = freq;

And so the above check will always fail.

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>


-- 
viresh
Rafael J. Wysocki Oct. 29, 2020, 11:29 a.m. UTC | #5
On Thu, Oct 29, 2020 at 12:23 PM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>

> On 29-10-20, 12:12, Rafael J. Wysocki wrote:

> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> >

> > Because sugov_update_next_freq() may skip a frequency update even if

> > the need_freq_update flag has been set for the policy at hand, policy

> > limits updates may not take effect as expected.

> >

> > For example, if the intel_pstate driver operates in the passive mode

> > with HWP enabled, it needs to update the HWP min and max limits when

> > the policy min and max limits change, respectively, but that may not

> > happen if the target frequency does not change along with the limit

> > at hand.  In particular, if the policy min is changed first, causing

> > the target frequency to be adjusted to it, and the policy max limit

> > is changed later to the same value, the HWP max limit will not be

> > updated to follow it as expected, because the target frequency is

> > still equal to the policy min limit and it will not change until

> > that limit is updated.

> >

> > To address this issue, modify get_next_freq() to let the driver

> > callback run if the CPUFREQ_NEED_UPDATE_LIMITS cpufreq driver flag

> > is set regardless of whether or not the new frequency to set is

> > equal to the previous one.

> >

> > Fixes: f6ebbcf08f37 ("cpufreq: intel_pstate: Implement passive mode with HWP enabled")

> > Reported-by: Zhang Rui <rui.zhang@intel.com>

> > Tested-by: Zhang Rui <rui.zhang@intel.com>

> > Cc: 5.9+ <stable@vger.kernel.org> # 5.9+

> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

> > ---

> >

> > v2.1 -> v2.2:

> >    * Instead of updating need_freq_update if CPUFREQ_NEED_UPDATE_LIMITS is set

> >      in get_next_freq() and checking it again in sugov_update_next_freq(),

> >      check CPUFREQ_NEED_UPDATE_LIMITS directly in sugov_update_next_freq().

> >    * Update the subject.

> >

> > v2 -> v2.1:

> >    * Fix typo in the subject.

> >    * Make get_next_freq() and sugov_update_next_freq() ignore the

> >      sg_policy->next_freq == next_freq case when CPUFREQ_NEED_UPDATE_LIMITS

> >      is set for the driver.

> >    * Add Tested-by from Rui (this version lets the driver callback run more

> >      often than the v2, so the behavior in the Rui's case doesn't change).

> >

> > ---

> >  kernel/sched/cpufreq_schedutil.c |    6 ++++--

> >  1 file changed, 4 insertions(+), 2 deletions(-)

> >

> > Index: linux-pm/kernel/sched/cpufreq_schedutil.c

> > ===================================================================

> > --- linux-pm.orig/kernel/sched/cpufreq_schedutil.c

> > +++ linux-pm/kernel/sched/cpufreq_schedutil.c

> > @@ -102,7 +102,8 @@ static bool sugov_should_update_freq(str

> >  static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,

> >                                  unsigned int next_freq)

> >  {

> > -     if (sg_policy->next_freq == next_freq)

> > +     if (sg_policy->next_freq == next_freq &&

> > +         !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))

> >               return false;

>

> Since sg_policy->next_freq is used elsewhere as well, this is the best

> we can do here.

>

> >       sg_policy->next_freq = next_freq;

> > @@ -161,7 +162,8 @@ static unsigned int get_next_freq(struct

> >

> >       freq = map_util_freq(util, freq, max);

> >

> > -     if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)

> > +     if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update &&

> > +         !cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))

> >               return sg_policy->next_freq;

> >

> >       sg_policy->need_freq_update = false;

>

> But I was wondering if instead of this we just do this here:

>

>         if (!cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS))

>                 sg_policy->cached_raw_freq = freq;

>

> And so the above check will always fail.


I wrote it this way, because I want to avoid looking at the driver
flags at all unless the update is going to be skipped.  Otherwise we
may end up fetching a new cache line here every time even if that is
not needed.

> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>


Thanks!
diff mbox series

Patch

Index: linux-pm/kernel/sched/cpufreq_schedutil.c
===================================================================
--- linux-pm.orig/kernel/sched/cpufreq_schedutil.c
+++ linux-pm/kernel/sched/cpufreq_schedutil.c
@@ -102,11 +102,12 @@  static bool sugov_should_update_freq(str
 static bool sugov_update_next_freq(struct sugov_policy *sg_policy, u64 time,
 				   unsigned int next_freq)
 {
-	if (sg_policy->next_freq == next_freq)
+	if (sg_policy->next_freq == next_freq && !sg_policy->need_freq_update)
 		return false;
 
 	sg_policy->next_freq = next_freq;
 	sg_policy->last_freq_update_time = time;
+	sg_policy->need_freq_update = false;
 
 	return true;
 }
@@ -164,7 +165,10 @@  static unsigned int get_next_freq(struct
 	if (freq == sg_policy->cached_raw_freq && !sg_policy->need_freq_update)
 		return sg_policy->next_freq;
 
-	sg_policy->need_freq_update = false;
+	if (sg_policy->need_freq_update)
+		sg_policy->need_freq_update =
+			cpufreq_driver_test_flags(CPUFREQ_NEED_UPDATE_LIMITS);
+
 	sg_policy->cached_raw_freq = freq;
 	return cpufreq_driver_resolve_freq(policy, freq);
 }