Message ID | 20230918112937.493352-1-pierre.gondois@arm.com |
---|---|
State | New |
Headers | show |
Series | cpufreq: Rebuild sched-domains when removing cpufreq driver | expand |
On 9/19/23 1:19 PM, Pierre Gondois wrote: > > > On 9/19/23 01:03, Dietmar Eggemann wrote: >> On 18/09/2023 13:29, Pierre Gondois wrote: >>> The Energy Aware Scheduler (EAS) relies on the schedutil governor. >>> When moving to/from the schedutil governor, sched domains must be >>> rebuilt to allow re-evaluating the enablement conditions of EAS. >>> This is done through sched_cpufreq_governor_change(). >>> Hi Pierre. It looks correct to update when removing the cpufreq governer. >>> Having a cpufreq governor assumes having a cpufreq driver running. >>> Inserting/removing a cpufreq driver should trigger a re-evaluation >>> of EAS enablement conditions, avoiding to see EAS enabled when >>> removing a running cpufreq driver. >>> >>> Add a sched_cpufreq_governor_change() call in cpufreq driver removal >>> path. >> >> Rebuilding SDs when inserting the driver is already covered by >> >> cpufreq_online() >> cpufreq_set_policy() >> sched_cpufreq_governor_change() >> if (old or new gov eq. schedutil) >> schedule_work(&rebuild_sd_work) >> >> So what's missing is only a sched_cpufreq_governor_change() call when >> removing the driver, right? > > Yes exact, removing a cpufreq driver (e.g. `rmmod cppc_cpufreq.ko`) goes > through: > cpufreq_remove_dev() > \-__cpufreq_offline() > > so the path you mentioned is not used in this case. > One Doubt, while looking through code. Not well versed with this area. cpuhp_cpufreq_offline is being registered with CPU hotplug. That ends up calling cpufreq_offline. This may cause non desired issues. 1. rebuild of sched domains twice instead, once by CPU hotplug and once by this. 2. offline/online of CPU (non-SMT) may not disabling EAS. > Regards, > Pierre
On 20/09/2023 07:29, Shrikanth Hegde wrote: > > On 9/19/23 1:19 PM, Pierre Gondois wrote: >> >> On 9/19/23 01:03, Dietmar Eggemann wrote: >>> On 18/09/2023 13:29, Pierre Gondois wrote: [...] >>> Rebuilding SDs when inserting the driver is already covered by >>> >>> cpufreq_online() >>> cpufreq_set_policy() >>> sched_cpufreq_governor_change() >>> if (old or new gov eq. schedutil) >>> schedule_work(&rebuild_sd_work) >>> >>> So what's missing is only a sched_cpufreq_governor_change() call when >>> removing the driver, right? >> >> Yes exact, removing a cpufreq driver (e.g. `rmmod cppc_cpufreq.ko`) goes >> through: >> cpufreq_remove_dev() >> \-__cpufreq_offline() >> >> so the path you mentioned is not used in this case. But IMHO the name of sched_cpufreq_governor_change() is now misleading when called from the driver removal path. It's not the governor which has changed. We want to disable EAS when unloading the driver in (3): ... if (!arch_scale_freq_invariant()) { if (sched_debug()) { pr_warn("rd %*pbl: Disabling EAS: frequency-invariant load tracking not yet supported", cpumask_pr_args(cpu_map)); } goto free; } ... > One Doubt, while looking through code. Not well versed with this area. > > cpuhp_cpufreq_offline is being registered with CPU hotplug. That ends up > calling cpufreq_offline. This may cause non desired issues. > 1. rebuild of sched domains twice instead, once by CPU hotplug and once by this. That's possible when you CPU hp out the last CPU of a policy (or Perf Domain (PD). Otherwise `cpuhp_cpufreq_offline -> cpufreq_offline() -> __cpufreq_offline()` returns early in `if (!policy_is_inactive(policy))` condition. partition_sched_domains_locked() does: (1) detach_destroy_domains() (2) build_sched_domains() (3) build_perf_domains() But only the first workqueue event (from cpuset_hotplug_workfn or rebuild_sd_workfn) will rebuild Sched Domains and PDs (1)-(3), the second one will only build PDs (3) again. That's not nice but AFAICS it is not functional incorrect. > 2. offline/online of CPU (non-SMT) may not disabling EAS. Can't see this issue right now. When we offline the last CPU of a PD on a 2 PD system, EAS should be stopped since the root domain does not have any asymmetric CPU capacities left and when we online it again, EAS should be started (sched_energy_set()).
On 9/29/23 05:27, Viresh Kumar wrote: > On 28-09-23, 14:49, Pierre Gondois wrote: >> Another solution would be to call sched_cpufreq_governor_change() >> from cpufreq_schedutil's init()/exit() callbacks. This would make >> more sense as EAS/schedutil cpufreq are tightly bound, and it would >> allow to cover all the possible paths. >> >> When tried locally, it seems to cover all scenarios: >> - insmod/rmmod a cpufreq driver >> - changing the governor policy >> - offlining all the CPUs of a pd > > Right now it is done for all governors. We don't need that ? We just care about > schedutil here ? > Yes exact, EAS requires the schedutil governor to be used by all policies. The purpose of this function is to re-evaluate if EAS enablement conditions are met. I.e., if one policy changed from/to schedutil, if EAS must be [de|re]activated.
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 60ed89000e82..0a4979c34fd1 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -1673,6 +1673,8 @@ static void __cpufreq_offline(unsigned int cpu, struct cpufreq_policy *policy) cpufreq_driver->exit(policy); policy->freq_table = NULL; } + + sched_cpufreq_governor_change(policy, policy->governor); } static int cpufreq_offline(unsigned int cpu)
The Energy Aware Scheduler (EAS) relies on the schedutil governor. When moving to/from the schedutil governor, sched domains must be rebuilt to allow re-evaluating the enablement conditions of EAS. This is done through sched_cpufreq_governor_change(). Having a cpufreq governor assumes having a cpufreq driver running. Inserting/removing a cpufreq driver should trigger a re-evaluation of EAS enablement conditions, avoiding to see EAS enabled when removing a running cpufreq driver. Add a sched_cpufreq_governor_change() call in cpufreq driver removal path. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> --- drivers/cpufreq/cpufreq.c | 2 ++ 1 file changed, 2 insertions(+)