Message ID | 1559571436-29091-1-git-send-email-vincent.guittot@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | sched/fair: clean up asym packing | expand |
Hi, On 03/06/2019 15:17, Vincent Guittot wrote: > Clean up asym packing to follow the default load balance behavior: > - classify the group by creating a group_asym_capacity field. Being nitpicky here, this doesn't classify the group in the usual way - it doesn't get a specific group_type value (group_classify()). So maybe "classify" isn't the best term here. Also, why tag this group in update_sd_pick_busiest()? It would make more sense to do so in update_sg_lb_stats() like with the other sg_lb_stats fields: diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 93c24473c8a0..537710026c3a 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8298,6 +8298,10 @@ static inline void update_sg_lb_stats(struct lb_env *env, } } + if (sgs->sum_nr_running && + sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) + sgs->group_asym_capacity = 1; + /* Adjust by relative CPU capacity of the group */ sgs->group_capacity = group->sgc->capacity; sgs->avg_load = (sgs->group_load*SCHED_CAPACITY_SCALE) / sgs->group_capacity; @@ -8391,9 +8395,7 @@ static bool update_sd_pick_busiest(struct lb_env *env, * perform better since they share less core resources. Hence when we * have idle threads, we want them to be the higher ones. */ - if (sgs->sum_nr_running && - sched_asym_prefer(env->dst_cpu, sg->asym_prefer_cpu)) { - sgs->group_asym_capacity = 1; + if (sgs->group_asym_capacity) { if (!sds->busiest) return true;
On Mon, 3 Jun 2019 at 20:15, Valentin Schneider <valentin.schneider@arm.com> wrote: > > Hi, > > On 03/06/2019 15:17, Vincent Guittot wrote: > > Clean up asym packing to follow the default load balance behavior: > > - classify the group by creating a group_asym_capacity field. > > Being nitpicky here, this doesn't classify the group in the usual way > - it doesn't get a specific group_type value (group_classify()). So maybe > "classify" isn't the best term here. My original goal was to add a group type to classify the group but this would have broken the current behavior whereas I only want to move code > > Also, why tag this group in update_sd_pick_busiest()? It would make more > sense to do so in update_sg_lb_stats() like with the other sg_lb_stats fields: With your proposal below, the test is called for every groups' statistic update whereas it is only done lastly after checking other rules in the current code and I don't want to modify the current behavior but only move code to set imbalance in calculate imbalance. A bigger cleanup will come in next steps > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 93c24473c8a0..537710026c3a 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -8298,6 +8298,10 @@ static inline void update_sg_lb_stats(struct lb_env *env, > } > } > > + if (sgs->sum_nr_running && > + sched_asym_prefer(env->dst_cpu, group->asym_prefer_cpu)) > + sgs->group_asym_capacity = 1; > + > /* Adjust by relative CPU capacity of the group */ > sgs->group_capacity = group->sgc->capacity; > sgs->avg_load = (sgs->group_load*SCHED_CAPACITY_SCALE) / sgs->group_capacity; > @@ -8391,9 +8395,7 @@ static bool update_sd_pick_busiest(struct lb_env *env, > * perform better since they share less core resources. Hence when we > * have idle threads, we want them to be the higher ones. > */ > - if (sgs->sum_nr_running && > - sched_asym_prefer(env->dst_cpu, sg->asym_prefer_cpu)) { > - sgs->group_asym_capacity = 1; > + if (sgs->group_asym_capacity) { > if (!sds->busiest) > return true; >
On 03/06/2019 19:32, Vincent Guittot wrote: > On Mon, 3 Jun 2019 at 20:15, Valentin Schneider [...] > My original goal was to add a group type to classify the group but > this would have broken the current behavior whereas I only want to > move code > >> >> Also, why tag this group in update_sd_pick_busiest()? It would make more >> sense to do so in update_sg_lb_stats() like with the other sg_lb_stats fields: > > With your proposal below, the test is called for every groups' > statistic update whereas it is only done lastly after checking other > rules in the current code and I don't want to modify the current > behavior but only move code to set imbalance in calculate imbalance. > Adding a new group_type would make sense. From a behavioral point of view your change is fine, but from a logical one it sits halfway between being a new stat and being a new group_type. I'd rather see a new group_type, though as you said that's a different topic than cleaning up duplicate operations. > A bigger cleanup will come in next steps > [...]
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index f35930f..93c2447 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7888,6 +7888,7 @@ struct sg_lb_stats { unsigned int group_weight; enum group_type group_type; int group_no_capacity; + int group_asym_capacity; unsigned long group_misfit_task_load; /* A CPU has a task too big for its capacity */ #ifdef CONFIG_NUMA_BALANCING unsigned int nr_numa_running; @@ -8382,9 +8383,17 @@ static bool update_sd_pick_busiest(struct lb_env *env, * ASYM_PACKING needs to move all the work to the highest * prority CPUs in the group, therefore mark all groups * of lower priority than ourself as busy. + * + * This is primarily intended to used at the sibling level. Some + * cores like POWER7 prefer to use lower numbered SMT threads. In the + * case of POWER7, it can move to lower SMT modes only when higher + * threads are idle. When in lower SMT modes, the threads will + * perform better since they share less core resources. Hence when we + * have idle threads, we want them to be the higher ones. */ if (sgs->sum_nr_running && sched_asym_prefer(env->dst_cpu, sg->asym_prefer_cpu)) { + sgs->group_asym_capacity = 1; if (!sds->busiest) return true; @@ -8522,51 +8531,6 @@ static inline void update_sd_lb_stats(struct lb_env *env, struct sd_lb_stats *sd } /** - * check_asym_packing - Check to see if the group is packed into the - * sched domain. - * - * This is primarily intended to used at the sibling level. Some - * cores like POWER7 prefer to use lower numbered SMT threads. In the - * case of POWER7, it can move to lower SMT modes only when higher - * threads are idle. When in lower SMT modes, the threads will - * perform better since they share less core resources. Hence when we - * have idle threads, we want them to be the higher ones. - * - * This packing function is run on idle threads. It checks to see if - * the busiest CPU in this domain (core in the P7 case) has a higher - * CPU number than the packing function is being run on. Here we are - * assuming lower CPU number will be equivalent to lower a SMT thread - * number. - * - * Return: 1 when packing is required and a task should be moved to - * this CPU. The amount of the imbalance is returned in env->imbalance. - * - * @env: The load balancing environment. - * @sds: Statistics of the sched_domain which is to be packed - */ -static int check_asym_packing(struct lb_env *env, struct sd_lb_stats *sds) -{ - int busiest_cpu; - - if (!(env->sd->flags & SD_ASYM_PACKING)) - return 0; - - if (env->idle == CPU_NOT_IDLE) - return 0; - - if (!sds->busiest) - return 0; - - busiest_cpu = sds->busiest->asym_prefer_cpu; - if (sched_asym_prefer(busiest_cpu, env->dst_cpu)) - return 0; - - env->imbalance = sds->busiest_stat.group_load; - - return 1; -} - -/** * fix_small_imbalance - Calculate the minor imbalance that exists * amongst the groups of a sched_domain, during * load balancing. @@ -8650,6 +8614,11 @@ static inline void calculate_imbalance(struct lb_env *env, struct sd_lb_stats *s local = &sds->local_stat; busiest = &sds->busiest_stat; + if (busiest->group_asym_capacity) { + env->imbalance = busiest->group_load; + return; + } + if (busiest->group_type == group_imbalanced) { /* * In the group_imb case we cannot rely on group-wide averages @@ -8754,8 +8723,8 @@ static struct sched_group *find_busiest_group(struct lb_env *env) busiest = &sds.busiest_stat; /* ASYM feature bypasses nice load balance check */ - if (check_asym_packing(env, &sds)) - return sds.busiest; + if (busiest->group_asym_capacity) + goto force_balance; /* There is no busy sibling group to pull tasks from */ if (!sds.busiest || busiest->sum_nr_running == 0)
Clean up asym packing to follow the default load balance behavior: - classify the group by creating a group_asym_capacity field. - calculate the imbalance in calculate_imbalance() instead of bypassing it. We don't need to test twice same conditions anymore to detect asym packing and we consolidate the calculation of imbalance in calculate_imbalance(). There is no functional changes. Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> --- kernel/sched/fair.c | 63 ++++++++++++++--------------------------------------- 1 file changed, 16 insertions(+), 47 deletions(-) -- 2.7.4