Message ID | 1466615004-3503-6-git-send-email-morten.rasmussen@arm.com |
---|---|
State | New |
Headers | show |
On Mon, Jul 11, 2016 at 12:04:49PM +0200, Peter Zijlstra wrote: > On Wed, Jun 22, 2016 at 06:03:16PM +0100, Morten Rasmussen wrote: > > Systems with the SD_ASYM_CPUCAPACITY flag set indicate that sched_groups > > at this level or below do not include cpus of all capacities available > > (e.g. group containing little-only or big-only cpus in big.LITTLE > > systems). It is therefore necessary to put in more effort in finding an > > appropriate cpu at task wake-up by enabling balancing at wake-up > > (SD_BALANCE_WAKE). > > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -6397,6 +6397,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu) > > * Convert topological properties into behaviour. > > */ > > > > + if (sd->flags & SD_ASYM_CPUCAPACITY) > > + sd->flags |= SD_BALANCE_WAKE; > > + > > So I'm a bit confused on the exact requirements for this; as also per > the previous patch. > > Should all sched domains get BALANCE_WAKE if one (typically the top) > domain has ASYM_CAP set? > > The previous patch set it on the actual asym one and one below that, but > what if there's more levels below that? Imagine ARM gaining SMT or > somesuch. Should not then that level also get BALANCE_WAKE in order to > 'correctly' place light/heavy tasks? > > IOW, are you trying to fudge the behaviour semantics by creating 'weird' > ASYM_CAP rules instead of having a more complex behaviour rule here? That is one possible way of describing it :-) The proposed semantic is to set ASYM_CAP at all levels starting from the bottom up until you have sched_groups containing all types of cpus available in the system, or reach the top level. The fundamental reason for this weird semantics is that we somehow need to know at the lower levels, which may be capacity symmetric, if we need to consider balancing at a higher level to see the asymmetry or not. If the flag isn't set bottom up we need some other way of knowing if the system is asymmetric, or we would have to go look for the flag further up the sched_domain hierarchy each time. I'm not saying this is the perfect solution, I'm happy to discuss alternatives. The example in the previous patch has the flag set on both levels, as we have two clusters of different cpus and therefore have to go to the top so 'see' all the types of cpus we have in the system. If you add SMT, you would add a third level at the bottom with ASYM_CAP set as well as you still have to balance at top level to have the full range of choice of cpu type. Should someone build a system with multiple big.LITTLE cluster pairs and essentially add another sched_domain level on top, then that level should _not_ have the ASYM_CAP flag set. The sched_groups at this level would span both big and little cpus of the cluster pair so there is little reason to expand the search scope at wake-up further. I hope that makes sense.
On Mon, Jul 11, 2016 at 11:37:18AM +0100, Morten Rasmussen wrote: > On Mon, Jul 11, 2016 at 12:04:49PM +0200, Peter Zijlstra wrote: > > On Wed, Jun 22, 2016 at 06:03:16PM +0100, Morten Rasmussen wrote: > > > Systems with the SD_ASYM_CPUCAPACITY flag set indicate that sched_groups > > > at this level or below do not include cpus of all capacities available > > > (e.g. group containing little-only or big-only cpus in big.LITTLE > > > systems). It is therefore necessary to put in more effort in finding an > > > appropriate cpu at task wake-up by enabling balancing at wake-up > > > (SD_BALANCE_WAKE). > > > > > --- a/kernel/sched/core.c > > > +++ b/kernel/sched/core.c > > > @@ -6397,6 +6397,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu) > > > * Convert topological properties into behaviour. > > > */ > > > > > > + if (sd->flags & SD_ASYM_CPUCAPACITY) > > > + sd->flags |= SD_BALANCE_WAKE; > > > + > > > > So I'm a bit confused on the exact requirements for this; as also per > > the previous patch. > > > > Should all sched domains get BALANCE_WAKE if one (typically the top) > > domain has ASYM_CAP set? > > > > The previous patch set it on the actual asym one and one below that, but > > what if there's more levels below that? Imagine ARM gaining SMT or > > somesuch. Should not then that level also get BALANCE_WAKE in order to > > 'correctly' place light/heavy tasks? > > > > IOW, are you trying to fudge the behaviour semantics by creating 'weird' > > ASYM_CAP rules instead of having a more complex behaviour rule here? > > That is one possible way of describing it :-) > > The proposed semantic is to set ASYM_CAP at all levels starting from the > bottom up until you have sched_groups containing all types of cpus > available in the system, or reach the top level. > > The fundamental reason for this weird semantics is that we somehow need > to know at the lower levels, which may be capacity symmetric, if we need > to consider balancing at a higher level to see the asymmetry or not. > > If the flag isn't set bottom up we need some other way of knowing if the > system is asymmetric, or we would have to go look for the flag further > up the sched_domain hierarchy each time. > > I'm not saying this is the perfect solution, I'm happy to discuss > alternatives. One alternative to setting ASYM_CAP bottom up would be to set it only where the asymmetry can be observed, and instead come up with a more complicated way of setting BALANCE_WAKE bottom up until and including the first level having the ASYM_CAP. I looked at it briefly an realized that I couldn't find a clean way of implementing it as I don't think we have visibility of which flags that will be set at higher levels in the sched_domain hierarchy when the lower levels are initialized. IOW, we have behavioural flags settings depend on topology flags settings at a different level.
On Mon, Jul 11, 2016 at 01:24:04PM +0200, Peter Zijlstra wrote: > On Mon, Jul 11, 2016 at 12:04:58PM +0100, Morten Rasmussen wrote: > > > One alternative to setting ASYM_CAP bottom up would be to set it only > > where the asymmetry can be observed, and instead come up with a more > > complicated way of setting BALANCE_WAKE bottom up until and including > > the first level having the ASYM_CAP. > > Right, that is what I was thinking. > > > I looked at it briefly an realized that I couldn't find a clean way of > > implementing it as I don't think we have visibility of which flags that > > will be set at higher levels in the sched_domain hierarchy when the > > lower levels are initialized. IOW, we have behavioural flags settings > > depend on topology flags settings at a different level. > > Looks doable if we pass @child into sd_init() in build_sched_domain(). > Then we could simply do: > > *sd = (struct sched_domain){ > /* ... */ > .child = child, > }; > > if (sd->flags & ASYM_CAP) { > struct sched_domain *t = sd; > while (t) { > t->sd_flags |= BALANCE_WAKE; > t = t->child; > } > } > > Or something like that. It appears to be working fine. I will roll it into v3 along with the simpler and more sane ASYM_CAP semantics :)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 351609279341..fe39118ffdfb 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6397,6 +6397,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu) * Convert topological properties into behaviour. */ + if (sd->flags & SD_ASYM_CPUCAPACITY) + sd->flags |= SD_BALANCE_WAKE; + if (sd->flags & SD_SHARE_CPUCAPACITY) { sd->flags |= SD_PREFER_SIBLING; sd->imbalance_pct = 110;
Systems with the SD_ASYM_CPUCAPACITY flag set indicate that sched_groups at this level or below do not include cpus of all capacities available (e.g. group containing little-only or big-only cpus in big.LITTLE systems). It is therefore necessary to put in more effort in finding an appropriate cpu at task wake-up by enabling balancing at wake-up (SD_BALANCE_WAKE). cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) -- 1.9.1