Message ID | 1464001138-25063-7-git-send-email-morten.rasmussen@arm.com |
---|---|
State | New |
Headers | show |
On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > If the system has cpu of different compute capacities (e.g. big.LITTLE) > let affine wakeups be constrained to cpus of the same type. Can you explain why you don't want wake affine with cpus with different compute capacity ? > > cc: Ingo Molnar <mingo@redhat.com> > cc: Peter Zijlstra <peterz@infradead.org> > > Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> > --- > kernel/sched/core.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index d9619a3..558ec4a 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -6410,6 +6410,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu) > sd->idle_idx = 1; > } > > + if (sd->flags & SD_ASYM_CPUCAPACITY) > + sd->flags &= ~SD_WAKE_AFFINE; > + > sd->private = &tl->data; > > return sd; > -- > 1.9.1 >
On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: > On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > > If the system has cpu of different compute capacities (e.g. big.LITTLE) > > let affine wakeups be constrained to cpus of the same type. > > Can you explain why you don't want wake affine with cpus with > different compute capacity ? I should have made the overall idea a bit more clear. The idea is to deal with cross-capacity migrations in the find_idlest_{group, cpu}{} path so we don't have to touch select_idle_sibling(). select_idle_sibling() is critical for wake-up latency, and I'm assumed that people wouldn't like adding extra overhead in there to deal with capacity and utilization. So the overall idea is that symmetric capacity systems, everything should work as normal. For asymmetric capacity systems, we restrict select_idle_sibling() to only look among same-capacity cpus and then use wake_cap() to use find_idlest_{group, cpu}() to look wider if we think should look for cpu with higher capacity than the previous one. So, for assymmetric cpus we take one of the two routes depending on whether a cpu of the same capacity as the previous one is okay. Do that make any sense? > > > > > cc: Ingo Molnar <mingo@redhat.com> > > cc: Peter Zijlstra <peterz@infradead.org> > > > > Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> > > --- > > kernel/sched/core.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > > index d9619a3..558ec4a 100644 > > --- a/kernel/sched/core.c > > +++ b/kernel/sched/core.c > > @@ -6410,6 +6410,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu) > > sd->idle_idx = 1; > > } > > > > + if (sd->flags & SD_ASYM_CPUCAPACITY) > > + sd->flags &= ~SD_WAKE_AFFINE; > > + > > sd->private = &tl->data; > > > > return sd; > > -- > > 1.9.1 > >
On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> > If the system has cpu of different compute capacities (e.g. big.LITTLE) >> > let affine wakeups be constrained to cpus of the same type. >> >> Can you explain why you don't want wake affine with cpus with >> different compute capacity ? > > I should have made the overall idea a bit more clear. The idea is to > deal with cross-capacity migrations in the find_idlest_{group, cpu}{} > path so we don't have to touch select_idle_sibling(). > select_idle_sibling() is critical for wake-up latency, and I'm assumed > that people wouldn't like adding extra overhead in there to deal with > capacity and utilization. So this means that we will never use the quick path of select_idle_sibling for cross capacity migration but always the one with extra overhead? Patch 9 adds more tests for enabling wake_affine path. Can't it also be used for cross capacity migration ? so we can use wake_affine if the task or the cpus (even with different capacity) doesn't need this extra overhead > > So the overall idea is that symmetric capacity systems, everything > should work as normal. For asymmetric capacity systems, we restrict > select_idle_sibling() to only look among same-capacity cpus and then use > wake_cap() to use find_idlest_{group, cpu}() to look wider if we think > should look for cpu with higher capacity than the previous one. So, for > assymmetric cpus we take one of the two routes depending on whether a > cpu of the same capacity as the previous one is okay. > > Do that make any sense? > >> >> > >> > cc: Ingo Molnar <mingo@redhat.com> >> > cc: Peter Zijlstra <peterz@infradead.org> >> > >> > Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> >> > --- >> > kernel/sched/core.c | 3 +++ >> > 1 file changed, 3 insertions(+) >> > >> > diff --git a/kernel/sched/core.c b/kernel/sched/core.c >> > index d9619a3..558ec4a 100644 >> > --- a/kernel/sched/core.c >> > +++ b/kernel/sched/core.c >> > @@ -6410,6 +6410,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu) >> > sd->idle_idx = 1; >> > } >> > >> > + if (sd->flags & SD_ASYM_CPUCAPACITY) >> > + sd->flags &= ~SD_WAKE_AFFINE; >> > + >> > sd->private = &tl->data; >> > >> > return sd; >> > -- >> > 1.9.1 >> >
On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: > On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: > >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> > If the system has cpu of different compute capacities (e.g. big.LITTLE) > >> > let affine wakeups be constrained to cpus of the same type. > >> > >> Can you explain why you don't want wake affine with cpus with > >> different compute capacity ? > > > > I should have made the overall idea a bit more clear. The idea is to > > deal with cross-capacity migrations in the find_idlest_{group, cpu}{} > > path so we don't have to touch select_idle_sibling(). > > select_idle_sibling() is critical for wake-up latency, and I'm assumed > > that people wouldn't like adding extra overhead in there to deal with > > capacity and utilization. > > So this means that we will never use the quick path of > select_idle_sibling for cross capacity migration but always the one > with extra overhead? Yes. select_idle_sibling() is only used to choose among equal capacity cpus (capacity_orig). > Patch 9 adds more tests for enabling wake_affine path. Can't it also > be used for cross capacity migration ? so we can use wake_affine if > the task or the cpus (even with different capacity) doesn't need this > extra overhead The test in patch 9 is to determine whether we are happy with the capacity of the previous cpu, or we should go look for one with more capacity. I don't see how we can use select_idle_sibling() unmodified for sched domains containing cpus of different capacity to select an appropriate cpu. It is just picking an idle cpu, it might have high capacity or low, it wouldn't care. How would you avoid the overhead of checking capacity and utilization of the cpus and still pick an appropriate cpu?
On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: >> >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> >> > If the system has cpu of different compute capacities (e.g. big.LITTLE) >> >> > let affine wakeups be constrained to cpus of the same type. >> >> >> >> Can you explain why you don't want wake affine with cpus with >> >> different compute capacity ? >> > >> > I should have made the overall idea a bit more clear. The idea is to >> > deal with cross-capacity migrations in the find_idlest_{group, cpu}{} >> > path so we don't have to touch select_idle_sibling(). >> > select_idle_sibling() is critical for wake-up latency, and I'm assumed >> > that people wouldn't like adding extra overhead in there to deal with >> > capacity and utilization. >> >> So this means that we will never use the quick path of >> select_idle_sibling for cross capacity migration but always the one >> with extra overhead? > > Yes. select_idle_sibling() is only used to choose among equal capacity > cpus (capacity_orig). > >> Patch 9 adds more tests for enabling wake_affine path. Can't it also >> be used for cross capacity migration ? so we can use wake_affine if >> the task or the cpus (even with different capacity) doesn't need this >> extra overhead > > The test in patch 9 is to determine whether we are happy with the > capacity of the previous cpu, or we should go look for one with more > capacity. I don't see how we can use select_idle_sibling() unmodified > for sched domains containing cpus of different capacity to select an > appropriate cpu. It is just picking an idle cpu, it might have high > capacity or low, it wouldn't care. > > How would you avoid the overhead of checking capacity and utilization of > the cpus and still pick an appropriate cpu? My point is that there is some wake up case where we don't care about the capacity and utilization of cpus even for cross capacity migration and we will never take benefit of this fast path. You have added an extra check for setting want_affine in patch 9 which uses capacity and utilization of cpu to disable this fast path when a task needs more capacity than available. Can't you use this function to disable the want_affine for cross-capacity migration situation that cares of the capacity and need the full scan of sched_domain but keep it enable for other cases ?
On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote: > On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: > >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: > >> >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> >> > If the system has cpu of different compute capacities (e.g. big.LITTLE) > >> >> > let affine wakeups be constrained to cpus of the same type. > >> >> > >> >> Can you explain why you don't want wake affine with cpus with > >> >> different compute capacity ? > >> > > >> > I should have made the overall idea a bit more clear. The idea is to > >> > deal with cross-capacity migrations in the find_idlest_{group, cpu}{} > >> > path so we don't have to touch select_idle_sibling(). > >> > select_idle_sibling() is critical for wake-up latency, and I'm assumed > >> > that people wouldn't like adding extra overhead in there to deal with > >> > capacity and utilization. > >> > >> So this means that we will never use the quick path of > >> select_idle_sibling for cross capacity migration but always the one > >> with extra overhead? > > > > Yes. select_idle_sibling() is only used to choose among equal capacity > > cpus (capacity_orig). > > > >> Patch 9 adds more tests for enabling wake_affine path. Can't it also > >> be used for cross capacity migration ? so we can use wake_affine if > >> the task or the cpus (even with different capacity) doesn't need this > >> extra overhead > > > > The test in patch 9 is to determine whether we are happy with the > > capacity of the previous cpu, or we should go look for one with more > > capacity. I don't see how we can use select_idle_sibling() unmodified > > for sched domains containing cpus of different capacity to select an > > appropriate cpu. It is just picking an idle cpu, it might have high > > capacity or low, it wouldn't care. > > > > How would you avoid the overhead of checking capacity and utilization of > > the cpus and still pick an appropriate cpu? > > My point is that there is some wake up case where we don't care about > the capacity and utilization of cpus even for cross capacity migration > and we will never take benefit of this fast path. > You have added an extra check for setting want_affine in patch 9 which > uses capacity and utilization of cpu to disable this fast path when a > task needs more capacity than available. Can't you use this function > to disable the want_affine for cross-capacity migration situation that > cares of the capacity and need the full scan of sched_domain but keep > it enable for other cases ? It is not clear to me what the other cases are. What kind of cases do you have in mind?
On 24 May 2016 at 15:36, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote: >> On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: >> >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: >> >> >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> >> >> > If the system has cpu of different compute capacities (e.g. big.LITTLE) >> >> >> > let affine wakeups be constrained to cpus of the same type. >> >> >> >> >> >> Can you explain why you don't want wake affine with cpus with >> >> >> different compute capacity ? >> >> > >> >> > I should have made the overall idea a bit more clear. The idea is to >> >> > deal with cross-capacity migrations in the find_idlest_{group, cpu}{} >> >> > path so we don't have to touch select_idle_sibling(). >> >> > select_idle_sibling() is critical for wake-up latency, and I'm assumed >> >> > that people wouldn't like adding extra overhead in there to deal with >> >> > capacity and utilization. >> >> >> >> So this means that we will never use the quick path of >> >> select_idle_sibling for cross capacity migration but always the one >> >> with extra overhead? >> > >> > Yes. select_idle_sibling() is only used to choose among equal capacity >> > cpus (capacity_orig). >> > >> >> Patch 9 adds more tests for enabling wake_affine path. Can't it also >> >> be used for cross capacity migration ? so we can use wake_affine if >> >> the task or the cpus (even with different capacity) doesn't need this >> >> extra overhead >> > >> > The test in patch 9 is to determine whether we are happy with the >> > capacity of the previous cpu, or we should go look for one with more >> > capacity. I don't see how we can use select_idle_sibling() unmodified >> > for sched domains containing cpus of different capacity to select an >> > appropriate cpu. It is just picking an idle cpu, it might have high >> > capacity or low, it wouldn't care. >> > >> > How would you avoid the overhead of checking capacity and utilization of >> > the cpus and still pick an appropriate cpu? >> >> My point is that there is some wake up case where we don't care about >> the capacity and utilization of cpus even for cross capacity migration >> and we will never take benefit of this fast path. >> You have added an extra check for setting want_affine in patch 9 which >> uses capacity and utilization of cpu to disable this fast path when a >> task needs more capacity than available. Can't you use this function >> to disable the want_affine for cross-capacity migration situation that >> cares of the capacity and need the full scan of sched_domain but keep >> it enable for other cases ? > > It is not clear to me what the other cases are. What kind of cases do > you have in mind? As an example, you have a task A that have to be on a big CPU because of the requirement of compute capacity, that wakes up a task B that can run on any cpu according to its utilization. The fast wake up path is fine for task B whatever prev cpu is.
On Tue, May 24, 2016 at 03:52:00PM +0200, Vincent Guittot wrote: > On 24 May 2016 at 15:36, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote: > >> On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: > >> >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: > >> >> >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> >> >> > If the system has cpu of different compute capacities (e.g. big.LITTLE) > >> >> >> > let affine wakeups be constrained to cpus of the same type. > >> >> >> > >> >> >> Can you explain why you don't want wake affine with cpus with > >> >> >> different compute capacity ? > >> >> > > >> >> > I should have made the overall idea a bit more clear. The idea is to > >> >> > deal with cross-capacity migrations in the find_idlest_{group, cpu}{} > >> >> > path so we don't have to touch select_idle_sibling(). > >> >> > select_idle_sibling() is critical for wake-up latency, and I'm assumed > >> >> > that people wouldn't like adding extra overhead in there to deal with > >> >> > capacity and utilization. > >> >> > >> >> So this means that we will never use the quick path of > >> >> select_idle_sibling for cross capacity migration but always the one > >> >> with extra overhead? > >> > > >> > Yes. select_idle_sibling() is only used to choose among equal capacity > >> > cpus (capacity_orig). > >> > > >> >> Patch 9 adds more tests for enabling wake_affine path. Can't it also > >> >> be used for cross capacity migration ? so we can use wake_affine if > >> >> the task or the cpus (even with different capacity) doesn't need this > >> >> extra overhead > >> > > >> > The test in patch 9 is to determine whether we are happy with the > >> > capacity of the previous cpu, or we should go look for one with more > >> > capacity. I don't see how we can use select_idle_sibling() unmodified > >> > for sched domains containing cpus of different capacity to select an > >> > appropriate cpu. It is just picking an idle cpu, it might have high > >> > capacity or low, it wouldn't care. > >> > > >> > How would you avoid the overhead of checking capacity and utilization of > >> > the cpus and still pick an appropriate cpu? > >> > >> My point is that there is some wake up case where we don't care about > >> the capacity and utilization of cpus even for cross capacity migration > >> and we will never take benefit of this fast path. > >> You have added an extra check for setting want_affine in patch 9 which > >> uses capacity and utilization of cpu to disable this fast path when a > >> task needs more capacity than available. Can't you use this function > >> to disable the want_affine for cross-capacity migration situation that > >> cares of the capacity and need the full scan of sched_domain but keep > >> it enable for other cases ? > > > > It is not clear to me what the other cases are. What kind of cases do > > you have in mind? > > As an example, you have a task A that have to be on a big CPU because > of the requirement of compute capacity, that wakes up a task B that > can run on any cpu according to its utilization. The fast wake up path > is fine for task B whatever prev cpu is. In that case, we will take always take fast path (select_idle_sibling()) for task B if wake_wide() allows it, which should be fine. wake_cap() will return true as the B's prev_cpu is either a big cpu (first criteria) or have sufficient capacity for B (second criteria). Given that wake_wide() allows returns false as well and there are no restrictions, want_affine will be true. Depending on where wake_affine() sends us, we will use select_idle_sibling() to search around B's prev_cpu or this cpu (where task A is running). We avoid the overhead of looking for cpu capacity and utilization, but we have restricted the search space for select_idle_sibling(). In case B's prev_cpu is a little cpu, the choice whether we looks for little or big capacity cpus depends on the wake_affine()'s decision. So the search space isn't as wide as it could be. To expand the search space we would have be able to adjust the sched_domain level at which select_idle_sibling() is operating, so we can look at same-capacity cpus only in the fast path for tasks like A, and look at all cpus for tasks like B. It could possibly be done, if we dare touching select_idle_sibling() ;-) I still have to look at those patches PeterZ posted a while back. TLDR; The fast path should already be used for task B, but the cpu search space is restricted to a specific subset of cpus selected by wake_affine() which isn't ideal, but much less invasive in terms of code changes.
On 24 May 2016 at 17:02, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > On Tue, May 24, 2016 at 03:52:00PM +0200, Vincent Guittot wrote: >> On 24 May 2016 at 15:36, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote: >> >> On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: >> >> >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> >> >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: >> >> >> >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> >> >> >> > If the system has cpu of different compute capacities (e.g. big.LITTLE) >> >> >> >> > let affine wakeups be constrained to cpus of the same type. >> >> >> >> >> >> >> >> Can you explain why you don't want wake affine with cpus with >> >> >> >> different compute capacity ? >> >> >> > >> >> >> > I should have made the overall idea a bit more clear. The idea is to >> >> >> > deal with cross-capacity migrations in the find_idlest_{group, cpu}{} >> >> >> > path so we don't have to touch select_idle_sibling(). >> >> >> > select_idle_sibling() is critical for wake-up latency, and I'm assumed >> >> >> > that people wouldn't like adding extra overhead in there to deal with >> >> >> > capacity and utilization. >> >> >> >> >> >> So this means that we will never use the quick path of >> >> >> select_idle_sibling for cross capacity migration but always the one >> >> >> with extra overhead? >> >> > >> >> > Yes. select_idle_sibling() is only used to choose among equal capacity >> >> > cpus (capacity_orig). >> >> > >> >> >> Patch 9 adds more tests for enabling wake_affine path. Can't it also >> >> >> be used for cross capacity migration ? so we can use wake_affine if >> >> >> the task or the cpus (even with different capacity) doesn't need this >> >> >> extra overhead >> >> > >> >> > The test in patch 9 is to determine whether we are happy with the >> >> > capacity of the previous cpu, or we should go look for one with more >> >> > capacity. I don't see how we can use select_idle_sibling() unmodified >> >> > for sched domains containing cpus of different capacity to select an >> >> > appropriate cpu. It is just picking an idle cpu, it might have high >> >> > capacity or low, it wouldn't care. >> >> > >> >> > How would you avoid the overhead of checking capacity and utilization of >> >> > the cpus and still pick an appropriate cpu? >> >> >> >> My point is that there is some wake up case where we don't care about >> >> the capacity and utilization of cpus even for cross capacity migration >> >> and we will never take benefit of this fast path. >> >> You have added an extra check for setting want_affine in patch 9 which >> >> uses capacity and utilization of cpu to disable this fast path when a >> >> task needs more capacity than available. Can't you use this function >> >> to disable the want_affine for cross-capacity migration situation that >> >> cares of the capacity and need the full scan of sched_domain but keep >> >> it enable for other cases ? >> > >> > It is not clear to me what the other cases are. What kind of cases do >> > you have in mind? >> >> As an example, you have a task A that have to be on a big CPU because >> of the requirement of compute capacity, that wakes up a task B that >> can run on any cpu according to its utilization. The fast wake up path >> is fine for task B whatever prev cpu is. > > In that case, we will take always take fast path (select_idle_sibling()) > for task B if wake_wide() allows it, which should be fine. Even if want_affine is set, the wake up of task B will not use the fast path. The affine_sd will not be set because the sched_domain, which have both cpus, will not have the SD_WAKE_AFFINE flag according to this patch, isn't it ? So task B can't use the fast path whereas nothing prevent him to take benefit of it Am I missing something ? > > wake_cap() will return true as the B's prev_cpu is either a big cpu > (first criteria) or have sufficient capacity for B (second criteria). > Given that wake_wide() allows returns false as well and there are no > restrictions, want_affine will be true. Depending on where wake_affine() > sends us, we will use select_idle_sibling() to search around B's > prev_cpu or this cpu (where task A is running). > > We avoid the overhead of looking for cpu capacity and utilization, but > we have restricted the search space for select_idle_sibling(). In case > B's prev_cpu is a little cpu, the choice whether we looks for little or > big capacity cpus depends on the wake_affine()'s decision. So the search > space isn't as wide as it could be. > > To expand the search space we would have be able to adjust the > sched_domain level at which select_idle_sibling() is operating, so we > can look at same-capacity cpus only in the fast path for tasks like A, > and look at all cpus for tasks like B. It could possibly be done, if we > dare touching select_idle_sibling() ;-) I still have to look at those > patches PeterZ posted a while back. > > TLDR; The fast path should already be used for task B, but the cpu > search space is restricted to a specific subset of cpus selected by > wake_affine() which isn't ideal, but much less invasive in terms of code > changes.
On Tue, May 24, 2016 at 05:53:27PM +0200, Vincent Guittot wrote: > On 24 May 2016 at 17:02, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > > On Tue, May 24, 2016 at 03:52:00PM +0200, Vincent Guittot wrote: > >> On 24 May 2016 at 15:36, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote: > >> >> On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: > >> >> >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> >> >> > On Tue, May 24, 2016 at 11:10:28AM +0200, Vincent Guittot wrote: > >> >> >> >> On 23 May 2016 at 12:58, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> >> >> >> > If the system has cpu of different compute capacities (e.g. big.LITTLE) > >> >> >> >> > let affine wakeups be constrained to cpus of the same type. > >> >> >> >> > >> >> >> >> Can you explain why you don't want wake affine with cpus with > >> >> >> >> different compute capacity ? > >> >> >> > > >> >> >> > I should have made the overall idea a bit more clear. The idea is to > >> >> >> > deal with cross-capacity migrations in the find_idlest_{group, cpu}{} > >> >> >> > path so we don't have to touch select_idle_sibling(). > >> >> >> > select_idle_sibling() is critical for wake-up latency, and I'm assumed > >> >> >> > that people wouldn't like adding extra overhead in there to deal with > >> >> >> > capacity and utilization. > >> >> >> > >> >> >> So this means that we will never use the quick path of > >> >> >> select_idle_sibling for cross capacity migration but always the one > >> >> >> with extra overhead? > >> >> > > >> >> > Yes. select_idle_sibling() is only used to choose among equal capacity > >> >> > cpus (capacity_orig). > >> >> > > >> >> >> Patch 9 adds more tests for enabling wake_affine path. Can't it also > >> >> >> be used for cross capacity migration ? so we can use wake_affine if > >> >> >> the task or the cpus (even with different capacity) doesn't need this > >> >> >> extra overhead > >> >> > > >> >> > The test in patch 9 is to determine whether we are happy with the > >> >> > capacity of the previous cpu, or we should go look for one with more > >> >> > capacity. I don't see how we can use select_idle_sibling() unmodified > >> >> > for sched domains containing cpus of different capacity to select an > >> >> > appropriate cpu. It is just picking an idle cpu, it might have high > >> >> > capacity or low, it wouldn't care. > >> >> > > >> >> > How would you avoid the overhead of checking capacity and utilization of > >> >> > the cpus and still pick an appropriate cpu? > >> >> > >> >> My point is that there is some wake up case where we don't care about > >> >> the capacity and utilization of cpus even for cross capacity migration > >> >> and we will never take benefit of this fast path. > >> >> You have added an extra check for setting want_affine in patch 9 which > >> >> uses capacity and utilization of cpu to disable this fast path when a > >> >> task needs more capacity than available. Can't you use this function > >> >> to disable the want_affine for cross-capacity migration situation that > >> >> cares of the capacity and need the full scan of sched_domain but keep > >> >> it enable for other cases ? > >> > > >> > It is not clear to me what the other cases are. What kind of cases do > >> > you have in mind? > >> > >> As an example, you have a task A that have to be on a big CPU because > >> of the requirement of compute capacity, that wakes up a task B that > >> can run on any cpu according to its utilization. The fast wake up path > >> is fine for task B whatever prev cpu is. > > > > In that case, we will take always take fast path (select_idle_sibling()) > > for task B if wake_wide() allows it, which should be fine. > > Even if want_affine is set, the wake up of task B will not use the fast path. > The affine_sd will not be set because the sched_domain, which have > both cpus, will not have the SD_WAKE_AFFINE flag according to this > patch, isn't it ? > So task B can't use the fast path whereas nothing prevent him to take > benefit of it > > Am I missing something ? No, I think you are right. Very good point. The cpumask test with sched_domain_span() will of cause return false. So yes, in this case the slow path is taken. It isn't wrong as such, just slower for asymmetric capacity systems :-) It is clearly not as optimized for asymmetric capacity systems as it could be, but my focus was to not ruin existing behaviour and minimize overhead for others. There are a lot of different routes through those conditions in the first half of select_task_rq_fair() that aren't obvious. I worry that some users depend on them and that I don't see/understand all of them. If people agree on changing things, it is fine with me. I just tried to avoid getting the patches shot down on that account ;-)
On 25 May 2016 at 11:12, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > On Tue, May 24, 2016 at 05:53:27PM +0200, Vincent Guittot wrote: >> On 24 May 2016 at 17:02, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> > On Tue, May 24, 2016 at 03:52:00PM +0200, Vincent Guittot wrote: >> >> On 24 May 2016 at 15:36, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> >> > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote: >> >> >> On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@arm.com> wrote: >> >> >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: >> >> >> >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: [snip] >> >> > It is not clear to me what the other cases are. What kind of cases do >> >> > you have in mind? >> >> >> >> As an example, you have a task A that have to be on a big CPU because >> >> of the requirement of compute capacity, that wakes up a task B that >> >> can run on any cpu according to its utilization. The fast wake up path >> >> is fine for task B whatever prev cpu is. >> > >> > In that case, we will take always take fast path (select_idle_sibling()) >> > for task B if wake_wide() allows it, which should be fine. >> >> Even if want_affine is set, the wake up of task B will not use the fast path. >> The affine_sd will not be set because the sched_domain, which have >> both cpus, will not have the SD_WAKE_AFFINE flag according to this >> patch, isn't it ? >> So task B can't use the fast path whereas nothing prevent him to take >> benefit of it >> >> Am I missing something ? > > No, I think you are right. Very good point. The cpumask test with > sched_domain_span() will of cause return false. So yes, in this case the > slow path is taken. It isn't wrong as such, just slower for asymmetric > capacity systems :-) > So, I still don't see why the function wake_cap that is introduce by patch 9 can't be used for testing cross capacity migration at wake up ? The only reason for which we would like to skip fast wake up path for cross capacity migration, is whether the task needs more capacity than the capacity of cpu or prev_cpu. You already do that for prev_cpu, can't you add same test for cpu ? > It is clearly not as optimized for asymmetric capacity systems as it > could be, but my focus was to not ruin existing behaviour and minimize > overhead for others. There are a lot of different routes through those > conditions in the first half of select_task_rq_fair() that aren't > obvious. I worry that some users depend on them and that I don't > see/understand all of them. > > If people agree on changing things, it is fine with me. I just tried to > avoid getting the patches shot down on that account ;-)
On Thu, May 26, 2016 at 08:45:03AM +0200, Vincent Guittot wrote: > On 25 May 2016 at 11:12, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > > On Tue, May 24, 2016 at 05:53:27PM +0200, Vincent Guittot wrote: > >> On 24 May 2016 at 17:02, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> > On Tue, May 24, 2016 at 03:52:00PM +0200, Vincent Guittot wrote: > >> >> On 24 May 2016 at 15:36, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> >> > On Tue, May 24, 2016 at 03:27:05PM +0200, Vincent Guittot wrote: > >> >> >> On 24 May 2016 at 15:16, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > >> >> >> > On Tue, May 24, 2016 at 02:12:38PM +0200, Vincent Guittot wrote: > >> >> >> >> On 24 May 2016 at 12:29, Morten Rasmussen <morten.rasmussen@arm.com> wrote: > > [snip] > > >> >> > It is not clear to me what the other cases are. What kind of cases do > >> >> > you have in mind? > >> >> > >> >> As an example, you have a task A that have to be on a big CPU because > >> >> of the requirement of compute capacity, that wakes up a task B that > >> >> can run on any cpu according to its utilization. The fast wake up path > >> >> is fine for task B whatever prev cpu is. > >> > > >> > In that case, we will take always take fast path (select_idle_sibling()) > >> > for task B if wake_wide() allows it, which should be fine. > >> > >> Even if want_affine is set, the wake up of task B will not use the fast path. > >> The affine_sd will not be set because the sched_domain, which have > >> both cpus, will not have the SD_WAKE_AFFINE flag according to this > >> patch, isn't it ? > >> So task B can't use the fast path whereas nothing prevent him to take > >> benefit of it > >> > >> Am I missing something ? > > > > No, I think you are right. Very good point. The cpumask test with > > sched_domain_span() will of cause return false. So yes, in this case the > > slow path is taken. It isn't wrong as such, just slower for asymmetric > > capacity systems :-) > > > > So, I still don't see why the function wake_cap that is introduce by > patch 9 can't be used for testing cross capacity migration at wake up > ? > The only reason for which we would like to skip fast wake up path for > cross capacity migration, is whether the task needs more capacity than > the capacity of cpu or prev_cpu. You already do that for prev_cpu, > can't you add same test for cpu ? I think I finally see what you mean. Thanks for your patience :-) It should be safe to let wake_affine be cross-capacity as long as we only take the fast path when both prev_cpu and this_cpu have sufficient capacity for the task. select_idle_sibling() can never end up looking at cpus with different capacities than those of this_cpu and prev_cpu. I will give that a try. I have a few ideas on how it can extended to use the fast path in a few additional cases as well.
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index d9619a3..558ec4a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -6410,6 +6410,9 @@ sd_init(struct sched_domain_topology_level *tl, int cpu) sd->idle_idx = 1; } + if (sd->flags & SD_ASYM_CPUCAPACITY) + sd->flags &= ~SD_WAKE_AFFINE; + sd->private = &tl->data; return sd;
If the system has cpu of different compute capacities (e.g. big.LITTLE) let affine wakeups be constrained to cpus of the same type. cc: Ingo Molnar <mingo@redhat.com> cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com> --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) -- 1.9.1