Message ID | 20161205092735.GA9161@linaro.org |
---|---|
State | New |
Headers | show |
On Mon, 05 Dec, at 10:27:36AM, Vincent Guittot wrote: > > Hi Matt, > > Thanks for the results. > > During the review, it has been pointed out by Morten that the test condition > (100*this_avg_load < imbalance_scale*min_avg_load) makes more sense than > (100*min_avg_load > imbalance_scale*this_avg_load). But i see lower > performances with this change. Coud you run tests with the change below on > top of the patchset ? > > --- > kernel/sched/fair.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index e8d1ae7..0129fbb 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -5514,7 +5514,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, > if (!idlest || > (min_runnable_load > (this_runnable_load + imbalance)) || > ((this_runnable_load < (min_runnable_load + imbalance)) && > - (100*min_avg_load > imbalance_scale*this_avg_load))) > + (100*this_avg_load < imbalance_scale*min_avg_load))) > return NULL; > return idlest; > } Queued for testing.
On Mon, 05 Dec, at 01:35:46PM, Matt Fleming wrote: > On Mon, 05 Dec, at 10:27:36AM, Vincent Guittot wrote: > > > > Hi Matt, > > > > Thanks for the results. > > > > During the review, it has been pointed out by Morten that the test condition > > (100*this_avg_load < imbalance_scale*min_avg_load) makes more sense than > > (100*min_avg_load > imbalance_scale*this_avg_load). But i see lower > > performances with this change. Coud you run tests with the change below on > > top of the patchset ? > > > > --- > > kernel/sched/fair.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > > index e8d1ae7..0129fbb 100644 > > --- a/kernel/sched/fair.c > > +++ b/kernel/sched/fair.c > > @@ -5514,7 +5514,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, > > if (!idlest || > > (min_runnable_load > (this_runnable_load + imbalance)) || > > ((this_runnable_load < (min_runnable_load + imbalance)) && > > - (100*min_avg_load > imbalance_scale*this_avg_load))) > > + (100*this_avg_load < imbalance_scale*min_avg_load))) > > return NULL; > > return idlest; > > } > > Queued for testing. Most of the machines didn't notice the difference with this new patch. However, I did come across one test that showed a negative change, hackbench-thread-pipes 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 tip-sched fix-fig-for-fork fix-sig fix-fig-for-fork-v2 Amean 1 0.1266 ( 0.00%) 0.1269 ( -0.23%) 0.1287 ( -1.69%) 0.1357 ( -7.22%) Amean 4 0.4989 ( 0.00%) 0.5174 ( -3.72%) 0.5251 ( -5.27%) 0.5117 ( -2.58%) Amean 7 0.8510 ( 0.00%) 0.8517 ( -0.08%) 0.8964 ( -5.34%) 0.8801 ( -3.42%) Amean 12 1.0699 ( 0.00%) 1.0484 ( 2.00%) 1.0147 ( 5.15%) 1.0759 ( -0.56%) Amean 21 1.2816 ( 0.00%) 1.2140 ( 5.27%) 1.1879 ( 7.31%) 1.2414 ( 3.13%) Amean 30 1.4440 ( 0.00%) 1.4003 ( 3.03%) 1.3969 ( 3.26%) 1.4057 ( 2.65%) Amean 48 1.5773 ( 0.00%) 1.5983 ( -1.33%) 1.3984 ( 11.34%) 1.5624 ( 0.94%) Amean 79 2.2343 ( 0.00%) 2.3066 ( -3.24%) 2.0053 ( 10.25%) 2.2630 ( -1.29%) Amean 96 2.6736 ( 0.00%) 2.4313 ( 9.06%) 2.4181 ( 9.55%) 2.4717 ( 7.55%) 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 tip-schedfix-fig-for-fork fix-sigfix-fig-for-fork-v2 User 129.53 128.64 127.70 131.00 System 1784.54 1744.21 1654.08 1744.00 Elapsed 92.07 90.44 86.95 91.00 Looking at the 48 and 79 groups rows for mean there's a noticeable drop off in performance of ~10%, which should be outside of the noise for this test. This is a 2 socket, 4 NUMA node (yes, really), 24 cpus AMD opteron circa 2010. Given the age of this machine, I don't think it's worth holding up the patch.
On 8 December 2016 at 15:09, Matt Fleming <matt@codeblueprint.co.uk> wrote: > On Mon, 05 Dec, at 01:35:46PM, Matt Fleming wrote: >> On Mon, 05 Dec, at 10:27:36AM, Vincent Guittot wrote: >> > >> > Hi Matt, >> > >> > Thanks for the results. >> > >> > During the review, it has been pointed out by Morten that the test condition >> > (100*this_avg_load < imbalance_scale*min_avg_load) makes more sense than >> > (100*min_avg_load > imbalance_scale*this_avg_load). But i see lower >> > performances with this change. Coud you run tests with the change below on >> > top of the patchset ? >> > >> > --- >> > kernel/sched/fair.c | 2 +- >> > 1 file changed, 1 insertion(+), 1 deletion(-) >> > >> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> > index e8d1ae7..0129fbb 100644 >> > --- a/kernel/sched/fair.c >> > +++ b/kernel/sched/fair.c >> > @@ -5514,7 +5514,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, >> > if (!idlest || >> > (min_runnable_load > (this_runnable_load + imbalance)) || >> > ((this_runnable_load < (min_runnable_load + imbalance)) && >> > - (100*min_avg_load > imbalance_scale*this_avg_load))) >> > + (100*this_avg_load < imbalance_scale*min_avg_load))) >> > return NULL; >> > return idlest; >> > } >> >> Queued for testing. > > Most of the machines didn't notice the difference with this new patch. > However, I did come across one test that showed a negative change, OK so you don't see perf impact between the 2 test condition except for the machine below So IMHO, we should use the new test condition which tries to keep task local if there is no other CPU that is obviously less loaded. I'm going to prepare a new version of the patchset and apply all comments Thanks Vincent > > > hackbench-thread-pipes > 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 > tip-sched fix-fig-for-fork fix-sig fix-fig-for-fork-v2 > Amean 1 0.1266 ( 0.00%) 0.1269 ( -0.23%) 0.1287 ( -1.69%) 0.1357 ( -7.22%) > Amean 4 0.4989 ( 0.00%) 0.5174 ( -3.72%) 0.5251 ( -5.27%) 0.5117 ( -2.58%) > Amean 7 0.8510 ( 0.00%) 0.8517 ( -0.08%) 0.8964 ( -5.34%) 0.8801 ( -3.42%) > Amean 12 1.0699 ( 0.00%) 1.0484 ( 2.00%) 1.0147 ( 5.15%) 1.0759 ( -0.56%) > Amean 21 1.2816 ( 0.00%) 1.2140 ( 5.27%) 1.1879 ( 7.31%) 1.2414 ( 3.13%) > Amean 30 1.4440 ( 0.00%) 1.4003 ( 3.03%) 1.3969 ( 3.26%) 1.4057 ( 2.65%) > Amean 48 1.5773 ( 0.00%) 1.5983 ( -1.33%) 1.3984 ( 11.34%) 1.5624 ( 0.94%) > Amean 79 2.2343 ( 0.00%) 2.3066 ( -3.24%) 2.0053 ( 10.25%) 2.2630 ( -1.29%) > Amean 96 2.6736 ( 0.00%) 2.4313 ( 9.06%) 2.4181 ( 9.55%) 2.4717 ( 7.55%) > > 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 4.9.0-rc6 > tip-schedfix-fig-for-fork fix-sigfix-fig-for-fork-v2 > User 129.53 128.64 127.70 131.00 > System 1784.54 1744.21 1654.08 1744.00 > Elapsed 92.07 90.44 86.95 91.00 > > Looking at the 48 and 79 groups rows for mean there's a noticeable > drop off in performance of ~10%, which should be outside of the noise > for this test. This is a 2 socket, 4 NUMA node (yes, really), 24 cpus > AMD opteron circa 2010. > > Given the age of this machine, I don't think it's worth holding up the > patch.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e8d1ae7..0129fbb 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5514,7 +5514,7 @@ find_idlest_group(struct sched_domain *sd, struct task_struct *p, if (!idlest || (min_runnable_load > (this_runnable_load + imbalance)) || ((this_runnable_load < (min_runnable_load + imbalance)) && - (100*min_avg_load > imbalance_scale*this_avg_load))) + (100*this_avg_load < imbalance_scale*min_avg_load))) return NULL; return idlest; }