Message ID | 20161222151215.GA23448@linaro.org |
---|---|
State | New |
Headers | show |
Vincent Guittot <vincent.guittot@linaro.org> writes: > Le Tuesday 13 Dec 2016 . 09:47:30 (+0800), Huang, Ying a .crit : >> Hi, Vincent, >> >> Vincent Guittot <vincent.guittot@linaro.org> writes: >> >> > Hi Ying, >> > >> > On 12 December 2016 at 06:43, kernel test robot >> > <ying.huang@linux.intel.com> wrote: >> >> Greeting, >> >> >> >> FYI, we noticed a 149% regression of ftq.noise.50% due to commit: >> >> >> >> >> >> commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa ("sched/fair: Propagate asynchrous detach") >> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master >> >> >> >> in testcase: ftq >> >> on test machine: 8 threads Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 8G memory >> >> with following parameters: >> >> >> >> nr_task: 100% >> >> samples: 6000ss >> >> test: cache >> >> freq: 20 >> >> cpufreq_governor: powersave >> > >> > Why using powersave ? Are you testing every governors ? >> >> We will test performance and powersave governor for FTQ. > > Ok thanks > >> >> >> >> >> test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective. >> >> test-url: https://github.com/rminnich/ftq >> > >> > It's a bit difficult to understand exactly what is measured and what >> > is ftq.noise.50% because this result is not part of the bench which >> > seems to only record a log of data in a file and ftq.noise.50% seems >> > to be lkp specific >> >> Yes. FTQ itself has no noise statistics builtin, although it is an OS >> noise benchmark. ftq.noise.50% is calculated as below: >> >> There is a score for every sample of ftq. The lower the score, the >> higher the noises. ftq.noise.50% is the number (per 1000000 samples) of >> samples whose score is less than 50% of the mean score. >> > > ok so IIUC we have moved from 0.03% to 0.11% for ftq.noise.50% > > I have not been able to reproduce the regression on the different system that I have access to so I can only guess the root cause of the regression. > > Could it be possible to test if the patch below fix the regression ? > > > --- > kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++- > 1 file changed, 28 insertions(+), 1 deletion(-) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 090a9bb..8efa113 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se) > return 1; > } > > +/* Check if we need to update the load and the utilization of a group_entity */ > +static inline bool skip_blocked_update(struct sched_entity *se) > +{ > + struct cfs_rq *gcfs_rq = group_cfs_rq(se); > + > + /* > + * If sched_entity still have not null load or utilization, we have to > + * decay it. > + */ > + if (se->avg.load_avg || se->avg.util_avg) > + return false; > + > + /* > + * If there is a pending propagation, we have to update the load and > + * the utilizaion of the sched_entity > + */ > + if (gcfs_rq->propagate_avg) > + return false; > + > + /* > + * Other wise, the load and the utilizaiton of the sched_entity is > + * already null so it will be a waste of time to try to decay it > + */ > + return true; > +} > #else /* CONFIG_FAIR_GROUP_SCHED */ > > static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {} > @@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu) > { > struct rq *rq = cpu_rq(cpu); > struct cfs_rq *cfs_rq; > + struct sched_entity *se; > unsigned long flags; > > raw_spin_lock_irqsave(&rq->lock, flags); > @@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu) > update_tg_load_avg(cfs_rq, 0); > > /* Propagate pending load changes to the parent */ > - if (cfs_rq->tg->se[cpu]) > + se = cfs_rq->tg->se[cpu]; > + if (se && !skip_blocked_update(se)) > update_load_avg(cfs_rq->tg->se[cpu], 0); > } > raw_spin_unlock_irqrestore(&rq->lock, flags); The test result is as follow, ========================================================================================= compiler/cpufreq_governor/freq/kconfig/nr_task/rootfs/samples/tbox_group/test/testcase: gcc-6/powersave/20/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/6000ss/lkp-hsw-d01/cache/ftq commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa: first bad commit 09a43ace1f986b003c118fdf6ddf1fd685692d49: parent of first bad commit 0613870ea53a7a279d8d37f2a3ce40aafc155fc8: debug commit with above patch 4e5160766fcc9f41 09a43ace1f986b003c118fdf6d 0613870ea53a7a279d8d37f2a3 ---------------- -------------------------- -------------------------- %stddev %change %stddev %change %stddev \ | \ | \ 61670 ±228% -96.5% 2148 ± 11% -94.7% 3281 ± 58% ftq.noise.25% 3463 ± 10% -60.0% 1386 ± 19% -26.3% 2552 ± 58% ftq.noise.50% 1116 ± 23% -72.6% 305.99 ± 30% -35.8% 716.15 ± 64% ftq.noise.75% 3843815 ± 3% +3.1% 3963589 ± 1% -49.6% 1938221 ±100% ftq.time.involuntary_context_switches 5.33 ± 30% +21.4% 6.46 ± 14% -71.7% 1.50 ±108% time.system_time It appears that the system_time and involuntary_context_switches reduced much after applied the debug patch, which is good from noise point of view. ftq.noise.50% reduced compared with the first bad commit, but have not restored to that of the parent of the first bad commit. Best Regards, Huang, Ying
Hi Ying, On 28 December 2016 at 09:17, Huang, Ying <ying.huang@intel.com> wrote: > Vincent Guittot <vincent.guittot@linaro.org> writes: > >> Le Tuesday 13 Dec 2016 . 09:47:30 (+0800), Huang, Ying a .crit : >>> Hi, Vincent, >>> >>> Vincent Guittot <vincent.guittot@linaro.org> writes: >>> >>> > Hi Ying, >>> > >>> > On 12 December 2016 at 06:43, kernel test robot >>> > <ying.huang@linux.intel.com> wrote: >>> >> Greeting, >>> >> >>> >> FYI, we noticed a 149% regression of ftq.noise.50% due to commit: >>> >> >>> >> >>> >> commit: 4e5160766fcc9f41bbd38bac11f92dce993644aa ("sched/fair: Propagate asynchrous detach") >>> >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master >>> >> >>> >> in testcase: ftq >>> >> on test machine: 8 threads Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz with 8G memory >>> >> with following parameters: >>> >> >>> >> nr_task: 100% >>> >> samples: 6000ss >>> >> test: cache >>> >> freq: 20 >>> >> cpufreq_governor: powersave >>> > >>> > Why using powersave ? Are you testing every governors ? >>> >>> We will test performance and powersave governor for FTQ. >> >> Ok thanks >> >>> >>> >> >>> >> test-description: The FTQ benchmarks measure hardware and software interference or 'noise' on a node from the applications perspective. >>> >> test-url: https://github.com/rminnich/ftq >>> > >>> > It's a bit difficult to understand exactly what is measured and what >>> > is ftq.noise.50% because this result is not part of the bench which >>> > seems to only record a log of data in a file and ftq.noise.50% seems >>> > to be lkp specific >>> >>> Yes. FTQ itself has no noise statistics builtin, although it is an OS >>> noise benchmark. ftq.noise.50% is calculated as below: >>> >>> There is a score for every sample of ftq. The lower the score, the >>> higher the noises. ftq.noise.50% is the number (per 1000000 samples) of >>> samples whose score is less than 50% of the mean score. >>> >> >> ok so IIUC we have moved from 0.03% to 0.11% for ftq.noise.50% >> >> I have not been able to reproduce the regression on the different system that I have access to so I can only guess the root cause of the regression. >> >> Could it be possible to test if the patch below fix the regression ? >> >> >> --- >> kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++- >> 1 file changed, 28 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >> index 090a9bb..8efa113 100644 >> --- a/kernel/sched/fair.c >> +++ b/kernel/sched/fair.c >> @@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se) >> return 1; >> } >> >> +/* Check if we need to update the load and the utilization of a group_entity */ >> +static inline bool skip_blocked_update(struct sched_entity *se) >> +{ >> + struct cfs_rq *gcfs_rq = group_cfs_rq(se); >> + >> + /* >> + * If sched_entity still have not null load or utilization, we have to >> + * decay it. >> + */ >> + if (se->avg.load_avg || se->avg.util_avg) >> + return false; >> + >> + /* >> + * If there is a pending propagation, we have to update the load and >> + * the utilizaion of the sched_entity >> + */ >> + if (gcfs_rq->propagate_avg) >> + return false; >> + >> + /* >> + * Other wise, the load and the utilizaiton of the sched_entity is >> + * already null so it will be a waste of time to try to decay it >> + */ >> + return true; >> +} >> #else /* CONFIG_FAIR_GROUP_SCHED */ >> >> static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {} >> @@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu) >> { >> struct rq *rq = cpu_rq(cpu); >> struct cfs_rq *cfs_rq; >> + struct sched_entity *se; >> unsigned long flags; >> >> raw_spin_lock_irqsave(&rq->lock, flags); >> @@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu) >> update_tg_load_avg(cfs_rq, 0); >> >> /* Propagate pending load changes to the parent */ >> - if (cfs_rq->tg->se[cpu]) >> + se = cfs_rq->tg->se[cpu]; >> + if (se && !skip_blocked_update(se)) >> update_load_avg(cfs_rq->tg->se[cpu], 0); >> } >> raw_spin_unlock_irqrestore(&rq->lock, flags); > > The test result is as follow, > > ========================================================================================= > compiler/cpufreq_governor/freq/kconfig/nr_task/rootfs/samples/tbox_group/test/testcase: > gcc-6/powersave/20/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/6000ss/lkp-hsw-d01/cache/ftq > > commit: > 4e5160766fcc9f41bbd38bac11f92dce993644aa: first bad commit > 09a43ace1f986b003c118fdf6ddf1fd685692d49: parent of first bad commit > 0613870ea53a7a279d8d37f2a3ce40aafc155fc8: debug commit with above patch > > 4e5160766fcc9f41 09a43ace1f986b003c118fdf6d 0613870ea53a7a279d8d37f2a3 > ---------------- -------------------------- -------------------------- > %stddev %change %stddev %change %stddev > \ | \ | \ > 61670 ±228% -96.5% 2148 ± 11% -94.7% 3281 ± 58% ftq.noise.25% > 3463 ± 10% -60.0% 1386 ± 19% -26.3% 2552 ± 58% ftq.noise.50% > 1116 ± 23% -72.6% 305.99 ± 30% -35.8% 716.15 ± 64% ftq.noise.75% > 3843815 ± 3% +3.1% 3963589 ± 1% -49.6% 1938221 ±100% ftq.time.involuntary_context_switches > 5.33 ± 30% +21.4% 6.46 ± 14% -71.7% 1.50 ±108% time.system_time > > > It appears that the system_time and involuntary_context_switches reduced > much after applied the debug patch, which is good from noise point of > view. ftq.noise.50% reduced compared with the first bad commit, but > have not restored to that of the parent of the first bad commit. Thanks for testing. I will try to improve it a bit but not sure that I can reduce more. Regards, Vincent > > Best Regards, > Huang, Ying
Hi Vincent and Ying, On 01/02/2017 04:42 PM, Vincent Guittot wrote: > Hi Ying, > > On 28 December 2016 at 09:17, Huang, Ying <ying.huang@intel.com> wrote: >> Vincent Guittot <vincent.guittot@linaro.org> writes: >> >>> Le Tuesday 13 Dec 2016 . 09:47:30 (+0800), Huang, Ying a .crit : >>>> Hi, Vincent, >>>> >>>> Vincent Guittot <vincent.guittot@linaro.org> writes: [...] >>> --- >>> kernel/sched/fair.c | 29 ++++++++++++++++++++++++++++- >>> 1 file changed, 28 insertions(+), 1 deletion(-) >>> >>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>> index 090a9bb..8efa113 100644 >>> --- a/kernel/sched/fair.c >>> +++ b/kernel/sched/fair.c >>> @@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se) >>> return 1; >>> } >>> >>> +/* Check if we need to update the load and the utilization of a group_entity */ >>> +static inline bool skip_blocked_update(struct sched_entity *se) >>> +{ >>> + struct cfs_rq *gcfs_rq = group_cfs_rq(se); >>> + >>> + /* >>> + * If sched_entity still have not null load or utilization, we have to >>> + * decay it. >>> + */ >>> + if (se->avg.load_avg || se->avg.util_avg) >>> + return false; >>> + >>> + /* >>> + * If there is a pending propagation, we have to update the load and >>> + * the utilizaion of the sched_entity >>> + */ >>> + if (gcfs_rq->propagate_avg) >>> + return false; >>> + >>> + /* >>> + * Other wise, the load and the utilizaiton of the sched_entity is >>> + * already null so it will be a waste of time to try to decay it >>> + */ >>> + return true; >>> +} >>> #else /* CONFIG_FAIR_GROUP_SCHED */ >>> >>> static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {} >>> @@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu) >>> { >>> struct rq *rq = cpu_rq(cpu); >>> struct cfs_rq *cfs_rq; >>> + struct sched_entity *se; >>> unsigned long flags; >>> >>> raw_spin_lock_irqsave(&rq->lock, flags); >>> @@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu) >>> update_tg_load_avg(cfs_rq, 0); >>> >>> /* Propagate pending load changes to the parent */ >>> - if (cfs_rq->tg->se[cpu]) >>> + se = cfs_rq->tg->se[cpu]; >>> + if (se && !skip_blocked_update(se)) >>> update_load_avg(cfs_rq->tg->se[cpu], 0); >>> } >>> raw_spin_unlock_irqrestore(&rq->lock, flags); >> >> The test result is as follow, >> >> ========================================================================================= >> compiler/cpufreq_governor/freq/kconfig/nr_task/rootfs/samples/tbox_group/test/testcase: >> gcc-6/powersave/20/x86_64-rhel-7.2/100%/debian-x86_64-2016-08-31.cgz/6000ss/lkp-hsw-d01/cache/ftq >> >> commit: >> 4e5160766fcc9f41bbd38bac11f92dce993644aa: first bad commit >> 09a43ace1f986b003c118fdf6ddf1fd685692d49: parent of first bad commit >> 0613870ea53a7a279d8d37f2a3ce40aafc155fc8: debug commit with above patch >> >> 4e5160766fcc9f41 09a43ace1f986b003c118fdf6d 0613870ea53a7a279d8d37f2a3 >> ---------------- -------------------------- -------------------------- >> %stddev %change %stddev %change %stddev >> \ | \ | \ >> 61670 ±228% -96.5% 2148 ± 11% -94.7% 3281 ± 58% ftq.noise.25% >> 3463 ± 10% -60.0% 1386 ± 19% -26.3% 2552 ± 58% ftq.noise.50% >> 1116 ± 23% -72.6% 305.99 ± 30% -35.8% 716.15 ± 64% ftq.noise.75% >> 3843815 ± 3% +3.1% 3963589 ± 1% -49.6% 1938221 ±100% ftq.time.involuntary_context_switches >> 5.33 ± 30% +21.4% 6.46 ± 14% -71.7% 1.50 ±108% time.system_time >> >> >> It appears that the system_time and involuntary_context_switches reduced >> much after applied the debug patch, which is good from noise point of >> view. ftq.noise.50% reduced compared with the first bad commit, but >> have not restored to that of the parent of the first bad commit. > > Thanks for testing. I will try to improve it a bit but not sure that I > can reduce more. Is this a desktop system where this regression comes from autogroups (1 level taskgroups) or a server system with systemd (2 level taskgroups)? Since the PELT rewrite (v4.2) I have ~60 autogroups per cpu (&rq->leaf_cfs_rq_list) on my Ubuntu desktop system permanently (Intel i7-4750HQ) whereas in v4.1 there were 0 - 10. $ for i in `seq 0 7`; do cat /proc/sched_debug | grep "cfs_rq\[$i\]:/autogroup-" | wc -l; done 58 61 63 65 60 59 62 56 Couldn't we still remove these autogroups by if (!cfs_rq->nr_running && !se->avg.load_avg && !se->avg.util_avg) in update_blocked_averages()? Vincent, like we discussed in September last year, the proper fix would probably be a cfs-rq->nr_attached which IMHO is not doable w/o being an atomic because of migrate_task_rq_fair()->remove_entity_load_avg() not holding the rq lock.
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 090a9bb..8efa113 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3138,6 +3138,31 @@ static inline int propagate_entity_load_avg(struct sched_entity *se) return 1; } +/* Check if we need to update the load and the utilization of a group_entity */ +static inline bool skip_blocked_update(struct sched_entity *se) +{ + struct cfs_rq *gcfs_rq = group_cfs_rq(se); + + /* + * If sched_entity still have not null load or utilization, we have to + * decay it. + */ + if (se->avg.load_avg || se->avg.util_avg) + return false; + + /* + * If there is a pending propagation, we have to update the load and + * the utilizaion of the sched_entity + */ + if (gcfs_rq->propagate_avg) + return false; + + /* + * Other wise, the load and the utilizaiton of the sched_entity is + * already null so it will be a waste of time to try to decay it + */ + return true; +} #else /* CONFIG_FAIR_GROUP_SCHED */ static inline void update_tg_load_avg(struct cfs_rq *cfs_rq, int force) {} @@ -6858,6 +6883,7 @@ static void update_blocked_averages(int cpu) { struct rq *rq = cpu_rq(cpu); struct cfs_rq *cfs_rq; + struct sched_entity *se; unsigned long flags; raw_spin_lock_irqsave(&rq->lock, flags); @@ -6876,7 +6902,8 @@ static void update_blocked_averages(int cpu) update_tg_load_avg(cfs_rq, 0); /* Propagate pending load changes to the parent */ - if (cfs_rq->tg->se[cpu]) + se = cfs_rq->tg->se[cpu]; + if (se && !skip_blocked_update(se)) update_load_avg(cfs_rq->tg->se[cpu], 0); } raw_spin_unlock_irqrestore(&rq->lock, flags);