Message ID | 20170328063541.12912-1-dietmar.eggemann@arm.com |
---|---|
Headers | show |
Series | CFS load tracking trace events | expand |
On 28 March 2017 at 08:35, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: > This patch-set introduces trace events for load (and utilization) > tracking for the following three cfs scheduler bricks: cfs_rq, > sched_entity and task_group. > > I've decided to sent it out because people are discussing problems with > PELT related functionality and as a base for the related discussion next > week at the OSPM-summit in Pisa. > > The requirements for the design are: > > (1) Support for all combinations related to the following kernel > configuration options: CONFIG_SMP, CONFIG_FAIR_GROUP_SCHED, > CONFIG_SCHED_AUTOGROUP, CONFIG_DEBUG_KERNEL. > > (2) All key=value pairs have to appear in all combinations of the > kernel configuration options mentioned under (1). > > (3) Stable interface, i.e. only use keys for a key=value pairs which > are independent from the underlying (load tracking) implementation. > > (4) Minimal invasive to the actual cfs scheduler code, i.e. the trace > event should not have to be guarded by an if condition (e.g. > entity_is_task(se) to distinguish between task and task_group) > or kernel configuration switch. > > The trace events only expose one load (and one utilization) key=value > pair besides those used to identify the cfs scheduler brick. They do > not provide any internal implementation details of the actual (PELT) > load-tracking mechanism. > This limitation might be too much since for a cfs_rq we can only trace > runnable load (cfs_rq->runnable_load_avg) or runnable/blocked load > (cfs_rq->avg.load_avg). > Other load metrics like instantaneous load ({cfs_rq|se}->load.weight) > are not traced but could be easily added. > > The following keys are used to identify the cfs scheduler brick: > > (1) Cpu number the cfs scheduler brick is attached to. > > (2) Task_group path and (css) id. > > (3) Task name and pid. Do you really need both path/name and id/pid ? The path/name looks quite intrusive so can't we just use id/pid ? > > In case a key does not apply due to an unset Kernel configuration option > or the fact that a sched_entity can represent either a task or a > task_group its value is set to an invalid default: > > (1) Task_group path: "(null)" > > (2) Task group id: -1 > > (3) Task name: "(null)" > > (4) Task pid: -1 > > Load tracking trace events are placed into kernel/sched/fair.c: > > (1) for cfs_rq: > > - In PELT core function __update_load_avg(). > > - During sched_entity attach/detach. > > - During sched_entity load/util propagation down the task_group > hierarchy. > > (2) for sched_entity: > > - In PELT core function __update_load_avg(). > > - During sched_entity load/util propagation down the task_group > hierarchy. > > (3) for task_group: > > - In its PELT update function. > > An alternative for using __update_load_avg() would be to put trace > events into update_cfs_rq_load_avg() for cfs_rq's and into > set_task_rq_fair(), update_load_avg(), sync_entity_load_avg() for > sched_entities. This would also get rid of the if(cfs_rq)/else condition > in __update_load_avg(). > > These trace events still look a bit fragile. > First of all, this patch-set has to use cfs specific data structures > in the global task scheduler trace file include/trace/events/sched.h. > And second, the accessor-functions (rq_of(), task_of(), etc.) are > private to the cfs scheduler class. In case they would be public these > trace events would be easier to code. That said, group_cfs_rq() is > already made public by this patch-stack. > > This version bases on tip/sched/core as of yesterday (bc4278987e38). It > has been compile tested on ~160 configurations via 0day's kbuild test > robot. > > Dietmar Eggemann (5): > sched/autogroup: Define autogroup_path() for !CONFIG_SCHED_DEBUG > sched/events: Introduce cfs_rq load tracking trace event > sched/fair: Export group_cfs_rq() > sched/events: Introduce sched_entity load tracking trace event > sched/events: Introduce task_group load tracking trace event > > include/linux/sched.h | 10 +++ > include/trace/events/sched.h | 143 +++++++++++++++++++++++++++++++++++++++++++ > kernel/sched/autogroup.c | 2 - > kernel/sched/autogroup.h | 2 - > kernel/sched/fair.c | 26 ++++---- > 5 files changed, 167 insertions(+), 16 deletions(-) > > -- > 2.11.0 >
On 03/28/2017 12:05 PM, Vincent Guittot wrote: > On 28 March 2017 at 08:35, Dietmar Eggemann <dietmar.eggemann@arm.com> wrote: [...] >> The following keys are used to identify the cfs scheduler brick: >> >> (1) Cpu number the cfs scheduler brick is attached to. >> >> (2) Task_group path and (css) id. >> >> (3) Task name and pid. > > Do you really need both path/name and id/pid ? > > The path/name looks quite intrusive so can't we just use id/pid ? One problem is that all autogroups use id=0. Another thing with task_groups is that dealing with path="/tg1/tg11" is so much more intuitive than id="7". IMHO, we do need task name and pid to be able to clearly identify a task (same name/different pid or fork phase (forkee still has name of forker)). You're right, the implementation with path is more complicated but I guess that's worth it. We could get rid of 'id' though. [...]