diff mbox

[4/7] sched: Track group sched_entity usage contributions

Message ID 1412684017-16595-5-git-send-email-vincent.guittot@linaro.org
State New
Headers show

Commit Message

Vincent Guittot Oct. 7, 2014, 12:13 p.m. UTC
From: Morten Rasmussen <morten.rasmussen@arm.com>

Adds usage contribution tracking for group entities. Unlike
se->avg.load_avg_contrib, se->avg.utilization_avg_contrib for group
entities is the sum of se->avg.utilization_avg_contrib for all entities on the
group runqueue. It is _not_ influenced in any way by the task group
h_load. Hence it is representing the actual cpu usage of the group, not
its intended load contribution which may differ significantly from the
utilization on lightly utilized systems.

cc: Paul Turner <pjt@google.com>
cc: Ben Segall <bsegall@google.com>

Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
---
 kernel/sched/debug.c | 3 +++
 kernel/sched/fair.c  | 5 +++++
 2 files changed, 8 insertions(+)

Comments

Benjamin Segall Oct. 7, 2014, 8:15 p.m. UTC | #1
Vincent Guittot <vincent.guittot@linaro.org> writes:

> From: Morten Rasmussen <morten.rasmussen@arm.com>
>
> Adds usage contribution tracking for group entities. Unlike
> se->avg.load_avg_contrib, se->avg.utilization_avg_contrib for group
> entities is the sum of se->avg.utilization_avg_contrib for all entities on the
> group runqueue. It is _not_ influenced in any way by the task group
> h_load. Hence it is representing the actual cpu usage of the group, not
> its intended load contribution which may differ significantly from the
> utilization on lightly utilized systems.


Just noting that this version also has usage disappear immediately when
a task blocks, although it does what you probably want on migration.

Also, group-ses don't ever use their running_avg_sum so it's kinda a
waste, but I'm not sure it's worth doing anything about.

>
> cc: Paul Turner <pjt@google.com>
> cc: Ben Segall <bsegall@google.com>
>
> Signed-off-by: Morten Rasmussen <morten.rasmussen@arm.com>
> ---
>  kernel/sched/debug.c | 3 +++
>  kernel/sched/fair.c  | 5 +++++
>  2 files changed, 8 insertions(+)
>
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index e0fbc0f..efb47ed 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -94,8 +94,10 @@ static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group
>  	P(se->load.weight);
>  #ifdef CONFIG_SMP
>  	P(se->avg.runnable_avg_sum);
> +	P(se->avg.running_avg_sum);
>  	P(se->avg.avg_period);
>  	P(se->avg.load_avg_contrib);
> +	P(se->avg.utilization_avg_contrib);
>  	P(se->avg.decay_count);
>  #endif
>  #undef PN
> @@ -633,6 +635,7 @@ void proc_sched_show_task(struct task_struct *p, struct seq_file *m)
>  	P(se.avg.running_avg_sum);
>  	P(se.avg.avg_period);
>  	P(se.avg.load_avg_contrib);
> +	P(se.avg.utilization_avg_contrib);
>  	P(se.avg.decay_count);
>  #endif
>  	P(policy);
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index d6de526..d3e9067 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2381,6 +2381,8 @@ static inline u64 __synchronize_entity_decay(struct sched_entity *se)
>  		return 0;
>  
>  	se->avg.load_avg_contrib = decay_load(se->avg.load_avg_contrib, decays);
> +	se->avg.utilization_avg_contrib =
> +			decay_load(se->avg.utilization_avg_contrib, decays);
>  	se->avg.decay_count = 0;
>  
>  	return decays;
> @@ -2525,6 +2527,9 @@ static long __update_entity_utilization_avg_contrib(struct sched_entity *se)
>  
>  	if (entity_is_task(se))
>  		__update_task_entity_utilization(se);
> +	else
> +		se->avg.utilization_avg_contrib =
> +					group_cfs_rq(se)->utilization_load_avg;
>  
>  	return se->avg.utilization_avg_contrib - old_contrib;
>  }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Vincent Guittot Oct. 8, 2014, 7:16 a.m. UTC | #2
On 7 October 2014 22:15,  <bsegall@google.com> wrote:
> Vincent Guittot <vincent.guittot@linaro.org> writes:
>
>> From: Morten Rasmussen <morten.rasmussen@arm.com>
>>
>> Adds usage contribution tracking for group entities. Unlike
>> se->avg.load_avg_contrib, se->avg.utilization_avg_contrib for group
>> entities is the sum of se->avg.utilization_avg_contrib for all entities on the
>> group runqueue. It is _not_ influenced in any way by the task group
>> h_load. Hence it is representing the actual cpu usage of the group, not
>> its intended load contribution which may differ significantly from the
>> utilization on lightly utilized systems.
>
>
> Just noting that this version also has usage disappear immediately when
> a task blocks, although it does what you probably want on migration.

Yes. For this patchset, we prefer to stay aligned with current
implementation which only take into account runnable tasks in order
cap the change of the behavior.

IMHO, the use of blocked task in utilization_avg_contrib should be
part of a dedicated patchset with dedicated non regression test

>
> Also, group-ses don't ever use their running_avg_sum so it's kinda a
> waste, but I'm not sure it's worth doing anything about.
>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Morten Rasmussen Oct. 8, 2014, 11:13 a.m. UTC | #3
On Tue, Oct 07, 2014 at 09:15:39PM +0100, bsegall@google.com wrote:
> Vincent Guittot <vincent.guittot@linaro.org> writes:
> 
> > From: Morten Rasmussen <morten.rasmussen@arm.com>
> >
> > Adds usage contribution tracking for group entities. Unlike
> > se->avg.load_avg_contrib, se->avg.utilization_avg_contrib for group
> > entities is the sum of se->avg.utilization_avg_contrib for all entities on the
> > group runqueue. It is _not_ influenced in any way by the task group
> > h_load. Hence it is representing the actual cpu usage of the group, not
> > its intended load contribution which may differ significantly from the
> > utilization on lightly utilized systems.
> 
> 
> Just noting that this version also has usage disappear immediately when
> a task blocks, although it does what you probably want on migration.

Yes, as it was discussed at Ksummit, we should include blocked
usage. It gives a much more stable metric.

> Also, group-ses don't ever use their running_avg_sum so it's kinda a
> waste, but I'm not sure it's worth doing anything about.

Yeah, but since we still have to maintain runnable_avg_sum for group-ses
it isn't much that can be saved I think. Maybe avoid calling
update_entity_load_avg() from set_next_entity() for group-ses, but I'm
not sure how much it helps.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
diff mbox

Patch

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index e0fbc0f..efb47ed 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -94,8 +94,10 @@  static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group
 	P(se->load.weight);
 #ifdef CONFIG_SMP
 	P(se->avg.runnable_avg_sum);
+	P(se->avg.running_avg_sum);
 	P(se->avg.avg_period);
 	P(se->avg.load_avg_contrib);
+	P(se->avg.utilization_avg_contrib);
 	P(se->avg.decay_count);
 #endif
 #undef PN
@@ -633,6 +635,7 @@  void proc_sched_show_task(struct task_struct *p, struct seq_file *m)
 	P(se.avg.running_avg_sum);
 	P(se.avg.avg_period);
 	P(se.avg.load_avg_contrib);
+	P(se.avg.utilization_avg_contrib);
 	P(se.avg.decay_count);
 #endif
 	P(policy);
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d6de526..d3e9067 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -2381,6 +2381,8 @@  static inline u64 __synchronize_entity_decay(struct sched_entity *se)
 		return 0;
 
 	se->avg.load_avg_contrib = decay_load(se->avg.load_avg_contrib, decays);
+	se->avg.utilization_avg_contrib =
+			decay_load(se->avg.utilization_avg_contrib, decays);
 	se->avg.decay_count = 0;
 
 	return decays;
@@ -2525,6 +2527,9 @@  static long __update_entity_utilization_avg_contrib(struct sched_entity *se)
 
 	if (entity_is_task(se))
 		__update_task_entity_utilization(se);
+	else
+		se->avg.utilization_avg_contrib =
+					group_cfs_rq(se)->utilization_load_avg;
 
 	return se->avg.utilization_avg_contrib - old_contrib;
 }