mbox series

[v3,0/2] Exposing nice CPU usage to userspace

Message ID 20240923142006.3592304-1-joshua.hahnjy@gmail.com
Headers show
Series Exposing nice CPU usage to userspace | expand

Message

Joshua Hahn Sept. 23, 2024, 2:20 p.m. UTC
From: Joshua Hahn <joshua.hahn6@gmail.com>

v2 -> v3: Signed-off-by & renamed subject for clarity.
v1 -> v2: Edited commit messages for clarity.

Niced CPU usage is a metric reported in host-level /prot/stat, but is
not reported in cgroup-level statistics in cpu.stat. However, when a
host contains multiple tasks across different workloads, it becomes
difficult to gauge how much of the task is being spent on niced
processes based on /proc/stat alone, since host-level metrics do not
provide this cgroup-level granularity.

Exposing this metric will allow users to accurately probe the niced CPU
metric for each workload, and make more informed decisions when
directing higher priority tasks.

Joshua Hahn (2):
  Tracking cgroup-level niced CPU time
  Selftests for niced CPU statistics

 include/linux/cgroup-defs.h               |  1 +
 kernel/cgroup/rstat.c                     | 16 ++++-
 tools/testing/selftests/cgroup/test_cpu.c | 72 +++++++++++++++++++++++
 3 files changed, 86 insertions(+), 3 deletions(-)

Comments

Michal Koutný Sept. 26, 2024, 6:10 p.m. UTC | #1
On Mon, Sep 23, 2024 at 07:20:04AM GMT, Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> From: Joshua Hahn <joshua.hahn6@gmail.com>
> 
> v2 -> v3: Signed-off-by & renamed subject for clarity.
> v1 -> v2: Edited commit messages for clarity.

Thanks for the version changelog, appreciated!

...
> Exposing this metric will allow users to accurately probe the niced CPU
> metric for each workload, and make more informed decisions when
> directing higher priority tasks.

Possibly an example of how this value (combined with some other?) is
used for decisions could shed some light on this and justify adding this
attribute.

Thanks,
Michal

(I'll respond here to Tejun's message from v2 thread.)

On Tue, Sep 10, 2024 at 11:01:07AM GMT, Tejun Heo <tj@kernel.org> wrote:
> I think it's as useful as system-wide nice metric is.

Exactly -- and I don't understand how that system-wide value (without
any cgroups) is useful.
If I don't know how many there are niced and non-niced tasks and what
their runnable patterns are, the aggregated nice time can have ambiguous
interpretations.

> I think there are benefits to mirroring system wide metrics, at least
> ones as widely spread as nice.

I agree with benefits of mirroring of some system wide metrics when they
are useful <del>but not all of them because it's difficult/impossible to take
them away once they're exposed</del>. Actually, readers _should_ handle
missing keys gracefuly, so this may be just fine.

(Is this nice time widely spread? (I remember the field from `top`, still
not sure how to use it.) Are other proc_stat(5) fields different?

I see how this can be the global analog on leaf cgroups but
interpretting middle cgroups with children of different cpu.weights?)