diff mbox series

[v3,2/2] cgroup/rstat: Selftests for niced CPU statistics

Message ID 20240923142006.3592304-3-joshua.hahnjy@gmail.com
State Superseded
Headers show
Series Exposing nice CPU usage to userspace | expand

Commit Message

Joshua Hahn Sept. 23, 2024, 2:20 p.m. UTC
From: Joshua Hahn <joshua.hahn6@gmail.com>

Creates a cgroup with a single nice CPU hog process running.
fork() is called to generate the nice process because un-nicing is
not possible (see man nice(3)). If fork() was not used to generate
the CPU hog, we would run the rest of the cgroup selftest suite as a
nice process.

Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
 tools/testing/selftests/cgroup/test_cpu.c | 72 +++++++++++++++++++++++
 1 file changed, 72 insertions(+)

Comments

Michal Koutný Sept. 26, 2024, 6:10 p.m. UTC | #1
On Mon, Sep 23, 2024 at 07:20:06AM GMT, Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> +/*
> + * Creates a nice process that consumes CPU and checks that the elapsed
> + * usertime in the cgroup is close to the expected time.
> + */
> +static int test_cpucg_nice(const char *root)
> +{
> +	int ret = KSFT_FAIL;
> +	int status;
> +	long user_usec, nice_usec;
> +	long usage_seconds = 2;
> +	long expected_nice_usec = usage_seconds * USEC_PER_SEC;
> +	char *cpucg;
> +	pid_t pid;
> +
> +	cpucg = cg_name(root, "cpucg_test");
> +	if (!cpucg)
> +		goto cleanup;
> +
> +	if (cg_create(cpucg))
> +		goto cleanup;
> +
> +	user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
> +	nice_usec = cg_read_key_long(cpucg, "cpu.stat", "nice_usec");
> +	if (user_usec != 0 || nice_usec != 0)
> +		goto cleanup;

Can you please distinguish a check between non-zero nice_usec and
non-existent nice_usec (KSFT_FAIL vs KSFT_SKIP)? So that the selftest is
usable on older kernels too.

> +
> +	/*
> +	 * We fork here to create a new process that can be niced without
> +	 * polluting the nice value of other selftests
> +	 */
> +	pid = fork();
> +	if (pid < 0) {
> +		goto cleanup;
> +	} else if (pid == 0) {
> +		struct cpu_hog_func_param param = {
> +			.nprocs = 1,
> +			.ts = {
> +				.tv_sec = usage_seconds,
> +				.tv_nsec = 0,
> +			},
> +			.clock_type = CPU_HOG_CLOCK_PROCESS,
> +		};
> +
> +		/* Try to keep niced CPU usage as constrained to hog_cpu as possible */
> +		nice(1);
> +		cg_run(cpucg, hog_cpus_timed, (void *)&param);

Notice that cg_run() does fork itself internally.
So you can call hog_cpus_timed(cpucg, (void *)&param) directly, no
need for the fork with cg_run(). (Alternatively substitute fork in this
test with the fork in cg_run() but with extension of cpu_hog_func_params
with the nice value.)


Thanks,
Michal
Joshua Hahn Sept. 30, 2024, 6:07 p.m. UTC | #2
On Thu, Sep 26, 2024 at 2:10 PM Michal Koutný <mkoutny@suse.com> wrote:
>
> On Mon, Sep 23, 2024 at 07:20:06AM GMT, Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> > +/*
> > + * Creates a nice process that consumes CPU and checks that the elapsed
> > + * usertime in the cgroup is close to the expected time.
> > + */
> > +     user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
> > +     nice_usec = cg_read_key_long(cpucg, "cpu.stat", "nice_usec");
> > +     if (user_usec != 0 || nice_usec != 0)
> > +             goto cleanup;
>
> Can you please distinguish a check between non-zero nice_usec and
> non-existent nice_usec (KSFT_FAIL vs KSFT_SKIP)? So that the selftest is
> usable on older kernels too.

Yes, this sounds good to me -- I will include it in a v4, which I am
hoping to send out soon.

> > +
> > +     /*
> > +      * We fork here to create a new process that can be niced without
> > +      * polluting the nice value of other selftests
> > +      */
> > +     pid = fork();
> > +     if (pid < 0) {
> > +             goto cleanup;
> > +     } else if (pid == 0) {
> > +             struct cpu_hog_func_param param = {
> > +                     .nprocs = 1,
> > +                     .ts = {
> > +                             .tv_sec = usage_seconds,
> > +                             .tv_nsec = 0,
> > +                     },
> > +                     .clock_type = CPU_HOG_CLOCK_PROCESS,
> > +             };
> > +
> > +             /* Try to keep niced CPU usage as constrained to hog_cpu as possible */
> > +             nice(1);
> > +             cg_run(cpucg, hog_cpus_timed, (void *)&param);
>
> Notice that cg_run() does fork itself internally.
> So you can call hog_cpus_timed(cpucg, (void *)&param) directly, no
> need for the fork with cg_run(). (Alternatively substitute fork in this
> test with the fork in cg_run() but with extension of cpu_hog_func_params
> with the nice value.)
>
> Thanks,
> Michal

Thank you for your feedback, Michal.
The reason I used a fork in the testing is so that I could isolate the niced
portion of the test to only the CPU hog. If I were to nice(1) --> cg_hog()
in a single process without forking, this would mean that the cleanup portion
of the test would also be run as a niced process, contributing to the stat and
potentially dirtying the value (which is tested for accuracy via
`values_close`).

The other thing that I considered when writing this was that while it is
possible to make a process nicer, it is impossible to make a process less
nice. This would mean that the comparison & cleanup portions would also be
run nicely if I do not call fork().

What do you think? Do you think that this increase in granularity /
accuracy is worth the increase in code complexity? I do agree that it
would be much easier to read if there was no fork.

Alternatively, I can add a new parameter to cpu_hog_func_param that
takes in a nice value. For this however, I am afraid of changing the
function signature of existing utility functions, since it would mean
breaking support for older functions or others currently working on this.

Thank you for your detailed feedback again -- I will also change up the
diffstat and indentation issues you brought up from the first part of the patch.

Joshua
Michal Koutný Oct. 1, 2024, 12:56 p.m. UTC | #3
On Mon, Sep 30, 2024 at 02:07:22PM GMT, Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> The reason I used a fork in the testing is so that I could isolate the niced
> portion of the test to only the CPU hog. If I were to nice(1) --> cg_hog()
> in a single process without forking, this would mean that the cleanup portion
> of the test would also be run as a niced process,

The cleanup runs in a parent process and nice is called after fork in a
child in those considered cases (at least that's what I meant).

> contributing to the stat and potentially dirtying the value (which is
> tested for accuracy via `values_close`).

Yes, a test that randomly fails (false negative) is a nuisance. One fork
is needed, the second doesn't divide different priority tasks.

> What do you think?

My motivation comes from debugging cgroup selftests when strace is quite
useful and your implementation adds the unnecessary fork which makes the
strace (slightly) less readable.

> Do you think that this increase in granularity / accuracy is worth the
> increase in code complexity? I do agree that it would be much easier
> to read if there was no fork.

I think both changes (no cg_run or cpu_hog_func_param extension) could
be reasonably small changes (existing usages of cpu_hog_func_param
extension would default to zero nice, so the actual change would only be
in hog_cpus_timed()).

> Alternatively, I can add a new parameter to cpu_hog_func_param that
> takes in a nice value. For this however, I am afraid of changing the
> function signature of existing utility functions, since it would mean
> breaking support for older functions or others currently working on this.

The function is internal to the cgroup selftests and others can rebase,
so it doesn't have to stick to a particular signature.

HTH,
Michal
Joshua Hahn Oct. 1, 2024, 5:50 p.m. UTC | #4
> My motivation comes from debugging cgroup selftests when strace is quite
> useful and your implementation adds the unnecessary fork which makes the
> strace (slightly) less readable.

This makes sense, thank you for the context. I hadn't considered debugging
considerations much, but I can imagine that it becomes harder to read
once the code & strace becomes clogged up.

> > Do you think that this increase in granularity / accuracy is worth the
> > increase in code complexity? I do agree that it would be much easier
> > to read if there was no fork.
>
> I think both changes (no cg_run or cpu_hog_func_param extension) could
> be reasonably small changes (existing usages of cpu_hog_func_param
> extension would default to zero nice, so the actual change would only be
> in hog_cpus_timed()).

I think I will stick with the no cg_run option. Initially, I had
wanted to use it
to maintain the same style with the other selftests in test_cpu.c, but I think
it creates more unnecessary unreadability.

Thank you again,
Joshua
diff mbox series

Patch

diff --git a/tools/testing/selftests/cgroup/test_cpu.c b/tools/testing/selftests/cgroup/test_cpu.c
index dad2ed82f3ef..cd5550391f49 100644
--- a/tools/testing/selftests/cgroup/test_cpu.c
+++ b/tools/testing/selftests/cgroup/test_cpu.c
@@ -8,6 +8,7 @@ 
 #include <pthread.h>
 #include <stdio.h>
 #include <time.h>
+#include <unistd.h>
 
 #include "../kselftest.h"
 #include "cgroup_util.h"
@@ -229,6 +230,76 @@  static int test_cpucg_stats(const char *root)
 	return ret;
 }
 
+/*
+ * Creates a nice process that consumes CPU and checks that the elapsed
+ * usertime in the cgroup is close to the expected time.
+ */
+static int test_cpucg_nice(const char *root)
+{
+	int ret = KSFT_FAIL;
+	int status;
+	long user_usec, nice_usec;
+	long usage_seconds = 2;
+	long expected_nice_usec = usage_seconds * USEC_PER_SEC;
+	char *cpucg;
+	pid_t pid;
+
+	cpucg = cg_name(root, "cpucg_test");
+	if (!cpucg)
+		goto cleanup;
+
+	if (cg_create(cpucg))
+		goto cleanup;
+
+	user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
+	nice_usec = cg_read_key_long(cpucg, "cpu.stat", "nice_usec");
+	if (user_usec != 0 || nice_usec != 0)
+		goto cleanup;
+
+	/*
+	 * We fork here to create a new process that can be niced without
+	 * polluting the nice value of other selftests
+	 */
+	pid = fork();
+	if (pid < 0) {
+		goto cleanup;
+	} else if (pid == 0) {
+		struct cpu_hog_func_param param = {
+			.nprocs = 1,
+			.ts = {
+				.tv_sec = usage_seconds,
+				.tv_nsec = 0,
+			},
+			.clock_type = CPU_HOG_CLOCK_PROCESS,
+		};
+
+		/* Try to keep niced CPU usage as constrained to hog_cpu as possible */
+		nice(1);
+		cg_run(cpucg, hog_cpus_timed, (void *)&param);
+		exit(0);
+	} else {
+		waitpid(pid, &status, 0);
+		if (!WIFEXITED(status))
+			goto cleanup;
+
+		user_usec = cg_read_key_long(cpucg, "cpu.stat", "user_usec");
+		nice_usec = cg_read_key_long(cpucg, "cpu.stat", "nice_usec");
+		if (nice_usec > user_usec || user_usec <= 0)
+			goto cleanup;
+
+		if (!values_close(nice_usec, expected_nice_usec, 1))
+			goto cleanup;
+
+		ret = KSFT_PASS;
+	}
+
+cleanup:
+	cg_destroy(cpucg);
+	free(cpucg);
+
+	return ret;
+}
+
 static int
 run_cpucg_weight_test(
 		const char *root,
@@ -686,6 +757,7 @@  struct cpucg_test {
 } tests[] = {
 	T(test_cpucg_subtree_control),
 	T(test_cpucg_stats),
+	T(test_cpucg_nice),
 	T(test_cpucg_weight_overprovisioned),
 	T(test_cpucg_weight_underprovisioned),
 	T(test_cpucg_nested_weight_overprovisioned),