diff mbox series

kunit: Taint kernel if any tests run

Message ID 20220429043913.626647-1-davidgow@google.com
State New
Headers show
Series kunit: Taint kernel if any tests run | expand

Commit Message

David Gow April 29, 2022, 4:39 a.m. UTC
KUnit tests are not supposed to run on production systems: they may do
deliberately illegal things to trigger errors, and have security
implications (assertions will often deliberately leak kernel addresses).

Add a new taint type, TAINT_KUNIT to signal that a KUnit test has been
run. This will be printed as 'N' (for kuNit, as K, U and T were already
taken).

This should discourage people from running KUnit tests on production
systems, and to make it easier to tell if tests have been run
accidentally (by loading the wrong configuration, etc.)

Signed-off-by: David Gow <davidgow@google.com>
---

This is something I'd been thinking about for a while, and it came up
again, so I'm finally giving it a go.

Two notes:
- I decided to add a new type of taint, as none of the existing ones
  really seemed to fit. We could live with considering KUnit tests as
  TAINT_WARN or TAINT_CRAP or something otherwise, but neither are quite
  right.
- The taint_flags table gives a couple of checkpatch.pl errors around
  bracket placement. I've kept the new entry consistent with what's
  there rather than reformatting the whole table, but be prepared for
  complaints about spaces.

Thoughts?
-- David

---
 Documentation/admin-guide/tainted-kernels.rst | 1 +
 include/linux/panic.h                         | 3 ++-
 kernel/panic.c                                | 1 +
 lib/kunit/test.c                              | 4 ++++
 4 files changed, 8 insertions(+), 1 deletion(-)

Comments

Luis Chamberlain May 13, 2022, 3:35 p.m. UTC | #1
On Fri, May 13, 2022 at 04:32:11PM +0800, David Gow wrote:
> Most in-kernel tests (such as KUnit tests) are not supposed to run on
> production systems: they may do deliberately illegal things to trigger
> errors, and have security implications (for example, KUnit assertions
> will often deliberately leak kernel addresses).
> 
> Add a new taint type, TAINT_TEST to signal that a test has been run.
> This will be printed as 'N' (originally for kuNit, as every other
> sensible letter was taken.)
> 
> This should discourage people from running these tests on production
> systems, and to make it easier to tell if tests have been run
> accidentally (by loading the wrong configuration, etc.)
> 
> Signed-off-by: David Gow <davidgow@google.com>
> ---
> 
> Updated this to handle the most common case of selftest modules, in
> addition to KUnit tests. There's room for other tests or test frameworks
> to use this as well, either with a call to add_taint() from within the
> kernel, or by writing to /proc/sys/kernel/tainted.
> 
> The 'N' character for the taint is even less useful now that it's no
> longer short for kuNit, but all the letters in TEST are taken. :-(
> 
> Changes since v2:
> https://lore.kernel.org/linux-kselftest/20220430030019.803481-1-davidgow@google.com/
> - Rename TAINT_KUNIT -> TAINT_TEST.
> - Split into separate patches for adding the taint, and triggering it.
> - Taint on a kselftest_module being loaded (patch 3/3)
> 
> Changes since v1:
> https://lore.kernel.org/linux-kselftest/20220429043913.626647-1-davidgow@google.com/
> - Make the taint per-module, to handle the case when tests are in
>   (longer lasting) modules. (Thanks Greg KH).
> 
> Note that this still has checkpatch.pl warnings around bracket
> placement, which are intentional as part of matching the surrounding
> code.
> 
> ---
>  Documentation/admin-guide/tainted-kernels.rst | 1 +
>  include/linux/panic.h                         | 3 ++-
>  kernel/panic.c                                | 1 +
>  3 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst
> index ceeed7b0798d..546f3071940d 100644
> --- a/Documentation/admin-guide/tainted-kernels.rst
> +++ b/Documentation/admin-guide/tainted-kernels.rst
> @@ -100,6 +100,7 @@ Bit  Log  Number  Reason that got the kernel tainted
>   15  _/K   32768  kernel has been live patched
>   16  _/X   65536  auxiliary taint, defined for and used by distros
>   17  _/T  131072  kernel was built with the struct randomization plugin
> + 18  _/N  262144  an in-kernel test (such as a KUnit test) has been run

I think mentioning just kunit fuzzes its interpretation here.
Best to keep that out.

Other than that:

Acked-by: Luis Chamberlain <mcgrof@kernel.org>

  Luis
Luis Chamberlain May 13, 2022, 3:38 p.m. UTC | #2
On Fri, May 13, 2022 at 04:32:13PM +0800, David Gow wrote:
> Make any kselftest test module (using the kselftest_module framework)
> taint the kernel with TAINT_TEST on module load.
> 
> Note that several selftests use kernel modules which are not based on
> the kselftest_module framework, and so will not automatically taint the
> kernel. These modules will have to be manually modified if they should
> taint the kernel this way.
> 
> Similarly, selftests which do not load modules into the kernel generally
> should not taint the kernel (or possibly should only do so on failure),
> as it's assumed that testing from user-space should be safe. Regardless,
> they can write to /proc/sys/kernel/tainted if required.
> 
> Signed-off-by: David Gow <davidgow@google.com>

Not all selftest modules use KSTM_MODULE_LOADERS() so I'd like to see a
modpost target as well, otherwise this just covers a sliver of
selftests.

  Luis
Daniel Latypov May 13, 2022, 7:08 p.m. UTC | #3
On Fri, May 13, 2022 at 1:32 AM David Gow <davidgow@google.com> wrote:
>
> Make KUnit trigger the new TAINT_TEST taint when any KUnit test is run.
> Due to KUnit tests not being intended to run on production systems, and
> potentially causing problems (or security issues like leaking kernel
> addresses), the kernel's state should not be considered safe for
> production use after KUnit tests are run.
>
> Signed-off-by: David Gow <davidgow@google.com>

Tested-by: Daniel Latypov <dlatypov@google.com>

Looks good to me.

There's an edge case where we might have 0 suites or 0 tests and we
still taint the kernel, but I don't think we need to deal with that.
At the start of kunit_run_tests() is the cleanest place to do this.

I wasn't quite sure where this applied, but I manually applied the changes here.
Without this patch, this command exits fine:
$ ./tools/testing/kunit/kunit.py run --kernel_args=panic_on_taint=0x40000

With it, I get
[12:03:31] Kernel panic - not syncing: panic_on_taint set ...
[12:03:31] CPU: 0 PID: 1 Comm: swapper Tainted: G                 N
5.17.0-00001-gea9ee5e7aed8-dirty #60

I'm a bit surprised that it prints 'G' and not 'N', but this does seem
to be the right mask
$ python3 -c 'print(hex(1<<18))'
0x40000
and it only takes effect when this patch is applied.
I'll chalk that up to my ignorance of how taint works.
David Gow May 14, 2022, 3:04 a.m. UTC | #4
On Sat, May 14, 2022 at 3:09 AM Daniel Latypov <dlatypov@google.com> wrote:
>
> On Fri, May 13, 2022 at 1:32 AM David Gow <davidgow@google.com> wrote:
> >
> > Make KUnit trigger the new TAINT_TEST taint when any KUnit test is run.
> > Due to KUnit tests not being intended to run on production systems, and
> > potentially causing problems (or security issues like leaking kernel
> > addresses), the kernel's state should not be considered safe for
> > production use after KUnit tests are run.
> >
> > Signed-off-by: David Gow <davidgow@google.com>
>
> Tested-by: Daniel Latypov <dlatypov@google.com>
>
> Looks good to me.
>
> There's an edge case where we might have 0 suites or 0 tests and we
> still taint the kernel, but I don't think we need to deal with that.
> At the start of kunit_run_tests() is the cleanest place to do this.

Hmm... thinking about it, I think it might be worth not tainting if 0
suites run, but tainting if 0 tests run.

If we taint even if there are no suites present, that'll make things
awkward for the "build KUnit in, but not any tests" case: the kernel
would be tainted regardless. Given Android might be having the KUnit
execution stuff built-in (but using modules for tests), it's probably
worth not tainting there. (Though I think they have a separate way of
disabling KUnit as well, so it's probably not a complete
deal-breaker).

The case of having suites but no tests should still taint the kernel,
as suite_init functions could still run.

Assuming that seems sensible, I'll send out a v4 with that changed.

> I wasn't quite sure where this applied, but I manually applied the changes here.
> Without this patch, this command exits fine:
> $ ./tools/testing/kunit/kunit.py run --kernel_args=panic_on_taint=0x40000
>
> With it, I get
> [12:03:31] Kernel panic - not syncing: panic_on_taint set ...
> [12:03:31] CPU: 0 PID: 1 Comm: swapper Tainted: G                 N

This is showing both 'G' and 'N' ('G' being the character for GPL --
i.e. the kernel is not tainted by proprietary modules: 'P').

Jani did suggest a better way of printing these in the v1 discussion
(printing the actual names of taints present), which I might do in a
follow-up.

> 5.17.0-00001-gea9ee5e7aed8-dirty #60
>
> I'm a bit surprised that it prints 'G' and not 'N', but this does seem
> to be the right mask
> $ python3 -c 'print(hex(1<<18))'
> 0x40000
> and it only takes effect when this patch is applied.
> I'll chalk that up to my ignorance of how taint works.

-- David
David Gow May 14, 2022, 8:34 a.m. UTC | #5
On Fri, May 13, 2022 at 11:38 PM Luis Chamberlain <mcgrof@kernel.org> wrote:
>
> On Fri, May 13, 2022 at 04:32:13PM +0800, David Gow wrote:
> > Make any kselftest test module (using the kselftest_module framework)
> > taint the kernel with TAINT_TEST on module load.
> >
> > Note that several selftests use kernel modules which are not based on
> > the kselftest_module framework, and so will not automatically taint the
> > kernel. These modules will have to be manually modified if they should
> > taint the kernel this way.
> >
> > Similarly, selftests which do not load modules into the kernel generally
> > should not taint the kernel (or possibly should only do so on failure),
> > as it's assumed that testing from user-space should be safe. Regardless,
> > they can write to /proc/sys/kernel/tainted if required.
> >
> > Signed-off-by: David Gow <davidgow@google.com>
>
> Not all selftest modules use KSTM_MODULE_LOADERS() so I'd like to see a
> modpost target as well, otherwise this just covers a sliver of
> selftests.
>

My personal feeling is that the ideal way of solving this is actually
to port those modules which aren't using KSTM_MODULE_LOADERS() (or
KUnit, or some other system) to do so, or to otherwise manually tag
them as selftests and/or make them taint the kernel.

That being said, we can gain a bit my making the module-loading
helpers in kselftest/module.sh manually taint the kernel with
/proc/sys/kernel/tainted, which will catch quite a few of them (even
if tainting from userspace before they're loaded is suboptimal).

I've also started experimenting with a "test" MODULE_INFO field, which
modpost would add with the -t option. That still requires sprinkling
MODULE_INFO() everwhere, or the '-t' option to a bunch of makefiles,
or doing something more drastic to set it automatically for modules in
a given directory / makefile. Or the staging thing of checking the
directory / prefix in modpost.

I'll play around some more and have something to show in v4. (If we
have a MODULE_INFO field, we should use it for KUnit modules, but we'd
still have to taint the kernel manually for built-in tests anyway, so
it'd be redundant...)

-- David
Daniel Latypov May 14, 2022, 7:25 p.m. UTC | #6
On Fri, May 13, 2022 at 8:04 PM David Gow <davidgow@google.com> wrote:
> Hmm... thinking about it, I think it might be worth not tainting if 0
> suites run, but tainting if 0 tests run.
>
> If we taint even if there are no suites present, that'll make things
> awkward for the "build KUnit in, but not any tests" case: the kernel
> would be tainted regardless. Given Android might be having the KUnit

Actually, this is what the code does right now. I was wrong.
If there are 0 suites => not tainted.
If there are 0 tests in the suites => tainted.

For kunit being built in, it first goes through this func
   206  static void kunit_exec_run_tests(struct suite_set *suite_set)
   207  {
   208          struct kunit_suite * const * const *suites;
   209
   210          kunit_print_tap_header(suite_set);
   211
   212          for (suites = suite_set->start; suites <
suite_set->end; suites++)
   213                  __kunit_test_suites_init(*suites);
   214  }

So for the "build KUnit in, but not any tests" case, you'll never
enter that for-loop.
Thus you'll never call __kunit_test_suites_init() => kunit_run_tests().

For module-based tests, we have the same behavior.
If there's 0 test suites, we never enter the second loop, so we never taint.
But if there's >0 suites, then we will, regardless of the # of test cases.

   570  int __kunit_test_suites_init(struct kunit_suite * const * const suites)
   571  {
   572          unsigned int i;
   573
   574          for (i = 0; suites[i] != NULL; i++) {
   575                  kunit_init_suite(suites[i]);
   576                  kunit_run_tests(suites[i]);
   577          }
   578          return 0;
   579  }

So this change should already work as intended.

> execution stuff built-in (but using modules for tests), it's probably
> worth not tainting there. (Though I think they have a separate way of
> disabling KUnit as well, so it's probably not a complete
> deal-breaker).
>
> The case of having suites but no tests should still taint the kernel,
> as suite_init functions could still run.

Yes, suite_init functions are the concern. I agree we should taint if
there are >0 suites but 0 test cases.

I don't think it's worth trying to be fancy and tainting iff there >0
test cases or a suite_init/exit function ran.

>
> Assuming that seems sensible, I'll send out a v4 with that changed.

Just to be clear: that shouldn't be necessary.

>
> > I wasn't quite sure where this applied, but I manually applied the changes here.
> > Without this patch, this command exits fine:
> > $ ./tools/testing/kunit/kunit.py run --kernel_args=panic_on_taint=0x40000
> >
> > With it, I get
> > [12:03:31] Kernel panic - not syncing: panic_on_taint set ...
> > [12:03:31] CPU: 0 PID: 1 Comm: swapper Tainted: G                 N
>
> This is showing both 'G' and 'N' ('G' being the character for GPL --

I just somehow missed the fact there was an 'N' at the end there.
Thanks, I thought I was going crazy. I guess I was just going blind.


Daniel
Brendan Higgins May 17, 2022, 8:58 p.m. UTC | #7
On Sat, May 14, 2022 at 3:25 PM Daniel Latypov <dlatypov@google.com> wrote:
>
> On Fri, May 13, 2022 at 8:04 PM David Gow <davidgow@google.com> wrote:
> > Hmm... thinking about it, I think it might be worth not tainting if 0
> > suites run, but tainting if 0 tests run.
> >
> > If we taint even if there are no suites present, that'll make things
> > awkward for the "build KUnit in, but not any tests" case: the kernel
> > would be tainted regardless. Given Android might be having the KUnit
>
> Actually, this is what the code does right now. I was wrong.
> If there are 0 suites => not tainted.
> If there are 0 tests in the suites => tainted.
>
> For kunit being built in, it first goes through this func
>    206  static void kunit_exec_run_tests(struct suite_set *suite_set)
>    207  {
>    208          struct kunit_suite * const * const *suites;
>    209
>    210          kunit_print_tap_header(suite_set);
>    211
>    212          for (suites = suite_set->start; suites <
> suite_set->end; suites++)
>    213                  __kunit_test_suites_init(*suites);
>    214  }
>
> So for the "build KUnit in, but not any tests" case, you'll never
> enter that for-loop.
> Thus you'll never call __kunit_test_suites_init() => kunit_run_tests().
>
> For module-based tests, we have the same behavior.
> If there's 0 test suites, we never enter the second loop, so we never taint.
> But if there's >0 suites, then we will, regardless of the # of test cases.
>
>    570  int __kunit_test_suites_init(struct kunit_suite * const * const suites)
>    571  {
>    572          unsigned int i;
>    573
>    574          for (i = 0; suites[i] != NULL; i++) {
>    575                  kunit_init_suite(suites[i]);
>    576                  kunit_run_tests(suites[i]);
>    577          }
>    578          return 0;
>    579  }
>
> So this change should already work as intended.
>
> > execution stuff built-in (but using modules for tests), it's probably
> > worth not tainting there. (Though I think they have a separate way of
> > disabling KUnit as well, so it's probably not a complete
> > deal-breaker).
> >
> > The case of having suites but no tests should still taint the kernel,
> > as suite_init functions could still run.
>
> Yes, suite_init functions are the concern. I agree we should taint if
> there are >0 suites but 0 test cases.
>
> I don't think it's worth trying to be fancy and tainting iff there >0
> test cases or a suite_init/exit function ran.
>
> >
> > Assuming that seems sensible, I'll send out a v4 with that changed.
>
> Just to be clear: that shouldn't be necessary.

Agreed. I think the current behavior is acceptable, and should be
unobtrusive to Android: Joe has a patch that introduces a kernel param
which disables running KUnit tests at the suite level which would
happen before this taint occurs. So the only way the taint happens is
if we actually try to execute some test cases (whether or not the
cases actually run).

> > > I wasn't quite sure where this applied, but I manually applied the changes here.
> > > Without this patch, this command exits fine:
> > > $ ./tools/testing/kunit/kunit.py run --kernel_args=panic_on_taint=0x40000
> > >
> > > With it, I get
> > > [12:03:31] Kernel panic - not syncing: panic_on_taint set ...
> > > [12:03:31] CPU: 0 PID: 1 Comm: swapper Tainted: G                 N
> >
> > This is showing both 'G' and 'N' ('G' being the character for GPL --
>
> I just somehow missed the fact there was an 'N' at the end there.
> Thanks, I thought I was going crazy. I guess I was just going blind.
>
>
> Daniel
diff mbox series

Patch

diff --git a/Documentation/admin-guide/tainted-kernels.rst b/Documentation/admin-guide/tainted-kernels.rst
index ceeed7b0798d..8f18fc4659d4 100644
--- a/Documentation/admin-guide/tainted-kernels.rst
+++ b/Documentation/admin-guide/tainted-kernels.rst
@@ -100,6 +100,7 @@  Bit  Log  Number  Reason that got the kernel tainted
  15  _/K   32768  kernel has been live patched
  16  _/X   65536  auxiliary taint, defined for and used by distros
  17  _/T  131072  kernel was built with the struct randomization plugin
+ 18  _/N  262144  a KUnit test has been run
 ===  ===  ======  ========================================================
 
 Note: The character ``_`` is representing a blank in this table to make reading
diff --git a/include/linux/panic.h b/include/linux/panic.h
index f5844908a089..1d316c26bf27 100644
--- a/include/linux/panic.h
+++ b/include/linux/panic.h
@@ -74,7 +74,8 @@  static inline void set_arch_panic_timeout(int timeout, int arch_default_timeout)
 #define TAINT_LIVEPATCH			15
 #define TAINT_AUX			16
 #define TAINT_RANDSTRUCT		17
-#define TAINT_FLAGS_COUNT		18
+#define TAINT_KUNIT			18
+#define TAINT_FLAGS_COUNT		19
 #define TAINT_FLAGS_MAX			((1UL << TAINT_FLAGS_COUNT) - 1)
 
 struct taint_flag {
diff --git a/kernel/panic.c b/kernel/panic.c
index eb4dfb932c85..b24ca63ed738 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -404,6 +404,7 @@  const struct taint_flag taint_flags[TAINT_FLAGS_COUNT] = {
 	[ TAINT_LIVEPATCH ]		= { 'K', ' ', true },
 	[ TAINT_AUX ]			= { 'X', ' ', true },
 	[ TAINT_RANDSTRUCT ]		= { 'T', ' ', true },
+	[ TAINT_KUNIT ]			= { 'N', ' ', false },
 };
 
 /**
diff --git a/lib/kunit/test.c b/lib/kunit/test.c
index 0f66c13d126e..ea8e9162445d 100644
--- a/lib/kunit/test.c
+++ b/lib/kunit/test.c
@@ -11,6 +11,7 @@ 
 #include <kunit/test-bug.h>
 #include <linux/kernel.h>
 #include <linux/moduleparam.h>
+#include <linux/panic.h>
 #include <linux/sched/debug.h>
 #include <linux/sched.h>
 
@@ -498,6 +499,9 @@  int kunit_run_tests(struct kunit_suite *suite)
 	struct kunit_result_stats suite_stats = { 0 };
 	struct kunit_result_stats total_stats = { 0 };
 
+	/* Taint the kernel so we know we've run tests. */
+	add_taint(TAINT_KUNIT, LOCKDEP_STILL_OK);
+
 	kunit_print_subtest_start(suite);
 
 	kunit_suite_for_each_test_case(suite, test_case) {