mbox series

[RFC,net-next,mlxsw,00/14] selftests: Fixes for kernel CI

Message ID cover.1711385795.git.petrm@nvidia.com
Headers show
Series selftests: Fixes for kernel CI | expand

Message

Petr Machata March 25, 2024, 5:29 p.m. UTC
As discussed on the bi-weekly call on Jan 30, and in mailing around
kernel CI effort, some changes are desirable in the suite of forwarding
selftests the better to work with the CI tooling. Namely:

- The forwarding selftests use a configuration file where names of
  interfaces are defined and various variables can be overridden. There
  is also forwarding.config.sample that users can use as a template to
  refer to when creating the config file. What happens a fair bit is
  that users either do not know about this at all, or simply forget, and
  are confused by cryptic failures about interfaces that cannot be
  created.

  In patches #1 - #3 have lib.sh just be the single source of truth with
  regards to which variables exist. That includes the topology variables
  which were previously only in the sample file, and any "tweak
  variables", such as what tools to use, sleep times, etc.

  forwarding.config.sample then becomes just a placeholder with a couple
  examples. Unless specific HW should be exercised, or specific tools
  used, the defaults are usually just fine.

- Several net/forwarding/ selftests (and one net/ one) cannot be run on
  veth pairs, they need an actual HW interface to run on. They are
  generic in the sense that any capable HW should pass them, which is
  why they have been put to net/forwarding/ as opposed to drivers/net/,
  but they do not generalize to veth. The fact that these tests are in
  net/forwarding/, but still complaining when run, is confusing.

  In patches #4 - #6 move these tests to a new directory
  drivers/net/hw.

- The following patches extend the codebase to handle well test results
  other than pass and fail.

  Patch #7 is preparatory. It converts several log_test_skip to XFAIL,
  so that tests do not spuriously end up returning non-0 when they
  are not supposed to.

  In patches #8 - #10, introduce some missing ksft constants, then support
  having those constants in RET, and then finally in EXIT_STATUS.

- The traffic scheduler tests generate a large amount of network traffic
  to test the behavior of the scheduler. This demands a relatively
  high-performance computer. On slow machines, such as with a debugging
  kernel, the test would spuriously fail.

  It can still be useful to "go through the motions" though, to possibly
  catch bugs in setup of the scheduler graph and passing packets around.
  Thus we still want to run the tests, just with lowered demands.

  To that end, in patches #11 - #12, introduce an environment variable
  KSFT_MACHINE_SLOW, with obvious meaning. Tests can then make checks
  more lenient, such as mark failures as XFAIL. A helper, xfail_on_slow,
  is provided to mark performance-sensitive parts of the selftest.

- In patch #13, use a similar mechanism to mark a NH group stats
  selftest to XFAIL HW stats tests when run on VETH pairs.

- All these changes complicate the hitherto straightforward logging and
  checking logic, so in patch #14, add a selftest that checks this
  functionality in lib.sh.

Petr Machata (14):
  selftests: net: libs: Change variable fallback syntax
  selftests: forwarding.config.sample: Move overrides to lib.sh
  selftests: forwarding: README: Document customization
  selftests: forwarding: ipip_lib: Do not import lib.sh
  selftests: forwarding: Move several selftests
  selftests: forwarding: Ditch skip_on_veth()
  selftests: forwarding: Change inappropriate log_test_skip() calls
  selftests: lib: Define more kselftest exit codes
  selftests: forwarding: Have RET track kselftest framework constants
  selftests: forwarding: Convert log_test() to recognize RET values
  selftests: forwarding: Support for performance sensitive tests
  selftests: forwarding: Mark performance-sensitive tests
  selftests: forwarding: router_mpath_nh_lib: Don't skip, xfail on veth
  selftests: forwarding: Add a test for testing lib.sh functionality

 .../testing/selftests/drivers/net/hw/Makefile |  25 ++
 .../net/hw}/devlink_port_split.py             |   0
 .../forwarding => drivers/net/hw}/ethtool.sh  |   5 +-
 .../net/hw}/ethtool_extended_state.sh         |   5 +-
 .../net/hw}/ethtool_lib.sh                    |   0
 .../net/hw}/ethtool_mm.sh                     |   3 +-
 .../net/hw}/ethtool_rmon.sh                   |   7 +-
 .../net/hw}/hw_stats_l3.sh                    |  19 +-
 .../net/hw}/hw_stats_l3_gre.sh                |   7 +-
 .../forwarding => drivers/net/hw}/loopback.sh |   5 +-
 .../testing/selftests/drivers/net/hw/settings |   1 +
 .../selftests/drivers/net/mlxsw/mlxsw_lib.sh  |   2 +-
 .../net/mlxsw/spectrum-2/resource_scale.sh    |   1 -
 .../net/mlxsw/spectrum/resource_scale.sh      |   1 -
 tools/testing/selftests/net/Makefile          |   1 -
 .../testing/selftests/net/forwarding/Makefile |   9 +-
 tools/testing/selftests/net/forwarding/README |  33 +++
 .../net/forwarding/forwarding.config.sample   |  53 ++--
 .../selftests/net/forwarding/ipip_lib.sh      |   1 -
 tools/testing/selftests/net/forwarding/lib.sh | 255 +++++++++++++-----
 .../selftests/net/forwarding/lib_sh_test.sh   | 208 ++++++++++++++
 .../net/forwarding/router_mpath_nh_lib.sh     |  12 +-
 .../selftests/net/forwarding/sch_ets_tests.sh |  19 +-
 .../selftests/net/forwarding/sch_red.sh       |  10 +-
 .../selftests/net/forwarding/sch_tbf_core.sh  |   2 +-
 .../selftests/net/forwarding/tc_common.sh     |   2 +-
 .../selftests/net/forwarding/tc_tunnel_key.sh |   2 -
 tools/testing/selftests/net/lib.sh            |  48 +++-
 28 files changed, 565 insertions(+), 171 deletions(-)
 create mode 100644 tools/testing/selftests/drivers/net/hw/Makefile
 rename tools/testing/selftests/{net => drivers/net/hw}/devlink_port_split.py (100%)
 rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool.sh (98%)
 rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool_extended_state.sh (96%)
 rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool_lib.sh (100%)
 rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool_mm.sh (99%)
 rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/ethtool_rmon.sh (92%)
 rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/hw_stats_l3.sh (96%)
 rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/hw_stats_l3_gre.sh (92%)
 rename tools/testing/selftests/{net/forwarding => drivers/net/hw}/loopback.sh (92%)
 create mode 100644 tools/testing/selftests/drivers/net/hw/settings
 create mode 100755 tools/testing/selftests/net/forwarding/lib_sh_test.sh

Comments

Jakub Kicinski March 26, 2024, 12:34 a.m. UTC | #1
On Mon, 25 Mar 2024 18:29:10 +0100 Petr Machata wrote:
> +The forwarding selftests framework uses a number of variables that
> +influence its behavior and tools it invokes, and how it invokes them, in
> +various ways. A number of these variables can be overridden. The way these
> +overridable variables are specified is typically one of the following two
> +syntaxes:
> +
> +	: "${VARIABLE:=default_value}"
> +	VARIABLE=${VARIABLE:=default_value}
> +
> +Any of these variables can be overridden. Notably net/forwarding/lib.sh and
> +net/lib.sh contain a number of overridable variables.
> +
> +One way of overriding these variables is through the environment:
> +
> +	PAUSE_ON_FAIL=yes ./some_test.sh

I like this conversion a lot. Makes me want to propose that we make this
a standard feature of kselftest. If "env" file exists in the test
directory kselftest would load its contents before running every test.

That's more of a broader question to anyone reading on linux-kselftest@
if there's no interest more than happy to merge as is :)

> +The variable NETIFS is special. Since it is an array variable, there is no
> +way to pass it through the environment. Its value can instead be given as
> +consecutive arguments to the selftest:
> +
> +	./some_test.sh swp{1..8}

Did you consider allowing them to be defined as NETIF_0, NETIF_1 etc.?
We can have lib.sh convert that into an array with a ugly-but-short
loop, it's a bit tempting to get rid of the exception.
Jakub Kicinski March 26, 2024, 12:43 a.m. UTC | #2
On Mon, 25 Mar 2024 18:29:16 +0100 Petr Machata wrote:
> +set_ret()
> +{
> +	local nret=$1; shift

May be worth throwing in a comment that $1 must be a legal ksft ret
code, not any exit code from random commands.
Jakub Kicinski March 26, 2024, 12:48 a.m. UTC | #3
On Mon, 25 Mar 2024 18:29:07 +0100 Petr Machata wrote:
> As discussed on the bi-weekly call on Jan 30, and in mailing around
> kernel CI effort, some changes are desirable in the suite of forwarding
> selftests the better to work with the CI tooling. 

Excellent.

Did you skip CCing netdev@ on purpose or by accident?
Feel free to send it to the list without the RFC tag
(pw-bot will auto-discard it if it sees RFC).
Petr Machata March 26, 2024, 10:31 a.m. UTC | #4
Jakub Kicinski <kuba@kernel.org> writes:

> On Mon, 25 Mar 2024 18:29:10 +0100 Petr Machata wrote:
>> +The forwarding selftests framework uses a number of variables that
>> +influence its behavior and tools it invokes, and how it invokes them, in
>> +various ways. A number of these variables can be overridden. The way these
>> +overridable variables are specified is typically one of the following two
>> +syntaxes:
>> +
>> +	: "${VARIABLE:=default_value}"
>> +	VARIABLE=${VARIABLE:=default_value}
>> +
>> +Any of these variables can be overridden. Notably net/forwarding/lib.sh and
>> +net/lib.sh contain a number of overridable variables.
>> +
>> +One way of overriding these variables is through the environment:
>> +
>> +	PAUSE_ON_FAIL=yes ./some_test.sh
>
> I like this conversion a lot. Makes me want to propose that we make this

Convention you mean?
Nothing was converted, this has always worked.

> a standard feature of kselftest. If "env" file exists in the test
> directory kselftest would load its contents before running every test.
>
> That's more of a broader question to anyone reading on linux-kselftest@
> if there's no interest more than happy to merge as is :)
>
>> +The variable NETIFS is special. Since it is an array variable, there is no
>> +way to pass it through the environment. Its value can instead be given as
>> +consecutive arguments to the selftest:
>> +
>> +	./some_test.sh swp{1..8}
>
> Did you consider allowing them to be defined as NETIF_0, NETIF_1 etc.?
> We can have lib.sh convert that into an array with a ugly-but-short
> loop, it's a bit tempting to get rid of the exception.

The exception is a bit annoying, yeah. But it works today, should stay,
and therefore should be documented, so the paragraph won't go away. I
use it all the time, too. I basically don't use the config file, I just
use the env overrides and the argv interface names. It's very handy.

The alternative is also very verbose:

	NETIF_1=swp1 NETIF_2=swp2 NETIF_3=swp3 [...] ./some_test.sh.

Maybe we could do this though?

	NETIFS="swp1 swp2 swp3 swp4 swp5 swp6 swp7 swp8" ./some_test.sh

And like this it won't make you want to pull your hair from all the
repetition:

	NETIFS=$(echo swp{1..8}) ./some_test.sh

But NETIFS is going to be a special case one way or another. That you
need to specify it through several variables, or a variable with a
special value, means you need to explain it as a special case in the
documentation. At which point you have two exceptions, and an
interaction between them, to describe.
Petr Machata March 26, 2024, 11:12 a.m. UTC | #5
Jakub Kicinski <kuba@kernel.org> writes:

> On Mon, 25 Mar 2024 18:29:16 +0100 Petr Machata wrote:
>> +set_ret()
>> +{
>> +	local nret=$1; shift
>
> May be worth throwing in a comment that $1 must be a legal ksft ret
> code, not any exit code from random commands.

Hmm, yeah. I'll rename to ret_set_ksft_status() and the variable to
ksft_status, I think that will be harder to miss than a comment. The
verbosity of the new name doesn't matter, it's an internal helper.
Petr Machata March 26, 2024, 11:13 a.m. UTC | #6
Jakub Kicinski <kuba@kernel.org> writes:

> On Mon, 25 Mar 2024 18:29:07 +0100 Petr Machata wrote:
>> As discussed on the bi-weekly call on Jan 30, and in mailing around
>> kernel CI effort, some changes are desirable in the suite of forwarding
>> selftests the better to work with the CI tooling. 
>
> Excellent.
>
> Did you skip CCing netdev@ on purpose or by accident?
> Feel free to send it to the list without the RFC tag
> (pw-bot will auto-discard it if it sees RFC).

D'oh, I forgot to toggle my dev tree from internal to external, so it
went to nbu-linux-internal instead. I'll resend.
Jakub Kicinski March 26, 2024, 2:13 p.m. UTC | #7
On Tue, 26 Mar 2024 11:31:31 +0100 Petr Machata wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
> 
> > On Mon, 25 Mar 2024 18:29:10 +0100 Petr Machata wrote:  
> >> +The forwarding selftests framework uses a number of variables that
> >> +influence its behavior and tools it invokes, and how it invokes them, in
> >> +various ways. A number of these variables can be overridden. The way these
> >> +overridable variables are specified is typically one of the following two
> >> +syntaxes:
> >> +
> >> +	: "${VARIABLE:=default_value}"
> >> +	VARIABLE=${VARIABLE:=default_value}
> >> +
> >> +Any of these variables can be overridden. Notably net/forwarding/lib.sh and
> >> +net/lib.sh contain a number of overridable variables.
> >> +
> >> +One way of overriding these variables is through the environment:
> >> +
> >> +	PAUSE_ON_FAIL=yes ./some_test.sh  
> >
> > I like this conversion a lot. Makes me want to propose that we make this  
> 
> Convention you mean?

Yes, sorry

> Nothing was converted, this has always worked.

Right, for forwarding and perhaps net.

> > a standard feature of kselftest. If "env" file exists in the test
> > directory kselftest would load its contents before running every test.
> >
> > That's more of a broader question to anyone reading on linux-kselftest@
> > if there's no interest more than happy to merge as is :)
> >  
> >> +The variable NETIFS is special. Since it is an array variable, there is no
> >> +way to pass it through the environment. Its value can instead be given as
> >> +consecutive arguments to the selftest:
> >> +
> >> +	./some_test.sh swp{1..8}  
> >
> > Did you consider allowing them to be defined as NETIF_0, NETIF_1 etc.?
> > We can have lib.sh convert that into an array with a ugly-but-short
> > loop, it's a bit tempting to get rid of the exception.  
> 
> The exception is a bit annoying, yeah. But it works today, should stay,
> and therefore should be documented, so the paragraph won't go away. I
> use it all the time, too. I basically don't use the config file, I just
> use the env overrides and the argv interface names. It's very handy.
> 
> The alternative is also very verbose:
> 
> 	NETIF_1=swp1 NETIF_2=swp2 NETIF_3=swp3 [...] ./some_test.sh.
> 
> Maybe we could do this though?
> 
> 	NETIFS="swp1 swp2 swp3 swp4 swp5 swp6 swp7 swp8" ./some_test.sh
> 
> And like this it won't make you want to pull your hair from all the
> repetition:
> 
> 	NETIFS=$(echo swp{1..8}) ./some_test.sh
> 
> But NETIFS is going to be a special case one way or another. That you
> need to specify it through several variables, or a variable with a
> special value, means you need to explain it as a special case in the
> documentation. At which point you have two exceptions, and an
> interaction between them, to describe.

I think there's some value in passing all inputs in the same way (thru
env rather than argv). I guess it's subjective, you're coding it up, 
so you can pick.
Petr Machata March 26, 2024, 5:32 p.m. UTC | #8
Jakub Kicinski <kuba@kernel.org> writes:

> On Tue, 26 Mar 2024 11:31:31 +0100 Petr Machata wrote:
>> Jakub Kicinski <kuba@kernel.org> writes:
>>
>> > a standard feature of kselftest. If "env" file exists in the test
>> > directory kselftest would load its contents before running every test.
>> >
>> > That's more of a broader question to anyone reading on linux-kselftest@
>> > if there's no interest more than happy to merge as is :)
>> >
>> > On Mon, 25 Mar 2024 18:29:10 +0100 Petr Machata wrote:
>> >
>> >> +The variable NETIFS is special. Since it is an array variable, there is no
>> >> +way to pass it through the environment. Its value can instead be given as
>> >> +consecutive arguments to the selftest:
>> >> +
>> >> +	./some_test.sh swp{1..8}  
>> >
>> > Did you consider allowing them to be defined as NETIF_0, NETIF_1 etc.?
>> > We can have lib.sh convert that into an array with a ugly-but-short
>> > loop, it's a bit tempting to get rid of the exception.  
>> 
>> Maybe we could do this though?
>> 
>> 	NETIFS="swp1 swp2 swp3 swp4 swp5 swp6 swp7 swp8" ./some_test.sh
>> 
>> But NETIFS is going to be a special case one way or another. That you
>> need to specify it through several variables, or a variable with a
>> special value, means you need to explain it as a special case in the
>> documentation. At which point you have two exceptions, and an
>> interaction between them, to describe.
>
> I think there's some value in passing all inputs in the same way (thru
> env rather than argv). I guess it's subjective, you're coding it up, 
> so you can pick.

I kinda like the NETIFS="a b c" approach. If somebody wants to code that
up, I'll be happy to review :) I might get around to it at some point.