[net-next,0/5] selftests: drv-net: support testing with a remote system

Message ID	20240412233705.1066444-1-kuba@kernel.org
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3CEB27442; Fri, 12 Apr 2024 23:37:16 +0000 (UTC) From: Jakub Kicinski <kuba@kernel.org> To: davem@davemloft.net Cc: netdev@vger.kernel.org, edumazet@google.com, pabeni@redhat.com, shuah@kernel.org, petrm@nvidia.com, linux-kselftest@vger.kernel.org, willemb@google.com, Jakub Kicinski <kuba@kernel.org> Subject: [PATCH net-next 0/5] selftests: drv-net: support testing with a remote system Date: Fri, 12 Apr 2024 16:37:00 -0700 Message-ID: <20240412233705.1066444-1-kuba@kernel.org> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	selftests: drv-net: support testing with a remote system \| expand [net-next,0/5] selftests: drv-net: support testing with a remote system [net-next,2/5] selftests: drv-net: add stdout to the command failed exception [net-next,4/5] selftests: drv-net: construct environment for running tests which require an endpoint

Jakub Kicinski April 12, 2024, 11:37 p.m. UTC

Hi!

Implement support for tests which require access to a remote system /
endpoint which can generate traffic.
This series concludes the "groundwork" for upstream driver tests.

I wanted to support the three models which came up in discussions:
 - SW testing with netdevsim
 - "local" testing with two ports on the same system in a loopback
 - "remote" testing via SSH
so there is a tiny bit of an abstraction which wraps up how "remote"
commands are executed. Otherwise hopefully there's nothing surprising.

I'm only adding a ping test. I had a bigger one written but I was
worried we'll get into discussing the details of the test itself
and how I chose to hack up netdevsim, instead of the test infra...
So that test will be a follow up :)

---

TBH, this series is on top of the one I posted in the morning:
https://lore.kernel.org/all/20240412141436.828666-1-kuba@kernel.org/
but it applies cleanly, and all it needs is the ifindex definition
in netdevsim. Testing with real HW works fine even without the other
series.

Jakub Kicinski (5):
  selftests: drv-net: define endpoint structures
  selftests: drv-net: add stdout to the command failed exception
  selftests: drv-net: factor out parsing of the env
  selftests: drv-net: construct environment for running tests which
    require an endpoint
  selftests: drv-net: add a trivial ping test

 tools/testing/selftests/drivers/net/Makefile  |   4 +-
 .../testing/selftests/drivers/net/README.rst  |  31 ++++
 .../selftests/drivers/net/lib/py/__init__.py  |   1 +
 .../selftests/drivers/net/lib/py/endpoint.py  |  13 ++
 .../selftests/drivers/net/lib/py/env.py       | 136 +++++++++++++++---
 .../selftests/drivers/net/lib/py/ep_netns.py  |  15 ++
 .../selftests/drivers/net/lib/py/ep_ssh.py    |  34 +++++
 tools/testing/selftests/drivers/net/ping.py   |  32 +++++
 .../testing/selftests/net/lib/py/__init__.py  |   1 +
 tools/testing/selftests/net/lib/py/netns.py   |  31 ++++
 tools/testing/selftests/net/lib/py/utils.py   |  22 +--
 11 files changed, 291 insertions(+), 29 deletions(-)
 create mode 100644 tools/testing/selftests/drivers/net/lib/py/endpoint.py
 create mode 100644 tools/testing/selftests/drivers/net/lib/py/ep_netns.py
 create mode 100644 tools/testing/selftests/drivers/net/lib/py/ep_ssh.py
 create mode 100755 tools/testing/selftests/drivers/net/ping.py
 create mode 100644 tools/testing/selftests/net/lib/py/netns.py

Jakub Kicinski April 15, 2024, 2:19 p.m. UTC | #1

On Mon, 15 Apr 2024 10:57:31 +0200 Paolo Abeni wrote:
> If I read correctly the above will do a full ssh handshake for each
> command. If the test script/setup is complex, I think/fear the overhead
> could become a bit cumbersome.

Connection reuse. I wasn't sure if I should add a hint to the README,
let me do so.

> Would using something alike Fabric to create a single connection at
> endpoint instantiation time and re-using it for all the command be too
> much? 

IDK what "Fabric" is, if its commonly used we can add the option
in tree. If less commonly - I hope the dynamic loading scheme
will allow users to very easily drop in their own class that 
integrates with Fabric, without dirtying the tree? :)

Jakub Kicinski April 15, 2024, 2:33 p.m. UTC | #2

On Mon, 15 Apr 2024 11:31:05 +0200 Paolo Abeni wrote:
> On Fri, 2024-04-12 at 16:37 -0700, Jakub Kicinski wrote:
> > +def ping_v4(cfg) -> None:
> > +    if not cfg.v4:
> > +        raise KsftXfailEx()
> > +
> > +    cmd(f"ping -c 1 -W0.5 {cfg.ep_v4}")
> > +    cmd(f"ping -c 1 -W0.5 {cfg.v4}", host=cfg.endpoint)  
> 
> Very minor nit, I personally find a bit more readable:
> 
> 	cfg.endpoint.cmd()
> 
> Which is already supported by the current infra, right?
> 
> With both endpoint possibly remote could be:
> 
> 	cfg.ep1.cmd()
> 	cfg.ep2.cmd()

As I said in the cover letter, I don't want to push us too much towards
classes. The argument format make local and local+remote tests look more
similar.

I could be wrong 🤷️

Willem de Bruijn April 15, 2024, 3:23 p.m. UTC | #3

Jakub Kicinski wrote:
> On Sun, 14 Apr 2024 13:04:46 -0400 Willem de Bruijn wrote:
> > 1. Cleaning up remote state in all conditions, including timeout/kill.
> > 
> >    Some tests require a setup phase before the test, and a matching
> >    cleanup phase. If any of the configured state is variable (even
> >    just a randomized filepath) this needs to be communicated to the
> >    cleanup phase. The remote filepath is handled well here. But if
> >    a test needs per-test setup? Say, change MTU or an Ethtool feature.
> >    Multiple related tests may want to share a setup/cleanup.
> > 
> >    Related: some tests may need benefit from a lightweight stateless
> >    check phase to detect preconditions before committing to any setup.
> >    Again, say an Ethtool feature like rx-gro-hw, or AF_XDP metadata rx.
> 
> I think this falls into the "frameworking debate" we were having with
> Petr. The consensus seems to be to keep things as simple as possible.

Makes sense. We can find the sticking points as we go along.

tools/testing/selftests/net already has a couple of hardware feature
tests, that probably see little use now that they require manual
testing (csum, gro, toeplitz, ..). Really excited to include them in
this infra to hopefully see more regular testing across more hardware.

> If we see that tests are poorly written and would benefit from extra
> structure we should try impose some, but every local custom is
> something people will have to learn.

The above were just observations from embedding tests like those
mentioned in our internal custom test framework. Especially with
heterogenous hardware, a lot of it is "can we run this test on this
platform", or "disable this feature as it interacts with the tested
feature" (e.g., HW-GRO and csum.c).

> timeout/kill is provided to us already by the kselftest harness.
> 
> > 2. Synchronizing peers. Often both peers need to be started at the
> >    same time, but then the client may need to wait until the server
> >    is listening. Paolo added a nice local script to detect a listening
> >    socket with sockstat. Less of a problem with TCP tests than UDP or
> >    raw packet tests.
> 
> Yes, definitely. We should probably add that with the first test that
> needs it.

Willem de Bruijn April 15, 2024, 3:30 p.m. UTC | #4

Jakub Kicinski wrote:
> Hi!
> 
> Implement support for tests which require access to a remote system /
> endpoint which can generate traffic.
> This series concludes the "groundwork" for upstream driver tests.
> 
> I wanted to support the three models which came up in discussions:
>  - SW testing with netdevsim
>  - "local" testing with two ports on the same system in a loopback
>  - "remote" testing via SSH
> so there is a tiny bit of an abstraction which wraps up how "remote"
> commands are executed. Otherwise hopefully there's nothing surprising.
> 
> I'm only adding a ping test. I had a bigger one written but I was
> worried we'll get into discussing the details of the test itself
> and how I chose to hack up netdevsim, instead of the test infra...
> So that test will be a follow up :)
> 
> ---
> 
> TBH, this series is on top of the one I posted in the morning:
> https://lore.kernel.org/all/20240412141436.828666-1-kuba@kernel.org/
> but it applies cleanly, and all it needs is the ifindex definition
> in netdevsim. Testing with real HW works fine even without the other
> series.
> 
> Jakub Kicinski (5):
>   selftests: drv-net: define endpoint structures
>   selftests: drv-net: add stdout to the command failed exception
>   selftests: drv-net: factor out parsing of the env
>   selftests: drv-net: construct environment for running tests which
>     require an endpoint
>   selftests: drv-net: add a trivial ping test

For the series:

Reviewed-by: Willem de Bruijn <willemb@google.com>

I left some comments for discussion, but did not spell out the more
important part: series looks great to me. Thanks for building this!

Paolo Abeni April 15, 2024, 4:02 p.m. UTC | #5

On Mon, 2024-04-15 at 07:19 -0700, Jakub Kicinski wrote:
> On Mon, 15 Apr 2024 10:57:31 +0200 Paolo Abeni wrote:
> > If I read correctly the above will do a full ssh handshake for each
> > command. If the test script/setup is complex, I think/fear the overhead
> > could become a bit cumbersome.
> 
> Connection reuse. I wasn't sure if I should add a hint to the README,
> let me do so.
> 
> > Would using something alike Fabric to create a single connection at
> > endpoint instantiation time and re-using it for all the command be too
> > much? 
> 
> IDK what "Fabric" is, if its commonly used we can add the option
> in tree. If less commonly - I hope the dynamic loading scheme
> will allow users to very easily drop in their own class that 
> integrates with Fabric, without dirtying the tree? :)

I'm really a python-expert. 'Fabric' a python library to execute
commands over ssh:

https://www.fabfile.org/
> 
No idea how much commont it is.

I'm fine with ssh connection sharing.

Thanks,

Paolo

Paolo Abeni April 15, 2024, 4:09 p.m. UTC | #6

On Mon, 2024-04-15 at 07:33 -0700, Jakub Kicinski wrote:
> On Mon, 15 Apr 2024 11:31:05 +0200 Paolo Abeni wrote:
> > On Fri, 2024-04-12 at 16:37 -0700, Jakub Kicinski wrote:
> > > +def ping_v4(cfg) -> None:
> > > +    if not cfg.v4:
> > > +        raise KsftXfailEx()
> > > +
> > > +    cmd(f"ping -c 1 -W0.5 {cfg.ep_v4}")
> > > +    cmd(f"ping -c 1 -W0.5 {cfg.v4}", host=cfg.endpoint)  
> > 
> > Very minor nit, I personally find a bit more readable:
> > 
> > 	cfg.endpoint.cmd()
> > 
> > Which is already supported by the current infra, right?
> > 
> > With both endpoint possibly remote could be:
> > 
> > 	cfg.ep1.cmd()
> > 	cfg.ep2.cmd()
> 
> As I said in the cover letter, I don't want to push us too much towards
> classes. The argument format make local and local+remote tests look more
> similar.

I guess it's a matter of personal preferences. I know mine are usually
quite twisted ;)

I'm fine with either syntax.

Cheers,

Paolo

[net-next,0/5] selftests: drv-net: support testing with a remote system

Message

Comments