mbox series

[net-next,v3,0/8] net/selftests: TCP-AO selftests updates

Message ID 20240815-tcp-ao-selftests-upd-6-12-v3-0-7bd2e22bb81c@gmail.com
Headers show
Series net/selftests: TCP-AO selftests updates | expand

Message

Dmitry Safonov via B4 Relay Aug. 15, 2024, 9:32 p.m. UTC
First 3 patches are more-or-less cleanups/preparations.

Patches 4/5 are fixes for netns file descriptors leaks/open.

Patch 6 was sent to me/contributed off-list by Mohammad, who wants 32-bit
kernels to run TCP-AO.

Patch 7 is a workaround/fix for slow VMs. Albeit, I can't reproduce
the issue, but I hope it will fix netdev flakes for connect-deny-*
tests.

And the biggest change is adding TCP-AO tracepoints to selftests.
I think it's a good addition by the following reasons:
- The related tracepoints are now tested;
- It allows tcp-ao selftests to raise expectations on the kernel
  behavior - up from the syscalls exit statuses + net counters.
- Provides tracepoints usage samples.

As tracepoints are not a stable ABI, any kernel changes done to them
will be reflected to the selftests, which also will allow users
to see how to change their code. It's quite better than parsing dmesg
(what BGP was doing pre-tracepoints, ugh).

Somewhat arguably, the code parses trace_pipe, rather than uses
libtraceevent (which any sane user should do). The reason behind that is
the same as for rt-netlink macros instead of libmnl: I'm trying
to minimize the library dependencies of the selftests. And the
performance of formatting text in kernel and parsing it again in a test
is not critical.

Current output sample:
> ok 73 Trace events matched expectations: 13 tcp_hash_md5_required[2] tcp_hash_md5_unexpected[4] tcp_hash_ao_required[3] tcp_ao_key_not_found[4]

Previously, tracepoints selftests were part of kernel tcp tracepoints
submission [1], but since then the code was quite changed:
- Now generic tracing setup is in lib/ftrace.c, separate from
  lib/ftrace-tcp.c which utilizes TCP trace points. This separation
  allows future selftests to trace non-TCP events, i.e. to find out
  an skb's drop reason, which was useful in the creation of TCP-CLOSE
  stress-test (not in this patch set, but used in attempt to reproduce
  the issue from [2]).
- Another change is that in the previous submission the trace events
  where used only to detect unexpected TCP-AO/TCP-MD5 events. In this
  version the selftests will fail if an expected trace event didn't
  appear.
  Let's see how reliable this is on the netdev bot - it obviously passes
  on my testing, but potentially may require a temporary XFAIL patch
  if it misbehaves on a slow VM.

[1] https://lore.kernel.org/lkml/20240224-tcp-ao-tracepoints-v1-0-15f31b7f30a7@arista.com/
[2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=33700a0c9b56

Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
---
Changes in v3:
- Corrected the selftests printing of tcp header flags, parsed from
  trace points
- Fixed an issue with VRF kconfig checks (and tests)
- Made check for unexpected trace events XFAIL, yet looking into the
  reason behind the fail
- Link to v2: https://lore.kernel.org/r/20240802-tcp-ao-selftests-upd-6-12-v2-0-370c99358161@gmail.com

Changes in v2:
- Fixed two issues with parsing TCP-AO events: the socket state and TCP
  segment flags. Hopefully, won't fail on netdev.
- Reword patch 1 & 2 messages to be more informative and at some degree
  formal (Paolo)
- Since commit e33a02ed6a4f ("selftests: Add printf attribute to
  kselftest prints") it's possible to use __printf instead of "raw" gcc
  attribute - switch using that, as checkpatch suggests.
- Link to v1: https://lore.kernel.org/r/20240730-tcp-ao-selftests-upd-6-12-v1-0-ffd4bf15d638@gmail.com

---
Dmitry Safonov (7):
      selftests/net: Clean-up double assignment
      selftests/net: Provide test_snprintf() helper
      selftests/net: Be consistent in kconfig checks
      selftests/net: Open /proc/thread-self in open_netns()
      selftests/net: Don't forget to close nsfd after switch_save_ns()
      selftests/net: Synchronize client/server before counters checks
      selftests/net: Add trace events matching to tcp_ao

Mohammad Nassiri (1):
      selftests/tcp_ao: Fix printing format for uint64_t

 tools/testing/selftests/net/tcp_ao/Makefile        |   3 +-
 tools/testing/selftests/net/tcp_ao/bench-lookups.c |   2 +-
 tools/testing/selftests/net/tcp_ao/config          |   1 +
 tools/testing/selftests/net/tcp_ao/connect-deny.c  |  25 +-
 tools/testing/selftests/net/tcp_ao/connect.c       |   6 +-
 tools/testing/selftests/net/tcp_ao/icmps-discard.c |   2 +-
 .../testing/selftests/net/tcp_ao/key-management.c  |  18 +-
 tools/testing/selftests/net/tcp_ao/lib/aolib.h     | 176 ++++++-
 .../testing/selftests/net/tcp_ao/lib/ftrace-tcp.c  | 549 +++++++++++++++++++++
 tools/testing/selftests/net/tcp_ao/lib/ftrace.c    | 466 +++++++++++++++++
 tools/testing/selftests/net/tcp_ao/lib/kconfig.c   |  31 +-
 tools/testing/selftests/net/tcp_ao/lib/setup.c     |  17 +-
 tools/testing/selftests/net/tcp_ao/lib/sock.c      |   1 -
 tools/testing/selftests/net/tcp_ao/lib/utils.c     |  26 +
 tools/testing/selftests/net/tcp_ao/restore.c       |  30 +-
 tools/testing/selftests/net/tcp_ao/rst.c           |   2 +-
 tools/testing/selftests/net/tcp_ao/self-connect.c  |  19 +-
 tools/testing/selftests/net/tcp_ao/seq-ext.c       |  28 +-
 .../selftests/net/tcp_ao/setsockopt-closed.c       |   6 +-
 tools/testing/selftests/net/tcp_ao/unsigned-md5.c  |  35 +-
 20 files changed, 1376 insertions(+), 67 deletions(-)
---
base-commit: a9c60712d71ff07197b2982899b9db28ed548ded
change-id: 20240730-tcp-ao-selftests-upd-6-12-4d3e53a74f3f

Best regards,

Comments

Dmitry Safonov Aug. 19, 2024, 5:19 p.m. UTC | #1
Going to update with v4, I see some netdev failures on non-appeared
trace events.
I need to review the synchronization on the tracer thread.

pw-bot: changes-requested

On Thu, 15 Aug 2024 at 22:32, Dmitry Safonov via B4 Relay
<devnull+0x7f454c46.gmail.com@kernel.org> wrote:
>
> First 3 patches are more-or-less cleanups/preparations.
>
> Patches 4/5 are fixes for netns file descriptors leaks/open.
>
> Patch 6 was sent to me/contributed off-list by Mohammad, who wants 32-bit
> kernels to run TCP-AO.
>
> Patch 7 is a workaround/fix for slow VMs. Albeit, I can't reproduce
> the issue, but I hope it will fix netdev flakes for connect-deny-*
> tests.
>
> And the biggest change is adding TCP-AO tracepoints to selftests.
> I think it's a good addition by the following reasons:
> - The related tracepoints are now tested;
> - It allows tcp-ao selftests to raise expectations on the kernel
>   behavior - up from the syscalls exit statuses + net counters.
> - Provides tracepoints usage samples.
>
> As tracepoints are not a stable ABI, any kernel changes done to them
> will be reflected to the selftests, which also will allow users
> to see how to change their code. It's quite better than parsing dmesg
> (what BGP was doing pre-tracepoints, ugh).
>
> Somewhat arguably, the code parses trace_pipe, rather than uses
> libtraceevent (which any sane user should do). The reason behind that is
> the same as for rt-netlink macros instead of libmnl: I'm trying
> to minimize the library dependencies of the selftests. And the
> performance of formatting text in kernel and parsing it again in a test
> is not critical.
>
> Current output sample:
> > ok 73 Trace events matched expectations: 13 tcp_hash_md5_required[2] tcp_hash_md5_unexpected[4] tcp_hash_ao_required[3] tcp_ao_key_not_found[4]
>
> Previously, tracepoints selftests were part of kernel tcp tracepoints
> submission [1], but since then the code was quite changed:
> - Now generic tracing setup is in lib/ftrace.c, separate from
>   lib/ftrace-tcp.c which utilizes TCP trace points. This separation
>   allows future selftests to trace non-TCP events, i.e. to find out
>   an skb's drop reason, which was useful in the creation of TCP-CLOSE
>   stress-test (not in this patch set, but used in attempt to reproduce
>   the issue from [2]).
> - Another change is that in the previous submission the trace events
>   where used only to detect unexpected TCP-AO/TCP-MD5 events. In this
>   version the selftests will fail if an expected trace event didn't
>   appear.
>   Let's see how reliable this is on the netdev bot - it obviously passes
>   on my testing, but potentially may require a temporary XFAIL patch
>   if it misbehaves on a slow VM.
>
> [1] https://lore.kernel.org/lkml/20240224-tcp-ao-tracepoints-v1-0-15f31b7f30a7@arista.com/
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=33700a0c9b56
>
> Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
[..]

Sorry for the noise,
             Dmitry
Simon Horman Aug. 21, 2024, 7:10 p.m. UTC | #2
On Thu, Aug 15, 2024 at 10:32:27PM +0100, Dmitry Safonov via B4 Relay wrote:
> From: Dmitry Safonov <0x7f454c46@gmail.com>
> 
> Instead of pre-allocating a fixed-sized buffer of TEST_MSG_BUFFER_SIZE
> and printing into it, call vsnprintf() with str = NULL, which will
> return the needed size of the buffer. This hack is documented in
> man 3 vsnprintf.
> 
> Essentially, in C++ terms, it re-invents std::stringstream, which is
> going to be used to print different tracing paths and formatted strings.
> Use it straight away in __test_print() - which is thread-safe version of
> printing in selftests.
> 
> Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>

Hi Dmitry,

Some minor nits, as it looks like there will be a v4.

> ---
>  tools/testing/selftests/net/tcp_ao/lib/aolib.h | 59 ++++++++++++++++++++++----
>  1 file changed, 50 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/testing/selftests/net/tcp_ao/lib/aolib.h b/tools/testing/selftests/net/tcp_ao/lib/aolib.h
> index fbc7f6111815..60a63419cabb 100644
> --- a/tools/testing/selftests/net/tcp_ao/lib/aolib.h
> +++ b/tools/testing/selftests/net/tcp_ao/lib/aolib.h
> @@ -37,17 +37,58 @@ extern void __test_xfail(const char *buf);
>  extern void __test_error(const char *buf);
>  extern void __test_skip(const char *buf);
>  
> -__attribute__((__format__(__printf__, 2, 3)))
> -static inline void __test_print(void (*fn)(const char *), const char *fmt, ...)
> +static inline char *test_snprintf(const char *fmt, va_list vargs)
>  {
> -#define TEST_MSG_BUFFER_SIZE 4096
> -	char buf[TEST_MSG_BUFFER_SIZE];
> -	va_list arg;
> +	char *ret = NULL;
> +	size_t size = 0;
> +	va_list tmp;
> +	int n = 0;
>  
> -	va_start(arg, fmt);
> -	vsnprintf(buf, sizeof(buf), fmt, arg);
> -	va_end(arg);
> -	fn(buf);
> +	va_copy(tmp, vargs);
> +	n = vsnprintf(ret, size, fmt, tmp);
> +	if (n < 0)
> +		return NULL;
> +
> +	size = (size_t) n + 1;

nit: I'm not sure this cast is needed.
     But if it is, then the space after ')' should be dropped.

> +	ret = malloc(size);
> +	if (ret == NULL)

nit: if (!ret)

> +		return NULL;
> +
> +	n = vsnprintf(ret, size, fmt, vargs);
> +	if (n < 0 || n > size - 1) {
> +		free(ret);
> +		return NULL;
> +	}
> +	return ret;
> +}

...
Simon Horman Aug. 21, 2024, 7:11 p.m. UTC | #3
On Thu, Aug 15, 2024 at 10:32:29PM +0100, Dmitry Safonov via B4 Relay wrote:
> From: Dmitry Safonov <0x7f454c46@gmail.com>
> 
> It turns to be that open_netns() is called rarely from the child-thread
> and more often from parent-thread. Yet, on initialization of kconfig
> checks, either of threads may reach kconfig_lock mutex first.
> VRF-related checks do create a temprary ksft-check VRF in

nit: temporary

     Flagged by checkpatch.pl --codespell

> an unshare()'d namespace and than setns() back to the original.
> As original was opened from "/proc/self/ns/net", it's valid for
> thread-leader (parent), but it's invalid for the child, resulting
> in the following failure on tests that check has_vrfs() support:
> > # ok 54 TCP-AO required on socket + TCP-MD5 key: prefailed as expected: Key was rejected by service
> > # not ok 55 # error 381[unsigned-md5.c:24] Failed to add a VRF: -17
> > # not ok 56 # error 383[unsigned-md5.c:33] Failed to add a route to VRF: -22: Key was rejected by service
> > not ok 1 selftests: net/tcp_ao: unsigned-md5_ipv6 # exit=1
> 
> Use "/proc/thread-self/ns/net" which is valid for any thread.
> 
> Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>

...
Dmitry Safonov Aug. 21, 2024, 9:35 p.m. UTC | #4
Hi Simon,

On Wed, 21 Aug 2024 at 20:10, Simon Horman <horms@kernel.org> wrote:
>
> On Thu, Aug 15, 2024 at 10:32:27PM +0100, Dmitry Safonov via B4 Relay wrote:
> > From: Dmitry Safonov <0x7f454c46@gmail.com>
> >
> > Instead of pre-allocating a fixed-sized buffer of TEST_MSG_BUFFER_SIZE
> > and printing into it, call vsnprintf() with str = NULL, which will
> > return the needed size of the buffer. This hack is documented in
> > man 3 vsnprintf.
> >
> > Essentially, in C++ terms, it re-invents std::stringstream, which is
> > going to be used to print different tracing paths and formatted strings.
> > Use it straight away in __test_print() - which is thread-safe version of
> > printing in selftests.
> >
> > Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
>
> Hi Dmitry,
>
> Some minor nits, as it looks like there will be a v4.

Thanks, both seem reasonable.
Did you get them with checkpatch.pl or with your trained eyes? :)

These days I run b4 prep --check and on latest version it just gave a
bunch of fmt-strings with columns > 100.

Thanks,
             Dmitry
Dmitry Safonov Aug. 21, 2024, 9:44 p.m. UTC | #5
On Wed, 21 Aug 2024 at 20:11, Simon Horman <horms@kernel.org> wrote:
>
> On Thu, Aug 15, 2024 at 10:32:29PM +0100, Dmitry Safonov via B4 Relay wrote:
> > From: Dmitry Safonov <0x7f454c46@gmail.com>
> >
> > It turns to be that open_netns() is called rarely from the child-thread
> > and more often from parent-thread. Yet, on initialization of kconfig
> > checks, either of threads may reach kconfig_lock mutex first.
> > VRF-related checks do create a temprary ksft-check VRF in
>
> nit: temporary
>
>      Flagged by checkpatch.pl --codespell

A-ha, b4 has this b4.prep-perpatch-check-cmd git setting:
https://github.com/mricon/b4/blob/37811c93f50e70f325e45107a9a20ffc69f2f6dc/src/b4/ez.py#L1667C20-L1667C43

Going to set it and hopefully, it will help avoid spellings/typos in
future, thanks!
Simon Horman Aug. 22, 2024, 10:13 a.m. UTC | #6
On Wed, Aug 21, 2024 at 10:35:10PM +0100, Dmitry Safonov wrote:
> Hi Simon,
> 
> On Wed, 21 Aug 2024 at 20:10, Simon Horman <horms@kernel.org> wrote:
> >
> > On Thu, Aug 15, 2024 at 10:32:27PM +0100, Dmitry Safonov via B4 Relay wrote:
> > > From: Dmitry Safonov <0x7f454c46@gmail.com>
> > >
> > > Instead of pre-allocating a fixed-sized buffer of TEST_MSG_BUFFER_SIZE
> > > and printing into it, call vsnprintf() with str = NULL, which will
> > > return the needed size of the buffer. This hack is documented in
> > > man 3 vsnprintf.
> > >
> > > Essentially, in C++ terms, it re-invents std::stringstream, which is
> > > going to be used to print different tracing paths and formatted strings.
> > > Use it straight away in __test_print() - which is thread-safe version of
> > > printing in selftests.
> > >
> > > Signed-off-by: Dmitry Safonov <0x7f454c46@gmail.com>
> >
> > Hi Dmitry,
> >
> > Some minor nits, as it looks like there will be a v4.
> 
> Thanks, both seem reasonable.
> Did you get them with checkpatch.pl or with your trained eyes? :)
> 
> These days I run b4 prep --check and on latest version it just gave a
> bunch of fmt-strings with columns > 100.

Hi Dimitry,

For networking code I usually run:

checkpatch.pl --strict --codespell --min-conf-desc-length=80

Where 80 is, I believe, still in line with preferences for Networking code.
Although I'm not entirely sure it is applicable to this patch.

As to your question, in this case I think it is the --strict that causes
checkpatch to flag the issues I raised. Sorry for not mentioning that in my
previous email.
Simon Horman Aug. 22, 2024, 10:14 a.m. UTC | #7
On Wed, Aug 21, 2024 at 10:44:30PM +0100, Dmitry Safonov wrote:
> On Wed, 21 Aug 2024 at 20:11, Simon Horman <horms@kernel.org> wrote:
> >
> > On Thu, Aug 15, 2024 at 10:32:29PM +0100, Dmitry Safonov via B4 Relay wrote:
> > > From: Dmitry Safonov <0x7f454c46@gmail.com>
> > >
> > > It turns to be that open_netns() is called rarely from the child-thread
> > > and more often from parent-thread. Yet, on initialization of kconfig
> > > checks, either of threads may reach kconfig_lock mutex first.
> > > VRF-related checks do create a temprary ksft-check VRF in
> >
> > nit: temporary
> >
> >      Flagged by checkpatch.pl --codespell
> 
> A-ha, b4 has this b4.prep-perpatch-check-cmd git setting:
> https://github.com/mricon/b4/blob/37811c93f50e70f325e45107a9a20ffc69f2f6dc/src/b4/ez.py#L1667C20-L1667C43
> 
> Going to set it and hopefully, it will help avoid spellings/typos in
> future, thanks!

--codespell is very handy for that :)
Dmitry Safonov Aug. 23, 2024, 2:27 p.m. UTC | #8
On Thu, 22 Aug 2024 at 11:13, Simon Horman <horms@kernel.org> wrote:
>
> On Wed, Aug 21, 2024 at 10:35:10PM +0100, Dmitry Safonov wrote:
> > Hi Simon,
> >
> > On Wed, 21 Aug 2024 at 20:10, Simon Horman <horms@kernel.org> wrote:
> > >
[..]
> > > Hi Dmitry,
> > >
> > > Some minor nits, as it looks like there will be a v4.
> >
> > Thanks, both seem reasonable.
> > Did you get them with checkpatch.pl or with your trained eyes? :)
> >
> > These days I run b4 prep --check and on latest version it just gave a
> > bunch of fmt-strings with columns > 100.
>
> Hi Dimitry,
>
> For networking code I usually run:
>
> checkpatch.pl --strict --codespell --min-conf-desc-length=80
>
> Where 80 is, I believe, still in line with preferences for Networking code.
> Although I'm not entirely sure it is applicable to this patch.
>
> As to your question, in this case I think it is the --strict that causes
> checkpatch to flag the issues I raised. Sorry for not mentioning that in my
> previous email.

Good, thanks!
Now I can see/fix them :-)

Cheers,
             Dmitry