mbox series

[00/12] selftests/net: Add TCP-AO tests

Message ID 20231215-tcp-ao-selftests-v1-0-f6c08180b985@arista.com
Headers show
Series selftests/net: Add TCP-AO tests | expand

Message

Dmitry Safonov Dec. 15, 2023, 2:36 a.m. UTC
Hi,

An essential part of any big kernel submissions is selftests.
At the beginning of TCP-AO project, I made patches to fcnal-test.sh
and nettest.c to have the benefits of easy refactoring, early noticing
breakages, putting a moat around the code, documenting
and designing uAPI.

While tests based on fcnal-test.sh/nettest.c provided initial testing*
and were very easy to add, the pile of TCP-AO quickly grew out of
one-binary + shell-script testing.

The design of the TCP-AO testing is a bit different than one-big
selftest binary as I did previously in net/ipsec.c. I found it
beneficial to avoid implementing a tests runner/scheduler and delegate
it to the user or Makefile. The approach is very influenced
by CRIU/ZDTM testing[1]: it provides a static library with helper
functions and selftest binaries that create specific scenarios.
I also tried to utilize kselftest.h.

test_init() function does all needed preparations. To not leave
any traces after a selftest exists, it creates a network namespace
and if the test wants to establish a TCP connection, a child netns.
The parent and child netns have veth pair with proper ip addresses
and routes set up. Both peers, the client and server are different
pthreads. The treading model was chosen over forking mostly by easiness
of cleanup on a failure: no need to search for children, handle SIGCHLD,
make sure not to wait for a dead peer to perform anything, etc.
Any thread that does exit() naturally kills the tests, sweet!
The selftests are compiled currently in two variants: ipv4 and ipv6.
Ipv4-mapped-ipv6 addresses might be a third variant to add, but it's not
there in this version. As pretty much all tests are shared between two
address families, most of the code can be shared, too. To differ in code
what kind of test is running, Makefile supplies -DIPV6_TEST to compiler
and ifdeffery in tests can do things that have to be different between
address families. This is similar to TARGETS_C_BOTHBITS in x86 selftests
and also to tests code sharing in CRIU/ZDTM.

The total number of tests is 832.

Comments

Hangbin Liu Dec. 18, 2023, 9:09 a.m. UTC | #1
On Fri, Dec 15, 2023 at 02:36:21AM +0000, Dmitry Safonov wrote:
> Sample output:
> > 1..36
> > # 1106[lib/setup.c:207] rand seed 1660754406
> > TAP version 13
> > ok 1   Worst case connect       512 keys: min=0ms max=1ms mean=0.583329ms stddev=0.076376
> > ok 2 Connect random-search      512 keys: min=0ms max=1ms mean=0.53412ms stddev=0.0516779
> > ok 3    Worst case delete       512 keys: min=2ms max=11ms mean=6.04139ms stddev=0.245792
> > ok 4        Add a new key       512 keys: min=0ms max=13ms mean=0.673415ms stddev=0.0820618
> > ok 5 Remove random-search       512 keys: min=5ms max=9ms mean=6.65969ms stddev=0.258064
> > ok 6         Remove async       512 keys: min=0ms max=0ms mean=0.041825ms stddev=0.0204512
> > ok 7   Worst case connect       1024 keys: min=0ms max=2ms mean=0.520357ms stddev=0.0721358
> > ok 8 Connect random-search      1024 keys: min=0ms max=2ms mean=0.535312ms stddev=0.0517355
> > ok 9    Worst case delete       1024 keys: min=5ms max=9ms mean=8.27219ms stddev=0.287614
> > ok 10        Add a new key      1024 keys: min=0ms max=1ms mean=0.688121ms stddev=0.0829531
> > ok 11 Remove random-search      1024 keys: min=5ms max=9ms mean=8.37649ms stddev=0.289422
> > ok 12         Remove async      1024 keys: min=0ms max=0ms mean=0.0457096ms stddev=0.0213798
> > ok 13   Worst case connect      2048 keys: min=0ms max=2ms mean=0.748804ms stddev=0.0865335
> > ok 14 Connect random-search     2048 keys: min=0ms max=2ms mean=0.782993ms stddev=0.0625697
> > ok 15    Worst case delete      2048 keys: min=5ms max=10ms mean=8.23106ms stddev=0.286898
> > ok 16        Add a new key      2048 keys: min=0ms max=1ms mean=0.812988ms stddev=0.0901658
> > ok 17 Remove random-search      2048 keys: min=8ms max=9ms mean=8.84949ms stddev=0.297481
> > ok 18         Remove async      2048 keys: min=0ms max=0ms mean=0.0297223ms stddev=0.0172402
> > ok 19   Worst case connect      4096 keys: min=1ms max=5ms mean=1.53352ms stddev=0.123836
> > ok 20 Connect random-search     4096 keys: min=1ms max=5ms mean=1.52226ms stddev=0.0872429
> > ok 21    Worst case delete      4096 keys: min=5ms max=9ms mean=8.25874ms stddev=0.28738
> > ok 22        Add a new key      4096 keys: min=0ms max=3ms mean=1.67382ms stddev=0.129376
> > ok 23 Remove random-search      4096 keys: min=5ms max=10ms mean=8.26178ms stddev=0.287433
> > ok 24         Remove async      4096 keys: min=0ms max=0ms mean=0.0340009ms stddev=0.0184393
> > ok 25   Worst case connect      8192 keys: min=2ms max=4ms mean=2.86208ms stddev=0.169177
> > ok 26 Connect random-search     8192 keys: min=2ms max=4ms mean=2.87592ms stddev=0.119915
> > ok 27    Worst case delete      8192 keys: min=6ms max=11ms mean=7.55291ms stddev=0.274826
> > ok 28        Add a new key      8192 keys: min=1ms max=5ms mean=2.56797ms stddev=0.160249
> > ok 29 Remove random-search      8192 keys: min=5ms max=10ms mean=7.14002ms stddev=0.267208
> > ok 30         Remove async      8192 keys: min=0ms max=0ms mean=0.0320066ms stddev=0.0178904
> > ok 31   Worst case connect      16384 keys: min=5ms max=6ms mean=5.55334ms stddev=0.235655
> > ok 32 Connect random-search     16384 keys: min=5ms max=6ms mean=5.52614ms stddev=0.166225
> > ok 33    Worst case delete      16384 keys: min=5ms max=11ms mean=7.39109ms stddev=0.271866
> > ok 34        Add a new key      16384 keys: min=2ms max=4ms mean=3.35799ms stddev=0.183248
> > ok 35 Remove random-search      16384 keys: min=5ms max=8ms mean=6.86078ms stddev=0.261931
> > ok 36         Remove async      16384 keys: min=0ms max=0ms mean=0.0302384ms stddev=0.0173892
> > # Totals: pass:36 fail:0 xfail:0 xpass:0 skip:0 error:0
> 
> From the output it's visible that the current simplified approach with
> linked-list of MKTs scales quite fine even for thousands of keys.
> And that also means that the majority of the time for delete is eaten by
> synchronize_rcu() [which I can confirm separately by tracing].
> 
> Signed-off-by: Dmitry Safonov <dima@arista.com>

Tested-by: Hangbin Liu <liuhangbin@gmail.com>
Hangbin Liu Dec. 18, 2023, 9:36 a.m. UTC | #2
Hi Dmitry,

I just found the patch set has been merged. Sorry for the noise of
tested-by replies. Please feel free to ignore the question for
unsigned-md5_ipv6. The test passed with 6.7.0-rc5.

Thanks
Hangbin

On Fri, Dec 15, 2023 at 02:36:14AM +0000, Dmitry Safonov wrote:
> Hi,
> 
> An essential part of any big kernel submissions is selftests.
> At the beginning of TCP-AO project, I made patches to fcnal-test.sh
> and nettest.c to have the benefits of easy refactoring, early noticing
> breakages, putting a moat around the code, documenting
> and designing uAPI.
> 
> While tests based on fcnal-test.sh/nettest.c provided initial testing*
> and were very easy to add, the pile of TCP-AO quickly grew out of
> one-binary + shell-script testing.
> 
> The design of the TCP-AO testing is a bit different than one-big
> selftest binary as I did previously in net/ipsec.c. I found it
> beneficial to avoid implementing a tests runner/scheduler and delegate
> it to the user or Makefile. The approach is very influenced
> by CRIU/ZDTM testing[1]: it provides a static library with helper
> functions and selftest binaries that create specific scenarios.
> I also tried to utilize kselftest.h.
> 
> test_init() function does all needed preparations. To not leave
> any traces after a selftest exists, it creates a network namespace
> and if the test wants to establish a TCP connection, a child netns.
> The parent and child netns have veth pair with proper ip addresses
> and routes set up. Both peers, the client and server are different
> pthreads. The treading model was chosen over forking mostly by easiness
> of cleanup on a failure: no need to search for children, handle SIGCHLD,
> make sure not to wait for a dead peer to perform anything, etc.
> Any thread that does exit() naturally kills the tests, sweet!
> The selftests are compiled currently in two variants: ipv4 and ipv6.
> Ipv4-mapped-ipv6 addresses might be a third variant to add, but it's not
> there in this version. As pretty much all tests are shared between two
> address families, most of the code can be shared, too. To differ in code
> what kind of test is running, Makefile supplies -DIPV6_TEST to compiler
> and ifdeffery in tests can do things that have to be different between
> address families. This is similar to TARGETS_C_BOTHBITS in x86 selftests
> and also to tests code sharing in CRIU/ZDTM.
> 
> The total number of tests is 832.
> From them rst_ipv{4,6} has currently one flaky subtest, that may fail:
> > not ok 9 client connection was not reset: 0
> I'll investigate what happens there. Also, unsigned-md5_ipv{4,6}
> are flaky because of netns counter checks: it doesn't expect that
> there may be retransmitted TCP segments from a previous sub-selftest.
> That will be fixed. Besides, key-management_ipv{4,6} has 3 sub-tests
> passing with XFAIL:
> > ok 15 # XFAIL listen() after current/rnext keys set: the socket has current/rnext keys: 100:200
> > ok 16 # XFAIL listen socket, delete current key from before listen(): failed to delete the key 100:100 -16
> > ok 17 # XFAIL listen socket, delete rnext key from before listen(): failed to delete the key 200:200 -16
> ...
> > # Totals: pass:117 fail:0 xfail:3 xpass:0 skip:0 error:0
> Those need some more kernel work to pass instead of xfail.
> 
> The overview of selftests (see the diffstat at the bottom):
> ├── lib
> │   ├── aolib.h
> │   │   The header for all selftests to include.
> │   ├── kconfig.c
> │   │   Kernel kconfig detector to SKIP tests that depend on something.
> │   ├── netlink.c
> │   │   Netlink helper to add/modify/delete VETH/IPs/routes/VRFs
> │   │   I considered just using libmnl, but this is around 400 lines
> │   │   and avoids selftests dependency on out-of-tree sources/packets.
> │   ├── proc.c
> │   │   SNMP/netstat procfs parser and the counters comparator.
> │   ├── repair.c
> │   │   Heavily influenced by libsoccr and reduced to minimum TCP
> │   │   socket checkpoint/repair. Shouldn't be used out of selftests,
> │   │   though.
> │   ├── setup.c
> │   │   All the needed netns/veth/ips/etc preparations for test init.
> │   ├── sock.c
> │   │   Socket helpers: {s,g}etsockopt()s/connect()/listen()/etc.
> │   └── utils.c
> │       Random stuff (a pun intended).
> ├── bench-lookups.c
> │   The only benchmark in selftests currently: checks how well TCP-AO
> │   setsockopt()s perform, depending on the amount of keys on a socket.
> ├── connect.c
> │   Trivial sample, can be used as a boilerplate to write a new test.
> ├── connect-deny.c
> │   More-or-less what could be expected for TCP-AO in fcnal-test.sh
> ├── icmps-accept.c -> icmps-discard.c
> ├── icmps-discard.c
> │   Verifies RFC5925 (7.8) by checking that TCP-AO connection can be
> │   broken if ICMPs are accepted and survives when ::accept_icmps = 0
> ├── key-management.c
> │   Key manipulations, rotations between randomized hashing algorithms
> │   and counter checks for those scenarios.
> ├── restore.c
> │   TCP_AO_REPAIR: verifies that a socket can be re-created without
> │   TCP-AO connection being interrupted.
> ├── rst.c
> │   As RST segments are signed on a separate code-path in kernel,
> │   verifies passive/active TCP send_reset().
> ├── self-connect.c
> │   Verifies that TCP self-connect and also simultaneous open work.
> ├── seq-ext.c
> │   Utilizes TCP_AO_REPAIR to check that on SEQ roll-over SNE
> │   increment is performed and segments with different SNEs fail to
> │   pass verification.
> ├── setsockopt-closed.c
> │   Checks that {s,g}etsockopt()s are extendable syscalls and common
> │   error-paths for them.
> └── unsigned-md5.c
>     Checks listen() socket for (non-)matching peers with: AO/MD5/none
>     keys. As well as their interaction with VRFs and AO_REQUIRED flag.
> 
> There are certainly more test scenarios that can be added, but even so,
> I'm pretty happy that this much of TCP-AO functionality and uAPIs got
> covered. These selftests were iteratively developed by me during TCP-AO
> kernel upstreaming and the resulting kernel patches would have been
> worse without having these tests. They provided the user-side
> perspective but also allowed safer refactoring with less possibility
> of introducing a regression. Now it's time to use them to dig
> a moat around the TCP-AO code!
> 
> There are also people from other network companies that work on TCP-AO
> (+testing), so sharing these selftests will allow them to contribute
> and may benefit from their efforts.
> 
> The following changes since commit c7402612e2e61b76177f22e6e7f705adcbecc6fe:
> 
>   Merge tag 'net-6.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net (2023-12-14 13:11:49 -0800)
> 
> are available in the Git repository at:
> 
>   git@github.com:0x7f454c46/linux.git tcp-ao-selftests-v1
> 
> for you to fetch changes up to 85dc9bc676985d81f9043fd9c3a506f30851597b:
> 
>   selftests/net: Add TCP-AO key-management test (2023-12-15 00:44:49 +0000)
> 
> ----------------------------------------------------------------
> 
> * Planning to submit basic TCP-AO tests to fcnal-test.sh/nettest.c
>   separately.
> 
> [1]: https://github.com/checkpoint-restore/criu/tree/criu-dev/test/zdtm/static
> 
> Signed-off-by: Dmitry Safonov <dima@arista.com>
> ---
> Dmitry Safonov (12):
>       selftests/net: Add TCP-AO library
>       selftests/net: Verify that TCP-AO complies with ignoring ICMPs
>       selftests/net: Add TCP-AO ICMPs accept test
>       selftests/net: Add a test for TCP-AO keys matching
>       selftests/net: Add test for TCP-AO add setsockopt() command
>       selftests/net: Add TCP-AO + TCP-MD5 + no sign listen socket tests
>       selftests/net: Add test/benchmark for removing MKTs
>       selftests/net: Add TCP_REPAIR TCP-AO tests
>       selftests/net: Add SEQ number extension test
>       selftests/net: Add TCP-AO RST test
>       selftests/net: Add TCP-AO selfconnect/simultaneous connect test
>       selftests/net: Add TCP-AO key-management test
> 
>  tools/testing/selftests/Makefile                   |    1 +
>  tools/testing/selftests/net/tcp_ao/.gitignore      |    2 +
>  tools/testing/selftests/net/tcp_ao/Makefile        |   59 +
>  tools/testing/selftests/net/tcp_ao/bench-lookups.c |  358 ++++++
>  tools/testing/selftests/net/tcp_ao/connect-deny.c  |  264 +++++
>  tools/testing/selftests/net/tcp_ao/connect.c       |   90 ++
>  tools/testing/selftests/net/tcp_ao/icmps-accept.c  |    1 +
>  tools/testing/selftests/net/tcp_ao/icmps-discard.c |  449 ++++++++
>  .../testing/selftests/net/tcp_ao/key-management.c  | 1180 ++++++++++++++++++++
>  tools/testing/selftests/net/tcp_ao/lib/aolib.h     |  605 ++++++++++
>  tools/testing/selftests/net/tcp_ao/lib/kconfig.c   |  148 +++
>  tools/testing/selftests/net/tcp_ao/lib/netlink.c   |  415 +++++++
>  tools/testing/selftests/net/tcp_ao/lib/proc.c      |  273 +++++
>  tools/testing/selftests/net/tcp_ao/lib/repair.c    |  254 +++++
>  tools/testing/selftests/net/tcp_ao/lib/setup.c     |  342 ++++++
>  tools/testing/selftests/net/tcp_ao/lib/sock.c      |  592 ++++++++++
>  tools/testing/selftests/net/tcp_ao/lib/utils.c     |   30 +
>  tools/testing/selftests/net/tcp_ao/restore.c       |  236 ++++
>  tools/testing/selftests/net/tcp_ao/rst.c           |  415 +++++++
>  tools/testing/selftests/net/tcp_ao/self-connect.c  |  197 ++++
>  tools/testing/selftests/net/tcp_ao/seq-ext.c       |  245 ++++
>  .../selftests/net/tcp_ao/setsockopt-closed.c       |  835 ++++++++++++++
>  tools/testing/selftests/net/tcp_ao/unsigned-md5.c  |  742 ++++++++++++
>  23 files changed, 7733 insertions(+)
> ---
> base-commit: c7402612e2e61b76177f22e6e7f705adcbecc6fe
> change-id: 20231213-tcp-ao-selftests-d0f323006667
> 
> Best regards,
> -- 
> Dmitry Safonov <dima@arista.com>
>