mbox series

[bpf-next,v3,0/5] selftests/bpf: xsk selftests

Message ID 20201125183749.13797-1-weqaar.a.janjua@intel.com
Headers show
Series selftests/bpf: xsk selftests | expand

Message

Weqaar Janjua Nov. 25, 2020, 6:37 p.m. UTC
This patch set adds AF_XDP selftests based on veth to selftests/bpf.

# Topology:
# ---------
#                 -----------
#               _ | Process | _
#              /  -----------  \
#             /        |        \
#            /         |         \
#      -----------     |     -----------
#      | Thread1 |     |     | Thread2 |
#      -----------     |     -----------
#           |          |          |
#      -----------     |     -----------
#      |  xskX   |     |     |  xskY   |
#      -----------     |     -----------
#           |          |          |
#      -----------     |     ----------
#      |  vethX  | --------- |  vethY |
#      -----------   peer    ----------
#           |          |          |
#      namespaceX      |     namespaceY

These selftests test AF_XDP SKB and Native/DRV modes using veth Virtual
Ethernet interfaces.

The test program contains two threads, each thread is single socket with
a unique UMEM. It validates in-order packet delivery and packet content
by sending packets to each other.

Prerequisites setup by script test_xsk.sh:

   Set up veth interfaces as per the topology shown ^^:
   * setup two veth interfaces and one namespace
   ** veth<xxxx> in root namespace
   ** veth<yyyy> in af_xdp<xxxx> namespace
   ** namespace af_xdp<xxxx>
   * create a spec file veth.spec that includes this run-time configuration
   *** xxxx and yyyy are randomly generated 4 digit numbers used to avoid
       conflict with any existing interface

The following tests are provided:

1. AF_XDP SKB mode
   Generic mode XDP is driver independent, used when the driver does
   not have support for XDP. Works on any netdevice using sockets and
   generic XDP path. XDP hook from netif_receive_skb().
   a. nopoll - soft-irq processing
   b. poll - using poll() syscall
   c. Socket Teardown
      Create a Tx and a Rx socket, Tx from one socket, Rx on another.
      Destroy both sockets, then repeat multiple times. Only nopoll mode
	  is used
   d. Bi-directional Sockets
      Configure sockets as bi-directional tx/rx sockets, sets up fill
	  and completion rings on each socket, tx/rx in both directions.
	  Only nopoll mode is used

2. AF_XDP DRV/Native mode
   Works on any netdevice with XDP_REDIRECT support, driver dependent.
   Processes packets before SKB allocation. Provides better performance
   than SKB. Driver hook available just after DMA of buffer descriptor.
   a. nopoll
   b. poll
   c. Socket Teardown
   d. Bi-directional Sockets
   * Only copy mode is supported because veth does not currently support
     zero-copy mode

Total tests: 8

Flow:
* Single process spawns two threads: Tx and Rx
* Each of these two threads attach to a veth interface within their
  assigned namespaces
* Each thread creates one AF_XDP socket connected to a unique umem
  for each veth interface
* Tx thread transmits 10k packets from veth<xxxx> to veth<yyyy>
* Rx thread verifies if all 10k packets were received and delivered
  in-order, and have the right content

v2 changes:
* Move selftests/xsk to selftests/bpf
* Remove Makefiles under selftests/xsk, and utilize selftests/bpf/Makefile

v3 changes:
* merge all test scripts test_xsk_*.sh into test_xsk.sh

This patch set requires applying patch from bpf stable tree:
commit 36ccdf85829a by Björn Töpel <bjorn.topel@intel.com>
[PATCH bpf v2] net, xsk: Avoid taking multiple skbuff references

Structure of the patch set:

Patch 1: This patch adds XSK Selftests framework under selftests/bpf
Patch 2: Adds tests: SKB poll and nopoll mode, and mac-ip-udp debug
Patch 3: Adds tests: DRV poll and nopoll mode
Patch 4: Adds tests: SKB and DRV Socket Teardown
Patch 5: Adds tests: SKB and DRV Bi-directional Sockets

Thanks: Weqaar

Weqaar Janjua (5):
  selftests/bpf: xsk selftests framework
  selftests/bpf: xsk selftests - SKB POLL, NOPOLL
  selftests/bpf: xsk selftests - DRV POLL, NOPOLL
  selftests/bpf: xsk selftests - Socket Teardown - SKB, DRV
  selftests/bpf: xsk selftests - Bi-directional Sockets - SKB, DRV

 tools/testing/selftests/bpf/Makefile       |    7 +-
 tools/testing/selftests/bpf/test_xsk.sh    |  238 +++++
 tools/testing/selftests/bpf/xdpxceiver.c   | 1056 ++++++++++++++++++++
 tools/testing/selftests/bpf/xdpxceiver.h   |  158 +++
 tools/testing/selftests/bpf/xsk_env.sh     |   28 +
 tools/testing/selftests/bpf/xsk_prereqs.sh |  119 +++
 6 files changed, 1604 insertions(+), 2 deletions(-)
 create mode 100755 tools/testing/selftests/bpf/test_xsk.sh
 create mode 100644 tools/testing/selftests/bpf/xdpxceiver.c
 create mode 100644 tools/testing/selftests/bpf/xdpxceiver.h
 create mode 100755 tools/testing/selftests/bpf/xsk_env.sh
 create mode 100755 tools/testing/selftests/bpf/xsk_prereqs.sh

Comments

Yonghong Song Nov. 26, 2020, 6:44 a.m. UTC | #1
On 11/25/20 10:37 AM, Weqaar Janjua wrote:
> This patch adds AF_XDP selftests framework under selftests/bpf.

> 

> Topology:

> ---------

>       -----------           -----------

>       |  xskX   | --------- |  xskY   |

>       -----------     |     -----------

>            |          |          |

>       -----------     |     ----------

>       |  vethX  | --------- |  vethY |

>       -----------   peer    ----------

>            |          |          |

>       namespaceX      |     namespaceY

> 

> Prerequisites setup by script test_xsk.sh:

> 

>     Set up veth interfaces as per the topology shown ^^:

>     * setup two veth interfaces and one namespace

>     ** veth<xxxx> in root namespace

>     ** veth<yyyy> in af_xdp<xxxx> namespace

>     ** namespace af_xdp<xxxx>

>     * create a spec file veth.spec that includes this run-time configuration

>     *** xxxx and yyyy are randomly generated 4 digit numbers used to avoid

>         conflict with any existing interface

>     * tests the veth and xsk layers of the topology

> 

> Signed-off-by: Weqaar Janjua <weqaar.a.janjua@intel.com>

> ---

>   tools/testing/selftests/bpf/Makefile       |   5 +-

>   tools/testing/selftests/bpf/test_xsk.sh    | 146 +++++++++++++++++++++

>   tools/testing/selftests/bpf/xsk_env.sh     |  11 ++

>   tools/testing/selftests/bpf/xsk_prereqs.sh | 119 +++++++++++++++++

>   4 files changed, 280 insertions(+), 1 deletion(-)

>   create mode 100755 tools/testing/selftests/bpf/test_xsk.sh

>   create mode 100755 tools/testing/selftests/bpf/xsk_env.sh

>   create mode 100755 tools/testing/selftests/bpf/xsk_prereqs.sh

> 

> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile

> index 3d5940cd110d..596ee5c27906 100644

> --- a/tools/testing/selftests/bpf/Makefile

> +++ b/tools/testing/selftests/bpf/Makefile

> @@ -46,7 +46,9 @@ endif

>   

>   TEST_GEN_FILES =

>   TEST_FILES = test_lwt_ip_encap.o \

> -	test_tc_edt.o

> +	test_tc_edt.o \

> +	xsk_prereqs.sh \

> +	xsk_env.sh

>   

>   # Order correspond to 'make run_tests' order

>   TEST_PROGS := test_kmod.sh \

> @@ -70,6 +72,7 @@ TEST_PROGS := test_kmod.sh \

>   	test_bpftool_build.sh \

>   	test_bpftool.sh \

>   	test_bpftool_metadata.sh \

> +	test_xsk.sh

>   

>   TEST_PROGS_EXTENDED := with_addr.sh \

>   	with_tunnels.sh \

> diff --git a/tools/testing/selftests/bpf/test_xsk.sh b/tools/testing/selftests/bpf/test_xsk.sh

> new file mode 100755

> index 000000000000..1836f2d2f617

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/test_xsk.sh

> @@ -0,0 +1,146 @@

> +#!/bin/bash

> +# SPDX-License-Identifier: GPL-2.0

> +# Copyright(c) 2020 Intel Corporation, Weqaar Janjua <weqaar.a.janjua@intel.com>

> +

> +# AF_XDP selftests based on veth

> +#

> +# End-to-end AF_XDP over Veth test

> +#

> +# Topology:

> +# ---------

> +#      -----------           -----------

> +#      |  xskX   | --------- |  xskY   |

> +#      -----------     |     -----------

> +#           |          |          |

> +#      -----------     |     ----------

> +#      |  vethX  | --------- |  vethY |

> +#      -----------   peer    ----------

> +#           |          |          |

> +#      namespaceX      |     namespaceY

> +#

> +# AF_XDP is an address family optimized for high performance packet processing,

> +# it is XDP’s user-space interface.

> +#

> +# An AF_XDP socket is linked to a single UMEM which is a region of virtual

> +# contiguous memory, divided into equal-sized frames.

> +#

> +# Refer to AF_XDP Kernel Documentation for detailed information:

> +# https://www.kernel.org/doc/html/latest/networking/af_xdp.html

> +#

> +# Prerequisites setup by script:

> +#

> +#   Set up veth interfaces as per the topology shown ^^:

> +#   * setup two veth interfaces and one namespace

> +#   ** veth<xxxx> in root namespace

> +#   ** veth<yyyy> in af_xdp<xxxx> namespace

> +#   ** namespace af_xdp<xxxx>

> +#   * create a spec file veth.spec that includes this run-time configuration

> +#   *** xxxx and yyyy are randomly generated 4 digit numbers used to avoid

> +#       conflict with any existing interface

> +#   * tests the veth and xsk layers of the topology

> +#

> +# Kernel configuration:

> +# ---------------------

> +# See "config" file for recommended kernel config options.

> +#

> +# Turn on XDP sockets and veth support when compiling i.e.

> +# 	Networking support -->

> +# 		Networking options -->

> +# 			[ * ] XDP sockets

> +#

> +# Executing Tests:

> +# ----------------

> +# Must run with CAP_NET_ADMIN capability.

> +#

> +# Run (summary only):

> +#  sudo make summary=1 run_tests

> +#

> +# Run (full color-coded output):

> +#   sudo make colorconsole=1 run_tests

> +#

> +# Run (full output without color-coding):

> +#   sudo make run_tests

> +#

> +# Clean:

> +#  sudo make clean


Can I just run test_xsk.sh at tools/testing/selftests/bpf/ directory?
This will be easier than the above for bpf developers. If it does not 
work, I would like to recommend to make it work.

I did that and there are some test failures.

root@arch-fb-vm1:~/net-next/net-next/tools/testing/selftests/bpf 
./test_xsk.sh
[ 3857.572549] ip (2547) used greatest stack depth: 11864 bytes left 

setting up ve1417: root: 192.168.222.1/30 

setting up ve6185: af_xdp6185: 192.168.222.2/30 

[ 3857.673408] IPv6: ADDRCONF(NETDEV_CHANGE): ve6185: link becomes ready 

Spec file created: veth.spec 

PREREQUISITES: [ PASS ] 

# Interface found: ve1417 

# Interface found: ve6185 

# NS switched: af_xdp6185 

1..1 

# Interface [ve6185] vector [Rx] 

# Interface [ve1417] vector [Tx] 

# Sending 10000 packets on interface ve1417 

not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0] 

# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 

SKB NOPOLL: [ FAIL ] 

# Interface found: ve1417 

# Interface found: ve6185 

# NS switched: af_xdp6185 

1..1 

# Interface [ve6185] vector [Rx] 

# Interface [ve1417] vector [Tx] 

# Sending 10000 packets on interface ve1417 

# End-of-tranmission frame received: PASS 

# Received 10000 packets on interface ve6185
ok 1 PASS: SKB POLL
# Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
SKB POLL: [ PASS ]
# Interface found: ve1417
# Interface found: ve6185
# NS switched: af_xdp6185
1..1
# Interface [ve6185] vector [Rx]
# Interface [ve1417] vector [Tx]
# Sending 10000 packets on interface ve1417
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [95], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV NOPOLL: [ FAIL ]
# Interface found: ve1417
# Interface found: ve6185
# NS switched: af_xdp6185
1..1
# Interface [ve6185] vector [Rx]
# Interface [ve1417] vector [Tx]
# Sending 10000 packets on interface ve1417
# End-of-tranmission frame received: PASS
# Received 10000 packets on interface ve6185
ok 1 PASS: DRV POLL
# Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
DRV POLL: [ PASS ]
# Interface found: ve1417
# Interface found: ve6185
# NS switched: af_xdp6185
1..1
# Creating socket
# Interface [ve6185] vector [Rx]
# Interface [ve1417] vector [Tx]
# Sending 10000 packets on interface ve1417
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [29], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
SKB SOCKET TEARDOWN: [ FAIL ]
# Interface found: ve1417
# Interface found: ve6185
# NS switched: af_xdp6185
1..1
# Creating socket
# Interface [ve6185] vector [Rx]
# Interface [ve1417] vector [Tx]
# Sending 10000 packets on interface ve1417
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [23], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV SOCKET TEARDOWN: [ FAIL ]
# Interface found: ve1417
# Interface found: ve6185
# NS switched: af_xdp6185
1..1
# Creating socket
# Interface [ve6185] vector [Rx]
# Interface [ve1417] vector [Tx]
# Sending 10000 packets on interface ve1417
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [88], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
SKB BIDIRECTIONAL SOCKETS: [ FAIL ]
# Interface found: ve1417
# Interface found: ve6185
# NS switched: af_xdp6185
1..1
# Creating socket
# Interface [ve6185] vector [Rx]
# Interface [ve1417] vector [Tx]
# Sending 10000 packets on interface ve1417
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [1], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV BIDIRECTIONAL SOCKETS: [ FAIL ]
cleaning up...
removing link ve6185
removing ns af_xdp6185
removing spec file: veth.spec
root@arch-fb-vm1:~/net-next/net-next/tools/testing/selftests/bpf

I do have the following
    CONFIG_VETH=y
    CONFIG_XDP_SOCKETS=y

What other configures I am missing?

BTW, I cherry-picked the following pick from bpf tree in this experiment.
   commit e7f4a5919bf66e530e08ff352d9b78ed89574e6b (HEAD -> xsk)
   Author: Björn Töpel <bjorn.topel@intel.com>
   Date:   Mon Nov 23 18:56:00 2020 +0100

       net, xsk: Avoid taking multiple skbuff references

> +

> +. xsk_prereqs.sh

> +

> +TEST_NAME="PREREQUISITES"

> +

> +URANDOM=/dev/urandom

> +[ ! -e "${URANDOM}" ] && { echo "${URANDOM} not found. Skipping tests."; test_exit 1 1; }

> +

> +VETH0_POSTFIX=$(cat ${URANDOM} | tr -dc '0-9' | fold -w 256 | head -n 1 | head --bytes 4)

> +VETH0=ve${VETH0_POSTFIX}

> +VETH1_POSTFIX=$(cat ${URANDOM} | tr -dc '0-9' | fold -w 256 | head -n 1 | head --bytes 4)

> +VETH1=ve${VETH1_POSTFIX}

> +NS1=af_xdp${VETH1_POSTFIX}

> +IPADDR_VETH0=192.168.222.1/30

> +IPADDR_VETH1=192.168.222.2/30

> +MTU=1500

> +

> +setup_vethPairs() {

> +	echo "setting up ${VETH0}: root: ${IPADDR_VETH0}"

> +	ip netns add ${NS1}

> +	ip link add ${VETH0} type veth peer name ${VETH1}

> +	ip addr add dev ${VETH0} ${IPADDR_VETH0}

> +	echo "setting up ${VETH1}: ${NS1}: ${IPADDR_VETH1}"

> +	ip link set ${VETH1} netns ${NS1}

> +	ip netns exec ${NS1} ip addr add dev ${VETH1} ${IPADDR_VETH1}

> +	ip netns exec ${NS1} ip link set ${VETH1} mtu ${MTU}

> +	ip netns exec ${NS1} ip link set ${VETH1} up

> +	ip link set ${VETH0} mtu ${MTU}

> +	ip link set ${VETH0} up

> +}

> +

> +validate_root_exec

> +validate_veth_support ${VETH0}

> +validate_configs

> +setup_vethPairs

> +

> +retval=$?

> +if [ $retval -ne 0 ]; then

> +	test_status $retval "${TEST_NAME}"

> +	cleanup_exit ${VETH0} ${VETH1} ${NS1}

> +	exit $retval

> +fi

> +

> +echo "${VETH0}:${VETH1},${NS1}" > ${SPECFILE}

> +

> +echo "Spec file created: ${SPECFILE}"

> +

> +test_status $retval "${TEST_NAME}"

> +

> +## START TESTS

> +

> +statusList=()

> +

> +### TEST 1

> +TEST_NAME="XSK FRAMEWORK"

> +

> +echo "Switching interfaces [${VETH0}, ${VETH1}] to XDP Generic mode"

> +vethXDPgeneric ${VETH0} ${VETH1} ${NS1}

> +

> +retval=$?

> +if [ $retval -eq 0 ]; then

> +	echo "Switching interfaces [${VETH0}, ${VETH1}] to XDP Native mode"

> +	vethXDPnative ${VETH0} ${VETH1} ${NS1}

> +fi

> +

> +retval=$?

> +test_status $retval "${TEST_NAME}"

> +statusList+=($retval)

> +

> +## END TESTS

> +

> +cleanup_exit ${VETH0} ${VETH1} ${NS1}

> +

> +for _status in "${statusList[@]}"

> +do

> +	if [ $_status -ne 0 ]; then

> +		test_exit $ksft_fail 0

> +	fi

> +done

> +

> +test_exit $ksft_pass 0

> diff --git a/tools/testing/selftests/bpf/xsk_env.sh b/tools/testing/selftests/bpf/xsk_env.sh

> new file mode 100755

> index 000000000000..2c41b4284cae

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/xsk_env.sh

> @@ -0,0 +1,11 @@

> +#!/bin/bash

> +# SPDX-License-Identifier: GPL-2.0

> +# Copyright(c) 2020 Intel Corporation.

> +

> +. xsk_prereqs.sh

> +

> +validate_veth_spec_file

> +

> +VETH0=$(cat ${SPECFILE} | cut -d':' -f 1)

> +VETH1=$(cat ${SPECFILE} | cut -d':' -f 2 | cut -d',' -f 1)

> +NS1=$(cat ${SPECFILE} | cut -d':' -f 2 | cut -d',' -f 2)

> diff --git a/tools/testing/selftests/bpf/xsk_prereqs.sh b/tools/testing/selftests/bpf/xsk_prereqs.sh

> new file mode 100755

> index 000000000000..694c5f5ab5e3

> --- /dev/null

> +++ b/tools/testing/selftests/bpf/xsk_prereqs.sh

> @@ -0,0 +1,119 @@

> +#!/bin/bash

> +# SPDX-License-Identifier: GPL-2.0

> +# Copyright(c) 2020 Intel Corporation.

> +

> +ksft_pass=0

> +ksft_fail=1

> +ksft_xfail=2

> +ksft_xpass=3

> +ksft_skip=4

> +

> +GREEN='\033[0;92m'

> +YELLOW='\033[0;93m'

> +RED='\033[0;31m'

> +NC='\033[0m'

> +STACK_LIM=131072

> +SPECFILE=veth.spec

> +

> +validate_root_exec()

> +{

> +	msg="skip all tests:"

> +	if [ $UID != 0 ]; then

> +		echo $msg must be run as root >&2

> +		test_exit $ksft_fail 2

> +	else

> +		return $ksft_pass

> +	fi

> +}

> +

> +validate_veth_support()

> +{

> +	msg="skip all tests:"

> +	if [ $(ip link add $1 type veth 2>/dev/null; echo $?;) != 0 ]; then

> +		echo $msg veth kernel support not available >&2

> +		test_exit $ksft_skip 1

> +	else

> +		ip link del $1

> +		return $ksft_pass

> +	fi

> +}

> +

> +validate_veth_spec_file()

> +{

> +	if [ ! -f ${SPECFILE} ]; then

> +		test_exit $ksft_skip 1

> +	fi

> +}

> +

> +test_status()

> +{

> +	statusval=$1

> +	if [ -n "${colorconsole+set}" ]; then

> +		if [ $statusval -eq 2 ]; then

> +			echo -e "${YELLOW}$2${NC}: [ ${RED}FAIL${NC} ]"

> +		elif [ $statusval -eq 1 ]; then

> +			echo -e "${YELLOW}$2${NC}: [ ${RED}SKIPPED${NC} ]"

> +		elif [ $statusval -eq 0 ]; then

> +			echo -e "${YELLOW}$2${NC}: [ ${GREEN}PASS${NC} ]"

> +		fi

> +	else

> +		if [ $statusval -eq 2 ]; then

> +			echo -e "$2: [ FAIL ]"

> +		elif [ $statusval -eq 1 ]; then

> +			echo -e "$2: [ SKIPPED ]"

> +		elif [ $statusval -eq 0 ]; then

> +			echo -e "$2: [ PASS ]"

> +		fi

> +	fi

> +}

> +

> +test_exit()

> +{

> +	retval=$1

> +	if [ $2 -ne 0 ]; then

> +		test_status $2 $(basename $0)

> +	fi

> +	exit $retval

> +}

> +

> +clear_configs()

> +{

> +	if [ $(ip netns show | grep $3 &>/dev/null; echo $?;) == 0 ]; then

> +		[ $(ip netns exec $3 ip link show $2 &>/dev/null; echo $?;) == 0 ] &&

> +			{ echo "removing link $2"; ip netns exec $3 ip link del $2; }

> +		echo "removing ns $3"

> +		ip netns del $3

> +	fi

> +	#Once we delete a veth pair node, the entire veth pair is removed,

> +	#this is just to be cautious just incase the NS does not exist then

> +	#veth node inside NS won't get removed so we explicitly remove it

> +	[ $(ip link show $1 &>/dev/null; echo $?;) == 0 ] &&

> +		{ echo "removing link $1"; ip link del $1; }

> +	if [ -f ${SPECFILE} ]; then

> +		echo "removing spec file:" ${SPECFILE}

> +		rm -f ${SPECFILE}

> +	fi

> +}

> +

> +cleanup_exit()

> +{

> +	echo "cleaning up..."

> +	clear_configs $1 $2 $3

> +}

> +

> +validate_configs()

> +{

> +	[ ! $(type -P ip) ] && { echo "'ip' not found. Skipping tests."; test_exit $ksft_skip 1; }

> +}

> +

> +vethXDPgeneric()

> +{

> +	ip link set dev $1 xdpdrv off

> +	ip netns exec $3 ip link set dev $2 xdpdrv off

> +}

> +

> +vethXDPnative()

> +{

> +	ip link set dev $1 xdpgeneric off

> +	ip netns exec $3 ip link set dev $2 xdpgeneric off

> +}

>
Björn Töpel Nov. 26, 2020, 9:01 a.m. UTC | #2
On 2020-11-26 07:44, Yonghong Song wrote:
> 

[...]
> 

> What other configures I am missing?

> 

> BTW, I cherry-picked the following pick from bpf tree in this experiment.

>    commit e7f4a5919bf66e530e08ff352d9b78ed89574e6b (HEAD -> xsk)

>    Author: Björn Töpel <bjorn.topel@intel.com>

>    Date:   Mon Nov 23 18:56:00 2020 +0100

> 

>        net, xsk: Avoid taking multiple skbuff references

>


Hmm, I'm getting an oops, unless I cherry-pick:

36ccdf85829a ("net, xsk: Avoid taking multiple skbuff references")

*AND*

537cf4e3cc2f ("xsk: Fix umem cleanup bug at socket destruct")

from bpf/master.


Björn
Weqaar Janjua Nov. 26, 2020, 9:22 p.m. UTC | #3
On Thu, 26 Nov 2020 at 09:01, Björn Töpel <bjorn.topel@intel.com> wrote:
>

> On 2020-11-26 07:44, Yonghong Song wrote:

> >

> [...]

> >

> > What other configures I am missing?

> >

> > BTW, I cherry-picked the following pick from bpf tree in this experiment.

> >    commit e7f4a5919bf66e530e08ff352d9b78ed89574e6b (HEAD -> xsk)

> >    Author: Björn Töpel <bjorn.topel@intel.com>

> >    Date:   Mon Nov 23 18:56:00 2020 +0100

> >

> >        net, xsk: Avoid taking multiple skbuff references

> >

>

> Hmm, I'm getting an oops, unless I cherry-pick:

>

> 36ccdf85829a ("net, xsk: Avoid taking multiple skbuff references")

>

> *AND*

>

> 537cf4e3cc2f ("xsk: Fix umem cleanup bug at socket destruct")

>

> from bpf/master.

>


Same as Bjorn's findings ^^^, additionally applying the second patch
537cf4e3cc2f [PASS] all tests for me

PREREQUISITES: [ PASS ]
SKB NOPOLL: [ PASS ]
SKB POLL: [ PASS ]
DRV NOPOLL: [ PASS ]
DRV POLL: [ PASS ]
SKB SOCKET TEARDOWN: [ PASS ]
DRV SOCKET TEARDOWN: [ PASS ]
SKB BIDIRECTIONAL SOCKETS: [ PASS ]
DRV BIDIRECTIONAL SOCKETS: [ PASS ]

With the first patch alone, as soon as we enter DRV/Native NOPOLL mode
kernel panics, whereas in your case NOPOLL tests were falling with
packets being *lost* as per seqnum mismatch.

Can you please test this out with both patches and let us know?

> Can I just run test_xsk.sh at tools/testing/selftests/bpf/ directory?

> This will be easier than the above for bpf developers. If it does not

> work, I would like to recommend to make it work.

>

yes test_xsk.shis self contained, will update the instructions in there with v4.

Thanks,
/Weqaar
>

> Björn
Weqaar Janjua Nov. 27, 2020, 5:54 p.m. UTC | #4
On Fri, 27 Nov 2020 at 04:19, Yonghong Song <yhs@fb.com> wrote:
>

>

>

> On 11/26/20 1:22 PM, Weqaar Janjua wrote:

> > On Thu, 26 Nov 2020 at 09:01, Björn Töpel <bjorn.topel@intel.com> wrote:

> >>

> >> On 2020-11-26 07:44, Yonghong Song wrote:

> >>>

> >> [...]

> >>>

> >>> What other configures I am missing?

> >>>

> >>> BTW, I cherry-picked the following pick from bpf tree in this experiment.

> >>>     commit e7f4a5919bf66e530e08ff352d9b78ed89574e6b (HEAD -> xsk)

> >>>     Author: Björn Töpel <bjorn.topel@intel.com>

> >>>     Date:   Mon Nov 23 18:56:00 2020 +0100

> >>>

> >>>         net, xsk: Avoid taking multiple skbuff references

> >>>

> >>

> >> Hmm, I'm getting an oops, unless I cherry-pick:

> >>

> >> 36ccdf85829a ("net, xsk: Avoid taking multiple skbuff references")

> >>

> >> *AND*

> >>

> >> 537cf4e3cc2f ("xsk: Fix umem cleanup bug at socket destruct")

> >>

> >> from bpf/master.

> >>

> >

> > Same as Bjorn's findings ^^^, additionally applying the second patch

> > 537cf4e3cc2f [PASS] all tests for me

> >

> > PREREQUISITES: [ PASS ]

> > SKB NOPOLL: [ PASS ]

> > SKB POLL: [ PASS ]

> > DRV NOPOLL: [ PASS ]

> > DRV POLL: [ PASS ]

> > SKB SOCKET TEARDOWN: [ PASS ]

> > DRV SOCKET TEARDOWN: [ PASS ]

> > SKB BIDIRECTIONAL SOCKETS: [ PASS ]

> > DRV BIDIRECTIONAL SOCKETS: [ PASS ]

> >

> > With the first patch alone, as soon as we enter DRV/Native NOPOLL mode

> > kernel panics, whereas in your case NOPOLL tests were falling with

> > packets being *lost* as per seqnum mismatch.

> >

> > Can you please test this out with both patches and let us know?

>

> I applied both the above patches in bpf-next as well as this patch set,

> I still see failures. I am attaching my config file. Maybe you can take

> a look at what is the issue.

>

Thanks for the config, can you please confirm the compiler version,
and resource limits i.e. stack size, memory, etc.?

Only NOPOLL tests are failing for you as I see it, do the same tests
fail every time?

I will need to spend some time debugging this to have a fix.

Thanks,
/Weqaar

> >

> >> Can I just run test_xsk.sh at tools/testing/selftests/bpf/ directory?

> >> This will be easier than the above for bpf developers. If it does not

> >> work, I would like to recommend to make it work.

> >>

> > yes test_xsk.shis self contained, will update the instructions in there with v4.

>

> That will be great. Thanks!

>

> >

> > Thanks,

> > /Weqaar

> >>

> >> Björn
Yonghong Song Nov. 28, 2020, 3:13 a.m. UTC | #5
On 11/27/20 9:54 AM, Weqaar Janjua wrote:
> On Fri, 27 Nov 2020 at 04:19, Yonghong Song <yhs@fb.com> wrote:

>>

>>

>>

>> On 11/26/20 1:22 PM, Weqaar Janjua wrote:

>>> On Thu, 26 Nov 2020 at 09:01, Björn Töpel <bjorn.topel@intel.com> wrote:

>>>>

>>>> On 2020-11-26 07:44, Yonghong Song wrote:

>>>>>

>>>> [...]

>>>>>

>>>>> What other configures I am missing?

>>>>>

>>>>> BTW, I cherry-picked the following pick from bpf tree in this experiment.

>>>>>      commit e7f4a5919bf66e530e08ff352d9b78ed89574e6b (HEAD -> xsk)

>>>>>      Author: Björn Töpel <bjorn.topel@intel.com>

>>>>>      Date:   Mon Nov 23 18:56:00 2020 +0100

>>>>>

>>>>>          net, xsk: Avoid taking multiple skbuff references

>>>>>

>>>>

>>>> Hmm, I'm getting an oops, unless I cherry-pick:

>>>>

>>>> 36ccdf85829a ("net, xsk: Avoid taking multiple skbuff references")

>>>>

>>>> *AND*

>>>>

>>>> 537cf4e3cc2f ("xsk: Fix umem cleanup bug at socket destruct")

>>>>

>>>> from bpf/master.

>>>>

>>>

>>> Same as Bjorn's findings ^^^, additionally applying the second patch

>>> 537cf4e3cc2f [PASS] all tests for me

>>>

>>> PREREQUISITES: [ PASS ]

>>> SKB NOPOLL: [ PASS ]

>>> SKB POLL: [ PASS ]

>>> DRV NOPOLL: [ PASS ]

>>> DRV POLL: [ PASS ]

>>> SKB SOCKET TEARDOWN: [ PASS ]

>>> DRV SOCKET TEARDOWN: [ PASS ]

>>> SKB BIDIRECTIONAL SOCKETS: [ PASS ]

>>> DRV BIDIRECTIONAL SOCKETS: [ PASS ]

>>>

>>> With the first patch alone, as soon as we enter DRV/Native NOPOLL mode

>>> kernel panics, whereas in your case NOPOLL tests were falling with

>>> packets being *lost* as per seqnum mismatch.

>>>

>>> Can you please test this out with both patches and let us know?

>>

>> I applied both the above patches in bpf-next as well as this patch set,

>> I still see failures. I am attaching my config file. Maybe you can take

>> a look at what is the issue.

>>

> Thanks for the config, can you please confirm the compiler version,

> and resource limits i.e. stack size, memory, etc.?


root@arch-fb-vm1:~/net-next/net-next/tools/testing/selftests/bpf ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 15587
max locked memory       (kbytes, -l) unlimited
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 15587
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

compiler: gcc 8.2

> 

> Only NOPOLL tests are failing for you as I see it, do the same tests

> fail every time?


In my case, with above two bpf patches applied as well, I got:
$ ./test_xsk.sh
setting up ve9127: root: 192.168.222.1/30 

setting up ve4520: af_xdp4520: 192.168.222.2/30 

Spec file created: veth.spec 

PREREQUISITES: [ PASS ] 

# Interface found: ve9127 

# Interface found: ve4520 

# NS switched: af_xdp4520 

1..1 

# Interface [ve4520] vector [Rx] 

# Interface [ve9127] vector [Tx] 

# Sending 10000 packets on interface ve9127 

not ok 1 ERROR: [worker_pkt_validate] prev_pkt [59], payloadseqnum [0] 

# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 

SKB NOPOLL: [ FAIL ] 

# Interface found: ve9127 

# Interface found: ve4520 

# NS switched: af_xdp4520
# NS switched: af_xdp4520 

1..1
# Interface [ve4520] vector [Rx]
# Interface [ve9127] vector [Tx]
# Sending 10000 packets on interface ve9127
# End-of-tranmission frame received: PASS
# Received 10000 packets on interface ve4520
ok 1 PASS: SKB POLL
# Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
SKB POLL: [ PASS ]
# Interface found: ve9127
# Interface found: ve4520
# NS switched: af_xdp4520
1..1
# Interface [ve4520] vector [Rx]
# Interface [ve9127] vector [Tx]
# Sending 10000 packets on interface ve9127
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [153], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV NOPOLL: [ FAIL ]
# Interface found: ve9127
# Interface found: ve4520
# NS switched: af_xdp4520
1..1
# Interface [ve4520] vector [Rx]
# Interface [ve9127] vector [Tx]
# Sending 10000 packets on interface ve9127
# End-of-tranmission frame received: PASS
# Received 10000 packets on interface ve4520
ok 1 PASS: DRV POLL
# Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
DRV POLL: [ PASS ]
# Interface found: ve9127
# Interface found: ve4520
# NS switched: af_xdp4520
1..1
# Creating socket
# Interface [ve4520] vector [Rx]
# Interface [ve9127] vector [Tx]
# Sending 10000 packets on interface ve9127
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [54], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
SKB SOCKET TEARDOWN: [ FAIL ]
# Interface found: ve9127
# Interface found: ve4520
# NS switched: af_xdp4520
1..1
# Creating socket
# Interface [ve4520] vector [Rx]
# Interface [ve9127] vector [Tx]
# Sending 10000 packets on interface ve9127
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV SOCKET TEARDOWN: [ FAIL ]
# Interface found: ve9127
# Interface found: ve4520
# NS switched: af_xdp4520
1..1
# Creating socket
# Interface [ve4520] vector [Rx]
# Interface [ve9127] vector [Tx]
# Sending 10000 packets on interface ve9127
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [64], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
SKB BIDIRECTIONAL SOCKETS: [ FAIL ]
# Interface found: ve9127
# Interface found: ve4520
# NS switched: af_xdp4520
1..1
# Creating socket
# Interface [ve4520] vector [Rx]
# Interface [ve9127] vector [Tx]
# Sending 10000 packets on interface ve9127
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [83], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV BIDIRECTIONAL SOCKETS: [ FAIL ]
cleaning up...
removing link ve4520
removing ns af_xdp4520
removing spec file: veth.spec

Second runs have one previous success becoming failure.

./test_xsk.sh
setting up ve2458: root: 192.168.222.1/30 

setting up ve4468: af_xdp4468: 192.168.222.2/30 

[  286.597111] IPv6: ADDRCONF(NETDEV_CHANGE): ve4468: link becomes ready 

Spec file created: veth.spec 

PREREQUISITES: [ PASS ] 

# Interface found: ve2458 

# Interface found: ve4468 

# NS switched: af_xdp4468 

1..1 

# Interface [ve4468] vector [Rx] 

# Interface [ve2458] vector [Tx] 

# Sending 10000 packets on interface ve2458 

not ok 1 ERROR: [worker_pkt_validate] prev_pkt [67], payloadseqnum [0] 

# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0 

SKB NOPOLL: [ FAIL ] 

# Interface found: ve2458 

# Interface found: ve4468 

# NS switched: af_xdp4468 

1..1 

# Interface [ve4468] vector [Rx] 

# Interface [ve2458] vector [Tx] 

# Sending 10000 packets on interface ve2458 

# End-of-tranmission frame received: PASS
# Received 10000 packets on interface ve4468
ok 1 PASS: SKB POLL
# Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
SKB POLL: [ PASS ]
# Interface found: ve2458
# Interface found: ve4468
# NS switched: af_xdp4468
1..1
# Interface [ve4468] vector [Rx]
# Interface [ve2458] vector [Tx]
# Sending 10000 packets on interface ve2458
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [191], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV NOPOLL: [ FAIL ]
# Interface found: ve2458
# Interface found: ve4468
# NS switched: af_xdp4468
1..1
# Interface [ve4468] vector [Rx]
# Interface [ve2458] vector [Tx]
# Sending 10000 packets on interface ve2458
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV POLL: [ FAIL ]
# Interface found: ve2458
# Interface found: ve4468
# NS switched: af_xdp4468
1..1
# Creating socket
# Interface [ve4468] vector [Rx]
# Interface [ve2458] vector [Tx]
# Sending 10000 packets on interface ve2458
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
SKB SOCKET TEARDOWN: [ FAIL ]
# Interface found: ve2458
# Interface found: ve4468
# NS switched: af_xdp4468
1..1
# Creating socket
# Interface [ve4468] vector [Rx]
# Interface [ve2458] vector [Tx]
# Sending 10000 packets on interface ve2458
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [171], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV SOCKET TEARDOWN: [ FAIL ]
# Interface found: ve2458
# Interface found: ve4468
# NS switched: af_xdp4468
1..1
# Creating socket
# Interface [ve4468] vector [Rx]
# Interface [ve2458] vector [Tx]
# Sending 10000 packets on interface ve2458
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [124], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
SKB BIDIRECTIONAL SOCKETS: [ FAIL ]
# Interface found: ve2458
# Interface found: ve4468
# NS switched: af_xdp4468
1..1
# Creating socket
# Interface [ve4468] vector [Rx]
# Interface [ve2458] vector [Tx]
# Sending 10000 packets on interface ve2458
not ok 1 ERROR: [worker_pkt_validate] prev_pkt [195], payloadseqnum [0]
# Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
DRV BIDIRECTIONAL SOCKETS: [ FAIL ]
cleaning up...
removing link ve4468
removing ns af_xdp4468
removing spec file: veth.spec

> 

> I will need to spend some time debugging this to have a fix.


Thanks.

> 

> Thanks,

> /Weqaar

> 

>>>

>>>> Can I just run test_xsk.sh at tools/testing/selftests/bpf/ directory?

>>>> This will be easier than the above for bpf developers. If it does not

>>>> work, I would like to recommend to make it work.

>>>>

>>> yes test_xsk.shis self contained, will update the instructions in there with v4.

>>

>> That will be great. Thanks!

>>

>>>

>>> Thanks,

>>> /Weqaar

>>>>

>>>> Björn
Weqaar Janjua Dec. 7, 2020, 9:55 p.m. UTC | #6
On Sat, 28 Nov 2020 at 03:13, Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 11/27/20 9:54 AM, Weqaar Janjua wrote:
> > On Fri, 27 Nov 2020 at 04:19, Yonghong Song <yhs@fb.com> wrote:
> >>
> >>
> >>
> >> On 11/26/20 1:22 PM, Weqaar Janjua wrote:
> >>> On Thu, 26 Nov 2020 at 09:01, Björn Töpel <bjorn.topel@intel.com> wrote:
> >>>>
> >>>> On 2020-11-26 07:44, Yonghong Song wrote:
> >>>>>
> >>>> [...]
> >>>>>
> >>>>> What other configures I am missing?
> >>>>>
> >>>>> BTW, I cherry-picked the following pick from bpf tree in this experiment.
> >>>>>      commit e7f4a5919bf66e530e08ff352d9b78ed89574e6b (HEAD -> xsk)
> >>>>>      Author: Björn Töpel <bjorn.topel@intel.com>
> >>>>>      Date:   Mon Nov 23 18:56:00 2020 +0100
> >>>>>
> >>>>>          net, xsk: Avoid taking multiple skbuff references
> >>>>>
> >>>>
> >>>> Hmm, I'm getting an oops, unless I cherry-pick:
> >>>>
> >>>> 36ccdf85829a ("net, xsk: Avoid taking multiple skbuff references")
> >>>>
> >>>> *AND*
> >>>>
> >>>> 537cf4e3cc2f ("xsk: Fix umem cleanup bug at socket destruct")
> >>>>
> >>>> from bpf/master.
> >>>>
> >>>
> >>> Same as Bjorn's findings ^^^, additionally applying the second patch
> >>> 537cf4e3cc2f [PASS] all tests for me
> >>>
> >>> PREREQUISITES: [ PASS ]
> >>> SKB NOPOLL: [ PASS ]
> >>> SKB POLL: [ PASS ]
> >>> DRV NOPOLL: [ PASS ]
> >>> DRV POLL: [ PASS ]
> >>> SKB SOCKET TEARDOWN: [ PASS ]
> >>> DRV SOCKET TEARDOWN: [ PASS ]
> >>> SKB BIDIRECTIONAL SOCKETS: [ PASS ]
> >>> DRV BIDIRECTIONAL SOCKETS: [ PASS ]
> >>>
> >>> With the first patch alone, as soon as we enter DRV/Native NOPOLL mode
> >>> kernel panics, whereas in your case NOPOLL tests were falling with
> >>> packets being *lost* as per seqnum mismatch.
> >>>
> >>> Can you please test this out with both patches and let us know?
> >>
> >> I applied both the above patches in bpf-next as well as this patch set,
> >> I still see failures. I am attaching my config file. Maybe you can take
> >> a look at what is the issue.
> >>
> > Thanks for the config, can you please confirm the compiler version,
> > and resource limits i.e. stack size, memory, etc.?
>
> root@arch-fb-vm1:~/net-next/net-next/tools/testing/selftests/bpf ulimit -a
> core file size          (blocks, -c) unlimited
> data seg size           (kbytes, -d) unlimited
> scheduling priority             (-e) 0
> file size               (blocks, -f) unlimited
> pending signals                 (-i) 15587
> max locked memory       (kbytes, -l) unlimited
> max memory size         (kbytes, -m) unlimited
> open files                      (-n) 1024
> pipe size            (512 bytes, -p) 8
> POSIX message queues     (bytes, -q) 819200
> real-time priority              (-r) 0
> stack size              (kbytes, -s) 8192
> cpu time               (seconds, -t) unlimited
> max user processes              (-u) 15587
> virtual memory          (kbytes, -v) unlimited
> file locks                      (-x) unlimited
>
> compiler: gcc 8.2
>
> >
> > Only NOPOLL tests are failing for you as I see it, do the same tests
> > fail every time?
>
> In my case, with above two bpf patches applied as well, I got:
> $ ./test_xsk.sh
> setting up ve9127: root: 192.168.222.1/30
>
> setting up ve4520: af_xdp4520: 192.168.222.2/30
>
> Spec file created: veth.spec
>
> PREREQUISITES: [ PASS ]
>
> # Interface found: ve9127
>
> # Interface found: ve4520
>
> # NS switched: af_xdp4520
>
> 1..1
>
> # Interface [ve4520] vector [Rx]
>
> # Interface [ve9127] vector [Tx]
>
> # Sending 10000 packets on interface ve9127
>
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [59], payloadseqnum [0]
>
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>
> SKB NOPOLL: [ FAIL ]
>
> # Interface found: ve9127
>
> # Interface found: ve4520
>
> # NS switched: af_xdp4520
> # NS switched: af_xdp4520
>
> 1..1
> # Interface [ve4520] vector [Rx]
> # Interface [ve9127] vector [Tx]
> # Sending 10000 packets on interface ve9127
> # End-of-tranmission frame received: PASS
> # Received 10000 packets on interface ve4520
> ok 1 PASS: SKB POLL
> # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
> SKB POLL: [ PASS ]
> # Interface found: ve9127
> # Interface found: ve4520
> # NS switched: af_xdp4520
> 1..1
> # Interface [ve4520] vector [Rx]
> # Interface [ve9127] vector [Tx]
> # Sending 10000 packets on interface ve9127
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [153], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> DRV NOPOLL: [ FAIL ]
> # Interface found: ve9127
> # Interface found: ve4520
> # NS switched: af_xdp4520
> 1..1
> # Interface [ve4520] vector [Rx]
> # Interface [ve9127] vector [Tx]
> # Sending 10000 packets on interface ve9127
> # End-of-tranmission frame received: PASS
> # Received 10000 packets on interface ve4520
> ok 1 PASS: DRV POLL
> # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
> DRV POLL: [ PASS ]
> # Interface found: ve9127
> # Interface found: ve4520
> # NS switched: af_xdp4520
> 1..1
> # Creating socket
> # Interface [ve4520] vector [Rx]
> # Interface [ve9127] vector [Tx]
> # Sending 10000 packets on interface ve9127
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [54], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> SKB SOCKET TEARDOWN: [ FAIL ]
> # Interface found: ve9127
> # Interface found: ve4520
> # NS switched: af_xdp4520
> 1..1
> # Creating socket
> # Interface [ve4520] vector [Rx]
> # Interface [ve9127] vector [Tx]
> # Sending 10000 packets on interface ve9127
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> DRV SOCKET TEARDOWN: [ FAIL ]
> # Interface found: ve9127
> # Interface found: ve4520
> # NS switched: af_xdp4520
> 1..1
> # Creating socket
> # Interface [ve4520] vector [Rx]
> # Interface [ve9127] vector [Tx]
> # Sending 10000 packets on interface ve9127
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [64], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> SKB BIDIRECTIONAL SOCKETS: [ FAIL ]
> # Interface found: ve9127
> # Interface found: ve4520
> # NS switched: af_xdp4520
> 1..1
> # Creating socket
> # Interface [ve4520] vector [Rx]
> # Interface [ve9127] vector [Tx]
> # Sending 10000 packets on interface ve9127
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [83], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> DRV BIDIRECTIONAL SOCKETS: [ FAIL ]
> cleaning up...
> removing link ve4520
> removing ns af_xdp4520
> removing spec file: veth.spec
>
> Second runs have one previous success becoming failure.
>
> ./test_xsk.sh
> setting up ve2458: root: 192.168.222.1/30
>
> setting up ve4468: af_xdp4468: 192.168.222.2/30
>
> [  286.597111] IPv6: ADDRCONF(NETDEV_CHANGE): ve4468: link becomes ready
>
> Spec file created: veth.spec
>
> PREREQUISITES: [ PASS ]
>
> # Interface found: ve2458
>
> # Interface found: ve4468
>
> # NS switched: af_xdp4468
>
> 1..1
>
> # Interface [ve4468] vector [Rx]
>
> # Interface [ve2458] vector [Tx]
>
> # Sending 10000 packets on interface ve2458
>
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [67], payloadseqnum [0]
>
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>
> SKB NOPOLL: [ FAIL ]
>
> # Interface found: ve2458
>
> # Interface found: ve4468
>
> # NS switched: af_xdp4468
>
> 1..1
>
> # Interface [ve4468] vector [Rx]
>
> # Interface [ve2458] vector [Tx]
>
> # Sending 10000 packets on interface ve2458
>
> # End-of-tranmission frame received: PASS
> # Received 10000 packets on interface ve4468
> ok 1 PASS: SKB POLL
> # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
> SKB POLL: [ PASS ]
> # Interface found: ve2458
> # Interface found: ve4468
> # NS switched: af_xdp4468
> 1..1
> # Interface [ve4468] vector [Rx]
> # Interface [ve2458] vector [Tx]
> # Sending 10000 packets on interface ve2458
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [191], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> DRV NOPOLL: [ FAIL ]
> # Interface found: ve2458
> # Interface found: ve4468
> # NS switched: af_xdp4468
> 1..1
> # Interface [ve4468] vector [Rx]
> # Interface [ve2458] vector [Tx]
> # Sending 10000 packets on interface ve2458
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> DRV POLL: [ FAIL ]
> # Interface found: ve2458
> # Interface found: ve4468
> # NS switched: af_xdp4468
> 1..1
> # Creating socket
> # Interface [ve4468] vector [Rx]
> # Interface [ve2458] vector [Tx]
> # Sending 10000 packets on interface ve2458
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> SKB SOCKET TEARDOWN: [ FAIL ]
> # Interface found: ve2458
> # Interface found: ve4468
> # NS switched: af_xdp4468
> 1..1
> # Creating socket
> # Interface [ve4468] vector [Rx]
> # Interface [ve2458] vector [Tx]
> # Sending 10000 packets on interface ve2458
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [171], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> DRV SOCKET TEARDOWN: [ FAIL ]
> # Interface found: ve2458
> # Interface found: ve4468
> # NS switched: af_xdp4468
> 1..1
> # Creating socket
> # Interface [ve4468] vector [Rx]
> # Interface [ve2458] vector [Tx]
> # Sending 10000 packets on interface ve2458
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [124], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> SKB BIDIRECTIONAL SOCKETS: [ FAIL ]
> # Interface found: ve2458
> # Interface found: ve4468
> # NS switched: af_xdp4468
> 1..1
> # Creating socket
> # Interface [ve4468] vector [Rx]
> # Interface [ve2458] vector [Tx]
> # Sending 10000 packets on interface ve2458
> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [195], payloadseqnum [0]
> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> DRV BIDIRECTIONAL SOCKETS: [ FAIL ]
> cleaning up...
> removing link ve4468
> removing ns af_xdp4468
> removing spec file: veth.spec
>
> >
> > I will need to spend some time debugging this to have a fix.
>
> Thanks.
>
> >
> > Thanks,
> > /Weqaar
> >
> >>>
> >>>> Can I just run test_xsk.sh at tools/testing/selftests/bpf/ directory?
> >>>> This will be easier than the above for bpf developers. If it does not
> >>>> work, I would like to recommend to make it work.
> >>>>
> >>> yes test_xsk.shis self contained, will update the instructions in there with v4.
> >>
> >> That will be great. Thanks!
> >>
v4 is out on the list, incorporating most if not all your suggestions
to the best of my memory.

I was able to reproduce the issue you were seeing (from your logs) ->
veth interfaces were receiving packets from the IPv6 neighboring
system (thanks @Björn Töpel for mentioning this).

The packet validation algo in *xdpxceiver* *assumed* all packets would
be IPv4 and intended for Rx.
Rx validates packets on both ip->tos = 0x9 (id for xsk tests) and
ip->version = 0x4, ignores the rest.

Hoping the tests now work -> PASS in your environment.

Thanks,
/Weqaar

> >>>
> >>> Thanks,
> >>> /Weqaar
> >>>>
> >>>> Björn
Yonghong Song Dec. 8, 2020, 3:48 a.m. UTC | #7
On 12/7/20 1:55 PM, Weqaar Janjua wrote:
> On Sat, 28 Nov 2020 at 03:13, Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 11/27/20 9:54 AM, Weqaar Janjua wrote:
>>> On Fri, 27 Nov 2020 at 04:19, Yonghong Song <yhs@fb.com> wrote:
>>>>
>>>>
>>>>
>>>> On 11/26/20 1:22 PM, Weqaar Janjua wrote:
>>>>> On Thu, 26 Nov 2020 at 09:01, Björn Töpel <bjorn.topel@intel.com> wrote:
>>>>>>
>>>>>> On 2020-11-26 07:44, Yonghong Song wrote:
>>>>>>>
>>>>>> [...]
>>>>>>>
>>>>>>> What other configures I am missing?
>>>>>>>
>>>>>>> BTW, I cherry-picked the following pick from bpf tree in this experiment.
>>>>>>>       commit e7f4a5919bf66e530e08ff352d9b78ed89574e6b (HEAD -> xsk)
>>>>>>>       Author: Björn Töpel <bjorn.topel@intel.com>
>>>>>>>       Date:   Mon Nov 23 18:56:00 2020 +0100
>>>>>>>
>>>>>>>           net, xsk: Avoid taking multiple skbuff references
>>>>>>>
>>>>>>
>>>>>> Hmm, I'm getting an oops, unless I cherry-pick:
>>>>>>
>>>>>> 36ccdf85829a ("net, xsk: Avoid taking multiple skbuff references")
>>>>>>
>>>>>> *AND*
>>>>>>
>>>>>> 537cf4e3cc2f ("xsk: Fix umem cleanup bug at socket destruct")
>>>>>>
>>>>>> from bpf/master.
>>>>>>
>>>>>
>>>>> Same as Bjorn's findings ^^^, additionally applying the second patch
>>>>> 537cf4e3cc2f [PASS] all tests for me
>>>>>
>>>>> PREREQUISITES: [ PASS ]
>>>>> SKB NOPOLL: [ PASS ]
>>>>> SKB POLL: [ PASS ]
>>>>> DRV NOPOLL: [ PASS ]
>>>>> DRV POLL: [ PASS ]
>>>>> SKB SOCKET TEARDOWN: [ PASS ]
>>>>> DRV SOCKET TEARDOWN: [ PASS ]
>>>>> SKB BIDIRECTIONAL SOCKETS: [ PASS ]
>>>>> DRV BIDIRECTIONAL SOCKETS: [ PASS ]
>>>>>
>>>>> With the first patch alone, as soon as we enter DRV/Native NOPOLL mode
>>>>> kernel panics, whereas in your case NOPOLL tests were falling with
>>>>> packets being *lost* as per seqnum mismatch.
>>>>>
>>>>> Can you please test this out with both patches and let us know?
>>>>
>>>> I applied both the above patches in bpf-next as well as this patch set,
>>>> I still see failures. I am attaching my config file. Maybe you can take
>>>> a look at what is the issue.
>>>>
>>> Thanks for the config, can you please confirm the compiler version,
>>> and resource limits i.e. stack size, memory, etc.?
>>
>> root@arch-fb-vm1:~/net-next/net-next/tools/testing/selftests/bpf ulimit -a
>> core file size          (blocks, -c) unlimited
>> data seg size           (kbytes, -d) unlimited
>> scheduling priority             (-e) 0
>> file size               (blocks, -f) unlimited
>> pending signals                 (-i) 15587
>> max locked memory       (kbytes, -l) unlimited
>> max memory size         (kbytes, -m) unlimited
>> open files                      (-n) 1024
>> pipe size            (512 bytes, -p) 8
>> POSIX message queues     (bytes, -q) 819200
>> real-time priority              (-r) 0
>> stack size              (kbytes, -s) 8192
>> cpu time               (seconds, -t) unlimited
>> max user processes              (-u) 15587
>> virtual memory          (kbytes, -v) unlimited
>> file locks                      (-x) unlimited
>>
>> compiler: gcc 8.2
>>
>>>
>>> Only NOPOLL tests are failing for you as I see it, do the same tests
>>> fail every time?
>>
>> In my case, with above two bpf patches applied as well, I got:
>> $ ./test_xsk.sh
>> setting up ve9127: root: 192.168.222.1/30
>>
>> setting up ve4520: af_xdp4520: 192.168.222.2/30
>>
>> Spec file created: veth.spec
>>
>> PREREQUISITES: [ PASS ]
>>
>> # Interface found: ve9127
>>
>> # Interface found: ve4520
>>
>> # NS switched: af_xdp4520
>>
>> 1..1
>>
>> # Interface [ve4520] vector [Rx]
>>
>> # Interface [ve9127] vector [Tx]
>>
>> # Sending 10000 packets on interface ve9127
>>
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [59], payloadseqnum [0]
>>
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>>
>> SKB NOPOLL: [ FAIL ]
>>
>> # Interface found: ve9127
>>
>> # Interface found: ve4520
>>
>> # NS switched: af_xdp4520
>> # NS switched: af_xdp4520
>>
>> 1..1
>> # Interface [ve4520] vector [Rx]
>> # Interface [ve9127] vector [Tx]
>> # Sending 10000 packets on interface ve9127
>> # End-of-tranmission frame received: PASS
>> # Received 10000 packets on interface ve4520
>> ok 1 PASS: SKB POLL
>> # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>> SKB POLL: [ PASS ]
>> # Interface found: ve9127
>> # Interface found: ve4520
>> # NS switched: af_xdp4520
>> 1..1
>> # Interface [ve4520] vector [Rx]
>> # Interface [ve9127] vector [Tx]
>> # Sending 10000 packets on interface ve9127
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [153], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> DRV NOPOLL: [ FAIL ]
>> # Interface found: ve9127
>> # Interface found: ve4520
>> # NS switched: af_xdp4520
>> 1..1
>> # Interface [ve4520] vector [Rx]
>> # Interface [ve9127] vector [Tx]
>> # Sending 10000 packets on interface ve9127
>> # End-of-tranmission frame received: PASS
>> # Received 10000 packets on interface ve4520
>> ok 1 PASS: DRV POLL
>> # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>> DRV POLL: [ PASS ]
>> # Interface found: ve9127
>> # Interface found: ve4520
>> # NS switched: af_xdp4520
>> 1..1
>> # Creating socket
>> # Interface [ve4520] vector [Rx]
>> # Interface [ve9127] vector [Tx]
>> # Sending 10000 packets on interface ve9127
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [54], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> SKB SOCKET TEARDOWN: [ FAIL ]
>> # Interface found: ve9127
>> # Interface found: ve4520
>> # NS switched: af_xdp4520
>> 1..1
>> # Creating socket
>> # Interface [ve4520] vector [Rx]
>> # Interface [ve9127] vector [Tx]
>> # Sending 10000 packets on interface ve9127
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> DRV SOCKET TEARDOWN: [ FAIL ]
>> # Interface found: ve9127
>> # Interface found: ve4520
>> # NS switched: af_xdp4520
>> 1..1
>> # Creating socket
>> # Interface [ve4520] vector [Rx]
>> # Interface [ve9127] vector [Tx]
>> # Sending 10000 packets on interface ve9127
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [64], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> SKB BIDIRECTIONAL SOCKETS: [ FAIL ]
>> # Interface found: ve9127
>> # Interface found: ve4520
>> # NS switched: af_xdp4520
>> 1..1
>> # Creating socket
>> # Interface [ve4520] vector [Rx]
>> # Interface [ve9127] vector [Tx]
>> # Sending 10000 packets on interface ve9127
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [83], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> DRV BIDIRECTIONAL SOCKETS: [ FAIL ]
>> cleaning up...
>> removing link ve4520
>> removing ns af_xdp4520
>> removing spec file: veth.spec
>>
>> Second runs have one previous success becoming failure.
>>
>> ./test_xsk.sh
>> setting up ve2458: root: 192.168.222.1/30
>>
>> setting up ve4468: af_xdp4468: 192.168.222.2/30
>>
>> [  286.597111] IPv6: ADDRCONF(NETDEV_CHANGE): ve4468: link becomes ready
>>
>> Spec file created: veth.spec
>>
>> PREREQUISITES: [ PASS ]
>>
>> # Interface found: ve2458
>>
>> # Interface found: ve4468
>>
>> # NS switched: af_xdp4468
>>
>> 1..1
>>
>> # Interface [ve4468] vector [Rx]
>>
>> # Interface [ve2458] vector [Tx]
>>
>> # Sending 10000 packets on interface ve2458
>>
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [67], payloadseqnum [0]
>>
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>>
>> SKB NOPOLL: [ FAIL ]
>>
>> # Interface found: ve2458
>>
>> # Interface found: ve4468
>>
>> # NS switched: af_xdp4468
>>
>> 1..1
>>
>> # Interface [ve4468] vector [Rx]
>>
>> # Interface [ve2458] vector [Tx]
>>
>> # Sending 10000 packets on interface ve2458
>>
>> # End-of-tranmission frame received: PASS
>> # Received 10000 packets on interface ve4468
>> ok 1 PASS: SKB POLL
>> # Totals: pass:1 fail:0 xfail:0 xpass:0 skip:0 error:0
>> SKB POLL: [ PASS ]
>> # Interface found: ve2458
>> # Interface found: ve4468
>> # NS switched: af_xdp4468
>> 1..1
>> # Interface [ve4468] vector [Rx]
>> # Interface [ve2458] vector [Tx]
>> # Sending 10000 packets on interface ve2458
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [191], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> DRV NOPOLL: [ FAIL ]
>> # Interface found: ve2458
>> # Interface found: ve4468
>> # NS switched: af_xdp4468
>> 1..1
>> # Interface [ve4468] vector [Rx]
>> # Interface [ve2458] vector [Tx]
>> # Sending 10000 packets on interface ve2458
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> DRV POLL: [ FAIL ]
>> # Interface found: ve2458
>> # Interface found: ve4468
>> # NS switched: af_xdp4468
>> 1..1
>> # Creating socket
>> # Interface [ve4468] vector [Rx]
>> # Interface [ve2458] vector [Tx]
>> # Sending 10000 packets on interface ve2458
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [0], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> SKB SOCKET TEARDOWN: [ FAIL ]
>> # Interface found: ve2458
>> # Interface found: ve4468
>> # NS switched: af_xdp4468
>> 1..1
>> # Creating socket
>> # Interface [ve4468] vector [Rx]
>> # Interface [ve2458] vector [Tx]
>> # Sending 10000 packets on interface ve2458
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [171], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> DRV SOCKET TEARDOWN: [ FAIL ]
>> # Interface found: ve2458
>> # Interface found: ve4468
>> # NS switched: af_xdp4468
>> 1..1
>> # Creating socket
>> # Interface [ve4468] vector [Rx]
>> # Interface [ve2458] vector [Tx]
>> # Sending 10000 packets on interface ve2458
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [124], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> SKB BIDIRECTIONAL SOCKETS: [ FAIL ]
>> # Interface found: ve2458
>> # Interface found: ve4468
>> # NS switched: af_xdp4468
>> 1..1
>> # Creating socket
>> # Interface [ve4468] vector [Rx]
>> # Interface [ve2458] vector [Tx]
>> # Sending 10000 packets on interface ve2458
>> not ok 1 ERROR: [worker_pkt_validate] prev_pkt [195], payloadseqnum [0]
>> # Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>> DRV BIDIRECTIONAL SOCKETS: [ FAIL ]
>> cleaning up...
>> removing link ve4468
>> removing ns af_xdp4468
>> removing spec file: veth.spec
>>
>>>
>>> I will need to spend some time debugging this to have a fix.
>>
>> Thanks.
>>
>>>
>>> Thanks,
>>> /Weqaar
>>>
>>>>>
>>>>>> Can I just run test_xsk.sh at tools/testing/selftests/bpf/ directory?
>>>>>> This will be easier than the above for bpf developers. If it does not
>>>>>> work, I would like to recommend to make it work.
>>>>>>
>>>>> yes test_xsk.shis self contained, will update the instructions in there with v4.
>>>>
>>>> That will be great. Thanks!
>>>>
> v4 is out on the list, incorporating most if not all your suggestions
> to the best of my memory.
> 
> I was able to reproduce the issue you were seeing (from your logs) ->
> veth interfaces were receiving packets from the IPv6 neighboring
> system (thanks @Björn Töpel for mentioning this).
> 
> The packet validation algo in *xdpxceiver* *assumed* all packets would
> be IPv4 and intended for Rx.
> Rx validates packets on both ip->tos = 0x9 (id for xsk tests) and
> ip->version = 0x4, ignores the rest.
> 
> Hoping the tests now work -> PASS in your environment.

Yes, no all tests passed in my environment. I will reply the v4
with Test-by tag. Now I think xsk people can really look at details.

> 
> Thanks,
> /Weqaar
> 
>>>>>
>>>>> Thanks,
>>>>> /Weqaar
>>>>>>
>>>>>> Björn