From patchwork Tue Jun 22 16:15:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 465566 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40874C48BDF for ; Tue, 22 Jun 2021 16:16:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2881F60FEE for ; Tue, 22 Jun 2021 16:16:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230039AbhFVQSi (ORCPT ); Tue, 22 Jun 2021 12:18:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229791AbhFVQSg (ORCPT ); Tue, 22 Jun 2021 12:18:36 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2128C06175F for ; Tue, 22 Jun 2021 09:16:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=8k5Tpc1J0p5dVdrQlCEVBy9u/KXOsse4TxG66ip3Z9c=; b=GoT8y+nksVhEj6bkIOFl6+LNf9 6ZBpFw6w1+mjk59w5W5DWh7i1O4JIzj5bJGhDMwAn0rovS0Hs6kapLgBH+DzRt2g3ISeMSnrB0nvF K0x4355p6SvEcoL0fh77901rFGBWc6UkogwEAlqUAvs6Uj/25Tj0bA4QKCl0r2c/C94yaSB3d+zxF VZ6WgfE2j9/BQ/bSoiV0Omal6PdCEL65XSK0zj5AdJzchXNAsfV69L19Hx2MrMM8FKu2HX5keeDOc AOi0ErcHu6GDQtP4odC2FXZWL7BrN068GrkY8xyf5J+skYFu0s5CNGgIj9T66ZIV7DSpdRcEm+tcG Xh53H0Hg==; Received: from i7.infradead.org ([2001:8b0:10b:1:21e:67ff:fecb:7a92]) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvj3m-00ETxe-38; Tue, 22 Jun 2021 16:15:48 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvj3l-005608-Kt; Tue, 22 Jun 2021 17:15:33 +0100 From: David Woodhouse To: netdev@vger.kernel.org Cc: Jason Wang , =?utf-8?q?Eugenio_P=C3=A9rez?= Subject: [PATCH v2 1/4] net: tun: fix tun_xdp_one() for IFF_TUN mode Date: Tue, 22 Jun 2021 17:15:30 +0100 Message-Id: <20210622161533.1214662-1-dwmw2@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <03ee62602dd7b7101f78e0802249a6e2e4c10b7f.camel@infradead.org> References: <03ee62602dd7b7101f78e0802249a6e2e4c10b7f.camel@infradead.org> MIME-Version: 1.0 Sender: David Woodhouse X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: David Woodhouse In tun_get_user(), skb->protocol is either taken from the tun_pi header or inferred from the first byte of the packet in IFF_TUN mode, while eth_type_trans() is called only in the IFF_TAP mode where the payload is expected to be an Ethernet frame. The alternative path in tun_xdp_one() was unconditionally using eth_type_trans(), which corrupts packets in IFF_TUN mode. Fix it to do the correct thing for IFF_TUN mode, as tun_get_user() does. Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()") Signed-off-by: David Woodhouse Acked-by: Jason Wang Signed-off-by: David Woodhouse Signed-off-by: David Woodhouse --- drivers/net/tun.c | 44 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 4cf38be26dc9..f812dcdc640e 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -2394,8 +2394,50 @@ static int tun_xdp_one(struct tun_struct *tun, err = -EINVAL; goto out; } + switch (tun->flags & TUN_TYPE_MASK) { + case IFF_TUN: + if (tun->flags & IFF_NO_PI) { + u8 ip_version = skb->len ? (skb->data[0] >> 4) : 0; + + switch (ip_version) { + case 4: + skb->protocol = htons(ETH_P_IP); + break; + case 6: + skb->protocol = htons(ETH_P_IPV6); + break; + default: + atomic_long_inc(&tun->dev->rx_dropped); + kfree_skb(skb); + err = -EINVAL; + goto out; + } + } else { + struct tun_pi *pi = (struct tun_pi *)skb->data; + if (!pskb_may_pull(skb, sizeof(*pi))) { + atomic_long_inc(&tun->dev->rx_dropped); + kfree_skb(skb); + err = -ENOMEM; + goto out; + } + skb_pull_inline(skb, sizeof(*pi)); + skb->protocol = pi->proto; + } + + skb_reset_mac_header(skb); + skb->dev = tun->dev; + break; + case IFF_TAP: + if (!pskb_may_pull(skb, ETH_HLEN)) { + atomic_long_inc(&tun->dev->rx_dropped); + kfree_skb(skb); + err = -ENOMEM; + goto out; + } + skb->protocol = eth_type_trans(skb, tun->dev); + break; + } - skb->protocol = eth_type_trans(skb, tun->dev); skb_reset_network_header(skb); skb_probe_transport_header(skb); skb_record_rx_queue(skb, tfile->queue_index); From patchwork Tue Jun 22 16:15:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 466295 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31DCEC2B9F4 for ; Tue, 22 Jun 2021 16:16:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1170561107 for ; Tue, 22 Jun 2021 16:16:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229853AbhFVQSh (ORCPT ); Tue, 22 Jun 2021 12:18:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229674AbhFVQSg (ORCPT ); Tue, 22 Jun 2021 12:18:36 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A2D0C061574 for ; Tue, 22 Jun 2021 09:16:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=dtVe0AJXy7anZOT9IZ0v6o+FzPwJ6e2fZ1XMWeGMC1Y=; b=A+OfhcN7qrSL1WqLDZsZ8xzs/t KrqrqkdmVcd+BMg+1zGi/kb0MHH9Pxd+wQcMOcHe4+quSlOjxWqFy2nHDnMVUQnaliK5ZHs5FgYk6 j2qqKEietAM6PnsbVKcExxbNPSif1LkyLIeB6giaYvt3GpDjuOLOobSdOIjILZbnqatPtaUMx72ue xTN8XfJWzmvQdwKSN02x96/OIFCYaGxznA9K4d7+avYSs1zN8ULV8/TU9HCTstB3pu5I4Qjn9dv2l VFvdz4L1sS7G+fZlFXHRfxWfC+oGLU2esiLYREju0iSvkCUZetV6nB1oveJHzeRFzbs0TW3V1LPmC cyCKzKVA==; Received: from i7.infradead.org ([2001:8b0:10b:1:21e:67ff:fecb:7a92]) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvj3m-00ETxf-5A; Tue, 22 Jun 2021 16:15:48 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvj3l-00560B-Mf; Tue, 22 Jun 2021 17:15:33 +0100 From: David Woodhouse To: netdev@vger.kernel.org Cc: Jason Wang , =?utf-8?q?Eugenio_P=C3=A9rez?= Subject: [PATCH v2 2/4] net: tun: don't assume IFF_VNET_HDR in tun_xdp_one() tx path Date: Tue, 22 Jun 2021 17:15:31 +0100 Message-Id: <20210622161533.1214662-2-dwmw2@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210622161533.1214662-1-dwmw2@infradead.org> References: <03ee62602dd7b7101f78e0802249a6e2e4c10b7f.camel@infradead.org> <20210622161533.1214662-1-dwmw2@infradead.org> MIME-Version: 1.0 Sender: David Woodhouse X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: David Woodhouse Sometimes it's just a data packet. The virtio_net_hdr processing should be conditional on IFF_VNET_HDR, just as it is in tun_get_user(). Fixes: 043d222f93ab ("tuntap: accept an array of XDP buffs through sendmsg()") Signed-off-by: David Woodhouse --- drivers/net/tun.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index f812dcdc640e..96933887d03d 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -2331,7 +2331,7 @@ static int tun_xdp_one(struct tun_struct *tun, { unsigned int datasize = xdp->data_end - xdp->data; struct tun_xdp_hdr *hdr = xdp->data_hard_start; - struct virtio_net_hdr *gso = &hdr->gso; + struct virtio_net_hdr *gso = NULL; struct bpf_prog *xdp_prog; struct sk_buff *skb = NULL; u32 rxhash = 0, act; @@ -2340,9 +2340,12 @@ static int tun_xdp_one(struct tun_struct *tun, bool skb_xdp = false; struct page *page; + if (tun->flags & IFF_VNET_HDR) + gso = &hdr->gso; + xdp_prog = rcu_dereference(tun->xdp_prog); if (xdp_prog) { - if (gso->gso_type) { + if (gso && gso->gso_type) { skb_xdp = true; goto build; } @@ -2388,7 +2391,7 @@ static int tun_xdp_one(struct tun_struct *tun, skb_reserve(skb, xdp->data - xdp->data_hard_start); skb_put(skb, xdp->data_end - xdp->data); - if (virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) { + if (gso && virtio_net_hdr_to_skb(skb, gso, tun_is_little_endian(tun))) { atomic_long_inc(&tun->rx_frame_errors); kfree_skb(skb); err = -EINVAL; From patchwork Tue Jun 22 16:15:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 465567 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 827B9C2B9F4 for ; Tue, 22 Jun 2021 16:15:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 65F9560FEE for ; Tue, 22 Jun 2021 16:15:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229791AbhFVQRx (ORCPT ); Tue, 22 Jun 2021 12:17:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229674AbhFVQRw (ORCPT ); Tue, 22 Jun 2021 12:17:52 -0400 Received: from desiato.infradead.org (desiato.infradead.org [IPv6:2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02D08C061574 for ; Tue, 22 Jun 2021 09:15:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Sender:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description; bh=i+iKYtl26TpN7Y7yLSMPClkveH4h2WunnuTzfof6WvA=; b=pQbQagAlhWnpzAWwBFHv1tV7H4 laSlRA9176obTHs38xR8OI40PX0qLIWrfSXLsjr84IyDh4I6yX86WqZZpLMAmMgEOEh7gudnSr4hh wTS4OfGTLlNDV39ba4NZ0jB+jUXV162ZsYTZAwAZTUNh4bGHHwJ40hMEb39SLz+ZBB8yjVuPuzD57 UwiliOkvVuGdCKaFqI/iytW8FEItEk7rXBZ4XgAnv/pr3J4//KMDRkCc8jnPm3L3MSFATx3dq9pnq 8sl6fRjRMY7sCEZaTMwYRGjbEyy1W4KX3fOs42TFU6Lx/tRrodsFvu3gWwRNInEH2LxyqTeVBS70P gvsQy+dw==; Received: from i7.infradead.org ([2001:8b0:10b:1:21e:67ff:fecb:7a92]) by desiato.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvj3f-00Ae50-99; Tue, 22 Jun 2021 16:15:34 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvj3l-00560F-Ob; Tue, 22 Jun 2021 17:15:33 +0100 From: David Woodhouse To: netdev@vger.kernel.org Cc: Jason Wang , =?utf-8?q?Eugenio_P=C3=A9rez?= Subject: [PATCH v2 3/4] vhost_net: validate virtio_net_hdr only if it exists Date: Tue, 22 Jun 2021 17:15:32 +0100 Message-Id: <20210622161533.1214662-3-dwmw2@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210622161533.1214662-1-dwmw2@infradead.org> References: <03ee62602dd7b7101f78e0802249a6e2e4c10b7f.camel@infradead.org> <20210622161533.1214662-1-dwmw2@infradead.org> MIME-Version: 1.0 Sender: David Woodhouse X-SRS-Rewrite: SMTP reverse-path rewritten from by desiato.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: David Woodhouse When the underlying socket doesn't handle the virtio_net_hdr, the existing code in vhost_net_build_xdp() would attempt to validate stack noise, by copying zero bytes into the local copy of the header and then validating that. Skip the whole pointless pointer arithmetic and partial copy (of zero bytes) in this case. Fixes: 0a0be13b8fe2 ("vhost_net: batch submitting XDP buffers to underlayer sockets") Signed-off-by: David Woodhouse Acked-by: Jason Wang --- drivers/vhost/net.c | 43 ++++++++++++++++++++++--------------------- 1 file changed, 22 insertions(+), 21 deletions(-) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index df82b124170e..1e3652eb53af 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -690,7 +690,6 @@ static int vhost_net_build_xdp(struct vhost_net_virtqueue *nvq, dev); struct socket *sock = vhost_vq_get_backend(vq); struct page_frag *alloc_frag = &net->page_frag; - struct virtio_net_hdr *gso; struct xdp_buff *xdp = &nvq->xdp[nvq->batched_xdp]; struct tun_xdp_hdr *hdr; size_t len = iov_iter_count(from); @@ -715,29 +714,31 @@ static int vhost_net_build_xdp(struct vhost_net_virtqueue *nvq, return -ENOMEM; buf = (char *)page_address(alloc_frag->page) + alloc_frag->offset; - copied = copy_page_from_iter(alloc_frag->page, - alloc_frag->offset + - offsetof(struct tun_xdp_hdr, gso), - sock_hlen, from); - if (copied != sock_hlen) - return -EFAULT; - hdr = buf; - gso = &hdr->gso; - - if ((gso->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) && - vhost16_to_cpu(vq, gso->csum_start) + - vhost16_to_cpu(vq, gso->csum_offset) + 2 > - vhost16_to_cpu(vq, gso->hdr_len)) { - gso->hdr_len = cpu_to_vhost16(vq, - vhost16_to_cpu(vq, gso->csum_start) + - vhost16_to_cpu(vq, gso->csum_offset) + 2); - - if (vhost16_to_cpu(vq, gso->hdr_len) > len) - return -EINVAL; + if (sock_hlen) { + struct virtio_net_hdr *gso = &hdr->gso; + + copied = copy_page_from_iter(alloc_frag->page, + alloc_frag->offset + + offsetof(struct tun_xdp_hdr, gso), + sock_hlen, from); + if (copied != sock_hlen) + return -EFAULT; + + if ((gso->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) && + vhost16_to_cpu(vq, gso->csum_start) + + vhost16_to_cpu(vq, gso->csum_offset) + 2 > + vhost16_to_cpu(vq, gso->hdr_len)) { + gso->hdr_len = cpu_to_vhost16(vq, + vhost16_to_cpu(vq, gso->csum_start) + + vhost16_to_cpu(vq, gso->csum_offset) + 2); + + if (vhost16_to_cpu(vq, gso->hdr_len) > len) + return -EINVAL; + } + len -= sock_hlen; } - len -= sock_hlen; copied = copy_page_from_iter(alloc_frag->page, alloc_frag->offset + pad, len, from); From patchwork Tue Jun 22 16:15:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: David Woodhouse X-Patchwork-Id: 466294 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11055C48BE5 for ; Tue, 22 Jun 2021 16:16:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E295C60FEE for ; Tue, 22 Jun 2021 16:16:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230102AbhFVQSj (ORCPT ); Tue, 22 Jun 2021 12:18:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229800AbhFVQSg (ORCPT ); Tue, 22 Jun 2021 12:18:36 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A3F3C061756 for ; Tue, 22 Jun 2021 09:16:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Sender:Content-Transfer-Encoding: Content-Type:MIME-Version:References:In-Reply-To:Message-Id:Date:Subject:Cc: To:From:Reply-To:Content-ID:Content-Description; bh=hal1z9SxKv6Or5898ik2qokIUvxAcvlc++PcRsn8RAI=; b=d2gN0AuRkR2veDC0WCjLNvKZkV a1gEgoAghuoCxR1C6ouCYp978bMB0LhOu7L0ilkxLDzj2ihePtbfS40+cfht6Fo2XHEOCmyTubIqP e+fodIbhc4Ic7sIG0V/etGG0I0uvGMnGEHzdFuQeWqmVw9Ao5bDXETNd6uHGAdoq4HWGIw3vKsts3 4DOWnsM5yriwyjy/nEQM9oc34ajsVz1/jNtGm0bbGA2q68bNPVHkDXjEFDwG8DSUICLPKlHmr0PjE p1EVyKKVrLVGG3wIUFNCY+Uwfj0+TYMwqb5Xrp8rW4lkkcy+58n5Vj/XFMfMm+9N5ue41nrJtU8KD p0SGc/7Q==; Received: from i7.infradead.org ([2001:8b0:10b:1:21e:67ff:fecb:7a92]) by casper.infradead.org with esmtpsa (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvj3m-00ETxg-9C; Tue, 22 Jun 2021 16:15:49 +0000 Received: from dwoodhou by i7.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1lvj3l-00560J-Qf; Tue, 22 Jun 2021 17:15:33 +0100 From: David Woodhouse To: netdev@vger.kernel.org Cc: Jason Wang , =?utf-8?q?Eugenio_P=C3=A9rez?= Subject: [PATCH v2 4/4] vhost_net: Add self test with tun device Date: Tue, 22 Jun 2021 17:15:33 +0100 Message-Id: <20210622161533.1214662-4-dwmw2@infradead.org> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20210622161533.1214662-1-dwmw2@infradead.org> References: <03ee62602dd7b7101f78e0802249a6e2e4c10b7f.camel@infradead.org> <20210622161533.1214662-1-dwmw2@infradead.org> MIME-Version: 1.0 Sender: David Woodhouse X-SRS-Rewrite: SMTP reverse-path rewritten from by casper.infradead.org. See http://www.infradead.org/rpr.html Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: David Woodhouse This creates a tun device and brings it up, then finds out the link-local address the kernel automatically assigns to it. It sends a ping to that address, from a fake LL address of its own, and then waits for a response. If the virtio_net_hdr stuff is all working correctly, it gets a response and manages to understand it. Signed-off-by: David Woodhouse --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/vhost/Makefile | 16 + tools/testing/selftests/vhost/config | 2 + .../testing/selftests/vhost/test_vhost_net.c | 522 ++++++++++++++++++ 4 files changed, 541 insertions(+) create mode 100644 tools/testing/selftests/vhost/Makefile create mode 100644 tools/testing/selftests/vhost/config create mode 100644 tools/testing/selftests/vhost/test_vhost_net.c diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 6c575cf34a71..300c03cfd0c7 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -71,6 +71,7 @@ TARGETS += user TARGETS += vDSO TARGETS += vm TARGETS += x86 +TARGETS += vhost TARGETS += zram #Please keep the TARGETS list alphabetically sorted # Run "make quicktest=1 run_tests" or diff --git a/tools/testing/selftests/vhost/Makefile b/tools/testing/selftests/vhost/Makefile new file mode 100644 index 000000000000..f5e565d80733 --- /dev/null +++ b/tools/testing/selftests/vhost/Makefile @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0 +all: + +include ../lib.mk + +.PHONY: all clean + +BINARIES := test_vhost_net + +test_vhost_net: test_vhost_net.c ../kselftest.h ../kselftest_harness.h + $(CC) $(CFLAGS) -g $< -o $@ + +TEST_PROGS += $(BINARIES) +EXTRA_CLEAN := $(BINARIES) + +all: $(BINARIES) diff --git a/tools/testing/selftests/vhost/config b/tools/testing/selftests/vhost/config new file mode 100644 index 000000000000..6391c1f32c34 --- /dev/null +++ b/tools/testing/selftests/vhost/config @@ -0,0 +1,2 @@ +CONFIG_VHOST_NET=y +CONFIG_TUN=y diff --git a/tools/testing/selftests/vhost/test_vhost_net.c b/tools/testing/selftests/vhost/test_vhost_net.c new file mode 100644 index 000000000000..14acf2c0e049 --- /dev/null +++ b/tools/testing/selftests/vhost/test_vhost_net.c @@ -0,0 +1,522 @@ +// SPDX-License-Identifier: LGPL-2.1 + +#include "../kselftest_harness.h" +#include "../../../virtio/asm/barrier.h" + +#include + +#include +#include + +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include +#include +#include +#include +#include + +#include +#include +#include + +static unsigned char hexnybble(char hex) +{ + switch (hex) { + case '0'...'9': + return hex - '0'; + case 'a'...'f': + return 10 + hex - 'a'; + case 'A'...'F': + return 10 + hex - 'A'; + default: + exit (KSFT_SKIP); + } +} + +static unsigned char hexchar(char *hex) +{ + return (hexnybble(hex[0]) << 4) | hexnybble(hex[1]); +} + +int open_tun(int vnet_hdr_sz, struct in6_addr *addr) +{ + int tun_fd = open("/dev/net/tun", O_RDWR); + if (tun_fd == -1) + return -1; + + struct ifreq ifr = { 0 }; + + ifr.ifr_flags = IFF_TUN | IFF_NO_PI; + if (vnet_hdr_sz) + ifr.ifr_flags |= IFF_VNET_HDR; + + if (ioctl(tun_fd, TUNSETIFF, (void *)&ifr) < 0) + goto out_tun; + + if (vnet_hdr_sz && + ioctl(tun_fd, TUNSETVNETHDRSZ, &vnet_hdr_sz) < 0) + goto out_tun; + + int sockfd = socket(AF_INET6, SOCK_DGRAM, IPPROTO_IP); + if (sockfd == -1) + goto out_tun; + + if (ioctl(sockfd, SIOCGIFFLAGS, (void *)&ifr) < 0) + goto out_sock; + + ifr.ifr_flags |= IFF_UP; + if (ioctl(sockfd, SIOCSIFFLAGS, (void *)&ifr) < 0) + goto out_sock; + + close(sockfd); + + FILE *inet6 = fopen("/proc/net/if_inet6", "r"); + if (!inet6) + goto out_tun; + + char buf[80]; + while (fgets(buf, sizeof(buf), inet6)) { + size_t len = strlen(buf), namelen = strlen(ifr.ifr_name); + if (!strncmp(buf, "fe80", 4) && + buf[len - namelen - 2] == ' ' && + !strncmp(buf + len - namelen - 1, ifr.ifr_name, namelen)) { + for (int i = 0; i < 16; i++) { + addr->s6_addr[i] = hexchar(buf + i*2); + } + fclose(inet6); + return tun_fd; + } + } + /* Not found */ + fclose(inet6); + out_sock: + close(sockfd); + out_tun: + close(tun_fd); + return -1; +} + +#define RING_SIZE 32 +#define RING_MASK(x) ((x) & (RING_SIZE-1)) + +struct pkt_buf { + unsigned char data[2048]; +}; + +struct test_vring { + struct vring_desc desc[RING_SIZE]; + struct vring_avail avail; + __virtio16 avail_ring[RING_SIZE]; + struct vring_used used; + struct vring_used_elem used_ring[RING_SIZE]; + struct pkt_buf pkts[RING_SIZE]; +} rings[2]; + +static int setup_vring(int vhost_fd, int tun_fd, int call_fd, int kick_fd, int idx) +{ + struct test_vring *vring = &rings[idx]; + int ret; + + memset(vring, 0, sizeof(vring)); + + struct vhost_vring_state vs = { }; + vs.index = idx; + vs.num = RING_SIZE; + if (ioctl(vhost_fd, VHOST_SET_VRING_NUM, &vs) < 0) { + perror("VHOST_SET_VRING_NUM"); + return -1; + } + + vs.num = 0; + if (ioctl(vhost_fd, VHOST_SET_VRING_BASE, &vs) < 0) { + perror("VHOST_SET_VRING_BASE"); + return -1; + } + + struct vhost_vring_addr va = { }; + va.index = idx; + va.desc_user_addr = (uint64_t)vring->desc; + va.avail_user_addr = (uint64_t)&vring->avail; + va.used_user_addr = (uint64_t)&vring->used; + if (ioctl(vhost_fd, VHOST_SET_VRING_ADDR, &va) < 0) { + perror("VHOST_SET_VRING_ADDR"); + return -1; + } + + struct vhost_vring_file vf = { }; + vf.index = idx; + vf.fd = tun_fd; + if (ioctl(vhost_fd, VHOST_NET_SET_BACKEND, &vf) < 0) { + perror("VHOST_NET_SET_BACKEND"); + return -1; + } + + vf.fd = call_fd; + if (ioctl(vhost_fd, VHOST_SET_VRING_CALL, &vf) < 0) { + perror("VHOST_SET_VRING_CALL"); + return -1; + } + + vf.fd = kick_fd; + if (ioctl(vhost_fd, VHOST_SET_VRING_KICK, &vf) < 0) { + perror("VHOST_SET_VRING_KICK"); + return -1; + } + + return 0; +} + +int setup_vhost(int vhost_fd, int tun_fd, int call_fd, int kick_fd, uint64_t want_features) +{ + int ret; + + if (ioctl(vhost_fd, VHOST_SET_OWNER, NULL) < 0) { + perror("VHOST_SET_OWNER"); + return -1; + } + + uint64_t features; + if (ioctl(vhost_fd, VHOST_GET_FEATURES, &features) < 0) { + perror("VHOST_GET_FEATURES"); + return -1; + } + + if ((features & want_features) != want_features) + return KSFT_SKIP; + + if (ioctl(vhost_fd, VHOST_SET_FEATURES, &want_features) < 0) { + perror("VHOST_SET_FEATURES"); + return -1; + } + + struct vhost_memory *vmem = alloca(sizeof(*vmem) + sizeof(vmem->regions[0])); + + memset(vmem, 0, sizeof(*vmem) + sizeof(vmem->regions[0])); + vmem->nregions = 1; + /* + * I just want to map the *whole* of userspace address space. But + * from userspace I don't know what that is. On x86_64 it would be: + * + * vmem->regions[0].guest_phys_addr = 4096; + * vmem->regions[0].memory_size = 0x7fffffffe000; + * vmem->regions[0].userspace_addr = 4096; + * + * For now, just ensure we put everything inside a single BSS region. + */ + vmem->regions[0].guest_phys_addr = (uint64_t)&rings; + vmem->regions[0].userspace_addr = (uint64_t)&rings; + vmem->regions[0].memory_size = sizeof(rings); + + if (ioctl(vhost_fd, VHOST_SET_MEM_TABLE, vmem) < 0) { + perror("VHOST_SET_MEM_TABLE"); + return -1; + } + + if (setup_vring(vhost_fd, tun_fd, call_fd, kick_fd, 0)) + return -1; + + if (setup_vring(vhost_fd, tun_fd, call_fd, kick_fd, 1)) + return -1; + + return 0; +} + + +static char ping_payload[16] = "VHOST TEST PACKT"; + +static inline uint32_t csum_partial(uint16_t *buf, int nwords) +{ + uint32_t sum = 0; + for(sum=0; nwords>0; nwords--) + sum += ntohs(*buf++); + return sum; +} + +static inline uint16_t csum_finish(uint32_t sum) +{ + sum = (sum >> 16) + (sum &0xffff); + sum += (sum >> 16); + return htons((uint16_t)(~sum)); +} + +static int create_icmp_echo(unsigned char *data, struct in6_addr *dst, + struct in6_addr *src, uint16_t id, uint16_t seq) +{ + const int icmplen = ICMP_MINLEN + sizeof(ping_payload); + const int plen = sizeof(struct ip6_hdr) + icmplen; + + struct ip6_hdr *iph = (void *)data; + struct icmp6_hdr *icmph = (void *)(data + sizeof(*iph)); + + /* IPv6 Header */ + iph->ip6_flow = htonl((6 << 28) + /* version 6 */ + (0 << 20) + /* traffic class */ + (0 << 0)); /* flow ID */ + iph->ip6_nxt = IPPROTO_ICMPV6; + iph->ip6_plen = htons(icmplen); + iph->ip6_hlim = 128; + iph->ip6_src = *src; + iph->ip6_dst = *dst; + + /* ICMPv6 echo request */ + icmph->icmp6_type = ICMP6_ECHO_REQUEST; + icmph->icmp6_code = 0; + icmph->icmp6_data16[0] = htons(id); /* ID */ + icmph->icmp6_data16[1] = htons(seq); /* sequence */ + + /* Some arbitrary payload */ + memcpy(&icmph[1], ping_payload, sizeof(ping_payload)); + + /* + * IPv6 upper-layer checksums include a pseudo-header + * for IPv6 which contains the source address, the + * destination address, the upper-layer packet length + * and next-header field. See RFC8200 ยง8.1. The + * checksum is as follows: + * + * checksum 32 bytes of real IPv6 header: + * src addr (16 bytes) + * dst addr (16 bytes) + * 8 bytes more: + * length of ICMPv6 in bytes (be32) + * 3 bytes of 0 + * next header byte (IPPROTO_ICMPV6) + * Then the actual ICMPv6 bytes + */ + uint32_t sum = csum_partial((uint16_t *)&iph->ip6_src, 8); /* 8 uint16_t */ + sum += csum_partial((uint16_t *)&iph->ip6_dst, 8); /* 8 uint16_t */ + + /* The easiest way to checksum the following 8-byte + * part of the pseudo-header without horridly violating + * C type aliasing rules is *not* to build it in memory + * at all. We know the length fits in 16 bits so the + * partial checksum of 00 00 LL LL 00 00 00 NH ends up + * being just LLLL + NH. + */ + sum += IPPROTO_ICMPV6; + sum += ICMP_MINLEN + sizeof(ping_payload); + + sum += csum_partial((uint16_t *)icmph, icmplen / 2); + icmph->icmp6_cksum = csum_finish(sum); + return plen; +} + + +static int check_icmp_response(unsigned char *data, uint32_t len, struct in6_addr *dst, struct in6_addr *src) +{ + struct ip6_hdr *iph = (void *)data; + return ( len >= 41 && (ntohl(iph->ip6_flow) >> 28)==6 /* IPv6 header */ + && iph->ip6_nxt == IPPROTO_ICMPV6 /* IPv6 next header field = ICMPv6 */ + && !memcmp(&iph->ip6_src, src, 16) /* source == magic address */ + && !memcmp(&iph->ip6_dst, dst, 16) /* source == magic address */ + && len >= 40 + ICMP_MINLEN + sizeof(ping_payload) /* No short-packet segfaults */ + && data[40] == ICMP6_ECHO_REPLY /* ICMPv6 reply */ + && !memcmp(&data[40 + ICMP_MINLEN], ping_payload, sizeof(ping_payload)) /* Same payload in response */ + ); + +} + +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ +#define vio16(x) (x) +#define vio32(x) (x) +#define vio64(x) (x) +#else +#define vio16(x) __builtin_bswap16(x) +#define vio32(x) __builtin_bswap32(x) +#define vio64(x) __builtin_bswap64(x) +#endif + + +int test_vhost(int vnet_hdr_sz, int xdp, uint64_t features) +{ + int call_fd = eventfd(0, EFD_CLOEXEC|EFD_NONBLOCK); + int kick_fd = eventfd(0, EFD_CLOEXEC|EFD_NONBLOCK); + int vhost_fd = open("/dev/vhost-net", O_RDWR); + int tun_fd = -1; + int ret = KSFT_SKIP; + + if (call_fd < 0 || kick_fd < 0 || vhost_fd < 0) + goto err; + + memset(rings, 0, sizeof(rings)); + + /* Pick up the link-local address that the kernel + * assigns to the tun device. */ + struct in6_addr tun_addr; + tun_fd = open_tun(vnet_hdr_sz, &tun_addr); + if (tun_fd < 0) + goto err; + + if (features & (1ULL << VHOST_NET_F_VIRTIO_NET_HDR)) { + if (vnet_hdr_sz) { + ret = -1; + goto err; + } + + vnet_hdr_sz = (features & ((1ULL << VIRTIO_NET_F_MRG_RXBUF) | + (1ULL << VIRTIO_F_VERSION_1))) ? + sizeof(struct virtio_net_hdr_mrg_rxbuf) : + sizeof(struct virtio_net_hdr); + } + + if (!xdp) { + int sndbuf = RING_SIZE * 2048; + if (ioctl(tun_fd, TUNSETSNDBUF, &sndbuf) < 0) { + perror("TUNSETSNDBUF"); + ret = -1; + goto err; + } + } + + ret = setup_vhost(vhost_fd, tun_fd, call_fd, kick_fd, features); + if (ret) + goto err; + + /* A fake link-local address for the userspace end */ + struct in6_addr local_addr = { 0 }; + local_addr.s6_addr16[0] = htons(0xfe80); + local_addr.s6_addr16[7] = htons(1); + + /* Set up RX and TX descriptors; the latter with ping packets ready to + * send to the kernel, but don't actually send them yet. */ + for (int i = 0; i < RING_SIZE; i++) { + struct pkt_buf *pkt = &rings[1].pkts[i]; + int plen = create_icmp_echo(&pkt->data[vnet_hdr_sz], &tun_addr, + &local_addr, 0x4747, i); + + rings[1].desc[i].addr = vio64((uint64_t)pkt); + rings[1].desc[i].len = vio32(plen + vnet_hdr_sz); + rings[1].avail_ring[i] = vio16(i); + + + pkt = &rings[0].pkts[i]; + rings[0].desc[i].addr = vio64((uint64_t)pkt); + rings[0].desc[i].len = vio32(sizeof(*pkt)); + rings[0].desc[i].flags = vio16(VRING_DESC_F_WRITE); + rings[0].avail_ring[i] = vio16(i); + } + barrier(); + + rings[0].avail.idx = RING_SIZE; + rings[1].avail.idx = vio16(1); + + barrier(); + eventfd_write(kick_fd, 1); + + uint16_t rx_seen_used = 0; + struct timeval tv = { 1, 0 }; + while (1) { + fd_set rfds = { 0 }; + FD_SET(call_fd, &rfds); + + if (select(call_fd + 1, &rfds, NULL, NULL, &tv) <= 0) { + ret = -1; + goto err; + } + + uint16_t rx_used_idx = vio16(rings[0].used.idx); + barrier(); + + while(rx_used_idx != rx_seen_used) { + uint32_t desc = vio32(rings[0].used_ring[RING_MASK(rx_seen_used)].id); + uint32_t len = vio32(rings[0].used_ring[RING_MASK(rx_seen_used)].len); + + if (desc >= RING_SIZE || len < vnet_hdr_sz) + return -1; + + uint64_t addr = vio64(rings[0].desc[desc].addr); + if (!addr) + return -1; + + if (check_icmp_response((void *)(addr + vnet_hdr_sz), len - vnet_hdr_sz, + &local_addr, &tun_addr)) { + ret = 0; + printf("Success (%d %d %llx)\n", vnet_hdr_sz, xdp, (unsigned long long)features); + goto err; + } + rx_seen_used++; + + /* Give the same buffer back */ + rings[0].avail.idx = vio16(rx_seen_used + RING_SIZE); + barrier(); + eventfd_write(kick_fd, 1); + } + + uint64_t ev_val; + eventfd_read(call_fd, &ev_val); + } + + err: + if (call_fd != -1) + close(call_fd); + if (kick_fd != -1) + close(kick_fd); + if (vhost_fd != -1) + close(vhost_fd); + if (tun_fd != -1) + close(tun_fd); + + return ret; +} + + +int main(void) +{ + int ret; + + ret = test_vhost(0, 0, ((1ULL << VHOST_NET_F_VIRTIO_NET_HDR) | + (1ULL << VIRTIO_F_VERSION_1))); + if (ret && ret != KSFT_SKIP) + return ret; + + ret = test_vhost(0, 1, ((1ULL << VHOST_NET_F_VIRTIO_NET_HDR) | + (1ULL << VIRTIO_F_VERSION_1))); + if (ret && ret != KSFT_SKIP) + return ret; + + ret = test_vhost(0, 0, ((1ULL << VHOST_NET_F_VIRTIO_NET_HDR))); + if (ret && ret != KSFT_SKIP) + return ret; + + ret = test_vhost(0, 1, ((1ULL << VHOST_NET_F_VIRTIO_NET_HDR))); + if (ret && ret != KSFT_SKIP) + return ret; + + ret = test_vhost(10, 0, 0); + if (ret && ret != KSFT_SKIP) + return ret; + + ret = test_vhost(10, 1, 0); + if (ret && ret != KSFT_SKIP) + return ret; + +#if 0 /* These ones will fail */ + ret = test_vhost(0, 0, 0); + if (ret && ret != KSFT_SKIP) + return ret; + + ret = test_vhost(0, 1, 0); + if (ret && ret != KSFT_SKIP) + return ret; + + ret = test_vhost(12, 0, 0); + if (ret && ret != KSFT_SKIP) + return ret; + + ret = test_vhost(12, 1, 0); + if (ret && ret != KSFT_SKIP) + return ret; +#endif + + return ret; +}