From patchwork Tue Jun 1 22:18:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanner Love X-Patchwork-Id: 453108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3DF28C47096 for ; Tue, 1 Jun 2021 22:18:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1D482613AB for ; Tue, 1 Jun 2021 22:18:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234957AbhFAWUd (ORCPT ); Tue, 1 Jun 2021 18:20:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234915AbhFAWUa (ORCPT ); Tue, 1 Jun 2021 18:20:30 -0400 Received: from mail-qk1-x734.google.com (mail-qk1-x734.google.com [IPv6:2607:f8b0:4864:20::734]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B53FC061574 for ; Tue, 1 Jun 2021 15:18:48 -0700 (PDT) Received: by mail-qk1-x734.google.com with SMTP id v8so490025qkv.1 for ; Tue, 01 Jun 2021 15:18:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TI12Da2DK+F/H4zU+J2oYauCoENzMsqXu3l0ONH7SSo=; b=lz5E5Yv7DP7PJrBQJaKUhcBQkjcxZBIr3Vw8ELe+ODYOahqkJemMAotkq+WhXpeCXS oJO/d6hRVpLXaPnojGSsk/DQxXSH+QcVwsoMogeMRMlYgPFp/BuXQXtWEjmXdOsyZmar GEFnywDPDSoC9upvQvW5wi00KcKAEZrz4qaCITz+mzVziShnIfEYadPReyc8WMu9lQra X0k3wwz+449x0o2GMQ8TnfDeuQhbADlwCjGMFItxCKBB2euaRy/odasCU+gPtm5NXUvM cHRh6g44CW+XO9vhUr0/cBKjDK1iWTRhXVm14LBNqxBv2CuOW+8ZKPcPs+kP3YntzVcz +wpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TI12Da2DK+F/H4zU+J2oYauCoENzMsqXu3l0ONH7SSo=; b=fBOsidpElwKoc79tf8HcNJNExhczdb+ZP4sofjTYVb+ZxxTEgScTw4Xt6pusSXyE/C lI7CesyW9fYzVL6U1b/dbckXFXO526b4mPdjIe91rGFYjxk+IuZfo1eZmoscenR9KGZN lh5MhL+2giWBqE2MW7jh8vcW+Lh1eeso+5FzgYi6d/B/IdaX38AqiEtdV4OaSShor1vV ROFSdovgadEYmZBpSap67YsPx7e5kQPjKUosL+KqXRz1ucEDvK3gpcdicVUootBjCn/3 9V1PmefdaENN3fobllfqi4Wi7FnIZwB9jnPOCfgjqnRaN5dpUHxwPVoTjcbj5gLa5W0s vdxg== X-Gm-Message-State: AOAM530Bdv1VJdwJCR2YEO5nQj3ElNzEpvLMgxjLLx9rptR9iLcp+Pfc Llgz4sahtmVAkptUUqcPMV+x+0mLTHT2sQ== X-Google-Smtp-Source: ABdhPJzjlTR8lcjr9YbsMUIfi3jhrVKM3LT9dp8YTXQ4FnmJ4VjlvlhrEicTjyWdhCF2t/bUxLtu+g== X-Received: by 2002:a05:620a:70e:: with SMTP id 14mr6070208qkc.85.1622585927370; Tue, 01 Jun 2021 15:18:47 -0700 (PDT) Received: from tannerlove.nyc.corp.google.com ([2620:0:1003:1000:56ea:5ee7:bba5:d755]) by smtp.gmail.com with ESMTPSA id n25sm1279282qtr.8.2021.06.01.15.18.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Jun 2021 15:18:47 -0700 (PDT) From: Tanner Love To: netdev@vger.kernel.org Cc: davem@davemloft.net, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Eric Dumazet , Willem de Bruijn , Petar Penkov , Jakub Kicinski , Tanner Love Subject: [PATCH net-next v3 1/3] net: flow_dissector: extend bpf flow dissector support with vnet hdr Date: Tue, 1 Jun 2021 18:18:38 -0400 Message-Id: <20210601221841.1251830-2-tannerlove.kernel@gmail.com> X-Mailer: git-send-email 2.32.0.rc0.204.g9fa02ecfa5-goog In-Reply-To: <20210601221841.1251830-1-tannerlove.kernel@gmail.com> References: <20210601221841.1251830-1-tannerlove.kernel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Tanner Love Amend the bpf flow dissector program type to accept virtio_net_hdr members. Do this to enable bpf flow dissector programs to perform virtio-net header validation. The next patch in this series will add a flow dissection hook in virtio_net_hdr_to_skb and make use of this extended functionality. That commit message has more background on the use case. Signed-off-by: Tanner Love Reviewed-by: Willem de Bruijn Reviewed-by: Petar Penkov --- drivers/net/bonding/bond_main.c | 2 +- include/linux/skbuff.h | 26 ++++++++++++---- include/net/flow_dissector.h | 6 ++++ include/uapi/linux/bpf.h | 6 ++++ net/core/filter.c | 55 +++++++++++++++++++++++++++++++++ net/core/flow_dissector.c | 24 ++++++++++++-- tools/include/uapi/linux/bpf.h | 6 ++++ 7 files changed, 116 insertions(+), 9 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 7e469c203ca5..5d2d7d5c5704 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3554,7 +3554,7 @@ static bool bond_flow_dissect(struct bonding *bond, struct sk_buff *skb, case BOND_XMIT_POLICY_ENCAP34: memset(fk, 0, sizeof(*fk)); return __skb_flow_dissect(NULL, skb, &flow_keys_bonding, - fk, NULL, 0, 0, 0, 0); + fk, NULL, 0, 0, 0, 0, NULL); default: break; } diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index dbf820a50a39..fef8f4b5db6e 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1312,18 +1312,20 @@ struct bpf_flow_dissector; bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx, __be16 proto, int nhoff, int hlen, unsigned int flags); +struct virtio_net_hdr; bool __skb_flow_dissect(const struct net *net, const struct sk_buff *skb, struct flow_dissector *flow_dissector, void *target_container, const void *data, - __be16 proto, int nhoff, int hlen, unsigned int flags); + __be16 proto, int nhoff, int hlen, unsigned int flags, + const struct virtio_net_hdr *vhdr); static inline bool skb_flow_dissect(const struct sk_buff *skb, struct flow_dissector *flow_dissector, void *target_container, unsigned int flags) { return __skb_flow_dissect(NULL, skb, flow_dissector, - target_container, NULL, 0, 0, 0, flags); + target_container, NULL, 0, 0, 0, flags, NULL); } static inline bool skb_flow_dissect_flow_keys(const struct sk_buff *skb, @@ -1332,7 +1334,20 @@ static inline bool skb_flow_dissect_flow_keys(const struct sk_buff *skb, { memset(flow, 0, sizeof(*flow)); return __skb_flow_dissect(NULL, skb, &flow_keys_dissector, - flow, NULL, 0, 0, 0, flags); + flow, NULL, 0, 0, 0, flags, NULL); +} + +static inline bool +__skb_flow_dissect_flow_keys_basic(const struct net *net, + const struct sk_buff *skb, + struct flow_keys_basic *flow, + const void *data, __be16 proto, + int nhoff, int hlen, unsigned int flags, + const struct virtio_net_hdr *vhdr) +{ + memset(flow, 0, sizeof(*flow)); + return __skb_flow_dissect(net, skb, &flow_keys_basic_dissector, flow, + data, proto, nhoff, hlen, flags, vhdr); } static inline bool @@ -1342,9 +1357,8 @@ skb_flow_dissect_flow_keys_basic(const struct net *net, const void *data, __be16 proto, int nhoff, int hlen, unsigned int flags) { - memset(flow, 0, sizeof(*flow)); - return __skb_flow_dissect(net, skb, &flow_keys_basic_dissector, flow, - data, proto, nhoff, hlen, flags); + return __skb_flow_dissect_flow_keys_basic(net, skb, flow, data, proto, + nhoff, hlen, flags, NULL); } void skb_flow_dissect_meta(const struct sk_buff *skb, diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index ffd386ea0dbb..0796ad745e69 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -370,6 +370,12 @@ struct bpf_flow_dissector { const struct sk_buff *skb; const void *data; const void *data_end; + __u8 vhdr_flags; + __u8 vhdr_gso_type; + __u16 vhdr_hdr_len; + __u16 vhdr_gso_size; + __u16 vhdr_csum_start; + __u16 vhdr_csum_offset; }; static inline void diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 418b9b813d65..de525defd462 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -5155,6 +5155,12 @@ struct __sk_buff { __u32 gso_segs; __bpf_md_ptr(struct bpf_sock *, sk); __u32 gso_size; + __u8 vhdr_flags; + __u8 vhdr_gso_type; + __u16 vhdr_hdr_len; + __u16 vhdr_gso_size; + __u16 vhdr_csum_start; + __u16 vhdr_csum_offset; }; struct bpf_tunnel_key { diff --git a/net/core/filter.c b/net/core/filter.c index 239de1306de9..af45e769ced6 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -8358,6 +8358,16 @@ static bool flow_dissector_is_valid_access(int off, int size, return false; info->reg_type = PTR_TO_FLOW_KEYS; return true; + case bpf_ctx_range(struct __sk_buff, len): + return size == size_default; + case bpf_ctx_range(struct __sk_buff, vhdr_flags): + case bpf_ctx_range(struct __sk_buff, vhdr_gso_type): + return size == sizeof(__u8); + case bpf_ctx_range(struct __sk_buff, vhdr_hdr_len): + case bpf_ctx_range(struct __sk_buff, vhdr_gso_size): + case bpf_ctx_range(struct __sk_buff, vhdr_csum_start): + case bpf_ctx_range(struct __sk_buff, vhdr_csum_offset): + return size == sizeof(__u16); default: return false; } @@ -8390,6 +8400,51 @@ static u32 flow_dissector_convert_ctx_access(enum bpf_access_type type, si->dst_reg, si->src_reg, offsetof(struct bpf_flow_dissector, flow_keys)); break; + + case offsetof(struct __sk_buff, vhdr_flags): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_flow_dissector, vhdr_flags), + si->dst_reg, si->src_reg, + offsetof(struct bpf_flow_dissector, vhdr_flags)); + break; + + case offsetof(struct __sk_buff, vhdr_gso_type): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_flow_dissector, vhdr_gso_type), + si->dst_reg, si->src_reg, + offsetof(struct bpf_flow_dissector, vhdr_gso_type)); + break; + + case offsetof(struct __sk_buff, vhdr_hdr_len): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_flow_dissector, vhdr_hdr_len), + si->dst_reg, si->src_reg, + offsetof(struct bpf_flow_dissector, vhdr_hdr_len)); + break; + + case offsetof(struct __sk_buff, vhdr_gso_size): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_flow_dissector, vhdr_gso_size), + si->dst_reg, si->src_reg, + offsetof(struct bpf_flow_dissector, vhdr_gso_size)); + break; + + case offsetof(struct __sk_buff, vhdr_csum_start): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_flow_dissector, vhdr_csum_start), + si->dst_reg, si->src_reg, + offsetof(struct bpf_flow_dissector, vhdr_csum_start)); + break; + + case offsetof(struct __sk_buff, vhdr_csum_offset): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_flow_dissector, vhdr_csum_offset), + si->dst_reg, si->src_reg, + offsetof(struct bpf_flow_dissector, vhdr_csum_offset)); + break; + + case offsetof(struct __sk_buff, len): + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct bpf_flow_dissector, skb), + si->dst_reg, si->src_reg, + offsetof(struct bpf_flow_dissector, skb)); + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct sk_buff, len), + si->dst_reg, si->dst_reg, + offsetof(struct sk_buff, len)); + break; } return insn - insn_buf; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 3ed7c98a98e1..4b171ebec084 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -28,6 +28,7 @@ #include #include #include +#include #if IS_ENABLED(CONFIG_NF_CONNTRACK) #include #include @@ -904,6 +905,7 @@ bool bpf_flow_dissect(struct bpf_prog *prog, struct bpf_flow_dissector *ctx, * @hlen: packet header length, if @data is NULL use skb_headlen(skb) * @flags: flags that control the dissection process, e.g. * FLOW_DISSECTOR_F_STOP_AT_ENCAP. + * @vhdr: virtio_net_header to include in kernel context for BPF flow dissector * * The function will try to retrieve individual keys into target specified * by flow_dissector from either the skbuff or a raw buffer specified by the @@ -915,7 +917,8 @@ bool __skb_flow_dissect(const struct net *net, const struct sk_buff *skb, struct flow_dissector *flow_dissector, void *target_container, const void *data, - __be16 proto, int nhoff, int hlen, unsigned int flags) + __be16 proto, int nhoff, int hlen, unsigned int flags, + const struct virtio_net_hdr *vhdr) { struct flow_dissector_key_control *key_control; struct flow_dissector_key_basic *key_basic; @@ -1001,6 +1004,23 @@ bool __skb_flow_dissect(const struct net *net, __be16 n_proto = proto; struct bpf_prog *prog; + if (vhdr) { + ctx.vhdr_flags = vhdr->flags; + ctx.vhdr_gso_type = vhdr->gso_type; + ctx.vhdr_hdr_len = + __virtio16_to_cpu(virtio_legacy_is_little_endian(), + vhdr->hdr_len); + ctx.vhdr_gso_size = + __virtio16_to_cpu(virtio_legacy_is_little_endian(), + vhdr->gso_size); + ctx.vhdr_csum_start = + __virtio16_to_cpu(virtio_legacy_is_little_endian(), + vhdr->csum_start); + ctx.vhdr_csum_offset = + __virtio16_to_cpu(virtio_legacy_is_little_endian(), + vhdr->csum_offset); + } + if (skb) { ctx.skb = skb; /* we can't use 'proto' in the skb case @@ -1610,7 +1630,7 @@ u32 __skb_get_hash_symmetric(const struct sk_buff *skb) memset(&keys, 0, sizeof(keys)); __skb_flow_dissect(NULL, skb, &flow_keys_dissector_symmetric, &keys, NULL, 0, 0, 0, - FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL); + FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL, NULL); return __flow_hash_from_keys(&keys, &hashrnd); } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 418b9b813d65..de525defd462 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -5155,6 +5155,12 @@ struct __sk_buff { __u32 gso_segs; __bpf_md_ptr(struct bpf_sock *, sk); __u32 gso_size; + __u8 vhdr_flags; + __u8 vhdr_gso_type; + __u16 vhdr_hdr_len; + __u16 vhdr_gso_size; + __u16 vhdr_csum_start; + __u16 vhdr_csum_offset; }; struct bpf_tunnel_key { From patchwork Tue Jun 1 22:18:39 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanner Love X-Patchwork-Id: 451780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBD2EC47080 for ; Tue, 1 Jun 2021 22:19:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CF8C0613BC for ; Tue, 1 Jun 2021 22:19:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234969AbhFAWUe (ORCPT ); Tue, 1 Jun 2021 18:20:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47414 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234956AbhFAWUc (ORCPT ); Tue, 1 Jun 2021 18:20:32 -0400 Received: from mail-qt1-x82c.google.com (mail-qt1-x82c.google.com [IPv6:2607:f8b0:4864:20::82c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BFF7C061574 for ; Tue, 1 Jun 2021 15:18:50 -0700 (PDT) Received: by mail-qt1-x82c.google.com with SMTP id k19so463921qta.2 for ; Tue, 01 Jun 2021 15:18:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nfeztu6o8kkxABpPg61jumMbzqlXlkFBS8u2O17zCNY=; b=rlkFwpQh/ZtbuCr66G/lXj382+lr27tjGk0NFVjW11ECQzoSVfvsyd7gDA3UTXdU5k p2ARSgkOB0NwDCyZxo9x+N4GFWkqebAg3FbwF0K1P/hR43tvDDD1Vpzf7vq8mtv0gxqT txCTz8dTFkJC+ySJRu88l20pmY+UQNeWaYzf86xAg3Kh10sjst2ZSmxpMrs/qsMpQ/rG T5aVJvjXHvPSwYQzrgTpkqo6jRRkbaUjJfKzt1Ta6GO8ljC9BOJH01lrUjf9TMMnmpz4 UY/HCUAXql7WnPcnZMrjftw9RqiQx91CJue3OixTxRDLMTWTtet8+BGlwp8moxWY024A svng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nfeztu6o8kkxABpPg61jumMbzqlXlkFBS8u2O17zCNY=; b=NUYhINQil6XAV16XuhD67nJddX81xaqjlWAWO8ef+H2kI8x2ikyZEY4Sjude27h8kT i2tL5pWLwpWQtiGieX+25l0MYP96Ie5aWNwQtC3ck2I84fjioZGM5ao5rU5ttEOHExRu AgA0QELjT3NIOV669yFjVhapxbjZjRgaF4zhIkOPkKug+JU2mTJk5c/WN0rwQuBz/qls F0vLNbJ1haaxvDaI1YTtHfvnt17vkWUlIg7jrZW4GiarWsFTZ2erhUif25hBWKUI/XsD 20sMH0mkpBf8QxIRvyCmmPJUN95cUW9J3O1fujN8A+BGnmge2xAUGNayEbimSswLMd2O 1Q6g== X-Gm-Message-State: AOAM531La/6CvOZ8M4aQ6vjp0ahYLLJxzQD8f1kk0fAuhZmIAyeAEcCk pTagFrDJ5wth8IjF2Uqg/PgU/6fksArGFw== X-Google-Smtp-Source: ABdhPJwUj3C504PYluR2iGAmOzrJURPY0Dml4q8MkECucYpFCB4CEdeFHbtSyLBwbGba/FD0n3Yh3w== X-Received: by 2002:ac8:6051:: with SMTP id k17mr21817044qtm.23.1622585929165; Tue, 01 Jun 2021 15:18:49 -0700 (PDT) Received: from tannerlove.nyc.corp.google.com ([2620:0:1003:1000:56ea:5ee7:bba5:d755]) by smtp.gmail.com with ESMTPSA id n25sm1279282qtr.8.2021.06.01.15.18.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Jun 2021 15:18:48 -0700 (PDT) From: Tanner Love To: netdev@vger.kernel.org Cc: davem@davemloft.net, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Eric Dumazet , Willem de Bruijn , Petar Penkov , Jakub Kicinski , Tanner Love , kernel test robot Subject: [PATCH net-next v3 2/3] virtio_net: add optional flow dissection in virtio_net_hdr_to_skb Date: Tue, 1 Jun 2021 18:18:39 -0400 Message-Id: <20210601221841.1251830-3-tannerlove.kernel@gmail.com> X-Mailer: git-send-email 2.32.0.rc0.204.g9fa02ecfa5-goog In-Reply-To: <20210601221841.1251830-1-tannerlove.kernel@gmail.com> References: <20210601221841.1251830-1-tannerlove.kernel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Tanner Love Syzkaller bugs have resulted from loose specification of virtio_net_hdr[1]. Enable execution of a BPF flow dissector program in virtio_net_hdr_to_skb to validate the vnet header and drop bad input. The existing behavior of accepting these vnet headers is part of the ABI. But individual admins may want to enforce restrictions. For example, verifying that a GSO_TCPV4 gso_type matches packet contents: unencapsulated TCP/IPV4 packet with payload exceeding gso_size and hdr_len at payload offset. Introduce a new sysctl net.core.flow_dissect_vnet_hdr controlling a static key to decide whether to perform flow dissection. When the key is false, virtio_net_hdr_to_skb computes as before. [1] https://syzkaller.appspot.com/bug?id=b419a5ca95062664fe1a60b764621eb4526e2cd0 [ um build error ] Reported-by: kernel test robot Signed-off-by: Tanner Love Suggested-by: Willem de Bruijn --- include/linux/virtio_net.h | 25 +++++++++++++++++++++---- net/core/flow_dissector.c | 3 +++ net/core/sysctl_net_core.c | 9 +++++++++ 3 files changed, 33 insertions(+), 4 deletions(-) diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h index b465f8f3e554..cdc6152b40c6 100644 --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -25,10 +25,13 @@ static inline int virtio_net_hdr_set_proto(struct sk_buff *skb, return 0; } +DECLARE_STATIC_KEY_FALSE(sysctl_flow_dissect_vnet_hdr_key); + static inline int virtio_net_hdr_to_skb(struct sk_buff *skb, const struct virtio_net_hdr *hdr, bool little_endian) { + struct flow_keys_basic keys; unsigned int gso_type = 0; unsigned int thlen = 0; unsigned int p_off = 0; @@ -78,13 +81,24 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb, p_off = skb_transport_offset(skb) + thlen; if (!pskb_may_pull(skb, p_off)) return -EINVAL; - } else { + } + + /* BPF flow dissection for optional strict validation. + * + * Admins can define permitted packets more strictly, such as dropping + * deprecated UDP_UFO packets and requiring skb->protocol to be non-zero + * and matching packet headers. + */ + if (static_branch_unlikely(&sysctl_flow_dissect_vnet_hdr_key) && + !__skb_flow_dissect_flow_keys_basic(NULL, skb, &keys, NULL, 0, 0, 0, + 0, hdr)) + return -EINVAL; + + if (!(hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM)) { /* gso packets without NEEDS_CSUM do not set transport_offset. * probe and drop if does not match one of the above types. */ if (gso_type && skb->network_header) { - struct flow_keys_basic keys; - if (!skb->protocol) { __be16 protocol = dev_parse_header_protocol(skb); @@ -92,8 +106,11 @@ static inline int virtio_net_hdr_to_skb(struct sk_buff *skb, if (protocol && protocol != skb->protocol) return -EINVAL; } + retry: - if (!skb_flow_dissect_flow_keys_basic(NULL, skb, &keys, + /* only if flow dissection not already done */ + if (!static_branch_unlikely(&sysctl_flow_dissect_vnet_hdr_key) && + !skb_flow_dissect_flow_keys_basic(NULL, skb, &keys, NULL, 0, 0, 0, 0)) { /* UFO does not specify ipv4 or 6: try both */ diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 4b171ebec084..ae2ce382f4f4 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -35,6 +35,9 @@ #endif #include +DEFINE_STATIC_KEY_FALSE(sysctl_flow_dissect_vnet_hdr_key); +EXPORT_SYMBOL(sysctl_flow_dissect_vnet_hdr_key); + static void dissector_set_key(struct flow_dissector *flow_dissector, enum flow_dissector_key_id key_id) { diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index c8496c1142c9..c01b9366bb75 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -36,6 +36,8 @@ static int net_msg_warn; /* Unused, but still a sysctl */ int sysctl_fb_tunnels_only_for_init_net __read_mostly = 0; EXPORT_SYMBOL(sysctl_fb_tunnels_only_for_init_net); +DECLARE_STATIC_KEY_FALSE(sysctl_flow_dissect_vnet_hdr_key); + /* 0 - Keep current behavior: * IPv4: inherit all current settings from init_net * IPv6: reset all settings to default @@ -580,6 +582,13 @@ static struct ctl_table net_core_table[] = { .extra1 = SYSCTL_ONE, .extra2 = &int_3600, }, + { + .procname = "flow_dissect_vnet_hdr", + .data = &sysctl_flow_dissect_vnet_hdr_key.key, + .maxlen = sizeof(sysctl_flow_dissect_vnet_hdr_key), + .mode = 0644, + .proc_handler = proc_do_static_key, + }, { } }; From patchwork Tue Jun 1 22:18:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tanner Love X-Patchwork-Id: 453107 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0365DC4708F for ; Tue, 1 Jun 2021 22:19:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DDBF1613BD for ; Tue, 1 Jun 2021 22:19:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234972AbhFAWUh (ORCPT ); Tue, 1 Jun 2021 18:20:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234971AbhFAWUf (ORCPT ); Tue, 1 Jun 2021 18:20:35 -0400 Received: from mail-qk1-x731.google.com (mail-qk1-x731.google.com [IPv6:2607:f8b0:4864:20::731]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05F3DC061574 for ; Tue, 1 Jun 2021 15:18:52 -0700 (PDT) Received: by mail-qk1-x731.google.com with SMTP id j189so490738qkf.2 for ; Tue, 01 Jun 2021 15:18:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HjWmN6kUqQ5YCwC7JwQIh1tbtRwZbilq583FvBtayno=; b=boEpIqTgnn2W2itHoSDh+RKfdyQ4dbO+fYIdfeBVBh5sVtzOX+bjRIjI+Fp3pskoev b9Xzw4iQnUaOGtQczsA8sukSdSEUGn5SOWFE/T6q46nW/flyKHrE+KwwyeaOIcG/iLKS an6zcE/dvl1tmQz3ZHYXngKXGNufjAM/yTtS1uWtnKKZ64AsgEypTr+L/MIXHGuhaGTx hgsm1Qmuxtxn0S3u3wYlYWzgJWietJAdrvs+mVZGpFmtPHdURAVNbn7MBLqCNVUlAYte YQUVFtaOFrfXgct9FVNc3X7QVMQ5xkhih2wJ+7bgPSof0eF8OAfRI2z/wl8AB0hpoE80 1Rig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HjWmN6kUqQ5YCwC7JwQIh1tbtRwZbilq583FvBtayno=; b=lsqKv6ozw0LrDGwYsRrtQSZW+E9tynjVA0j2uuyE9Ych7KusqHqtXaiKd+gDkX7lUP 7d6bGsY+ytr6zZPoLZsf2lolY+gVZUwhji8Qgh7QNJzCrSxbE9nJX7QuizQ1AE6v99Ts peh1fz6h41mjX8zGBdHE1EuGdgWJYICbGnAEz/vmFRmge5KmBYKJxT0xXZLxugS8FFIh s7OrQ1CROpe6XbV/XA4eIEAIF4kfbeYexfppZrEPGRSf1O6roF6EXmPXJGsyHH4BcWZk AqRtNQTCZ6wOY4ILzBS4QS93QzpKUCWsu+r9B50PGloqmXh/1ACG9L04HIu+a8T+sxn/ jIrA== X-Gm-Message-State: AOAM531VKbmjAf68flUkbwGM4bEEOlvT0r0cS5QnebJ98jTSLP04QB2B /4akZ0yEEsDAXWopX9Y7o3unkcLzGDWVnw== X-Google-Smtp-Source: ABdhPJxPVWZaq3Y29S5N7c0Vu6xs0joD2SgC+iAeAgL1EH/+r0OwXy2bKxDuPRI0PyBAPyXel6i1Qg== X-Received: by 2002:a37:468b:: with SMTP id t133mr23685335qka.189.1622585930977; Tue, 01 Jun 2021 15:18:50 -0700 (PDT) Received: from tannerlove.nyc.corp.google.com ([2620:0:1003:1000:56ea:5ee7:bba5:d755]) by smtp.gmail.com with ESMTPSA id n25sm1279282qtr.8.2021.06.01.15.18.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Jun 2021 15:18:50 -0700 (PDT) From: Tanner Love To: netdev@vger.kernel.org Cc: davem@davemloft.net, Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Eric Dumazet , Willem de Bruijn , Petar Penkov , Jakub Kicinski , Tanner Love Subject: [PATCH net-next v3 3/3] selftests/net: amend bpf flow dissector prog to do vnet hdr validation Date: Tue, 1 Jun 2021 18:18:40 -0400 Message-Id: <20210601221841.1251830-4-tannerlove.kernel@gmail.com> X-Mailer: git-send-email 2.32.0.rc0.204.g9fa02ecfa5-goog In-Reply-To: <20210601221841.1251830-1-tannerlove.kernel@gmail.com> References: <20210601221841.1251830-1-tannerlove.kernel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Tanner Love Change the BPF flow dissector program to perform various checks on the virtio_net_hdr fields after doing flow dissection. Amend test_flow_dissector.(c|sh) to add test cases that inject packets with reasonable or unreasonable virtio-net headers and assert that bad packets are dropped and good packets are not. Do this via packet socket; the kernel executes tpacket_snd, which enters virtio_net_hdr_to_skb, where flow dissection / vnet header validation occurs. Signed-off-by: Tanner Love Reviewed-by: Willem de Bruijn --- tools/testing/selftests/bpf/progs/bpf_flow.c | 188 +++++++++++++----- .../selftests/bpf/test_flow_dissector.c | 181 +++++++++++++++-- .../selftests/bpf/test_flow_dissector.sh | 19 ++ 3 files changed, 321 insertions(+), 67 deletions(-) diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c index 95a5a0778ed7..681cfbdba1d7 100644 --- a/tools/testing/selftests/bpf/progs/bpf_flow.c +++ b/tools/testing/selftests/bpf/progs/bpf_flow.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -71,15 +72,119 @@ struct { __type(value, struct bpf_flow_keys); } last_dissection SEC(".maps"); -static __always_inline int export_flow_keys(struct bpf_flow_keys *keys, - int ret) +/* Drops invalid virtio-net headers */ +static __always_inline int validate_virtio_net_hdr(const struct __sk_buff *skb) { + const struct bpf_flow_keys *keys = skb->flow_keys; + + /* Check gso */ + if (skb->vhdr_gso_type != VIRTIO_NET_HDR_GSO_NONE) { + if (!(skb->vhdr_flags & VIRTIO_NET_HDR_F_NEEDS_CSUM)) + return BPF_DROP; + + if (keys->is_encap) + return BPF_DROP; + + switch (skb->vhdr_gso_type & ~VIRTIO_NET_HDR_GSO_ECN) { + case VIRTIO_NET_HDR_GSO_TCPV4: + if (keys->addr_proto != ETH_P_IP || + keys->ip_proto != IPPROTO_TCP) + return BPF_DROP; + + if (skb->vhdr_gso_size >= skb->len - keys->thoff - + sizeof(struct tcphdr)) + return BPF_DROP; + + break; + case VIRTIO_NET_HDR_GSO_TCPV6: + if (keys->addr_proto != ETH_P_IPV6 || + keys->ip_proto != IPPROTO_TCP) + return BPF_DROP; + + if (skb->vhdr_gso_size >= skb->len - keys->thoff - + sizeof(struct tcphdr)) + return BPF_DROP; + + break; + case VIRTIO_NET_HDR_GSO_UDP: + if (keys->ip_proto != IPPROTO_UDP) + return BPF_DROP; + + if (skb->vhdr_gso_size >= skb->len - keys->thoff - + sizeof(struct udphdr)) + return BPF_DROP; + + break; + default: + return BPF_DROP; + } + } + + /* Check hdr_len */ + if (skb->vhdr_hdr_len) { + switch (keys->ip_proto) { + case IPPROTO_TCP: + if (skb->vhdr_hdr_len != skb->flow_keys->thoff + + sizeof(struct tcphdr)) + return BPF_DROP; + + break; + case IPPROTO_UDP: + if (skb->vhdr_hdr_len != keys->thoff + + sizeof(struct udphdr)) + return BPF_DROP; + + break; + } + } + + /* Check csum */ + if (skb->vhdr_flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { + if (keys->addr_proto != ETH_P_IP && + keys->addr_proto != ETH_P_IPV6) + return BPF_DROP; + + if (skb->vhdr_csum_start != keys->thoff) + return BPF_DROP; + + switch (keys->ip_proto) { + case IPPROTO_TCP: + if (skb->vhdr_csum_offset != + offsetof(struct tcphdr, check)) + return BPF_DROP; + + break; + case IPPROTO_UDP: + if (skb->vhdr_csum_offset != + offsetof(struct udphdr, check)) + return BPF_DROP; + + break; + default: + return BPF_DROP; + } + } + + return BPF_OK; +} + +/* Common steps to perform regardless of where protocol parsing finishes: + * 1. store flow keys in map + * 2. if parse result is BPF_OK, parse the vnet hdr if present + * 3. return the parse result + */ +static __always_inline int parse_epilogue(struct __sk_buff *skb, int ret) +{ + const struct bpf_flow_keys *keys = skb->flow_keys; __u32 key = (__u32)(keys->sport) << 16 | keys->dport; struct bpf_flow_keys val; memcpy(&val, keys, sizeof(val)); bpf_map_update_elem(&last_dissection, &key, &val, BPF_ANY); - return ret; + + if (ret != BPF_OK) + return ret; + return validate_virtio_net_hdr(skb); } #define IPV6_FLOWLABEL_MASK __bpf_constant_htonl(0x000FFFFF) @@ -114,8 +219,6 @@ static __always_inline void *bpf_flow_dissect_get_header(struct __sk_buff *skb, /* Dispatches on ETHERTYPE */ static __always_inline int parse_eth_proto(struct __sk_buff *skb, __be16 proto) { - struct bpf_flow_keys *keys = skb->flow_keys; - switch (proto) { case bpf_htons(ETH_P_IP): bpf_tail_call_static(skb, &jmp_table, IP); @@ -131,12 +234,10 @@ static __always_inline int parse_eth_proto(struct __sk_buff *skb, __be16 proto) case bpf_htons(ETH_P_8021AD): bpf_tail_call_static(skb, &jmp_table, VLAN); break; - default: - /* Protocol not supported */ - return export_flow_keys(keys, BPF_DROP); } - return export_flow_keys(keys, BPF_DROP); + /* Protocol not supported */ + return parse_epilogue(skb, BPF_DROP); } SEC("flow_dissector") @@ -162,28 +263,28 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto) case IPPROTO_ICMP: icmp = bpf_flow_dissect_get_header(skb, sizeof(*icmp), &_icmp); if (!icmp) - return export_flow_keys(keys, BPF_DROP); - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_DROP); + return parse_epilogue(skb, BPF_OK); case IPPROTO_IPIP: keys->is_encap = true; if (keys->flags & BPF_FLOW_DISSECTOR_F_STOP_AT_ENCAP) - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); return parse_eth_proto(skb, bpf_htons(ETH_P_IP)); case IPPROTO_IPV6: keys->is_encap = true; if (keys->flags & BPF_FLOW_DISSECTOR_F_STOP_AT_ENCAP) - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); return parse_eth_proto(skb, bpf_htons(ETH_P_IPV6)); case IPPROTO_GRE: gre = bpf_flow_dissect_get_header(skb, sizeof(*gre), &_gre); if (!gre) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); if (bpf_htons(gre->flags & GRE_VERSION)) /* Only inspect standard GRE packets with version 0 */ - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); keys->thoff += sizeof(*gre); /* Step over GRE Flags and Proto */ if (GRE_IS_CSUM(gre->flags)) @@ -195,13 +296,13 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto) keys->is_encap = true; if (keys->flags & BPF_FLOW_DISSECTOR_F_STOP_AT_ENCAP) - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); if (gre->proto == bpf_htons(ETH_P_TEB)) { eth = bpf_flow_dissect_get_header(skb, sizeof(*eth), &_eth); if (!eth) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->thoff += sizeof(*eth); @@ -212,37 +313,35 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto) case IPPROTO_TCP: tcp = bpf_flow_dissect_get_header(skb, sizeof(*tcp), &_tcp); if (!tcp) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); if (tcp->doff < 5) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); if ((__u8 *)tcp + (tcp->doff << 2) > data_end) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->sport = tcp->source; keys->dport = tcp->dest; - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); case IPPROTO_UDP: case IPPROTO_UDPLITE: udp = bpf_flow_dissect_get_header(skb, sizeof(*udp), &_udp); if (!udp) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->sport = udp->source; keys->dport = udp->dest; - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); default: - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); } - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); } static __always_inline int parse_ipv6_proto(struct __sk_buff *skb, __u8 nexthdr) { - struct bpf_flow_keys *keys = skb->flow_keys; - switch (nexthdr) { case IPPROTO_HOPOPTS: case IPPROTO_DSTOPTS: @@ -255,7 +354,7 @@ static __always_inline int parse_ipv6_proto(struct __sk_buff *skb, __u8 nexthdr) return parse_ip_proto(skb, nexthdr); } - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); } PROG(IP)(struct __sk_buff *skb) @@ -268,11 +367,11 @@ PROG(IP)(struct __sk_buff *skb) iph = bpf_flow_dissect_get_header(skb, sizeof(*iph), &_iph); if (!iph) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); /* IP header cannot be smaller than 20 bytes */ if (iph->ihl < 5) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->addr_proto = ETH_P_IP; keys->ipv4_src = iph->saddr; @@ -281,7 +380,7 @@ PROG(IP)(struct __sk_buff *skb) keys->thoff += iph->ihl << 2; if (data + keys->thoff > data_end) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); if (iph->frag_off & bpf_htons(IP_MF | IP_OFFSET)) { keys->is_frag = true; @@ -302,7 +401,7 @@ PROG(IP)(struct __sk_buff *skb) } if (done) - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); return parse_ip_proto(skb, iph->protocol); } @@ -314,7 +413,7 @@ PROG(IPV6)(struct __sk_buff *skb) ip6h = bpf_flow_dissect_get_header(skb, sizeof(*ip6h), &_ip6h); if (!ip6h) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->addr_proto = ETH_P_IPV6; memcpy(&keys->ipv6_src, &ip6h->saddr, 2*sizeof(ip6h->saddr)); @@ -324,7 +423,7 @@ PROG(IPV6)(struct __sk_buff *skb) keys->flow_label = ip6_flowlabel(ip6h); if (keys->flags & BPF_FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL) - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); return parse_ipv6_proto(skb, ip6h->nexthdr); } @@ -336,7 +435,7 @@ PROG(IPV6OP)(struct __sk_buff *skb) ip6h = bpf_flow_dissect_get_header(skb, sizeof(*ip6h), &_ip6h); if (!ip6h) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); /* hlen is in 8-octets and does not include the first 8 bytes * of the header @@ -354,7 +453,7 @@ PROG(IPV6FR)(struct __sk_buff *skb) fragh = bpf_flow_dissect_get_header(skb, sizeof(*fragh), &_fragh); if (!fragh) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->thoff += sizeof(*fragh); keys->is_frag = true; @@ -367,9 +466,9 @@ PROG(IPV6FR)(struct __sk_buff *skb) * explicitly asked for. */ if (!(keys->flags & BPF_FLOW_DISSECTOR_F_PARSE_1ST_FRAG)) - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); } else { - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); } return parse_ipv6_proto(skb, fragh->nexthdr); @@ -377,14 +476,13 @@ PROG(IPV6FR)(struct __sk_buff *skb) PROG(MPLS)(struct __sk_buff *skb) { - struct bpf_flow_keys *keys = skb->flow_keys; struct mpls_label *mpls, _mpls; mpls = bpf_flow_dissect_get_header(skb, sizeof(*mpls), &_mpls); if (!mpls) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); - return export_flow_keys(keys, BPF_OK); + return parse_epilogue(skb, BPF_OK); } PROG(VLAN)(struct __sk_buff *skb) @@ -396,10 +494,10 @@ PROG(VLAN)(struct __sk_buff *skb) if (keys->n_proto == bpf_htons(ETH_P_8021AD)) { vlan = bpf_flow_dissect_get_header(skb, sizeof(*vlan), &_vlan); if (!vlan) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); if (vlan->h_vlan_encapsulated_proto != bpf_htons(ETH_P_8021Q)) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->nhoff += sizeof(*vlan); keys->thoff += sizeof(*vlan); @@ -407,14 +505,14 @@ PROG(VLAN)(struct __sk_buff *skb) vlan = bpf_flow_dissect_get_header(skb, sizeof(*vlan), &_vlan); if (!vlan) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->nhoff += sizeof(*vlan); keys->thoff += sizeof(*vlan); /* Only allow 8021AD + 8021Q double tagging and no triple tagging.*/ if (vlan->h_vlan_encapsulated_proto == bpf_htons(ETH_P_8021AD) || vlan->h_vlan_encapsulated_proto == bpf_htons(ETH_P_8021Q)) - return export_flow_keys(keys, BPF_DROP); + return parse_epilogue(skb, BPF_DROP); keys->n_proto = vlan->h_vlan_encapsulated_proto; return parse_eth_proto(skb, vlan->h_vlan_encapsulated_proto); diff --git a/tools/testing/selftests/bpf/test_flow_dissector.c b/tools/testing/selftests/bpf/test_flow_dissector.c index 571cc076dd7d..583c13f75ead 100644 --- a/tools/testing/selftests/bpf/test_flow_dissector.c +++ b/tools/testing/selftests/bpf/test_flow_dissector.c @@ -17,6 +17,8 @@ #include #include #include +#include +#include #include #include #include @@ -65,7 +67,8 @@ struct guehdr { static uint8_t cfg_dsfield_inner; static uint8_t cfg_dsfield_outer; static uint8_t cfg_encap_proto; -static bool cfg_expect_failure = false; +static bool cfg_expect_norx; +static bool cfg_expect_snd_failure; static int cfg_l3_extra = AF_UNSPEC; /* optional SIT prefix */ static int cfg_l3_inner = AF_UNSPEC; static int cfg_l3_outer = AF_UNSPEC; @@ -77,8 +80,14 @@ static int cfg_port_gue = 6080; static bool cfg_only_rx; static bool cfg_only_tx; static int cfg_src_port = 9; +static bool cfg_tx_pf_packet; +static bool cfg_use_vnet; +static bool cfg_vnet_use_hdr_len_bad; +static bool cfg_vnet_use_gso; +static bool cfg_vnet_use_csum_off; +static bool cfg_partial_udp_hdr; -static char buf[ETH_DATA_LEN]; +static char buf[ETH_MAX_MTU]; #define INIT_ADDR4(name, addr4, port) \ static struct sockaddr_in name = { \ @@ -273,8 +282,48 @@ static int l3_length(int family) return sizeof(struct ipv6hdr); } +static int build_vnet_header(void *header, int il3_len) +{ + struct virtio_net_hdr *vh = header; + + vh->hdr_len = ETH_HLEN + il3_len + sizeof(struct udphdr); + + if (cfg_partial_udp_hdr) { + vh->hdr_len -= (sizeof(struct udphdr) >> 1); + return sizeof(*vh); + } + + /* Alteration must increase hdr_len; if not, kernel overwrites it */ + if (cfg_vnet_use_hdr_len_bad) + vh->hdr_len++; + + if (cfg_vnet_use_csum_off) { + vh->flags |= VIRTIO_NET_HDR_F_NEEDS_CSUM; + vh->csum_start = ETH_HLEN + il3_len; + vh->csum_offset = __builtin_offsetof(struct udphdr, check); + } + + if (cfg_vnet_use_gso) { + vh->gso_type = VIRTIO_NET_HDR_GSO_UDP; + vh->gso_size = ETH_DATA_LEN - il3_len; + } + + return sizeof(*vh); +} + +static int build_eth_header(void *header) +{ + struct ethhdr *eth = header; + uint16_t proto = cfg_l3_inner == PF_INET ? ETH_P_IP : ETH_P_IPV6; + + eth->h_proto = htons(proto); + + return ETH_HLEN; +} + static int build_packet(void) { + int l2_len = 0; int ol3_len = 0, ol4_len = 0, il3_len = 0, il4_len = 0; int el3_len = 0; @@ -294,23 +343,29 @@ static int build_packet(void) il3_len = l3_length(cfg_l3_inner); il4_len = sizeof(struct udphdr); - if (el3_len + ol3_len + ol4_len + il3_len + il4_len + cfg_payload_len >= - sizeof(buf)) + if (cfg_use_vnet) + l2_len += build_vnet_header(buf, il3_len); + if (cfg_tx_pf_packet) + l2_len += build_eth_header(buf + l2_len); + + if (l2_len + el3_len + ol3_len + ol4_len + il3_len + il4_len + + cfg_payload_len >= sizeof(buf)) error(1, 0, "packet too large\n"); /* * Fill packet from inside out, to calculate correct checksums. * But create ip before udp headers, as udp uses ip for pseudo-sum. */ - memset(buf + el3_len + ol3_len + ol4_len + il3_len + il4_len, + memset(buf + l2_len + el3_len + ol3_len + ol4_len + il3_len + il4_len, cfg_payload_char, cfg_payload_len); /* add zero byte for udp csum padding */ - buf[el3_len + ol3_len + ol4_len + il3_len + il4_len + cfg_payload_len] = 0; + buf[l2_len + el3_len + ol3_len + ol4_len + il3_len + il4_len + + cfg_payload_len] = 0; switch (cfg_l3_inner) { case PF_INET: - build_ipv4_header(buf + el3_len + ol3_len + ol4_len, + build_ipv4_header(buf + l2_len + el3_len + ol3_len + ol4_len, IPPROTO_UDP, in_saddr4.sin_addr.s_addr, in_daddr4.sin_addr.s_addr, @@ -318,7 +373,7 @@ static int build_packet(void) cfg_dsfield_inner); break; case PF_INET6: - build_ipv6_header(buf + el3_len + ol3_len + ol4_len, + build_ipv6_header(buf + l2_len + el3_len + ol3_len + ol4_len, IPPROTO_UDP, &in_saddr6, &in_daddr6, il4_len + cfg_payload_len, @@ -326,22 +381,25 @@ static int build_packet(void) break; } - build_udp_header(buf + el3_len + ol3_len + ol4_len + il3_len, + build_udp_header(buf + l2_len + el3_len + ol3_len + ol4_len + il3_len, cfg_payload_len, CFG_PORT_INNER, cfg_l3_inner); + if (cfg_partial_udp_hdr) + return l2_len + il3_len + (il4_len >> 1); + if (!cfg_encap_proto) - return il3_len + il4_len + cfg_payload_len; + return l2_len + il3_len + il4_len + cfg_payload_len; switch (cfg_l3_outer) { case PF_INET: - build_ipv4_header(buf + el3_len, cfg_encap_proto, + build_ipv4_header(buf + l2_len + el3_len, cfg_encap_proto, out_saddr4.sin_addr.s_addr, out_daddr4.sin_addr.s_addr, ol4_len + il3_len + il4_len + cfg_payload_len, cfg_dsfield_outer); break; case PF_INET6: - build_ipv6_header(buf + el3_len, cfg_encap_proto, + build_ipv6_header(buf + l2_len + el3_len, cfg_encap_proto, &out_saddr6, &out_daddr6, ol4_len + il3_len + il4_len + cfg_payload_len, cfg_dsfield_outer); @@ -350,17 +408,17 @@ static int build_packet(void) switch (cfg_encap_proto) { case IPPROTO_UDP: - build_gue_header(buf + el3_len + ol3_len + ol4_len - + build_gue_header(buf + l2_len + el3_len + ol3_len + ol4_len - sizeof(struct guehdr), cfg_l3_inner == PF_INET ? IPPROTO_IPIP : IPPROTO_IPV6); - build_udp_header(buf + el3_len + ol3_len, + build_udp_header(buf + l2_len + el3_len + ol3_len, sizeof(struct guehdr) + il3_len + il4_len + cfg_payload_len, cfg_port_gue, cfg_l3_outer); break; case IPPROTO_GRE: - build_gre_header(buf + el3_len + ol3_len, + build_gre_header(buf + l2_len + el3_len + ol3_len, cfg_l3_inner == PF_INET ? ETH_P_IP : ETH_P_IPV6); break; @@ -368,7 +426,7 @@ static int build_packet(void) switch (cfg_l3_extra) { case PF_INET: - build_ipv4_header(buf, + build_ipv4_header(buf + l2_len, cfg_l3_outer == PF_INET ? IPPROTO_IPIP : IPPROTO_IPV6, extra_saddr4.sin_addr.s_addr, @@ -377,7 +435,7 @@ static int build_packet(void) cfg_payload_len, 0); break; case PF_INET6: - build_ipv6_header(buf, + build_ipv6_header(buf + l2_len, cfg_l3_outer == PF_INET ? IPPROTO_IPIP : IPPROTO_IPV6, &extra_saddr6, &extra_daddr6, @@ -386,15 +444,46 @@ static int build_packet(void) break; } - return el3_len + ol3_len + ol4_len + il3_len + il4_len + + return l2_len + el3_len + ol3_len + ol4_len + il3_len + il4_len + cfg_payload_len; } +static int setup_tx_pfpacket(void) +{ + struct sockaddr_ll laddr = {0}; + const int one = 1; + uint16_t proto; + int fd; + + fd = socket(PF_PACKET, SOCK_RAW, 0); + if (fd == -1) + error(1, errno, "socket tx"); + + if (cfg_use_vnet && + setsockopt(fd, SOL_PACKET, PACKET_VNET_HDR, &one, sizeof(one))) + error(1, errno, "setsockopt vnet"); + + proto = cfg_l3_inner == PF_INET ? ETH_P_IP : ETH_P_IPV6; + laddr.sll_family = AF_PACKET; + laddr.sll_protocol = htons(proto); + laddr.sll_ifindex = if_nametoindex("lo"); + if (!laddr.sll_ifindex) + error(1, errno, "if_nametoindex"); + + if (bind(fd, (void *)&laddr, sizeof(laddr))) + error(1, errno, "bind"); + + return fd; +} + /* sender transmits encapsulated over RAW or unencap'd over UDP */ static int setup_tx(void) { int family, fd, ret; + if (cfg_tx_pf_packet) + return setup_tx_pfpacket(); + if (cfg_l3_extra) family = cfg_l3_extra; else if (cfg_l3_outer) @@ -464,6 +553,13 @@ static int do_tx(int fd, const char *pkt, int len) int ret; ret = write(fd, pkt, len); + + if (cfg_expect_snd_failure) { + if (ret == -1) + return 0; + error(1, 0, "expected tx to fail but it did not"); + } + if (ret == -1) error(1, errno, "send"); if (ret != len) @@ -571,7 +667,7 @@ static int do_main(void) * success (== 0) only if received all packets * unless failure is expected, in which case none must arrive. */ - if (cfg_expect_failure) + if (cfg_expect_norx || cfg_expect_snd_failure) return rx != 0; else return rx != tx; @@ -623,8 +719,12 @@ static void parse_opts(int argc, char **argv) { int c; - while ((c = getopt(argc, argv, "d:D:e:f:Fhi:l:n:o:O:Rs:S:t:Tx:X:")) != -1) { + while ((c = getopt(argc, argv, + "cd:D:e:Ef:FGghi:l:Ln:o:O:pRs:S:t:TUvx:X:")) != -1) { switch (c) { + case 'c': + cfg_vnet_use_csum_off = true; + break; case 'd': if (cfg_l3_outer == AF_UNSPEC) error(1, 0, "-d must be preceded by -o"); @@ -653,11 +753,17 @@ static void parse_opts(int argc, char **argv) else usage(argv[0]); break; + case 'E': + cfg_expect_snd_failure = true; + break; case 'f': cfg_src_port = strtol(optarg, NULL, 0); break; case 'F': - cfg_expect_failure = true; + cfg_expect_norx = true; + break; + case 'g': + cfg_vnet_use_gso = true; break; case 'h': usage(argv[0]); @@ -673,6 +779,9 @@ static void parse_opts(int argc, char **argv) case 'l': cfg_payload_len = strtol(optarg, NULL, 0); break; + case 'L': + cfg_vnet_use_hdr_len_bad = true; + break; case 'n': cfg_num_pkt = strtol(optarg, NULL, 0); break; @@ -682,6 +791,9 @@ static void parse_opts(int argc, char **argv) case 'O': cfg_l3_extra = parse_protocol_family(argv[0], optarg); break; + case 'p': + cfg_tx_pf_packet = true; + break; case 'R': cfg_only_rx = true; break; @@ -703,6 +815,12 @@ static void parse_opts(int argc, char **argv) case 'T': cfg_only_tx = true; break; + case 'U': + cfg_partial_udp_hdr = true; + break; + case 'v': + cfg_use_vnet = true; + break; case 'x': cfg_dsfield_outer = strtol(optarg, NULL, 0); break; @@ -733,7 +851,26 @@ static void parse_opts(int argc, char **argv) */ if (((cfg_dsfield_outer & 0x3) == 0x3) && ((cfg_dsfield_inner & 0x3) == 0x0)) - cfg_expect_failure = true; + cfg_expect_norx = true; + + /* Don't wait around for packets that we expect to fail to send */ + if (cfg_expect_snd_failure && !cfg_num_secs) + cfg_num_secs = 3; + + if (cfg_partial_udp_hdr && cfg_encap_proto) + error(1, 0, + "ops: can't specify partial UDP hdr (-U) and encap (-e)"); + + if (cfg_use_vnet && cfg_encap_proto) + error(1, 0, "options: can't specify encap (-e) with vnet (-v)"); + if (cfg_use_vnet && !cfg_tx_pf_packet) + error(1, 0, "options: vnet (-v) requires psock for tx (-p)"); + if (cfg_vnet_use_gso && !cfg_use_vnet) + error(1, 0, "options: gso (-g) requires vnet (-v)"); + if (cfg_vnet_use_csum_off && !cfg_use_vnet) + error(1, 0, "options: vnet csum (-c) requires vnet (-v)"); + if (cfg_vnet_use_hdr_len_bad && !cfg_use_vnet) + error(1, 0, "options: bad vnet hdrlen (-L) requires vnet (-v)"); } static void print_opts(void) diff --git a/tools/testing/selftests/bpf/test_flow_dissector.sh b/tools/testing/selftests/bpf/test_flow_dissector.sh index 174b72a64a4c..5852cf815eeb 100755 --- a/tools/testing/selftests/bpf/test_flow_dissector.sh +++ b/tools/testing/selftests/bpf/test_flow_dissector.sh @@ -51,6 +51,9 @@ if [[ -z $(ip netns identify $$) ]]; then echo "Skipping root flow dissector test, bpftool not found" >&2 fi + orig_flow_dissect_sysctl=$(