From patchwork Mon Jan 25 17:09:17 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 371168 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-21.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0524C433E6 for ; Tue, 26 Jan 2021 20:13:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9847E22255 for ; Tue, 26 Jan 2021 20:13:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732605AbhAZFYh (ORCPT ); Tue, 26 Jan 2021 00:24:37 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:36173 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729891AbhAYRKy (ORCPT ); Mon, 25 Jan 2021 12:10:54 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611594568; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gPZFpN+oBirNrqNtbu3XBQMkOtPoKz1KHF5o+KEb6HE=; b=VkuiXn/KifsgeSE30SQ9/9PQd4hVtznGEMOCoJPVXnL+Ehp0Uri4cyXX7Yn2Shvqln9Q+j bbRkHnkvw+enCsRR75sOz70/ZAPOZifaM1822MUXHH9YPCwA3jGvEEAgCEGRhCoSrLpBAK EGDWzvCkwV9Mm8/xGQ2G+ddCirCNQrI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-251-ZlPuu6xcNCW0jYoYMCCNRQ-1; Mon, 25 Jan 2021 12:09:24 -0500 X-MC-Unique: ZlPuu6xcNCW0jYoYMCCNRQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id BD6B8802B45; Mon, 25 Jan 2021 17:09:21 +0000 (UTC) Received: from firesoul.localdomain (unknown [10.40.208.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3354A1007639; Mon, 25 Jan 2021 17:09:18 +0000 (UTC) Received: from [192.168.42.3] (localhost [IPv6:::1]) by firesoul.localdomain (Postfix) with ESMTP id 39E2732233490; Mon, 25 Jan 2021 18:09:17 +0100 (CET) Subject: [PATCH bpf-next V13 1/7] bpf: Remove MTU check in __bpf_skb_max_len From: Jesper Dangaard Brouer To: bpf@vger.kernel.org Cc: Jesper Dangaard Brouer , netdev@vger.kernel.org, Daniel Borkmann , Alexei Starovoitov , maze@google.com, lmb@cloudflare.com, shaun@tigera.io, Lorenzo Bianconi , marek@cloudflare.com, John Fastabend , Jakub Kicinski , eyal.birger@gmail.com, colrack@gmail.com Date: Mon, 25 Jan 2021 18:09:17 +0100 Message-ID: <161159455715.321749.12122913350638192791.stgit@firesoul> In-Reply-To: <161159451743.321749.17528005626909164523.stgit@firesoul> References: <161159451743.321749.17528005626909164523.stgit@firesoul> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Multiple BPF-helpers that can manipulate/increase the size of the SKB uses __bpf_skb_max_len() as the max-length. This function limit size against the current net_device MTU (skb->dev->mtu). When a BPF-prog grow the packet size, then it should not be limited to the MTU. The MTU is a transmit limitation, and software receiving this packet should be allowed to increase the size. Further more, current MTU check in __bpf_skb_max_len uses the MTU from ingress/current net_device, which in case of redirects uses the wrong net_device. This patch keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit is elsewhere in the system. Jesper's testing[1] showed it was not possible to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is in-effect due to this being called from softirq context see code __gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed that frames above 16KiB can cause NICs to reset (but not crash). Keep this sanity limit at this level as memory layer can differ based on kernel config. [1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests Signed-off-by: Jesper Dangaard Brouer Acked-by: John Fastabend --- net/core/filter.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index 9ab94e90d660..5beadd659091 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3552,11 +3552,7 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff, return 0; } -static u32 __bpf_skb_max_len(const struct sk_buff *skb) -{ - return skb->dev ? skb->dev->mtu + skb->dev->hard_header_len : - SKB_MAX_ALLOC; -} +#define BPF_SKB_MAX_LEN SKB_MAX_ALLOC BPF_CALL_4(sk_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, u32, mode, u64, flags) @@ -3605,7 +3601,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, { u32 len_cur, len_diff_abs = abs(len_diff); u32 len_min = bpf_skb_net_base_len(skb); - u32 len_max = __bpf_skb_max_len(skb); + u32 len_max = BPF_SKB_MAX_LEN; __be16 proto = skb->protocol; bool shrink = len_diff < 0; u32 off; @@ -3688,7 +3684,7 @@ static int bpf_skb_trim_rcsum(struct sk_buff *skb, unsigned int new_len) static inline int __bpf_skb_change_tail(struct sk_buff *skb, u32 new_len, u64 flags) { - u32 max_len = __bpf_skb_max_len(skb); + u32 max_len = BPF_SKB_MAX_LEN; u32 min_len = __bpf_skb_min_len(skb); int ret; @@ -3764,7 +3760,7 @@ static const struct bpf_func_proto sk_skb_change_tail_proto = { static inline int __bpf_skb_change_head(struct sk_buff *skb, u32 head_room, u64 flags) { - u32 max_len = __bpf_skb_max_len(skb); + u32 max_len = BPF_SKB_MAX_LEN; u32 new_len = skb->len + head_room; int ret; From patchwork Mon Jan 25 17:09:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 371213 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 976C0C433E0 for ; Tue, 26 Jan 2021 05:26:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5D46022795 for ; Tue, 26 Jan 2021 05:26:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731328AbhAZFZ7 (ORCPT ); Tue, 26 Jan 2021 00:25:59 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:46974 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729700AbhAYRLQ (ORCPT ); Mon, 25 Jan 2021 12:11:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611594583; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5+dS/DO1/Ydnxzy1kuuwCDu6pwjLE9LFukvK7qV/gt4=; b=M5NOsf4S1W8+mnqg5XxeFOFb/qf13a8pom9x5o5YjzUsyurtrlt4vTCMh2e82g3+TPV4Jm bGu0UY/yaALAT7eccPlj//TAkAiMqQaYhXD9zKmekzFoFYHKwJGd3HxsriQ2ZnAeabdRTP 2cOaVtFhGq65/sYO8bYkdTAlC89E61Y= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-357-JVt3ddp-Mp2IYcIN-nQ4ng-1; Mon, 25 Jan 2021 12:09:39 -0500 X-MC-Unique: JVt3ddp-Mp2IYcIN-nQ4ng-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 13BA0107ACE3; Mon, 25 Jan 2021 17:09:37 +0000 (UTC) Received: from firesoul.localdomain (unknown [10.40.208.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5A6C45D9DC; Mon, 25 Jan 2021 17:09:33 +0000 (UTC) Received: from [192.168.42.3] (localhost [IPv6:::1]) by firesoul.localdomain (Postfix) with ESMTP id 7024C32233490; Mon, 25 Jan 2021 18:09:32 +0100 (CET) Subject: [PATCH bpf-next V13 4/7] bpf: add BPF-helper for MTU checking From: Jesper Dangaard Brouer To: bpf@vger.kernel.org Cc: Jesper Dangaard Brouer , netdev@vger.kernel.org, Daniel Borkmann , Alexei Starovoitov , maze@google.com, lmb@cloudflare.com, shaun@tigera.io, Lorenzo Bianconi , marek@cloudflare.com, John Fastabend , Jakub Kicinski , eyal.birger@gmail.com, colrack@gmail.com Date: Mon, 25 Jan 2021 18:09:32 +0100 Message-ID: <161159457239.321749.9067604476261493815.stgit@firesoul> In-Reply-To: <161159451743.321749.17528005626909164523.stgit@firesoul> References: <161159451743.321749.17528005626909164523.stgit@firesoul> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This BPF-helper bpf_check_mtu() works for both XDP and TC-BPF programs. The SKB object is complex and the skb->len value (accessible from BPF-prog) also include the length of any extra GRO/GSO segments, but without taking into account that these GRO/GSO segments get added transport (L4) and network (L3) headers before being transmitted. Thus, this BPF-helper is created such that the BPF-programmer don't need to handle these details in the BPF-prog. The API is designed to help the BPF-programmer, that want to do packet context size changes, which involves other helpers. These other helpers usually does a delta size adjustment. This helper also support a delta size (len_diff), which allow BPF-programmer to reuse arguments needed by these other helpers, and perform the MTU check prior to doing any actual size adjustment of the packet context. It is on purpose, that we allow the len adjustment to become a negative result, that will pass the MTU check. This might seem weird, but it's not this helpers responsibility to "catch" wrong len_diff adjustments. Other helpers will take care of these checks, if BPF-programmer chooses to do actual size adjustment. V13: - Enforce flag BPF_MTU_CHK_SEGS cannot use len_diff. V12: - Simplify segment check that calls skb_gso_validate_network_len. - Helpers should return long V9: - Use dev->hard_header_len (instead of ETH_HLEN) - Annotate with unlikely req from Daniel - Fix logic error using skb_gso_validate_network_len from Daniel V6: - Took John's advice and dropped BPF_MTU_CHK_RELAX - Returned MTU is kept at L3-level (like fib_lookup) V4: Lot of changes - ifindex 0 now use current netdev for MTU lookup - rename helper from bpf_mtu_check to bpf_check_mtu - fix bug for GSO pkt length (as skb->len is total len) - remove __bpf_len_adj_positive, simply allow negative len adj Signed-off-by: Jesper Dangaard Brouer --- include/uapi/linux/bpf.h | 67 ++++++++++++++++++++++++ net/core/filter.c | 114 ++++++++++++++++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 67 ++++++++++++++++++++++++ 3 files changed, 248 insertions(+) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 05bfc8c843dc..f17381a337ec 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3839,6 +3839,61 @@ union bpf_attr { * Return * A pointer to a struct socket on success or NULL if the file is * not a socket. + * + * long bpf_check_mtu(void *ctx, u32 ifindex, u32 *mtu_len, s32 len_diff, u64 flags) + * Description + * Check ctx packet size against MTU of net device (based on + * *ifindex*). This helper will likely be used in combination with + * helpers that adjust/change the packet size. The argument + * *len_diff* can be used for querying with a planned size + * change. This allows to check MTU prior to changing packet ctx. + * + * Specifying *ifindex* zero means the MTU check is performed + * against the current net device. This is practical if this isn't + * used prior to redirect. + * + * The Linux kernel route table can configure MTUs on a more + * specific per route level, which is not provided by this helper. + * For route level MTU checks use the **bpf_fib_lookup**\ () + * helper. + * + * *ctx* is either **struct xdp_md** for XDP programs or + * **struct sk_buff** for tc cls_act programs. + * + * The *flags* argument can be a combination of one or more of the + * following values: + * + * **BPF_MTU_CHK_SEGS** + * This flag will only works for *ctx* **struct sk_buff**. + * If packet context contains extra packet segment buffers + * (often knows as GSO skb), then MTU check is harder to + * check at this point, because in transmit path it is + * possible for the skb packet to get re-segmented + * (depending on net device features). This could still be + * a MTU violation, so this flag enables performing MTU + * check against segments, with a different violation + * return code to tell it apart. Check cannot use len_diff. + * + * On return *mtu_len* pointer contains the MTU value of the net + * device. Remember the net device configured MTU is the L3 size, + * which is returned here and XDP and TX length operate at L2. + * Helper take this into account for you, but remember when using + * MTU value in your BPF-code. On input *mtu_len* must be a valid + * pointer and be initialized (to zero), else verifier will reject + * BPF program. + * + * Return + * * 0 on success, and populate MTU value in *mtu_len* pointer. + * + * * < 0 if any input argument is invalid (*mtu_len* not updated) + * + * MTU violations return positive values, but also populate MTU + * value in *mtu_len* pointer, as this can be needed for + * implementing PMTU handing: + * + * * **BPF_MTU_CHK_RET_FRAG_NEEDED** + * * **BPF_MTU_CHK_RET_SEGS_TOOBIG** + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4004,6 +4059,7 @@ union bpf_attr { FN(ktime_get_coarse_ns), \ FN(ima_inode_hash), \ FN(sock_from_file), \ + FN(check_mtu), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -5036,6 +5092,17 @@ struct bpf_redir_neigh { }; }; +/* bpf_check_mtu flags*/ +enum bpf_check_mtu_flags { + BPF_MTU_CHK_SEGS = (1U << 0), +}; + +enum bpf_check_mtu_ret { + BPF_MTU_CHK_RET_SUCCESS, /* check and lookup successful */ + BPF_MTU_CHK_RET_FRAG_NEEDED, /* fragmentation required to fwd */ + BPF_MTU_CHK_RET_SEGS_TOOBIG, /* GSO re-segmentation needed to fwd */ +}; + enum bpf_task_fd_type { BPF_FD_TYPE_RAW_TRACEPOINT, /* tp name */ BPF_FD_TYPE_TRACEPOINT, /* tp name */ diff --git a/net/core/filter.c b/net/core/filter.c index 7b1b9baaabd5..f6777e8f9ff7 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5624,6 +5624,116 @@ static const struct bpf_func_proto bpf_skb_fib_lookup_proto = { .arg4_type = ARG_ANYTHING, }; +static struct net_device *__dev_via_ifindex(struct net_device *dev_curr, + u32 ifindex) +{ + struct net *netns = dev_net(dev_curr); + + /* Non-redirect use-cases can use ifindex=0 and save ifindex lookup */ + if (ifindex == 0) + return dev_curr; + + return dev_get_by_index_rcu(netns, ifindex); +} + +BPF_CALL_5(bpf_skb_check_mtu, struct sk_buff *, skb, + u32, ifindex, u32 *, mtu_len, s32, len_diff, u64, flags) +{ + int ret = BPF_MTU_CHK_RET_FRAG_NEEDED; + struct net_device *dev = skb->dev; + int skb_len, dev_len; + int mtu; + + if (unlikely(flags & ~(BPF_MTU_CHK_SEGS))) + return -EINVAL; + + if (unlikely(flags & BPF_MTU_CHK_SEGS && len_diff)) + return -EINVAL; + + dev = __dev_via_ifindex(dev, ifindex); + if (unlikely(!dev)) + return -ENODEV; + + mtu = READ_ONCE(dev->mtu); + + dev_len = mtu + dev->hard_header_len; + skb_len = skb->len + len_diff; /* minus result pass check */ + if (skb_len <= dev_len) { + ret = BPF_MTU_CHK_RET_SUCCESS; + goto out; + } + /* At this point, skb->len exceed MTU, but as it include length of all + * segments, it can still be below MTU. The SKB can possibly get + * re-segmented in transmit path (see validate_xmit_skb). Thus, user + * must choose if segs are to be MTU checked. + */ + if (skb_is_gso(skb)) { + ret = BPF_MTU_CHK_RET_SUCCESS; + + if (flags & BPF_MTU_CHK_SEGS && + !skb_gso_validate_network_len(skb, mtu)) + ret = BPF_MTU_CHK_RET_SEGS_TOOBIG; + } +out: + /* BPF verifier guarantees valid pointer */ + *mtu_len = mtu; + + return ret; +} + +BPF_CALL_5(bpf_xdp_check_mtu, struct xdp_buff *, xdp, + u32, ifindex, u32 *, mtu_len, s32, len_diff, u64, flags) +{ + struct net_device *dev = xdp->rxq->dev; + int xdp_len = xdp->data_end - xdp->data; + int ret = BPF_MTU_CHK_RET_SUCCESS; + int mtu, dev_len; + + /* XDP variant doesn't support multi-buffer segment check (yet) */ + if (unlikely(flags)) + return -EINVAL; + + dev = __dev_via_ifindex(dev, ifindex); + if (unlikely(!dev)) + return -ENODEV; + + mtu = READ_ONCE(dev->mtu); + + /* Add L2-header as dev MTU is L3 size */ + dev_len = mtu + dev->hard_header_len; + + xdp_len += len_diff; /* minus result pass check */ + if (xdp_len > dev_len) + ret = BPF_MTU_CHK_RET_FRAG_NEEDED; + + /* BPF verifier guarantees valid pointer */ + *mtu_len = mtu; + + return ret; +} + +static const struct bpf_func_proto bpf_skb_check_mtu_proto = { + .func = bpf_skb_check_mtu, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_PTR_TO_INT, + .arg4_type = ARG_ANYTHING, + .arg5_type = ARG_ANYTHING, +}; + +static const struct bpf_func_proto bpf_xdp_check_mtu_proto = { + .func = bpf_xdp_check_mtu, + .gpl_only = true, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_ANYTHING, + .arg3_type = ARG_PTR_TO_INT, + .arg4_type = ARG_ANYTHING, + .arg5_type = ARG_ANYTHING, +}; + #if IS_ENABLED(CONFIG_IPV6_SEG6_BPF) static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len) { @@ -7193,6 +7303,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_get_socket_uid_proto; case BPF_FUNC_fib_lookup: return &bpf_skb_fib_lookup_proto; + case BPF_FUNC_check_mtu: + return &bpf_skb_check_mtu_proto; case BPF_FUNC_sk_fullsock: return &bpf_sk_fullsock_proto; case BPF_FUNC_sk_storage_get: @@ -7262,6 +7374,8 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_xdp_adjust_tail_proto; case BPF_FUNC_fib_lookup: return &bpf_xdp_fib_lookup_proto; + case BPF_FUNC_check_mtu: + return &bpf_xdp_check_mtu_proto; #ifdef CONFIG_INET case BPF_FUNC_sk_lookup_udp: return &bpf_xdp_sk_lookup_udp_proto; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 05bfc8c843dc..f17381a337ec 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3839,6 +3839,61 @@ union bpf_attr { * Return * A pointer to a struct socket on success or NULL if the file is * not a socket. + * + * long bpf_check_mtu(void *ctx, u32 ifindex, u32 *mtu_len, s32 len_diff, u64 flags) + * Description + * Check ctx packet size against MTU of net device (based on + * *ifindex*). This helper will likely be used in combination with + * helpers that adjust/change the packet size. The argument + * *len_diff* can be used for querying with a planned size + * change. This allows to check MTU prior to changing packet ctx. + * + * Specifying *ifindex* zero means the MTU check is performed + * against the current net device. This is practical if this isn't + * used prior to redirect. + * + * The Linux kernel route table can configure MTUs on a more + * specific per route level, which is not provided by this helper. + * For route level MTU checks use the **bpf_fib_lookup**\ () + * helper. + * + * *ctx* is either **struct xdp_md** for XDP programs or + * **struct sk_buff** for tc cls_act programs. + * + * The *flags* argument can be a combination of one or more of the + * following values: + * + * **BPF_MTU_CHK_SEGS** + * This flag will only works for *ctx* **struct sk_buff**. + * If packet context contains extra packet segment buffers + * (often knows as GSO skb), then MTU check is harder to + * check at this point, because in transmit path it is + * possible for the skb packet to get re-segmented + * (depending on net device features). This could still be + * a MTU violation, so this flag enables performing MTU + * check against segments, with a different violation + * return code to tell it apart. Check cannot use len_diff. + * + * On return *mtu_len* pointer contains the MTU value of the net + * device. Remember the net device configured MTU is the L3 size, + * which is returned here and XDP and TX length operate at L2. + * Helper take this into account for you, but remember when using + * MTU value in your BPF-code. On input *mtu_len* must be a valid + * pointer and be initialized (to zero), else verifier will reject + * BPF program. + * + * Return + * * 0 on success, and populate MTU value in *mtu_len* pointer. + * + * * < 0 if any input argument is invalid (*mtu_len* not updated) + * + * MTU violations return positive values, but also populate MTU + * value in *mtu_len* pointer, as this can be needed for + * implementing PMTU handing: + * + * * **BPF_MTU_CHK_RET_FRAG_NEEDED** + * * **BPF_MTU_CHK_RET_SEGS_TOOBIG** + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -4004,6 +4059,7 @@ union bpf_attr { FN(ktime_get_coarse_ns), \ FN(ima_inode_hash), \ FN(sock_from_file), \ + FN(check_mtu), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -5036,6 +5092,17 @@ struct bpf_redir_neigh { }; }; +/* bpf_check_mtu flags*/ +enum bpf_check_mtu_flags { + BPF_MTU_CHK_SEGS = (1U << 0), +}; + +enum bpf_check_mtu_ret { + BPF_MTU_CHK_RET_SUCCESS, /* check and lookup successful */ + BPF_MTU_CHK_RET_FRAG_NEEDED, /* fragmentation required to fwd */ + BPF_MTU_CHK_RET_SEGS_TOOBIG, /* GSO re-segmentation needed to fwd */ +}; + enum bpf_task_fd_type { BPF_FD_TYPE_RAW_TRACEPOINT, /* tp name */ BPF_FD_TYPE_TRACEPOINT, /* tp name */ From patchwork Mon Jan 25 17:09:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 371169 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5508CC433E0 for ; Tue, 26 Jan 2021 20:12:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0F2A322241 for ; Tue, 26 Jan 2021 20:12:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726747AbhAZFYv (ORCPT ); Tue, 26 Jan 2021 00:24:51 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:20461 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730081AbhAYRLO (ORCPT ); Mon, 25 Jan 2021 12:11:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611594587; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iFs5YuS+UQBHxuk28XB6kdJ0sgDe9yXGAPFUS7xIqxM=; b=YUJWlxeXsCiP26ikHdBXjJmvDAi9X/H/dvH33urFo+HJ1m/hT7qFGkmdErRHAbLWrOfSXQ UqMV0SBIRn9iAuP+UezCvrkw15fqhcaE5DrxpLA0VXBhddMSs/V64AH2qWaKfQJwjwXGDL Ca0SDL+VwWjphbZ8a0jmyuZobyyB4VM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-34-B49__65TOGeZ-SiN5OW5Qg-1; Mon, 25 Jan 2021 12:09:45 -0500 X-MC-Unique: B49__65TOGeZ-SiN5OW5Qg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D853719251A3; Mon, 25 Jan 2021 17:09:43 +0000 (UTC) Received: from firesoul.localdomain (unknown [10.40.208.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id 84D3319C78; Mon, 25 Jan 2021 17:09:43 +0000 (UTC) Received: from [192.168.42.3] (localhost [IPv6:::1]) by firesoul.localdomain (Postfix) with ESMTP id 9063932233490; Mon, 25 Jan 2021 18:09:42 +0100 (CET) Subject: [PATCH bpf-next V13 6/7] selftests/bpf: use bpf_check_mtu in selftest test_cls_redirect From: Jesper Dangaard Brouer To: bpf@vger.kernel.org Cc: Jesper Dangaard Brouer , netdev@vger.kernel.org, Daniel Borkmann , Alexei Starovoitov , maze@google.com, lmb@cloudflare.com, shaun@tigera.io, Lorenzo Bianconi , marek@cloudflare.com, John Fastabend , Jakub Kicinski , eyal.birger@gmail.com, colrack@gmail.com Date: Mon, 25 Jan 2021 18:09:42 +0100 Message-ID: <161159458253.321749.4626116952494155329.stgit@firesoul> In-Reply-To: <161159451743.321749.17528005626909164523.stgit@firesoul> References: <161159451743.321749.17528005626909164523.stgit@firesoul> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This demonstrate how bpf_check_mtu() helper can easily be used together with bpf_skb_adjust_room() helper, prior to doing size adjustment, as delta argument is already setup. Hint: This specific test can be selected like this: ./test_progs -t cls_redirect Signed-off-by: Jesper Dangaard Brouer Acked-by: John Fastabend --- .../selftests/bpf/progs/test_cls_redirect.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/tools/testing/selftests/bpf/progs/test_cls_redirect.c b/tools/testing/selftests/bpf/progs/test_cls_redirect.c index c9f8464996ea..3c1e042962e6 100644 --- a/tools/testing/selftests/bpf/progs/test_cls_redirect.c +++ b/tools/testing/selftests/bpf/progs/test_cls_redirect.c @@ -70,6 +70,7 @@ typedef struct { uint64_t errors_total_encap_adjust_failed; uint64_t errors_total_encap_buffer_too_small; uint64_t errors_total_redirect_loop; + uint64_t errors_total_encap_mtu_violate; } metrics_t; typedef enum { @@ -407,6 +408,7 @@ static INLINING ret_t forward_with_gre(struct __sk_buff *skb, encap_headers_t *e payload_off - sizeof(struct ethhdr) - sizeof(struct iphdr); int32_t delta = sizeof(struct gre_base_hdr) - encap_overhead; uint16_t proto = ETH_P_IP; + uint32_t mtu_len = 0; /* Loop protection: the inner packet's TTL is decremented as a safeguard * against any forwarding loop. As the only interesting field is the TTL @@ -479,6 +481,11 @@ static INLINING ret_t forward_with_gre(struct __sk_buff *skb, encap_headers_t *e } } + if (bpf_check_mtu(skb, skb->ifindex, &mtu_len, delta, 0)) { + metrics->errors_total_encap_mtu_violate++; + return TC_ACT_SHOT; + } + if (bpf_skb_adjust_room(skb, delta, BPF_ADJ_ROOM_NET, BPF_F_ADJ_ROOM_FIXED_GSO | BPF_F_ADJ_ROOM_NO_CSUM_RESET) || From patchwork Mon Jan 25 17:09:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 371170 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.0 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FB77C433E0 for ; Tue, 26 Jan 2021 20:09:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A86CF22255 for ; Tue, 26 Jan 2021 20:09:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732713AbhAZF0f (ORCPT ); Tue, 26 Jan 2021 00:26:35 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:58937 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730100AbhAYRLb (ORCPT ); Mon, 25 Jan 2021 12:11:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1611594601; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=O/95Lw7Txt8aVPZGrqa5hMWBxFMssMIZa0dCMHYZkq0=; b=i4LITeK2ynJd7gbDScO/jaFE0Q6ljmZaX1ATU4FmO54zvr1K4Yyx4OV/t601Mw+p2iLHf1 DcVEQk4cLN2LqPN+AneyvANoNFCX/SFLJwySb53VqmB47rIw7XLJUF6zkqwinZ048gcvvX zrHqrMBu/TKOTZ/Y2BDdILNrZ5IzotM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-310-idIqiHX-Pv2YWh9RUKP0Xw-1; Mon, 25 Jan 2021 12:09:57 -0500 X-MC-Unique: idIqiHX-Pv2YWh9RUKP0Xw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 48995100F347; Mon, 25 Jan 2021 17:09:55 +0000 (UTC) Received: from firesoul.localdomain (unknown [10.40.208.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id AE1925E1A4; Mon, 25 Jan 2021 17:09:48 +0000 (UTC) Received: from [192.168.42.3] (localhost [IPv6:::1]) by firesoul.localdomain (Postfix) with ESMTP id 9FC9532233490; Mon, 25 Jan 2021 18:09:47 +0100 (CET) Subject: [PATCH bpf-next V13 7/7] selftests/bpf: tests using bpf_check_mtu BPF-helper From: Jesper Dangaard Brouer To: bpf@vger.kernel.org Cc: Jesper Dangaard Brouer , netdev@vger.kernel.org, Daniel Borkmann , Alexei Starovoitov , maze@google.com, lmb@cloudflare.com, shaun@tigera.io, Lorenzo Bianconi , marek@cloudflare.com, John Fastabend , Jakub Kicinski , eyal.birger@gmail.com, colrack@gmail.com Date: Mon, 25 Jan 2021 18:09:47 +0100 Message-ID: <161159458760.321749.4512713874492237529.stgit@firesoul> In-Reply-To: <161159451743.321749.17528005626909164523.stgit@firesoul> References: <161159451743.321749.17528005626909164523.stgit@firesoul> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Adding selftest for BPF-helper bpf_check_mtu(). Making sure it can be used from both XDP and TC. V11: - Addresse nitpicks from Andrii Nakryiko V10: - Remove errno non-zero test in CHECK_ATTR() - Addresse comments from Andrii Nakryiko Signed-off-by: Jesper Dangaard Brouer Acked-by: Andrii Nakryiko --- tools/testing/selftests/bpf/prog_tests/check_mtu.c | 216 ++++++++++++++++++++ tools/testing/selftests/bpf/progs/test_check_mtu.c | 198 ++++++++++++++++++ 2 files changed, 414 insertions(+) create mode 100644 tools/testing/selftests/bpf/prog_tests/check_mtu.c create mode 100644 tools/testing/selftests/bpf/progs/test_check_mtu.c diff --git a/tools/testing/selftests/bpf/prog_tests/check_mtu.c b/tools/testing/selftests/bpf/prog_tests/check_mtu.c new file mode 100644 index 000000000000..9e2fd01b7c65 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/check_mtu.c @@ -0,0 +1,216 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Jesper Dangaard Brouer */ + +#include /* before test_progs.h, avoid bpf_util.h redefines */ +#include +#include "test_check_mtu.skel.h" +#include "network_helpers.h" + +#include +#include + +#define IFINDEX_LO 1 + +static __u32 duration; /* Hint: needed for CHECK macro */ + +static int read_mtu_device_lo(void) +{ + const char *filename = "/sys/class/net/lo/mtu"; + char buf[11] = {}; + int value, n, fd; + + fd = open(filename, 0, O_RDONLY); + if (fd == -1) + return -1; + + n = read(fd, buf, sizeof(buf)); + close(fd); + + if (n == -1) + return -2; + + value = strtoimax(buf, NULL, 10); + if (errno == ERANGE) + return -3; + + return value; +} + +static void test_check_mtu_xdp_attach() +{ + struct bpf_link_info link_info; + __u32 link_info_len = sizeof(link_info); + struct test_check_mtu *skel; + struct bpf_program *prog; + struct bpf_link *link; + int err = 0; + int fd; + + skel = test_check_mtu__open_and_load(); + if (CHECK(!skel, "open and load skel", "failed")) + return; /* Exit if e.g. helper unknown to kernel */ + + prog = skel->progs.xdp_use_helper_basic; + + link = bpf_program__attach_xdp(prog, IFINDEX_LO); + if (CHECK(IS_ERR(link), "link_attach", "failed: %ld\n", PTR_ERR(link))) + goto out; + skel->links.xdp_use_helper_basic = link; + + memset(&link_info, 0, sizeof(link_info)); + fd = bpf_link__fd(link); + err = bpf_obj_get_info_by_fd(fd, &link_info, &link_info_len); + if (CHECK(err, "link_info", "failed: %d\n", err)) + goto out; + + CHECK(link_info.type != BPF_LINK_TYPE_XDP, "link_type", + "got %u != exp %u\n", link_info.type, BPF_LINK_TYPE_XDP); + CHECK(link_info.xdp.ifindex != IFINDEX_LO, "link_ifindex", + "got %u != exp %u\n", link_info.xdp.ifindex, IFINDEX_LO); + + err = bpf_link__detach(link); + CHECK(err, "link_detach", "failed %d\n", err); + +out: + test_check_mtu__destroy(skel); +} + +static void test_check_mtu_run_xdp(struct test_check_mtu *skel, + struct bpf_program *prog, + __u32 mtu_expect) +{ + const char *prog_name = bpf_program__name(prog); + int retval_expect = XDP_PASS; + __u32 mtu_result = 0; + char buf[256] = {}; + int err; + struct bpf_prog_test_run_attr tattr = { + .repeat = 1, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .data_out = buf, + .data_size_out = sizeof(buf), + .prog_fd = bpf_program__fd(prog), + }; + + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err != 0, "bpf_prog_test_run", + "prog_name:%s (err %d errno %d retval %d)\n", + prog_name, err, errno, tattr.retval); + + CHECK(tattr.retval != retval_expect, "retval", + "progname:%s unexpected retval=%d expected=%d\n", + prog_name, tattr.retval, retval_expect); + + /* Extract MTU that BPF-prog got */ + mtu_result = skel->bss->global_bpf_mtu_xdp; + ASSERT_EQ(mtu_result, mtu_expect, "MTU-compare-user"); +} + + +static void test_check_mtu_xdp(__u32 mtu, __u32 ifindex) +{ + struct test_check_mtu *skel; + int err; + + skel = test_check_mtu__open(); + if (CHECK(!skel, "skel_open", "failed")) + return; + + /* Update "constants" in BPF-prog *BEFORE* libbpf load */ + skel->rodata->GLOBAL_USER_MTU = mtu; + skel->rodata->GLOBAL_USER_IFINDEX = ifindex; + + err = test_check_mtu__load(skel); + if (CHECK(err, "skel_load", "failed: %d\n", err)) + goto cleanup; + + test_check_mtu_run_xdp(skel, skel->progs.xdp_use_helper, mtu); + test_check_mtu_run_xdp(skel, skel->progs.xdp_exceed_mtu, mtu); + test_check_mtu_run_xdp(skel, skel->progs.xdp_minus_delta, mtu); + +cleanup: + test_check_mtu__destroy(skel); +} + +static void test_check_mtu_run_tc(struct test_check_mtu *skel, + struct bpf_program *prog, + __u32 mtu_expect) +{ + const char *prog_name = bpf_program__name(prog); + int retval_expect = BPF_OK; + __u32 mtu_result = 0; + char buf[256] = {}; + int err; + struct bpf_prog_test_run_attr tattr = { + .repeat = 1, + .data_in = &pkt_v4, + .data_size_in = sizeof(pkt_v4), + .data_out = buf, + .data_size_out = sizeof(buf), + .prog_fd = bpf_program__fd(prog), + }; + + err = bpf_prog_test_run_xattr(&tattr); + CHECK_ATTR(err != 0, "bpf_prog_test_run", + "prog_name:%s (err %d errno %d retval %d)\n", + prog_name, err, errno, tattr.retval); + + CHECK(tattr.retval != retval_expect, "retval", + "progname:%s unexpected retval=%d expected=%d\n", + prog_name, tattr.retval, retval_expect); + + /* Extract MTU that BPF-prog got */ + mtu_result = skel->bss->global_bpf_mtu_tc; + ASSERT_EQ(mtu_result, mtu_expect, "MTU-compare-user"); +} + + +static void test_check_mtu_tc(__u32 mtu, __u32 ifindex) +{ + struct test_check_mtu *skel; + int err; + + skel = test_check_mtu__open(); + if (CHECK(!skel, "skel_open", "failed")) + return; + + /* Update "constants" in BPF-prog *BEFORE* libbpf load */ + skel->rodata->GLOBAL_USER_MTU = mtu; + skel->rodata->GLOBAL_USER_IFINDEX = ifindex; + + err = test_check_mtu__load(skel); + if (CHECK(err, "skel_load", "failed: %d\n", err)) + goto cleanup; + + test_check_mtu_run_tc(skel, skel->progs.tc_use_helper, mtu); + test_check_mtu_run_tc(skel, skel->progs.tc_exceed_mtu, mtu); + test_check_mtu_run_tc(skel, skel->progs.tc_exceed_mtu_da, mtu); + test_check_mtu_run_tc(skel, skel->progs.tc_minus_delta, mtu); +cleanup: + test_check_mtu__destroy(skel); +} + +void test_check_mtu(void) +{ + __u32 mtu_lo; + + if (test__start_subtest("bpf_check_mtu XDP-attach")) + test_check_mtu_xdp_attach(); + + mtu_lo = read_mtu_device_lo(); + if (CHECK(mtu_lo < 0, "reading MTU value", "failed (err:%d)", mtu_lo)) + return; + + if (test__start_subtest("bpf_check_mtu XDP-run")) + test_check_mtu_xdp(mtu_lo, 0); + + if (test__start_subtest("bpf_check_mtu XDP-run ifindex-lookup")) + test_check_mtu_xdp(mtu_lo, IFINDEX_LO); + + if (test__start_subtest("bpf_check_mtu TC-run")) + test_check_mtu_tc(mtu_lo, 0); + + if (test__start_subtest("bpf_check_mtu TC-run ifindex-lookup")) + test_check_mtu_tc(mtu_lo, IFINDEX_LO); +} diff --git a/tools/testing/selftests/bpf/progs/test_check_mtu.c b/tools/testing/selftests/bpf/progs/test_check_mtu.c new file mode 100644 index 000000000000..b7787b43f9db --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_check_mtu.c @@ -0,0 +1,198 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2020 Jesper Dangaard Brouer */ + +#include +#include +#include + +#include +#include + +char _license[] SEC("license") = "GPL"; + +/* Userspace will update with MTU it can see on device */ +static volatile const int GLOBAL_USER_MTU; +static volatile const __u32 GLOBAL_USER_IFINDEX; + +/* BPF-prog will update these with MTU values it can see */ +__u32 global_bpf_mtu_xdp = 0; +__u32 global_bpf_mtu_tc = 0; + +SEC("xdp") +int xdp_use_helper_basic(struct xdp_md *ctx) +{ + __u32 mtu_len = 0; + + if (bpf_check_mtu(ctx, 0, &mtu_len, 0, 0)) + return XDP_ABORTED; + + return XDP_PASS; +} + +SEC("xdp") +int xdp_use_helper(struct xdp_md *ctx) +{ + int retval = XDP_PASS; /* Expected retval on successful test */ + __u32 mtu_len = 0; + __u32 ifindex = 0; + int delta = 0; + + /* When ifindex is zero, save net_device lookup and use ctx netdev */ + if (GLOBAL_USER_IFINDEX > 0) + ifindex = GLOBAL_USER_IFINDEX; + + if (bpf_check_mtu(ctx, ifindex, &mtu_len, delta, 0)) { + /* mtu_len is also valid when check fail */ + retval = XDP_ABORTED; + goto out; + } + + if (mtu_len != GLOBAL_USER_MTU) + retval = XDP_DROP; + +out: + global_bpf_mtu_xdp = mtu_len; + return retval; +} + +SEC("xdp") +int xdp_exceed_mtu(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + __u32 ifindex = GLOBAL_USER_IFINDEX; + __u32 data_len = data_end - data; + int retval = XDP_ABORTED; /* Fail */ + __u32 mtu_len = 0; + int delta; + int err; + + /* Exceed MTU with 1 via delta adjust */ + delta = GLOBAL_USER_MTU - (data_len - ETH_HLEN) + 1; + + err = bpf_check_mtu(ctx, ifindex, &mtu_len, delta, 0); + if (err) { + retval = XDP_PASS; /* Success in exceeding MTU check */ + if (err != BPF_MTU_CHK_RET_FRAG_NEEDED) + retval = XDP_DROP; + } + + global_bpf_mtu_xdp = mtu_len; + return retval; +} + +SEC("xdp") +int xdp_minus_delta(struct xdp_md *ctx) +{ + int retval = XDP_PASS; /* Expected retval on successful test */ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + __u32 ifindex = GLOBAL_USER_IFINDEX; + __u32 data_len = data_end - data; + __u32 mtu_len = 0; + int delta; + + /* Borderline test case: Minus delta exceeding packet length allowed */ + delta = -((data_len - ETH_HLEN) + 1); + + /* Minus length (adjusted via delta) still pass MTU check, other helpers + * are responsible for catching this, when doing actual size adjust + */ + if (bpf_check_mtu(ctx, ifindex, &mtu_len, delta, 0)) + retval = XDP_ABORTED; + + global_bpf_mtu_xdp = mtu_len; + return retval; +} + +SEC("classifier") +int tc_use_helper(struct __sk_buff *ctx) +{ + int retval = BPF_OK; /* Expected retval on successful test */ + __u32 mtu_len = 0; + int delta = 0; + + if (bpf_check_mtu(ctx, 0, &mtu_len, delta, 0)) { + retval = BPF_DROP; + goto out; + } + + if (mtu_len != GLOBAL_USER_MTU) + retval = BPF_REDIRECT; +out: + global_bpf_mtu_tc = mtu_len; + return retval; +} + +SEC("classifier") +int tc_exceed_mtu(struct __sk_buff *ctx) +{ + __u32 ifindex = GLOBAL_USER_IFINDEX; + int retval = BPF_DROP; /* Fail */ + __u32 skb_len = ctx->len; + __u32 mtu_len = 0; + int delta; + int err; + + /* Exceed MTU with 1 via delta adjust */ + delta = GLOBAL_USER_MTU - (skb_len - ETH_HLEN) + 1; + + err = bpf_check_mtu(ctx, ifindex, &mtu_len, delta, 0); + if (err) { + retval = BPF_OK; /* Success in exceeding MTU check */ + if (err != BPF_MTU_CHK_RET_FRAG_NEEDED) + retval = BPF_DROP; + } + + global_bpf_mtu_tc = mtu_len; + return retval; +} + +SEC("classifier") +int tc_exceed_mtu_da(struct __sk_buff *ctx) +{ + /* SKB Direct-Access variant */ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + __u32 ifindex = GLOBAL_USER_IFINDEX; + __u32 data_len = data_end - data; + int retval = BPF_DROP; /* Fail */ + __u32 mtu_len = 0; + int delta; + int err; + + /* Exceed MTU with 1 via delta adjust */ + delta = GLOBAL_USER_MTU - (data_len - ETH_HLEN) + 1; + + err = bpf_check_mtu(ctx, ifindex, &mtu_len, delta, 0); + if (err) { + retval = BPF_OK; /* Success in exceeding MTU check */ + if (err != BPF_MTU_CHK_RET_FRAG_NEEDED) + retval = BPF_DROP; + } + + global_bpf_mtu_tc = mtu_len; + return retval; +} + +SEC("classifier") +int tc_minus_delta(struct __sk_buff *ctx) +{ + int retval = BPF_OK; /* Expected retval on successful test */ + __u32 ifindex = GLOBAL_USER_IFINDEX; + __u32 skb_len = ctx->len; + __u32 mtu_len = 0; + int delta; + + /* Borderline test case: Minus delta exceeding packet length allowed */ + delta = -((skb_len - ETH_HLEN) + 1); + + /* Minus length (adjusted via delta) still pass MTU check, other helpers + * are responsible for catching this, when doing actual size adjust + */ + if (bpf_check_mtu(ctx, ifindex, &mtu_len, delta, 0)) + retval = BPF_DROP; + + global_bpf_mtu_xdp = mtu_len; + return retval; +}