From patchwork Fri Oct 30 16:50:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jesper Dangaard Brouer X-Patchwork-Id: 315972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.9 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 000CBC56201 for ; Fri, 30 Oct 2020 16:51:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A597120756 for ; Fri, 30 Oct 2020 16:51:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="PmRJ2QpZ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727109AbgJ3QvC (ORCPT ); Fri, 30 Oct 2020 12:51:02 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:39898 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725808AbgJ3QvB (ORCPT ); Fri, 30 Oct 2020 12:51:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604076660; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=EzMI+v2ILDpTyy8xf2G2Jpbn7b3vnXL/BR6ybaJuMdo=; b=PmRJ2QpZK640cjvInODblH+7U3zk5Nh+easoWAbfnkgCF6mLPTeYKM20StGwQrzY7V8KUG HMNLLAgpe5jFGaQfII/pRI0yu791iu1HZwRXnu9wiFhOEMHWxgja5Ki+v5/KX4OSr9AIZH w3GCmHYrxpfuXn2r/B1PiC/8XXOf/+k= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-556-rqwsCfzDOreIGSGrdL-K6A-1; Fri, 30 Oct 2020 12:50:55 -0400 X-MC-Unique: rqwsCfzDOreIGSGrdL-K6A-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 87710760C7; Fri, 30 Oct 2020 16:50:53 +0000 (UTC) Received: from firesoul.localdomain (unknown [10.40.208.22]) by smtp.corp.redhat.com (Postfix) with ESMTP id 326AC5D99B; Fri, 30 Oct 2020 16:50:53 +0000 (UTC) Received: from [192.168.42.3] (localhost [IPv6:::1]) by firesoul.localdomain (Postfix) with ESMTP id 42D9F30736C8B; Fri, 30 Oct 2020 17:50:52 +0100 (CET) Subject: [PATCH bpf-next V5 1/5] bpf: Remove MTU check in __bpf_skb_max_len From: Jesper Dangaard Brouer To: bpf@vger.kernel.org Cc: Jesper Dangaard Brouer , netdev@vger.kernel.org, Daniel Borkmann , Alexei Starovoitov , maze@google.com, lmb@cloudflare.com, shaun@tigera.io, Lorenzo Bianconi , marek@cloudflare.com, John Fastabend , Jakub Kicinski , eyal.birger@gmail.com Date: Fri, 30 Oct 2020 17:50:52 +0100 Message-ID: <160407665219.1525159.16849637089559475976.stgit@firesoul> In-Reply-To: <160407661383.1525159.12855559773280533146.stgit@firesoul> References: <160407661383.1525159.12855559773280533146.stgit@firesoul> User-Agent: StGit/0.19 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Multiple BPF-helpers that can manipulate/increase the size of the SKB uses __bpf_skb_max_len() as the max-length. This function limit size against the current net_device MTU (skb->dev->mtu). When a BPF-prog grow the packet size, then it should not be limited to the MTU. The MTU is a transmit limitation, and software receiving this packet should be allowed to increase the size. Further more, current MTU check in __bpf_skb_max_len uses the MTU from ingress/current net_device, which in case of redirects uses the wrong net_device. Patch V4 keeps a sanity max limit of SKB_MAX_ALLOC (16KiB). The real limit is elsewhere in the system. Jesper's testing[1] showed it was not possible to exceed 8KiB when expanding the SKB size via BPF-helper. The limiting factor is the define KMALLOC_MAX_CACHE_SIZE which is 8192 for SLUB-allocator (CONFIG_SLUB) in-case PAGE_SIZE is 4096. This define is in-effect due to this being called from softirq context see code __gfp_pfmemalloc_flags() and __do_kmalloc_node(). Jakub's testing showed that frames above 16KiB can cause NICs to reset (but not crash). Keep this sanity limit at this level as memory layer can differ based on kernel config. [1] https://github.com/xdp-project/bpf-examples/tree/master/MTU-tests V3: replace __bpf_skb_max_len() with define and use IPv6 max MTU size. Signed-off-by: Jesper Dangaard Brouer --- net/core/filter.c | 12 ++++-------- 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/net/core/filter.c b/net/core/filter.c index 2ca5eecebacf..1ee97fdeea64 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3552,11 +3552,7 @@ static int bpf_skb_net_shrink(struct sk_buff *skb, u32 off, u32 len_diff, return 0; } -static u32 __bpf_skb_max_len(const struct sk_buff *skb) -{ - return skb->dev ? skb->dev->mtu + skb->dev->hard_header_len : - SKB_MAX_ALLOC; -} +#define BPF_SKB_MAX_LEN SKB_MAX_ALLOC BPF_CALL_4(sk_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, u32, mode, u64, flags) @@ -3605,7 +3601,7 @@ BPF_CALL_4(bpf_skb_adjust_room, struct sk_buff *, skb, s32, len_diff, { u32 len_cur, len_diff_abs = abs(len_diff); u32 len_min = bpf_skb_net_base_len(skb); - u32 len_max = __bpf_skb_max_len(skb); + u32 len_max = BPF_SKB_MAX_LEN; __be16 proto = skb->protocol; bool shrink = len_diff < 0; u32 off; @@ -3688,7 +3684,7 @@ static int bpf_skb_trim_rcsum(struct sk_buff *skb, unsigned int new_len) static inline int __bpf_skb_change_tail(struct sk_buff *skb, u32 new_len, u64 flags) { - u32 max_len = __bpf_skb_max_len(skb); + u32 max_len = BPF_SKB_MAX_LEN; u32 min_len = __bpf_skb_min_len(skb); int ret; @@ -3764,7 +3760,7 @@ static const struct bpf_func_proto sk_skb_change_tail_proto = { static inline int __bpf_skb_change_head(struct sk_buff *skb, u32 head_room, u64 flags) { - u32 max_len = __bpf_skb_max_len(skb); + u32 max_len = BPF_SKB_MAX_LEN; u32 new_len = skb->len + head_room; int ret;