From patchwork Thu Jan 7 13:50:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aya Levin X-Patchwork-Id: 358710 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, UNPARSEABLE_RELAY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B29CC433DB for ; Thu, 7 Jan 2021 13:51:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3D47822AB9 for ; Thu, 7 Jan 2021 13:51:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728450AbhAGNvx (ORCPT ); Thu, 7 Jan 2021 08:51:53 -0500 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:43893 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726854AbhAGNvx (ORCPT ); Thu, 7 Jan 2021 08:51:53 -0500 Received: from Internal Mail-Server by MTLPINE1 (envelope-from ayal@mellanox.com) with SMTP; 7 Jan 2021 15:50:58 +0200 Received: from dev-l-vrt-210.mtl.labs.mlnx (dev-l-vrt-210.mtl.labs.mlnx [10.234.210.1]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 107DowbB002543; Thu, 7 Jan 2021 15:50:58 +0200 Received: from dev-l-vrt-210.mtl.labs.mlnx (localhost [127.0.0.1]) by dev-l-vrt-210.mtl.labs.mlnx (8.15.2/8.15.2/Debian-8ubuntu1) with ESMTP id 107DowTB030489; Thu, 7 Jan 2021 15:50:58 +0200 Received: (from ayal@localhost) by dev-l-vrt-210.mtl.labs.mlnx (8.15.2/8.15.2/Submit) id 107DogfQ030482; Thu, 7 Jan 2021 15:50:42 +0200 From: Aya Levin To: "David S. Miller" , Jakub Kicinski Cc: Daniel Axtens , netdev@vger.kernel.org, Moshe Shemesh , Tariq Toukan , Dan Carpenter , jengelh@medozas.de, kaber@trash.net, kuznet@ms2.inr.ac.ru, yoshfuji@linux-ipv6.org, Aya Levin Subject: [PATCH net V3] net: ipv6: Validate GSO SKB before finish IPv6 processing Date: Thu, 7 Jan 2021 15:50:18 +0200 Message-Id: <1610027418-30438-1-git-send-email-ayal@nvidia.com> X-Mailer: git-send-email 1.8.4.3 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org There are cases where GSO segment's length exceeds the egress MTU: - Forwarding of a TCP GRO skb, when DF flag is not set. - Forwarding of an skb that arrived on a virtualisation interface (virtio-net/vhost/tap) with TSO/GSO size set by other network stack. - Local GSO skb transmitted on an NETIF_F_TSO tunnel stacked over an interface with a smaller MTU. - Arriving GRO skb (or GSO skb in a virtualised environment) that is bridged to a NETIF_F_TSO tunnel stacked over an interface with an insufficient MTU. If so: - Consume the SKB and its segments. - Issue an ICMP packet with 'Packet Too Big' message containing the MTU, allowing the source host to reduce its Path MTU appropriately. Note: These cases are handled in the same manner in IPv4 output finish. This patch aligns the behavior of IPv6 and the one of IPv4. Fixes: 9e50849054a4 ("netfilter: ipv6: move POSTROUTING invocation before fragmentation") Signed-off-by: Aya Levin Reviewed-by: Tariq Toukan --- net/ipv6/ip6_output.c | 41 ++++++++++++++++++++++++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-) Hi, Please queue to -stable >= v2.6.35. Thanks, Aya v2: Addressed error: uninitialized symbol 'ret', reported by kernel test robot / Dan Carpenter . v3: Per Jakub's request, added further details in commit message, and extended the CCed list. diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 749ad72386b2..077d43af8226 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -125,8 +125,43 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff * return -EINVAL; } +static int +ip6_finish_output_gso_slowpath_drop(struct net *net, struct sock *sk, + struct sk_buff *skb, unsigned int mtu) +{ + struct sk_buff *segs, *nskb; + netdev_features_t features; + int ret = 0; + + /* Please see corresponding comment in ip_finish_output_gso + * describing the cases where GSO segment length exceeds the + * egress MTU. + */ + features = netif_skb_features(skb); + segs = skb_gso_segment(skb, features & ~NETIF_F_GSO_MASK); + if (IS_ERR_OR_NULL(segs)) { + kfree_skb(skb); + return -ENOMEM; + } + + consume_skb(skb); + + skb_list_walk_safe(segs, segs, nskb) { + int err; + + skb_mark_not_on_list(segs); + err = ip6_fragment(net, sk, segs, ip6_finish_output2); + if (err && ret == 0) + ret = err; + } + + return ret; +} + static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff *skb) { + unsigned int mtu; + #if defined(CONFIG_NETFILTER) && defined(CONFIG_XFRM) /* Policy lookup after SNAT yielded a new policy */ if (skb_dst(skb)->xfrm) { @@ -135,7 +170,11 @@ static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff } #endif - if ((skb->len > ip6_skb_dst_mtu(skb) && !skb_is_gso(skb)) || + mtu = ip6_skb_dst_mtu(skb); + if (skb_is_gso(skb) && !skb_gso_validate_network_len(skb, mtu)) + return ip6_finish_output_gso_slowpath_drop(net, sk, skb, mtu); + + if ((skb->len > mtu && !skb_is_gso(skb)) || dst_allfrag(skb_dst(skb)) || (IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max_size)) return ip6_fragment(net, sk, skb, ip6_finish_output2);