From patchwork Fri Jan 22 14:12:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 369419 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98640C433E6 for ; Fri, 22 Jan 2021 14:23:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6926123A54 for ; Fri, 22 Jan 2021 14:23:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728555AbhAVOWv (ORCPT ); Fri, 22 Jan 2021 09:22:51 -0500 Received: from mail.kernel.org ([198.145.29.99]:40024 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728545AbhAVOWa (ORCPT ); Fri, 22 Jan 2021 09:22:30 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id D1AA123AC4; Fri, 22 Jan 2021 14:15:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1611324949; bh=uArgpiAfCjeXqj2bldbMln1PPWrVV37LaS7jzvA4RLI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YBWAu+0MFcLsgvSgnM07eBgZR4xfBi46LO5Q8vP9CNv0BRL5PCTQA5+kvoCJkBzq1 wEECO+FxE97+aEfKzJUAcbE95gqOucbo6tKXqg+EXaPHC0WEh7qBKZ1p8EvAMqoSl+ cCViIsE4/+4NYfuMEI+Oiju8DlZTFnuD2g1DMrww= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Eric Dumazet , Paolo Abeni , Greg Thelen , Alexander Duyck , "Michael S. Tsirkin" , Jakub Kicinski Subject: [PATCH 4.19 16/22] net: avoid 32 x truesize under-estimation for tiny skbs Date: Fri, 22 Jan 2021 15:12:34 +0100 Message-Id: <20210122135732.555755452@linuxfoundation.org> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210122135731.921636245@linuxfoundation.org> References: <20210122135731.921636245@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Eric Dumazet [ Upstream commit 3226b158e67cfaa677fd180152bfb28989cb2fac ] Both virtio net and napi_get_frags() allocate skbs with a very small skb->head While using page fragments instead of a kmalloc backed skb->head might give a small performance improvement in some cases, there is a huge risk of under estimating memory usage. For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations per page (order-3 page in x86), or even 64 on PowerPC We have been tracking OOM issues on GKE hosts hitting tcp_mem limits but consuming far more memory for TCP buffers than instructed in tcp_mem[2] Even if we force napi_alloc_skb() to only use order-0 pages, the issue would still be there on arches with PAGE_SIZE >= 32768 This patch makes sure that small skb head are kmalloc backed, so that other objects in the slab page can be reused instead of being held as long as skbs are sitting in socket queues. Note that we might in the future use the sk_buff napi cache, instead of going through a more expensive __alloc_skb() Another idea would be to use separate page sizes depending on the allocated length (to never have more than 4 frags per page) I would like to thank Greg Thelen for his precious help on this matter, analysing crash dumps is always a time consuming task. Fixes: fd11a83dd363 ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb") Signed-off-by: Eric Dumazet Cc: Paolo Abeni Cc: Greg Thelen Reviewed-by: Alexander Duyck Acked-by: Michael S. Tsirkin Link: https://lore.kernel.org/r/20210113161819.1155526-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski Signed-off-by: Greg Kroah-Hartman --- net/core/skbuff.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -459,13 +459,17 @@ EXPORT_SYMBOL(__netdev_alloc_skb); struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len, gfp_t gfp_mask) { - struct napi_alloc_cache *nc = this_cpu_ptr(&napi_alloc_cache); + struct napi_alloc_cache *nc; struct sk_buff *skb; void *data; len += NET_SKB_PAD + NET_IP_ALIGN; - if ((len > SKB_WITH_OVERHEAD(PAGE_SIZE)) || + /* If requested length is either too small or too big, + * we use kmalloc() for skb->head allocation. + */ + if (len <= SKB_WITH_OVERHEAD(1024) || + len > SKB_WITH_OVERHEAD(PAGE_SIZE) || (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) { skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE); if (!skb) @@ -473,6 +477,7 @@ struct sk_buff *__napi_alloc_skb(struct goto skb_success; } + nc = this_cpu_ptr(&napi_alloc_cache); len += SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); len = SKB_DATA_ALIGN(len);