From patchwork Mon Apr 7 14:21:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiayuan Chen X-Patchwork-Id: 878891 Received: from out-179.mta0.migadu.com (out-179.mta0.migadu.com [91.218.175.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3C38D191461 for ; Mon, 7 Apr 2025 14:23:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744035825; cv=none; b=e9VPJWEes0CD/NwnAWq0I3jJagmEzL7yB1UkbKBacgvljWjpoyo3pG1975GDZtdcGEr8htl41xpnEfxwnMCfwyPytemD/15kA07e3INL7JXfAHUMIKqR1D4PVKm0exKldcST4IUp6nrlr6OyGbPxyVgb+WZPJOOnAXrJ8wKSkmE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744035825; c=relaxed/simple; bh=GrUS7nmfi1SY0aJ/MVjihRp1EO08eS93vornewj9Onk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=f6Pchucb5Aspj0Z3DZf+sq2+EX/Zrf6Nx/fplrWJpa8IfpZ0+BIKt/cIvBEyL0H/QQgoesAK5P0yqwdsw8CA9BHV2BC1bp9jxuh/qjZvzvlRHutiDxbbuYDCCKgUuXKxQ5q+THH/YESvWvNI56g8Ll8eX/1TwmGMryBSrTi0lTU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=qKl2Sqwn; arc=none smtp.client-ip=91.218.175.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="qKl2Sqwn" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1744035819; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RqXWt/QORD9OH3iap+B7ewziZ+4G9b9oNDtYRsxkesI=; b=qKl2Sqwn7u0DClCEbYAfaFyB2yP5VQGVJ4Hp9zbys19NNYR+vCYztGag0xgO6LIGsCZ8Q6 2rsACaScnW0iFkzfhNW+M0d86b3WsEHozJ4jErCeUsf2KnmBbX3z7dvJOCSA0Dx2TLMPRz EyYt5RHn7giksGXVNk/lMc1kXqJ+WKY= From: Jiayuan Chen To: bpf@vger.kernel.org Cc: mrpre@163.com, Jiayuan Chen , John Fastabend , Jakub Sitnicki , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Mykola Lysenko , Shuah Khan , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: [PATCH bpf-next v1 1/4] bpf, sockmap: Fix data lost during EAGAIN retries Date: Mon, 7 Apr 2025 22:21:20 +0800 Message-ID: <20250407142234.47591-2-jiayuan.chen@linux.dev> In-Reply-To: <20250407142234.47591-1-jiayuan.chen@linux.dev> References: <20250407142234.47591-1-jiayuan.chen@linux.dev> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT We call skb_bpf_redirect_clear() to clean _sk_redir before handling skb in backlog, but when sk_psock_handle_skb() return EAGAIN due to sk_rcvbuf limit, the redirect info in _sk_redir is not recovered. Fix skb redir loss during EAGAIN retries by restoring _sk_redir information using skb_bpf_set_redir(). Before this patch: ''' ./bench sockmap -c 2 -p 1 -a --rx-verdict-ingress Setting up benchmark 'sockmap'... create socket fd c1:13 p1:14 c2:15 p2:16 Benchmark 'sockmap' started. Send Speed 1343.172 MB/s, BPF Speed 1343.238 MB/s, Rcv Speed 65.271 MB/s Send Speed 1352.022 MB/s, BPF Speed 1352.088 MB/s, Rcv Speed 0 MB/s Send Speed 1354.105 MB/s, BPF Speed 1354.105 MB/s, Rcv Speed 0 MB/s Send Speed 1355.018 MB/s, BPF Speed 1354.887 MB/s, Rcv Speed 0 MB/s ''' Due to the high send rate, the RX processing path may frequently hit the sk_rcvbuf limit. Once triggered, incorrect _sk_redir will cause the flow to mistakenly enter the "!ingress" path, leading to send failures. (The Rcv speed depends on tcp_rmem). After this patch: ''' ./bench sockmap -c 2 -p 1 -a --rx-verdict-ingress Setting up benchmark 'sockmap'... create socket fd c1:13 p1:14 c2:15 p2:16 Benchmark 'sockmap' started. Send Speed 1347.236 MB/s, BPF Speed 1347.367 MB/s, Rcv Speed 65.402 MB/s Send Speed 1353.320 MB/s, BPF Speed 1353.320 MB/s, Rcv Speed 65.536 MB/s Send Speed 1353.186 MB/s, BPF Speed 1353.121 MB/s, Rcv Speed 65.536 MB/s ''' Signed-off-by: Jiayuan Chen --- net/core/skmsg.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 0ddc4c718833..29cb5ffd56c0 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -680,7 +680,8 @@ static void sk_psock_backlog(struct work_struct *work) if (ret <= 0) { if (ret == -EAGAIN) { sk_psock_skb_state(psock, state, len, off); - + /* Restore redir info we cleared before */ + skb_bpf_set_redir(skb, psock->sk, ingress); /* Delay slightly to prioritize any * other work that might be here. */