From patchwork Mon Apr 14 13:13:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 881146 Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2073.outbound.protection.outlook.com [40.107.22.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 925DC2D3A8A; Mon, 14 Apr 2025 13:13:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.22.73 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636441; cv=fail; b=BcyKOP6r3e0DduhKm1hvhQytTnDar8+WemcCvorspiGDqPvAxcW1qM5GZizzK/EaI/S0U2HmWXH3IrLin1Gf3mkEtrlRwSh3HIBx4jouIdwRiLMGhUOSq77JHFqTvIlMFd9QAurOVxHnjv0uMUqPmpKoY6Mt4a0YPApQOCoh1D4= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636441; c=relaxed/simple; bh=PYyCXUuKVKRWlC5fYTcacULQXbuy70otEK62ZKgXQqA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=fyFIav2nyQ7gicSqJNNgPUCy2PEbXFqHoC3pFpOeo4dd2M+q6B7ogSC3/fWl7oQCI4wP4P3xlGaSOLuWUo/8UWSc1unQFmnOr7rzzJo7f95Rc1cADSABvi0w2cKMCSqwp66RQi7rCow2b/VsdfPClzasBgIioO0HuyNN57nJ2BQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=IE4l/LOj; arc=fail smtp.client-ip=40.107.22.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="IE4l/LOj" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ulFnbsNGGwVfKQ3t0YA9dy6E0BUyS1V6X8AcItxatSRN/CEuytfkFrCzi4hbYEIIJAuvmdHqKHURp7khHtJhii4d9BxUsC8aqhnvXfL/55FjFPc4qVFVshWj9Yrx+VUTQKHTWIpBavPRtfWnA0JrPHWORzZ/I5MoAtUCu+K9vo5lFAPnwkvBsWxJ62UsQiuDuMjJOOVam4zmItl2PeIA1hRKSKSbTEB1cSPpiGnU+O8dIeCyPdsVOsQAXfDhnIv+w59J/n5KTRKdJuo0udTitWJTocOhvevdQP6AIB6L8DZXpM0+AEeS9c98D47zAK4w8AK1racGncqEk5LNs/qCnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Jbp9mDaGjGTen8BidXf0OT4su1XnPFAKk8A8EBFXEZQ=; b=VjUx4AwVbHWooDjiP1SleNGnvskrjLkDr9qF2t4s5MDkNqm9PhjuA8MBolytlV+iNhbJprUTRqUCHKCol1PlJompqutUFvGnejOrypaL+PZq1zVfh6oAEuJdAuUfvLG3kLvsw+OrjuRnncdAAkUVKHRHU6nqc7mb/6fTeb/DKVwPTmCgjm4EGo0wBu4qotbXkwilT5cLGi3Tdn53BkCm0rL+SRzGMGP2DXRYGDw8i4a1dKcSqd2EgsQndpHA93PqORW37bmIiqeKZSNcBz0x1NPTpffU8CcPcwva1jJcVpwj398lHbEWSwHQJBKncvT3GwYxdoRmmSn1k6oJUbsKxQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.101) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Jbp9mDaGjGTen8BidXf0OT4su1XnPFAKk8A8EBFXEZQ=; b=IE4l/LOjKbFk2t5vvJmfXTBZJQ8rO5NarYHkMcNPF5odHcepmZiOjMuvH6GUpKTJGgzFT4Js8gQch0HGB6voCmV50jt/x3xSwrpJvB66BI89T71dJR/sap3k1qWEZA848amNVzFhLx5pqFWRr/DQn1QyRFc9U6A5N7u4x3ffhCe9v8BOP86NbTKiG0qCbmB0qwPeG2qFyopyeCj3f8a9zQGVD4ScT+2QpeJCPF5x6zROMrUE0+wDIB7oYRVESZbJwooBd6b8hca8YnvJQceH58D+nW5ZRjZ/r9uYM3YW1q4+PCyt7I7gLnQU+vuTVKp0yYNYYlrKZVhz3/Rbym4BHA== Received: from AM0PR10CA0068.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:208:15::21) by AS8PR07MB9548.eurprd07.prod.outlook.com (2603:10a6:20b:628::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8632.32; Mon, 14 Apr 2025 13:13:55 +0000 Received: from AM3PEPF00009B9D.eurprd04.prod.outlook.com (2603:10a6:208:15:cafe::d8) by AM0PR10CA0068.outlook.office365.com (2603:10a6:208:15::21) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8632.33 via Frontend Transport; Mon, 14 Apr 2025 13:13:55 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.101) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.101 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.101; helo=fr712usmtp1.zeu.alcatel-lucent.com; pr=C Received: from fr712usmtp1.zeu.alcatel-lucent.com (131.228.6.101) by AM3PEPF00009B9D.mail.protection.outlook.com (10.167.16.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8655.12 via Frontend Transport; Mon, 14 Apr 2025 13:13:54 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr712usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id 53EDDQBC009623; Mon, 14 Apr 2025 13:13:53 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v3 net-next 01/15] tcp: reorganize SYN ECN code Date: Mon, 14 Apr 2025 15:13:01 +0200 Message-Id: <20250414131315.97456-2-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> References: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM3PEPF00009B9D:EE_|AS8PR07MB9548:EE_ X-MS-Office365-Filtering-Correlation-Id: ffe696ea-f8ff-4ed8-fe8d-08dd7b563548 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|376014|36860700013|1800799024|7416014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?1qS2C2bYi0r5tbCV3fO54p88mkD0GIy?= =?utf-8?q?R+6Q7sjVVlGJUVngdhK16UkWW3cQKS60UMiQAkpmek2xq3JV0dd7xP+CCE0R9Chuu?= =?utf-8?q?BwXVIt4A8OQ8+I72LGzkvxrMN/nmpZF1jkSy9JiylkGEBuqk910F+FkyoFV2OHHyM?= =?utf-8?q?tVIFRCdvRZMOrKrdKuMuF6PasYYo4H3C9WRSX6lFyt/n7LZjWt5MYhUtlmOPrw7F4?= =?utf-8?q?GruM9+YdaP4C/EjRnoexkqsUR2xYh/aVm/Eq9vPL7waSOrg0djn7Ty+1odyWa/X5v?= =?utf-8?q?tN1HdpqYqj8lhQvfip3sMJQvRb1fHNIfsXQHcTPTEPzbpI87BI9G8NJY6JU6XE66O?= =?utf-8?q?bVqaObLvQDnBx/aN5PulJunfM/HJcFy0Rux5M6214wRHakwcY3CyRzdk3qzIRw/Mw?= =?utf-8?q?/vJC8S4Ar4Mqg8Cagz8ef1QwbR6TJxidvImEASNhOWmP7/+QEXYXHgFyLjvV1UAri?= =?utf-8?q?aECcVkt7ded5up1wsp8hG/T3LId2aPmZzLQtJALhLhZ2QIQzqD0zqm7onhgcRf5aS?= =?utf-8?q?yus2FRTvj+a+Pp6NBvj3AIfu0WUplSjtjSCkhjoL5R3iR/whMJ3z8HsQGPerUkchE?= =?utf-8?q?myzT8mvcSMJc8LNl4QLTWLbj+tX+GVl5HgQ0YT/T6EDBJmof3o9Jd1Ti2co/UGVHo?= =?utf-8?q?9ntLF7+fWsq0XOkEP/deO4c719GKq7ynQ2DO/HgzWpsyjzuMvGMgVse11mekAGI6q?= =?utf-8?q?L/a41UB68c+wlTdof71umD+cqcY4TXsI/d8lCNPOQTgOlHpCJFKGPMGTr0Udq9Fml?= =?utf-8?q?1rg3F59JjiuoajHFXMA4xudIfTOkuZpPXbLrZ+SpNob/UdQ5cNNrx5eVCnnFWLlR2?= =?utf-8?q?ZfntHH2lRtYSbsYpCXaGRHh6FSmrauyqTZ8h8F/8xQTjrDCLptjiX9USALeUv8PIg?= =?utf-8?q?6LvP6q+CTJyfqjmlZQe7vWqmmJgYIiU1A8e5xU+RyLKHlwSEHCcVojrgH3UI/NYI0?= =?utf-8?q?o8rGE6BFeBpIa0dDsnB5JF8XMwz5yhQAEYV5AxJJa+N4+obX5qyNH7dRiJmzEEmE/?= =?utf-8?q?8wqM277DvA0Gt4RKI0sOuj5kc79l118zoDDxqeLB8nTMQcoV8yja3+FBF0RZbObek?= =?utf-8?q?evREz0LEJ43ayklwyMb+rV/88Km7hVb3cFCSrPyQd/JesrILcljc3dy1O7heyBQph?= =?utf-8?q?7JEhC6CTsGqXDfpoqODsLKfXROF6OjSoF/OZoos8x9n/fne0B3R3OhjY+qIOXRXUk?= =?utf-8?q?35g9MvBDWIa6LD9p1fx6/CFDduF4syEB0/NuawiDZI+6U2GbRaorhPvo051SxLZ/b?= =?utf-8?q?3rjlCBiAuk51vfCiEezxO6qGJ0xIjW3QOmJfLDxM5XFMLBCMPjbgdUddWsM1WECF3?= =?utf-8?q?eMXs2/yFQo0LpNKZsrvv+9ggYQbJzzRKgRJcN94F2h4Di75LH/vfY5fdnUnVqedsd?= =?utf-8?q?xBxw0cZsR+F+2haRfS9qa58FQN7X8WcitB/BLBc6Nw5IAiT5vDEk9XVZrUigKr8hA?= =?utf-8?q?3U8gYQhGGC1OyoNiU4Xc2xGpsA4qg36g=3D=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.101; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fr712usmtp1.zeu.alcatel-lucent.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(376014)(36860700013)(1800799024)(7416014)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2025 13:13:54.5671 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ffe696ea-f8ff-4ed8-fe8d-08dd7b563548 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.6.101]; Helo=[fr712usmtp1.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF00009B9D.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS8PR07MB9548 From: Ilpo Järvinen Prepare for AccECN that needs to have access here on IP ECN field value which is only available after INET_ECN_xmit(). No functional changes. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang Reviewed-by: Eric Dumazet --- net/ipv4/tcp_output.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 13295a59d22e..9a1ab946ff62 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -350,10 +350,11 @@ static void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb) tp->ecn_flags = 0; if (use_ecn) { - TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ECE | TCPHDR_CWR; - tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); if (tcp_ca_needs_ecn(sk) || bpf_needs_ecn) INET_ECN_xmit(sk); + + TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ECE | TCPHDR_CWR; + tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); } } From patchwork Mon Apr 14 13:13:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 881144 Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2079.outbound.protection.outlook.com [40.107.21.79]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 00E672D3A6F; Mon, 14 Apr 2025 13:14:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.21.79 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636451; cv=fail; b=O40XTF57+ZBop2XKnYUOQRHn7mEjVKbMVuHcFwPPvDNaZiww7hyjvrjJocgWEwDfWix8ffPFh3KXgkVb8MCfxB7YuRddB/ZhVt/rhkak2snZiq1XKBvS28ncd/2AhwH6MlXE1PuwwwZC7yAXVbCBvPinTVIaZrjzHIHHy3P7+nk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636451; c=relaxed/simple; bh=OHI07OdTLuEAo/tuXNUSns8PHBzoNp/D6MefEdqbtvo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=qaxUOhuqlcDsaG6pJfxF+aJjlPYCz3gjN/Gzo7QyEfdqQwKsxyZ/sW/bIOXQyE7v3AQkahm5uBcd44pf5xfBDpjJ+SiKZ9KGbkUbAmBUGJJfrowCXSuI+OhTNLQiVpQdzF2bXcFNSlxWtD+54bgPJbBUEPE/Waj9eXfXtTcbx10= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=kPs4FgmG; arc=fail smtp.client-ip=40.107.21.79 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="kPs4FgmG" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=dMHOcX9kBXDyfAmWub3qjEguOt5ILUIYxvM0cFJrM4u0InKVYl+GI9D7NPI6KNTfKELsnOdpHtXsLtGx0S11VB6cHe5LfpM+t/Gr5fgM2212+kLnUwrbCb0DkxGLJg4q7al9qphejzfaQHSWoIRZGI14y1AdXffbLjfxXtT4TH0UZc/RfBlN6rkTjdLmnpyXaSzEIU6wglxyT7FvFaUS57NTI24OgegnDLkMt/CTYJtRTJa7IsbgWdVLoA4T1RPVSNfT7MRzaYVWe2jjav0pxiLHOcVkFAPMJtVI+n/mk3GdBjBANbuzJQxbv6TVj3J3ki34dP1CI4Tzeu40asrf6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=KpU2mBKVawl3mLFgZumdfVgBQuM7kHvZ0vKwUOZpcis=; b=LgHvdn8AYNyB3ApX+wC822w1/HqMdnt8s+5BKkfcaC53r+YH6/l1ZEvYC1fkWQjEDbYTGcu/S84KKHnoMikPoELc2SzlikLmC5SwT8P66ADKWHV5r78w1UQdyeYC29g3gkYL+VVvUBFkHnZNCnzB8IWhtXATnEJoQ3l3tn03uNu7g/zJVckpv4Gcj1SrBh77VFMmRTC+wYl6FTfaq8VyZeTV5BbghXMdGOlcgiVco8bmG8qPtOVt6gt3VjhGBo3QhdgK0BA6dNYCFNwkKIS8IffSDDgdmhcfQe2UrdLVnVhn3olN7PjhLIHLtY5PH+7C/m9gYgMvipSa+xR65NUrhA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=temperror (sender ip is 131.228.6.101) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=temperror action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=KpU2mBKVawl3mLFgZumdfVgBQuM7kHvZ0vKwUOZpcis=; b=kPs4FgmGcTbgrKBAi84yZZlJMbNv/ksJ1Ep0ErHhvrp9lis6VLhN2lcZ+HmsReJfLaGnEzNVoT5RlybGoSyzBXBjkoRaNu0eR3QhVW30jVU+Y93H9KBbgP+XycWD5cSK2Ah+uqP/hgJGxuV6tZ+FohCvRqvLNaV9g4VtBAfX9zfSMF/mewgcVTOMIQPGH1NV3jBM0m9DZ9Vsorq8PGOVL9X9Ydl6chWk+NYClTlsHDGj9oVgjzV/StGoL1jHS5n/1NO33I7tVS7DtG20H3N9xqv/GKoFfbt/T6P/whjESTzNsAadgNF+/n0VX4cfd18RKZCKTBTiEyLxujDpeKns6g== Received: from DUZPR01CA0332.eurprd01.prod.exchangelabs.com (2603:10a6:10:4b8::18) by PR3PR07MB6876.eurprd07.prod.outlook.com (2603:10a6:102:75::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8632.32; Mon, 14 Apr 2025 13:14:04 +0000 Received: from DB1PEPF000509F6.eurprd02.prod.outlook.com (2603:10a6:10:4b8:cafe::5a) by DUZPR01CA0332.outlook.office365.com (2603:10a6:10:4b8::18) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8632.34 via Frontend Transport; Mon, 14 Apr 2025 13:14:00 +0000 X-MS-Exchange-Authentication-Results: spf=temperror (sender IP is 131.228.6.101) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=temperror action=none header.from=nokia-bell-labs.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of nokia-bell-labs.com: DNS Timeout) Received: from fr712usmtp1.zeu.alcatel-lucent.com (131.228.6.101) by DB1PEPF000509F6.mail.protection.outlook.com (10.167.242.152) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8655.12 via Frontend Transport; Mon, 14 Apr 2025 13:14:02 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr712usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id 53EDDQBF009623; Mon, 14 Apr 2025 13:13:59 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Olivier Tilmans , Chia-Yu Chang Subject: [PATCH v3 net-next 04/15] tcp: accecn: AccECN negotiation Date: Mon, 14 Apr 2025 15:13:04 +0200 Message-Id: <20250414131315.97456-5-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> References: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DB1PEPF000509F6:EE_|PR3PR07MB6876:EE_ X-MS-Office365-Filtering-Correlation-Id: 02b71cae-fa0d-4d0a-016c-08dd7b5639e0 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|1800799024|7416014|376014|36860700013|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?2V0bigRGQnBy/YIW7G/Z3AVX9X+aQPt?= =?utf-8?q?C4kct76Qr95Gs1uQ29lr7oU3z3jk4xysQ884YErVBlAZan1pNGrduGlBgeUVM3BxK?= =?utf-8?q?CU1j06SpO1mNe43Gwp95LKygZjobNJdN2nsd439mUF+w99vJAyGweaFmZpNgmZv2n?= =?utf-8?q?q2YBeXrVzH9KxKo4nrMUIHaQP3KqS+RDFdSuEoH95siq4O7jJklsYg/2TusFk6bjs?= =?utf-8?q?0OCFS+bWz2rO1WbhiqCXB4DHQe5iPlpXvqrC9w14yAM7A+o3qeS3fDLcy6U0tyGfq?= =?utf-8?q?60WqNAioil5XGLv8ejRi0gbbhx36NMpSw2XUHCx8L01zYEXPhOr9u1RMOKIX7mH4X?= =?utf-8?q?wL6hXV8i3190LBuxvjQV48UlmC0qld2qR/jMf72kwnjLha6h8334vNzTpQikj62R3?= =?utf-8?q?V77r0AjnL1i4lItdY6QSCXxmx3uNnuM1d7m0GEtka5y1COnLZoA7Jwk6Rv+3DPmN/?= =?utf-8?q?bqz4Jp/EGQLKsZexviFYk2Fp/ijlWS39mQjJOkjvESMuxgmGj3kNTAMLfzLJScLvJ?= =?utf-8?q?makCpddJm/jti6plZdB1qitGJ6W8p4QLli1Cqrxh7XktY0tyfejyxfaqSXnPsGjPL?= =?utf-8?q?V7frksvOameK2fFG3iXvtgPIFrnHMhf7w0QLRu0+cDMJkidwUCeziRvhuyygrao6r?= =?utf-8?q?Ezn6oDHqi37M1dHVdbt7+v5VhsMTkNqnpVfdaC2gxvbHbDY0r5uaByCRbkiFsjgUA?= =?utf-8?q?WxTMc6utwt8qBZtfPrbxiNIVXadLs0ORj3MoRztdS18rM+96TWOypoHI/nmogAQIx?= =?utf-8?q?pCZHrg/8h6fEIPmzWyHPfSRI8GRh8SPNGDryHFcRORVPhpt6dH6J5zmouz4dsNXIG?= =?utf-8?q?BW5Heqa6izx/Z1oCh45fslj6nUah4UzGhgOuRvg/RUQObXndFW/Hhv37mRTUmEzVO?= =?utf-8?q?85vxlZ2zpQkiugAjCVhIrK8s1jDiDCpVfJknsKu+AG+GWUgwfu9azlvnOgxPmdZjU?= =?utf-8?q?/Blfw/BUgQ9ODIr5VchnVDKYUMBNxSy5iUSjFTfAV0hwrK2JE6a5SotVyQ8OqsD/H?= =?utf-8?q?PpH9ZL1tADF9T6rzxJByNB5PdC92dYE7RLpUIM1LkywGoTqiiFDg7KYzQcDHhZusG?= =?utf-8?q?sMl/0T3PTuF7L+nn1S6oYih9aysYyiRnHuIT6BMmZQEZc7YzGSvztl8X11iRmCBQN?= =?utf-8?q?Wo8VTsF/GJA8K11qVEmN5MP4KSuvhwhTA3X2ZRNMDldEi2DwL7jLR6Dxzj3403cdL?= =?utf-8?q?acPb0qhVxUSHgLMzxqgdxTrhDveCMtBzwjhl9EgIaCZnrR2AVvYmoATJfZSPok1sI?= =?utf-8?q?Uw+jvl7nNBBvVKm/APPg6zud8SVxSw5IS5fuQ96xVvAdu1h13yGbWAHwZh5dq0NcH?= =?utf-8?q?cbHrLerFvoU4fgbYAfAV5Zz/ol5zF/dUgGgBSQ5EZaN0Mk2jSxCGIqKMsZHS31V+O?= =?utf-8?q?LcPiKWRq0d6Bf8D8oR7r696UWARRP9roqJBFbMlrpcVKZ9tdB7xEDFxl+V2n5YmlO?= =?utf-8?q?xAzVDcHeE8fo0EeM7B8MBQMaGQjhYnQA=3D=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.101; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fr712usmtp1.zeu.alcatel-lucent.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(1800799024)(7416014)(376014)(36860700013)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2025 13:14:02.3926 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 02b71cae-fa0d-4d0a-016c-08dd7b5639e0 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.6.101]; Helo=[fr712usmtp1.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: DB1PEPF000509F6.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PR3PR07MB6876 From: Ilpo Järvinen Accurate ECN negotiation parts based on the specification: https://tools.ietf.org/id/draft-ietf-tcpm-accurate-ecn-28.txt Accurate ECN is negotiated using ECE, CWR and AE flags in the TCP header. TCP falls back into using RFC3168 ECN if one of the ends supports only RFC3168-style ECN. The AccECN negotiation includes reflecting IP ECN field value seen in SYN and SYNACK back using the same bits as negotiation to allow responding to SYN CE marks and to detect ECN field mangling. CE marks should not occur currently because SYN=1 segments are sent with Non-ECT in IP ECN field (but proposal exists to remove this restriction). Reflecting SYN IP ECN field in SYNACK is relatively simple. Reflecting SYNACK IP ECN field in the final/third ACK of the handshake is more challenging. Linux TCP code is not well prepared for using the final/third ACK a signalling channel which makes things somewhat complicated here. Co-developed-by: Olivier Tilmans Signed-off-by: Olivier Tilmans Signed-off-by: Ilpo Järvinen Co-developed-by: Chia-Yu Chang Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 9 ++- include/net/tcp.h | 80 ++++++++++++++++++- net/ipv4/syncookies.c | 3 + net/ipv4/sysctl_net_ipv4.c | 3 +- net/ipv4/tcp.c | 2 + net/ipv4/tcp_input.c | 155 +++++++++++++++++++++++++++++++++---- net/ipv4/tcp_ipv4.c | 3 +- net/ipv4/tcp_minisocks.c | 51 ++++++++++-- net/ipv4/tcp_output.c | 78 +++++++++++++++---- net/ipv6/syncookies.c | 1 + net/ipv6/tcp_ipv6.c | 1 + 11 files changed, 343 insertions(+), 43 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index e36018203bd0..af38fff24aa4 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -156,6 +156,10 @@ struct tcp_request_sock { #if IS_ENABLED(CONFIG_MPTCP) bool drop_req; #endif + u8 accecn_ok : 1, + syn_ect_snt: 2, + syn_ect_rcv: 2; + u8 accecn_fail_mode:4; u32 txhash; u32 rcv_isn; u32 snt_isn; @@ -376,7 +380,10 @@ struct tcp_sock { u8 compressed_ack; u8 dup_ack_counter:2, tlp_retrans:1, /* TLP is a retransmission */ - unused:5; + syn_ect_snt:2, /* AccECN ECT memory, only */ + syn_ect_rcv:2, /* ... needed durign 3WHS + first seqno */ + wait_third_ack:1; /* Wait 3rd ACK in simultaneous open */ + u8 accecn_fail_mode:4; /* AccECN failure handling */ u8 thin_lto : 1,/* Use linear timeouts for thin streams */ fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */ fastopen_no_cookie:1, /* Allow send/recv SYN+data without a cookie */ diff --git a/include/net/tcp.h b/include/net/tcp.h index cc28255deef7..f36a1a3d538f 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -27,6 +27,7 @@ #include #include #include +#include #include #include @@ -234,6 +235,37 @@ static_assert((1 << ATO_BITS) > TCP_DELACK_MAX); #define TCPOLEN_MSS_ALIGNED 4 #define TCPOLEN_EXP_SMC_BASE_ALIGNED 8 +/* tp->accecn_fail_mode */ +#define TCP_ACCECN_ACE_FAIL_SEND BIT(0) +#define TCP_ACCECN_ACE_FAIL_RECV BIT(1) +#define TCP_ACCECN_OPT_FAIL_SEND BIT(2) +#define TCP_ACCECN_OPT_FAIL_RECV BIT(3) + +static inline bool tcp_accecn_ace_fail_send(const struct tcp_sock *tp) +{ + return tp->accecn_fail_mode & TCP_ACCECN_ACE_FAIL_SEND; +} + +static inline bool tcp_accecn_ace_fail_recv(const struct tcp_sock *tp) +{ + return tp->accecn_fail_mode & TCP_ACCECN_ACE_FAIL_RECV; +} + +static inline bool tcp_accecn_opt_fail_send(const struct tcp_sock *tp) +{ + return tp->accecn_fail_mode & TCP_ACCECN_OPT_FAIL_SEND; +} + +static inline bool tcp_accecn_opt_fail_recv(const struct tcp_sock *tp) +{ + return tp->accecn_fail_mode & TCP_ACCECN_OPT_FAIL_RECV; +} + +static inline void tcp_accecn_fail_mode_set(struct tcp_sock *tp, u8 mode) +{ + tp->accecn_fail_mode |= mode; +} + /* Flags in tp->nonagle */ #define TCP_NAGLE_OFF 1 /* Nagle's algo is disabled */ #define TCP_NAGLE_CORK 2 /* Socket is corked */ @@ -420,6 +452,23 @@ static inline u8 tcp_accecn_ace(const struct tcphdr *th) return (th->ae << 2) | (th->cwr << 1) | th->ece; } +/* Infer the ECT value our SYN arrived with from the echoed ACE field */ +static inline int tcp_accecn_extract_syn_ect(u8 ace) +{ + if (ace & 0x1) + return INET_ECN_ECT_1; + if (!(ace & 0x2)) + return INET_ECN_ECT_0; + if (ace & 0x4) + return INET_ECN_CE; + return INET_ECN_NOT_ECT; +} + +bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect); +void tcp_accecn_third_ack(struct sock *sk, const struct sk_buff *skb, + u8 syn_ect_snt); +void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb); + enum tcp_tw_status { TCP_TW_SUCCESS = 0, TCP_TW_RST = 1, @@ -657,6 +706,15 @@ static inline bool cookie_ecn_ok(const struct net *net, const struct dst_entry * dst_feature(dst, RTAX_FEATURE_ECN); } +/* AccECN specification, 5.1: [...] a server can determine that it + * negotiated AccECN as [...] if the ACK contains an ACE field with + * the value 0b010 to 0b111 (decimal 2 to 7). + */ +static inline bool cookie_accecn_ok(const struct tcphdr *th) +{ + return tcp_accecn_ace(th) > 0x1; +} + #if IS_ENABLED(CONFIG_BPF) static inline bool cookie_bpf_ok(struct sk_buff *skb) { @@ -968,6 +1026,7 @@ static inline u32 tcp_rsk_tsval(const struct tcp_request_sock *treq) #define TCPHDR_ACE (TCPHDR_ECE | TCPHDR_CWR | TCPHDR_AE) #define TCPHDR_SYN_ECN (TCPHDR_SYN | TCPHDR_ECE | TCPHDR_CWR) +#define TCPHDR_SYNACK_ACCECN (TCPHDR_SYN | TCPHDR_ACK | TCPHDR_CWR) #define TCP_ACCECN_CEP_ACE_MASK 0x7 #define TCP_ACCECN_ACE_MAX_DELTA 6 @@ -1051,6 +1110,15 @@ struct tcp_skb_cb { #define TCP_SKB_CB(__skb) ((struct tcp_skb_cb *)&((__skb)->cb[0])) +static inline u16 tcp_accecn_reflector_flags(u8 ect) +{ + u32 flags = ect + 2; + + if (ect == 3) + flags++; + return FIELD_PREP(TCPHDR_ACE, flags); +} + extern const struct inet_connection_sock_af_ops ipv4_specific; #if IS_ENABLED(CONFIG_IPV6) @@ -1173,7 +1241,10 @@ enum tcp_ca_ack_event_flags { #define TCP_CONG_NON_RESTRICTED BIT(0) /* Requires ECN/ECT set on all packets */ #define TCP_CONG_NEEDS_ECN BIT(1) -#define TCP_CONG_MASK (TCP_CONG_NON_RESTRICTED | TCP_CONG_NEEDS_ECN) +/* Require successfully negotiated AccECN capability */ +#define TCP_CONG_NEEDS_ACCECN BIT(2) +#define TCP_CONG_MASK (TCP_CONG_NON_RESTRICTED | TCP_CONG_NEEDS_ECN | \ + TCP_CONG_NEEDS_ACCECN) union tcp_cc_info; @@ -1305,6 +1376,13 @@ static inline bool tcp_ca_needs_ecn(const struct sock *sk) return icsk->icsk_ca_ops->flags & TCP_CONG_NEEDS_ECN; } +static inline bool tcp_ca_needs_accecn(const struct sock *sk) +{ + const struct inet_connection_sock *icsk = inet_csk(sk); + + return icsk->icsk_ca_ops->flags & TCP_CONG_NEEDS_ACCECN; +} + static inline void tcp_ca_event(struct sock *sk, const enum tcp_ca_event event) { const struct inet_connection_sock *icsk = inet_csk(sk); diff --git a/net/ipv4/syncookies.c b/net/ipv4/syncookies.c index 5459a78b9809..3a44eb9c1d1a 100644 --- a/net/ipv4/syncookies.c +++ b/net/ipv4/syncookies.c @@ -403,6 +403,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) struct tcp_sock *tp = tcp_sk(sk); struct inet_request_sock *ireq; struct net *net = sock_net(sk); + struct tcp_request_sock *treq; struct request_sock *req; struct sock *ret = sk; struct flowi4 fl4; @@ -428,6 +429,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) } ireq = inet_rsk(req); + treq = tcp_rsk(req); sk_rcv_saddr_set(req_to_sk(req), ip_hdr(skb)->daddr); sk_daddr_set(req_to_sk(req), ip_hdr(skb)->saddr); @@ -482,6 +484,7 @@ struct sock *cookie_v4_check(struct sock *sk, struct sk_buff *skb) if (!req->syncookie) ireq->rcv_wscale = rcv_wscale; ireq->ecn_ok &= cookie_ecn_ok(net, &rt->dst); + treq->accecn_ok = ireq->ecn_ok && cookie_accecn_ok(th); ret = tcp_get_cookie_sock(sk, skb, req, &rt->dst); /* ip_queue_xmit() depends on our flow being setup diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 3a43010d726f..75ec1a599b52 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -47,6 +47,7 @@ static unsigned int udp_child_hash_entries_max = UDP_HTABLE_SIZE_MAX; static int tcp_plb_max_rounds = 31; static int tcp_plb_max_cong_thresh = 256; static unsigned int tcp_tw_reuse_delay_max = TCP_PAWS_MSL * MSEC_PER_SEC; +static int tcp_ecn_mode_max = 5; /* obsolete */ static int sysctl_tcp_low_latency __read_mostly; @@ -728,7 +729,7 @@ static struct ctl_table ipv4_net_table[] = { .mode = 0644, .proc_handler = proc_dou8vec_minmax, .extra1 = SYSCTL_ZERO, - .extra2 = SYSCTL_TWO, + .extra2 = &tcp_ecn_mode_max, }, { .procname = "tcp_ecn_fallback", diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 372c58170f4c..73f8cc715bff 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3364,6 +3364,8 @@ int tcp_disconnect(struct sock *sk, int flags) tp->window_clamp = 0; tp->delivered = 0; tp->delivered_ce = 0; + tp->wait_third_ack = 0; + tp->accecn_fail_mode = 0; tcp_accecn_init_counters(tp); if (icsk->icsk_ca_initialized && icsk->icsk_ca_ops->release) icsk->icsk_ca_ops->release(sk); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 8dbb625f5e8a..cc34664805f8 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -401,14 +401,93 @@ static void tcp_data_ecn_check(struct sock *sk, const struct sk_buff *skb) } } -static void tcp_ecn_rcv_synack(struct tcp_sock *tp, const struct tcphdr *th) +/* AccECN specificaiton, 3.1.2: If a TCP server that implements AccECN + * receives a SYN with the three TCP header flags (AE, CWR and ECE) set + * to any combination other than 000, 011 or 111, it MUST negotiate the + * use of AccECN as if they had been set to 111. + */ +static bool tcp_accecn_syn_requested(const struct tcphdr *th) +{ + u8 ace = tcp_accecn_ace(th); + + return ace && ace != 0x3; +} + +/* Check ECN field transition to detect invalid transitions */ +static bool tcp_ect_transition_valid(u8 snt, u8 rcv) +{ + if (rcv == snt) + return true; + + /* Non-ECT altered to something or something became non-ECT */ + if (snt == INET_ECN_NOT_ECT || rcv == INET_ECN_NOT_ECT) + return false; + /* CE -> ECT(0/1)? */ + if (snt == INET_ECN_CE) + return false; + return true; +} + +bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect) { - if (tcp_ecn_mode_rfc3168(tp) && (!th->ece || th->cwr)) + u8 ect = tcp_accecn_extract_syn_ect(ace); + struct tcp_sock *tp = tcp_sk(sk); + + if (!sock_net(sk)->ipv4.sysctl_tcp_ecn_fallback) + return true; + + if (!tcp_ect_transition_valid(sent_ect, ect)) { + tcp_accecn_fail_mode_set(tp, TCP_ACCECN_ACE_FAIL_RECV); + return false; + } + + return true; +} + +/* See Table 2 of the AccECN draft */ +static void tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th, + u8 ip_dsfield) +{ + struct tcp_sock *tp = tcp_sk(sk); + u8 ace = tcp_accecn_ace(th); + + switch (ace) { + case 0x0: + case 0x7: tcp_ecn_mode_set(tp, TCP_ECN_DISABLED); + break; + case 0x1: + case 0x5: + if (tcp_ecn_mode_pending(tp)) + /* Downgrade from AccECN, or requested initially */ + tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); + break; + default: + tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); + tp->syn_ect_rcv = ip_dsfield & INET_ECN_MASK; + if (INET_ECN_is_ce(ip_dsfield) && + tcp_accecn_validate_syn_feedback(sk, ace, + tp->syn_ect_snt)) { + tp->received_ce++; + tp->received_ce_pending++; + } + break; + } } -static void tcp_ecn_rcv_syn(struct tcp_sock *tp, const struct tcphdr *th) +static void tcp_ecn_rcv_syn(struct tcp_sock *tp, const struct tcphdr *th, + const struct sk_buff *skb) { + if (tcp_ecn_mode_pending(tp)) { + if (!tcp_accecn_syn_requested(th)) { + /* Downgrade to classic ECN feedback */ + tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); + } else { + tp->syn_ect_rcv = TCP_SKB_CB(skb)->ip_dsfield & + INET_ECN_MASK; + tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); + } + } if (tcp_ecn_mode_rfc3168(tp) && (!th->ece || !th->cwr)) tcp_ecn_mode_set(tp, TCP_ECN_DISABLED); } @@ -3834,7 +3913,7 @@ bool tcp_oow_rate_limited(struct net *net, const struct sk_buff *skb, } /* RFC 5961 7 [ACK Throttling] */ -static void tcp_send_challenge_ack(struct sock *sk) +static void tcp_send_challenge_ack(struct sock *sk, bool accecn_reflector) { struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); @@ -3864,7 +3943,9 @@ static void tcp_send_challenge_ack(struct sock *sk) WRITE_ONCE(net->ipv4.tcp_challenge_count, count - 1); send_ack: NET_INC_STATS(net, LINUX_MIB_TCPCHALLENGEACK); - tcp_send_ack(sk); + __tcp_send_ack(sk, tp->rcv_nxt, + !accecn_reflector ? 0 : + tcp_accecn_reflector_flags(tp->syn_ect_rcv)); } } @@ -4031,7 +4112,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) /* RFC 5961 5.2 [Blind Data Injection Attack].[Mitigation] */ if (before(ack, prior_snd_una - max_window)) { if (!(flag & FLAG_NO_CHALLENGE_ACK)) - tcp_send_challenge_ack(sk); + tcp_send_challenge_ack(sk, false); return -SKB_DROP_REASON_TCP_TOO_OLD_ACK; } goto old_ack; @@ -6025,8 +6106,7 @@ static void tcp_urg(struct sock *sk, struct sk_buff *skb, const struct tcphdr *t } /* Updates Accurate ECN received counters from the received IP ECN field */ -static void tcp_ecn_received_counters(struct sock *sk, - const struct sk_buff *skb) +void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb) { u8 ecnfield = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; u8 is_ce = INET_ECN_is_ce(ecnfield); @@ -6067,6 +6147,7 @@ static bool tcp_reset_check(const struct sock *sk, const struct sk_buff *skb) static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, const struct tcphdr *th, int syn_inerr) { + bool send_accecn_reflector = false; struct tcp_sock *tp = tcp_sk(sk); SKB_DR(reason); @@ -6160,7 +6241,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, if (tp->syn_fastopen && !tp->data_segs_in && sk->sk_state == TCP_ESTABLISHED) tcp_fastopen_active_disable(sk); - tcp_send_challenge_ack(sk); + tcp_send_challenge_ack(sk, false); SKB_DR_SET(reason, TCP_RESET); goto discard; } @@ -6171,16 +6252,27 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, * RFC 5961 4.2 : Send a challenge ack */ if (th->syn) { + if (tcp_ecn_mode_accecn(tp)) + send_accecn_reflector = true; if (sk->sk_state == TCP_SYN_RECV && sk->sk_socket && th->ack && TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && TCP_SKB_CB(skb)->seq + 1 == tp->rcv_nxt && - TCP_SKB_CB(skb)->ack_seq == tp->snd_nxt) + TCP_SKB_CB(skb)->ack_seq == tp->snd_nxt) { + if (!tcp_ecn_disabled(tp)) { + u8 ect = tp->syn_ect_rcv; + + tp->wait_third_ack = true; + __tcp_send_ack(sk, tp->rcv_nxt, + !send_accecn_reflector ? 0 : + tcp_accecn_reflector_flags(ect)); + } goto pass; + } syn_challenge: if (syn_inerr) TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS); NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNCHALLENGE); - tcp_send_challenge_ack(sk); + tcp_send_challenge_ack(sk, send_accecn_reflector); SKB_DR_SET(reason, TCP_INVALID_SYN); goto discard; } @@ -6393,6 +6485,12 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) return; step5: + if (unlikely(tp->wait_third_ack)) { + tp->wait_third_ack = 0; + if (tcp_ecn_mode_accecn(tp)) + tcp_accecn_third_ack(sk, skb, tp->syn_ect_snt); + tcp_fast_path_on(tp); + } tcp_ecn_received_counters(sk, skb); reason = tcp_ack(sk, skb, FLAG_SLOWPATH | FLAG_UPDATE_TS_RECENT); @@ -6645,7 +6743,8 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, * state to ESTABLISHED..." */ - tcp_ecn_rcv_synack(tp, th); + if (tcp_ecn_mode_any(tp)) + tcp_ecn_rcv_synack(sk, th, TCP_SKB_CB(skb)->ip_dsfield); tcp_init_wl(tp, TCP_SKB_CB(skb)->seq); tcp_try_undo_spurious_syn(sk); @@ -6717,7 +6816,9 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, TCP_DELACK_MAX, false); goto consume; } - tcp_send_ack(sk); + __tcp_send_ack(sk, tp->rcv_nxt, + !tcp_ecn_mode_accecn(tp) ? 0 : + tcp_accecn_reflector_flags(tp->syn_ect_rcv)); return -1; } @@ -6776,7 +6877,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, tp->snd_wl1 = TCP_SKB_CB(skb)->seq; tp->max_window = tp->snd_wnd; - tcp_ecn_rcv_syn(tp, th); + tcp_ecn_rcv_syn(tp, th, skb); tcp_mtup_init(sk); tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); @@ -6958,7 +7059,7 @@ tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) } /* accept old ack during closing */ if ((int)reason < 0) { - tcp_send_challenge_ack(sk); + tcp_send_challenge_ack(sk, false); reason = -reason; goto discard; } @@ -7005,9 +7106,16 @@ tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) tp->lsndtime = tcp_jiffies32; tcp_initialize_rcv_mss(sk); - tcp_fast_path_on(tp); + if (likely(!tp->wait_third_ack)) { + if (tcp_ecn_mode_accecn(tp)) + tcp_accecn_third_ack(sk, skb, tp->syn_ect_snt); + tcp_fast_path_on(tp); + } if (sk->sk_shutdown & SEND_SHUTDOWN) tcp_shutdown(sk, SEND_SHUTDOWN); + + if (sk->sk_socket && tp->wait_third_ack) + goto consume; break; case TCP_FIN_WAIT1: { @@ -7177,6 +7285,15 @@ static void tcp_ecn_create_request(struct request_sock *req, bool ect, ecn_ok; u32 ecn_ok_dst; + if (tcp_accecn_syn_requested(th) && + (net->ipv4.sysctl_tcp_ecn >= 3 || tcp_ca_needs_accecn(listen_sk))) { + inet_rsk(req)->ecn_ok = 1; + tcp_rsk(req)->accecn_ok = 1; + tcp_rsk(req)->syn_ect_rcv = TCP_SKB_CB(skb)->ip_dsfield & + INET_ECN_MASK; + return; + } + if (!th_ecn) return; @@ -7184,7 +7301,8 @@ static void tcp_ecn_create_request(struct request_sock *req, ecn_ok_dst = dst_feature(dst, DST_FEATURE_ECN_MASK); ecn_ok = READ_ONCE(net->ipv4.sysctl_tcp_ecn) || ecn_ok_dst; - if (((!ect || th->res1) && ecn_ok) || tcp_ca_needs_ecn(listen_sk) || + if (((!ect || th->res1 || th->ae) && ecn_ok) || + tcp_ca_needs_ecn(listen_sk) || (ecn_ok_dst & DST_FEATURE_ECN_CA) || tcp_bpf_ca_needs_ecn((struct sock *)req)) inet_rsk(req)->ecn_ok = 1; @@ -7202,6 +7320,9 @@ static void tcp_openreq_init(struct request_sock *req, tcp_rsk(req)->snt_synack = 0; tcp_rsk(req)->snt_tsval_first = 0; tcp_rsk(req)->last_oow_ack_time = 0; + tcp_rsk(req)->accecn_ok = 0; + tcp_rsk(req)->syn_ect_rcv = 0; + tcp_rsk(req)->syn_ect_snt = 0; req->mss = rx_opt->mss_clamp; req->ts_recent = rx_opt->saw_tstamp ? rx_opt->rcv_tsval : 0; ireq->tstamp_ok = rx_opt->tstamp_ok; diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index d5b5c32115d2..5c5d4b94b59c 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -1189,7 +1189,7 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst, enum tcp_synack_type synack_type, struct sk_buff *syn_skb) { - const struct inet_request_sock *ireq = inet_rsk(req); + struct inet_request_sock *ireq = inet_rsk(req); struct flowi4 fl4; int err = -1; struct sk_buff *skb; @@ -1202,6 +1202,7 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst, skb = tcp_make_synack(sk, dst, req, foc, synack_type, syn_skb); if (skb) { + tcp_rsk(req)->syn_ect_snt = inet_sk(sk)->tos & INET_ECN_MASK; __tcp_v4_send_check(skb, ireq->ir_loc_addr, ireq->ir_rmt_addr); tos = READ_ONCE(inet_sk(sk)->tos); diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 43d7852ce07e..779a206a5ca6 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -461,12 +461,51 @@ void tcp_openreq_init_rwin(struct request_sock *req, ireq->rcv_wscale = rcv_wscale; } -static void tcp_ecn_openreq_child(struct tcp_sock *tp, - const struct request_sock *req) +void tcp_accecn_third_ack(struct sock *sk, const struct sk_buff *skb, + u8 syn_ect_snt) { - tcp_ecn_mode_set(tp, inet_rsk(req)->ecn_ok ? - TCP_ECN_MODE_RFC3168 : - TCP_ECN_DISABLED); + u8 ace = tcp_accecn_ace(tcp_hdr(skb)); + struct tcp_sock *tp = tcp_sk(sk); + + switch (ace) { + case 0x0: + tcp_accecn_fail_mode_set(tp, TCP_ACCECN_ACE_FAIL_RECV); + break; + case 0x7: + case 0x5: + case 0x1: + /* Unused but legal values */ + break; + default: + /* Validation only applies to first non-data packet */ + if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq && + !TCP_SKB_CB(skb)->sacked && + tcp_accecn_validate_syn_feedback(sk, ace, syn_ect_snt)) { + if ((tcp_accecn_extract_syn_ect(ace) == INET_ECN_CE) && + !tp->delivered_ce) + tp->delivered_ce++; + } + break; + } +} + +static void tcp_ecn_openreq_child(struct sock *sk, + const struct request_sock *req, + const struct sk_buff *skb) +{ + const struct tcp_request_sock *treq = tcp_rsk(req); + struct tcp_sock *tp = tcp_sk(sk); + + if (treq->accecn_ok) { + tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); + tp->syn_ect_snt = treq->syn_ect_snt; + tcp_accecn_third_ack(sk, skb, treq->syn_ect_snt); + tcp_ecn_received_counters(sk, skb); + } else { + tcp_ecn_mode_set(tp, inet_rsk(req)->ecn_ok ? + TCP_ECN_MODE_RFC3168 : + TCP_ECN_DISABLED); + } } void tcp_ca_openreq_child(struct sock *sk, const struct dst_entry *dst) @@ -631,7 +670,7 @@ struct sock *tcp_create_openreq_child(const struct sock *sk, if (skb->len >= TCP_MSS_DEFAULT + newtp->tcp_header_len) newicsk->icsk_ack.last_seg_size = skb->len - newtp->tcp_header_len; newtp->rx_opt.mss_clamp = req->mss; - tcp_ecn_openreq_child(newtp, req); + tcp_ecn_openreq_child(newsk, req, skb); newtp->fastopen_req = NULL; RCU_INIT_POINTER(newtp->fastopen_rsk, NULL); diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 9c978d12c7cf..b4eac0725682 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -322,7 +322,7 @@ static u16 tcp_select_window(struct sock *sk) /* Packet ECN state for a SYN-ACK */ static void tcp_ecn_send_synack(struct sock *sk, struct sk_buff *skb) { - const struct tcp_sock *tp = tcp_sk(sk); + struct tcp_sock *tp = tcp_sk(sk); TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_CWR; if (tcp_ecn_disabled(tp)) @@ -330,6 +330,13 @@ static void tcp_ecn_send_synack(struct sock *sk, struct sk_buff *skb) else if (tcp_ca_needs_ecn(sk) || tcp_bpf_ca_needs_ecn(sk)) INET_ECN_xmit(sk); + + if (tp->ecn_flags & TCP_ECN_MODE_ACCECN) { + TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_ACE; + TCP_SKB_CB(skb)->tcp_flags |= + tcp_accecn_reflector_flags(tp->syn_ect_rcv); + tp->syn_ect_snt = inet_sk(sk)->tos & INET_ECN_MASK; + } } /* Packet ECN state for a SYN. */ @@ -337,8 +344,20 @@ static void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb) { struct tcp_sock *tp = tcp_sk(sk); bool bpf_needs_ecn = tcp_bpf_ca_needs_ecn(sk); - bool use_ecn = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn) == 1 || - tcp_ca_needs_ecn(sk) || bpf_needs_ecn; + bool use_ecn, use_accecn; + u8 tcp_ecn = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn); + + /* ============== ========================== + * tcp_ecn values Outgoing connections + * ============== ========================== + * 0,2,5 Do not request ECN + * 1,4 Request ECN connection + * 3 Request AccECN connection + * ============== ========================== + */ + use_accecn = tcp_ecn == 3 || tcp_ca_needs_accecn(sk); + use_ecn = tcp_ecn == 1 || tcp_ecn == 4 || + tcp_ca_needs_ecn(sk) || bpf_needs_ecn || use_accecn; if (!use_ecn) { const struct dst_entry *dst = __sk_dst_get(sk); @@ -354,35 +373,58 @@ static void tcp_ecn_send_syn(struct sock *sk, struct sk_buff *skb) INET_ECN_xmit(sk); TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_ECE | TCPHDR_CWR; - tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); + if (use_accecn) { + TCP_SKB_CB(skb)->tcp_flags |= TCPHDR_AE; + tcp_ecn_mode_set(tp, TCP_ECN_MODE_PENDING); + tp->syn_ect_snt = inet_sk(sk)->tos & INET_ECN_MASK; + } else { + tcp_ecn_mode_set(tp, TCP_ECN_MODE_RFC3168); + } } } static void tcp_ecn_clear_syn(struct sock *sk, struct sk_buff *skb) { - if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn_fallback)) + if (READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_ecn_fallback)) { /* tp->ecn_flags are cleared at a later point in time when * SYN ACK is ultimatively being received. */ - TCP_SKB_CB(skb)->tcp_flags &= ~(TCPHDR_ECE | TCPHDR_CWR); + TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_ACE; + } +} + +static void tcp_accecn_echo_syn_ect(struct tcphdr *th, u8 ect) +{ + th->ae = !!(ect & INET_ECN_ECT_0); + th->cwr = ect != INET_ECN_ECT_0; + th->ece = ect == INET_ECN_ECT_1; } static void tcp_ecn_make_synack(const struct request_sock *req, struct tcphdr *th) { - if (inet_rsk(req)->ecn_ok) + if (tcp_rsk(req)->accecn_ok) + tcp_accecn_echo_syn_ect(th, tcp_rsk(req)->syn_ect_rcv); + else if (inet_rsk(req)->ecn_ok) th->ece = 1; } -static void tcp_accecn_set_ace(struct tcphdr *th, struct tcp_sock *tp) +static void tcp_accecn_set_ace(struct tcp_sock *tp, struct sk_buff *skb, + struct tcphdr *th) { u32 wire_ace; - wire_ace = tp->received_ce + TCP_ACCECN_CEP_INIT_OFFSET; - th->ece = !!(wire_ace & 0x1); - th->cwr = !!(wire_ace & 0x2); - th->ae = !!(wire_ace & 0x4); - tp->received_ce_pending = 0; + /* The final packet of the 3WHS or anything like it must reflect + * the SYN/ACK ECT instead of putting CEP into ACE field, such + * case show up in tcp_flags. + */ + if (likely(!(TCP_SKB_CB(skb)->tcp_flags & TCPHDR_ACE))) { + wire_ace = tp->received_ce + TCP_ACCECN_CEP_INIT_OFFSET; + th->ece = !!(wire_ace & 0x1); + th->cwr = !!(wire_ace & 0x2); + th->ae = !!(wire_ace & 0x4); + tp->received_ce_pending = 0; + } } /* Set up ECN state for a packet on a ESTABLISHED socket that is about to @@ -396,9 +438,10 @@ static void tcp_ecn_send(struct sock *sk, struct sk_buff *skb, if (!tcp_ecn_mode_any(tp)) return; - INET_ECN_xmit(sk); + if (!tcp_accecn_ace_fail_recv(tp)) + INET_ECN_xmit(sk); if (tcp_ecn_mode_accecn(tp)) { - tcp_accecn_set_ace(th, tp); + tcp_accecn_set_ace(tp, skb, th); skb_shinfo(skb)->gso_type |= SKB_GSO_TCP_ACCECN; } else { /* Not-retransmitted data segment: set ECT and inject CWR. */ @@ -3414,7 +3457,10 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs) tcp_retrans_try_collapse(sk, skb, avail_wnd); } - /* RFC3168, section 6.1.1.1. ECN fallback */ + /* RFC3168, section 6.1.1.1. ECN fallback + * As AccECN uses the same SYN flags (+ AE), this check covers both + * cases. + */ if ((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_SYN_ECN) == TCPHDR_SYN_ECN) tcp_ecn_clear_syn(sk, skb); diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c index 9d83eadd308b..50046460ee0b 100644 --- a/net/ipv6/syncookies.c +++ b/net/ipv6/syncookies.c @@ -264,6 +264,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb) if (!req->syncookie) ireq->rcv_wscale = rcv_wscale; ireq->ecn_ok &= cookie_ecn_ok(net, dst); + tcp_rsk(req)->accecn_ok = ireq->ecn_ok && cookie_accecn_ok(th); ret = tcp_get_cookie_sock(sk, skb, req, dst); if (!ret) { diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 7dcb33f879ee..34381f94f3ca 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -542,6 +542,7 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst, skb = tcp_make_synack(sk, dst, req, foc, synack_type, syn_skb); if (skb) { + tcp_rsk(req)->syn_ect_snt = np->tclass & INET_ECN_MASK; __tcp_v6_send_check(skb, &ireq->ir_v6_loc_addr, &ireq->ir_v6_rmt_addr); From patchwork Mon Apr 14 13:13:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 881145 Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05on2078.outbound.protection.outlook.com [40.107.22.78]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81F1A2D1F7E; Mon, 14 Apr 2025 13:14:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.22.78 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636448; cv=fail; b=BtvqTVyEJNvoAfwqjJacrA/YAF0TmNYsnhG87ADQCA8u6RhzPkp1b6Z8z0BjPJR3Z0cvKCTrI1Y45Q7Syd+AYANklr8JH56RA97ddOp6/t2/e0GvhJJcpTyABDC/iG84s1K3mq2Kqi4VGiVxmAbEH6CuuaqfvDoYzDkC+J2lAVc= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636448; c=relaxed/simple; bh=e7UrFITwpmygTjGLAEnZ3V7gb1JncYGCuLIqU5AKJKg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=VIQoCIXeSHIDOx2599IyvL7wWeqtA/tWoDvm+IPGUVqOEch4/rKT0uzWr7oF915Zn1oRKwp+0DRAqIgey8STXm1Jj5FvdKoXWLyJbot8gVvfK12fp5BMckAG43KlQM8zGZ1NWkSie/F5bnd23b8ZUuDvyqRtJ+sdEG8I/Xf88ok= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=Tc+HJ8nK; arc=fail smtp.client-ip=40.107.22.78 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="Tc+HJ8nK" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ym+Yf4bbEtHd/oPmasm+v5dfEGSvceCbt3c+SuSN6ilM5yffTnHzSW/+AV7qkAU3M0zhHuwOBVrPOnTTG4p0bSFBOsbEhjj4I7fddr4RkPW6bb/+icev8oeG0znyOEmOjwoSXzs0k5gF58HMm94Kw4GGFDlCMr1RM+Oec3ntjg0pL5nbABGdFc+bnmTZiie4jbBRpRJul4AAxl2+fPKG1baGx7hjRBuJp0+AJ+bPay2DPXD2ZlwLtfoZHouU130kuNsOHd1yeIqylDAoYDCdaWzhNxZtBD4vUqlJvOkntf6Zt1DvZHDKf9z1z0QjMpMmPICwRs8j+WQAKqIKkricfQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=K7r8uly+ZR1PUPikdYkXp376WrR5KVziPb7QO8/Apvo=; b=pZzPHO7A0i/gSUa01KjOj+hPxVbcmnycfwKgKtwgl5VyIjXK07fPrNQ0psB1+bviIwF8dNQWkvHYrDbNeibXQrPeVYUc1UxDr4Fxy3/987MgbbkkSU7QnELbH9CIzN1DYoYWqE56r17HPIaHDOstJf20t/nfq57GgV9N9v2RbTwvoqd7VocGjBqEa1S/E3dSN9BcEVbbHkG33Itkh6ZbLdqtn4bhAgrdLynex8s5iF2HoTNL0Uul8hTKomnbhRPiyboHMRe9ZjWoXoZhTaaSw2wPR4Qh2ntSXu2LndRZ0Mkwfx7x64k3FIFM3/E9cUHb7/P8fQdhd28XQvOJsWwfjQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.101) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=K7r8uly+ZR1PUPikdYkXp376WrR5KVziPb7QO8/Apvo=; b=Tc+HJ8nKXHmWVrSD8NvjoW+eEKGsXyeJOXBXwnG3L1Qbt9Qpkg13wP5NpHmrpywlHQTw0m/JyzYs2yEvwq98wtfLRp2VNBbmm4iKLnb3D4J/3ptFmRpuOMcEsr01eMagRZmOmloJJTvNzAoGd4C4kXInXVvP044vV9z0O6FOKMOiQ47i8bM7FO/Va+rXyGr0nY7U6uTMctC/3vh+yC+V0zU8KNmQrWI3w51jv1bSoJYWKXiLdnt46l1+Y7xdhK9hGAEDsU/ozSgausKE1TShayMbP+vrX09aA434TjEzFpbQxqCOqpHb00Bqag5tqNGWq0mpUS6RgfhiE2aSUlCLkA== Received: from AS4P189CA0018.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:5db::19) by VI1PR0701MB6878.eurprd07.prod.outlook.com (2603:10a6:800:192::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8632.34; Mon, 14 Apr 2025 13:14:03 +0000 Received: from AMS0EPF00000193.eurprd05.prod.outlook.com (2603:10a6:20b:5db:cafe::73) by AS4P189CA0018.outlook.office365.com (2603:10a6:20b:5db::19) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8632.35 via Frontend Transport; Mon, 14 Apr 2025 13:14:03 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.101) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.101 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.101; helo=fr712usmtp1.zeu.alcatel-lucent.com; pr=C Received: from fr712usmtp1.zeu.alcatel-lucent.com (131.228.6.101) by AMS0EPF00000193.mail.protection.outlook.com (10.167.16.212) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8655.12 via Frontend Transport; Mon, 14 Apr 2025 13:14:03 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr712usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id 53EDDQBG009623; Mon, 14 Apr 2025 13:14:01 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v3 net-next 05/15] tcp: accecn: add AccECN rx byte counters Date: Mon, 14 Apr 2025 15:13:05 +0200 Message-Id: <20250414131315.97456-6-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> References: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AMS0EPF00000193:EE_|VI1PR0701MB6878:EE_ X-MS-Office365-Filtering-Correlation-Id: b7b0a76a-8c19-41ac-9a6e-08dd7b563a5a X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|7416014|376014|82310400026|1800799024|36860700013|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?Ad71EL6wVgiZlhKdd79GSCoKm94iTag?= =?utf-8?q?gukIL8Hf5EOnNJH8dmIBUor3hpG4JGgvCORW9brwzyTlRItNiVNOeHi1ikhIXVDFt?= =?utf-8?q?mp9ONi9Y8jMSY8+PWknBxg3PmUmAn+4cPzYqQS2SxY0XkEiIs/qlf7W8Yd8prYH9r?= =?utf-8?q?C6NB0azXcPgTGJtOJRTdHWrdoVeSoc32q6I45kdarUM5W7H5CUL1fZGImoOuA5EoU?= =?utf-8?q?uhe3LnpMV1x/0wq9R0bNIGowBUKQSLGHn7siu77glhqNGnYI8DNQ5KtjIFMOc6pQ8?= =?utf-8?q?tmrIUOb4bHojwhdXcjqlC0iS9XRipI7HlA612IfLjqrIlOIo/4PN/gO5VjLQQaUv3?= =?utf-8?q?3sGoW8iqwkxgLMa8AKGEKS1UiOU37/AueM5yEgOFKbzp5ISItQNGTZNqC0FMLmeGv?= =?utf-8?q?m+gmAyr9nCf7+aRqvhmIcpyd9nHdxil+RXPPMzzJ2hyVZGgcGXRKDqzSSJv3jlUyV?= =?utf-8?q?59z36FpP14vWb/IUlR8ucV0eYnSPKjEuVA60LfD448G/t+RBqoNKblQjFgCdnWViz?= =?utf-8?q?+18wV1p4cQ1EcgaD3pmcBiV6OHnFOHqDTWad8C6gpKH3Jz4JdWz31JBKKqcnAPFj9?= =?utf-8?q?+FWuWy/zI3ADc3sLXXy0qG+CswI9qjUQwc7f5yV1b2h+d1mHW6iGl/L6+m9QKYxmK?= =?utf-8?q?9Ssewd9A1Q/F8diTHfe79foLlkOaiAzy1ywHYTPXChC9Jb0mGb3vmPo9ploZFhdEd?= =?utf-8?q?Dz0BTFmMhPSeBzbZ1TbaMzkaQNfGlSdCS96wO+gQC7QgJSRT4vc3X9rK+XV2yS3r/?= =?utf-8?q?+gsF65R5F3kCpaoMs3gAeAmvRq1+zRcrJWfZGcTK9QH48R9e7LIhd2xxw14fbb9Bh?= =?utf-8?q?8LrpdJehevqGpDdfmFYLM6sJdbt1Z/mFV6NK/xm/+aUEEL3aPGeDEBAwsNjRDV/Tf?= =?utf-8?q?zuozMOiUKmb/obNOqPPLTzUL/N0uski3nlSgfO5ZDKp6gaIks61wXeNiUFtE/Z0bE?= =?utf-8?q?s8MQGdzvdF1qUuiIzl+Dm2qOcwb2dk4qxAaE6ak4fpmeOIGec2Zdo+nmh8VymQc/A?= =?utf-8?q?f3+X7Hp+tGb95pAmN7JZ0ifpITM9qElAk60LZxvomA4NqRuymn6UQSuaW3skGkOYe?= =?utf-8?q?cF2Oyc2/RRiTT3kfhlsze4MGJbw0wGNNJvX2LDwcEED35PBsEhdZ24XSXnbZQZDeo?= =?utf-8?q?lbTLajEinXlteygOIE11j+2eaGMAIUmBM0Ci2hBlbisVfZbGUCaJ7jLyBo2UuMQUe?= =?utf-8?q?qssA1+I3fyGxkFIEeqtIun1VCom55grhThw465guTrCamRU00or41t9zo1JsfvtXr?= =?utf-8?q?1MnLeVAdOUNoQavSYS9SgP5YNPoN2zQjtyYDL1r8wtFbYm5+VEGwbUSs5v3s66BV7?= =?utf-8?q?WzGJpoaCzH5r2Pv/dL2pbMjZb4Y8JjaWB11G0QtyA10fWK5fJlZAIguRl/xYSNq2F?= =?utf-8?q?2HGfgWtXPXVQ5Tz0cq2tjNwiy5yK9Q/NlbQeS8VBtnWSgkq0gXe6B3tm/IGi7C32a?= =?utf-8?q?pqJGtpYV4CHBjzmqf1pKO+D3c/yxBAkg=3D=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.101; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fr712usmtp1.zeu.alcatel-lucent.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(7416014)(376014)(82310400026)(1800799024)(36860700013)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2025 13:14:03.1336 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b7b0a76a-8c19-41ac-9a6e-08dd7b563a5a X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.6.101]; Helo=[fr712usmtp1.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AMS0EPF00000193.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0701MB6878 From: Ilpo Järvinen These counters track IP ECN field payload byte sums for all arriving (acceptable) packets. The AccECN option (added by a later patch in the series) echoes these counters back to sender side. Signed-off-by: Ilpo Järvinen Signed-off-by: Neal Cardwell Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 1 + include/net/tcp.h | 18 +++++++++++++++++- net/ipv4/tcp.c | 3 ++- net/ipv4/tcp_input.c | 13 +++++++++---- net/ipv4/tcp_minisocks.c | 3 ++- 5 files changed, 31 insertions(+), 7 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index af38fff24aa4..9cbfefd693e3 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -303,6 +303,7 @@ struct tcp_sock { u32 delivered; /* Total data packets delivered incl. rexmits */ u32 delivered_ce; /* Like the above but only ECE marked packets */ u32 received_ce; /* Like the above but for rcvd CE marked pkts */ + u32 received_ecn_bytes[3]; u8 received_ce_pending:4, /* Not yet transmit cnt of received_ce */ unused2:4; u32 app_limited; /* limited until "delivered" reaches this val */ diff --git a/include/net/tcp.h b/include/net/tcp.h index f36a1a3d538f..6ffa4ae085db 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -467,7 +467,8 @@ static inline int tcp_accecn_extract_syn_ect(u8 ace) bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect); void tcp_accecn_third_ack(struct sock *sk, const struct sk_buff *skb, u8 syn_ect_snt); -void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb); +void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, + u32 payload_len); enum tcp_tw_status { TCP_TW_SUCCESS = 0, @@ -1035,11 +1036,26 @@ static inline u32 tcp_rsk_tsval(const struct tcp_request_sock *treq) * See draft-ietf-tcpm-accurate-ecn for the latest values. */ #define TCP_ACCECN_CEP_INIT_OFFSET 5 +#define TCP_ACCECN_E1B_INIT_OFFSET 1 +#define TCP_ACCECN_E0B_INIT_OFFSET 1 +#define TCP_ACCECN_CEB_INIT_OFFSET 0 + +static inline void __tcp_accecn_init_bytes_counters(int *counter_array) +{ + BUILD_BUG_ON(INET_ECN_ECT_1 != 0x1); + BUILD_BUG_ON(INET_ECN_ECT_0 != 0x2); + BUILD_BUG_ON(INET_ECN_CE != 0x3); + + counter_array[INET_ECN_ECT_1 - 1] = 0; + counter_array[INET_ECN_ECT_0 - 1] = 0; + counter_array[INET_ECN_CE - 1] = 0; +} static inline void tcp_accecn_init_counters(struct tcp_sock *tp) { tp->received_ce = 0; tp->received_ce_pending = 0; + __tcp_accecn_init_bytes_counters(tp->received_ecn_bytes); } /* State flags for sacked in struct tcp_skb_cb */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 73f8cc715bff..278990dba721 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -5092,6 +5092,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ce); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ce); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_wnd); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rx_opt); @@ -5099,7 +5100,7 @@ static void __init tcp_struct_check(void) /* 32bit arches with 8byte alignment on u64 fields might need padding * before tcp_clock_cache. */ - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 97 + 7); + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 109 + 3); /* RX read-write hotpath cache lines */ CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_rx, bytes_received); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index cc34664805f8..c017e342f092 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6106,7 +6106,8 @@ static void tcp_urg(struct sock *sk, struct sk_buff *skb, const struct tcphdr *t } /* Updates Accurate ECN received counters from the received IP ECN field */ -void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb) +void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, + u32 payload_len) { u8 ecnfield = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; u8 is_ce = INET_ECN_is_ce(ecnfield); @@ -6121,6 +6122,9 @@ void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb) tp->received_ce += pcount; tp->received_ce_pending = min(tp->received_ce_pending + pcount, 0xfU); + + if (payload_len > 0) + tp->received_ecn_bytes[ecnfield - 1] += payload_len; } } @@ -6398,7 +6402,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) flag |= __tcp_replace_ts_recent(tp, delta); - tcp_ecn_received_counters(sk, skb); + tcp_ecn_received_counters(sk, skb, 0); /* We know that such packets are checksummed * on entry. @@ -6444,7 +6448,8 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) /* Bulk data transfer: receiver */ tcp_cleanup_skb(skb); __skb_pull(skb, tcp_header_len); - tcp_ecn_received_counters(sk, skb); + tcp_ecn_received_counters(sk, skb, + len - tcp_header_len); eaten = tcp_queue_rcv(sk, skb, &fragstolen); tcp_event_data_recv(sk, skb); @@ -6491,7 +6496,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb) tcp_accecn_third_ack(sk, skb, tp->syn_ect_snt); tcp_fast_path_on(tp); } - tcp_ecn_received_counters(sk, skb); + tcp_ecn_received_counters(sk, skb, len - th->doff * 4); reason = tcp_ack(sk, skb, FLAG_SLOWPATH | FLAG_UPDATE_TS_RECENT); if ((int)reason < 0) { diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 779a206a5ca6..3f8225bae49f 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -497,10 +497,11 @@ static void tcp_ecn_openreq_child(struct sock *sk, struct tcp_sock *tp = tcp_sk(sk); if (treq->accecn_ok) { + const struct tcphdr *th = (const struct tcphdr *)skb->data; tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_snt = treq->syn_ect_snt; tcp_accecn_third_ack(sk, skb, treq->syn_ect_snt); - tcp_ecn_received_counters(sk, skb); + tcp_ecn_received_counters(sk, skb, skb->len - th->doff * 4); } else { tcp_ecn_mode_set(tp, inet_rsk(req)->ecn_ok ? TCP_ECN_MODE_RFC3168 : From patchwork Mon Apr 14 13:13:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 881143 Received: from AM0PR83CU005.outbound.protection.outlook.com (mail-westeuropeazon11010068.outbound.protection.outlook.com [52.101.69.68]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE5332D1F7E; Mon, 14 Apr 2025 13:14:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=52.101.69.68 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636452; cv=fail; b=do3MsAXgW+CyADwGLglxUrzYYdnFOagR9HVDyy6j71XXEnM6mLvuPuWIfq/GjzKZpRtGOzMEegOszhNLp+gG5Mx41isxf1FJOTOCTgfM/vnkoqMeuByDAVzfU7n8nbrK8qC+ET+ST8SVyBfmUYkbzo+Y56UWhh3V3zinF3Gr4dg= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636452; c=relaxed/simple; bh=bz0uIeWpSGSuIQvh66yhCx2wBEBZLK1Uhj3GCh55+fI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=aKTEKCWRUT7d5NAuo5YMZqeuVG+7Bpj33ah0wkFW/PUgC/mFOeJKpcMQ5NRpeo1MkcJXzzFevxnHjTpoMeqL7Ymyzt3A+7Z6XSQRYbtcUIvaaxbvhc6WLj9PKL6KY5Q3i6Z/5YWDpneSlNJNL6yNM6ofK5dxYmdOD3zx1qeCibM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=LXhDsDPq; arc=fail smtp.client-ip=52.101.69.68 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="LXhDsDPq" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=KAmQ42KtZbjnCbZ6vNoX62JDVGYuS3xOUd76+MZ3fh0rFufm5DkGvO3W2F1qpGKK658AcOT1ypXg5aWcnccLVCayPzOiBQa92mmLsbtH9B/HKAIpFGuHPZ4mE0zqf/cX7zZXB7cKhbsDtNpg+R4Fh0fTXipCEbEtAKWltX6iWP0wsa82kIXfmBggV3UGTEmy/xqSbkCNMbR1HYwHwOx2mpBrgIg5vTcPew3exswn0dxYPBr2oDRrCbV50n28/woCmoG/tiHmUNxM19j/QyU3p15ZSQFsMcnOYn3w90J8BiG9xKBQyfGC442pbAOMTKtbl/OiwBBXACF9RCRxkrgJLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=xY4cSgz9DD+izQ5CHXWWIiWN/HVvG4kiITK+k1h7IUY=; b=zNEJ2FbHtLQ/4ca7cR1Xgzwies7MPlgccC5xblL0aoUpqqttJEeqMkz9tfLqQ77z0MpfVFJaOL6R86Q35ZT3mSiAfcvaoWLvGMBZVWLdantew0Hn2qljnTpitboL4PSNjAcqiDD7Lv1fOf022abt9gE74h/9CFGdRvgE1qQBF10CAd2ddOqTExNhR1J8Z11Aoin3g+d4uozHdXp/Lg6SIiB7/III4vwr5qmtD4iA/7v2Y8bkcTh14KVypG0D1qmGdNlCdtMYsa9wHWbaPj5OPCdZMAoQEe+a4OUNh+RkEwgNsL9Nfw+/OT7KF88K4LzDz0+kZAZXPE3CZcWU4BcnbA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.101) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xY4cSgz9DD+izQ5CHXWWIiWN/HVvG4kiITK+k1h7IUY=; b=LXhDsDPqLqCT+ctADTDDJ8kN/vSUhIa99M4JK8sTM89jIcckDmlWVQubgOIUO7z0ANXyvyTCZJQHrS+j2QScg7UE0g6zQGNyMs3Akhz3tE1Kl1TUPACfMzbtS5qICKO5gbgGbvc1js3o8cB6IvF0A28764aeu8G/yYhDltsjro8GWSNiFbjcRRh9oJVXyCNYAvz0+qArxe4XISbG5Uu75Fj3NLAPEEzAw0slmsG7t3cga36gXrfHQnIC94SM9A37LX8RM0aE5UrkwMg3GqAr5HUIKNd8srAFmpnwzmkxtdmY/ZGYIAECkUAkVxl6069SMzgkQwIr9P+410LiVdedvg== Received: from PR1P264CA0194.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:34d::10) by PA4PR07MB8885.eurprd07.prod.outlook.com (2603:10a6:102:266::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8632.34; Mon, 14 Apr 2025 13:14:05 +0000 Received: from AM4PEPF00025F95.EURPRD83.prod.outlook.com (2603:10a6:102:34d:cafe::14) by PR1P264CA0194.outlook.office365.com (2603:10a6:102:34d::10) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8632.34 via Frontend Transport; Mon, 14 Apr 2025 13:14:05 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.101) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.101 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.101; helo=fr712usmtp1.zeu.alcatel-lucent.com; pr=C Received: from fr712usmtp1.zeu.alcatel-lucent.com (131.228.6.101) by AM4PEPF00025F95.mail.protection.outlook.com (10.167.16.4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8678.4 via Frontend Transport; Mon, 14 Apr 2025 13:14:05 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr712usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id 53EDDQBH009623; Mon, 14 Apr 2025 13:14:03 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v3 net-next 06/15] tcp: accecn: AccECN needs to know delivered bytes Date: Mon, 14 Apr 2025 15:13:06 +0200 Message-Id: <20250414131315.97456-7-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> References: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM4PEPF00025F95:EE_|PA4PR07MB8885:EE_ X-MS-Office365-Filtering-Correlation-Id: 46790267-8f15-48dd-7a71-08dd7b563b91 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|376014|36860700013|7416014|1800799024|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?uM/XwjxurT+MOnxVFIa1lr0XVXNxaBv?= =?utf-8?q?kJsO7PWc8F861Z+5izohs9BufN/+tZ9BcBJgYhagx5VSniXiK40sDYi57XnoRAeO6?= =?utf-8?q?be6BfrJ2dOfwfBCaZpCGwsdU75dmjFORZmXj42qF0qkb/xtVdGYCUdqn1cRc4GZyJ?= =?utf-8?q?XMTVyIXaQc+szR0cLGBRIp37pjWa9m+fGNamWgV2OKmxXqUZzYQFww7vMiUhTVMLe?= =?utf-8?q?Y+1K8P9hVMtu+Nx6aONG5Zt0EZLZHBgANBJfpiEeIOLmzRJ66mxTK8Ozf5uZmYzWV?= =?utf-8?q?qosUiH0nFnRKj1Jk+gQp9LNYXd46a186HklpxX8+bT3XgL5MSXeMY5nP8z9itEY7G?= =?utf-8?q?2TRjg3UBCGe1GElgsNgF1H79OqwOTsH5MgIzNgIQnSz8EpveFm2UeEVOXjtVWlPjg?= =?utf-8?q?rDphbqo3pMPB8/pj4T2Y4qg3z3ZfM+mLPKWbrrpKMrdJzcxceXA120ngFX7+Uae/U?= =?utf-8?q?rIW3OZD64V+CYJRdd4OGLqpZ5X0FyYxs6FIsMyPfArYrcF8Ips388qKxCu/J5DxGE?= =?utf-8?q?AXBNIoO8nA5CZC3TaYQiIsL+Klp3BtJbasPD9EuzH1kSrr33zYM/R6gJGh0yvqxTl?= =?utf-8?q?lBvEo/DtVEk+6F525YbA12CXhqcMkCquZLMx1BcxgywB3jmBlcmV4eqQrIMh67ChX?= =?utf-8?q?LisdDR2vTwvFfLR31PQzznnnPqnLO/LWf6Lc5npkOvKxqLT4EukF7WhB9aVSwnKvg?= =?utf-8?q?GAsk870OQID3UqcWYLv2rvap9oN+6nG01r1rOllmKUL3cYYasrZqTvywBvemAmkDL?= =?utf-8?q?g1Pq2fIGEOI8o6FyFbXz4HrKvJsn6VzG/DKaGeHZMPvHW9dWba+6FsS3cfY27Suxt?= =?utf-8?q?pAhZfRRoHVviROsb2gfia39qwLwwiaDqKh1k4gvPf+Q3iWK3Oi+got4hDge5B25Cd?= =?utf-8?q?expV3DuyIdjnNitT4v2afyEKv3nx5FkMnR9YnkRq8YCoABGIcrR91lUtI5WskGU0F?= =?utf-8?q?aaIGeQksdkDEObDgPj8ekx/vrd53xV1+N5Hxp1kYDZgH5h7m1wMGRBFZqU39UQleu?= =?utf-8?q?42+S/orHfsSARNJ5HgI4qP+auuriBKo0AzW7pXMqEoU3ixdhd1ZuQDmt+fwb8XmWo?= =?utf-8?q?I25FZqGyR/+wxJWvzUBWtqfB4VhiEArDCdDyTDFHpVPUIjg6kPAR8+zToMmSNakLY?= =?utf-8?q?BgGM5CIsvNo27647mi0/v1gnyaZjyLcJdvsnapAuZg2fMeFj0MHdqW29y+1QOpdRf?= =?utf-8?q?HY1B5zx7Nodr8rlPIw1X/QHfyyjy9/Z/FL1EMRiRyq82aPUyT+YOVXQEozSo/ySmf?= =?utf-8?q?Zgz7U+z39/coaXrmpuiVo4vms5/PVnKexY0KPl8U9fR9QqLwTSB7nihvFwtkAOYs0?= =?utf-8?q?0Dbm03bkMwTATlqLhUn1YTOsU+7AZ1YUddp4QfW5Ma4zr0mNNMJSqOBfG2dLYnDqH?= =?utf-8?q?CdvywDP1t/HSw8/z4MyvgAXc2iBjdQ+lJaekNgb9Cd6Kql5HKEcWGbs0B5xI+BLXf?= =?utf-8?q?vdkgUXM4oJ+bJ54MAhuLcFypzpyzyJkg=3D=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.101; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fr712usmtp1.zeu.alcatel-lucent.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(376014)(36860700013)(7416014)(1800799024)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2025 13:14:05.2368 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 46790267-8f15-48dd-7a71-08dd7b563b91 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.6.101]; Helo=[fr712usmtp1.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00025F95.EURPRD83.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PA4PR07MB8885 From: Ilpo Järvinen AccECN byte counter estimation requires delivered bytes which can be calculated while processing SACK blocks and cumulative ACK. The delivered bytes will be used to estimate the byte counters between AccECN option (on ACKs w/o the option). Non-SACK calculation is quite annoying, inaccurate, and likely bogus. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- net/ipv4/tcp_input.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index c017e342f092..5bd7fc9bcf66 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -1170,6 +1170,7 @@ struct tcp_sacktag_state { u64 last_sackt; u32 reord; u32 sack_delivered; + u32 delivered_bytes; int flag; unsigned int mss_now; struct rate_sample *rate; @@ -1531,7 +1532,7 @@ static int tcp_match_skb_to_sack(struct sock *sk, struct sk_buff *skb, static u8 tcp_sacktag_one(struct sock *sk, struct tcp_sacktag_state *state, u8 sacked, u32 start_seq, u32 end_seq, - int dup_sack, int pcount, + int dup_sack, int pcount, u32 plen, u64 xmit_time) { struct tcp_sock *tp = tcp_sk(sk); @@ -1591,6 +1592,7 @@ static u8 tcp_sacktag_one(struct sock *sk, tp->sacked_out += pcount; /* Out-of-order packets delivered */ state->sack_delivered += pcount; + state->delivered_bytes += plen; /* Lost marker hint past SACKed? Tweak RFC3517 cnt */ if (tp->lost_skb_hint && @@ -1632,7 +1634,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *prev, * tcp_highest_sack_seq() when skb is highest_sack. */ tcp_sacktag_one(sk, state, TCP_SKB_CB(skb)->sacked, - start_seq, end_seq, dup_sack, pcount, + start_seq, end_seq, dup_sack, pcount, skb->len, tcp_skb_timestamp_us(skb)); tcp_rate_skb_delivered(sk, skb, state->rate); @@ -1924,6 +1926,7 @@ static struct sk_buff *tcp_sacktag_walk(struct sk_buff *skb, struct sock *sk, TCP_SKB_CB(skb)->end_seq, dup_sack, tcp_skb_pcount(skb), + skb->len, tcp_skb_timestamp_us(skb)); tcp_rate_skb_delivered(sk, skb, state->rate); if (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED) @@ -3540,6 +3543,8 @@ static int tcp_clean_rtx_queue(struct sock *sk, const struct sk_buff *ack_skb, if (sacked & TCPCB_SACKED_ACKED) { tp->sacked_out -= acked_pcount; + /* snd_una delta covers these skbs */ + sack->delivered_bytes -= skb->len; } else if (tcp_is_sack(tp)) { tcp_count_delivered(tp, acked_pcount, ece_ack); if (!tcp_skb_spurious_retrans(tp, skb)) @@ -3643,6 +3648,10 @@ static int tcp_clean_rtx_queue(struct sock *sk, const struct sk_buff *ack_skb, delta = prior_sacked - tp->sacked_out; tp->lost_cnt_hint -= min(tp->lost_cnt_hint, delta); } + + sack->delivered_bytes = (skb ? + TCP_SKB_CB(skb)->seq : tp->snd_una) - + prior_snd_una; } else if (skb && rtt_update && sack_rtt_us >= 0 && sack_rtt_us > tcp_stamp_us_delta(tp->tcp_mstamp, tcp_skb_timestamp_us(skb))) { @@ -4097,6 +4106,7 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag) sack_state.first_sackt = 0; sack_state.rate = &rs; sack_state.sack_delivered = 0; + sack_state.delivered_bytes = 0; /* We very likely will need to access rtx queue. */ prefetch(sk->tcp_rtx_queue.rb_node); From patchwork Mon Apr 14 13:13:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 881141 Received: from EUR02-DB5-obe.outbound.protection.outlook.com (mail-db5eur02on2062.outbound.protection.outlook.com [40.107.249.62]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 15A5914A0A8; Mon, 14 Apr 2025 13:14:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.249.62 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636463; cv=fail; b=iOViutdInT9T2FL++qTnpl1jVe5mj8Kjyd2ZHDXDD/08RkVOa5dAyjM8/4UCuGd3sux9Yfq/oLhfzx69uC3yhB7Bu9CDp5jQUMr9sLBnMGy4oRegi1TIiPwiDyDfWj1C5uZiOxSYXswiXLoBRG2W/ye//M5GWvDzwnRqxUh9Flk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636463; c=relaxed/simple; bh=G7AD1z9w1HgJyE5etSVk9tk++GMHosLLpz93FfPbtls=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=rDCtoOS6pQYsKpF6zgI34vJBcf71UOhIA8LScf39UVBKL8Caxfr7/YrCOL+GvhB2jQGaad7Engy0+RAgoiHp41SM30/Moi1K59do1XiK4z50RG4F9yUVQYeEprCkzb2P2ykGe6V35YVGz0IKu0zkjstyw6GK/Dzs+lIqi9XN3Cw= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=U+4Q9Zdj; arc=fail smtp.client-ip=40.107.249.62 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="U+4Q9Zdj" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=fdLly4e7VxGrHN3xU/xIAmJlVWmXqSmnkgYbDgeHNwfn6XOSZ/zJvYRLJfno0s81oYXwiDUmurSd2CbjU+VPJSP311pkLg1qDiHGDO0+xuJ+R0cmSWPxtxpxkNyRjbpBLvLju9uAdCPcpq7cocxvN+czlisZk+DTryercw+uBFMhUtNkMrPA+dfQcEer7uy6HoxjYk2l/TkJ8/LwKSjjr8UpdpoqzkRwc6kbcALxULSAlZY19FkxAmg0LcMrGRJ9s0r32h016DyPRgw9z0cAhIhIZ6yMz2wi/CxNoBqUxkuIRs2w0uOROjvV5O7wAZeP1Khf/r7u/UUkaHd7VHof9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=99oKgOcN6Ay0P/Xag6QzCJxFE5Sfm9d6UD3XFt66CiM=; b=V0KGuVxKiJ4hIeG9wb2IHIE4LhZKG2SMZ3p/j9meWIyMUUuSOBB2h4mMV+JeYaKFhVJCxkpTi3dVhQK6YxpN89sEPZkRxadW8tx2yuIC2B9BrEd3BHNVZy2v4FsHbouTftTjBOeJhYr877aIi/0sFz0A1KMiOfpUQw6eH8sT3Pd2W+Y0/7+YkuOPbeIok5w86vVoHVTT6hmL0npK8iJK7WB0NxhgzX9WJ2KNUxLY3fettuUtpzAa0l0up4XHWZMWKV8vfwYqoc8O6Y+7rj9gFCZS+USJM2iAZbmHa53sYO2TvrJ+xzg5jHARv8AzhxXwia9M2MLKWPrGKvRrwNB8FA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.101) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=99oKgOcN6Ay0P/Xag6QzCJxFE5Sfm9d6UD3XFt66CiM=; b=U+4Q9ZdjtsOFLORTB1M1ZjcZE6Tpuch/f2puL7eO2cjWsjm1tgZM3/3pk1r3guoK6y34HcvckwgJ/voXi+iZ+aCB+NbXlzeUZPpl0/oWpequWZ5CcGXrUKOhhTHvc/slIe9Uc73oGSLkyvBXf5phMSHTpEcSz0NfyGDGOEHzS1YdtBtivjw0Fp+aRRr0saf+pgeB0/ltHoV7m813VK+6SiQVpmlVkpjrxsZrC7q2TGF1K/MQKdFl7U+jMHALilOIxHbY6vKtrEENhXRVhX7Y6SaoOaFZNFUyK4tSY589/RmtMlBQ4EYpnfva2G6zQ1S2jTKnhTsBg1ydvWA6WvRXrw== Received: from CWLP123CA0086.GBRP123.PROD.OUTLOOK.COM (2603:10a6:401:5b::26) by VI1PR07MB9682.eurprd07.prod.outlook.com (2603:10a6:800:1d2::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8632.32; Mon, 14 Apr 2025 13:14:13 +0000 Received: from AM3PEPF0000A790.eurprd04.prod.outlook.com (2603:10a6:401:5b:cafe::21) by CWLP123CA0086.outlook.office365.com (2603:10a6:401:5b::26) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8632.35 via Frontend Transport; Mon, 14 Apr 2025 13:14:13 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.101) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.101 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.101; helo=fr712usmtp1.zeu.alcatel-lucent.com; pr=C Received: from fr712usmtp1.zeu.alcatel-lucent.com (131.228.6.101) by AM3PEPF0000A790.mail.protection.outlook.com (10.167.16.119) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8655.12 via Frontend Transport; Mon, 14 Apr 2025 13:14:13 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr712usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id 53EDDQBL009623; Mon, 14 Apr 2025 13:14:11 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v3 net-next 10/15] tcp: accecn: AccECN option send control Date: Mon, 14 Apr 2025 15:13:10 +0200 Message-Id: <20250414131315.97456-11-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> References: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM3PEPF0000A790:EE_|VI1PR07MB9682:EE_ X-MS-Office365-Filtering-Correlation-Id: dbbc6f01-e7f3-4a67-e282-08dd7b564045 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|82310400026|7416014|376014|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?V/ZNk+sUxdO2JsOdhKebX30B83xc1po?= =?utf-8?q?1uPVXdqQUH7Y45wr+l6IMO991D7VekFKLZuy693cSxvjyV8yfdkT/pad047k0yePX?= =?utf-8?q?+8eis2qqOJfoq6gZQxKNhrbqqLFt9sw99m+XqnHdXGflAaq6C9oeeHC5lqlqA0vre?= =?utf-8?q?xEwER9/aY1UdYS4eOsdIw6gAVnMWeMshAhlI1FPK2B/5C8TAuHJmWuIUYNbwBgpsu?= =?utf-8?q?CYzdbRipmVRwxV9AjnLh+13ctJhKK4i5J8xWIwAG5wu4e4xSNd9O1k+WIva8x3/xR?= =?utf-8?q?uuqZt69VbDX+yc5MJIgGjCrU/4LN0zPqqxwGKfS2NtqWrQ7eozzJRfgzLq0yc45wd?= =?utf-8?q?vVj8noTv+ce3tRldP+rWxRnt9yocE3PcdirqunnMZEJOlf9uzwMc2HS9kKHZAg55Q?= =?utf-8?q?q4pIE6faG3CdGJJ+6MXwv1vm8QyeEUCoB1sh++A9OXzKXoz4NHKPuN5b9CGD4v1An?= =?utf-8?q?1fb7LYTPOc5/xrC1QzzuNMnB1sY3KS+FERUVcgzZ2EvnwOoxyvlA9imtrkYs0EYoB?= =?utf-8?q?dYh8rJ++h0NfzbpeiwzcWPhCD235idobpy8GnlhlaaHNxhtpRIUyxKG58aSiVkNz+?= =?utf-8?q?pPY20ai6RDcRNYz4ku/6ELPghNCs2eCczHwUyrTndkVIiQJTBXQaa+1RF4+DeEBYK?= =?utf-8?q?pEkcmo11nMq0J8R7qEv0VFxoGW6ebSNYtYT1sEnE9LIy7JfKXbgXnsK4Mv1U/f6iJ?= =?utf-8?q?3ls0V97pGEvz7e1M3iZtBDB9K/qx/xKv96FspYVbAe9C/ToUiSww6wmTpxo9cgdI7?= =?utf-8?q?Zw7kIv5jyLn+ymeY48bbWxXDcwuHSyQKrfHYuF2xQEfhzZT4NLwyxeknSyd6FcSpz?= =?utf-8?q?dpz4vSjLrYX3tiAYzKYm/RHNuqLrAgk3thtyvy2LF+VHh/JM803TvNVih6eIiGv34?= =?utf-8?q?IUzADJPxON3tKneCZ1YOKHehgRT4/ukT6hGnfjXO6eaytDdcCz8d7GaZVDfMNQ2x/?= =?utf-8?q?fy6hQepZGhTg6zOrvZynbQ+LCHSuiTB4BVeWrp6LIkPVkDHt7RmDLMn51yuH749xB?= =?utf-8?q?9o0RJvIeN6FwQsMAWmJjXv8e+pmePoEdCNpM2K3v40+h6u6YzUGB3jgBPH+9NLVq+?= =?utf-8?q?tHzjHuaOU1/fpxh68wVBVkWW6BAMq/5tiGvVj6LErW+R4bqMjFUtM3DFz4n56jVBn?= =?utf-8?q?V4TM6VUGlUy+HWZHQvm+ppx4JYGGAFyrPduMQFWcyvHIX3gGjZj8PBPOP0VhgNAOL?= =?utf-8?q?1lP5MZsdklpxRU9AMMeTAs39Wtwaec02qWt4FPzqa8IpKW8nDLp/b/Z1EPu+7PukH?= =?utf-8?q?4XJ4hLLtzPYCJsDIRhDntataq7Q5TXlJyUeNmD/BJgu2pJwOTJSuzQKEz1402cDs9?= =?utf-8?q?HJcU5FH01DEb807revF/HNkQszjjSzKWYHzMtXf8awue7WwEyAajXEp+mGEK+wMhH?= =?utf-8?q?JWPYqMFeA1xIfVJnMK/PUs2ith51PBI/CtZQbTdDe3kCiyc8fZwappsSYnqL8LjNr?= =?utf-8?q?jdQexFB7PznF/ecqtdu7fr2xpOazpfwg=3D=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.101; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fr712usmtp1.zeu.alcatel-lucent.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(82310400026)(7416014)(376014)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2025 13:14:13.1422 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: dbbc6f01-e7f3-4a67-e282-08dd7b564045 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.6.101]; Helo=[fr712usmtp1.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A790.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR07MB9682 From: Ilpo Järvinen Instead of sending the option in every ACK, limit sending to those ACKs where the option is necessary: - Handshake - "Change-triggered ACK" + the ACK following it. The 2nd ACK is necessary to unambiguously indicate which of the ECN byte counters in increasing. The first ACK has two counters increasing due to the ecnfield edge. - ACKs with CE to allow CEP delta validations to take advantage of the option. - Force option to be sent every at least once per 2^22 bytes. The check is done using the bit edges of the byte counters (avoids need for extra variables). - AccECN option beacon to send a few times per RTT even if nothing in the ECN state requires that. The default is 3 times per RTT, and its period can be set via sysctl_tcp_ecn_option_beacon. Signed-off-by: Ilpo Järvinen Co-developed-by: Chia-Yu Chang Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 3 +++ include/net/netns/ipv4.h | 1 + include/net/tcp.h | 1 + net/ipv4/sysctl_net_ipv4.c | 9 ++++++++ net/ipv4/tcp.c | 5 ++++- net/ipv4/tcp_input.c | 36 +++++++++++++++++++++++++++++++- net/ipv4/tcp_ipv4.c | 1 + net/ipv4/tcp_minisocks.c | 2 ++ net/ipv4/tcp_output.c | 42 ++++++++++++++++++++++++++++++-------- 9 files changed, 90 insertions(+), 10 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 0e032d9631ac..9619524d8901 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -309,7 +309,10 @@ struct tcp_sock { u8 received_ce_pending:4, /* Not yet transmit cnt of received_ce */ unused2:4; u8 accecn_minlen:2,/* Minimum length of AccECN option sent */ + prev_ecnfield:2,/* ECN bits from the previous segment */ + accecn_opt_demand:2,/* Demand AccECN option for n next ACKs */ est_ecnfield:2;/* ECN field for AccECN delivered estimates */ + u64 accecn_opt_tstamp; /* Last AccECN option sent timestamp */ u32 app_limited; /* limited until "delivered" reaches this val */ u32 rcv_wnd; /* Current receiver window */ /* diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index 4569a9ef4fb8..ff8b5b56ad00 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -149,6 +149,7 @@ struct netns_ipv4 { u8 sysctl_tcp_ecn; u8 sysctl_tcp_ecn_option; + u8 sysctl_tcp_ecn_option_beacon; u8 sysctl_tcp_ecn_fallback; u8 sysctl_ip_default_ttl; diff --git a/include/net/tcp.h b/include/net/tcp.h index bfff2a9f95bf..3ee5b52441e3 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1068,6 +1068,7 @@ static inline void tcp_accecn_init_counters(struct tcp_sock *tp) __tcp_accecn_init_bytes_counters(tp->received_ecn_bytes); __tcp_accecn_init_bytes_counters(tp->delivered_ecn_bytes); tp->accecn_minlen = 0; + tp->accecn_opt_demand = 0; tp->est_ecnfield = 0; } diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 1d7fd86ca7b9..3ceefd2a77d7 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -740,6 +740,15 @@ static struct ctl_table ipv4_net_table[] = { .extra1 = SYSCTL_ZERO, .extra2 = SYSCTL_TWO, }, + { + .procname = "tcp_ecn_option_beacon", + .data = &init_net.ipv4.sysctl_tcp_ecn_option_beacon, + .maxlen = sizeof(u8), + .mode = 0644, + .proc_handler = proc_dou8vec_minmax, + .extra1 = SYSCTL_ZERO, + .extra2 = SYSCTL_FOUR, + }, { .procname = "tcp_ecn_fallback", .data = &init_net.ipv4.sysctl_tcp_ecn_fallback, diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 89799f73c451..25a986ad1c2f 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3368,6 +3368,8 @@ int tcp_disconnect(struct sock *sk, int flags) tp->wait_third_ack = 0; tp->accecn_fail_mode = 0; tcp_accecn_init_counters(tp); + tp->prev_ecnfield = 0; + tp->accecn_opt_tstamp = 0; if (icsk->icsk_ca_initialized && icsk->icsk_ca_ops->release) icsk->icsk_ca_ops->release(sk); memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv)); @@ -5106,6 +5108,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ce); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ecn_bytes); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, accecn_opt_tstamp); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_wnd); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rx_opt); @@ -5113,7 +5116,7 @@ static void __init tcp_struct_check(void) /* 32bit arches with 8byte alignment on u64 fields might need padding * before tcp_clock_cache. */ - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 122 + 6); + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 130 + 6); /* RX read-write hotpath cache lines */ CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_rx, bytes_received); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 41e45b9aff3f..1e8e49881ca4 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -466,6 +466,7 @@ static void tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th, default: tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_rcv = ip_dsfield & INET_ECN_MASK; + tp->accecn_opt_demand = 2; if (INET_ECN_is_ce(ip_dsfield) && tcp_accecn_validate_syn_feedback(sk, ace, tp->syn_ect_snt)) { @@ -486,6 +487,7 @@ static void tcp_ecn_rcv_syn(struct tcp_sock *tp, const struct tcphdr *th, } else { tp->syn_ect_rcv = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; + tp->prev_ecnfield = tp->syn_ect_rcv; tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); } } @@ -6278,6 +6280,7 @@ void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, u8 ecnfield = TCP_SKB_CB(skb)->ip_dsfield & INET_ECN_MASK; u8 is_ce = INET_ECN_is_ce(ecnfield); struct tcp_sock *tp = tcp_sk(sk); + bool ecn_edge; if (!INET_ECN_is_not_ect(ecnfield)) { u32 pcount = is_ce * max_t(u16, 1, skb_shinfo(skb)->gso_segs); @@ -6291,9 +6294,36 @@ void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, if (payload_len > 0) { u8 minlen = tcp_ecnfield_to_accecn_optfield(ecnfield); + u32 oldbytes = tp->received_ecn_bytes[ecnfield - 1]; + tp->received_ecn_bytes[ecnfield - 1] += payload_len; tp->accecn_minlen = max_t(u8, tp->accecn_minlen, minlen); + + /* Demand AccECN option at least every 2^22 bytes to + * avoid overflowing the ECN byte counters. + */ + if ((tp->received_ecn_bytes[ecnfield - 1] ^ oldbytes) & + ~((1 << 22) - 1)) { + u8 opt_demand = max_t(u8, 1, + tp->accecn_opt_demand); + + tp->accecn_opt_demand = opt_demand; + } + } + } + + ecn_edge = tp->prev_ecnfield != ecnfield; + if (ecn_edge || is_ce) { + tp->prev_ecnfield = ecnfield; + /* Demand Accurate ECN change-triggered ACKs. Two ACK are + * demanded to indicate unambiguously the ecnfield value + * in the latter ACK. + */ + if (tcp_ecn_mode_accecn(tp)) { + if (ecn_edge) + inet_csk(sk)->icsk_ack.pending |= ICSK_ACK_NOW; + tp->accecn_opt_demand = 2; } } } @@ -6426,8 +6456,12 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, * RFC 5961 4.2 : Send a challenge ack */ if (th->syn) { - if (tcp_ecn_mode_accecn(tp)) + if (tcp_ecn_mode_accecn(tp)) { + u8 opt_demand = max_t(u8, 1, tp->accecn_opt_demand); + send_accecn_reflector = true; + tp->accecn_opt_demand = opt_demand; + } if (sk->sk_state == TCP_SYN_RECV && sk->sk_socket && th->ack && TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && TCP_SKB_CB(skb)->seq + 1 == tp->rcv_nxt && diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index 3f3e285fc973..2e95dad66fe3 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -3451,6 +3451,7 @@ static int __net_init tcp_sk_init(struct net *net) { net->ipv4.sysctl_tcp_ecn = 2; net->ipv4.sysctl_tcp_ecn_option = 2; + net->ipv4.sysctl_tcp_ecn_option_beacon = 3; net->ipv4.sysctl_tcp_ecn_fallback = 1; net->ipv4.sysctl_tcp_base_mss = TCP_BASE_MSS; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index 3f8225bae49f..e0f2bd2cee9e 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -501,6 +501,8 @@ static void tcp_ecn_openreq_child(struct sock *sk, tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_snt = treq->syn_ect_snt; tcp_accecn_third_ack(sk, skb, treq->syn_ect_snt); + tp->prev_ecnfield = treq->syn_ect_rcv; + tp->accecn_opt_demand = 1; tcp_ecn_received_counters(sk, skb, skb->len - th->doff * 4); } else { tcp_ecn_mode_set(tp, inet_rsk(req)->ecn_ok ? diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index a36de6c539da..a76061dc4e5f 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -806,8 +806,12 @@ static void tcp_options_write(struct tcphdr *th, struct tcp_sock *tp, *ptr++ = htonl(((e0b & 0xffffff) << 8) | TCPOPT_NOP); } - if (tp) + if (tp) { tp->accecn_minlen = 0; + tp->accecn_opt_tstamp = tp->tcp_mstamp; + if (tp->accecn_opt_demand) + tp->accecn_opt_demand--; + } } if (unlikely(OPTION_SACK_ADVERTISE & options)) { @@ -984,6 +988,18 @@ static int tcp_options_fit_accecn(struct tcp_out_options *opts, int required, return size; } +static bool tcp_accecn_option_beacon_check(const struct sock *sk) +{ + const struct tcp_sock *tp = tcp_sk(sk); + + if (!sock_net(sk)->ipv4.sysctl_tcp_ecn_option_beacon) + return false; + + return tcp_stamp_us_delta(tp->tcp_mstamp, tp->accecn_opt_tstamp) * + sock_net(sk)->ipv4.sysctl_tcp_ecn_option_beacon >= + (tp->srtt_us >> 3); +} + /* Compute TCP options for SYN packets. This is not the final * network wire format yet. */ @@ -1237,13 +1253,18 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb if (tcp_ecn_mode_accecn(tp) && sock_net(sk)->ipv4.sysctl_tcp_ecn_option) { - int saving = opts->num_sack_blocks > 0 ? 2 : 0; - int remaining = MAX_TCP_OPTION_SPACE - size; - - opts->ecn_bytes = tp->received_ecn_bytes; - size += tcp_options_fit_accecn(opts, tp->accecn_minlen, - remaining, - saving); + if (sock_net(sk)->ipv4.sysctl_tcp_ecn_option >= 2 || + tp->accecn_opt_demand || + tcp_accecn_option_beacon_check(sk)) { + int saving = opts->num_sack_blocks > 0 ? 2 : 0; + int remaining = MAX_TCP_OPTION_SPACE - size; + + opts->ecn_bytes = tp->received_ecn_bytes; + size += tcp_options_fit_accecn(opts, + tp->accecn_minlen, + remaining, + saving); + } } if (unlikely(BPF_SOCK_OPS_TEST_FLAG(tp, @@ -2959,6 +2980,11 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle, sent_pkts = 0; tcp_mstamp_refresh(tp); + + /* AccECN option beacon depends on mstamp, it may change mss */ + if (tcp_ecn_mode_accecn(tp) && tcp_accecn_option_beacon_check(sk)) + mss_now = tcp_current_mss(sk); + if (!push_one) { /* Do MTU probing. */ result = tcp_mtu_probe(sk); From patchwork Mon Apr 14 13:13:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 881142 Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2049.outbound.protection.outlook.com [40.107.21.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FFD42E336F; Mon, 14 Apr 2025 13:14:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.21.49 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636461; cv=fail; b=JmWuyfD8uFfXDANElthp6RCmOE0ZPH8wdZ9aH8hqYWRjFNzozmeGMSwuM/UdZT46ciZsvMSH33Dw5B6zzvC8dXU3rilkOlTjLPpv4Wq1cDr9NjmJzxdXdXrBiGMVIZdbR4otl8+qE6w1skcdTeck7DgtSvNq5zD+AJFwApLLaeU= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636461; c=relaxed/simple; bh=KD7/NR3z7EZK3pNxTn5FeoJCQ5ELLAzCb70Q6PtqoD0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=BmKuRiaAkyzssAW8UIEYNtygoqr7mNFu3sU74gKkw7w6e0XG2KukfZ/1/B4uGXH3GZNCgNt2nnsELGxlBmunX69lA26kUNkVulXBKdJgL9EKJDpJYieH0Ls96LmQCoi8c7nMVVkeF3STltfIrNcvODJtpWHpB52tejDvT7r2QkE= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=NZ7WH6t4; arc=fail smtp.client-ip=40.107.21.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="NZ7WH6t4" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=a2S65I9Bm+Dhb1cFbc8HW5oJliDTlnLbVH5hkmgk+dGtZHaxhsc8fxE81c008ofwkzNKOMfws0xMRXgRkLYjGl87Ac6Pd7x8tNZqTSSCubOR37aX9FA42wTOomdb69mQ4lYJwXw/OHZHRPbaLqZjB/toeClStwkj432QlCwOZ9qb5vKRg6S8eX4prlonts9ZoR25BxCeVO2czOs5z8BbJNxgu/nIIeYXH3OOFNV3UL24lrA2dyV5k1I+8ZeF+dIQyul2rfFR9YEbo2zSEl9Uv9A899uyiLGG4AUUyaFUnwe+bXJCaKBuyhkcBNV6yJuNec/2Kv2Ed2c2pKpRVKyNwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fgX0CwSPtzsPeTZbfIpkANOIMszxeyMZ4gue9xFsyEQ=; b=UInLOSD+cjl/3YGlal3ikmW4QMzaQWZwcqyk9eioNLpe4IeFHetd/L/y69yg8RpuRP0g+d3KYViG8V9nCcfZQFTUDa7Sn2BqPnbEF1B+uZT7xMdwT/mpD4otAGJDRZEA0fINme2sQCsvwNquRaz23SVfRw7ZqJEHBSgb6fqJ3QI0mcYtY7z/1cC7pOnRg9RJU2UdHPK0mWzNkSg/p2JFlXM+tSKhhHbBub9z2B66Otne3GCe9hLvDGaN3RVARJ8tF7pw4f7SUTgAAYkxqkriDLkkBbrtRvazP2+epLk2b+iXxt+GveFhQTSBq6TEmPH6LEoj9jl/j2e1uTWtukHxuQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.101) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fgX0CwSPtzsPeTZbfIpkANOIMszxeyMZ4gue9xFsyEQ=; b=NZ7WH6t4FZckcG3UtC7kQdMp+o/0+9P/8VIlTE/zzopmG4kX7tipL/Z6CrZLHIBoVsawJFFKj13VOGyMt6hE+th+/AswAXeAc6mQMY5iBtYy2hgxaPL87xR4Gkc1DjyzUqjMM+k2assL7JTrBkHs+E4zjHeLIPY7XoCGbyz701utGP0X6ycPlWFfVQCz4/Ly2rIhsgIMciii3XQ+4d7nn4G/aL5zheJ3EmOv+opdD0lIuhJp23BkI2LXJqokY9I2wGeGJMEddzwPqHMRE2jt76HUMpZLvFwOhQkaVkMwLT7nyldUwpcpLUQxEDD6n3HirDSQ7nvK6LE3aZSvM2JWqg== Received: from AS4P189CA0043.EURP189.PROD.OUTLOOK.COM (2603:10a6:20b:5dd::19) by AS5PR07MB9840.eurprd07.prod.outlook.com (2603:10a6:20b:682::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8632.33; Mon, 14 Apr 2025 13:14:15 +0000 Received: from AM3PEPF0000A792.eurprd04.prod.outlook.com (2603:10a6:20b:5dd:cafe::1d) by AS4P189CA0043.outlook.office365.com (2603:10a6:20b:5dd::19) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8632.34 via Frontend Transport; Mon, 14 Apr 2025 13:14:15 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.101) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.101 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.101; helo=fr712usmtp1.zeu.alcatel-lucent.com; pr=C Received: from fr712usmtp1.zeu.alcatel-lucent.com (131.228.6.101) by AM3PEPF0000A792.mail.protection.outlook.com (10.167.16.121) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8655.12 via Frontend Transport; Mon, 14 Apr 2025 13:14:15 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr712usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id 53EDDQBM009623; Mon, 14 Apr 2025 13:14:13 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v3 net-next 11/15] tcp: accecn: AccECN option failure handling Date: Mon, 14 Apr 2025 15:13:11 +0200 Message-Id: <20250414131315.97456-12-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> References: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM3PEPF0000A792:EE_|AS5PR07MB9840:EE_ X-MS-Office365-Filtering-Correlation-Id: 305e4e65-bf82-471f-dfea-08dd7b56416e X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|7416014|1800799024|36860700013|82310400026|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?w0z0pXpSDgkIjDaMtJKAz6QlpW5hrEt?= =?utf-8?q?dSd7vuhejmDxjRsIGigmIP3eCDVEhducjlA8dST250Y9C7fd1NewM78byEUJl2MbK?= =?utf-8?q?YKxc1RBuAajVBWecywBzWly0ntBNJYLszH31vWElSA4rtuHs0ONmzulQnXQn91klE?= =?utf-8?q?cW2KJJoVcFHuw48ATjVEcR1stQxyP50P+21QHNBEZzZpWTWZc93Jn+/QK5koG/9Lp?= =?utf-8?q?uURkpCBAoipeCQpvyqpHvUtmAR5Zg/vzK1x4+gnsXm8r/tDTjl68Lqcw99PGLPT+A?= =?utf-8?q?aK8nmMUox+qpnKOuIT5kcw/agSVJtB91irqll0BGaXU5pGteXHVa/LxS+wTlyFZIp?= =?utf-8?q?sBlNLMpqVHoa9PKcXjQfWgjJWkVlaxNxAPE/kdsp0lXja0qrAyReQ4TP6DQFS8JPN?= =?utf-8?q?NQWRM8gpTbd3CCD4KMQXKmwZqnxWe/87+JkKtztQH6UpPXB5zNZ0RyT1cOTNjdi64?= =?utf-8?q?7bJK3zCWB2bPzWXsBWfXYAaoULAOmQmqc9ioR/tjWNqfcCI38jJClB7PMUZt1Rtup?= =?utf-8?q?HLnwA478uRYph1PMCLtGxIdzz/nQw1NzidBowb0QboCQxRTvq7xVxEdVBjqbxqlss?= =?utf-8?q?Ktj7PNAYUS3FrPURn7viG0tdyqlQdudaePI6K24Uk4BKPY8IrgNqVqRGO3XQhGXoA?= =?utf-8?q?RpjrDEoDekHmR0YVVyG7ohiWRWwlsUBy5VdnoEtry0M18ChzSOZoNC+WK5y5Nsmli?= =?utf-8?q?gzk/LgwMIT+dvFWSVqWdsBXGtt9gXbUTFHNBvky9UUJY00EUmVIGpQQPn4qDHQlDw?= =?utf-8?q?fLNnzk4gDOxCo7Y9ylF1vCAkqUQpu8m06zFlfjavcLCCqTah30U60EMxD4WWpR0fm?= =?utf-8?q?o0G+IgqOtiopzjrD/SuhMHPWqRTrW52e/b+qit7QMj4MMDSsIGqB5fCHdy407RxwR?= =?utf-8?q?aIfkCy+GlgiBRka5kYd1+ImdRtxdeuWjfuylTiqu8y18Zj0rElNsbymrSDb1dTRxs?= =?utf-8?q?q3gJTfaRaTLV/fF+VSbyBUuIx1lEptfAaNMQqEl2jL2uGylIFrNHYNr8SpXI4qChO?= =?utf-8?q?Xe2u/+0fFft6htDKWhSUZyKOlqFt8CXI1NSjiyfKmA26DEZ2ya0bY/zDfWB21d/4M?= =?utf-8?q?wIR4mz7InsaZDFW+6piO95CksZMiq5oVJsuv5cyITucpne2uJzzwduW327ptqWWs9?= =?utf-8?q?mqx7UkEBiiTg0b8akGgkO4PlylPS2OnbjK5YovfS0Sdk+kn8o4FWLg7dhnJnslBa5?= =?utf-8?q?aKID1BYmOxFmDjgeLyevCmP2JarBr6E4yEsvxXSf1y2GjbYvT5b6KyOoWhKAllKTh?= =?utf-8?q?HhbQxW1wqq54NP5POkNqD/t4zTk8tXchUp9lesqJl7AxjGAG0HZ2ZEI1d4KfsuHb2?= =?utf-8?q?2TIrIPTSA9ib068J7XusYjjP2feCi0AfOAd3d9UaOwzacZD4ctKM+RX/RXu/f3KBZ?= =?utf-8?q?MCWXHx1G5va55ZQAJT/kBNoOxH6AQHBkoC+5KaF8A6psp2NwKZziOkCD14T/EZ4CI?= =?utf-8?q?1ltl77tWRSH4YHbq/TApW7wl5dhC/Ymw=3D=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.101; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fr712usmtp1.zeu.alcatel-lucent.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(376014)(7416014)(1800799024)(36860700013)(82310400026)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2025 13:14:15.0886 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 305e4e65-bf82-471f-dfea-08dd7b56416e X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.6.101]; Helo=[fr712usmtp1.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF0000A792.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AS5PR07MB9840 From: Chia-Yu Chang AccECN option may fail in various way, handle these: - Remove option from SYN/ACK rexmits to handle blackholes - If no option arrives in SYN/ACK, assume Option is not usable - If an option arrives later, re-enabled - If option is zeroed, disable AccECN option processing Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 6 ++-- include/net/tcp.h | 7 +++++ net/ipv4/tcp.c | 1 + net/ipv4/tcp_input.c | 67 +++++++++++++++++++++++++++++++++++----- net/ipv4/tcp_minisocks.c | 38 +++++++++++++++++++++++ net/ipv4/tcp_output.c | 7 +++-- 6 files changed, 115 insertions(+), 11 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 9619524d8901..782e4dd58bf7 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -160,7 +160,8 @@ struct tcp_request_sock { u8 accecn_ok : 1, syn_ect_snt: 2, syn_ect_rcv: 2; - u8 accecn_fail_mode:4; + u8 accecn_fail_mode:4, + saw_accecn_opt :2; u32 txhash; u32 rcv_isn; u32 snt_isn; @@ -391,7 +392,8 @@ struct tcp_sock { syn_ect_snt:2, /* AccECN ECT memory, only */ syn_ect_rcv:2, /* ... needed durign 3WHS + first seqno */ wait_third_ack:1; /* Wait 3rd ACK in simultaneous open */ - u8 accecn_fail_mode:4; /* AccECN failure handling */ + u8 accecn_fail_mode:4, /* AccECN failure handling */ + saw_accecn_opt:2; /* An AccECN option was seen */ u8 thin_lto : 1,/* Use linear timeouts for thin streams */ fastopen_connect:1, /* FASTOPEN_CONNECT sockopt */ fastopen_no_cookie:1, /* Allow send/recv SYN+data without a cookie */ diff --git a/include/net/tcp.h b/include/net/tcp.h index 3ee5b52441e3..0ade2873b84e 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -276,6 +276,12 @@ static inline void tcp_accecn_fail_mode_set(struct tcp_sock *tp, u8 mode) tp->accecn_fail_mode |= mode; } +/* tp->saw_accecn_opt states */ +#define TCP_ACCECN_OPT_NOT_SEEN 0x0 +#define TCP_ACCECN_OPT_EMPTY_SEEN 0x1 +#define TCP_ACCECN_OPT_COUNTER_SEEN 0x2 +#define TCP_ACCECN_OPT_FAIL_SEEN 0x3 + /* Flags in tp->nonagle */ #define TCP_NAGLE_OFF 1 /* Nagle's algo is disabled */ #define TCP_NAGLE_CORK 2 /* Socket is corked */ @@ -477,6 +483,7 @@ static inline int tcp_accecn_extract_syn_ect(u8 ace) bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect); void tcp_accecn_third_ack(struct sock *sk, const struct sk_buff *skb, u8 syn_ect_snt); +u8 tcp_accecn_option_init(const struct sk_buff *skb, u8 opt_offset); void tcp_ecn_received_counters(struct sock *sk, const struct sk_buff *skb, u32 payload_len); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 25a986ad1c2f..8e3582c1b5bb 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3367,6 +3367,7 @@ int tcp_disconnect(struct sock *sk, int flags) tp->delivered_ce = 0; tp->wait_third_ack = 0; tp->accecn_fail_mode = 0; + tp->saw_accecn_opt = TCP_ACCECN_OPT_NOT_SEEN; tcp_accecn_init_counters(tp); tp->prev_ecnfield = 0; tp->accecn_opt_tstamp = 0; diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 1e8e49881ca4..8f1e10530880 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -446,8 +446,8 @@ bool tcp_accecn_validate_syn_feedback(struct sock *sk, u8 ace, u8 sent_ect) } /* See Table 2 of the AccECN draft */ -static void tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th, - u8 ip_dsfield) +static void tcp_ecn_rcv_synack(struct sock *sk, const struct sk_buff *skb, + const struct tcphdr *th, u8 ip_dsfield) { struct tcp_sock *tp = tcp_sk(sk); u8 ace = tcp_accecn_ace(th); @@ -466,7 +466,19 @@ static void tcp_ecn_rcv_synack(struct sock *sk, const struct tcphdr *th, default: tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_rcv = ip_dsfield & INET_ECN_MASK; - tp->accecn_opt_demand = 2; + if (tp->rx_opt.accecn && + tp->saw_accecn_opt < TCP_ACCECN_OPT_COUNTER_SEEN) { + u8 saw_opt = tcp_accecn_option_init(skb, + tp->rx_opt.accecn); + + tp->saw_accecn_opt = saw_opt; + if (tp->saw_accecn_opt == TCP_ACCECN_OPT_FAIL_SEEN) { + u8 fail_mode = TCP_ACCECN_OPT_FAIL_RECV; + + tcp_accecn_fail_mode_set(tp, fail_mode); + } + tp->accecn_opt_demand = 2; + } if (INET_ECN_is_ce(ip_dsfield) && tcp_accecn_validate_syn_feedback(sk, ace, tp->syn_ect_snt)) { @@ -586,7 +598,23 @@ static bool tcp_accecn_process_option(struct tcp_sock *tp, bool order1, res; unsigned int i; + if (tcp_accecn_opt_fail_recv(tp)) + return false; + if (!(flag & FLAG_SLOWPATH) || !tp->rx_opt.accecn) { + if (!tp->saw_accecn_opt) { + /* Too late to enable after this point due to + * potential counter wraps + */ + if (tp->bytes_sent >= (1 << 23) - 1) { + u8 fail_mode = TCP_ACCECN_OPT_FAIL_RECV; + + tp->saw_accecn_opt = TCP_ACCECN_OPT_FAIL_SEEN; + tcp_accecn_fail_mode_set(tp, fail_mode); + } + return false; + } + if (estimate_ecnfield) { u8 ecnfield = estimate_ecnfield - 1; @@ -602,6 +630,13 @@ static bool tcp_accecn_process_option(struct tcp_sock *tp, order1 = (ptr[0] == TCPOPT_ACCECN1); ptr += 2; + if (tp->saw_accecn_opt < TCP_ACCECN_OPT_COUNTER_SEEN) { + tp->saw_accecn_opt = tcp_accecn_option_init(skb, + tp->rx_opt.accecn); + if (tp->saw_accecn_opt == TCP_ACCECN_OPT_FAIL_SEEN) + tcp_accecn_fail_mode_set(tp, TCP_ACCECN_OPT_FAIL_RECV); + } + res = !!estimate_ecnfield; for (i = 0; i < 3; i++) { if (optlen >= TCPOLEN_ACCECN_PERFIELD) { @@ -6457,10 +6492,25 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, */ if (th->syn) { if (tcp_ecn_mode_accecn(tp)) { - u8 opt_demand = max_t(u8, 1, tp->accecn_opt_demand); - send_accecn_reflector = true; - tp->accecn_opt_demand = opt_demand; + if (tp->rx_opt.accecn && + tp->saw_accecn_opt < TCP_ACCECN_OPT_COUNTER_SEEN) { + u8 offset = tp->rx_opt.accecn; + u8 opt_demand; + u8 saw_opt; + + saw_opt = tcp_accecn_option_init(skb, offset); + tp->saw_accecn_opt = saw_opt; + if (tp->saw_accecn_opt == + TCP_ACCECN_OPT_FAIL_SEEN) { + u8 fail_mode = TCP_ACCECN_OPT_FAIL_RECV; + + tcp_accecn_fail_mode_set(tp, fail_mode); + } + opt_demand = max_t(u8, 1, + tp->accecn_opt_demand); + tp->accecn_opt_demand = opt_demand; + } } if (sk->sk_state == TCP_SYN_RECV && sk->sk_socket && th->ack && TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && @@ -6954,7 +7004,8 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb, */ if (tcp_ecn_mode_any(tp)) - tcp_ecn_rcv_synack(sk, th, TCP_SKB_CB(skb)->ip_dsfield); + tcp_ecn_rcv_synack(sk, skb, th, + TCP_SKB_CB(skb)->ip_dsfield); tcp_init_wl(tp, TCP_SKB_CB(skb)->seq); tcp_try_undo_spurious_syn(sk); @@ -7531,6 +7582,8 @@ static void tcp_openreq_init(struct request_sock *req, tcp_rsk(req)->snt_tsval_first = 0; tcp_rsk(req)->last_oow_ack_time = 0; tcp_rsk(req)->accecn_ok = 0; + tcp_rsk(req)->saw_accecn_opt = TCP_ACCECN_OPT_NOT_SEEN; + tcp_rsk(req)->accecn_fail_mode = 0; tcp_rsk(req)->syn_ect_rcv = 0; tcp_rsk(req)->syn_ect_snt = 0; req->mss = rx_opt->mss_clamp; diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c index e0f2bd2cee9e..8bb4953fc8bd 100644 --- a/net/ipv4/tcp_minisocks.c +++ b/net/ipv4/tcp_minisocks.c @@ -501,6 +501,7 @@ static void tcp_ecn_openreq_child(struct sock *sk, tcp_ecn_mode_set(tp, TCP_ECN_MODE_ACCECN); tp->syn_ect_snt = treq->syn_ect_snt; tcp_accecn_third_ack(sk, skb, treq->syn_ect_snt); + tp->saw_accecn_opt = treq->saw_accecn_opt; tp->prev_ecnfield = treq->syn_ect_rcv; tp->accecn_opt_demand = 1; tcp_ecn_received_counters(sk, skb, skb->len - th->doff * 4); @@ -555,6 +556,30 @@ static void smc_check_reset_syn_req(const struct tcp_sock *oldtp, #endif } +u8 tcp_accecn_option_init(const struct sk_buff *skb, u8 opt_offset) +{ + unsigned char *ptr = skb_transport_header(skb) + opt_offset; + unsigned int optlen = ptr[1] - 2; + + WARN_ON_ONCE(ptr[0] != TCPOPT_ACCECN0 && ptr[0] != TCPOPT_ACCECN1); + ptr += 2; + + /* Detect option zeroing: an AccECN connection "MAY check that the + * initial value of the EE0B field or the EE1B field is non-zero" + */ + if (optlen < TCPOLEN_ACCECN_PERFIELD) + return TCP_ACCECN_OPT_EMPTY_SEEN; + if (get_unaligned_be24(ptr) == 0) + return TCP_ACCECN_OPT_FAIL_SEEN; + if (optlen < TCPOLEN_ACCECN_PERFIELD * 3) + return TCP_ACCECN_OPT_COUNTER_SEEN; + ptr += TCPOLEN_ACCECN_PERFIELD * 2; + if (get_unaligned_be24(ptr) == 0) + return TCP_ACCECN_OPT_FAIL_SEEN; + + return TCP_ACCECN_OPT_COUNTER_SEEN; +} + /* This is not only more efficient than what we used to do, it eliminates * a lot of code duplication between IPv4/IPv6 SYN recv processing. -DaveM * @@ -716,6 +741,7 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, bool own_req; tmp_opt.saw_tstamp = 0; + tmp_opt.accecn = 0; if (th->doff > (sizeof(struct tcphdr)>>2)) { tcp_parse_options(sock_net(sk), skb, &tmp_opt, 0, NULL); @@ -893,6 +919,18 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb, if (!(flg & TCP_FLAG_ACK)) return NULL; + if (tcp_rsk(req)->accecn_ok && tmp_opt.accecn && + tcp_rsk(req)->saw_accecn_opt < TCP_ACCECN_OPT_COUNTER_SEEN) { + u8 saw_opt = tcp_accecn_option_init(skb, tmp_opt.accecn); + + tcp_rsk(req)->saw_accecn_opt = saw_opt; + if (tcp_rsk(req)->saw_accecn_opt == TCP_ACCECN_OPT_FAIL_SEEN) { + u8 fail_mode = TCP_ACCECN_OPT_FAIL_RECV; + + tcp_rsk(req)->accecn_fail_mode |= fail_mode; + } + } + /* For Fast Open no more processing is needed (sk is the * child socket). */ diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index a76061dc4e5f..8e1535635aab 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1085,6 +1085,7 @@ static unsigned int tcp_syn_options(struct sock *sk, struct sk_buff *skb, /* Simultaneous open SYN/ACK needs AccECN option but not SYN */ if (unlikely((TCP_SKB_CB(skb)->tcp_flags & TCPHDR_ACK) && tcp_ecn_mode_accecn(tp) && + inet_csk(sk)->icsk_retransmits < 2 && sock_net(sk)->ipv4.sysctl_tcp_ecn_option && remaining >= TCPOLEN_ACCECN_BASE)) { u32 saving = tcp_synack_options_combine_saving(opts); @@ -1174,7 +1175,7 @@ static unsigned int tcp_synack_options(const struct sock *sk, smc_set_option_cond(tcp_sk(sk), ireq, opts, &remaining); if (treq->accecn_ok && sock_net(sk)->ipv4.sysctl_tcp_ecn_option && - remaining >= TCPOLEN_ACCECN_BASE) { + req->num_timeout < 1 && remaining >= TCPOLEN_ACCECN_BASE) { u32 saving = tcp_synack_options_combine_saving(opts); opts->ecn_bytes = synack_ecn_bytes; @@ -1252,7 +1253,9 @@ static unsigned int tcp_established_options(struct sock *sk, struct sk_buff *skb } if (tcp_ecn_mode_accecn(tp) && - sock_net(sk)->ipv4.sysctl_tcp_ecn_option) { + sock_net(sk)->ipv4.sysctl_tcp_ecn_option && + tp->saw_accecn_opt && + !tcp_accecn_opt_fail_send(tp)) { if (sock_net(sk)->ipv4.sysctl_tcp_ecn_option >= 2 || tp->accecn_opt_demand || tcp_accecn_option_beacon_check(sk)) { From patchwork Mon Apr 14 13:13:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 881140 Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2060.outbound.protection.outlook.com [40.107.20.60]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 72C6C2E339E; Mon, 14 Apr 2025 13:14:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.20.60 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636466; cv=fail; b=iZ4UddyxRrAexhWsksx2dsAXhgXm2c3IiDZTzVeUTDl8d1WplfA6/xKhj/wbKB5U5AMtzK5rYplefzKoa7j8a3zUzH+17LAg0YovnoQv5eiFSi+9JyLBxot+ZK2wTbxCFwXY3RO0umpa0PwFpvMBmxSbCSASQDO7+Du6sdfEayc= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636466; c=relaxed/simple; bh=I4ubTWT3RFl5BeQzn9ugdonkuIPtlIVxMelUvF756Yk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=ZqDLpBM53mAVyP+iDtMIWqr/Rr8dFgeOK5aW8r74GsVPkP2WBAnXiyrPdMEWW5DGUSOXMfFs+TU2bmrMNsjvEy11TtJ1fhitq1D4fk1G+wJQHY9IJ+MB1AINTd/0lqUuXJkfv/9hrG3lZe+/HM2gsPO7MPSSdZL0fi6AbkaDrSA= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=GOG0XxXL; arc=fail smtp.client-ip=40.107.20.60 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="GOG0XxXL" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=gtkaYknftc5WIn8xbX4tvCiYw7YXn2wZUafF+UlXAMqNDwpsTXgqR/E/5PN0KF6tfEYKoGn5KFfPlTmtwDqbvzmzcJdiAmSbW05Uj6BLbotWGMPiwxNroEJhprNkIdQl/Tx4mezMmbLUlvMkBH1OO11/ZGlaNt9ULENzTt1Rql/PqfqPOXXPdoFjKWP16c+9aLmCOqJtfdCWNL1QRyOt3HrpuBeHHiRHeZlKH+kqvylD3DYUljKPp7czM16gGkLb8NfAg586evFYUUhazxp8qRx37V95He7Edn52aV4glEnyocosE491htMhp3bZWeC0cbmUrP+k3aT0cztOY/bv1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wjzGCXWtVKbrVv/LQ94yH6VK8X9J1eeWY+M2JK853FY=; b=WbbJ0WGQrbEE7GvzyKbFQ9JqTCLuL+hsSZXVv7dmaG1KQ8ara3egX2seegXyfdapQIXDPum1UpkyIYhTDPX1IrtJwEZfGOSWAIYIgU0dQlsCuqnUe66pJhBHoOhQEgTZDsPEAqaKtotXweqj81OQkqlsdiPLgu9uVpZPKTLdbS/ZVBaRTwBLo1RwzhrZJQ5ewlFu4BQlFhFYevPHtwAqKBEVDOWH5yEWp+lElDoTmResynKhCBFC+ecsVgJxd0Zkm6b5UVfSJwwhpEHxm25qDdU8Jjf5BMDV8rcVdQ33i60BfbV6nsbUWgpZn5puT+3YbCZkcHmVwlCdp3FXX5b2Rw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 131.228.6.101) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=wjzGCXWtVKbrVv/LQ94yH6VK8X9J1eeWY+M2JK853FY=; b=GOG0XxXLwNtm/ScwK2f2N5Mpq347TX8+MlJgeo+ApX7p9INyPGfgcIVN0L7cxx5FOg5tDsBghMmzPG/Zm32idYMXEnDZk5Dwj+obq7yiwKfjU05gZM3kSiI3jNN/TsXIEDIzxKqY8Qlo75doeh0MDQe8QBtp9Wkp9fIwA5LTscbyT5Wjn6K6Akno1WOTHYGqqf+PLg1JUAs3oEaWsq8/meKtljQjaX3z8kxjfRHzfUOHkJ0sxMjOgV/l3KvZyVZ5IGPCzWFZB9NZ3O3uisosWGWj4DHY9sc7ly0EcyHPz6gWxOLUohxYIWBm8ge74jjuXxX/WlvARqHwH5KWa69czA== Received: from BE1P281CA0483.DEUP281.PROD.OUTLOOK.COM (2603:10a6:b10:7e::25) by AM0PR07MB6244.eurprd07.prod.outlook.com (2603:10a6:20b:15d::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8632.33; Mon, 14 Apr 2025 13:14:19 +0000 Received: from AM4PEPF00025F97.EURPRD83.prod.outlook.com (2603:10a6:b10:7e:cafe::3) by BE1P281CA0483.outlook.office365.com (2603:10a6:b10:7e::25) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8655.15 via Frontend Transport; Mon, 14 Apr 2025 13:14:19 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 131.228.6.101) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nokia-bell-labs.com; Received-SPF: Pass (protection.outlook.com: domain of nokia-bell-labs.com designates 131.228.6.101 as permitted sender) receiver=protection.outlook.com; client-ip=131.228.6.101; helo=fr712usmtp1.zeu.alcatel-lucent.com; pr=C Received: from fr712usmtp1.zeu.alcatel-lucent.com (131.228.6.101) by AM4PEPF00025F97.mail.protection.outlook.com (10.167.16.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8678.4 via Frontend Transport; Mon, 14 Apr 2025 13:14:18 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr712usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id 53EDDQBO009623; Mon, 14 Apr 2025 13:14:17 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v3 net-next 13/15] tcp: accecn: AccECN ACE field multi-wrap heuristic Date: Mon, 14 Apr 2025 15:13:13 +0200 Message-Id: <20250414131315.97456-14-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> References: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM4PEPF00025F97:EE_|AM0PR07MB6244:EE_ X-MS-Office365-Filtering-Correlation-Id: 1d378463-792b-487b-ec47-08dd7b5643a0 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|82310400026|1800799024|7416014|376014|36860700013|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?mRUwYf7F3926wQA2OhQCWeliRnJzjTo?= =?utf-8?q?A5H6ozk4jc2kaR3LQAjlpdIXnO8GfiOKeWOIjrnSCv2pWBT2X0zHNRiU8hbI3DvW/?= =?utf-8?q?iLSKpVC2y9FJI/mF5MFMqlSQAp3ThX9o51p0xruOD89JjwLE4fTdQqnH6U7MHEB/R?= =?utf-8?q?3QuasDb4Kj7LErb/1vHuYbPufBFGfbLc2ao0ZoNzt5+Bn/4JiY+g6QjHqF5nx6Rsl?= =?utf-8?q?tS5g2oIMeForP/zMf2zSqS01MXBHfRe3J+FQ0tG/REvAgLyvcE23v6QTwTML2TaXJ?= =?utf-8?q?Qle8cPob49Pz7zIYI1bYRRY6uMxigsMmWaaOQG2KTXRABKoHthaKkxN9PLBU2bVTX?= =?utf-8?q?SuHobb4/DxyVnGkTvLUQQoVl1YlND0g2njnTnEGgnseIXfuF4KOtOH+FMssE4zioB?= =?utf-8?q?TLizRGYVElpWJa6YH7PQxjIPxqH0Kq2CxLsE1KKpeW//23GM3gfewjJiZP80prhhi?= =?utf-8?q?ZgClkMrIg+e3d6hOzKTcl7tw52h6buK6y1+p2UaAXxvykCkP6T3wYebRf3xGqoYM2?= =?utf-8?q?Lcsjzi1e+hpGbiSzS6m6qcmFz4Y5DsQEAfKQfaFL0Fncm6DCpMaNUz0KnZDIY+CLD?= =?utf-8?q?5lF4tPKYLII/prAX4Y/AyxFPb1nd5g5B2+ogdRLbJGgeuHHaaDpHwgwPPNYhxACDO?= =?utf-8?q?9edxYkMOisTVscgU66S24DgwUpm7s6SOzVBwGwKy5dN0jZvUJlit9+wltV3/WdGBo?= =?utf-8?q?AzLXohyeWwyW0d2kw6ivO5iq/SVOv6ulFJId8Y9c19t5SCGXGRg0P+dKej/tv8d3f?= =?utf-8?q?9sWqzAAJ47KBQULjUsusQBFUd5MbU14LNw9j4NvlPsC6e1RaMco36rS72ap+g8Oam?= =?utf-8?q?BBXb4c03tNKtjrYlHXtLzFzSpkCMtTyTiW9oLbgcNP1CZnQTVRiCmq5rfHlw39WGa?= =?utf-8?q?uPV9Z05zyu/vWHmxhMbVxOmFZHPGfE71/DSegCcEXhFVurDdzFqJZkG8pIkUvO24i?= =?utf-8?q?Hc+0vIXyG0k4TV8r8HsRO8NAeX+so3jJVpYIWV+5OQHLJEPkSpA8veM3Ogdmd9nPA?= =?utf-8?q?mJLT1yFTqUjjJK5Z6PpEuHpsfEaqcHn1KnkoAB+nWT+9EgNjfxWVAdphL11tEu/Yi?= =?utf-8?q?2yNYVleeVQrENnF5g52vTKMwFeXj5SPbHK+/jk+GjBaB5iYkPU4Y0sBFNZZim/vfW?= =?utf-8?q?HVm8jbbKnfFWs8gzRBQRfosus6O2BuDF/8B6TUtT5puak34D9Giov6+bDsbJKicNl?= =?utf-8?q?GN0QT6dF9YCq0I4R2dKd6im90P6Z3NJJiqGtdQ/gQS35i1HXwaO5hHTxvLLnpIipE?= =?utf-8?q?BXrHO9Nxrbc5Ww/yEatZwqfBgf+7mxa5zCOW2P2GpcZP25LMlhxsP6JUzHpX+Y0wK?= =?utf-8?q?IK9wSPtP2nRMzHT/JrYSaztqIisMI6GMAciEumIiMIl4KH5z1hVzFUpPHgfOcuj3b?= =?utf-8?q?YCQ7G9iM0hAdungvJfL5a7ZykbzM9ifjRlvRcNqJdHPDGjCUyjNzIpTxVJWhs3UVQ?= =?utf-8?q?EKXNh88pnoFpzMVERLr+j4yLz6pSCdHQ=3D=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.101; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fr712usmtp1.zeu.alcatel-lucent.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(82310400026)(1800799024)(7416014)(376014)(36860700013)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2025 13:14:18.7736 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 1d378463-792b-487b-ec47-08dd7b5643a0 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.6.101]; Helo=[fr712usmtp1.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM4PEPF00025F97.EURPRD83.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR07MB6244 From: Ilpo Järvinen Armed with ceb delta from option, delivered bytes, and delivered packets it is possible to estimate how many times ACE field wrapped. This calculation is necessary only if more than one wrap is possible. Without SACK, delivered bytes and packets are not always trustworthy in which case TCP falls back to the simpler no-or-all wraps ceb algorithm. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- net/ipv4/tcp_input.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 54f798161d14..c6dac3c2d47a 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -733,6 +733,24 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, d_ceb = tp->delivered_ecn_bytes[INET_ECN_CE - 1] - old_ceb; if (!d_ceb) return delta; + + if ((delivered_pkts >= (TCP_ACCECN_CEP_ACE_MASK + 1) * 2) && + (tcp_is_sack(tp) || + ((1 << inet_csk(sk)->icsk_ca_state) & + (TCPF_CA_Open | TCPF_CA_CWR)))) { + u32 est_d_cep; + + if (delivered_bytes <= d_ceb) + return safe_delta; + + est_d_cep = DIV_ROUND_UP_ULL((u64)d_ceb * + delivered_pkts, + delivered_bytes); + return min(safe_delta, + delta + + (est_d_cep & ~TCP_ACCECN_CEP_ACE_MASK)); + } + if (d_ceb > delta * tp->mss_cache) return safe_delta; if (d_ceb < From patchwork Mon Apr 14 13:13:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Chia-Yu Chang \(Nokia\)" X-Patchwork-Id: 881139 Received: from EUR03-VI1-obe.outbound.protection.outlook.com (mail-vi1eur03on2053.outbound.protection.outlook.com [40.107.103.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A278F2D4B44; Mon, 14 Apr 2025 13:14:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.103.53 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636471; cv=fail; b=KtkgePPbEuB/vJBjdC9bIl1bxwKrnYIYuFWC8Chq68BiCJgG1QbGTLfy94ZyuGdEq4STV3UNPooWQVt6aK40TsB3C0KF1VrpfAPOTfADgfePfqoOuRqLENfxeX+BxlC4FySyWcYCL6BWaU+D/AnHucygbpHZBfSjM8srkxpW7h8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744636471; c=relaxed/simple; bh=WJHQxI6i/fEm60bcI2XfCDvqnlxwPFZf87Vi7aVBoy0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=M+VmKOy4nAnhoOGdSUmumIdu9LasZiCaK3uBzhpYjINeW+QstfE45tx7l7BIatR1Awau2Wb+p6kt8bIkH2LgMsPeVNKUj0+6t1NQ/iiMqzLLJ1RZ95B2oXFDcl901moS/VAGCZWCF+VVvvS4AShQgMIoWy8f0B0UDtvWFZDys3A= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com; spf=fail smtp.mailfrom=nokia-bell-labs.com; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b=BpnSzPwz; arc=fail smtp.client-ip=40.107.103.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nokia-bell-labs.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=nokia-bell-labs.com header.i=@nokia-bell-labs.com header.b="BpnSzPwz" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=YsD3XlxGCYuVco19uPI7BoV1LCM10wbnC11GKic5dZgIfD+OrqUmaN68qbAcsGkdBJUWV3k0rRx8FccEMt2ieEN9z2+OZElw12ZlnQJLuMj85gheO/8gjoAxyh9v4ig3133ziFyXEQ9cxliNIVOKp/Ik8p5/Hf59sRnF7Ni0O+3PHZUXkoPUZEQ/222vT2rYX460BwNxkiEVtYsRKN8p9ge66Mks/cIHVtE7DJKpIoDPA3P+g4ELqH1rFzZ/x+gCvVoL9RPQVawJPDbb08OOp9cSdIfeSTw4C8la3bwIP30AKEB2ameat03mbl3R05RPVRjZnl/5Akqvb4cjKYtPBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bJrGNiiOHulkzV3MesfEyK14yRPd86Kdu/49Wik9ee0=; b=DBsAR6OqwnfY0e8YSry6e0rPTrezNhdjJIAvxy9xlhKcBwVxo7mvopIHovPt1jZ/8w1h627qt9s9tk2QN3mrlcj8X/RERTrCi+wrkWTq8k02CWtPoW9K2OS9buqZpKru1r62AhHq4Tji6B2gRuEtNSVZfhHuom03J9wu31rERYsjPxvdkScrR0ogvaVeW9oeinQ9pUlNha0B6wGZxk36gi7jiDqcmA5ZsX2ShyPcPnqtIMFb+panJoX8hT325S7ebK0gtfMjXes7cWmIxUr53e3dIvRL+GxC57nF3/16Rqr8cZQyvwUD6iOYHmXSY130N2GzJbYh2AROzWf/DGd/SA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=temperror (sender ip is 131.228.6.101) smtp.rcpttodomain=nokia-bell-labs.com smtp.mailfrom=nokia-bell-labs.com; dmarc=temperror action=none header.from=nokia-bell-labs.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia-bell-labs.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bJrGNiiOHulkzV3MesfEyK14yRPd86Kdu/49Wik9ee0=; b=BpnSzPwzvcOXam9geQuEL53FR94BnjM5fqD2szeu5SwB81GN+5EV8tSqsZrOQ3JhbTs/qXd31PvkJU80sPfGd9cA8SwMVazggowuXpjvES+NAJxue/JkNQy7qnujA9kM1aIXDFYlzPVyiaJ7n1B0FTL5oiCqVlFHbmTj3smE8sGBK+UdK2olwDoLsAm9878J7LFIrTyIWJzCaTPe1pYhvIe0ZkuUeXzJZqDCYDzZRM9ESf/Eqet0p/A7dkiu/7dr+Q88HUcc9JpIqU44GX5EUMUhu85qaa2ZPc+yYkEUzZyOv487H/vOpvvfVwHzNvtQuL+nucLd++QEER9ibVYE8Q== Received: from CWLP265CA0475.GBRP265.PROD.OUTLOOK.COM (2603:10a6:400:18a::19) by DBAPR07MB6983.eurprd07.prod.outlook.com (2603:10a6:10:193::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8632.30; Mon, 14 Apr 2025 13:14:24 +0000 Received: from AM3PEPF00009B9B.eurprd04.prod.outlook.com (2603:10a6:400:18a:cafe::a5) by CWLP265CA0475.outlook.office365.com (2603:10a6:400:18a::19) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.8632.35 via Frontend Transport; Mon, 14 Apr 2025 13:14:24 +0000 X-MS-Exchange-Authentication-Results: spf=temperror (sender IP is 131.228.6.101) smtp.mailfrom=nokia-bell-labs.com; dkim=none (message not signed) header.d=none;dmarc=temperror action=none header.from=nokia-bell-labs.com; Received-SPF: TempError (protection.outlook.com: error in processing during lookup of nokia-bell-labs.com: DNS Timeout) Received: from fr712usmtp1.zeu.alcatel-lucent.com (131.228.6.101) by AM3PEPF00009B9B.mail.protection.outlook.com (10.167.16.20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8655.12 via Frontend Transport; Mon, 14 Apr 2025 13:14:22 +0000 Received: from sarah.nbl.nsn-rdnet.net (sarah.nbl.nsn-rdnet.net [10.0.73.150]) by fr712usmtp1.zeu.alcatel-lucent.com (GMO) with ESMTP id 53EDDQBQ009623; Mon, 14 Apr 2025 13:14:21 GMT From: chia-yu.chang@nokia-bell-labs.com To: netdev@vger.kernel.org, dave.taht@gmail.com, pabeni@redhat.com, jhs@mojatatu.com, kuba@kernel.org, stephen@networkplumber.org, xiyou.wangcong@gmail.com, jiri@resnulli.us, davem@davemloft.net, edumazet@google.com, horms@kernel.org, andrew+netdev@lunn.ch, donald.hunter@gmail.com, ast@fiberby.net, liuhangbin@gmail.com, shuah@kernel.org, linux-kselftest@vger.kernel.org, ij@kernel.org, ncardwell@google.com, koen.de_schepper@nokia-bell-labs.com, g.white@cablelabs.com, ingemar.s.johansson@ericsson.com, mirja.kuehlewind@ericsson.com, cheshire@apple.com, rs.ietf@gmx.at, Jason_Livingood@comcast.com, vidhi_goel@apple.com Cc: Chia-Yu Chang Subject: [PATCH v3 net-next 15/15] tcp: try to avoid safer when ACKs are thinned Date: Mon, 14 Apr 2025 15:13:15 +0200 Message-Id: <20250414131315.97456-16-chia-yu.chang@nokia-bell-labs.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> References: <20250414131315.97456-1-chia-yu.chang@nokia-bell-labs.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM3PEPF00009B9B:EE_|DBAPR07MB6983:EE_ X-MS-Office365-Filtering-Correlation-Id: 0937f453-912c-49d0-0f3c-08dd7b5645d2 X-LD-Processed: 5d471751-9675-428d-917b-70f44f9630b0,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|36860700013|7416014|376014|1800799024|82310400026|921020; X-Microsoft-Antispam-Message-Info: =?utf-8?q?c8Mnj5o93L9tHOF5Be0BTw53z53h1G2?= =?utf-8?q?owAJqhX04VWMfL6Xkb1la8V2Je5Lmx6sooXXgTv+lqYRhgoQCBlctJ0f7KxVG55kL?= =?utf-8?q?jmUoWiFP94crR/hJzLNm1jSpOfkxZy56kIL8S2rc79xmqPfRZ0qsSNRMS0+5AnrqB?= =?utf-8?q?maCM6NkgOd85WYkPKYi5OvhZ1i9DOw2f5MRMppVZYTAp5Pa4XDNq7oTVcBczunpTO?= =?utf-8?q?sZZGJCOy4Gv2IyL9Wk198/UNBfy13ydTczi4QUUNycPJs9d+SgIypBxe/aWCrLXwt?= =?utf-8?q?lI8naVM8IpkRolXEte4az1zDpRK6gzz3BmGwPwfxbZzJfHGGwVi3wTD1Eimjrt76x?= =?utf-8?q?xRT6hXTs5gYGz2jWBGYyWaNlilgxF0gKenLAYuONyCh9TT6RbK+6Ym2lZRWkRUsRH?= =?utf-8?q?k8jKUcd/TuD6wvS2SvimhN8NT0qkZSv2c3WJl+O41Up/FTsYZrewRtZBWF16KSq+5?= =?utf-8?q?+ui4I9KGpyft1RM7UPaWwgkpvtpOvqDYT+4QLtyZBY64cTJDuCELvXnWxh9hLnVS6?= =?utf-8?q?NppViNJuxiYleP/QcD2chjE4uFcamZtQaVZ2PLV5bBwBx3VuEoOhcFGYb5Dm/h5zB?= =?utf-8?q?EG8LQMQlaR+IL3PJFqizQLl6dERFL8lXg8eZldCGoXVzTbafwRJhQsE34G0zebN1p?= =?utf-8?q?jhwr6sVMslaoqxlP/SOLdYBvnZkAqnr71TpJIsxJwefmPx+jmskwYRD3yr8JooZp1?= =?utf-8?q?NR8ekwXTb9O8qX4STW+90Rm08gaaVrXxD6g3BBo6cA4yUnZta3R4YdwR3VKlBUVu8?= =?utf-8?q?N2lttJzQQKVEzdg0bCSpxHWTVU51mXFxK3qMf8v7/PXt0SaVcfNchbYM8+BlAfZUq?= =?utf-8?q?kkhbzJtw01Wb4K0FHHKwUCAWnvHNqEDAmKKGTQC8EdH6VizkbX+8IpIkeDDG8GmXq?= =?utf-8?q?Efggg5tAQpYQLuJDUK6HUw17ZIV1D6FWvQT86NmhozsE6LsrXwShmetKHwjrNC0mi?= =?utf-8?q?kQssmu6zI1Vpy/Ts5u3/LBbv5IJDWxMGyCY7uMH0Rkg6PrXM3jpLdU8V5tfpRsVK1?= =?utf-8?q?VVNyhMMrnLHPt7wc2rMAK7qI02ntOCJxjPXpG12yzY0N5kg35lNWPOXZIDdCDErIH?= =?utf-8?q?i1t6sn2Nm4oy9he2kNP+4pID2h0fkMNL7PYPLSY1G5viIfuJxlvKbp5kEEFE3nagj?= =?utf-8?q?wSTLD9zA90a2b7jsK7FHtiJjy69dBNbNP93uUgeQa5MsHbVu6RRsTb8mh3WclRA9C?= =?utf-8?q?aeN23FTGGqYKw+Ys16R3of4oo/dZMUuTsd2bxo64DgA6j4Js8Ar0h5wQtRlPiTPOU?= =?utf-8?q?6VUbwvNMmhQmqtg/W/ACVM0+k4K33kEXf62livd5wpNJ1ldcFRstRAjDNkiNqYgFH?= =?utf-8?q?ag5F69/lqL/T4q5vaSHw/Q8rxaD0q/ZETsU8ZG1GviLF2YPGlsXn/cIyDYv5JobH3?= =?utf-8?q?5q6E5mleNOAiHlJpLMvAQgPWmzZUNbCm1YvsGYooT/EK9eHcYE3Rh7I+1Bu954Y7i?= =?utf-8?q?1dO0sybUJXAyFzeyKbXgGpBH1KEnK30g=3D=3D?= X-Forefront-Antispam-Report: CIP:131.228.6.101; CTRY:FI; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:fr712usmtp1.zeu.alcatel-lucent.com; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230040)(36860700013)(7416014)(376014)(1800799024)(82310400026)(921020); DIR:OUT; SFP:1101; X-OriginatorOrg: nokia-bell-labs.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Apr 2025 13:14:22.4495 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 0937f453-912c-49d0-0f3c-08dd7b5645d2 X-MS-Exchange-CrossTenant-Id: 5d471751-9675-428d-917b-70f44f9630b0 X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=5d471751-9675-428d-917b-70f44f9630b0; Ip=[131.228.6.101]; Helo=[fr712usmtp1.zeu.alcatel-lucent.com] X-MS-Exchange-CrossTenant-AuthSource: AM3PEPF00009B9B.eurprd04.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBAPR07MB6983 From: Ilpo Järvinen Add newly acked pkts EWMA. When ACK thinning occurs, select between safer and unsafe cep delta in AccECN processing based on it. If the packets ACKed per ACK tends to be large, don't conservatively assume ACE field overflow. Signed-off-by: Ilpo Järvinen Signed-off-by: Chia-Yu Chang --- include/linux/tcp.h | 1 + net/ipv4/tcp.c | 4 +++- net/ipv4/tcp_input.c | 20 +++++++++++++++++++- 3 files changed, 23 insertions(+), 2 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 782e4dd58bf7..230f55b22a51 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -313,6 +313,7 @@ struct tcp_sock { prev_ecnfield:2,/* ECN bits from the previous segment */ accecn_opt_demand:2,/* Demand AccECN option for n next ACKs */ est_ecnfield:2;/* ECN field for AccECN delivered estimates */ + u16 pkts_acked_ewma;/* Pkts acked EWMA for AccECN cep heuristic */ u64 accecn_opt_tstamp; /* Last AccECN option sent timestamp */ u32 app_limited; /* limited until "delivered" reaches this val */ u32 rcv_wnd; /* Current receiver window */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 8e3582c1b5bb..673224273540 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3371,6 +3371,7 @@ int tcp_disconnect(struct sock *sk, int flags) tcp_accecn_init_counters(tp); tp->prev_ecnfield = 0; tp->accecn_opt_tstamp = 0; + tp->pkts_acked_ewma = 0; if (icsk->icsk_ca_initialized && icsk->icsk_ca_ops->release) icsk->icsk_ca_ops->release(sk); memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv)); @@ -5109,6 +5110,7 @@ static void __init tcp_struct_check(void) CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, delivered_ecn_bytes); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ce); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, received_ecn_bytes); + CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, pkts_acked_ewma); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, accecn_opt_tstamp); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, app_limited); CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_txrx, rcv_wnd); @@ -5117,7 +5119,7 @@ static void __init tcp_struct_check(void) /* 32bit arches with 8byte alignment on u64 fields might need padding * before tcp_clock_cache. */ - CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 130 + 6); + CACHELINE_ASSERT_GROUP_SIZE(struct tcp_sock, tcp_sock_write_txrx, 132 + 4); /* RX read-write hotpath cache lines */ CACHELINE_ASSERT_GROUP_MEMBER(struct tcp_sock, tcp_sock_write_rx, bytes_received); diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index c6dac3c2d47a..5bdd82d3c201 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -689,6 +689,10 @@ static void tcp_count_delivered(struct tcp_sock *tp, u32 delivered, tcp_count_delivered_ce(tp, delivered); } +#define PKTS_ACKED_WEIGHT 6 +#define PKTS_ACKED_PREC 6 +#define ACK_COMP_THRESH 4 + /* Returns the ECN CE delta */ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, u32 delivered_pkts, u32 delivered_bytes, @@ -708,6 +712,19 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, opt_deltas_valid = tcp_accecn_process_option(tp, skb, delivered_bytes, flag); + if (delivered_pkts) { + if (!tp->pkts_acked_ewma) { + tp->pkts_acked_ewma = delivered_pkts << PKTS_ACKED_PREC; + } else { + u32 ewma = tp->pkts_acked_ewma; + + ewma = (((ewma << PKTS_ACKED_WEIGHT) - ewma) + + (delivered_pkts << PKTS_ACKED_PREC)) >> + PKTS_ACKED_WEIGHT; + tp->pkts_acked_ewma = min_t(u32, ewma, 0xFFFFU); + } + } + if (!(flag & FLAG_SLOWPATH)) { /* AccECN counter might overflow on large ACKs */ if (delivered_pkts <= TCP_ACCECN_CEP_ACE_MASK) @@ -756,7 +773,8 @@ static u32 __tcp_accecn_process(struct sock *sk, const struct sk_buff *skb, if (d_ceb < safe_delta * tp->mss_cache >> TCP_ACCECN_SAFETY_SHIFT) return delta; - } + } else if (tp->pkts_acked_ewma > (ACK_COMP_THRESH << PKTS_ACKED_PREC)) + return delta; return safe_delta; }