From patchwork Fri May 8 17:46:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 219602 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D319C38A2A for ; Fri, 8 May 2020 17:46:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4D0B52063A for ; Fri, 8 May 2020 17:46:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="A4p0wfg0" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727772AbgEHRqR (ORCPT ); Fri, 8 May 2020 13:46:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1726746AbgEHRqQ (ORCPT ); Fri, 8 May 2020 13:46:16 -0400 Received: from mail-qt1-x84a.google.com (mail-qt1-x84a.google.com [IPv6:2607:f8b0:4864:20::84a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6F1A8C05BD43 for ; Fri, 8 May 2020 10:46:16 -0700 (PDT) Received: by mail-qt1-x84a.google.com with SMTP id g55so2739598qtk.14 for ; Fri, 08 May 2020 10:46:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=UqpfYUZiz2CYMv/nVzIiUXbRSrzC4CcACp4sK8TRSAU=; b=A4p0wfg0IhgheMp7WR/OQnHxXdgubuGhFI2Y5Vy50ajaPDddfNOgFet2K0o6uegC6x pmvlOKf+hcUWttXUjdPKeywlO8avZAsYOU2snDA7vXQoEv+4jM9puRvjNvmlNdZovQiq o5U4+Aw6nhk3qWCPMaEWS3H+btVXsxX1gHBSOrhriFd+5oywOSwLSRDv3VUC85Nezdeq QKvZW+BlIYP0fe6ook8HKEVJLsLwjzEYUw3hZm521uuTQ41+v20RBZ1SWREGhyfIHfsO v2IFljWWbJkLEON4WnqkLX5K1hgJKs0rYEgtuRgshlM+LqSPblZeWP2WaVVbefN8QDAZ jbRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=UqpfYUZiz2CYMv/nVzIiUXbRSrzC4CcACp4sK8TRSAU=; b=ixB1dFyploH4KIoCjD6yh3M0vnLyzpDgWYq2x0iE1qJigiWv+IiQxRhQ3PA34GdMAl bxYrW2uNCp6M3LSX188PhWckXPbQBdeMfRLoYQYsqo3iJvO+MVw6Dsl9RgOq67KR3TDX WNmsLDFBUC1Rff+sKLJLEQS58ilFKGjQd2AOFiOod8ofl2jSym65VX3swFGMZqi6P4IA CxMj7BoyLLKmUdXxVakBmt7KZqF6uVLKsyUUZ3seOCMTOF+DkraiMCNnYwa+GLb9UA5c Q2TGCSt9kzoxsJn1vPkojCce8KclGAc6ywlYhh65NcHg+eaqZd4FSRLZDkyVvlIc3bp0 zGqQ== X-Gm-Message-State: AGi0PubQ/KOzwoJYHMzTnLj+7cAPl9aAyveqO2XlZKeOWWTZ371BtpT5 8Xd+h12H3cx++x+Q0vXdphdvLCwzXVe0Z3RFFYvH2/8OkasMez5qwm+IlpRsQLvxG9ettpigDlg s/MHaGxfpFxpP6CT11HwbFrjS4d3SJOUt9Ax+zh0/X5BADtU9CbSLAg== X-Google-Smtp-Source: APiQypKZ8LsvOa6NQbAR7GWo4M5qmASiPs+SICNao6LiQGN8AztsfPRtFHcOLeicQAOYm4iX5qybIvc= X-Received: by 2002:ad4:4c4d:: with SMTP id cs13mr3852416qvb.207.1588959975398; Fri, 08 May 2020 10:46:15 -0700 (PDT) Date: Fri, 8 May 2020 10:46:08 -0700 In-Reply-To: <20200508174611.228805-1-sdf@google.com> Message-Id: <20200508174611.228805-2-sdf@google.com> Mime-Version: 1.0 References: <20200508174611.228805-1-sdf@google.com> X-Mailer: git-send-email 2.26.2.645.ge9eca65c58-goog Subject: [PATCH bpf-next v5 1/4] selftests/bpf: generalize helpers to control background listener From: Stanislav Fomichev To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: davem@davemloft.net, ast@kernel.org, daniel@iogearbox.net, Stanislav Fomichev , Andrey Ignatov , Martin KaFai Lau , Jakub Sitnicki Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Move the following routines that let us start a background listener thread and connect to a server by fd to the test_prog: * start_server - socket+bind+listen * connect_to_fd - connect to the server identified by fd These will be used in the next commit. Also, extend these helpers to support AF_INET6 and accept the family as an argument. v5: * drop pthread.h (Martin KaFai Lau) * add SO_SNDTIMEO (Martin KaFai Lau) v4: * export extra helper to start server without a thread (Martin KaFai Lau) * tcp_rtt is no longer starting background thread (Martin KaFai Lau) v2: * put helpers into network_helpers.c (Andrii Nakryiko) Acked-by: Andrey Ignatov Acked-by: Martin KaFai Lau Cc: Jakub Sitnicki Signed-off-by: Stanislav Fomichev --- tools/testing/selftests/bpf/Makefile | 2 +- tools/testing/selftests/bpf/network_helpers.c | 93 ++++++++++++++ tools/testing/selftests/bpf/network_helpers.h | 10 ++ .../selftests/bpf/prog_tests/tcp_rtt.c | 116 +----------------- 4 files changed, 108 insertions(+), 113 deletions(-) create mode 100644 tools/testing/selftests/bpf/network_helpers.c create mode 100644 tools/testing/selftests/bpf/network_helpers.h diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 3d942be23d09..8f25966b500b 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -354,7 +354,7 @@ endef TRUNNER_TESTS_DIR := prog_tests TRUNNER_BPF_PROGS_DIR := progs TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c \ - flow_dissector_load.h + network_helpers.c flow_dissector_load.h TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read \ $(wildcard progs/btf_dump_test_case_*.c) TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c new file mode 100644 index 000000000000..0073dddb72fd --- /dev/null +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -0,0 +1,93 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include +#include +#include +#include + +#include "network_helpers.h" + +#define clean_errno() (errno == 0 ? "None" : strerror(errno)) +#define log_err(MSG, ...) fprintf(stderr, "(%s:%d: errno: %s) " MSG "\n", \ + __FILE__, __LINE__, clean_errno(), ##__VA_ARGS__) + +int start_server(int family, int type) +{ + struct sockaddr_storage addr = {}; + socklen_t len; + int fd; + + if (family == AF_INET) { + struct sockaddr_in *sin = (void *)&addr; + + sin->sin_family = AF_INET; + len = sizeof(*sin); + } else { + struct sockaddr_in6 *sin6 = (void *)&addr; + + sin6->sin6_family = AF_INET6; + len = sizeof(*sin6); + } + + fd = socket(family, type | SOCK_NONBLOCK, 0); + if (fd < 0) { + log_err("Failed to create server socket"); + return -1; + } + + if (bind(fd, (const struct sockaddr *)&addr, len) < 0) { + log_err("Failed to bind socket"); + close(fd); + return -1; + } + + if (type == SOCK_STREAM) { + if (listen(fd, 1) < 0) { + log_err("Failed to listed on socket"); + close(fd); + return -1; + } + } + + return fd; +} + +static const struct timeval timeo_sec = { .tv_sec = 3 }; +static const size_t timeo_optlen = sizeof(timeo_sec); + +int connect_to_fd(int family, int type, int server_fd) +{ + struct sockaddr_storage addr; + socklen_t len = sizeof(addr); + int fd; + + fd = socket(family, type, 0); + if (fd < 0) { + log_err("Failed to create client socket"); + return -1; + } + + if (setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, &timeo_sec, timeo_optlen)) { + log_err("Failed to set SO_RCVTIMEO"); + goto out; + } + + if (getsockname(server_fd, (struct sockaddr *)&addr, &len)) { + log_err("Failed to get server addr"); + goto out; + } + + if (connect(fd, (const struct sockaddr *)&addr, len) < 0) { + log_err("Fail to connect to server with family %d", family); + goto out; + } + + return fd; + +out: + close(fd); + return -1; +} diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h new file mode 100644 index 000000000000..30068eacc1a2 --- /dev/null +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -0,0 +1,10 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __NETWORK_HELPERS_H +#define __NETWORK_HELPERS_H +#include +#include + +int start_server(int family, int type); +int connect_to_fd(int family, int type, int server_fd); + +#endif diff --git a/tools/testing/selftests/bpf/prog_tests/tcp_rtt.c b/tools/testing/selftests/bpf/prog_tests/tcp_rtt.c index e56b52ab41da..9013a0c01eed 100644 --- a/tools/testing/selftests/bpf/prog_tests/tcp_rtt.c +++ b/tools/testing/selftests/bpf/prog_tests/tcp_rtt.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include #include "cgroup_helpers.h" +#include "network_helpers.h" struct tcp_rtt_storage { __u32 invoked; @@ -87,34 +88,6 @@ static int verify_sk(int map_fd, int client_fd, const char *msg, __u32 invoked, return err; } -static int connect_to_server(int server_fd) -{ - struct sockaddr_storage addr; - socklen_t len = sizeof(addr); - int fd; - - fd = socket(AF_INET, SOCK_STREAM, 0); - if (fd < 0) { - log_err("Failed to create client socket"); - return -1; - } - - if (getsockname(server_fd, (struct sockaddr *)&addr, &len)) { - log_err("Failed to get server addr"); - goto out; - } - - if (connect(fd, (const struct sockaddr *)&addr, len) < 0) { - log_err("Fail to connect to server"); - goto out; - } - - return fd; - -out: - close(fd); - return -1; -} static int run_test(int cgroup_fd, int server_fd) { @@ -145,7 +118,7 @@ static int run_test(int cgroup_fd, int server_fd) goto close_bpf_object; } - client_fd = connect_to_server(server_fd); + client_fd = connect_to_fd(AF_INET, SOCK_STREAM, server_fd); if (client_fd < 0) { err = -1; goto close_bpf_object; @@ -180,103 +153,22 @@ static int run_test(int cgroup_fd, int server_fd) return err; } -static int start_server(void) -{ - struct sockaddr_in addr = { - .sin_family = AF_INET, - .sin_addr.s_addr = htonl(INADDR_LOOPBACK), - }; - int fd; - - fd = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0); - if (fd < 0) { - log_err("Failed to create server socket"); - return -1; - } - - if (bind(fd, (const struct sockaddr *)&addr, sizeof(addr)) < 0) { - log_err("Failed to bind socket"); - close(fd); - return -1; - } - - return fd; -} - -static pthread_mutex_t server_started_mtx = PTHREAD_MUTEX_INITIALIZER; -static pthread_cond_t server_started = PTHREAD_COND_INITIALIZER; -static volatile bool server_done = false; - -static void *server_thread(void *arg) -{ - struct sockaddr_storage addr; - socklen_t len = sizeof(addr); - int fd = *(int *)arg; - int client_fd; - int err; - - err = listen(fd, 1); - - pthread_mutex_lock(&server_started_mtx); - pthread_cond_signal(&server_started); - pthread_mutex_unlock(&server_started_mtx); - - if (CHECK_FAIL(err < 0)) { - perror("Failed to listed on socket"); - return ERR_PTR(err); - } - - while (true) { - client_fd = accept(fd, (struct sockaddr *)&addr, &len); - if (client_fd == -1 && errno == EAGAIN) { - usleep(50); - continue; - } - break; - } - if (CHECK_FAIL(client_fd < 0)) { - perror("Failed to accept client"); - return ERR_PTR(err); - } - - while (!server_done) - usleep(50); - - close(client_fd); - - return NULL; -} - void test_tcp_rtt(void) { int server_fd, cgroup_fd; - pthread_t tid; - void *server_res; cgroup_fd = test__join_cgroup("/tcp_rtt"); if (CHECK_FAIL(cgroup_fd < 0)) return; - server_fd = start_server(); + server_fd = start_server(AF_INET, SOCK_STREAM); if (CHECK_FAIL(server_fd < 0)) goto close_cgroup_fd; - if (CHECK_FAIL(pthread_create(&tid, NULL, server_thread, - (void *)&server_fd))) - goto close_server_fd; - - pthread_mutex_lock(&server_started_mtx); - pthread_cond_wait(&server_started, &server_started_mtx); - pthread_mutex_unlock(&server_started_mtx); - CHECK_FAIL(run_test(cgroup_fd, server_fd)); - server_done = true; - CHECK_FAIL(pthread_join(tid, &server_res)); - CHECK_FAIL(IS_ERR(server_res)); - -close_server_fd: close(server_fd); + close_cgroup_fd: close(cgroup_fd); } From patchwork Fri May 8 17:46:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stanislav Fomichev X-Patchwork-Id: 219601 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-17.4 required=3.0 tests=DKIMWL_WL_MED, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62E22C47257 for ; Fri, 8 May 2020 17:46:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3AE522063A for ; Fri, 8 May 2020 17:46:24 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="XweMujkC" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727857AbgEHRqX (ORCPT ); Fri, 8 May 2020 13:46:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1727835AbgEHRqW (ORCPT ); Fri, 8 May 2020 13:46:22 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAE84C05BD43 for ; Fri, 8 May 2020 10:46:21 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id x13so2910672ybg.23 for ; Fri, 08 May 2020 10:46:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=2fjtxto67J12FTXb5mOaBLRD24qL9TShTZ2jVbmQo1k=; b=XweMujkCC07i8rME8+2dz55Jnx0/csxjgr8AaGuquSnswYziLVW3arDLhQoLFDGyq/ DDqwxkexFztwQHDij8FEEi/DJbD/KmxOwhOzNoWeFJi5NvRJEc2dz9+dfDJehPnYJ3dD ffgBfHM+Kvv9uVYftfSkc8RJMQkLC8U8mj0No/H6oxj05VUi0hRTdFJKaEcG4m8fz9Pv Rgu2V0IoOGhYYRfwYnpkvfxQHP/H7g/aK8L1AFfEhXVgkx2ZEmCsIS/rEpU8ej6AeaSv G0A3qLLmF1c3UZ/8NMFdjZkzqb2QwMwCcNnWd1ypLg2RumRp1ZfDO55l0KX7qMVXqLd7 OAhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=2fjtxto67J12FTXb5mOaBLRD24qL9TShTZ2jVbmQo1k=; b=R3f/N0tBAKPvNua7QJI0mADrAKRbB084r7wl7kr3U5mYAiEuENn48DhnOl13XLBh+7 jfzGpC57yY2Sr/xerEUXcMdFOBY2mLYJLMBgvJWby7tCIAOiwdQ2FkoUsVM+o/d8Tbj6 lAJB3JgvkWzvr7+1aWJ5vaIzH/4jt2BBYVhOnjlq3qZnOM9hhuZjhnjz+lPw2pEDQdiK rMOEl2LWG6QJNoDL7XPnXFNXzABQbkmx4vgH9a0UJhn+Z3kFAP1S+q6ghyNxRih0hcNp AYM/3XKungujGgHx4NGQezud40IJnFCbTNfi6wkkm8Za2A4lDm1ubfogEctgKPlRppIE WZLQ== X-Gm-Message-State: AGi0Pua3ulcOWTMi1ZnkIB7ILRO13cxfq/6qXTagc8v+Db1vz6+wBPf4 e0NvDhvM6b7ROo68I1aqqFxnc1GPhhTCmXpdT+fbIiIN/QSHqxc1jq0UjqxHM+e6VOlkHgPRSoj vogbN9iO4HrWj2lthFjlDTdR40lkMiGsplTYLg2+KKtvHhYY5VfI0Sg== X-Google-Smtp-Source: APiQypKHs05nPrG5xDS7I0N2ivMnOPhWWZOIIyeknI001yaibB083j8YfL2AYvLgMKN8RUR+6nu/p3s= X-Received: by 2002:a25:b7d3:: with SMTP id u19mr6451534ybj.434.1588959980999; Fri, 08 May 2020 10:46:20 -0700 (PDT) Date: Fri, 8 May 2020 10:46:11 -0700 In-Reply-To: <20200508174611.228805-1-sdf@google.com> Message-Id: <20200508174611.228805-5-sdf@google.com> Mime-Version: 1.0 References: <20200508174611.228805-1-sdf@google.com> X-Mailer: git-send-email 2.26.2.645.ge9eca65c58-goog Subject: [PATCH bpf-next v5 4/4] bpf: allow any port in bpf_bind helper From: Stanislav Fomichev To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: davem@davemloft.net, ast@kernel.org, daniel@iogearbox.net, Stanislav Fomichev , Andrey Ignatov , Martin KaFai Lau Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org We want to have a tighter control on what ports we bind to in the BPF_CGROUP_INET{4,6}_CONNECT hooks even if it means connect() becomes slightly more expensive. The expensive part comes from the fact that we now need to call inet_csk_get_port() that verifies that the port is not used and allocates an entry in the hash table for it. Since we can't rely on "snum || !bind_address_no_port" to prevent us from calling POST_BIND hook anymore, let's add another bind flag to indicate that the call site is BPF program. v5: * fix wrong AF_INET (should be AF_INET6) in the bpf program for v6 v3: * More bpf_bind documentation refinements (Martin KaFai Lau) * Add UDP tests as well (Martin KaFai Lau) * Don't start the thread, just do socket+bind+listen (Martin KaFai Lau) v2: * Update documentation (Andrey Ignatov) * Pass BIND_FORCE_ADDRESS_NO_PORT conditionally (Andrey Ignatov) Acked-by: Andrey Ignatov Acked-by: Martin KaFai Lau Signed-off-by: Stanislav Fomichev --- include/net/inet_common.h | 2 + include/uapi/linux/bpf.h | 9 +- net/core/filter.c | 18 ++- net/ipv4/af_inet.c | 10 +- net/ipv6/af_inet6.c | 12 +- tools/include/uapi/linux/bpf.h | 9 +- .../bpf/prog_tests/connect_force_port.c | 115 ++++++++++++++++++ .../selftests/bpf/progs/connect_force_port4.c | 28 +++++ .../selftests/bpf/progs/connect_force_port6.c | 28 +++++ 9 files changed, 203 insertions(+), 28 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/connect_force_port.c create mode 100644 tools/testing/selftests/bpf/progs/connect_force_port4.c create mode 100644 tools/testing/selftests/bpf/progs/connect_force_port6.c diff --git a/include/net/inet_common.h b/include/net/inet_common.h index c38f4f7d660a..cb2818862919 100644 --- a/include/net/inet_common.h +++ b/include/net/inet_common.h @@ -39,6 +39,8 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len); #define BIND_FORCE_ADDRESS_NO_PORT (1 << 0) /* Grab and release socket lock. */ #define BIND_WITH_LOCK (1 << 1) +/* Called from BPF program. */ +#define BIND_FROM_BPF (1 << 2) int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len, u32 flags); int inet_getname(struct socket *sock, struct sockaddr *uaddr, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index b3643e27e264..6e5e7caa3739 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -1994,10 +1994,11 @@ union bpf_attr { * * This helper works for IPv4 and IPv6, TCP and UDP sockets. The * domain (*addr*\ **->sa_family**) must be **AF_INET** (or - * **AF_INET6**). Looking for a free port to bind to can be - * expensive, therefore binding to port is not permitted by the - * helper: *addr*\ **->sin_port** (or **sin6_port**, respectively) - * must be set to zero. + * **AF_INET6**). It's advised to pass zero port (**sin_port** + * or **sin6_port**) which triggers IP_BIND_ADDRESS_NO_PORT-like + * behavior and lets the kernel efficiently pick up an unused + * port as long as 4-tuple is unique. Passing non-zero port might + * lead to degraded performance. * Return * 0 on success, or a negative error in case of failure. * diff --git a/net/core/filter.c b/net/core/filter.c index fa9ddab5dd1f..da0634979f53 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -4525,32 +4525,28 @@ BPF_CALL_3(bpf_bind, struct bpf_sock_addr_kern *, ctx, struct sockaddr *, addr, { #ifdef CONFIG_INET struct sock *sk = ctx->sk; + u32 flags = BIND_FROM_BPF; int err; - /* Binding to port can be expensive so it's prohibited in the helper. - * Only binding to IP is supported. - */ err = -EINVAL; if (addr_len < offsetofend(struct sockaddr, sa_family)) return err; if (addr->sa_family == AF_INET) { if (addr_len < sizeof(struct sockaddr_in)) return err; - if (((struct sockaddr_in *)addr)->sin_port != htons(0)) - return err; - return __inet_bind(sk, addr, addr_len, - BIND_FORCE_ADDRESS_NO_PORT); + if (((struct sockaddr_in *)addr)->sin_port == htons(0)) + flags |= BIND_FORCE_ADDRESS_NO_PORT; + return __inet_bind(sk, addr, addr_len, flags); #if IS_ENABLED(CONFIG_IPV6) } else if (addr->sa_family == AF_INET6) { if (addr_len < SIN6_LEN_RFC2133) return err; - if (((struct sockaddr_in6 *)addr)->sin6_port != htons(0)) - return err; + if (((struct sockaddr_in6 *)addr)->sin6_port == htons(0)) + flags |= BIND_FORCE_ADDRESS_NO_PORT; /* ipv6_bpf_stub cannot be NULL, since it's called from * bpf_cgroup_inet6_connect hook and ipv6 is already loaded */ - return ipv6_bpf_stub->inet6_bind(sk, addr, addr_len, - BIND_FORCE_ADDRESS_NO_PORT); + return ipv6_bpf_stub->inet6_bind(sk, addr, addr_len, flags); #endif /* CONFIG_IPV6 */ } #endif /* CONFIG_INET */ diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c index 68e74b1b0f26..fcf0d12a407a 100644 --- a/net/ipv4/af_inet.c +++ b/net/ipv4/af_inet.c @@ -526,10 +526,12 @@ int __inet_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len, err = -EADDRINUSE; goto out_release_sock; } - err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk); - if (err) { - inet->inet_saddr = inet->inet_rcv_saddr = 0; - goto out_release_sock; + if (!(flags & BIND_FROM_BPF)) { + err = BPF_CGROUP_RUN_PROG_INET4_POST_BIND(sk); + if (err) { + inet->inet_saddr = inet->inet_rcv_saddr = 0; + goto out_release_sock; + } } } diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c index 552c2592b81c..771a462a8322 100644 --- a/net/ipv6/af_inet6.c +++ b/net/ipv6/af_inet6.c @@ -407,11 +407,13 @@ static int __inet6_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len, err = -EADDRINUSE; goto out; } - err = BPF_CGROUP_RUN_PROG_INET6_POST_BIND(sk); - if (err) { - sk->sk_ipv6only = saved_ipv6only; - inet_reset_saddr(sk); - goto out; + if (!(flags & BIND_FROM_BPF)) { + err = BPF_CGROUP_RUN_PROG_INET6_POST_BIND(sk); + if (err) { + sk->sk_ipv6only = saved_ipv6only; + inet_reset_saddr(sk); + goto out; + } } } diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index b3643e27e264..6e5e7caa3739 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -1994,10 +1994,11 @@ union bpf_attr { * * This helper works for IPv4 and IPv6, TCP and UDP sockets. The * domain (*addr*\ **->sa_family**) must be **AF_INET** (or - * **AF_INET6**). Looking for a free port to bind to can be - * expensive, therefore binding to port is not permitted by the - * helper: *addr*\ **->sin_port** (or **sin6_port**, respectively) - * must be set to zero. + * **AF_INET6**). It's advised to pass zero port (**sin_port** + * or **sin6_port**) which triggers IP_BIND_ADDRESS_NO_PORT-like + * behavior and lets the kernel efficiently pick up an unused + * port as long as 4-tuple is unique. Passing non-zero port might + * lead to degraded performance. * Return * 0 on success, or a negative error in case of failure. * diff --git a/tools/testing/selftests/bpf/prog_tests/connect_force_port.c b/tools/testing/selftests/bpf/prog_tests/connect_force_port.c new file mode 100644 index 000000000000..47fbb20cb6a6 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/connect_force_port.c @@ -0,0 +1,115 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include "cgroup_helpers.h" +#include "network_helpers.h" + +static int verify_port(int family, int fd, int expected) +{ + struct sockaddr_storage addr; + socklen_t len = sizeof(addr); + __u16 port; + + if (getsockname(fd, (struct sockaddr *)&addr, &len)) { + log_err("Failed to get server addr"); + return -1; + } + + if (family == AF_INET) + port = ((struct sockaddr_in *)&addr)->sin_port; + else + port = ((struct sockaddr_in6 *)&addr)->sin6_port; + + if (ntohs(port) != expected) { + log_err("Unexpected port %d, expected %d", ntohs(port), + expected); + return -1; + } + + return 0; +} + +static int run_test(int cgroup_fd, int server_fd, int family, int type) +{ + struct bpf_prog_load_attr attr = { + .prog_type = BPF_PROG_TYPE_CGROUP_SOCK_ADDR, + }; + struct bpf_object *obj; + int expected_port; + int prog_fd; + int err; + int fd; + + if (family == AF_INET) { + attr.file = "./connect_force_port4.o"; + attr.expected_attach_type = BPF_CGROUP_INET4_CONNECT; + expected_port = 22222; + } else { + attr.file = "./connect_force_port6.o"; + attr.expected_attach_type = BPF_CGROUP_INET6_CONNECT; + expected_port = 22223; + } + + err = bpf_prog_load_xattr(&attr, &obj, &prog_fd); + if (err) { + log_err("Failed to load BPF object"); + return -1; + } + + err = bpf_prog_attach(prog_fd, cgroup_fd, attr.expected_attach_type, + 0); + if (err) { + log_err("Failed to attach BPF program"); + goto close_bpf_object; + } + + fd = connect_to_fd(family, type, server_fd); + if (fd < 0) { + err = -1; + goto close_bpf_object; + } + + err = verify_port(family, fd, expected_port); + + close(fd); + +close_bpf_object: + bpf_object__close(obj); + return err; +} + +void test_connect_force_port(void) +{ + int server_fd, cgroup_fd; + + cgroup_fd = test__join_cgroup("/connect_force_port"); + if (CHECK_FAIL(cgroup_fd < 0)) + return; + + server_fd = start_server(AF_INET, SOCK_STREAM); + if (CHECK_FAIL(server_fd < 0)) + goto close_cgroup_fd; + CHECK_FAIL(run_test(cgroup_fd, server_fd, AF_INET, SOCK_STREAM)); + close(server_fd); + + server_fd = start_server(AF_INET6, SOCK_STREAM); + if (CHECK_FAIL(server_fd < 0)) + goto close_cgroup_fd; + CHECK_FAIL(run_test(cgroup_fd, server_fd, AF_INET6, SOCK_STREAM)); + close(server_fd); + + server_fd = start_server(AF_INET, SOCK_DGRAM); + if (CHECK_FAIL(server_fd < 0)) + goto close_cgroup_fd; + CHECK_FAIL(run_test(cgroup_fd, server_fd, AF_INET, SOCK_DGRAM)); + close(server_fd); + + server_fd = start_server(AF_INET6, SOCK_DGRAM); + if (CHECK_FAIL(server_fd < 0)) + goto close_cgroup_fd; + CHECK_FAIL(run_test(cgroup_fd, server_fd, AF_INET6, SOCK_DGRAM)); + close(server_fd); + +close_cgroup_fd: + close(cgroup_fd); +} diff --git a/tools/testing/selftests/bpf/progs/connect_force_port4.c b/tools/testing/selftests/bpf/progs/connect_force_port4.c new file mode 100644 index 000000000000..1b8eb34b2db0 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/connect_force_port4.c @@ -0,0 +1,28 @@ +// SPDX-License-Identifier: GPL-2.0 +#include + +#include +#include +#include +#include + +#include +#include + +char _license[] SEC("license") = "GPL"; +int _version SEC("version") = 1; + +SEC("cgroup/connect4") +int _connect4(struct bpf_sock_addr *ctx) +{ + struct sockaddr_in sa = {}; + + sa.sin_family = AF_INET; + sa.sin_port = bpf_htons(22222); + sa.sin_addr.s_addr = bpf_htonl(0x7f000001); /* 127.0.0.1 */ + + if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0) + return 0; + + return 1; +} diff --git a/tools/testing/selftests/bpf/progs/connect_force_port6.c b/tools/testing/selftests/bpf/progs/connect_force_port6.c new file mode 100644 index 000000000000..ae6f7d750b4c --- /dev/null +++ b/tools/testing/selftests/bpf/progs/connect_force_port6.c @@ -0,0 +1,28 @@ +// SPDX-License-Identifier: GPL-2.0 +#include + +#include +#include +#include +#include + +#include +#include + +char _license[] SEC("license") = "GPL"; +int _version SEC("version") = 1; + +SEC("cgroup/connect6") +int _connect6(struct bpf_sock_addr *ctx) +{ + struct sockaddr_in6 sa = {}; + + sa.sin6_family = AF_INET6; + sa.sin6_port = bpf_htons(22223); + sa.sin6_addr.s6_addr32[3] = bpf_htonl(1); /* ::1 */ + + if (bpf_bind(ctx, (struct sockaddr *)&sa, sizeof(sa)) != 0) + return 0; + + return 1; +}