From patchwork Fri Jun 4 15:17:30 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 454544 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C5EBC47083 for ; Fri, 4 Jun 2021 15:18:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 24E036140B for ; Fri, 4 Jun 2021 15:18:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230147AbhFDPTr (ORCPT ); Fri, 4 Jun 2021 11:19:47 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54916 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229822AbhFDPTp (ORCPT ); Fri, 4 Jun 2021 11:19:45 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819875; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yBKFhxPsXa9fMNRa2Z76DvOFp9cyO77PK+EqAtHPkcI=; b=D7ylREqfEpWAWkunLfhUCnnZHyqZ6G9xnBmi+6goAPQWnlc3hJaEY+EbLZl7PTEBrGrlRD sQJERMBUmNwxj3Fw/n9xnwzPBqhkLfbZLlyyhNwXiQhZQo07Fsxc/Qq2bYHl/hLCCqRdzi pbsYjQ8wRns1N/RuXdA4emoV0IuJtIE= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 38e50d81 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:17:55 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , Rui Salvaterra , stable@vger.kernel.org Subject: [PATCH net 1/9] wireguard: selftests: remove old conntrack kconfig value Date: Fri, 4 Jun 2021 17:17:30 +0200 Message-Id: <20210604151738.220232-2-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On recent kernels, this config symbol is no longer used. Reported-by: Rui Salvaterra Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- tools/testing/selftests/wireguard/qemu/kernel.config | 1 - 1 file changed, 1 deletion(-) diff --git a/tools/testing/selftests/wireguard/qemu/kernel.config b/tools/testing/selftests/wireguard/qemu/kernel.config index 4eecb432a66c..74db83a0aedd 100644 --- a/tools/testing/selftests/wireguard/qemu/kernel.config +++ b/tools/testing/selftests/wireguard/qemu/kernel.config @@ -19,7 +19,6 @@ CONFIG_NETFILTER_XTABLES=y CONFIG_NETFILTER_XT_NAT=y CONFIG_NETFILTER_XT_MATCH_LENGTH=y CONFIG_NETFILTER_XT_MARK=y -CONFIG_NF_CONNTRACK_IPV4=y CONFIG_NF_NAT_IPV4=y CONFIG_IP_NF_IPTABLES=y CONFIG_IP_NF_FILTER=y From patchwork Fri Jun 4 15:17:31 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 455004 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2AA56C07E94 for ; Fri, 4 Jun 2021 15:18:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1740761402 for ; Fri, 4 Jun 2021 15:18:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230341AbhFDPTs (ORCPT ); Fri, 4 Jun 2021 11:19:48 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54916 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229778AbhFDPTq (ORCPT ); Fri, 4 Jun 2021 11:19:46 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819878; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wFEvuNyHfPiVav+6JQ1EDLZB4jXWgskHb1VvAjhvm2A=; b=Nfqk1Tx6XQy4/vZ7k+Dk9sFbGxN6IaInu3npUWf/qA0QCjVlgdsFO/1xakviI3UBOG05Vh iRsJtgAGK0e/QWfHaSjxjeh6hhFzzWzaGuIcDMhR11l1EPDh7QOg/sc78hDUMpdrWKYJSR QQR3yVv8YQtoC0+Vf/JiB1tD+YZ+rqs= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 5c409dc9 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:17:58 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , Hangbin Liu , stable@vger.kernel.org Subject: [PATCH net 2/9] wireguard: selftests: make sure rp_filter is disabled on vethc Date: Fri, 4 Jun 2021 17:17:31 +0200 Message-Id: <20210604151738.220232-3-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Some distros may enable strict rp_filter by default, which will prevent vethc from receiving the packets with an unrouteable reverse path address. Reported-by: Hangbin Liu Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- tools/testing/selftests/wireguard/netns.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/tools/testing/selftests/wireguard/netns.sh b/tools/testing/selftests/wireguard/netns.sh index 7ed7cd95e58f..ebc4ee0fe179 100755 --- a/tools/testing/selftests/wireguard/netns.sh +++ b/tools/testing/selftests/wireguard/netns.sh @@ -363,6 +363,7 @@ ip1 -6 rule add table main suppress_prefixlength 0 ip1 -4 route add default dev wg0 table 51820 ip1 -4 rule add not fwmark 51820 table 51820 ip1 -4 rule add table main suppress_prefixlength 0 +n1 bash -c 'printf 0 > /proc/sys/net/ipv4/conf/vethc/rp_filter' # Flood the pings instead of sending just one, to trigger routing table reference counting bugs. n1 ping -W 1 -c 100 -f 192.168.99.7 n1 ping -W 1 -c 100 -f abab::1111 From patchwork Fri Jun 4 15:17:32 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 454042 Delivered-To: patch@linaro.org Received: by 2002:a02:735a:0:0:0:0:0 with SMTP id a26csp508357jae; Fri, 4 Jun 2021 08:18:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw+0EqU+fxMdigjW8KUyLPx5zlrAe7L5JVjOQ2IiUZkRSKx4IbhdWH8NIAzU0ACqNH4Exay X-Received: by 2002:a17:907:6288:: with SMTP id nd8mr4736032ejc.223.1622819929829; Fri, 04 Jun 2021 08:18:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1622819929; cv=none; d=google.com; s=arc-20160816; b=b2k91CsccZV2ckqi7NRY/mwK/625hd5o14xMBO9iA/3vXNOeZKqD8dsFsSN+fRV483 xL17BWfWBkfPzoyI45X9yWyQ+zgt8TUdh5A4mh8HwCLq5CXjamEtU0AAN5a/jP9kgAYC L+ZR2DG8FTXdDSnPI75pUOp7VspjudJP8YRO5q7CP04O9LXtq4S7GLGSwjyicedm/4Zy 2t/AHtht02clU+AvgfAML0yXdmGTSXYgEcIpQwBsySNg2QEjf2MPaOq4OIoOP50ABbMt ZWfmuD758nCDLGoGBnap9JkVFZrYiRtqN70bgRCS51QeUF9yzvkRL8bM4WT8WULgUj3w rSUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=03qVuV/z5QIT1sBJaboojuxrUS02tPyZze9WNKWgRyc=; b=G9F5T0UQw93BkLzRinc3ZDA72lcLSbgsf9e9viSIT9zvMZAWDrDO4ezLCvALCCNpgM SEjlQXQj1JF46NueJMWOI8lIZXyazoQ6sM1QPpOpREmCVxdtGIc9ok2dQUKovKF9P827 6DCDkRoor2N71BAomMIc7brNQC4zc6mtr2FDyEUynpGSBCRFyuMkHuiUnA7b0sKt4sHi p5r+uZItaEB9vq8xvkHYlZDKd/SxVAHAXShYOOnD83r9KDpqWLA+y4BQxZ0WsF7YB+8v vHLbzGvrAQpVz+Cd6Y/oBLlHVjQHBSyrSMBFcEJ3cz6Z7dA0IMbmmCBZFhwEIoj10UX/ mUwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=qHDJOl+2; spf=pass (google.com: domain of netdev-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=netdev-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zx2c4.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l15si4894987ejc.521.2021.06.04.08.18.49; Fri, 04 Jun 2021 08:18:49 -0700 (PDT) Received-SPF: pass (google.com: domain of netdev-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@zx2c4.com header.s=20210105 header.b=qHDJOl+2; spf=pass (google.com: domain of netdev-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=netdev-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zx2c4.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230378AbhFDPTu (ORCPT + 8 others); Fri, 4 Jun 2021 11:19:50 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54916 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230173AbhFDPTs (ORCPT ); Fri, 4 Jun 2021 11:19:48 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819881; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=03qVuV/z5QIT1sBJaboojuxrUS02tPyZze9WNKWgRyc=; b=qHDJOl+2DFRnTHLPWmLGtEk9LFtAH95XFfHLCT61pQ6AiX+wRpOOwKvcnKZNV3HMB0si5N AVyCny87O4Gju85rR5c7+nr10FMuzZYiso5SY5MnGd/HIsUSPUlmvppUMiSqMDDLCMCe8I 46F5mVKr4Qq6lciwbZY0q2aWzTTH6gU= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 9c8a12b9 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:18:00 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , Linus Torvalds , stable@vger.kernel.org Subject: [PATCH net 3/9] wireguard: do not use -O3 Date: Fri, 4 Jun 2021 17:17:32 +0200 Message-Id: <20210604151738.220232-4-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Apparently, various versions of gcc have O3-related miscompiles. Looking at the difference between -O2 and -O3 for gcc 11 doesn't indicate miscompiles, but the difference also doesn't seem so significant for performance that it's worth risking. Link: https://lore.kernel.org/lkml/CAHk-=wjuoGyxDhAF8SsrTkN0-YfCx7E6jUN3ikC_tn2AKWTTsA@mail.gmail.com/ Link: https://lore.kernel.org/lkml/CAHmME9otB5Wwxp7H8bR_i2uH2esEMvoBMC8uEXBMH9p0q1s6Bw@mail.gmail.com/ Reported-by: Linus Torvalds Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- drivers/net/wireguard/Makefile | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) -- 2.31.1 diff --git a/drivers/net/wireguard/Makefile b/drivers/net/wireguard/Makefile index fc52b2cb500b..dbe1f8514efc 100644 --- a/drivers/net/wireguard/Makefile +++ b/drivers/net/wireguard/Makefile @@ -1,5 +1,4 @@ -ccflags-y := -O3 -ccflags-y += -D'pr_fmt(fmt)=KBUILD_MODNAME ": " fmt' +ccflags-y := -D'pr_fmt(fmt)=KBUILD_MODNAME ": " fmt' ccflags-$(CONFIG_WIREGUARD_DEBUG) += -DDEBUG wireguard-y := main.o wireguard-y += noise.o From patchwork Fri Jun 4 15:17:33 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 455003 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DA1BC4708F for ; Fri, 4 Jun 2021 15:18:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1B08161400 for ; Fri, 4 Jun 2021 15:18:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230412AbhFDPTy (ORCPT ); Fri, 4 Jun 2021 11:19:54 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54948 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230374AbhFDPTv (ORCPT ); Fri, 4 Jun 2021 11:19:51 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819883; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ew2oDN9uOXatxBQ4LLdwymBBFALjUrTav2zPF6ydiF4=; b=PmPYj8u0I9/Jb6nV0nhvQM7LjqfR7bq9lxX7zTnhZHhMkyubPvdzy+EMF+hynlNGcv/Fuo zcovG6a7Yq2pzDIEXORUFL11CPyeytXXV9il1zLCbQ6VfTFLMjJNjTWfteVfbzuft7xoo6 EV2am2nIb2c+AMTVyvHXkonV+SCS66w= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id a67f05fe (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:18:03 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , stable@vger.kernel.org Subject: [PATCH net 4/9] wireguard: use synchronize_net rather than synchronize_rcu Date: Fri, 4 Jun 2021 17:17:33 +0200 Message-Id: <20210604151738.220232-5-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Many of the synchronization points are sometimes called under the rtnl lock, which means we should use synchronize_net rather than synchronize_rcu. Under the hood, this expands to using the expedited flavor of function in the event that rtnl is held, in order to not stall other concurrent changes. This fixes some very, very long delays when removing multiple peers at once, which would cause some operations to take several minutes. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- drivers/net/wireguard/peer.c | 6 +++--- drivers/net/wireguard/socket.c | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireguard/peer.c b/drivers/net/wireguard/peer.c index cd5cb0292cb6..3a042d28eb2e 100644 --- a/drivers/net/wireguard/peer.c +++ b/drivers/net/wireguard/peer.c @@ -88,7 +88,7 @@ static void peer_make_dead(struct wg_peer *peer) /* Mark as dead, so that we don't allow jumping contexts after. */ WRITE_ONCE(peer->is_dead, true); - /* The caller must now synchronize_rcu() for this to take effect. */ + /* The caller must now synchronize_net() for this to take effect. */ } static void peer_remove_after_dead(struct wg_peer *peer) @@ -160,7 +160,7 @@ void wg_peer_remove(struct wg_peer *peer) lockdep_assert_held(&peer->device->device_update_lock); peer_make_dead(peer); - synchronize_rcu(); + synchronize_net(); peer_remove_after_dead(peer); } @@ -178,7 +178,7 @@ void wg_peer_remove_all(struct wg_device *wg) peer_make_dead(peer); list_add_tail(&peer->peer_list, &dead_peers); } - synchronize_rcu(); + synchronize_net(); list_for_each_entry_safe(peer, temp, &dead_peers, peer_list) peer_remove_after_dead(peer); } diff --git a/drivers/net/wireguard/socket.c b/drivers/net/wireguard/socket.c index d9ad850daa79..8c496b747108 100644 --- a/drivers/net/wireguard/socket.c +++ b/drivers/net/wireguard/socket.c @@ -430,7 +430,7 @@ void wg_socket_reinit(struct wg_device *wg, struct sock *new4, if (new4) wg->incoming_port = ntohs(inet_sk(new4)->inet_sport); mutex_unlock(&wg->socket_update_lock); - synchronize_rcu(); + synchronize_net(); sock_free(old4); sock_free(old6); } From patchwork Fri Jun 4 15:17:34 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 454542 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 019FAC47099 for ; Fri, 4 Jun 2021 15:18:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E19A86140B for ; Fri, 4 Jun 2021 15:18:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230436AbhFDPTz (ORCPT ); Fri, 4 Jun 2021 11:19:55 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54948 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230366AbhFDPTy (ORCPT ); Fri, 4 Jun 2021 11:19:54 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819886; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oBjY5vDoMMy6mIXgNjOV+szgQeVTSx5zMFLE7TfVt+g=; b=CppsgXqfq2kb1EWxc+aiUta+vUR1hBO77SpLtijLBzeEzAmfJQ5bqDZeyHoYwpP7yurDQY +m8PrhyynqqR3gUpi98WMw3gUJcv4ChKMLjeCAY+tvkxBhaZQyILJ6cSn2tiqg5XJ5LEr/ STK5I3dXPbRVC2rPjTGO3pN/reHbKHM= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 00ccca12 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:18:05 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , Arnd Bergmann , Matthew Wilcox , stable@vger.kernel.org Subject: [PATCH net 5/9] wireguard: peer: allocate in kmem_cache Date: Fri, 4 Jun 2021 17:17:34 +0200 Message-Id: <20210604151738.220232-6-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org With deployments having upwards of 600k peers now, this somewhat heavy structure could benefit from more fine-grained allocations. Specifically, instead of using a 2048-byte slab for a 1544-byte object, we can now use 1544-byte objects directly, thus saving almost 25% per-peer, or with 600k peers, that's a savings of 303 MiB. This also makes wireguard's memory usage more transparent in tools like slabtop and /proc/slabinfo. Fixes: 8b5553ace83c ("wireguard: queueing: get rid of per-peer ring buffers") Suggested-by: Arnd Bergmann Suggested-by: Matthew Wilcox Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- drivers/net/wireguard/main.c | 7 +++++++ drivers/net/wireguard/peer.c | 21 +++++++++++++++++---- drivers/net/wireguard/peer.h | 3 +++ 3 files changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/net/wireguard/main.c b/drivers/net/wireguard/main.c index 7a7d5f1a80fc..0a3ebfdac794 100644 --- a/drivers/net/wireguard/main.c +++ b/drivers/net/wireguard/main.c @@ -28,6 +28,10 @@ static int __init mod_init(void) #endif wg_noise_init(); + ret = wg_peer_init(); + if (ret < 0) + goto err_peer; + ret = wg_device_init(); if (ret < 0) goto err_device; @@ -44,6 +48,8 @@ static int __init mod_init(void) err_netlink: wg_device_uninit(); err_device: + wg_peer_uninit(); +err_peer: return ret; } @@ -51,6 +57,7 @@ static void __exit mod_exit(void) { wg_genetlink_uninit(); wg_device_uninit(); + wg_peer_uninit(); } module_init(mod_init); diff --git a/drivers/net/wireguard/peer.c b/drivers/net/wireguard/peer.c index 3a042d28eb2e..1acd00ab2fbc 100644 --- a/drivers/net/wireguard/peer.c +++ b/drivers/net/wireguard/peer.c @@ -15,6 +15,7 @@ #include #include +static struct kmem_cache *peer_cache; static atomic64_t peer_counter = ATOMIC64_INIT(0); struct wg_peer *wg_peer_create(struct wg_device *wg, @@ -29,10 +30,10 @@ struct wg_peer *wg_peer_create(struct wg_device *wg, if (wg->num_peers >= MAX_PEERS_PER_DEVICE) return ERR_PTR(ret); - peer = kzalloc(sizeof(*peer), GFP_KERNEL); + peer = kmem_cache_zalloc(peer_cache, GFP_KERNEL); if (unlikely(!peer)) return ERR_PTR(ret); - if (dst_cache_init(&peer->endpoint_cache, GFP_KERNEL)) + if (unlikely(dst_cache_init(&peer->endpoint_cache, GFP_KERNEL))) goto err; peer->device = wg; @@ -64,7 +65,7 @@ struct wg_peer *wg_peer_create(struct wg_device *wg, return peer; err: - kfree(peer); + kmem_cache_free(peer_cache, peer); return ERR_PTR(ret); } @@ -193,7 +194,8 @@ static void rcu_release(struct rcu_head *rcu) /* The final zeroing takes care of clearing any remaining handshake key * material and other potentially sensitive information. */ - kfree_sensitive(peer); + memzero_explicit(peer, sizeof(*peer)); + kmem_cache_free(peer_cache, peer); } static void kref_release(struct kref *refcount) @@ -225,3 +227,14 @@ void wg_peer_put(struct wg_peer *peer) return; kref_put(&peer->refcount, kref_release); } + +int __init wg_peer_init(void) +{ + peer_cache = KMEM_CACHE(wg_peer, 0); + return peer_cache ? 0 : -ENOMEM; +} + +void wg_peer_uninit(void) +{ + kmem_cache_destroy(peer_cache); +} diff --git a/drivers/net/wireguard/peer.h b/drivers/net/wireguard/peer.h index 8d53b687a1d1..76e4d3128ad4 100644 --- a/drivers/net/wireguard/peer.h +++ b/drivers/net/wireguard/peer.h @@ -80,4 +80,7 @@ void wg_peer_put(struct wg_peer *peer); void wg_peer_remove(struct wg_peer *peer); void wg_peer_remove_all(struct wg_device *wg); +int wg_peer_init(void); +void wg_peer_uninit(void); + #endif /* _WG_PEER_H */ From patchwork Fri Jun 4 15:17:35 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 455002 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 682E4C47083 for ; Fri, 4 Jun 2021 15:18:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5102961402 for ; Fri, 4 Jun 2021 15:18:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230462AbhFDPUC (ORCPT ); Fri, 4 Jun 2021 11:20:02 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54988 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230366AbhFDPUB (ORCPT ); Fri, 4 Jun 2021 11:20:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819889; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8N0Fk94XQEKy2fk7aF/Vhnx+Uu/pat84MbHqxv22CaE=; b=XDtvQcg+bjXtf8pgWiLbyiYgq0WB64kVllnsnS55AemklCOhibeV2Ne5TSZPEovBf5bREe ac4cMk6XcBsXUa5Oz1NsYWm5WWfs5ACfKzERFOFZuejYmSwObk23TwVJAH1Xa+3XaG6bmh Xbn/g/wLECyxPEhWI/G07XitG51II3o= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 82391298 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:18:08 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , stable@vger.kernel.org Subject: [PATCH net 6/9] wireguard: allowedips: initialize list head in selftest Date: Fri, 4 Jun 2021 17:17:35 +0200 Message-Id: <20210604151738.220232-7-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The randomized trie tests weren't initializing the dummy peer list head, resulting in a NULL pointer dereference when used. Fix this by initializing it in the randomized trie test, just like we do for the static unit test. While we're at it, all of the other strings like this have the word "self-test", so add it to the missing place here. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- drivers/net/wireguard/selftest/allowedips.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c index 846db14cb046..0d2a43a2d400 100644 --- a/drivers/net/wireguard/selftest/allowedips.c +++ b/drivers/net/wireguard/selftest/allowedips.c @@ -296,6 +296,7 @@ static __init bool randomized_test(void) goto free; } kref_init(&peers[i]->refcount); + INIT_LIST_HEAD(&peers[i]->allowedips_list); } mutex_lock(&mutex); @@ -333,7 +334,7 @@ static __init bool randomized_test(void) if (wg_allowedips_insert_v4(&t, (struct in_addr *)mutated, cidr, peer, &mutex) < 0) { - pr_err("allowedips random malloc: FAIL\n"); + pr_err("allowedips random self-test malloc: FAIL\n"); goto free_locked; } if (horrible_allowedips_insert_v4(&h, From patchwork Fri Jun 4 15:17:36 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 454541 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66680C4708F for ; Fri, 4 Jun 2021 15:18:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4EA5361415 for ; Fri, 4 Jun 2021 15:18:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230453AbhFDPUD (ORCPT ); Fri, 4 Jun 2021 11:20:03 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54988 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230374AbhFDPUC (ORCPT ); Fri, 4 Jun 2021 11:20:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819891; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AYEnH8/v2ltjpJ0YV9kW5qqM6u9qvKetv5dPxil2ZbI=; b=DtbIMxRe5ib4zZ1oFj8VBi0n++OQcWcwv35Bv0ZxwJM+21ThmOI9B03Q587RB1YyreeNvG 0D1BEzmEGjTwT40msBRAxqwITtBnxu8aVc5/3/wSinX0TO3EZMnUA0uIyCdcv8KooC/PT8 /GORrUYU9w80g5p9DoT87LlidlI+64Q= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 1e7dff62 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:18:11 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , stable@vger.kernel.org Subject: [PATCH net 7/9] wireguard: allowedips: remove nodes in O(1) Date: Fri, 4 Jun 2021 17:17:36 +0200 Message-Id: <20210604151738.220232-8-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Previously, deleting peers would require traversing the entire trie in order to rebalance nodes and safely free them. This meant that removing 1000 peers from a trie with a half million nodes would take an extremely long time, during which we're holding the rtnl lock. Large-scale users were reporting 200ms latencies added to the networking stack as a whole every time their userspace software would queue up significant removals. That's a serious situation. This commit fixes that by maintaining a double pointer to the parent's bit pointer for each node, and then using the already existing node list belonging to each peer to go directly to the node, fix up its pointers, and free it with RCU. This means removal is O(1) instead of O(n), and we don't use gobs of stack. The removal algorithm has the same downside as the code that it fixes: it won't collapse needlessly long runs of fillers. We can enhance that in the future if it ever becomes a problem. This commit documents that limitation with a TODO comment in code, a small but meaningful improvement over the prior situation. Currently the biggest flaw, which the next commit addresses, is that because this increases the node size on 64-bit machines from 60 bytes to 68 bytes. 60 rounds up to 64, but 68 rounds up to 128. So we wind up using twice as much memory per node, because of power-of-two allocations, which is a big bummer. We'll need to figure something out there. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- drivers/net/wireguard/allowedips.c | 132 ++++++++++++----------------- drivers/net/wireguard/allowedips.h | 9 +- 2 files changed, 57 insertions(+), 84 deletions(-) diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c index 3725e9cd85f4..2785cfd3a221 100644 --- a/drivers/net/wireguard/allowedips.c +++ b/drivers/net/wireguard/allowedips.c @@ -66,60 +66,6 @@ static void root_remove_peer_lists(struct allowedips_node *root) } } -static void walk_remove_by_peer(struct allowedips_node __rcu **top, - struct wg_peer *peer, struct mutex *lock) -{ -#define REF(p) rcu_access_pointer(p) -#define DEREF(p) rcu_dereference_protected(*(p), lockdep_is_held(lock)) -#define PUSH(p) ({ \ - WARN_ON(IS_ENABLED(DEBUG) && len >= 128); \ - stack[len++] = p; \ - }) - - struct allowedips_node __rcu **stack[128], **nptr; - struct allowedips_node *node, *prev; - unsigned int len; - - if (unlikely(!peer || !REF(*top))) - return; - - for (prev = NULL, len = 0, PUSH(top); len > 0; prev = node) { - nptr = stack[len - 1]; - node = DEREF(nptr); - if (!node) { - --len; - continue; - } - if (!prev || REF(prev->bit[0]) == node || - REF(prev->bit[1]) == node) { - if (REF(node->bit[0])) - PUSH(&node->bit[0]); - else if (REF(node->bit[1])) - PUSH(&node->bit[1]); - } else if (REF(node->bit[0]) == prev) { - if (REF(node->bit[1])) - PUSH(&node->bit[1]); - } else { - if (rcu_dereference_protected(node->peer, - lockdep_is_held(lock)) == peer) { - RCU_INIT_POINTER(node->peer, NULL); - list_del_init(&node->peer_list); - if (!node->bit[0] || !node->bit[1]) { - rcu_assign_pointer(*nptr, DEREF( - &node->bit[!REF(node->bit[0])])); - kfree_rcu(node, rcu); - node = DEREF(nptr); - } - } - --len; - } - } - -#undef REF -#undef DEREF -#undef PUSH -} - static unsigned int fls128(u64 a, u64 b) { return a ? fls64(a) + 64U : fls64(b); @@ -224,6 +170,7 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, RCU_INIT_POINTER(node->peer, peer); list_add_tail(&node->peer_list, &peer->allowedips_list); copy_and_assign_cidr(node, key, cidr, bits); + rcu_assign_pointer(node->parent_bit, trie); rcu_assign_pointer(*trie, node); return 0; } @@ -243,9 +190,9 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, if (!node) { down = rcu_dereference_protected(*trie, lockdep_is_held(lock)); } else { - down = rcu_dereference_protected(CHOOSE_NODE(node, key), - lockdep_is_held(lock)); + down = rcu_dereference_protected(CHOOSE_NODE(node, key), lockdep_is_held(lock)); if (!down) { + rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(node, key)); rcu_assign_pointer(CHOOSE_NODE(node, key), newnode); return 0; } @@ -254,29 +201,37 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, parent = node; if (newnode->cidr == cidr) { + rcu_assign_pointer(down->parent_bit, &CHOOSE_NODE(newnode, down->bits)); rcu_assign_pointer(CHOOSE_NODE(newnode, down->bits), down); - if (!parent) + if (!parent) { + rcu_assign_pointer(newnode->parent_bit, trie); rcu_assign_pointer(*trie, newnode); - else - rcu_assign_pointer(CHOOSE_NODE(parent, newnode->bits), - newnode); - } else { - node = kzalloc(sizeof(*node), GFP_KERNEL); - if (unlikely(!node)) { - list_del(&newnode->peer_list); - kfree(newnode); - return -ENOMEM; + } else { + rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(parent, newnode->bits)); + rcu_assign_pointer(CHOOSE_NODE(parent, newnode->bits), newnode); } - INIT_LIST_HEAD(&node->peer_list); - copy_and_assign_cidr(node, newnode->bits, cidr, bits); - - rcu_assign_pointer(CHOOSE_NODE(node, down->bits), down); - rcu_assign_pointer(CHOOSE_NODE(node, newnode->bits), newnode); - if (!parent) - rcu_assign_pointer(*trie, node); - else - rcu_assign_pointer(CHOOSE_NODE(parent, node->bits), - node); + return 0; + } + + node = kzalloc(sizeof(*node), GFP_KERNEL); + if (unlikely(!node)) { + list_del(&newnode->peer_list); + kfree(newnode); + return -ENOMEM; + } + INIT_LIST_HEAD(&node->peer_list); + copy_and_assign_cidr(node, newnode->bits, cidr, bits); + + rcu_assign_pointer(down->parent_bit, &CHOOSE_NODE(node, down->bits)); + rcu_assign_pointer(CHOOSE_NODE(node, down->bits), down); + rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(node, newnode->bits)); + rcu_assign_pointer(CHOOSE_NODE(node, newnode->bits), newnode); + if (!parent) { + rcu_assign_pointer(node->parent_bit, trie); + rcu_assign_pointer(*trie, node); + } else { + rcu_assign_pointer(node->parent_bit, &CHOOSE_NODE(parent, node->bits)); + rcu_assign_pointer(CHOOSE_NODE(parent, node->bits), node); } return 0; } @@ -335,9 +290,30 @@ int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, void wg_allowedips_remove_by_peer(struct allowedips *table, struct wg_peer *peer, struct mutex *lock) { + struct allowedips_node *node, *child, *tmp; + + if (list_empty(&peer->allowedips_list)) + return; ++table->seq; - walk_remove_by_peer(&table->root4, peer, lock); - walk_remove_by_peer(&table->root6, peer, lock); + list_for_each_entry_safe(node, tmp, &peer->allowedips_list, peer_list) { + list_del_init(&node->peer_list); + RCU_INIT_POINTER(node->peer, NULL); + if (node->bit[0] && node->bit[1]) + continue; + child = rcu_dereference_protected( + node->bit[!rcu_access_pointer(node->bit[0])], + lockdep_is_held(lock)); + if (child) + child->parent_bit = node->parent_bit; + *rcu_dereference_protected(node->parent_bit, lockdep_is_held(lock)) = child; + kfree_rcu(node, rcu); + + /* TODO: Note that we currently don't walk up and down in order to + * free any potential filler nodes. This means that this function + * doesn't free up as much as it could, which could be revisited + * at some point. + */ + } } int wg_allowedips_read_node(struct allowedips_node *node, u8 ip[16], u8 *cidr) diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h index e5c83cafcef4..f08f552e6852 100644 --- a/drivers/net/wireguard/allowedips.h +++ b/drivers/net/wireguard/allowedips.h @@ -15,14 +15,11 @@ struct wg_peer; struct allowedips_node { struct wg_peer __rcu *peer; struct allowedips_node __rcu *bit[2]; - /* While it may seem scandalous that we waste space for v4, - * we're alloc'ing to the nearest power of 2 anyway, so this - * doesn't actually make a difference. - */ - u8 bits[16] __aligned(__alignof(u64)); u8 cidr, bit_at_a, bit_at_b, bitlen; + u8 bits[16] __aligned(__alignof(u64)); - /* Keep rarely used list at bottom to be beyond cache line. */ + /* Keep rarely used members at bottom to be beyond cache line. */ + struct allowedips_node *__rcu *parent_bit; /* XXX: this puts us at 68->128 bytes instead of 60->64 bytes!! */ union { struct list_head peer_list; struct rcu_head rcu; From patchwork Fri Jun 4 15:17:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 455001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_NONE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66B14C47083 for ; Fri, 4 Jun 2021 15:18:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4F8B761404 for ; Fri, 4 Jun 2021 15:18:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230478AbhFDPUE (ORCPT ); Fri, 4 Jun 2021 11:20:04 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54988 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230366AbhFDPUD (ORCPT ); Fri, 4 Jun 2021 11:20:03 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819894; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u8KcRiC5mWVVM2THyvIX2Xl3zmS3c02PavxpNfenwpg=; b=iMjj2YQdc5ia0L6TzguFoF3wn3cD0KIqW4Hs2sSaULgAkb0Y3Jy+PIjvh+/1e57kJkgxcJ bwghpsxKjPpfVQgGskd4elWNrMlMXq9IyfYIIccioSbN7XmF2xHUKhEwIXPNfYK5lNoM83 KSH/+LOaWUMawQ0b70LQg1cOLU7u/iw= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 9f590534 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:18:13 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , Arnd Bergmann , Matthew Wilcox , stable@vger.kernel.org Subject: [PATCH net 8/9] wireguard: allowedips: allocate nodes in kmem_cache Date: Fri, 4 Jun 2021 17:17:37 +0200 Message-Id: <20210604151738.220232-9-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The previous commit moved from O(n) to O(1) for removal, but in the process introduced an additional pointer member to a struct that increased the size from 60 to 68 bytes, putting nodes in the 128-byte slab. With deployed systems having as many as 2 million nodes, this represents a significant doubling in memory usage (128 MiB -> 256 MiB). Fix this by using our own kmem_cache, that's sized exactly right. This also makes wireguard's memory usage more transparent in tools like slabtop and /proc/slabinfo. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Suggested-by: Arnd Bergmann Suggested-by: Matthew Wilcox Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- drivers/net/wireguard/allowedips.c | 31 ++++++++++++++++++++++++------ drivers/net/wireguard/allowedips.h | 5 ++++- drivers/net/wireguard/main.c | 10 +++++++++- 3 files changed, 38 insertions(+), 8 deletions(-) diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c index 2785cfd3a221..c540dce8d224 100644 --- a/drivers/net/wireguard/allowedips.c +++ b/drivers/net/wireguard/allowedips.c @@ -6,6 +6,8 @@ #include "allowedips.h" #include "peer.h" +static struct kmem_cache *node_cache; + static void swap_endian(u8 *dst, const u8 *src, u8 bits) { if (bits == 32) { @@ -40,6 +42,11 @@ static void push_rcu(struct allowedips_node **stack, } } +static void node_free_rcu(struct rcu_head *rcu) +{ + kmem_cache_free(node_cache, container_of(rcu, struct allowedips_node, rcu)); +} + static void root_free_rcu(struct rcu_head *rcu) { struct allowedips_node *node, *stack[128] = { @@ -49,7 +56,7 @@ static void root_free_rcu(struct rcu_head *rcu) while (len > 0 && (node = stack[--len])) { push_rcu(stack, node->bit[0], &len); push_rcu(stack, node->bit[1], &len); - kfree(node); + kmem_cache_free(node_cache, node); } } @@ -164,7 +171,7 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, return -EINVAL; if (!rcu_access_pointer(*trie)) { - node = kzalloc(sizeof(*node), GFP_KERNEL); + node = kmem_cache_zalloc(node_cache, GFP_KERNEL); if (unlikely(!node)) return -ENOMEM; RCU_INIT_POINTER(node->peer, peer); @@ -180,7 +187,7 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, return 0; } - newnode = kzalloc(sizeof(*newnode), GFP_KERNEL); + newnode = kmem_cache_zalloc(node_cache, GFP_KERNEL); if (unlikely(!newnode)) return -ENOMEM; RCU_INIT_POINTER(newnode->peer, peer); @@ -213,10 +220,10 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, return 0; } - node = kzalloc(sizeof(*node), GFP_KERNEL); + node = kmem_cache_zalloc(node_cache, GFP_KERNEL); if (unlikely(!node)) { list_del(&newnode->peer_list); - kfree(newnode); + kmem_cache_free(node_cache, newnode); return -ENOMEM; } INIT_LIST_HEAD(&node->peer_list); @@ -306,7 +313,7 @@ void wg_allowedips_remove_by_peer(struct allowedips *table, if (child) child->parent_bit = node->parent_bit; *rcu_dereference_protected(node->parent_bit, lockdep_is_held(lock)) = child; - kfree_rcu(node, rcu); + call_rcu(&node->rcu, node_free_rcu); /* TODO: Note that we currently don't walk up and down in order to * free any potential filler nodes. This means that this function @@ -350,4 +357,16 @@ struct wg_peer *wg_allowedips_lookup_src(struct allowedips *table, return NULL; } +int __init wg_allowedips_slab_init(void) +{ + node_cache = KMEM_CACHE(allowedips_node, 0); + return node_cache ? 0 : -ENOMEM; +} + +void wg_allowedips_slab_uninit(void) +{ + rcu_barrier(); + kmem_cache_destroy(node_cache); +} + #include "selftest/allowedips.c" diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h index f08f552e6852..32d611aaf3cc 100644 --- a/drivers/net/wireguard/allowedips.h +++ b/drivers/net/wireguard/allowedips.h @@ -19,7 +19,7 @@ struct allowedips_node { u8 bits[16] __aligned(__alignof(u64)); /* Keep rarely used members at bottom to be beyond cache line. */ - struct allowedips_node *__rcu *parent_bit; /* XXX: this puts us at 68->128 bytes instead of 60->64 bytes!! */ + struct allowedips_node *__rcu *parent_bit; union { struct list_head peer_list; struct rcu_head rcu; @@ -53,4 +53,7 @@ struct wg_peer *wg_allowedips_lookup_src(struct allowedips *table, bool wg_allowedips_selftest(void); #endif +int wg_allowedips_slab_init(void); +void wg_allowedips_slab_uninit(void); + #endif /* _WG_ALLOWEDIPS_H */ diff --git a/drivers/net/wireguard/main.c b/drivers/net/wireguard/main.c index 0a3ebfdac794..75dbe77b0b4b 100644 --- a/drivers/net/wireguard/main.c +++ b/drivers/net/wireguard/main.c @@ -21,10 +21,15 @@ static int __init mod_init(void) { int ret; + ret = wg_allowedips_slab_init(); + if (ret < 0) + goto err_allowedips; + #ifdef DEBUG + ret = -ENOTRECOVERABLE; if (!wg_allowedips_selftest() || !wg_packet_counter_selftest() || !wg_ratelimiter_selftest()) - return -ENOTRECOVERABLE; + goto err_peer; #endif wg_noise_init(); @@ -50,6 +55,8 @@ static int __init mod_init(void) err_device: wg_peer_uninit(); err_peer: + wg_allowedips_slab_uninit(); +err_allowedips: return ret; } @@ -58,6 +65,7 @@ static void __exit mod_exit(void) wg_genetlink_uninit(); wg_device_uninit(); wg_peer_uninit(); + wg_allowedips_slab_uninit(); } module_init(mod_init); From patchwork Fri Jun 4 15:17:38 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jason A. Donenfeld" X-Patchwork-Id: 454540 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_NONE, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF238C4708F for ; Fri, 4 Jun 2021 15:18:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A940B61400 for ; Fri, 4 Jun 2021 15:18:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230523AbhFDPUG (ORCPT ); Fri, 4 Jun 2021 11:20:06 -0400 Received: from mail.zx2c4.com ([104.131.123.232]:54988 "EHLO mail.zx2c4.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230374AbhFDPUE (ORCPT ); Fri, 4 Jun 2021 11:20:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zx2c4.com; s=20210105; t=1622819896; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hNsbtY2jm3XIM081rnM4heIU6va6uBjsay4ai2kBiX4=; b=CiEmJiY+OZ35qNrub90nKhmYcSdYF9KS51zSru0PtaJTwpSEbkEhd4R5a59kyT8O9nHS2w GdHm0yt+xD5WKRrecsjkFgg5Jf7RQs6WCDRKCx7GCbNfu1sZ2Fjf8A/Boptcwru66FdTuQ ODQzUS4f50fhhP9wUvJA8cilaigBOlU= Received: by mail.zx2c4.com (ZX2C4 Mail Server) with ESMTPSA id 552c6aa5 (TLSv1.3:AEAD-AES256-GCM-SHA384:256:NO); Fri, 4 Jun 2021 15:18:16 +0000 (UTC) From: "Jason A. Donenfeld" To: netdev@vger.kernel.org, davem@davemloft.net, kuba@kernel.org Cc: "Jason A. Donenfeld" , stable@vger.kernel.org Subject: [PATCH net 9/9] wireguard: allowedips: free empty intermediate nodes when removing single node Date: Fri, 4 Jun 2021 17:17:38 +0200 Message-Id: <20210604151738.220232-10-Jason@zx2c4.com> In-Reply-To: <20210604151738.220232-1-Jason@zx2c4.com> References: <20210604151738.220232-1-Jason@zx2c4.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When removing single nodes, it's possible that that node's parent is an empty intermediate node, in which case, it too should be removed. Otherwise the trie fills up and never is fully emptied, leading to gradual memory leaks over time for tries that are modified often. There was originally code to do this, but was removed during refactoring in 2016 and never reworked. Now that we have proper parent pointers from the previous commits, we can implement this properly. In order to reduce branching and expensive comparisons, we want to keep the double pointer for parent assignment (which lets us easily chain up to the root), but we still need to actually get the parent's base address. So encode the bit number into the last two bits of the pointer, and pack and unpack it as needed. This is a little bit clumsy but is the fastest and less memory wasteful of the compromises. Note that we align the root struct here to a minimum of 4, because it's embedded into a larger struct, and we're relying on having the bottom two bits for our flag, which would only be 16-bit aligned on m68k. The existing macro-based helpers were a bit unwieldy for adding the bit packing to, so this commit replaces them with safer and clearer ordinary functions. We add a test to the randomized/fuzzer part of the selftests, to free the randomized tries by-peer, refuzz it, and repeat, until it's supposed to be empty, and then then see if that actually resulted in the whole thing being emptied. That combined with kmemcheck should hopefully make sure this commit is doing what it should. Along the way this resulted in various other cleanups of the tests and fixes for recent graphviz. Fixes: e7096c131e51 ("net: WireGuard secure network tunnel") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld --- drivers/net/wireguard/allowedips.c | 102 ++++++------ drivers/net/wireguard/allowedips.h | 4 +- drivers/net/wireguard/selftest/allowedips.c | 162 ++++++++++---------- 3 files changed, 137 insertions(+), 131 deletions(-) diff --git a/drivers/net/wireguard/allowedips.c b/drivers/net/wireguard/allowedips.c index c540dce8d224..b7197e80f226 100644 --- a/drivers/net/wireguard/allowedips.c +++ b/drivers/net/wireguard/allowedips.c @@ -30,8 +30,11 @@ static void copy_and_assign_cidr(struct allowedips_node *node, const u8 *src, node->bitlen = bits; memcpy(node->bits, src, bits / 8U); } -#define CHOOSE_NODE(parent, key) \ - parent->bit[(key[parent->bit_at_a] >> parent->bit_at_b) & 1] + +static inline u8 choose(struct allowedips_node *node, const u8 *key) +{ + return (key[node->bit_at_a] >> node->bit_at_b) & 1; +} static void push_rcu(struct allowedips_node **stack, struct allowedips_node __rcu *p, unsigned int *len) @@ -112,7 +115,7 @@ static struct allowedips_node *find_node(struct allowedips_node *trie, u8 bits, found = node; if (node->cidr == bits) break; - node = rcu_dereference_bh(CHOOSE_NODE(node, key)); + node = rcu_dereference_bh(node->bit[choose(node, key)]); } return found; } @@ -144,8 +147,7 @@ static bool node_placement(struct allowedips_node __rcu *trie, const u8 *key, u8 cidr, u8 bits, struct allowedips_node **rnode, struct mutex *lock) { - struct allowedips_node *node = rcu_dereference_protected(trie, - lockdep_is_held(lock)); + struct allowedips_node *node = rcu_dereference_protected(trie, lockdep_is_held(lock)); struct allowedips_node *parent = NULL; bool exact = false; @@ -155,13 +157,24 @@ static bool node_placement(struct allowedips_node __rcu *trie, const u8 *key, exact = true; break; } - node = rcu_dereference_protected(CHOOSE_NODE(parent, key), - lockdep_is_held(lock)); + node = rcu_dereference_protected(parent->bit[choose(parent, key)], lockdep_is_held(lock)); } *rnode = parent; return exact; } +static inline void connect_node(struct allowedips_node **parent, u8 bit, struct allowedips_node *node) +{ + node->parent_bit_packed = (unsigned long)parent | bit; + rcu_assign_pointer(*parent, node); +} + +static inline void choose_and_connect_node(struct allowedips_node *parent, struct allowedips_node *node) +{ + u8 bit = choose(parent, node->bits); + connect_node(&parent->bit[bit], bit, node); +} + static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, u8 cidr, struct wg_peer *peer, struct mutex *lock) { @@ -177,8 +190,7 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, RCU_INIT_POINTER(node->peer, peer); list_add_tail(&node->peer_list, &peer->allowedips_list); copy_and_assign_cidr(node, key, cidr, bits); - rcu_assign_pointer(node->parent_bit, trie); - rcu_assign_pointer(*trie, node); + connect_node(trie, 2, node); return 0; } if (node_placement(*trie, key, cidr, bits, &node, lock)) { @@ -197,10 +209,10 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, if (!node) { down = rcu_dereference_protected(*trie, lockdep_is_held(lock)); } else { - down = rcu_dereference_protected(CHOOSE_NODE(node, key), lockdep_is_held(lock)); + const u8 bit = choose(node, key); + down = rcu_dereference_protected(node->bit[bit], lockdep_is_held(lock)); if (!down) { - rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(node, key)); - rcu_assign_pointer(CHOOSE_NODE(node, key), newnode); + connect_node(&node->bit[bit], bit, newnode); return 0; } } @@ -208,15 +220,11 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, parent = node; if (newnode->cidr == cidr) { - rcu_assign_pointer(down->parent_bit, &CHOOSE_NODE(newnode, down->bits)); - rcu_assign_pointer(CHOOSE_NODE(newnode, down->bits), down); - if (!parent) { - rcu_assign_pointer(newnode->parent_bit, trie); - rcu_assign_pointer(*trie, newnode); - } else { - rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(parent, newnode->bits)); - rcu_assign_pointer(CHOOSE_NODE(parent, newnode->bits), newnode); - } + choose_and_connect_node(newnode, down); + if (!parent) + connect_node(trie, 2, newnode); + else + choose_and_connect_node(parent, newnode); return 0; } @@ -229,17 +237,12 @@ static int add(struct allowedips_node __rcu **trie, u8 bits, const u8 *key, INIT_LIST_HEAD(&node->peer_list); copy_and_assign_cidr(node, newnode->bits, cidr, bits); - rcu_assign_pointer(down->parent_bit, &CHOOSE_NODE(node, down->bits)); - rcu_assign_pointer(CHOOSE_NODE(node, down->bits), down); - rcu_assign_pointer(newnode->parent_bit, &CHOOSE_NODE(node, newnode->bits)); - rcu_assign_pointer(CHOOSE_NODE(node, newnode->bits), newnode); - if (!parent) { - rcu_assign_pointer(node->parent_bit, trie); - rcu_assign_pointer(*trie, node); - } else { - rcu_assign_pointer(node->parent_bit, &CHOOSE_NODE(parent, node->bits)); - rcu_assign_pointer(CHOOSE_NODE(parent, node->bits), node); - } + choose_and_connect_node(node, down); + choose_and_connect_node(node, newnode); + if (!parent) + connect_node(trie, 2, node); + else + choose_and_connect_node(parent, node); return 0; } @@ -297,7 +300,8 @@ int wg_allowedips_insert_v6(struct allowedips *table, const struct in6_addr *ip, void wg_allowedips_remove_by_peer(struct allowedips *table, struct wg_peer *peer, struct mutex *lock) { - struct allowedips_node *node, *child, *tmp; + struct allowedips_node *node, *child, **parent_bit, *parent, *tmp; + bool free_parent; if (list_empty(&peer->allowedips_list)) return; @@ -307,19 +311,29 @@ void wg_allowedips_remove_by_peer(struct allowedips *table, RCU_INIT_POINTER(node->peer, NULL); if (node->bit[0] && node->bit[1]) continue; - child = rcu_dereference_protected( - node->bit[!rcu_access_pointer(node->bit[0])], - lockdep_is_held(lock)); + child = rcu_dereference_protected(node->bit[!rcu_access_pointer(node->bit[0])], + lockdep_is_held(lock)); if (child) - child->parent_bit = node->parent_bit; - *rcu_dereference_protected(node->parent_bit, lockdep_is_held(lock)) = child; + child->parent_bit_packed = node->parent_bit_packed; + parent_bit = (struct allowedips_node **)(node->parent_bit_packed & ~3UL); + *parent_bit = child; + parent = (void *)parent_bit - + offsetof(struct allowedips_node, bit[node->parent_bit_packed & 1]); + free_parent = !rcu_access_pointer(node->bit[0]) && + !rcu_access_pointer(node->bit[1]) && + (node->parent_bit_packed & 3) <= 1 && + !rcu_access_pointer(parent->peer); + if (free_parent) + child = rcu_dereference_protected( + parent->bit[!(node->parent_bit_packed & 1)], + lockdep_is_held(lock)); call_rcu(&node->rcu, node_free_rcu); - - /* TODO: Note that we currently don't walk up and down in order to - * free any potential filler nodes. This means that this function - * doesn't free up as much as it could, which could be revisited - * at some point. - */ + if (!free_parent) + continue; + if (child) + child->parent_bit_packed = parent->parent_bit_packed; + *(struct allowedips_node **)(parent->parent_bit_packed & ~3UL) = child; + call_rcu(&parent->rcu, node_free_rcu); } } diff --git a/drivers/net/wireguard/allowedips.h b/drivers/net/wireguard/allowedips.h index 32d611aaf3cc..2346c797eb4d 100644 --- a/drivers/net/wireguard/allowedips.h +++ b/drivers/net/wireguard/allowedips.h @@ -19,7 +19,7 @@ struct allowedips_node { u8 bits[16] __aligned(__alignof(u64)); /* Keep rarely used members at bottom to be beyond cache line. */ - struct allowedips_node *__rcu *parent_bit; + unsigned long parent_bit_packed; union { struct list_head peer_list; struct rcu_head rcu; @@ -30,7 +30,7 @@ struct allowedips { struct allowedips_node __rcu *root4; struct allowedips_node __rcu *root6; u64 seq; -}; +} __aligned(4); /* We pack the lower 2 bits of &root, but m68k only gives 16-bit alignment. */ void wg_allowedips_init(struct allowedips *table); void wg_allowedips_free(struct allowedips *table, struct mutex *mutex); diff --git a/drivers/net/wireguard/selftest/allowedips.c b/drivers/net/wireguard/selftest/allowedips.c index 0d2a43a2d400..e173204ae7d7 100644 --- a/drivers/net/wireguard/selftest/allowedips.c +++ b/drivers/net/wireguard/selftest/allowedips.c @@ -19,32 +19,22 @@ #include -static __init void swap_endian_and_apply_cidr(u8 *dst, const u8 *src, u8 bits, - u8 cidr) -{ - swap_endian(dst, src, bits); - memset(dst + (cidr + 7) / 8, 0, bits / 8 - (cidr + 7) / 8); - if (cidr) - dst[(cidr + 7) / 8 - 1] &= ~0U << ((8 - (cidr % 8)) % 8); -} - static __init void print_node(struct allowedips_node *node, u8 bits) { char *fmt_connection = KERN_DEBUG "\t\"%p/%d\" -> \"%p/%d\";\n"; - char *fmt_declaration = KERN_DEBUG - "\t\"%p/%d\"[style=%s, color=\"#%06x\"];\n"; + char *fmt_declaration = KERN_DEBUG "\t\"%p/%d\"[style=%s, color=\"#%06x\"];\n"; + u8 ip1[16], ip2[16], cidr1, cidr2; char *style = "dotted"; - u8 ip1[16], ip2[16]; u32 color = 0; + if (node == NULL) + return; if (bits == 32) { fmt_connection = KERN_DEBUG "\t\"%pI4/%d\" -> \"%pI4/%d\";\n"; - fmt_declaration = KERN_DEBUG - "\t\"%pI4/%d\"[style=%s, color=\"#%06x\"];\n"; + fmt_declaration = KERN_DEBUG "\t\"%pI4/%d\"[style=%s, color=\"#%06x\"];\n"; } else if (bits == 128) { fmt_connection = KERN_DEBUG "\t\"%pI6/%d\" -> \"%pI6/%d\";\n"; - fmt_declaration = KERN_DEBUG - "\t\"%pI6/%d\"[style=%s, color=\"#%06x\"];\n"; + fmt_declaration = KERN_DEBUG "\t\"%pI6/%d\"[style=%s, color=\"#%06x\"];\n"; } if (node->peer) { hsiphash_key_t key = { { 0 } }; @@ -55,24 +45,20 @@ static __init void print_node(struct allowedips_node *node, u8 bits) hsiphash_1u32(0xabad1dea, &key) % 200; style = "bold"; } - swap_endian_and_apply_cidr(ip1, node->bits, bits, node->cidr); - printk(fmt_declaration, ip1, node->cidr, style, color); + wg_allowedips_read_node(node, ip1, &cidr1); + printk(fmt_declaration, ip1, cidr1, style, color); if (node->bit[0]) { - swap_endian_and_apply_cidr(ip2, - rcu_dereference_raw(node->bit[0])->bits, bits, - node->cidr); - printk(fmt_connection, ip1, node->cidr, ip2, - rcu_dereference_raw(node->bit[0])->cidr); - print_node(rcu_dereference_raw(node->bit[0]), bits); + wg_allowedips_read_node(rcu_dereference_raw(node->bit[0]), ip2, &cidr2); + printk(fmt_connection, ip1, cidr1, ip2, cidr2); } if (node->bit[1]) { - swap_endian_and_apply_cidr(ip2, - rcu_dereference_raw(node->bit[1])->bits, - bits, node->cidr); - printk(fmt_connection, ip1, node->cidr, ip2, - rcu_dereference_raw(node->bit[1])->cidr); - print_node(rcu_dereference_raw(node->bit[1]), bits); + wg_allowedips_read_node(rcu_dereference_raw(node->bit[1]), ip2, &cidr2); + printk(fmt_connection, ip1, cidr1, ip2, cidr2); } + if (node->bit[0]) + print_node(rcu_dereference_raw(node->bit[0]), bits); + if (node->bit[1]) + print_node(rcu_dereference_raw(node->bit[1]), bits); } static __init void print_tree(struct allowedips_node __rcu *top, u8 bits) @@ -121,8 +107,8 @@ static __init inline union nf_inet_addr horrible_cidr_to_mask(u8 cidr) { union nf_inet_addr mask; - memset(&mask, 0x00, 128 / 8); - memset(&mask, 0xff, cidr / 8); + memset(&mask, 0, sizeof(mask)); + memset(&mask.all, 0xff, cidr / 8); if (cidr % 32) mask.all[cidr / 32] = (__force u32)htonl( (0xFFFFFFFFUL << (32 - (cidr % 32))) & 0xFFFFFFFFUL); @@ -149,42 +135,36 @@ horrible_mask_self(struct horrible_allowedips_node *node) } static __init inline bool -horrible_match_v4(const struct horrible_allowedips_node *node, - struct in_addr *ip) +horrible_match_v4(const struct horrible_allowedips_node *node, struct in_addr *ip) { return (ip->s_addr & node->mask.ip) == node->ip.ip; } static __init inline bool -horrible_match_v6(const struct horrible_allowedips_node *node, - struct in6_addr *ip) +horrible_match_v6(const struct horrible_allowedips_node *node, struct in6_addr *ip) { - return (ip->in6_u.u6_addr32[0] & node->mask.ip6[0]) == - node->ip.ip6[0] && - (ip->in6_u.u6_addr32[1] & node->mask.ip6[1]) == - node->ip.ip6[1] && - (ip->in6_u.u6_addr32[2] & node->mask.ip6[2]) == - node->ip.ip6[2] && + return (ip->in6_u.u6_addr32[0] & node->mask.ip6[0]) == node->ip.ip6[0] && + (ip->in6_u.u6_addr32[1] & node->mask.ip6[1]) == node->ip.ip6[1] && + (ip->in6_u.u6_addr32[2] & node->mask.ip6[2]) == node->ip.ip6[2] && (ip->in6_u.u6_addr32[3] & node->mask.ip6[3]) == node->ip.ip6[3]; } static __init void -horrible_insert_ordered(struct horrible_allowedips *table, - struct horrible_allowedips_node *node) +horrible_insert_ordered(struct horrible_allowedips *table, struct horrible_allowedips_node *node) { struct horrible_allowedips_node *other = NULL, *where = NULL; u8 my_cidr = horrible_mask_to_cidr(node->mask); hlist_for_each_entry(other, &table->head, table) { - if (!memcmp(&other->mask, &node->mask, - sizeof(union nf_inet_addr)) && - !memcmp(&other->ip, &node->ip, - sizeof(union nf_inet_addr)) && - other->ip_version == node->ip_version) { + if (other->ip_version == node->ip_version && + !memcmp(&other->mask, &node->mask, sizeof(union nf_inet_addr)) && + !memcmp(&other->ip, &node->ip, sizeof(union nf_inet_addr))) { other->value = node->value; kfree(node); return; } + } + hlist_for_each_entry(other, &table->head, table) { where = other; if (horrible_mask_to_cidr(other->mask) <= my_cidr) break; @@ -201,8 +181,7 @@ static __init int horrible_allowedips_insert_v4(struct horrible_allowedips *table, struct in_addr *ip, u8 cidr, void *value) { - struct horrible_allowedips_node *node = kzalloc(sizeof(*node), - GFP_KERNEL); + struct horrible_allowedips_node *node = kzalloc(sizeof(*node), GFP_KERNEL); if (unlikely(!node)) return -ENOMEM; @@ -219,8 +198,7 @@ static __init int horrible_allowedips_insert_v6(struct horrible_allowedips *table, struct in6_addr *ip, u8 cidr, void *value) { - struct horrible_allowedips_node *node = kzalloc(sizeof(*node), - GFP_KERNEL); + struct horrible_allowedips_node *node = kzalloc(sizeof(*node), GFP_KERNEL); if (unlikely(!node)) return -ENOMEM; @@ -234,39 +212,43 @@ horrible_allowedips_insert_v6(struct horrible_allowedips *table, } static __init void * -horrible_allowedips_lookup_v4(struct horrible_allowedips *table, - struct in_addr *ip) +horrible_allowedips_lookup_v4(struct horrible_allowedips *table, struct in_addr *ip) { struct horrible_allowedips_node *node; - void *ret = NULL; hlist_for_each_entry(node, &table->head, table) { - if (node->ip_version != 4) - continue; - if (horrible_match_v4(node, ip)) { - ret = node->value; - break; - } + if (node->ip_version == 4 && horrible_match_v4(node, ip)) + return node->value; } - return ret; + return NULL; } static __init void * -horrible_allowedips_lookup_v6(struct horrible_allowedips *table, - struct in6_addr *ip) +horrible_allowedips_lookup_v6(struct horrible_allowedips *table, struct in6_addr *ip) { struct horrible_allowedips_node *node; - void *ret = NULL; hlist_for_each_entry(node, &table->head, table) { - if (node->ip_version != 6) + if (node->ip_version == 6 && horrible_match_v6(node, ip)) + return node->value; + } + return NULL; +} + + +static __init void +horrible_allowedips_remove_by_value(struct horrible_allowedips *table, void *value) +{ + struct horrible_allowedips_node *node; + struct hlist_node *h; + + hlist_for_each_entry_safe(node, h, &table->head, table) { + if (node->value != value) continue; - if (horrible_match_v6(node, ip)) { - ret = node->value; - break; - } + hlist_del(&node->table); + kfree(node); } - return ret; + } static __init bool randomized_test(void) @@ -397,23 +379,33 @@ static __init bool randomized_test(void) print_tree(t.root6, 128); } - for (i = 0; i < NUM_QUERIES; ++i) { - prandom_bytes(ip, 4); - if (lookup(t.root4, 32, ip) != - horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip)) { - pr_err("allowedips random self-test: FAIL\n"); - goto free; + for (j = 0;; ++j) { + for (i = 0; i < NUM_QUERIES; ++i) { + prandom_bytes(ip, 4); + if (lookup(t.root4, 32, ip) != horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip)) { + horrible_allowedips_lookup_v4(&h, (struct in_addr *)ip); + pr_err("allowedips random v4 self-test: FAIL\n"); + goto free; + } + prandom_bytes(ip, 16); + if (lookup(t.root6, 128, ip) != horrible_allowedips_lookup_v6(&h, (struct in6_addr *)ip)) { + pr_err("allowedips random v6 self-test: FAIL\n"); + goto free; + } } + if (j >= NUM_PEERS) + break; + mutex_lock(&mutex); + wg_allowedips_remove_by_peer(&t, peers[j], &mutex); + mutex_unlock(&mutex); + horrible_allowedips_remove_by_value(&h, peers[j]); } - for (i = 0; i < NUM_QUERIES; ++i) { - prandom_bytes(ip, 16); - if (lookup(t.root6, 128, ip) != - horrible_allowedips_lookup_v6(&h, (struct in6_addr *)ip)) { - pr_err("allowedips random self-test: FAIL\n"); - goto free; - } + if (t.root4 || t.root6) { + pr_err("allowedips random self-test removal: FAIL\n"); + goto free; } + ret = true; free: