Message ID | 20210117102319.193756-1-dong.menglong@zte.com.cn |
---|---|
State | New |
Headers | show |
Series | [net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default | expand |
On Sun, Jan 17, 2021 at 06:23:19PM +0800, menglong8.dong@gmail.com wrote: > From: Menglong Dong <dong.menglong@zte.com.cn> > > For now, sysctl_wmem_default and sysctl_rmem_default are globally > unified. It's not convenient in some case. For example, when we > use docker and try to control the default udp socket receive buffer > for each container. > > For that reason, make sysctl_wmem_default and sysctl_rmem_default > per-namespace. > > Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> > --- Hey Menglong, I was about to review the two patches you sent: 1. [PATCH net-next] net: core: Namespace-ify sysctl_rmem_max and sysctl_wmem_max https://lore.kernel.org/lkml/20210117104743.217194-1-dong.menglong@zte.com.cn 2. [PATCH net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default https://lore.kernel.org/lkml/20210117102319.193756-1-dong.menglong@zte.com.cn and I had to spend some time figuring out that 2. is dependent on 1. I first thought I got the base wrong. I'd suggest you resend both patches as a part of a single series with a cover letter mentioning the goal and use-case for these changes and also pass --base=<base-commit> when creating the patch series which makes it way easier to figure out what to apply it to when wanting to review a series in the larger context of a tree. Thanks! Christian
On Mon, 18 Jan 2021 12:15:18 +0100 Christian Brauner wrote: > On Sun, Jan 17, 2021 at 06:23:19PM +0800, menglong8.dong@gmail.com wrote: > > From: Menglong Dong <dong.menglong@zte.com.cn> > > > > For now, sysctl_wmem_default and sysctl_rmem_default are globally > > unified. It's not convenient in some case. For example, when we > > use docker and try to control the default udp socket receive buffer > > for each container. > > > > For that reason, make sysctl_wmem_default and sysctl_rmem_default > > per-namespace. > > > > Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> > > --- > > Hey Menglong, > > I was about to review the two patches you sent: > > 1. [PATCH net-next] net: core: Namespace-ify sysctl_rmem_max and sysctl_wmem_max > https://lore.kernel.org/lkml/20210117104743.217194-1-dong.menglong@zte.com.cn > 2. [PATCH net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default > https://lore.kernel.org/lkml/20210117102319.193756-1-dong.menglong@zte.com.cn And perhaps 0. [PATCH net-next] net: core: init every ctl_table in netns_core_table ? I'm dropping these three from patchwork please follow Christian suggestions on how to repost properly, thanks! > and I had to spend some time figuring out that 2. is dependent on 1. I > first thought I got the base wrong. > > I'd suggest you resend both patches as a part of a single series with a > cover letter mentioning the goal and use-case for these changes and also > pass --base=<base-commit> > when creating the patch series which makes it way easier to figure out > what to apply it to when wanting to review a series in the larger > context of a tree.
On Mon, 18 Jan 2021 13:15:28 -0800 Jakub Kicinski wrote: > On Mon, 18 Jan 2021 12:15:18 +0100 Christian Brauner wrote: > > On Sun, Jan 17, 2021 at 06:23:19PM +0800, menglong8.dong@gmail.com wrote: > > > From: Menglong Dong <dong.menglong@zte.com.cn> > > > > > > For now, sysctl_wmem_default and sysctl_rmem_default are globally > > > unified. It's not convenient in some case. For example, when we > > > use docker and try to control the default udp socket receive buffer > > > for each container. > > > > > > For that reason, make sysctl_wmem_default and sysctl_rmem_default > > > per-namespace. > > > > > > Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn> > > > --- > > > > Hey Menglong, > > > > I was about to review the two patches you sent: > > > > 1. [PATCH net-next] net: core: Namespace-ify sysctl_rmem_max and sysctl_wmem_max > > https://lore.kernel.org/lkml/20210117104743.217194-1-dong.menglong@zte.com.cn > > 2. [PATCH net-next] net: core: Namespace-ify sysctl_wmem_default and sysctl_rmem_default > > https://lore.kernel.org/lkml/20210117102319.193756-1-dong.menglong@zte.com.cn > > And perhaps > > 0. [PATCH net-next] net: core: init every ctl_table in netns_core_table > > ? > > I'm dropping these three from patchwork please follow Christian > suggestions on how to repost properly, thanks! Ah, you already did.
diff --git a/include/net/netns/core.h b/include/net/netns/core.h index 36c2d998a43c..317b47df6d08 100644 --- a/include/net/netns/core.h +++ b/include/net/netns/core.h @@ -9,6 +9,8 @@ struct netns_core { /* core sysctls */ struct ctl_table_header *sysctl_hdr; + int sysctl_wmem_default; + int sysctl_rmem_default; int sysctl_somaxconn; #ifdef CONFIG_PROC_FS diff --git a/include/net/sock.h b/include/net/sock.h index bdc4323ce53c..b846a6d24459 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2653,9 +2653,6 @@ extern __u32 sysctl_rmem_max; extern int sysctl_tstamp_allow_data; extern int sysctl_optmem_max; -extern __u32 sysctl_wmem_default; -extern __u32 sysctl_rmem_default; - DECLARE_STATIC_KEY_FALSE(net_high_order_alloc_disable_key); static inline int sk_get_wmem0(const struct sock *sk, const struct proto *proto) diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c index 2ef3b4557f40..eb4ea99131d6 100644 --- a/net/core/net_namespace.c +++ b/net/core/net_namespace.c @@ -374,6 +374,8 @@ static __net_init int setup_net(struct net *net, struct user_namespace *user_ns) static int __net_init net_defaults_init_net(struct net *net) { + net->core.sysctl_rmem_default = SK_RMEM_MAX; + net->core.sysctl_wmem_default = SK_WMEM_MAX; net->core.sysctl_somaxconn = SOMAXCONN; return 0; } diff --git a/net/core/sock.c b/net/core/sock.c index bbcd4b97eddd..2421e4ea1915 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -270,8 +270,6 @@ __u32 sysctl_wmem_max __read_mostly = SK_WMEM_MAX; EXPORT_SYMBOL(sysctl_wmem_max); __u32 sysctl_rmem_max __read_mostly = SK_RMEM_MAX; EXPORT_SYMBOL(sysctl_rmem_max); -__u32 sysctl_wmem_default __read_mostly = SK_WMEM_MAX; -__u32 sysctl_rmem_default __read_mostly = SK_RMEM_MAX; /* Maximal space eaten by iovec or ancillary data plus some space */ int sysctl_optmem_max __read_mostly = sizeof(unsigned long)*(2*UIO_MAXIOV+512); @@ -2970,8 +2968,8 @@ void sock_init_data(struct socket *sock, struct sock *sk) timer_setup(&sk->sk_timer, NULL, 0); sk->sk_allocation = GFP_KERNEL; - sk->sk_rcvbuf = sysctl_rmem_default; - sk->sk_sndbuf = sysctl_wmem_default; + sk->sk_rcvbuf = sock_net(sk)->core.sysctl_rmem_default; + sk->sk_sndbuf = sock_net(sk)->core.sysctl_wmem_default; sk->sk_state = TCP_CLOSE; sk_set_socket(sk, sock); diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c index 966d976dee84..5c1c75e42a09 100644 --- a/net/core/sysctl_net_core.c +++ b/net/core/sysctl_net_core.c @@ -326,22 +326,6 @@ static struct ctl_table net_core_table[] = { .proc_handler = proc_dointvec_minmax, .extra1 = &min_rcvbuf, }, - { - .procname = "wmem_default", - .data = &sysctl_wmem_default, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_sndbuf, - }, - { - .procname = "rmem_default", - .data = &sysctl_rmem_default, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec_minmax, - .extra1 = &min_rcvbuf, - }, { .procname = "dev_weight", .data = &weight_p, @@ -584,6 +568,22 @@ static struct ctl_table netns_core_table[] = { .extra1 = SYSCTL_ZERO, .proc_handler = proc_dointvec_minmax }, + { + .procname = "wmem_default", + .data = &init_net.core.sysctl_wmem_default, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_sndbuf, + }, + { + .procname = "rmem_default", + .data = &init_net.core.sysctl_rmem_default, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = &min_rcvbuf, + }, { } }; diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c index 2ed0b01f72f0..0fbdcda6f314 100644 --- a/net/ipv4/ip_output.c +++ b/net/ipv4/ip_output.c @@ -1709,7 +1709,7 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb, sk->sk_protocol = ip_hdr(skb)->protocol; sk->sk_bound_dev_if = arg->bound_dev_if; - sk->sk_sndbuf = sysctl_wmem_default; + sk->sk_sndbuf = sock_net(sk)->core.sysctl_wmem_default; ipc.sockc.mark = fl4.flowi4_mark; err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base, len, 0, &ipc, &rt, MSG_DONTWAIT);