Message ID | 20200911143022.414783-2-nicolas.rybowski@tessares.net |
---|---|
State | New |
Headers | show |
Series | [bpf-next,v2,1/5] bpf: expose is_mptcp flag to bpf_tcp_sock | expand |
On Fri, Sep 11, 2020 at 9:43 AM Nicolas Rybowski <nicolas.rybowski@tessares.net> wrote: > > It has been observed that the kernel sockets created for the subflows > (except the first one) are not in the same cgroup as their parents. > That's because the additional subflows are created by kernel workers. > > This is a problem with eBPF programs attached to the parent's > cgroup won't be executed for the children. But also with any other features > of CGroup linked to a sk. > > This patch fixes this behaviour. > > As the subflow sockets are created by the kernel, we can't use > 'mem_cgroup_sk_alloc' because of the current context being the one of the > kworker. This is why we have to do low level memcg manipulation, if > required. > > Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> > Suggested-by: Paolo Abeni <pabeni@redhat.com> > Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> > Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> > Signed-off-by: Nicolas Rybowski <nicolas.rybowski@tessares.net> Acked-by: Song Liu <songliubraving@fb.com> [...]
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index e8cac2655c82..535a3f9f8cfc 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1130,6 +1130,30 @@ int __mptcp_subflow_connect(struct sock *sk, int ifindex, return err; } +static void mptcp_attach_cgroup(struct sock *parent, struct sock *child) +{ +#ifdef CONFIG_SOCK_CGROUP_DATA + struct sock_cgroup_data *parent_skcd = &parent->sk_cgrp_data, + *child_skcd = &child->sk_cgrp_data; + + /* only the additional subflows created by kworkers have to be modified */ + if (cgroup_id(sock_cgroup_ptr(parent_skcd)) != + cgroup_id(sock_cgroup_ptr(child_skcd))) { +#ifdef CONFIG_MEMCG + struct mem_cgroup *memcg = parent->sk_memcg; + + mem_cgroup_sk_free(child); + if (memcg && css_tryget(&memcg->css)) + child->sk_memcg = memcg; +#endif /* CONFIG_MEMCG */ + + cgroup_sk_free(child_skcd); + *child_skcd = *parent_skcd; + cgroup_sk_clone(child_skcd); + } +#endif /* CONFIG_SOCK_CGROUP_DATA */ +} + int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock) { struct mptcp_subflow_context *subflow; @@ -1150,6 +1174,9 @@ int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock) lock_sock(sf->sk); + /* the newly created socket has to be in the same cgroup as its parent */ + mptcp_attach_cgroup(sk, sf->sk); + /* kernel sockets do not by default acquire net ref, but TCP timer * needs it. */