From patchwork Tue Mar 17 10:53:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Greg KH X-Patchwork-Id: 229334 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DCF8C10F29 for ; Tue, 17 Mar 2020 11:01:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 63B1E205ED for ; Tue, 17 Mar 2020 11:01:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584442917; bh=Lh4BxuQ7K6cIEQ/7yvgksfX22SmWU44NFAT21/scgfI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:List-ID:From; b=gU/f51os0xRNOkU8XjE0cs12i9wC+WjJZ75pxz7U3JiNiMWqcRQDOcIqR+JFq22qF 0xOqZCBxNLH5lCf5SDjb0zAsy+fF0rBixwRAVqOievimNosyrzFpJCaqpkVsNhb2UY ckz7WYGV/MdTatZU47QOfNFNk4EjPYdkeGVmltCc= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727828AbgCQLB4 (ORCPT ); Tue, 17 Mar 2020 07:01:56 -0400 Received: from mail.kernel.org ([198.145.29.99]:41468 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727822AbgCQLBz (ORCPT ); Tue, 17 Mar 2020 07:01:55 -0400 Received: from localhost (83-86-89-107.cable.dynamic.v4.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 0B11920714; Tue, 17 Mar 2020 11:01:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1584442914; bh=Lh4BxuQ7K6cIEQ/7yvgksfX22SmWU44NFAT21/scgfI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=va6xlzLvo2nZ2vxXver/PSTTbHiVZNTSCUzCc/oiABT0NZ3HHlfN5xASbTg8Xxee5 7z+cQiLTU2qSTdtTpODXWnTY4E47CHz/AumrqGu8Rv3As2CMc6DjsG9bL/SNxr9qk6 QfgGzatqwN6y2igJdSWHkdrrk+NZd8D0+baMlzh0= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Dmitry Yakunin , Konstantin Khlebnikov , "David S. Miller" Subject: [PATCH 5.4 005/123] cgroup, netclassid: periodically release file_lock on classid updating Date: Tue, 17 Mar 2020 11:53:52 +0100 Message-Id: <20200317103308.001944033@linuxfoundation.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200317103307.343627747@linuxfoundation.org> References: <20200317103307.343627747@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Dmitry Yakunin [ Upstream commit 018d26fcd12a75fb9b5fe233762aa3f2f0854b88 ] In our production environment we have faced with problem that updating classid in cgroup with heavy tasks cause long freeze of the file tables in this tasks. By heavy tasks we understand tasks with many threads and opened sockets (e.g. balancers). This freeze leads to an increase number of client timeouts. This patch implements following logic to fix this issue: аfter iterating 1000 file descriptors file table lock will be released thus providing a time gap for socket creation/deletion. Now update is non atomic and socket may be skipped using calls: dup2(oldfd, newfd); close(oldfd); But this case is not typical. Moreover before this patch skip is possible too by hiding socket fd in unix socket buffer. New sockets will be allocated with updated classid because cgroup state is updated before start of the file descriptors iteration. So in common cases this patch has no side effects. Signed-off-by: Dmitry Yakunin Reviewed-by: Konstantin Khlebnikov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman --- net/core/netclassid_cgroup.c | 47 +++++++++++++++++++++++++++++++++---------- 1 file changed, 37 insertions(+), 10 deletions(-) --- a/net/core/netclassid_cgroup.c +++ b/net/core/netclassid_cgroup.c @@ -53,30 +53,60 @@ static void cgrp_css_free(struct cgroup_ kfree(css_cls_state(css)); } +/* + * To avoid freezing of sockets creation for tasks with big number of threads + * and opened sockets lets release file_lock every 1000 iterated descriptors. + * New sockets will already have been created with new classid. + */ + +struct update_classid_context { + u32 classid; + unsigned int batch; +}; + +#define UPDATE_CLASSID_BATCH 1000 + static int update_classid_sock(const void *v, struct file *file, unsigned n) { int err; + struct update_classid_context *ctx = (void *)v; struct socket *sock = sock_from_file(file, &err); if (sock) { spin_lock(&cgroup_sk_update_lock); - sock_cgroup_set_classid(&sock->sk->sk_cgrp_data, - (unsigned long)v); + sock_cgroup_set_classid(&sock->sk->sk_cgrp_data, ctx->classid); spin_unlock(&cgroup_sk_update_lock); } + if (--ctx->batch == 0) { + ctx->batch = UPDATE_CLASSID_BATCH; + return n + 1; + } return 0; } +static void update_classid_task(struct task_struct *p, u32 classid) +{ + struct update_classid_context ctx = { + .classid = classid, + .batch = UPDATE_CLASSID_BATCH + }; + unsigned int fd = 0; + + do { + task_lock(p); + fd = iterate_fd(p->files, fd, update_classid_sock, &ctx); + task_unlock(p); + cond_resched(); + } while (fd); +} + static void cgrp_attach(struct cgroup_taskset *tset) { struct cgroup_subsys_state *css; struct task_struct *p; cgroup_taskset_for_each(p, css, tset) { - task_lock(p); - iterate_fd(p->files, 0, update_classid_sock, - (void *)(unsigned long)css_cls_state(css)->classid); - task_unlock(p); + update_classid_task(p, css_cls_state(css)->classid); } } @@ -98,10 +128,7 @@ static int write_classid(struct cgroup_s css_task_iter_start(css, 0, &it); while ((p = css_task_iter_next(&it))) { - task_lock(p); - iterate_fd(p->files, 0, update_classid_sock, - (void *)(unsigned long)cs->classid); - task_unlock(p); + update_classid_task(p, cs->classid); cond_resched(); } css_task_iter_end(&it);