From patchwork Mon Oct 23 16:09:54 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sumit Semwal X-Patchwork-Id: 116743 Delivered-To: patch@linaro.org Received: by 10.140.22.164 with SMTP id 33csp4816380qgn; Mon, 23 Oct 2017 09:10:19 -0700 (PDT) X-Received: by 10.98.57.215 with SMTP id u84mr13462300pfj.300.1508775019801; Mon, 23 Oct 2017 09:10:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1508775019; cv=none; d=google.com; s=arc-20160816; b=A5cr0y/+4LYRznqCQiaR7V1rWQiOZEHObt7aBOqSoIJZ2Yj/AGiTkDUTr/6+NSsl7E hJIeJLesZf0Fs5iChsviP+IApXU536Vcag1wCeMG996RWmaPAEtZnEmfULhFMldoFbaQ KhyGRR/bXPdLHjypQrns/YDILqaYZy48kbjdK95uNHMMFThfd8ywEahhzQvCIEshR0XB WLh2cEZ5EgkuAODjb5lI4Mf1XlJc0lVQjzFukSxkAAzPGOuVrM9xQaTTYpkjthQUQxx+ 6/cyaRbrtLn1CWvNAMeUX65FFqsTtxBzU0wInkW/lIcIfhY2WUB2Fu4cgVQ2xjL8KSoZ 5C3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=MrLWKTfqthKid2l+Dt9Py8CMLZ70QpT9JsBLMJo2F9I=; b=hxUSYB0HyfDgmH4EM6l5XYuPHVElO92sjom/5hizbyLJWJaPMFQyOiTbqZDgN+o3xo Bn0bLiwctiS6XIlI9T7UkDWiqPfavObW6ZT2dFwWg54RBM0xehGWbekRnk7UFDGBGWNu fBlNd3mInm/TCBlXmHQV/l6NNszNKb2cSHgcVobwrtfVoeIMy9o8jn6fKCT/ST6Y0TXk kmrp3fxg7YgqvRNCg/m6sVFufZDCBMyQGpFmY6iHTt+9N5ammiZlDMKxbqmkOeLgV/fN VZ1I2zJyeS9rvnUVhjZSRTcFp/NgeSKQ081NZ/odi6IwlqNvrwEireIJhMUiPC2PPKyt g0/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=QRjW8PV5; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f15si4222639plr.601.2017.10.23.09.10.19; Mon, 23 Oct 2017 09:10:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=QRjW8PV5; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751277AbdJWQKS (ORCPT + 9 others); Mon, 23 Oct 2017 12:10:18 -0400 Received: from mail-pf0-f195.google.com ([209.85.192.195]:56451 "EHLO mail-pf0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751260AbdJWQKR (ORCPT ); Mon, 23 Oct 2017 12:10:17 -0400 Received: by mail-pf0-f195.google.com with SMTP id b85so17431146pfj.13 for ; Mon, 23 Oct 2017 09:10:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=MrLWKTfqthKid2l+Dt9Py8CMLZ70QpT9JsBLMJo2F9I=; b=QRjW8PV5pZzOJUkXO8IkfTTpruoxPCRZ3Xs5BV6cN+pkUUTgFIpVmbLqe3mS4uMyKc 1FJATHn5fzyDncwY77f6v7pm7TY6ivVUjkQscIuS+N1/cnL9UjeiVW/LGnIOfW4t24SI LeaTudJZE3CMBx50w8xUCTXKoAKTcoQySMRsg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=MrLWKTfqthKid2l+Dt9Py8CMLZ70QpT9JsBLMJo2F9I=; b=Nn6PfpKOn+h+dGx9zK+XMZMvwTTdSMYLS5JwscSAKLFGybh0QxMJfcLZ5cXUgJWx1Q /BOVg25cfWTWYzhFzIxde1dd4trVuIizWwdZFwy14pJt8vBAzHdt1FBVlSM3+/0H+POa YkZrgw2c8/FPZae9YYJdu4NCrAKsmRLMpBYxpHvb+cKujMid1dPGUKJCtawGIzxiPNhv m/HFUTMzpIDbRSzEShFVuG//4Ajmf+yH/ngHc6+rbHl28PtzkYMO4+R2Ct061wDOQawE tRC0rAIcVu7ndWSY7Fpi/xEe6MogjeFNfCeAPI2xsalW2Nm7mE9npV4MG0iLPEPeNFTA NrYA== X-Gm-Message-State: AMCzsaU7fIjfMjl2LS4MpM4uya/Iett3IHUMvx0Si7A581iVseeT/+SU YQEchzyIq0f+82Ll0bfQR1WWGM6K0euqQg== X-Google-Smtp-Source: ABhQp+RCFb8Nx76/TbK6bAXq454Vbs/HnbY4BqeNABaYO6Tj980SWYm3mWgyo7JjkeQlLNFNCIab8Q== X-Received: by 10.99.97.139 with SMTP id v133mr11631104pgb.300.1508775016383; Mon, 23 Oct 2017 09:10:16 -0700 (PDT) Received: from phantom.lan ([106.51.116.52]) by smtp.gmail.com with ESMTPSA id y30sm12781213pge.27.2017.10.23.09.10.12 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 23 Oct 2017 09:10:15 -0700 (PDT) From: Sumit Semwal To: stable@vger.kernel.org Cc: Oleg Nesterov , Peter Zijlstra , Linus Torvalds , Mike Galbraith , Thomas Gleixner , hartsjc@redhat.com, vbendel@redhat.com, Ingo Molnar , Sumit Semwal Subject: [PATCH review for 4.4] sched/autogroup: Fix autogroup_move_group() to never skip sched_move_task() Date: Mon, 23 Oct 2017 21:39:54 +0530 Message-Id: <1508774994-25698-1-git-send-email-sumit.semwal@linaro.org> X-Mailer: git-send-email 2.7.4 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Oleg Nesterov [ Upstream commit 18f649ef344127ef6de23a5a4272dbe2fdb73dde ] The PF_EXITING check in task_wants_autogroup() is no longer needed. Remove it, but see the next patch. However the comment is correct in that autogroup_move_group() must always change task_group() for every thread so the sysctl_ check is very wrong; we can race with cgroups and even sys_setsid() is not safe because a task running with task_group() == ag->tg must participate in refcounting: int main(void) { int sctl = open("/proc/sys/kernel/sched_autogroup_enabled", O_WRONLY); assert(sctl > 0); if (fork()) { wait(NULL); // destroy the child's ag/tg pause(); } assert(pwrite(sctl, "1\n", 2, 0) == 2); assert(setsid() > 0); if (fork()) pause(); kill(getppid(), SIGKILL); sleep(1); // The child has gone, the grandchild runs with kref == 1 assert(pwrite(sctl, "0\n", 2, 0) == 2); assert(setsid() > 0); // runs with the freed ag/tg for (;;) sleep(1); return 0; } crashes the kernel. It doesn't really need sleep(1), it doesn't matter if autogroup_move_group() actually frees the task_group or this happens later. ----- Without this commit, 4.4.y + hikey kernel crashes while running 'autogroup01' test in ltp (it seems to be written to catch this break itself): Unable to handle kernel NULL pointer dereference at virtual address 00000080 pgd = ffffffc050b11000 [00000080] *pgd=0000000070744003, *pud=0000000070744003, *pmd=0000000000000000 Internal error: Oops: 96000006 [#1] PREEMPT SMP Modules linked in: bnep arc4 wl18xx wlcore mac80211 cfg80211 wlcore_sdio spidev btwilink st_drv bluetooth rfkill CPU: 7 PID: 7965 Comm: autogroup01 Tainted: G W 4.4.92-rc1+ #1 Hardware name: HiKey Development Board (DT) task: ffffffc04ccf0000 ti: ffffffc0491ec000 task.ti: ffffffc0491ec000 PC is at pick_next_task_fair+0x620/0x8f0 LR is at pick_next_task_fair+0x620/0x8f0 pc : [] lr : [] pstate: 600001c5 sp : ffffffc0491efcc0 ... ... ... Call trace: [] pick_next_task_fair+0x620/0x8f0 [] __schedule+0x124/0x768 [] schedule+0x44/0xb8 [] do_nanosleep+0x94/0x168 [] hrtimer_nanosleep+0x94/0x148 [] SyS_nanosleep+0x74/0xa0 [] el0_svc_naked+0x24/0x28 Code: 54ffdae1 aa1303e0 aa1403e1 97ffdb61 (f9404013) With this, the test case passes: hikey:/opt/ltp# ./testcases/bin/autogroup01 tst_test.c:934: INFO: Timeout per run is 0h 05m 00s autogroup01.c:71: PASS: Bug not reproduced Summary: passed 1 failed 0 skipped 0 warnings 0 Reported-by: Vern Lovejoy Signed-off-by: Oleg Nesterov Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Mike Galbraith Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: hartsjc@redhat.com Cc: vbendel@redhat.com Link: http://lkml.kernel.org/r/20161114184609.GA15965@redhat.com Signed-off-by: Ingo Molnar Signed-off-by: Sumit Semwal [sumits: submit to 4.4 LTS, post testing on Hikey] --- kernel/sched/auto_group.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) -- 2.7.4 diff --git a/kernel/sched/auto_group.c b/kernel/sched/auto_group.c index 750ed601ddf7..8620fd01b3d0 100644 --- a/kernel/sched/auto_group.c +++ b/kernel/sched/auto_group.c @@ -111,14 +111,11 @@ bool task_wants_autogroup(struct task_struct *p, struct task_group *tg) { if (tg != &root_task_group) return false; - /* - * We can only assume the task group can't go away on us if - * autogroup_move_group() can see us on ->thread_group list. + * If we race with autogroup_move_group() the caller can use the old + * value of signal->autogroup but in this case sched_move_task() will + * be called again before autogroup_kref_put(). */ - if (p->flags & PF_EXITING) - return false; - return true; } @@ -138,13 +135,17 @@ autogroup_move_group(struct task_struct *p, struct autogroup *ag) } p->signal->autogroup = autogroup_kref_get(ag); - - if (!READ_ONCE(sysctl_sched_autogroup_enabled)) - goto out; - + /* + * We can't avoid sched_move_task() after we changed signal->autogroup, + * this process can already run with task_group() == prev->tg or we can + * race with cgroup code which can read autogroup = prev under rq->lock. + * In the latter case for_each_thread() can not miss a migrating thread, + * cpu_cgroup_attach() must not be possible after cgroup_exit() and it + * can't be removed from thread list, we hold ->siglock. + */ for_each_thread(p, t) sched_move_task(t); -out: + unlock_task_sighand(p, &flags); autogroup_kref_put(prev); }