From patchwork Wed Oct 10 09:29:46 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amit Pundir X-Patchwork-Id: 148544 Delivered-To: patch@linaro.org Received: by 2002:a2e:8595:0:0:0:0:0 with SMTP id b21-v6csp617275lji; Wed, 10 Oct 2018 02:29:56 -0700 (PDT) X-Google-Smtp-Source: ACcGV609bQE09ZjA64j2z8gPBcdgHpthK96lZwvn06JvV/prQvbCyqi5XY5vxPzJdKkUJ56blC30 X-Received: by 2002:a62:f553:: with SMTP id n80-v6mr33774544pfh.59.1539163796299; Wed, 10 Oct 2018 02:29:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1539163796; cv=none; d=google.com; s=arc-20160816; b=UCQnbEEhBG84wx79IBmH6lxPwOk4w4pSeAC6P2jVIJ8xx0rte8fZYMdePj6GBorTeq hn1816Gauf+pmRRSmmCoBcqYvNM3rFSrevfmhpy02FU6KqZNAHju61InmSMvy5P8W9ej 5arlazACY+69yUWpiXhSE/8CkxaDgFcLusEB1igqTDoca6zvmx0icpiRGqgcE9P1eSBD 2zRXWXs2L5c4sJf8STYvxO8rboe3xppP3e2isiC/J2S0LxdhDh8g4EE6qsbLE14JEB1i I9EzORxRluD7kTZZMRI9Jaxht1Ox+vzyFwkLyhGK2KSAPfBShpMbfEte9g4xafRFnofd pk0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature; bh=XspLHCwx2YbdRsEwwX12dvQf/Vv9r/FuE99oV54V9Ls=; b=qPK4/vvGhrlTNfK4YA1bVCfSaNfzjHC2g9RB0NH6M2IUeHLoTiEhW1Qcq1IAG6DiJf Xar6M4Pm+A7TBECBOayHg9XPzRkXSM94fYi8lIroEBfVLm/iX2Lgj7utLaQTQIf3sIhE SzldVR1cVULBeog45mYl/FRkZXmOneIQHGtj/hVILSrdgxAQo7YJvh8IBg+u45trc6rq E41EbO3Q0/gEm8ynQTWJxmZhLnOIFqGe6ExC0NqwJ0I9P909WPYR16G5rMH+rJ/VEm2P ZwlOsX3A0fisOOIM74ObbDxZtJyPwruBgMRA7HmK0sx739giwgHfQumudM9WJBYdSA5B TpgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Pf/ClDZG"; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o12-v6si27119281pfd.102.2018.10.10.02.29.56; Wed, 10 Oct 2018 02:29:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b="Pf/ClDZG"; spf=pass (google.com: best guess record for domain of stable-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=stable-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726579AbeJJQvN (ORCPT + 14 others); Wed, 10 Oct 2018 12:51:13 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:39377 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726515AbeJJQvN (ORCPT ); Wed, 10 Oct 2018 12:51:13 -0400 Received: by mail-pg1-f193.google.com with SMTP id r9-v6so2241062pgv.6 for ; Wed, 10 Oct 2018 02:29:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id; bh=XspLHCwx2YbdRsEwwX12dvQf/Vv9r/FuE99oV54V9Ls=; b=Pf/ClDZG4qId4Hq+ieGwLWpShmjgLH8BU6Guj4G4dLNOzZWD3xyPVAWz+pntVGgA6A I0B482VwdkCRzsWObH4y1zSHL6gu1WYb9Z5D2UaIEQ7Tkp/sX/AaRNQ6HGpZ3bchI8GA 5kL/aYzGpTlpDHd+RX+1W9+B7NhsvtRo8WLDY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=XspLHCwx2YbdRsEwwX12dvQf/Vv9r/FuE99oV54V9Ls=; b=ns4zSCE4KUFKTgFQhnM2HZC4jiGSdxSuFVOAPUk0xK3uzBFlDWKrPZ1hP//M2IFFnz 1Bi1IaDl3hDJhgrudbf3GV83n/VYB63p2pnEwlypBE/9zzMz66N5TCeabiNjQfOnE+ee zD7q8hsku/2hgbAVPjTA8fclpsr0oTs/q0a8Ra4/O0mAmHaOfq66fXV5ugBcZyGs6e6e KmNrh6in0LyDoCuiF/pm6pOGdVgXwPjMwpsafcSCPeWQtVlI/dGIh0gYWuaxg8K3VKqO Oxv3fZPtkhOnSxnYiE2vjr1Vx/gI8F8LfK4el1fwiZDmYtRRe1+Tp18kQ5BBBFiOarzX h6sw== X-Gm-Message-State: ABuFfog99EKys3hStXk8s8TyifBdlbx++M2g7RmElxEnATqzC2Sl2gJW 3BNt8pwXf4xfSRZPGIdDBUD3qw== X-Received: by 2002:a65:53c9:: with SMTP id z9-v6mr29245787pgr.203.1539163794147; Wed, 10 Oct 2018 02:29:54 -0700 (PDT) Received: from localhost.localdomain ([106.51.16.178]) by smtp.gmail.com with ESMTPSA id k71-v6sm37977548pge.44.2018.10.10.02.29.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 10 Oct 2018 02:29:52 -0700 (PDT) From: Amit Pundir To: Greg KH Cc: Stable , Prateek Sood , Tejun Heo Subject: [PATCH for-4.14.y 1/4] cgroup/cpuset: remove circular dependency deadlock Date: Wed, 10 Oct 2018 14:59:46 +0530 Message-Id: <1539163789-32338-1-git-send-email-amit.pundir@linaro.org> X-Mailer: git-send-email 2.7.4 Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Prateek Sood commit aa24163b2ee5c92120e32e99b5a93143a0f4258e upstream. Remove circular dependency deadlock in a scenario where hotplug of CPU is being done while there is updation in cgroup and cpuset triggered from userspace. Process A => kthreadd => Process B => Process C => Process A Process A cpu_subsys_offline(); cpu_down(); _cpu_down(); percpu_down_write(&cpu_hotplug_lock); //held cpuhp_invoke_callback(); workqueue_offline_cpu(); queue_work_on(); // unbind_work on system_highpri_wq __queue_work(); insert_work(); wake_up_worker(); flush_work(); wait_for_completion(); worker_thread(); manage_workers(); create_worker(); kthread_create_on_node(); wake_up_process(kthreadd_task); kthreadd kthreadd(); kernel_thread(); do_fork(); copy_process(); percpu_down_read(&cgroup_threadgroup_rwsem); __rwsem_down_read_failed_common(); //waiting Process B kernfs_fop_write(); cgroup_file_write(); cgroup_procs_write(); percpu_down_write(&cgroup_threadgroup_rwsem); //held cgroup_attach_task(); cgroup_migrate(); cgroup_migrate_execute(); cpuset_can_attach(); mutex_lock(&cpuset_mutex); //waiting Process C kernfs_fop_write(); cgroup_file_write(); cpuset_write_resmask(); mutex_lock(&cpuset_mutex); //held update_cpumask(); update_cpumasks_hier(); rebuild_sched_domains_locked(); get_online_cpus(); percpu_down_read(&cpu_hotplug_lock); //waiting Eliminating deadlock by reversing the locking order for cpuset_mutex and cpu_hotplug_lock. Signed-off-by: Prateek Sood Signed-off-by: Tejun Heo Signed-off-by: Amit Pundir --- Build tested on 4.14.74 for ARCH=arm/arm64 allmodconfig. kernel/cgroup/cpuset.c | 53 ++++++++++++++++++++++++++++---------------------- 1 file changed, 30 insertions(+), 23 deletions(-) -- 2.7.4 diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 4657e2924ecb..54f4855b92fa 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -817,6 +817,18 @@ static int generate_sched_domains(cpumask_var_t **domains, return ndoms; } +static void cpuset_sched_change_begin(void) +{ + cpus_read_lock(); + mutex_lock(&cpuset_mutex); +} + +static void cpuset_sched_change_end(void) +{ + mutex_unlock(&cpuset_mutex); + cpus_read_unlock(); +} + /* * Rebuild scheduler domains. * @@ -826,16 +838,14 @@ static int generate_sched_domains(cpumask_var_t **domains, * 'cpus' is removed, then call this routine to rebuild the * scheduler's dynamic sched domains. * - * Call with cpuset_mutex held. Takes get_online_cpus(). */ -static void rebuild_sched_domains_locked(void) +static void rebuild_sched_domains_cpuslocked(void) { struct sched_domain_attr *attr; cpumask_var_t *doms; int ndoms; lockdep_assert_held(&cpuset_mutex); - get_online_cpus(); /* * We have raced with CPU hotplug. Don't do anything to avoid @@ -843,27 +853,25 @@ static void rebuild_sched_domains_locked(void) * Anyways, hotplug work item will rebuild sched domains. */ if (!cpumask_equal(top_cpuset.effective_cpus, cpu_active_mask)) - goto out; + return; /* Generate domain masks and attrs */ ndoms = generate_sched_domains(&doms, &attr); /* Have scheduler rebuild the domains */ partition_sched_domains(ndoms, doms, attr); -out: - put_online_cpus(); } #else /* !CONFIG_SMP */ -static void rebuild_sched_domains_locked(void) +static void rebuild_sched_domains_cpuslocked(void) { } #endif /* CONFIG_SMP */ void rebuild_sched_domains(void) { - mutex_lock(&cpuset_mutex); - rebuild_sched_domains_locked(); - mutex_unlock(&cpuset_mutex); + cpuset_sched_change_begin(); + rebuild_sched_domains_cpuslocked(); + cpuset_sched_change_end(); } /** @@ -949,7 +957,7 @@ static void update_cpumasks_hier(struct cpuset *cs, struct cpumask *new_cpus) rcu_read_unlock(); if (need_rebuild_sched_domains) - rebuild_sched_domains_locked(); + rebuild_sched_domains_cpuslocked(); } /** @@ -1281,7 +1289,7 @@ static int update_relax_domain_level(struct cpuset *cs, s64 val) cs->relax_domain_level = val; if (!cpumask_empty(cs->cpus_allowed) && is_sched_load_balance(cs)) - rebuild_sched_domains_locked(); + rebuild_sched_domains_cpuslocked(); } return 0; @@ -1314,7 +1322,6 @@ static void update_tasks_flags(struct cpuset *cs) * * Call with cpuset_mutex held. */ - static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, int turning_on) { @@ -1347,7 +1354,7 @@ static int update_flag(cpuset_flagbits_t bit, struct cpuset *cs, spin_unlock_irq(&callback_lock); if (!cpumask_empty(trialcs->cpus_allowed) && balance_flag_changed) - rebuild_sched_domains_locked(); + rebuild_sched_domains_cpuslocked(); if (spread_flag_changed) update_tasks_flags(cs); @@ -1615,7 +1622,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, cpuset_filetype_t type = cft->private; int retval = 0; - mutex_lock(&cpuset_mutex); + cpuset_sched_change_begin(); if (!is_cpuset_online(cs)) { retval = -ENODEV; goto out_unlock; @@ -1651,7 +1658,7 @@ static int cpuset_write_u64(struct cgroup_subsys_state *css, struct cftype *cft, break; } out_unlock: - mutex_unlock(&cpuset_mutex); + cpuset_sched_change_end(); return retval; } @@ -1662,7 +1669,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft, cpuset_filetype_t type = cft->private; int retval = -ENODEV; - mutex_lock(&cpuset_mutex); + cpuset_sched_change_begin(); if (!is_cpuset_online(cs)) goto out_unlock; @@ -1675,7 +1682,7 @@ static int cpuset_write_s64(struct cgroup_subsys_state *css, struct cftype *cft, break; } out_unlock: - mutex_unlock(&cpuset_mutex); + cpuset_sched_change_end(); return retval; } @@ -1714,7 +1721,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of, kernfs_break_active_protection(of->kn); flush_work(&cpuset_hotplug_work); - mutex_lock(&cpuset_mutex); + cpuset_sched_change_begin(); if (!is_cpuset_online(cs)) goto out_unlock; @@ -1738,7 +1745,7 @@ static ssize_t cpuset_write_resmask(struct kernfs_open_file *of, free_trial_cpuset(trialcs); out_unlock: - mutex_unlock(&cpuset_mutex); + cpuset_sched_change_end(); kernfs_unbreak_active_protection(of->kn); css_put(&cs->css); flush_workqueue(cpuset_migrate_mm_wq); @@ -2039,14 +2046,14 @@ static int cpuset_css_online(struct cgroup_subsys_state *css) /* * If the cpuset being removed has its flag 'sched_load_balance' * enabled, then simulate turning sched_load_balance off, which - * will call rebuild_sched_domains_locked(). + * will call rebuild_sched_domains_cpuslocked(). */ static void cpuset_css_offline(struct cgroup_subsys_state *css) { struct cpuset *cs = css_cs(css); - mutex_lock(&cpuset_mutex); + cpuset_sched_change_begin(); if (is_sched_load_balance(cs)) update_flag(CS_SCHED_LOAD_BALANCE, cs, 0); @@ -2054,7 +2061,7 @@ static void cpuset_css_offline(struct cgroup_subsys_state *css) cpuset_dec(); clear_bit(CS_ONLINE, &cs->flags); - mutex_unlock(&cpuset_mutex); + cpuset_sched_change_end(); } static void cpuset_css_free(struct cgroup_subsys_state *css)