From patchwork Thu Dec 3 23:25:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 337679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5018C4361A for ; Thu, 3 Dec 2020 23:27:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 51C822245C for ; Thu, 3 Dec 2020 23:27:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727146AbgLCX1N (ORCPT ); Thu, 3 Dec 2020 18:27:13 -0500 Received: from mga14.intel.com ([192.55.52.115]:51637 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725912AbgLCX1N (ORCPT ); Thu, 3 Dec 2020 18:27:13 -0500 IronPort-SDR: p72wGICYHJ6FGueZkYBIlwC/0chyuTXyZvTRRwuneTXo4vNAgrt2Bir486KUI+dHpYctSYucdM bsIeSNoCgx7A== X-IronPort-AV: E=McAfee;i="6000,8403,9824"; a="172512669" X-IronPort-AV: E=Sophos;i="5.78,390,1599548400"; d="scan'208";a="172512669" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2020 15:26:17 -0800 IronPort-SDR: x8OssM1Ed5zcbgkQkNUJ1Y221zW9VIEJqsZlrdW8kXNai8PR21SiqOIDT/S3BmAihFTWPmOQgf I9O2yHMvDY7A== X-IronPort-AV: E=Sophos;i="5.78,390,1599548400"; d="scan'208";a="482158932" Received: from rchatre-mobl1.jf.intel.com ([10.54.70.7]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2020 15:26:16 -0800 From: Reinette Chatre To: tglx@linutronix.de, fenghua.yu@intel.com, bp@alien8.de, tony.luck@intel.com Cc: kuo-lang.tseng@intel.com, shakeelb@google.com, valentin.schneider@arm.com, mingo@redhat.com, babu.moger@amd.com, james.morse@arm.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, Reinette Chatre , stable@vger.kernel.org Subject: [PATCH 1/3] x86/resctrl: Move setting task's active CPU in a mask into helpers Date: Thu, 3 Dec 2020 15:25:48 -0800 Message-Id: <77973e75a10bf7ef9b33c664544667deee9e1a8e.1607036601.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Fenghua Yu The code of setting the CPU on which a task is running in a CPU mask is moved into a couple of helpers. The new helper task_on_cpu() will be reused shortly. Signed-off-by: Fenghua Yu Signed-off-by: Reinette Chatre Reviewed-by: Tony Luck Cc: stable@vger.kernel.org Reported-by: James Morse Signed-off-by: Reinette Chatre --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 47 +++++++++++++++++++------- 1 file changed, 34 insertions(+), 13 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 6f4ca4bea625..68db7d2dec8f 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -525,6 +525,38 @@ static void rdtgroup_remove(struct rdtgroup *rdtgrp) kfree(rdtgrp); } +#ifdef CONFIG_SMP +/* Get the CPU if the task is on it. */ +static bool task_on_cpu(struct task_struct *t, int *cpu) +{ + /* + * This is safe on x86 w/o barriers as the ordering of writing to + * task_cpu() and t->on_cpu is reverse to the reading here. The + * detection is inaccurate as tasks might move or schedule before + * the smp function call takes place. In such a case the function + * call is pointless, but there is no other side effect. + */ + if (t->on_cpu) { + *cpu = task_cpu(t); + + return true; + } + + return false; +} + +static void set_task_cpumask(struct task_struct *t, struct cpumask *mask) +{ + int cpu; + + if (mask && task_on_cpu(t, &cpu)) + cpumask_set_cpu(cpu, mask); +} +#else +static inline void +set_task_cpumask(struct task_struct *t, struct cpumask *mask) { } +#endif + struct task_move_callback { struct callback_head work; struct rdtgroup *rdtgrp; @@ -2327,19 +2359,8 @@ static void rdt_move_group_tasks(struct rdtgroup *from, struct rdtgroup *to, t->closid = to->closid; t->rmid = to->mon.rmid; -#ifdef CONFIG_SMP - /* - * This is safe on x86 w/o barriers as the ordering - * of writing to task_cpu() and t->on_cpu is - * reverse to the reading here. The detection is - * inaccurate as tasks might move or schedule - * before the smp function call takes place. In - * such a case the function call is pointless, but - * there is no other side effect. - */ - if (mask && t->on_cpu) - cpumask_set_cpu(task_cpu(t), mask); -#endif + /* If the task is on a CPU, set the CPU in the mask. */ + set_task_cpumask(t, mask); } } read_unlock(&tasklist_lock); From patchwork Thu Dec 3 23:25:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 338394 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8C3CC433FE for ; Thu, 3 Dec 2020 23:27:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6343D221EA for ; Thu, 3 Dec 2020 23:27:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727106AbgLCX12 (ORCPT ); Thu, 3 Dec 2020 18:27:28 -0500 Received: from mga14.intel.com ([192.55.52.115]:51637 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726721AbgLCX12 (ORCPT ); Thu, 3 Dec 2020 18:27:28 -0500 IronPort-SDR: oEEPEtKtjH1sDLhr8G1/Yn7b2X8TIjlPA84LGXjrUVglf+XVfsd7BI1NjNQLDPV1WgdR9hmIk1 2a9MUgNggM0Q== X-IronPort-AV: E=McAfee;i="6000,8403,9824"; a="172512671" X-IronPort-AV: E=Sophos;i="5.78,390,1599548400"; d="scan'208";a="172512671" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2020 15:26:17 -0800 IronPort-SDR: +LdZ/278WWi8CEGZqjbS7Jhp8rdz0UTi4xLf/cqjKHakIoLogKIW1xmQ4TK0jK5EsaRgqgbniV 3x1HqJEBTYww== X-IronPort-AV: E=Sophos;i="5.78,390,1599548400"; d="scan'208";a="482158935" Received: from rchatre-mobl1.jf.intel.com ([10.54.70.7]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2020 15:26:16 -0800 From: Reinette Chatre To: tglx@linutronix.de, fenghua.yu@intel.com, bp@alien8.de, tony.luck@intel.com Cc: kuo-lang.tseng@intel.com, shakeelb@google.com, valentin.schneider@arm.com, mingo@redhat.com, babu.moger@amd.com, james.morse@arm.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, Reinette Chatre , stable@vger.kernel.org Subject: [PATCH 2/3] x86/resctrl: Update PQR_ASSOC MSR synchronously when moving task to resource group Date: Thu, 3 Dec 2020 15:25:49 -0800 Message-Id: X-Mailer: git-send-email 2.26.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Fenghua Yu Currently when moving a task to a resource group the PQR_ASSOC MSR is updated with the new closid and rmid in an added task callback. If the task is running the work is run as soon as possible. If the task is not running the work is executed later in the kernel exit path when the kernel returns to the task again. Updating the PQR_ASSOC MSR as soon as possible on the CPU a moved task is running is the right thing to do. Queueing work for a task that is not running is unnecessary (the PQR_ASSOC MSR is already updated when the task is scheduled in) and causing system resource waste with the way in which it is implemented: Work to update the PQR_ASSOC register is queued every time the user writes a task id to the "tasks" file, even if the task already belongs to the resource group. This could result in multiple pending work items associated with a single task even if they are all identical and even though only a single update with most recent values is needed. Specifically, even if a task is moved between different resource groups while it is sleeping then it is only the last move that is relevant but yet a work item is queued during each move. This unnecessary queueing of work items could result in significant system resource waste, especially on tasks sleeping for a long time. For example, as demonstrated by Shakeel Butt in [1] writing the same task id to the "tasks" file can quickly consume significant memory. The same problem (wasted system resources) occurs when moving a task between different resource groups. As pointed out by Valentin Schneider in [2] there is an additional issue with the way in which the queueing of work is done in that the task_struct update is currently done after the work is queued, resulting in a race with the register update possibly done before the data needed by the update is available. To solve these issues, the PQR_ASSOC MSR is updated in a synchronous way right after the new closid and rmid are ready during the task movement, only if the task is running. If a moved task is not running nothing is done since the PQR_ASSOC MSR will be updated next time the task is scheduled. This is the same way used to update the register when tasks are moved as part of resource group removal. [1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/ [2] https://lore.kernel.org/lkml/20201123022433.17905-1-valentin.schneider@arm.com Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files") Reported-by: Shakeel Butt Reported-by: Valentin Schneider Signed-off-by: Fenghua Yu Signed-off-by: Reinette Chatre Reviewed-by: Tony Luck Cc: stable@vger.kernel.org Reviewed-by: James Morse Reviewed-by: Valentin Schneider --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 123 ++++++++++--------------- 1 file changed, 50 insertions(+), 73 deletions(-) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 68db7d2dec8f..9d62f1fadcc3 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -525,6 +525,16 @@ static void rdtgroup_remove(struct rdtgroup *rdtgrp) kfree(rdtgrp); } +static void _update_task_closid_rmid(void *task) +{ + /* + * If the task is still current on this CPU, update PQR_ASSOC MSR. + * Otherwise, the MSR is updated when the task is scheduled in. + */ + if (task == current) + resctrl_sched_in(); +} + #ifdef CONFIG_SMP /* Get the CPU if the task is on it. */ static bool task_on_cpu(struct task_struct *t, int *cpu) @@ -552,94 +562,61 @@ static void set_task_cpumask(struct task_struct *t, struct cpumask *mask) if (mask && task_on_cpu(t, &cpu)) cpumask_set_cpu(cpu, mask); } -#else -static inline void -set_task_cpumask(struct task_struct *t, struct cpumask *mask) { } -#endif - -struct task_move_callback { - struct callback_head work; - struct rdtgroup *rdtgrp; -}; -static void move_myself(struct callback_head *head) +static void update_task_closid_rmid(struct task_struct *t) { - struct task_move_callback *callback; - struct rdtgroup *rdtgrp; - - callback = container_of(head, struct task_move_callback, work); - rdtgrp = callback->rdtgrp; - - /* - * If resource group was deleted before this task work callback - * was invoked, then assign the task to root group and free the - * resource group. - */ - if (atomic_dec_and_test(&rdtgrp->waitcount) && - (rdtgrp->flags & RDT_DELETED)) { - current->closid = 0; - current->rmid = 0; - rdtgroup_remove(rdtgrp); - } + int cpu; - if (unlikely(current->flags & PF_EXITING)) - goto out; + if (task_on_cpu(t, &cpu)) + smp_call_function_single(cpu, _update_task_closid_rmid, t, 1); +} - preempt_disable(); - /* update PQR_ASSOC MSR to make resource group go into effect */ - resctrl_sched_in(); - preempt_enable(); +#else +static inline void +set_task_cpumask(struct task_struct *t, struct cpumask *mask) { } -out: - kfree(callback); +static void update_task_closid_rmid(struct task_struct *t) +{ + _update_task_closid_rmid(t); } +#endif static int __rdtgroup_move_task(struct task_struct *tsk, struct rdtgroup *rdtgrp) { - struct task_move_callback *callback; - int ret; - - callback = kzalloc(sizeof(*callback), GFP_KERNEL); - if (!callback) - return -ENOMEM; - callback->work.func = move_myself; - callback->rdtgrp = rdtgrp; - /* - * Take a refcount, so rdtgrp cannot be freed before the - * callback has been invoked. + * Set the task's closid/rmid before the PQR_ASSOC MSR can be + * updated by them. + * + * For ctrl_mon groups, move both closid and rmid. + * For monitor groups, can move the tasks only from + * their parent CTRL group. */ - atomic_inc(&rdtgrp->waitcount); - ret = task_work_add(tsk, &callback->work, TWA_RESUME); - if (ret) { - /* - * Task is exiting. Drop the refcount and free the callback. - * No need to check the refcount as the group cannot be - * deleted before the write function unlocks rdtgroup_mutex. - */ - atomic_dec(&rdtgrp->waitcount); - kfree(callback); - rdt_last_cmd_puts("Task exited\n"); - } else { - /* - * For ctrl_mon groups move both closid and rmid. - * For monitor groups, can move the tasks only from - * their parent CTRL group. - */ - if (rdtgrp->type == RDTCTRL_GROUP) { - tsk->closid = rdtgrp->closid; + + if (rdtgrp->type == RDTCTRL_GROUP) { + tsk->closid = rdtgrp->closid; + tsk->rmid = rdtgrp->mon.rmid; + } else if (rdtgrp->type == RDTMON_GROUP) { + if (rdtgrp->mon.parent->closid == tsk->closid) { tsk->rmid = rdtgrp->mon.rmid; - } else if (rdtgrp->type == RDTMON_GROUP) { - if (rdtgrp->mon.parent->closid == tsk->closid) { - tsk->rmid = rdtgrp->mon.rmid; - } else { - rdt_last_cmd_puts("Can't move task to different control group\n"); - ret = -EINVAL; - } + } else { + rdt_last_cmd_puts("Can't move task to different control group\n"); + return -EINVAL; } + } else { + rdt_last_cmd_puts("Invalid resource group type\n"); + return -EINVAL; } - return ret; + + /* + * By now, the task's closid and rmid are set. If the task is current + * on a CPU, the PQR_ASSOC MSR needs to be updated to make the resource + * group go into effect. If the task is not current, the MSR will be + * updated when the task is scheduled in. + */ + update_task_closid_rmid(tsk); + + return 0; } static bool is_closid_match(struct task_struct *t, struct rdtgroup *r) From patchwork Thu Dec 3 23:25:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Reinette Chatre X-Patchwork-Id: 337678 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15136C4167B for ; Thu, 3 Dec 2020 23:27:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BDF582245C for ; Thu, 3 Dec 2020 23:27:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387856AbgLCX1i (ORCPT ); Thu, 3 Dec 2020 18:27:38 -0500 Received: from mga14.intel.com ([192.55.52.115]:51723 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387843AbgLCX1i (ORCPT ); Thu, 3 Dec 2020 18:27:38 -0500 IronPort-SDR: tJ66gV4RpVbKrQcRLdRafvyurlwcAJK2QaYVfxaMzmaTX+AJWAJzk+LGzUFHKjg48tq9XBx24g tIs3a5OsIgDQ== X-IronPort-AV: E=McAfee;i="6000,8403,9824"; a="172512672" X-IronPort-AV: E=Sophos;i="5.78,390,1599548400"; d="scan'208";a="172512672" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2020 15:26:17 -0800 IronPort-SDR: hf9phKmrwuysyQSh5Fyyzqiq5yTbSINxKTCIHZeLb36yX+EJE/fpJT5AUjw6THII2VLzenPfjB aAEISBQPSGNQ== X-IronPort-AV: E=Sophos;i="5.78,390,1599548400"; d="scan'208";a="482158937" Received: from rchatre-mobl1.jf.intel.com ([10.54.70.7]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Dec 2020 15:26:16 -0800 From: Reinette Chatre To: tglx@linutronix.de, fenghua.yu@intel.com, bp@alien8.de, tony.luck@intel.com Cc: kuo-lang.tseng@intel.com, shakeelb@google.com, valentin.schneider@arm.com, mingo@redhat.com, babu.moger@amd.com, james.morse@arm.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org, Reinette Chatre , stable@vger.kernel.org Subject: [PATCH 3/3] x86/resctrl: Don't move a task to the same resource group Date: Thu, 3 Dec 2020 15:25:50 -0800 Message-Id: <68b2ee2310d9fd1f12bddceeae2f4f997fa4fda9.1607036601.git.reinette.chatre@intel.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Fenghua Yu Shakeel Butt reported in [1] that a user can request a task to be moved to a resource group even if the task is already in the group. It just wastes time to do the move operation which could be costly to send IPI to a different CPU. Add a sanity check to ensure that the move operation only happens when the task is not already in the resource group. [1] https://lore.kernel.org/lkml/CALvZod7E9zzHwenzf7objzGKsdBmVwTgEJ0nPgs0LUFU3SN5Pw@mail.gmail.com/ Fixes: e02737d5b826 ("x86/intel_rdt: Add tasks files") Reported-by: Shakeel Butt Signed-off-by: Fenghua Yu Signed-off-by: Reinette Chatre Reviewed-by: Tony Luck Cc: stable@vger.kernel.org --- arch/x86/kernel/cpu/resctrl/rdtgroup.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/arch/x86/kernel/cpu/resctrl/rdtgroup.c b/arch/x86/kernel/cpu/resctrl/rdtgroup.c index 9d62f1fadcc3..d3523032265c 100644 --- a/arch/x86/kernel/cpu/resctrl/rdtgroup.c +++ b/arch/x86/kernel/cpu/resctrl/rdtgroup.c @@ -584,6 +584,13 @@ static void update_task_closid_rmid(struct task_struct *t) static int __rdtgroup_move_task(struct task_struct *tsk, struct rdtgroup *rdtgrp) { + /* If the task is already in rdtgrp, no need to move the task. */ + if ((rdtgrp->type == RDTCTRL_GROUP && tsk->closid == rdtgrp->closid && + tsk->rmid == rdtgrp->mon.rmid) || + (rdtgrp->type == RDTMON_GROUP && tsk->rmid == rdtgrp->mon.rmid && + tsk->closid == rdtgrp->mon.parent->closid)) + return 0; + /* * Set the task's closid/rmid before the PQR_ASSOC MSR can be * updated by them.