From patchwork Sun Jan 31 23:05:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 374202 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0EDAAC433E0 for ; Sun, 31 Jan 2021 23:06:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CCF7E64E27 for ; Sun, 31 Jan 2021 23:06:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229832AbhAaXGu (ORCPT ); Sun, 31 Jan 2021 18:06:50 -0500 Received: from mail.kernel.org ([198.145.29.99]:40460 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229646AbhAaXGh (ORCPT ); Sun, 31 Jan 2021 18:06:37 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 0C32164E2B; Sun, 31 Jan 2021 23:05:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612134356; bh=PHJ+cn82IwzdHiWA0YBzy3kVog5hf1vR0OxGqFUsMro=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Y+ZaLC1tgFnyqsmBcWu0VpbE3pS9egdTcYMOex7SGIJkIe9z6TFcVbTRV5GNc+KQh dk23NUsVk+k5UEx5ex691cIX9ERner5o2g2Xiafakf3KgdS4FZetsMw7RnXG7gsetp r9XTQouHywhMAbt0Q65+GHR5JLtCwZoeC+IhJnzdRSCrWxCt0eaAPAJICMhdAhD2pF zuYnApLAYV/rdT22cQUbzQ+m4EibhHIGM3WJ5oglN1lpsyAS7tUs24mF43Fw1UKap7 rEl+As5NcvyiKpfoK3/ICTqSXH2ELiP6DbmJo2R4VuOj+d5JRqyezF1xTCtefSwSQL KAxRW9O5fRQ4Q== From: Frederic Weisbecker To: "Paul E . McKenney" , Peter Zijlstra Cc: LKML , Frederic Weisbecker , Paolo Bonzini , "Rafael J . Wysocki" , Thomas Gleixner , stable@vger.kernel.org, Ingo Molnar Subject: [PATCH 1/5] rcu: Pull deferred rcuog wake up to rcu_eqs_enter() callers Date: Mon, 1 Feb 2021 00:05:44 +0100 Message-Id: <20210131230548.32970-2-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131230548.32970-1-frederic@kernel.org> References: <20210131230548.32970-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Deferred wakeup of rcuog kthreads upon RCU idle mode entry is going to be handled differently whether initiated by idle, user or guest. Prepare with pulling that control up to rcu_eqs_enter() callers. Signed-off-by: Frederic Weisbecker Cc: Paul E. McKenney Cc: Rafael J. Wysocki Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar --- kernel/rcu/tree.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 40e5e3dd253e..63032e5620b9 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -644,7 +644,6 @@ static noinstr void rcu_eqs_enter(bool user) trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks)); WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current)); rdp = this_cpu_ptr(&rcu_data); - do_nocb_deferred_wakeup(rdp); rcu_prepare_for_idle(); rcu_preempt_deferred_qs(current); @@ -672,7 +671,10 @@ static noinstr void rcu_eqs_enter(bool user) */ void rcu_idle_enter(void) { + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); + lockdep_assert_irqs_disabled(); + do_nocb_deferred_wakeup(rdp); rcu_eqs_enter(false); } EXPORT_SYMBOL_GPL(rcu_idle_enter); @@ -691,7 +693,14 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter); */ noinstr void rcu_user_enter(void) { + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); + lockdep_assert_irqs_disabled(); + + instrumentation_begin(); + do_nocb_deferred_wakeup(rdp); + instrumentation_end(); + rcu_eqs_enter(true); } #endif /* CONFIG_NO_HZ_FULL */ From patchwork Sun Jan 31 23:05:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 374592 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9F719C433DB for ; Sun, 31 Jan 2021 23:07:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6784C64E36 for ; Sun, 31 Jan 2021 23:07:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229870AbhAaXG7 (ORCPT ); Sun, 31 Jan 2021 18:06:59 -0500 Received: from mail.kernel.org ([198.145.29.99]:40494 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229656AbhAaXGk (ORCPT ); Sun, 31 Jan 2021 18:06:40 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 5B68464E32; Sun, 31 Jan 2021 23:05:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612134359; bh=c6V8dT0ARZYc0OSrsRZKpnPh61x2a/YTfh54XoZVLIs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=pfccJdj27o5QFhEb4/GqFIGl2//r7aMfve7FA1osYluJxPITsfYUcQQ0LN0clfT2/ 2qSPgwLjWwxxRRRf172jhWYKVwyHt+7mslUks8pkVFoPkUCOdVBtjOH7jIRz5LSW8r jlC07OZR4+Oh3SkEOuyJtWfGOYmHe1FeIH9g87fSpEJFjyu7GZtOsCHClSIlYJyhms CZ10pdJca+x8yVn+1E4h9Gm1i1o3B578hpVwNQZaiBnC/9MMndx6mZp1gWOq8XkXZk lovxt7QstGD9bVq/3/k2nkxDaHvMRVg8mIm5KuQYSskViXQbSzhF/noTVJJ7N1vs8O bjw3V8ylT5JvQ== From: Frederic Weisbecker To: "Paul E . McKenney" , Peter Zijlstra Cc: LKML , Frederic Weisbecker , Paolo Bonzini , "Rafael J . Wysocki" , Thomas Gleixner , stable@vger.kernel.org, Ingo Molnar Subject: [PATCH 2/5] rcu/nocb: Perform deferred wake up before last idle's need_resched() check Date: Mon, 1 Feb 2021 00:05:45 +0100 Message-Id: <20210131230548.32970-3-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131230548.32970-1-frederic@kernel.org> References: <20210131230548.32970-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced. Usually a local wake up happening while running the idle task is handled in one of the need_resched() checks carefully placed within the idle loop that can break to the scheduler. Unfortunately the call to rcu_idle_enter() is already beyond the last generic need_resched() check and we may halt the CPU with a resched request unhandled, leaving the task hanging. Fix this with splitting the rcuog wakeup handling from rcu_idle_enter() and place it before the last generic need_resched() check in the idle loop. It is then assumed that no call to call_rcu() will be performed after that in the idle loop until the CPU is put in low power mode. Reported-by: Paul E. McKenney Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf) Cc: stable@vger.kernel.org Cc: Rafael J. Wysocki Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Signed-off-by: Frederic Weisbecker --- include/linux/rcupdate.h | 2 ++ kernel/rcu/tree.c | 3 --- kernel/rcu/tree_plugin.h | 5 +++++ kernel/sched/idle.c | 3 +++ 4 files changed, 10 insertions(+), 3 deletions(-) diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h index fd02c5fa60cb..36c2119de702 100644 --- a/include/linux/rcupdate.h +++ b/include/linux/rcupdate.h @@ -110,8 +110,10 @@ static inline void rcu_user_exit(void) { } #ifdef CONFIG_RCU_NOCB_CPU void rcu_init_nohz(void); +void rcu_nocb_flush_deferred_wakeup(void); #else /* #ifdef CONFIG_RCU_NOCB_CPU */ static inline void rcu_init_nohz(void) { } +static inline void rcu_nocb_flush_deferred_wakeup(void) { } #endif /* #else #ifdef CONFIG_RCU_NOCB_CPU */ /** diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 63032e5620b9..82838e93b498 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -671,10 +671,7 @@ static noinstr void rcu_eqs_enter(bool user) */ void rcu_idle_enter(void) { - struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - lockdep_assert_irqs_disabled(); - do_nocb_deferred_wakeup(rdp); rcu_eqs_enter(false); } EXPORT_SYMBOL_GPL(rcu_idle_enter); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 7e291ce0a1d6..d5b38c28abd1 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -2187,6 +2187,11 @@ static void do_nocb_deferred_wakeup(struct rcu_data *rdp) do_nocb_deferred_wakeup_common(rdp); } +void rcu_nocb_flush_deferred_wakeup(void) +{ + do_nocb_deferred_wakeup(this_cpu_ptr(&rcu_data)); +} + void __init rcu_init_nohz(void) { int cpu; diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 305727ea0677..b601a3aa2152 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -55,6 +55,7 @@ __setup("hlt", cpu_idle_nopoll_setup); static noinline int __cpuidle cpu_idle_poll(void) { trace_cpu_idle(0, smp_processor_id()); + rcu_nocb_flush_deferred_wakeup(); stop_critical_timings(); rcu_idle_enter(); local_irq_enable(); @@ -173,6 +174,8 @@ static void cpuidle_idle_call(void) struct cpuidle_driver *drv = cpuidle_get_cpu_driver(dev); int next_state, entered_state; + rcu_nocb_flush_deferred_wakeup(); + /* * Check if the idle task must be rescheduled. If it is the * case, exit the function after re-enabling the local irq. From patchwork Sun Jan 31 23:05:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 374201 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDF1EC433E6 for ; Sun, 31 Jan 2021 23:07:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B6E2A64E3A for ; Sun, 31 Jan 2021 23:07:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229981AbhAaXHY (ORCPT ); Sun, 31 Jan 2021 18:07:24 -0500 Received: from mail.kernel.org ([198.145.29.99]:40516 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229624AbhAaXGm (ORCPT ); Sun, 31 Jan 2021 18:06:42 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id A6FF664E30; Sun, 31 Jan 2021 23:05:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612134361; bh=gnGpkSJe4lEl53vay/VXJkleVMsgT1Uyy2FKZv7wHAY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=o7+shMKk73mXhSYt1Y1nJXWvhFFXLiWrOCEtybg1vsQ9fq2C5DYql6KwI85eUxVgX 5iobsF9WIWJG5Qbbvu1HhD6vxmDC8h9IWex2WsUdzTni3iPtJVa+K/1dmNV5Sp0Ot/ E4N0r0gIOfDuNvDVELlk6BoBPwsgQ8PxVpfVaPF2re3jcvB2sSnfD9IB3AjDX8Jw3L 1aXSXvKw2XFUk4Uc0tq0kFONQiYMgX4XTB1EFjudHCC12EFlnVwhsuPWXfPxDGFl8N pmHfQ6Cv1c5Lrg8MWoPehRSmBnYAVRUrkIbEJbQoHC6PiEH2zrOJaRamxpuL1EeN85 rVGdpMErL/MpA== From: Frederic Weisbecker To: "Paul E . McKenney" , Peter Zijlstra Cc: LKML , Frederic Weisbecker , Paolo Bonzini , "Rafael J . Wysocki" , Thomas Gleixner , stable@vger.kernel.org, Ingo Molnar Subject: [PATCH 3/5] rcu/nocb: Trigger self-IPI on late deferred wake up before user resume Date: Mon, 1 Feb 2021 00:05:46 +0100 Message-Id: <20210131230548.32970-4-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131230548.32970-1-frederic@kernel.org> References: <20210131230548.32970-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Entering RCU idle mode may cause a deferred wake up of an RCU NOCB_GP kthread (rcuog) to be serviced. Unfortunately the call to rcu_user_enter() is already past the last rescheduling opportunity before we resume to userspace or to guest mode. We may escape there with the woken task ignored. The ultimate resort to fix every callsites is to trigger a self-IPI (nohz_full depends on arch to implement arch_irq_work_raise()) that will trigger a reschedule on IRQ tail or guest exit. Eventually every site that want a saner treatment will need to carefully place a call to rcu_nocb_flush_deferred_wakeup() before the last explicit need_resched() check upon resume. Reported-by: Paul E. McKenney Fixes: 96d3fd0d315a (rcu: Break call_rcu() deadlock involving scheduler and perf) Cc: stable@vger.kernel.org Cc: Rafael J. Wysocki Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Signed-off-by: Frederic Weisbecker --- kernel/rcu/tree.c | 21 ++++++++++++++++++++- kernel/rcu/tree.h | 2 +- kernel/rcu/tree_plugin.h | 25 ++++++++++++++++--------- 3 files changed, 37 insertions(+), 11 deletions(-) diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 82838e93b498..4b1e5bd16492 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -677,6 +677,18 @@ void rcu_idle_enter(void) EXPORT_SYMBOL_GPL(rcu_idle_enter); #ifdef CONFIG_NO_HZ_FULL + +/* + * An empty function that will trigger a reschedule on + * IRQ tail once IRQs get re-enabled on userspace resume. + */ +static void late_wakeup_func(struct irq_work *work) +{ +} + +static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) = + IRQ_WORK_INIT(late_wakeup_func); + /** * rcu_user_enter - inform RCU that we are resuming userspace. * @@ -694,12 +706,19 @@ noinstr void rcu_user_enter(void) lockdep_assert_irqs_disabled(); + /* + * We may be past the last rescheduling opportunity in the entry code. + * Trigger a self IPI that will fire and reschedule once we resume to + * user/guest mode. + */ instrumentation_begin(); - do_nocb_deferred_wakeup(rdp); + if (do_nocb_deferred_wakeup(rdp) && need_resched()) + irq_work_queue(this_cpu_ptr(&late_wakeup_work)); instrumentation_end(); rcu_eqs_enter(true); } + #endif /* CONFIG_NO_HZ_FULL */ /** diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h index 7708ed161f4a..9226f4021a36 100644 --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -433,7 +433,7 @@ static bool rcu_nocb_try_bypass(struct rcu_data *rdp, struct rcu_head *rhp, static void __call_rcu_nocb_wake(struct rcu_data *rdp, bool was_empty, unsigned long flags); static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp); -static void do_nocb_deferred_wakeup(struct rcu_data *rdp); +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp); static void rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp); static void rcu_spawn_cpu_nocb_kthread(int cpu); static void __init rcu_spawn_nocb_kthreads(void); diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index d5b38c28abd1..384856e4d13e 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -1631,8 +1631,8 @@ bool rcu_is_nocb_cpu(int cpu) * Kick the GP kthread for this NOCB group. Caller holds ->nocb_lock * and this function releases it. */ -static void wake_nocb_gp(struct rcu_data *rdp, bool force, - unsigned long flags) +static bool wake_nocb_gp(struct rcu_data *rdp, bool force, + unsigned long flags) __releases(rdp->nocb_lock) { bool needwake = false; @@ -1643,7 +1643,7 @@ static void wake_nocb_gp(struct rcu_data *rdp, bool force, trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("AlreadyAwake")); rcu_nocb_unlock_irqrestore(rdp, flags); - return; + return false; } del_timer(&rdp->nocb_timer); rcu_nocb_unlock_irqrestore(rdp, flags); @@ -1656,6 +1656,8 @@ static void wake_nocb_gp(struct rcu_data *rdp, bool force, raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags); if (needwake) wake_up_process(rdp_gp->nocb_gp_kthread); + + return needwake; } /* @@ -2152,20 +2154,23 @@ static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp) } /* Do a deferred wakeup of rcu_nocb_kthread(). */ -static void do_nocb_deferred_wakeup_common(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup_common(struct rcu_data *rdp) { unsigned long flags; int ndw; + int ret; rcu_nocb_lock_irqsave(rdp, flags); if (!rcu_nocb_need_deferred_wakeup(rdp)) { rcu_nocb_unlock_irqrestore(rdp, flags); - return; + return false; } ndw = READ_ONCE(rdp->nocb_defer_wakeup); WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT); - wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); + ret = wake_nocb_gp(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags); trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("DeferredWake")); + + return ret; } /* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */ @@ -2181,10 +2186,11 @@ static void do_nocb_deferred_wakeup_timer(struct timer_list *t) * This means we do an inexact common-case check. Note that if * we miss, ->nocb_timer will eventually clean things up. */ -static void do_nocb_deferred_wakeup(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp) { if (rcu_nocb_need_deferred_wakeup(rdp)) - do_nocb_deferred_wakeup_common(rdp); + return do_nocb_deferred_wakeup_common(rdp); + return false; } void rcu_nocb_flush_deferred_wakeup(void) @@ -2523,8 +2529,9 @@ static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp) return false; } -static void do_nocb_deferred_wakeup(struct rcu_data *rdp) +static bool do_nocb_deferred_wakeup(struct rcu_data *rdp) { + return false; } static void rcu_spawn_cpu_nocb_kthread(int cpu) From patchwork Sun Jan 31 23:05:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 374591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C487CC433E0 for ; Sun, 31 Jan 2021 23:07:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A88464E27 for ; Sun, 31 Jan 2021 23:07:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229705AbhAaXHe (ORCPT ); Sun, 31 Jan 2021 18:07:34 -0500 Received: from mail.kernel.org ([198.145.29.99]:40538 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229813AbhAaXGo (ORCPT ); Sun, 31 Jan 2021 18:06:44 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 00B5764E33; Sun, 31 Jan 2021 23:06:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612134363; bh=cKuICB7wZoPDGA7Mqm0x9IOcUKhIiHZ3nTKrwqMxxik=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TuEbr4fa2rG04PBHHnz1Lf542nz9e3+9+RmuipLguO/8uJJLOcDB114AB1BNTENW9 w+8+XI7NMUYcNGr8vXaUg3Rg5f6nE284Rby0gAU61h1eW6EFgkvHWeKsKTqV8kwGUn eqAumR5uTcdFvlt5V7Ey1aRahV6b1TKuB7kAzV3G4w56CVTMezjjClCqHmiHV62b7F YWwdILLmMjnJf3UNqer9gbGF+v3Z8pBJZjOrPKbZx+txycf+b/Nqh+0BRcfB9PeSNj dvyVObBFamVm+AWL3yW2xdWVw3Jcgrs0Pdl5DQeGH402dT0gadR6RT4K24iWE6RifC KLNqI6mBQVndQ== From: Frederic Weisbecker To: "Paul E . McKenney" , Peter Zijlstra Cc: LKML , Frederic Weisbecker , Paolo Bonzini , "Rafael J . Wysocki" , Thomas Gleixner , stable@vger.kernel.org, Ingo Molnar Subject: [PATCH 4/5] entry: Explicitly flush pending rcuog wakeup before last rescheduling point Date: Mon, 1 Feb 2021 00:05:47 +0100 Message-Id: <20210131230548.32970-5-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131230548.32970-1-frederic@kernel.org> References: <20210131230548.32970-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Following the idle loop model, cleanly check for pending rcuog wakeup before the last rescheduling point on resuming to user mode. This way we can avoid to do it from rcu_user_enter() with the last resort self-IPI hack that enforces rescheduling. Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Paul E. McKenney Cc: Rafael J. Wysocki --- kernel/entry/common.c | 7 +++++++ kernel/rcu/tree.c | 12 +++++++----- 2 files changed, 14 insertions(+), 5 deletions(-) diff --git a/kernel/entry/common.c b/kernel/entry/common.c index 378341642f94..7c61460a0867 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -184,6 +184,10 @@ static unsigned long exit_to_user_mode_loop(struct pt_regs *regs, * enabled above. */ local_irq_disable_exit_to_user(); + + /* Check if any of the above work has queued a deferred wakeup */ + rcu_nocb_flush_deferred_wakeup(); + ti_work = READ_ONCE(current_thread_info()->flags); } @@ -197,6 +201,9 @@ static void exit_to_user_mode_prepare(struct pt_regs *regs) lockdep_assert_irqs_disabled(); + /* Flush pending rcuog wakeup before the last need_resched() check */ + rcu_nocb_flush_deferred_wakeup(); + if (unlikely(ti_work & EXIT_TO_USER_MODE_WORK)) ti_work = exit_to_user_mode_loop(regs, ti_work); diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 4b1e5bd16492..2ebc211fffcb 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -707,13 +707,15 @@ noinstr void rcu_user_enter(void) lockdep_assert_irqs_disabled(); /* - * We may be past the last rescheduling opportunity in the entry code. - * Trigger a self IPI that will fire and reschedule once we resume to - * user/guest mode. + * Other than generic entry implementation, we may be past the last + * rescheduling opportunity in the entry code. Trigger a self IPI + * that will fire and reschedule once we resume in user/guest mode. */ instrumentation_begin(); - if (do_nocb_deferred_wakeup(rdp) && need_resched()) - irq_work_queue(this_cpu_ptr(&late_wakeup_work)); + if (!IS_ENABLED(CONFIG_GENERIC_ENTRY) || (current->flags & PF_VCPU)) { + if (do_nocb_deferred_wakeup(rdp) && need_resched()) + irq_work_queue(this_cpu_ptr(&late_wakeup_work)); + } instrumentation_end(); rcu_eqs_enter(true); From patchwork Sun Jan 31 23:05:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 374200 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, INCLUDES_CR_TRAILER, INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8924C433DB for ; Sun, 31 Jan 2021 23:09:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8F19764E3A for ; Sun, 31 Jan 2021 23:09:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230179AbhAaXIr (ORCPT ); Sun, 31 Jan 2021 18:08:47 -0500 Received: from mail.kernel.org ([198.145.29.99]:40570 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229656AbhAaXHT (ORCPT ); Sun, 31 Jan 2021 18:07:19 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 4BA4564E3B; Sun, 31 Jan 2021 23:06:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1612134366; bh=xbeqdSJTFKQL09N89i7fGchr5c55MOuzlCw8yty7Y4Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oxRW/uUffVCtwHqqzMj894TnktckRtXYdLry0jiB/YAkMLmnSFJHk7WMACODCP4DZ xvNvtyPUOjYV5b0hbdjs39Uepv/j8tDukb6QJbHtwQMq7YcooEfbP+r2/CjYqCwtfT sgYgWSsi0Yxx6puGPlGWMLEG5eGpp3SLsDAFnXonjCBZ6VImiRyK1GuxESp+udcuXB F5f3kY5bY3Ow44MOKQ3ULomfO8QeQ72yAcNzSMHeNWxnUmOcCXkWR889dgmoXSIBlI pn2zztSs7qWP82MsTVqM9a+8ZNWNTJp0LUI7JW/WheJLJqF9PxYVgNPoBudZNbgcKr xaEHbbd290yZA== From: Frederic Weisbecker To: "Paul E . McKenney" , Peter Zijlstra Cc: LKML , Frederic Weisbecker , Paolo Bonzini , "Rafael J . Wysocki" , Thomas Gleixner , stable@vger.kernel.org, Ingo Molnar Subject: [PATCH 5/5] entry/kvm: Explicitly flush pending rcuog wakeup before last rescheduling point Date: Mon, 1 Feb 2021 00:05:48 +0100 Message-Id: <20210131230548.32970-6-frederic@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210131230548.32970-1-frederic@kernel.org> References: <20210131230548.32970-1-frederic@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org Following the idle loop model, cleanly check for pending rcuog wakeup before the last rescheduling point upon resuming to guest mode. This way we can avoid to do it from rcu_user_enter() with the last resort self-IPI hack that enforces rescheduling. Suggested-by: Peter Zijlstra Signed-off-by: Frederic Weisbecker Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Paul E. McKenney Cc: Rafael J. Wysocki Cc: Paolo Bonzini --- arch/x86/kvm/x86.c | 1 + include/linux/entry-kvm.h | 14 +++++++++++++ kernel/rcu/tree.c | 44 ++++++++++++++++++++++++++++++--------- kernel/rcu/tree_plugin.h | 1 + 4 files changed, 50 insertions(+), 10 deletions(-) diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 9a8969a6dd06..7fd4f70c229b 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1773,6 +1773,7 @@ EXPORT_SYMBOL_GPL(kvm_emulate_wrmsr); bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu) { + xfer_to_guest_mode_prepare(); return vcpu->mode == EXITING_GUEST_MODE || kvm_request_pending(vcpu) || xfer_to_guest_mode_work_pending(); } diff --git a/include/linux/entry-kvm.h b/include/linux/entry-kvm.h index 9b93f8584ff7..8b2b1d68b954 100644 --- a/include/linux/entry-kvm.h +++ b/include/linux/entry-kvm.h @@ -46,6 +46,20 @@ static inline int arch_xfer_to_guest_mode_handle_work(struct kvm_vcpu *vcpu, */ int xfer_to_guest_mode_handle_work(struct kvm_vcpu *vcpu); +/** + * xfer_to_guest_mode_prepare - Perform last minute preparation work that + * need to be handled while IRQs are disabled + * upon entering to guest. + * + * Has to be invoked with interrupts disabled before the last call + * to xfer_to_guest_mode_work_pending(). + */ +static inline void xfer_to_guest_mode_prepare(void) +{ + lockdep_assert_irqs_disabled(); + rcu_nocb_flush_deferred_wakeup(); +} + /** * __xfer_to_guest_mode_work_pending - Check if work is pending * diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c index 2ebc211fffcb..ce17b8477442 100644 --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -678,9 +678,10 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter); #ifdef CONFIG_NO_HZ_FULL +#if !defined(CONFIG_GENERIC_ENTRY) || !defined(CONFIG_KVM_XFER_TO_GUEST_WORK) /* * An empty function that will trigger a reschedule on - * IRQ tail once IRQs get re-enabled on userspace resume. + * IRQ tail once IRQs get re-enabled on userspace/guest resume. */ static void late_wakeup_func(struct irq_work *work) { @@ -689,6 +690,37 @@ static void late_wakeup_func(struct irq_work *work) static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) = IRQ_WORK_INIT(late_wakeup_func); +/* + * If either: + * + * 1) the task is about to enter in guest mode and $ARCH doesn't support KVM generic work + * 2) the task is about to enter in user mode and $ARCH doesn't support generic entry. + * + * In these cases the late RCU wake ups aren't supported in the resched loops and our + * last resort is to fire a local irq_work that will trigger a reschedule once IRQs + * get re-enabled again. + */ +noinstr static void rcu_irq_work_resched(void) +{ + struct rcu_data *rdp = this_cpu_ptr(&rcu_data); + + if (IS_ENABLED(CONFIG_GENERIC_ENTRY) && !(current->flags & PF_VCPU)) + return; + + if (IS_ENABLED(CONFIG_KVM_XFER_TO_GUEST_WORK) && (current->flags & PF_VCPU)) + return; + + instrumentation_begin(); + if (do_nocb_deferred_wakeup(rdp) && need_resched()) { + irq_work_queue(this_cpu_ptr(&late_wakeup_work)); + } + instrumentation_end(); +} + +#else +static inline void rcu_irq_work_resched(void) { } +#endif + /** * rcu_user_enter - inform RCU that we are resuming userspace. * @@ -702,8 +734,6 @@ static DEFINE_PER_CPU(struct irq_work, late_wakeup_work) = */ noinstr void rcu_user_enter(void) { - struct rcu_data *rdp = this_cpu_ptr(&rcu_data); - lockdep_assert_irqs_disabled(); /* @@ -711,13 +741,7 @@ noinstr void rcu_user_enter(void) * rescheduling opportunity in the entry code. Trigger a self IPI * that will fire and reschedule once we resume in user/guest mode. */ - instrumentation_begin(); - if (!IS_ENABLED(CONFIG_GENERIC_ENTRY) || (current->flags & PF_VCPU)) { - if (do_nocb_deferred_wakeup(rdp) && need_resched()) - irq_work_queue(this_cpu_ptr(&late_wakeup_work)); - } - instrumentation_end(); - + rcu_irq_work_resched(); rcu_eqs_enter(true); } diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h index 384856e4d13e..cdc1b7651c03 100644 --- a/kernel/rcu/tree_plugin.h +++ b/kernel/rcu/tree_plugin.h @@ -2197,6 +2197,7 @@ void rcu_nocb_flush_deferred_wakeup(void) { do_nocb_deferred_wakeup(this_cpu_ptr(&rcu_data)); } +EXPORT_SYMBOL_GPL(rcu_nocb_flush_deferred_wakeup); void __init rcu_init_nohz(void) {