Message ID | 20210107091841.19798-1-ran.wang_1@nxp.com |
---|---|
State | New |
Headers | show |
Series | [RFC] rt: kernel/sched/core: fix kthread_park() pending too long when CPU un-plugged | expand |
On Thu, Jan 07, 2021 at 05:18:41PM +0800, Ran Wang wrote: > + > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && > + !strncmp(p->comm, "ksoftirqd/", 10)) > + schedule_hrtimeout(&to, > + HRTIMER_MODE_REL | HRTIMER_MODE_HARD); > + else > + schedule_hrtimeout(&to, HRTIMER_MODE_REL); This is horrific, why did you not self-censor and spare me the mental anguish of having to formulate a CoC compliant response? It also violates coding style, but given the total lack of any sense, that seems like a minor detail. Why can't we use HRTIMER_MODE_HARD unconditionally?
Hi Sebastian, Peter Thursday, January 7, 2021 11:29 PM, Sebastian Siewior wrote: > > On 2021-01-07 11:45:39 [+0100], Peter Zijlstra wrote: > > On Thu, Jan 07, 2021 at 05:18:41PM +0800, Ran Wang wrote: > > > + > > > + if (IS_ENABLED(CONFIG_PREEMPT_RT) && > > > + !strncmp(p->comm, "ksoftirqd/", 10)) > > > + schedule_hrtimeout(&to, > > > + HRTIMER_MODE_REL | HRTIMER_MODE_HARD); > > > + else > > > + schedule_hrtimeout(&to, HRTIMER_MODE_REL); > > > > This is horrific, why did you not self-censor and spare me the mental > > anguish of having to formulate a CoC compliant response? > > > > It also violates coding style, but given the total lack of any sense, > > that seems like a minor detail. > > > > Why can't we use HRTIMER_MODE_HARD unconditionally? > > I had a similar patch in -RT and dropped it in v5.10-rc7-rt16. > It was added because RT couldn't boot since creating the boot-threads didn't work before the ksoftirqd was up. This was fixed by commit > 26c7295be0c5e ("kthread: Do not preempt current task if it is going to call schedule()") I tried applying above commit to linux-5.6.y-rt, it could resolve my problem on LX2160ARDB, THANKS! > and live was good again. > tglx (also) suggested to add HRTIMER_MODE_HARD unconditionally (it looked at SYSTEM_STATE back then) and I was only worried some > abuse via userland. Got it. Regards, Ran > This sleep can be triggered by ptrace/strace() and with brief testing I can trigger the sleep there but I don't get it anywhere near where I > would notice it with cyclictest. > > Sebastian
On 2021-01-08 08:45:14 [+0000], Ran Wang wrote: > Hi Sebastian, Peter Hi, > > I had a similar patch in -RT and dropped it in v5.10-rc7-rt16. > > It was added because RT couldn't boot since creating the boot-threads didn't work before the ksoftirqd was up. This was fixed by commit > > 26c7295be0c5e ("kthread: Do not preempt current task if it is going to call schedule()") > > I tried applying above commit to linux-5.6.y-rt, it could resolve my problem on LX2160ARDB, THANKS! so in other words all this could have been avoided by using a supported or maintained RT series. The v5.4 series has this patch, v5.6 isn't maintained anymore so it is likely that there is more missing. Sebastian
Hi Sebastian, On Friday, January 8, 2021 5:05 PM, Sebastian Siewior wrote: > > On 2021-01-08 08:45:14 [+0000], Ran Wang wrote: > > Hi Sebastian, Peter > Hi, > > > > I had a similar patch in -RT and dropped it in v5.10-rc7-rt16. > > > It was added because RT couldn't boot since creating the boot-threads didn't work before the ksoftirqd was up. This was fixed by commit > > > 26c7295be0c5e ("kthread: Do not preempt current task if it is > > > going to call schedule()") > > > > I tried applying above commit to linux-5.6.y-rt, it could resolve my problem on LX2160ARDB, THANKS! > > so in other words all this could have been avoided by using a supported or maintained RT series. The v5.4 series has this patch, v5.6 isn't > maintained anymore so it is likely that there is more missing. Thanks for let me know this. The reason I trying linux-5.6-rt is that I have encountered other more serious issues on later RT version (even with v5.10-rc7-rt16), one of them is CPU hot plug got stuck in irq_work_sync() which called by sugov_stop(), failure happen at 1st loop stress test every time. I will try to collect more information and create another mail thread later. Thanks & Regards, Ran > Sebastian
On Thu, Jan 07, 2021 at 04:28:43PM +0100, Sebastian Siewior wrote: > This sleep can be triggered by ptrace/strace() and with brief testing I > can trigger the sleep there but I don't get it anywhere near where I > would notice it with cyclictest. It's a single task wakeup (the caller), doing that from hardirq context really should not be a problem, we do lots of that in RT already.
On 2021-01-08 10:32:36 [+0100], Peter Zijlstra wrote: > It's a single task wakeup (the caller), doing that from hardirq context > really should not be a problem, we do lots of that in RT already. I'm not worry about that single wakeup but about an artificial case where you manage to accumulate multiple single wake ups in a short time frame. Sebastian
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 792da55..4cc742a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2054,10 +2054,15 @@ unsigned long wait_task_inactive(struct task_struct *p, long match_state) ktime_t to = NSEC_PER_SEC / HZ; set_current_state(TASK_UNINTERRUPTIBLE); - schedule_hrtimeout(&to, HRTIMER_MODE_REL); + + if (IS_ENABLED(CONFIG_PREEMPT_RT) && + !strncmp(p->comm, "ksoftirqd/", 10)) + schedule_hrtimeout(&to, + HRTIMER_MODE_REL | HRTIMER_MODE_HARD); + else + schedule_hrtimeout(&to, HRTIMER_MODE_REL); continue; } - /* * Ahh, all good. It wasn't running, and it wasn't * runnable, which means that it will never become
When doing CPU un-plug stress test, function smpboot_park_threads() would get call to park kernel threads (which including ksoftirqd) on that CPU core, and function wait_task_inactive() would yield for those queued task(s) by calling schedule_hrtimerout() with mode of HRTIMER_MODE_REL. stack trace: ... smpboot_thread_fn cpuhp_thread_fun cpuhp_invoke_callback smpboot_park_threads smpboot_park_thread: ksoftirqd/1 kthread_park wait_task_inactive schedule_hrtimerout However, when PREEMPT_RT is set, this would cause a pending issue since schedule_hrtimerout() depend on thread ksoftirqd to complete related work if it using HRTIMER_MODE_SOFT. So force using HRTIMER_MODE_HARD in such case. Suggested-by: Jiafei Pan <jiafei.pan@nxp.com> Signed-off-by: Ran Wang <ran.wang_1@nxp.com> --- kernel/sched/core.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)