Message ID | 5cc8b55c.1c69fb81.c3759.1c27@mx.google.com |
---|---|
State | Superseded |
Headers | show |
Series | next/master boot bisection: next-20190430 on beagle-xm | expand |
On 2019-04-30 13:51:40 [-0700], kernelci.org bot wrote: > next/master boot bisection: next-20190430 on beagle-xm > > Summary: > Start: f43b05fd4c17 Add linux-next specific files for 20190430 > Details: https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b > Plain log: https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt > HTML log: https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html > Result: 6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock > > Checks: > revert: PASS > verify: PASS > > Parameters: > Tree: next > URL: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git > Branch: master > Target: beagle-xm > CPU arch: arm > Lab: lab-baylibre > Compiler: gcc-7 > Config: multi_v7_defconfig+CONFIG_SMP=n > Test suite: boot > > Breaking commit found: > > ------------------------------------------------------------------------------- > commit 6d25be5782e482eb93e3de0c94d0a517879377d0 > Author: Thomas Gleixner <tglx@linutronix.de> > Date: Wed Mar 13 17:55:48 2019 +0100 > > sched/core, workqueues: Distangle worker accounting from rq lock According to the bootlog it just stopped its output. This commit is in next since a week or two so I don't understand why this pops up now. I just revived my BBB and I can boot that commit in question. Currently that as close as I get to a beagle-xm. Looking at https://kernelci.org/boot/id/5cc9a64359b514a77f5584af/ it seems that the very same board managed to boot linux-next for next-20190501. Side note: I can't boot next-20190501 on my BBB, bisect points to commit 1a5cd7c23cc52 ("bus: ti-sysc: Enable all clocks directly during init to read revision") any idea? Sebastian
On 2019-05-01 09:29:44 [-0700], Tony Lindgren wrote: > Hi, > > * Sebastian Andrzej Siewior <bigeasy@linutronix.de> [190501 15:37]: > > > > On 2019-04-30 13:51:40 [-0700], kernelci.org bot wrote: > > > next/master boot bisection: next-20190430 on beagle-xm > > > > > > Summary: > > > Start: f43b05fd4c17 Add linux-next specific files for 20190430 > > > Details: https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b > > > Plain log: https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt > > > HTML log: https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html > > > Result: 6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock > > > > > > Checks: > > > revert: PASS > > > verify: PASS > > > > > > Parameters: > > > Tree: next > > > URL: git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git > > > Branch: master > > > Target: beagle-xm > > > CPU arch: arm > > > Lab: lab-baylibre > > > Compiler: gcc-7 > > > Config: multi_v7_defconfig+CONFIG_SMP=n > > > Test suite: boot > > > > > > Breaking commit found: > > > > > > ------------------------------------------------------------------------------- > > > commit 6d25be5782e482eb93e3de0c94d0a517879377d0 > > > Author: Thomas Gleixner <tglx@linutronix.de> > > > Date: Wed Mar 13 17:55:48 2019 +0100 > > > > > > sched/core, workqueues: Distangle worker accounting from rq lock > > > > According to the bootlog it just stopped its output. This commit is in > > next since a week or two so I don't understand why this pops up now. > > Adding Kevin to Cc, he just confirmed on #armlinux irc that he is able to > reproduce this with CONFIG_SMP=n and root=/dev/ram0. I could not reproduce > this issue so far on omap3 with NFSroot at least. So that problem remains even that the job for today passed? > > I just revived my BBB and I can boot that commit in question. Currently > > that as close as I get to a beagle-xm. > > Looking at > > https://kernelci.org/boot/id/5cc9a64359b514a77f5584af/ > > it seems that the very same board managed to boot linux-next for > > next-20190501. > > > > Side note: I can't boot next-20190501 on my BBB, bisect points to commit > > 1a5cd7c23cc52 ("bus: ti-sysc: Enable all clocks directly during init to read revision") > > > > any idea? > > Oh interesting thanks for letting me know. Next boots fine for me here > with NFSroot on BBB. > > Do you have some output on what happens so I can investigate? Nope, the console remains dark. > Regards, > > Tony Sebastian
On 2019-05-01 13:21:49 [-0700], Tony Lindgren wrote: > Hi, Hi, > OK I found two issues. It seems that d_can also needs osc clock > on am335x. And there's no revision register for d_can.. We're now > reading the CTL register unnecessarily. > > Below is what I hope fixes the boot issue for you, care to boot > test? yup, that boots. Thanks. > If this helps I'll send out proper patches for for both issues. > > Regards, > > Tony Sebastian
"kernelci.org bot" <bot@kernelci.org> writes: > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > * This automated bisection report was sent to you on the basis * > * that you may be involved with the breaking commit it has * > * found. No manual investigation has been done to verify it, * > * and the root cause of the problem may be somewhere else. * > * Hope this helps! * > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > > next/master boot bisection: next-20190430 on beagle-xm > > Summary: > Start: f43b05fd4c17 Add linux-next specific files for 20190430 > Details: https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b > Plain log: https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt > HTML log: https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html > Result: 6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock I was able to reproduce this in next-20190430, but... I'm not sure what fixed it, but this is passing again in today's linux-next (next-20190501) Kevin
On 2019-05-01 14:48:44 [-0700], Kevin Hilman wrote: > "kernelci.org bot" <bot@kernelci.org> writes: > > > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > > * This automated bisection report was sent to you on the basis * > > * that you may be involved with the breaking commit it has * > > * found. No manual investigation has been done to verify it, * > > * and the root cause of the problem may be somewhere else. * > > * Hope this helps! * > > * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * > > > > next/master boot bisection: next-20190430 on beagle-xm > > > > Summary: > > Start: f43b05fd4c17 Add linux-next specific files for 20190430 > > Details: https://kernelci.org/boot/id/5cc84d7359b514b7ab55847b > > Plain log: https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.txt > > HTML log: https://storage.kernelci.org//next/master/next-20190430/arm/multi_v7_defconfig+CONFIG_SMP=n/gcc-7/lab-baylibre/boot-omap3-beagle-xm.html > > Result: 6d25be5782e4 sched/core, workqueues: Distangle worker accounting from rq lock > > I was able to reproduce this in next-20190430, but... > > I'm not sure what fixed it, but this is passing again in today's > linux-next (next-20190501) Okay, thanks for the confirmation. > Kevin Sebastian
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4778c48a7fda..6184a0856aab 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1685,10 +1685,6 @@ static inline void ttwu_activate(struct rq *rq, struct task_struct *p, int en_fl { activate_task(rq, p, en_flags); p->on_rq = TASK_ON_RQ_QUEUED; - - /* If a worker is waking up, notify the workqueue: */ - if (p->flags & PF_WQ_WORKER) - wq_worker_waking_up(p, cpu_of(rq)); } /* @@ -2106,56 +2102,6 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags) return success; } -/** - * try_to_wake_up_local - try to wake up a local task with rq lock held - * @p: the thread to be awakened - * @rf: request-queue flags for pinning - * - * Put @p on the run-queue if it's not already there. The caller must - * ensure that this_rq() is locked, @p is bound to this_rq() and not - * the current task. - */ -static void try_to_wake_up_local(struct task_struct *p, struct rq_flags *rf) -{ - struct rq *rq = task_rq(p); - - if (WARN_ON_ONCE(rq != this_rq()) || - WARN_ON_ONCE(p == current)) - return; - - lockdep_assert_held(&rq->lock); - - if (!raw_spin_trylock(&p->pi_lock)) { - /* - * This is OK, because current is on_cpu, which avoids it being - * picked for load-balance and preemption/IRQs are still - * disabled avoiding further scheduler activity on it and we've - * not yet picked a replacement task. - */ - rq_unlock(rq, rf); - raw_spin_lock(&p->pi_lock); - rq_relock(rq, rf); - } - - if (!(p->state & TASK_NORMAL)) - goto out; - - trace_sched_waking(p); - - if (!task_on_rq_queued(p)) { - if (p->in_iowait) { - delayacct_blkio_end(p); - atomic_dec(&rq->nr_iowait); - } - ttwu_activate(rq, p, ENQUEUE_WAKEUP | ENQUEUE_NOCLOCK); - } - - ttwu_do_wakeup(rq, p, 0, rf); - ttwu_stat(p, smp_processor_id(), 0); -out: - raw_spin_unlock(&p->pi_lock); -} - /** * wake_up_process - Wake up a specific process * @p: The process to be woken up. @@ -3472,19 +3418,6 @@ static void __sched notrace __schedule(bool preempt) atomic_inc(&rq->nr_iowait); delayacct_blkio_start(); } - - /* - * If a worker went to sleep, notify and ask workqueue - * whether it wants to wake up a task to maintain - * concurrency. - */ - if (prev->flags & PF_WQ_WORKER) { - struct task_struct *to_wakeup; - - to_wakeup = wq_worker_sleeping(prev); - if (to_wakeup) - try_to_wake_up_local(to_wakeup, &rf); - } } switch_count = &prev->nvcsw; } @@ -3544,6 +3477,20 @@ static inline void sched_submit_work(struct task_struct *tsk) { if (!tsk->state || tsk_is_pi_blocked(tsk)) return; + + /* + * If a worker went to sleep, notify and ask workqueue whether + * it wants to wake up a task to maintain concurrency. + * As this function is called inside the schedule() context, + * we disable preemption to avoid it calling schedule() again + * in the possible wakeup of a kworker. + */ + if (tsk->flags & PF_WQ_WORKER) { + preempt_disable(); + wq_worker_sleeping(tsk); + preempt_enable_no_resched(); + } + /* * If we are going to sleep and we have plugged IO queued, * make sure to submit it to avoid deadlocks. @@ -3552,6 +3499,12 @@ static inline void sched_submit_work(struct task_struct *tsk) blk_schedule_flush_plug(tsk); } +static void sched_update_worker(struct task_struct *tsk) +{ + if (tsk->flags & PF_WQ_WORKER) + wq_worker_running(tsk); +} + asmlinkage __visible void __sched schedule(void) { struct task_struct *tsk = current; @@ -3562,6 +3515,7 @@ asmlinkage __visible void __sched schedule(void) __schedule(false); sched_preempt_enable_no_resched(); } while (need_resched()); + sched_update_worker(tsk); } EXPORT_SYMBOL(schedule); diff --git a/kernel/workqueue.c b/kernel/workqueue.c index ddee541ea97a..56180c9286f5 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -841,43 +841,32 @@ static void wake_up_worker(struct worker_pool *pool) } /** - * wq_worker_waking_up - a worker is waking up + * wq_worker_running - a worker is running again * @task: task waking up - * @cpu: CPU @task is waking up to * - * This function is called during try_to_wake_up() when a worker is - * being awoken. - * - * CONTEXT: - * spin_lock_irq(rq->lock) + * This function is called when a worker returns from schedule() */ -void wq_worker_waking_up(struct task_struct *task, int cpu) +void wq_worker_running(struct task_struct *task) { struct worker *worker = kthread_data(task); - if (!(worker->flags & WORKER_NOT_RUNNING)) { - WARN_ON_ONCE(worker->pool->cpu != cpu); + if (!worker->sleeping) + return; + if (!(worker->flags & WORKER_NOT_RUNNING)) atomic_inc(&worker->pool->nr_running); - } + worker->sleeping = 0; } /** * wq_worker_sleeping - a worker is going to sleep * @task: task going to sleep * - * This function is called during schedule() when a busy worker is - * going to sleep. Worker on the same cpu can be woken up by - * returning pointer to its task. - * - * CONTEXT: - * spin_lock_irq(rq->lock) - * - * Return: - * Worker task on @cpu to wake up, %NULL if none. + * This function is called from schedule() when a busy worker is + * going to sleep. */ -struct task_struct *wq_worker_sleeping(struct task_struct *task) +void wq_worker_sleeping(struct task_struct *task) { - struct worker *worker = kthread_data(task), *to_wakeup = NULL; + struct worker *next, *worker = kthread_data(task); struct worker_pool *pool; /* @@ -886,13 +875,15 @@ struct task_struct *wq_worker_sleeping(struct task_struct *task) * checking NOT_RUNNING. */ if (worker->flags & WORKER_NOT_RUNNING) - return NULL; + return; pool = worker->pool; - /* this can only happen on the local cpu */ - if (WARN_ON_ONCE(pool->cpu != raw_smp_processor_id())) - return NULL; + if (WARN_ON_ONCE(worker->sleeping)) + return; + + worker->sleeping = 1; + spin_lock_irq(&pool->lock); /* * The counterpart of the following dec_and_test, implied mb, @@ -906,9 +897,12 @@ struct task_struct *wq_worker_sleeping(struct task_struct *task) * lock is safe. */ if (atomic_dec_and_test(&pool->nr_running) && - !list_empty(&pool->worklist)) - to_wakeup = first_idle_worker(pool); - return to_wakeup ? to_wakeup->task : NULL; + !list_empty(&pool->worklist)) { + next = first_idle_worker(pool); + if (next) + wake_up_process(next->task); + } + spin_unlock_irq(&pool->lock); } /** @@ -4929,7 +4923,7 @@ static void rebind_workers(struct worker_pool *pool) * * WRITE_ONCE() is necessary because @worker->flags may be * tested without holding any lock in - * wq_worker_waking_up(). Without it, NOT_RUNNING test may + * wq_worker_running(). Without it, NOT_RUNNING test may * fail incorrectly leading to premature concurrency * management operations. */ diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h index cb68b03ca89a..498de0e909a4 100644 --- a/kernel/workqueue_internal.h +++ b/kernel/workqueue_internal.h @@ -44,6 +44,7 @@ struct worker { unsigned long last_active; /* L: last active timestamp */ unsigned int flags; /* X: flags */ int id; /* I: worker id */ + int sleeping; /* None */ /* * Opaque string set with work_set_desc(). Printed out with task @@ -72,8 +73,8 @@ static inline struct worker *current_wq_worker(void) * Scheduler hooks for concurrency managed workqueue. Only to be used from * sched/ and workqueue.c. */ -void wq_worker_waking_up(struct task_struct *task, int cpu); -struct task_struct *wq_worker_sleeping(struct task_struct *task); +void wq_worker_running(struct task_struct *task); +void wq_worker_sleeping(struct task_struct *task); work_func_t wq_worker_last_func(struct task_struct *task); #endif /* _KERNEL_WORKQUEUE_INTERNAL_H */