Message ID | 3232b3192e2e373cc4aaf43359d454c5ad53cddb.1348568074.git.viresh.kumar@linaro.org |
---|---|
State | Accepted |
Headers | show |
On Tue, 2012-09-25 at 16:06 +0530, Viresh Kumar wrote: > @@ -1066,8 +1076,9 @@ int queue_work(struct workqueue_struct *wq, > struct work_struct *work) > { > int ret; > > - ret = queue_work_on(get_cpu(), wq, work); > - put_cpu(); > + preempt_disable(); > + ret = queue_work_on(wq_select_cpu(), wq, work); > + preempt_enable(); > > return ret; > } Right, so the problem I see here is that wq_select_cpu() is horridly expensive.. > @@ -1102,7 +1113,7 @@ static void delayed_work_timer_fn(unsigned long > __data) > struct delayed_work *dwork = (struct delayed_work *)__data; > struct cpu_workqueue_struct *cwq = get_work_cwq(&dwork->work); > > - __queue_work(smp_processor_id(), cwq->wq, &dwork->work); > + __queue_work(wq_select_cpu(), cwq->wq, &dwork->work); > } Shouldn't timer migration have sorted this one?
On 25 September 2012 16:52, Peter Zijlstra <peterz@infradead.org> wrote: > On Tue, 2012-09-25 at 16:06 +0530, Viresh Kumar wrote: >> @@ -1066,8 +1076,9 @@ int queue_work(struct workqueue_struct *wq, >> struct work_struct *work) >> { >> int ret; >> >> - ret = queue_work_on(get_cpu(), wq, work); >> - put_cpu(); >> + preempt_disable(); >> + ret = queue_work_on(wq_select_cpu(), wq, work); >> + preempt_enable(); >> >> return ret; >> } > > Right, so the problem I see here is that wq_select_cpu() is horridly > expensive.. But this is what the initial idea during LPC we had. Any improvements here you can suggest? >> @@ -1102,7 +1113,7 @@ static void delayed_work_timer_fn(unsigned long >> __data) >> struct delayed_work *dwork = (struct delayed_work *)__data; >> struct cpu_workqueue_struct *cwq = get_work_cwq(&dwork->work); >> >> - __queue_work(smp_processor_id(), cwq->wq, &dwork->work); >> + __queue_work(wq_select_cpu(), cwq->wq, &dwork->work); >> } > > Shouldn't timer migration have sorted this one? Maybe yes. Will investigate more on it. Thanks for your early feedback. -- viresh
On 25 September 2012 13:30, Viresh Kumar <viresh.kumar@linaro.org> wrote: > On 25 September 2012 16:52, Peter Zijlstra <peterz@infradead.org> wrote: >> On Tue, 2012-09-25 at 16:06 +0530, Viresh Kumar wrote: >>> @@ -1066,8 +1076,9 @@ int queue_work(struct workqueue_struct *wq, >>> struct work_struct *work) >>> { >>> int ret; >>> >>> - ret = queue_work_on(get_cpu(), wq, work); >>> - put_cpu(); >>> + preempt_disable(); >>> + ret = queue_work_on(wq_select_cpu(), wq, work); >>> + preempt_enable(); >>> >>> return ret; >>> } >> >> Right, so the problem I see here is that wq_select_cpu() is horridly >> expensive.. > > But this is what the initial idea during LPC we had. Any improvements here > you can suggest? The main outcome of the LPC was that we should be able to select another CPU than the local one. Using the same policy than timer, is a 1st step to consolidate interface. A next step should be to update the policy of the function Vincent > >>> @@ -1102,7 +1113,7 @@ static void delayed_work_timer_fn(unsigned long >>> __data) >>> struct delayed_work *dwork = (struct delayed_work *)__data; >>> struct cpu_workqueue_struct *cwq = get_work_cwq(&dwork->work); >>> >>> - __queue_work(smp_processor_id(), cwq->wq, &dwork->work); >>> + __queue_work(wq_select_cpu(), cwq->wq, &dwork->work); >>> } >> >> Shouldn't timer migration have sorted this one? > > Maybe yes. Will investigate more on it. > > Thanks for your early feedback. > > -- > viresh
On Tue, 2012-09-25 at 17:00 +0530, Viresh Kumar wrote: > But this is what the initial idea during LPC we had. Yeah.. that's true. > Any improvements here you can suggest? We could uhm... /me tries thinking ... reuse some of the NOHZ magic? Would that be sufficient, not waking a NOHZ cpu, or do you really want not waking any idle cpu?
On Tue, 2012-09-25 at 13:40 +0200, Peter Zijlstra wrote: > On Tue, 2012-09-25 at 17:00 +0530, Viresh Kumar wrote: > > But this is what the initial idea during LPC we had. > > Yeah.. that's true. > > > Any improvements here you can suggest? > > We could uhm... /me tries thinking ... reuse some of the NOHZ magic? > Would that be sufficient, not waking a NOHZ cpu, or do you really want > not waking any idle cpu? Depending on the trade-off we could have the NOHZ stuff track a non-NOHZ-idle cpu and avoid having to compute one every time we need it.
Hello, On Tue, Sep 25, 2012 at 04:06:08PM +0530, Viresh Kumar wrote: > +config MIGRATE_WQ > + bool "(EXPERIMENTAL) Migrate Workqueues to non-idle cpu" > + depends on SMP && EXPERIMENTAL > + help > + Workqueues queues work on current cpu, if the caller haven't passed a > + preferred cpu. This may wake up an idle CPU, which is actually not > + required. This work can be processed by any CPU and so we must select > + a non-idle CPU here. This patch adds in support in workqueue > + framework to get preferred CPU details from the scheduler, instead of > + using current CPU. I don't think it's a good idea to make behavior like this a config option. The behavior difference is subtle and may induce incorrect behavior. > +/* This enables migration of a work to a non-IDLE cpu instead of current cpu */ > +#ifdef CONFIG_MIGRATE_WQ > +static int wq_select_cpu(void) > +{ > + return sched_select_cpu(SD_NUMA, -1); > +} > +#else > +#define wq_select_cpu() smp_processor_id() > +#endif > + > /* Serializes the accesses to the list of workqueues. */ > static DEFINE_SPINLOCK(workqueue_lock); > static LIST_HEAD(workqueues); > @@ -995,7 +1005,7 @@ static void __queue_work(unsigned int cpu, struct workqueue_struct *wq, > struct global_cwq *last_gcwq; > > if (unlikely(cpu == WORK_CPU_UNBOUND)) > - cpu = raw_smp_processor_id(); > + cpu = wq_select_cpu(); > > /* > * It's multi cpu. If @wq is non-reentrant and @work > @@ -1066,8 +1076,9 @@ int queue_work(struct workqueue_struct *wq, struct work_struct *work) > { > int ret; > > - ret = queue_work_on(get_cpu(), wq, work); > - put_cpu(); > + preempt_disable(); > + ret = queue_work_on(wq_select_cpu(), wq, work); > + preempt_enable(); First of all, I'm not entirely sure this is safe. queue_work() used to *guarantee* that the work item would execute on the local CPU. I don't think there are many which depend on that but I'd be surprised if this doesn't lead to some subtle problems somewhere. It might not be realistic to audit all users and we might have to just let it happen and watch for the fallouts. Dunno, still wanna see some level of auditing. Also, I'm wondering why this is necessary at all for workqueues. For schedule/queue_work(), you pretty much know the current cpu is not idle. For delayed workqueue, sure but for immediate scheduling, why? Thanks.
On 25 September 2012 23:26, Tejun Heo <tj@kernel.org> wrote: > On Tue, Sep 25, 2012 at 04:06:08PM +0530, Viresh Kumar wrote: >> +config MIGRATE_WQ >> + bool "(EXPERIMENTAL) Migrate Workqueues to non-idle cpu" >> + depends on SMP && EXPERIMENTAL >> + help >> + Workqueues queues work on current cpu, if the caller haven't passed a >> + preferred cpu. This may wake up an idle CPU, which is actually not >> + required. This work can be processed by any CPU and so we must select >> + a non-idle CPU here. This patch adds in support in workqueue >> + framework to get preferred CPU details from the scheduler, instead of >> + using current CPU. > > I don't think it's a good idea to make behavior like this a config > option. The behavior difference is subtle and may induce incorrect > behavior. Ok. Will remove it. >> @@ -1066,8 +1076,9 @@ int queue_work(struct workqueue_struct *wq, struct work_struct *work) >> { >> int ret; >> >> - ret = queue_work_on(get_cpu(), wq, work); >> - put_cpu(); >> + preempt_disable(); >> + ret = queue_work_on(wq_select_cpu(), wq, work); >> + preempt_enable(); > > First of all, I'm not entirely sure this is safe. queue_work() used > to *guarantee* that the work item would execute on the local CPU. I > don't think there are many which depend on that but I'd be surprised > if this doesn't lead to some subtle problems somewhere. It might not > be realistic to audit all users and we might have to just let it > happen and watch for the fallouts. Dunno, still wanna see some level > of auditing. Ok. > Also, I'm wondering why this is necessary at all for workqueues. For > schedule/queue_work(), you pretty much know the current cpu is not > idle. For delayed workqueue, sure but for immediate scheduling, why? This was done for below scenario: - A cpu has programmed a timer and is IDLE now. - CPU gets into interrupt handler due to timer and queues a work. As the CPU is currently IDLE, we should queue this work to some other CPU. I know this patch did migrate works in all cases. Will fix it by queuing work only for this case in V2. -- viresh
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 5944511..da17bd0 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1594,6 +1594,17 @@ config HMP_SLOW_CPU_MASK Specify the cpuids of the slow CPUs in the system as a list string, e.g. cpuid 0+1 should be specified as 0-1. +config MIGRATE_WQ + bool "(EXPERIMENTAL) Migrate Workqueues to non-idle cpu" + depends on SMP && EXPERIMENTAL + help + Workqueues queues work on current cpu, if the caller haven't passed a + preferred cpu. This may wake up an idle CPU, which is actually not + required. This work can be processed by any CPU and so we must select + a non-idle CPU here. This patch adds in support in workqueue + framework to get preferred CPU details from the scheduler, instead of + using current CPU. + config HAVE_ARM_SCU bool help diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 692a55b..fd8df4a 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -456,6 +456,16 @@ static inline void debug_work_activate(struct work_struct *work) { } static inline void debug_work_deactivate(struct work_struct *work) { } #endif +/* This enables migration of a work to a non-IDLE cpu instead of current cpu */ +#ifdef CONFIG_MIGRATE_WQ +static int wq_select_cpu(void) +{ + return sched_select_cpu(SD_NUMA, -1); +} +#else +#define wq_select_cpu() smp_processor_id() +#endif + /* Serializes the accesses to the list of workqueues. */ static DEFINE_SPINLOCK(workqueue_lock); static LIST_HEAD(workqueues); @@ -995,7 +1005,7 @@ static void __queue_work(unsigned int cpu, struct workqueue_struct *wq, struct global_cwq *last_gcwq; if (unlikely(cpu == WORK_CPU_UNBOUND)) - cpu = raw_smp_processor_id(); + cpu = wq_select_cpu(); /* * It's multi cpu. If @wq is non-reentrant and @work @@ -1066,8 +1076,9 @@ int queue_work(struct workqueue_struct *wq, struct work_struct *work) { int ret; - ret = queue_work_on(get_cpu(), wq, work); - put_cpu(); + preempt_disable(); + ret = queue_work_on(wq_select_cpu(), wq, work); + preempt_enable(); return ret; } @@ -1102,7 +1113,7 @@ static void delayed_work_timer_fn(unsigned long __data) struct delayed_work *dwork = (struct delayed_work *)__data; struct cpu_workqueue_struct *cwq = get_work_cwq(&dwork->work); - __queue_work(smp_processor_id(), cwq->wq, &dwork->work); + __queue_work(wq_select_cpu(), cwq->wq, &dwork->work); } /** @@ -1158,7 +1169,7 @@ int queue_delayed_work_on(int cpu, struct workqueue_struct *wq, if (gcwq && gcwq->cpu != WORK_CPU_UNBOUND) lcpu = gcwq->cpu; else - lcpu = raw_smp_processor_id(); + lcpu = wq_select_cpu(); } else lcpu = WORK_CPU_UNBOUND; @@ -2823,8 +2834,8 @@ EXPORT_SYMBOL_GPL(cancel_work_sync); static inline void __flush_delayed_work(struct delayed_work *dwork) { if (del_timer_sync(&dwork->timer)) - __queue_work(raw_smp_processor_id(), - get_work_cwq(&dwork->work)->wq, &dwork->work); + __queue_work(wq_select_cpu(), get_work_cwq(&dwork->work)->wq, + &dwork->work); } /**
Workqueues queues work on current cpu, if the caller haven't passed a preferred cpu. This may wake up an idle CPU, which is actually not required. This work can be processed by any CPU and so we must select a non-idle CPU here. This patch adds in support in workqueue framework to get preferred CPU details from the scheduler, instead of using current CPU. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> --- arch/arm/Kconfig | 11 +++++++++++ kernel/workqueue.c | 25 ++++++++++++++++++------- 2 files changed, 29 insertions(+), 7 deletions(-)