Message ID | 1346352988-32444-14-git-send-email-paulmck@linux.vnet.ibm.com |
---|---|
State | Accepted |
Commit | 803b0ebae921714d1c36f0996db8125eda5fae53 |
Headers | show |
On Thu, Aug 30, 2012 at 11:56:27AM -0700, Paul E. McKenney wrote: > From: "Paul E. McKenney" <paul.mckenney@linaro.org> > > The can_stop_idle_tick() function complains if a softirq vector is > raised too late in the idle-entry process, presumably in order to > prevent dangling softirq invocations from being delayed across the > full idle period, which might be indefinitely long -- and if softirq > was asserted any later than the call to this function, such a delay > might well happen. > > However, RCU needs to be able to use softirq to stop idle entry in > order to be able to drain RCU callbacks from the current CPU, which in > turn enables faster entry into dyntick-idle mode, which in turn reduces > power consumption. Because RCU takes this action at a well-defined > point in the idle-entry path, it is safe for RCU to take this approach. > > This commit therefore silences the error message that is sometimes > produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ > to process. The error message will continue to be issued for other > softirq vectors. > > Reported-by: Sedat Dilek <sedat.dilek@gmail.com> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Tested-by: Sedat Dilek <sedat.dilek@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> > --- > include/linux/interrupt.h | 2 ++ > kernel/time/tick-sched.c | 3 ++- > 2 files changed, 4 insertions(+), 1 deletions(-) > > diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h > index c5f856a..5e4e617 100644 > --- a/include/linux/interrupt.h > +++ b/include/linux/interrupt.h > @@ -430,6 +430,8 @@ enum > NR_SOFTIRQS > }; > > +#define SOFTIRQ_STOP_IDLE_MASK (~(1 << RCU_SOFTIRQ)) > + > /* map softirq index to softirq name. update 'softirq_to_name' in > * kernel/softirq.c when adding a new softirq. > */ > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > index 024540f..4b1785a 100644 > --- a/kernel/time/tick-sched.c > +++ b/kernel/time/tick-sched.c > @@ -436,7 +436,8 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts) > if (unlikely(local_softirq_pending() && cpu_online(cpu))) { > static int ratelimit; > > - if (ratelimit < 10) { > + if (ratelimit < 10 && > + (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) { > printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n", > (unsigned int) local_softirq_pending()); > ratelimit++; > -- > 1.7.8 >
On Thu, 2012-08-30 at 11:56 -0700, Paul E. McKenney wrote: > From: "Paul E. McKenney" <paul.mckenney@linaro.org> > > The can_stop_idle_tick() function complains if a softirq vector is > raised too late in the idle-entry process, presumably in order to > prevent dangling softirq invocations from being delayed across the > full idle period, which might be indefinitely long -- and if softirq > was asserted any later than the call to this function, such a delay > might well happen. > > However, RCU needs to be able to use softirq to stop idle entry in > order to be able to drain RCU callbacks from the current CPU, which in > turn enables faster entry into dyntick-idle mode, which in turn reduces > power consumption. Because RCU takes this action at a well-defined > point in the idle-entry path, it is safe for RCU to take this approach. > > This commit therefore silences the error message that is sometimes > produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ > to process. The error message will continue to be issued for other > softirq vectors. > > Reported-by: Sedat Dilek <sedat.dilek@gmail.com> > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > Tested-by: Sedat Dilek <sedat.dilek@gmail.com> > --- > include/linux/interrupt.h | 2 ++ > kernel/time/tick-sched.c | 3 ++- > 2 files changed, 4 insertions(+), 1 deletions(-) > > diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h > index c5f856a..5e4e617 100644 > --- a/include/linux/interrupt.h > +++ b/include/linux/interrupt.h > @@ -430,6 +430,8 @@ enum > NR_SOFTIRQS > }; > > +#define SOFTIRQ_STOP_IDLE_MASK (~(1 << RCU_SOFTIRQ)) > + > /* map softirq index to softirq name. update 'softirq_to_name' in > * kernel/softirq.c when adding a new softirq. > */ > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > index 024540f..4b1785a 100644 > --- a/kernel/time/tick-sched.c > +++ b/kernel/time/tick-sched.c > @@ -436,7 +436,8 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts) > if (unlikely(local_softirq_pending() && cpu_online(cpu))) { > static int ratelimit; > > - if (ratelimit < 10) { > + if (ratelimit < 10 && > + (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) { > printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n", > (unsigned int) local_softirq_pending()); > ratelimit++; Urgh.. yuck. So either add a very verbose comment here on why its OK for RCU (the changelog is rather vague about it), or try and come up with something better. Where does RCU flush the pending softirq? Does it flush all softirqs or only the RCU one? Can we move the check after RCU does this so we can avoid the special case?
On Thu, Sep 06, 2012 at 05:12:08PM +0200, Peter Zijlstra wrote: > On Thu, 2012-08-30 at 11:56 -0700, Paul E. McKenney wrote: > > From: "Paul E. McKenney" <paul.mckenney@linaro.org> > > > > The can_stop_idle_tick() function complains if a softirq vector is > > raised too late in the idle-entry process, presumably in order to > > prevent dangling softirq invocations from being delayed across the > > full idle period, which might be indefinitely long -- and if softirq > > was asserted any later than the call to this function, such a delay > > might well happen. > > > > However, RCU needs to be able to use softirq to stop idle entry in > > order to be able to drain RCU callbacks from the current CPU, which in > > turn enables faster entry into dyntick-idle mode, which in turn reduces > > power consumption. Because RCU takes this action at a well-defined > > point in the idle-entry path, it is safe for RCU to take this approach. > > > > This commit therefore silences the error message that is sometimes > > produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ > > to process. The error message will continue to be issued for other > > softirq vectors. > > > > Reported-by: Sedat Dilek <sedat.dilek@gmail.com> > > Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> > > Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> > > Tested-by: Sedat Dilek <sedat.dilek@gmail.com> > > --- > > include/linux/interrupt.h | 2 ++ > > kernel/time/tick-sched.c | 3 ++- > > 2 files changed, 4 insertions(+), 1 deletions(-) > > > > diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h > > index c5f856a..5e4e617 100644 > > --- a/include/linux/interrupt.h > > +++ b/include/linux/interrupt.h > > @@ -430,6 +430,8 @@ enum > > NR_SOFTIRQS > > }; > > > > +#define SOFTIRQ_STOP_IDLE_MASK (~(1 << RCU_SOFTIRQ)) > > + > > /* map softirq index to softirq name. update 'softirq_to_name' in > > * kernel/softirq.c when adding a new softirq. > > */ > > diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c > > index 024540f..4b1785a 100644 > > --- a/kernel/time/tick-sched.c > > +++ b/kernel/time/tick-sched.c > > @@ -436,7 +436,8 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts) > > if (unlikely(local_softirq_pending() && cpu_online(cpu))) { > > static int ratelimit; > > > > - if (ratelimit < 10) { > > + if (ratelimit < 10 && > > + (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) { > > printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n", > > (unsigned int) local_softirq_pending()); > > ratelimit++; > > Urgh.. yuck. So either add a very verbose comment here on why its OK for > RCU (the changelog is rather vague about it), or try and come up with > something better. OK, what questions does the changelog fail to answer? > Where does RCU flush the pending softirq? Does it flush all softirqs or > only the RCU one? Can we move the check after RCU does this so we can > avoid the special case? You lost me on this one. RCU doesn't flush pending softirqs except by them eventually being invoked. And it doesn't mess with any but its own softirq vector. But yes, RCU was a lot simpler before people started worrying about its energy efficiency. ;-) Thanx, Paul
On Thu, 2012-09-06 at 14:35 -0700, Paul E. McKenney wrote: > But yes, RCU was a lot simpler before people started worrying about > its energy efficiency. ;-) Tell them to buy bigger batteries. -- Steve
On Thu, Sep 06, 2012 at 05:57:45PM -0400, Steven Rostedt wrote: > On Thu, 2012-09-06 at 14:35 -0700, Paul E. McKenney wrote: > > > But yes, RCU was a lot simpler before people started worrying about > > its energy efficiency. ;-) > > Tell them to buy bigger batteries. If nuclear batteries are good enough for Curiosity, they are good enough for you!!! ;-) Thanx, Paul
diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h index c5f856a..5e4e617 100644 --- a/include/linux/interrupt.h +++ b/include/linux/interrupt.h @@ -430,6 +430,8 @@ enum NR_SOFTIRQS }; +#define SOFTIRQ_STOP_IDLE_MASK (~(1 << RCU_SOFTIRQ)) + /* map softirq index to softirq name. update 'softirq_to_name' in * kernel/softirq.c when adding a new softirq. */ diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 024540f..4b1785a 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -436,7 +436,8 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched *ts) if (unlikely(local_softirq_pending() && cpu_online(cpu))) { static int ratelimit; - if (ratelimit < 10) { + if (ratelimit < 10 && + (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) { printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n", (unsigned int) local_softirq_pending()); ratelimit++;