mbox series

[RFC,0/9] local_clock() vs noinstr

Message ID 20230508211951.901961964@infradead.org
Headers show
Series local_clock() vs noinstr | expand

Message

Peter Zijlstra May 8, 2023, 9:19 p.m. UTC
Hi all!

A recent commit of mine marked local_clock() as noinstr.

  776f22913b8e ("sched/clock: Make local_clock() noinstr")

Sadly both me and objtool missed the fact that this is subly broken; but
Sebastian tripped over it [*]:

| vmlinux.o: warning: objtool: native_sched_clock+0x97: call to preempt_schedule_notrace_thunk() leaves .noinstr.text section
| vmlinux.o: warning: objtool: kvm_clock_read+0x22: call to preempt_schedule_notrace_thunk() leaves .noinstr.text section
| vmlinux.o: warning: objtool: local_clock+0xb4: call to preempt_schedule_notrace_thunk() leaves .noinstr.text section

Specifically, local_clock() (and many of the sched_clock() implementation is
relies upon) use preempt_{dis,en}able_notrace() which obviously calls out to
schedule().

Now, noinstr code *should* never trigger this and already run in
non-preemptible code. Specifically entry code should have IRQs disabled while
__cpuidle code should have preemption disabled.

So while it is mostly harmless, I figured it wouldn't be too hard to clean this
up a little -- but that was ~10 patches. Anyway, here goes...

Compile tested only on x86_64/s390/arm64 -- I've just fed it to the
robots.

---
 arch/arm64/include/asm/arch_timer.h    |  8 +----
 arch/arm64/include/asm/io.h            | 12 +++----
 arch/loongarch/include/asm/loongarch.h |  2 +-
 arch/loongarch/kernel/time.c           |  6 ++--
 arch/s390/include/asm/timex.h          | 13 +++++---
 arch/s390/kernel/time.c                | 11 ++++++-
 arch/x86/kernel/kvmclock.c             |  4 +--
 arch/x86/kernel/tsc.c                  | 38 ++++++++++++++++-----
 arch/x86/xen/time.c                    |  3 +-
 drivers/clocksource/arm_arch_timer.c   | 60 ++++++++++++++++++++++++++--------
 drivers/clocksource/hyperv_timer.c     |  4 +--
 drivers/cpuidle/cpuidle.c              |  8 ++---
 drivers/cpuidle/poll_state.c           |  4 +--
 include/clocksource/hyperv_timer.h     |  4 +--
 include/linux/rbtree_latch.h           |  2 +-
 include/linux/sched/clock.h            | 17 +++++++++-
 include/linux/seqlock.h                | 15 +++++----
 kernel/printk/printk.c                 |  2 +-
 kernel/sched/clock.c                   | 19 +++++++----
 kernel/time/sched_clock.c              | 24 ++++++++++----
 kernel/time/timekeeping.c              |  4 +--
 21 files changed, 176 insertions(+), 84 deletions(-)


* https://lkml.kernel.org/r/20230309072724.3F6zRkvw@linutronix.de
  TL;DR: PREEMPT_DYNAMIC=n PREEMPT=y DEBUG_ENTRY=y

Comments

Peter Zijlstra May 9, 2023, 6:42 a.m. UTC | #1
On Tue, May 09, 2023 at 08:13:59AM +0200, Heiko Carstens wrote:
> 
> 1;115;0cOn Mon, May 08, 2023 at 11:19:57PM +0200, Peter Zijlstra wrote:
> > With the intent to provide local_clock_noinstr(), a variant of
> > local_clock() that's safe to be called from noinstr code (with the
> > assumption that any such code will already be non-preemptible),
> > prepare for things by providing a noinstr sched_clock_noinstr()
> > function.
> > 
> > Specifically, preempt_enable_*() calls out to schedule(), which upsets
> > noinstr validation efforts.
> > 
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> > ---
> >  arch/s390/include/asm/timex.h |   13 +++++++++----
> >  arch/s390/kernel/time.c       |   11 ++++++++++-
> >  2 files changed, 19 insertions(+), 5 deletions(-)
> ...
> > +static __always_inline unsigned long __get_tod_clock_monotonic(void)
> > +{
> > +	return get_tod_clock() - tod_clock_base.tod;
> > +}
> > +
> >  /**
> >   * get_clock_monotonic - returns current time in clock rate units
> >   *
> > @@ -216,7 +221,7 @@ static inline unsigned long get_tod_cloc
> >  	unsigned long tod;
> >  
> >  	preempt_disable_notrace();
> > -	tod = get_tod_clock() - tod_clock_base.tod;
> > +	tod = __get_tod_clock_monotonic();
> >  	preempt_enable_notrace();
> >  	return tod;
> >  }
> ...
> > +unsigned long long noinstr sched_clock_noinstr(void)
> > +{
> > +	return tod_to_ns(__get_tod_clock_monotonic());
> > +}
> > +
> >  /*
> >   * Scheduler clock - returns current time in nanosec units.
> >   */
> >  unsigned long long notrace sched_clock(void)
> >  {
> > -	return tod_to_ns(get_tod_clock_monotonic());
> > +	unsigned long long ns;
> > +	preempt_disable_notrace();
> > +	ns = tod_to_ns(get_tod_clock_monotonic());
> > +	preempt_enable_notrace();
> > +	return ns;
> >  }
> >  NOKPROBE_SYMBOL(sched_clock);
> 
> This disables preemption twice within sched_clock(). So this should either
> call __get_tod_clock_monotonic() instead, or the function could stay as it
> is, which I would prefer.

Duh. Will fix.
Peter Zijlstra May 9, 2023, 7:02 p.m. UTC | #2
On Tue, May 09, 2023 at 06:18:08PM +0200, Rafael J. Wysocki wrote:
> On Mon, May 8, 2023 at 11:34 PM Peter Zijlstra <peterz@infradead.org> wrote:
> > --- a/drivers/cpuidle/poll_state.c
> > +++ b/drivers/cpuidle/poll_state.c
> > @@ -15,7 +15,7 @@ static int __cpuidle poll_idle(struct cp
> >  {
> >         u64 time_start;
> >
> > -       time_start = local_clock();
> > +       time_start = local_clock_noinstr();
> >
> >         dev->poll_time_limit = false;
> >
> > @@ -32,7 +32,7 @@ static int __cpuidle poll_idle(struct cp
> >                                 continue;
> >
> >                         loop_count = 0;
> > -                       if (local_clock() - time_start > limit) {
> > +                       if (local_clock_noinstr() - time_start > limit) {
> >                                 dev->poll_time_limit = true;
> >                                 break;
> >                         }
> >
> 
> The above LGTM, but the teo governors uses local_clock() too.  Should
> it use the _noinstr() version?

Only the callsites from noinstr or __cpuidle functions, IIRC the
governors are neither and should be OK.
Sebastian Andrzej Siewior May 10, 2023, 1:43 p.m. UTC | #3
On 2023-05-08 23:19:51 [+0200], Peter Zijlstra wrote:
> Hi all!
Hi Peter,

> Compile tested only on x86_64/s390/arm64 -- I've just fed it to the
Thanks, this works.

Sebastian