Message ID | 1526925446-6067-1-git-send-email-fabrizio.castro@bp.renesas.com |
---|---|
State | Superseded |
Headers | show |
Series | time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting | expand |
On Mon, May 21, 2018 at 06:57:26PM +0100, Fabrizio Castro wrote: > From: John Stultz <john.stultz@linaro.org> > > commit 3d88d56c5873f6eebe23e05c3da701960146b801 upstream. > > Due to how the MONOTONIC_RAW accumulation logic was handled, > there is the potential for a 1ns discontinuity when we do > accumulations. This small discontinuity has for the most part > gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW > in their vDSO clock_gettime implementation, we've seen failures > with the inconsistency-check test in kselftest. > > This patch addresses the issue by using the same sub-ns > accumulation handling that CLOCK_MONOTONIC uses, which avoids > the issue for in-kernel users. > > Since the ARM64 vDSO implementation has its own clock_gettime > calculation logic, this patch reduces the frequency of errors, > but failures are still seen. The ARM64 vDSO will need to be > updated to include the sub-nanosecond xtime_nsec values in its > calculation for this issue to be completely fixed. > > Signed-off-by: John Stultz <john.stultz@linaro.org> > Tested-by: Daniel Mentz <danielmentz@google.com> > Cc: Prarit Bhargava <prarit@redhat.com> > Cc: Kevin Brodsky <kevin.brodsky@arm.com> > Cc: Richard Cochran <richardcochran@gmail.com> > Cc: Stephen Boyd <stephen.boyd@linaro.org> > Cc: Will Deacon <will.deacon@arm.com> > Cc: "stable #4 . 8+" <stable@vger.kernel.org> > Cc: Miroslav Lichvar <mlichvar@redhat.com> > Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > [fabrizio: cherry-pick to 4.4. Kept cycle_t type for function > logarithmic_accumulation local variable "interval". Dropped > casting of "interval" variable] > Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> > Signed-off-by: Biju Das <biju.das@bp.renesas.com> > --- > Hello Greg, > > we noticed tools/testing/selftests/timers/clocksource-switch.c > was failing for us, this patch fixes the cause of the failure. > Are you happy to take this patch? For what kernel tree(s)? And why did you not cc: the developers and maintainer of this subsystem? thanks, greg k-h
Hello Greg, Thank you for your feedback. > Subject: Re: [PATCH] time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting > > On Mon, May 21, 2018 at 06:57:26PM +0100, Fabrizio Castro wrote: > > From: John Stultz <john.stultz@linaro.org> > > > > commit 3d88d56c5873f6eebe23e05c3da701960146b801 upstream. > > > > Due to how the MONOTONIC_RAW accumulation logic was handled, > > there is the potential for a 1ns discontinuity when we do > > accumulations. This small discontinuity has for the most part > > gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW > > in their vDSO clock_gettime implementation, we've seen failures > > with the inconsistency-check test in kselftest. > > > > This patch addresses the issue by using the same sub-ns > > accumulation handling that CLOCK_MONOTONIC uses, which avoids > > the issue for in-kernel users. > > > > Since the ARM64 vDSO implementation has its own clock_gettime > > calculation logic, this patch reduces the frequency of errors, > > but failures are still seen. The ARM64 vDSO will need to be > > updated to include the sub-nanosecond xtime_nsec values in its > > calculation for this issue to be completely fixed. > > > > Signed-off-by: John Stultz <john.stultz@linaro.org> > > Tested-by: Daniel Mentz <danielmentz@google.com> > > Cc: Prarit Bhargava <prarit@redhat.com> > > Cc: Kevin Brodsky <kevin.brodsky@arm.com> > > Cc: Richard Cochran <richardcochran@gmail.com> > > Cc: Stephen Boyd <stephen.boyd@linaro.org> > > Cc: Will Deacon <will.deacon@arm.com> > > Cc: "stable #4 . 8+" <stable@vger.kernel.org> > > Cc: Miroslav Lichvar <mlichvar@redhat.com> > > Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org > > Signed-off-by: Thomas Gleixner <tglx@linutronix.de> > > [fabrizio: cherry-pick to 4.4. Kept cycle_t type for function > > logarithmic_accumulation local variable "interval". Dropped > > casting of "interval" variable] > > Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com> > > Signed-off-by: Biju Das <biju.das@bp.renesas.com> > > --- > > Hello Greg, > > > > we noticed tools/testing/selftests/timers/clocksource-switch.c > > was failing for us, this patch fixes the cause of the failure. > > Are you happy to take this patch? > > For what kernel tree(s)? 4.4.y > > And why did you not cc: the developers and maintainer of this subsystem? Sorry, I wasn't sure about bothering the other developers with this. I'll repost with all the relevant people in cc. Thanks, Fab > > thanks, > > greg k-h Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered No. 04586709.
diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h index f0f1793..115216e 100644 --- a/include/linux/timekeeper_internal.h +++ b/include/linux/timekeeper_internal.h @@ -56,7 +56,7 @@ struct tk_read_base { * interval. * @xtime_remainder: Shifted nano seconds left over when rounding * @cycle_interval - * @raw_interval: Raw nano seconds accumulated per NTP interval. + * @raw_interval: Shifted raw nano seconds accumulated per NTP interval. * @ntp_error: Difference between accumulated time and NTP time in ntp * shifted nano seconds. * @ntp_error_shift: Shift conversion between clock shifted nano seconds and @@ -97,7 +97,7 @@ struct timekeeper { cycle_t cycle_interval; u64 xtime_interval; s64 xtime_remainder; - u32 raw_interval; + u64 raw_interval; /* The ntp_tick_length() value currently being used. * This cached copy ensures we consistently apply the tick * length for an entire tick, as ntp_tick_length may change diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 6e48668..fed86b2 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -277,8 +277,7 @@ static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock) /* Go back from cycles -> shifted ns */ tk->xtime_interval = (u64) interval * clock->mult; tk->xtime_remainder = ntpinterval - tk->xtime_interval; - tk->raw_interval = - ((u64) interval * clock->mult) >> clock->shift; + tk->raw_interval = interval * clock->mult; /* if changing clocks, convert xtime_nsec shift units */ if (old_clock) { @@ -1767,7 +1766,7 @@ static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset, unsigned int *clock_set) { cycle_t interval = tk->cycle_interval << shift; - u64 raw_nsecs; + u64 snsec_per_sec; /* If the offset is smaller than a shifted interval, do nothing */ if (offset < interval) @@ -1782,14 +1781,15 @@ static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset, *clock_set |= accumulate_nsecs_to_secs(tk); /* Accumulate raw time */ - raw_nsecs = (u64)tk->raw_interval << shift; - raw_nsecs += tk->raw_time.tv_nsec; - if (raw_nsecs >= NSEC_PER_SEC) { - u64 raw_secs = raw_nsecs; - raw_nsecs = do_div(raw_secs, NSEC_PER_SEC); - tk->raw_time.tv_sec += raw_secs; + tk->tkr_raw.xtime_nsec += (u64)tk->raw_time.tv_nsec << tk->tkr_raw.shift; + tk->tkr_raw.xtime_nsec += tk->raw_interval << shift; + snsec_per_sec = (u64)NSEC_PER_SEC << tk->tkr_raw.shift; + while (tk->tkr_raw.xtime_nsec >= snsec_per_sec) { + tk->tkr_raw.xtime_nsec -= snsec_per_sec; + tk->raw_time.tv_sec++; } - tk->raw_time.tv_nsec = raw_nsecs; + tk->raw_time.tv_nsec = tk->tkr_raw.xtime_nsec >> tk->tkr_raw.shift; + tk->tkr_raw.xtime_nsec -= (u64)tk->raw_time.tv_nsec << tk->tkr_raw.shift; /* Accumulate error between NTP and clock interval */ tk->ntp_error += tk->ntp_tick << shift;