diff mbox series

time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting

Message ID 1526925446-6067-1-git-send-email-fabrizio.castro@bp.renesas.com
State Superseded
Headers show
Series time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting | expand

Commit Message

Fabrizio Castro May 21, 2018, 5:57 p.m. UTC
From: John Stultz <john.stultz@linaro.org>


commit 3d88d56c5873f6eebe23e05c3da701960146b801 upstream.

Due to how the MONOTONIC_RAW accumulation logic was handled,
there is the potential for a 1ns discontinuity when we do
accumulations. This small discontinuity has for the most part
gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW
in their vDSO clock_gettime implementation, we've seen failures
with the inconsistency-check test in kselftest.

This patch addresses the issue by using the same sub-ns
accumulation handling that CLOCK_MONOTONIC uses, which avoids
the issue for in-kernel users.

Since the ARM64 vDSO implementation has its own clock_gettime
calculation logic, this patch reduces the frequency of errors,
but failures are still seen. The ARM64 vDSO will need to be
updated to include the sub-nanosecond xtime_nsec values in its
calculation for this issue to be completely fixed.

Signed-off-by: John Stultz <john.stultz@linaro.org>

Tested-by: Daniel Mentz <danielmentz@google.com>

Cc: Prarit Bhargava <prarit@redhat.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "stable #4 . 8+" <stable@vger.kernel.org>
Cc: Miroslav Lichvar <mlichvar@redhat.com>
Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

[fabrizio: cherry-pick to 4.4. Kept cycle_t type for function
logarithmic_accumulation local variable "interval". Dropped
casting of "interval" variable]
Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>

Signed-off-by: Biju Das <biju.das@bp.renesas.com>

---
Hello Greg,

we noticed tools/testing/selftests/timers/clocksource-switch.c
was failing for us, this patch fixes the cause of the failure.
Are you happy to take this patch?

Thanks,
Fab

 include/linux/timekeeper_internal.h |  4 ++--
 kernel/time/timekeeping.c           | 20 ++++++++++----------
 2 files changed, 12 insertions(+), 12 deletions(-)

-- 
2.7.4

Comments

Greg KH May 21, 2018, 7:03 p.m. UTC | #1
On Mon, May 21, 2018 at 06:57:26PM +0100, Fabrizio Castro wrote:
> From: John Stultz <john.stultz@linaro.org>

> 

> commit 3d88d56c5873f6eebe23e05c3da701960146b801 upstream.

> 

> Due to how the MONOTONIC_RAW accumulation logic was handled,

> there is the potential for a 1ns discontinuity when we do

> accumulations. This small discontinuity has for the most part

> gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW

> in their vDSO clock_gettime implementation, we've seen failures

> with the inconsistency-check test in kselftest.

> 

> This patch addresses the issue by using the same sub-ns

> accumulation handling that CLOCK_MONOTONIC uses, which avoids

> the issue for in-kernel users.

> 

> Since the ARM64 vDSO implementation has its own clock_gettime

> calculation logic, this patch reduces the frequency of errors,

> but failures are still seen. The ARM64 vDSO will need to be

> updated to include the sub-nanosecond xtime_nsec values in its

> calculation for this issue to be completely fixed.

> 

> Signed-off-by: John Stultz <john.stultz@linaro.org>

> Tested-by: Daniel Mentz <danielmentz@google.com>

> Cc: Prarit Bhargava <prarit@redhat.com>

> Cc: Kevin Brodsky <kevin.brodsky@arm.com>

> Cc: Richard Cochran <richardcochran@gmail.com>

> Cc: Stephen Boyd <stephen.boyd@linaro.org>

> Cc: Will Deacon <will.deacon@arm.com>

> Cc: "stable #4 . 8+" <stable@vger.kernel.org>

> Cc: Miroslav Lichvar <mlichvar@redhat.com>

> Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org

> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

> [fabrizio: cherry-pick to 4.4. Kept cycle_t type for function

> logarithmic_accumulation local variable "interval". Dropped

> casting of "interval" variable]

> Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>

> Signed-off-by: Biju Das <biju.das@bp.renesas.com>

> ---

> Hello Greg,

> 

> we noticed tools/testing/selftests/timers/clocksource-switch.c

> was failing for us, this patch fixes the cause of the failure.

> Are you happy to take this patch?


For what kernel tree(s)?

And why did you not cc: the developers and maintainer of this subsystem?

thanks,

greg k-h
Fabrizio Castro May 22, 2018, 8:53 a.m. UTC | #2
Hello Greg,

Thank you for your feedback.

> Subject: Re: [PATCH] time: Fix CLOCK_MONOTONIC_RAW sub-nanosecond accounting

>

> On Mon, May 21, 2018 at 06:57:26PM +0100, Fabrizio Castro wrote:

> > From: John Stultz <john.stultz@linaro.org>

> >

> > commit 3d88d56c5873f6eebe23e05c3da701960146b801 upstream.

> >

> > Due to how the MONOTONIC_RAW accumulation logic was handled,

> > there is the potential for a 1ns discontinuity when we do

> > accumulations. This small discontinuity has for the most part

> > gone un-noticed, but since ARM64 enabled CLOCK_MONOTONIC_RAW

> > in their vDSO clock_gettime implementation, we've seen failures

> > with the inconsistency-check test in kselftest.

> >

> > This patch addresses the issue by using the same sub-ns

> > accumulation handling that CLOCK_MONOTONIC uses, which avoids

> > the issue for in-kernel users.

> >

> > Since the ARM64 vDSO implementation has its own clock_gettime

> > calculation logic, this patch reduces the frequency of errors,

> > but failures are still seen. The ARM64 vDSO will need to be

> > updated to include the sub-nanosecond xtime_nsec values in its

> > calculation for this issue to be completely fixed.

> >

> > Signed-off-by: John Stultz <john.stultz@linaro.org>

> > Tested-by: Daniel Mentz <danielmentz@google.com>

> > Cc: Prarit Bhargava <prarit@redhat.com>

> > Cc: Kevin Brodsky <kevin.brodsky@arm.com>

> > Cc: Richard Cochran <richardcochran@gmail.com>

> > Cc: Stephen Boyd <stephen.boyd@linaro.org>

> > Cc: Will Deacon <will.deacon@arm.com>

> > Cc: "stable #4 . 8+" <stable@vger.kernel.org>

> > Cc: Miroslav Lichvar <mlichvar@redhat.com>

> > Link: http://lkml.kernel.org/r/1496965462-20003-3-git-send-email-john.stultz@linaro.org

> > Signed-off-by: Thomas Gleixner <tglx@linutronix.de>

> > [fabrizio: cherry-pick to 4.4. Kept cycle_t type for function

> > logarithmic_accumulation local variable "interval". Dropped

> > casting of "interval" variable]

> > Signed-off-by: Fabrizio Castro <fabrizio.castro@bp.renesas.com>

> > Signed-off-by: Biju Das <biju.das@bp.renesas.com>

> > ---

> > Hello Greg,

> >

> > we noticed tools/testing/selftests/timers/clocksource-switch.c

> > was failing for us, this patch fixes the cause of the failure.

> > Are you happy to take this patch?

>

> For what kernel tree(s)?


4.4.y

>

> And why did you not cc: the developers and maintainer of this subsystem?


Sorry, I wasn't sure about bothering the other developers with this. I'll repost with all the relevant people in cc.

Thanks,
Fab

>

> thanks,

>

> greg k-h




Renesas Electronics Europe Ltd, Dukes Meadow, Millboard Road, Bourne End, Buckinghamshire, SL8 5FH, UK. Registered in England & Wales under Registered No. 04586709.
diff mbox series

Patch

diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h
index f0f1793..115216e 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -56,7 +56,7 @@  struct tk_read_base {
  *			interval.
  * @xtime_remainder:	Shifted nano seconds left over when rounding
  *			@cycle_interval
- * @raw_interval:	Raw nano seconds accumulated per NTP interval.
+ * @raw_interval:	Shifted raw nano seconds accumulated per NTP interval.
  * @ntp_error:		Difference between accumulated time and NTP time in ntp
  *			shifted nano seconds.
  * @ntp_error_shift:	Shift conversion between clock shifted nano seconds and
@@ -97,7 +97,7 @@  struct timekeeper {
 	cycle_t			cycle_interval;
 	u64			xtime_interval;
 	s64			xtime_remainder;
-	u32			raw_interval;
+	u64			raw_interval;
 	/* The ntp_tick_length() value currently being used.
 	 * This cached copy ensures we consistently apply the tick
 	 * length for an entire tick, as ntp_tick_length may change
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index 6e48668..fed86b2 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -277,8 +277,7 @@  static void tk_setup_internals(struct timekeeper *tk, struct clocksource *clock)
 	/* Go back from cycles -> shifted ns */
 	tk->xtime_interval = (u64) interval * clock->mult;
 	tk->xtime_remainder = ntpinterval - tk->xtime_interval;
-	tk->raw_interval =
-		((u64) interval * clock->mult) >> clock->shift;
+	tk->raw_interval = interval * clock->mult;
 
 	 /* if changing clocks, convert xtime_nsec shift units */
 	if (old_clock) {
@@ -1767,7 +1766,7 @@  static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset,
 						unsigned int *clock_set)
 {
 	cycle_t interval = tk->cycle_interval << shift;
-	u64 raw_nsecs;
+	u64 snsec_per_sec;
 
 	/* If the offset is smaller than a shifted interval, do nothing */
 	if (offset < interval)
@@ -1782,14 +1781,15 @@  static cycle_t logarithmic_accumulation(struct timekeeper *tk, cycle_t offset,
 	*clock_set |= accumulate_nsecs_to_secs(tk);
 
 	/* Accumulate raw time */
-	raw_nsecs = (u64)tk->raw_interval << shift;
-	raw_nsecs += tk->raw_time.tv_nsec;
-	if (raw_nsecs >= NSEC_PER_SEC) {
-		u64 raw_secs = raw_nsecs;
-		raw_nsecs = do_div(raw_secs, NSEC_PER_SEC);
-		tk->raw_time.tv_sec += raw_secs;
+	tk->tkr_raw.xtime_nsec += (u64)tk->raw_time.tv_nsec << tk->tkr_raw.shift;
+	tk->tkr_raw.xtime_nsec += tk->raw_interval << shift;
+	snsec_per_sec = (u64)NSEC_PER_SEC << tk->tkr_raw.shift;
+	while (tk->tkr_raw.xtime_nsec >= snsec_per_sec) {
+		tk->tkr_raw.xtime_nsec -= snsec_per_sec;
+		tk->raw_time.tv_sec++;
 	}
-	tk->raw_time.tv_nsec = raw_nsecs;
+	tk->raw_time.tv_nsec = tk->tkr_raw.xtime_nsec >> tk->tkr_raw.shift;
+	tk->tkr_raw.xtime_nsec -= (u64)tk->raw_time.tv_nsec << tk->tkr_raw.shift;
 
 	/* Accumulate error between NTP and clock interval */
 	tk->ntp_error += tk->ntp_tick << shift;