From patchwork Thu Jun 11 22:54:55 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: John Stultz X-Patchwork-Id: 49775 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-wi0-f198.google.com (mail-wi0-f198.google.com [209.85.212.198]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id 6D2A420C81 for ; Thu, 11 Jun 2015 22:55:15 +0000 (UTC) Received: by wibdt2 with SMTP id dt2sf192019wib.3 for ; Thu, 11 Jun 2015 15:55:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:delivered-to:from:to:cc:subject :date:message-id:in-reply-to:references:x-original-sender :x-original-authentication-results:precedence:mailing-list:list-id :list-post:list-help:list-archive:list-unsubscribe; bh=YDT4ed1l9INcrw6HsYrrRl3CZiZkgG3k41Z85+HGTnQ=; b=VECwa38cXwlCavhWVqpHkPSj/KLnm2rWOoG+DZC03lE7n1c5DxB1VXezebZs/MDXtz K5QgKBdDjuhxiO0q6wKjtIHVDQj8lIh5ba1Jvs+EfDgMi2AJXrFpyQFNnpkRKYUWEGwj gPCHa9RPv57XkH4gYv8v8hryYtRzJYt7ZZDr6gyqdpsblqiAlRgPWQOS1I49gifYgHT8 6wZYnQK5iPWntl+iezUx3aU+ArvJ3falWuDs7nRCEhwTNEnBs+qNjOYmWckUeNrPdkg0 6tuafFPlxY6diJlXSaWjzeDOmenNrm1mfWyxZvk7apteaqIfd2zgEagNHI+zEnKePDTw vnPg== X-Gm-Message-State: ALoCoQntWDvvySL7gidrm7yU0v9WSaCGUnN9gK/cHNwXEmWeeHtk72wrA30yfYXBfnOuQgjSrqiz X-Received: by 10.112.28.111 with SMTP id a15mr10824661lbh.21.1434063314775; Thu, 11 Jun 2015 15:55:14 -0700 (PDT) MIME-Version: 1.0 X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.120.227 with SMTP id lf3ls500025lab.93.gmail; Thu, 11 Jun 2015 15:55:14 -0700 (PDT) X-Received: by 10.152.6.105 with SMTP id z9mr12048876laz.98.1434063314538; Thu, 11 Jun 2015 15:55:14 -0700 (PDT) Received: from mail-lb0-f170.google.com (mail-lb0-f170.google.com. [209.85.217.170]) by mx.google.com with ESMTPS id w10si1807118law.138.2015.06.11.15.55.14 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Jun 2015 15:55:14 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.170 as permitted sender) client-ip=209.85.217.170; Received: by lbcue7 with SMTP id ue7so10378078lbc.0 for ; Thu, 11 Jun 2015 15:55:14 -0700 (PDT) X-Received: by 10.112.222.133 with SMTP id qm5mr12348787lbc.86.1434063314349; Thu, 11 Jun 2015 15:55:14 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patches@linaro.org Received: by 10.112.108.230 with SMTP id hn6csp135171lbb; Thu, 11 Jun 2015 15:55:13 -0700 (PDT) X-Received: by 10.68.135.233 with SMTP id pv9mr18410064pbb.115.1434063309224; Thu, 11 Jun 2015 15:55:09 -0700 (PDT) Received: from mail-pd0-f180.google.com (mail-pd0-f180.google.com. [209.85.192.180]) by mx.google.com with ESMTPS id oi3si2659851pdb.222.2015.06.11.15.55.08 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 11 Jun 2015 15:55:09 -0700 (PDT) Received-SPF: pass (google.com: domain of john.stultz@linaro.org designates 209.85.192.180 as permitted sender) client-ip=209.85.192.180; Received: by pdbnf5 with SMTP id nf5so11473110pdb.2 for ; Thu, 11 Jun 2015 15:55:08 -0700 (PDT) X-Received: by 10.68.69.48 with SMTP id b16mr18018742pbu.144.1434063308323; Thu, 11 Jun 2015 15:55:08 -0700 (PDT) Received: from localhost.localdomain (c-71-63-136-39.hsd1.or.comcast.net. [71.63.136.39]) by mx.google.com with ESMTPSA id ax2sm1676093pac.21.2015.06.11.15.55.07 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 11 Jun 2015 15:55:07 -0700 (PDT) From: John Stultz To: linux-kernel@vger.kernel.org Cc: John Stultz , Prarit Bhargava , Daniel Bristot de Oliveira , Richard Cochran , Jan Kara , Jiri Bohac , Thomas Gleixner , Ingo Molnar , Shuah Khan Subject: [PATCH 3/5] time: Do leapsecond adjustment to avoid early timer expirations Date: Thu, 11 Jun 2015 15:54:55 -0700 Message-Id: <1434063297-28657-4-git-send-email-john.stultz@linaro.org> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1434063297-28657-1-git-send-email-john.stultz@linaro.org> References: <1434063297-28657-1-git-send-email-john.stultz@linaro.org> X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: john.stultz@linaro.org X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.217.170 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Precedence: list Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org List-ID: X-Google-Group-Id: 836684582541 List-Post: , List-Help: , List-Archive: List-Unsubscribe: , Currently, leapsecond adjustments are done at tick time. As a result, the leapsecond was applied at the first timer tick *after* the leapsecond (~1-10ms late depending on HZ), rather then exactly on the second edge. This was in part historical from back when we were always tick based, but correcting this since has been avoided since it adds extra conditional checks in the gettime fastpath, which has performance overhead. However, it was recently pointed out that ABS_TIME CLOCK_REALTIME timers set for right after the leapsecond could fire a second early, since some timers may be expired before we trigger the timekeeping timer, which then applies the leapsecond. This isn't quite as bad as it sounds, since behaviorally it is similar to what is possible w/ ntpd made leapsecond adjustments done w/o using the kernel discipline. Where due to latencies, timers may fire just prior to the settimeofday call. (Also, one should note that all applications using CLOCK_REALTIME timers should always be careful, since they are prone to quirks from settimeofday() disturbances.) However, the purpose of having the kernel do the leap adjustment is to avoid such latencies, so I think this is worth fixing. So in order to properly keep those timers from firing a second early, this patch modifies the ntp and timekeeping logic so that we keep enough state so that the update_base_offsets_now accessor, which provides the hrtimer core the current time, can check and apply the leapsecond adjustment on the second edge. This prevents the hrtimer core from expiring timers too early. This patch does not modify any other time read path, so no additional overhead is incurred. However, this also means that the leap-second continues to be applied at tick time for all other read-paths. Apologies to Richard Cochran, who pushed for similar changes years ago, which I resisted due to the concerns about the performance overhead. While I suspect this isn't extremely critical, folks who care about strict leap-second correctness will likely want to watch this. Potentially a -stable candidate eventually. Cc: Prarit Bhargava Cc: Daniel Bristot de Oliveira Cc: Richard Cochran Cc: Jan Kara Cc: Jiri Bohac Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Shuah Khan Originally-suggested-by: Richard Cochran Reported-by: Daniel Bristot de Oliveira Reported-by: Prarit Bhargava Signed-off-by: John Stultz New in v2: * Removed redundent state update calls * Simplified logic to only handle leap insertion * Reduced scope to only address hrtimer expiration issue * Incorporated logic suggestions from Thomas. --- include/linux/time64.h | 1 + include/linux/timekeeper_internal.h | 2 ++ kernel/time/ntp.c | 42 ++++++++++++++++++++++++++++++------- kernel/time/ntp_internal.h | 1 + kernel/time/timekeeping.c | 23 +++++++++++++++++++- 5 files changed, 61 insertions(+), 8 deletions(-) diff --git a/include/linux/time64.h b/include/linux/time64.h index 12d4e82..77b5df2 100644 --- a/include/linux/time64.h +++ b/include/linux/time64.h @@ -29,6 +29,7 @@ struct timespec64 { #define FSEC_PER_SEC 1000000000000000LL /* Located here for timespec[64]_valid_strict */ +#define TIME64_MAX ((s64)~((u64)1 << 63)) #define KTIME_MAX ((s64)~((u64)1 << 63)) #define KTIME_SEC_MAX (KTIME_MAX / NSEC_PER_SEC) diff --git a/include/linux/timekeeper_internal.h b/include/linux/timekeeper_internal.h index e1f5a11..2524722 100644 --- a/include/linux/timekeeper_internal.h +++ b/include/linux/timekeeper_internal.h @@ -50,6 +50,7 @@ struct tk_read_base { * @offs_tai: Offset clock monotonic -> clock tai * @tai_offset: The current UTC to TAI offset in seconds * @clock_was_set_seq: The sequence number of clock was set events + * @next_leap_ktime: CLOCK_MONOTONIC time value of a pending leap-second * @raw_time: Monotonic raw base time in timespec64 format * @cycle_interval: Number of clock cycles in one NTP interval * @xtime_interval: Number of clock shifted nano seconds in one NTP @@ -90,6 +91,7 @@ struct timekeeper { ktime_t offs_tai; s32 tai_offset; unsigned int clock_was_set_seq; + ktime_t next_leap_ktime; struct timespec64 raw_time; /* The following members are for timekeeping internal use */ diff --git a/kernel/time/ntp.c b/kernel/time/ntp.c index 7aa2161..033743e 100644 --- a/kernel/time/ntp.c +++ b/kernel/time/ntp.c @@ -77,6 +77,9 @@ static long time_adjust; /* constant (boot-param configurable) NTP tick adjustment (upscaled) */ static s64 ntp_tick_adj; +/* second value of the next pending leapsecond, or TIME64_MAX if no leap */ +static time64_t ntp_next_leap_sec = TIME64_MAX; + #ifdef CONFIG_NTP_PPS /* @@ -350,6 +353,7 @@ void ntp_clear(void) tick_length = tick_length_base; time_offset = 0; + ntp_next_leap_sec = TIME64_MAX; /* Clear PPS state variables */ pps_clear(); } @@ -360,6 +364,21 @@ u64 ntp_tick_length(void) return tick_length; } +/** + * ntp_get_next_leap - Returns the next leapsecond in CLOCK_REALTIME ktime_t + * + * Provides the time of the next leapsecond against CLOCK_REALTIME in + * a ktime_t format. Returns KTIME_MAX if no leapsecond is pending. + */ +ktime_t ntp_get_next_leap(void) +{ + ktime_t ret; + + if ((time_state == TIME_INS) && (time_status & STA_INS)) + return ktime_set(ntp_next_leap_sec, 0); + ret.tv64 = KTIME_MAX; + return ret; +} /* * this routine handles the overflow of the microsecond field @@ -383,15 +402,21 @@ int second_overflow(unsigned long secs) */ switch (time_state) { case TIME_OK: - if (time_status & STA_INS) + if (time_status & STA_INS) { time_state = TIME_INS; - else if (time_status & STA_DEL) + ntp_next_leap_sec = secs + SECS_PER_DAY - + (secs % SECS_PER_DAY); + } else if (time_status & STA_DEL) { time_state = TIME_DEL; + ntp_next_leap_sec = secs + SECS_PER_DAY - + ((secs+1) % SECS_PER_DAY); + } break; case TIME_INS: - if (!(time_status & STA_INS)) + if (!(time_status & STA_INS)) { + ntp_next_leap_sec = TIME64_MAX; time_state = TIME_OK; - else if (secs % SECS_PER_DAY == 0) { + } else if (secs % SECS_PER_DAY == 0) { leap = -1; time_state = TIME_OOP; printk(KERN_NOTICE @@ -399,19 +424,21 @@ int second_overflow(unsigned long secs) } break; case TIME_DEL: - if (!(time_status & STA_DEL)) + if (!(time_status & STA_DEL)) { + ntp_next_leap_sec = TIME64_MAX; time_state = TIME_OK; - else if ((secs + 1) % SECS_PER_DAY == 0) { + } else if ((secs + 1) % SECS_PER_DAY == 0) { leap = 1; + ntp_next_leap_sec = TIME64_MAX; time_state = TIME_WAIT; printk(KERN_NOTICE "Clock: deleting leap second 23:59:59 UTC\n"); } break; case TIME_OOP: + ntp_next_leap_sec = TIME64_MAX; time_state = TIME_WAIT; break; - case TIME_WAIT: if (!(time_status & (STA_INS | STA_DEL))) time_state = TIME_OK; @@ -548,6 +575,7 @@ static inline void process_adj_status(struct timex *txc, struct timespec64 *ts) if ((time_status & STA_PLL) && !(txc->status & STA_PLL)) { time_state = TIME_OK; time_status = STA_UNSYNC; + ntp_next_leap_sec = TIME64_MAX; /* restart PPS frequency calibration */ pps_reset_freq_interval(); } diff --git a/kernel/time/ntp_internal.h b/kernel/time/ntp_internal.h index bbd102a..6543050 100644 --- a/kernel/time/ntp_internal.h +++ b/kernel/time/ntp_internal.h @@ -5,6 +5,7 @@ extern void ntp_init(void); extern void ntp_clear(void); /* Returns how long ticks are at present, in ns / 2^NTP_SCALE_SHIFT. */ extern u64 ntp_tick_length(void); +extern ktime_t ntp_get_next_leap(void); extern int second_overflow(unsigned long secs); extern int ntp_validate_timex(struct timex *); extern int __do_adjtimex(struct timex *, struct timespec64 *, s32 *); diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c index 849b932..5d67ffb 100644 --- a/kernel/time/timekeeping.c +++ b/kernel/time/timekeeping.c @@ -540,6 +540,17 @@ int pvclock_gtod_unregister_notifier(struct notifier_block *nb) EXPORT_SYMBOL_GPL(pvclock_gtod_unregister_notifier); /* + * tk_update_leap_state - helper to update the next_leap_ktime + */ +static inline void tk_update_leap_state(struct timekeeper *tk) +{ + tk->next_leap_ktime = ntp_get_next_leap(); + if (tk->next_leap_ktime.tv64 != KTIME_MAX) + /* Convert to monotonic time */ + tk->next_leap_ktime = ktime_sub(tk->next_leap_ktime, tk->offs_real); +} + +/* * Update the ktime_t based scalar nsec members of the timekeeper */ static inline void tk_update_ktime_data(struct timekeeper *tk) @@ -580,6 +591,7 @@ static void timekeeping_update(struct timekeeper *tk, unsigned int action) ntp_clear(); } + tk_update_leap_state(tk); tk_update_ktime_data(tk); update_vsyscall(tk); @@ -1956,15 +1968,22 @@ ktime_t ktime_get_update_offsets_now(unsigned int *cwsseq, ktime_t *offs_real, base = tk->tkr_mono.base; nsecs = timekeeping_get_ns(&tk->tkr_mono); + base = ktime_add_ns(base, nsecs); + if (*cwsseq != tk->clock_was_set_seq) { *cwsseq = tk->clock_was_set_seq; *offs_real = tk->offs_real; *offs_boot = tk->offs_boot; *offs_tai = tk->offs_tai; } + + /* Handle leapsecond insertion adjustments */ + if (unlikely(base.tv64 >= tk->next_leap_ktime.tv64)) + *offs_real = ktime_sub(tk->offs_real, ktime_set(1, 0)); + } while (read_seqcount_retry(&tk_core.seq, seq)); - return ktime_add_ns(base, nsecs); + return base; } /** @@ -2006,6 +2025,8 @@ int do_adjtimex(struct timex *txc) __timekeeping_set_tai_offset(tk, tai); timekeeping_update(tk, TK_MIRROR | TK_CLOCK_WAS_SET); } + tk_update_leap_state(tk); + write_seqcount_end(&tk_core.seq); raw_spin_unlock_irqrestore(&timekeeper_lock, flags);