Message ID | 1411050873-9310-2-git-send-email-pawel.moll@arm.com |
---|---|
State | New |
Headers | show |
On Thu, Sep 18, 2014 at 03:34:32PM +0100, Pawel Moll wrote: > @@ -4456,6 +4459,13 @@ static void __perf_event_header__init_id(struct perf_event_header *header, > data->cpu_entry.cpu = raw_smp_processor_id(); > data->cpu_entry.reserved = 0; > } > + > + if (sample_type & PERF_SAMPLE_CLOCK_RAW_MONOTONIC) { > + struct timespec now; > + > + getrawmonotonic(&now); > + data->clock_raw_monotonic = timespec_to_ns(&now); > + } > } > This cannot work, getrawmonotonic() isn't NMI-safe and there's nothing stopping this being used from NMI context. Also getrawmonotonic() + timespec_to_ns() will make tglx sad, he's just done a tree-wide eradication of silly conversions and now you're adding a ns -> timespec -> ns dance right back. I _think_ you want ktime_get_mono_fast_ns(), but this does bring us right back to the question/discussion on which timebase you'd want to sync again. MONO does make sense for most cases, but I think we've had fairly sane stories for people wanting to sync against other clocks. A well.. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
On Mon, 2014-09-29 at 16:28 +0100, Peter Zijlstra wrote: > On Thu, Sep 18, 2014 at 03:34:32PM +0100, Pawel Moll wrote: > > @@ -4456,6 +4459,13 @@ static void __perf_event_header__init_id(struct perf_event_header *header, > > data->cpu_entry.cpu = raw_smp_processor_id(); > > data->cpu_entry.reserved = 0; > > } > > + > > + if (sample_type & PERF_SAMPLE_CLOCK_RAW_MONOTONIC) { > > + struct timespec now; > > + > > + getrawmonotonic(&now); > > + data->clock_raw_monotonic = timespec_to_ns(&now); > > + } > > } > > > > This cannot work, getrawmonotonic() isn't NMI-safe and there's > nothing stopping this being used from NMI context. > > Also getrawmonotonic() + timespec_to_ns() will make tglx sad, he's just > done a tree-wide eradication of silly conversions and now you're adding > a ns -> timespec -> ns dance right back. Last thing I want is to make Thomas sad... For obvious reasons ;-) > I _think_ you want ktime_get_mono_fast_ns(), With pleasure, it's exactly what I need. > but this does bring us > right back to the question/discussion on which timebase you'd want to > sync again. MONO does make sense for most cases, but I think we've had > fairly sane stories for people wanting to sync against other clocks. Yes. I've asked the same question somewhere in the thread. ftrace has got a switch and a selection of trace_clocks in kernel/trace/trace.c - do we want something similar (in integer form probably, though) in perf_events.h with an additional "flag" in struct perf_event_attr? It could be used to pick a time source for PERF_SAMPLE_CLOCK (PERF_SAMPLE_TRACE_CLOCK?) sample. Pawel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 707617a..28b73b2 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -602,6 +602,8 @@ struct perf_sample_data { * Transaction flags for abort events: */ u64 txn; + /* Raw monotonic timestamp, for userspace time correlation */ + u64 clock_raw_monotonic; }; static inline void perf_sample_data_init(struct perf_sample_data *data, diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h index 9269de2..e5a75c5 100644 --- a/include/uapi/linux/perf_event.h +++ b/include/uapi/linux/perf_event.h @@ -137,8 +137,9 @@ enum perf_event_sample_format { PERF_SAMPLE_DATA_SRC = 1U << 15, PERF_SAMPLE_IDENTIFIER = 1U << 16, PERF_SAMPLE_TRANSACTION = 1U << 17, + PERF_SAMPLE_CLOCK_RAW_MONOTONIC = 1U << 18, - PERF_SAMPLE_MAX = 1U << 18, /* non-ABI */ + PERF_SAMPLE_MAX = 1U << 19, /* non-ABI */ }; /* @@ -686,6 +687,7 @@ enum perf_event_type { * { u64 weight; } && PERF_SAMPLE_WEIGHT * { u64 data_src; } && PERF_SAMPLE_DATA_SRC * { u64 transaction; } && PERF_SAMPLE_TRANSACTION + * { u64 clock_raw_monotonic; } && PERF_SAMPLE_CLOCK_RAW_MONOTONIC * }; */ PERF_RECORD_SAMPLE = 9, diff --git a/kernel/events/core.c b/kernel/events/core.c index f9c1ed0..f6df547 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1216,6 +1216,9 @@ static void perf_event__header_size(struct perf_event *event) if (sample_type & PERF_SAMPLE_TRANSACTION) size += sizeof(data->txn); + if (sample_type & PERF_SAMPLE_CLOCK_RAW_MONOTONIC) + size += sizeof(data->clock_raw_monotonic); + event->header_size = size; } @@ -4456,6 +4459,13 @@ static void __perf_event_header__init_id(struct perf_event_header *header, data->cpu_entry.cpu = raw_smp_processor_id(); data->cpu_entry.reserved = 0; } + + if (sample_type & PERF_SAMPLE_CLOCK_RAW_MONOTONIC) { + struct timespec now; + + getrawmonotonic(&now); + data->clock_raw_monotonic = timespec_to_ns(&now); + } } void perf_event_header__init_id(struct perf_event_header *header, @@ -4714,6 +4724,9 @@ void perf_output_sample(struct perf_output_handle *handle, if (sample_type & PERF_SAMPLE_TRANSACTION) perf_output_put(handle, data->txn); + if (sample_type & PERF_SAMPLE_CLOCK_RAW_MONOTONIC) + perf_output_put(handle, data->clock_raw_monotonic); + if (!event->attr.watermark) { int wakeup_events = event->attr.wakeup_events;
This patch adds an option to sample raw monotonic clock value with any perf event, with the the aim of allowing time correlation between data coming from perf and additional performance-related information generated in userspace. In order to correlate timestamps in perf data stream with events happening in userspace (be it JITed debug symbols or hwmon-originating environment data), user requests a more or less periodic event (sched_switch trace event of a hrtimer-based cpu-clock being the most obvious examples) with PERF_SAMPLE_TIME *and* PERF_SAMPLE_CLOCK_RAW_MONOTONIC and stamps user-originating data with values obtained from clock_gettime(CLOCK_MONOTONIC_RAW). Then, during analysis, one looks at the perf events immediately preceding and following (in terms of the clock_raw_monotonic sample) the userspace event and does simple linear approximation to get the equivalent perf time. perf event user event -----O--------------+-------------O------> t_mono : | : : V : -----O----------------------------O------> t_perf Signed-off-by: Pawel Moll <pawel.moll@arm.com> --- include/linux/perf_event.h | 2 ++ include/uapi/linux/perf_event.h | 4 +++- kernel/events/core.c | 13 +++++++++++++ 3 files changed, 18 insertions(+), 1 deletion(-)