Message ID | 1463182576-6035-1-git-send-email-mathieu.poirier@linaro.org |
---|---|
State | New |
Headers | show |
On 16 May 2016 at 00:01, Jiri Olsa <jolsa@redhat.com> wrote: > On Fri, May 13, 2016 at 05:36:16PM -0600, Mathieu Poirier wrote: >> The following patch[1] adds a new "cpumode" to the perf_sample >> structure that gets initialised as events are read from the data >> event file. >> >> With the advent of HW tracers and more specifically the decoding of >> the traces they generate, function perf_session__deliver_synth_event() >> gets called directly from the infrastructure that decodes the traces, >> for example[2]. > > hum, but this one ends up calling perf_session__deliver_synth_event > with sample == NULL right? Wrong example - this highlights the problem [1]. Here "sample" is created from scratch and "->cpumode" isn't initialised. Looking at things with a rested head this morning it is apparent to me that adding something like: sample.cpumode = event->header.misc; to line 1048 is probably a better solution. Thanks, Mathieu [1]. http://lxr.free-electrons.com/source/tools/perf/util/intel-pt.c#L1070 > > I checked callers of perf_session__deliver_synth_event and it seems > all either init sample->cpumode or pass sample as NULL.. what do I miss? > > thanks, > jirka > >> >> As such initialisation of perf_sample::cpumode doesn't get done, which >> prevents the shared object name from being printed in the perf report >> output. >> >> Before this patch: >> >> 4.13% 4.13% uname [unknown] [H] 0x0000007fae8b4758 >> 3.74% 3.74% uname [unknown] [H] 0x0000007fae8b4e50 >> 2.06% 2.06% uname [unknown] [H] 0x0000007fae938af4 >> 1.65% 1.65% uname [unknown] [H] 0x0000007fae938ae4 >> 1.59% 1.59% uname [unknown] [H] 0x0000007fae98f7f4 >> 1.50% 1.50% uname [unknown] [H] 0x0000007fae8b4e40 >> 1.43% 1.43% uname [unknown] [H] 0x0000007fae938ac4 >> 1.31% 1.31% uname [unknown] [H] 0x0000007fae86b0c0 >> 1.26% 1.26% uname [unknown] [H] 0x0000007fae99b888 >> >> And after this patch: >> >> 4.13% 4.13% uname libc-2.21.so [.] 0x0000000000078758 >> 3.74% 3.74% uname libc-2.21.so [.] 0x0000000000078e50 >> 2.06% 2.06% uname libc-2.21.so [.] 0x00000000000fcaf4 >> 1.65% 1.65% uname libc-2.21.so [.] 0x00000000000fcae4 >> 1.59% 1.59% uname ld-2.21.so [.] 0x000000000000a7f4 >> 1.50% 1.50% uname libc-2.21.so [.] 0x0000000000078e40 >> 1.43% 1.43% uname libc-2.21.so [.] 0x00000000000fcac4 >> 1.31% 1.31% uname libc-2.21.so [.] 0x000000000002f0c0 >> 1.26% 1.26% uname ld-2.21.so [.] 0x0000000000016888 >> >> This was tested using CoreSight but the same is very likely to happen on >> IntelPT. I'm pretty sure this isn't the right solution but we can start >> building from that. >> >> Comments welcomed. >> >> [1]. commit 473398a21d28 ("perf tools: Add cpumode to struct perf_sample") >> [2]. http://lxr.free-electrons.com/source/tools/perf/util/intel-pt.c#L1806 >> >> Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> >> --- >> tools/perf/util/session.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c >> index 4abd85c6346d..28cc405e7245 100644 >> --- a/tools/perf/util/session.c >> +++ b/tools/perf/util/session.c >> @@ -1352,6 +1352,10 @@ int perf_session__deliver_synth_event(struct perf_session *session, >> { >> struct perf_evlist *evlist = session->evlist; >> struct perf_tool *tool = session->tool; >> + u8 mode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK; >> + >> + if (sample) >> + sample->cpumode = mode; >> >> events_stats__inc(&evlist->stats, event->header.type); >> >> -- >> 2.5.0 >>
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index 4abd85c6346d..28cc405e7245 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1352,6 +1352,10 @@ int perf_session__deliver_synth_event(struct perf_session *session, { struct perf_evlist *evlist = session->evlist; struct perf_tool *tool = session->tool; + u8 mode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK; + + if (sample) + sample->cpumode = mode; events_stats__inc(&evlist->stats, event->header.type);
The following patch[1] adds a new "cpumode" to the perf_sample structure that gets initialised as events are read from the data event file. With the advent of HW tracers and more specifically the decoding of the traces they generate, function perf_session__deliver_synth_event() gets called directly from the infrastructure that decodes the traces, for example[2]. As such initialisation of perf_sample::cpumode doesn't get done, which prevents the shared object name from being printed in the perf report output. Before this patch: 4.13% 4.13% uname [unknown] [H] 0x0000007fae8b4758 3.74% 3.74% uname [unknown] [H] 0x0000007fae8b4e50 2.06% 2.06% uname [unknown] [H] 0x0000007fae938af4 1.65% 1.65% uname [unknown] [H] 0x0000007fae938ae4 1.59% 1.59% uname [unknown] [H] 0x0000007fae98f7f4 1.50% 1.50% uname [unknown] [H] 0x0000007fae8b4e40 1.43% 1.43% uname [unknown] [H] 0x0000007fae938ac4 1.31% 1.31% uname [unknown] [H] 0x0000007fae86b0c0 1.26% 1.26% uname [unknown] [H] 0x0000007fae99b888 And after this patch: 4.13% 4.13% uname libc-2.21.so [.] 0x0000000000078758 3.74% 3.74% uname libc-2.21.so [.] 0x0000000000078e50 2.06% 2.06% uname libc-2.21.so [.] 0x00000000000fcaf4 1.65% 1.65% uname libc-2.21.so [.] 0x00000000000fcae4 1.59% 1.59% uname ld-2.21.so [.] 0x000000000000a7f4 1.50% 1.50% uname libc-2.21.so [.] 0x0000000000078e40 1.43% 1.43% uname libc-2.21.so [.] 0x00000000000fcac4 1.31% 1.31% uname libc-2.21.so [.] 0x000000000002f0c0 1.26% 1.26% uname ld-2.21.so [.] 0x0000000000016888 This was tested using CoreSight but the same is very likely to happen on IntelPT. I'm pretty sure this isn't the right solution but we can start building from that. Comments welcomed. [1]. commit 473398a21d28 ("perf tools: Add cpumode to struct perf_sample") [2]. http://lxr.free-electrons.com/source/tools/perf/util/intel-pt.c#L1806 Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> --- tools/perf/util/session.c | 4 ++++ 1 file changed, 4 insertions(+) -- 2.5.0