Message ID | 1507315102-5942-1-git-send-email-mark.rutland@arm.com |
---|---|
State | New |
Headers | show |
Series | perf tools: unbreak perf record for arm/arm64 | expand |
On Fri, Oct 06, 2017 at 07:38:22PM +0100, Mark Rutland wrote: > Currently, perf record is broken on arm/arm64 systems when the PMU is > specified explicitly as part of the event, e.g. > > $ ./perf record -e armv8_cortex_a53/cpu_cycles/u true > > In such cases, perf record fails to open events unless > perf_event_paranoid is set to -1, even if the PMU in question supports > mode exclusion. Further, even when perf_event_paranoid is toggled, no > samples are recorded. > > This is an unintended side effect of commit: > > e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring) > > ... which assumes that if a PMU has an associated cpu_map, it is an > uncore PMU, and forces events for such PMUs to be system-wide. > > This is not true for arm/arm64 systems, which can have heterogeneous > CPUs. To account for this, multiple CPU PMUs are exposed, each with a > "cpus" field under sysfs, which the perf tool parses into a cpu_map. ARM > PMUs do not have a "cpumask" file, and only have a "cpus" file. For the > gory details as to why, see commit: > > 7e3fcffe95544010 ("perf pmu: Support alternative sysfs cpumask") > > Given all of this, we can instead identify uncore PMUs by explicitly > checking for a "cpumask" file, and restore arm/arm64 PMU support back to > a working state. This patch does so, adding a new perf_pmu::is_uncore > field, and splitting the existing cpumask parsing so that it can be > reused. > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring) It sucks that we haven't noticed this being broken for so long, but I can confirm that this fixes the issue: Acked-by: Will Deacon <will.deacon@arm.com> Tested-by Will Deacon <will.deacon@arm.com> Any chance we can get this into 4.14? You'll probably need to do some stable backports too, since this is a bit spread out. Will
On Mon, Oct 09, 2017 at 11:22:00AM +0100, Will Deacon wrote: > It sucks that we haven't noticed this being broken for so long, but I can > confirm that this fixes the issue: > > Acked-by: Will Deacon <will.deacon@arm.com> > Tested-by Will Deacon <will.deacon@arm.com> Cheers. > Any chance we can get this into 4.14? You'll probably need to do some > stable backports too, since this is a bit spread out. FWIW, I'm happy to do the backport(s) to stable. As long as the Fixes tag and stable Cc are kept, I should get pinged by the stable maintainers if/when the patch fails to apply. Thanks, Mark.
Em Mon, Oct 09, 2017 at 11:22:00AM +0100, Will Deacon escreveu: > On Fri, Oct 06, 2017 at 07:38:22PM +0100, Mark Rutland wrote: > > Currently, perf record is broken on arm/arm64 systems when the PMU is > > specified explicitly as part of the event, e.g. > > > > $ ./perf record -e armv8_cortex_a53/cpu_cycles/u true > > > > In such cases, perf record fails to open events unless > > perf_event_paranoid is set to -1, even if the PMU in question supports > > mode exclusion. Further, even when perf_event_paranoid is toggled, no > > samples are recorded. > > > > This is an unintended side effect of commit: > > > > e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring) > > > > ... which assumes that if a PMU has an associated cpu_map, it is an > > uncore PMU, and forces events for such PMUs to be system-wide. > > > > This is not true for arm/arm64 systems, which can have heterogeneous > > CPUs. To account for this, multiple CPU PMUs are exposed, each with a > > "cpus" field under sysfs, which the perf tool parses into a cpu_map. ARM > > PMUs do not have a "cpumask" file, and only have a "cpus" file. For the > > gory details as to why, see commit: > > > > 7e3fcffe95544010 ("perf pmu: Support alternative sysfs cpumask") > > > > Given all of this, we can instead identify uncore PMUs by explicitly > > checking for a "cpumask" file, and restore arm/arm64 PMU support back to > > a working state. This patch does so, adding a new perf_pmu::is_uncore > > field, and splitting the existing cpumask parsing so that it can be > > reused. > > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > > Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring) > > It sucks that we haven't noticed this being broken for so long, but I can > confirm that this fixes the issue: > > Acked-by: Will Deacon <will.deacon@arm.com> > Tested-by Will Deacon <will.deacon@arm.com> > > Any chance we can get this into 4.14? You'll probably need to do some stable > backports too, since this is a bit spread out. Sure, I've added this to my perf/urgent branch, that, together with a few other fixes is undergoing testing now. - Arnaldo
On Mon, Oct 09, 2017 at 04:00:00PM -0300, Arnaldo Carvalho de Melo wrote: > Em Mon, Oct 09, 2017 at 11:22:00AM +0100, Will Deacon escreveu: > > On Fri, Oct 06, 2017 at 07:38:22PM +0100, Mark Rutland wrote: > > > Currently, perf record is broken on arm/arm64 systems when the PMU is > > > specified explicitly as part of the event, e.g. > > > > > > $ ./perf record -e armv8_cortex_a53/cpu_cycles/u true > > > > > > In such cases, perf record fails to open events unless > > > perf_event_paranoid is set to -1, even if the PMU in question supports > > > mode exclusion. Further, even when perf_event_paranoid is toggled, no > > > samples are recorded. > > > > > > This is an unintended side effect of commit: > > > > > > e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring) > > > > > > ... which assumes that if a PMU has an associated cpu_map, it is an > > > uncore PMU, and forces events for such PMUs to be system-wide. > > > > > > This is not true for arm/arm64 systems, which can have heterogeneous > > > CPUs. To account for this, multiple CPU PMUs are exposed, each with a > > > "cpus" field under sysfs, which the perf tool parses into a cpu_map. ARM > > > PMUs do not have a "cpumask" file, and only have a "cpus" file. For the > > > gory details as to why, see commit: > > > > > > 7e3fcffe95544010 ("perf pmu: Support alternative sysfs cpumask") > > > > > > Given all of this, we can instead identify uncore PMUs by explicitly > > > checking for a "cpumask" file, and restore arm/arm64 PMU support back to > > > a working state. This patch does so, adding a new perf_pmu::is_uncore > > > field, and splitting the existing cpumask parsing so that it can be > > > reused. > > > > > > Signed-off-by: Mark Rutland <mark.rutland@arm.com> > > > Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring) > > > > It sucks that we haven't noticed this being broken for so long, but I can > > confirm that this fixes the issue: > > > > Acked-by: Will Deacon <will.deacon@arm.com> > > Tested-by Will Deacon <will.deacon@arm.com> > > > > Any chance we can get this into 4.14? You'll probably need to do some stable > > backports too, since this is a bit spread out. > > Sure, I've added this to my perf/urgent branch, that, together with a > few other fixes is undergoing testing now. Thanks, Arnaldo. Will
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index f6257fb..39b1596 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -309,10 +309,11 @@ static char *get_config_name(struct list_head *head_terms) static struct perf_evsel * __add_event(struct list_head *list, int *idx, struct perf_event_attr *attr, - char *name, struct cpu_map *cpus, + char *name, struct perf_pmu *pmu, struct list_head *config_terms, bool auto_merge_stats) { struct perf_evsel *evsel; + struct cpu_map *cpus = pmu ? pmu->cpus : NULL; event_attr_init(attr); @@ -323,7 +324,7 @@ static char *get_config_name(struct list_head *head_terms) (*idx)++; evsel->cpus = cpu_map__get(cpus); evsel->own_cpus = cpu_map__get(cpus); - evsel->system_wide = !!cpus; + evsel->system_wide = pmu ? pmu->is_uncore : false; evsel->auto_merge_stats = auto_merge_stats; if (name) @@ -1233,7 +1234,7 @@ static int __parse_events_add_pmu(struct parse_events_state *parse_state, if (!head_config) { attr.type = pmu->type; - evsel = __add_event(list, &parse_state->idx, &attr, NULL, pmu->cpus, NULL, auto_merge_stats); + evsel = __add_event(list, &parse_state->idx, &attr, NULL, pmu, NULL, auto_merge_stats); return evsel ? 0 : -ENOMEM; } @@ -1254,7 +1255,7 @@ static int __parse_events_add_pmu(struct parse_events_state *parse_state, return -EINVAL; evsel = __add_event(list, &parse_state->idx, &attr, - get_config_name(head_config), pmu->cpus, + get_config_name(head_config), pmu, &config_terms, auto_merge_stats); if (evsel) { evsel->unit = info.unit; diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index ac16a9d..1c4d7b4 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -470,17 +470,36 @@ static void pmu_read_sysfs(void) closedir(dir); } +static struct cpu_map *__pmu_cpumask(const char *path) +{ + FILE *file; + struct cpu_map *cpus; + + file = fopen(path, "r"); + if (!file) + return NULL; + + cpus = cpu_map__read(file); + fclose(file); + return cpus; +} + +/* + * Uncore PMUs have a "cpumask" file under sysfs. CPU PMUs (e.g. on arm/arm64) + * may have a "cpus" file. + */ +#define CPUS_TEMPLATE_UNCORE "%s/bus/event_source/devices/%s/cpumask" +#define CPUS_TEMPLATE_CPU "%s/bus/event_source/devices/%s/cpus" + static struct cpu_map *pmu_cpumask(const char *name) { - struct stat st; char path[PATH_MAX]; - FILE *file; struct cpu_map *cpus; const char *sysfs = sysfs__mountpoint(); const char *templates[] = { - "%s/bus/event_source/devices/%s/cpumask", - "%s/bus/event_source/devices/%s/cpus", - NULL + CPUS_TEMPLATE_UNCORE, + CPUS_TEMPLATE_CPU, + NULL }; const char **template; @@ -489,20 +508,25 @@ static struct cpu_map *pmu_cpumask(const char *name) for (template = templates; *template; template++) { snprintf(path, PATH_MAX, *template, sysfs, name); - if (stat(path, &st) == 0) - break; + cpus = __pmu_cpumask(path); + if (cpus) + return cpus; } - if (!*template) - return NULL; + return NULL; +} - file = fopen(path, "r"); - if (!file) - return NULL; +static bool pmu_is_uncore(const char *name) +{ + char path[PATH_MAX]; + struct cpu_map *cpus; + const char *sysfs = sysfs__mountpoint(); - cpus = cpu_map__read(file); - fclose(file); - return cpus; + snprintf(path, PATH_MAX, CPUS_TEMPLATE_UNCORE, sysfs, name); + cpus = __pmu_cpumask(path); + cpu_map__put(cpus); + + return !!cpus; } /* @@ -617,6 +641,8 @@ static struct perf_pmu *pmu_lookup(const char *name) pmu->cpus = pmu_cpumask(name); + pmu->is_uncore = pmu_is_uncore(name); + INIT_LIST_HEAD(&pmu->format); INIT_LIST_HEAD(&pmu->aliases); list_splice(&format, &pmu->format); diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h index 389e972..fe0de05 100644 --- a/tools/perf/util/pmu.h +++ b/tools/perf/util/pmu.h @@ -22,6 +22,7 @@ struct perf_pmu { char *name; __u32 type; bool selectable; + bool is_uncore; struct perf_event_attr *default_config; struct cpu_map *cpus; struct list_head format; /* HEAD struct perf_pmu_format -> list */
Currently, perf record is broken on arm/arm64 systems when the PMU is specified explicitly as part of the event, e.g. $ ./perf record -e armv8_cortex_a53/cpu_cycles/u true In such cases, perf record fails to open events unless perf_event_paranoid is set to -1, even if the PMU in question supports mode exclusion. Further, even when perf_event_paranoid is toggled, no samples are recorded. This is an unintended side effect of commit: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring) ... which assumes that if a PMU has an associated cpu_map, it is an uncore PMU, and forces events for such PMUs to be system-wide. This is not true for arm/arm64 systems, which can have heterogeneous CPUs. To account for this, multiple CPU PMUs are exposed, each with a "cpus" field under sysfs, which the perf tool parses into a cpu_map. ARM PMUs do not have a "cpumask" file, and only have a "cpus" file. For the gory details as to why, see commit: 7e3fcffe95544010 ("perf pmu: Support alternative sysfs cpumask") Given all of this, we can instead identify uncore PMUs by explicitly checking for a "cpumask" file, and restore arm/arm64 PMU support back to a working state. This patch does so, adding a new perf_pmu::is_uncore field, and splitting the existing cpumask parsing so that it can be reused. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: e3ba76deef23064f ("perf tools: Force uncore events to system wide monitoring) Cc: stable@vger.kernel.org Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Will Deacon <will.deacon@arm.com> --- tools/perf/util/parse-events.c | 9 ++++--- tools/perf/util/pmu.c | 56 +++++++++++++++++++++++++++++++----------- tools/perf/util/pmu.h | 1 + 3 files changed, 47 insertions(+), 19 deletions(-) -- 1.9.1