diff mbox

[V12,00/14] perf/core: Add ability for an event to "pause" or "resume" AUX area tracing

Message ID 20241010143152.19071-1-adrian.hunter@intel.com
State Superseded
Headers show

Commit Message

Adrian Hunter Oct. 10, 2024, 2:31 p.m. UTC
Hi

Note for V12:
	There was a small conflict between the Intel PT changes in
	"KVM: x86: Fix Intel PT Host/Guest mode when host tracing" and the
	changes in this patch set, so I have put the patch sets together,
	along with outstanding fix "perf/x86/intel/pt: Fix buffer full but
	size is 0 case"

	Cover letter for KVM changes (patches 2 to 4):

	There is a long-standing problem whereby running Intel PT on host and guest
	in Host/Guest mode, causes VM-Entry failure.

	The motivation for this patch set is to provide a fix for stable kernels
	prior to the advent of the "Mediated Passthrough vPMU" patch set:

		https://lore.kernel.org/kvm/20240801045907.4010984-1-mizhang@google.com/

	which would render a large part of the fix unnecessary but likely not be
	suitable for backport to stable due to its size and complexity.

	Ideally, this patch set would be applied before "Mediated Passthrough vPMU"

	Note that the fix does not conflict with "Mediated Passthrough vPMU", it
	is just that "Mediated Passthrough vPMU" will make the code to stop and
	restart Intel PT unnecessary.

Note for V11:
	Moving aux_paused into a union within struct hw_perf_event caused
	a regression because aux_paused was being written unconditionally
	even though it is valid only for AUX (e.g. Intel PT) PMUs.
	That is fixed in V11.

Hardware traces, such as instruction traces, can produce a vast amount of
trace data, so being able to reduce tracing to more specific circumstances
can be useful.

The ability to pause or resume tracing when another event happens, can do
that.

These patches add such a facilty and show how it would work for Intel
Processor Trace.

Maintainers of other AUX area tracing implementations are requested to
consider if this is something they might employ and then whether or not
the ABI would work for them.  Note, thank you to James Clark (ARM) for
evaluating the API for Coresight.  Suzuki K Poulose (ARM) also responded
positively to the RFC.

Changes to perf tools are now (since V4) fleshed out.

Please note, IntelĀ® Architecture Instruction Set Extensions and Future
Features Programming Reference March 2024 319433-052, currently:

	https://cdrdv2.intel.com/v1/dl/getContent/671368

introduces hardware pause / resume for Intel PT in a feature named
Intel PT Trigger Tracing.

For that more fields in perf_event_attr will be necessary.  The main
differences are:
	- it can be applied not just to overflows, but optionally to
	every event
	- a packet is emitted into the trace, optionally with IP
	information
	- no PMI
	- works with PMC and DR (breakpoint) events only

Here are the proposed additions to perf_event_attr, please comment:


Changes in V12:
	Add previously sent patch "perf/x86/intel/pt: Fix buffer full
	but size is 0 case"

	Add previously sent patch set "KVM: x86: Fix Intel PT Host/Guest
	mode when host tracing"

	Rebase on current tip plus patch set "KVM: x86: Fix Intel PT Host/Guest
	mode when host tracing"

Changes in V11:
      perf/core: Add aux_pause, aux_resume, aux_start_paused
	Make assignment to event->hw.aux_paused conditional on
	(pmu->capabilities & PERF_PMU_CAP_AUX_PAUSE).

      perf/x86/intel: Do not enable large PEBS for events with aux actions or aux sampling
	Remove definition of has_aux_action() because it has
	already been added as an inline function.

      perf/x86/intel/pt: Fix sampling synchronization
      perf tools: Enable evsel__is_aux_event() to work for ARM/ARM64
      perf tools: Enable evsel__is_aux_event() to work for S390_CPUMSF
	Dropped because they have already been applied

Changes in V10:
      perf/core: Add aux_pause, aux_resume, aux_start_paused
	Move aux_paused into a union within struct hw_perf_event.
	Additional comment wrt PERF_EF_PAUSE/PERF_EF_RESUME.
	Factor out has_aux_action() as an inline function.
	Use scoped_guard for irqsave.
	Move calls of perf_event_aux_pause() from __perf_event_output()
	to __perf_event_overflow().

Changes in V9:
      perf/x86/intel/pt: Fix sampling synchronization
	New patch

      perf/core: Add aux_pause, aux_resume, aux_start_paused
	Move aux_paused to struct hw_perf_event

      perf/x86/intel/pt: Add support for pause / resume
	Add more comments and barriers for resume_allowed and
	pause_allowed
	Always use WRITE_ONCE with resume_allowed


Changes in V8:

      perf tools: Parse aux-action
	Fix clang warning:
	     util/auxtrace.c:821:7: error: missing field 'aux_action' initializer [-Werror,-Wmissing-field-initializers]
	     821 |         {NULL},
	         |              ^

Changes in V7:

	Add Andi's Reviewed-by for patches 2-12
	Re-base

Changes in V6:

      perf/core: Add aux_pause, aux_resume, aux_start_paused
	Removed READ/WRITE_ONCE from __perf_event_aux_pause()
	Expanded comment about guarding against NMI

Changes in V5:

    perf/core: Add aux_pause, aux_resume, aux_start_paused
	Added James' Ack

    perf/x86/intel: Do not enable large PEBS for events with aux actions or aux sampling
	New patch

    perf tools
	Added Ian's Ack

Changes in V4:

    perf/core: Add aux_pause, aux_resume, aux_start_paused
	Rename aux_output_cfg -> aux_action
	Reorder aux_action bits from:
		aux_pause, aux_resume, aux_start_paused
	to:
		aux_start_paused, aux_pause, aux_resume
	Fix aux_action bits __u64 -> __u32

    coresight: Have a stab at support for pause / resume
	Dropped

    perf tools
	All new patches

Changes in RFC V3:

    coresight: Have a stab at support for pause / resume
	'mode' -> 'flags' so it at least compiles

Changes in RFC V2:

	Use ->stop() / ->start() instead of ->pause_resume()
	Move aux_start_paused bit into aux_output_cfg
	Tighten up when Intel PT pause / resume is allowed
	Add an example of how it might work for CoreSight


Adrian Hunter (14):
      perf/x86/intel/pt: Fix buffer full but size is 0 case
      KVM: x86: Fix Intel PT IA32_RTIT_CTL MSR validation
      KVM: x86: Fix Intel PT Host/Guest mode when host tracing also
      KVM: selftests: Add guest Intel PT test
      perf/core: Add aux_pause, aux_resume, aux_start_paused
      perf/x86/intel/pt: Add support for pause / resume
      perf/x86/intel: Do not enable large PEBS for events with aux actions or aux sampling
      perf tools: Add aux_start_paused, aux_pause and aux_resume
      perf tools: Add aux-action config term
      perf tools: Parse aux-action
      perf tools: Add missing_features for aux_start_paused, aux_pause, aux_resume
      perf intel-pt: Improve man page format
      perf intel-pt: Add documentation for pause / resume
      perf intel-pt: Add a test for pause / resume

 arch/x86/events/intel/core.c                       |   4 +-
 arch/x86/events/intel/pt.c                         | 209 +++++++-
 arch/x86/events/intel/pt.h                         |  16 +
 arch/x86/include/asm/intel_pt.h                    |   4 +
 arch/x86/kvm/vmx/vmx.c                             |  26 +-
 arch/x86/kvm/vmx/vmx.h                             |   1 -
 include/linux/perf_event.h                         |  28 +
 include/uapi/linux/perf_event.h                    |  11 +-
 kernel/events/core.c                               |  72 ++-
 kernel/events/internal.h                           |   1 +
 tools/include/uapi/linux/perf_event.h              |  11 +-
 tools/perf/Documentation/perf-intel-pt.txt         | 596 +++++++++++++--------
 tools/perf/Documentation/perf-record.txt           |   4 +
 tools/perf/builtin-record.c                        |   4 +-
 tools/perf/tests/shell/test_intel_pt.sh            |  28 +
 tools/perf/util/auxtrace.c                         |  67 ++-
 tools/perf/util/auxtrace.h                         |   6 +-
 tools/perf/util/evsel.c                            |  13 +-
 tools/perf/util/evsel.h                            |   1 +
 tools/perf/util/evsel_config.h                     |   1 +
 tools/perf/util/parse-events.c                     |  10 +
 tools/perf/util/parse-events.h                     |   1 +
 tools/perf/util/parse-events.l                     |   1 +
 tools/perf/util/perf_event_attr_fprintf.c          |   3 +
 tools/perf/util/pmu.c                              |   1 +
 tools/testing/selftests/kvm/Makefile               |   1 +
 .../selftests/kvm/include/x86_64/processor.h       |   1 +
 tools/testing/selftests/kvm/x86_64/intel_pt.c      | 381 +++++++++++++
 28 files changed, 1238 insertions(+), 264 deletions(-)
 create mode 100644 tools/testing/selftests/kvm/x86_64/intel_pt.c


Regards
Adrian

Comments

Leo Yan Oct. 13, 2024, 2:48 p.m. UTC | #1
On Thu, Oct 10, 2024 at 05:31:48PM +0300, Adrian Hunter wrote:
> Display "feature is not supported" error message if aux_start_paused,
> aux_pause or aux_resume result in a perf_event_open() error.
> 
> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
> Acked-by: Ian Rogers <irogers@google.com>
> Reviewed-by: Andi Kleen <ak@linux.intel.com>

This patch looks good to me.

A case is the Linux kernel has supported aux_pause_resume feature, but
the PMU event does not support it. So we might consider to add a extra
patch in perf:

diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index 927aa61e7b14..9a3191df2ec5 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -3373,6 +3373,10 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
                        return scnprintf(msg, size,
        "%s: PMU Hardware doesn't support 'aux_output' feature",
                                         evsel__name(evsel));
+               if (evsel->core.attr.aux_action)
+                       return scnprintf(msg, size,
+       "%s: PMU Hardware doesn't support 'aux_action' feature",
+                                        evsel__name(evsel));
                if (evsel->core.attr.sample_period != 0)
                        return scnprintf(msg, size,
        "%s: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'",

Thanks,
Leo

> ---
>  tools/perf/util/evsel.c | 10 +++++++++-
>  tools/perf/util/evsel.h |  1 +
>  2 files changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
> index d34ceab9e454..927aa61e7b14 100644
> --- a/tools/perf/util/evsel.c
> +++ b/tools/perf/util/evsel.c
> @@ -2147,7 +2147,13 @@ bool evsel__detect_missing_features(struct evsel *evsel)
>  	 * Must probe features in the order they were added to the
>  	 * perf_event_attr interface.
>  	 */
> -	if (!perf_missing_features.branch_counters &&
> +	if (!perf_missing_features.aux_pause_resume &&
> +	    (evsel->core.attr.aux_pause || evsel->core.attr.aux_resume ||
> +	     evsel->core.attr.aux_start_paused)) {
> +		perf_missing_features.aux_pause_resume = true;
> +		pr_debug2_peo("Kernel has no aux_pause/aux_resume support, bailing out\n");
> +		return false;
> +	} else if (!perf_missing_features.branch_counters &&
>  	    (evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_COUNTERS)) {
>  		perf_missing_features.branch_counters = true;
>  		pr_debug2("switching off branch counters support\n");
> @@ -3397,6 +3403,8 @@ int evsel__open_strerror(struct evsel *evsel, struct target *target,
>  			return scnprintf(msg, size, "clockid feature not supported.");
>  		if (perf_missing_features.clockid_wrong)
>  			return scnprintf(msg, size, "wrong clockid (%d).", clockid);
> +		if (perf_missing_features.aux_pause_resume)
> +			return scnprintf(msg, size, "The 'aux_pause / aux_resume' feature is not supported, update the kernel.");
>  		if (perf_missing_features.aux_output)
>  			return scnprintf(msg, size, "The 'aux_output' feature is not supported, update the kernel.");
>  		if (!target__has_cpu(target))
> diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
> index 15e745a9a798..778fcdb8261f 100644
> --- a/tools/perf/util/evsel.h
> +++ b/tools/perf/util/evsel.h
> @@ -221,6 +221,7 @@ struct perf_missing_features {
>  	bool weight_struct;
>  	bool read_lost;
>  	bool branch_counters;
> +	bool aux_pause_resume;
>  };
>  
>  extern struct perf_missing_features perf_missing_features;
> -- 
> 2.43.0
> 
>
diff mbox

Patch

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index 0c557f0a17b3..05dcc43f11bb 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -369,6 +369,22 @@  enum perf_event_read_format {
 	PERF_FORMAT_MAX = 1U << 5,		/* non-ABI */
 };
 
+enum {
+	PERF_AUX_ACTION_START_PAUSED		=   1U << 0,
+	PERF_AUX_ACTION_PAUSE			=   1U << 1,
+	PERF_AUX_ACTION_RESUME			=   1U << 2,
+	PERF_AUX_ACTION_EMIT			=   1U << 3,
+	PERF_AUX_ACTION_NR			= 0x1f << 4,
+	PERF_AUX_ACTION_NO_IP			=   1U << 9,
+	PERF_AUX_ACTION_PAUSE_ON_EVT		=   1U << 10,
+	PERF_AUX_ACTION_RESUME_ON_EVT		=   1U << 11,
+	PERF_AUX_ACTION_EMIT_ON_EVT		=   1U << 12,
+	PERF_AUX_ACTION_NR_ON_EVT		= 0x1f << 13,
+	PERF_AUX_ACTION_NO_IP_ON_EVT		=   1U << 18,
+	PERF_AUX_ACTION_MASK			= ~PERF_AUX_ACTION_START_PAUSED,
+	PERF_AUX_PAUSE_RESUME_MASK		= PERF_AUX_ACTION_PAUSE | PERF_AUX_ACTION_RESUME,
+};
+
 #define PERF_ATTR_SIZE_VER0	64	/* sizeof first published struct */
 #define PERF_ATTR_SIZE_VER1	72	/* add: config2 */
 #define PERF_ATTR_SIZE_VER2	80	/* add: branch_sample_type */
@@ -515,10 +531,19 @@  struct perf_event_attr {
 	union {
 		__u32	aux_action;
 		struct {
-			__u32	aux_start_paused :  1, /* start AUX area tracing paused */
-				aux_pause        :  1, /* on overflow, pause AUX area tracing */
-				aux_resume       :  1, /* on overflow, resume AUX area tracing */
-				__reserved_3     : 29;
+			__u32	aux_start_paused  :  1, /* start AUX area tracing paused */
+				aux_pause         :  1, /* on overflow, pause AUX area tracing */
+				aux_resume        :  1, /* on overflow, resume AUX area tracing */
+				aux_emit          :  1, /* generate AUX records instead of events */
+				aux_nr            :  5, /* AUX area tracing reference number */
+				aux_no_ip         :  1, /* suppress IP in AUX records */
+				/* Following apply to event occurrence not overflows */
+				aux_pause_on_evt  :  1, /* on event, pause AUX area tracing */
+				aux_resume_on_evt :  1, /* on event, resume AUX area tracing */
+				aux_emit_on_evt   :  1, /* generate AUX records instead of events */
+				aux_nr_on_evt     :  5, /* AUX area tracing reference number */
+				aux_no_ip_on_evt  :  1, /* suppress IP in AUX records */
+				__reserved_3      : 13;
 		};
 	};