diff mbox series

perf/x86/intel: Don't extend the pseudo-encoding to GP counters

Message ID 1648482543-14923-1-git-send-email-kan.liang@linux.intel.com
State Accepted
Commit 4a263bf331c512849062805ef1b4ac40301a9829
Headers show
Series perf/x86/intel: Don't extend the pseudo-encoding to GP counters | expand

Commit Message

Liang, Kan March 28, 2022, 3:49 p.m. UTC
From: Kan Liang <kan.liang@linux.intel.com>

The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0

 Performance counter stats for 'CPU(s) 0':

           607,246      cpu/event=0xc0,umask=0x0/
                 0      cpu/event=0x0,umask=0x1/

The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
doesn't work on the generic counters. However, current perf extends its
mask to the generic counters.

The pseudo event-code for a fixed counter must be 0x00. Check and avoid
extending the mask for the fixed counter event which using the
pseudo-encoding, e.g., ref-cycles and PREC_DIST event.

With the patch,
perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0

 Performance counter stats for 'CPU(s) 0':

           583,184      cpu/event=0xc0,umask=0x0/
           583,048      cpu/event=0x0,umask=0x1/

Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: stable@vger.kernel.org
---
 arch/x86/events/intel/core.c      | 6 +++++-
 arch/x86/include/asm/perf_event.h | 5 +++++
 2 files changed, 10 insertions(+), 1 deletion(-)

Comments

Stephane Eranian March 28, 2022, 5:11 p.m. UTC | #1
On Mon, Mar 28, 2022 at 8:50 AM <kan.liang@linux.intel.com> wrote:
>
> From: Kan Liang <kan.liang@linux.intel.com>
>
> The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
> perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
>
>  Performance counter stats for 'CPU(s) 0':
>
>            607,246      cpu/event=0xc0,umask=0x0/
>                  0      cpu/event=0x0,umask=0x1/
>
> The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
> doesn't work on the generic counters. However, current perf extends its
> mask to the generic counters.
>
> The pseudo event-code for a fixed counter must be 0x00. Check and avoid
> extending the mask for the fixed counter event which using the
> pseudo-encoding, e.g., ref-cycles and PREC_DIST event.
>
> With the patch,
> perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
>
>  Performance counter stats for 'CPU(s) 0':
>
>            583,184      cpu/event=0xc0,umask=0x0/
>            583,048      cpu/event=0x0,umask=0x1/
>
> Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
> Cc: stable@vger.kernel.org
> ---
>  arch/x86/events/intel/core.c      | 6 +++++-
>  arch/x86/include/asm/perf_event.h | 5 +++++
>  2 files changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index db32ef6..1d2e49d 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -5668,7 +5668,11 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
>                         /* Disabled fixed counters which are not in CPUID */
>                         c->idxmsk64 &= intel_ctrl;
>
> -                       if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
> +                       /*
> +                        * Don't extend the pseudo-encoding to the
> +                        * generic counters
> +                        */
> +                       if (!use_fixed_pseudo_encoding(c->code))
>                                 c->idxmsk64 |= (1ULL << num_counters) - 1;
>                 }
>                 c->idxmsk64 &=
> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
> index 48e6ef56..cd85f03 100644
> --- a/arch/x86/include/asm/perf_event.h
> +++ b/arch/x86/include/asm/perf_event.h
> @@ -242,6 +242,11 @@ struct x86_pmu_capability {
>  #define INTEL_PMC_IDX_FIXED_SLOTS      (INTEL_PMC_IDX_FIXED + 3)
>  #define INTEL_PMC_MSK_FIXED_SLOTS      (1ULL << INTEL_PMC_IDX_FIXED_SLOTS)
>
> +static inline bool use_fixed_pseudo_encoding(u64 code)
> +{
> +       return !(code & 0xff);
> +}
> +
I ack the problem.

That does not take into account the old encoding for PREC_DIST 0x01c0
which is also forced to
fixed counter0 on ICL and should not be extended.

That also limits the options for the SLOTS events which can be
measured by a GP. Yet to work
with PERF_METRICS, it has to be programmed into fixed counter 3.

>  /*
>   * We model BTS tracing as another fixed-mode PMC.
>   *
> --
> 2.7.4
>
Liang, Kan March 28, 2022, 6:30 p.m. UTC | #2
On 3/28/2022 1:11 PM, Stephane Eranian wrote:
> On Mon, Mar 28, 2022 at 8:50 AM <kan.liang@linux.intel.com> wrote:
>>
>> From: Kan Liang <kan.liang@linux.intel.com>
>>
>> The INST_RETIRED.PREC_DIST event (0x0100) doesn't count on SPR.
>> perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
>>
>>   Performance counter stats for 'CPU(s) 0':
>>
>>             607,246      cpu/event=0xc0,umask=0x0/
>>                   0      cpu/event=0x0,umask=0x1/
>>
>> The encoding for INST_RETIRED.PREC_DIST is pseudo-encoding, which
>> doesn't work on the generic counters. However, current perf extends its
>> mask to the generic counters.
>>
>> The pseudo event-code for a fixed counter must be 0x00. Check and avoid
>> extending the mask for the fixed counter event which using the
>> pseudo-encoding, e.g., ref-cycles and PREC_DIST event.
>>
>> With the patch,
>> perf stat -e cpu/event=0xc0,umask=0x0/,cpu/event=0x0,umask=0x1/ -C0
>>
>>   Performance counter stats for 'CPU(s) 0':
>>
>>             583,184      cpu/event=0xc0,umask=0x0/
>>             583,048      cpu/event=0x0,umask=0x1/
>>
>> Fixes: 2de71ee153ef ("perf/x86/intel: Fix ICL/SPR INST_RETIRED.PREC_DIST encodings")
>> Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
>> Cc: stable@vger.kernel.org
>> ---
>>   arch/x86/events/intel/core.c      | 6 +++++-
>>   arch/x86/include/asm/perf_event.h | 5 +++++
>>   2 files changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
>> index db32ef6..1d2e49d 100644
>> --- a/arch/x86/events/intel/core.c
>> +++ b/arch/x86/events/intel/core.c
>> @@ -5668,7 +5668,11 @@ static void intel_pmu_check_event_constraints(struct event_constraint *event_con
>>                          /* Disabled fixed counters which are not in CPUID */
>>                          c->idxmsk64 &= intel_ctrl;
>>
>> -                       if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
>> +                       /*
>> +                        * Don't extend the pseudo-encoding to the
>> +                        * generic counters
>> +                        */
>> +                       if (!use_fixed_pseudo_encoding(c->code))
>>                                  c->idxmsk64 |= (1ULL << num_counters) - 1;
>>                  }
>>                  c->idxmsk64 &=
>> diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
>> index 48e6ef56..cd85f03 100644
>> --- a/arch/x86/include/asm/perf_event.h
>> +++ b/arch/x86/include/asm/perf_event.h
>> @@ -242,6 +242,11 @@ struct x86_pmu_capability {
>>   #define INTEL_PMC_IDX_FIXED_SLOTS      (INTEL_PMC_IDX_FIXED + 3)
>>   #define INTEL_PMC_MSK_FIXED_SLOTS      (1ULL << INTEL_PMC_IDX_FIXED_SLOTS)
>>
>> +static inline bool use_fixed_pseudo_encoding(u64 code)
>> +{
>> +       return !(code & 0xff);
>> +}
>> +
> I ack the problem.
> 
> That does not take into account the old encoding for PREC_DIST 0x01c0
> which is also forced to
> fixed counter0 on ICL and should not be extended.

The old encoding is not documented in the ICL event list now. The only 
PREC_DIST event for ICL is using the pseudo encoding.

   {
     "EventCode": "0x00",
     "UMask": "0x01",
     "EventName": "INST_RETIRED.PREC_DIST",
     "BriefDescription": "Precise instruction retired event with a 
reduced effect of PEBS shadow in IP distribution",
     "PublicDescription": "A version of INST_RETIRED that allows for a 
more unbiased distribution of samples across instructions retired. It 
utilizes the Precise Distribution of Instructions Retired (PDIR) feature 
to mitigate some bias in how retired instructions get sampled. Use on 
Fixed Counter 0.",
     "Counter": "Fixed counter 0",

Ideally, I think we should remove the old encoding 0x01c0 from the 
constraints table rather than force it to fixed counter 0 only.
If so, that should be a separate patch.

> 
> That also limits the options for the SLOTS events which can be
> measured by a GP. Yet to work
> with PERF_METRICS, it has to be programmed into fixed counter 3.

For the SLOTS event which can only work with PERF_METRICS, the current 
perf already limit it as below.
FIXED_EVENT_CONSTRAINT(0x0400, 3),	/* SLOTS */
No behavior is changed with this patch.

For the GP version of SLOTS, it's 0x01a4. According to the event list, 
it can be scheduled on all GP counters. So it's not added into the 
constraints table.

     "EventCode": "0xa4",
     "UMask": "0x01",
     "EventName": "TOPDOWN.SLOTS_P",
     "BriefDescription": "TMA slots available for an unhalted logical 
processor. General counter - architectural event",
     "PublicDescription": "Counts the number of available slots for an 
unhalted logical processor. The event increments by machine-width of the 
narrowest pipeline as employed by the Top-down Microarchitecture 
Analysis method. The count is distributed among unhalted logical 
processors (hyper-threads) who share the same physical core.",
     "Counter": "0,1,2,3,4,5,6,7",
     "PEBScounters": "0,1,2,3,4,5,6,7",

Even we finally decide to extend the 0x01a4 to the fixed counter 3 and 
add an entry FIXED_EVENT_CONSTRAINT(0x01a4, 3) in the constraints table. 
This patch doesn't limit it.

Thanks,
Kan

> 
>>   /*
>>    * We model BTS tracing as another fixed-mode PMC.
>>    *
>> --
>> 2.7.4
>>
diff mbox series

Patch

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index db32ef6..1d2e49d 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -5668,7 +5668,11 @@  static void intel_pmu_check_event_constraints(struct event_constraint *event_con
 			/* Disabled fixed counters which are not in CPUID */
 			c->idxmsk64 &= intel_ctrl;
 
-			if (c->idxmsk64 != INTEL_PMC_MSK_FIXED_REF_CYCLES)
+			/*
+			 * Don't extend the pseudo-encoding to the
+			 * generic counters
+			 */
+			if (!use_fixed_pseudo_encoding(c->code))
 				c->idxmsk64 |= (1ULL << num_counters) - 1;
 		}
 		c->idxmsk64 &=
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 48e6ef56..cd85f03 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -242,6 +242,11 @@  struct x86_pmu_capability {
 #define INTEL_PMC_IDX_FIXED_SLOTS	(INTEL_PMC_IDX_FIXED + 3)
 #define INTEL_PMC_MSK_FIXED_SLOTS	(1ULL << INTEL_PMC_IDX_FIXED_SLOTS)
 
+static inline bool use_fixed_pseudo_encoding(u64 code)
+{
+	return !(code & 0xff);
+}
+
 /*
  * We model BTS tracing as another fixed-mode PMC.
  *