diff mbox series

[14/18] tools/power turbostat: Avoid probing the same perf counters

Message ID 9d6b56ebcc57108902335e35b8242aa3cf5b2051.1749406068.git.len.brown@intel.com
State New
Headers show
Series None | expand

Commit Message

Len Brown June 8, 2025, 6:17 p.m. UTC
From: Zhang Rui <rui.zhang@intel.com>

For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.

Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.

As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.

Fix the problem by skipping the already probed counter.

Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.

In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
---
 tools/power/x86/turbostat/turbostat.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)
diff mbox series

Patch

diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c
index 6f91ec3f3f14..8deb6a23c7dd 100644
--- a/tools/power/x86/turbostat/turbostat.c
+++ b/tools/power/x86/turbostat/turbostat.c
@@ -7991,6 +7991,21 @@  void rapl_perf_init(void)
 
 			struct rapl_counter_info_t *rci = &rapl_counter_info_perdomain[next_domain];
 
+			/*
+			 * rapl_counter_arch_infos[] can have multiple entries describing the same
+			 * counter, due to the difference from different platforms/Vendors.
+			 * E.g. rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] share the
+			 * same perf_subsys and perf_name, but with different MSR address.
+			 * rapl_counter_arch_infos[0] is for Intel and rapl_counter_arch_infos[1]
+			 * is for AMD.
+			 * In this case, it is possible that multiple rapl_counter_arch_infos[]
+			 * entries are probed just because their perf/msr is duplicate and valid.
+			 *
+			 * Thus need a check to avoid re-probe the same counters.
+			 */
+			if (rci->source[cai->rci_index] != COUNTER_SOURCE_NONE)
+				break;
+
 			/* Use perf API for this counter */
 			if (add_rapl_perf_counter(cpu, rci, cai, &scale, &unit) != -1) {
 				rci->source[cai->rci_index] = COUNTER_SOURCE_PERF;