From patchwork Fri Sep 11 07:02:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: thermal-bot for Julien Panis X-Patchwork-Id: 264110 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CDC7C2BC11 for ; Fri, 11 Sep 2020 07:03:48 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 26762221EF for ; Fri, 11 Sep 2020 07:03:48 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="lnBSJV9b"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="McWhkeW8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1725804AbgIKHDP (ORCPT ); Fri, 11 Sep 2020 03:03:15 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:45112 "EHLO galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725785AbgIKHCd (ORCPT ); Fri, 11 Sep 2020 03:02:33 -0400 Date: Fri, 11 Sep 2020 07:02:28 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1599807749; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nK+Y+pGcwGyQYhqw3U+d78NVAhziKaGB5comF2R4ZiA=; b=lnBSJV9bIdkFa8DN3Lvz1CA+G3rVL4l21bP2jfAMs+ZdaDyK1jsBHKNXGotvd9EfBcyy/a axVjq6KYlMPwbuwcdc/Y4dYsAtglaQ1WbovpuIh0Y4Vsrvm4Dxy2HByVIUIhfINLrbfc5I 0IU203uWaLSSQaIjT+QqeRQiYhahDubrjOy9P1qVsXnpW0Az+ism2Qb+ZFwW+SkIXe7pz/ aamGOkOniX6aDAo2FHgnPaFVI/8yl2+/MAX++j1vX+c4WDR86UmelLADNs+YJYH/euD41V iNfiRTppl8ZpIRU9PNROcy0dlmqJLe2JITHOhLhFlhIQflvUz1cfGqDFknlPjw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1599807749; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nK+Y+pGcwGyQYhqw3U+d78NVAhziKaGB5comF2R4ZiA=; b=McWhkeW8OMH0xUppniI1KSnVOTBV6QAD5eiaCGz+ICy/zhpAoWGHQv1jLJzFAwcFxzvPZm KgQQdlKEUdrXpkCA== From: "tip-bot2 for Kim Phillips" Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: perf/core] perf/amd/uncore: Set all slices and threads to restore perf stat -a behaviour Cc: Kim Phillips , "Peter Zijlstra (Intel)" , stable@vger.kernel.org, x86 , LKML In-Reply-To: <20200908214740.18097-2-kim.phillips@amd.com> References: <20200908214740.18097-2-kim.phillips@amd.com> MIME-Version: 1.0 Message-ID: <159980774865.20229.4594347442051094162.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Sender: stable-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org The following commit has been merged into the perf/core branch of tip: Commit-ID: c8fe99d0701fec9fb849ec880a86bc5592530496 Gitweb: https://git.kernel.org/tip/c8fe99d0701fec9fb849ec880a86bc5592530496 Author: Kim Phillips AuthorDate: Tue, 08 Sep 2020 16:47:34 -05:00 Committer: Peter Zijlstra CommitterDate: Thu, 10 Sep 2020 11:19:34 +02:00 perf/amd/uncore: Set all slices and threads to restore perf stat -a behaviour Commit 2f217d58a8a0 ("perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCs") inadvertently changed the uncore driver's behaviour wrt perf tool invocations with or without a CPU list, specified with -C / --cpu=. Change the behaviour of the driver to assume the former all-cpu (-a) case, which is the more commonly desired default. This fixes '-a -A' invocations without explicit cpu lists (-C) to not count L3 events only on behalf of the first thread of the first core in the L3 domain. BEFORE: Activity performed by the first thread of the last core (CPU#43) in CPU#40's L3 domain is not reported by CPU#40: sudo perf stat -a -A -e l3_request_g1.caching_l3_cache_accesses taskset -c 43 perf bench mem memcpy -s 32mb -l 100 -f default ... CPU36 21,835 l3_request_g1.caching_l3_cache_accesses CPU40 87,066 l3_request_g1.caching_l3_cache_accesses CPU44 17,360 l3_request_g1.caching_l3_cache_accesses ... AFTER: The L3 domain activity is now reported by CPU#40: sudo perf stat -a -A -e l3_request_g1.caching_l3_cache_accesses taskset -c 43 perf bench mem memcpy -s 32mb -l 100 -f default ... CPU36 354,891 l3_request_g1.caching_l3_cache_accesses CPU40 1,780,870 l3_request_g1.caching_l3_cache_accesses CPU44 315,062 l3_request_g1.caching_l3_cache_accesses ... Fixes: 2f217d58a8a0 ("perf/x86/amd/uncore: Set the thread mask for F17h L3 PMCs") Signed-off-by: Kim Phillips Signed-off-by: Peter Zijlstra (Intel) Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20200908214740.18097-2-kim.phillips@amd.com --- arch/x86/events/amd/uncore.c | 28 ++++++++-------------------- 1 file changed, 8 insertions(+), 20 deletions(-) diff --git a/arch/x86/events/amd/uncore.c b/arch/x86/events/amd/uncore.c index 76400c0..e7e61c8 100644 --- a/arch/x86/events/amd/uncore.c +++ b/arch/x86/events/amd/uncore.c @@ -181,28 +181,16 @@ static void amd_uncore_del(struct perf_event *event, int flags) } /* - * Convert logical CPU number to L3 PMC Config ThreadMask format + * Return a full thread and slice mask until per-CPU is + * properly supported. */ -static u64 l3_thread_slice_mask(int cpu) +static u64 l3_thread_slice_mask(void) { - u64 thread_mask, core = topology_core_id(cpu); - unsigned int shift, thread = 0; + if (boot_cpu_data.x86 <= 0x18) + return AMD64_L3_SLICE_MASK | AMD64_L3_THREAD_MASK; - if (topology_smt_supported() && !topology_is_primary_thread(cpu)) - thread = 1; - - if (boot_cpu_data.x86 <= 0x18) { - shift = AMD64_L3_THREAD_SHIFT + 2 * (core % 4) + thread; - thread_mask = BIT_ULL(shift); - - return AMD64_L3_SLICE_MASK | thread_mask; - } - - core = (core << AMD64_L3_COREID_SHIFT) & AMD64_L3_COREID_MASK; - shift = AMD64_L3_THREAD_SHIFT + thread; - thread_mask = BIT_ULL(shift); - - return AMD64_L3_EN_ALL_SLICES | core | thread_mask; + return AMD64_L3_EN_ALL_SLICES | AMD64_L3_EN_ALL_CORES | + AMD64_L3_F19H_THREAD_MASK; } static int amd_uncore_event_init(struct perf_event *event) @@ -232,7 +220,7 @@ static int amd_uncore_event_init(struct perf_event *event) * For other events, the two fields do not affect the count. */ if (l3_mask && is_llc_event(event)) - hwc->config |= l3_thread_slice_mask(event->cpu); + hwc->config |= l3_thread_slice_mask(); uncore = event_to_amd_uncore(event); if (!uncore)