From patchwork Fri Mar 3 19:18:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Rafael J. Wysocki" X-Patchwork-Id: 658669 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08188C7EE32 for ; Fri, 3 Mar 2023 19:24:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231649AbjCCTYc (ORCPT ); Fri, 3 Mar 2023 14:24:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231599AbjCCTYU (ORCPT ); Fri, 3 Mar 2023 14:24:20 -0500 Received: from cloudserver094114.home.pl (cloudserver094114.home.pl [79.96.170.134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 06B62125BA; Fri, 3 Mar 2023 11:24:17 -0800 (PST) Received: from localhost (127.0.0.1) (HELO v370.home.net.pl) by /usr/run/smtp (/usr/run/postfix/private/idea_relay_lmtp) via UNIX with SMTP (IdeaSmtpServer 5.1.0) id c9c6336197d2ec2a; Fri, 3 Mar 2023 20:24:16 +0100 Received: from kreacher.localnet (unknown [213.134.183.41]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by v370.home.net.pl (Postfix) with ESMTPSA id 35F8F20619DE; Fri, 3 Mar 2023 20:24:15 +0100 (CET) From: "Rafael J. Wysocki" To: Linux PM Cc: Zhang Rui , Linux ACPI , LKML , Daniel Lezcano , Srinivas Pandruvada , Viresh Kumar , Quanxian Wang Subject: [PATCH v1 0/4] thermal: core/ACPI: Fix processor cooling device regression Date: Fri, 03 Mar 2023 20:18:30 +0100 Message-ID: <2148907.irdbgypaU6@kreacher> MIME-Version: 1.0 X-CLIENT-IP: 213.134.183.41 X-CLIENT-HOSTNAME: 213.134.183.41 X-VADE-SPAMSTATE: clean X-VADE-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvhedrudelledguddukecutefuodetggdotefrodftvfcurfhrohhfihhlvgemucfjqffogffrnfdpggftiffpkfenuceurghilhhouhhtmecuudehtdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurhephffvvefufffkggfgtgesthfuredttddtjeenucfhrhhomhepfdftrghfrggvlhculfdrucghhihsohgtkhhifdcuoehrjhifsehrjhifhihsohgtkhhirdhnvghtqeenucggtffrrghtthgvrhhnpeegfffhudejlefhtdegffekteduhfethffhieettefhkeevgfdvgfefieekiefgheenucffohhmrghinhepkhgvrhhnvghlrdhorhhgnecukfhppedvudefrddufeegrddukeefrdegudenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpedvudefrddufeegrddukeefrdeguddphhgvlhhopehkrhgvrggthhgvrhdrlhhotggrlhhnvghtpdhmrghilhhfrhhomhepfdftrghfrggvlhculfdrucghhihsohgtkhhifdcuoehrjhifsehrjhifhihsohgtkhhirdhnvghtqedpnhgspghrtghpthhtohepkedprhgtphhtthhopehlihhnuhigqdhpmhesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehruhhirdiihhgrnhhgsehinhhtvghlrdgtohhmpdhrtghpthhtoheplhhinhhugidqrggtphhisehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtoheplhhinhhugidqkhgvrhhnvghlsehvghgvrhdr khgvrhhnvghlrdhorhhgpdhrtghpthhtohepuggrnhhivghlrdhlvgiitggrnhhosehlihhnrghrohdrohhrghdprhgtphhtthhopehsrhhinhhivhgrshdrphgrnhgurhhuvhgruggrsehlihhnuhigrdhinhhtvghlrdgtohhm X-DCC--Metrics: v370.home.net.pl 1024; Body=8 Fuz1=8 Fuz2=8 Precedence: bulk List-ID: X-Mailing-List: linux-pm@vger.kernel.org Hi All, As reported by Rui in this thread: Link: https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/ some recent changes in the thermal core cause the CPU cooling devices registered by the ACPI processor driver to become unusable in some cases and somewhat crippled in general. The problem is that the ACPI processor driver changes its ->get_max_state() callback return value depending on whether or not cpufreq is available and there is a cpufreq policy for a given CPU. However, the thermal core has always assumed that the return value of that callback will not change, which in fact is relied on by the cooling device statistics code. In particular, when the ->get_max_state() grows, the memory buffer allocated for storing the statistics will be too small and corruption may ensue as a result. For this reason, the issue needs to be addressed in the ACPI processor driver and not in the thermal core, but the core needs to help somewhat too. Namely, it needs to provide a helper allowing an interested driver to update the max_state value for an already registered cooling device in certain situations which will also cause the statistics to be rebuilt. This series implements the above and for details please refer to the individual patch chagelogs. Please also note that it has only been lightly tested, so more testing and of course review of it is welcome. Thanks!