From patchwork Fri Mar 17 16:48:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Rafael J. Wysocki" X-Patchwork-Id: 664746 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0D084C6FD1D for ; Fri, 17 Mar 2023 17:04:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230083AbjCQREV (ORCPT ); Fri, 17 Mar 2023 13:04:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229590AbjCQRET (ORCPT ); Fri, 17 Mar 2023 13:04:19 -0400 Received: from cloudserver094114.home.pl (cloudserver094114.home.pl [79.96.170.134]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E946CF756; Fri, 17 Mar 2023 10:04:17 -0700 (PDT) Received: from localhost (127.0.0.1) (HELO v370.home.net.pl) by /usr/run/smtp (/usr/run/postfix/private/idea_relay_lmtp) via UNIX with SMTP (IdeaSmtpServer 5.1.0) id 67b4c436527aeee2; Fri, 17 Mar 2023 18:04:16 +0100 Received: from kreacher.localnet (unknown [213.134.163.228]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by v370.home.net.pl (Postfix) with ESMTPSA id 9819B1C52C22; Fri, 17 Mar 2023 18:04:15 +0100 (CET) From: "Rafael J. Wysocki" To: Linux PM Cc: Zhang Rui , Linux ACPI , LKML , Daniel Lezcano , Srinivas Pandruvada , Viresh Kumar , Quanxian Wang Subject: [PATCH v3 0/4] thermal: core/ACPI: Fix processor cooling device regression Date: Fri, 17 Mar 2023 17:48:04 +0100 Message-ID: <2678096.mvXUDI8C0e@kreacher> MIME-Version: 1.0 X-CLIENT-IP: 213.134.163.228 X-CLIENT-HOSTNAME: 213.134.163.228 X-VADE-SPAMSTATE: clean X-VADE-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrgedvhedrvdefvddgleehucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecujffqoffgrffnpdggtffipffknecuuegrihhlohhuthemucduhedtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpefhvfevufffkfgggfgtsehtufertddttdejnecuhfhrohhmpedftfgrfhgrvghlucflrdcuhgihshhotghkihdfuceorhhjfiesrhhjfiihshhotghkihdrnhgvtheqnecuggftrfgrthhtvghrnhepgeffhfdujeelhfdtgeffkeetudfhtefhhfeiteethfekvefgvdfgfeeikeeigfehnecuffhomhgrihhnpehkvghrnhgvlhdrohhrghenucfkphepvddufedrudefgedrudeifedrvddvkeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpedvudefrddufeegrdduieefrddvvdekpdhhvghlohepkhhrvggrtghhvghrrdhlohgtrghlnhgvthdpmhgrihhlfhhrohhmpedftfgrfhgrvghlucflrdcuhgihshhotghkihdfuceorhhjfiesrhhjfiihshhotghkihdrnhgvtheqpdhnsggprhgtphhtthhopeekpdhrtghpthhtoheplhhinhhugidqphhmsehvghgvrhdrkhgvrhhnvghlrdhorhhgpdhrtghpthhtoheprhhuihdriihhrghnghesihhnthgvlhdrtghomhdprhgtphhtthhopehlihhnuhigqdgrtghpihesvhhgvghrrdhkvghrnhgvlhdrohhrghdprhgtphhtthhopehlihhnuhigqdhkvghrnhgvlhesvhhgvghr rdhkvghrnhgvlhdrohhrghdprhgtphhtthhopegurghnihgvlhdrlhgviigtrghnoheslhhinhgrrhhordhorhhgpdhrtghpthhtohepshhrihhnihhvrghsrdhprghnughruhhvrggurgeslhhinhhugidrihhnthgvlhdrtghomh X-DCC--Metrics: v370.home.net.pl 1024; Body=8 Fuz1=8 Fuz2=8 Precedence: bulk List-ID: X-Mailing-List: linux-acpi@vger.kernel.org Hi All, The two previous revisions of this patch series was posted as https://lore.kernel.org/linux-pm/2148907.irdbgypaU6@kreacher/ https://lore.kernel.org/linux-acpi/2692681.mvXUDI8C0e@kreacher/ As reported by Rui in this thread: Link: https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/ some recent changes in the thermal core cause the CPU cooling devices registered by the ACPI processor driver to become unusable in some cases and somewhat crippled in general. The problem is that the ACPI processor driver changes its ->get_max_state() callback return value depending on whether or not cpufreq is available and there is a cpufreq policy for a given CPU. However, the thermal core has always assumed that the return value of that callback will not change, which in fact is relied on by the cooling device statistics code. In particular, when the ->get_max_state() grows, the memory buffer allocated for storing the statistics will be too small and corruption may ensue as a result. For this reason, the issue needs to be addressed in the ACPI processor driver and not in the thermal core, but the core needs to help somewhat too. Namely, it needs to provide a helper allowing an interested driver to update the max_state value for an already registered cooling device in certain situations which will also cause the statistics to be rebuilt. This series implements the above and for details please refer to the individual patch chagelogs. Thanks!