[1/4] PCI: Check PCI_PM_CTRL in pci_dev_wait()

Message ID	20240613054204.5850-2-mario.limonciello@amd.com
State	New
Headers	show Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2054.outbound.protection.outlook.com [40.107.100.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81CB213698E; Thu, 13 Jun 2024 05:42:47 +0000 (UTC) Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C From: Mario Limonciello <mario.limonciello@amd.com> To: Bjorn Helgaas <bhelgaas@google.com>, Mathias Nyman <mathias.nyman@intel.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org> CC: "open list:PCI SUBSYSTEM" <linux-pci@vger.kernel.org>, open list <linux-kernel@vger.kernel.org>, "open list:USB XHCI DRIVER" <linux-usb@vger.kernel.org>, Daniel Drake <drake@endlessos.org>, Gary Li <Gary.Li@amd.com>, Mika Westerberg <mika.westerberg@linux.intel.com>, "Mario Limonciello" <mario.limonciello@amd.com> Subject: [PATCH 1/4] PCI: Check PCI_PM_CTRL in pci_dev_wait() Date: Thu, 13 Jun 2024 00:42:01 -0500 Message-ID: <20240613054204.5850-2-mario.limonciello@amd.com> In-Reply-To: <20240613054204.5850-1-mario.limonciello@amd.com> References: <20240613054204.5850-1-mario.limonciello@amd.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain
Series	Verify devices transition from D3cold to D0 \| expand [0/4] Verify devices transition from D3cold to D0 [1/4] PCI: Check PCI_PM_CTRL in pci_dev_wait() [2/4] PCI: Verify functions currently in D3cold have entered D0 [3/4] PCI: Allow Ryzen XHCI controllers into D3cold and drop delays [4/4] PCI: Drop Radeon quirk for Macbook Pro 8.2

Message ID

20240613054204.5850-2-mario.limonciello@amd.com

State

New

Headers

Received-SPF: Pass (protection.outlook.com: domain of amd.com designates
 165.204.84.17 as permitted sender) receiver=protection.outlook.com;
 client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C
From: Mario Limonciello <mario.limonciello@amd.com>
To: Bjorn Helgaas <bhelgaas@google.com>, Mathias Nyman
	<mathias.nyman@intel.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: "open list:PCI SUBSYSTEM" <linux-pci@vger.kernel.org>, open list
	<linux-kernel@vger.kernel.org>, "open list:USB XHCI DRIVER"
	<linux-usb@vger.kernel.org>, Daniel Drake <drake@endlessos.org>, Gary Li
	<Gary.Li@amd.com>, Mika Westerberg <mika.westerberg@linux.intel.com>, "Mario
 Limonciello" <mario.limonciello@amd.com>
Subject: [PATCH 1/4] PCI: Check PCI_PM_CTRL in pci_dev_wait()
Date: Thu, 13 Jun 2024 00:42:01 -0500
Message-ID: <20240613054204.5850-2-mario.limonciello@amd.com>
In-Reply-To: <20240613054204.5850-1-mario.limonciello@amd.com>
References: <20240613054204.5850-1-mario.limonciello@amd.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jun 2024 05:42:43.6683
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 e21549a5-0634-40bd-d504-08dc8b6ba5d3
X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: 
 TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com]
X-MS-Exchange-CrossTenant-AuthSource: 
	CY4PEPF0000E9D5.namprd05.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: LV3PR12MB9120

Series

Verify devices transition from D3cold to D0 | expand

Commit Message

Mario Limonciello June 13, 2024, 5:42 a.m. UTC

A device that has gone through a reset may return a value in PCI_COMMAND
but that doesn't mean it's finished transitioning to D0.  On devices that
support power management explicitly check PCI_PM_CTRL to ensure the
transition happened.  Devicees that don't support power management will
continue to use PCI_COMMAND.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/pci/pci.c | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

Comments

Markus Elfring June 19, 2024, 6:33 p.m. UTC | #1

> A device that has gone through a reset may return a value in PCI_COMMAND
> but that doesn't mean it's finished transitioning to D0.  On devices that
> support power management explicitly check PCI_PM_CTRL to ensure the
> transition happened.  Devicees that don't support power management will

                        Devices?


> continue to use PCI_COMMAND.

Would the tag “Fixes” be relevant for such a change description?

Regards,
Markus

Mario Limonciello June 19, 2024, 6:44 p.m. UTC | #2

On 6/19/2024 13:33, Markus Elfring wrote:
>> A device that has gone through a reset may return a value in PCI_COMMAND
>> but that doesn't mean it's finished transitioning to D0.  On devices that
>> support power management explicitly check PCI_PM_CTRL to ensure the
>> transition happened.  Devicees that don't support power management will
> 
>                          Devices?

Yes, thanks.  I'll fix that up for the next version once we have some 
alignment on the functionality outlined in these patches.

> 
> 
>> continue to use PCI_COMMAND.
> 
> Would the tag “Fixes” be relevant for such a change description?
> 
> Regards,
> Markus

I did trace back the history of the wait function and it goes back to 
4.6.  In my mind yes; it is a fix, but I don't think it should go that 
far back automatically.  I think we should prioritize getting it fixed 
for 6.11 or 6.12 and then can revisit how far back to do a stable backport.

For example AMD Rembrandt (where this race condition was found) isn't 
enabled until 5.17 or 5.18 IIRC.

The backports would have a dependency on 08e3ed12ca861 (from 6.5-rc1) 
and bae26849372b8 (from 5.5-rc1) and 821cdad5c46ca (from 4.14) and 
5adecf817dd63 (from 4.6-rc1).

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 59e0949fb079..41961e28a86c 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1270,21 +1270,33 @@  static int pci_dev_wait(struct pci_dev *dev, char *reset_type, int timeout)
 	 * the read (except when CRS SV is enabled and the read was for the
 	 * Vendor ID; in that case it synthesizes 0x0001 data).
 	 *
-	 * Wait for the device to return a non-CRS completion.  Read the
-	 * Command register instead of Vendor ID so we don't have to
-	 * contend with the CRS SV value.
+	 * Wait for the device to return a non-CRS completion.  On devices
+	 * that support PM control read the PM control register to ensure
+	 * the device has transitioned to D0.  On devices that don't support
+	 * PM control, read the command register to instead of Vendor ID so
+	 * we don't have to contend with the CRS SV value.
 	 */
 	for (;;) {
-		u32 id;
 
 		if (pci_dev_is_disconnected(dev)) {
 			pci_dbg(dev, "disconnected; not waiting\n");
 			return -ENOTTY;
 		}
 
-		pci_read_config_dword(dev, PCI_COMMAND, &id);
-		if (!PCI_POSSIBLE_ERROR(id))
-			break;
+		if (dev->pm_cap) {
+			u16 pmcsr;
+
+			pci_read_config_word(dev, dev->pm_cap + PCI_PM_CTRL, &pmcsr);
+			if (!PCI_POSSIBLE_ERROR(pmcsr) &&
+			    (pmcsr & PCI_PM_CTRL_STATE_MASK) == PCI_D0)
+				break;
+		} else {
+			u32 id;
+
+			pci_read_config_dword(dev, PCI_COMMAND, &id);
+			if (!PCI_POSSIBLE_ERROR(id))
+				break;
+		}
 
 		if (delay > timeout) {
 			pci_warn(dev, "not ready %dms after %s; giving up\n",

[1/4] PCI: Check PCI_PM_CTRL in pci_dev_wait()

Commit Message

Comments

Patch