mbox series

[v4,0/7] Add managed SOFT RESERVE resource handling

Message ID 20250603221949.53272-1-Smita.KoralahalliChannabasappa@amd.com
Headers show
Series Add managed SOFT RESERVE resource handling | expand

Message

Smita Koralahalli June 3, 2025, 10:19 p.m. UTC
Add the ability to manage SOFT RESERVE iomem resources prior to them being
added to the iomem resource tree. This allows drivers, such as CXL, to
remove any pieces of the SOFT RESERVE resource that intersect with created
CXL regions.

The current approach of leaving the SOFT RESERVE resources as is can cause
failures during hotplug of devices, such as CXL, because the resource is
not available for reuse after teardown of the device.

The approach is to add SOFT RESERVE resources to a separate tree during
boot. This allows any drivers to update the SOFT RESERVE resources before
they are merged into the iomem resource tree. In addition a notifier chain
is added so that drivers can be notified when these SOFT RESERVE resources
are added to the ioeme resource tree.

The CXL driver is modified to use a worker thread that waits for the CXL
PCI and CXL mem drivers to be loaded and for their probe routine to
complete. Then the driver walks through any created CXL regions to trim any
intersections with SOFT RESERVE resources in the iomem tree.

The dax driver uses the new soft reserve notifier chain so it can consume
any remaining SOFT RESERVES once they're added to the iomem tree.

The following scenarios have been tested:

Example 1: Exact alignment, soft reserved is a child of the region

|---------- "Soft Reserved" -----------|
|-------------- "Region #" ------------|

Before:
  1050000000-304fffffff : CXL Window 0
    1050000000-304fffffff : region0
      1050000000-304fffffff : Soft Reserved
        1080000000-2fffffffff : dax0.0
          1080000000-2fffffffff : System RAM (kmem)

After:
  1050000000-304fffffff : CXL Window 0
    1050000000-304fffffff : region1
      1080000000-2fffffffff : dax0.0
        1080000000-2fffffffff : System RAM (kmem)

Example 2: Start and/or end aligned and soft reserved spans multiple
regions

|----------- "Soft Reserved" -----------|
|-------- "Region #" -------|
or
|----------- "Soft Reserved" -----------|
|-------- "Region #" -------|

Before:
  850000000-684fffffff : Soft Reserved
    850000000-284fffffff : CXL Window 0
      850000000-284fffffff : region3
        850000000-284fffffff : dax0.0
          850000000-284fffffff : System RAM (kmem)
    2850000000-484fffffff : CXL Window 1
      2850000000-484fffffff : region4
        2850000000-484fffffff : dax1.0
          2850000000-484fffffff : System RAM (kmem)
    4850000000-684fffffff : CXL Window 2
      4850000000-684fffffff : region5
        4850000000-684fffffff : dax2.0
          4850000000-684fffffff : System RAM (kmem)

After:
  850000000-284fffffff : CXL Window 0
    850000000-284fffffff : region3
      850000000-284fffffff : dax0.0
        850000000-284fffffff : System RAM (kmem)
  2850000000-484fffffff : CXL Window 1
    2850000000-484fffffff : region4
      2850000000-484fffffff : dax1.0
        2850000000-484fffffff : System RAM (kmem)
  4850000000-684fffffff : CXL Window 2
    4850000000-684fffffff : region5
      4850000000-684fffffff : dax2.0
        4850000000-684fffffff : System RAM (kmem)

Example 3: No alignment
|---------- "Soft Reserved" ----------|
	|---- "Region #" ----|

Before:
  00000000-3050000ffd : Soft Reserved
    ..
    ..
    1050000000-304fffffff : CXL Window 0
      1050000000-304fffffff : region1
        1080000000-2fffffffff : dax0.0
          1080000000-2fffffffff : System RAM (kmem)

After:
  00000000-104fffffff : Soft Reserved
    ..
    ..
  1050000000-304fffffff : CXL Window 0
    1050000000-304fffffff : region1
      1080000000-2fffffffff : dax0.0
        1080000000-2fffffffff : System RAM (kmem)
  3050000000-3050000ffd : Soft Reserved

v4 updates:
 - Split first patch into 4 smaller patches.
 - Correct the logic for cxl_pci_loaded() and cxl_mem_active() to return
   false at default instead of true.
 - Cleanup cxl_wait_for_pci_mem() to remove config checks for cxl_pci
   and cxl_mem.
 - Fixed multiple bugs and build issues which includes correcting
   walk_iomem_resc_desc() and calculations of alignments.
 
v3 updates:
 - Remove srmem resource tree from kernel/resource.c, this is no longer
   needed in the current implementation. All SOFT RESERVE resources now
   put on the iomem resource tree.
 - Remove the no longer needed SOFT_RESERVED_MANAGED kernel config option.
 - Add the 'nid' parameter back to hmem_register_resource();
 - Remove the no longer used soft reserve notification chain (introduced
   in v2). The dax driver is now notified of SOFT RESERVED resources by
   the CXL driver.

v2 updates:
 - Add config option SOFT_RESERVE_MANAGED to control use of the
   separate srmem resource tree at boot.
 - Only add SOFT RESERVE resources to the soft reserve tree during
   boot, they go to the iomem resource tree after boot.
 - Remove the resource trimming code in the previous patch to re-use
   the existing code in kernel/resource.c
 - Add functionality for the cxl acpi driver to wait for the cxl PCI
   and me drivers to load.

Smita Koralahalli (7):
  cxl/region: Avoid null pointer dereference in is_cxl_region()
  cxl/core: Remove CONFIG_CXL_SUSPEND and always build suspend.o
  cxl/pci: Add pci_loaded tracking to mark PCI driver readiness
  cxl/acpi: Add background worker to wait for cxl_pci and cxl_mem probe
  cxl/region: Introduce SOFT RESERVED resource removal on region
    teardown
  dax/hmem: Save the DAX HMEM platform device pointer
  cxl/dax: Defer DAX consumption of SOFT RESERVED resources until after
    CXL region creation

 drivers/cxl/Kconfig        |   4 -
 drivers/cxl/acpi.c         |  25 ++++++
 drivers/cxl/core/Makefile  |   2 +-
 drivers/cxl/core/region.c  | 163 ++++++++++++++++++++++++++++++++++++-
 drivers/cxl/core/suspend.c |  34 +++++++-
 drivers/cxl/cxl.h          |   7 ++
 drivers/cxl/cxlmem.h       |   9 --
 drivers/cxl/cxlpci.h       |   1 +
 drivers/cxl/pci.c          |   2 +
 drivers/dax/hmem/device.c  |  47 +++++------
 drivers/dax/hmem/hmem.c    |  10 ++-
 include/linux/dax.h        |  11 ++-
 include/linux/pm.h         |   7 --
 13 files changed, 270 insertions(+), 52 deletions(-)

Comments

Zhijian Li (Fujitsu) June 4, 2025, 8:43 a.m. UTC | #1
Smita,

Thanks for your awesome work. I just tested the scenarios you listed, and they work as expected. Thanks again.
(Minor comments inlined)

Tested-by: Li Zhijian <lizhijian@fujitsu.com>


To the CXL community,

The scenarios mentioned here essentially cover what a correct firmware may provide. However,
I would like to discuss one more scenario that I can simulate with a modified QEMU:
The E820 exposes a SOFT RESERVED region which is the same as a CFMW, but the HDM decoders are not committed. This means no region will be auto-created during boot.

As an example, after boot, the iomem tree is as follows:
1050000000-304fffffff : CXL Window 0
   1050000000-304fffffff : Soft Reserved
     <No region>

In this case, the SOFT RESERVED resource is not trimmed, so the end-user cannot create a new region.
My question is: Is this scenario a problem? If it is, should we fix it in this patchset or create a new patch?




On 04/06/2025 06:19, Smita Koralahalli wrote:
> Add the ability to manage SOFT RESERVE iomem resources prior to them being
> added to the iomem resource tree. This allows drivers, such as CXL, to
> remove any pieces of the SOFT RESERVE resource that intersect with created
> CXL regions.
> 
> The current approach of leaving the SOFT RESERVE resources as is can cause
> failures during hotplug of devices, such as CXL, because the resource is
> not available for reuse after teardown of the device.
> 
> The approach is to add SOFT RESERVE resources to a separate tree during
> boot.

No special tree at all since V3


> This allows any drivers to update the SOFT RESERVE resources before
> they are merged into the iomem resource tree. In addition a notifier chain
> is added so that drivers can be notified when these SOFT RESERVE resources
> are added to the ioeme resource tree.
> 
> The CXL driver is modified to use a worker thread that waits for the CXL
> PCI and CXL mem drivers to be loaded and for their probe routine to
> complete. Then the driver walks through any created CXL regions to trim any
> intersections with SOFT RESERVE resources in the iomem tree.
> 
> The dax driver uses the new soft reserve notifier chain so it can consume
> any remaining SOFT RESERVES once they're added to the iomem tree.
> 
> The following scenarios have been tested:
> 
> Example 1: Exact alignment, soft reserved is a child of the region
> 
> |---------- "Soft Reserved" -----------|
> |-------------- "Region #" ------------|
> 
> Before:
>    1050000000-304fffffff : CXL Window 0
>      1050000000-304fffffff : region0
>        1050000000-304fffffff : Soft Reserved
>          1080000000-2fffffffff : dax0.0

BTW, I'm curious how to set up a dax with an address range different from its corresponding region.


>            1080000000-2fffffffff : System RAM (kmem)
> 
> After:
>    1050000000-304fffffff : CXL Window 0
>      1050000000-304fffffff : region1
>        1080000000-2fffffffff : dax0.0
>          1080000000-2fffffffff : System RAM (kmem)
> 
> Example 2: Start and/or end aligned and soft reserved spans multiple
> regions

Tested

> 
> |----------- "Soft Reserved" -----------|
> |-------- "Region #" -------|
> or
> |----------- "Soft Reserved" -----------|
> |-------- "Region #" -------|

Typo? should be:
|----------- "Soft Reserved" -----------|
             |-------- "Region #" -------|

> 
> Example 3: No alignment
> |---------- "Soft Reserved" ----------|
> 	|---- "Region #" ----|

Tested.


Thanks
Zhijian
Zhijian Li (Fujitsu) June 4, 2025, 9:29 a.m. UTC | #2
On 04/06/2025 06:19, Smita Koralahalli wrote:
> Introduce a pci_loaded flag similar to mem_active, and define
> mark_cxl_pci_loaded() to indicate when the PCI driver has initialized.
> 
> This will be used by other CXL components, such as cxl_acpi and the soft
> reserved resource handling logic, to coordinate initialization and ensure
> that dependent operations proceed only after both cxl_pci and cxl_mem
> drivers are ready.
> 
> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> ---
>   drivers/cxl/core/suspend.c | 8 ++++++++
>   drivers/cxl/cxlpci.h       | 1 +
>   drivers/cxl/pci.c          | 2 ++
>   3 files changed, 11 insertions(+)
> 
> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
> index 5ba4b4de0e33..72818a2c8ec8 100644
> --- a/drivers/cxl/core/suspend.c
> +++ b/drivers/cxl/core/suspend.c
> @@ -3,8 +3,10 @@
>   #include <linux/atomic.h>
>   #include <linux/export.h>
>   #include "cxlmem.h"
> +#include "cxlpci.h"
>   
>   static atomic_t mem_active;
> +static atomic_t pci_loaded;


I find it odd to place these changes in suspend.c. Also, I noticed that in a
subsequent patch, DJ has mentioned (and I agree) that this patch is unnecessary.


Thanks
Zhijian


>   
>   bool cxl_mem_active(void)
>   {
> @@ -25,3 +27,9 @@ void cxl_mem_active_dec(void)
>   	atomic_dec(&mem_active);
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
> +
> +void mark_cxl_pci_loaded(void)
> +{
> +	atomic_inc(&pci_loaded);
> +}
> +EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
> diff --git a/drivers/cxl/cxlpci.h b/drivers/cxl/cxlpci.h
> index 54e219b0049e..5a811ac63fcf 100644
> --- a/drivers/cxl/cxlpci.h
> +++ b/drivers/cxl/cxlpci.h
> @@ -135,4 +135,5 @@ void read_cdat_data(struct cxl_port *port);
>   void cxl_cor_error_detected(struct pci_dev *pdev);
>   pci_ers_result_t cxl_error_detected(struct pci_dev *pdev,
>   				    pci_channel_state_t state);
> +void mark_cxl_pci_loaded(void);
>   #endif /* __CXL_PCI_H__ */
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index 785aa2af5eaa..b019bd324dba 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1189,6 +1189,8 @@ static int __init cxl_pci_driver_init(void)
>   	if (rc)
>   		pci_unregister_driver(&cxl_pci_driver);
>   
> +	mark_cxl_pci_loaded();
> +
>   	return rc;
>   }
>
Zhijian Li (Fujitsu) June 4, 2025, 9:40 a.m. UTC | #3
On 04/06/2025 06:19, Smita Koralahalli wrote:
>   drivers/cxl/acpi.c         | 23 +++++++++++++++++++++++
>   drivers/cxl/core/suspend.c | 21 +++++++++++++++++++++
>   drivers/cxl/cxl.h          |  2 ++
>   3 files changed, 46 insertions(+)
> 
> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
> index cb14829bb9be..978f63b32b41 100644
> --- a/drivers/cxl/acpi.c
> +++ b/drivers/cxl/acpi.c
> @@ -813,6 +813,24 @@ static int pair_cxl_resource(struct device *dev, void *data)
>   	return 0;
>   }
>   
> +static void cxl_softreserv_mem_work_fn(struct work_struct *work)
> +{
> +	/* Wait for cxl_pci and cxl_mem drivers to load */
> +	cxl_wait_for_pci_mem();
> +
> +	/*
> +	 * Wait for the driver probe routines to complete after cxl_pci
> +	 * and cxl_mem drivers are loaded.
> +	 */
> +	wait_for_device_probe();
> +}
> +static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
> +
> +static void cxl_softreserv_mem_update(void)
> +{
> +	schedule_work(&cxl_sr_work);
> +}
> +
>   static int cxl_acpi_probe(struct platform_device *pdev)
>   {
>   	int rc;
> @@ -887,6 +905,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>   
>   	/* In case PCI is scanned before ACPI re-trigger memdev attach */
>   	cxl_bus_rescan();
> +
> +	/* Update SOFT RESERVE resources that intersect with CXL regions */
> +	cxl_softreserv_mem_update();
> +
>   	return 0;
>   }
>   
> @@ -918,6 +940,7 @@ static int __init cxl_acpi_init(void)
>   
>   static void __exit cxl_acpi_exit(void)
>   {
> +	cancel_work_sync(&cxl_sr_work);
>   	platform_driver_unregister(&cxl_acpi_driver);
>   	cxl_bus_drain();
>   }
> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
> index 72818a2c8ec8..c0d8f70aed56 100644
> --- a/drivers/cxl/core/suspend.c
> +++ b/drivers/cxl/core/suspend.c
> @@ -2,12 +2,15 @@
>   /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
>   #include <linux/atomic.h>
>   #include <linux/export.h>
> +#include <linux/wait.h>
>   #include "cxlmem.h"
>   #include "cxlpci.h"
>   
>   static atomic_t mem_active;
>   static atomic_t pci_loaded;
>   
> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);

Given that this file (suspend.c) focuses on power management functions,
it might be more appropriate to move the wait queue declaration and its
related changes to acpi.c in where the its caller is.


Thanks
Zhijian

> +
>   bool cxl_mem_active(void)
>   {
>   	if (IS_ENABLED(CONFIG_CXL_MEM))
> @@ -19,6 +22,7 @@ bool cxl_mem_active(void)
>   void cxl_mem_active_inc(void)
>   {
>   	atomic_inc(&mem_active);
> +	wake_up(&cxl_wait_queue);
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL");
>   
> @@ -28,8 +32,25 @@ void cxl_mem_active_dec(void)
>   }
>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
>   
> +static bool cxl_pci_loaded(void)
> +{
> +	if (IS_ENABLED(CONFIG_CXL_PCI))
> +		return atomic_read(&pci_loaded) != 0;
> +
> +	return false;
> +}
> +
>   void mark_cxl_pci_loaded(void)
>   {
>   	atomic_inc(&pci_loaded);
> +	wake_up(&cxl_wait_queue);
>   }
>   EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
> +
> +void cxl_wait_for_pci_mem(void)
> +{
> +	if (!wait_event_timeout(cxl_wait_queue, cxl_pci_loaded() &&
> +				cxl_mem_active(), 30 * HZ))
> +		pr_debug("Timeout waiting for cxl_pci or cxl_mem probing\n");
> +}
Dave Jiang June 4, 2025, 2:35 p.m. UTC | #4
On 6/4/25 2:40 AM, Zhijian Li (Fujitsu) wrote:
> 
> 
> On 04/06/2025 06:19, Smita Koralahalli wrote:
>>   drivers/cxl/acpi.c         | 23 +++++++++++++++++++++++
>>   drivers/cxl/core/suspend.c | 21 +++++++++++++++++++++
>>   drivers/cxl/cxl.h          |  2 ++
>>   3 files changed, 46 insertions(+)
>>
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index cb14829bb9be..978f63b32b41 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -813,6 +813,24 @@ static int pair_cxl_resource(struct device *dev, void *data)
>>   	return 0;
>>   }
>>   
>> +static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>> +{
>> +	/* Wait for cxl_pci and cxl_mem drivers to load */
>> +	cxl_wait_for_pci_mem();
>> +
>> +	/*
>> +	 * Wait for the driver probe routines to complete after cxl_pci
>> +	 * and cxl_mem drivers are loaded.
>> +	 */
>> +	wait_for_device_probe();
>> +}
>> +static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>> +
>> +static void cxl_softreserv_mem_update(void)
>> +{
>> +	schedule_work(&cxl_sr_work);
>> +}
>> +
>>   static int cxl_acpi_probe(struct platform_device *pdev)
>>   {
>>   	int rc;
>> @@ -887,6 +905,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>>   
>>   	/* In case PCI is scanned before ACPI re-trigger memdev attach */
>>   	cxl_bus_rescan();
>> +
>> +	/* Update SOFT RESERVE resources that intersect with CXL regions */
>> +	cxl_softreserv_mem_update();
>> +
>>   	return 0;
>>   }
>>   
>> @@ -918,6 +940,7 @@ static int __init cxl_acpi_init(void)
>>   
>>   static void __exit cxl_acpi_exit(void)
>>   {
>> +	cancel_work_sync(&cxl_sr_work);
>>   	platform_driver_unregister(&cxl_acpi_driver);
>>   	cxl_bus_drain();
>>   }
>> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
>> index 72818a2c8ec8..c0d8f70aed56 100644
>> --- a/drivers/cxl/core/suspend.c
>> +++ b/drivers/cxl/core/suspend.c
>> @@ -2,12 +2,15 @@
>>   /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
>>   #include <linux/atomic.h>
>>   #include <linux/export.h>
>> +#include <linux/wait.h>
>>   #include "cxlmem.h"
>>   #include "cxlpci.h"
>>   
>>   static atomic_t mem_active;
>>   static atomic_t pci_loaded;
>>   
>> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);
> 
> Given that this file (suspend.c) focuses on power management functions,
> it might be more appropriate to move the wait queue declaration and its
> related changes to acpi.c in where the its caller is.

You mean drivers/cxl/acpi.c and not drivers/cxl/core/acpi.c right? The core one is my mistake and I'm going to create a patch to remove that.

DJ

> 
> 
> Thanks
> Zhijian
> 
>> +
>>   bool cxl_mem_active(void)
>>   {
>>   	if (IS_ENABLED(CONFIG_CXL_MEM))
>> @@ -19,6 +22,7 @@ bool cxl_mem_active(void)
>>   void cxl_mem_active_inc(void)
>>   {
>>   	atomic_inc(&mem_active);
>> +	wake_up(&cxl_wait_queue);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL");
>>   
>> @@ -28,8 +32,25 @@ void cxl_mem_active_dec(void)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
>>   
>> +static bool cxl_pci_loaded(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_CXL_PCI))
>> +		return atomic_read(&pci_loaded) != 0;
>> +
>> +	return false;
>> +}
>> +
>>   void mark_cxl_pci_loaded(void)
>>   {
>>   	atomic_inc(&pci_loaded);
>> +	wake_up(&cxl_wait_queue);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
>> +
>> +void cxl_wait_for_pci_mem(void)
>> +{
>> +	if (!wait_event_timeout(cxl_wait_queue, cxl_pci_loaded() &&
>> +				cxl_mem_active(), 30 * HZ))
>> +		pr_debug("Timeout waiting for cxl_pci or cxl_mem probing\n");
>> +}
Smita Koralahalli June 4, 2025, 6:31 p.m. UTC | #5
On 6/3/2025 4:45 PM, Dave Jiang wrote:
> 
> 
> On 6/3/25 3:19 PM, Smita Koralahalli wrote:
>> Introduce a waitqueue mechanism to coordinate initialization between the
>> cxl_pci and cxl_mem drivers.
>>
>> Launch a background worker from cxl_acpi_probe() that waits for both
>> drivers to complete initialization before invoking wait_for_device_probe().
>> Without this, the probe completion wait could begin prematurely, before
>> the drivers are present, leading to missed updates.
>>
>> Co-developed-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Signed-off-by: Nathan Fontenot <Nathan.Fontenot@amd.com>
>> Co-developed-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> ---
>>   drivers/cxl/acpi.c         | 23 +++++++++++++++++++++++
>>   drivers/cxl/core/suspend.c | 21 +++++++++++++++++++++
>>   drivers/cxl/cxl.h          |  2 ++
>>   3 files changed, 46 insertions(+)
>>
>> diff --git a/drivers/cxl/acpi.c b/drivers/cxl/acpi.c
>> index cb14829bb9be..978f63b32b41 100644
>> --- a/drivers/cxl/acpi.c
>> +++ b/drivers/cxl/acpi.c
>> @@ -813,6 +813,24 @@ static int pair_cxl_resource(struct device *dev, void *data)
>>   	return 0;
>>   }
>>   
>> +static void cxl_softreserv_mem_work_fn(struct work_struct *work)
>> +{
>> +	/* Wait for cxl_pci and cxl_mem drivers to load */
>> +	cxl_wait_for_pci_mem();
>> +
>> +	/*
>> +	 * Wait for the driver probe routines to complete after cxl_pci
>> +	 * and cxl_mem drivers are loaded.
>> +	 */
>> +	wait_for_device_probe();
>> +}
>> +static DECLARE_WORK(cxl_sr_work, cxl_softreserv_mem_work_fn);
>> +
>> +static void cxl_softreserv_mem_update(void)
>> +{
>> +	schedule_work(&cxl_sr_work);
>> +}
>> +
>>   static int cxl_acpi_probe(struct platform_device *pdev)
>>   {
>>   	int rc;
>> @@ -887,6 +905,10 @@ static int cxl_acpi_probe(struct platform_device *pdev)
>>   
>>   	/* In case PCI is scanned before ACPI re-trigger memdev attach */
>>   	cxl_bus_rescan();
>> +
>> +	/* Update SOFT RESERVE resources that intersect with CXL regions */
>> +	cxl_softreserv_mem_update();
>> +
>>   	return 0;
>>   }
>>   
>> @@ -918,6 +940,7 @@ static int __init cxl_acpi_init(void)
>>   
>>   static void __exit cxl_acpi_exit(void)
>>   {
>> +	cancel_work_sync(&cxl_sr_work);
>>   	platform_driver_unregister(&cxl_acpi_driver);
>>   	cxl_bus_drain();
>>   }
>> diff --git a/drivers/cxl/core/suspend.c b/drivers/cxl/core/suspend.c
>> index 72818a2c8ec8..c0d8f70aed56 100644
>> --- a/drivers/cxl/core/suspend.c
>> +++ b/drivers/cxl/core/suspend.c
>> @@ -2,12 +2,15 @@
>>   /* Copyright(c) 2022 Intel Corporation. All rights reserved. */
>>   #include <linux/atomic.h>
>>   #include <linux/export.h>
>> +#include <linux/wait.h>
>>   #include "cxlmem.h"
>>   #include "cxlpci.h"
>>   
>>   static atomic_t mem_active;
>>   static atomic_t pci_loaded;
>>   
>> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);
>> +
>>   bool cxl_mem_active(void)
>>   {
>>   	if (IS_ENABLED(CONFIG_CXL_MEM))
>> @@ -19,6 +22,7 @@ bool cxl_mem_active(void)
>>   void cxl_mem_active_inc(void)
>>   {
>>   	atomic_inc(&mem_active);
>> +	wake_up(&cxl_wait_queue);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_inc, "CXL");
>>   
>> @@ -28,8 +32,25 @@ void cxl_mem_active_dec(void)
>>   }
>>   EXPORT_SYMBOL_NS_GPL(cxl_mem_active_dec, "CXL");
>>   
>> +static bool cxl_pci_loaded(void)
>> +{
>> +	if (IS_ENABLED(CONFIG_CXL_PCI))
>> +		return atomic_read(&pci_loaded) != 0;
>> +
>> +	return false;
>> +}
>> +
>>   void mark_cxl_pci_loaded(void)
>>   {
>>   	atomic_inc(&pci_loaded);
>> +	wake_up(&cxl_wait_queue);
>>   }
>>   EXPORT_SYMBOL_NS_GPL(mark_cxl_pci_loaded, "CXL");
>> +
>> +void cxl_wait_for_pci_mem(void)
>> +{
>> +	if (!wait_event_timeout(cxl_wait_queue, cxl_pci_loaded() &&
>> +				cxl_mem_active(), 30 * HZ))
> 
> I'm trying to understand why cxl_pci_loaded() is needed. cxl_mem_active() goes above 0 when a cxl_mem_probe() instance succeeds. cxl_mem_probe() being triggered implies that an instance of cxl_pci_probe() has been called since cxl_mem_probe() is triggered from devm_cxl_add_memdev() with memdev being added and cxl_mem driver also have been loaded. So does cxl_mem_active() not imply cxl_pci_loaded() and makes it unnecessary?

Yeah you are right. I will remove this check.

Thanks
Smita
> 
> DJ
> 
> 
>> +		pr_debug("Timeout waiting for cxl_pci or cxl_mem probing\n");
>> +}
>> +EXPORT_SYMBOL_NS_GPL(cxl_wait_for_pci_mem, "CXL");
>> diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
>> index a9ab46eb0610..1ba7d39c2991 100644
>> --- a/drivers/cxl/cxl.h
>> +++ b/drivers/cxl/cxl.h
>> @@ -902,6 +902,8 @@ void cxl_coordinates_combine(struct access_coordinate *out,
>>   
>>   bool cxl_endpoint_decoder_reset_detected(struct cxl_port *port);
>>   
>> +void cxl_wait_for_pci_mem(void);
>> +
>>   /*
>>    * Unit test builds overrides this to __weak, find the 'strong' version
>>    * of these symbols in tools/testing/cxl/.
>
Zhijian Li (Fujitsu) June 5, 2025, 2:18 a.m. UTC | #6
On 04/06/2025 22:35, Dave Jiang wrote:
>>>    
>>> +static DECLARE_WAIT_QUEUE_HEAD(cxl_wait_queue);
>> Given that this file (suspend.c) focuses on power management functions,
>> it might be more appropriate to move the wait queue declaration and its
>> related changes to acpi.c in where the its caller is.
> You mean drivers/cxl/acpi.c and not drivers/cxl/core/acpi.c right? 

Yes


> The core one is my mistake and I'm going to create a patch to remove that.


Completely missed its existence - thank you for catching and cleaning that up!

Thanks
Zhijian


> 
> DJ