[PATCH v8 3/9] dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges

Smita Koralahalli posted 9 patches 1 week, 6 days ago
[PATCH v8 3/9] dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges
Posted by Smita Koralahalli 1 week, 6 days ago
From: Dan Williams <dan.j.williams@intel.com>

Ensure cxl_acpi has published CXL Window resources before HMEM walks Soft
Reserved ranges.

Replace MODULE_SOFTDEP("pre: cxl_acpi") with an explicit, synchronous
request_module("cxl_acpi"). MODULE_SOFTDEP() only guarantees eventual
loading, it does not enforce that the dependency has finished init
before the current module runs. This can cause HMEM to start before
cxl_acpi has populated the resource tree, breaking detection of overlaps
between Soft Reserved and CXL Windows.

Also, request cxl_pci before HMEM walks Soft Reserved ranges. Unlike
cxl_acpi, cxl_pci attach is asynchronous and creates dependent devices
that trigger further module loads. Asynchronous probe flushing
(wait_for_device_probe()) is added later in the series in a deferred
context before HMEM makes ownership decisions for Soft Reserved ranges.

Add an additional explicit Kconfig ordering so that CXL_ACPI and CXL_PCI
must be initialized before DEV_DAX_HMEM. This prevents HMEM from consuming
Soft Reserved ranges before CXL drivers have had a chance to claim them.

Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
---
 drivers/dax/Kconfig     |  2 ++
 drivers/dax/hmem/hmem.c | 17 ++++++++++-------
 2 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
index d656e4c0eb84..3683bb3f2311 100644
--- a/drivers/dax/Kconfig
+++ b/drivers/dax/Kconfig
@@ -48,6 +48,8 @@ config DEV_DAX_CXL
 	tristate "CXL DAX: direct access to CXL RAM regions"
 	depends on CXL_BUS && CXL_REGION && DEV_DAX
 	default CXL_REGION && DEV_DAX
+	depends on CXL_ACPI >= DEV_DAX_HMEM
+	depends on CXL_PCI >= DEV_DAX_HMEM
 	help
 	  CXL RAM regions are either mapped by platform-firmware
 	  and published in the initial system-memory map as "System RAM", mapped
diff --git a/drivers/dax/hmem/hmem.c b/drivers/dax/hmem/hmem.c
index a3d45032355c..85e751675f65 100644
--- a/drivers/dax/hmem/hmem.c
+++ b/drivers/dax/hmem/hmem.c
@@ -145,6 +145,16 @@ static __init int dax_hmem_init(void)
 {
 	int rc;
 
+	/*
+	 * Ensure that cxl_acpi and cxl_pci have a chance to kick off
+	 * CXL topology discovery at least once before scanning the
+	 * iomem resource tree for IORES_DESC_CXL resources.
+	 */
+	if (IS_ENABLED(CONFIG_DEV_DAX_CXL)) {
+		request_module("cxl_acpi");
+		request_module("cxl_pci");
+	}
+
 	rc = platform_driver_register(&dax_hmem_platform_driver);
 	if (rc)
 		return rc;
@@ -165,13 +175,6 @@ static __exit void dax_hmem_exit(void)
 module_init(dax_hmem_init);
 module_exit(dax_hmem_exit);
 
-/* Allow for CXL to define its own dax regions */
-#if IS_ENABLED(CONFIG_CXL_REGION)
-#if IS_MODULE(CONFIG_CXL_ACPI)
-MODULE_SOFTDEP("pre: cxl_acpi");
-#endif
-#endif
-
 MODULE_ALIAS("platform:hmem*");
 MODULE_ALIAS("platform:hmem_platform*");
 MODULE_DESCRIPTION("HMEM DAX: direct access to 'specific purpose' memory");
-- 
2.17.1
Re: [PATCH v8 3/9] dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges
Posted by Dan Williams 1 week, 5 days ago
Smita Koralahalli wrote:
> From: Dan Williams <dan.j.williams@intel.com>
> 
> Ensure cxl_acpi has published CXL Window resources before HMEM walks Soft
> Reserved ranges.
> 
> Replace MODULE_SOFTDEP("pre: cxl_acpi") with an explicit, synchronous
> request_module("cxl_acpi"). MODULE_SOFTDEP() only guarantees eventual
> loading, it does not enforce that the dependency has finished init
> before the current module runs. This can cause HMEM to start before
> cxl_acpi has populated the resource tree, breaking detection of overlaps
> between Soft Reserved and CXL Windows.
> 
> Also, request cxl_pci before HMEM walks Soft Reserved ranges. Unlike
> cxl_acpi, cxl_pci attach is asynchronous and creates dependent devices
> that trigger further module loads. Asynchronous probe flushing
> (wait_for_device_probe()) is added later in the series in a deferred
> context before HMEM makes ownership decisions for Soft Reserved ranges.
> 
> Add an additional explicit Kconfig ordering so that CXL_ACPI and CXL_PCI
> must be initialized before DEV_DAX_HMEM. This prevents HMEM from consuming
> Soft Reserved ranges before CXL drivers have had a chance to claim them.
> 
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
> ---
>  drivers/dax/Kconfig     |  2 ++
>  drivers/dax/hmem/hmem.c | 17 ++++++++++-------
>  2 files changed, 12 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
> index d656e4c0eb84..3683bb3f2311 100644
> --- a/drivers/dax/Kconfig
> +++ b/drivers/dax/Kconfig
> @@ -48,6 +48,8 @@ config DEV_DAX_CXL
>  	tristate "CXL DAX: direct access to CXL RAM regions"
>  	depends on CXL_BUS && CXL_REGION && DEV_DAX
>  	default CXL_REGION && DEV_DAX
> +	depends on CXL_ACPI >= DEV_DAX_HMEM
> +	depends on CXL_PCI >= DEV_DAX_HMEM

As I learned from Keith's recent CXL_PMEM dependency fix for CXL_ACPI
[1], this wants to be:

depends on DEV_DAX_HMEM || !DEV_DAX_HMEM
depends on CXL_ACPI || !CXL_ACPI
depends on CXL_PCI || !CXL_PCI

...to make sure that DEV_DAX_CXL can never be built-in unless all of its
dependencies are built-in.

[1]: http://lore.kernel.org/69aa341fcf526_6423c1002c@dwillia2-mobl4.notmuch

At this point I am wondering if all of the feedback I have for this
series should just be incremental fixes. I also want to have a canned
unit test that verifies the base expectations. That can also be
something I reply incrementally.
Re: [PATCH v8 3/9] dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges
Posted by Koralahalli Channabasappa, Smita 1 week, 5 days ago
Hi Dan,

On 3/23/2026 12:54 PM, Dan Williams wrote:
> Smita Koralahalli wrote:
>> From: Dan Williams <dan.j.williams@intel.com>
>>
>> Ensure cxl_acpi has published CXL Window resources before HMEM walks Soft
>> Reserved ranges.
>>
>> Replace MODULE_SOFTDEP("pre: cxl_acpi") with an explicit, synchronous
>> request_module("cxl_acpi"). MODULE_SOFTDEP() only guarantees eventual
>> loading, it does not enforce that the dependency has finished init
>> before the current module runs. This can cause HMEM to start before
>> cxl_acpi has populated the resource tree, breaking detection of overlaps
>> between Soft Reserved and CXL Windows.
>>
>> Also, request cxl_pci before HMEM walks Soft Reserved ranges. Unlike
>> cxl_acpi, cxl_pci attach is asynchronous and creates dependent devices
>> that trigger further module loads. Asynchronous probe flushing
>> (wait_for_device_probe()) is added later in the series in a deferred
>> context before HMEM makes ownership decisions for Soft Reserved ranges.
>>
>> Add an additional explicit Kconfig ordering so that CXL_ACPI and CXL_PCI
>> must be initialized before DEV_DAX_HMEM. This prevents HMEM from consuming
>> Soft Reserved ranges before CXL drivers have had a chance to claim them.
>>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
>> Reviewed-by: Dave Jiang <dave.jiang@intel.com>
>> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
>> Reviewed-by: Alison Schofield <alison.schofield@intel.com>
>> ---
>>   drivers/dax/Kconfig     |  2 ++
>>   drivers/dax/hmem/hmem.c | 17 ++++++++++-------
>>   2 files changed, 12 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/dax/Kconfig b/drivers/dax/Kconfig
>> index d656e4c0eb84..3683bb3f2311 100644
>> --- a/drivers/dax/Kconfig
>> +++ b/drivers/dax/Kconfig
>> @@ -48,6 +48,8 @@ config DEV_DAX_CXL
>>   	tristate "CXL DAX: direct access to CXL RAM regions"
>>   	depends on CXL_BUS && CXL_REGION && DEV_DAX
>>   	default CXL_REGION && DEV_DAX
>> +	depends on CXL_ACPI >= DEV_DAX_HMEM
>> +	depends on CXL_PCI >= DEV_DAX_HMEM
> 
> As I learned from Keith's recent CXL_PMEM dependency fix for CXL_ACPI
> [1], this wants to be:
> 
> depends on DEV_DAX_HMEM || !DEV_DAX_HMEM
> depends on CXL_ACPI || !CXL_ACPI
> depends on CXL_PCI || !CXL_PCI
> 
> ...to make sure that DEV_DAX_CXL can never be built-in unless all of its
> dependencies are built-in.
> 
> [1]: http://lore.kernel.org/69aa341fcf526_6423c1002c@dwillia2-mobl4.notmuch
> 
> At this point I am wondering if all of the feedback I have for this
> series should just be incremental fixes. I also want to have a canned
> unit test that verifies the base expectations. That can also be
> something I reply incrementally.

Two things on the Kconfig change:

When DEV_DAX_HMEM = y and CXL_ACPI = m and CXL_PCI = m

1. Regarding switching from >= to || ! pattern:

The >= pattern disabled DEV_DAX_CXL entirely when DEV_DAX_HMEM = y and 
CXL_ACPI/CXL_PCI = m. So, HMEM unconditionally owned all ranges - the 
CXL deferral path is never entered.

With the || ! pattern, DEV_DAX_CXL is enabled, which changes the 
ownership behavior based on how the probes starts for CXL_ACPI/CXL_PCI.

On my system I see:

   [  7.379] dax_hmem_platform_probe began
   [  7.384] alloc_dev_dax_range: dax0.0
   [ 28.560] cxl acpi probe started     <- 21 seconds later

HMEM ends up owning in this case because CXL windows aren't published 
yet when HMEM probes (built-in runs before modules load and 
request_module might not work this early??), so region_intersects() 
returns DISJOINT for all CXL ranges.

But it could go the other way if CXL ACPI and PCI probe starts before 
the deferred work is queued in HMEM. (And I think this is the expected 
path if DEV_DAX_CXL is enabled..)

But do you think it is okay as of now with resource exclusion handling??

2. Separate build issue with DEV_DAX_HMEM = y,  CXL_BUS/ACPI/PCI = m and
CXL_REGION = y.

I hit this build error when I was testing the above config: (Sorry I 
should have checked this config before)..

When DEV_DAX_HMEM = y and CXL core is built as a module hmem.c calls 
cxl_region_contains_resource() which lives in cxl_core.ko causing an 
undefined reference at link time.

This happens with both the >= and || ! Kconfig patterns.

The current #ifdef CONFIG_CXL_REGION guard evaluates to true even when 
CXL_REGION is compiled into a module. Changing the guard to check 
reachability of the actual module in include/cxl/cxl.h worked for me to 
overcome the error:

-#ifdef CONFIG_CXL_REGION
+#if IS_REACHABLE(CONFIG_CXL_BUS) && defined(CONFIG_CXL_REGION)
bool cxl_region_contains_resource(struct resource *res);
#else
...

Not sure if CONFIG_CXL_BUS is the right check here or it should be more 
specifically checking on CXL_ACPI or PCI..

Thanks
Smita
Re: [PATCH v8 3/9] dax/hmem: Request cxl_acpi and cxl_pci before walking Soft Reserved ranges
Posted by Dan Williams 1 week, 4 days ago
Koralahalli Channabasappa, Smita wrote:
[..]
> > As I learned from Keith's recent CXL_PMEM dependency fix for CXL_ACPI
> > [1], this wants to be:
> > 
> > depends on DEV_DAX_HMEM || !DEV_DAX_HMEM
> > depends on CXL_ACPI || !CXL_ACPI
> > depends on CXL_PCI || !CXL_PCI
> > 
> > ...to make sure that DEV_DAX_CXL can never be built-in unless all of its
> > dependencies are built-in.
> > 
> > [1]: http://lore.kernel.org/69aa341fcf526_6423c1002c@dwillia2-mobl4.notmuch
> > 
> > At this point I am wondering if all of the feedback I have for this
> > series should just be incremental fixes. I also want to have a canned
> > unit test that verifies the base expectations. That can also be
> > something I reply incrementally.
> 
> Two things on the Kconfig change:
> 
> When DEV_DAX_HMEM = y and CXL_ACPI = m and CXL_PCI = m

Right, this should not be possible. The patch I am testing moves the
optional CXL dependencies to DEV_DAX_HMEM where they belong. I
mistakenly showed them against DEV_DAX_CXL in my comment.

> 1. Regarding switching from >= to || ! pattern:
> 
> The >= pattern disabled DEV_DAX_CXL entirely when DEV_DAX_HMEM = y and 
> CXL_ACPI/CXL_PCI = m. So, HMEM unconditionally owned all ranges - the 
> CXL deferral path is never entered.

That is one of the broken configurations to fix. It should never be
possible to set DEV_DAX_HMEM=y unless CXL_ACPI and CXL_PCI are both
disabled or both built-in.

> When DEV_DAX_HMEM = y and CXL core is built as a module hmem.c calls 
> cxl_region_contains_resource() which lives in cxl_core.ko causing an 
> undefined reference at link time.

Yes, I hit this as well and requires another CXL_BUS dependency.