[PATCH v6] cxl/region: Fix wrong logic for finding a free switch cxl decoder

Zijun Hu posted 1 patch 1 month, 1 week ago
drivers/cxl/core/region.c | 35 +++++++++++++++++++++++------------
1 file changed, 23 insertions(+), 12 deletions(-)
[PATCH v6] cxl/region: Fix wrong logic for finding a free switch cxl decoder
Posted by Zijun Hu 1 month, 1 week ago
From: Zijun Hu <quic_zijuhu@quicinc.com>

Provided that all child switch cxl decoders are sorted by ID in ascending
order, then it is wrong for current match_free_decoder()'s logic to find
a free cxl decoder as explained below:

Port
 ├── cxld A        <---->  region A
 ├── cxld B        // no region
 ├── cxld C        <---->  region C

Current logic will find cxld B as a free one, but cxld B is not true
free since region C has not been torn down, so current logic is wrong.

Fixed by verifying if cxl decoder with ID (@port->commit_end + 1) can
be returned as finding result.

Link: https://lore.kernel.org/all/670af54931b8_964fe29427@dwillia2-xfh.jf.intel.com.notmuch/
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
---
This patch is incremental to below patch series with title "cxl: Initialization and shutdown fixes"
http://lore.kernel.org/172862483180.2150669.5564474284074502692.stgit@dwillia2-xfh.jf.intel.com

This patch is based on Dan's recommendation shown by below link:
https://lore.kernel.org/all/670af54931b8_964fe29427@dwillia2-xfh.jf.intel.com.notmuch/
---
Changes in v6:
- Use Dan's recommendation that validate clxd with ID (@port->commit_end + 1)
- Correct commit message and update Link tag this commit derive from
- Link to v5: https://lore.kernel.org/r/20240917-const_dfc_prepare-v5-1-0e20f673ee0c@quicinc.com

Changes in v5:
- Use changes suggested by Dan Williams for cxl/region patch
- Correct title and commit message for cxl/region change
- Seperate qcom/emac change for patch series
- Link to v4: https://lore.kernel.org/r/20240905-const_dfc_prepare-v4-0-4180e1d5a244@quicinc.com

Changes in v4:
- Drop driver core patch
- Correct commit message for cxl/region patch
- Correct title and commit message for qcom/emac patch
- Link to v3: https://lore.kernel.org/r/20240824-const_dfc_prepare-v3-0-32127ea32bba@quicinc.com

Changes in v3:
- Git rebase
- Correct commit message for the driver core patch
- Use changes suggested by Ira Weiny cxl/region
- Drop firewire core patch
- Make qcom/emac follow cxl/region solution suggested by Greg
- Link to v2: https://lore.kernel.org/r/20240815-const_dfc_prepare-v2-0-8316b87b8ff9@quicinc.com

Changes in v2:
- Give up introducing the API constify_device_find_child_helper()
- Correct commit message and inline comments
- Implement a driver specific and equivalent one instead of device_find_child()
- Link to v1: https://lore.kernel.org/r/20240811-const_dfc_prepare-v1-0-d67cc416b3d3@quicinc.com
---
 drivers/cxl/core/region.c | 35 +++++++++++++++++++++++------------
 1 file changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index e701e4b04032..a7a82da39393 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -796,8 +796,9 @@ static size_t show_targetN(struct cxl_region *cxlr, char *buf, int pos)
 
 static int match_free_decoder(struct device *dev, void *data)
 {
+	struct cxl_port *port = to_cxl_port(dev->parent);
 	struct cxl_decoder *cxld;
-	int *id = data;
+	struct device **target_dev = data;
 
 	if (!is_switch_decoder(dev))
 		return 0;
@@ -805,15 +806,24 @@ static int match_free_decoder(struct device *dev, void *data)
 	cxld = to_cxl_decoder(dev);
 
 	/* enforce ordered allocation */
-	if (cxld->id != *id)
-		return 0;
+	if (cxld->id == port->commit_end + 1) {
+		if (!cxld->region) {
+			*target_dev = dev;
+			return 1;
+		}
 
-	if (!cxld->region)
-		return 1;
+		dev_dbg(dev->parent,
+			"%s: cxld '%s' was reserved by '%s'\n", __func__,
+			dev_name(dev), dev_name(&cxld->region->dev));
+		return -EBUSY;
+	}
 
-	(*id)++;
+	if (cxld->flags & CXL_DECODER_F_ENABLE)
+		return 0;
 
-	return 0;
+	dev_dbg(dev->parent, "%s: cxld '%s' was out of order shut down\n",
+		__func__, dev_name(dev));
+	return -EINVAL;
 }
 
 static int match_auto_decoder(struct device *dev, void *data)
@@ -839,17 +849,18 @@ cxl_region_find_decoder(struct cxl_port *port,
 			struct cxl_endpoint_decoder *cxled,
 			struct cxl_region *cxlr)
 {
-	struct device *dev;
-	int id = 0;
+	struct device *dev = NULL;
 
 	if (port == cxled_to_port(cxled))
 		return &cxled->cxld;
 
-	if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags))
+	if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
 		dev = device_find_child(&port->dev, &cxlr->params,
 					match_auto_decoder);
-	else
-		dev = device_find_child(&port->dev, &id, match_free_decoder);
+	} else {
+		device_for_each_child(&port->dev, &dev, match_free_decoder);
+		get_device(dev);
+	}
 	if (!dev)
 		return NULL;
 	/*

---
base-commit: 9bd133f05b1dca5ca4399a76d04d0f6f4d454e44
change-id: 20240811-const_dfc_prepare-3ff23c6598e5

Best regards,
-- 
Zijun Hu <quic_zijuhu@quicinc.com>

Re: [PATCH v6] cxl/region: Fix wrong logic for finding a free switch cxl decoder
Posted by Dan Williams 1 month, 1 week ago
Zijun Hu wrote:
> From: Zijun Hu <quic_zijuhu@quicinc.com>
> 
> Provided that all child switch cxl decoders are sorted by ID in ascending
> order, then it is wrong for current match_free_decoder()'s logic to find
> a free cxl decoder as explained below:
> 
> Port
>  ├── cxld A        <---->  region A
>  ├── cxld B        // no region
>  ├── cxld C        <---->  region C
> 
> Current logic will find cxld B as a free one, but cxld B is not true
> free since region C has not been torn down, so current logic is wrong.
> 
> Fixed by verifying if cxl decoder with ID (@port->commit_end + 1) can
> be returned as finding result.
> 
> Link: https://lore.kernel.org/all/670af54931b8_964fe29427@dwillia2-xfh.jf.intel.com.notmuch/
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
> ---
> This patch is incremental to below patch series with title "cxl: Initialization and shutdown fixes"
> http://lore.kernel.org/172862483180.2150669.5564474284074502692.stgit@dwillia2-xfh.jf.intel.com
> 
> This patch is based on Dan's recommendation shown by below link:
> https://lore.kernel.org/all/670af54931b8_964fe29427@dwillia2-xfh.jf.intel.com.notmuch/

...and Dan and already sent a patch he is happier with here:

http://lore.kernel.org/172895072669.39002.9296583943188706348.stgit@dwillia2-xfh.jf.intel.com
Re: [PATCH v6] cxl/region: Fix wrong logic for finding a free switch cxl decoder
Posted by Zijun Hu 1 month, 1 week ago
On 2024/10/17 07:02, Dan Williams wrote:
> Zijun Hu wrote:
>> From: Zijun Hu <quic_zijuhu@quicinc.com>
>>
>> Provided that all child switch cxl decoders are sorted by ID in ascending
>> order, then it is wrong for current match_free_decoder()'s logic to find
>> a free cxl decoder as explained below:
>>
>> Port
>>  ├── cxld A        <---->  region A
>>  ├── cxld B        // no region
>>  ├── cxld C        <---->  region C
>>
>> Current logic will find cxld B as a free one, but cxld B is not true
>> free since region C has not been torn down, so current logic is wrong.
>>
>> Fixed by verifying if cxl decoder with ID (@port->commit_end + 1) can
>> be returned as finding result.
>>
>> Link: https://lore.kernel.org/all/670af54931b8_964fe29427@dwillia2-xfh.jf.intel.com.notmuch/
>> Suggested-by: Dan Williams <dan.j.williams@intel.com>
>> Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
>> ---
>> This patch is incremental to below patch series with title "cxl: Initialization and shutdown fixes"
>> http://lore.kernel.org/172862483180.2150669.5564474284074502692.stgit@dwillia2-xfh.jf.intel.com
>>
>> This patch is based on Dan's recommendation shown by below link:
>> https://lore.kernel.org/all/670af54931b8_964fe29427@dwillia2-xfh.jf.intel.com.notmuch/
> 
> ...and Dan and already sent a patch he is happier with here:
> 
> http://lore.kernel.org/172895072669.39002.9296583943188706348.stgit@dwillia2-xfh.jf.intel.com

this is a optimized one based your recommendation, namely, above link.
please don't hesitate to use it as your V2 if you like this one.

you maybe also have below change when you send V2
Reported-by: Zijun Hu <zijun_hu@icloud.com>
to
Reported-by: Zijun Hu <quic_zijuhu@quicinc.com>

So let us close discussion for this mail thread and go to your above thread.