drivers/cxl/core/port.c | 3 ++- drivers/cxl/core/region.c | 43 ++++++++++++++++++++++++++----------------- drivers/cxl/cxl.h | 5 +++++ 3 files changed, 33 insertions(+), 18 deletions(-)
From: Zijun Hu <quic_zijuhu@quicinc.com>
Provided that all child switch decoders are sorted by ID in ascending
order, then it is wrong for current match_free_decoder()'s logic to
find a free cxl decoder as explained below:
Port
├── cxld A <----> region A
├── cxld B // no region
├── cxld C <----> region C
Current logic will find cxld B as a free cxld, that is wrong since
region C has not been torn down, so can not regard cxld B as free.
Fixed by finding a real free clxd by ID instead of region state.
Link: https://lore.kernel.org/all/66e08f9beb6a2_326462945d@dwillia2-xfh.jf.intel.com.notmuch/
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com>
---
Changes in v5:
- Use changes suggested by Dan Williams for cxl/region patch
- Correct title and commit message for cxl/region change
- Seperate qcom/emac change for patch series
- Link to v4: https://lore.kernel.org/r/20240905-const_dfc_prepare-v4-0-4180e1d5a244@quicinc.com
Changes in v4:
- Drop driver core patch
- Correct commit message for cxl/region patch
- Correct title and commit message for qcom/emac patch
- Link to v3: https://lore.kernel.org/r/20240824-const_dfc_prepare-v3-0-32127ea32bba@quicinc.com
Changes in v3:
- Git rebase
- Correct commit message for the driver core patch
- Use changes suggested by Ira Weiny cxl/region
- Drop firewire core patch
- Make qcom/emac follow cxl/region solution suggested by Greg
- Link to v2: https://lore.kernel.org/r/20240815-const_dfc_prepare-v2-0-8316b87b8ff9@quicinc.com
Changes in v2:
- Give up introducing the API constify_device_find_child_helper()
- Correct commit message and inline comments
- Implement a driver specific and equivalent one instead of device_find_child()
- Link to v1: https://lore.kernel.org/r/20240811-const_dfc_prepare-v1-0-d67cc416b3d3@quicinc.com
---
drivers/cxl/core/port.c | 3 ++-
drivers/cxl/core/region.c | 43 ++++++++++++++++++++++++++-----------------
drivers/cxl/cxl.h | 5 +++++
3 files changed, 33 insertions(+), 18 deletions(-)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 1d5007e3795a..749a281819b4 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1750,7 +1750,8 @@ static int cxl_decoder_init(struct cxl_port *port, struct cxl_decoder *cxld)
struct device *dev;
int rc;
- rc = ida_alloc(&port->decoder_ida, GFP_KERNEL);
+ rc = ida_alloc_max(&port->decoder_ida, CXL_DECODER_NR_MAX - 1,
+ GFP_KERNEL);
if (rc < 0)
return rc;
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 21ad5f242875..d3e191ae3c20 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -794,26 +794,16 @@ static size_t show_targetN(struct cxl_region *cxlr, char *buf, int pos)
return rc;
}
-static int match_free_decoder(struct device *dev, void *data)
+static int match_decoder_id(struct device *dev, void *data)
{
struct cxl_decoder *cxld;
- int *id = data;
+ int id = *(int *)data;
if (!is_switch_decoder(dev))
return 0;
cxld = to_cxl_decoder(dev);
-
- /* enforce ordered allocation */
- if (cxld->id != *id)
- return 0;
-
- if (!cxld->region)
- return 1;
-
- (*id)++;
-
- return 0;
+ return cxld->id == id;
}
static int match_auto_decoder(struct device *dev, void *data)
@@ -840,16 +830,31 @@ cxl_region_find_decoder(struct cxl_port *port,
struct cxl_region *cxlr)
{
struct device *dev;
- int id = 0;
if (port == cxled_to_port(cxled))
return &cxled->cxld;
- if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags))
+ if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
dev = device_find_child(&port->dev, &cxlr->params,
match_auto_decoder);
- else
- dev = device_find_child(&port->dev, &id, match_free_decoder);
+ } else {
+ int id, last;
+
+ /*
+ * Find next available decoder, but fail new decoder
+ * allocations if out-of-order region destruction has
+ * occurred
+ */
+ last = find_last_bit(port->decoder_alloc, CXL_DECODER_NR_MAX);
+ if (last >= CXL_DECODER_NR_MAX)
+ id = 0;
+ else if (last + 1 < CXL_DECODER_NR_MAX)
+ id = last + 1;
+ else
+ return NULL;
+
+ dev = device_find_child(&port->dev, &id, match_decoder_id);
+ }
if (!dev)
return NULL;
/*
@@ -943,6 +948,9 @@ static void cxl_rr_free_decoder(struct cxl_region_ref *cxl_rr)
dev_WARN_ONCE(&cxlr->dev, cxld->region != cxlr, "region mismatch\n");
if (cxld->region == cxlr) {
+ struct cxl_port *port = to_cxl_port(cxld->dev.parent);
+
+ clear_bit(cxld->id, port->decoder_alloc);
cxld->region = NULL;
put_device(&cxlr->dev);
}
@@ -977,6 +985,7 @@ static int cxl_rr_ep_add(struct cxl_region_ref *cxl_rr,
cxl_rr->nr_eps++;
if (!cxld->region) {
+ set_bit(cxld->id, port->decoder_alloc);
cxld->region = cxlr;
get_device(&cxlr->dev);
}
diff --git a/drivers/cxl/cxl.h b/drivers/cxl/cxl.h
index 9afb407d438f..750cd027d0b0 100644
--- a/drivers/cxl/cxl.h
+++ b/drivers/cxl/cxl.h
@@ -578,6 +578,9 @@ struct cxl_dax_region {
struct range hpa_range;
};
+/* Max as of CXL 3.1 (8.2.4.20.1 CXL HDM Decoder Capability Register) */
+#define CXL_DECODER_NR_MAX 32
+
/**
* struct cxl_port - logical collection of upstream port devices and
* downstream port devices to construct a CXL memory
@@ -591,6 +594,7 @@ struct cxl_dax_region {
* @regions: cxl_region_ref instances, regions mapped by this port
* @parent_dport: dport that points to this port in the parent
* @decoder_ida: allocator for decoder ids
+ * @decoder_alloc: decoder busy/free (@cxld->region set) bitmap
* @reg_map: component and ras register mapping parameters
* @nr_dports: number of entries in @dports
* @hdm_end: track last allocated HDM decoder instance for allocation ordering
@@ -611,6 +615,7 @@ struct cxl_port {
struct xarray regions;
struct cxl_dport *parent_dport;
struct ida decoder_ida;
+ DECLARE_BITMAP(decoder_alloc, CXL_DECODER_NR_MAX);
struct cxl_register_map reg_map;
int nr_dports;
int hdm_end;
---
base-commit: 6a36d828bdef0e02b1e6c12e2160f5b83be6aab5
change-id: 20240811-const_dfc_prepare-3ff23c6598e5
Best regards,
--
Zijun Hu <quic_zijuhu@quicinc.com>
Zijun Hu wrote: > From: Zijun Hu <quic_zijuhu@quicinc.com> > > Provided that all child switch decoders are sorted by ID in ascending > order, then it is wrong for current match_free_decoder()'s logic to > find a free cxl decoder as explained below: > > Port > ├── cxld A <----> region A > ├── cxld B // no region > ├── cxld C <----> region C > > Current logic will find cxld B as a free cxld, that is wrong since > region C has not been torn down, so can not regard cxld B as free. > > Fixed by finding a real free clxd by ID instead of region state. > > Link: https://lore.kernel.org/all/66e08f9beb6a2_326462945d@dwillia2-xfh.jf.intel.com.notmuch/ > Suggested-by: Dan Williams <dan.j.williams@intel.com> > Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Thanks for this. Now, on the way to recommending this I had the thought that another way to do this would be to assume that the next decoder to allocate is "port->commit_end + 1" and then double check that no previous decoder is disabled. I put that idea aside because I did not want to depend on new child device iteration infrastructure. However, it turns out that since this suggestion, I have found that the current behavior of cxl_region_detach(), where it early exits upon detecting out-of-order shutdown, leads to use after free bugs. Part of the fix for that is introducing a new device_for_each_child_reverse_from() helper that is identical to device_for_each_child_reverse(), but takes a starting child device for the iteration. With that, this allocator would not need to walk forward through the list, it could just start at port->commit_end + 1, and then walk backward to make sure no decoders are idle. Let me post that fix, Cc: you, and then you can fix this allocation order validation to use device_for_each_child_reverse_from() rather than add more decoder_alloc tracking logic to 'struct cxl_port' since port->commit_end is already sufficient.
© 2016 - 2024 Red Hat, Inc.