drivers/cxl/core/region.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
When validating decoder IW/IG when setting up regions, the granularity
is irrelevant when iw=1 - all accesses will always route to the only
target anyway - so all ig values are "correct". Loosen the requirement
that `ig = (parent_iw * parent_ig)` when iw=1.
On some Zen5 platforms, the platform BIOS specifies a 256-byte
interleave granularity window for host bridges when there is only
one target downstream. This leads to Linux rejecting the configuration
of a region with a x2 root with two x1 hostbridges.
Decoder Programming:
root - iw:2 ig:256
hb1 - iw:1 ig:256 (Linux expects 512)
hb2 - iw:1 ig:256 (Linux expects 512)
ep1 - iw:2 ig:256
ep2 - iw:2 ig:256
This change allows all decoders downstream of a passthrough decoder to
also be configured as passthrough (iw:1 ig:X), but still disallows
downstream decoders from applying subsequent interleaves.
e.g. in the above example if there was another decoder south of hb1
attempting to interleave 2 endpoints - Linux would enforce hb1.ig=512
because the southern decoder would have iw:2 and require ig=pig*piw.
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/cxl/core/region.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
index 04bc6cad092c..dec262eadf9a 100644
--- a/drivers/cxl/core/region.c
+++ b/drivers/cxl/core/region.c
@@ -1553,7 +1553,7 @@ static int cxl_port_setup_targets(struct cxl_port *port,
if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) {
if (cxld->interleave_ways != iw ||
- cxld->interleave_granularity != ig ||
+ (iw > 1 && cxld->interleave_granularity != ig) ||
cxled->spa_range.start != p->res->start ||
cxled->spa_range.end != p->res->end ||
((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
--
2.47.1
Gregory Price wrote: > When validating decoder IW/IG when setting up regions, the granularity > is irrelevant when iw=1 - all accesses will always route to the only > target anyway - so all ig values are "correct". Loosen the requirement > that `ig = (parent_iw * parent_ig)` when iw=1. > > On some Zen5 platforms, the platform BIOS specifies a 256-byte > interleave granularity window for host bridges when there is only > one target downstream. This leads to Linux rejecting the configuration > of a region with a x2 root with two x1 hostbridges. > > Decoder Programming: > root - iw:2 ig:256 > hb1 - iw:1 ig:256 (Linux expects 512) > hb2 - iw:1 ig:256 (Linux expects 512) > ep1 - iw:2 ig:256 > ep2 - iw:2 ig:256 > > This change allows all decoders downstream of a passthrough decoder to > also be configured as passthrough (iw:1 ig:X), but still disallows > downstream decoders from applying subsequent interleaves. > > e.g. in the above example if there was another decoder south of hb1 > attempting to interleave 2 endpoints - Linux would enforce hb1.ig=512 > because the southern decoder would have iw:2 and require ig=pig*piw. > > Signed-off-by: Gregory Price <gourry@gourry.net> > Reviewed-by: Dave Jiang <dave.jiang@intel.com> Thanks for the respin! Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Hi Gregory and CXL community Cc Goto-san Wow, our platform has encountered a similar issue, and I am intending to consult the community regarding this matter. I drafted similar patch locally, but I wonder if we should "ignore" the IG or "program" the IG to the decoder. Let me post the mail(question) from my drafts in your thread(I hope you I hope you won't mind). ====================================== [Question] granularity is a don't care if not interleaving? I saw this sentence " granularity is a don't care if not interleaving" in this patch "[ndctl PATCH 2/6] cxl/list: Add interleave parameters to decoder listings" [1] This reminds me that our platform programed an unmatched interleave_granularity in HDM decoders between endpoint and the host-bridge, see below: CXL Root CFMW0 / \ CFMW1 decoder0.0 decoder0.1 128 GiB IW:1 IG:256 IW:1 IG:256 128 GiB \ / Host Bridge / \ decoder5.0 decoder5.1 IW:1 IG:256 IW:1 IG:256 / \ Endpoint9 Endpoint10 | | decoder9.0 decoder10.0 IW:1 IG:1024 IW:1 IG:1024 With this setup, the Linux kernel attempts to create regions for Endpoint9 and Endpoint10 but fails because the endpoint decoders’ interleave granularity (IG=1024) does not match their parent decoders’ IG (256). Ideally, the endpoint decoders are expected to be configured for IG=256. Currently, we learned that we have only special handling for the root decoders [2][3]. My question are: Q1: whether "granularity is a don't care if not interleaving" is applied to all HDM decoders(including root decoder and HDM decoder) In current cxl cli , it will not show any interleave_granularity at all when ways==1(no-interleaving) $ cxl list -PDE | grep granularity # show nothing when ways==1 Per the CXL Spec r3.1 IG: "The number of consecutive bytes that are assigned to each target in the Target List." Q2: Does this imply a configuration where the number of ways>1? Q3: Does the IG also represent the device's capabilities? When programming, should one also consider whether the device supports it? If "granularity is a don't care if not interleaving" is true, how about below changes diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c index 75cd5dbb41e4..647fe2ce18ca 100644 --- a/drivers/cxl/core/region.c +++ b/drivers/cxl/core/region.c @@ -1435,6 +1435,11 @@ static int cxl_port_setup_targets(struct cxl_port *port, + if (cxld->interleave_ways == 1 && cxld->interleave_granularity != ig) { + cxld->interleave_granularity = ig; + /* program HDM decoder */ + ... + } if (cxld->interleave_ways != iw || cxld->interleave_granularity != ig || cxld->hpa_range.start != p->res->start || [1] https://lore.kernel.org/all/165973188300.1528532.222988685552982872.stgit@dwillia2-xfh.jf.intel.com/ [2] https://lore.kernel.org/all/165853776917.2430596.16823264262010844458.stgit@dwillia2-xfh.jf.intel.com/ [3] https://lore.kernel.org/all/169824893473.1403938.16110924262989774582.stgit@bgt-140510-bm03.eng.stellus.in/ Thanks Zhijian On 03/04/2025 07:25, Gregory Price wrote: > When validating decoder IW/IG when setting up regions, the granularity > is irrelevant when iw=1 - all accesses will always route to the only > target anyway - so all ig values are "correct". Loosen the requirement > that `ig = (parent_iw * parent_ig)` when iw=1. > > On some Zen5 platforms, the platform BIOS specifies a 256-byte > interleave granularity window for host bridges when there is only > one target downstream. This leads to Linux rejecting the configuration > of a region with a x2 root with two x1 hostbridges. > > Decoder Programming: > root - iw:2 ig:256 > hb1 - iw:1 ig:256 (Linux expects 512) > hb2 - iw:1 ig:256 (Linux expects 512) > ep1 - iw:2 ig:256 > ep2 - iw:2 ig:256 > > This change allows all decoders downstream of a passthrough decoder to > also be configured as passthrough (iw:1 ig:X), but still disallows > downstream decoders from applying subsequent interleaves. > > e.g. in the above example if there was another decoder south of hb1 > attempting to interleave 2 endpoints - Linux would enforce hb1.ig=512 > because the southern decoder would have iw:2 and require ig=pig*piw. > > Signed-off-by: Gregory Price <gourry@gourry.net> > Reviewed-by: Dave Jiang <dave.jiang@intel.com> > --- > drivers/cxl/core/region.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c > index 04bc6cad092c..dec262eadf9a 100644 > --- a/drivers/cxl/core/region.c > +++ b/drivers/cxl/core/region.c > @@ -1553,7 +1553,7 @@ static int cxl_port_setup_targets(struct cxl_port *port, > > if (test_bit(CXL_REGION_F_AUTO, &cxlr->flags)) { > if (cxld->interleave_ways != iw || > - cxld->interleave_granularity != ig || > + (iw > 1 && cxld->interleave_granularity != ig) || > cxled->spa_range.start != p->res->start || > cxled->spa_range.end != p->res->end || > ((cxld->flags & CXL_DECODER_F_ENABLE) == 0)) {
Zhijian Li (Fujitsu) wrote: > Hi Gregory and CXL community > Cc Goto-san > > Wow, our platform has encountered a similar issue, and I am intending to consult > the community regarding this matter. > > I drafted similar patch locally, but I wonder if we should "ignore" the IG > or "program" the IG to the decoder. > > Let me post the mail(question) from my drafts in your thread(I hope you I hope you won't mind). > ====================================== > [Question] granularity is a don't care if not interleaving? > I saw this sentence " granularity is a don't care if not interleaving" in this > patch "[ndctl PATCH 2/6] cxl/list: Add interleave parameters to decoder listings" [1] > > This reminds me that our platform programed an unmatched interleave_granularity in HDM decoders > between endpoint and the host-bridge, see below: > > CXL Root > CFMW0 / \ CFMW1 > decoder0.0 decoder0.1 > 128 GiB IW:1 IG:256 IW:1 IG:256 128 GiB > \ / > Host Bridge > / \ > decoder5.0 decoder5.1 > IW:1 IG:256 IW:1 IG:256 > / \ > Endpoint9 Endpoint10 > | | > decoder9.0 decoder10.0 > IW:1 IG:1024 IW:1 IG:1024 Why 1024? Yes, the value does not matter, but attempting 1024 feels more like a unit test than a production use case. > > With this setup, the Linux kernel attempts to create regions for Endpoint9 and Endpoint10 > but fails because the endpoint decoders’ interleave granularity (IG=1024) does not > match their parent decoders’ IG (256). Ideally, the endpoint decoders are expected to be > configured for IG=256. > > Currently, we learned that we have only special handling for the root decoders [2][3]. > > My question are: > Q1: whether "granularity is a don't care if not interleaving" is applied to > all HDM decoders(including root decoder and HDM decoder) All decoders. > In current cxl cli , it will not show any interleave_granularity at all when ways==1(no-interleaving) > $ cxl list -PDE | grep granularity # show nothing when ways==1 Right, because the value theoretically has no functional impact in the ways==1 case. However, it errantly ends up having practical impact in these corners cases where code performs granularity comparisons without considering that ways may be 1. > Per the CXL Spec r3.1 > IG: "The number of consecutive bytes that are assigned to each target in the Target List." > Q2: Does this imply a configuration where the number of ways>1? Right, the granularity is the boundary at which the decoder switches to the next target in the target list. When ways=1 granularity can be infinity or zero by that definition. > Q3: Does the IG also represent the device's capabilities? When programming, should one also > consider whether the device supports it? Yes, see bits [9:8] in the CXL HDM Decoder Capability Register (CXL 3.2 8.2.4.20.1). So even though the math should not matter, I would still expect the driver to try to be careful to make sure that IG+8 is less than the address-bit max. > If "granularity is a don't care if not interleaving" is true, how about below changes Part of me says, "yes, that should be ok", another part of me says "what is the practical benefit of allowing any granularity to be specified?". So the fix from Gregory is limited to the case of "whoops, the platform BIOS thought this was a good idea even though it does not matter in practice, teach Linux to be lenient in this case.". The proposal to accept that in all case allows user-created regions to have odd large granularity sizes in the iw=1 case, and I am skeptical it is worth supporting that now.
Hi Dan I am grateful for your prompt response. On 03/04/2025 12:42, Dan Williams wrote: > Zhijian Li (Fujitsu) wrote: >> Hi Gregory and CXL community >> Cc Goto-san >> >> | | >> decoder9.0 decoder10.0 >> IW:1 IG:1024 IW:1 IG:1024 > > Why 1024? Yes, the value does not matter, but attempting 1024 feels more > like a unit test than a production use case. I am uncertain, it appears to be this way when we get the device. I presume it should not be a side effect in no-interleaving case. Thank you for your answers to these questions. Your reply has truly cleared up my confusion. Once again, thank you! Thanks Zhijian >> >> My question are: >> Q1: whether "granularity is a don't care if not interleaving" is applied to >> all HDM decoders(including root decoder and HDM decoder) > > All decoders.> >> In current cxl cli , it will not show any interleave_granularity at all when ways==1(no-interleaving) >> $ cxl list -PDE | grep granularity # show nothing when ways==1 > > Right, because the value theoretically has no functional impact in the > ways==1 case. However, it errantly ends up having practical impact in > these corners cases where code performs granularity comparisons without > considering that ways may be 1. > >> Per the CXL Spec r3.1 >> IG: "The number of consecutive bytes that are assigned to each target in the Target List." >> Q2: Does this imply a configuration where the number of ways>1? > > Right, the granularity is the boundary at which the decoder switches to > the next target in the target list. When ways=1 granularity can be > infinity or zero by that definition. > >> Q3: Does the IG also represent the device's capabilities? When programming, should one also >> consider whether the device supports it? > > Yes, see bits [9:8] in the CXL HDM Decoder Capability Register (CXL 3.2 > 8.2.4.20.1). So even though the math should not matter, I would still > expect the driver to try to be careful to make sure that IG+8 is less > than the address-bit max. > >> If "granularity is a don't care if not interleaving" is true, how about below changes > > Part of me says, "yes, that should be ok", another part of me says "what > is the practical benefit of allowing any granularity to be specified?". > > So the fix from Gregory is limited to the case of "whoops, the platform > BIOS thought this was a good idea even though it does not matter in > practice, teach Linux to be lenient in this case.". > > The proposal to accept that in all case allows user-created regions to > have odd large granularity sizes in the iw=1 case, and I am skeptical > it is worth supporting that now.
© 2016 - 2025 Red Hat, Inc.