[PATCH] PCI: dwc: Use multiple ATU regions for large bridge windows

Samuel Holland posted 1 patch 2 months ago
.../pci/controller/dwc/pcie-designware-host.c | 34 ++++++++++++-------
1 file changed, 22 insertions(+), 12 deletions(-)
[PATCH] PCI: dwc: Use multiple ATU regions for large bridge windows
Posted by Samuel Holland 2 months ago
Some SoCs may allocate more address space for a bridge window than can
be covered by a single ATU region. Allow using a larger bridge window
by allocating multiple adjacent ATU regions.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
---
An example of where this is needed is the ESWIN EIC7700 SoC[1]. The SoC
decodes 128 GiB of address space to the PCIe controller. Without this
change, only 8 GiB is usable; after this change 48 GiB (6 ATU regions)
is usable, which allows using PCIe cards with >8 GiB BARs:

eic7700-pcie 54000000.pcie: host bridge /soc/pcie@54000000 ranges:
eic7700-pcie 54000000.pcie:       IO 0x0040800000..0x0040ffffff -> 0x0040800000
eic7700-pcie 54000000.pcie:      MEM 0x0041000000..0x004fffffff -> 0x0041000000
eic7700-pcie 54000000.pcie:      MEM 0x8000000000..0x89ffffffff -> 0x8000000000
eic7700-pcie 54000000.pcie: iATU: unroll T, 8 ob, 4 ib, align 4K, limit 8G
eic7700-pcie 54000000.pcie: PCIe Gen.2 x1 link up
eic7700-pcie 54000000.pcie: PCI host bridge to bus 0000:00

[1]: https://lore.kernel.org/linux-pci/20250923120946.1218-1-zhangsenchuan@eswincomputing.com/

 .../pci/controller/dwc/pcie-designware-host.c | 34 ++++++++++++-------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 20c9333bcb1c..148076331d7b 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -873,30 +873,40 @@ static int dw_pcie_iatu_setup(struct dw_pcie_rp *pp)
 
 	i = 0;
 	resource_list_for_each_entry(entry, &pp->bridge->windows) {
+		u64 total_size;
+
 		if (resource_type(entry->res) != IORESOURCE_MEM)
 			continue;
 
-		if (pci->num_ob_windows <= ++i)
-			break;
-
-		atu.index = i;
 		atu.type = PCIE_ATU_TYPE_MEM;
 		atu.parent_bus_addr = entry->res->start - pci->parent_bus_offset;
 		atu.pci_addr = entry->res->start - entry->offset;
 
 		/* Adjust iATU size if MSG TLP region was allocated before */
 		if (pp->msg_res && pp->msg_res->parent == entry->res)
-			atu.size = resource_size(entry->res) -
+			total_size = resource_size(entry->res) -
 					resource_size(pp->msg_res);
 		else
-			atu.size = resource_size(entry->res);
+			total_size = resource_size(entry->res);
 
-		ret = dw_pcie_prog_outbound_atu(pci, &atu);
-		if (ret) {
-			dev_err(pci->dev, "Failed to set MEM range %pr\n",
-				entry->res);
-			return ret;
-		}
+		do {
+			if (pci->num_ob_windows <= ++i)
+				break;
+
+			atu.index = i;
+			atu.size = min(total_size, pci->region_limit + 1);
+
+			ret = dw_pcie_prog_outbound_atu(pci, &atu);
+			if (ret) {
+				dev_err(pci->dev, "Failed to set MEM range %pr\n",
+					entry->res);
+				return ret;
+			}
+
+			atu.parent_bus_addr += atu.size;
+			atu.pci_addr += atu.size;
+			total_size -= atu.size;
+		} while (total_size);
 	}
 
 	if (pp->io_size) {
-- 
2.47.2

base-commit: 5a6f65d1502551f84c158789e5d89299c78907c7
branch: up/pci-bridge-window
Re: [PATCH] PCI: dwc: Use multiple ATU regions for large bridge windows
Posted by Charles Mirabile 1 month, 4 weeks ago
Hi Samuel—

On Wed, Oct 15, 2025 at 04:15:01PM -0700, Samuel Holland wrote:
> Some SoCs may allocate more address space for a bridge window than can
> be covered by a single ATU region. Allow using a larger bridge window
> by allocating multiple adjacent ATU regions.

I had a similar patch floating around that I wanted to upstream, but I
figured I should take a look at lore before sending it in case someone
else had the same idea. Looks like great mind think alike :^). I have
attached my version the patch to this email and will leave some feedback
inline. If you want to incorporate some of my changes, feel free to add

Acked-by: Charles Mirabile <cmirabil@redhat.com>

or even a Co-developed-by and my DCO.

> 
> Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
> ---
> An example of where this is needed is the ESWIN EIC7700 SoC[1]. The SoC
> decodes 128 GiB of address space to the PCIe controller. Without this
> change, only 8 GiB is usable; after this change 48 GiB (6 ATU regions)
> is usable, which allows using PCIe cards with >8 GiB BARs:
> 
> eic7700-pcie 54000000.pcie: host bridge /soc/pcie@54000000 ranges:
> eic7700-pcie 54000000.pcie:       IO 0x0040800000..0x0040ffffff -> 0x0040800000
> eic7700-pcie 54000000.pcie:      MEM 0x0041000000..0x004fffffff -> 0x0041000000
> eic7700-pcie 54000000.pcie:      MEM 0x8000000000..0x89ffffffff -> 0x8000000000
> eic7700-pcie 54000000.pcie: iATU: unroll T, 8 ob, 4 ib, align 4K, limit 8G
> eic7700-pcie 54000000.pcie: PCIe Gen.2 x1 link up
> eic7700-pcie 54000000.pcie: PCI host bridge to bus 0000:00
> 
> [1]: https://lore.kernel.org/linux-pci/20250923120946.1218-1-zhangsenchuan@eswincomputing.com/
> 
>  .../pci/controller/dwc/pcie-designware-host.c | 34 ++++++++++++-------
>  1 file changed, 22 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 20c9333bcb1c..148076331d7b 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -873,30 +873,40 @@ static int dw_pcie_iatu_setup(struct dw_pcie_rp *pp)
>  
>  	i = 0;
>  	resource_list_for_each_entry(entry, &pp->bridge->windows) {
> +		u64 total_size;

`resource_size_t` might be a better fit for this

> +
>  		if (resource_type(entry->res) != IORESOURCE_MEM)
>  			continue;
>  
> -		if (pci->num_ob_windows <= ++i)
> -			break;
> -
> -		atu.index = i;
>  		atu.type = PCIE_ATU_TYPE_MEM;
>  		atu.parent_bus_addr = entry->res->start - pci->parent_bus_offset;
>  		atu.pci_addr = entry->res->start - entry->offset;
>  
>  		/* Adjust iATU size if MSG TLP region was allocated before */
>  		if (pp->msg_res && pp->msg_res->parent == entry->res)
> -			atu.size = resource_size(entry->res) -
> +			total_size = resource_size(entry->res) -
>  					resource_size(pp->msg_res);
>  		else
> -			atu.size = resource_size(entry->res);
> +			total_size = resource_size(entry->res);
>  
> -		ret = dw_pcie_prog_outbound_atu(pci, &atu);
> -		if (ret) {
> -			dev_err(pci->dev, "Failed to set MEM range %pr\n",
> -				entry->res);
> -			return ret;
> -		}
> +		do {
> +			if (pci->num_ob_windows <= ++i)
> +				break;

I think it might be bad if you were to only able to partially map a given
resource. In my version, I keep the original check outside the loop with
merely `break`, but return an error from probe in this check.

> +
> +			atu.index = i;
> +			atu.size = min(total_size, pci->region_limit + 1);

I had to look up the difference because I couldn't remember—I used `MIN`
here instead of `min`. I think `MIN` is approriate, but I am not sure it
really matters so you could keep `min`. 

> +
> +			ret = dw_pcie_prog_outbound_atu(pci, &atu);
> +			if (ret) {
> +				dev_err(pci->dev, "Failed to set MEM range %pr\n",
> +					entry->res);
> +				return ret;
> +			}
> +
> +			atu.parent_bus_addr += atu.size;
> +			atu.pci_addr += atu.size;
> +			total_size -= atu.size;
> +		} while (total_size);
>  	}
>  
>  	if (pp->io_size) {
> -- 
> 2.47.2
> 
> base-commit: 5a6f65d1502551f84c158789e5d89299c78907c7
> branch: up/pci-bridge-window

I also included an attempt at the inbound window version after seeing
Nilkas's feedback.

Best—Charlie

Signed-off-by: Charles Mirabile <cmirabil@redhat.com>
---
diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
index 20c9333bcb1c..f30961482799 100644
--- a/drivers/pci/controller/dwc/pcie-designware-host.c
+++ b/drivers/pci/controller/dwc/pcie-designware-host.c
@@ -873,29 +873,49 @@ static int dw_pcie_iatu_setup(struct dw_pcie_rp *pp)
 
 	i = 0;
 	resource_list_for_each_entry(entry, &pp->bridge->windows) {
+		resource_size_t res_size;
+
 		if (resource_type(entry->res) != IORESOURCE_MEM)
 			continue;
 
-		if (pci->num_ob_windows <= ++i)
+		if (pci->num_ob_windows <= i + 1)
 			break;
 
-		atu.index = i;
 		atu.type = PCIE_ATU_TYPE_MEM;
 		atu.parent_bus_addr = entry->res->start - pci->parent_bus_offset;
 		atu.pci_addr = entry->res->start - entry->offset;
 
 		/* Adjust iATU size if MSG TLP region was allocated before */
 		if (pp->msg_res && pp->msg_res->parent == entry->res)
-			atu.size = resource_size(entry->res) -
+			res_size = resource_size(entry->res) -
 					resource_size(pp->msg_res);
 		else
-			atu.size = resource_size(entry->res);
+			res_size = resource_size(entry->res);
+
+		while (res_size > 0) {
+			/*
+			 * Make sure to fail probe if we run out of windows
+			 * in the middle and we would end up only partially
+			 * mapping a single resource
+			 */
+			if (pci->num_ob_windows <= ++i) {
+				dev_err(pci->dev, "Exhausted outbound windows mapping %pr\n",
+					entry->res);
+				return -ENOMEM;
+			}
+			atu.index = i;
+			atu.size = MIN(pci->region_limit + 1, res_size);
 
-		ret = dw_pcie_prog_outbound_atu(pci, &atu);
-		if (ret) {
-			dev_err(pci->dev, "Failed to set MEM range %pr\n",
-				entry->res);
-			return ret;
+			ret = dw_pcie_prog_outbound_atu(pci, &atu);
+			if (ret) {
+				dev_err(pci->dev, "Failed to set MEM range %pr\n",
+					entry->res);
+				return ret;
+			}
+
+			atu.parent_bus_addr += atu.size;
+			atu.pci_addr += atu.size;
+			res_size -= atu.size;
 		}
 	}
 
@@ -926,20 +946,38 @@ static int dw_pcie_iatu_setup(struct dw_pcie_rp *pp)
 
 	i = 0;
 	resource_list_for_each_entry(entry, &pp->bridge->dma_ranges) {
+		resource_size_t res_start, res_size, window_size;
+
 		if (resource_type(entry->res) != IORESOURCE_MEM)
 			continue;
 
 		if (pci->num_ib_windows <= i)
 			break;
 
-		ret = dw_pcie_prog_inbound_atu(pci, i++, PCIE_ATU_TYPE_MEM,
-					       entry->res->start,
-					       entry->res->start - entry->offset,
-					       resource_size(entry->res));
-		if (ret) {
-			dev_err(pci->dev, "Failed to set DMA range %pr\n",
-				entry->res);
-			return ret;
+		res_size = resource_size(entry->res);
+		res_start = entry->res->start;
+		while (res_size >= 0) {
+			/*
+			 * Make sure to fail probe if we run out of windows
+			 * in the middle and we would end up only partially
+			 * mapping a single resource
+			 */
+			if (pci->num_ib_windows <= i) {
+				dev_err(pci->dev, "Exhausted inbound windows mapping %pr\n",
+					entry->res);
+				return -ENOMEM;
+			}
+			window_size = MIN(pci->region_limit + 1, res_size);
+			ret = dw_pcie_prog_inbound_atu(pci, i++, PCIE_ATU_TYPE_MEM, res_start,
+						       res_start - entry->offset, window_size);
+			if (ret) {
+				dev_err(pci->dev, "Failed to set DMA range %pr\n",
+					entry->res);
+				return ret;
+			}
+
+			res_start += window_size;
+			res_size -= window_size;
 		}
 	}
 
-- 
2.43.0

base-commit: 211ddde0823f1442e4ad052a2f30f050145ccada

Re: [PATCH] PCI: dwc: Use multiple ATU regions for large bridge windows
Posted by Frank Li 2 months ago
On Wed, Oct 15, 2025 at 04:15:01PM -0700, Samuel Holland wrote:
> Some SoCs may allocate more address space for a bridge window than can
> be covered by a single ATU region. Allow using a larger bridge window
> by allocating multiple adjacent ATU regions.
>
> Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
> ---

Nice feature.

Reviewed-by: Frank Li <Frank.Li@nxp.com>

> An example of where this is needed is the ESWIN EIC7700 SoC[1]. The SoC
> decodes 128 GiB of address space to the PCIe controller. Without this
> change, only 8 GiB is usable; after this change 48 GiB (6 ATU regions)
> is usable, which allows using PCIe cards with >8 GiB BARs:
>
> eic7700-pcie 54000000.pcie: host bridge /soc/pcie@54000000 ranges:
> eic7700-pcie 54000000.pcie:       IO 0x0040800000..0x0040ffffff -> 0x0040800000
> eic7700-pcie 54000000.pcie:      MEM 0x0041000000..0x004fffffff -> 0x0041000000
> eic7700-pcie 54000000.pcie:      MEM 0x8000000000..0x89ffffffff -> 0x8000000000
> eic7700-pcie 54000000.pcie: iATU: unroll T, 8 ob, 4 ib, align 4K, limit 8G
> eic7700-pcie 54000000.pcie: PCIe Gen.2 x1 link up
> eic7700-pcie 54000000.pcie: PCI host bridge to bus 0000:00
>
> [1]: https://lore.kernel.org/linux-pci/20250923120946.1218-1-zhangsenchuan@eswincomputing.com/
>
>  .../pci/controller/dwc/pcie-designware-host.c | 34 ++++++++++++-------
>  1 file changed, 22 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 20c9333bcb1c..148076331d7b 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -873,30 +873,40 @@ static int dw_pcie_iatu_setup(struct dw_pcie_rp *pp)
>
>  	i = 0;
>  	resource_list_for_each_entry(entry, &pp->bridge->windows) {
> +		u64 total_size;
> +
>  		if (resource_type(entry->res) != IORESOURCE_MEM)
>  			continue;
>
> -		if (pci->num_ob_windows <= ++i)
> -			break;
> -
> -		atu.index = i;
>  		atu.type = PCIE_ATU_TYPE_MEM;
>  		atu.parent_bus_addr = entry->res->start - pci->parent_bus_offset;
>  		atu.pci_addr = entry->res->start - entry->offset;
>
>  		/* Adjust iATU size if MSG TLP region was allocated before */
>  		if (pp->msg_res && pp->msg_res->parent == entry->res)
> -			atu.size = resource_size(entry->res) -
> +			total_size = resource_size(entry->res) -
>  					resource_size(pp->msg_res);
>  		else
> -			atu.size = resource_size(entry->res);
> +			total_size = resource_size(entry->res);
>
> -		ret = dw_pcie_prog_outbound_atu(pci, &atu);
> -		if (ret) {
> -			dev_err(pci->dev, "Failed to set MEM range %pr\n",
> -				entry->res);
> -			return ret;
> -		}
> +		do {
> +			if (pci->num_ob_windows <= ++i)
> +				break;
> +
> +			atu.index = i;
> +			atu.size = min(total_size, pci->region_limit + 1);
> +
> +			ret = dw_pcie_prog_outbound_atu(pci, &atu);
> +			if (ret) {
> +				dev_err(pci->dev, "Failed to set MEM range %pr\n",
> +					entry->res);
> +				return ret;
> +			}
> +
> +			atu.parent_bus_addr += atu.size;
> +			atu.pci_addr += atu.size;
> +			total_size -= atu.size;
> +		} while (total_size);
>  	}
>
>  	if (pp->io_size) {
> --
> 2.47.2
>
> base-commit: 5a6f65d1502551f84c158789e5d89299c78907c7
> branch: up/pci-bridge-window
Re: [PATCH] PCI: dwc: Use multiple ATU regions for large bridge windows
Posted by Niklas Cassel 2 months ago
On Wed, Oct 15, 2025 at 04:15:01PM -0700, Samuel Holland wrote:
> Some SoCs may allocate more address space for a bridge window than can
> be covered by a single ATU region. Allow using a larger bridge window
> by allocating multiple adjacent ATU regions.
> 
> Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
> ---
> An example of where this is needed is the ESWIN EIC7700 SoC[1]. The SoC
> decodes 128 GiB of address space to the PCIe controller. Without this
> change, only 8 GiB is usable; after this change 48 GiB (6 ATU regions)
> is usable, which allows using PCIe cards with >8 GiB BARs:
> 
> eic7700-pcie 54000000.pcie: host bridge /soc/pcie@54000000 ranges:
> eic7700-pcie 54000000.pcie:       IO 0x0040800000..0x0040ffffff -> 0x0040800000
> eic7700-pcie 54000000.pcie:      MEM 0x0041000000..0x004fffffff -> 0x0041000000
> eic7700-pcie 54000000.pcie:      MEM 0x8000000000..0x89ffffffff -> 0x8000000000
> eic7700-pcie 54000000.pcie: iATU: unroll T, 8 ob, 4 ib, align 4K, limit 8G
> eic7700-pcie 54000000.pcie: PCIe Gen.2 x1 link up
> eic7700-pcie 54000000.pcie: PCI host bridge to bus 0000:00
> 
> [1]: https://lore.kernel.org/linux-pci/20250923120946.1218-1-zhangsenchuan@eswincomputing.com/
> 
>  .../pci/controller/dwc/pcie-designware-host.c | 34 ++++++++++++-------
>  1 file changed, 22 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware-host.c b/drivers/pci/controller/dwc/pcie-designware-host.c
> index 20c9333bcb1c..148076331d7b 100644
> --- a/drivers/pci/controller/dwc/pcie-designware-host.c
> +++ b/drivers/pci/controller/dwc/pcie-designware-host.c
> @@ -873,30 +873,40 @@ static int dw_pcie_iatu_setup(struct dw_pcie_rp *pp)
>  
>  	i = 0;
>  	resource_list_for_each_entry(entry, &pp->bridge->windows) {
> +		u64 total_size;
> +
>  		if (resource_type(entry->res) != IORESOURCE_MEM)
>  			continue;
>  
> -		if (pci->num_ob_windows <= ++i)
> -			break;
> -
> -		atu.index = i;
>  		atu.type = PCIE_ATU_TYPE_MEM;
>  		atu.parent_bus_addr = entry->res->start - pci->parent_bus_offset;
>  		atu.pci_addr = entry->res->start - entry->offset;
>  
>  		/* Adjust iATU size if MSG TLP region was allocated before */
>  		if (pp->msg_res && pp->msg_res->parent == entry->res)
> -			atu.size = resource_size(entry->res) -
> +			total_size = resource_size(entry->res) -
>  					resource_size(pp->msg_res);
>  		else
> -			atu.size = resource_size(entry->res);
> +			total_size = resource_size(entry->res);
>  
> -		ret = dw_pcie_prog_outbound_atu(pci, &atu);
> -		if (ret) {
> -			dev_err(pci->dev, "Failed to set MEM range %pr\n",
> -				entry->res);
> -			return ret;
> -		}
> +		do {
> +			if (pci->num_ob_windows <= ++i)
> +				break;
> +
> +			atu.index = i;
> +			atu.size = min(total_size, pci->region_limit + 1);
> +
> +			ret = dw_pcie_prog_outbound_atu(pci, &atu);
> +			if (ret) {
> +				dev_err(pci->dev, "Failed to set MEM range %pr\n",
> +					entry->res);
> +				return ret;
> +			}
> +
> +			atu.parent_bus_addr += atu.size;
> +			atu.pci_addr += atu.size;
> +			total_size -= atu.size;

1) Personal opinion, but perhaps:

			total_size -= atu.size;
			if (total_size) {
				atu.parent_bus_addr += atu.size;
				atu.pci_addr += atu.size;
			}

To more clearly show that these lines are performed/relevant only if
total_size > atu_limit ?


2) Perhaps convert the loop from:

do {

} while (total_size);


To a:

while (total_size) {

}

I don't see any reason for this to be a do-while loop.

while () also has the benefit that you will not "waste" an outbound iatu
if you get a window of size 0.


3) Since you are modifying the outbound range to support sizes > atu_limit,
I think you should do the same change for dma-ranges (inbound translation):

https://github.com/torvalds/linux/blob/v6.18-rc1/drivers/pci/controller/dwc/pcie-designware-host.c#L928-L948

Because it feels inconsistent if we allow size > atu_limit for outbound,
but not for inbound translation.


Kind regards,
Niklas