[PATCH 02/23] PCI: Rewrite bridge window head alignment function

Ilpo Järvinen posted 23 patches 1 month, 3 weeks ago
[PATCH 02/23] PCI: Rewrite bridge window head alignment function
Posted by Ilpo Järvinen 1 month, 3 weeks ago
The calculation of bridge window head alignment is done by
calculate_mem_align() [*]. With the default bridge window alignment, it
is used for both head and tail alignment.

The selected head alignment does not always result in tight-fitting
resources (gap at d4f00000-d4ffffff):

    d4800000-dbffffff : PCI Bus 0000:06
      d4800000-d48fffff : PCI Bus 0000:07
        d4800000-d4803fff : 0000:07:00.0
          d4800000-d4803fff : nvme
      d4900000-d49fffff : PCI Bus 0000:0a
        d4900000-d490ffff : 0000:0a:00.0
          d4900000-d490ffff : r8169
        d4910000-d4913fff : 0000:0a:00.0
      d4a00000-d4cfffff : PCI Bus 0000:0b
        d4a00000-d4bfffff : 0000:0b:00.0
          d4a00000-d4bfffff : 0000:0b:00.0
        d4c00000-d4c07fff : 0000:0b:00.0
      d4d00000-d4dfffff : PCI Bus 0000:15
        d4d00000-d4d07fff : 0000:15:00.0
          d4d00000-d4d07fff : xhci-hcd
      d4e00000-d4efffff : PCI Bus 0000:16
        d4e00000-d4e7ffff : 0000:16:00.0
        d4e80000-d4e803ff : 0000:16:00.0
          d4e80000-d4e803ff : ahci
      d5000000-dbffffff : PCI Bus 0000:0c

This has not been caused problems (for years) with the default bridge
window tail alignment that grossly over-estimates the required tail
alignment leaving more tail room than necessary. With the introduction
of relaxed tail alignment that leaves no extra tail room whatsoever,
any gaps will immediately turn into assignment failures.

Introduce head alignment calculation that ensures no gaps are left and
apply the new approach when using relaxed alignment. We may want to
consider using it for the normal alignment eventually, but as the first
step, solve only the problem with the relaxed tail alignment.

([*] I don't understand the algorithm in calculate_mem_align().)

Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220775
Reported-by: Malte Schröder <malte+lkml@tnxip.de>
Tested-by: Malte Schröder <malte+lkml@tnxip.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Cc: stable@vger.kernel.org
---

Little annoyingly, there's difference in what aligns array contains
between the legacy alignment approach (which I dare not to touch as I
really don't understand what the algorithm tries to do) and this new
head aligment algorithm, both consuming stack space. After making the
new approach the only available approach in the follow-up patch, only
one array remains (however, that follow-up change is also somewhat
riskier when it comes to regressions).

That being said, the new head alignment could work with the same aligns
array as the legacy approach, it just won't necessarily produce an
optimal (the smallest possible) head alignment when if (r_size <=
align) condition is used. Just let me know if that approach is
preferred (to save some stack space).
---
 drivers/pci/setup-bus.c | 53 ++++++++++++++++++++++++++++++++++-------
 1 file changed, 44 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/setup-bus.c b/drivers/pci/setup-bus.c
index 4b918ff4d2d8..80e5a8fc62e7 100644
--- a/drivers/pci/setup-bus.c
+++ b/drivers/pci/setup-bus.c
@@ -1228,6 +1228,45 @@ static inline resource_size_t calculate_mem_align(resource_size_t *aligns,
 	return min_align;
 }
 
+/*
+ * Calculate bridge window head alignment that leaves no gaps in between
+ * resources.
+ */
+static resource_size_t calculate_head_align(resource_size_t *aligns,
+					    int max_order)
+{
+	resource_size_t head_align = 1;
+	resource_size_t remainder = 0;
+	int order;
+
+	/* Take the largest alignment as the starting point. */
+	head_align <<= max_order + __ffs(SZ_1M);
+
+	for (order = max_order - 1; order >= 0; order--) {
+		resource_size_t align1 = 1;
+
+		align1 <<= order + __ffs(SZ_1M);
+
+		/*
+		 * Account smaller resources with alignment < max_order that
+		 * could be used to fill head room if alignment less than
+		 * max_order is used.
+		 */
+		remainder += aligns[order];
+
+		/*
+		 * Test if head fill is enough to satisfy the alignment of
+		 * the larger resources after reducing the alignment.
+		 */
+		while ((head_align > align1) && (remainder >= head_align / 2)) {
+			head_align /= 2;
+			remainder -= head_align;
+		}
+	}
+
+	return head_align;
+}
+
 /**
  * pbus_upstream_space_available - Check no upstream resource limits allocation
  * @bus:	The bus
@@ -1315,13 +1354,13 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
 {
 	struct pci_dev *dev;
 	resource_size_t min_align, win_align, align, size, size0, size1 = 0;
-	resource_size_t aligns[28]; /* Alignments from 1MB to 128TB */
+	resource_size_t aligns[28] = {}; /* Alignments from 1MB to 128TB */
+	resource_size_t aligns2[28] = {};/* Alignments from 1MB to 128TB */
 	int order, max_order;
 	struct resource *b_res = pbus_select_window_for_type(bus, type);
 	resource_size_t children_add_size = 0;
 	resource_size_t children_add_align = 0;
 	resource_size_t add_align = 0;
-	resource_size_t relaxed_align;
 	resource_size_t old_size;
 
 	if (!b_res)
@@ -1331,7 +1370,6 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
 	if (b_res->parent)
 		return;
 
-	memset(aligns, 0, sizeof(aligns));
 	max_order = 0;
 	size = 0;
 
@@ -1382,6 +1420,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
 			 */
 			if (r_size <= align)
 				aligns[order] += align;
+			aligns2[order] += align;
 			if (order > max_order)
 				max_order = order;
 
@@ -1406,9 +1445,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
 
 	if (bus->self && size0 &&
 	    !pbus_upstream_space_available(bus, b_res, size0, min_align)) {
-		relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
-		relaxed_align = max(relaxed_align, win_align);
-		min_align = min(min_align, relaxed_align);
+		min_align = calculate_head_align(aligns2, max_order);
 		size0 = calculate_memsize(size, min_size, 0, 0, old_size, win_align);
 		resource_set_range(b_res, min_align, size0);
 		pci_info(bus->self, "bridge window %pR to %pR requires relaxed alignment rules\n",
@@ -1422,9 +1459,7 @@ static void pbus_size_mem(struct pci_bus *bus, unsigned long type,
 
 		if (bus->self && size1 &&
 		    !pbus_upstream_space_available(bus, b_res, size1, add_align)) {
-			relaxed_align = 1ULL << (max_order + __ffs(SZ_1M));
-			relaxed_align = max(relaxed_align, win_align);
-			min_align = min(min_align, relaxed_align);
+			min_align = calculate_head_align(aligns2, max_order);
 			size1 = calculate_memsize(size, min_size, add_size, children_add_size,
 						  old_size, win_align);
 			pci_info(bus->self,
-- 
2.39.5

Re: [PATCH 02/23] PCI: Rewrite bridge window head alignment function
Posted by Bjorn Helgaas 1 week, 5 days ago
On Fri, Dec 19, 2025 at 07:40:15PM +0200, Ilpo Järvinen wrote:
> The calculation of bridge window head alignment is done by
> calculate_mem_align() [*]. With the default bridge window alignment, it
> is used for both head and tail alignment.
> 
> The selected head alignment does not always result in tight-fitting
> resources (gap at d4f00000-d4ffffff):
> 
>     d4800000-dbffffff : PCI Bus 0000:06
>       d4800000-d48fffff : PCI Bus 0000:07
>         d4800000-d4803fff : 0000:07:00.0
>           d4800000-d4803fff : nvme
>       d4900000-d49fffff : PCI Bus 0000:0a
>         d4900000-d490ffff : 0000:0a:00.0
>           d4900000-d490ffff : r8169
>         d4910000-d4913fff : 0000:0a:00.0
>       d4a00000-d4cfffff : PCI Bus 0000:0b
>         d4a00000-d4bfffff : 0000:0b:00.0
>           d4a00000-d4bfffff : 0000:0b:00.0
>         d4c00000-d4c07fff : 0000:0b:00.0
>       d4d00000-d4dfffff : PCI Bus 0000:15
>         d4d00000-d4d07fff : 0000:15:00.0
>           d4d00000-d4d07fff : xhci-hcd
>       d4e00000-d4efffff : PCI Bus 0000:16
>         d4e00000-d4e7ffff : 0000:16:00.0
>         d4e80000-d4e803ff : 0000:16:00.0
>           d4e80000-d4e803ff : ahci
>       d5000000-dbffffff : PCI Bus 0000:0c
> 
> This has not been caused problems (for years) with the default bridge
> window tail alignment that grossly over-estimates the required tail
> alignment leaving more tail room than necessary. With the introduction
> of relaxed tail alignment that leaves no extra tail room whatsoever,
> any gaps will immediately turn into assignment failures.
> 
> Introduce head alignment calculation that ensures no gaps are left and
> apply the new approach when using relaxed alignment. We may want to
> consider using it for the normal alignment eventually, but as the first
> step, solve only the problem with the relaxed tail alignment.
> 
> ([*] I don't understand the algorithm in calculate_mem_align().)
> 
> Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")

check_commits complains that this SHA1 doesn't exist:

  In commit

    a21a27a0e893 ("PCI: Rewrite bridge window head alignment function")

  Fixes tag

    Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")

  has these problem(s):

    - Target SHA1 does not exist

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5d0a8965aea9
does find it, but says it's not reachable.

It's so old (2002) that I'm not sure it's worth including it as a
Fixes: tag.

Bjorn
Re: [PATCH 02/23] PCI: Rewrite bridge window head alignment function
Posted by Ilpo Järvinen 1 week, 5 days ago
On Mon, 26 Jan 2026, Bjorn Helgaas wrote:

> On Fri, Dec 19, 2025 at 07:40:15PM +0200, Ilpo Järvinen wrote:
> > The calculation of bridge window head alignment is done by
> > calculate_mem_align() [*]. With the default bridge window alignment, it
> > is used for both head and tail alignment.
> > 
> > The selected head alignment does not always result in tight-fitting
> > resources (gap at d4f00000-d4ffffff):
> > 
> >     d4800000-dbffffff : PCI Bus 0000:06
> >       d4800000-d48fffff : PCI Bus 0000:07
> >         d4800000-d4803fff : 0000:07:00.0
> >           d4800000-d4803fff : nvme
> >       d4900000-d49fffff : PCI Bus 0000:0a
> >         d4900000-d490ffff : 0000:0a:00.0
> >           d4900000-d490ffff : r8169
> >         d4910000-d4913fff : 0000:0a:00.0
> >       d4a00000-d4cfffff : PCI Bus 0000:0b
> >         d4a00000-d4bfffff : 0000:0b:00.0
> >           d4a00000-d4bfffff : 0000:0b:00.0
> >         d4c00000-d4c07fff : 0000:0b:00.0
> >       d4d00000-d4dfffff : PCI Bus 0000:15
> >         d4d00000-d4d07fff : 0000:15:00.0
> >           d4d00000-d4d07fff : xhci-hcd
> >       d4e00000-d4efffff : PCI Bus 0000:16
> >         d4e00000-d4e7ffff : 0000:16:00.0
> >         d4e80000-d4e803ff : 0000:16:00.0
> >           d4e80000-d4e803ff : ahci
> >       d5000000-dbffffff : PCI Bus 0000:0c
> > 
> > This has not been caused problems (for years) with the default bridge
> > window tail alignment that grossly over-estimates the required tail
> > alignment leaving more tail room than necessary. With the introduction
> > of relaxed tail alignment that leaves no extra tail room whatsoever,
> > any gaps will immediately turn into assignment failures.
> > 
> > Introduce head alignment calculation that ensures no gaps are left and
> > apply the new approach when using relaxed alignment. We may want to
> > consider using it for the normal alignment eventually, but as the first
> > step, solve only the problem with the relaxed tail alignment.
> > 
> > ([*] I don't understand the algorithm in calculate_mem_align().)
> > 
> > Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
> 
> check_commits complains that this SHA1 doesn't exist:
> 
>   In commit
> 
>     a21a27a0e893 ("PCI: Rewrite bridge window head alignment function")
> 
>   Fixes tag
> 
>     Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
> 
>   has these problem(s):
> 
>     - Target SHA1 does not exist
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5d0a8965aea9
> does find it, but says it's not reachable.
> 
> It's so old (2002) that I'm not sure it's worth including it as a
> Fixes: tag.

Hi,

The commit is in the history repo, and yes, even the git web ui for some 
reason says it's not reachable by any branch:

https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=5d0a8965aea93bd799ebcd671e562d90f3ec2711

...But it's part of a tag for sure:

$ git describe --contains 5d0a8965aea93bd799ebcd671e562d90f3ec2711
v2.5.15~11^2~5^2~10

The composition in the history repo is strange, things don't always appear 
properly linear for some reason there but I've found that commit by going 
backwards with git annotate code-line-shaid^ in a "loop" until I came 
back to commit that introduced it. Maybe this entire lineage of commits is 
headed only by a tag, dunno.

Many things in the resource fitting and assignment algorithm lead back to 
that same commit BTW (and its commit message isn't very helpful in 
explaining why things were made the way they were).

If you don't want to put it into a Fixes tag, could you put that history 
repo URL into a Link tag instead. I do find it relevant where this came 
from.

-- 
 i.
Re: [PATCH 02/23] PCI: Rewrite bridge window head alignment function
Posted by Bjorn Helgaas 1 week, 4 days ago
On Tue, Jan 27, 2026 at 01:22:22PM +0200, Ilpo Järvinen wrote:
> On Mon, 26 Jan 2026, Bjorn Helgaas wrote:
> > On Fri, Dec 19, 2025 at 07:40:15PM +0200, Ilpo Järvinen wrote:
> > > The calculation of bridge window head alignment is done by
> > > calculate_mem_align() [*]. With the default bridge window alignment, it
> > > is used for both head and tail alignment.
> ...

> > > Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
> > 
> > check_commits complains that this SHA1 doesn't exist:
> > 
> >   In commit
> > 
> >     a21a27a0e893 ("PCI: Rewrite bridge window head alignment function")
> > 
> >   Fixes tag
> > 
> >     Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
> > 
> >   has these problem(s):
> > 
> >     - Target SHA1 does not exist
> > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=5d0a8965aea9
> > does find it, but says it's not reachable.
> > 
> > It's so old (2002) that I'm not sure it's worth including it as a
> > Fixes: tag.
> 
> Hi,
> 
> The commit is in the history repo, and yes, even the git web ui for some 
> reason says it's not reachable by any branch:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=5d0a8965aea93bd799ebcd671e562d90f3ec2711
> 
> ...But it's part of a tag for sure:
> 
> $ git describe --contains 5d0a8965aea93bd799ebcd671e562d90f3ec2711
> v2.5.15~11^2~5^2~10

Thanks, I made it a Link tag instead:

  Link: https://git.kernel.org/history/history/c/5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")