Current PCIe initialization logic may leave root ports operating with
non-optimal Maximum Payload Size (MPS) settings. While downstream device
configuration is handled during bus enumeration, root port MPS values
inherited from firmware or hardware defaults might not utilize the full
capabilities supported by the controller hardware. This can result in
suboptimal data transfer efficiency across the PCIe hierarchy.
During host controller probing phase, when PCIe bus tuning is enabled,
the implementation now configures root port MPS settings to their
hardware-supported maximum values. Specifically, when configuring the MPS
for a PCIe device, if the device is a root port and the bus tuning is not
disabled (PCIE_BUS_TUNE_OFF), the MPS is set to 128 << dev->pcie_mpss to
match the Root Port's maximum supported payload size. The Max Read Request
Size (MRRS) is subsequently adjusted through existing companion logic to
maintain compatibility with PCIe specifications.
Note that this initial setting of the root port MPS to the maximum might
be reduced later during the enumeration of downstream devices if any of
those devices do not support the maximum MPS of the root port.
Explicit initialization at host probing stage ensures consistent PCIe
topology configuration before downstream devices perform their own MPS
negotiations. This proactive approach addresses platform-specific
requirements where controller drivers depend on properly initialized
root port settings, while maintaining backward compatibility through
PCIE_BUS_TUNE_OFF conditional checks. Hardware capabilities are fully
utilized without altering existing device negotiation behaviors.
Suggested-by: Niklas Cassel <cassel@kernel.org>
Suggested-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Hans Zhang <18255117159@163.com>
---
drivers/pci/probe.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 4b8693ec9e4c..9f8803da914c 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -2178,6 +2178,16 @@ static void pci_configure_mps(struct pci_dev *dev)
return;
}
+ /*
+ * Unless MPS strategy is PCIE_BUS_TUNE_OFF (don't touch MPS at all),
+ * start off by setting root ports' MPS to MPSS. Depending on the MPS
+ * strategy, and the MPSS of the devices below the root port, the MPS
+ * of the root port might get overridden later.
+ */
+ if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT &&
+ pcie_bus_config != PCIE_BUS_TUNE_OFF)
+ pcie_set_mps(dev, 128 << dev->pcie_mpss);
+
if (!bridge || !pci_is_pcie(bridge))
return;
--
2.25.1
On Fri, Jun 20, 2025 at 11:55:06PM +0800, Hans Zhang wrote: > Current PCIe initialization logic may leave root ports operating with > non-optimal Maximum Payload Size (MPS) settings. While downstream device > configuration is handled during bus enumeration, root port MPS values > inherited from firmware or hardware defaults ... Apparently Root Port MPS configuration is different from that for downstream devices? > might not utilize the full > capabilities supported by the controller hardware. This can result in > suboptimal data transfer efficiency across the PCIe hierarchy. > > During host controller probing phase, when PCIe bus tuning is enabled, > the implementation now configures root port MPS settings to their > hardware-supported maximum values. Specifically, when configuring the MPS > for a PCIe device, if the device is a root port and the bus tuning is not > disabled (PCIE_BUS_TUNE_OFF), the MPS is set to 128 << dev->pcie_mpss to > match the Root Port's maximum supported payload size. The Max Read Request > Size (MRRS) is subsequently adjusted through existing companion logic to > maintain compatibility with PCIe specifications. > > Note that this initial setting of the root port MPS to the maximum might > be reduced later during the enumeration of downstream devices if any of > those devices do not support the maximum MPS of the root port. > > Explicit initialization at host probing stage ensures consistent PCIe > topology configuration before downstream devices perform their own MPS > negotiations. This proactive approach addresses platform-specific > requirements where controller drivers depend on properly initialized > root port settings, while maintaining backward compatibility through > PCIE_BUS_TUNE_OFF conditional checks. Hardware capabilities are fully > utilized without altering existing device negotiation behaviors. This last paragraph seems kind of like marketing without any real content. Is there something important in there? Nits: s/root port/Root Port/ Reword "implementation now configures" to be clear about whether "now" refers to before this patch or after. Update the MRRS "to maintain compatibility" part. I'm dubious about there being a spec compatibility issue with respect to MRRS. Cite the relevant section if there is an issue. > Suggested-by: Niklas Cassel <cassel@kernel.org> > Suggested-by: Manivannan Sadhasivam <mani@kernel.org> > Signed-off-by: Hans Zhang <18255117159@163.com> > --- > drivers/pci/probe.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 4b8693ec9e4c..9f8803da914c 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -2178,6 +2178,16 @@ static void pci_configure_mps(struct pci_dev *dev) > return; > } > > + /* > + * Unless MPS strategy is PCIE_BUS_TUNE_OFF (don't touch MPS at all), > + * start off by setting root ports' MPS to MPSS. Depending on the MPS > + * strategy, and the MPSS of the devices below the root port, the MPS > + * of the root port might get overridden later. > + */ > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT && > + pcie_bus_config != PCIE_BUS_TUNE_OFF) > + pcie_set_mps(dev, 128 << dev->pcie_mpss); > + > if (!bridge || !pci_is_pcie(bridge)) > return; > > -- > 2.25.1 >
On Tue, Sep 02, 2025 at 12:48:28PM -0500, Bjorn Helgaas wrote: > On Fri, Jun 20, 2025 at 11:55:06PM +0800, Hans Zhang wrote: > > Current PCIe initialization logic may leave root ports operating with > > non-optimal Maximum Payload Size (MPS) settings. While downstream device > > configuration is handled during bus enumeration, root port MPS values > > inherited from firmware or hardware defaults ... > > Apparently Root Port MPS configuration is different from that for > downstream devices? pci_host_probe() will call pci_scan_root_bus_bridge(), which will call pci_scan_single_device(), which will call pci_device_add(), which will call pci_configure_device(), which will call pci_configure_mps(). This will be done for both bridges and endpoints. The bridge will be scanned/added first, before devices behind the bridge. While pci_configure_device()/pci_configure_mps() will be called for both bridges and endpoints, pci_configure_mps() will do an early return for devices where pci_upstream_bridge() returns NULL, i.e. for devices where that does not have an upstream bridge, i.e. for the root bridge itself: https://github.com/torvalds/linux/blob/v6.17-rc4/drivers/pci/probe.c#L2181-L2182 So MPS will not be touched for root bridges. This patch ensures that MPS for root bridges gets initialized to MPSS (Max supported MPS). Later, when pci_configure_device()/pci_configure_mps() is called for a device behind the bridge, if the MPSS of the device behind the bridge is smaller than the MPS of the bridge, the code reduces the MPS of the bridge: https://github.com/torvalds/linux/blob/v6.17-rc4/drivers/pci/probe.c#L2205 My only question with this patch is if there is a bridge behind a bridge, will the bridge behind the bridge still have pci_pcie_type() == PCI_EXP_TYPE_ROOT_PORT ? If so, perhaps we should modify this patch from: + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT && + pcie_bus_config != PCIE_BUS_TUNE_OFF) { + pcie_write_mps(dev, 128 << dev->pcie_mpss); + } + if (!bridge || !pci_is_pcie(bridge)) return; to: + if (!bridge && pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT && + pcie_bus_config != PCIE_BUS_TUNE_OFF) { + pcie_write_mps(dev, 128 << dev->pcie_mpss); + } + if (!bridge || !pci_is_pcie(bridge)) return; > > During host controller probing phase, when PCIe bus tuning is enabled, > > the implementation now configures root port MPS settings to their > > hardware-supported maximum values. Specifically, when configuring the MPS > > for a PCIe device, if the device is a root port and the bus tuning is not > > disabled (PCIE_BUS_TUNE_OFF), the MPS is set to 128 << dev->pcie_mpss to > > match the Root Port's maximum supported payload size. The Max Read Request > > Size (MRRS) is subsequently adjusted through existing companion logic to > > maintain compatibility with PCIe specifications. > > > > Note that this initial setting of the root port MPS to the maximum might > > be reduced later during the enumeration of downstream devices if any of > > those devices do not support the maximum MPS of the root port. > > > > Explicit initialization at host probing stage ensures consistent PCIe > > topology configuration before downstream devices perform their own MPS > > negotiations. This proactive approach addresses platform-specific > > requirements where controller drivers depend on properly initialized > > root port settings, while maintaining backward compatibility through > > PCIE_BUS_TUNE_OFF conditional checks. Hardware capabilities are fully > > utilized without altering existing device negotiation behaviors. > > This last paragraph seems kind of like marketing without any real > content. Is there something important in there? > > Nits: > s/root port/Root Port/ > > Reword "implementation now configures" to be clear about whether "now" > refers to before this patch or after. > > Update the MRRS "to maintain compatibility" part. I'm dubious about > there being a spec compatibility issue with respect to MRRS. Cite the > relevant section if there is an issue. I'm not sure why the commit message mentions MRRS at all. Sure, pcie_write_mrrs() might set MRRS to MPS, but that is existing logic and not really related to the change in this patch IMO. Kind regards, Niklas
On 2025/9/3 01:48, Bjorn Helgaas wrote: > On Fri, Jun 20, 2025 at 11:55:06PM +0800, Hans Zhang wrote: >> Current PCIe initialization logic may leave root ports operating with >> non-optimal Maximum Payload Size (MPS) settings. While downstream device >> configuration is handled during bus enumeration, root port MPS values >> inherited from firmware or hardware defaults ... > > Apparently Root Port MPS configuration is different from that for > downstream devices? Dear Bjorn, Thank you very much for your reply. Yes, at the very beginning, the situation I tested was like the previous reply: https://lore.kernel.org/linux-pci/bb40385c-6839-484c-90b2-d6c7ecb95ba9@163.com/ Niklas helped find the documentation description of RK3588 TRM: https://lore.kernel.org/linux-pci/aACoEpueUHBLjgbb@ryzen/ Dear Niklas, If I have misunderstood Bjorn's review opinion, please help me clarify it together. Thank you again for helping me ping the Maintainer. When I wanted to ping, you did it before me. > >> might not utilize the full >> capabilities supported by the controller hardware. This can result in >> suboptimal data transfer efficiency across the PCIe hierarchy. >> >> During host controller probing phase, when PCIe bus tuning is enabled, >> the implementation now configures root port MPS settings to their >> hardware-supported maximum values. Specifically, when configuring the MPS >> for a PCIe device, if the device is a root port and the bus tuning is not >> disabled (PCIE_BUS_TUNE_OFF), the MPS is set to 128 << dev->pcie_mpss to >> match the Root Port's maximum supported payload size. The Max Read Request >> Size (MRRS) is subsequently adjusted through existing companion logic to >> maintain compatibility with PCIe specifications. >> >> Note that this initial setting of the root port MPS to the maximum might >> be reduced later during the enumeration of downstream devices if any of >> those devices do not support the maximum MPS of the root port. >> >> Explicit initialization at host probing stage ensures consistent PCIe >> topology configuration before downstream devices perform their own MPS >> negotiations. This proactive approach addresses platform-specific >> requirements where controller drivers depend on properly initialized >> root port settings, while maintaining backward compatibility through >> PCIE_BUS_TUNE_OFF conditional checks. Hardware capabilities are fully >> utilized without altering existing device negotiation behaviors. > > This last paragraph seems kind of like marketing without any real > content. Is there something important in there? I think the above elaboration is sufficient. I will delete it. > > Nits: > s/root port/Root Port/ Will change. > > Reword "implementation now configures" to be clear about whether "now" > refers to before this patch or after. Will be modified. > > Update the MRRS "to maintain compatibility" part. I'm dubious about > there being a spec compatibility issue with respect to MRRS. Cite the > relevant section if there is an issue. The description is inaccurate. I will correct it. I plan to modify the commit message as follows: If there are any incorrect descriptions, please correct them. Thank you very much. Current PCIe initialization logic may leave Root Ports operating with non-optimal Maximum Payload Size (MPS) settings. While downstream device configuration is handled during bus enumeration, Root Port MPS values inherited from firmware or hardware defaults might not utilize the full capabilities supported by the controller hardware. This can result in suboptimal data transfer efficiency across the PCIe hierarchy. With this patch, during the host controller probing phase and when PCIe bus tuning is enabled, the implementation configures Root Port MPS settings to their hardware-supported maximum values. Specifically, when configuring the MPS for a PCIe device, if the device is a Root Port and the bus tuning is not disabled (PCIE_BUS_TUNE_OFF), the MPS is set to 128 << dev->pcie_mpss to match the Root Port's maximum supported payload size. The Max Read Request Size (MRRS) is subsequently adjusted by existing logic in pci_configure_mps() to ensure it is not less than the MPS, maintaining compliance with PCIe specifications (see PCIe r7.0, sec 7.5.3.4). Note that this initial setting of the Root Port MPS to the maximum might be reduced later during the enumeration of downstream devices if any of those devices do not support the maximum MPS of the Root Port. Best regards, Hans > >> Suggested-by: Niklas Cassel <cassel@kernel.org> >> Suggested-by: Manivannan Sadhasivam <mani@kernel.org> >> Signed-off-by: Hans Zhang <18255117159@163.com> >> --- >> drivers/pci/probe.c | 10 ++++++++++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c >> index 4b8693ec9e4c..9f8803da914c 100644 >> --- a/drivers/pci/probe.c >> +++ b/drivers/pci/probe.c >> @@ -2178,6 +2178,16 @@ static void pci_configure_mps(struct pci_dev *dev) >> return; >> } >> >> + /* >> + * Unless MPS strategy is PCIE_BUS_TUNE_OFF (don't touch MPS at all), >> + * start off by setting root ports' MPS to MPSS. Depending on the MPS >> + * strategy, and the MPSS of the devices below the root port, the MPS >> + * of the root port might get overridden later. >> + */ >> + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT && >> + pcie_bus_config != PCIE_BUS_TUNE_OFF) >> + pcie_set_mps(dev, 128 << dev->pcie_mpss); >> + >> if (!bridge || !pci_is_pcie(bridge)) >> return; >> >> -- >> 2.25.1 >>
On Thu, Sep 04, 2025 at 01:11:22AM +0800, Hans Zhang wrote: > On 2025/9/3 01:48, Bjorn Helgaas wrote: > > On Fri, Jun 20, 2025 at 11:55:06PM +0800, Hans Zhang wrote: > > > Current PCIe initialization logic may leave root ports operating with > > > non-optimal Maximum Payload Size (MPS) settings. While downstream device > > > configuration is handled during bus enumeration, root port MPS values > > > inherited from firmware or hardware defaults ... > > > > Apparently Root Port MPS configuration is different from that for > > downstream devices? > Yes, at the very beginning, the situation I tested was like the previous > reply: > https://lore.kernel.org/linux-pci/bb40385c-6839-484c-90b2-d6c7ecb95ba9@163.com/ I meant that this commit log suggests the *code path* is different for Root Ports than other devices. > > > might not utilize the full > > > capabilities supported by the controller hardware. This can result in > > > suboptimal data transfer efficiency across the PCIe hierarchy. > > > > > > During host controller probing phase, when PCIe bus tuning is enabled, > > > the implementation now configures root port MPS settings to their > > > hardware-supported maximum values. Specifically, when configuring the MPS > > > for a PCIe device, if the device is a root port and the bus tuning is not > > > disabled (PCIE_BUS_TUNE_OFF), the MPS is set to 128 << dev->pcie_mpss to > > > match the Root Port's maximum supported payload size. The Max Read Request > > > Size (MRRS) is subsequently adjusted through existing companion logic to > > > maintain compatibility with PCIe specifications. > > > > > > Note that this initial setting of the root port MPS to the maximum might > > > be reduced later during the enumeration of downstream devices if any of > > > those devices do not support the maximum MPS of the root port. > > > > > > Explicit initialization at host probing stage ensures consistent PCIe > > > topology configuration before downstream devices perform their own MPS > > > negotiations. This proactive approach addresses platform-specific > > > requirements where controller drivers depend on properly initialized > > > root port settings, while maintaining backward compatibility through > > > PCIE_BUS_TUNE_OFF conditional checks. Hardware capabilities are fully > > > utilized without altering existing device negotiation behaviors. > > Update the MRRS "to maintain compatibility" part. I'm dubious about > > there being a spec compatibility issue with respect to MRRS. Cite the > > relevant section if there is an issue. > > The description is inaccurate. I will correct it. > > I plan to modify the commit message as follows: > If there are any incorrect descriptions, please correct them. Thank you very > much. > > Current PCIe initialization logic may leave Root Ports operating with > non-optimal Maximum Payload Size (MPS) settings. While downstream device > configuration is handled during bus enumeration, Root Port MPS values > inherited from firmware or hardware defaults might not utilize the full > capabilities supported by the controller hardware. This can result in > suboptimal data transfer efficiency across the PCIe hierarchy. > > With this patch, during the host controller probing phase and when PCIe > bus tuning is enabled, the implementation configures Root Port MPS > settings to their hardware-supported maximum values. Specifically, when > configuring the MPS for a PCIe device, if the device is a Root Port and > the bus tuning is not disabled (PCIE_BUS_TUNE_OFF), the MPS is set to > 128 << dev->pcie_mpss to match the Root Port's maximum supported payload > size. The Max Read Request Size (MRRS) is subsequently adjusted by > existing logic in pci_configure_mps() to ensure it is not less than the > MPS, maintaining compliance with PCIe specifications (see PCIe r7.0, > sec 7.5.3.4). AFAICS, sec 7.5.3.4 doesn't say anything about a relationship between MRRS and MPS. MPS is a constraint on the payload size of TLPs with data (Memory Writes, Completions with Data, etc). MRRS is a constraint on the Length field in a Memory Read Request. A single Memory Read can be satisified with several Completions. MPS is one of the things that determines how many Completions are required. This has more details and a lot of good discussion: https://www.xilinx.com/support/documentation/white_papers/wp350.pdf
On Fri, Jun 20, 2025 at 11:55:06PM +0800, Hans Zhang wrote: > Current PCIe initialization logic may leave root ports operating with > non-optimal Maximum Payload Size (MPS) settings. While downstream device > configuration is handled during bus enumeration, root port MPS values > inherited from firmware or hardware defaults might not utilize the full > capabilities supported by the controller hardware. This can result in > suboptimal data transfer efficiency across the PCIe hierarchy. > > During host controller probing phase, when PCIe bus tuning is enabled, > the implementation now configures root port MPS settings to their > hardware-supported maximum values. Specifically, when configuring the MPS > for a PCIe device, if the device is a root port and the bus tuning is not > disabled (PCIE_BUS_TUNE_OFF), the MPS is set to 128 << dev->pcie_mpss to > match the Root Port's maximum supported payload size. The Max Read Request > Size (MRRS) is subsequently adjusted through existing companion logic to > maintain compatibility with PCIe specifications. > > Note that this initial setting of the root port MPS to the maximum might > be reduced later during the enumeration of downstream devices if any of > those devices do not support the maximum MPS of the root port. > > Explicit initialization at host probing stage ensures consistent PCIe > topology configuration before downstream devices perform their own MPS > negotiations. This proactive approach addresses platform-specific > requirements where controller drivers depend on properly initialized > root port settings, while maintaining backward compatibility through > PCIE_BUS_TUNE_OFF conditional checks. Hardware capabilities are fully > utilized without altering existing device negotiation behaviors. > > Suggested-by: Niklas Cassel <cassel@kernel.org> > Suggested-by: Manivannan Sadhasivam <mani@kernel.org> > Signed-off-by: Hans Zhang <18255117159@163.com> Acked-by: Manivannan Sadhasivam <mani@kernel.org> - Mani > --- > drivers/pci/probe.c | 10 ++++++++++ > 1 file changed, 10 insertions(+) > > diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c > index 4b8693ec9e4c..9f8803da914c 100644 > --- a/drivers/pci/probe.c > +++ b/drivers/pci/probe.c > @@ -2178,6 +2178,16 @@ static void pci_configure_mps(struct pci_dev *dev) > return; > } > > + /* > + * Unless MPS strategy is PCIE_BUS_TUNE_OFF (don't touch MPS at all), > + * start off by setting root ports' MPS to MPSS. Depending on the MPS > + * strategy, and the MPSS of the devices below the root port, the MPS > + * of the root port might get overridden later. > + */ > + if (pci_pcie_type(dev) == PCI_EXP_TYPE_ROOT_PORT && > + pcie_bus_config != PCIE_BUS_TUNE_OFF) > + pcie_set_mps(dev, 128 << dev->pcie_mpss); > + > if (!bridge || !pci_is_pcie(bridge)) > return; > > -- > 2.25.1 > -- மணிவண்ணன் சதாசிவம்
© 2016 - 2025 Red Hat, Inc.