drivers/pci/pcie/portdrv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
Trying to enable bwctrl on a Thunderbolt port causes a boot hang on some
systems though the exact reason is not yet understood. As per the spec
Thunderbolt PCIe Downstream Ports have a fake Max Link Speed of 2.5 GT/s
(USB4 v2 sec 11.2.1):
"Max Link Speed field in the Link Capabilities Register set to 0001b
(data rate of 2.5 GT/s only).
Note: These settings do not represent actual throughput.
Throughput is implementation specific and based on the USB4 Fabric
performance."
More generally if 2.5 GT/s is the only supported link speed there is no
point in throtteling as this is already the lowest possible PCIe speed
so don't advertise the capability stopping bwctrl from being probed on
these ports.
Link: https://lore.kernel.org/linux-pci/Z1R4VNwCOlh9Sg9n@wunner.de/
Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Tested-by: Niklas Schnelle <niks@kernel.org>
Signed-off-by: Niklas Schnelle <niks@kernel.org>
---
Note: This issue causes a boot hang on my personal workstation see the
Link for details.
---
drivers/pci/pcie/portdrv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
index 5e10306b63081b1ddd13e0a545418e2a8610c14c..e5f80e4a11aad4ce60b2ce998b40ec9fda8c653d 100644
--- a/drivers/pci/pcie/portdrv.c
+++ b/drivers/pci/pcie/portdrv.c
@@ -270,7 +270,8 @@ static int get_port_device_capability(struct pci_dev *dev)
u32 linkcap;
pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap);
- if (linkcap & PCI_EXP_LNKCAP_LBNC)
+ if (linkcap & PCI_EXP_LNKCAP_LBNC &&
+ (linkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB)
services |= PCIE_PORT_SERVICE_BWCTRL;
}
---
base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
change-id: 20241207-fix_bwctrl_thunderbolt-bd1f96b3d98f
Best regards,
--
Niklas Schnelle <niks@kernel.org>
On Sat, Dec 07, 2024 at 07:44:09PM +0100, Niklas Schnelle wrote:
> Trying to enable bwctrl on a Thunderbolt port causes a boot hang on some
> systems though the exact reason is not yet understood.
Probably worth highlighting the discrete Thunderbolt chip which exhibits
this issue, i.e. Intel JHL7540 (Titan Ridge).
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -270,7 +270,8 @@ static int get_port_device_capability(struct pci_dev *dev)
> u32 linkcap;
>
> pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap);
> - if (linkcap & PCI_EXP_LNKCAP_LBNC)
> + if (linkcap & PCI_EXP_LNKCAP_LBNC &&
> + (linkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB)
> services |= PCIE_PORT_SERVICE_BWCTRL;
> }
This is fine in principle because PCIe r6.2 sec 8.2.1 states:
"A device must support 2.5 GT/s and is not permitted to skip support
for any data rates between 2.5 GT/s and the highest supported rate."
However the Implementation Note at the end of PCIe r6.2 sec 7.5.3.18
cautions:
"It is strongly encouraged that software primarily utilize the
Supported Link Speeds Vector instead of the Max Link Speed field,
so that software can determine the exact set of supported speeds
on current and future hardware. This can avoid software being
confused if a future specification defines Links that do not
require support for all slower speeds."
First of all, the Supported Link Speeds field in the Link Capabilities
register (which you're querying here) was renamed to Max Link Speed in
PCIe r3.1 and a new Link Capabilities 2 register was added which contains
a new Supported Link Speeds field. Software is supposed to query the
latter if the device implements the Link Capabilities 2 register
(see the other Implementation Note at the end of PCIe r6.2 sec 7.5.3.18).
Second, the above-quoted Implementation Note says that software should
not rely on future spec versions to mandate that *all* link speeds
(2.5 GT/s and all intermediate speeds up to the maximum supported speed)
are supported.
Since v6.13-rc1, we cache the supported speeds in the "supported_speeds"
field in struct pci_dev, taking care of the PCIe 3.0 versus later versions
issue.
So to make this future-proof what you could do is check whether only a
*single* speed is supported (which could be something else than 2.5 GT/s
if future spec versions allow that), i.e.:
- if (linkcap & PCI_EXP_LNKCAP_LBNC)
+ if (linkcap & PCI_EXP_LNKCAP_LBNC &&
+ hweight8(dev->supported_speeds) > 1)
...and optionally add a code comment, e.g.:
/* Enable bandwidth control if more than one speed is supported. */
Thanks,
Lukas
On Tue, 2024-12-10 at 11:05 +0100, Lukas Wunner wrote:
> On Sat, Dec 07, 2024 at 07:44:09PM +0100, Niklas Schnelle wrote:
> > Trying to enable bwctrl on a Thunderbolt port causes a boot hang on some
> > systems though the exact reason is not yet understood.
>
> Probably worth highlighting the discrete Thunderbolt chip which exhibits
> this issue, i.e. Intel JHL7540 (Titan Ridge).
Agree will ad for v2.
>
> > --- a/drivers/pci/pcie/portdrv.c
> > +++ b/drivers/pci/pcie/portdrv.c
> > @@ -270,7 +270,8 @@ static int get_port_device_capability(struct pci_dev *dev)
> > u32 linkcap;
> >
> > pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap);
> > - if (linkcap & PCI_EXP_LNKCAP_LBNC)
> > + if (linkcap & PCI_EXP_LNKCAP_LBNC &&
> > + (linkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB)
> > services |= PCIE_PORT_SERVICE_BWCTRL;
> > }
>
> This is fine in principle because PCIe r6.2 sec 8.2.1 states:
>
> "A device must support 2.5 GT/s and is not permitted to skip support
> for any data rates between 2.5 GT/s and the highest supported rate."
>
> However the Implementation Note at the end of PCIe r6.2 sec 7.5.3.18
> cautions:
>
> "It is strongly encouraged that software primarily utilize the
> Supported Link Speeds Vector instead of the Max Link Speed field,
> so that software can determine the exact set of supported speeds
> on current and future hardware. This can avoid software being
> confused if a future specification defines Links that do not
> require support for all slower speeds."
>
> First of all, the Supported Link Speeds field in the Link Capabilities
> register (which you're querying here) was renamed to Max Link Speed in
> PCIe r3.1 and a new Link Capabilities 2 register was added which contains
> a new Supported Link Speeds field. Software is supposed to query the
> latter if the device implements the Link Capabilities 2 register
> (see the other Implementation Note at the end of PCIe r6.2 sec 7.5.3.18).
Would it maybe make sense to update the comment for PCI_EXP_LNKCAP_SLS
in pci_regs.h to point out that in PCIe r3.1 and newer this is called
the Max Link Speed field? This would certainly helped me here.
>
> Second, the above-quoted Implementation Note says that software should
> not rely on future spec versions to mandate that *all* link speeds
> (2.5 GT/s and all intermediate speeds up to the maximum supported speed)
> are supported.
>
> Since v6.13-rc1, we cache the supported speeds in the "supported_speeds"
> field in struct pci_dev, taking care of the PCIe 3.0 versus later versions
> issue.
>
> So to make this future-proof what you could do is check whether only a
> *single* speed is supported (which could be something else than 2.5 GT/s
> if future spec versions allow that), i.e.:
>
> - if (linkcap & PCI_EXP_LNKCAP_LBNC)
> + if (linkcap & PCI_EXP_LNKCAP_LBNC &&
> + hweight8(dev->supported_speeds) > 1)
This also makes sense to me in that the argument holds that if there is
only one supported speed bwctrl can't control it. That said it is
definitely more general than this patch.
Sadly, I tried it and in my case it doesn't work. Taking a closer look
at lspci -vvv of the Thunderbolt port as well as a debug print reveals
why:
07:00.0 PCI bridge: Intel Corporation JHL7540 Thunderbolt 3 Bridge [Titan Ridge 4C 2018] (rev 06) (prog-if 00 [Normal decode])
...
LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L1, Exit Latency L1 <1us
ClockPM- Surprise- LLActRep- BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled; LnkDisable- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4
TrErr- Train- SlotClk+ DLActive- BWMgmt+ ABWMgmt-
...
LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
...
So it seems that on this Thunderbolt chip the LnkCap field
says 2.5 GT/s only as per the USB 4 spec you quoted but LnkCap2
is 0x0E i.e. 2.5-8 GT/s.
I wonder if this is related to why the hang occurs. Could it be that
bwctrl tries to enable speeds above 2.5 GT/s and that causes links to
fail?
Thanks,
Niklas
On Sat, Dec 07, 2024 at 07:44:09PM +0100, Niklas Schnelle wrote:
> Trying to enable bwctrl on a Thunderbolt port causes a boot hang on some
> systems though the exact reason is not yet understood. As per the spec
> Thunderbolt PCIe Downstream Ports have a fake Max Link Speed of 2.5 GT/s
> (USB4 v2 sec 11.2.1):
>
> "Max Link Speed field in the Link Capabilities Register set to 0001b
> (data rate of 2.5 GT/s only).
> Note: These settings do not represent actual throughput.
> Throughput is implementation specific and based on the USB4 Fabric
> performance."
>
> More generally if 2.5 GT/s is the only supported link speed there is no
> point in throtteling as this is already the lowest possible PCIe speed
> so don't advertise the capability stopping bwctrl from being probed on
> these ports.
>
> Link: https://lore.kernel.org/linux-pci/Z1R4VNwCOlh9Sg9n@wunner.de/
> Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
> Tested-by: Niklas Schnelle <niks@kernel.org>
> Signed-off-by: Niklas Schnelle <niks@kernel.org>
Makes sense to me,
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
On Sat, 7 Dec 2024, Niklas Schnelle wrote:
> Trying to enable bwctrl on a Thunderbolt port causes a boot hang on some
> systems though the exact reason is not yet understood. As per the spec
> Thunderbolt PCIe Downstream Ports have a fake Max Link Speed of 2.5 GT/s
> (USB4 v2 sec 11.2.1):
>
> "Max Link Speed field in the Link Capabilities Register set to 0001b
> (data rate of 2.5 GT/s only).
> Note: These settings do not represent actual throughput.
> Throughput is implementation specific and based on the USB4 Fabric
> performance."
>
> More generally if 2.5 GT/s is the only supported link speed there is no
> point in throtteling as this is already the lowest possible PCIe speed
> so don't advertise the capability stopping bwctrl from being probed on
> these ports.
Hi,
Thanks for finding the reason this far.
Mika mentioned earlier to me this Link Speed stuff is all made up on
Thunderbolt but I didn't make any changes back then because I thought
bwctrl is not going to change the speed anyway in that case.
Seem reasonable way to workaround the Thunderbold problem and agreed,
there's not much point in adding bwctrl for 2.5GT/s only ports.
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
--
i.
> Link: https://lore.kernel.org/linux-pci/Z1R4VNwCOlh9Sg9n@wunner.de/
> Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
> Tested-by: Niklas Schnelle <niks@kernel.org>
> Signed-off-by: Niklas Schnelle <niks@kernel.org>
> ---
> Note: This issue causes a boot hang on my personal workstation see the
> Link for details.
> ---
> drivers/pci/pcie/portdrv.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> index 5e10306b63081b1ddd13e0a545418e2a8610c14c..e5f80e4a11aad4ce60b2ce998b40ec9fda8c653d 100644
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -270,7 +270,8 @@ static int get_port_device_capability(struct pci_dev *dev)
> u32 linkcap;
>
> pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap);
> - if (linkcap & PCI_EXP_LNKCAP_LBNC)
> + if (linkcap & PCI_EXP_LNKCAP_LBNC &&
> + (linkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB)
> services |= PCIE_PORT_SERVICE_BWCTRL;
> }
>
>
> ---
> base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
> change-id: 20241207-fix_bwctrl_thunderbolt-bd1f96b3d98f
>
> Best regards,
>
On Sat, 2024-12-07 at 19:44 +0100, Niklas Schnelle wrote:
> Trying to enable bwctrl on a Thunderbolt port causes a boot hang on some
> systems though the exact reason is not yet understood. As per the spec
> Thunderbolt PCIe Downstream Ports have a fake Max Link Speed of 2.5 GT/s
> (USB4 v2 sec 11.2.1):
>
> "Max Link Speed field in the Link Capabilities Register set to 0001b
> (data rate of 2.5 GT/s only).
> Note: These settings do not represent actual throughput.
> Throughput is implementation specific and based on the USB4 Fabric
> performance."
>
> More generally if 2.5 GT/s is the only supported link speed there is no
> point in throtteling as this is already the lowest possible PCIe speed
> so don't advertise the capability stopping bwctrl from being probed on
> these ports.
>
> Link: https://lore.kernel.org/linux-pci/Z1R4VNwCOlh9Sg9n@wunner.de/
> Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
> Tested-by: Niklas Schnelle <niks@kernel.org>
> Signed-off-by: Niklas Schnelle <niks@kernel.org>
Should probably add but forgot:
Suggested-by: Lukas Wunner <lukas@wunner.de>
> ---
> Note: This issue causes a boot hang on my personal workstation see the
> Link for details.
> ---
> drivers/pci/pcie/portdrv.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pcie/portdrv.c b/drivers/pci/pcie/portdrv.c
> index 5e10306b63081b1ddd13e0a545418e2a8610c14c..e5f80e4a11aad4ce60b2ce998b40ec9fda8c653d 100644
> --- a/drivers/pci/pcie/portdrv.c
> +++ b/drivers/pci/pcie/portdrv.c
> @@ -270,7 +270,8 @@ static int get_port_device_capability(struct pci_dev *dev)
> u32 linkcap;
>
> pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &linkcap);
> - if (linkcap & PCI_EXP_LNKCAP_LBNC)
> + if (linkcap & PCI_EXP_LNKCAP_LBNC &&
> + (linkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB)
> services |= PCIE_PORT_SERVICE_BWCTRL;
> }
>
>
> ---
> base-commit: 40384c840ea1944d7c5a392e8975ed088ecf0b37
> change-id: 20241207-fix_bwctrl_thunderbolt-bd1f96b3d98f
>
> Best regards,
© 2016 - 2025 Red Hat, Inc.