drivers/pci/quirks.c | 50 ++++++++++++++++++-------------------------------- 1 file changed, 18 insertions(+), 32 deletions(-)
Discard Vendor:Device ID matching in the PCIe failed link retraining
quirk and ignore the link status for the removal of the 2.5GT/s speed
clamp, whether applied by the quirk itself or the firmware earlier on.
Revert to the original target link speed if this final link retraining
has failed.
This is so that link training noise in hot-plug scenarios does not make
a link remain clamped to the 2.5GT/s speed where an event race has led
the quirk to apply the speed clamp for one device, only to leave it in
place for a subsequent device to be plugged in.
Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: <stable@vger.kernel.org> # v6.5+
---
drivers/pci/quirks.c | 50 ++++++++++++++++++--------------------------------
1 file changed, 18 insertions(+), 32 deletions(-)
linux-pcie-failed-link-retrain-unclamp-always.diff
Index: linux-macro/drivers/pci/quirks.c
===================================================================
--- linux-macro.orig/drivers/pci/quirks.c
+++ linux-macro/drivers/pci/quirks.c
@@ -79,11 +79,10 @@ static bool pcie_lbms_seen(struct pci_de
* Restrict the speed to 2.5GT/s then with the Target Link Speed field,
* request a retrain and check the result.
*
- * If this turns out successful and we know by the Vendor:Device ID it is
- * safe to do so, then lift the restriction, letting the devices negotiate
- * a higher speed. Also check for a similar 2.5GT/s speed restriction the
- * firmware may have already arranged and lift it with ports that already
- * report their data link being up.
+ * If this turns out successful, or where a 2.5GT/s speed restriction has
+ * been previously arranged by the firmware and the port reports its link
+ * already being up, lift the restriction, in a hope it is safe to do so,
+ * letting the devices negotiate a higher speed.
*
* Otherwise revert the speed to the original setting and request a retrain
* again to remove any residual state, ignoring the result as it's supposed
@@ -94,52 +93,39 @@ static bool pcie_lbms_seen(struct pci_de
*/
int pcie_failed_link_retrain(struct pci_dev *dev)
{
- static const struct pci_device_id ids[] = {
- { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
- {}
- };
- u16 lnksta, lnkctl2;
+ u16 lnksta, lnkctl2, oldlnkctl2;
int ret = -ENOTTY;
+ u32 lnkcap;
if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
!pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
return ret;
pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
if (!(lnksta & PCI_EXP_LNKSTA_DLLLA) && pcie_lbms_seen(dev, lnksta)) {
- u16 oldlnkctl2;
-
pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
- pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2),
- true);
- return ret;
- }
-
- pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ if (ret)
+ goto err;
}
pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
-
- if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
- (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
- pci_match_id(ids, dev)) {
- u32 lnkcap;
-
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
+ (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
- pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- return ret;
- }
+ if (ret)
+ goto err;
}
return ret;
+err:
+ pci_info(dev, "retraining failed\n");
+ pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true);
+ return ret;
}
static ktime_t fixup_debug_start(struct pci_dev *dev,
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> Discard Vendor:Device ID matching in the PCIe failed link retraining
> quirk and ignore the link status for the removal of the 2.5GT/s speed
> clamp, whether applied by the quirk itself or the firmware earlier on.
> Revert to the original target link speed if this final link retraining
> has failed.
>
> This is so that link training noise in hot-plug scenarios does not make
> a link remain clamped to the 2.5GT/s speed where an event race has led
> the quirk to apply the speed clamp for one device, only to leave it in
> place for a subsequent device to be plugged in.
>
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Cc: <stable@vger.kernel.org> # v6.5+
> ---
> drivers/pci/quirks.c | 50 ++++++++++++++++++--------------------------------
> 1 file changed, 18 insertions(+), 32 deletions(-)
>
> linux-pcie-failed-link-retrain-unclamp-always.diff
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
> @@ -79,11 +79,10 @@ static bool pcie_lbms_seen(struct pci_de
> * Restrict the speed to 2.5GT/s then with the Target Link Speed field,
> * request a retrain and check the result.
> *
> - * If this turns out successful and we know by the Vendor:Device ID it is
> - * safe to do so, then lift the restriction, letting the devices negotiate
> - * a higher speed. Also check for a similar 2.5GT/s speed restriction the
> - * firmware may have already arranged and lift it with ports that already
> - * report their data link being up.
> + * If this turns out successful, or where a 2.5GT/s speed restriction has
> + * been previously arranged by the firmware and the port reports its link
> + * already being up, lift the restriction, in a hope it is safe to do so,
> + * letting the devices negotiate a higher speed.
> *
> * Otherwise revert the speed to the original setting and request a retrain
> * again to remove any residual state, ignoring the result as it's supposed
> @@ -94,52 +93,39 @@ static bool pcie_lbms_seen(struct pci_de
> */
> int pcie_failed_link_retrain(struct pci_dev *dev)
> {
> - static const struct pci_device_id ids[] = {
> - { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
> - {}
> - };
> - u16 lnksta, lnkctl2;
> + u16 lnksta, lnkctl2, oldlnkctl2;
> int ret = -ENOTTY;
> + u32 lnkcap;
>
> if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
> !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> return ret;
>
> pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
> if (!(lnksta & PCI_EXP_LNKSTA_DLLLA) && pcie_lbms_seen(dev, lnksta)) {
> - u16 oldlnkctl2;
> -
> pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
>
> - pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
> ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false);
> - if (ret) {
> - pci_info(dev, "retraining failed\n");
> - pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2),
> - true);
> - return ret;
> - }
> -
> - pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> + if (ret)
> + goto err;
> }
>
> pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
> -
> - if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
> - (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> - pci_match_id(ids, dev)) {
> - u32 lnkcap;
> -
> + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
I'm trying to recall, if there was some particular reason why
->supported_speeds couldn't be used in this function. It would avoid the
need to read LinkCap at all.
> pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
> - pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false);
> - if (ret) {
> - pci_info(dev, "retraining failed\n");
> - return ret;
> - }
> + if (ret)
> + goto err;
> }
>
> return ret;
return 0;
> +err:
> + pci_info(dev, "retraining failed\n");
> + pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true);
> + return ret;
> }
>
> static ktime_t fixup_debug_start(struct pci_dev *dev,
>
--
i.
On Mon, 1 Dec 2025, Ilpo Järvinen wrote:
> > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> > + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> > + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
>
> I'm trying to recall, if there was some particular reason why
> ->supported_speeds couldn't be used in this function. It would avoid the
> need to read LinkCap at all.
Thanks for the hint. There's probably none and it's just me missing some
of the zillion bits and pieces. I'll wait a couple of days for any other
people to chime in and respin with this update included if everyone is
otherwise happy to proceed with this update.
> > + if (ret)
> > + goto err;
> > }
> >
> > return ret;
>
> return 0;
It can still return -ENOTTY if neither of the two latter conditionals
matched, meaning the quirk was not applicable after all. ISTR you had
issues with the structure of this code before; I am not sure if it can
be made any better in a reasonable way. It is not a failure per se, so
the newly-added common error path does not apply. This is the case for:
"Return an error if retraining was not needed[...]" from the introductory
comment.
Shall I add a comment above the return statement referring to this?
Maciej
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> On Mon, 1 Dec 2025, Ilpo Järvinen wrote:
>
> > > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> > > + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> > > + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
> >
> > I'm trying to recall, if there was some particular reason why
> > ->supported_speeds couldn't be used in this function. It would avoid the
> > need to read LinkCap at all.
>
> Thanks for the hint. There's probably none and it's just me missing some
> of the zillion bits and pieces. I'll wait a couple of days for any other
> people to chime in and respin with this update included if everyone is
> otherwise happy to proceed with this update.
>
> > > + if (ret)
> > > + goto err;
> > > }
> > >
> > > return ret;
> >
> > return 0;
>
> It can still return -ENOTTY if neither of the two latter conditionals
> matched, meaning the quirk was not applicable after all. ISTR you had
> issues with the structure of this code before; I am not sure if it can
> be made any better in a reasonable way. It is not a failure per se, so
> the newly-added common error path does not apply. This is the case for:
> "Return an error if retraining was not needed[...]" from the introductory
> comment.
>
> Shall I add a comment above the return statement referring to this?
I think it's fine as is, I just didn't review with enough context to
notice what it was initialized to (the usual thing when adding a
rollback path is to forget to change the normal path to return 0, thus
"auto commenting" it without checking enough, I'm sorry about that).
--
i.
© 2016 - 2025 Red Hat, Inc.