drivers/pci/quirks.c | 50 ++++++++++++++++++-------------------------------- 1 file changed, 18 insertions(+), 32 deletions(-)
Discard Vendor:Device ID matching in the PCIe failed link retraining
quirk and ignore the link status for the removal of the 2.5GT/s speed
clamp, whether applied by the quirk itself or the firmware earlier on.
Revert to the original target link speed if this final link retraining
has failed.
This is so that link training noise in hot-plug scenarios does not make
a link remain clamped to the 2.5GT/s speed where an event race has led
the quirk to apply the speed clamp for one device, only to leave it in
place for a subsequent device to be plugged in.
Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Cc: <stable@vger.kernel.org> # v6.5+
---
drivers/pci/quirks.c | 50 ++++++++++++++++++--------------------------------
1 file changed, 18 insertions(+), 32 deletions(-)
linux-pcie-failed-link-retrain-unclamp-always.diff
Index: linux-macro/drivers/pci/quirks.c
===================================================================
--- linux-macro.orig/drivers/pci/quirks.c
+++ linux-macro/drivers/pci/quirks.c
@@ -79,11 +79,10 @@ static bool pcie_lbms_seen(struct pci_de
* Restrict the speed to 2.5GT/s then with the Target Link Speed field,
* request a retrain and check the result.
*
- * If this turns out successful and we know by the Vendor:Device ID it is
- * safe to do so, then lift the restriction, letting the devices negotiate
- * a higher speed. Also check for a similar 2.5GT/s speed restriction the
- * firmware may have already arranged and lift it with ports that already
- * report their data link being up.
+ * If this turns out successful, or where a 2.5GT/s speed restriction has
+ * been previously arranged by the firmware and the port reports its link
+ * already being up, lift the restriction, in a hope it is safe to do so,
+ * letting the devices negotiate a higher speed.
*
* Otherwise revert the speed to the original setting and request a retrain
* again to remove any residual state, ignoring the result as it's supposed
@@ -94,52 +93,39 @@ static bool pcie_lbms_seen(struct pci_de
*/
int pcie_failed_link_retrain(struct pci_dev *dev)
{
- static const struct pci_device_id ids[] = {
- { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
- {}
- };
- u16 lnksta, lnkctl2;
+ u16 lnksta, lnkctl2, oldlnkctl2;
int ret = -ENOTTY;
+ u32 lnkcap;
if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
!pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
return ret;
pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
if (!(lnksta & PCI_EXP_LNKSTA_DLLLA) && pcie_lbms_seen(dev, lnksta)) {
- u16 oldlnkctl2;
-
pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
- pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2),
- true);
- return ret;
- }
-
- pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ if (ret)
+ goto err;
}
pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
-
- if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
- (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
- pci_match_id(ids, dev)) {
- u32 lnkcap;
-
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
+ (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
- pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- return ret;
- }
+ if (ret)
+ goto err;
}
return ret;
+err:
+ pci_info(dev, "retraining failed\n");
+ pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true);
+ return ret;
}
static ktime_t fixup_debug_start(struct pci_dev *dev,
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote: > Discard Vendor:Device ID matching in the PCIe failed link retraining > quirk and ignore the link status for the removal of the 2.5GT/s speed > clamp, whether applied by the quirk itself or the firmware earlier on. > Revert to the original target link speed if this final link retraining > has failed. I think we should either remove the quirk or only execute the quirk when the downstream port is the specific ASMedia 0x2824. Hardware companies that develop PCIe devices rely on the linux kernel for a significant amount of their testing & the action taken by this quirk is going to introduce noise into those tests by initiating unexpected speed changes etc. As long as we have this quirk messing with link speeds we'll just continue to see issue reports over time in my opinion.
On Wed, 3 Dec 2025, Matthew W Carlis wrote: > > Discard Vendor:Device ID matching in the PCIe failed link retraining > > quirk and ignore the link status for the removal of the 2.5GT/s speed > > clamp, whether applied by the quirk itself or the firmware earlier on. > > Revert to the original target link speed if this final link retraining > > has failed. > > I think we should either remove the quirk or only execute the quirk when the > downstream port is the specific ASMedia 0x2824. Hardware companies that > develop PCIe devices rely on the linux kernel for a significant amount of > their testing & the action taken by this quirk is going to introduce > noise into those tests by initiating unexpected speed changes etc. Conversely, ISTM this could be good motivation for hardware designers to reduce hot-plug noise. After all LBMS is only supposed to be ever set for links in the active state and not while training, so perhaps debouncing is needed for the transient state? > As long as we have this quirk messing with link speeds we'll just > continue to see issue reports over time in my opinion. Well, the issues happened because I made an unfortunate design decision with the original implementation which did not clean up after itself, just because I have no use for hot-plug scenarios and didn't envisage it could be causing issues. The most recent update brings the device back to its original state whether retraining succeeded or not, so it should now be transparent to your noisy hot-plug scenarios, while helping stubborn devices at the same time. You might have not noticed this code existed if it had been in its currently proposed shape right from the beginning. It's only those who do nothing that make no mistakes. Maciej
I'm sorry my last response kind of messed up the threading in this chain... I don't understand why we still think its a good idea to have this action in the kernel for any device type since it seems to only help Maciej W. Rozycki's specific situation which is very likely to be the only one of its kind. In addition the Delock adapter isn't what I would consider a particularly "solid" device. For what its worth I have a particular Gen5 device in my lab here that gets stuck in an infinite link up-down loop when you force the speed to Gen2 when also combined with a specific downstream port... I'm sure there is another device somewhere else in the world that has the same issue when you force it to Gen1. The kernel should assume a PCIe link will automatically train to the best speed that the devices can achieve & if the link fails to train then it should be up to the user space to decide what to do with it in my opinion.
On Thu, 4 Dec 2025, Matthew W Carlis wrote: > For what its worth I have a particular Gen5 device in my lab here that gets stuck > in an infinite link up-down loop when you force the speed to Gen2 when also combined > with a specific downstream port... I'm sure there is another device somewhere else > in the world that has the same issue when you force it to Gen1. Well, it's an erratum vs another erratum case. Then should such a device at its maximum link speed trigger the workaround somehow, with this fix in place any temporary clamp will be lifted anyway and the link retrained, so it will recover from the link up-down loop. So no functional regression. > The kernel should assume a PCIe link will automatically train to the best > speed that the devices can achieve & if the link fails to train then it should > be up to the user space to decide what to do with it in my opinion. It might be tough where you have your rootfs device down the problematic link. Thank you for your input. Maciej
On 12/1/2025 9:22 AM, Maciej W. Rozycki wrote:
> Discard Vendor:Device ID matching in the PCIe failed link retraining
> quirk and ignore the link status for the removal of the 2.5GT/s speed
> clamp, whether applied by the quirk itself or the firmware earlier on.
> Revert to the original target link speed if this final link retraining
> has failed.
>
> This is so that link training noise in hot-plug scenarios does not make
> a link remain clamped to the 2.5GT/s speed where an event race has led
> the quirk to apply the speed clamp for one device, only to leave it in
> place for a subsequent device to be plugged in.
>
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Signed-off-by: Maciej W. Rozycki<macro@orcam.me.uk>
> Cc:<stable@vger.kernel.org> # v6.5+
> ---
> drivers/pci/quirks.c | 50 ++++++++++++++++++--------------------------------
> 1 file changed, 18 insertions(+), 32 deletions(-)
>
> linux-pcie-failed-link-retrain-unclamp-always.diff
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
Thanks a lot for your patch.
The patch works, and the issue has been resolved in our testing.
However, this patch does not cleanly apply to the 6.12 LTS kernel.
To apply the fix cleanly, a series of patches is required.
PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
2389d8dc38fee PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
15b8968dcb90f PCI/bwctrl: Fix NULL pointer deref on unbind and bind
e3f30d563a388 PCI: Make pci_destroy_dev() concurrent safe
667f053b05f00 PCI/bwctrl: Fix NULL pointer dereference on bus number
exhaustion
e93d9fcfd7dc6 PCI: Refactor pcie_update_link_speed()
9989e0ca7462c PCI: Fix link speed calculation on retrain failure
b85af48de3ece PCI: Adjust the position of reading the Link Control 2
register
026e4bffb0af9 PCI/bwctrl: Fix pcie_bwctrl_select_speed() return type
d278b098282d1 thermal: Add PCIe cooling driver
de9a6c8d5dbfe PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed
665745f274870 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW
controller
3491f50966686 PCI: Abstract LBMS seen check into pcie_lbms_seen()
Could you please provide a version of this patch that can be
cleanly cherry-picked for the 6.12 LTS (6.12.y) branch?
Alternatively, is it okay to back-port the above patch series to 6.12.y?
Tested-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Thanks,
Alok
On Tue, 2 Dec 2025, ALOK TIWARI wrote:
> On 12/1/2025 9:22 AM, Maciej W. Rozycki wrote:
> > Discard Vendor:Device ID matching in the PCIe failed link retraining
> > quirk and ignore the link status for the removal of the 2.5GT/s speed
> > clamp, whether applied by the quirk itself or the firmware earlier on.
> > Revert to the original target link speed if this final link retraining
> > has failed.
> >
> > This is so that link training noise in hot-plug scenarios does not make
> > a link remain clamped to the 2.5GT/s speed where an event race has led
> > the quirk to apply the speed clamp for one device, only to leave it in
> > place for a subsequent device to be plugged in.
> >
> > Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> > Signed-off-by: Maciej W. Rozycki<macro@orcam.me.uk>
> > Cc:<stable@vger.kernel.org> # v6.5+
> > ---
> > drivers/pci/quirks.c | 50
> > ++++++++++++++++++--------------------------------
> > 1 file changed, 18 insertions(+), 32 deletions(-)
> >
> > linux-pcie-failed-link-retrain-unclamp-always.diff
> > Index: linux-macro/drivers/pci/quirks.c
> > ===================================================================
> > --- linux-macro.orig/drivers/pci/quirks.c
> > +++ linux-macro/drivers/pci/quirks.c
>
> Thanks a lot for your patch.
> The patch works, and the issue has been resolved in our testing.
>
> However, this patch does not cleanly apply to the 6.12 LTS kernel.
> To apply the fix cleanly, a series of patches is required.
>
> PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining
> 2389d8dc38fee PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
> 15b8968dcb90f PCI/bwctrl: Fix NULL pointer deref on unbind and bind
> e3f30d563a388 PCI: Make pci_destroy_dev() concurrent safe
> 667f053b05f00 PCI/bwctrl: Fix NULL pointer dereference on bus number
> exhaustion
> e93d9fcfd7dc6 PCI: Refactor pcie_update_link_speed()
> 9989e0ca7462c PCI: Fix link speed calculation on retrain failure
> b85af48de3ece PCI: Adjust the position of reading the Link Control 2 register
> 026e4bffb0af9 PCI/bwctrl: Fix pcie_bwctrl_select_speed() return type
> d278b098282d1 thermal: Add PCIe cooling driver
> de9a6c8d5dbfe PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed
> 665745f274870 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller
> 3491f50966686 PCI: Abstract LBMS seen check into pcie_lbms_seen()
>
> Could you please provide a version of this patch that can be
> cleanly cherry-picked for the 6.12 LTS (6.12.y) branch?
>
> Alternatively, is it okay to back-port the above patch series to 6.12.y?
>
>
> Tested-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Thanks for testing.
As the change has a Fixes tag, stable maintainers will eventually get to
it (after the change has progressed into Linus' tree).
As per usual, if the patch won't apply to some old kernel version cleanly,
stable maintainers will decide themselves whether they end up taking some
extra changes or ask for a backport from the submitter of the original
change.
--
i.
On Tue, 2 Dec 2025, Ilpo Järvinen wrote: > > > linux-pcie-failed-link-retrain-unclamp-always.diff > > > Index: linux-macro/drivers/pci/quirks.c > > > =================================================================== > > > --- linux-macro.orig/drivers/pci/quirks.c > > > +++ linux-macro/drivers/pci/quirks.c > > > > Thanks a lot for your patch. > > The patch works, and the issue has been resolved in our testing. Great, thanks for running the verification! > > However, this patch does not cleanly apply to the 6.12 LTS kernel. > > To apply the fix cleanly, a series of patches is required. > > > > PCI: Always lift 2.5GT/s restriction in PCIe failed link retraining > > 2389d8dc38fee PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag > > 15b8968dcb90f PCI/bwctrl: Fix NULL pointer deref on unbind and bind > > e3f30d563a388 PCI: Make pci_destroy_dev() concurrent safe > > 667f053b05f00 PCI/bwctrl: Fix NULL pointer dereference on bus number > > exhaustion > > e93d9fcfd7dc6 PCI: Refactor pcie_update_link_speed() > > 9989e0ca7462c PCI: Fix link speed calculation on retrain failure > > b85af48de3ece PCI: Adjust the position of reading the Link Control 2 register > > 026e4bffb0af9 PCI/bwctrl: Fix pcie_bwctrl_select_speed() return type > > d278b098282d1 thermal: Add PCIe cooling driver > > de9a6c8d5dbfe PCI/bwctrl: Add pcie_set_target_speed() to set PCIe Link Speed > > 665745f274870 PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller > > 3491f50966686 PCI: Abstract LBMS seen check into pcie_lbms_seen() Thank you for identifying the dependencies. Clearly the majority of the changes is not functionally related to this fix at all and any need for them is purely mechanical. > As the change has a Fixes tag, stable maintainers will eventually get to > it (after the change has progressed into Linus' tree). > > As per usual, if the patch won't apply to some old kernel version cleanly, > stable maintainers will decide themselves whether they end up taking some > extra changes or ask for a backport from the submitter of the original > change. Dependencies can be listed if need be along with a backport request. However in this case there have been lots of syntactic updates to this function, which do not necessarily qualify for backporting, as that is supposed to include critical fixes only. Certainly changes to prevent systems from breaking do qualify, but helper macros and functions, or code restructuring do not. Since I'm going to wait for further feedback and respin anyway, I will check if there are critical dependencies required that are missing on the stable branches and list any along with the backport request. Then any remaining syntactic sugar can, as you say, be handled on a case by case basis in the backport process. Maciej
On Wed, 3 Dec 2025, Maciej W. Rozycki wrote:
> Since I'm going to wait for further feedback and respin anyway, I will
> check if there are critical dependencies required that are missing on the
> stable branches and list any along with the backport request. Then any
> remaining syntactic sugar can, as you say, be handled on a case by case
> basis in the backport process.
I've gone through the relevant active stable branches and there's no fix
required to be additionally backported to either 6.6 or 6.12. There are
only mechanical updates needed, which I've included in the change below,
and which works for me with 6.12. Also nothing extra is needed for 6.17
as the code is the same as in the trunk.
Maciej
---
drivers/pci/quirks.c | 53 ++++++++++++++++++++-------------------------------
1 file changed, 21 insertions(+), 32 deletions(-)
Index: linux-macro/drivers/pci/quirks.c
===================================================================
--- linux-macro.orig/drivers/pci/quirks.c
+++ linux-macro/drivers/pci/quirks.c
@@ -68,11 +68,10 @@
* Restrict the speed to 2.5GT/s then with the Target Link Speed field,
* request a retrain and check the result.
*
- * If this turns out successful and we know by the Vendor:Device ID it is
- * safe to do so, then lift the restriction, letting the devices negotiate
- * a higher speed. Also check for a similar 2.5GT/s speed restriction the
- * firmware may have already arranged and lift it with ports that already
- * report their data link being up.
+ * If this turns out successful, or where a 2.5GT/s speed restriction has
+ * been previously arranged by the firmware and the port reports its link
+ * already being up, lift the restriction, in a hope it is safe to do so,
+ * letting the devices negotiate a higher speed.
*
* Otherwise revert the speed to the original setting and request a retrain
* again to remove any residual state, ignoring the result as it's supposed
@@ -83,23 +82,19 @@
*/
int pcie_failed_link_retrain(struct pci_dev *dev)
{
- static const struct pci_device_id ids[] = {
- { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
- {}
- };
- u16 lnksta, lnkctl2;
+ u16 lnksta, lnkctl2, oldlnkctl2;
int ret = -ENOTTY;
+ u32 lnkcap;
if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
!pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
return ret;
- pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
+ lnkctl2 = oldlnkctl2;
if ((lnksta & (PCI_EXP_LNKSTA_LBMS | PCI_EXP_LNKSTA_DLLLA)) ==
PCI_EXP_LNKSTA_LBMS) {
- u16 oldlnkctl2 = lnkctl2;
-
pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
@@ -107,36 +102,30 @@ int pcie_failed_link_retrain(struct pci_
pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
ret = pcie_retrain_link(dev, false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- pcie_capability_write_word(dev, PCI_EXP_LNKCTL2,
- oldlnkctl2);
- pcie_retrain_link(dev, true);
- return ret;
- }
-
- pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
+ if (ret)
+ goto err;
}
- if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
- (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
- pci_match_id(ids, dev)) {
- u32 lnkcap;
-
+ pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+ if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
+ (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
- pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
+
lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, lnkctl2);
ret = pcie_retrain_link(dev, false);
- if (ret) {
- pci_info(dev, "retraining failed\n");
- return ret;
- }
+ if (ret)
+ goto err;
}
return ret;
+err:
+ pci_info(dev, "retraining failed\n");
+ pcie_capability_write_word(dev, PCI_EXP_LNKCTL2, oldlnkctl2);
+ pcie_retrain_link(dev, true);
+ return ret;
}
static ktime_t fixup_debug_start(struct pci_dev *dev,
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> Discard Vendor:Device ID matching in the PCIe failed link retraining
> quirk and ignore the link status for the removal of the 2.5GT/s speed
> clamp, whether applied by the quirk itself or the firmware earlier on.
> Revert to the original target link speed if this final link retraining
> has failed.
>
> This is so that link training noise in hot-plug scenarios does not make
> a link remain clamped to the 2.5GT/s speed where an event race has led
> the quirk to apply the speed clamp for one device, only to leave it in
> place for a subsequent device to be plugged in.
>
> Fixes: a89c82249c37 ("PCI: Work around PCIe link training failures")
> Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
> Cc: <stable@vger.kernel.org> # v6.5+
> ---
> drivers/pci/quirks.c | 50 ++++++++++++++++++--------------------------------
> 1 file changed, 18 insertions(+), 32 deletions(-)
>
> linux-pcie-failed-link-retrain-unclamp-always.diff
> Index: linux-macro/drivers/pci/quirks.c
> ===================================================================
> --- linux-macro.orig/drivers/pci/quirks.c
> +++ linux-macro/drivers/pci/quirks.c
> @@ -79,11 +79,10 @@ static bool pcie_lbms_seen(struct pci_de
> * Restrict the speed to 2.5GT/s then with the Target Link Speed field,
> * request a retrain and check the result.
> *
> - * If this turns out successful and we know by the Vendor:Device ID it is
> - * safe to do so, then lift the restriction, letting the devices negotiate
> - * a higher speed. Also check for a similar 2.5GT/s speed restriction the
> - * firmware may have already arranged and lift it with ports that already
> - * report their data link being up.
> + * If this turns out successful, or where a 2.5GT/s speed restriction has
> + * been previously arranged by the firmware and the port reports its link
> + * already being up, lift the restriction, in a hope it is safe to do so,
> + * letting the devices negotiate a higher speed.
> *
> * Otherwise revert the speed to the original setting and request a retrain
> * again to remove any residual state, ignoring the result as it's supposed
> @@ -94,52 +93,39 @@ static bool pcie_lbms_seen(struct pci_de
> */
> int pcie_failed_link_retrain(struct pci_dev *dev)
> {
> - static const struct pci_device_id ids[] = {
> - { PCI_VDEVICE(ASMEDIA, 0x2824) }, /* ASMedia ASM2824 */
> - {}
> - };
> - u16 lnksta, lnkctl2;
> + u16 lnksta, lnkctl2, oldlnkctl2;
> int ret = -ENOTTY;
> + u32 lnkcap;
>
> if (!pci_is_pcie(dev) || !pcie_downstream_port(dev) ||
> !pcie_cap_has_lnkctl2(dev) || !dev->link_active_reporting)
> return ret;
>
> pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> + pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
> if (!(lnksta & PCI_EXP_LNKSTA_DLLLA) && pcie_lbms_seen(dev, lnksta)) {
> - u16 oldlnkctl2;
> -
> pci_info(dev, "broken device, retraining non-functional downstream link at 2.5GT/s\n");
>
> - pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &oldlnkctl2);
> ret = pcie_set_target_speed(dev, PCIE_SPEED_2_5GT, false);
> - if (ret) {
> - pci_info(dev, "retraining failed\n");
> - pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2),
> - true);
> - return ret;
> - }
> -
> - pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
> + if (ret)
> + goto err;
> }
>
> pcie_capability_read_word(dev, PCI_EXP_LNKCTL2, &lnkctl2);
> -
> - if ((lnksta & PCI_EXP_LNKSTA_DLLLA) &&
> - (lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> - pci_match_id(ids, dev)) {
> - u32 lnkcap;
> -
> + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
I'm trying to recall, if there was some particular reason why
->supported_speeds couldn't be used in this function. It would avoid the
need to read LinkCap at all.
> pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
> - pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> ret = pcie_set_target_speed(dev, PCIE_LNKCAP_SLS2SPEED(lnkcap), false);
> - if (ret) {
> - pci_info(dev, "retraining failed\n");
> - return ret;
> - }
> + if (ret)
> + goto err;
> }
>
> return ret;
return 0;
> +err:
> + pci_info(dev, "retraining failed\n");
> + pcie_set_target_speed(dev, PCIE_LNKCTL2_TLS2SPEED(oldlnkctl2), true);
> + return ret;
> }
>
> static ktime_t fixup_debug_start(struct pci_dev *dev,
>
--
i.
On Mon, 1 Dec 2025, Ilpo Järvinen wrote:
> > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> > + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> > + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
>
> I'm trying to recall, if there was some particular reason why
> ->supported_speeds couldn't be used in this function. It would avoid the
> need to read LinkCap at all.
Thanks for the hint. There's probably none and it's just me missing some
of the zillion bits and pieces. I'll wait a couple of days for any other
people to chime in and respin with this update included if everyone is
otherwise happy to proceed with this update.
> > + if (ret)
> > + goto err;
> > }
> >
> > return ret;
>
> return 0;
It can still return -ENOTTY if neither of the two latter conditionals
matched, meaning the quirk was not applicable after all. ISTR you had
issues with the structure of this code before; I am not sure if it can
be made any better in a reasonable way. It is not a failure per se, so
the newly-added common error path does not apply. This is the case for:
"Return an error if retraining was not needed[...]" from the introductory
comment.
Shall I add a comment above the return statement referring to this?
Maciej
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> > > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> > > + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> > > + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
> >
> > I'm trying to recall, if there was some particular reason why
> > ->supported_speeds couldn't be used in this function. It would avoid the
> > need to read LinkCap at all.
>
> Thanks for the hint. There's probably none and it's just me missing some
> of the zillion bits and pieces. I'll wait a couple of days for any other
> people to chime in and respin with this update included if everyone is
> otherwise happy to proceed with this update.
I take it no further feedback will be gathered, so I've sent v2 now, but
I've figured out backporting v1 as it is will result in less intrusion to
the trunk commit, so I have only made a change to use `->supported_speeds'
a follow-up patch in a series. Please let me know if this works for you.
Maciej
On Mon, 1 Dec 2025, Maciej W. Rozycki wrote:
> On Mon, 1 Dec 2025, Ilpo Järvinen wrote:
>
> > > + pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> > > + if ((lnkctl2 & PCI_EXP_LNKCTL2_TLS) == PCI_EXP_LNKCTL2_TLS_2_5GT &&
> > > + (lnkcap & PCI_EXP_LNKCAP_SLS) != PCI_EXP_LNKCAP_SLS_2_5GB) {
> >
> > I'm trying to recall, if there was some particular reason why
> > ->supported_speeds couldn't be used in this function. It would avoid the
> > need to read LinkCap at all.
>
> Thanks for the hint. There's probably none and it's just me missing some
> of the zillion bits and pieces. I'll wait a couple of days for any other
> people to chime in and respin with this update included if everyone is
> otherwise happy to proceed with this update.
>
> > > + if (ret)
> > > + goto err;
> > > }
> > >
> > > return ret;
> >
> > return 0;
>
> It can still return -ENOTTY if neither of the two latter conditionals
> matched, meaning the quirk was not applicable after all. ISTR you had
> issues with the structure of this code before; I am not sure if it can
> be made any better in a reasonable way. It is not a failure per se, so
> the newly-added common error path does not apply. This is the case for:
> "Return an error if retraining was not needed[...]" from the introductory
> comment.
>
> Shall I add a comment above the return statement referring to this?
I think it's fine as is, I just didn't review with enough context to
notice what it was initialized to (the usual thing when adding a
rollback path is to forget to change the normal path to return 0, thus
"auto commenting" it without checking enough, I'm sorry about that).
--
i.
© 2016 - 2026 Red Hat, Inc.