drivers/pci/pcie/aspm.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
When comparing lspci output between Windows and Linux for hotplugged
Thunderbolt 5 eGPU devices, Windows enables ASPM L1 but Linux doesn't:
Windows: LnkCtl: ASPM L1 Enabled
Linux: LnkCtl: ASPM Disabled
This difference in ASPM configuration can cause behavioral differences
between the two operating systems for the same hardware.
The root cause is that Linux's default ASPM policy (POLICY_DEFAULT) relies
on firmware/BIOS configuration. For hotplugged devices like Thunderbolt/USB4
eGPUs, the BIOS may not have configured ASPM since the device wasn't present
at boot time. As a result, link->aspm_enabled is 0, link->aspm_default is
set to 0, and Linux never enables ASPM for these devices.
Devicetree platforms already have special handling to enable L0s/L1 by
default regardless of firmware configuration. Extend this same logic to
removable devices when firmware hasn't configured any ASPM states.
This makes Linux behavior more consistent with Windows for hotplugged
Thunderbolt/USB4 devices.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=221319
Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/pci/pcie/aspm.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
index 925373b98dff0..77497d90be0b7 100644
--- a/drivers/pci/pcie/aspm.c
+++ b/drivers/pci/pcie/aspm.c
@@ -804,8 +804,15 @@ static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
struct pci_dev *pdev = link->downstream;
u32 override;
- /* For devicetree platforms, enable L0s and L1 by default */
- if (of_have_populated_dt()) {
+ /*
+ * For devicetree platforms, enable L0s and L1 by default.
+ *
+ * For removable devices (e.g., Thunderbolt/USB4), enable L0s and L1
+ * by default if BIOS didn't configure any ASPM states. This handles
+ * hotplugged devices where firmware may not have configured ASPM.
+ */
+ if (of_have_populated_dt() ||
+ (dev_is_removable(&pdev->dev) && !link->aspm_enabled)) {
if (link->aspm_support & PCIE_LINK_STATE_L0S)
link->aspm_default |= PCIE_LINK_STATE_L0S;
if (link->aspm_support & PCIE_LINK_STATE_L1)
--
2.43.0
+Mika Westerberg
On Tue, May 5, 2026 at 12:53 AM Mario Limonciello
<mario.limonciello@amd.com> wrote:
>
> When comparing lspci output between Windows and Linux for hotplugged
> Thunderbolt 5 eGPU devices, Windows enables ASPM L1 but Linux doesn't:
>
> Windows: LnkCtl: ASPM L1 Enabled
> Linux: LnkCtl: ASPM Disabled
>
> This difference in ASPM configuration can cause behavioral differences
> between the two operating systems for the same hardware.
>
> The root cause is that Linux's default ASPM policy (POLICY_DEFAULT) relies
> on firmware/BIOS configuration. For hotplugged devices like Thunderbolt/USB4
> eGPUs, the BIOS may not have configured ASPM since the device wasn't present
> at boot time. As a result, link->aspm_enabled is 0, link->aspm_default is
> set to 0, and Linux never enables ASPM for these devices.
>
> Devicetree platforms already have special handling to enable L0s/L1 by
> default regardless of firmware configuration. Extend this same logic to
> removable devices when firmware hasn't configured any ASPM states.
>
> This makes Linux behavior more consistent with Windows for hotplugged
> Thunderbolt/USB4 devices.
>
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=221319
> Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> drivers/pci/pcie/aspm.c | 11 +++++++++--
> 1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> index 925373b98dff0..77497d90be0b7 100644
> --- a/drivers/pci/pcie/aspm.c
> +++ b/drivers/pci/pcie/aspm.c
> @@ -804,8 +804,15 @@ static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
> struct pci_dev *pdev = link->downstream;
> u32 override;
>
> - /* For devicetree platforms, enable L0s and L1 by default */
> - if (of_have_populated_dt()) {
> + /*
> + * For devicetree platforms, enable L0s and L1 by default.
> + *
> + * For removable devices (e.g., Thunderbolt/USB4), enable L0s and L1
> + * by default if BIOS didn't configure any ASPM states. This handles
> + * hotplugged devices where firmware may not have configured ASPM.
> + */
> + if (of_have_populated_dt() ||
> + (dev_is_removable(&pdev->dev) && !link->aspm_enabled)) {
> if (link->aspm_support & PCIE_LINK_STATE_L0S)
> link->aspm_default |= PCIE_LINK_STATE_L0S;
> if (link->aspm_support & PCIE_LINK_STATE_L1)
> --
> 2.43.0
>
Hi,
On Tue, May 05, 2026 at 08:09:22PM +0200, Rafael J. Wysocki wrote:
> +Mika Westerberg
>
> On Tue, May 5, 2026 at 12:53 AM Mario Limonciello
> <mario.limonciello@amd.com> wrote:
> >
> > When comparing lspci output between Windows and Linux for hotplugged
> > Thunderbolt 5 eGPU devices, Windows enables ASPM L1 but Linux doesn't:
> >
> > Windows: LnkCtl: ASPM L1 Enabled
> > Linux: LnkCtl: ASPM Disabled
> >
> > This difference in ASPM configuration can cause behavioral differences
> > between the two operating systems for the same hardware.
> >
> > The root cause is that Linux's default ASPM policy (POLICY_DEFAULT) relies
> > on firmware/BIOS configuration. For hotplugged devices like Thunderbolt/USB4
> > eGPUs, the BIOS may not have configured ASPM since the device wasn't present
> > at boot time. As a result, link->aspm_enabled is 0, link->aspm_default is
> > set to 0, and Linux never enables ASPM for these devices.
> >
> > Devicetree platforms already have special handling to enable L0s/L1 by
> > default regardless of firmware configuration. Extend this same logic to
> > removable devices when firmware hasn't configured any ASPM states.
> >
> > This makes Linux behavior more consistent with Windows for hotplugged
> > Thunderbolt/USB4 devices.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=221319
> > Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > ---
> > drivers/pci/pcie/aspm.c | 11 +++++++++--
> > 1 file changed, 9 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index 925373b98dff0..77497d90be0b7 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -804,8 +804,15 @@ static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
> > struct pci_dev *pdev = link->downstream;
> > u32 override;
> >
> > - /* For devicetree platforms, enable L0s and L1 by default */
> > - if (of_have_populated_dt()) {
> > + /*
> > + * For devicetree platforms, enable L0s and L1 by default.
> > + *
> > + * For removable devices (e.g., Thunderbolt/USB4), enable L0s and L1
> > + * by default if BIOS didn't configure any ASPM states. This handles
> > + * hotplugged devices where firmware may not have configured ASPM.
> > + */
Only L1 is supported over TB/USB4 tunnel (no L0s, no L1 substates). The
PCIe endpoint and the downstream port it connects to of course can support
the full range as that's a real PCIe link.
> > + if (of_have_populated_dt() ||
> > + (dev_is_removable(&pdev->dev) && !link->aspm_enabled)) {
> > if (link->aspm_support & PCIE_LINK_STATE_L0S)
> > link->aspm_default |= PCIE_LINK_STATE_L0S;
> > if (link->aspm_support & PCIE_LINK_STATE_L1)
> > --
> > 2.43.0
> >
On 5/5/26 23:53, Mika Westerberg wrote:
> Hi,
>
> On Tue, May 05, 2026 at 08:09:22PM +0200, Rafael J. Wysocki wrote:
>> +Mika Westerberg
>>
>> On Tue, May 5, 2026 at 12:53 AM Mario Limonciello
>> <mario.limonciello@amd.com> wrote:
>>>
>>> When comparing lspci output between Windows and Linux for hotplugged
>>> Thunderbolt 5 eGPU devices, Windows enables ASPM L1 but Linux doesn't:
>>>
>>> Windows: LnkCtl: ASPM L1 Enabled
>>> Linux: LnkCtl: ASPM Disabled
>>>
>>> This difference in ASPM configuration can cause behavioral differences
>>> between the two operating systems for the same hardware.
>>>
>>> The root cause is that Linux's default ASPM policy (POLICY_DEFAULT) relies
>>> on firmware/BIOS configuration. For hotplugged devices like Thunderbolt/USB4
>>> eGPUs, the BIOS may not have configured ASPM since the device wasn't present
>>> at boot time. As a result, link->aspm_enabled is 0, link->aspm_default is
>>> set to 0, and Linux never enables ASPM for these devices.
>>>
>>> Devicetree platforms already have special handling to enable L0s/L1 by
>>> default regardless of firmware configuration. Extend this same logic to
>>> removable devices when firmware hasn't configured any ASPM states.
>>>
>>> This makes Linux behavior more consistent with Windows for hotplugged
>>> Thunderbolt/USB4 devices.
>>>
>>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=221319
>>> Assisted-by: Claude Opus 4.6 <noreply@anthropic.com>
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>> ---
>>> drivers/pci/pcie/aspm.c | 11 +++++++++--
>>> 1 file changed, 9 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
>>> index 925373b98dff0..77497d90be0b7 100644
>>> --- a/drivers/pci/pcie/aspm.c
>>> +++ b/drivers/pci/pcie/aspm.c
>>> @@ -804,8 +804,15 @@ static void pcie_aspm_override_default_link_state(struct pcie_link_state *link)
>>> struct pci_dev *pdev = link->downstream;
>>> u32 override;
>>>
>>> - /* For devicetree platforms, enable L0s and L1 by default */
>>> - if (of_have_populated_dt()) {
>>> + /*
>>> + * For devicetree platforms, enable L0s and L1 by default.
>>> + *
>>> + * For removable devices (e.g., Thunderbolt/USB4), enable L0s and L1
>>> + * by default if BIOS didn't configure any ASPM states. This handles
>>> + * hotplugged devices where firmware may not have configured ASPM.
>>> + */
>
> Only L1 is supported over TB/USB4 tunnel (no L0s, no L1 substates). The
> PCIe endpoint and the downstream port it connects to of course can support
> the full range as that's a real PCIe link.
>
OK - the comment should be updated but I do expect that below code
(link->aspm_support) should remain OK.
>>> + if (of_have_populated_dt() ||
>>> + (dev_is_removable(&pdev->dev) && !link->aspm_enabled)) {
>>> if (link->aspm_support & PCIE_LINK_STATE_L0S)
>>> link->aspm_default |= PCIE_LINK_STATE_L0S;
>>> if (link->aspm_support & PCIE_LINK_STATE_L1)
>>> --
>>> 2.43.0
>>>
On Wed, May 06, 2026 at 10:10:47AM -0500, Mario Limonciello wrote:
> On 5/5/26 23:53, Mika Westerberg wrote:
> > On Tue, May 05, 2026 at 08:09:22PM +0200, Rafael J. Wysocki wrote:
> > > On Tue, May 5, 2026 at 12:53 AM Mario Limonciello
> > > <mario.limonciello@amd.com> wrote:
> ...
> > > > + * For devicetree platforms, enable L0s and L1 by default.
> > > > + *
> > > > + * For removable devices (e.g., Thunderbolt/USB4), enable L0s and L1
> > > > + * by default if BIOS didn't configure any ASPM states. This handles
> > > > + * hotplugged devices where firmware may not have configured ASPM.
> > > > + */
> >
> > Only L1 is supported over TB/USB4 tunnel (no L0s, no L1 substates). The
> > PCIe endpoint and the downstream port it connects to of course can support
> > the full range as that's a real PCIe link.
>
> OK - the comment should be updated but I do expect that below code
> (link->aspm_support) should remain OK.
TB/USB4 are examples of removable devices but they're not the only
ones, so I think it's OK for the comment to mention L0s. In fact, it
*should* mention L0s since the code below includes L0s, and mentioning
only L1 would just be confusing.
> > > > + if (of_have_populated_dt() ||
> > > > + (dev_is_removable(&pdev->dev) && !link->aspm_enabled)) {
> > > > if (link->aspm_support & PCIE_LINK_STATE_L0S)
> > > > link->aspm_default |= PCIE_LINK_STATE_L0S;
> > > > if (link->aspm_support & PCIE_LINK_STATE_L1)
> > > > --
> > > > 2.43.0
> > > >
>
On 5/6/26 10:27, Bjorn Helgaas wrote: > On Wed, May 06, 2026 at 10:10:47AM -0500, Mario Limonciello wrote: >> On 5/5/26 23:53, Mika Westerberg wrote: >>> On Tue, May 05, 2026 at 08:09:22PM +0200, Rafael J. Wysocki wrote: >>>> On Tue, May 5, 2026 at 12:53 AM Mario Limonciello >>>> <mario.limonciello@amd.com> wrote: >> ... > >>>>> + * For devicetree platforms, enable L0s and L1 by default. >>>>> + * >>>>> + * For removable devices (e.g., Thunderbolt/USB4), enable L0s and L1 >>>>> + * by default if BIOS didn't configure any ASPM states. This handles >>>>> + * hotplugged devices where firmware may not have configured ASPM. >>>>> + */ >>> >>> Only L1 is supported over TB/USB4 tunnel (no L0s, no L1 substates). The >>> PCIe endpoint and the downstream port it connects to of course can support >>> the full range as that's a real PCIe link. >> >> OK - the comment should be updated but I do expect that below code >> (link->aspm_support) should remain OK. > > TB/USB4 are examples of removable devices but they're not the only > ones, so I think it's OK for the comment to mention L0s. In fact, it > *should* mention L0s since the code below includes L0s, and mentioning > only L1 would just be confusing. > It sounds like you're suggesting no changes to this proposal then, right?
On Thu, May 14, 2026 at 12:14:29PM -0500, Mario Limonciello wrote:
> On 5/6/26 10:27, Bjorn Helgaas wrote:
> > On Wed, May 06, 2026 at 10:10:47AM -0500, Mario Limonciello wrote:
> > > On 5/5/26 23:53, Mika Westerberg wrote:
> > > > On Tue, May 05, 2026 at 08:09:22PM +0200, Rafael J. Wysocki wrote:
> > > > > On Tue, May 5, 2026 at 12:53 AM Mario Limonciello
> > > > > <mario.limonciello@amd.com> wrote:
> > > ...
> >
> > > > > > + * For devicetree platforms, enable L0s and L1 by default.
> > > > > > + *
> > > > > > + * For removable devices (e.g., Thunderbolt/USB4), enable L0s and L1
> > > > > > + * by default if BIOS didn't configure any ASPM states. This handles
> > > > > > + * hotplugged devices where firmware may not have configured ASPM.
> > > > > > + */
> > > >
> > > > Only L1 is supported over TB/USB4 tunnel (no L0s, no L1 substates). The
> > > > PCIe endpoint and the downstream port it connects to of course can support
> > > > the full range as that's a real PCIe link.
> > >
> > > OK - the comment should be updated but I do expect that below code
> > > (link->aspm_support) should remain OK.
> >
> > TB/USB4 are examples of removable devices but they're not the only
> > ones, so I think it's OK for the comment to mention L0s. In fact, it
> > *should* mention L0s since the code below includes L0s, and mentioning
> > only L1 would just be confusing.
> >
> It sounds like you're suggesting no changes to this proposal then, right?
I'm fine with the comment mentioning "L0s and L1", which matches the
code:
if (link->aspm_support & PCIE_LINK_STATE_L0S)
link->aspm_default |= PCIE_LINK_STATE_L0S;
if (link->aspm_support & PCIE_LINK_STATE_L1)
link->aspm_default |= PCIE_LINK_STATE_L1;
Sashiko had several comments. I'm hoping for those to be addressed or
responded to. Unfortunately the complexity of the ASPM code means the
sashiko comments are also very complicated and a lot of work to go
though.
On Mon, May 04, 2026 at 05:52:46PM -0500, Mario Limonciello wrote: > When comparing lspci output between Windows and Linux for hotplugged > Thunderbolt 5 eGPU devices, Windows enables ASPM L1 but Linux doesn't: > > Windows: LnkCtl: ASPM L1 Enabled > Linux: LnkCtl: ASPM Disabled > > This difference in ASPM configuration can cause behavioral differences > between the two operating systems for the same hardware. A tangent, not a comment on the patch itself, but what sort of behavioral differences are these? If ASPM is working correctly, the only differences *should* be in power consumption and performance.
On 5/5/26 11:05, Bjorn Helgaas wrote: > On Mon, May 04, 2026 at 05:52:46PM -0500, Mario Limonciello wrote: >> When comparing lspci output between Windows and Linux for hotplugged >> Thunderbolt 5 eGPU devices, Windows enables ASPM L1 but Linux doesn't: >> >> Windows: LnkCtl: ASPM L1 Enabled >> Linux: LnkCtl: ASPM Disabled >> >> This difference in ASPM configuration can cause behavioral differences >> between the two operating systems for the same hardware. > > A tangent, not a comment on the patch itself, but what sort of > behavioral differences are these? If ASPM is working correctly, the > only differences *should* be in power consumption and performance. This originally stemmed from a significant performance difference that was observed between Windows and Linux with eGPUs. The link in the patch points at that bug if you want to look more closely at it. I was hopeful that aligning ASPM would align the behavior, but alas this didn't. It was still a difference that I figured we should discuss whether it should be changed to be consistent.
[+cc Mika] On Tue, May 05, 2026 at 11:08:14AM -0500, Mario Limonciello wrote: > On 5/5/26 11:05, Bjorn Helgaas wrote: > > On Mon, May 04, 2026 at 05:52:46PM -0500, Mario Limonciello wrote: > > > When comparing lspci output between Windows and Linux for hotplugged > > > Thunderbolt 5 eGPU devices, Windows enables ASPM L1 but Linux doesn't: > > > > > > Windows: LnkCtl: ASPM L1 Enabled > > > Linux: LnkCtl: ASPM Disabled > > > > > > This difference in ASPM configuration can cause behavioral differences > > > between the two operating systems for the same hardware. > > > > A tangent, not a comment on the patch itself, but what sort of > > behavioral differences are these? If ASPM is working correctly, the > > only differences *should* be in power consumption and performance. > > This originally stemmed from a significant performance difference that was > observed between Windows and Linux with eGPUs. The link in the patch points > at that bug if you want to look more closely at it. Hmm. The bug (https://bugzilla.kernel.org/show_bug.cgi?id=221319) reports "instant reboot", which is definitely a behavioral difference. But AFAICS this patch would just fix something noticed along the way but not the reboot itself. To avoid confusion, I would use "performance difference" or "power difference" when describing this patch. > I was hopeful that aligning ASPM would align the behavior, but alas this > didn't. > > It was still a difference that I figured we should discuss whether it should > be changed to be consistent. Definitely. I hope we can at least enable L1.1. L1.2 is a whole 'nother issue.
On 5/5/26 16:42, Bjorn Helgaas wrote: > [+cc Mika] > > On Tue, May 05, 2026 at 11:08:14AM -0500, Mario Limonciello wrote: >> On 5/5/26 11:05, Bjorn Helgaas wrote: >>> On Mon, May 04, 2026 at 05:52:46PM -0500, Mario Limonciello wrote: >>>> When comparing lspci output between Windows and Linux for hotplugged >>>> Thunderbolt 5 eGPU devices, Windows enables ASPM L1 but Linux doesn't: >>>> >>>> Windows: LnkCtl: ASPM L1 Enabled >>>> Linux: LnkCtl: ASPM Disabled >>>> >>>> This difference in ASPM configuration can cause behavioral differences >>>> between the two operating systems for the same hardware. >>> >>> A tangent, not a comment on the patch itself, but what sort of >>> behavioral differences are these? If ASPM is working correctly, the >>> only differences *should* be in power consumption and performance. >> >> This originally stemmed from a significant performance difference that was >> observed between Windows and Linux with eGPUs. The link in the patch points >> at that bug if you want to look more closely at it. > > Hmm. The bug (https://bugzilla.kernel.org/show_bug.cgi?id=221319) > reports "instant reboot", which is definitely a behavioral difference. > But AFAICS this patch would just fix something noticed along the way > but not the reboot itself. > > To avoid confusion, I would use "performance difference" or "power > difference" when describing this patch. There is a lot of traffic in that bug and similar eGPU bugs; but some people have narrowed down that using NVIDIA's GSP "causes the instant reboot" but the performance difference is tangential to the reboot (or maybe it's part of the cause - I don't actually know). The reboots /seem/ to be caused by sync floods which I originally hypothesized to be caused by Linux using AER and Windows not using it (potentially leading to a flood of errors in Linux), but turning off AER from kernel command line didn't change that. > >> I was hopeful that aligning ASPM would align the behavior, but alas this >> didn't. >> >> It was still a difference that I figured we should discuss whether it should >> be changed to be consistent. > > Definitely. I hope we can at least enable L1.1. L1.2 is a whole > 'nother issue. Yup.
© 2016 - 2026 Red Hat, Inc.