[v1] thunderbolt: Defer DP tunnel teardown until display driver is ready

[PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by ChunAn Wu 1 week, 5 days ago

When the Thunderbolt driver loads early (e.g., from initramfs)
and discovers a BIOS-established DisplayPort tunnel, it starts
asynchronous DPRX polling which checks if the GPU driver has
read DPCD from the connected monitor within a 12-second timeout
(TB_DPRX_TIMEOUT).

On systems with Full Disk Encryption (FDE/LUKS), the GPU driver
(i915, xe, amdgpu, etc.) resides on the encrypted root filesystem
and cannot load until the user enters the passphrase. This creates
a driver load ordering issue where the DPRX timeout fires before
the GPU driver has had a chance to initialize, causing the
Thunderbolt driver to permanently tear down the DP tunnel and
remove the DP IN adapter from available resources. Recovery
requires a physical re-plug of the dock.

Fix this by deferring the DP tunnel teardown when no PCI display
driver has bound yet. Register a PCI bus notifier that watches
for display class (PCI_BASE_CLASS_DISPLAY) driver bind events.
When the DPRX timeout fires:

 - If no display driver is bound: tear down the tunnel but keep
   the DP IN adapter in the available resources list, allowing
   a retry.
 - If a display driver is already bound: proceed with the
   existing behavior of permanently removing the DP IN resource.

When a display driver eventually binds, the notifier triggers a
DP tunnel retry via a scheduled work item, re-establishing the
connection.

This approach requires no changes to GPU drivers and handles all
GPU vendors (Intel, AMD, NVIDIA) through the generic PCI base
class check (0x03xx covers VGA, XGA, 3D, and other display
controllers). It also handles the FDE case gracefully since the
defer and retry can span an unbounded passphrase wait.

Tested on Dell Pro Max 14 MC14250 with Dell SD25TB5 Thunderbolt
5 Dock and LUKS full disk encryption. Simulated a 58-second
delay between TB and GPU driver loading -- display came up
successfully after display driver bound.

Signed-off-by: ChunAn Wu <an.wu@canonical.com>
---
 drivers/thunderbolt/tb.c | 96 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 88 insertions(+), 8 deletions(-)

diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
index 95d84612e06e..48e0b540fbec 100644
--- a/drivers/thunderbolt/tb.c
+++ b/drivers/thunderbolt/tb.c
@@ -62,6 +62,9 @@ MODULE_PARM_DESC(asym_threshold,
  * @remove_work: Work used to remove any unplugged routers after
  *		 runtime resume
  * @groups: Bandwidth groups used in this domain.
+ * @pci_nb: PCI bus notifier to detect when a display driver binds
+ * @display_bound: Set when a PCI display driver has bound
+ * @display_retry_work: Work to retry DP tunneling after display driver binds
  */
 struct tb_cm {
 	struct list_head tunnel_list;
@@ -69,6 +72,9 @@ struct tb_cm {
 	bool hotplug_active;
 	struct delayed_work remove_work;
 	struct tb_bandwidth_group groups[MAX_GROUPS];
+	struct notifier_block pci_nb;
+	bool display_bound;
+	struct work_struct display_retry_work;
 };
 
 static inline struct tb *tcm_to_tb(struct tb_cm *tcm)
@@ -1914,6 +1920,58 @@ static struct tb_port *tb_find_dp_out(struct tb *tb, struct tb_port *in)
 	return NULL;
 }
 
+static void tb_tunnel_dp(struct tb *tb);
+
+/*
+ * Check if any PCI display class (0x03xx) device has a driver bound.
+ * Used to decide whether to defer DPRX polling at boot.
+ */
+static bool tb_is_display_driver_bound(void)
+{
+	struct pci_dev *pdev = NULL;
+
+	while ((pdev = pci_get_base_class(PCI_BASE_CLASS_DISPLAY, pdev))) {
+		if (pdev->driver) {
+			pci_dev_put(pdev);
+			return true;
+		}
+	}
+	return false;
+}
+
+static void tb_display_retry_work_fn(struct work_struct *work)
+{
+	struct tb_cm *tcm = container_of(work, struct tb_cm, display_retry_work);
+	struct tb *tb = tcm_to_tb(tcm);
+
+	mutex_lock(&tb->lock);
+	tb_dbg(tb, "display driver bound, retrying DP tunneling\n");
+	tb_tunnel_dp(tb);
+	mutex_unlock(&tb->lock);
+}
+
+static int tb_pci_notifier_fn(struct notifier_block *nb, unsigned long action,
+			      void *data)
+{
+	struct tb_cm *tcm = container_of(nb, struct tb_cm, pci_nb);
+	struct device *dev = data;
+	struct pci_dev *pdev;
+
+	if (action != BUS_NOTIFY_BOUND_DRIVER)
+		return NOTIFY_OK;
+
+	pdev = to_pci_dev(dev);
+	if ((pdev->class >> 16) != PCI_BASE_CLASS_DISPLAY)
+		return NOTIFY_OK;
+
+	if (!tcm->display_bound) {
+		tcm->display_bound = true;
+		schedule_work(&tcm->display_retry_work);
+	}
+
+	return NOTIFY_OK;
+}
+
 static void tb_dp_tunnel_active(struct tb_tunnel *tunnel, void *data)
 {
 	struct tb_port *in = tunnel->src_port;
@@ -1955,6 +2013,7 @@ static void tb_dp_tunnel_active(struct tb_tunnel *tunnel, void *data)
 		}
 	} else {
 		struct tb_port *in = tunnel->src_port;
+		struct tb_cm *tcm = tb_priv(tb);
 
 		/*
 		 * This tunnel failed to establish. This means DPRX
@@ -1963,16 +2022,26 @@ static void tb_dp_tunnel_active(struct tb_tunnel *tunnel, void *data)
 		 * loaded or not all DP cables where connected to the
 		 * discrete router.
 		 *
-		 * In both cases we remove the DP IN adapter from the
-		 * available resources as it is not usable. This will
-		 * also tear down the tunnel and try to re-use the
-		 * released DP OUT.
+		 * If no display driver has bound yet (common during boot
+		 * with FDE/LUKS where the GPU driver loads late from
+		 * the encrypted root filesystem), tear down the tunnel
+		 * but keep the DP IN resource available. The PCI bus
+		 * notifier will trigger a retry once a display driver
+		 * binds.
 		 *
-		 * It will be added back only if there is hotplug for
-		 * the DP IN again.
+		 * Otherwise, remove the DP IN adapter from available
+		 * resources as it is not usable. It will be added back
+		 * only if there is hotplug for the DP IN again.
 		 */
-		tb_tunnel_warn(tunnel, "not active, tearing down\n");
-		tb_dp_resource_unavailable(tb, in, "DPRX negotiation failed");
+		if (!tcm->display_bound && !tb_is_display_driver_bound()) {
+			tb_tunnel_warn(tunnel,
+				       "not active, deferring until display driver loads\n");
+			tb_deactivate_and_free_tunnel(tunnel);
+		} else {
+			tb_tunnel_warn(tunnel, "not active, tearing down\n");
+			tb_dp_resource_unavailable(tb, in,
+						   "DPRX negotiation failed");
+		}
 	}
 	mutex_unlock(&tb->lock);
 
@@ -2984,6 +3053,9 @@ static void tb_deinit(struct tb *tb)
 	struct tb_cm *tcm = tb_priv(tb);
 	int i;
 
+	bus_unregister_notifier(&pci_bus_type, &tcm->pci_nb);
+	cancel_work_sync(&tcm->display_retry_work);
+
 	/* Cancel all the release bandwidth workers */
 	for (i = 0; i < ARRAY_SIZE(tcm->groups); i++)
 		cancel_delayed_work_sync(&tcm->groups[i].release_work);
@@ -3410,8 +3482,16 @@ struct tb *tb_probe(struct tb_nhi *nhi)
 	INIT_LIST_HEAD(&tcm->tunnel_list);
 	INIT_LIST_HEAD(&tcm->dp_resources);
 	INIT_DELAYED_WORK(&tcm->remove_work, tb_remove_work);
+	INIT_WORK(&tcm->display_retry_work, tb_display_retry_work_fn);
 	tb_init_bandwidth_groups(tcm);
 
+	/* Check if a display driver is already bound (e.g. hotplug after boot) */
+	tcm->display_bound = tb_is_display_driver_bound();
+
+	/* Watch for display driver binding to defer DPRX until GPU is ready */
+	tcm->pci_nb.notifier_call = tb_pci_notifier_fn;
+	bus_register_notifier(&pci_bus_type, &tcm->pci_nb);
+
 	tb_dbg(tb, "using software connection manager\n");
 
 	/*
-- 
2.34.1

Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by Mika Westerberg 1 week, 5 days ago

Hi,

On Wed, May 27, 2026 at 02:41:21PM +0800, ChunAn Wu wrote:
> When the Thunderbolt driver loads early (e.g., from initramfs)
> and discovers a BIOS-established DisplayPort tunnel, it starts
> asynchronous DPRX polling which checks if the GPU driver has
> read DPCD from the connected monitor within a 12-second timeout
> (TB_DPRX_TIMEOUT).
> 
> On systems with Full Disk Encryption (FDE/LUKS), the GPU driver
> (i915, xe, amdgpu, etc.) resides on the encrypted root filesystem
> and cannot load until the user enters the passphrase. This creates
> a driver load ordering issue where the DPRX timeout fires before
> the GPU driver has had a chance to initialize, causing the
> Thunderbolt driver to permanently tear down the DP tunnel and
> remove the DP IN adapter from available resources. Recovery
> requires a physical re-plug of the dock.
> 
> Fix this by deferring the DP tunnel teardown when no PCI display
> driver has bound yet. Register a PCI bus notifier that watches
> for display class (PCI_BASE_CLASS_DISPLAY) driver bind events.
> When the DPRX timeout fires:
> 
>  - If no display driver is bound: tear down the tunnel but keep
>    the DP IN adapter in the available resources list, allowing
>    a retry.
>  - If a display driver is already bound: proceed with the
>    existing behavior of permanently removing the DP IN resource.
> 
> When a display driver eventually binds, the notifier triggers a
> DP tunnel retry via a scheduled work item, re-establishing the
> connection.
> 
> This approach requires no changes to GPU drivers and handles all
> GPU vendors (Intel, AMD, NVIDIA) through the generic PCI base
> class check (0x03xx covers VGA, XGA, 3D, and other display
> controllers). It also handles the FDE case gracefully since the
> defer and retry can span an unbounded passphrase wait.
> 
> Tested on Dell Pro Max 14 MC14250 with Dell SD25TB5 Thunderbolt
> 5 Dock and LUKS full disk encryption. Simulated a 58-second
> delay between TB and GPU driver loading -- display came up
> successfully after display driver bound.
> 
> Signed-off-by: ChunAn Wu <an.wu@canonical.com>
> ---
>  drivers/thunderbolt/tb.c | 96 ++++++++++++++++++++++++++++++++++++----
>  1 file changed, 88 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
> index 95d84612e06e..48e0b540fbec 100644
> --- a/drivers/thunderbolt/tb.c
> +++ b/drivers/thunderbolt/tb.c
> @@ -62,6 +62,9 @@ MODULE_PARM_DESC(asym_threshold,
>   * @remove_work: Work used to remove any unplugged routers after
>   *		 runtime resume
>   * @groups: Bandwidth groups used in this domain.
> + * @pci_nb: PCI bus notifier to detect when a display driver binds
> + * @display_bound: Set when a PCI display driver has bound
> + * @display_retry_work: Work to retry DP tunneling after display driver binds
>   */
>  struct tb_cm {
>  	struct list_head tunnel_list;
> @@ -69,6 +72,9 @@ struct tb_cm {
>  	bool hotplug_active;
>  	struct delayed_work remove_work;
>  	struct tb_bandwidth_group groups[MAX_GROUPS];
> +	struct notifier_block pci_nb;
> +	bool display_bound;
> +	struct work_struct display_retry_work;
>  };
>  
>  static inline struct tb *tcm_to_tb(struct tb_cm *tcm)
> @@ -1914,6 +1920,58 @@ static struct tb_port *tb_find_dp_out(struct tb *tb, struct tb_port *in)
>  	return NULL;
>  }
>  
> +static void tb_tunnel_dp(struct tb *tb);
> +
> +/*
> + * Check if any PCI display class (0x03xx) device has a driver bound.
> + * Used to decide whether to defer DPRX polling at boot.
> + */
> +static bool tb_is_display_driver_bound(void)
> +{
> +	struct pci_dev *pdev = NULL;
> +
> +	while ((pdev = pci_get_base_class(PCI_BASE_CLASS_DISPLAY, pdev))) {

There is no way we are going to call PCI functions from the core of the CM.
We are actually going to the opposite direction to be able to support
non-PCIe hosts.

Why not put the TB driver as part of the encrypted volume as well if the
graphics driver is there? Or put the graphics drivers part of the
initramfs?

Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by An Wu 1 week, 4 days ago

Hi Mika,

Thank you for the feedback.

Sorry for the mess, and I understand the concern that the Thunderbolt
CM core should not call PCI-specific functions, especially since the
direction is to support non-PCIe hosts as well.

Putting graphics drivers into the initramfs does not look practical
for us, because we may need to include many possible graphics drivers
and dependencies, which would increase the initramfs size and
complexity. Moving Thunderbolt out of the initramfs may also cause
regressions for users relying on Thunderbolt docks early in boot, such
as keyboards in the recovery/LUKS shell or network devices for
early/rootfs use cases.

The problem I am trying to solve is that graphics driver readiness can
affect Thunderbolt DP tunneling, but the graphics and Thunderbolt
drivers currently run independently without any coordination. As a
result, Thunderbolt may treat a temporary graphics-side readiness
issue as a permanent DP tunnel failure.

So the goal is not to make Thunderbolt depend on PCI, but to find an
acceptable way for these components to coordinate, or for Thunderbolt
to retry/check readiness in a more generic way without adding
PCI-specific logic into the CM core.

Could you please give us guidance on what direction would be
acceptable upstream?

BR
    An

On Wed, May 27, 2026 at 3:14 PM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> Hi,
>
> On Wed, May 27, 2026 at 02:41:21PM +0800, ChunAn Wu wrote:
> > When the Thunderbolt driver loads early (e.g., from initramfs)
> > and discovers a BIOS-established DisplayPort tunnel, it starts
> > asynchronous DPRX polling which checks if the GPU driver has
> > read DPCD from the connected monitor within a 12-second timeout
> > (TB_DPRX_TIMEOUT).
> >
> > On systems with Full Disk Encryption (FDE/LUKS), the GPU driver
> > (i915, xe, amdgpu, etc.) resides on the encrypted root filesystem
> > and cannot load until the user enters the passphrase. This creates
> > a driver load ordering issue where the DPRX timeout fires before
> > the GPU driver has had a chance to initialize, causing the
> > Thunderbolt driver to permanently tear down the DP tunnel and
> > remove the DP IN adapter from available resources. Recovery
> > requires a physical re-plug of the dock.
> >
> > Fix this by deferring the DP tunnel teardown when no PCI display
> > driver has bound yet. Register a PCI bus notifier that watches
> > for display class (PCI_BASE_CLASS_DISPLAY) driver bind events.
> > When the DPRX timeout fires:
> >
> >  - If no display driver is bound: tear down the tunnel but keep
> >    the DP IN adapter in the available resources list, allowing
> >    a retry.
> >  - If a display driver is already bound: proceed with the
> >    existing behavior of permanently removing the DP IN resource.
> >
> > When a display driver eventually binds, the notifier triggers a
> > DP tunnel retry via a scheduled work item, re-establishing the
> > connection.
> >
> > This approach requires no changes to GPU drivers and handles all
> > GPU vendors (Intel, AMD, NVIDIA) through the generic PCI base
> > class check (0x03xx covers VGA, XGA, 3D, and other display
> > controllers). It also handles the FDE case gracefully since the
> > defer and retry can span an unbounded passphrase wait.
> >
> > Tested on Dell Pro Max 14 MC14250 with Dell SD25TB5 Thunderbolt
> > 5 Dock and LUKS full disk encryption. Simulated a 58-second
> > delay between TB and GPU driver loading -- display came up
> > successfully after display driver bound.
> >
> > Signed-off-by: ChunAn Wu <an.wu@canonical.com>
> > ---
> >  drivers/thunderbolt/tb.c | 96 ++++++++++++++++++++++++++++++++++++----
> >  1 file changed, 88 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
> > index 95d84612e06e..48e0b540fbec 100644
> > --- a/drivers/thunderbolt/tb.c
> > +++ b/drivers/thunderbolt/tb.c
> > @@ -62,6 +62,9 @@ MODULE_PARM_DESC(asym_threshold,
> >   * @remove_work: Work used to remove any unplugged routers after
> >   *            runtime resume
> >   * @groups: Bandwidth groups used in this domain.
> > + * @pci_nb: PCI bus notifier to detect when a display driver binds
> > + * @display_bound: Set when a PCI display driver has bound
> > + * @display_retry_work: Work to retry DP tunneling after display driver binds
> >   */
> >  struct tb_cm {
> >       struct list_head tunnel_list;
> > @@ -69,6 +72,9 @@ struct tb_cm {
> >       bool hotplug_active;
> >       struct delayed_work remove_work;
> >       struct tb_bandwidth_group groups[MAX_GROUPS];
> > +     struct notifier_block pci_nb;
> > +     bool display_bound;
> > +     struct work_struct display_retry_work;
> >  };
> >
> >  static inline struct tb *tcm_to_tb(struct tb_cm *tcm)
> > @@ -1914,6 +1920,58 @@ static struct tb_port *tb_find_dp_out(struct tb *tb, struct tb_port *in)
> >       return NULL;
> >  }
> >
> > +static void tb_tunnel_dp(struct tb *tb);
> > +
> > +/*
> > + * Check if any PCI display class (0x03xx) device has a driver bound.
> > + * Used to decide whether to defer DPRX polling at boot.
> > + */
> > +static bool tb_is_display_driver_bound(void)
> > +{
> > +     struct pci_dev *pdev = NULL;
> > +
> > +     while ((pdev = pci_get_base_class(PCI_BASE_CLASS_DISPLAY, pdev))) {
>
> There is no way we are going to call PCI functions from the core of the CM.
> We are actually going to the opposite direction to be able to support
> non-PCIe hosts.
>
> Why not put the TB driver as part of the encrypted volume as well if the
> graphics driver is there? Or put the graphics drivers part of the
> initramfs?

Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by Mika Westerberg 1 week, 4 days ago

Hi,

On Thu, May 28, 2026 at 09:03:30AM +0800, An Wu wrote:
> Hi Mika,
> 
> Thank you for the feedback.
> 
> Sorry for the mess, and I understand the concern that the Thunderbolt
> CM core should not call PCI-specific functions, especially since the
> direction is to support non-PCIe hosts as well.
> 
> Putting graphics drivers into the initramfs does not look practical
> for us, because we may need to include many possible graphics drivers
> and dependencies, which would increase the initramfs size and
> complexity. Moving Thunderbolt out of the initramfs may also cause
> regressions for users relying on Thunderbolt docks early in boot, such
> as keyboards in the recovery/LUKS shell or network devices for
> early/rootfs use cases.
> 
> The problem I am trying to solve is that graphics driver readiness can
> affect Thunderbolt DP tunneling, but the graphics and Thunderbolt
> drivers currently run independently without any coordination. As a
> result, Thunderbolt may treat a temporary graphics-side readiness
> issue as a permanent DP tunnel failure.
> 
> So the goal is not to make Thunderbolt depend on PCI, but to find an
> acceptable way for these components to coordinate, or for Thunderbolt
> to retry/check readiness in a more generic way without adding
> PCI-specific logic into the CM core.
> 
> Could you please give us guidance on what direction would be
> acceptable upstream?

The DPRX timeout is there for a reason, although the reason is not really
that common. Basically if there is nothing connected to the DP IN we can
detect that and be able to use another DP IN to provide user a working DP
tunnel.

The timeout itself is currently 10 + 2 = 12s to allow i915 enter runtime
suspend and still be able to detect (via polling) a connected monitor.
However, it is not really "written in stone". VESA spec wants it to be 5s
but for our usage it is way too short. I have no problem increasing it
either but then some users may suffer due to the above reasons (if a DP IN
is not connected). Maybe increasing it is a reasonable compromise?

The other option is to put the DP IN to a "penalty box" for a while but I
don't think this helps because you need the hotplug event for the DP OUT
part and that's not happening after we have acked it.

Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by An Wu 1 week ago

Hi Mika,


Thank you for the suggestion and for explaining the rationale behind
the current timeout value.

In our case, unfortunately, increasing the timeout alone wouldn't fully
address the issue. We have LUKS encryption in the middle of the boot
process, which means the timing of user interaction is unpredictable —
users may walk away and return at arbitrary points, making it difficult
to rely on any fixed timeout value regardless of how generous it is.

Appreciate you sharing your perspective on this.
It helps us better understand the constraints we need to work within.
We’ll continue investigating how to address this problem under the
current conditions.

Best regards,

An

On Thu, May 28, 2026 at 6:29 PM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> Hi,
>
> On Thu, May 28, 2026 at 09:03:30AM +0800, An Wu wrote:
> > Hi Mika,
> >
> > Thank you for the feedback.
> >
> > Sorry for the mess, and I understand the concern that the Thunderbolt
> > CM core should not call PCI-specific functions, especially since the
> > direction is to support non-PCIe hosts as well.
> >
> > Putting graphics drivers into the initramfs does not look practical
> > for us, because we may need to include many possible graphics drivers
> > and dependencies, which would increase the initramfs size and
> > complexity. Moving Thunderbolt out of the initramfs may also cause
> > regressions for users relying on Thunderbolt docks early in boot, such
> > as keyboards in the recovery/LUKS shell or network devices for
> > early/rootfs use cases.
> >
> > The problem I am trying to solve is that graphics driver readiness can
> > affect Thunderbolt DP tunneling, but the graphics and Thunderbolt
> > drivers currently run independently without any coordination. As a
> > result, Thunderbolt may treat a temporary graphics-side readiness
> > issue as a permanent DP tunnel failure.
> >
> > So the goal is not to make Thunderbolt depend on PCI, but to find an
> > acceptable way for these components to coordinate, or for Thunderbolt
> > to retry/check readiness in a more generic way without adding
> > PCI-specific logic into the CM core.
> >
> > Could you please give us guidance on what direction would be
> > acceptable upstream?
>
> The DPRX timeout is there for a reason, although the reason is not really
> that common. Basically if there is nothing connected to the DP IN we can
> detect that and be able to use another DP IN to provide user a working DP
> tunnel.
>
> The timeout itself is currently 10 + 2 = 12s to allow i915 enter runtime
> suspend and still be able to detect (via polling) a connected monitor.
> However, it is not really "written in stone". VESA spec wants it to be 5s
> but for our usage it is way too short. I have no problem increasing it
> either but then some users may suffer due to the above reasons (if a DP IN
> is not connected). Maybe increasing it is a reasonable compromise?
>
> The other option is to put the DP IN to a "penalty box" for a while but I
> don't think this helps because you need the hotplug event for the DP OUT
> part and that's not happening after we have acked it.

Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by An Wu 1 week ago

Hi Mika

Another approach I considered is using register_module_notifier() to
detect when a display driver module is loaded, then retrigger the DP
tunnel setup. However, since struct module does not carry any device
class or subsystem metadata, there is no generic way to identify
whether a loaded module is a display driver. We would need to maintain
a hardcoded list of known GPU module names (i915, xe, amdgpu, etc.),
which is fragile and not scalable.

BR
    An

On Mon, Jun 1, 2026 at 10:45 AM An Wu <an.wu@canonical.com> wrote:
>
> Hi Mika,
>
>
> Thank you for the suggestion and for explaining the rationale behind
> the current timeout value.
>
> In our case, unfortunately, increasing the timeout alone wouldn't fully
> address the issue. We have LUKS encryption in the middle of the boot
> process, which means the timing of user interaction is unpredictable —
> users may walk away and return at arbitrary points, making it difficult
> to rely on any fixed timeout value regardless of how generous it is.
>
> Appreciate you sharing your perspective on this.
> It helps us better understand the constraints we need to work within.
> We’ll continue investigating how to address this problem under the
> current conditions.
>
> Best regards,
>
> An
>
> On Thu, May 28, 2026 at 6:29 PM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> >
> > Hi,
> >
> > On Thu, May 28, 2026 at 09:03:30AM +0800, An Wu wrote:
> > > Hi Mika,
> > >
> > > Thank you for the feedback.
> > >
> > > Sorry for the mess, and I understand the concern that the Thunderbolt
> > > CM core should not call PCI-specific functions, especially since the
> > > direction is to support non-PCIe hosts as well.
> > >
> > > Putting graphics drivers into the initramfs does not look practical
> > > for us, because we may need to include many possible graphics drivers
> > > and dependencies, which would increase the initramfs size and
> > > complexity. Moving Thunderbolt out of the initramfs may also cause
> > > regressions for users relying on Thunderbolt docks early in boot, such
> > > as keyboards in the recovery/LUKS shell or network devices for
> > > early/rootfs use cases.
> > >
> > > The problem I am trying to solve is that graphics driver readiness can
> > > affect Thunderbolt DP tunneling, but the graphics and Thunderbolt
> > > drivers currently run independently without any coordination. As a
> > > result, Thunderbolt may treat a temporary graphics-side readiness
> > > issue as a permanent DP tunnel failure.
> > >
> > > So the goal is not to make Thunderbolt depend on PCI, but to find an
> > > acceptable way for these components to coordinate, or for Thunderbolt
> > > to retry/check readiness in a more generic way without adding
> > > PCI-specific logic into the CM core.
> > >
> > > Could you please give us guidance on what direction would be
> > > acceptable upstream?
> >
> > The DPRX timeout is there for a reason, although the reason is not really
> > that common. Basically if there is nothing connected to the DP IN we can
> > detect that and be able to use another DP IN to provide user a working DP
> > tunnel.
> >
> > The timeout itself is currently 10 + 2 = 12s to allow i915 enter runtime
> > suspend and still be able to detect (via polling) a connected monitor.
> > However, it is not really "written in stone". VESA spec wants it to be 5s
> > but for our usage it is way too short. I have no problem increasing it
> > either but then some users may suffer due to the above reasons (if a DP IN
> > is not connected). Maybe increasing it is a reasonable compromise?
> >
> > The other option is to put the DP IN to a "penalty box" for a while but I
> > don't think this helps because you need the hotplug event for the DP OUT
> > part and that's not happening after we have acked it.

Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by Mika Westerberg 1 week ago

Hi,

On Mon, Jun 01, 2026 at 02:50:21PM +0800, An Wu wrote:
> Hi Mika
> 
> Another approach I considered is using register_module_notifier() to
> detect when a display driver module is loaded, then retrigger the DP
> tunnel setup. However, since struct module does not carry any device
> class or subsystem metadata, there is no generic way to identify
> whether a loaded module is a display driver. We would need to maintain
> a hardcoded list of known GPU module names (i915, xe, amdgpu, etc.),
> which is fragile and not scalable.

Indeed. Perhaps not to try to solve this in the kernel and instead do this
in userspace?

Have you actually measured how much initramfs size "increases" if you do
include the relevant graphics drivers and their dependencies?

Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by An Wu 1 week ago

Hi Mika
    We tried putting graphic modules into initramfs and the size
increased from 56 MB to over 200 MB. We will discuss with the team the
possibility of fixing this in userspace and follow up once we have a
clearer picture.
Really appreciate your time and patience.

BR
    An

On Mon, Jun 1, 2026 at 3:04 PM Mika Westerberg
<mika.westerberg@linux.intel.com> wrote:
>
> Hi,
>
> On Mon, Jun 01, 2026 at 02:50:21PM +0800, An Wu wrote:
> > Hi Mika
> >
> > Another approach I considered is using register_module_notifier() to
> > detect when a display driver module is loaded, then retrigger the DP
> > tunnel setup. However, since struct module does not carry any device
> > class or subsystem metadata, there is no generic way to identify
> > whether a loaded module is a display driver. We would need to maintain
> > a hardcoded list of known GPU module names (i915, xe, amdgpu, etc.),
> > which is fragile and not scalable.
>
> Indeed. Perhaps not to try to solve this in the kernel and instead do this
> in userspace?
>
> Have you actually measured how much initramfs size "increases" if you do
> include the relevant graphics drivers and their dependencies?

Re: [PATCH] thunderbolt: Defer DP tunnel teardown until display driver is ready

Posted by An Wu 3 days, 8 hours ago

Hi Mika
    After further consideration, we've decided to drop this patch and
probably to note it as limitation. Really appreciate your patience and
sorry for the noise.

BR
    An

On Mon, Jun 1, 2026 at 4:18 PM An Wu <an.wu@canonical.com> wrote:
>
> Hi Mika
>     We tried putting graphic modules into initramfs and the size
> increased from 56 MB to over 200 MB. We will discuss with the team the
> possibility of fixing this in userspace and follow up once we have a
> clearer picture.
> Really appreciate your time and patience.
>
> BR
>     An
>
> On Mon, Jun 1, 2026 at 3:04 PM Mika Westerberg
> <mika.westerberg@linux.intel.com> wrote:
> >
> > Hi,
> >
> > On Mon, Jun 01, 2026 at 02:50:21PM +0800, An Wu wrote:
> > > Hi Mika
> > >
> > > Another approach I considered is using register_module_notifier() to
> > > detect when a display driver module is loaded, then retrigger the DP
> > > tunnel setup. However, since struct module does not carry any device
> > > class or subsystem metadata, there is no generic way to identify
> > > whether a loaded module is a display driver. We would need to maintain
> > > a hardcoded list of known GPU module names (i915, xe, amdgpu, etc.),
> > > which is fragile and not scalable.
> >
> > Indeed. Perhaps not to try to solve this in the kernel and instead do this
> > in userspace?
> >
> > Have you actually measured how much initramfs size "increases" if you do
> > include the relevant graphics drivers and their dependencies?