drivers/cxl/core/port.c | 9 +++++++++ 1 file changed, 9 insertions(+)
xa_erase() in cxl_dport_remove() runs without the port device lock,
creating a race with any caller that does xa_load() on port->dports
and then dereferences the returned dport pointer. A concurrent
cxl_dport_remove() can erase and free the dport between the xa_load()
and the caller acquiring the port lock, causing a use-after-free.
For non-root ports the port lock is already held by the caller on two
paths:
1. Driver unbind: devres_release_all() is called from
__device_release_driver() which holds port->dev.mutex.
2. Dynamic endpoint removal: cxl_detach_ep() takes the port lock
before calling del_dports() -> del_dport() -> devres_release_group(),
which synchronously runs cxl_dport_remove().
Use cond_cxl_root_lock/unlock(), which only acquires the port lock when
the port is a root port and the lock is therefore not already held.
This matches the pattern used in __devm_cxl_add_dport() for the same
reason.
Reported-by: Sashiko
Fixes: 391785859e7e ("cxl/port: Move dport tracking to an xarray")
Signed-off-by: Terry Bowman <terry.bowman@amd.com>
---
drivers/cxl/core/port.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index 0c5957d1d329..80ce7c4d357c 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1088,8 +1088,17 @@ static void cxl_dport_remove(void *data)
struct cxl_dport *dport = data;
struct cxl_port *port = dport->port;
+ /*
+ * For non-root ports the port lock is already held by the caller
+ * (driver unbind via devres_release_all(), or cxl_detach_ep() via
+ * devres_release_group()). Acquiring it again unconditionally would
+ * deadlock. Use cond_cxl_root_lock() which only acquires when the
+ * port is a root port and the lock is therefore not yet held.
+ */
+ cond_cxl_root_lock(port);
port->nr_dports--;
xa_erase(&port->dports, (unsigned long) dport->dport_dev);
+ cond_cxl_root_unlock(port);
put_device(dport->dport_dev);
}
--
2.34.1
On 6/5/2026 1:20 PM, Terry Bowman wrote:
> xa_erase() in cxl_dport_remove() runs without the port device lock,
> creating a race with any caller that does xa_load() on port->dports
> and then dereferences the returned dport pointer. A concurrent
> cxl_dport_remove() can erase and free the dport between the xa_load()
> and the caller acquiring the port lock, causing a use-after-free.
>
> For non-root ports the port lock is already held by the caller on two
> paths:
>
> 1. Driver unbind: devres_release_all() is called from
> __device_release_driver() which holds port->dev.mutex.
>
> 2. Dynamic endpoint removal: cxl_detach_ep() takes the port lock
> before calling del_dports() -> del_dport() -> devres_release_group(),
> which synchronously runs cxl_dport_remove().
>
> Use cond_cxl_root_lock/unlock(), which only acquires the port lock when
> the port is a root port and the lock is therefore not already held.
> This matches the pattern used in __devm_cxl_add_dport() for the same
> reason.
>
> Reported-by: Sashiko
> Fixes: 391785859e7e ("cxl/port: Move dport tracking to an xarray")
> Signed-off-by: Terry Bowman <terry.bowman@amd.com>
So I think this is a real bug, but I had to think about it pretty long
and hard to convince myself. The scenario I'm envisioning would be an
error occurs and the driver is force unbound while the error routine is
running. I guess it could also happen if someone does something weird
with force unbinding and rebinding the driver, but I'm not sure that's
possible.
What I guess I'm getting at is this could benefit from an example of
how this can happen in the commit log. With that:
Reviewed-by: Ben Cheatham <benjamin.cheatham@amd.com>
> ---
> drivers/cxl/core/port.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 0c5957d1d329..80ce7c4d357c 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1088,8 +1088,17 @@ static void cxl_dport_remove(void *data)
> struct cxl_dport *dport = data;
> struct cxl_port *port = dport->port;
>
> + /*
> + * For non-root ports the port lock is already held by the caller
> + * (driver unbind via devres_release_all(), or cxl_detach_ep() via
> + * devres_release_group()). Acquiring it again unconditionally would
> + * deadlock. Use cond_cxl_root_lock() which only acquires when the
> + * port is a root port and the lock is therefore not yet held.
> + */
> + cond_cxl_root_lock(port);
> port->nr_dports--;
> xa_erase(&port->dports, (unsigned long) dport->dport_dev);
> + cond_cxl_root_unlock(port);
> put_device(dport->dport_dev);
> }
>
© 2016 - 2026 Red Hat, Inc.