The resource-tracking (restrack) database is the back-end for the netlink
"rdma resource show" interface which pins objects with
rdma_restrack_get().
The QP/CQ/SRQ destroy flows call rdma_restrack_del() at the end of
ib_destroy_*_user(), after device->ops.destroy_*() had already freed the
vendor object. Therefore, a concurrent netlink dump could look the
object up and touch freed memory, causing a use-after-free via
ib_query_qp() for instance.
Fix this by splitting the delete into a begin/commit/abort sequence:
begin_del() parks the entry as XA_ZERO_ENTRY (so lookups return NULL),
drops the birth reference and waits for in-flight readers to drain,
while keeping the index reserved. The destroy paths run begin_del()
first, then commit_del() on success or abort_del() on error.
abort_del() re-inserts into the reserved slot, so it needs no allocation
and cannot fail.
The first two patches remove DCT and raw RSS QP restrack tracking as
they have never worked (their ID is unset/reserved at create time).
Signed-off-by: Edward Srouji <edwards@nvidia.com>
---
Patrisious Haddad (6):
RDMA/mlx5: Remove DCT restrack tracking
RDMA/mlx5: Remove raw RSS QP restrack tracking
RDMA/core: Add rdma_restrack_begin/abort/commit_del() operations
RDMA/core: Fix use after free in ib_query_qp()
RDMA/core: Fix potential use after free in ib_destroy_cq_user()
RDMA/core: Fix potential use after free in ib_destroy_srq_user()
drivers/infiniband/core/restrack.c | 120 ++++++++++++++++++++++++++++++----
drivers/infiniband/core/restrack.h | 3 +
drivers/infiniband/core/verbs.c | 21 ++++--
drivers/infiniband/hw/mlx5/qp.c | 2 +
drivers/infiniband/hw/mlx5/restrack.c | 3 -
5 files changed, 130 insertions(+), 19 deletions(-)
---
base-commit: d6ab440240a04b8737ee4c7bb21af9182e451733
change-id: 20260607-restrack-uaf-fix-d3e0bccf0be1
Best regards,
--
Edward Srouji <edwards@nvidia.com>