[PATCH rdma v1] RDMA/mlx5: Fix devx subscribe-event unwind NULL dereference

Prathamesh Deshpande posted 1 patch 1 month, 3 weeks ago
There is a newer version of this series
drivers/infiniband/hw/mlx5/devx.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
[PATCH rdma v1] RDMA/mlx5: Fix devx subscribe-event unwind NULL dereference
Posted by Prathamesh Deshpande 1 month, 3 weeks ago
MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT() links event_sub into sub_list
before initializing the fields used by the shared error path.

If eventfd_ctx_fdget() then fails, the unwind path dereferences
event_sub->ev_file in uverbs_uobject_put() and calls
subscribe_event_xa_dealloc() with event_sub->xa_key_level1 still unset.

Also, if kzalloc_obj() for event_sub fails after
subscribe_event_xa_alloc() succeeds, the current iteration is not yet
tracked in sub_list, so the shared unwind path cannot undo the XA
allocation.

Initialize the shared-unwind fields before linking event_sub into
sub_list and explicitly unwind the XA allocation on event_sub allocation
failure.

Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX")
Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
---
 drivers/infiniband/hw/mlx5/devx.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c
index 645ebcc0832d..3d1528b1c816 100644
--- a/drivers/infiniband/hw/mlx5/devx.c
+++ b/drivers/infiniband/hw/mlx5/devx.c
@@ -2160,10 +2160,16 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
 
 		event_sub = kzalloc_obj(*event_sub);
 		if (!event_sub) {
+			subscribe_event_xa_dealloc(devx_event_table,
+						   key_level1,
+						   obj,
+						   obj_id);
 			err = -ENOMEM;
 			goto err;
 		}
 
+		event_sub->ev_file = ev_file;
+		event_sub->xa_key_level1 = key_level1;
 		list_add_tail(&event_sub->event_list, &sub_list);
 		uverbs_uobject_get(&ev_file->uobj);
 		if (use_eventfd) {
@@ -2178,9 +2184,6 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
 		}
 
 		event_sub->cookie = cookie;
-		event_sub->ev_file = ev_file;
-		/* May be needed upon cleanup the devx object/subscription */
-		event_sub->xa_key_level1 = key_level1;
 		event_sub->xa_key_level2 = obj_id;
 		INIT_LIST_HEAD(&event_sub->obj_list);
 	}
-- 
2.43.0
Re: [PATCH rdma v1] RDMA/mlx5: Fix devx subscribe-event unwind NULL dereference
Posted by Yishai Hadas 1 month, 3 weeks ago
On 25/04/2026 3:59, Prathamesh Deshpande wrote:
> MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT() links event_sub into sub_list
> before initializing the fields used by the shared error path.
> 
> If eventfd_ctx_fdget() then fails, the unwind path dereferences
> event_sub->ev_file in uverbs_uobject_put() and calls
> subscribe_event_xa_dealloc() with event_sub->xa_key_level1 still unset.
> 
> Also, if kzalloc_obj() for event_sub fails after
> subscribe_event_xa_alloc() succeeds, the current iteration is not yet
> tracked in sub_list, so the shared unwind path cannot undo the XA
> allocation.
> 
> Initialize the shared-unwind fields before linking event_sub into
> sub_list and explicitly unwind the XA allocation on event_sub allocation
> failure.
> 
> Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX")
> Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>

LGDM
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>

> ---
>   drivers/infiniband/hw/mlx5/devx.c | 9 ++++++---
>   1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/hw/mlx5/devx.c
> index 645ebcc0832d..3d1528b1c816 100644
> --- a/drivers/infiniband/hw/mlx5/devx.c
> +++ b/drivers/infiniband/hw/mlx5/devx.c
> @@ -2160,10 +2160,16 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
>   
>   		event_sub = kzalloc_obj(*event_sub);
>   		if (!event_sub) {
> +			subscribe_event_xa_dealloc(devx_event_table,
> +						   key_level1,
> +						   obj,
> +						   obj_id);
>   			err = -ENOMEM;
>   			goto err;
>   		}
>   
> +		event_sub->ev_file = ev_file;
> +		event_sub->xa_key_level1 = key_level1;
>   		list_add_tail(&event_sub->event_list, &sub_list);
>   		uverbs_uobject_get(&ev_file->uobj);
>   		if (use_eventfd) {
> @@ -2178,9 +2184,6 @@ static int UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
>   		}
>   
>   		event_sub->cookie = cookie;
> -		event_sub->ev_file = ev_file;
> -		/* May be needed upon cleanup the devx object/subscription */
> -		event_sub->xa_key_level1 = key_level1;
>   		event_sub->xa_key_level2 = obj_id;
>   		INIT_LIST_HEAD(&event_sub->obj_list);
>   	}
Re: [PATCH rdma v1] RDMA/mlx5: Fix devx subscribe-event unwind NULL dereference
Posted by Yishai Hadas 1 month, 3 weeks ago
On 27/04/2026 15:16, Yishai Hadas wrote:
> On 25/04/2026 3:59, Prathamesh Deshpande wrote:
>> MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT() links event_sub into sub_list
>> before initializing the fields used by the shared error path.
>>
>> If eventfd_ctx_fdget() then fails, the unwind path dereferences
>> event_sub->ev_file in uverbs_uobject_put() and calls
>> subscribe_event_xa_dealloc() with event_sub->xa_key_level1 still unset.
>>
>> Also, if kzalloc_obj() for event_sub fails after
>> subscribe_event_xa_alloc() succeeds, the current iteration is not yet
>> tracked in sub_list, so the shared unwind path cannot undo the XA
>> allocation.
>>
>> Initialize the shared-unwind fields before linking event_sub into
>> sub_list and explicitly unwind the XA allocation on event_sub allocation
>> failure.
>>
>> Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events 
>> over DEVX")
>> Signed-off-by: Prathamesh Deshpande <prathameshdeshpande7@gmail.com>
> 
> LGDM
> Reviewed-by: Yishai Hadas <yishaih@nvidia.com>

Prathamesh,

Please see the below [1] review note from sashiko on your patch, it 
seems right to me.

Can you please come with V2 while addressing it ?

The below [2] chunks on top of your V1 with a proper/improved commit log 
can be considered as a proper solution.

I would add in the commit log something as of that.

"subscribe_event_xa_alloc() created the XA entry exactly once (on the 
first occurrence of KEY_A), so subscribe_event_xa_dealloc() must also be 
called exactly once for it.
Enforcing that by adding a helper function named devx_key_in_sub_list()
and call subscribe_event_xa_dealloc() only once the last occurrence
being cleaned up."

[1] 
https://sashiko.dev/#/patchset/20260425010107.19586-1-prathameshdeshpande7%40gmail.com

[2] diff --git a/drivers/infiniband/hw/mlx5/devx.c 
b/drivers/infiniband/hw/mlx5/devx.c
index 3d1528b1c816..c2ae5a140471 100644
--- a/drivers/infiniband/hw/mlx5/devx.c
+++ b/drivers/infiniband/hw/mlx5/devx.c
@@ -1913,6 +1913,17 @@ static int 
UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_OBJ_ASYNC_QUERY)(
         return err;
  }

+static bool devx_key_in_sub_list(struct list_head *list, u32 key_level1)
+{
+       struct devx_event_subscription *s;
+
+       list_for_each_entry(s, list, event_list)
+               if (s->xa_key_level1 == key_level1)
+                       return true;
+
+       return false;
+}
+
  static void
  subscribe_event_xa_dealloc(struct mlx5_devx_event_table *devx_event_table,
                            u32 key_level1,
@@ -2160,10 +2171,11 @@ static int 
UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(

                 event_sub = kzalloc_obj(*event_sub);
                 if (!event_sub) {
-                       subscribe_event_xa_dealloc(devx_event_table,
-                                                  key_level1,
-                                                  obj,
-                                                  obj_id);
+                       if (!devx_key_in_sub_list(&sub_list, key_level1))
+                               subscribe_event_xa_dealloc(devx_event_table,
+                                                          key_level1,
+                                                          obj,
+                                                          obj_id);
                         err = -ENOMEM;
                         goto err;
                 }
@@ -2228,10 +2240,11 @@ static int 
UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
         list_for_each_entry_safe(event_sub, tmp_sub, &sub_list, 
event_list) {
                 list_del(&event_sub->event_list);

-               subscribe_event_xa_dealloc(devx_event_table,
-                                          event_sub->xa_key_level1,
-                                          obj,
-                                          obj_id);
+               if (!devx_key_in_sub_list(&sub_list, 
event_sub->xa_key_level1))
+                       subscribe_event_xa_dealloc(devx_event_table,
+                                                  event_sub->xa_key_level1,
+                                                  obj,
+                                                  obj_id);

                 if (event_sub->eventfd)
                         eventfd_ctx_put(event_sub->eventfd);

Yishai

> 
>> ---
>>   drivers/infiniband/hw/mlx5/devx.c | 9 ++++++---
>>   1 file changed, 6 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/infiniband/hw/mlx5/devx.c b/drivers/infiniband/ 
>> hw/mlx5/devx.c
>> index 645ebcc0832d..3d1528b1c816 100644
>> --- a/drivers/infiniband/hw/mlx5/devx.c
>> +++ b/drivers/infiniband/hw/mlx5/devx.c
>> @@ -2160,10 +2160,16 @@ static int 
>> UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
>>           event_sub = kzalloc_obj(*event_sub);
>>           if (!event_sub) {
>> +            subscribe_event_xa_dealloc(devx_event_table,
>> +                           key_level1,
>> +                           obj,
>> +                           obj_id);
>>               err = -ENOMEM;
>>               goto err;
>>           }
>> +        event_sub->ev_file = ev_file;
>> +        event_sub->xa_key_level1 = key_level1;
>>           list_add_tail(&event_sub->event_list, &sub_list);
>>           uverbs_uobject_get(&ev_file->uobj);
>>           if (use_eventfd) {
>> @@ -2178,9 +2184,6 @@ static int 
>> UVERBS_HANDLER(MLX5_IB_METHOD_DEVX_SUBSCRIBE_EVENT)(
>>           }
>>           event_sub->cookie = cookie;
>> -        event_sub->ev_file = ev_file;
>> -        /* May be needed upon cleanup the devx object/subscription */
>> -        event_sub->xa_key_level1 = key_level1;
>>           event_sub->xa_key_level2 = obj_id;
>>           INIT_LIST_HEAD(&event_sub->obj_list);
>>       }
> 

Re: [PATCH rdma v1] RDMA/mlx5: Fix devx subscribe-event unwind NULL dereference
Posted by Prathamesh Deshpande 1 month, 3 weeks ago
On 28 Apr 2026 17:55:22 +0300, Yishai Hadas wrote:
> Can you please come with V2 while addressing it ?

Thanks for the feedback, Yishai. I have sent v2 which addresses these points.