net/devlink/rate.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
From: Shay Drory <shayd@nvidia.com>
The function devl_rate_nodes_destroy is documented to "Unset parent for
all rate objects". However, it was only calling the driver-specific
`rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing
the parent's refcount, without actually setting the
`devlink_rate->parent` pointer to NULL.
This leaves a dangling pointer in the `devlink_rate` struct, which is
inconsistent with the behavior of `devlink_nl_rate_parent_node_set`,
where the parent pointer is correctly cleared.
This patch fixes the issue by explicitly setting `devlink_rate->parent`
to NULL after notifying the driver, thus fulfilling the function's
documented behavior for all rate objects.
Fixes: d75559845078 ("devlink: Allow setting parent node of rate objects")
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
---
net/devlink/rate.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/devlink/rate.c b/net/devlink/rate.c
index 264fb82cba19..d157a8419bca 100644
--- a/net/devlink/rate.c
+++ b/net/devlink/rate.c
@@ -828,13 +828,15 @@ void devl_rate_nodes_destroy(struct devlink *devlink)
if (!devlink_rate->parent)
continue;
- refcount_dec(&devlink_rate->parent->refcnt);
if (devlink_rate_is_leaf(devlink_rate))
ops->rate_leaf_parent_set(devlink_rate, NULL, devlink_rate->priv,
NULL, NULL);
else if (devlink_rate_is_node(devlink_rate))
ops->rate_node_parent_set(devlink_rate, NULL, devlink_rate->priv,
NULL, NULL);
+
+ refcount_dec(&devlink_rate->parent->refcnt);
+ devlink_rate->parent = NULL;
}
list_for_each_entry_safe(devlink_rate, tmp, &devlink->rate_list, list) {
if (devlink_rate_is_node(devlink_rate)) {
base-commit: 8c0726e861f3920bac958d76cf134b5a3aa14ce4
--
2.31.1
On Tue, 11 Nov 2025 14:14:39 +0200 Tariq Toukan wrote: > The function devl_rate_nodes_destroy is documented to "Unset parent for > all rate objects". However, it was only calling the driver-specific > `rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing > the parent's refcount, without actually setting the > `devlink_rate->parent` pointer to NULL. > > This leaves a dangling pointer in the `devlink_rate` struct, which is > inconsistent with the behavior of `devlink_nl_rate_parent_node_set`, > where the parent pointer is correctly cleared. > > This patch fixes the issue by explicitly setting `devlink_rate->parent` > to NULL after notifying the driver, thus fulfilling the function's > documented behavior for all rate objects. What is the _real_ issue you're solving here? If the function destroys all nodes maybe it doesn't matter that the pointer isn't cleared. -- pw-bot: cr
On 13/11/2025 4:12, Jakub Kicinski wrote: > External email: Use caution opening links or attachments > > > On Tue, 11 Nov 2025 14:14:39 +0200 Tariq Toukan wrote: >> The function devl_rate_nodes_destroy is documented to "Unset parent for >> all rate objects". However, it was only calling the driver-specific >> `rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing >> the parent's refcount, without actually setting the >> `devlink_rate->parent` pointer to NULL. >> >> This leaves a dangling pointer in the `devlink_rate` struct, which is >> inconsistent with the behavior of `devlink_nl_rate_parent_node_set`, >> where the parent pointer is correctly cleared. >> >> This patch fixes the issue by explicitly setting `devlink_rate->parent` >> to NULL after notifying the driver, thus fulfilling the function's >> documented behavior for all rate objects. > > What is the _real_ issue you're solving here? If the function destroys > all nodes maybe it doesn't matter that the pointer isn't cleared. > -- > pw-bot: cr The problem is a leaf which have this node as a parent, now pointing to invalid memory. When this leaf will be destroyed, in devl_rate_leaf_destroy, we can get NULL-ptr error, or refcount error. Is this answer your question?
On Thu, 13 Nov 2025 10:33:09 +0200 Shay Drori wrote: > > On Tue, 11 Nov 2025 14:14:39 +0200 Tariq Toukan wrote: > >> The function devl_rate_nodes_destroy is documented to "Unset parent for > >> all rate objects". However, it was only calling the driver-specific > >> `rate_leaf_parent_set` or `rate_node_parent_set` ops and decrementing > >> the parent's refcount, without actually setting the > >> `devlink_rate->parent` pointer to NULL. > >> > >> This leaves a dangling pointer in the `devlink_rate` struct, which is > >> inconsistent with the behavior of `devlink_nl_rate_parent_node_set`, > >> where the parent pointer is correctly cleared. > >> > >> This patch fixes the issue by explicitly setting `devlink_rate->parent` > >> to NULL after notifying the driver, thus fulfilling the function's > >> documented behavior for all rate objects. > > > > What is the _real_ issue you're solving here? If the function destroys > > all nodes maybe it doesn't matter that the pointer isn't cleared. > > The problem is a leaf which have this node as a parent, now pointing to > invalid memory. When this leaf will be destroyed, in > devl_rate_leaf_destroy, we can get NULL-ptr error, or refcount error. > > Is this answer your question? Kind of. I was hoping you can add a concrete example to the commit message. What sequence of user operations are needed with mlx5 to make the kernel oops.
© 2016 - 2026 Red Hat, Inc.