net/devlink/health.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
From: Li RongQing <lirongqing@baidu.com>
In devlink_health_report(), the DEVLINK_CMD_HEALTH_REPORTER_RECOVER
notification is sent immediately after setting the error state, before
checking if recovery should be aborted via devlink_health_recover_abort().
When devlink_health_recover_abort() returns true (e.g., due to rate
limiting), the recovery process terminates early, but userspace has already
received a notification implying that recovery is underway. This creates a
misleading view of the reporter's activity.
Move the notification after the abort check, ensuring it is only sent when
recovery will actually proceed. This aligns the notification with the
actual recovery behavior.
Signed-off-by: Li RongQing <lirongqing@baidu.com>
---
net/devlink/health.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/devlink/health.c b/net/devlink/health.c
index 136a67c..e9999fc 100644
--- a/net/devlink/health.c
+++ b/net/devlink/health.c
@@ -665,7 +665,6 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
reporter->error_count++;
prev_health_state = reporter->health_state;
reporter->health_state = DEVLINK_HEALTH_REPORTER_STATE_ERROR;
- devlink_recover_notify(reporter, DEVLINK_CMD_HEALTH_REPORTER_RECOVER);
if (devlink_health_recover_abort(reporter, prev_health_state)) {
trace_devlink_health_recover_aborted(devlink,
@@ -686,6 +685,7 @@ int devlink_health_report(struct devlink_health_reporter *reporter,
if (!reporter->auto_recover)
return 0;
+ devlink_recover_notify(reporter, DEVLINK_CMD_HEALTH_REPORTER_RECOVER);
devl_lock(devlink);
ret = devlink_health_reporter_recover(reporter, priv_ctx, NULL);
devl_unlock(devlink);
--
2.9.4
On Tue, 24 Feb 2026 21:10:03 -0500 lirongqing wrote: > In devlink_health_report(), the DEVLINK_CMD_HEALTH_REPORTER_RECOVER > notification is sent immediately after setting the error state, before > checking if recovery should be aborted via devlink_health_recover_abort(). > > When devlink_health_recover_abort() returns true (e.g., due to rate > limiting), the recovery process terminates early, but userspace has already > received a notification implying that recovery is underway. This creates a > misleading view of the reporter's activity. > > Move the notification after the abort check, ensuring it is only sent when > recovery will actually proceed. This aligns the notification with the > actual recovery behavior. Hm, we don't have solid documentation for this notification, but I think it's supposed to be triggered on any change in the health state. It's not just a notification that recovery has taken place. devlink_health_reporter_state_update() for instance sends it whether the update is healthy -> error or error -> healthy. -- pw-bot: reject
© 2016 - 2026 Red Hat, Inc.