[PATCH v1] cpuidle: Do not return from cpuidle_play_dead() on callback failures

Rafael J. Wysocki posted 1 patch 1 week, 1 day ago
drivers/cpuidle/cpuidle.c |    7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
[PATCH v1] cpuidle: Do not return from cpuidle_play_dead() on callback failures
Posted by Rafael J. Wysocki 1 week, 1 day ago
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

If the :enter_dead() idle state callback fails for a certain state,
there may be still a shallower state for which it will work.

Because the only caller of cpuidle_play_dead(), native_play_dead(),
falls back to hlt_play_dead() if it returns an error, it should
better try all of the idle states for which :enter_dead() is present
before failing, so change it accordingly.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/cpuidle/cpuidle.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

Index: linux-pm/drivers/cpuidle/cpuidle.c
===================================================================
--- linux-pm.orig/drivers/cpuidle/cpuidle.c
+++ linux-pm/drivers/cpuidle/cpuidle.c
@@ -70,9 +70,10 @@ int cpuidle_play_dead(void)
 		return -ENODEV;
 
 	/* Find lowest-power state that supports long-term idle */
-	for (i = drv->state_count - 1; i >= 0; i--)
-		if (drv->states[i].enter_dead)
-			return drv->states[i].enter_dead(dev, i);
+	for (i = drv->state_count - 1; i >= 0; i--) {
+		if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
+			return 0;
+	}
 
 	return -ENODEV;
 }
Re: [PATCH v1] cpuidle: Do not return from cpuidle_play_dead() on callback failures
Posted by Peter Zijlstra 1 week ago
On Thu, Nov 14, 2024 at 06:46:20PM +0100, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If the :enter_dead() idle state callback fails for a certain state,
> there may be still a shallower state for which it will work.
> 
> Because the only caller of cpuidle_play_dead(), native_play_dead(),
> falls back to hlt_play_dead() if it returns an error, it should
> better try all of the idle states for which :enter_dead() is present
> before failing, so change it accordingly.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/cpuidle/cpuidle.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> Index: linux-pm/drivers/cpuidle/cpuidle.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/cpuidle.c
> +++ linux-pm/drivers/cpuidle/cpuidle.c
> @@ -70,9 +70,10 @@ int cpuidle_play_dead(void)
>  		return -ENODEV;
>  
>  	/* Find lowest-power state that supports long-term idle */
> -	for (i = drv->state_count - 1; i >= 0; i--)
> -		if (drv->states[i].enter_dead)
> -			return drv->states[i].enter_dead(dev, i);
> +	for (i = drv->state_count - 1; i >= 0; i--) {
> +		if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
> +			return 0;
> +	}

Hmm, strictly speaking there is no 'success' return from play_dead(). On
success, the CPU is dead :-)
Re: [PATCH v1] cpuidle: Do not return from cpuidle_play_dead() on callback failures
Posted by Rafael J. Wysocki 1 week ago
On Fri, Nov 15, 2024 at 11:14 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Nov 14, 2024 at 06:46:20PM +0100, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > If the :enter_dead() idle state callback fails for a certain state,
> > there may be still a shallower state for which it will work.
> >
> > Because the only caller of cpuidle_play_dead(), native_play_dead(),
> > falls back to hlt_play_dead() if it returns an error, it should
> > better try all of the idle states for which :enter_dead() is present
> > before failing, so change it accordingly.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >  drivers/cpuidle/cpuidle.c |    7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > Index: linux-pm/drivers/cpuidle/cpuidle.c
> > ===================================================================
> > --- linux-pm.orig/drivers/cpuidle/cpuidle.c
> > +++ linux-pm/drivers/cpuidle/cpuidle.c
> > @@ -70,9 +70,10 @@ int cpuidle_play_dead(void)
> >               return -ENODEV;
> >
> >       /* Find lowest-power state that supports long-term idle */
> > -     for (i = drv->state_count - 1; i >= 0; i--)
> > -             if (drv->states[i].enter_dead)
> > -                     return drv->states[i].enter_dead(dev, i);
> > +     for (i = drv->state_count - 1; i >= 0; i--) {
> > +             if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
> > +                     return 0;
> > +     }
>
> Hmm, strictly speaking there is no 'success' return from play_dead(). On
> success, the CPU is dead :-)

Well, would you prefer something like

for (i = drv->state_count - 1; i >= 0; i--) {
        if (drv->states[i].enter_dead)
                drv->states[i].enter_dead(dev, i);
}

and adding a comment before the final return statement that
:enter_dead() only returns on failure?
Re: [PATCH v1] cpuidle: Do not return from cpuidle_play_dead() on callback failures
Posted by Peter Zijlstra 1 week ago
On Fri, Nov 15, 2024 at 01:46:29PM +0100, Rafael J. Wysocki wrote:
> On Fri, Nov 15, 2024 at 11:14 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Nov 14, 2024 at 06:46:20PM +0100, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > If the :enter_dead() idle state callback fails for a certain state,
> > > there may be still a shallower state for which it will work.
> > >
> > > Because the only caller of cpuidle_play_dead(), native_play_dead(),
> > > falls back to hlt_play_dead() if it returns an error, it should
> > > better try all of the idle states for which :enter_dead() is present
> > > before failing, so change it accordingly.
> > >
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > ---
> > >  drivers/cpuidle/cpuidle.c |    7 ++++---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > >
> > > Index: linux-pm/drivers/cpuidle/cpuidle.c
> > > ===================================================================
> > > --- linux-pm.orig/drivers/cpuidle/cpuidle.c
> > > +++ linux-pm/drivers/cpuidle/cpuidle.c
> > > @@ -70,9 +70,10 @@ int cpuidle_play_dead(void)
> > >               return -ENODEV;
> > >
> > >       /* Find lowest-power state that supports long-term idle */
> > > -     for (i = drv->state_count - 1; i >= 0; i--)
> > > -             if (drv->states[i].enter_dead)
> > > -                     return drv->states[i].enter_dead(dev, i);
> > > +     for (i = drv->state_count - 1; i >= 0; i--) {
> > > +             if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
> > > +                     return 0;
> > > +     }
> >
> > Hmm, strictly speaking there is no 'success' return from play_dead(). On
> > success, the CPU is dead :-)
> 
> Well, would you prefer something like
> 
> for (i = drv->state_count - 1; i >= 0; i--) {
>         if (drv->states[i].enter_dead)
>                 drv->states[i].enter_dead(dev, i);
> }
> 
> and adding a comment before the final return statement that
> :enter_dead() only returns on failure?

Yeah, but perhaps remove the return value entirely if we're going to
ignore it anyway. And then assume that any return is a failure to die.

I mean, something like:

	if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
		panic("Dead CPU walking...");

is 'fun' but not very useful.
Re: [PATCH v1] cpuidle: Do not return from cpuidle_play_dead() on callback failures
Posted by Rafael J. Wysocki 1 week ago
On Fri, Nov 15, 2024 at 1:55 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Nov 15, 2024 at 01:46:29PM +0100, Rafael J. Wysocki wrote:
> > On Fri, Nov 15, 2024 at 11:14 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Thu, Nov 14, 2024 at 06:46:20PM +0100, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > >
> > > > If the :enter_dead() idle state callback fails for a certain state,
> > > > there may be still a shallower state for which it will work.
> > > >
> > > > Because the only caller of cpuidle_play_dead(), native_play_dead(),
> > > > falls back to hlt_play_dead() if it returns an error, it should
> > > > better try all of the idle states for which :enter_dead() is present
> > > > before failing, so change it accordingly.
> > > >
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > ---
> > > >  drivers/cpuidle/cpuidle.c |    7 ++++---
> > > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > >
> > > > Index: linux-pm/drivers/cpuidle/cpuidle.c
> > > > ===================================================================
> > > > --- linux-pm.orig/drivers/cpuidle/cpuidle.c
> > > > +++ linux-pm/drivers/cpuidle/cpuidle.c
> > > > @@ -70,9 +70,10 @@ int cpuidle_play_dead(void)
> > > >               return -ENODEV;
> > > >
> > > >       /* Find lowest-power state that supports long-term idle */
> > > > -     for (i = drv->state_count - 1; i >= 0; i--)
> > > > -             if (drv->states[i].enter_dead)
> > > > -                     return drv->states[i].enter_dead(dev, i);
> > > > +     for (i = drv->state_count - 1; i >= 0; i--) {
> > > > +             if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
> > > > +                     return 0;
> > > > +     }
> > >
> > > Hmm, strictly speaking there is no 'success' return from play_dead(). On
> > > success, the CPU is dead :-)
> >
> > Well, would you prefer something like
> >
> > for (i = drv->state_count - 1; i >= 0; i--) {
> >         if (drv->states[i].enter_dead)
> >                 drv->states[i].enter_dead(dev, i);
> > }
> >
> > and adding a comment before the final return statement that
> > :enter_dead() only returns on failure?
>
> Yeah, but perhaps remove the return value entirely if we're going to
> ignore it anyway. And then assume that any return is a failure to die.
>
> I mean, something like:
>
>         if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
>                 panic("Dead CPU walking...");
>
> is 'fun' but not very useful.

The panic would be hard to debug if it ever triggers I'm afraid and
there is the fallback to HLT in the caller.

An error message could be printed here though:

    if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
            pr_err("CPU %d: Unexpectedly undead\n", dev->cpu);
Re: [PATCH v1] cpuidle: Do not return from cpuidle_play_dead() on callback failures
Posted by Peter Zijlstra 1 week ago
On Fri, Nov 15, 2024 at 02:25:23PM +0100, Rafael J. Wysocki wrote:
> > I mean, something like:
> >
> >         if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
> >                 panic("Dead CPU walking...");
> >
> > is 'fun' but not very useful.
> 
> The panic would be hard to debug if it ever triggers I'm afraid and
> there is the fallback to HLT in the caller.

I was being facetious, removing the return value and simply calling them
all in reverse order is fine.
Re: [PATCH v1] cpuidle: Do not return from cpuidle_play_dead() on callback failures
Posted by Mario Limonciello 1 week ago
On 11/14/2024 11:46, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> If the :enter_dead() idle state callback fails for a certain state,
> there may be still a shallower state for which it will work.
> 
> Because the only caller of cpuidle_play_dead(), native_play_dead(),
> falls back to hlt_play_dead() if it returns an error, it should
> better try all of the idle states for which :enter_dead() is present
> before failing, so change it accordingly.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # 6.12-rc7
> ---
>   drivers/cpuidle/cpuidle.c |    7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> Index: linux-pm/drivers/cpuidle/cpuidle.c
> ===================================================================
> --- linux-pm.orig/drivers/cpuidle/cpuidle.c
> +++ linux-pm/drivers/cpuidle/cpuidle.c
> @@ -70,9 +70,10 @@ int cpuidle_play_dead(void)
>   		return -ENODEV;
>   
>   	/* Find lowest-power state that supports long-term idle */
> -	for (i = drv->state_count - 1; i >= 0; i--)
> -		if (drv->states[i].enter_dead)
> -			return drv->states[i].enter_dead(dev, i);
> +	for (i = drv->state_count - 1; i >= 0; i--) {
> +		if (drv->states[i].enter_dead && !drv->states[i].enter_dead(dev, i))
> +			return 0;
> +	}
>   
>   	return -ENODEV;
>   }
> 
> 
>