[PATCH v2 1/2] tick: Remove unreasonable detached state set in tick_shutdown()

Bibo Mao posted 2 patches 4 weeks ago
There is a newer version of this series
[PATCH v2 1/2] tick: Remove unreasonable detached state set in tick_shutdown()
Posted by Bibo Mao 4 weeks ago
Function clockevents_switch_state() will check whether it has already
switched to specified state, do nothing if it has.

In function tick_shutdown(), it will set detached state at first and
call clockevents_switch_state() in clockevents_exchange_device(). The
function clockevents_switch_state() will do nothing since it is already
detached state. So the tick timer device will not be shutdown when CPU
is offline. In guest VM system, timer interrupt will prevent vCPU to
sleep if vCPU is hot removed.

Here remove state set before calling clockevents_exchange_device(),
its state will be set in function clockevents_switch_state() if it
succeeds to do so.

Fixes: bf9a001fb8e4 ("clocksource/drivers/timer-tegra: Remove clockevents shutdown call on offlining")
Fixes: cd165ce8314f ("clocksource/drivers/qcom: Remove clockevents shutdown call on offlining")
Fixes: 30f8c70a85bc ("clocksource/drivers/armada-370-xp: Remove clockevents shutdown call on offlining")
Fixes: ba23b6c7f974 ("clocksource/drivers/exynos_mct: Remove clockevents shutdown call on offlining")
Fixes: 15b810e0496e ("clocksource/drivers/arm_global_timer: Remove clockevents shutdown call on offlining")
Fixes: 78b5c2ca5f27 ("clocksource/drivers/arm_arch_timer: Remove clockevents shutdown call on offlining")
Fixes: 900053d9eedf ("ARM: smp_twd: Remove clockevents shutdown call on offlining")

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
---
 kernel/time/tick-common.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
index 9a3859443c04..eb9b777f5492 100644
--- a/kernel/time/tick-common.c
+++ b/kernel/time/tick-common.c
@@ -424,11 +424,6 @@ void tick_shutdown(unsigned int cpu)
 
 	td->mode = TICKDEV_MODE_PERIODIC;
 	if (dev) {
-		/*
-		 * Prevent that the clock events layer tries to call
-		 * the set mode function!
-		 */
-		clockevent_set_state(dev, CLOCK_EVT_STATE_DETACHED);
 		clockevents_exchange_device(dev, NULL);
 		dev->event_handler = clockevents_handle_noop;
 		td->evtdev = NULL;
-- 
2.39.3
Re: [PATCH v2 1/2] tick: Remove unreasonable detached state set in tick_shutdown()
Posted by Thomas Gleixner 4 weeks ago
On Thu, Sep 04 2025 at 15:17, Bibo Mao wrote:
> Function clockevents_switch_state() will check whether it has already
> switched to specified state, do nothing if it has.
>
> In function tick_shutdown(), it will set detached state at first and
> call clockevents_switch_state() in clockevents_exchange_device(). The
> function clockevents_switch_state() will do nothing since it is already
> detached state. So the tick timer device will not be shutdown when CPU
> is offline. In guest VM system, timer interrupt will prevent vCPU to
> sleep if vCPU is hot removed.
>
> Here remove state set before calling clockevents_exchange_device(),
> its state will be set in function clockevents_switch_state() if it
> succeeds to do so.

This explanation is incomplete. tick_shutdown() did this because it was
originally invoked on a life CPU and not on the outgoing CPU.

That got changed in

  3b1596a21fbf ("clockevents: Shutdown and unregister current clockevents at CPUHP_AP_TICK_DYING")

which is the actual root cause.

The pile of 'Fixes:' below is just enumerating the subsequent problems.

> Fixes: bf9a001fb8e4 ("clocksource/drivers/timer-tegra: Remove clockevents shutdown call on offlining")
> Fixes: cd165ce8314f ("clocksource/drivers/qcom: Remove clockevents shutdown call on offlining")
> Fixes: 30f8c70a85bc ("clocksource/drivers/armada-370-xp: Remove clockevents shutdown call on offlining")
> Fixes: ba23b6c7f974 ("clocksource/drivers/exynos_mct: Remove clockevents shutdown call on offlining")
> Fixes: 15b810e0496e ("clocksource/drivers/arm_global_timer: Remove clockevents shutdown call on offlining")
> Fixes: 78b5c2ca5f27 ("clocksource/drivers/arm_arch_timer: Remove clockevents shutdown call on offlining")
> Fixes: 900053d9eedf ("ARM: smp_twd: Remove clockevents shutdown call on offlining")
>
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
> ---
>  kernel/time/tick-common.c | 5 -----
>  1 file changed, 5 deletions(-)
>
> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> index 9a3859443c04..eb9b777f5492 100644
> --- a/kernel/time/tick-common.c
> +++ b/kernel/time/tick-common.c
> @@ -424,11 +424,6 @@ void tick_shutdown(unsigned int cpu)
>  
>  	td->mode = TICKDEV_MODE_PERIODIC;
>  	if (dev) {
> -		/*
> -		 * Prevent that the clock events layer tries to call
> -		 * the set mode function!
> -		 */
> -		clockevent_set_state(dev, CLOCK_EVT_STATE_DETACHED);
>  		clockevents_exchange_device(dev, NULL);
>  		dev->event_handler = clockevents_handle_noop;
>  		td->evtdev = NULL;

Can this pretty please cleanup the misleading comment above
tick_shutdown() as well?

 * Shutdown an event device on a given cpu:
 *
 * This is called on a life CPU, when a CPU is dead. So we cannot
 * access the hardware device itself.
 * We just set the mode and remove it from the lists.

That should have been removed or updated with 3b1596a21fbf too, no?

With that the cpu argument is not longer useful either, because this is
now guaranteed to be invoked on the outgoing CPU, no?

Thanks,

        tglx
Re: [PATCH v2 1/2] tick: Remove unreasonable detached state set in tick_shutdown()
Posted by Bibo Mao 4 weeks ago

On 2025/9/4 下午11:57, Thomas Gleixner wrote:
> On Thu, Sep 04 2025 at 15:17, Bibo Mao wrote:
>> Function clockevents_switch_state() will check whether it has already
>> switched to specified state, do nothing if it has.
>>
>> In function tick_shutdown(), it will set detached state at first and
>> call clockevents_switch_state() in clockevents_exchange_device(). The
>> function clockevents_switch_state() will do nothing since it is already
>> detached state. So the tick timer device will not be shutdown when CPU
>> is offline. In guest VM system, timer interrupt will prevent vCPU to
>> sleep if vCPU is hot removed.
>>
>> Here remove state set before calling clockevents_exchange_device(),
>> its state will be set in function clockevents_switch_state() if it
>> succeeds to do so.
> 
> This explanation is incomplete. tick_shutdown() did this because it was
> originally invoked on a life CPU and not on the outgoing CPU.
> 
> That got changed in
> 
>    3b1596a21fbf ("clockevents: Shutdown and unregister current clockevents at CPUHP_AP_TICK_DYING")
> 
> which is the actual root cause.
> 
> The pile of 'Fixes:' below is just enumerating the subsequent problems.
> 
>> Fixes: bf9a001fb8e4 ("clocksource/drivers/timer-tegra: Remove clockevents shutdown call on offlining")
>> Fixes: cd165ce8314f ("clocksource/drivers/qcom: Remove clockevents shutdown call on offlining")
>> Fixes: 30f8c70a85bc ("clocksource/drivers/armada-370-xp: Remove clockevents shutdown call on offlining")
>> Fixes: ba23b6c7f974 ("clocksource/drivers/exynos_mct: Remove clockevents shutdown call on offlining")
>> Fixes: 15b810e0496e ("clocksource/drivers/arm_global_timer: Remove clockevents shutdown call on offlining")
>> Fixes: 78b5c2ca5f27 ("clocksource/drivers/arm_arch_timer: Remove clockevents shutdown call on offlining")
>> Fixes: 900053d9eedf ("ARM: smp_twd: Remove clockevents shutdown call on offlining")
>>
>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
>> ---
>>   kernel/time/tick-common.c | 5 -----
>>   1 file changed, 5 deletions(-)
>>
>> diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
>> index 9a3859443c04..eb9b777f5492 100644
>> --- a/kernel/time/tick-common.c
>> +++ b/kernel/time/tick-common.c
>> @@ -424,11 +424,6 @@ void tick_shutdown(unsigned int cpu)
>>   
>>   	td->mode = TICKDEV_MODE_PERIODIC;
>>   	if (dev) {
>> -		/*
>> -		 * Prevent that the clock events layer tries to call
>> -		 * the set mode function!
>> -		 */
>> -		clockevent_set_state(dev, CLOCK_EVT_STATE_DETACHED);
>>   		clockevents_exchange_device(dev, NULL);
>>   		dev->event_handler = clockevents_handle_noop;
>>   		td->evtdev = NULL;
> 
> Can this pretty please cleanup the misleading comment above
> tick_shutdown() as well?
> 
>   * Shutdown an event device on a given cpu:
>   *
>   * This is called on a life CPU, when a CPU is dead. So we cannot
>   * access the hardware device itself.
>   * We just set the mode and remove it from the lists.
> 
> That should have been removed or updated with 3b1596a21fbf too, no?
> 
> With that the cpu argument is not longer useful either, because this is
> now guaranteed to be invoked on the outgoing CPU, no?

It is not easy with my poor English to spell out the comments :(
How about the patch like this?

Function clockevents_switch_state() will check whether it has already
switched to specified state, do nothing if it has.

In function tick_shutdown(), it will set detached state at first and
call clockevents_switch_state() in clockevents_exchange_device(). The
function clockevents_switch_state() will do nothing since it is already
detached state. So the tick timer device will not be shutdown when CPU
is offline.

Function tick_shutdown() did this because it was originally invoked
on a life CPU and not on the outgoing CPU. Now this function is called
on the outgoing CPU, the hardware device can be accessed.

Here remove state set before calling clockevents_exchange_device(), its
state will be set in function clockevents_switch_state() if it succeeds
to do so.

Fixes: 3b1596a21fbf ("clockevents: Shutdown and unregister current 
clockevents at CPUHP_AP_TICK_DYING")


  /*
- * Shutdown an event device on a given cpu:
+ * Shutdown an event device on the outgoing CPU:
   *
- * This is called on a life CPU, when a CPU is dead. So we cannot
- * access the hardware device itself.
- * We just set the mode and remove it from the lists.
+ * Called by the dying CPU during teardown, with clockevents_lock held
+ * and interrupts disabled.
   */
-void tick_shutdown(unsigned int cpu)
+void tick_shutdown(void)
  {
-       struct tick_device *td = &per_cpu(tick_cpu_device, cpu);
+       struct tick_device *td = this_cpu_ptr(&tick_cpu_device);
         struct clock_event_device *dev = td->evtdev;

         td->mode = TICKDEV_MODE_PERIODIC;
         if (dev) {
-               /*
-                * Prevent that the clock events layer tries to call
-                * the set mode function!
-                */
-               clockevent_set_state(dev, CLOCK_EVT_STATE_DETACHED);
                 clockevents_exchange_device(dev, NULL);
                 dev->event_handler = clockevents_handle_noop;
                 td->evtdev = NULL;

Regards
Bibo Mao
> 
> Thanks,
> 
>          tglx
> 

Re: [PATCH v2 1/2] tick: Remove unreasonable detached state set in tick_shutdown()
Posted by Thomas Gleixner 3 weeks, 6 days ago
On Fri, Sep 05 2025 at 10:04, Bibo Mao wrote:
> On 2025/9/4 下午11:57, Thomas Gleixner wrote:
>
> It is not easy with my poor English to spell out the comments :(
> How about the patch like this?

Looks about right.
Re: [PATCH v2 1/2] tick: Remove unreasonable detached state set in tick_shutdown()
Posted by Frederic Weisbecker 4 weeks ago
Le Thu, Sep 04, 2025 at 05:57:30PM +0200, Thomas Gleixner a écrit :
> On Thu, Sep 04 2025 at 15:17, Bibo Mao wrote:
> > Function clockevents_switch_state() will check whether it has already
> > switched to specified state, do nothing if it has.
> >
> > In function tick_shutdown(), it will set detached state at first and
> > call clockevents_switch_state() in clockevents_exchange_device(). The
> > function clockevents_switch_state() will do nothing since it is already
> > detached state. So the tick timer device will not be shutdown when CPU
> > is offline. In guest VM system, timer interrupt will prevent vCPU to
> > sleep if vCPU is hot removed.
> >
> > Here remove state set before calling clockevents_exchange_device(),
> > its state will be set in function clockevents_switch_state() if it
> > succeeds to do so.
> 
> This explanation is incomplete. tick_shutdown() did this because it was
> originally invoked on a life CPU and not on the outgoing CPU.

Ok I didn't know that.

> 
> That got changed in
> 
>   3b1596a21fbf ("clockevents: Shutdown and unregister current clockevents at CPUHP_AP_TICK_DYING")
> 
> which is the actual root cause.
> 
> The pile of 'Fixes:' below is just enumerating the subsequent problems.
> 
> > Fixes: bf9a001fb8e4 ("clocksource/drivers/timer-tegra: Remove clockevents shutdown call on offlining")
> > Fixes: cd165ce8314f ("clocksource/drivers/qcom: Remove clockevents shutdown call on offlining")
> > Fixes: 30f8c70a85bc ("clocksource/drivers/armada-370-xp: Remove clockevents shutdown call on offlining")
> > Fixes: ba23b6c7f974 ("clocksource/drivers/exynos_mct: Remove clockevents shutdown call on offlining")
> > Fixes: 15b810e0496e ("clocksource/drivers/arm_global_timer: Remove clockevents shutdown call on offlining")
> > Fixes: 78b5c2ca5f27 ("clocksource/drivers/arm_arch_timer: Remove clockevents shutdown call on offlining")
> > Fixes: 900053d9eedf ("ARM: smp_twd: Remove clockevents shutdown call on
> > offlining")

Makes sense.

> >
> > Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> > Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
> > ---
> >  kernel/time/tick-common.c | 5 -----
> >  1 file changed, 5 deletions(-)
> >
> > diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c
> > index 9a3859443c04..eb9b777f5492 100644
> > --- a/kernel/time/tick-common.c
> > +++ b/kernel/time/tick-common.c
> > @@ -424,11 +424,6 @@ void tick_shutdown(unsigned int cpu)
> >  
> >  	td->mode = TICKDEV_MODE_PERIODIC;
> >  	if (dev) {
> > -		/*
> > -		 * Prevent that the clock events layer tries to call
> > -		 * the set mode function!
> > -		 */
> > -		clockevent_set_state(dev, CLOCK_EVT_STATE_DETACHED);
> >  		clockevents_exchange_device(dev, NULL);
> >  		dev->event_handler = clockevents_handle_noop;
> >  		td->evtdev = NULL;
> 
> Can this pretty please cleanup the misleading comment above
> tick_shutdown() as well?
> 
>  * Shutdown an event device on a given cpu:
>  *
>  * This is called on a life CPU, when a CPU is dead. So we cannot
>  * access the hardware device itself.
>  * We just set the mode and remove it from the lists.
> 
> That should have been removed or updated with 3b1596a21fbf too, no?

Right, missed that.

> 
> With that the cpu argument is not longer useful either, because this is
> now guaranteed to be invoked on the outgoing CPU, no?

Right.

Thanks!

> 
> Thanks,
> 
>         tglx
> 

-- 
Frederic Weisbecker
SUSE Labs
Re: [PATCH v2 1/2] tick: Remove unreasonable detached state set in tick_shutdown()
Posted by Frederic Weisbecker 4 weeks ago
Le Thu, Sep 04, 2025 at 03:17:31PM +0800, Bibo Mao a écrit :
> Function clockevents_switch_state() will check whether it has already
> switched to specified state, do nothing if it has.
> 
> In function tick_shutdown(), it will set detached state at first and
> call clockevents_switch_state() in clockevents_exchange_device(). The
> function clockevents_switch_state() will do nothing since it is already
> detached state. So the tick timer device will not be shutdown when CPU
> is offline. In guest VM system, timer interrupt will prevent vCPU to
> sleep if vCPU is hot removed.
> 
> Here remove state set before calling clockevents_exchange_device(),
> its state will be set in function clockevents_switch_state() if it
> succeeds to do so.
> 
> Fixes: bf9a001fb8e4 ("clocksource/drivers/timer-tegra: Remove clockevents shutdown call on offlining")
> Fixes: cd165ce8314f ("clocksource/drivers/qcom: Remove clockevents shutdown call on offlining")
> Fixes: 30f8c70a85bc ("clocksource/drivers/armada-370-xp: Remove clockevents shutdown call on offlining")
> Fixes: ba23b6c7f974 ("clocksource/drivers/exynos_mct: Remove clockevents shutdown call on offlining")
> Fixes: 15b810e0496e ("clocksource/drivers/arm_global_timer: Remove clockevents shutdown call on offlining")
> Fixes: 78b5c2ca5f27 ("clocksource/drivers/arm_arch_timer: Remove clockevents shutdown call on offlining")
> Fixes: 900053d9eedf ("ARM: smp_twd: Remove clockevents shutdown call on offlining")
> 
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>

Thanks a lot!

-- 
Frederic Weisbecker
SUSE Labs