[PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled

Yu Liao posted 1 patch 2 years ago
kernel/time/tick-broadcast-hrtimer.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
[PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled
Posted by Yu Liao 2 years ago
It was found that running the LTP hotplug stress test on a aarch64
system could produce rcu_sched stall warnings.

The issue is the following:

CPU1 (owns the broadcast hrtimer)	CPU2

				tick_broadcast_enter()
				//shut down local timer device
				...
				tick_broadcast_exit()
				//exits with tick_broadcast_force_mask set,
				timer device remains disabled

				initiates offlining of CPU1
take_cpu_down()
//CPU1 shuts down and does
not send broadcast IPI anymore
				takedown_cpu()
				  hotplug_cpu__broadcast_tick_pull()
				  //move broadcast hrtimer to this CPU
				    clockevents_program_event()
				      bc_set_next()
					hrtimer_start()
					//does not call hrtimer_reprogram()
					to program timer device if expires
					equals dev->next_event, so the timer
					device remains disabled.

CPU2 takes over the broadcast duty but local timer device is disabled,
causing many CPUs to become stuck.

Fix this by calling tick_program_event() to reprogram the local timer
device in this scenario.

Signed-off-by: Yu Liao <liaoyu15@huawei.com>
---
 kernel/time/tick-broadcast-hrtimer.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/kernel/time/tick-broadcast-hrtimer.c b/kernel/time/tick-broadcast-hrtimer.c
index e28f9210f8a1..6a4a612581fb 100644
--- a/kernel/time/tick-broadcast-hrtimer.c
+++ b/kernel/time/tick-broadcast-hrtimer.c
@@ -42,10 +42,22 @@ static int bc_shutdown(struct clock_event_device *evt)
  */
 static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
 {
+	ktime_t next_event = this_cpu_ptr(&tick_cpu_device)->evtdev->next_event;
+
 	/*
-	 * This is called either from enter/exit idle code or from the
-	 * broadcast handler. In all cases tick_broadcast_lock is held.
-	 *
+	 * This can be called from CPU offline operation to move broadcast
+	 * assignment. If tick_broadcast_force_mask is set, the CPU local
+	 * timer device may be disabled. And hrtimer_reprogram() will not
+	 * called if the timer is not the first expiring timer. Reprogram
+	 * the cpu local timer device to ensure we can take over the
+	 * broadcast duty.
+	 */
+	if (tick_check_broadcast_expired() && expires >= next_event)
+		tick_program_event(next_event, 1);
+
+	/*
+	 * This is called from enter/exit idle code, broadcast handler or
+	 * CPU offline operation. In all cases tick_broadcast_lock is held.
 	 * hrtimer_cancel() cannot be called here neither from the
 	 * broadcast handler nor from the enter/exit idle code. The idle
 	 * code can run into the problem described in bc_shutdown() and the
-- 
2.33.0
Re: [PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled
Posted by Thomas Gleixner 1 year, 10 months ago
On Mon, Dec 18 2023 at 10:58, Yu Liao wrote:
>  static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
>  {
> +	ktime_t next_event = this_cpu_ptr(&tick_cpu_device)->evtdev->next_event;
> +
>  	/*
> -	 * This is called either from enter/exit idle code or from the
> -	 * broadcast handler. In all cases tick_broadcast_lock is held.
> -	 *
> +	 * This can be called from CPU offline operation to move broadcast
> +	 * assignment. If tick_broadcast_force_mask is set, the CPU local
> +	 * timer device may be disabled. And hrtimer_reprogram() will not
> +	 * called if the timer is not the first expiring timer. Reprogram
> +	 * the cpu local timer device to ensure we can take over the
> +	 * broadcast duty.
> +	 */
> +	if (tick_check_broadcast_expired() && expires >= next_event)
> +		tick_program_event(next_event, 1);

I'm not really enthused about another conditional here and that
condition is more than obscure.

The problem is that the local clockevent might be shut down, right?

So checking for that state is the right thing to do and the proper place
is in hotplug_cpu__broadcast_tick_pull(), no?

Thanks,

        tglx
Re: [PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled
Posted by Yu Liao 1 year, 5 months ago
Hi Thomas,

Sorry it took so long to reply.

On 2024/1/25 3:18, Thomas Gleixner wrote:
>> +	if (tick_check_broadcast_expired() && expires >= next_event)
>> +		tick_program_event(next_event, 1);
> 
> I'm not really enthused about another conditional here and that
> condition is more than obscure.
> 
> The problem is that the local clockevent might be shut down, right?
> 
> So checking for that state is the right thing to do and the proper place
> is in hotplug_cpu__broadcast_tick_pull(), no?

We can't check the clockevent state, because when exiting broadcast mode the
state is switched to oneshot, but the clockevent is still shutdown, due to
some device (e.g. arm arch timers) do not implement set_state_oneshot handler,
the switch operation only change the state value.

Yeah, hotplug_cpu__broadcast_tick_pull() is the proper place to do the check.
Thank you for your advice, I have modified it and sent you the v2 patch.

Best regards,
Yu
Re: [PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled
Posted by Yu Liao 1 year, 11 months ago
Hi Thomas,

Kindly ping..

On 2023/12/18 10:58, Yu Liao wrote:
> It was found that running the LTP hotplug stress test on a aarch64
> system could produce rcu_sched stall warnings.
> 
> The issue is the following:
> 
> CPU1 (owns the broadcast hrtimer)	CPU2
> 
> 				tick_broadcast_enter()
> 				//shut down local timer device
> 				...
> 				tick_broadcast_exit()
> 				//exits with tick_broadcast_force_mask set,
> 				timer device remains disabled
> 
> 				initiates offlining of CPU1
> take_cpu_down()
> //CPU1 shuts down and does
> not send broadcast IPI anymore
> 				takedown_cpu()
> 				  hotplug_cpu__broadcast_tick_pull()
> 				  //move broadcast hrtimer to this CPU
> 				    clockevents_program_event()
> 				      bc_set_next()
> 					hrtimer_start()
> 					//does not call hrtimer_reprogram()
> 					to program timer device if expires
> 					equals dev->next_event, so the timer
> 					device remains disabled.
> 
> CPU2 takes over the broadcast duty but local timer device is disabled,
> causing many CPUs to become stuck.
> 
> Fix this by calling tick_program_event() to reprogram the local timer
> device in this scenario.
> 
> Signed-off-by: Yu Liao <liaoyu15@huawei.com>
> ---
>  kernel/time/tick-broadcast-hrtimer.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/time/tick-broadcast-hrtimer.c b/kernel/time/tick-broadcast-hrtimer.c
> index e28f9210f8a1..6a4a612581fb 100644
> --- a/kernel/time/tick-broadcast-hrtimer.c
> +++ b/kernel/time/tick-broadcast-hrtimer.c
> @@ -42,10 +42,22 @@ static int bc_shutdown(struct clock_event_device *evt)
>   */
>  static int bc_set_next(ktime_t expires, struct clock_event_device *bc)
>  {
> +	ktime_t next_event = this_cpu_ptr(&tick_cpu_device)->evtdev->next_event;
> +
>  	/*
> -	 * This is called either from enter/exit idle code or from the
> -	 * broadcast handler. In all cases tick_broadcast_lock is held.
> -	 *
> +	 * This can be called from CPU offline operation to move broadcast
> +	 * assignment. If tick_broadcast_force_mask is set, the CPU local
> +	 * timer device may be disabled. And hrtimer_reprogram() will not
> +	 * called if the timer is not the first expiring timer. Reprogram
> +	 * the cpu local timer device to ensure we can take over the
> +	 * broadcast duty.
> +	 */
> +	if (tick_check_broadcast_expired() && expires >= next_event)
> +		tick_program_event(next_event, 1);
> +
> +	/*
> +	 * This is called from enter/exit idle code, broadcast handler or
> +	 * CPU offline operation. In all cases tick_broadcast_lock is held.
>  	 * hrtimer_cancel() cannot be called here neither from the
>  	 * broadcast handler nor from the enter/exit idle code. The idle
>  	 * code can run into the problem described in bc_shutdown() and the
Re: [PATCH] tick/broadcast-hrtimer: Prevent the timer device on broadcast duty CPU from being disabled
Posted by Thomas Gleixner 1 year, 11 months ago
On Fri, Jan 12 2024 at 15:40, Yu Liao wrote:
> Hi Thomas,
>
> Kindly ping..

It's in my backlog ...