From nobody Sat Jun 20 19:59:35 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 05DB936165D; Fri, 10 Apr 2026 20:54:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775854493; cv=none; b=DC5fkHppd2oNJTa99VUM018szWR+HKXoO+c2Ji+9jHntm7ZOHIMAU+2SYNBqoDQbQae+k4VgAprVZmVzHuQQ2514ii8r77M36R0tVy8WSpgNGCV+66aL/if1K0g6SUpBzoeH8HCb6b2zzoj7FYHf/MdhWab4ek7do3m+rVTsOpk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775854493; c=relaxed/simple; bh=y3PdGQuez/CO2jQZGfb2b1h6AcTwXKn1rWrart/N1UU=; h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version: Message-ID:Content-Type; b=dWnyJZwAPuyAPraR+ejPPU9gYw2m4rofaydT1uS5SANTaF7TIUx9I2CzJ0In8tF3mIVu2fTJfIWB4Yi/fgCF4yg+Q4sXu9VJ+jbzXTyQBdAyokSUUyn4PSHz4eiKpDRlNf1mOa7OI+CttqM9Q3sSfWwxVCdXqD9AbQ8tep0C/50= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=bn0s1obb; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=kQOBl4Cp; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="bn0s1obb"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="kQOBl4Cp" Date: Fri, 10 Apr 2026 20:54:48 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1775854489; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pSj7tQ9em6GwvpdK9Lpm9ZxRtwQ4qiBlEfVleUpnjVs=; b=bn0s1obbujDP+OWXIPCoFbjDd0D8w3P1HqZQdlUrsQy5XqR28u1oVn609Hg5zcFA2k+BBn xYbzK32iQOykEmTMrMvithbIf+7kyQ6CH6wEgA4vilxmU7ZpQupniodl9kzKArdxY0MDMO 3mD6x4vlcN6LAq8PKmkxDs162ohJWrc6lTwlAgVhaplIfUv6Zc+iBJzOSFO90AXaU1oCA9 VPGmZna/cxKa9DMbei1WiNDuqeybxd/0FFUc8/57A9wSPIDQGPPJABn6bmh8E3ODTpIYLg 4ub059j/hDZe2rML8u/yAaEyjTjESLiZYnSCKgYmA1fmldAZBCFlk+P3zQbvAA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1775854489; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pSj7tQ9em6GwvpdK9Lpm9ZxRtwQ4qiBlEfVleUpnjVs=; b=kQOBl4CpWTZxIttH8zhSpQ01cs34xLPXSJmCEcCJh7MFnO+VpN8oAcMWC7ywpKsMHWcndx uoJ7Rx5/cBu1UHCg== From: "tip-bot2 for Thomas Gleixner" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: timers/urgent] clockevents: Prevent timer interrupt starvation Cc: Calvin Owens , Thomas Gleixner , Borislav Petkov , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20260407083247.562657657@kernel.org> References: <20260407083247.562657657@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <177585448816.801717.4161239325203144549.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Precedence: bulk Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The following commit has been merged into the timers/urgent branch of tip: Commit-ID: d6e152d905bdb1f32f9d99775e2f453350399a6a Gitweb: https://git.kernel.org/tip/d6e152d905bdb1f32f9d99775e2f45335= 0399a6a Author: Thomas Gleixner AuthorDate: Tue, 07 Apr 2026 10:54:17 +02:00 Committer: Thomas Gleixner CommitterDate: Fri, 10 Apr 2026 22:45:38 +02:00 clockevents: Prevent timer interrupt starvation Calvin reported an odd NMI watchdog lockup which claims that the CPU locked up in user space. He provided a reproducer, which sets up a timerfd based timer and then rearms it in a loop with an absolute expiry time of 1ns. As the expiry time is in the past, the timer ends up as the first expiring timer in the per CPU hrtimer base and the clockevent device is programmed with the minimum delta value. If the machine is fast enough, this ends up in a endless loop of programming the delta value to the minimum value defined by the clock event device, before the timer interrupt can fire, which starves the interrupt and consequently triggers the lockup detector because the hrtimer callback of the lockup mechanism is never invoked. As a first step to prevent this, avoid reprogramming the clock event device when: - a forced minimum delta event is pending - the new expiry delta is less then or equal to the minimum delta Thanks to Calvin for providing the reproducer and to Borislav for testing and providing data from his Zen5 machine. The problem is not limited to Zen5, but depending on the underlying clock event device (e.g. TSC deadline timer on Intel) and the CPU speed not necessarily observable. This change serves only as the last resort and further changes will be made to prevent this scenario earlier in the call chain as far as possible. [ tglx: Updated to restore the old behaviour vs. !force and delta <=3D 0 and fixed up the tick-broadcast handlers as pointed out by Borislav ] Fixes: d316c57ff6bf ("[PATCH] clockevents: add core functionality") Reported-by: Calvin Owens Signed-off-by: Thomas Gleixner Tested-by: Calvin Owens Tested-by: Borislav Petkov Link: https://lore.kernel.org/lkml/acMe-QZUel-bBYUh@mozart.vkv.me/ Link: https://patch.msgid.link/20260407083247.562657657@kernel.org --- include/linux/clockchips.h | 2 ++ kernel/time/clockevents.c | 27 +++++++++++++++++++-------- kernel/time/hrtimer.c | 1 + kernel/time/tick-broadcast.c | 8 +++++++- kernel/time/tick-common.c | 1 + kernel/time/tick-sched.c | 1 + 6 files changed, 31 insertions(+), 9 deletions(-) diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index b0df28d..50cdc9d 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -80,6 +80,7 @@ enum clock_event_state { * @shift: nanoseconds to cycles divisor (power of two) * @state_use_accessors:current state of the device, assigned by the core = code * @features: features + * @next_event_forced: True if the last programming was a forced event * @retries: number of forced programming retries * @set_state_periodic: switch state to periodic * @set_state_oneshot: switch state to oneshot @@ -108,6 +109,7 @@ struct clock_event_device { u32 shift; enum clock_event_state state_use_accessors; unsigned int features; + unsigned int next_event_forced; unsigned long retries; =20 int (*set_state_periodic)(struct clock_event_device *); diff --git a/kernel/time/clockevents.c b/kernel/time/clockevents.c index eaae1ce..3857099 100644 --- a/kernel/time/clockevents.c +++ b/kernel/time/clockevents.c @@ -172,6 +172,7 @@ void clockevents_shutdown(struct clock_event_device *de= v) { clockevents_switch_state(dev, CLOCK_EVT_STATE_SHUTDOWN); dev->next_event =3D KTIME_MAX; + dev->next_event_forced =3D 0; } =20 /** @@ -305,7 +306,6 @@ int clockevents_program_event(struct clock_event_device= *dev, ktime_t expires, { unsigned long long clc; int64_t delta; - int rc; =20 if (WARN_ON_ONCE(expires < 0)) return -ETIME; @@ -324,16 +324,27 @@ int clockevents_program_event(struct clock_event_devi= ce *dev, ktime_t expires, return dev->set_next_ktime(expires, dev); =20 delta =3D ktime_to_ns(ktime_sub(expires, ktime_get())); - if (delta <=3D 0) - return force ? clockevents_program_min_delta(dev) : -ETIME; =20 - delta =3D min(delta, (int64_t) dev->max_delta_ns); - delta =3D max(delta, (int64_t) dev->min_delta_ns); + /* Required for tick_periodic() during early boot */ + if (delta <=3D 0 && !force) + return -ETIME; + + if (delta > (int64_t)dev->min_delta_ns) { + delta =3D min(delta, (int64_t) dev->max_delta_ns); + clc =3D ((unsigned long long) delta * dev->mult) >> dev->shift; + if (!dev->set_next_event((unsigned long) clc, dev)) + return 0; + } =20 - clc =3D ((unsigned long long) delta * dev->mult) >> dev->shift; - rc =3D dev->set_next_event((unsigned long) clc, dev); + if (dev->next_event_forced) + return 0; =20 - return (rc && force) ? clockevents_program_min_delta(dev) : rc; + if (dev->set_next_event(dev->min_delta_ticks, dev)) { + if (!force || clockevents_program_min_delta(dev)) + return -ETIME; + } + dev->next_event_forced =3D 1; + return 0; } =20 /* diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c index 860af7a..1e37142 100644 --- a/kernel/time/hrtimer.c +++ b/kernel/time/hrtimer.c @@ -1888,6 +1888,7 @@ void hrtimer_interrupt(struct clock_event_device *dev) BUG_ON(!cpu_base->hres_active); cpu_base->nr_events++; dev->next_event =3D KTIME_MAX; + dev->next_event_forced =3D 0; =20 raw_spin_lock_irqsave(&cpu_base->lock, flags); entry_time =3D now =3D hrtimer_update_base(cpu_base); diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index f63c658..7e57fa3 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -76,8 +76,10 @@ const struct clock_event_device *tick_get_wakeup_device(= int cpu) */ static void tick_broadcast_start_periodic(struct clock_event_device *bc) { - if (bc) + if (bc) { + bc->next_event_forced =3D 0; tick_setup_periodic(bc, 1); + } } =20 /* @@ -403,6 +405,7 @@ static void tick_handle_periodic_broadcast(struct clock= _event_device *dev) bool bc_local; =20 raw_spin_lock(&tick_broadcast_lock); + tick_broadcast_device.evtdev->next_event_forced =3D 0; =20 /* Handle spurious interrupts gracefully */ if (clockevent_state_shutdown(tick_broadcast_device.evtdev)) { @@ -696,6 +699,7 @@ static void tick_handle_oneshot_broadcast(struct clock_= event_device *dev) =20 raw_spin_lock(&tick_broadcast_lock); dev->next_event =3D KTIME_MAX; + tick_broadcast_device.evtdev->next_event_forced =3D 0; next_event =3D KTIME_MAX; cpumask_clear(tmpmask); now =3D ktime_get(); @@ -1063,6 +1067,7 @@ static void tick_broadcast_setup_oneshot(struct clock= _event_device *bc, =20 =20 bc->event_handler =3D tick_handle_oneshot_broadcast; + bc->next_event_forced =3D 0; bc->next_event =3D KTIME_MAX; =20 /* @@ -1175,6 +1180,7 @@ void hotplug_cpu__broadcast_tick_pull(int deadcpu) } =20 /* This moves the broadcast assignment to this CPU: */ + bc->next_event_forced =3D 0; clockevents_program_event(bc, bc->next_event, 1); } raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags); diff --git a/kernel/time/tick-common.c b/kernel/time/tick-common.c index d305d85..6a9198a 100644 --- a/kernel/time/tick-common.c +++ b/kernel/time/tick-common.c @@ -110,6 +110,7 @@ void tick_handle_periodic(struct clock_event_device *de= v) int cpu =3D smp_processor_id(); ktime_t next =3D dev->next_event; =20 + dev->next_event_forced =3D 0; tick_periodic(cpu); =20 /* diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 36449f0..d1f27df 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -1513,6 +1513,7 @@ static void tick_nohz_lowres_handler(struct clock_eve= nt_device *dev) struct tick_sched *ts =3D this_cpu_ptr(&tick_cpu_sched); =20 dev->next_event =3D KTIME_MAX; + dev->next_event_forced =3D 0; =20 if (likely(tick_nohz_handler(&ts->sched_timer) =3D=3D HRTIMER_RESTART)) tick_program_event(hrtimer_get_expires(&ts->sched_timer), 1);