From nobody Mon Jun 15 05:20:29 2026 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 346B83B6C0B; Wed, 8 Apr 2026 10:10:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775643047; cv=none; b=qmu8JlHPPb9wljZzfT9v21UA/ZS4CWQHkLHQIU2gCAgRj5LuOHe1q+kD8jPpoDZ+yTmpnTtWjVhn8jjwZj6EjjbkBxRx6b8XwJ1EGMfTEZVVJ58b79BgrN6f2A8maor17zarO7xuOvlrJ8aFq5xJtjNXOlS1dWNnS+9X8eJc2dM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775643047; c=relaxed/simple; bh=OhxtNDLx7Xz4vaO/Y+nHF9Qo6stWyAnTpduvdtvKzLg=; h=Date:From:To:Subject:Cc:In-Reply-To:References:MIME-Version: Message-ID:Content-Type; b=B8APxCn7w6RocG80PkzbkVtjpiZ3mmWA3sxs5b6QjqyByJGuXCfxFIe2DZAEELEvhBaqQgOe9L8icx+4I+w1+np9+hb/Q6iF8IQhonYv2zh7+Om7p4q3J6EmNR+2BIZtZKNRKVri5vpqLmVGDrTKsrS65CYAFWmdeNFsKqXlb+k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=JKC0Iekr; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=EKJCoSVI; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="JKC0Iekr"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="EKJCoSVI" Date: Wed, 08 Apr 2026 10:10:36 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1775643038; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QHbYOr0oc37WXEoENl3swTxt8ut/z8F2RlvPOnJjlmk=; b=JKC0Iekrj4I1VEInQSviS+II5CHkVE6dzJgTMl628+jVKlhDlG80oGaEAdalvbF3Aer7hw QfNYTBtl5vfxBDL+dHOums0/xx1xeGIbwjmdOTBz8vvfW3sYwP5ZLikzrsliBvddAmDbUb 52dD/JEGqDxzdWBcGs7k9S+wj9vnZBQsiKJ1tK7CjVRA99unazIRhUm1Z8Ny/IzbDaHoGB r+iqAjeqjKy3BdR5CyNXio9a1ix18NT7nOrYH5LsON5OFDFcVOAvW14wjSVqFiNjlUPTVd D9qM9HxxNoNA3y3P2V1s8hbWzJ+8NZIMCAUAkokjzoEXSVwzSCTukoupTL2bBw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1775643038; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QHbYOr0oc37WXEoENl3swTxt8ut/z8F2RlvPOnJjlmk=; b=EKJCoSVIxdGoE8bQAdt+fk2s/Q/djvmPNLhhpWFM0uG1KRDaIEh0fQoClESd6IPwfozt5T Kn+z3ys7CciDgkAA== From: "tip-bot2 for Mark Rutland" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/hrtick] entry: Split kernel mode logic from irqentry_{enter,exit}() Cc: Mark Rutland , Thomas Gleixner , Jinjie Ruan , "Peter Zijlstra (Intel)" , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20260407131650.3813777-5-mark.rutland@arm.com> References: <20260407131650.3813777-5-mark.rutland@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-ID: <177564303671.801717.16124769678851048072.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Precedence: bulk Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable The following commit has been merged into the sched/hrtick branch of tip: Commit-ID: c5538d0141b383808f440186fcd0bc2799af2853 Gitweb: https://git.kernel.org/tip/c5538d0141b383808f440186fcd0bc279= 9af2853 Author: Mark Rutland AuthorDate: Tue, 07 Apr 2026 14:16:44 +01:00 Committer: Thomas Gleixner CommitterDate: Wed, 08 Apr 2026 11:43:32 +02:00 entry: Split kernel mode logic from irqentry_{enter,exit}() The generic irqentry code has entry/exit functions specifically for exceptions taken from user mode, but doesn't have entry/exit functions specifically for exceptions taken from kernel mode. It would be helpful to have separate entry/exit functions specifically for exceptions taken from kernel mode. This would make the structure of the entry code more consistent, and would make it easier for architectures to manage logic specific to exceptions taken from kernel mode. Move the logic specific to kernel mode out of irqentry_enter() and irqentry_exit() into new irqentry_enter_from_kernel_mode() and irqentry_exit_to_kernel_mode() functions. These are marked __always_inline and placed in irq-entry-common.h, as with irqentry_enter_from_user_mode() and irqentry_exit_to_user_mode(), so that they can be inlined into architecture-specific wrappers. The existing out-of-line irqentry_enter() and irqentry_exit() functions retained as callers of the new functions. The lockdep assertion from irqentry_exit() is moved into irqentry_exit_to_user_mode() and irqentry_exit_to_kernel_mode(). This was previously missing from irqentry_exit_to_user_mode() when called directly, and any new lockdep assertion failure relating from this change is a latent bug. Aside from the lockdep change noted above, there should be no functional change as a result of this change. [ tglx: Updated kernel doc ] Signed-off-by: Mark Rutland Signed-off-by: Thomas Gleixner Reviewed-by: Jinjie Ruan Acked-by: Peter Zijlstra (Intel) Link: https://patch.msgid.link/20260407131650.3813777-5-mark.rutland@arm.com --- include/linux/irq-entry-common.h | 134 ++++++++++++++++++++++++++++++- kernel/entry/common.c | 103 +---------------------- 2 files changed, 142 insertions(+), 95 deletions(-) diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-com= mon.h index d1e8591..66bc168 100644 --- a/include/linux/irq-entry-common.h +++ b/include/linux/irq-entry-common.h @@ -304,6 +304,8 @@ static __always_inline void irqentry_enter_from_user_mo= de(struct pt_regs *regs) */ static __always_inline void irqentry_exit_to_user_mode(struct pt_regs *reg= s) { + lockdep_assert_irqs_disabled(); + instrumentation_begin(); irqentry_exit_to_user_mode_prepare(regs); instrumentation_end(); @@ -357,6 +359,138 @@ void dynamic_irqentry_exit_cond_resched(void); #endif /* CONFIG_PREEMPT_DYNAMIC */ =20 /** + * irqentry_enter_from_kernel_mode - Establish state before invoking the i= rq handler + * @regs: Pointer to currents pt_regs + * + * Invoked from architecture specific entry code with interrupts disabled. + * Can only be called when the interrupt entry came from kernel mode. The + * calling code must be non-instrumentable. When the function returns all + * state is correct and the subsequent functions can be instrumented. + * + * The function establishes state (lockdep, RCU (context tracking), tracin= g) and + * is provided for architectures which require a strict split between entr= y from + * kernel and user mode and therefore cannot use irqentry_enter() which ha= ndles + * both entry modes. + * + * Returns: An opaque object that must be passed to irqentry_exit_to_kerne= l_mode(). + */ +static __always_inline irqentry_state_t irqentry_enter_from_kernel_mode(st= ruct pt_regs *regs) +{ + irqentry_state_t ret =3D { + .exit_rcu =3D false, + }; + + /* + * If this entry hit the idle task invoke ct_irq_enter() whether + * RCU is watching or not. + * + * Interrupts can nest when the first interrupt invokes softirq + * processing on return which enables interrupts. + * + * Scheduler ticks in the idle task can mark quiescent state and + * terminate a grace period, if and only if the timer interrupt is + * not nested into another interrupt. + * + * Checking for rcu_is_watching() here would prevent the nesting + * interrupt to invoke ct_irq_enter(). If that nested interrupt is + * the tick then rcu_flavor_sched_clock_irq() would wrongfully + * assume that it is the first interrupt and eventually claim + * quiescent state and end grace periods prematurely. + * + * Unconditionally invoke ct_irq_enter() so RCU state stays + * consistent. + * + * TINY_RCU does not support EQS, so let the compiler eliminate + * this part when enabled. + */ + if (!IS_ENABLED(CONFIG_TINY_RCU) && + (is_idle_task(current) || arch_in_rcu_eqs())) { + /* + * If RCU is not watching then the same careful + * sequence vs. lockdep and tracing is required + * as in irqentry_enter_from_user_mode(). + */ + lockdep_hardirqs_off(CALLER_ADDR0); + ct_irq_enter(); + instrumentation_begin(); + kmsan_unpoison_entry_regs(regs); + trace_hardirqs_off_finish(); + instrumentation_end(); + + ret.exit_rcu =3D true; + return ret; + } + + /* + * If RCU is watching then RCU only wants to check whether it needs + * to restart the tick in NOHZ mode. rcu_irq_enter_check_tick() + * already contains a warning when RCU is not watching, so no point + * in having another one here. + */ + lockdep_hardirqs_off(CALLER_ADDR0); + instrumentation_begin(); + kmsan_unpoison_entry_regs(regs); + rcu_irq_enter_check_tick(); + trace_hardirqs_off_finish(); + instrumentation_end(); + + return ret; +} + +/** + * irqentry_exit_to_kernel_mode - Run preempt checks and establish state a= fter + * invoking the interrupt handler + * @regs: Pointer to current's pt_regs + * @state: Return value from matching call to irqentry_enter_from_kernel_m= ode() + * + * This is the counterpart of irqentry_enter_from_kernel_mode() and runs t= he + * necessary preemption check if possible and required. It returns to the = caller + * with interrupts disabled and the correct state vs. tracing, lockdep and= RCU + * required to return to the interrupted context. + * + * It is the last action before returning to the low level ASM code which = just + * needs to return. + */ +static __always_inline void irqentry_exit_to_kernel_mode(struct pt_regs *r= egs, + irqentry_state_t state) +{ + lockdep_assert_irqs_disabled(); + + if (!regs_irqs_disabled(regs)) { + /* + * If RCU was not watching on entry this needs to be done + * carefully and needs the same ordering of lockdep/tracing + * and RCU as the return to user mode path. + */ + if (state.exit_rcu) { + instrumentation_begin(); + /* Tell the tracer that IRET will enable interrupts */ + trace_hardirqs_on_prepare(); + lockdep_hardirqs_on_prepare(); + instrumentation_end(); + ct_irq_exit(); + lockdep_hardirqs_on(CALLER_ADDR0); + return; + } + + instrumentation_begin(); + if (IS_ENABLED(CONFIG_PREEMPTION)) + irqentry_exit_cond_resched(); + + /* Covers both tracing and lockdep */ + trace_hardirqs_on(); + instrumentation_end(); + } else { + /* + * IRQ flags state is correct already. Just tell RCU if it + * was not watching on entry. + */ + if (state.exit_rcu) + ct_irq_exit(); + } +} + +/** * irqentry_enter - Handle state tracking on ordinary interrupt entries * @regs: Pointer to pt_regs of interrupted context * diff --git a/kernel/entry/common.c b/kernel/entry/common.c index b5e05d8..1034be0 100644 --- a/kernel/entry/common.c +++ b/kernel/entry/common.c @@ -105,70 +105,16 @@ __always_inline unsigned long exit_to_user_mode_loop(= struct pt_regs *regs, =20 noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs) { - irqentry_state_t ret =3D { - .exit_rcu =3D false, - }; - if (user_mode(regs)) { - irqentry_enter_from_user_mode(regs); - return ret; - } + irqentry_state_t ret =3D { + .exit_rcu =3D false, + }; =20 - /* - * If this entry hit the idle task invoke ct_irq_enter() whether - * RCU is watching or not. - * - * Interrupts can nest when the first interrupt invokes softirq - * processing on return which enables interrupts. - * - * Scheduler ticks in the idle task can mark quiescent state and - * terminate a grace period, if and only if the timer interrupt is - * not nested into another interrupt. - * - * Checking for rcu_is_watching() here would prevent the nesting - * interrupt to invoke ct_irq_enter(). If that nested interrupt is - * the tick then rcu_flavor_sched_clock_irq() would wrongfully - * assume that it is the first interrupt and eventually claim - * quiescent state and end grace periods prematurely. - * - * Unconditionally invoke ct_irq_enter() so RCU state stays - * consistent. - * - * TINY_RCU does not support EQS, so let the compiler eliminate - * this part when enabled. - */ - if (!IS_ENABLED(CONFIG_TINY_RCU) && - (is_idle_task(current) || arch_in_rcu_eqs())) { - /* - * If RCU is not watching then the same careful - * sequence vs. lockdep and tracing is required - * as in irqentry_enter_from_user_mode(). - */ - lockdep_hardirqs_off(CALLER_ADDR0); - ct_irq_enter(); - instrumentation_begin(); - kmsan_unpoison_entry_regs(regs); - trace_hardirqs_off_finish(); - instrumentation_end(); - - ret.exit_rcu =3D true; + irqentry_enter_from_user_mode(regs); return ret; } =20 - /* - * If RCU is watching then RCU only wants to check whether it needs - * to restart the tick in NOHZ mode. rcu_irq_enter_check_tick() - * already contains a warning when RCU is not watching, so no point - * in having another one here. - */ - lockdep_hardirqs_off(CALLER_ADDR0); - instrumentation_begin(); - kmsan_unpoison_entry_regs(regs); - rcu_irq_enter_check_tick(); - trace_hardirqs_off_finish(); - instrumentation_end(); - - return ret; + return irqentry_enter_from_kernel_mode(regs); } =20 /** @@ -212,43 +158,10 @@ void dynamic_irqentry_exit_cond_resched(void) =20 noinstr void irqentry_exit(struct pt_regs *regs, irqentry_state_t state) { - lockdep_assert_irqs_disabled(); - - /* Check whether this returns to user mode */ - if (user_mode(regs)) { + if (user_mode(regs)) irqentry_exit_to_user_mode(regs); - } else if (!regs_irqs_disabled(regs)) { - /* - * If RCU was not watching on entry this needs to be done - * carefully and needs the same ordering of lockdep/tracing - * and RCU as the return to user mode path. - */ - if (state.exit_rcu) { - instrumentation_begin(); - /* Tell the tracer that IRET will enable interrupts */ - trace_hardirqs_on_prepare(); - lockdep_hardirqs_on_prepare(); - instrumentation_end(); - ct_irq_exit(); - lockdep_hardirqs_on(CALLER_ADDR0); - return; - } - - instrumentation_begin(); - if (IS_ENABLED(CONFIG_PREEMPTION)) - irqentry_exit_cond_resched(); - - /* Covers both tracing and lockdep */ - trace_hardirqs_on(); - instrumentation_end(); - } else { - /* - * IRQ flags state is correct already. Just tell RCU if it - * was not watching on entry. - */ - if (state.exit_rcu) - ct_irq_exit(); - } + else + irqentry_exit_to_kernel_mode(regs, state); } =20 irqentry_state_t noinstr irqentry_nmi_enter(struct pt_regs *regs)