From nobody Sun Apr 5 20:05:09 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2865C54EBE for ; Fri, 13 Jan 2023 12:36:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241644AbjAMMgt (ORCPT ); Fri, 13 Jan 2023 07:36:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241238AbjAMMcT (ORCPT ); Fri, 13 Jan 2023 07:32:19 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61B77564CD; Fri, 13 Jan 2023 04:31:15 -0800 (PST) Date: Fri, 13 Jan 2023 12:31:13 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1673613074; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+VJkT3N/Q0nHnNyST+KtiVYGPfPMPkjFyNMmUbOTZkk=; b=kMWkEbULDlsMk71PD94Gb8ojrNp7YXko2Nwv1n2jNG6H6v/dW6akNTq/OhziiIPoWo/RN3 N7zbyOSI88Z2+xZFS01si1JVBB3MgrAJdiFNGhVER+fUe4+y8m3L9ygp1DKS9LGM2QZ7Tu hFyBuWUJPUiXN8pgrZDe+/s+oW92mYgiZD80meQsE6lvrOBDadj2YQynzX6sTABPY4FFD8 40suTSEmrxATDKBLO9+PHF6Ww+kpQjmi5rxMT9otDkG63EW2aaeH/l8C9jWQNIwDZwizyU e0AHWtQl3dScIwfv3EnowcC1M7g76wTf0S+/r80EOFJ4H1pXjw1Xvub63DpJrQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1673613074; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=+VJkT3N/Q0nHnNyST+KtiVYGPfPMPkjFyNMmUbOTZkk=; b=tWtJlV2VrPuV+MTCvRG23ebIRP1F8SOwdnoVPov+kUga2GbR1uBalSq4riT2QcQ8TimRFl 5VEYRP38a4lREnAw== From: "tip-bot2 for Peter Zijlstra" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] cpuidle: Fix ct_idle_*() usage Cc: "Peter Zijlstra (Intel)" , Ingo Molnar , Tony Lindgren , Ulf Hansson , "Rafael J. Wysocki" , Frederic Weisbecker , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20230112195540.130014793@infradead.org> References: <20230112195540.130014793@infradead.org> MIME-Version: 1.0 Message-ID: <167361307365.4906.6746266098740384424.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the sched/core branch of tip: Commit-ID: a01353cf1896ea5b8a7bbc5e2b2d38feed8b7aaa Gitweb: https://git.kernel.org/tip/a01353cf1896ea5b8a7bbc5e2b2d38fee= d8b7aaa Author: Peter Zijlstra AuthorDate: Thu, 12 Jan 2023 20:43:27 +01:00 Committer: Ingo Molnar CommitterDate: Fri, 13 Jan 2023 11:48:15 +01:00 cpuidle: Fix ct_idle_*() usage The whole disable-RCU, enable-IRQS dance is very intricate since changing IRQ state is traced, which depends on RCU. Add two helpers for the cpuidle case that mirror the entry code: ct_cpuidle_enter() ct_cpuidle_exit() And fix all the cases where the enter/exit dance was buggy. Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar Tested-by: Tony Lindgren Tested-by: Ulf Hansson Acked-by: Rafael J. Wysocki Acked-by: Frederic Weisbecker Link: https://lore.kernel.org/r/20230112195540.130014793@infradead.org --- arch/arm/mach-imx/cpuidle-imx6q.c | 4 +- arch/arm/mach-imx/cpuidle-imx6sx.c | 4 +- arch/arm/mach-omap2/cpuidle34xx.c | 4 +- arch/arm/mach-omap2/cpuidle44xx.c | 8 ++--- drivers/acpi/processor_idle.c | 8 +++-- drivers/cpuidle/cpuidle-big_little.c | 4 +- drivers/cpuidle/cpuidle-mvebu-v7.c | 4 +- drivers/cpuidle/cpuidle-psci.c | 4 +- drivers/cpuidle/cpuidle-riscv-sbi.c | 4 +- drivers/cpuidle/cpuidle-tegra.c | 8 ++--- drivers/cpuidle/cpuidle.c | 11 +++---- include/linux/clockchips.h | 4 +- include/linux/cpuidle.h | 34 ++++++++++++++++++-- kernel/sched/idle.c | 45 +++++++-------------------- kernel/time/tick-broadcast.c | 6 +++- 15 files changed, 86 insertions(+), 66 deletions(-) diff --git a/arch/arm/mach-imx/cpuidle-imx6q.c b/arch/arm/mach-imx/cpuidle-= imx6q.c index d086cba..c24c78a 100644 --- a/arch/arm/mach-imx/cpuidle-imx6q.c +++ b/arch/arm/mach-imx/cpuidle-imx6q.c @@ -25,9 +25,9 @@ static int imx6q_enter_wait(struct cpuidle_device *dev, imx6_set_lpm(WAIT_UNCLOCKED); raw_spin_unlock(&cpuidle_lock); =20 - ct_idle_enter(); + ct_cpuidle_enter(); cpu_do_idle(); - ct_idle_exit(); + ct_cpuidle_exit(); =20 raw_spin_lock(&cpuidle_lock); if (num_idle_cpus-- =3D=3D num_online_cpus()) diff --git a/arch/arm/mach-imx/cpuidle-imx6sx.c b/arch/arm/mach-imx/cpuidle= -imx6sx.c index 1dc01f6..479f062 100644 --- a/arch/arm/mach-imx/cpuidle-imx6sx.c +++ b/arch/arm/mach-imx/cpuidle-imx6sx.c @@ -47,9 +47,9 @@ static int imx6sx_enter_wait(struct cpuidle_device *dev, cpu_pm_enter(); cpu_cluster_pm_enter(); =20 - ct_idle_enter(); + ct_cpuidle_enter(); cpu_suspend(0, imx6sx_idle_finish); - ct_idle_exit(); + ct_cpuidle_exit(); =20 cpu_cluster_pm_exit(); cpu_pm_exit(); diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidl= e34xx.c index cedf5cb..6d63769 100644 --- a/arch/arm/mach-omap2/cpuidle34xx.c +++ b/arch/arm/mach-omap2/cpuidle34xx.c @@ -133,9 +133,9 @@ static int omap3_enter_idle(struct cpuidle_device *dev, } =20 /* Execute ARM wfi */ - ct_idle_enter(); + ct_cpuidle_enter(); omap_sram_idle(); - ct_idle_exit(); + ct_cpuidle_exit(); =20 /* * Call idle CPU PM enter notifier chain to restore diff --git a/arch/arm/mach-omap2/cpuidle44xx.c b/arch/arm/mach-omap2/cpuidl= e44xx.c index 953ad88..3c97d56 100644 --- a/arch/arm/mach-omap2/cpuidle44xx.c +++ b/arch/arm/mach-omap2/cpuidle44xx.c @@ -105,9 +105,9 @@ static int omap_enter_idle_smp(struct cpuidle_device *d= ev, } raw_spin_unlock_irqrestore(&mpu_lock, flag); =20 - ct_idle_enter(); + ct_cpuidle_enter(); omap4_enter_lowpower(dev->cpu, cx->cpu_state); - ct_idle_exit(); + ct_cpuidle_exit(); =20 raw_spin_lock_irqsave(&mpu_lock, flag); if (cx->mpu_state_vote =3D=3D num_online_cpus()) @@ -186,10 +186,10 @@ static int omap_enter_idle_coupled(struct cpuidle_dev= ice *dev, } } =20 - ct_idle_enter(); + ct_cpuidle_enter(); omap4_enter_lowpower(dev->cpu, cx->cpu_state); cpu_done[dev->cpu] =3D true; - ct_idle_exit(); + ct_cpuidle_exit(); =20 /* Wakeup CPU1 only if it is not offlined */ if (dev->cpu =3D=3D 0 && cpumask_test_cpu(1, cpu_online_mask)) { diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c index 566f7db..a589cfa 100644 --- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -642,6 +642,8 @@ static int __cpuidle acpi_idle_enter_bm(struct cpuidle_= driver *drv, */ bool dis_bm =3D pr->flags.bm_control; =20 + instrumentation_begin(); + /* If we can skip BM, demote to a safe state. */ if (!cx->bm_sts_skip && acpi_idle_bm_check()) { dis_bm =3D false; @@ -663,11 +665,11 @@ static int __cpuidle acpi_idle_enter_bm(struct cpuidl= e_driver *drv, raw_spin_unlock(&c3_lock); } =20 - ct_idle_enter(); + ct_cpuidle_enter(); =20 acpi_idle_do_entry(cx); =20 - ct_idle_exit(); + ct_cpuidle_exit(); =20 /* Re-enable bus master arbitration */ if (dis_bm) { @@ -677,6 +679,8 @@ static int __cpuidle acpi_idle_enter_bm(struct cpuidle_= driver *drv, raw_spin_unlock(&c3_lock); } =20 + instrumentation_end(); + return index; } =20 diff --git a/drivers/cpuidle/cpuidle-big_little.c b/drivers/cpuidle/cpuidle= -big_little.c index fffd4ed..5858db2 100644 --- a/drivers/cpuidle/cpuidle-big_little.c +++ b/drivers/cpuidle/cpuidle-big_little.c @@ -126,13 +126,13 @@ static int bl_enter_powerdown(struct cpuidle_device *= dev, struct cpuidle_driver *drv, int idx) { cpu_pm_enter(); - ct_idle_enter(); + ct_cpuidle_enter(); =20 cpu_suspend(0, bl_powerdown_finisher); =20 /* signals the MCPM core that CPU is out of low power state */ mcpm_cpu_powered_up(); - ct_idle_exit(); + ct_cpuidle_exit(); =20 cpu_pm_exit(); =20 diff --git a/drivers/cpuidle/cpuidle-mvebu-v7.c b/drivers/cpuidle/cpuidle-m= vebu-v7.c index c9568aa..20bfb26 100644 --- a/drivers/cpuidle/cpuidle-mvebu-v7.c +++ b/drivers/cpuidle/cpuidle-mvebu-v7.c @@ -36,9 +36,9 @@ static int mvebu_v7_enter_idle(struct cpuidle_device *dev, if (drv->states[index].flags & MVEBU_V7_FLAG_DEEP_IDLE) deepidle =3D true; =20 - ct_idle_enter(); + ct_cpuidle_enter(); ret =3D mvebu_v7_cpu_suspend(deepidle); - ct_idle_exit(); + ct_cpuidle_exit(); =20 cpu_pm_exit(); =20 diff --git a/drivers/cpuidle/cpuidle-psci.c b/drivers/cpuidle/cpuidle-psci.c index 969808c..58b2cbb 100644 --- a/drivers/cpuidle/cpuidle-psci.c +++ b/drivers/cpuidle/cpuidle-psci.c @@ -74,7 +74,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_= device *dev, else pm_runtime_put_sync_suspend(pd_dev); =20 - ct_idle_enter(); + ct_cpuidle_enter(); =20 state =3D psci_get_domain_state(); if (!state) @@ -82,7 +82,7 @@ static int __psci_enter_domain_idle_state(struct cpuidle_= device *dev, =20 ret =3D psci_cpu_suspend_enter(state) ? -1 : idx; =20 - ct_idle_exit(); + ct_cpuidle_exit(); =20 if (s2idle) dev_pm_genpd_resume(pd_dev); diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-= riscv-sbi.c index cbdbb11..0a480f5 100644 --- a/drivers/cpuidle/cpuidle-riscv-sbi.c +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c @@ -126,7 +126,7 @@ static int __sbi_enter_domain_idle_state(struct cpuidle= _device *dev, else pm_runtime_put_sync_suspend(pd_dev); =20 - ct_idle_enter(); + ct_cpuidle_enter(); =20 if (sbi_is_domain_state_available()) state =3D sbi_get_domain_state(); @@ -135,7 +135,7 @@ static int __sbi_enter_domain_idle_state(struct cpuidle= _device *dev, =20 ret =3D sbi_suspend(state) ? -1 : idx; =20 - ct_idle_exit(); + ct_cpuidle_exit(); =20 if (s2idle) dev_pm_genpd_resume(pd_dev); diff --git a/drivers/cpuidle/cpuidle-tegra.c b/drivers/cpuidle/cpuidle-tegr= a.c index 3ca5cfb..9c2903c 100644 --- a/drivers/cpuidle/cpuidle-tegra.c +++ b/drivers/cpuidle/cpuidle-tegra.c @@ -183,7 +183,7 @@ static int tegra_cpuidle_state_enter(struct cpuidle_dev= ice *dev, tegra_pm_set_cpu_in_lp2(); cpu_pm_enter(); =20 - ct_idle_enter(); + ct_cpuidle_enter(); =20 switch (index) { case TEGRA_C7: @@ -199,7 +199,7 @@ static int tegra_cpuidle_state_enter(struct cpuidle_dev= ice *dev, break; } =20 - ct_idle_exit(); + ct_cpuidle_exit(); =20 cpu_pm_exit(); tegra_pm_clear_cpu_in_lp2(); @@ -240,10 +240,10 @@ static int tegra_cpuidle_enter(struct cpuidle_device = *dev, =20 if (index =3D=3D TEGRA_C1) { if (do_rcu) - ct_idle_enter(); + ct_cpuidle_enter(); ret =3D arm_cpuidle_simple_enter(dev, drv, index); if (do_rcu) - ct_idle_exit(); + ct_cpuidle_exit(); } else ret =3D tegra_cpuidle_state_enter(dev, index, cpu); =20 diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index 95c801f..08374c7 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -152,12 +153,12 @@ static void enter_s2idle_proper(struct cpuidle_driver= *drv, */ stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) - ct_idle_enter(); + ct_cpuidle_enter(); target_state->enter_s2idle(dev, drv, index); if (WARN_ON_ONCE(!irqs_disabled())) - local_irq_disable(); + raw_local_irq_disable(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) - ct_idle_exit(); + ct_cpuidle_exit(); tick_unfreeze(); start_critical_timings(); =20 @@ -235,14 +236,14 @@ int cpuidle_enter_state(struct cpuidle_device *dev, s= truct cpuidle_driver *drv, =20 stop_critical_timings(); if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) - ct_idle_enter(); + ct_cpuidle_enter(); =20 entered_state =3D target_state->enter(dev, drv, index); if (WARN_ONCE(!irqs_disabled(), "%ps leaked IRQ state", target_state->ent= er)) raw_local_irq_disable(); =20 if (!(target_state->flags & CPUIDLE_FLAG_RCU_IDLE)) - ct_idle_exit(); + ct_cpuidle_exit(); start_critical_timings(); =20 sched_clock_idle_wakeup_event(); diff --git a/include/linux/clockchips.h b/include/linux/clockchips.h index 8ae9a95..9aac31d 100644 --- a/include/linux/clockchips.h +++ b/include/linux/clockchips.h @@ -211,7 +211,7 @@ extern int tick_receive_broadcast(void); extern void tick_setup_hrtimer_broadcast(void); extern int tick_check_broadcast_expired(void); # else -static inline int tick_check_broadcast_expired(void) { return 0; } +static __always_inline int tick_check_broadcast_expired(void) { return 0; } static inline void tick_setup_hrtimer_broadcast(void) { } # endif =20 @@ -219,7 +219,7 @@ static inline void tick_setup_hrtimer_broadcast(void) {= } =20 static inline void clockevents_suspend(void) { } static inline void clockevents_resume(void) { } -static inline int tick_check_broadcast_expired(void) { return 0; } +static __always_inline int tick_check_broadcast_expired(void) { return 0; } static inline void tick_setup_hrtimer_broadcast(void) { } =20 #endif /* !CONFIG_GENERIC_CLOCKEVENTS */ diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 0ddc11e..630c879 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -14,6 +14,7 @@ #include #include #include +#include =20 #define CPUIDLE_STATE_MAX 10 #define CPUIDLE_NAME_LEN 16 @@ -115,6 +116,35 @@ struct cpuidle_device { DECLARE_PER_CPU(struct cpuidle_device *, cpuidle_devices); DECLARE_PER_CPU(struct cpuidle_device, cpuidle_dev); =20 +static __always_inline void ct_cpuidle_enter(void) +{ + lockdep_assert_irqs_disabled(); + /* + * Idle is allowed to (temporary) enable IRQs. It + * will return with IRQs disabled. + * + * Trace IRQs enable here, then switch off RCU, and have + * arch_cpu_idle() use raw_local_irq_enable(). Note that + * ct_idle_enter() relies on lockdep IRQ state, so switch that + * last -- this is very similar to the entry code. + */ + trace_hardirqs_on_prepare(); + lockdep_hardirqs_on_prepare(); + instrumentation_end(); + ct_idle_enter(); + lockdep_hardirqs_on(_RET_IP_); +} + +static __always_inline void ct_cpuidle_exit(void) +{ + /* + * Carefully undo the above. + */ + lockdep_hardirqs_off(_RET_IP_); + ct_idle_exit(); + instrumentation_begin(); +} + /**************************** * CPUIDLE DRIVER INTERFACE * ****************************/ @@ -289,9 +319,9 @@ extern s64 cpuidle_governor_latency_req(unsigned int cp= u); if (!is_retention) \ __ret =3D cpu_pm_enter(); \ if (!__ret) { \ - ct_idle_enter(); \ + ct_cpuidle_enter(); \ __ret =3D low_level_idle_enter(state); \ - ct_idle_exit(); \ + ct_cpuidle_exit(); \ if (!is_retention) \ cpu_pm_exit(); \ } \ diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index f26ab26..e924602 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -51,18 +51,22 @@ __setup("hlt", cpu_idle_nopoll_setup); =20 static noinline int __cpuidle cpu_idle_poll(void) { + instrumentation_begin(); trace_cpu_idle(0, smp_processor_id()); stop_critical_timings(); - ct_idle_enter(); - local_irq_enable(); + ct_cpuidle_enter(); =20 + raw_local_irq_enable(); while (!tif_need_resched() && (cpu_idle_force_poll || tick_check_broadcast_expired())) cpu_relax(); + raw_local_irq_disable(); =20 - ct_idle_exit(); + ct_cpuidle_exit(); start_critical_timings(); trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); + local_irq_enable(); + instrumentation_end(); =20 return 1; } @@ -85,44 +89,21 @@ void __weak arch_cpu_idle(void) */ void __cpuidle default_idle_call(void) { - if (current_clr_polling_and_test()) { - local_irq_enable(); - } else { - + instrumentation_begin(); + if (!current_clr_polling_and_test()) { trace_cpu_idle(1, smp_processor_id()); stop_critical_timings(); =20 - /* - * arch_cpu_idle() is supposed to enable IRQs, however - * we can't do that because of RCU and tracing. - * - * Trace IRQs enable here, then switch off RCU, and have - * arch_cpu_idle() use raw_local_irq_enable(). Note that - * ct_idle_enter() relies on lockdep IRQ state, so switch that - * last -- this is very similar to the entry code. - */ - trace_hardirqs_on_prepare(); - lockdep_hardirqs_on_prepare(); - ct_idle_enter(); - lockdep_hardirqs_on(_THIS_IP_); - + ct_cpuidle_enter(); arch_cpu_idle(); - - /* - * OK, so IRQs are enabled here, but RCU needs them disabled to - * turn itself back on.. funny thing is that disabling IRQs - * will cause tracing, which needs RCU. Jump through hoops to - * make it 'work'. - */ raw_local_irq_disable(); - lockdep_hardirqs_off(_THIS_IP_); - ct_idle_exit(); - lockdep_hardirqs_on(_THIS_IP_); - raw_local_irq_enable(); + ct_cpuidle_exit(); =20 start_critical_timings(); trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id()); } + local_irq_enable(); + instrumentation_end(); } =20 static int call_cpuidle_s2idle(struct cpuidle_driver *drv, diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c index f7fe6fe..93bf2b4 100644 --- a/kernel/time/tick-broadcast.c +++ b/kernel/time/tick-broadcast.c @@ -622,9 +622,13 @@ struct cpumask *tick_get_broadcast_oneshot_mask(void) * to avoid a deep idle transition as we are about to get the * broadcast IPI right away. */ -int tick_check_broadcast_expired(void) +noinstr int tick_check_broadcast_expired(void) { +#ifdef _ASM_GENERIC_BITOPS_INSTRUMENTED_NON_ATOMIC_H + return arch_test_bit(smp_processor_id(), cpumask_bits(tick_broadcast_forc= e_mask)); +#else return cpumask_test_cpu(smp_processor_id(), tick_broadcast_force_mask); +#endif } =20 /*