From nobody Sun May 10 06:27:24 2026 Received: from out198-162.us.a.mail.aliyun.com (out198-162.us.a.mail.aliyun.com [47.90.198.162]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66C72195 for ; Mon, 30 Dec 2024 15:19:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=47.90.198.162 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735571986; cv=none; b=t24/n212kco2ISPWku+C5GlOz9XLmAJZ+/I4jsAl6yzQlKlR1ONSsVxu/PJ75OVQh49amurhhFjc0NuAO/xEP19jwokLUsAsJnlqZfQaf41MOZrDHObjl0uObqny05/y6rCTgoghy23Z+EJ0d2hfZ+gBggrzBSgIqc98oruFuKI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1735571986; c=relaxed/simple; bh=djGGiLr6p4BvK+dL2jikUz7m6z6bGlSAIOUWHQDpYxw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=DsfetXYvV6yWPwYeXEtH+BB0jCisUyppy8V3giyLuioAO1bJnFCohZa/De8mOUhs90eHhhJT0zkSN8a+63IqNEiLaYJHX78zYtOu7geYH/5STl+8IYR1zqnX9bS1X2fZcVkNuRZudiGiYHRV8HgfeThWRQoy50scQmExi2sMO4s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pascal-lab.net; spf=pass smtp.mailfrom=pascal-lab.net; dkim=pass (1024-bit key) header.d=pascal-lab.net header.i=@pascal-lab.net header.b=Q8S27an+; arc=none smtp.client-ip=47.90.198.162 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=pascal-lab.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=pascal-lab.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=pascal-lab.net header.i=@pascal-lab.net header.b="Q8S27an+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pascal-lab.net; s=default; t=1735571972; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=ONVScN2rq/pEbna9VqYebq4EPKtnz0mxrvGoR1WqS5A=; b=Q8S27an+Hc3I8LBSUUVMn/fcgCd/okgBd3vl/iSFzgOz3GXdF1fct5G8y8X53N/QaOUJ6W6pplINVeXcnrl+R0PJedjekyFQcvl1Q7zUC07UWgr4La6D5a3N1gj3o2e0BH63XD+sLZ3v7bnOeZoN0THURm29QDX0lKDwbxoAwcs= Received: from localhost.localdomain(mailfrom:yujundong@pascal-lab.net fp:SMTPD_---.arJ7AyQ_1735568302 cluster:ay29) by smtp.aliyun-inc.com; Mon, 30 Dec 2024 22:18:23 +0800 From: Yujun Dong To: Ingo Molnar , Valentin Schneider , Vincent Guittot , Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Yujun Dong Subject: [PATCH] cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling() Date: Mon, 30 Dec 2024 22:16:24 +0800 Message-ID: <20241230141624.155356-1-yujundong@pascal-lab.net> X-Mailer: git-send-email 2.47.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In architectures that use the polling bit, current_clr_polling() employs smp_mb() to ensure that the clearing of the polling bit is visible to other cores before checking TIF_NEED_RESCHED. However, smp_mb() can be costly. Given that clear_bit() is an atomic operation, replacing smp_mb() with smp_mb__after_atomic() is appropriate. Many architectures implement smp_mb__after_atomic() as a lighter-weight barrier compared to smp_mb(), leading to performance improvements. For instance, on x86, smp_mb__after_atomic() is a no-op. This change eliminates a smp_mb() instruction in the cpuidle wake-up path, saving several CPU cycles and thereby reducing wake-up latency. Architectures that do not use the polling bit will retain the original smp_mb() behavior to ensure that existing dependencies remain unaffected. Signed-off-by: Yujun Dong --- include/linux/sched/idle.h | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/include/linux/sched/idle.h b/include/linux/sched/idle.h index e670ac282333..439f6029d3b9 100644 --- a/include/linux/sched/idle.h +++ b/include/linux/sched/idle.h @@ -79,6 +79,21 @@ static __always_inline bool __must_check current_clr_pol= ling_and_test(void) return unlikely(tif_need_resched()); } =20 +static __always_inline void current_clr_polling(void) +{ + __current_clr_polling(); + + /* + * Ensure we check TIF_NEED_RESCHED after we clear the polling bit. + * Once the bit is cleared, we'll get IPIs with every new + * TIF_NEED_RESCHED and the IPI handler, scheduler_ipi(), will also + * fold. + */ + smp_mb__after_atomic(); /* paired with resched_curr() */ + + preempt_fold_need_resched(); +} + #else static inline void __current_set_polling(void) { } static inline void __current_clr_polling(void) { } @@ -91,21 +106,15 @@ static inline bool __must_check current_clr_polling_an= d_test(void) { return unlikely(tif_need_resched()); } -#endif =20 static __always_inline void current_clr_polling(void) { __current_clr_polling(); =20 - /* - * Ensure we check TIF_NEED_RESCHED after we clear the polling bit. - * Once the bit is cleared, we'll get IPIs with every new - * TIF_NEED_RESCHED and the IPI handler, scheduler_ipi(), will also - * fold. - */ smp_mb(); /* paired with resched_curr() */ =20 preempt_fold_need_resched(); } +#endif =20 #endif /* _LINUX_SCHED_IDLE_H */ --=20 2.47.1