[PATCH v2 1/3] x86/entry: Test ti_work for zero before processing individual bits

Xin Li (Intel) posted 3 patches 1 year, 3 months ago
[PATCH v2 1/3] x86/entry: Test ti_work for zero before processing individual bits
Posted by Xin Li (Intel) 1 year, 3 months ago
In most cases, ti_work values passed to arch_exit_to_user_mode_prepare()
are zeros, e.g., 99% in kernel build tests.  So an obvious optimization
is to test ti_work for zero before processing individual bits in it.

In addition, Intel 0day tests find no perf regression with this change.

Suggested-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
---

Change since v1:
* Leave fpregs_assert_state_consistent() unconditional and independent
  of ti_work. (Brian Gerst and Thomas Gleixner)
* Add arch_exit_work() to spare an extra indentation level. (Thomas Gleixner)
---
 arch/x86/include/asm/entry-common.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index fb2809b20b0a..db970828f385 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -44,8 +44,7 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
 }
 #define arch_enter_from_user_mode arch_enter_from_user_mode
 
-static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
-						  unsigned long ti_work)
+static inline void arch_exit_work(unsigned long ti_work)
 {
 	if (ti_work & _TIF_USER_RETURN_NOTIFY)
 		fire_user_return_notifiers();
@@ -56,6 +55,13 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
 	fpregs_assert_state_consistent();
 	if (unlikely(ti_work & _TIF_NEED_FPU_LOAD))
 		switch_fpu_return();
+}
+
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+						  unsigned long ti_work)
+{
+	if (IS_ENABLED(CONFIG_X86_DEBUG_FPU) || unlikely(ti_work))
+		arch_exit_work(ti_work);
 
 #ifdef CONFIG_COMPAT
 	/*
-- 
2.46.0
[tip: x86/fred] x86/entry: Test ti_work for zero before processing individual bits
Posted by tip-bot2 for Xin Li (Intel) 1 year, 3 months ago
The following commit has been merged into the x86/fred branch of tip:

Commit-ID:     0dfac6f267fa091aa348c6a6742b463c9e7c98e3
Gitweb:        https://git.kernel.org/tip/0dfac6f267fa091aa348c6a6742b463c9e7c98e3
Author:        Xin Li (Intel) <xin@zytor.com>
AuthorDate:    Thu, 22 Aug 2024 00:39:04 -07:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Sun, 25 Aug 2024 19:23:00 +02:00

x86/entry: Test ti_work for zero before processing individual bits

In most cases, ti_work values passed to arch_exit_to_user_mode_prepare()
are zeros, e.g., 99% in kernel build tests.  So an obvious optimization is
to test ti_work for zero before processing individual bits in it.

Omit the optimization when FPU debugging is enabled, otherwise the
FPU consistency check is never executed.

Intel 0day tests did not find a perfermance regression with this change.

Suggested-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240822073906.2176342-2-xin@zytor.com

---
 arch/x86/include/asm/entry-common.h | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/entry-common.h
index fb2809b..db97082 100644
--- a/arch/x86/include/asm/entry-common.h
+++ b/arch/x86/include/asm/entry-common.h
@@ -44,8 +44,7 @@ static __always_inline void arch_enter_from_user_mode(struct pt_regs *regs)
 }
 #define arch_enter_from_user_mode arch_enter_from_user_mode
 
-static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
-						  unsigned long ti_work)
+static inline void arch_exit_work(unsigned long ti_work)
 {
 	if (ti_work & _TIF_USER_RETURN_NOTIFY)
 		fire_user_return_notifiers();
@@ -56,6 +55,13 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
 	fpregs_assert_state_consistent();
 	if (unlikely(ti_work & _TIF_NEED_FPU_LOAD))
 		switch_fpu_return();
+}
+
+static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
+						  unsigned long ti_work)
+{
+	if (IS_ENABLED(CONFIG_X86_DEBUG_FPU) || unlikely(ti_work))
+		arch_exit_work(ti_work);
 
 #ifdef CONFIG_COMPAT
 	/*