From nobody Fri Dec 19 20:54:29 2025 Received: from mail.zytor.com (terminus.zytor.com [198.137.202.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9888B190694 for ; Wed, 7 Aug 2024 05:50:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.136 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723009841; cv=none; b=EFPxbDjpqr0lFf3kY3LkP3Rr5YHR8p75D9b6nGt6/I6xgDIQgKDKMx0cXc6UrX9rNn1C3ovr6WAs91znb4A4n0XJ7uSRBHQJj1G3yX8HLVkrgvlm7SQg7L50psWPh3Vy9Kbqx4Wd+LeWwDYLsgM5iBTp9X0D7336YAMjwLNQzi8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723009841; c=relaxed/simple; bh=557UNLWVX/wp85JgjZpNyKLnIc/7+o4zVckBHmzIta8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FSInbBwtJ7AINkY7m/vE1MaERdDJDGfdY7GDH50ajAuDlvuhkX+RGp9oKKJFtbmysnk/3a1FycpAXXueFkOhoPbnhCvjoVFRzS80l0T8t9nv7VjUfsCYcVvUxxwJILr3OgXAPelV3IHeiBZqiSwz/5os4JVscAfO6jpmFl02UWo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com; spf=pass smtp.mailfrom=zytor.com; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b=Et5tJISm; arc=none smtp.client-ip=198.137.202.136 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zytor.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b="Et5tJISm" Received: from terminus.zytor.com (terminus.zytor.com [IPv6:2607:7c80:54:3:0:0:0:136]) (authenticated bits=0) by mail.zytor.com (8.17.2/8.17.1) with ESMTPSA id 4775lNin682395 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 6 Aug 2024 22:47:27 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 4775lNin682395 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2024071601; t=1723009648; bh=wycxuEMd1j4juCR3qbKXzCQngO9gKkWRzlYDtXjUAew=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Et5tJISmOBW3cotqKc9fSpTvBur1j7VgqoT2tKjWtV/KPVdS52rXdN3NP9ienot1i IuxH1awNf1ho4F+zGG7L2DUYg6/7t5jiKuuFBwSwo7UIfV67zpayDEqJYyH1FAU7Yb rXQCmpwTJnZBc0Jas1fC7rhf7kgxNJnbl+HDQM1/43xYVsQyNbKf1l+/Rwi2Nr7Qwr xvPkcNLPZQD6Br+0nmy1ZXsibIGeWDKj6p8/eV/4QV18z9lISh4ysBe4jvNtrrCy0q BIVNJzg2naue4xHh/8XNEf+jtIcscuOQHyjVWJvlzQuTLGFtY5wEftTcG5eJwJp+bK X5jGVUnSS1LmA== From: "Xin Li (Intel)" To: linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com Subject: [PATCH v1 1/3] x86/entry: Test ti_work for zero before processing individual bits Date: Tue, 6 Aug 2024 22:47:20 -0700 Message-ID: <20240807054722.682375-2-xin@zytor.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807054722.682375-1-xin@zytor.com> References: <20240807054722.682375-1-xin@zytor.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In most cases, ti_work values passed to arch_exit_to_user_mode_prepare() are zeros, e.g., 99% in kernel build tests. So an obvious optimization is to test ti_work for zero before processing individual bits in it. In addition, Intel 0day tests find no perf regression with this change. Suggested-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li (Intel) --- arch/x86/include/asm/entry-common.h | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/ent= ry-common.h index fb2809b20b0a..4c78b99060b5 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -47,15 +47,17 @@ static __always_inline void arch_enter_from_user_mode(s= truct pt_regs *regs) static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, unsigned long ti_work) { - if (ti_work & _TIF_USER_RETURN_NOTIFY) - fire_user_return_notifiers(); + if (unlikely(ti_work)) { + if (ti_work & _TIF_USER_RETURN_NOTIFY) + fire_user_return_notifiers(); =20 - if (unlikely(ti_work & _TIF_IO_BITMAP)) - tss_update_io_bitmap(); + if (unlikely(ti_work & _TIF_IO_BITMAP)) + tss_update_io_bitmap(); =20 - fpregs_assert_state_consistent(); - if (unlikely(ti_work & _TIF_NEED_FPU_LOAD)) - switch_fpu_return(); + fpregs_assert_state_consistent(); + if (unlikely(ti_work & _TIF_NEED_FPU_LOAD)) + switch_fpu_return(); + } =20 #ifdef CONFIG_COMPAT /* --=20 2.45.2 From nobody Fri Dec 19 20:54:29 2025 Received: from terminus.zytor.com (terminus.zytor.com [198.137.202.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5769C1917D1 for ; Wed, 7 Aug 2024 05:50:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.136 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723009842; cv=none; b=mjUXeOi3bWm9Sb35pBOmvsrNmkrNHKoqbt7kXKD3f/3R386uNUi9gkKqhdzGqQ6zhl+Z26F/qGDDyJGP2er9Fagj8l7A1anGuTLQoKgFJf5VlKcWerOyAXZQJO2bsAh1Qu/UB1kN3NjxR2gtrBl+LVhnZeKjmmPPkrVqsoW9usA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723009842; c=relaxed/simple; bh=lD9I8tRZPS0k7uL2HJKU9ym34dZ+T+CgRY2H+dTSx24=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U83ViKdvFFzu+Pw0iTRr+CLIjWjCL3mba6cmTjSLhkY38jwd3PhZwSyNZOwHDs+QenrBYHWpvcSOyOOCw+fOwuqwcrGPgeUUojvKy9ysl4lS5RpCzZjFdVOAr9BZEagETC5tT/DlxfBkFIyHkVqC7X3phqV45XTcDzymNYkcib4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com; spf=pass smtp.mailfrom=zytor.com; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b=VrQkO1Kb; arc=none smtp.client-ip=198.137.202.136 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zytor.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b="VrQkO1Kb" Received: from terminus.zytor.com (terminus.zytor.com [IPv6:2607:7c80:54:3:0:0:0:136]) (authenticated bits=0) by mail.zytor.com (8.17.2/8.17.1) with ESMTPSA id 4775lNio682395 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 6 Aug 2024 22:47:28 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 4775lNio682395 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2024071601; t=1723009648; bh=BP04zkUdKGpcBnf3pCpXskdtXCY1mJ3F+84Z6xYrrYY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VrQkO1KbaUXMGzxyKneXHTFlKhN489h9P4yr3jDgiGNqVVJYwN5zdZqrSKZRXMQz2 kLxQBSxzVYwQxpW/hK+BwiGZOsHrZFfZxe2WV9TyHV2dsLLj6vIj6uDZRnOCOIhf0O e2ktzuKmC4IzeWnZj48Pbn6STKosFS709ERDHVPZfGBKeIw038+nMYzAAC/eSrqHoI NKm8TmcIY0JCrSS3eY972sBa4IDPJz4yX/nQz2IHusoa9bFAY/tLmOiNwCfA7az/vn tgbSoAVM9kpoGEwOLdKUvW8TTsF2qKp/XCLaO/jVp4kKwSMM0J8LRXBtCigw9Z7yF4 b5xZMqA/Ha5Vw== From: "Xin Li (Intel)" To: linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com Subject: [PATCH v1 2/3] x86/msr: Switch between WRMSRNS and WRMSR with the alternatives mechanism Date: Tue, 6 Aug 2024 22:47:21 -0700 Message-ID: <20240807054722.682375-3-xin@zytor.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807054722.682375-1-xin@zytor.com> References: <20240807054722.682375-1-xin@zytor.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Andrew Cooper Per the discussion about FRED MSR writes with WRMSRNS instruction [1], use the alternatives mechanism to choose WRMSRNS when it's available, otherwise fallback to WRMSR. [1] https://lore.kernel.org/lkml/15f56e6a-6edd-43d0-8e83-bb6430096514@citri= x.com/ Signed-off-by: Andrew Cooper Signed-off-by: Xin Li (Intel) --- arch/x86/include/asm/msr.h | 28 ++++++++++++++-------------- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/arch/x86/include/asm/msr.h b/arch/x86/include/asm/msr.h index d642037f9ed5..3e402d717815 100644 --- a/arch/x86/include/asm/msr.h +++ b/arch/x86/include/asm/msr.h @@ -99,19 +99,6 @@ static __always_inline void __wrmsr(unsigned int msr, u3= 2 low, u32 high) : : "c" (msr), "a"(low), "d" (high) : "memory"); } =20 -/* - * WRMSRNS behaves exactly like WRMSR with the only difference being - * that it is not a serializing instruction by default. - */ -static __always_inline void __wrmsrns(u32 msr, u32 low, u32 high) -{ - /* Instruction opcode for WRMSRNS; supported in binutils >=3D 2.40. */ - asm volatile("1: .byte 0x0f,0x01,0xc6\n" - "2:\n" - _ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_WRMSR) - : : "c" (msr), "a"(low), "d" (high)); -} - #define native_rdmsr(msr, val1, val2) \ do { \ u64 __val =3D __rdmsr((msr)); \ @@ -312,9 +299,22 @@ do { \ =20 #endif /* !CONFIG_PARAVIRT_XXL */ =20 +/* Instruction opcode for WRMSRNS supported in binutils >=3D 2.40 */ +#define WRMSRNS _ASM_BYTES(0x0f,0x01,0xc6) + +/* Non-serializing WRMSR, when available. Falls back to a serializing WRM= SR. */ static __always_inline void wrmsrns(u32 msr, u64 val) { - __wrmsrns(msr, val, val >> 32); + /* + * WRMSR is 2 bytes. WRMSRNS is 3 bytes. Pad WRMSR with a redundant + * DS prefix to avoid a trailing NOP. + */ + asm volatile("1: " + ALTERNATIVE("ds wrmsr", + WRMSRNS, X86_FEATURE_WRMSRNS) + "2:\n" + _ASM_EXTABLE_TYPE(1b, 2b, EX_TYPE_WRMSR) + : : "c" (msr), "a" ((u32)val), "d" ((u32)(val >> 32))); } =20 /* --=20 2.45.2 From nobody Fri Dec 19 20:54:29 2025 Received: from mail.zytor.com (terminus.zytor.com [198.137.202.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 781321917F8 for ; Wed, 7 Aug 2024 05:50:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.136 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723009843; cv=none; b=n+/0Jlb2ASmN2WoY3M5+QG52tLDExPxkK4U1wYwcUQvgUrjf7xKXOM3oTPMun9qB4n/NDYfezviAF/dgK063X0SwoS4Fsl4DjpfepWSJJliO8TY14rF5XC1upc9WHpod/8zslm7U0UakZyddR7wOhl3r8f2AbwDo9z2NaS7Btag= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1723009843; c=relaxed/simple; bh=tlCvRoYaLMPU5gQj378kknU4aYbegHbHq8Y8IuOxgj8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nFAfJzuqTBL0B8DjItBFg1QH+6R96RVTBBW8nJ2fxm0OEptV5QdIFjuf38+v0y5l6E669ByN0A38mOtbhX6TMVJnN3DXAQVkfj8T+E4CFrnDJTJkQS3hgLUnQL44RhsZzHEvI6uHBasPClzh0W0/MnTHhRDsdaAddITJm8TruqE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com; spf=pass smtp.mailfrom=zytor.com; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b=CH7ZK5O4; arc=none smtp.client-ip=198.137.202.136 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zytor.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b="CH7ZK5O4" Received: from terminus.zytor.com (terminus.zytor.com [IPv6:2607:7c80:54:3:0:0:0:136]) (authenticated bits=0) by mail.zytor.com (8.17.2/8.17.1) with ESMTPSA id 4775lNip682395 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Tue, 6 Aug 2024 22:47:29 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 4775lNip682395 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2024071601; t=1723009649; bh=PpuvlTdRkxoiU3regBqhao5DL93njt/uaEV2ZJVl618=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CH7ZK5O45KBmjBvXg76Qa5tZy0CVVosDNGNiWc1dkkrywPJl+JmlWvnLivv5uPNzV cRC+nP3zZ6Nl0j2Wa6pUvLQv/GOI7ABhHI+YPV7eq97V2LrBGXgFwhd/H0D+n6DtRQ lT8Eiur3jkQiTHgPAkftQ3JsihJ/jxLNfRhJtGv8KweueRs3hEyKRT/LAMeD5ZmBj+ RIZuzX4ok03Lu4pPnpxx+TRcsp9aNnwqr9j90ZEi+E6gakr5yKTZU7iKSps0O4c5Eu lVHhgFkzibr2XfnL5i5Y9XrWDkhT8ap0gHJ0Zncjc1HRQmGJ3AsDeJR2wY7+Wu+Toc MYl9au81vpvow== From: "Xin Li (Intel)" To: linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, peterz@infradead.org, andrew.cooper3@citrix.com, seanjc@google.com Subject: [PATCH v1 3/3] x86/entry: Set FRED RSP0 on return to userspace instead of context switch Date: Tue, 6 Aug 2024 22:47:22 -0700 Message-ID: <20240807054722.682375-4-xin@zytor.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240807054722.682375-1-xin@zytor.com> References: <20240807054722.682375-1-xin@zytor.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" FRED RSP0 is a per task constant pointing to top of its kernel stack for user level event delivery, and needs to be updated when a task is scheduled in. Introduce a new TI flag TIF_LOAD_USER_STATES to track whether FRED RSP0 needs to be loaded, and do the actual load of FRED RSP0 in arch_exit_to_user_mode_prepare() if the TI flag is set, thus to avoid a fair number of WRMSRs in both KVM and the kernel. Suggested-by: Sean Christopherson Signed-off-by: Xin Li (Intel) --- arch/x86/include/asm/entry-common.h | 5 +++++ arch/x86/include/asm/switch_to.h | 3 +-- arch/x86/include/asm/thread_info.h | 2 ++ arch/x86/kernel/cpu/cpuid-deps.c | 1 - 4 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/entry-common.h b/arch/x86/include/asm/ent= ry-common.h index 4c78b99060b5..ae365579efb3 100644 --- a/arch/x86/include/asm/entry-common.h +++ b/arch/x86/include/asm/entry-common.h @@ -51,6 +51,11 @@ static inline void arch_exit_to_user_mode_prepare(struct= pt_regs *regs, if (ti_work & _TIF_USER_RETURN_NOTIFY) fire_user_return_notifiers(); =20 + if (cpu_feature_enabled(X86_FEATURE_FRED) && + (ti_work & _TIF_LOAD_USER_STATES)) + wrmsrns(MSR_IA32_FRED_RSP0, + (unsigned long)task_stack_page(current) + THREAD_SIZE); + if (unlikely(ti_work & _TIF_IO_BITMAP)) tss_update_io_bitmap(); =20 diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch= _to.h index c3bd0c0758c9..a31ea544cc0e 100644 --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -71,8 +71,7 @@ static inline void update_task_stack(struct task_struct *= task) this_cpu_write(cpu_tss_rw.x86_tss.sp1, task->thread.sp0); #else if (cpu_feature_enabled(X86_FEATURE_FRED)) { - /* WRMSRNS is a baseline feature for FRED. */ - wrmsrns(MSR_IA32_FRED_RSP0, (unsigned long)task_stack_page(task) + THREA= D_SIZE); + set_thread_flag(TIF_LOAD_USER_STATES); } else if (cpu_feature_enabled(X86_FEATURE_XENPV)) { /* Xen PV enters the kernel on the thread stack. */ load_sp0(task_top_of_stack(task)); diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thre= ad_info.h index 12da7dfd5ef1..fb51904651c0 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -106,6 +106,7 @@ struct thread_info { #define TIF_BLOCKSTEP 25 /* set when we want DEBUGCTLMSR_BTF */ #define TIF_LAZY_MMU_UPDATES 27 /* task is updating the mmu lazily */ #define TIF_ADDR32 29 /* 32-bit address space on 64 bits */ +#define TIF_LOAD_USER_STATES 30 /* Load user level states */ =20 #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) @@ -128,6 +129,7 @@ struct thread_info { #define _TIF_BLOCKSTEP (1 << TIF_BLOCKSTEP) #define _TIF_LAZY_MMU_UPDATES (1 << TIF_LAZY_MMU_UPDATES) #define _TIF_ADDR32 (1 << TIF_ADDR32) +#define _TIF_LOAD_USER_STATES (1 << TIF_LOAD_USER_STATES) =20 /* flags to check in __switch_to() */ #define _TIF_WORK_CTXSW_BASE \ diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-d= eps.c index b7d9f530ae16..8bd84114c2d9 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -83,7 +83,6 @@ static const struct cpuid_dep cpuid_deps[] =3D { { X86_FEATURE_AMX_TILE, X86_FEATURE_XFD }, { X86_FEATURE_SHSTK, X86_FEATURE_XSAVES }, { X86_FEATURE_FRED, X86_FEATURE_LKGS }, - { X86_FEATURE_FRED, X86_FEATURE_WRMSRNS }, {} }; =20 --=20 2.45.2