From nobody Tue Dec 16 19:56:41 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1EA55CDB483 for ; Fri, 13 Oct 2023 11:18:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231147AbjJMLSg (ORCPT ); Fri, 13 Oct 2023 07:18:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230039AbjJMLSc (ORCPT ); Fri, 13 Oct 2023 07:18:32 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C1AEBE; Fri, 13 Oct 2023 04:18:30 -0700 (PDT) Date: Fri, 13 Oct 2023 11:18:27 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1697195908; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tSqmBMVpyuwFzlCFIFtbk3M96+8ZydCk3RJzpjIhmRQ=; b=UlT4DpKTT0P8ICgzCYHghW3Ngt8zmEkGUztIhtBvo6TqCE7khYEU3ouj1feDpbhBQR0h7S deF+x1wG4ambZYG9i7BnBCNpdDAI4YallANvQoCglMcpBInm90TNYm4sAtz0l/c3GTd1NB RONj4MOb8mxZz74jC/F1GXdzDHgwUiddQY8WkJXGTMfAMB/r/GM6IIObpCRol4g6+dMKG6 2u2yRMOdCoKQOWBmCi7rZpfe8cnFY6qltteZ+0S5LXw/GDhkp8jNBbjSU4WHOH1ydMhL8m sMwpqEwbdCwi47y78qBWKyC19IjcvLYzivA2SXryOKTbRE79ALLv25iPBhy3LQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1697195908; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tSqmBMVpyuwFzlCFIFtbk3M96+8ZydCk3RJzpjIhmRQ=; b=2G/GI/DGVTAkul9dLU2x1PsFA4lHuAygxkI+67KsyQfCT/fdQn2O4FKxJWehx+t5Zu+tMr U/wHCMRWsqeF8UAg== From: "tip-bot2 for Brian Gerst" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/entry] x86/entry/64: Convert SYSRET validation tests to C Cc: Brian Gerst , Ingo Molnar , Andy Lutomirski , Borislav Petkov , Denys Vlasenko , "H. Peter Anvin" , Linus Torvalds , Peter Zijlstra , Thomas Gleixner , Josh Poimboeuf , Uros Bizjak , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20231011224351.130935-2-brgerst@gmail.com> References: <20231011224351.130935-2-brgerst@gmail.com> MIME-Version: 1.0 Message-ID: <169719590773.3135.13248965487116739403.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the x86/entry branch of tip: Commit-ID: ca282b486a570a0bfda5c1a4595ace7fa14243bf Gitweb: https://git.kernel.org/tip/ca282b486a570a0bfda5c1a4595ace7fa= 14243bf Author: Brian Gerst AuthorDate: Wed, 11 Oct 2023 18:43:49 -04:00 Committer: Ingo Molnar CommitterDate: Fri, 13 Oct 2023 13:05:28 +02:00 x86/entry/64: Convert SYSRET validation tests to C No change in functionality expected. Signed-off-by: Brian Gerst Signed-off-by: Ingo Molnar Cc: Andy Lutomirski Cc: Borislav Petkov Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Josh Poimboeuf Cc: Uros Bizjak Link: https://lore.kernel.org/r/20231011224351.130935-2-brgerst@gmail.com --- arch/x86/entry/common.c | 43 ++++++++++++++++++++++++++- arch/x86/entry/entry_64.S | 53 +-------------------------------- arch/x86/include/asm/syscall.h | 2 +- 3 files changed, 45 insertions(+), 53 deletions(-) diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c index 0551bcb..9021465 100644 --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -71,7 +71,8 @@ static __always_inline bool do_syscall_x32(struct pt_regs= *regs, int nr) return false; } =20 -__visible noinstr void do_syscall_64(struct pt_regs *regs, int nr) +/* Returns true to return using SYSRET, or false to use IRET */ +__visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr) { add_random_kstack_offset(); nr =3D syscall_enter_from_user_mode(regs, nr); @@ -85,6 +86,46 @@ __visible noinstr void do_syscall_64(struct pt_regs *reg= s, int nr) =20 instrumentation_end(); syscall_exit_to_user_mode(regs); + + /* + * Check that the register state is valid for using SYSRET to exit + * to userspace. Otherwise use the slower but fully capable IRET + * exit path. + */ + + /* XEN PV guests always use the IRET path */ + if (cpu_feature_enabled(X86_FEATURE_XENPV)) + return false; + + /* SYSRET requires RCX =3D=3D RIP and R11 =3D=3D EFLAGS */ + if (unlikely(regs->cx !=3D regs->ip || regs->r11 !=3D regs->flags)) + return false; + + /* CS and SS must match the values set in MSR_STAR */ + if (unlikely(regs->cs !=3D __USER_CS || regs->ss !=3D __USER_DS)) + return false; + + /* + * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP + * in kernel space. This essentially lets the user take over + * the kernel, since userspace controls RSP. + * + * Change top bits to match the most significant bit (47th or 56th bit + * depending on paging mode) in the address. + */ + if (unlikely(!__is_canonical_address(regs->ip, __VIRTUAL_MASK_SHIFT + 1))) + return false; + + /* + * SYSRET cannot restore RF. It can restore TF, but unlike IRET, + * restoring TF results in a trap from userspace immediately after + * SYSRET. + */ + if (unlikely(regs->flags & (X86_EFLAGS_RF | X86_EFLAGS_TF))) + return false; + + /* Use SYSRET to exit to userspace */ + return true; } #endif =20 diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 7574639..1730640 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -126,57 +126,8 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_= GLOBAL) * In the Xen PV case we must use iret anyway. */ =20 - ALTERNATIVE "", "jmp swapgs_restore_regs_and_return_to_usermode", \ - X86_FEATURE_XENPV - - movq RCX(%rsp), %rcx - movq RIP(%rsp), %r11 - - cmpq %rcx, %r11 /* SYSRET requires RCX =3D=3D RIP */ - jne swapgs_restore_regs_and_return_to_usermode - - /* - * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP - * in kernel space. This essentially lets the user take over - * the kernel, since userspace controls RSP. - * - * If width of "canonical tail" ever becomes variable, this will need - * to be updated to remain correct on both old and new CPUs. - * - * Change top bits to match most significant bit (47th or 56th bit - * depending on paging mode) in the address. - */ -#ifdef CONFIG_X86_5LEVEL - ALTERNATIVE "shl $(64 - 48), %rcx; sar $(64 - 48), %rcx", \ - "shl $(64 - 57), %rcx; sar $(64 - 57), %rcx", X86_FEATURE_LA57 -#else - shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx - sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx -#endif - - /* If this changed %rcx, it was not canonical */ - cmpq %rcx, %r11 - jne swapgs_restore_regs_and_return_to_usermode - - cmpq $__USER_CS, CS(%rsp) /* CS must match SYSRET */ - jne swapgs_restore_regs_and_return_to_usermode - - movq R11(%rsp), %r11 - cmpq %r11, EFLAGS(%rsp) /* R11 =3D=3D RFLAGS */ - jne swapgs_restore_regs_and_return_to_usermode - - /* - * SYSRET cannot restore RF. It can restore TF, but unlike IRET, - * restoring TF results in a trap from userspace immediately after - * SYSRET. - */ - testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11 - jnz swapgs_restore_regs_and_return_to_usermode - - /* nothing to check for RSP */ - - cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */ - jne swapgs_restore_regs_and_return_to_usermode + ALTERNATIVE "testb %al, %al; jz swapgs_restore_regs_and_return_to_usermod= e", \ + "jmp swapgs_restore_regs_and_return_to_usermode", X86_FEATURE_XENPV =20 /* * We win! This label is here just for ease of understanding diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h index c7e25c9..f44e2f9 100644 --- a/arch/x86/include/asm/syscall.h +++ b/arch/x86/include/asm/syscall.h @@ -126,7 +126,7 @@ static inline int syscall_get_arch(struct task_struct *= task) ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64; } =20 -void do_syscall_64(struct pt_regs *regs, int nr); +bool do_syscall_64(struct pt_regs *regs, int nr); =20 #endif /* CONFIG_X86_32 */