From nobody Sun Feb 8 12:36:59 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 97023C001DC for ; Fri, 21 Jul 2023 16:10:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231969AbjGUQK5 (ORCPT ); Fri, 21 Jul 2023 12:10:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231422AbjGUQKr (ORCPT ); Fri, 21 Jul 2023 12:10:47 -0400 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F92430D1 for ; Fri, 21 Jul 2023 09:10:44 -0700 (PDT) Received: by mail-qt1-x835.google.com with SMTP id d75a77b69052e-4039a2b71c1so15688961cf.0 for ; Fri, 21 Jul 2023 09:10:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1689955842; x=1690560642; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OTO5jd/nKObBWWPyprEXCBFVCoZH0swGcg3ifta3qUY=; b=klznSxfG7LYALMJrU/cCJd6ae9flDFrCupw/4zMfPDfMc/6pxRyNDDrYO4dpOSm+bf WXpj9XmbpP968pnZn6MZp4JcBVAeYzSuhsBEl/A2v1+1Nuy3kDXvn7CPSHAEbHveLfWb 9axPhnlMUgKebJK/D46Huzk1/2hNHsJTsEov/wyh7p2CNFSZCV2jOvEh9W4ZYk1+TkXs N44GXPanNogkhrgwMJ7lnMvSwEgxMmKjosn5ghfyb543tkcvbkgsRLNvQ6xRkMMJaMMJ uT+4OMFsU8B7S44IP+QEHn8uqtMg9/h7RHO9UK2DXscZT2ipZCj0i8139FStjrdivFWN 0foA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689955842; x=1690560642; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OTO5jd/nKObBWWPyprEXCBFVCoZH0swGcg3ifta3qUY=; b=NrqbcEjX/tbwlScmPOnoK4trxL0632WtqKDn3ILnzQXLsfzzOl+O28GHCbTB32kXOb dPIQqESmpLM2vHUJ8wCw1/qxMnOd3R3BAoPZ6o2uD+QcUasfMm4gBi8yMKSjDvNFpCI1 GP7sHOrsewovDnrDLBvYtYpjLMgYbrYtAumlACwLaz7Ke5KKioa+DxUljR9ALGYwc7Uy /hEzz6YxzeMeiXdGZ/aOjzSEYW7u56oLbf1BdnIfWmmRmS2uq6/TawguOvAt4VpI4Rfw tamZYUzjA+Ie3cK4mT2b9qItGSysfZu0jWz3GXBitPWPBkieUu4q4o9XOwLgELnncKyT 7w/g== X-Gm-Message-State: ABy/qLaKMxHwV8QhgDKm0jDuORDBKBSHpi/uAPvqITHq9mhce7uIcs6c y4Ye/jWaVAqABYXHq15M+cocxXjb1Q== X-Google-Smtp-Source: APBJJlH9rAE/jv5DsRe1RD4SCNmmw1j7ePI0a+ooD7kwPoMKpYq9mnQjXJdx4emBAOcxX2FlhhRAPg== X-Received: by 2002:a05:622a:4c:b0:403:999d:afe5 with SMTP id y12-20020a05622a004c00b00403999dafe5mr531794qtw.41.1689955842419; Fri, 21 Jul 2023 09:10:42 -0700 (PDT) Received: from citadel.. (047-026-243-217.res.spectrum.com. [47.26.243.217]) by smtp.gmail.com with ESMTPSA id c16-20020ac87d90000000b004054fbf9273sm311286qtd.80.2023.07.21.09.10.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Jul 2023 09:10:41 -0700 (PDT) From: Brian Gerst To: linux-kernel@vger.kernel.org, x86@kernel.org Cc: Thomas Gleixner , Borislav Petkov , "H . Peter Anvin" , Andy Lutomirski , =?UTF-8?q?Mika=20Penttil=C3=A4?= , Brian Gerst Subject: [PATCH v2 2/6] x86/entry/64: Convert SYSRET validation tests to C Date: Fri, 21 Jul 2023 12:10:13 -0400 Message-ID: <20230721161018.50214-3-brgerst@gmail.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230721161018.50214-1-brgerst@gmail.com> References: <20230721161018.50214-1-brgerst@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Signed-off-by: Brian Gerst --- arch/x86/entry/common.c | 40 ++++++++++++++++++++++++- arch/x86/entry/entry_64.S | 55 ++-------------------------------- arch/x86/include/asm/syscall.h | 2 +- 3 files changed, 42 insertions(+), 55 deletions(-) diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c index 6c2826417b33..cccdd18c8304 100644 --- a/arch/x86/entry/common.c +++ b/arch/x86/entry/common.c @@ -70,7 +70,8 @@ static __always_inline bool do_syscall_x32(struct pt_regs= *regs, int nr) return false; } =20 -__visible noinstr void do_syscall_64(struct pt_regs *regs, int nr) +/* Returns true to return using SYSRET, or false to use IRET */ +__visible noinstr bool do_syscall_64(struct pt_regs *regs, int nr) { add_random_kstack_offset(); nr =3D syscall_enter_from_user_mode(regs, nr); @@ -84,6 +85,43 @@ __visible noinstr void do_syscall_64(struct pt_regs *reg= s, int nr) =20 instrumentation_end(); syscall_exit_to_user_mode(regs); + + /* + * Check that the register state is valid for using SYSRET to exit + * to userspace. Otherwise use the slower but fully capable IRET + * exit path. + */ + + /* XEN PV guests always use IRET path */ + if (cpu_feature_enabled(X86_FEATURE_XENPV)) + return false; + + /* SYSRET requires RCX =3D=3D RIP and R11 =3D=3D EFLAGS */ + if (unlikely(regs->cx !=3D regs->ip || regs->r11 !=3D regs->flags)) + return false; + + /* CS and SS must match the values set in MSR_STAR */ + if (unlikely(regs->cs !=3D __USER_CS || regs->ss !=3D __USER_DS)) + return false; + + /* + * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP + * in kernel space. This essentially lets the user take over + * the kernel, since userspace controls RSP. + */ + if (unlikely(!__is_canonical_address(regs->ip, __VIRTUAL_MASK_SHIFT + 1))) + return false; + + /* + * SYSRET cannot restore RF. It can restore TF, but unlike IRET, + * restoring TF results in a trap from userspace immediately after + * SYSRET. + */ + if (unlikely(regs->flags & (X86_EFLAGS_RF | X86_EFLAGS_TF))) + return false; + + /* Use SYSRET to exit to userspace */ + return true; } #endif =20 diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index c01776a51545..b1288e22cae8 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -123,60 +123,9 @@ SYM_INNER_LABEL(entry_SYSCALL_64_after_hwframe, SYM_L_= GLOBAL) * Try to use SYSRET instead of IRET if we're returning to * a completely clean 64-bit userspace context. If we're not, * go to the slow exit path. - * In the Xen PV case we must use iret anyway. */ - - ALTERNATIVE "", "jmp swapgs_restore_regs_and_return_to_usermode", \ - X86_FEATURE_XENPV - - movq RCX(%rsp), %rcx - movq RIP(%rsp), %r11 - - cmpq %rcx, %r11 /* SYSRET requires RCX =3D=3D RIP */ - jne swapgs_restore_regs_and_return_to_usermode - - /* - * On Intel CPUs, SYSRET with non-canonical RCX/RIP will #GP - * in kernel space. This essentially lets the user take over - * the kernel, since userspace controls RSP. - * - * If width of "canonical tail" ever becomes variable, this will need - * to be updated to remain correct on both old and new CPUs. - * - * Change top bits to match most significant bit (47th or 56th bit - * depending on paging mode) in the address. - */ -#ifdef CONFIG_X86_5LEVEL - ALTERNATIVE "shl $(64 - 48), %rcx; sar $(64 - 48), %rcx", \ - "shl $(64 - 57), %rcx; sar $(64 - 57), %rcx", X86_FEATURE_LA57 -#else - shl $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx - sar $(64 - (__VIRTUAL_MASK_SHIFT+1)), %rcx -#endif - - /* If this changed %rcx, it was not canonical */ - cmpq %rcx, %r11 - jne swapgs_restore_regs_and_return_to_usermode - - cmpq $__USER_CS, CS(%rsp) /* CS must match SYSRET */ - jne swapgs_restore_regs_and_return_to_usermode - - movq R11(%rsp), %r11 - cmpq %r11, EFLAGS(%rsp) /* R11 =3D=3D RFLAGS */ - jne swapgs_restore_regs_and_return_to_usermode - - /* - * SYSRET cannot restore RF. It can restore TF, but unlike IRET, - * restoring TF results in a trap from userspace immediately after - * SYSRET. - */ - testq $(X86_EFLAGS_RF|X86_EFLAGS_TF), %r11 - jnz swapgs_restore_regs_and_return_to_usermode - - /* nothing to check for RSP */ - - cmpq $__USER_DS, SS(%rsp) /* SS must match SYSRET */ - jne swapgs_restore_regs_and_return_to_usermode + testb %al, %al + jz swapgs_restore_regs_and_return_to_usermode =20 /* * We win! This label is here just for ease of understanding diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h index 4fb36fba4b5a..be6c5515e0b9 100644 --- a/arch/x86/include/asm/syscall.h +++ b/arch/x86/include/asm/syscall.h @@ -126,7 +126,7 @@ static inline int syscall_get_arch(struct task_struct *= task) ? AUDIT_ARCH_I386 : AUDIT_ARCH_X86_64; } =20 -void do_syscall_64(struct pt_regs *regs, int nr); +bool do_syscall_64(struct pt_regs *regs, int nr); =20 #endif /* CONFIG_X86_32 */ =20 --=20 2.41.0