From nobody Tue Dec 16 13:22:06 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D1171E7542F for ; Tue, 3 Oct 2023 06:57:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239322AbjJCG5F (ORCPT ); Tue, 3 Oct 2023 02:57:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57352 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231184AbjJCG4c (ORCPT ); Tue, 3 Oct 2023 02:56:32 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A053426BD; Mon, 2 Oct 2023 23:55:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1696316144; x=1727852144; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=t7LG5n12hETRnrxNhjk5swu4X1wUVGmoU9KoK3uS4dI=; b=F1e+pEtaxnJ4Oy4HH1bj8o0e/1Fxr8DEvXvgTJG0d9XdFrJvzDDuQoUI rXqrX9USAKXOrfNzgdND4gdinckRyOMB7DP7hUAkeuuqUuwQy3CDWnaSb Rjzx7aIHy4Xv+7g5DakwGQ/ChRtl7z9I7nL7209/AESE7pf8ES3Sw3FGg 2ExgD6tMY6LGGu2qrRG5ZRE0UUDHoZ//PSeT48uAyTYLc/HV4SZeb4MF6 Jz8uVNi3ILQ7pvrAQg0kZTi1719yqzW6A+nZf+aCKQLYLgB9JNJaFzRxi VHETW7dUHbjCZU45i8PJOCUhtmwPihjKe8FFSm+i7DHDnVWe1HfouAyPv w==; X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="367858294" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="367858294" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Oct 2023 23:54:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10851"; a="1081901001" X-IronPort-AV: E=Sophos;i="6.03,196,1694761200"; d="scan'208";a="1081901001" Received: from unknown (HELO fred..) ([172.25.112.68]) by fmsmga005.fm.intel.com with ESMTP; 02 Oct 2023 23:54:50 -0700 From: Xin Li To: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-edac@vger.kernel.org, linux-hyperv@vger.kernel.org, kvm@vger.kernel.org, xen-devel@lists.xenproject.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, luto@kernel.org, pbonzini@redhat.com, seanjc@google.com, peterz@infradead.org, jgross@suse.com, ravi.v.shankar@intel.com, mhiramat@kernel.org, andrew.cooper3@citrix.com, jiangshanlai@gmail.com, nik.borisov@suse.com Subject: [PATCH v12 31/37] x86/fred: Fixup fault on ERETU by jumping to fred_entrypoint_user Date: Mon, 2 Oct 2023 23:24:52 -0700 Message-Id: <20231003062458.23552-32-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231003062458.23552-1-xin3.li@intel.com> References: <20231003062458.23552-1-xin3.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" If the stack frame contains an invalid user context (e.g. due to invalid SS, a non-canonical RIP, etc.) the ERETU instruction will trap (#SS or #GP). From a Linux point of view, this really should be considered a user space failure, so use the standard fault fixup mechanism to intercept the fault, fix up the exception frame, and redirect execution to fred_entrypoint_user. The end result is that it appears just as if the hardware had taken the exception immediately after completing the transition to user space. Suggested-by: H. Peter Anvin (Intel) Tested-by: Shan Kang Signed-off-by: Xin Li --- Changes since v8: * Reflect the FRED spec 5.0 change that ERETS and ERETU add 8 to %rsp before popping the return context from the stack. Changes since v6: * Add a comment to explain why it is safe to write to the previous FRED sta= ck frame. (Lai Jiangshan). Changes since v5: * Move the NMI bit from an invalid stack frame, which caused ERETU to fault, to the fault handler's stack frame, thus to unblock NMI ASAP if NMI is bl= ocked (Lai Jiangshan). --- arch/x86/entry/entry_64_fred.S | 5 +- arch/x86/include/asm/extable_fixup_types.h | 4 +- arch/x86/mm/extable.c | 79 ++++++++++++++++++++++ 3 files changed, 86 insertions(+), 2 deletions(-) diff --git a/arch/x86/entry/entry_64_fred.S b/arch/x86/entry/entry_64_fred.S index 5781c3411b44..d1c2fc4af8ae 100644 --- a/arch/x86/entry/entry_64_fred.S +++ b/arch/x86/entry/entry_64_fred.S @@ -3,6 +3,7 @@ * The actual FRED entry points. */ =20 +#include #include =20 #include "calling.h" @@ -34,7 +35,9 @@ SYM_CODE_START_NOALIGN(asm_fred_entrypoint_user) call fred_entry_from_user SYM_INNER_LABEL(asm_fred_exit_user, SYM_L_GLOBAL) FRED_EXIT - ERETU +1: ERETU + + _ASM_EXTABLE_TYPE(1b, asm_fred_entrypoint_user, EX_TYPE_ERETU) SYM_CODE_END(asm_fred_entrypoint_user) =20 .fill asm_fred_entrypoint_kernel - ., 1, 0xcc diff --git a/arch/x86/include/asm/extable_fixup_types.h b/arch/x86/include/= asm/extable_fixup_types.h index 991e31cfde94..1585c798a02f 100644 --- a/arch/x86/include/asm/extable_fixup_types.h +++ b/arch/x86/include/asm/extable_fixup_types.h @@ -64,6 +64,8 @@ #define EX_TYPE_UCOPY_LEN4 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(4)) #define EX_TYPE_UCOPY_LEN8 (EX_TYPE_UCOPY_LEN | EX_DATA_IMM(8)) =20 -#define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ +#define EX_TYPE_ZEROPAD 20 /* longword load with zeropad on fault */ + +#define EX_TYPE_ERETU 21 =20 #endif diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c index 271dcb2deabc..fc40a4e12f3a 100644 --- a/arch/x86/mm/extable.c +++ b/arch/x86/mm/extable.c @@ -6,6 +6,7 @@ #include =20 #include +#include #include #include #include @@ -223,6 +224,80 @@ static bool ex_handler_ucopy_len(const struct exceptio= n_table_entry *fixup, return ex_handler_uaccess(fixup, regs, trapnr, fault_address); } =20 +#ifdef CONFIG_X86_FRED +static bool ex_handler_eretu(const struct exception_table_entry *fixup, + struct pt_regs *regs, unsigned long error_code) +{ + struct pt_regs *uregs =3D (struct pt_regs *) + (regs->sp - offsetof(struct pt_regs, orig_ax)); + unsigned short ss =3D uregs->ss; + unsigned short cs =3D uregs->cs; + + /* + * Move the NMI bit from the invalid stack frame, which caused ERETU + * to fault, to the fault handler's stack frame, thus to unblock NMI + * with the fault handler's ERETS instruction ASAP if NMI is blocked. + */ + regs->fred_ss.nmi =3D uregs->fred_ss.nmi; + + /* + * Sync event information to uregs, i.e., the ERETU return frame, but + * is it safe to write to the ERETU return frame which is just above + * current event stack frame? + * + * The RSP used by FRED to push a stack frame is not the value in %rsp, + * it is calculated from %rsp with the following 2 steps: + * 1) RSP =3D %rsp - (IA32_FRED_CONFIG & 0x1c0) // Reserve N*64 bytes + * 2) RSP =3D RSP & ~0x3f // Align to a 64-byte cache line + * when an event delivery doesn't trigger a stack level change. + * + * Here is an example with N*64 (N=3D1) bytes reserved: + * + * 64-byte cache line =3D=3D> ______________ + * |___Reserved___| + * |__Event_data__| + * |_____SS_______| + * |_____RSP______| + * |_____FLAGS____| + * |_____CS_______| + * |_____IP_______| + * 64-byte cache line =3D=3D> |__Error_code__| <=3D=3D ERETU return frame + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| + * |______________| + * 64-byte cache line =3D=3D> |______________| <=3D=3D RSP after step 1)= and 2) + * |___Reserved___| + * |__Event_data__| + * |_____SS_______| + * |_____RSP______| + * |_____FLAGS____| + * |_____CS_______| + * |_____IP_______| + * 64-byte cache line =3D=3D> |__Error_code__| <=3D=3D ERETS return frame + * + * Thus a new FRED stack frame will always be pushed below a previous + * FRED stack frame ((N*64) bytes may be reserved between), and it is + * safe to write to a previous FRED stack frame as they never overlap. + */ + fred_info(uregs)->edata =3D fred_event_data(regs); + uregs->ssx =3D regs->ssx; + uregs->fred_ss.ss =3D ss; + /* The NMI bit was moved away above */ + uregs->fred_ss.nmi =3D 0; + uregs->csx =3D regs->csx; + uregs->fred_cs.sl =3D 0; + uregs->fred_cs.wfe =3D 0; + uregs->cs =3D cs; + uregs->orig_ax =3D error_code; + + return ex_handler_default(fixup, regs); +} +#endif + int ex_get_fixup_type(unsigned long ip) { const struct exception_table_entry *e =3D search_exception_tables(ip); @@ -300,6 +375,10 @@ int fixup_exception(struct pt_regs *regs, int trapnr, = unsigned long error_code, return ex_handler_ucopy_len(e, regs, trapnr, fault_addr, reg, imm); case EX_TYPE_ZEROPAD: return ex_handler_zeropad(e, regs, fault_addr); +#ifdef CONFIG_X86_FRED + case EX_TYPE_ERETU: + return ex_handler_eretu(e, regs, error_code); +#endif } BUG(); } --=20 2.34.1