From nobody Mon Apr 27 10:42:20 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F2AC2C43334 for ; Tue, 14 Jun 2022 12:01:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355778AbiFNMBr (ORCPT ); Tue, 14 Jun 2022 08:01:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1353519AbiFNMBj (ORCPT ); Tue, 14 Jun 2022 08:01:39 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20F3D47AEE for ; Tue, 14 Jun 2022 05:01:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208098; x=1686744098; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+YKWMSbb5ilC548cSjiQyw/GUTbjzrOI6q8bG06K3nw=; b=Kd83AkEE9wW0uZnnPr++k7jOXEybJuCjfS6JRSlKmrkC0uoUdPVpGrrx 5YeF9SKSNIB7d8Oiab/Y4ZyoW87JqbXVItBeNjHlQ6LTuuoPPxtJfgiBN RBsQWrHkK8MwN3MG8slEwSztejVJ4oFhYt5ioiwUSOMHNL54vnwD3oUjz uYp2q6Vv8naqAx9QT6I4GYfUAmYDirCJS5OcEEAvDacWxZMEbYra5Fjap V72VM0BdsUjye0vbADfINRUXQFiHlVLgJ8gJ2nni/XYv2WmElYG9XL8H5 yk2wHJ7dyN0rd932Hpvph7uDWk9Ha6V1AbpOuMznDSm37xP+moPKqf4ub Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="276137977" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="276137977" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:01:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="761935995" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga005.jf.intel.com with ESMTP; 14 Jun 2022 05:01:32 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id E3C5D18F; Tue, 14 Jun 2022 15:01:36 +0300 (EEST) From: "Kirill A. Shutemov" To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@intel.com, luto@kernel.org, peterz@infradead.org Cc: ak@linux.intel.com, dan.j.williams@intel.com, david@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com, seanjc@google.com, thomas.lendacky@amd.com, x86@kernel.org, "Kirill A. Shutemov" Subject: [PATCHv4 1/3] x86/tdx: Fix early #VE handling Date: Tue, 14 Jun 2022 15:01:33 +0300 Message-Id: <20220614120135.14812-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120135.14812-1-kirill.shutemov@linux.intel.com> References: <20220614120135.14812-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Move RIP in tdx_early_handle_ve() after handling the exception. Failure to do that leads to infinite loop of exceptions. Signed-off-by: Kirill A. Shutemov Fixes: 32e72854fa5f ("x86/tdx: Port I/O: Add early boot support") Reviewed-by: Kuppuswamy Sathyanarayanan --- arch/x86/coco/tdx/tdx.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 03deb4d6920d..faae53f8d559 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -447,13 +447,17 @@ static bool handle_io(struct pt_regs *regs, u32 exit_= qual) __init bool tdx_early_handle_ve(struct pt_regs *regs) { struct ve_info ve; + bool ret; =20 tdx_get_ve_info(&ve); =20 if (ve.exit_reason !=3D EXIT_REASON_IO_INSTRUCTION) return false; =20 - return handle_io(regs, ve.exit_qual); + ret =3D handle_io(regs, ve.exit_qual); + if (ret) + regs->ip +=3D ve.instr_len; + return ret; } =20 void tdx_get_ve_info(struct ve_info *ve) --=20 2.35.1 From nobody Mon Apr 27 10:42:20 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 57BD8C433EF for ; Tue, 14 Jun 2022 12:01:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355669AbiFNMBp (ORCPT ); Tue, 14 Jun 2022 08:01:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352550AbiFNMBj (ORCPT ); Tue, 14 Jun 2022 08:01:39 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 613B54756E for ; Tue, 14 Jun 2022 05:01:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208097; x=1686744097; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+A2WQIeWLfKFt9l6xnCfBVKA+KRbh7RBvbegb5TlLdQ=; b=Nr3uEs0hDEIfDranu7k06SzBHUekkSQN5lfaUrRU3nmGioMaZPwbrcKp ooMCcNa2C05S3cCrsit0wvrvV10hKAkCIzqeOtrZq0jCToWCu6UzwS9ac LWNWs3zCsrRSGTlDOax8x3MRWobliYo1cfLQjBSeqDU4RfKQ0gaGox1Z4 fZZLEC+XyCYnfQKeDVjAE8X+pEXMWvrH3PkIrY9jTe7OKXOjCwfLkBUMJ FTzddYrbNMsNhCrUjER21uLlFP38UdElRCEZgzRdG3i7RxUPMdZ4xJBA0 OTscSWuAsAyxsc8IftsOdoA/a8j9vs/R7w6bRVVwlC7DPemjk3EpSTv/f A==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="340260405" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="340260405" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:01:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="673838347" Received: from black.fi.intel.com ([10.237.72.28]) by FMSMGA003.fm.intel.com with ESMTP; 14 Jun 2022 05:01:32 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 07094506; Tue, 14 Jun 2022 15:01:37 +0300 (EEST) From: "Kirill A. Shutemov" To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@intel.com, luto@kernel.org, peterz@infradead.org Cc: ak@linux.intel.com, dan.j.williams@intel.com, david@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com, seanjc@google.com, thomas.lendacky@amd.com, x86@kernel.org, "Kirill A. Shutemov" Subject: [PATCHv4 2/3] x86/tdx: Clarify RIP adjustments in #VE handler Date: Tue, 14 Jun 2022 15:01:34 +0300 Message-Id: <20220614120135.14812-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120135.14812-1-kirill.shutemov@linux.intel.com> References: <20220614120135.14812-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" After successful #VE handling, tdx_handle_virt_exception() has to move RIP to the next instruction. The handler needs to know the length of the instruction. If the #VE happened due to instruction execution, GET_VEINFO TDX module call provides info on the instruction in R10, including its length. For #VE due to EPT violation, info in R10 is not usable and kernel has to decode instruction manually to find out its length. Restructure the code to make it explicit that the instruction length depends on the type of #VE. Handler of an exit reason returns instruction length on success or -errno on failure. Suggested-by: Dave Hansen Signed-off-by: Kirill A. Shutemov --- arch/x86/coco/tdx/tdx.c | 178 +++++++++++++++++++++++++++------------- 1 file changed, 123 insertions(+), 55 deletions(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index faae53f8d559..7d6d484a6d28 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -124,6 +124,51 @@ static u64 get_cc_mask(void) return BIT_ULL(gpa_width - 1); } =20 +/* + * TDX module spec states that #VE may be injected by the Intel TDX module= in + * several cases: + * + * - Emulation of the architectural #VE injection on EPT violation; + * + * - As a result of guest TD execution of a disallowed instruction, + * a disallowed MSR access, or CPUID virtualization; + * + * - A notification to the guest TD about anomalous behavior; + * + * The last one is opt-in and is not used by the kernel. + * + * Intel Software Developer's Manual describes cases when instruction leng= th + * field can be used in section "Information for VM Exits Due to Instructi= on + * Execution". + * + * For TDX, it ultimately means GET_VEINFO provides reliable instruction l= ength + * information if #VE occurred due to instruction execution, but not for E= PT + * violations. + */ +static int ve_instr_len(struct ve_info *ve) +{ + switch (ve->exit_reason) { + case EXIT_REASON_HLT: + case EXIT_REASON_MSR_READ: + case EXIT_REASON_MSR_WRITE: + case EXIT_REASON_CPUID: + case EXIT_REASON_IO_INSTRUCTION: + /* It is safe to use ve->instr_len for #VE due instructions */ + return ve->instr_len; + case EXIT_REASON_EPT_VIOLATION: + /* + * For EPT violations, ve->insn_len is not defined. For those, + * the kernel must decode instructions manually and should not + * be using this function. + */ + WARN_ONCE(1, "ve->instr_len is not defined for EPT violations"); + return 0; + default: + WARN_ONCE(1, "Unexpected #VE-type: %lld\n", ve->exit_reason); + return ve->instr_len; + } +} + static u64 __cpuidle __halt(const bool irq_disabled, const bool do_sti) { struct tdx_hypercall_args args =3D { @@ -147,7 +192,7 @@ static u64 __cpuidle __halt(const bool irq_disabled, co= nst bool do_sti) return __tdx_hypercall(&args, do_sti ? TDX_HCALL_ISSUE_STI : 0); } =20 -static bool handle_halt(void) +static int handle_halt(struct ve_info *ve) { /* * Since non safe halt is mainly used in CPU offlining @@ -158,9 +203,9 @@ static bool handle_halt(void) const bool do_sti =3D false; =20 if (__halt(irq_disabled, do_sti)) - return false; + return -EIO; =20 - return true; + return ve_instr_len(ve); } =20 void __cpuidle tdx_safe_halt(void) @@ -180,7 +225,7 @@ void __cpuidle tdx_safe_halt(void) WARN_ONCE(1, "HLT instruction emulation failed\n"); } =20 -static bool read_msr(struct pt_regs *regs) +static int read_msr(struct pt_regs *regs, struct ve_info *ve) { struct tdx_hypercall_args args =3D { .r10 =3D TDX_HYPERCALL_STANDARD, @@ -194,14 +239,14 @@ static bool read_msr(struct pt_regs *regs) * (GHCI), section titled "TDG.VP.VMCALL". */ if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT)) - return false; + return -EIO; =20 regs->ax =3D lower_32_bits(args.r11); regs->dx =3D upper_32_bits(args.r11); - return true; + return ve_instr_len(ve); } =20 -static bool write_msr(struct pt_regs *regs) +static int write_msr(struct pt_regs *regs, struct ve_info *ve) { struct tdx_hypercall_args args =3D { .r10 =3D TDX_HYPERCALL_STANDARD, @@ -215,10 +260,13 @@ static bool write_msr(struct pt_regs *regs) * can be found in TDX Guest-Host-Communication Interface * (GHCI) section titled "TDG.VP.VMCALL". */ - return !__tdx_hypercall(&args, 0); + if (__tdx_hypercall(&args, 0)) + return -EIO; + + return ve_instr_len(ve); } =20 -static bool handle_cpuid(struct pt_regs *regs) +static int handle_cpuid(struct pt_regs *regs, struct ve_info *ve) { struct tdx_hypercall_args args =3D { .r10 =3D TDX_HYPERCALL_STANDARD, @@ -236,7 +284,7 @@ static bool handle_cpuid(struct pt_regs *regs) */ if (regs->ax < 0x40000000 || regs->ax > 0x4FFFFFFF) { regs->ax =3D regs->bx =3D regs->cx =3D regs->dx =3D 0; - return true; + return ve_instr_len(ve); } =20 /* @@ -245,7 +293,7 @@ static bool handle_cpuid(struct pt_regs *regs) * (GHCI), section titled "VP.VMCALL". */ if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT)) - return false; + return -EIO; =20 /* * As per TDX GHCI CPUID ABI, r12-r15 registers contain contents of @@ -257,7 +305,7 @@ static bool handle_cpuid(struct pt_regs *regs) regs->cx =3D args.r14; regs->dx =3D args.r15; =20 - return true; + return ve_instr_len(ve); } =20 static bool mmio_read(int size, unsigned long addr, unsigned long *val) @@ -283,7 +331,7 @@ static bool mmio_write(int size, unsigned long addr, un= signed long val) EPT_WRITE, addr, val); } =20 -static bool handle_mmio(struct pt_regs *regs, struct ve_info *ve) +static int handle_mmio(struct pt_regs *regs, struct ve_info *ve) { char buffer[MAX_INSN_SIZE]; unsigned long *reg, val; @@ -294,34 +342,36 @@ static bool handle_mmio(struct pt_regs *regs, struct = ve_info *ve) =20 /* Only in-kernel MMIO is supported */ if (WARN_ON_ONCE(user_mode(regs))) - return false; + return -EFAULT; =20 if (copy_from_kernel_nofault(buffer, (void *)regs->ip, MAX_INSN_SIZE)) - return false; + return -EFAULT; =20 if (insn_decode(&insn, buffer, MAX_INSN_SIZE, INSN_MODE_64)) - return false; + return -EINVAL; =20 mmio =3D insn_decode_mmio(&insn, &size); if (WARN_ON_ONCE(mmio =3D=3D MMIO_DECODE_FAILED)) - return false; + return -EINVAL; =20 if (mmio !=3D MMIO_WRITE_IMM && mmio !=3D MMIO_MOVS) { reg =3D insn_get_modrm_reg_ptr(&insn, regs); if (!reg) - return false; + return -EINVAL; } =20 - ve->instr_len =3D insn.length; - /* Handle writes first */ switch (mmio) { case MMIO_WRITE: memcpy(&val, reg, size); - return mmio_write(size, ve->gpa, val); + if (!mmio_write(size, ve->gpa, val)) + return -EIO; + return insn.length; case MMIO_WRITE_IMM: val =3D insn.immediate.value; - return mmio_write(size, ve->gpa, val); + if (!mmio_write(size, ve->gpa, val)) + return -EIO; + return insn.length; case MMIO_READ: case MMIO_READ_ZERO_EXTEND: case MMIO_READ_SIGN_EXTEND: @@ -334,15 +384,15 @@ static bool handle_mmio(struct pt_regs *regs, struct = ve_info *ve) * decoded or handled properly. It was likely not using io.h * helpers or accessed MMIO accidentally. */ - return false; + return -EINVAL; default: WARN_ONCE(1, "Unknown insn_decode_mmio() decode value?"); - return false; + return -EINVAL; } =20 /* Handle reads */ if (!mmio_read(size, ve->gpa, &val)) - return false; + return -EIO; =20 switch (mmio) { case MMIO_READ: @@ -364,13 +414,13 @@ static bool handle_mmio(struct pt_regs *regs, struct = ve_info *ve) default: /* All other cases has to be covered with the first switch() */ WARN_ON_ONCE(1); - return false; + return -EINVAL; } =20 if (extend_size) memset(reg, extend_val, extend_size); memcpy(reg, &val, size); - return true; + return insn.length; } =20 static bool handle_in(struct pt_regs *regs, int size, int port) @@ -421,13 +471,14 @@ static bool handle_out(struct pt_regs *regs, int size= , int port) * * Return True on success or False on failure. */ -static bool handle_io(struct pt_regs *regs, u32 exit_qual) +static int handle_io(struct pt_regs *regs, struct ve_info *ve) { + u32 exit_qual =3D ve->exit_qual; int size, port; - bool in; + bool in, ret; =20 if (VE_IS_IO_STRING(exit_qual)) - return false; + return -EIO; =20 in =3D VE_IS_IO_IN(exit_qual); size =3D VE_GET_IO_SIZE(exit_qual); @@ -435,9 +486,13 @@ static bool handle_io(struct pt_regs *regs, u32 exit_q= ual) =20 =20 if (in) - return handle_in(regs, size, port); + ret =3D handle_in(regs, size, port); else - return handle_out(regs, size, port); + ret =3D handle_out(regs, size, port); + if (!ret) + return -EIO; + + return ve_instr_len(ve); } =20 /* @@ -447,17 +502,19 @@ static bool handle_io(struct pt_regs *regs, u32 exit_= qual) __init bool tdx_early_handle_ve(struct pt_regs *regs) { struct ve_info ve; - bool ret; + int insn_len; =20 tdx_get_ve_info(&ve); =20 if (ve.exit_reason !=3D EXIT_REASON_IO_INSTRUCTION) return false; =20 - ret =3D handle_io(regs, ve.exit_qual); - if (ret) - regs->ip +=3D ve.instr_len; - return ret; + insn_len =3D handle_io(regs, &ve); + if (insn_len < 0) + return false; + + regs->ip +=3D insn_len; + return true; } =20 void tdx_get_ve_info(struct ve_info *ve) @@ -490,54 +547,65 @@ void tdx_get_ve_info(struct ve_info *ve) ve->instr_info =3D upper_32_bits(out.r10); } =20 -/* Handle the user initiated #VE */ -static bool virt_exception_user(struct pt_regs *regs, struct ve_info *ve) +/* + * Handle the user initiated #VE. + * + * On success, returns the number of bytes RIP should be incremented (>=3D= 0) + * or -errno on error. + */ +static int virt_exception_user(struct pt_regs *regs, struct ve_info *ve) { switch (ve->exit_reason) { case EXIT_REASON_CPUID: - return handle_cpuid(regs); + return handle_cpuid(regs, ve); default: pr_warn("Unexpected #VE: %lld\n", ve->exit_reason); - return false; + return -EIO; } } =20 -/* Handle the kernel #VE */ -static bool virt_exception_kernel(struct pt_regs *regs, struct ve_info *ve) +/* + * Handle the kernel #VE. + * + * On success, returns the number of bytes RIP should be incremented (>=3D= 0) + * or -errno on error. + */ +static int virt_exception_kernel(struct pt_regs *regs, struct ve_info *ve) { switch (ve->exit_reason) { case EXIT_REASON_HLT: - return handle_halt(); + return handle_halt(ve); case EXIT_REASON_MSR_READ: - return read_msr(regs); + return read_msr(regs, ve); case EXIT_REASON_MSR_WRITE: - return write_msr(regs); + return write_msr(regs, ve); case EXIT_REASON_CPUID: - return handle_cpuid(regs); + return handle_cpuid(regs, ve); case EXIT_REASON_EPT_VIOLATION: return handle_mmio(regs, ve); case EXIT_REASON_IO_INSTRUCTION: - return handle_io(regs, ve->exit_qual); + return handle_io(regs, ve); default: pr_warn("Unexpected #VE: %lld\n", ve->exit_reason); - return false; + return -EIO; } } =20 bool tdx_handle_virt_exception(struct pt_regs *regs, struct ve_info *ve) { - bool ret; + int insn_len; =20 if (user_mode(regs)) - ret =3D virt_exception_user(regs, ve); + insn_len =3D virt_exception_user(regs, ve); else - ret =3D virt_exception_kernel(regs, ve); + insn_len =3D virt_exception_kernel(regs, ve); + if (insn_len < 0) + return false; =20 /* After successful #VE handling, move the IP */ - if (ret) - regs->ip +=3D ve->instr_len; + regs->ip +=3D insn_len; =20 - return ret; + return true; } =20 static bool tdx_tlb_flush_required(bool private) --=20 2.35.1 From nobody Mon Apr 27 10:42:20 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B6DCC433EF for ; Tue, 14 Jun 2022 12:01:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355988AbiFNMBv (ORCPT ); Tue, 14 Jun 2022 08:01:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51792 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355419AbiFNMBk (ORCPT ); Tue, 14 Jun 2022 08:01:40 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66BC74755C for ; Tue, 14 Jun 2022 05:01:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1655208099; x=1686744099; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3BZ8RWBROa1LlaDnI1IOVxCv82mL+FJE/PjDuYq4QlY=; b=n/CzsrkDcmwDdSDaJWrWjKyeADmCBxtSDzDbdvSyQmPPcwZRMeJiumHG FdHhBwNs8HQnjDAvIDN8kRvrtdQm9Sq83KyuGovKOPvBknnSo0IzA1BAi mXLRtVaQEsYNDzU3Ffi5Xp535bDbL7QNYqHuvTC0xgSFeP2oIEdgQDLiu pj/XH5qkx0/WioVHNF75Vaa7BjQo0QcBtTKoSEKQhA3O2x9/LHNh6OfU0 u4WwekIs3xvm+LVV3lcNPY9NrV9Dy8qFBLy6i2LLygLn+VvvSS1BfHOa5 5mF5dWUkHMFZPY+GhG78acM6CAUg3k9b7jGtLiLTnmdA+OnLIn3sQdrya g==; X-IronPort-AV: E=McAfee;i="6400,9594,10377"; a="276137980" X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="276137980" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Jun 2022 05:01:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,300,1647327600"; d="scan'208";a="617967148" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga001.jf.intel.com with ESMTP; 14 Jun 2022 05:01:33 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 161655D3; Tue, 14 Jun 2022 15:01:37 +0300 (EEST) From: "Kirill A. Shutemov" To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@intel.com, luto@kernel.org, peterz@infradead.org Cc: ak@linux.intel.com, dan.j.williams@intel.com, david@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com, seanjc@google.com, thomas.lendacky@amd.com, x86@kernel.org, "Kirill A. Shutemov" Subject: [PATCHv4 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Date: Tue, 14 Jun 2022 15:01:35 +0300 Message-Id: <20220614120135.14812-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220614120135.14812-1-kirill.shutemov@linux.intel.com> References: <20220614120135.14812-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" load_unaligned_zeropad() can lead to unwanted loads across page boundaries. The unwanted loads are typically harmless. But, they might be made to totally unrelated or even unmapped memory. load_unaligned_zeropad() relies on exception fixup (#PF, #GP and now #VE) to recover from these unwanted loads. In TDX guests, the second page can be shared page and VMM may configure it to trigger #VE. Kernel assumes that #VE on a shared page is MMIO access and tries to decode instruction to handle it. In case of load_unaligned_zeropad() it may result in confusion as it is not MMIO access. Fix it by detecting split page MMIO accesses and fail them. load_unaligned_zeropad() will recover using exception fixups. The issue was discovered by analysis. It was not triggered during the testing. Signed-off-by: Kirill A. Shutemov --- arch/x86/coco/tdx/tdx.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 7d6d484a6d28..3bcaf2170ede 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -333,8 +333,8 @@ static bool mmio_write(int size, unsigned long addr, un= signed long val) =20 static int handle_mmio(struct pt_regs *regs, struct ve_info *ve) { + unsigned long *reg, val, vaddr; char buffer[MAX_INSN_SIZE]; - unsigned long *reg, val; struct insn insn =3D {}; enum mmio_type mmio; int size, extend_size; @@ -360,6 +360,19 @@ static int handle_mmio(struct pt_regs *regs, struct ve= _info *ve) return -EINVAL; } =20 + /* + * Reject EPT violation #VEs that split pages. + * + * MMIO accesses suppose to be naturally aligned and therefore never + * cross a page boundary. Seeing split page accesses indicates a bug + * or load_unaligned_zeropad() that steps into unmapped shared page. + * + * load_unaligned_zeropad() will recover using exception fixups. + */ + vaddr =3D (unsigned long)insn_get_addr_ref(&insn, regs); + if (vaddr / PAGE_SIZE !=3D (vaddr + size) / PAGE_SIZE) + return -EFAULT; + /* Handle writes first */ switch (mmio) { case MMIO_WRITE: --=20 2.35.1