From nobody Fri Sep 12 06:10:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02E6FC636D4 for ; Mon, 13 Feb 2023 10:34:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230323AbjBMKee (ORCPT ); Mon, 13 Feb 2023 05:34:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40698 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229567AbjBMKe2 (ORCPT ); Mon, 13 Feb 2023 05:34:28 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 62345DBF6; Mon, 13 Feb 2023 02:34:27 -0800 (PST) Received: from vm02.corp.microsoft.com (unknown [167.220.196.155]) by linux.microsoft.com (Postfix) with ESMTPSA id 5A68D20C8B73; Mon, 13 Feb 2023 02:34:24 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 5A68D20C8B73 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1676284466; bh=lmISFNCs5b2fQLI1ZzQ/46EgrntcRajyJvLUq4mleho=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CV8UDLpDdqXV5p+nQMHZS0Ds0q6sH7YjOVfBHa0HG1/scso9bLsOLrFLCheysXzIE 2yo4E27ZpgTnN1n94az+FQgr12PJJTs0/FCSVjcafao+20R6J1q5oBhrWxpLbzAA9P 6np8HrV9fL1cPgMq4z5m/CpDQ2fWcMNRIGcitUyM= From: Jeremi Piotrowski To: linux-kernel@vger.kernel.org Cc: Jeremi Piotrowski , Wei Liu , Dexuan Cui , Tianyu Lan , Michael Kelley , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, linux-hyperv@vger.kernel.org, Brijesh Singh , Michael Roth , Ashish Kalra , Tom Lendacky Subject: [RFC PATCH v2 1/7] x86/hyperv: Allocate RMP table during boot Date: Mon, 13 Feb 2023 10:33:56 +0000 Message-Id: <20230213103402.1189285-2-jpiotrowski@linux.microsoft.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> References: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hyper-V VMs can be capable of hosting SNP isolated nested VMs on AMD CPUs. One of the pieces of SNP is the RMP (Reverse Map) table which tracks page assignment to firmware, hypervisor or guest. On bare-metal this table is allocated by UEFI, but on Hyper-V it is the responsibility of the OS to allocate one if necessary. The nested_feature 'HV_X64_NESTED_NO_RMP_TABLE' will be set to communicate that no rmp is available. The actual RMP table is exclusively controlled by the Hyper-V hypervisor and is not virtualized to the VM. The SNP code in the kernel uses the RMP table for its own tracking and so it is necessary for init code to allocate one. While not strictly necessary, follow the requirements defined by "SEV Secure Nested Paging Firmware ABI Specification" Rev 1.54, section 8.8.2 when allocating the RMP: - RMP_BASE and RMP_END must be set identically across all cores. - RMP_BASE must be 1 MB aligned - RMP_END =E2=80=93 RMP_BASE + 1 must be a multiple of 1 MB - RMP is large enough to protect itself The allocation is done in the init_mem_mapping() hook, which is the earliest hook I found that has both max_pfn and memblock initialized. At this point we are still under the memblock_set_current_limit(ISA_END_ADDRESS) condition, but explicitly passing the end to memblock_phys_alloc_range() allows us to allocate past that value. The RMP table is needed when the hypervisor has access to SNP, which can be determined using X86_FEATURE_SEV_SNP, but we need to exclude SNP guests themselves (since SNP guests are not capable of virtualization). This is why we check for cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT). Signed-off-by: Jeremi Piotrowski --- arch/x86/hyperv/hv_init.c | 5 ++++ arch/x86/include/asm/hyperv-tlfs.h | 3 ++ arch/x86/include/asm/mshyperv.h | 3 ++ arch/x86/include/asm/sev.h | 2 ++ arch/x86/kernel/cpu/mshyperv.c | 45 ++++++++++++++++++++++++++++++ arch/x86/kernel/sev.c | 1 - 6 files changed, 58 insertions(+), 1 deletion(-) diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index 29774126e931..0c540fff1a20 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -117,6 +117,11 @@ static int hv_cpu_init(unsigned int cpu) } } =20 + if (hv_needs_snp_rmp()) { + wrmsrl(MSR_AMD64_RMP_BASE, rmp_res.start); + wrmsrl(MSR_AMD64_RMP_END, rmp_res.end); + } + return hyperv_init_ghcb(); } =20 diff --git a/arch/x86/include/asm/hyperv-tlfs.h b/arch/x86/include/asm/hype= rv-tlfs.h index e3efaf6e6b62..01cc2c3f9f20 100644 --- a/arch/x86/include/asm/hyperv-tlfs.h +++ b/arch/x86/include/asm/hyperv-tlfs.h @@ -152,6 +152,9 @@ */ #define HV_X64_NESTED_ENLIGHTENED_TLB BIT(22) =20 +/* Nested SNP on Hyper-V */ +#define HV_X64_NESTED_NO_RMP_TABLE BIT(23) + /* HYPERV_CPUID_ISOLATION_CONFIG.EAX bits. */ #define HV_PARAVISOR_PRESENT BIT(0) =20 diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyper= v.h index 61f0c206bff0..3533b002cede 100644 --- a/arch/x86/include/asm/mshyperv.h +++ b/arch/x86/include/asm/mshyperv.h @@ -190,6 +190,9 @@ static inline void hv_ghcb_terminate(unsigned int set, = unsigned int reason) {} =20 extern bool hv_isolation_type_snp(void); =20 +extern struct resource rmp_res; +bool hv_needs_snp_rmp(void); + static inline bool hv_is_synic_reg(unsigned int reg) { if ((reg >=3D HV_REGISTER_SCONTROL) && diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h index 2916f4150ac7..db5438663229 100644 --- a/arch/x86/include/asm/sev.h +++ b/arch/x86/include/asm/sev.h @@ -83,6 +83,8 @@ extern bool handle_vc_boot_ghcb(struct pt_regs *regs); /* RMUPDATE detected 4K page and 2MB page overlap. */ #define RMPUPDATE_FAIL_OVERLAP 7 =20 +#define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000 + /* RMP page size */ #define RMP_PG_SIZE_4K 0 #define RMP_PG_SIZE_2M 1 diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 831613959a92..777c9d812dfa 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -31,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -488,6 +490,48 @@ static bool __init ms_hyperv_msi_ext_dest_id(void) return eax & HYPERV_VS_PROPERTIES_EAX_EXTENDED_IOAPIC_RTE; } =20 +struct resource rmp_res =3D { + .name =3D "RMP", + .start =3D 0, + .end =3D 0, + .flags =3D IORESOURCE_SYSTEM_RAM, +}; + +/* + * HV_X64_NESTED_NO_RMP_TABLE indicates to the nested hypervisor that no R= MP + * table is provided/necessary, but kernel code requires access to one so = we + * use that bit as an indication that we need to allocate one ourselves. + */ +bool hv_needs_snp_rmp(void) +{ + return IS_ENABLED(CONFIG_KVM_AMD_SEV) && + boot_cpu_has(X86_FEATURE_SEV_SNP) && + !cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && + (ms_hyperv.nested_features & HV_X64_NESTED_NO_RMP_TABLE); +} + +static void __init ms_hyperv_init_mem_mapping(void) +{ + phys_addr_t addr; + u64 calc_rmp_sz; + + if (!hv_needs_snp_rmp()) + return; + + calc_rmp_sz =3D (max_pfn << 4) + RMPTABLE_CPU_BOOKKEEPING_SZ; + calc_rmp_sz =3D round_up(calc_rmp_sz, SZ_1M); + addr =3D memblock_phys_alloc_range(calc_rmp_sz, SZ_1M, 0, max_pfn << PAGE= _SHIFT); + if (!addr) { + pr_warn("Unable to allocate RMP table\n"); + return; + } + rmp_res.start =3D addr; + rmp_res.end =3D addr + calc_rmp_sz - 1; + wrmsrl(MSR_AMD64_RMP_BASE, rmp_res.start); + wrmsrl(MSR_AMD64_RMP_END, rmp_res.end); + insert_resource(&iomem_resource, &rmp_res); +} + const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv =3D { .name =3D "Microsoft Hyper-V", .detect =3D ms_hyperv_platform, @@ -495,4 +539,5 @@ const __initconst struct hypervisor_x86 x86_hyper_ms_hy= perv =3D { .init.x2apic_available =3D ms_hyperv_x2apic_available, .init.msi_ext_dest_id =3D ms_hyperv_msi_ext_dest_id, .init.init_platform =3D ms_hyperv_init_platform, + .init.init_mem_mapping =3D ms_hyperv_init_mem_mapping, }; diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 1dd1b36bdfea..7fa39dc17edd 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -87,7 +87,6 @@ struct rmpentry { * The first 16KB from the RMP_BASE is used by the processor for the * bookkeeping, the range needs to be added during the RMP entry lookup. */ -#define RMPTABLE_CPU_BOOKKEEPING_SZ 0x4000 #define RMPENTRY_SHIFT 8 #define rmptable_page_offset(x) (RMPTABLE_CPU_BOOKKEEPING_SZ + (((unsigned= long)x) >> RMPENTRY_SHIFT)) =20 --=20 2.25.1 From nobody Fri Sep 12 06:10:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9256C636D4 for ; Mon, 13 Feb 2023 10:34:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230351AbjBMKeh (ORCPT ); Mon, 13 Feb 2023 05:34:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230288AbjBMKed (ORCPT ); Mon, 13 Feb 2023 05:34:33 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 58E35126D8; Mon, 13 Feb 2023 02:34:30 -0800 (PST) Received: from vm02.corp.microsoft.com (unknown [167.220.196.155]) by linux.microsoft.com (Postfix) with ESMTPSA id 5255E20C8B75; Mon, 13 Feb 2023 02:34:27 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 5255E20C8B75 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1676284469; bh=JDn8a8015dAIvKwIsQKiqtgMDWEL4ycrKsqNeNuYORg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WSfJD3ml/i9FfUkGd5YkJtKx1KgxuMKagB9LFqPLcln9xsKkdri2GvW53ddeUvHpZ vyFZWftQrXhNf9XV2tZUgKPTwD0XUg8L1zFS2DoP2HgFFatYo2Too0ZAjXeoM5CaN0 lTy/8gDo8DV5QiXBGYOar9zRLQl2HL4/EQSsdIWQ= From: Jeremi Piotrowski To: linux-kernel@vger.kernel.org Cc: Jeremi Piotrowski , Wei Liu , Dexuan Cui , Tianyu Lan , Michael Kelley , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, linux-hyperv@vger.kernel.org, Brijesh Singh , Michael Roth , Ashish Kalra , Tom Lendacky Subject: [RFC PATCH v2 2/7] x86/sev: Add support for NestedVirtSnpMsr Date: Mon, 13 Feb 2023 10:33:57 +0000 Message-Id: <20230213103402.1189285-3-jpiotrowski@linux.microsoft.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> References: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The rmpupdate and psmash instructions, which are used in AMD's SEV-SNP to update the RMP (Reverse Map) table, can't be trapped. For nested scenarios, AMD defined MSR versions of these instructions which can be trapped and must be emulated by the L0 hypervisor. One instance where these MSRs are used are Hyper-V VMs which expose SNP hardware isolation capabilities to the L1 guest. The MSRs are defined in "AMD64 Architecture Programmer=E2=80=99s Manual, Vo= lume 2: System Programming", section 15.36.19. Signed-off-by: Jeremi Piotrowski --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 2 + arch/x86/kernel/sev.c | 80 ++++++++++++++++++++++++++---- 3 files changed, 73 insertions(+), 10 deletions(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpuf= eatures.h index 480b4eaef310..e6e2e824f67b 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -423,6 +423,7 @@ #define X86_FEATURE_SEV_SNP (19*32+ 4) /* AMD Secure Encrypted Virtualiza= tion - Secure Nested Paging */ #define X86_FEATURE_V_TSC_AUX (19*32+ 9) /* "" Virtual TSC_AUX */ #define X86_FEATURE_SME_COHERENT (19*32+10) /* "" AMD hardware-enforced ca= che coherency */ +#define X86_FEATURE_NESTED_VIRT_SNP_MSR (19*32+29) /* Virtualizable RMPUPD= ATE and PSMASH MSR available */ =20 /* * BUG word(s) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 35100c630617..d6103e607896 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -567,6 +567,8 @@ #define MSR_AMD64_SEV_SNP_ENABLED BIT_ULL(MSR_AMD64_SEV_SNP_ENABLED_BIT) #define MSR_AMD64_RMP_BASE 0xc0010132 #define MSR_AMD64_RMP_END 0xc0010133 +#define MSR_AMD64_VIRT_RMPUPDATE 0xc001f001 +#define MSR_AMD64_VIRT_PSMASH 0xc001f002 =20 #define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f =20 diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index 7fa39dc17edd..ad09dd3747a1 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -2566,6 +2566,32 @@ int snp_lookup_rmpentry(u64 pfn, int *level) } EXPORT_SYMBOL_GPL(snp_lookup_rmpentry); =20 +static bool virt_snp_msr(void) +{ + return boot_cpu_has(X86_FEATURE_NESTED_VIRT_SNP_MSR); +} + +/* + * This version of psmash is not implemented in hardware but always + * traps to L0 hypervisor. It doesn't follow usual wrmsr conventions. + * Inputs: + * rax: 2MB aligned GPA + * Outputs: + * rax: psmash return code + */ +static u64 virt_psmash(u64 paddr) +{ + int ret; + + asm volatile( + "wrmsr\n\t" + : "=3Da"(ret) + : "a"(paddr), "c"(MSR_AMD64_VIRT_PSMASH) + : "memory", "cc" + ); + return ret; +} + /* * psmash is used to smash a 2MB aligned page into 4K * pages while preserving the Validated bit in the RMP. @@ -2581,11 +2607,15 @@ int psmash(u64 pfn) if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP)) return -ENXIO; =20 - /* Binutils version 2.36 supports the PSMASH mnemonic. */ - asm volatile(".byte 0xF3, 0x0F, 0x01, 0xFF" - : "=3Da"(ret) - : "a"(paddr) - : "memory", "cc"); + if (virt_snp_msr()) { + ret =3D virt_psmash(paddr); + } else { + /* Binutils version 2.36 supports the PSMASH mnemonic. */ + asm volatile(".byte 0xF3, 0x0F, 0x01, 0xFF" + : "=3Da"(ret) + : "a"(paddr) + : "memory", "cc"); + } =20 return ret; } @@ -2601,6 +2631,31 @@ static int invalidate_direct_map(unsigned long pfn, = int npages) return set_memory_np((unsigned long)pfn_to_kaddr(pfn), npages); } =20 +/* + * This version of rmpupdate is not implemented in hardware but always + * traps to L0 hypervisor. It doesn't follow usual wrmsr conventions. + * Inputs: + * rax: 4KB aligned GPA + * rdx: bytes 7:0 of new rmp entry + * r8: bytes 15:8 of new rmp entry + * Outputs: + * rax: rmpupdate return code + */ +static u64 virt_rmpupdate(unsigned long paddr, struct rmp_state *val) +{ + int ret; + register u64 hi asm("r8") =3D ((u64 *)val)[1]; + register u64 lo asm("rdx") =3D ((u64 *)val)[0]; + + asm volatile( + "wrmsr\n\t" + : "=3Da"(ret) + : "a"(paddr), "c"(MSR_AMD64_VIRT_RMPUPDATE), "r"(lo), "r"(hi) + : "memory", "cc" + ); + return ret; +} + static int rmpupdate(u64 pfn, struct rmp_state *val) { unsigned long paddr =3D pfn << PAGE_SHIFT; @@ -2626,11 +2681,16 @@ static int rmpupdate(u64 pfn, struct rmp_state *val) } =20 retry: - /* Binutils version 2.36 supports the RMPUPDATE mnemonic. */ - asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFE" - : "=3Da"(ret) - : "a"(paddr), "c"((unsigned long)val) - : "memory", "cc"); + + if (virt_snp_msr()) { + ret =3D virt_rmpupdate(paddr, val); + } else { + /* Binutils version 2.36 supports the RMPUPDATE mnemonic. */ + asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFE" + : "=3Da"(ret) + : "a"(paddr), "c"((unsigned long)val) + : "memory", "cc"); + } =20 if (ret) { if (!retries) { --=20 2.25.1 From nobody Fri Sep 12 06:10:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22C29C636CC for ; Mon, 13 Feb 2023 10:34:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230388AbjBMKek (ORCPT ); Mon, 13 Feb 2023 05:34:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229567AbjBMKef (ORCPT ); Mon, 13 Feb 2023 05:34:35 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4C802DBF6; Mon, 13 Feb 2023 02:34:33 -0800 (PST) Received: from vm02.corp.microsoft.com (unknown [167.220.196.155]) by linux.microsoft.com (Postfix) with ESMTPSA id 49EE420C8B73; Mon, 13 Feb 2023 02:34:30 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 49EE420C8B73 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1676284472; bh=0f3V5nPDxmg11RJ8oq9LMHRMj/eCJ6zpTgfupRm3u00=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bsaUHTv6b+3sbEl+Ft5POXdghpjuPy5HJ3bT8pxUZgFfpsngZ5ECQ2Vfe6M6isvFm 0r5KHxawxvb4R+ShERIl789T3wRdtpdvxGLCg4GPyRbs+tuKpMZLR6TlcH4nXVYc0g 7NqDxwucntQp4sLJHs8txoMsKPVJLZ3yKBRp+/Tc= From: Jeremi Piotrowski To: linux-kernel@vger.kernel.org Cc: Jeremi Piotrowski , Wei Liu , Dexuan Cui , Tianyu Lan , Michael Kelley , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, linux-hyperv@vger.kernel.org, Brijesh Singh , Michael Roth , Ashish Kalra , Tom Lendacky Subject: [RFC PATCH v2 3/7] x86/sev: Maintain shadow rmptable on Hyper-V Date: Mon, 13 Feb 2023 10:33:58 +0000 Message-Id: <20230213103402.1189285-4-jpiotrowski@linux.microsoft.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> References: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hyper-V can expose the SEV-SNP feature to guests, and manages the system-wide RMP (Reverse Map) table. The SNP implementation in the kernel needs access to the rmptable for tracking pages and deciding when/how to issue rmpupdate/psmash. When running as a Hyper-V guest with SNP support, an rmptable is allocated by the kernel during boot for this purpose. Keep the table in sync with issued rmpupdate/psmash instructions. The logic for how to update the rmptable comes from "AMD64 Architecture Programmer=E2=80=99s Manual, Volume 3" which describes the psmash and rmpup= date instructions. To ensure correctness of the SNP host code, the most important fields are "assigned" and "page size". Signed-off-by: Jeremi Piotrowski --- arch/x86/include/asm/sev.h | 4 ++ arch/x86/kernel/cpu/mshyperv.c | 2 + arch/x86/kernel/sev.c | 69 ++++++++++++++++++++++++++++++++++ 3 files changed, 75 insertions(+) diff --git a/arch/x86/include/asm/sev.h b/arch/x86/include/asm/sev.h index db5438663229..4d3591ebff5d 100644 --- a/arch/x86/include/asm/sev.h +++ b/arch/x86/include/asm/sev.h @@ -218,6 +218,8 @@ int psmash(u64 pfn); int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool= immutable); int rmp_make_shared(u64 pfn, enum pg_level level); void sev_dump_rmpentry(u64 pfn); +bool snp_soft_rmptable(void); +void __init snp_set_soft_rmptable(void); #else static inline void sev_es_ist_enter(struct pt_regs *regs) { } static inline void sev_es_ist_exit(void) { } @@ -251,6 +253,8 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, en= um pg_level level, int as } static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -= ENODEV; } static inline void sev_dump_rmpentry(u64 pfn) {} +static inline bool snp_soft_rmptable(void) { return false; } +static inline void __init snp_set_soft_rmptable(void) {} #endif =20 #endif diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c index 777c9d812dfa..101c38e9cae7 100644 --- a/arch/x86/kernel/cpu/mshyperv.c +++ b/arch/x86/kernel/cpu/mshyperv.c @@ -530,6 +530,8 @@ static void __init ms_hyperv_init_mem_mapping(void) wrmsrl(MSR_AMD64_RMP_BASE, rmp_res.start); wrmsrl(MSR_AMD64_RMP_END, rmp_res.end); insert_resource(&iomem_resource, &rmp_res); + + snp_set_soft_rmptable(); } =20 const __initconst struct hypervisor_x86 x86_hyper_ms_hyperv =3D { diff --git a/arch/x86/kernel/sev.c b/arch/x86/kernel/sev.c index ad09dd3747a1..712f1a9623ce 100644 --- a/arch/x86/kernel/sev.c +++ b/arch/x86/kernel/sev.c @@ -2566,6 +2566,22 @@ int snp_lookup_rmpentry(u64 pfn, int *level) } EXPORT_SYMBOL_GPL(snp_lookup_rmpentry); =20 +static bool soft_rmptable __ro_after_init; + +/* + * Test if the rmptable needs to be managed by software and is not maintai= ned by + * (virtualized) hardware. + */ +bool snp_soft_rmptable(void) +{ + return soft_rmptable; +} + +void __init snp_set_soft_rmptable(void) +{ + soft_rmptable =3D true; +} + static bool virt_snp_msr(void) { return boot_cpu_has(X86_FEATURE_NESTED_VIRT_SNP_MSR); @@ -2592,6 +2608,26 @@ static u64 virt_psmash(u64 paddr) return ret; } =20 +static void snp_update_rmptable_psmash(u64 pfn) +{ + int level; + struct rmpentry *entry =3D __snp_lookup_rmpentry(pfn, &level); + + if (WARN_ON(IS_ERR_OR_NULL(entry))) + return; + + if (level =3D=3D PG_LEVEL_2M) { + int i; + + entry->info.pagesize =3D RMP_PG_SIZE_4K; + for (i =3D 1; i < PTRS_PER_PMD; i++) { + struct rmpentry *it =3D &entry[i]; + *it =3D *entry; + it->info.gpa =3D entry->info.gpa + i * PAGE_SIZE; + } + } +} + /* * psmash is used to smash a 2MB aligned page into 4K * pages while preserving the Validated bit in the RMP. @@ -2609,6 +2645,8 @@ int psmash(u64 pfn) =20 if (virt_snp_msr()) { ret =3D virt_psmash(paddr); + if (!ret && snp_soft_rmptable()) + snp_update_rmptable_psmash(pfn); } else { /* Binutils version 2.36 supports the PSMASH mnemonic. */ asm volatile(".byte 0xF3, 0x0F, 0x01, 0xFF" @@ -2656,6 +2694,35 @@ static u64 virt_rmpupdate(unsigned long paddr, struc= t rmp_state *val) return ret; } =20 +static void snp_update_rmptable_rmpupdate(u64 pfn, int level, struct rmp_s= tate *val) +{ + int prev_level; + struct rmpentry *entry =3D __snp_lookup_rmpentry(pfn, &prev_level); + + if (WARN_ON(IS_ERR_OR_NULL(entry))) + return; + + if (level > PG_LEVEL_4K) { + int i; + struct rmpentry tmp_rmp =3D { + .info =3D { + .assigned =3D val->assigned, + }, + }; + for (i =3D 1; i < PTRS_PER_PMD; i++) + entry[i] =3D tmp_rmp; + } + if (!val->assigned) { + memset(entry, 0, sizeof(*entry)); + } else { + entry->info.assigned =3D val->assigned; + entry->info.pagesize =3D val->pagesize; + entry->info.immutable =3D val->immutable; + entry->info.gpa =3D val->gpa; + entry->info.asid =3D val->asid; + } +} + static int rmpupdate(u64 pfn, struct rmp_state *val) { unsigned long paddr =3D pfn << PAGE_SHIFT; @@ -2684,6 +2751,8 @@ static int rmpupdate(u64 pfn, struct rmp_state *val) =20 if (virt_snp_msr()) { ret =3D virt_rmpupdate(paddr, val); + if (!ret && snp_soft_rmptable()) + snp_update_rmptable_rmpupdate(pfn, level, val); } else { /* Binutils version 2.36 supports the RMPUPDATE mnemonic. */ asm volatile(".byte 0xF2, 0x0F, 0x01, 0xFE" --=20 2.25.1 From nobody Fri Sep 12 06:10:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29895C636D7 for ; Mon, 13 Feb 2023 10:34:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230449AbjBMKeu (ORCPT ); Mon, 13 Feb 2023 05:34:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41158 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230390AbjBMKek (ORCPT ); Mon, 13 Feb 2023 05:34:40 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 47D0813D43; Mon, 13 Feb 2023 02:34:36 -0800 (PST) Received: from vm02.corp.microsoft.com (unknown [167.220.196.155]) by linux.microsoft.com (Postfix) with ESMTPSA id 4230D20C8B77; Mon, 13 Feb 2023 02:34:33 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 4230D20C8B77 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1676284475; bh=CIr41cGlWZ1wcUio7O4orsXVmz7isVmQd0woUBuuPcQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZuD2qOdLSqqQlqTCLW+rbKY2cD2ILv4r1ZtjIm6EYBdmnPkrjoqpk9XbeOH4+KU+T wUTy6JgOYK1C43KXXSFTkOgjsDRTxYsYO+VpudE08dh9LxCDkbvzvljC/J/Ab3fyEq O4m/GxEwmM46BOs0rxLIZsaDqgOE37FrrS6V8K7I= From: Jeremi Piotrowski To: linux-kernel@vger.kernel.org Cc: Jeremi Piotrowski , Wei Liu , Dexuan Cui , Tianyu Lan , Michael Kelley , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, linux-hyperv@vger.kernel.org, Brijesh Singh , Michael Roth , Ashish Kalra , Tom Lendacky Subject: [RFC PATCH v2 4/7] x86/amd: Configure necessary MSRs for SNP during CPU init when running as a guest Date: Mon, 13 Feb 2023 10:33:59 +0000 Message-Id: <20230213103402.1189285-5-jpiotrowski@linux.microsoft.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> References: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Hyper-V may expose the SEV-SNP CPU features to the guest, but it is the guests kernel's responsibility to configure them. early_detect_mem_encrypt() checks SYSCFG[MEM_ENCRYPT] and HWCR[SMMLOCK] and if these are not set the SEV-SNP CPU flags are cleared. These checks are only really necessary on baremetal and provide no value when running virtualized. They prevent further initialization from happening, so check if we are running under a hypervisor and if so - update SYSCFG and skip the HWCR check. Signed-off-by: Jeremi Piotrowski --- arch/x86/kernel/cpu/amd.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index c7884198ad5b..4418a418109b 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -565,6 +565,9 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86= *c) * don't advertise the feature under CONFIG_X86_32. */ if (cpu_has(c, X86_FEATURE_SME) || cpu_has(c, X86_FEATURE_SEV)) { + if (cpu_has(c, X86_FEATURE_HYPERVISOR)) + msr_set_bit(MSR_AMD64_SYSCFG, MSR_AMD64_SYSCFG_MEM_ENCRYPT_BIT); + /* Check if memory encryption is enabled */ rdmsrl(MSR_AMD64_SYSCFG, msr); if (!(msr & MSR_AMD64_SYSCFG_MEM_ENCRYPT)) @@ -584,7 +587,7 @@ static void early_detect_mem_encrypt(struct cpuinfo_x86= *c) setup_clear_cpu_cap(X86_FEATURE_SME); =20 rdmsrl(MSR_K7_HWCR, msr); - if (!(msr & MSR_K7_HWCR_SMMLOCK)) + if (!(msr & MSR_K7_HWCR_SMMLOCK) && !cpu_has(c, X86_FEATURE_HYPERVISOR)) goto clear_sev; =20 return; --=20 2.25.1 From nobody Fri Sep 12 06:10:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 71072C636D4 for ; Mon, 13 Feb 2023 10:34:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230423AbjBMKe4 (ORCPT ); Mon, 13 Feb 2023 05:34:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230356AbjBMKem (ORCPT ); Mon, 13 Feb 2023 05:34:42 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 17166144A1; Mon, 13 Feb 2023 02:34:39 -0800 (PST) Received: from vm02.corp.microsoft.com (unknown [167.220.196.155]) by linux.microsoft.com (Postfix) with ESMTPSA id 3A2B320C8B73; Mon, 13 Feb 2023 02:34:36 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 3A2B320C8B73 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1676284478; bh=seVPAjNfSiYKxnX2j0YblQLtaiY/QQf1dxqHVHTN/Zc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YQaKS8/tWGPsg/vr+lv3W7e7ETHFnqAEdaZn7cwscr/pRpUtFeY5yo4n4AFouND64 QvKhy7BXBALbt13SRO9CASzhjkwKrgJgvBepCJ9MK6xl3KloUl2x6WuXb0cb6qIY2k 2F98P3q2k92KEl2+UF5tE+nrqNorqi8OJIFvgbo0= From: Jeremi Piotrowski To: linux-kernel@vger.kernel.org Cc: Jeremi Piotrowski , Wei Liu , Dexuan Cui , Tianyu Lan , Michael Kelley , linux-hyperv@vger.kernel.org, Brijesh Singh , Michael Roth , Ashish Kalra , Tom Lendacky , Joerg Roedel , Suravee Suthikulpanit , iommu@lists.linux.dev Subject: [RFC PATCH v2 5/7] iommu/amd: Don't fail snp_enable when running virtualized Date: Mon, 13 Feb 2023 10:34:00 +0000 Message-Id: <20230213103402.1189285-6-jpiotrowski@linux.microsoft.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> References: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Hyper-V VMs do not have access to an IOMMU but can support hosting SNP VMs. amd_iommu_snp_enable() is on the SNP init path and should not fail in that case. Signed-off-by: Jeremi Piotrowski --- drivers/iommu/amd/init.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c index d1270e3c5baf..8049dbe78a27 100644 --- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3619,6 +3619,12 @@ int amd_iommu_pc_set_reg(struct amd_iommu *iommu, u8= bank, u8 cntr, u8 fxn, u64 #ifdef CONFIG_AMD_MEM_ENCRYPT int amd_iommu_snp_enable(void) { + /* + * If we're running virtualized there doesn't have to be an IOMMU for SNP= to work. + */ + if (init_state =3D=3D IOMMU_NOT_FOUND && boot_cpu_has(X86_FEATURE_HYPERVI= SOR)) + return 0; + /* * The SNP support requires that IOMMU must be enabled, and is * not configured in the passthrough mode. --=20 2.25.1 From nobody Fri Sep 12 06:10:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 573AFC64EC4 for ; Mon, 13 Feb 2023 10:35:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230078AbjBMKe7 (ORCPT ); Mon, 13 Feb 2023 05:34:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231168AbjBMKeq (ORCPT ); Mon, 13 Feb 2023 05:34:46 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 51A861259F; Mon, 13 Feb 2023 02:34:41 -0800 (PST) Received: from vm02.corp.microsoft.com (unknown [167.220.196.155]) by linux.microsoft.com (Postfix) with ESMTPSA id DECB920C8B77; Mon, 13 Feb 2023 02:34:38 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com DECB920C8B77 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1676284480; bh=xy4n0PsUoqcPr+Y27mq9kcRn+aDYjUAZLw0FdyzM2VM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CpP2X4BqcHytwtx7qnNrvxPPHrAXu2hHh1Wa7HYW5nAQo+xEO2DaM1YDITuJBBzSu 41WPUDBqw1nTbnH1BqA9pv3TQmFBaXE2svX+DNk9SdfBSvcXGo9cgdBgVR+0OI3fht 8ubwil6ix/Hz8s4QxF5VEtXGYsjG0Ccyq39FqXyA= From: Jeremi Piotrowski To: linux-kernel@vger.kernel.org Cc: Jeremi Piotrowski , Wei Liu , Dexuan Cui , Tianyu Lan , Michael Kelley , linux-hyperv@vger.kernel.org, Brijesh Singh , Michael Roth , Ashish Kalra , Tom Lendacky , linux-crypto@vger.kernel.org Subject: [RFC PATCH v2 6/7] crypto: ccp - Introduce quirk to always reclaim pages after SEV-legacy commands Date: Mon, 13 Feb 2023 10:34:01 +0000 Message-Id: <20230213103402.1189285-7-jpiotrowski@linux.microsoft.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> References: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" On Hyper-V, the rmp_mark_pages_shared() call after a SEV_PLATFORM_STATUS fails with return code 2 (FAIL_PERMISSION) due to the page having the immutable bit set in the RMP (SNP has been initialized). The comment above this spot mentions that firmware automatically clears the immutable bit, but I can't find any mention of this behavior in the SNP Firmware ABI Spec. Introduce a quirk to always attempt the page reclaim and set it for the platform PSP. It would be possible to make this behavior unconditional as the firmware spec defines that page reclaim results in success if the page does not have the immutable bit set. Signed-off-by: Jeremi Piotrowski --- drivers/crypto/ccp/sev-dev.c | 6 +++++- drivers/crypto/ccp/sp-dev.h | 4 ++++ drivers/crypto/ccp/sp-platform.c | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/drivers/crypto/ccp/sev-dev.c b/drivers/crypto/ccp/sev-dev.c index 6c4fdcaed72b..4719c0cafa28 100644 --- a/drivers/crypto/ccp/sev-dev.c +++ b/drivers/crypto/ccp/sev-dev.c @@ -658,8 +658,12 @@ static int __snp_cmd_buf_copy(int cmd, void *cmd_buf, = bool to_fw, int fw_err) * no not need to reclaim the page. */ if (from_fw && sev_legacy_cmd_buf_writable(cmd)) { - if (rmp_mark_pages_shared(__pa(cmd_buf), 1)) + if (psp_master->vdata->quirks & PSP_QUIRK_ALWAYS_RECLAIM) { + if (snp_reclaim_pages(__pa(cmd_buf), 1, true)) + return -EFAULT; + } else if (rmp_mark_pages_shared(__pa(cmd_buf), 1)) { return -EFAULT; + } =20 /* No need to go further if firmware failed to execute command. */ if (fw_err) diff --git a/drivers/crypto/ccp/sp-dev.h b/drivers/crypto/ccp/sp-dev.h index c05f1fa82ff4..d50f274462d4 100644 --- a/drivers/crypto/ccp/sp-dev.h +++ b/drivers/crypto/ccp/sp-dev.h @@ -28,6 +28,9 @@ #define CACHE_NONE 0x00 #define CACHE_WB_NO_ALLOC 0xb7 =20 +/* PSP requires a reclaim after every firmware command */ +#define PSP_QUIRK_ALWAYS_RECLAIM BIT(0) + /* Structure to hold CCP device data */ struct ccp_device; struct ccp_vdata { @@ -59,6 +62,7 @@ struct psp_vdata { const unsigned int feature_reg; const unsigned int inten_reg; const unsigned int intsts_reg; + const unsigned int quirks; }; =20 /* Structure to hold SP device data */ diff --git a/drivers/crypto/ccp/sp-platform.c b/drivers/crypto/ccp/sp-platf= orm.c index 1926efbc7b32..937448f6391a 100644 --- a/drivers/crypto/ccp/sp-platform.c +++ b/drivers/crypto/ccp/sp-platform.c @@ -103,6 +103,7 @@ static void sp_platform_fill_vdata(struct sp_dev_vdata = *vdata, .feature_reg =3D pdata->feature_reg, .inten_reg =3D pdata->irq_en_reg, .intsts_reg =3D pdata->irq_st_reg, + .quirks =3D PSP_QUIRK_ALWAYS_RECLAIM, }; =20 memcpy(sev, &sevtmp, sizeof(*sev)); --=20 2.25.1 From nobody Fri Sep 12 06:10:00 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4A1EC636D4 for ; Mon, 13 Feb 2023 10:35:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230515AbjBMKfC (ORCPT ); Mon, 13 Feb 2023 05:35:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41188 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231137AbjBMKex (ORCPT ); Mon, 13 Feb 2023 05:34:53 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E648E13D45; Mon, 13 Feb 2023 02:34:44 -0800 (PST) Received: from vm02.corp.microsoft.com (unknown [167.220.196.155]) by linux.microsoft.com (Postfix) with ESMTPSA id 486C320C8B73; Mon, 13 Feb 2023 02:34:41 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 486C320C8B73 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1676284483; bh=6FyEHphcQFflUo9SuGkOPqnwYofGYMYNDEnJFWuN+p0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FrMtkOd1CwfJf6LnPWUPPAiIA/JLvDKepzvuSXnElFQf1wYGNOFFmuMspWyESh3Md aJLVDcle7HBxTnPfNRoYPGKaVNnL88xdlXEjAK5/ykffTPAhXyzFElVB5aY7r555K/ viRumx21ZR9f6C0rDLrVFD5enchU+7MAlgkeY4S8= From: Jeremi Piotrowski To: linux-kernel@vger.kernel.org Cc: Jeremi Piotrowski , Wei Liu , Dexuan Cui , Tianyu Lan , Michael Kelley , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, linux-hyperv@vger.kernel.org, Brijesh Singh , Michael Roth , Ashish Kalra , Tom Lendacky Subject: [RFC PATCH v2 7/7] x86/fault: Handle RMP faults with 0 address when nested Date: Mon, 13 Feb 2023 10:34:02 +0000 Message-Id: <20230213103402.1189285-8-jpiotrowski@linux.microsoft.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> References: <20230213103402.1189285-1-jpiotrowski@linux.microsoft.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When using SNP, accessing an encrypted guest page from the host triggers an RMP fault. The page fault handling code can currently handle this by looking up the corresponding rmp entry. If the same operation happens when using nested virtualization, the L0 hypervisor sees a #NPF but the CPU does not provide the address of the fault if the CPU was running at L1 at the time of the fault. This happens on Hyper-V when using nested SNP guests. Hyper-V has no choice but to use a placeholder address (0) when injecting the page fault to L1. We need to handle this, and the only sane thing to do is to forward a SIGBUS to the task. One path where this happens is when the SNP guest issues a KVM_HC_CLOCK_PAIRING hypercall, which leads to KVM calling kvm_write_guest() on a guest supplied address. This results in the following backtrace: [ 191.862660] exc_page_fault+0x71/0x170 [ 191.862664] asm_exc_page_fault+0x2c/0x40 [ 191.862666] RIP: 0010:copy_user_enhanced_fast_string+0xa/0x40 ... [ 191.862677] ? __kvm_write_guest_page+0x6e/0xa0 [kvm] [ 191.862700] kvm_write_guest_page+0x52/0xc0 [kvm] [ 191.862788] kvm_write_guest+0x44/0x80 [kvm] [ 191.862807] kvm_emulate_hypercall+0x1ca/0x5a0 [kvm] [ 191.862830] ? kvm_emulate_monitor+0x40/0x40 [kvm] [ 191.862849] svm_invoke_exit_handler+0x74/0x180 [kvm_amd] [ 191.862854] sev_handle_vmgexit+0xf42/0x17f0 [kvm_amd] [ 191.862858] ? __this_cpu_preempt_check+0x13/0x20 [ 191.862860] ? sev_post_map_gfn+0xf0/0xf0 [kvm_amd] [ 191.862863] svm_invoke_exit_handler+0x74/0x180 [kvm_amd] [ 191.862866] svm_handle_exit+0xb5/0x2b0 [kvm_amd] [ 191.862869] kvm_arch_vcpu_ioctl_run+0x12a8/0x1aa0 [kvm] [ 191.862891] kvm_vcpu_ioctl+0x24f/0x6d0 [kvm] [ 191.862910] ? kvm_vm_ioctl_irq_line+0x27/0x40 [kvm] [ 191.862929] ? _copy_to_user+0x25/0x30 [ 191.862932] ? kvm_vm_ioctl+0x291/0xea0 [kvm] [ 191.862951] ? kvm_vm_ioctl+0x291/0xea0 [kvm] [ 191.862970] ? __fget_light+0xc5/0x100 [ 191.862972] __x64_sys_ioctl+0x91/0xc0 [ 191.862975] do_syscall_64+0x5c/0x80 [ 191.862976] ? exit_to_user_mode_prepare+0x53/0x240 [ 191.862978] ? syscall_exit_to_user_mode+0x17/0x40 [ 191.862980] ? do_syscall_64+0x69/0x80 [ 191.862981] ? do_syscall_64+0x69/0x80 [ 191.862982] ? syscall_exit_to_user_mode+0x17/0x40 [ 191.862983] ? do_syscall_64+0x69/0x80 [ 191.862984] ? syscall_exit_to_user_mode+0x17/0x40 [ 191.862985] ? do_syscall_64+0x69/0x80 [ 191.862986] ? do_syscall_64+0x69/0x80 [ 191.862987] entry_SYSCALL_64_after_hwframe+0x46/0xb0 Without this fix the handler returns without doing anything and the result is a soft-lockup of the CPU. Signed-off-by: Jeremi Piotrowski --- arch/x86/mm/fault.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index f2b16dcfbd9a..8706fd34f3a9 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -34,6 +34,7 @@ #include /* fixup_vdso_exception() */ #include #include /* snp_lookup_rmpentry() */ +#include /* hypervisor_is_type() */ =20 #define CREATE_TRACE_POINTS #include @@ -1282,6 +1283,18 @@ static int handle_user_rmp_page_fault(struct pt_regs= *regs, unsigned long error_ pte_t *pte; u64 pfn; =20 + /* + * When an rmp fault occurs while not inside the SNP guest, the L0 + * hypervisor sees a NPF and does not have access to the address that + * caused the fault to forward to L1 hypervisor. Hyper-V places a 0 in + * the PF as a placeholder. SIGBUS the task since there's nothing + * better that we can do. + */ + if (!address && hypervisor_is_type(X86_HYPER_MS_HYPERV)) { + do_sigbus(regs, error_code, address, VM_FAULT_SIGBUS); + return 1; + } + pgd =3D __va(read_cr3_pa()); pgd +=3D pgd_index(address); =20 --=20 2.25.1