From nobody Mon Dec 1 22:32:11 2025 Received: from out-185.mta0.migadu.com (out-185.mta0.migadu.com [91.218.175.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E42FF263F4E for ; Thu, 27 Nov 2025 01:35:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=91.218.175.185 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764207319; cv=none; b=C2QFQZnx5vy772nBZF02VTL6I+3UuiuY2o4e0zrFz5ixUVjDao1jz7QJjwQZkqCRaC7MHyOEmcehf8/tM+mX2EM9t4dAqMlA4gIfRftXpkIZa9Qf5X8OvYbbHwXBPVtV+PJJrQsFCsphOWND4lr7MU92CQ7njqQ+4E646Y2P1FI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764207319; c=relaxed/simple; bh=AVL65zE6ap9yXa02g08WlNKn7e/EuZdOWIoUxqMYqAs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MwJPMcrO59aUUc/dMmxBEe4Z9uokOM/Xle90Fv6xATzSGbU3Fi0EeJwm8OL2z426P3FYGBucwZgGltxb2giwfaN1JDJgoNRsQe1qBK6dG5VtYTQM7ypUyR/Qj0sMUJuxquP/NE61ZATGsKndhFJHCptSpz9PtuTroBek8mkR9GI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev; spf=pass smtp.mailfrom=linux.dev; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b=VA0qzw9i; arc=none smtp.client-ip=91.218.175.185 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.dev Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.dev Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.dev header.i=@linux.dev header.b="VA0qzw9i" X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1764207313; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=heCl6z29MAL1C93kBW4Xu7OGs20kKxnbrJao+8ri5pg=; b=VA0qzw9iMAs3D2ocRyQ5XNuY/PAsNZyEk0BVwoJmkEHh8lI8RIJothTwLjyyyHX/yNEfwp Fn259pMLYb/S8pBM3qtd6o2AXt0Z2e7nTcpfB9s8jK3LFdOQNzaI9PwRtS5a9eT3waJFNc XBqmcapbioaMuLVrm6k9DLudcuV93Cs= From: Yosry Ahmed To: Sean Christopherson Cc: Paolo Bonzini , kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Yosry Ahmed Subject: [PATCH v3 10/16] KVM: selftests: Reuse virt mapping functions for nested EPTs Date: Thu, 27 Nov 2025 01:34:34 +0000 Message-ID: <20251127013440.3324671-11-yosry.ahmed@linux.dev> In-Reply-To: <20251127013440.3324671-1-yosry.ahmed@linux.dev> References: <20251127013440.3324671-1-yosry.ahmed@linux.dev> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Content-Type: text/plain; charset="utf-8" __tdp_pg_map() bears a lot of resemblence to __virt_pg_map(). The main differences are: - It uses the EPT struct overlay instead of the PTE masks. - It always assumes 4-level EPTs. To reuse __virt_pg_map(), initialize the PTE masks in nested MMU with EPT PTE masks. EPTs have no 'present' or 'user' bits, so use the 'readable' bit instead like shadow_{present/user}_mask, ignoring the fact that entries can be present and not readable if the CPU has VMX_EPT_EXECUTE_ONLY_BIT. This is simple and sufficient for testing. Add an executable bitmask and update __virt_pg_map() and friends to set the bit on newly created entries to match the EPT behavior. It's a nop for x86 page tables. Another benefit of reusing the code is having separate handling for upper-level PTEs vs 4K PTEs, which avoids some quirks like setting the large bit on a 4K PTE in the EPTs. No functional change intended. Suggested-by: Sean Christopherson Signed-off-by: Yosry Ahmed --- .../selftests/kvm/include/x86/processor.h | 3 + .../testing/selftests/kvm/lib/x86/processor.c | 12 +- tools/testing/selftests/kvm/lib/x86/vmx.c | 115 ++++-------------- 3 files changed, 33 insertions(+), 97 deletions(-) diff --git a/tools/testing/selftests/kvm/include/x86/processor.h b/tools/te= sting/selftests/kvm/include/x86/processor.h index fb2b2e53d453..62e10b296719 100644 --- a/tools/testing/selftests/kvm/include/x86/processor.h +++ b/tools/testing/selftests/kvm/include/x86/processor.h @@ -1447,6 +1447,7 @@ struct pte_masks { uint64_t dirty; uint64_t huge; uint64_t nx; + uint64_t x; uint64_t c; uint64_t s; }; @@ -1464,6 +1465,7 @@ struct kvm_mmu { #define PTE_DIRTY_MASK(mmu) ((mmu)->pte_masks.dirty) #define PTE_HUGE_MASK(mmu) ((mmu)->pte_masks.huge) #define PTE_NX_MASK(mmu) ((mmu)->pte_masks.nx) +#define PTE_X_MASK(mmu) ((mmu)->pte_masks.x) #define PTE_C_MASK(mmu) ((mmu)->pte_masks.c) #define PTE_S_MASK(mmu) ((mmu)->pte_masks.s) =20 @@ -1474,6 +1476,7 @@ struct kvm_mmu { #define pte_dirty(mmu, pte) (!!(*(pte) & PTE_DIRTY_MASK(mmu))) #define pte_huge(mmu, pte) (!!(*(pte) & PTE_HUGE_MASK(mmu))) #define pte_nx(mmu, pte) (!!(*(pte) & PTE_NX_MASK(mmu))) +#define pte_x(mmu, pte) (!!(*(pte) & PTE_X_MASK(mmu))) #define pte_c(mmu, pte) (!!(*(pte) & PTE_C_MASK(mmu))) #define pte_s(mmu, pte) (!!(*(pte) & PTE_S_MASK(mmu))) =20 diff --git a/tools/testing/selftests/kvm/lib/x86/processor.c b/tools/testin= g/selftests/kvm/lib/x86/processor.c index bff75ff05364..8b0e17f8ca37 100644 --- a/tools/testing/selftests/kvm/lib/x86/processor.c +++ b/tools/testing/selftests/kvm/lib/x86/processor.c @@ -162,8 +162,7 @@ struct kvm_mmu *mmu_create(struct kvm_vm *vm, int pgtab= le_levels, struct kvm_mmu *mmu =3D calloc(1, sizeof(*mmu)); =20 TEST_ASSERT(mmu, "-ENOMEM when allocating MMU"); - if (pte_masks) - mmu->pte_masks =3D *pte_masks; + mmu->pte_masks =3D *pte_masks; mmu->root_gpa =3D vm_alloc_page_table(vm); mmu->pgtable_levels =3D pgtable_levels; return mmu; @@ -179,6 +178,7 @@ static void mmu_init(struct kvm_vm *vm) .dirty =3D BIT_ULL(6), .huge =3D BIT_ULL(7), .nx =3D BIT_ULL(63), + .x =3D 0, .c =3D vm->arch.c_bit, .s =3D vm->arch.s_bit, }; @@ -225,7 +225,7 @@ static uint64_t *virt_create_upper_pte(struct kvm_vm *v= m, paddr =3D vm_untag_gpa(vm, paddr); =20 if (!pte_present(mmu, pte)) { - *pte =3D PTE_PRESENT_MASK(mmu) | PTE_WRITABLE_MASK(mmu); + *pte =3D PTE_PRESENT_MASK(mmu) | PTE_WRITABLE_MASK(mmu) | PTE_X_MASK(mmu= ); if (current_level =3D=3D target_level) *pte |=3D PTE_HUGE_MASK(mmu) | (paddr & PHYSICAL_PAGE_MASK); else @@ -271,6 +271,9 @@ void __virt_pg_map(struct kvm_vm *vm, struct kvm_mmu *m= mu, uint64_t vaddr, TEST_ASSERT(vm_untag_gpa(vm, paddr) =3D=3D paddr, "Unexpected bits in paddr: %lx", paddr); =20 + TEST_ASSERT(!PTE_X_MASK(mmu) || !PTE_NX_MASK(mmu), + "X and NX bit masks cannot be used simultaneously"); + /* * Allocate upper level page tables, if not already present. Return * early if a hugepage was created. @@ -288,7 +291,8 @@ void __virt_pg_map(struct kvm_vm *vm, struct kvm_mmu *m= mu, uint64_t vaddr, pte =3D virt_get_pte(vm, mmu, pte, vaddr, PG_LEVEL_4K); TEST_ASSERT(!pte_present(mmu, pte), "PTE already present for 4k page at vaddr: 0x%lx", vaddr); - *pte =3D PTE_PRESENT_MASK(mmu) | PTE_WRITABLE_MASK(mmu) | (paddr & PHYSIC= AL_PAGE_MASK); + *pte =3D PTE_PRESENT_MASK(mmu) | PTE_WRITABLE_MASK(mmu) | PTE_X_MASK(mmu) + | (paddr & PHYSICAL_PAGE_MASK); =20 /* * Neither SEV nor TDX supports shared page tables, so only the final diff --git a/tools/testing/selftests/kvm/lib/x86/vmx.c b/tools/testing/self= tests/kvm/lib/x86/vmx.c index a909fad57fd5..0cba31cae896 100644 --- a/tools/testing/selftests/kvm/lib/x86/vmx.c +++ b/tools/testing/selftests/kvm/lib/x86/vmx.c @@ -25,21 +25,6 @@ bool enable_evmcs; struct hv_enlightened_vmcs *current_evmcs; struct hv_vp_assist_page *current_vp_assist; =20 -struct eptPageTableEntry { - uint64_t readable:1; - uint64_t writable:1; - uint64_t executable:1; - uint64_t memory_type:3; - uint64_t ignore_pat:1; - uint64_t page_size:1; - uint64_t accessed:1; - uint64_t dirty:1; - uint64_t ignored_11_10:2; - uint64_t address:40; - uint64_t ignored_62_52:11; - uint64_t suppress_ve:1; -}; - int vcpu_enable_evmcs(struct kvm_vcpu *vcpu) { uint16_t evmcs_ver; @@ -58,12 +43,31 @@ int vcpu_enable_evmcs(struct kvm_vcpu *vcpu) =20 void vm_enable_ept(struct kvm_vm *vm) { + struct pte_masks pte_masks; + TEST_ASSERT(kvm_cpu_has_ept(), "KVM doesn't support nested EPT"); if (vm->arch.nested.mmu) return; =20 + /* + * EPTs do not have 'present' or 'user' bits, instead bit 0 is the + * 'readable' bit. In some cases, EPTs can be execute-only and an entry + * is present but not readable. However, for the purposes of testing we + * assume 'present' =3D=3D 'user' =3D=3D 'readable' for simplicity. + */ + pte_masks =3D (struct pte_masks){ + .present =3D BIT_ULL(0), + .user =3D BIT_ULL(0), + .writable =3D BIT_ULL(1), + .x =3D BIT_ULL(2), + .accessed =3D BIT_ULL(5), + .dirty =3D BIT_ULL(6), + .huge =3D BIT_ULL(7), + .nx =3D 0, + }; + /* EPTP_PWL_4 is always used */ - vm->arch.nested.mmu =3D mmu_create(vm, 4, NULL); + vm->arch.nested.mmu =3D mmu_create(vm, 4, &pte_masks); } =20 /* Allocate memory regions for nested VMX tests. @@ -372,82 +376,6 @@ void prepare_vmcs(struct vmx_pages *vmx, void *guest_r= ip, void *guest_rsp) init_vmcs_guest_state(guest_rip, guest_rsp); } =20 -static void tdp_create_pte(struct kvm_vm *vm, - struct eptPageTableEntry *pte, - uint64_t nested_paddr, - uint64_t paddr, - int current_level, - int target_level) -{ - if (!pte->readable) { - pte->writable =3D true; - pte->readable =3D true; - pte->executable =3D true; - pte->page_size =3D (current_level =3D=3D target_level); - if (pte->page_size) - pte->address =3D paddr >> vm->page_shift; - else - pte->address =3D vm_alloc_page_table(vm) >> vm->page_shift; - } else { - /* - * Entry already present. Assert that the caller doesn't want - * a hugepage at this level, and that there isn't a hugepage at - * this level. - */ - TEST_ASSERT(current_level !=3D target_level, - "Cannot create hugepage at level: %u, nested_paddr: 0x%lx", - current_level, nested_paddr); - TEST_ASSERT(!pte->page_size, - "Cannot create page table at level: %u, nested_paddr: 0x%lx", - current_level, nested_paddr); - } -} - - -void __tdp_pg_map(struct kvm_vm *vm, uint64_t nested_paddr, uint64_t paddr, - int target_level) -{ - const uint64_t page_size =3D PG_LEVEL_SIZE(target_level); - void *eptp_hva =3D addr_gpa2hva(vm, vm->arch.nested.mmu->root_gpa); - struct eptPageTableEntry *pt =3D eptp_hva, *pte; - uint16_t index; - - TEST_ASSERT(vm->mode =3D=3D VM_MODE_PXXVYY_4K, - "Unknown or unsupported guest mode: 0x%x", vm->mode); - - TEST_ASSERT((nested_paddr >> 48) =3D=3D 0, - "Nested physical address 0x%lx is > 48-bits and requires 5-level EPT= ", - nested_paddr); - TEST_ASSERT((nested_paddr % page_size) =3D=3D 0, - "Nested physical address not on page boundary,\n" - " nested_paddr: 0x%lx page_size: 0x%lx", - nested_paddr, page_size); - TEST_ASSERT((nested_paddr >> vm->page_shift) <=3D vm->max_gfn, - "Physical address beyond beyond maximum supported,\n" - " nested_paddr: 0x%lx vm->max_gfn: 0x%lx vm->page_size: 0x%x", - paddr, vm->max_gfn, vm->page_size); - TEST_ASSERT((paddr % page_size) =3D=3D 0, - "Physical address not on page boundary,\n" - " paddr: 0x%lx page_size: 0x%lx", - paddr, page_size); - TEST_ASSERT((paddr >> vm->page_shift) <=3D vm->max_gfn, - "Physical address beyond beyond maximum supported,\n" - " paddr: 0x%lx vm->max_gfn: 0x%lx vm->page_size: 0x%x", - paddr, vm->max_gfn, vm->page_size); - - for (int level =3D PG_LEVEL_512G; level >=3D PG_LEVEL_4K; level--) { - index =3D (nested_paddr >> PG_LEVEL_SHIFT(level)) & 0x1ffu; - pte =3D &pt[index]; - - tdp_create_pte(vm, pte, nested_paddr, paddr, level, target_level); - - if (pte->page_size) - break; - - pt =3D addr_gpa2hva(vm, pte->address * vm->page_size); - } -} - /* * Map a range of EPT guest physical addresses to the VM's physical address * @@ -468,6 +396,7 @@ void __tdp_pg_map(struct kvm_vm *vm, uint64_t nested_pa= ddr, uint64_t paddr, void __tdp_map(struct kvm_vm *vm, uint64_t nested_paddr, uint64_t paddr, uint64_t size, int level) { + struct kvm_mmu *mmu =3D vm->arch.nested.mmu; size_t page_size =3D PG_LEVEL_SIZE(level); size_t npages =3D size / page_size; =20 @@ -475,7 +404,7 @@ void __tdp_map(struct kvm_vm *vm, uint64_t nested_paddr= , uint64_t paddr, TEST_ASSERT(paddr + size > paddr, "Paddr overflow"); =20 while (npages--) { - __tdp_pg_map(vm, nested_paddr, paddr, level); + __virt_pg_map(vm, mmu, nested_paddr, paddr, level); nested_paddr +=3D page_size; paddr +=3D page_size; } --=20 2.52.0.158.g65b55ccf14-goog