From nobody Mon Jun 8 08:30:20 2026 Received: from m16.mail.163.com (m16.mail.163.com [117.135.210.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13CE713B293; Thu, 4 Jun 2026 14:28:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=117.135.210.2 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780583309; cv=none; b=BWISr4vqRztIEacG6suYkE+/QzNuZwc61lqNSs5S1X10BJzIthVb8mVUEZblBbMl7897ub34KIimwYi10clYbt3Hh143FIXz7N9Usis8vQiJAcAzQz2f5Uunn4Ny5/HDH6JvRiMu4W/upIDXHKYqUOQ6xnAQsJ1lQFFAnrRCTBE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780583309; c=relaxed/simple; bh=ivHNBQTGnEQXFYnN8jBy21wmCGtuGwbD5xr0g6kS38w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=I0I1Eb4sssOrYL8lUugGbGUobE5hcpXc4OEbm+5zwzBRqQepzY8js6PoqZgG7KFaDLtfoD9iLSSWUe6XRZ0oUaBg1abGIK5+Aa4hw1PfariX2jgsymFEEIGCZ0owb/Ke1yxM7tNdwSuzbVaQ75oB/27GMuSTEehge8ocpQ3r84c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=163.com; spf=pass smtp.mailfrom=163.com; dkim=pass (1024-bit key) header.d=163.com header.i=@163.com header.b=azP3yeTY; arc=none smtp.client-ip=117.135.210.2 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=163.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=163.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=163.com header.i=@163.com header.b="azP3yeTY" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=4z HbRJ4AiR5mFPQ8rQUHGwM8CN+rCT/lxSBfEBPuX4s=; b=azP3yeTYBfb5P9yAXH C5vss+RW1tu0+rguZUXsYXjj3LYYwAfmmmuVqkb3szJmBGjJL1CgUEHCa5Pd4ggY 62HPkR/6lO5TZwCJMmO5LFJuihkBLpZ3iYkLTptXcgjtw/oiScERGM864Ne9QSKF ROqFfqC27mjslfOYiSitC4qvU= Received: from localhost.localdomain (unknown []) by gzsmtp1 (Coremail) with SMTP id PCgvCgC3VJAIiyFqSKmLAQ--.50067S3; Thu, 04 Jun 2026 22:26:20 +0800 (CST) From: Jinyu Tang To: Anup Patel , Anup Patel , Paolo Bonzini Cc: Atish Patra , Paul Walmsley , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Shuah Khan , Sean Christopherson , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Andrew Jones , Conor Dooley , Yong-Xuan Wang , Nutty Liu , kvm@vger.kernel.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Jinyu Tang Subject: [PATCH v2 1/2] KVM: riscv: Check hugetlb block mappings against memslot bounds Date: Thu, 4 Jun 2026 22:26:01 +0800 Message-ID: <20260604142602.3582602-2-tjytimi@163.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260604142602.3582602-1-tjytimi@163.com> References: <20260604142602.3582602-1-tjytimi@163.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: PCgvCgC3VJAIiyFqSKmLAQ--.50067S3 X-Coremail-Antispam: 1Uf129KBjvJXoW3XFy7ZF1rAw4Dtw43Gr4fuFg_yoWxJw1DpF 47Ga15CrW5try3Kr4xAwnru345Zw4rW34UAFyfJa90vrsxKFyagaykAay5XrWrZrn3XF4x ZFsxZFWUurZ0g3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x0z_AwI7UUUUU= X-CM-SenderInfo: xwm13xlpl6il2tof0z/xtbC8QzIQWohiwxZhQAA3k Content-Type: text/plain; charset="utf-8" RISC-V KVM has used the hugetlb VMA size directly as the G-stage mapping size since stage-2 page table support was added. That is safe only if the block covered by the fault is fully contained in the memslot and the userspace address has the same offset as the GPA within that block. The THP path already checks those constraints before installing a PMD block mapping. The hugetlb path did not, so an unaligned memslot could make KVM install a PMD or PUD sized G-stage block that covers memory outside the slot or maps the wrong host pages. Pass the target mapping size into fault_supports_gstage_huge_mapping(). The same helper can be used for both THP PMD mappings and hugetlb PMD/PUD mappings. Select hugetlb mapping sizes through the same memslot-boundary check, falling back from PUD to PMD to PAGE_SIZE. When a smaller hugetlb mapping size is selected, fault the GFN aligned to that selected size instead of the original VMA size. Also keep hugetlb mappings out of transparent_hugepage_adjust(). Once the hugetlb path has chosen PAGE_SIZE, promoting it again through the THP helper would miss the hugetlb fallback decision. Fixes: 9d05c1fee837 ("RISC-V: KVM: Implement stage2 page table programming") Signed-off-by: Jinyu Tang Reviewed-by: Nutty Liu Reviewed-by: Anup Patel --- v1 -> v2: - Squash the helper parameterization into this hugetlb fix. - Use the ALIGN()/ALIGN_DOWN() form suggested by Nutty Liu and Anup for the memslot boundary check. arch/riscv/kvm/mmu.c | 54 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 44 insertions(+), 10 deletions(-) diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index 2d3def024..0adf017a2 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -286,7 +286,8 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_r= ange *range) } =20 static bool fault_supports_gstage_huge_mapping(struct kvm_memory_slot *mem= slot, - unsigned long hva) + unsigned long hva, + unsigned long map_size) { hva_t uaddr_start, uaddr_end; gpa_t gpa_start; @@ -300,8 +301,8 @@ static bool fault_supports_gstage_huge_mapping(struct k= vm_memory_slot *memslot, =20 /* * Pages belonging to memslots that don't have the same alignment - * within a PMD for userspace and GPA cannot be mapped with g-stage - * PMD entries, because we'll end up mapping the wrong pages. + * within a huge page for userspace and GPA cannot be mapped with + * g-stage block entries, because we'll end up mapping the wrong pages. * * Consider a layout like the following: * @@ -321,7 +322,7 @@ static bool fault_supports_gstage_huge_mapping(struct k= vm_memory_slot *memslot, * e -> g * f -> h */ - if ((gpa_start & (PMD_SIZE - 1)) !=3D (uaddr_start & (PMD_SIZE - 1))) + if ((gpa_start & (map_size - 1)) !=3D (uaddr_start & (map_size - 1))) return false; =20 /* @@ -336,7 +337,8 @@ static bool fault_supports_gstage_huge_mapping(struct k= vm_memory_slot *memslot, * userspace_addr or the base_gfn, as both are equally aligned (per * the check above) and equally sized. */ - return (hva >=3D ALIGN(uaddr_start, PMD_SIZE)) && (hva < ALIGN_DOWN(uaddr= _end, PMD_SIZE)); + return (hva >=3D ALIGN(uaddr_start, map_size)) && + (hva < ALIGN_DOWN(uaddr_end, map_size)); } =20 static int get_hva_mapping_size(struct kvm *kvm, @@ -404,7 +406,7 @@ static unsigned long transparent_hugepage_adjust(struct= kvm *kvm, * sure that the HVA and GPA are sufficiently aligned and that the * block map is contained within the memslot. */ - if (fault_supports_gstage_huge_mapping(memslot, hva)) { + if (fault_supports_gstage_huge_mapping(memslot, hva, PMD_SIZE)) { int sz; =20 sz =3D get_hva_mapping_size(kvm, hva); @@ -421,12 +423,33 @@ static unsigned long transparent_hugepage_adjust(stru= ct kvm *kvm, return PAGE_SIZE; } =20 +static unsigned long hugetlb_mapping_size(struct kvm_memory_slot *memslot, + unsigned long hva, + unsigned long map_size) +{ + switch (map_size) { + case PUD_SIZE: + if (fault_supports_gstage_huge_mapping(memslot, hva, PUD_SIZE)) + return PUD_SIZE; + fallthrough; + case PMD_SIZE: + if (fault_supports_gstage_huge_mapping(memslot, hva, PMD_SIZE)) + return PMD_SIZE; + fallthrough; + case PAGE_SIZE: + return PAGE_SIZE; + default: + return map_size; + } +} + int kvm_riscv_mmu_map(struct kvm_vcpu *vcpu, struct kvm_memory_slot *memsl= ot, gpa_t gpa, unsigned long hva, bool is_write, struct kvm_gstage_mapping *out_map) { int ret; kvm_pfn_t hfn; + bool is_hugetlb; bool writable; short vma_pageshift; gfn_t gfn =3D gpa >> PAGE_SHIFT; @@ -460,16 +483,23 @@ int kvm_riscv_mmu_map(struct kvm_vcpu *vcpu, struct k= vm_memory_slot *memslot, return -EFAULT; } =20 - if (is_vm_hugetlb_page(vma)) + is_hugetlb =3D is_vm_hugetlb_page(vma); + if (is_hugetlb) vma_pageshift =3D huge_page_shift(hstate_vma(vma)); else vma_pageshift =3D PAGE_SHIFT; vma_pagesize =3D 1ULL << vma_pageshift; if (logging || (vma->vm_flags & VM_PFNMAP)) vma_pagesize =3D PAGE_SIZE; + else if (is_hugetlb) + vma_pagesize =3D hugetlb_mapping_size(memslot, hva, vma_pagesize); =20 + /* + * For hugetlb mappings, vma_pagesize might have been reduced from the + * VMA size to a smaller safe mapping size. + */ if (vma_pagesize =3D=3D PMD_SIZE || vma_pagesize =3D=3D PUD_SIZE) - gfn =3D (gpa & huge_page_mask(hstate_vma(vma))) >> PAGE_SHIFT; + gfn =3D ALIGN_DOWN(gpa, vma_pagesize) >> PAGE_SHIFT; =20 /* * Read mmu_invalidate_seq so that KVM can detect if the results of @@ -511,8 +541,12 @@ int kvm_riscv_mmu_map(struct kvm_vcpu *vcpu, struct kv= m_memory_slot *memslot, if (mmu_invalidate_retry(kvm, mmu_seq)) goto out_unlock; =20 - /* Check if we are backed by a THP and thus use block mapping if possible= */ - if (!logging && (vma_pagesize =3D=3D PAGE_SIZE)) + /* + * Check if we are backed by a THP and thus use block mapping if + * possible. Hugetlb mappings already selected their target size above, + * so do not promote them through the THP helper. + */ + if (!logging && !is_hugetlb && vma_pagesize =3D=3D PAGE_SIZE) vma_pagesize =3D transparent_hugepage_adjust(kvm, memslot, hva, &hfn, &g= pa); =20 if (writable) { --=20 2.43.0 From nobody Mon Jun 8 08:30:20 2026 Received: from m16.mail.163.com (m16.mail.163.com [117.135.210.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A1D8B477993; Thu, 4 Jun 2026 14:28:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=117.135.210.2 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780583329; cv=none; b=FMxW+jgfRJjMfUfEMnlaw8J95r0/GHO31XA5V4jOWqY9wQC6Trfsy5alw+mKmaH6peHEepe8tgDb3GTA5jsAKQ4D8FJL3QE0+dbvCxNLWOOTutCG+7QQhfgFF7okgAvml+5x8lt2wCgmt3Y4die06x5Xe84miGQnlYVYRAk+LfE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780583329; c=relaxed/simple; bh=U5PHaW/NPKVl54JXFI9c/aoJqwYVm3tj9fpkxNb2Dd8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=my1k2UBldksO4wKbG2Ue7pUnsN+5MjDaHe4tFrDrZsQMzo5l3XYqeIoU4HSk+cnqVDefm+gnJDaxzsayzFJ2TMwcOHgkNBOZacdA2u5AXgtuaBeXwtVzCjLKBgQh1DO3GLawz6skk4kfCwXNKEXhHGqFdjyYiKWYIAkV64hMWyM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=163.com; spf=pass smtp.mailfrom=163.com; dkim=pass (1024-bit key) header.d=163.com header.i=@163.com header.b=WjW1oxJg; arc=none smtp.client-ip=117.135.210.2 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=163.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=163.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=163.com header.i=@163.com header.b="WjW1oxJg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=163.com; s=s110527; h=From:To:Subject:Date:Message-ID:MIME-Version; bh=SS qjx2KGowuc/Phvlx0zInzCGKJ5fZ8xC0eJIquVxlE=; b=WjW1oxJg3qXYn0dab/ EsS3+OcV3Kn2jwXw92iHAp+rDDty3aMEppM2T1NHuC7m6Y4pvJIDhpf4KEIm+t8Q f5AKnrnF0UK9upYo3SZS4olruJ62cXhux94XH5pKikjebe0QutCVYAAkcH6M6X5M hqKpLNPOnRMlBhDq/YUcQUCxw= Received: from localhost.localdomain (unknown []) by gzsmtp1 (Coremail) with SMTP id PCgvCgC3VJAIiyFqSKmLAQ--.50067S4; Thu, 04 Jun 2026 22:26:22 +0800 (CST) From: Jinyu Tang To: Anup Patel , Anup Patel , Paolo Bonzini Cc: Atish Patra , Paul Walmsley , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexandre Ghiti , Shuah Khan , Sean Christopherson , =?UTF-8?q?Radim=20Kr=C4=8Dm=C3=A1=C5=99?= , Andrew Jones , Conor Dooley , Yong-Xuan Wang , Nutty Liu , kvm@vger.kernel.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, Jinyu Tang Subject: [PATCH v2 2/2] KVM: selftests: Add a hugetlb memslot alignment test mode Date: Thu, 4 Jun 2026 22:26:02 +0800 Message-ID: <20260604142602.3582602-3-tjytimi@163.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20260604142602.3582602-1-tjytimi@163.com> References: <20260604142602.3582602-1-tjytimi@163.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: PCgvCgC3VJAIiyFqSKmLAQ--.50067S4 X-Coremail-Antispam: 1Uf129KBjvJXoWxCF15Kw47tr45JFWxuw4rKrg_yoWrZr4kpF Z5AFn8KFsrJrWfXF4xGw1vkr1Sy3ykGrW0kryYg3yj93y2y3WIvF4xC3W7ZF93CrZ3Zr9a va15tF17Wa4DJF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x0ziyrWrUUUUU= X-CM-SenderInfo: xwm13xlpl6il2tof0z/xtbC8Q7IQWohiw5ZvgAA3f Content-Type: text/plain; charset="utf-8" kvm_page_table_test can already exercise hugetlb-backed guest memory, but it always creates the test memslot with GPA alignment matching the hugetlb backing size. That misses the case where a valid hugetlb memslot is later moved so that the memslot GPA and HVA no longer have the same offset within the backing huge page. Add a -u option that moves the test memslot GPA by one guest page after creating the hugetlb memslot. The memslot is created through the normal helper first, so the backing allocation remains valid and hugetlb aligned. Moving the memslot then creates a deliberate HVA/GPA offset mismatch before the guest mapping is installed. This mode is useful for checking that architecture MMUs do not install a block mapping when the block would map the wrong host pages or cover memory outside the memslot. The option is restricted to hugetlb-backed test memory because it's specifically about hugetlb block mapping=20 eligibility. Signed-off-by: Jinyu Tang Reviewed-by: Anup Patel --- v1 -> v2: - Keep the selftest change unchanged from v1 .../selftests/kvm/kvm_page_table_test.c | 28 +++++++++++++++---- 1 file changed, 22 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/kvm/kvm_page_table_test.c b/tools/test= ing/selftests/kvm/kvm_page_table_test.c index fc5242fb9..a910e3abb 100644 --- a/tools/testing/selftests/kvm/kvm_page_table_test.c +++ b/tools/testing/selftests/kvm/kvm_page_table_test.c @@ -230,6 +230,7 @@ struct test_params { u64 phys_offset; u64 test_mem_size; enum vm_mem_backing_src_type src_type; + bool misalign_slot_gpa; }; =20 static struct kvm_vm *pre_init_before_test(enum vm_guest_mode mode, void *= arg) @@ -244,6 +245,7 @@ static struct kvm_vm *pre_init_before_test(enum vm_gues= t_mode mode, void *arg) u64 guest_num_pages; u64 alignment; void *host_test_mem; + struct userspace_mem_region *region; struct kvm_vm *vm; =20 /* Align up the test memory size */ @@ -276,13 +278,22 @@ static struct kvm_vm *pre_init_before_test(enum vm_gu= est_mode mode, void *arg) /* Add an extra memory slot with specified backing src type */ vm_userspace_mem_region_add(vm, src_type, guest_test_phys_mem, TEST_MEM_SLOT_INDEX, guest_num_pages, 0); + region =3D memslot2region(vm, TEST_MEM_SLOT_INDEX); + host_test_mem =3D region->host_mem; + + if (p->misalign_slot_gpa) { + TEST_ASSERT(is_backing_src_hugetlb(src_type), + "Memslot GPA misalignment requires hugetlb backing"); + TEST_ASSERT(guest_num_pages > 1, + "Need at least two guest pages to misalign memslot GPA"); + + guest_test_phys_mem +=3D guest_page_size; + vm_mem_region_move(vm, TEST_MEM_SLOT_INDEX, guest_test_phys_mem); + } =20 /* Do mapping(GVA->GPA) for the testing memory slot */ virt_map(vm, guest_test_virt_mem, guest_test_phys_mem, guest_num_pages); =20 - /* Cache the HVA pointer of the region */ - host_test_mem =3D addr_gpa2hva(vm, (gpa_t)guest_test_phys_mem); - /* Export shared structure test_args to guest */ sync_global_to_guest(vm, test_args); =20 @@ -417,8 +428,8 @@ static void run_test(enum vm_guest_mode mode, void *arg) static void help(char *name) { puts(""); - printf("usage: %s [-h] [-p offset] [-m mode] " - "[-b mem-size] [-v vcpus] [-s mem-type]\n", name); + printf("usage: %s [-h] [-p offset] [-m mode] [-b mem-size]\n", name); + printf(" [-v vcpus] [-s mem-type] [-u]\n"); puts(""); printf(" -p: specify guest physical test memory offset\n" " Warning: a low offset can conflict with the loaded test code= .\n"); @@ -428,6 +439,8 @@ static void help(char *name) printf(" -v: specify the number of vCPUs to run\n" " (default: 1)\n"); backing_src_help("-s"); + printf(" -u: move the test memslot GPA by one guest page after creating\n" + " the memslot, forcing a hugetlb HVA/GPA offset mismatch\n"); puts(""); } =20 @@ -442,7 +455,7 @@ int main(int argc, char *argv[]) =20 guest_modes_append_default(); =20 - while ((opt =3D getopt(argc, argv, "hp:m:b:v:s:")) !=3D -1) { + while ((opt =3D getopt(argc, argv, "hp:m:b:v:s:u")) !=3D -1) { switch (opt) { case 'p': p.phys_offset =3D strtoull(optarg, NULL, 0); @@ -461,6 +474,9 @@ int main(int argc, char *argv[]) case 's': p.src_type =3D parse_backing_src_type(optarg); break; + case 'u': + p.misalign_slot_gpa =3D true; + break; case 'h': default: help(argv[0]); --=20 2.43.0