From nobody Sat Apr 4 00:07:09 2026 Received: from out30-99.freemail.mail.aliyun.com (out30-99.freemail.mail.aliyun.com [115.124.30.99]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 853BE1B4244; Fri, 3 Apr 2026 15:30:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.99 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775230236; cv=none; b=XNhsGULxoT2wFtmAQ+lIpPn+bgpvtfz7jPfrPdXMqN7pfgA/s/p4oSNhW8UAwV/RJPkeNXNFCIeMVN2K7Y9r1vKTSDSUXxxbp5dPFxFy2uVu9MzM0MEfeBVsZTVkZZBCetxXzEUTJl5KiMdWK6OlkYgckU1N0AUWJae6W60aYU8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775230236; c=relaxed/simple; bh=XoS7ZyveF6egIgZTnEEm+McOri2/tqF2LZS+eUn/yB4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Vu8wTtj3cewKY/gSdgXjV84JDywd/bPcIQfMGBF4DLyEoDOay6rcyJ9k5oAgv94dgCfewsmqDHJ4lgazGCV3ndyYNPdvptrpZjDdOomGV6W473Qj3gyLqLtj8f5W4X9UeLaEaoiWeWt66ceQAr2WEeBHhRnLTn0ivR21S3qRFqg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=CVgLUkT+; arc=none smtp.client-ip=115.124.30.99 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="CVgLUkT+" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1775230226; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=MMrR0S4oPpakdh1Wgb0LcRFd9ZAMSs3xYarfKJreC9Q=; b=CVgLUkT+rUGa6X2RyIpcUZFk0LSTt1cVbsQ4DGzdsn+cHqt4jTep6+KhJDgheth1H2SNHYjn9LJ2qTMdWQzIJfAmTLbZ2Mj7/W2KPKLBIpfAOpug6sTHTTo1Lx9EGaZc06GWY/x0+2EyILpVDGURqKA/UJrwG2wRKh2wqXpDxxw= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R651e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037009110;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0X0KbxTp_1775230224; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X0KbxTp_1775230224 cluster:ay36) by smtp.aliyun-inc.com; Fri, 03 Apr 2026 23:30:25 +0800 From: fangyu.yu@linux.alibaba.com To: pbonzini@redhat.com, corbet@lwn.net, anup@brainfault.org, atish.patra@linux.dev, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, skhan@linuxfoundation.org Cc: guoren@kernel.org, radim.krcmar@oss.qualcomm.com, andrew.jones@oss.qualcomm.com, linux-doc@vger.kernel.org, kvm@vger.kernel.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [PATCH v8 1/3] RISC-V: KVM: Support runtime configuration for per-VM's HGATP mode Date: Fri, 3 Apr 2026 23:30:16 +0800 Message-Id: <20260403153019.9916-2-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260403153019.9916-1-fangyu.yu@linux.alibaba.com> References: <20260403153019.9916-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu Introduces one per-VM architecture-specific fields to support runtime configuration of the G-stage page table format: - kvm->arch.pgd_levels: the corresponding number of page table levels for the selected mode. These fields replace the previous global variables kvm_riscv_gstage_mode and kvm_riscv_gstage_pgd_levels, enabling different virtual machines to independently select their G-stage page table format instead of being forced to share the maximum mode detected by the kernel at boot time. Signed-off-by: Fangyu Yu Reviewed-by: Andrew Jones Reviewed-by: Anup Patel Reviewed-by: Guo Ren --- arch/riscv/include/asm/kvm_gstage.h | 37 ++++++++++++---- arch/riscv/include/asm/kvm_host.h | 1 + arch/riscv/kvm/gstage.c | 65 ++++++++++++++--------------- arch/riscv/kvm/main.c | 12 +++--- arch/riscv/kvm/mmu.c | 20 +++++---- arch/riscv/kvm/vm.c | 5 ++- arch/riscv/kvm/vmid.c | 3 +- 7 files changed, 86 insertions(+), 57 deletions(-) diff --git a/arch/riscv/include/asm/kvm_gstage.h b/arch/riscv/include/asm/k= vm_gstage.h index 595e2183173e..5aa58d1f692a 100644 --- a/arch/riscv/include/asm/kvm_gstage.h +++ b/arch/riscv/include/asm/kvm_gstage.h @@ -29,16 +29,22 @@ struct kvm_gstage_mapping { #define kvm_riscv_gstage_index_bits 10 #endif =20 -extern unsigned long kvm_riscv_gstage_mode; -extern unsigned long kvm_riscv_gstage_pgd_levels; +extern unsigned long kvm_riscv_gstage_max_pgd_levels; =20 #define kvm_riscv_gstage_pgd_xbits 2 #define kvm_riscv_gstage_pgd_size (1UL << (HGATP_PAGE_SHIFT + kvm_riscv_gs= tage_pgd_xbits)) -#define kvm_riscv_gstage_gpa_bits (HGATP_PAGE_SHIFT + \ - (kvm_riscv_gstage_pgd_levels * \ - kvm_riscv_gstage_index_bits) + \ - kvm_riscv_gstage_pgd_xbits) -#define kvm_riscv_gstage_gpa_size ((gpa_t)(1ULL << kvm_riscv_gstage_gpa_bi= ts)) + +static inline unsigned long kvm_riscv_gstage_gpa_bits(unsigned long pgd_le= vels) +{ + return (HGATP_PAGE_SHIFT + + pgd_levels * kvm_riscv_gstage_index_bits + + kvm_riscv_gstage_pgd_xbits); +} + +static inline gpa_t kvm_riscv_gstage_gpa_size(unsigned long pgd_levels) +{ + return BIT_ULL(kvm_riscv_gstage_gpa_bits(pgd_levels)); +} =20 bool kvm_riscv_gstage_get_leaf(struct kvm_gstage *gstage, gpa_t addr, pte_t **ptepp, u32 *ptep_level); @@ -69,4 +75,21 @@ void kvm_riscv_gstage_wp_range(struct kvm_gstage *gstage= , gpa_t start, gpa_t end =20 void kvm_riscv_gstage_mode_detect(void); =20 +static inline unsigned long kvm_riscv_gstage_mode(unsigned long pgd_levels) +{ + switch (pgd_levels) { + case 2: + return HGATP_MODE_SV32X4; + case 3: + return HGATP_MODE_SV39X4; + case 4: + return HGATP_MODE_SV48X4; + case 5: + return HGATP_MODE_SV57X4; + default: + WARN_ON_ONCE(1); + return HGATP_MODE_OFF; + } +} + #endif diff --git a/arch/riscv/include/asm/kvm_host.h b/arch/riscv/include/asm/kvm= _host.h index 24585304c02b..478f699e9dec 100644 --- a/arch/riscv/include/asm/kvm_host.h +++ b/arch/riscv/include/asm/kvm_host.h @@ -94,6 +94,7 @@ struct kvm_arch { /* G-stage page table */ pgd_t *pgd; phys_addr_t pgd_phys; + unsigned long pgd_levels; =20 /* Guest Timer */ struct kvm_guest_timer timer; diff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c index b67d60d722c2..4beb9322fe76 100644 --- a/arch/riscv/kvm/gstage.c +++ b/arch/riscv/kvm/gstage.c @@ -12,22 +12,21 @@ #include =20 #ifdef CONFIG_64BIT -unsigned long kvm_riscv_gstage_mode __ro_after_init =3D HGATP_MODE_SV39X4; -unsigned long kvm_riscv_gstage_pgd_levels __ro_after_init =3D 3; +unsigned long kvm_riscv_gstage_max_pgd_levels __ro_after_init =3D 3; #else -unsigned long kvm_riscv_gstage_mode __ro_after_init =3D HGATP_MODE_SV32X4; -unsigned long kvm_riscv_gstage_pgd_levels __ro_after_init =3D 2; +unsigned long kvm_riscv_gstage_max_pgd_levels __ro_after_init =3D 2; #endif =20 #define gstage_pte_leaf(__ptep) \ (pte_val(*(__ptep)) & (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC)) =20 -static inline unsigned long gstage_pte_index(gpa_t addr, u32 level) +static inline unsigned long gstage_pte_index(struct kvm_gstage *gstage, + gpa_t addr, u32 level) { unsigned long mask; unsigned long shift =3D HGATP_PAGE_SHIFT + (kvm_riscv_gstage_index_bits *= level); =20 - if (level =3D=3D (kvm_riscv_gstage_pgd_levels - 1)) + if (level =3D=3D gstage->kvm->arch.pgd_levels - 1) mask =3D (PTRS_PER_PTE * (1UL << kvm_riscv_gstage_pgd_xbits)) - 1; else mask =3D PTRS_PER_PTE - 1; @@ -40,12 +39,13 @@ static inline unsigned long gstage_pte_page_vaddr(pte_t= pte) return (unsigned long)pfn_to_virt(__page_val_to_pfn(pte_val(pte))); } =20 -static int gstage_page_size_to_level(unsigned long page_size, u32 *out_lev= el) +static int gstage_page_size_to_level(struct kvm_gstage *gstage, unsigned l= ong page_size, + u32 *out_level) { u32 i; unsigned long psz =3D 1UL << 12; =20 - for (i =3D 0; i < kvm_riscv_gstage_pgd_levels; i++) { + for (i =3D 0; i < gstage->kvm->arch.pgd_levels; i++) { if (page_size =3D=3D (psz << (i * kvm_riscv_gstage_index_bits))) { *out_level =3D i; return 0; @@ -55,21 +55,23 @@ static int gstage_page_size_to_level(unsigned long page= _size, u32 *out_level) return -EINVAL; } =20 -static int gstage_level_to_page_order(u32 level, unsigned long *out_pgorde= r) +static int gstage_level_to_page_order(struct kvm_gstage *gstage, u32 level, + unsigned long *out_pgorder) { - if (kvm_riscv_gstage_pgd_levels < level) + if (gstage->kvm->arch.pgd_levels < level) return -EINVAL; =20 *out_pgorder =3D 12 + (level * kvm_riscv_gstage_index_bits); return 0; } =20 -static int gstage_level_to_page_size(u32 level, unsigned long *out_pgsize) +static int gstage_level_to_page_size(struct kvm_gstage *gstage, u32 level, + unsigned long *out_pgsize) { int rc; unsigned long page_order =3D PAGE_SHIFT; =20 - rc =3D gstage_level_to_page_order(level, &page_order); + rc =3D gstage_level_to_page_order(gstage, level, &page_order); if (rc) return rc; =20 @@ -81,11 +83,11 @@ bool kvm_riscv_gstage_get_leaf(struct kvm_gstage *gstag= e, gpa_t addr, pte_t **ptepp, u32 *ptep_level) { pte_t *ptep; - u32 current_level =3D kvm_riscv_gstage_pgd_levels - 1; + u32 current_level =3D gstage->kvm->arch.pgd_levels - 1; =20 *ptep_level =3D current_level; ptep =3D (pte_t *)gstage->pgd; - ptep =3D &ptep[gstage_pte_index(addr, current_level)]; + ptep =3D &ptep[gstage_pte_index(gstage, addr, current_level)]; while (ptep && pte_val(ptep_get(ptep))) { if (gstage_pte_leaf(ptep)) { *ptep_level =3D current_level; @@ -97,7 +99,7 @@ bool kvm_riscv_gstage_get_leaf(struct kvm_gstage *gstage,= gpa_t addr, current_level--; *ptep_level =3D current_level; ptep =3D (pte_t *)gstage_pte_page_vaddr(ptep_get(ptep)); - ptep =3D &ptep[gstage_pte_index(addr, current_level)]; + ptep =3D &ptep[gstage_pte_index(gstage, addr, current_level)]; } else { ptep =3D NULL; } @@ -110,7 +112,7 @@ static void gstage_tlb_flush(struct kvm_gstage *gstage,= u32 level, gpa_t addr) { unsigned long order =3D PAGE_SHIFT; =20 - if (gstage_level_to_page_order(level, &order)) + if (gstage_level_to_page_order(gstage, level, &order)) return; addr &=3D ~(BIT(order) - 1); =20 @@ -125,9 +127,9 @@ int kvm_riscv_gstage_set_pte(struct kvm_gstage *gstage, struct kvm_mmu_memory_cache *pcache, const struct kvm_gstage_mapping *map) { - u32 current_level =3D kvm_riscv_gstage_pgd_levels - 1; + u32 current_level =3D gstage->kvm->arch.pgd_levels - 1; pte_t *next_ptep =3D (pte_t *)gstage->pgd; - pte_t *ptep =3D &next_ptep[gstage_pte_index(map->addr, current_level)]; + pte_t *ptep =3D &next_ptep[gstage_pte_index(gstage, map->addr, current_le= vel)]; =20 if (current_level < map->level) return -EINVAL; @@ -151,7 +153,7 @@ int kvm_riscv_gstage_set_pte(struct kvm_gstage *gstage, } =20 current_level--; - ptep =3D &next_ptep[gstage_pte_index(map->addr, current_level)]; + ptep =3D &next_ptep[gstage_pte_index(gstage, map->addr, current_level)]; } =20 if (pte_val(*ptep) !=3D pte_val(map->pte)) { @@ -175,7 +177,7 @@ int kvm_riscv_gstage_map_page(struct kvm_gstage *gstage, out_map->addr =3D gpa; out_map->level =3D 0; =20 - ret =3D gstage_page_size_to_level(page_size, &out_map->level); + ret =3D gstage_page_size_to_level(gstage, page_size, &out_map->level); if (ret) return ret; =20 @@ -217,7 +219,7 @@ void kvm_riscv_gstage_op_pte(struct kvm_gstage *gstage,= gpa_t addr, u32 next_ptep_level; unsigned long next_page_size, page_size; =20 - ret =3D gstage_level_to_page_size(ptep_level, &page_size); + ret =3D gstage_level_to_page_size(gstage, ptep_level, &page_size); if (ret) return; =20 @@ -229,7 +231,7 @@ void kvm_riscv_gstage_op_pte(struct kvm_gstage *gstage,= gpa_t addr, if (ptep_level && !gstage_pte_leaf(ptep)) { next_ptep =3D (pte_t *)gstage_pte_page_vaddr(ptep_get(ptep)); next_ptep_level =3D ptep_level - 1; - ret =3D gstage_level_to_page_size(next_ptep_level, &next_page_size); + ret =3D gstage_level_to_page_size(gstage, next_ptep_level, &next_page_si= ze); if (ret) return; =20 @@ -263,7 +265,7 @@ void kvm_riscv_gstage_unmap_range(struct kvm_gstage *gs= tage, =20 while (addr < end) { found_leaf =3D kvm_riscv_gstage_get_leaf(gstage, addr, &ptep, &ptep_leve= l); - ret =3D gstage_level_to_page_size(ptep_level, &page_size); + ret =3D gstage_level_to_page_size(gstage, ptep_level, &page_size); if (ret) break; =20 @@ -297,7 +299,7 @@ void kvm_riscv_gstage_wp_range(struct kvm_gstage *gstag= e, gpa_t start, gpa_t end =20 while (addr < end) { found_leaf =3D kvm_riscv_gstage_get_leaf(gstage, addr, &ptep, &ptep_leve= l); - ret =3D gstage_level_to_page_size(ptep_level, &page_size); + ret =3D gstage_level_to_page_size(gstage, ptep_level, &page_size); if (ret) break; =20 @@ -319,39 +321,34 @@ void __init kvm_riscv_gstage_mode_detect(void) /* Try Sv57x4 G-stage mode */ csr_write(CSR_HGATP, HGATP_MODE_SV57X4 << HGATP_MODE_SHIFT); if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) =3D=3D HGATP_MODE_SV57X4) { - kvm_riscv_gstage_mode =3D HGATP_MODE_SV57X4; - kvm_riscv_gstage_pgd_levels =3D 5; + kvm_riscv_gstage_max_pgd_levels =3D 5; goto done; } =20 /* Try Sv48x4 G-stage mode */ csr_write(CSR_HGATP, HGATP_MODE_SV48X4 << HGATP_MODE_SHIFT); if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) =3D=3D HGATP_MODE_SV48X4) { - kvm_riscv_gstage_mode =3D HGATP_MODE_SV48X4; - kvm_riscv_gstage_pgd_levels =3D 4; + kvm_riscv_gstage_max_pgd_levels =3D 4; goto done; } =20 /* Try Sv39x4 G-stage mode */ csr_write(CSR_HGATP, HGATP_MODE_SV39X4 << HGATP_MODE_SHIFT); if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) =3D=3D HGATP_MODE_SV39X4) { - kvm_riscv_gstage_mode =3D HGATP_MODE_SV39X4; - kvm_riscv_gstage_pgd_levels =3D 3; + kvm_riscv_gstage_max_pgd_levels =3D 3; goto done; } #else /* CONFIG_32BIT */ /* Try Sv32x4 G-stage mode */ csr_write(CSR_HGATP, HGATP_MODE_SV32X4 << HGATP_MODE_SHIFT); if ((csr_read(CSR_HGATP) >> HGATP_MODE_SHIFT) =3D=3D HGATP_MODE_SV32X4) { - kvm_riscv_gstage_mode =3D HGATP_MODE_SV32X4; - kvm_riscv_gstage_pgd_levels =3D 2; + kvm_riscv_gstage_max_pgd_levels =3D 2; goto done; } #endif =20 /* KVM depends on !HGATP_MODE_OFF */ - kvm_riscv_gstage_mode =3D HGATP_MODE_OFF; - kvm_riscv_gstage_pgd_levels =3D 0; + kvm_riscv_gstage_max_pgd_levels =3D 0; =20 done: csr_write(CSR_HGATP, 0); diff --git a/arch/riscv/kvm/main.c b/arch/riscv/kvm/main.c index 0f3fe3986fc0..90ee0a032b9a 100644 --- a/arch/riscv/kvm/main.c +++ b/arch/riscv/kvm/main.c @@ -105,17 +105,17 @@ static int __init riscv_kvm_init(void) return rc; =20 kvm_riscv_gstage_mode_detect(); - switch (kvm_riscv_gstage_mode) { - case HGATP_MODE_SV32X4: + switch (kvm_riscv_gstage_max_pgd_levels) { + case 2: str =3D "Sv32x4"; break; - case HGATP_MODE_SV39X4: + case 3: str =3D "Sv39x4"; break; - case HGATP_MODE_SV48X4: + case 4: str =3D "Sv48x4"; break; - case HGATP_MODE_SV57X4: + case 5: str =3D "Sv57x4"; break; default: @@ -164,7 +164,7 @@ static int __init riscv_kvm_init(void) (rc) ? slist : "no features"); } =20 - kvm_info("using %s G-stage page table format\n", str); + kvm_info("highest G-stage page table mode is %s\n", str); =20 kvm_info("VMID %ld bits available\n", kvm_riscv_gstage_vmid_bits()); =20 diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index 088d33ba90ed..fbcdd75cb9af 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -67,7 +67,7 @@ int kvm_riscv_mmu_ioremap(struct kvm *kvm, gpa_t gpa, phy= s_addr_t hpa, if (!writable) map.pte =3D pte_wrprotect(map.pte); =20 - ret =3D kvm_mmu_topup_memory_cache(&pcache, kvm_riscv_gstage_pgd_levels); + ret =3D kvm_mmu_topup_memory_cache(&pcache, kvm->arch.pgd_levels); if (ret) goto out; =20 @@ -186,7 +186,7 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, * space addressable by the KVM guest GPA space. */ if ((new->base_gfn + new->npages) >=3D - (kvm_riscv_gstage_gpa_size >> PAGE_SHIFT)) + kvm_riscv_gstage_gpa_size(kvm->arch.pgd_levels) >> PAGE_SHIFT) return -EFAULT; =20 hva =3D new->userspace_addr; @@ -472,7 +472,7 @@ int kvm_riscv_mmu_map(struct kvm_vcpu *vcpu, struct kvm= _memory_slot *memslot, memset(out_map, 0, sizeof(*out_map)); =20 /* We need minimum second+third level pages */ - ret =3D kvm_mmu_topup_memory_cache(pcache, kvm_riscv_gstage_pgd_levels); + ret =3D kvm_mmu_topup_memory_cache(pcache, kvm->arch.pgd_levels); if (ret) { kvm_err("Failed to topup G-stage cache\n"); return ret; @@ -575,6 +575,7 @@ int kvm_riscv_mmu_alloc_pgd(struct kvm *kvm) return -ENOMEM; kvm->arch.pgd =3D page_to_virt(pgd_page); kvm->arch.pgd_phys =3D page_to_phys(pgd_page); + kvm->arch.pgd_levels =3D kvm_riscv_gstage_max_pgd_levels; =20 return 0; } @@ -590,10 +591,12 @@ void kvm_riscv_mmu_free_pgd(struct kvm *kvm) gstage.flags =3D 0; gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); gstage.pgd =3D kvm->arch.pgd; - kvm_riscv_gstage_unmap_range(&gstage, 0UL, kvm_riscv_gstage_gpa_size, fa= lse); + kvm_riscv_gstage_unmap_range(&gstage, 0UL, + kvm_riscv_gstage_gpa_size(kvm->arch.pgd_levels), false); pgd =3D READ_ONCE(kvm->arch.pgd); kvm->arch.pgd =3D NULL; kvm->arch.pgd_phys =3D 0; + kvm->arch.pgd_levels =3D 0; } spin_unlock(&kvm->mmu_lock); =20 @@ -603,11 +606,12 @@ void kvm_riscv_mmu_free_pgd(struct kvm *kvm) =20 void kvm_riscv_mmu_update_hgatp(struct kvm_vcpu *vcpu) { - unsigned long hgatp =3D kvm_riscv_gstage_mode << HGATP_MODE_SHIFT; - struct kvm_arch *k =3D &vcpu->kvm->arch; + struct kvm_arch *ka =3D &vcpu->kvm->arch; + unsigned long hgatp =3D kvm_riscv_gstage_mode(ka->pgd_levels) + << HGATP_MODE_SHIFT; =20 - hgatp |=3D (READ_ONCE(k->vmid.vmid) << HGATP_VMID_SHIFT) & HGATP_VMID; - hgatp |=3D (k->pgd_phys >> PAGE_SHIFT) & HGATP_PPN; + hgatp |=3D (READ_ONCE(ka->vmid.vmid) << HGATP_VMID_SHIFT) & HGATP_VMID; + hgatp |=3D (ka->pgd_phys >> PAGE_SHIFT) & HGATP_PPN; =20 ncsr_write(CSR_HGATP, hgatp); =20 diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c index 13c63ae1a78b..fb7c4e07961f 100644 --- a/arch/riscv/kvm/vm.c +++ b/arch/riscv/kvm/vm.c @@ -199,7 +199,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long= ext) r =3D KVM_USER_MEM_SLOTS; break; case KVM_CAP_VM_GPA_BITS: - r =3D kvm_riscv_gstage_gpa_bits; + if (!kvm) + r =3D kvm_riscv_gstage_gpa_bits(kvm_riscv_gstage_max_pgd_levels); + else + r =3D kvm_riscv_gstage_gpa_bits(kvm->arch.pgd_levels); break; default: r =3D 0; diff --git a/arch/riscv/kvm/vmid.c b/arch/riscv/kvm/vmid.c index cf34d448289d..c15bdb1dd8be 100644 --- a/arch/riscv/kvm/vmid.c +++ b/arch/riscv/kvm/vmid.c @@ -26,7 +26,8 @@ static DEFINE_SPINLOCK(vmid_lock); void __init kvm_riscv_gstage_vmid_detect(void) { /* Figure-out number of VMID bits in HW */ - csr_write(CSR_HGATP, (kvm_riscv_gstage_mode << HGATP_MODE_SHIFT) | HGATP_= VMID); + csr_write(CSR_HGATP, (kvm_riscv_gstage_mode(kvm_riscv_gstage_max_pgd_leve= ls) << + HGATP_MODE_SHIFT) | HGATP_VMID); vmid_bits =3D csr_read(CSR_HGATP); vmid_bits =3D (vmid_bits & HGATP_VMID) >> HGATP_VMID_SHIFT; vmid_bits =3D fls_long(vmid_bits); --=20 2.50.1 From nobody Sat Apr 4 00:07:09 2026 Received: from out30-110.freemail.mail.aliyun.com (out30-110.freemail.mail.aliyun.com [115.124.30.110]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3095A30DEAC; Fri, 3 Apr 2026 15:30:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.110 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775230235; cv=none; b=JxwaSRks4nn59sGhG66r46ts0vRW5Nd/IxdDlaNxjeQwurVmeHruScsII60q94UF+zjanhE0wYhZb25WaI4Xc2wUQNodVKsoKF5dFXPoKKMmWJ0NHkZxPWoIHNmLRAClnipKfcpx8NI0OvfgI+WSXdplvkrCvgr//Lq1VnkhXc0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775230235; c=relaxed/simple; bh=JnauhkU82g9U9/CUtFR2/Mf9x9B9JDGjeLevjrioG/0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=GxUQ983a+ryPQMd4bMtIAcZTZ11Siny1xONLfyH6O95vyWFCR8vorTqTSzBUM6r9zoeLOF+Nen0F7Fb0E536miHgdIv0o94kJHCy0ISGrNqWWRpM2JVJGEbH0KG1zamI89/El3m9MFfKliywsJXZj9Slp3QyOcpUblC0BwjbnAk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=Ag/Uw+BV; arc=none smtp.client-ip=115.124.30.110 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="Ag/Uw+BV" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1775230228; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=P6NbZiOMkU8cQutDdqvRtWcC03732/Yoi102kKoLQNc=; b=Ag/Uw+BV/8fTrv29bTeZhE0ptFw8wXPpXDK3fLhs/CqYHPioeyW1bJm+mNcn8gsC4aYeh7K7cgKT8fkr11zHTiQ1mOdsIF9ZcCoVZt+zet7npQVSJWmJIEXWj5U4F0QgBLwfaRGRYn+OZso6Nw2KZkB/ZBw6aunv1mns55ASb0Q= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R901e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037033178;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0X0KbxUO_1775230225; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X0KbxUO_1775230225 cluster:ay36) by smtp.aliyun-inc.com; Fri, 03 Apr 2026 23:30:26 +0800 From: fangyu.yu@linux.alibaba.com To: pbonzini@redhat.com, corbet@lwn.net, anup@brainfault.org, atish.patra@linux.dev, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, skhan@linuxfoundation.org Cc: guoren@kernel.org, radim.krcmar@oss.qualcomm.com, andrew.jones@oss.qualcomm.com, linux-doc@vger.kernel.org, kvm@vger.kernel.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [PATCH v8 2/3] RISC-V: KVM: Cache gstage pgd_levels in struct kvm_gstage Date: Fri, 3 Apr 2026 23:30:17 +0800 Message-Id: <20260403153019.9916-3-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260403153019.9916-1-fangyu.yu@linux.alibaba.com> References: <20260403153019.9916-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu Gstage page-table helpers frequently chase gstage->kvm->arch to fetch pgd_levels. This adds noise and repeats the same dereference chain in hot paths. Add pgd_levels to struct kvm_gstage and initialize it from kvm->arch when setting up a gstage instance. Introduce kvm_riscv_gstage_init() to centralize initialization and switch gstage code to use gstage->pgd_levels. Suggested-by: Anup Patel Signed-off-by: Fangyu Yu Reviewed-by: Anup Patel --- arch/riscv/include/asm/kvm_gstage.h | 10 ++++++ arch/riscv/kvm/gstage.c | 10 +++--- arch/riscv/kvm/mmu.c | 50 ++++++----------------------- 3 files changed, 25 insertions(+), 45 deletions(-) diff --git a/arch/riscv/include/asm/kvm_gstage.h b/arch/riscv/include/asm/k= vm_gstage.h index 5aa58d1f692a..70d9d483365e 100644 --- a/arch/riscv/include/asm/kvm_gstage.h +++ b/arch/riscv/include/asm/kvm_gstage.h @@ -15,6 +15,7 @@ struct kvm_gstage { #define KVM_GSTAGE_FLAGS_LOCAL BIT(0) unsigned long vmid; pgd_t *pgd; + unsigned long pgd_levels; }; =20 struct kvm_gstage_mapping { @@ -92,4 +93,13 @@ static inline unsigned long kvm_riscv_gstage_mode(unsign= ed long pgd_levels) } } =20 +static inline void kvm_riscv_gstage_init(struct kvm_gstage *gstage, struct= kvm *kvm) +{ + gstage->kvm =3D kvm; + gstage->flags =3D 0; + gstage->vmid =3D READ_ONCE(kvm->arch.vmid.vmid); + gstage->pgd =3D kvm->arch.pgd; + gstage->pgd_levels =3D kvm->arch.pgd_levels; +} + #endif diff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c index 4beb9322fe76..7c4c34bc191b 100644 --- a/arch/riscv/kvm/gstage.c +++ b/arch/riscv/kvm/gstage.c @@ -26,7 +26,7 @@ static inline unsigned long gstage_pte_index(struct kvm_g= stage *gstage, unsigned long mask; unsigned long shift =3D HGATP_PAGE_SHIFT + (kvm_riscv_gstage_index_bits *= level); =20 - if (level =3D=3D gstage->kvm->arch.pgd_levels - 1) + if (level =3D=3D gstage->pgd_levels - 1) mask =3D (PTRS_PER_PTE * (1UL << kvm_riscv_gstage_pgd_xbits)) - 1; else mask =3D PTRS_PER_PTE - 1; @@ -45,7 +45,7 @@ static int gstage_page_size_to_level(struct kvm_gstage *g= stage, unsigned long pa u32 i; unsigned long psz =3D 1UL << 12; =20 - for (i =3D 0; i < gstage->kvm->arch.pgd_levels; i++) { + for (i =3D 0; i < gstage->pgd_levels; i++) { if (page_size =3D=3D (psz << (i * kvm_riscv_gstage_index_bits))) { *out_level =3D i; return 0; @@ -58,7 +58,7 @@ static int gstage_page_size_to_level(struct kvm_gstage *g= stage, unsigned long pa static int gstage_level_to_page_order(struct kvm_gstage *gstage, u32 level, unsigned long *out_pgorder) { - if (gstage->kvm->arch.pgd_levels < level) + if (gstage->pgd_levels < level) return -EINVAL; =20 *out_pgorder =3D 12 + (level * kvm_riscv_gstage_index_bits); @@ -83,7 +83,7 @@ bool kvm_riscv_gstage_get_leaf(struct kvm_gstage *gstage,= gpa_t addr, pte_t **ptepp, u32 *ptep_level) { pte_t *ptep; - u32 current_level =3D gstage->kvm->arch.pgd_levels - 1; + u32 current_level =3D gstage->pgd_levels - 1; =20 *ptep_level =3D current_level; ptep =3D (pte_t *)gstage->pgd; @@ -127,7 +127,7 @@ int kvm_riscv_gstage_set_pte(struct kvm_gstage *gstage, struct kvm_mmu_memory_cache *pcache, const struct kvm_gstage_mapping *map) { - u32 current_level =3D gstage->kvm->arch.pgd_levels - 1; + u32 current_level =3D gstage->pgd_levels - 1; pte_t *next_ptep =3D (pte_t *)gstage->pgd; pte_t *ptep =3D &next_ptep[gstage_pte_index(gstage, map->addr, current_le= vel)]; =20 diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index fbcdd75cb9af..2d3def024270 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -24,10 +24,7 @@ static void mmu_wp_memory_region(struct kvm *kvm, int sl= ot) phys_addr_t end =3D (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; struct kvm_gstage gstage; =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); =20 spin_lock(&kvm->mmu_lock); kvm_riscv_gstage_wp_range(&gstage, start, end); @@ -49,10 +46,7 @@ int kvm_riscv_mmu_ioremap(struct kvm *kvm, gpa_t gpa, ph= ys_addr_t hpa, struct kvm_gstage_mapping map; struct kvm_gstage gstage; =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); =20 end =3D (gpa + size + PAGE_SIZE - 1) & PAGE_MASK; pfn =3D __phys_to_pfn(hpa); @@ -89,10 +83,7 @@ void kvm_riscv_mmu_iounmap(struct kvm *kvm, gpa_t gpa, u= nsigned long size) { struct kvm_gstage gstage; =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); =20 spin_lock(&kvm->mmu_lock); kvm_riscv_gstage_unmap_range(&gstage, gpa, size, false); @@ -109,10 +100,7 @@ void kvm_arch_mmu_enable_log_dirty_pt_masked(struct kv= m *kvm, phys_addr_t end =3D (base_gfn + __fls(mask) + 1) << PAGE_SHIFT; struct kvm_gstage gstage; =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); =20 kvm_riscv_gstage_wp_range(&gstage, start, end); } @@ -141,10 +129,7 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm, phys_addr_t size =3D slot->npages << PAGE_SHIFT; struct kvm_gstage gstage; =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); =20 spin_lock(&kvm->mmu_lock); kvm_riscv_gstage_unmap_range(&gstage, gpa, size, false); @@ -250,10 +235,7 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_g= fn_range *range) if (!kvm->arch.pgd) return false; =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); mmu_locked =3D spin_trylock(&kvm->mmu_lock); kvm_riscv_gstage_unmap_range(&gstage, range->start << PAGE_SHIFT, (range->end - range->start) << PAGE_SHIFT, @@ -275,10 +257,7 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range= *range) =20 WARN_ON(size !=3D PAGE_SIZE && size !=3D PMD_SIZE && size !=3D PUD_SIZE); =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); if (!kvm_riscv_gstage_get_leaf(&gstage, range->start << PAGE_SHIFT, &ptep, &ptep_level)) return false; @@ -298,10 +277,7 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_gfn_= range *range) =20 WARN_ON(size !=3D PAGE_SIZE && size !=3D PMD_SIZE && size !=3D PUD_SIZE); =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); if (!kvm_riscv_gstage_get_leaf(&gstage, range->start << PAGE_SHIFT, &ptep, &ptep_level)) return false; @@ -463,10 +439,7 @@ int kvm_riscv_mmu_map(struct kvm_vcpu *vcpu, struct kv= m_memory_slot *memslot, struct kvm_gstage gstage; struct page *page; =20 - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); =20 /* Setup initial state of output mapping */ memset(out_map, 0, sizeof(*out_map)); @@ -587,10 +560,7 @@ void kvm_riscv_mmu_free_pgd(struct kvm *kvm) =20 spin_lock(&kvm->mmu_lock); if (kvm->arch.pgd) { - gstage.kvm =3D kvm; - gstage.flags =3D 0; - gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); - gstage.pgd =3D kvm->arch.pgd; + kvm_riscv_gstage_init(&gstage, kvm); kvm_riscv_gstage_unmap_range(&gstage, 0UL, kvm_riscv_gstage_gpa_size(kvm->arch.pgd_levels), false); pgd =3D READ_ONCE(kvm->arch.pgd); --=20 2.50.1 From nobody Sat Apr 4 00:07:09 2026 Received: from out30-118.freemail.mail.aliyun.com (out30-118.freemail.mail.aliyun.com [115.124.30.118]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AE2728C2BF; Fri, 3 Apr 2026 15:30:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=115.124.30.118 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775230234; cv=none; b=WJMA6eo8NAzrJVYv9gpHztRcKuc317fi7+fyzjba7UIWEWKz2kqA9jk+D8Y1jVxvIx4UudAPlus71qFluOX7nDwDHZJJXOAzdhV4+f1XSbEF/DXMFo58hkrl84PaRc3Eh/5bw70QGBIbLZxCL5ARvmhtGw5jEtGvWZ/U0sqr838= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775230234; c=relaxed/simple; bh=1JN9y2YP149Xaa8TAhsR1KGMxhicd7F0xJEbKmyzfVc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=bnLi0g1ueabaNju9IauLCUZwluLPLbD2yxVRK+96hdzJBC/9OfDQ/NFWrLtFauJqzu/lv/Ju8VJYSpjYjC85Bqh+x9rN+8+iiPYzcnFLgiTucz2lAi4pvUKTkvtzQC075uW9Xt7a7OxJlPBegZWpQt/dOec09eYdB0y4QtQ2Jlk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com; spf=pass smtp.mailfrom=linux.alibaba.com; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b=pukBa6MR; arc=none smtp.client-ip=115.124.30.118 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.alibaba.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=linux.alibaba.com header.i=@linux.alibaba.com header.b="pukBa6MR" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1775230229; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=vAXnuXYy5CVxib+kdtLGkhTwGblq2YVJ173NjMXO3mM=; b=pukBa6MR1g4xP7IoEZUsrNMg6Wn8aSRXXhP6V7liMI04Y0wsFRU74FnG+oQhOHexhyDyA5ZrTlchWob2fChjy1GiWoBctaqhXxxLiO7KuOMKHZgik7ixkJMfeaY8pd2aRT61nslhYhUvc5TC3Bfjr/TSfXg4U37lBCmfGQ/pT6k= X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033037009110;MF=fangyu.yu@linux.alibaba.com;NM=1;PH=DS;RN=18;SR=0;TI=SMTPD_---0X0KbxUp_1775230226; Received: from localhost.localdomain(mailfrom:fangyu.yu@linux.alibaba.com fp:SMTPD_---0X0KbxUp_1775230226 cluster:ay36) by smtp.aliyun-inc.com; Fri, 03 Apr 2026 23:30:27 +0800 From: fangyu.yu@linux.alibaba.com To: pbonzini@redhat.com, corbet@lwn.net, anup@brainfault.org, atish.patra@linux.dev, pjw@kernel.org, palmer@dabbelt.com, aou@eecs.berkeley.edu, alex@ghiti.fr, skhan@linuxfoundation.org Cc: guoren@kernel.org, radim.krcmar@oss.qualcomm.com, andrew.jones@oss.qualcomm.com, linux-doc@vger.kernel.org, kvm@vger.kernel.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Fangyu Yu Subject: [PATCH v8 3/3] RISC-V: KVM: Reuse KVM_CAP_VM_GPA_BITS to select HGATP.MODE Date: Fri, 3 Apr 2026 23:30:18 +0800 Message-Id: <20260403153019.9916-4-fangyu.yu@linux.alibaba.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) In-Reply-To: <20260403153019.9916-1-fangyu.yu@linux.alibaba.com> References: <20260403153019.9916-1-fangyu.yu@linux.alibaba.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Fangyu Yu Reuse KVM_CAP_VM_GPA_BITS to advertise and select the effective G-stage GPA width for a VM. KVM_CHECK_EXTENSION(KVM_CAP_VM_GPA_BITS) returns the effective GPA bits for a VM, KVM_ENABLE_CAP(KVM_CAP_VM_GPA_BITS) allows userspace to downsize the effective GPA width by selecting a smaller G-stage page table format: - gpa_bits <=3D 41 selects Sv39x4 (pgd_levels=3D3) - gpa_bits <=3D 50 selects Sv48x4 (pgd_levels=3D4) - gpa_bits <=3D 59 selects Sv57x4 (pgd_levels=3D5) Reject the request with -EINVAL for unsupported values and with -EBUSY if vCPUs have been created or any memslot is populated. Signed-off-by: Fangyu Yu Reviewed-by: Andrew Jones Reviewed-by: Guo Ren Reviewed-by: Anup Patel --- arch/riscv/kvm/vm.c | 44 ++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 42 insertions(+), 2 deletions(-) diff --git a/arch/riscv/kvm/vm.c b/arch/riscv/kvm/vm.c index fb7c4e07961f..a9f083feeb76 100644 --- a/arch/riscv/kvm/vm.c +++ b/arch/riscv/kvm/vm.c @@ -214,12 +214,52 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, lon= g ext) =20 int kvm_vm_ioctl_enable_cap(struct kvm *kvm, struct kvm_enable_cap *cap) { + if (cap->flags) + return -EINVAL; + switch (cap->cap) { case KVM_CAP_RISCV_MP_STATE_RESET: - if (cap->flags) - return -EINVAL; kvm->arch.mp_state_reset =3D true; return 0; + case KVM_CAP_VM_GPA_BITS: { + unsigned long gpa_bits =3D cap->args[0]; + unsigned long new_levels; + int r =3D 0; + + /* Decide target pgd levels from requested gpa_bits */ +#ifdef CONFIG_64BIT + if (gpa_bits <=3D 41) + new_levels =3D 3; /* Sv39x4 */ + else if (gpa_bits <=3D 50) + new_levels =3D 4; /* Sv48x4 */ + else if (gpa_bits <=3D 59) + new_levels =3D 5; /* Sv57x4 */ + else + return -EINVAL; +#else + /* 32-bit: only Sv32x4*/ + if (gpa_bits <=3D 34) + new_levels =3D 2; + else + return -EINVAL; +#endif + if (new_levels > kvm_riscv_gstage_max_pgd_levels) + return -EINVAL; + + /* Follow KVM's lock ordering: kvm->lock -> kvm->slots_lock. */ + mutex_lock(&kvm->lock); + mutex_lock(&kvm->slots_lock); + + if (kvm->created_vcpus || !kvm_are_all_memslots_empty(kvm)) + r =3D -EBUSY; + else + kvm->arch.pgd_levels =3D new_levels; + + mutex_unlock(&kvm->slots_lock); + mutex_unlock(&kvm->lock); + + return r; + } default: return -EINVAL; } --=20 2.50.1