From nobody Mon Jun 15 00:12:35 2026 Received: from mxhk.zte.com.cn (mxhk.zte.com.cn [160.30.148.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4111E38C403; Tue, 7 Apr 2026 09:13:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=160.30.148.35 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775553239; cv=none; b=M7Szss9ucAxUhkztBfIJ6IP0lFGL/IELo3oBLn1p+dDzBVAFE83DpbuwM/M85VYYerA+5czlJEw9UJa0Hb9GM+4f8yl+qDvqq29SoQkLJHkU4jjd9MNQHvsDwrWkqSnYDe5lAqPVcq4T1RPaGFlALr/yvtV3Ix1qj39Sw9URz6E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775553239; c=relaxed/simple; bh=J0rvlCsL4WnPEwfc1/LQty3RPIMHMc0i95c+FlH4W6I=; h=Message-ID:In-Reply-To:References:Date:Mime-Version:From:To:Cc: Subject:Content-Type; b=X6En+FmNhFxTTWdQWK57N2sB0PiKqgoxz6GGH4rQrN7tJxEcTXgR6LuX2wr+HXoqXVLC+NeksgPLXIJRPdIKqAkReSi4ziMA6oR3XgHHBJjv2qALC8vU4d0ubix7+x1CguNh6u5jCy4utIQUAPVHA3VDgmq93/VxZSzJoKXuyLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zte.com.cn; spf=pass smtp.mailfrom=zte.com.cn; arc=none smtp.client-ip=160.30.148.35 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zte.com.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zte.com.cn Received: from mse-fl1.zte.com.cn (unknown [10.5.228.132]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mxhk.zte.com.cn (FangMail) with ESMTPS id 4fqgVN3l4Wz8Xs70; Tue, 07 Apr 2026 17:13:48 +0800 (CST) Received: from szxl2zmapp05.zte.com.cn ([10.1.32.37]) by mse-fl1.zte.com.cn with SMTP id 6379DZhh068225; Tue, 7 Apr 2026 17:13:35 +0800 (+08) (envelope-from wang.yechao255@zte.com.cn) Received: from mapi (szxlzmapp03[null]) by mapi (Zmail) with MAPI id mid12; Tue, 7 Apr 2026 17:13:38 +0800 (CST) X-Zmail-TransId: 2b0569d4cac2d33-9076c X-Mailer: Zmail v1.0 Message-ID: <20260407171338173YQ_-quPfgvXzr9BX4WY-q@zte.com.cn> In-Reply-To: <20260407171052241tmZDFGusMP_wlEsBVVtJo@zte.com.cn> References: 20260407171052241tmZDFGusMP_wlEsBVVtJo@zte.com.cn Date: Tue, 7 Apr 2026 17:13:38 +0800 (CST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 From: To: , , , , , Cc: , , , Subject: =?UTF-8?B?W1BBVENIIDEvM10gUklTQy1WOiBLVk06IFJlZmFjdG9yIGt2bV9hcmNoX2NvbW1pdF9tZW1vcnlfcmVnaW9uKCk=?= X-MAIL: mse-fl1.zte.com.cn 6379DZhh068225 X-TLS: YES X-SPF-DOMAIN: zte.com.cn X-ENVELOPE-SENDER: wang.yechao255@zte.com.cn X-SPF: None X-SOURCE-IP: 10.5.228.132 unknown Tue, 07 Apr 2026 17:13:48 +0800 X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 69D4CACC.003/4fqgVN3l4Wz8Xs70 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Wang Yechao Refactor kvm_arch_commit_memory_region() as a preparation for a future commit to look cleaner and more understandable. Also, it looks more like its arm64 and x86 counterparts. Signed-off-by: Wang Yechao --- arch/riscv/kvm/mmu.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index c3539f660142..d2116c09c589 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -156,12 +156,22 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, const struct kvm_memory_slot *new, enum kvm_mr_change change) { + bool log_dirty_pages =3D new && new->flags & KVM_MEM_LOG_DIRTY_PAGES; + bool read_only =3D new && new->flags & KVM_MEM_READONLY; + + /* + * Nothing more to do for RO slots (which can't be dirtied and can't be + * made writable) or CREATE/MOVE/DELETE of a slot. + */ + if ((change !=3D KVM_MR_FLAGS_ONLY) || read_only) + return; + /* * At this point memslot has been committed and there is an * allocated dirty_bitmap[], dirty pages will be tracked while * the memory slot is write protected. */ - if (change !=3D KVM_MR_DELETE && new->flags & KVM_MEM_LOG_DIRTY_PAGES) { + if (log_dirty_pages) { if (kvm_dirty_log_manual_protect_and_init_set(kvm)) return; mmu_wp_memory_region(kvm, new->id); --=20 2.47.3 From nobody Mon Jun 15 00:12:35 2026 Received: from mxhk.zte.com.cn (mxhk.zte.com.cn [160.30.148.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B60213A0B13; Tue, 7 Apr 2026 09:15:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=160.30.148.35 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775553314; cv=none; b=gGU/xjgZspJxBbmMIX8DxyVgbKWDMC87Qqlwk82kfndfodrwPvjP88LZFHXyptgLjpHZvhI6xjJ4h2m6opQ31OjVnDfG1mlcVNPiq7kM4eDpdudEPEEnzwHZF1EvOc57UMbh0KN4K65Xu/y9uiAMQZJ4lHoEEcWqOHI6sRo1zxU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775553314; c=relaxed/simple; bh=Cxqvhvo5oKEKZ1cdBCU2bgE9T3FwDSksh9rd9Vt7PQA=; h=Message-ID:In-Reply-To:References:Date:Mime-Version:From:To:Cc: Subject:Content-Type; b=oaroQZhDoQHONdEQg/3FumDpIfNYC9qD1t4HKqbTz2gvJMcn/qVGHfApqDt7HOUqfFMFkoitX5iJ1t6NNehaBfx+2lemKpIKd6PEwkUqHAtuvEfVQWOu0w54llf4oM9JppuxN/cDRg4T1/gMW+WP8XCeiPNfSX5t+jQ6aJShQFM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zte.com.cn; spf=pass smtp.mailfrom=zte.com.cn; arc=none smtp.client-ip=160.30.148.35 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zte.com.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zte.com.cn Received: from mse-fl1.zte.com.cn (unknown [10.5.228.132]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mxhk.zte.com.cn (FangMail) with ESMTPS id 4fqgWz00Dnz7QYRQ; Tue, 07 Apr 2026 17:15:10 +0800 (CST) Received: from szxlzmapp02.zte.com.cn ([10.5.231.79]) by mse-fl1.zte.com.cn with SMTP id 6379Ere8070472; Tue, 7 Apr 2026 17:14:53 +0800 (+08) (envelope-from wang.yechao255@zte.com.cn) Received: from mapi (szxlzmapp01[null]) by mapi (Zmail) with MAPI id mid12; Tue, 7 Apr 2026 17:14:55 +0800 (CST) X-Zmail-TransId: 2b0369d4cb0fb57-00521 X-Mailer: Zmail v1.0 Message-ID: <20260407171455735miXfwO3C51btdouaaGUEc@zte.com.cn> In-Reply-To: <20260407171052241tmZDFGusMP_wlEsBVVtJo@zte.com.cn> References: 20260407171052241tmZDFGusMP_wlEsBVVtJo@zte.com.cn Date: Tue, 7 Apr 2026 17:14:55 +0800 (CST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 From: To: , , , , , Cc: , , , Subject: =?UTF-8?B?W1BBVENIIDIvM10gUklTQy1WOiBLVk06IGFkZCB0cmFjZXBvaW50cyBmb3IgZ3Vlc3QgcGFnZSBmYXVsdHM=?= X-MAIL: mse-fl1.zte.com.cn 6379Ere8070472 X-TLS: YES X-SPF-DOMAIN: zte.com.cn X-ENVELOPE-SENDER: wang.yechao255@zte.com.cn X-SPF: None X-SOURCE-IP: 10.5.228.132 unknown Tue, 07 Apr 2026 17:15:11 +0800 X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 69D4CB1E.001/4fqgWz00Dnz7QYRQ Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Wang Yechao Add the kvm_page_fault event tracepoints to count the number of KVM guest page faults. Signed-off-by: Wang Yechao --- arch/riscv/kvm/trace.h | 25 +++++++++++++++++++++++++ arch/riscv/kvm/vcpu_exit.c | 3 +++ 2 files changed, 28 insertions(+) diff --git a/arch/riscv/kvm/trace.h b/arch/riscv/kvm/trace.h index 3d54175d805c..9056cc9883cf 100644 --- a/arch/riscv/kvm/trace.h +++ b/arch/riscv/kvm/trace.h @@ -56,6 +56,31 @@ TRACE_EVENT(kvm_exit, __entry->htinst) ); +/* + * Tracepoint for page fault. + */ +TRACE_EVENT(kvm_page_fault, + TP_PROTO(struct kvm_vcpu *vcpu, u64 fault_address, u64 error_code), + TP_ARGS(vcpu, fault_address, error_code), + + TP_STRUCT__entry( + __field(unsigned int, vcpu_id) + __field(u64, fault_address) + __field(u64, error_code) + ), + + TP_fast_assign( + __entry->vcpu_id =3D vcpu->vcpu_id; + __entry->fault_address =3D fault_address; + __entry->error_code =3D error_code; + ), + + TP_printk("vcpu %u address 0x%016llx error_code 0x%llx", + __entry->vcpu_id, + __entry->fault_address, + __entry->error_code) +); + #endif /* _TRACE_RSICV_KVM_H */ #undef TRACE_INCLUDE_PATH diff --git a/arch/riscv/kvm/vcpu_exit.c b/arch/riscv/kvm/vcpu_exit.c index 0bb0c51e3c89..0cfb0149da9f 100644 --- a/arch/riscv/kvm/vcpu_exit.c +++ b/arch/riscv/kvm/vcpu_exit.c @@ -11,6 +11,7 @@ #include #include #include +#include "trace.h" static int gstage_page_fault(struct kvm_vcpu *vcpu, struct kvm_run *run, struct kvm_cpu_trap *trap) @@ -43,6 +44,8 @@ static int gstage_page_fault(struct kvm_vcpu *vcpu, struc= t kvm_run *run, }; } + trace_kvm_page_fault(vcpu, fault_addr, trap->scause); + ret =3D kvm_riscv_mmu_map(vcpu, memslot, fault_addr, hva, (trap->scause =3D=3D EXC_STORE_GUEST_PAGE_FAULT) ? true : false, &host_map); --=20 2.47.3 From nobody Mon Jun 15 00:12:35 2026 Received: from mxct.zte.com.cn (mxct.zte.com.cn [183.62.165.209]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 768342FE58C; Tue, 7 Apr 2026 09:16:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=183.62.165.209 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775553407; cv=none; b=L+JiRdyH8Qa7OH5OzCDIBV9Elu8tNaqnuYuZuqWlNcugQbXGRBvBMoppPZLDEGv80XSMsBjkEOFNCElf+mkcoubUIocXq60YOUeaNZ6F3W3yrrUTwbC49+QOPWM4uSWmEPQhWisyGR0DtDm4Akg0B+55bh1B4Kd2CynuEOYihMs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775553407; c=relaxed/simple; bh=C8puQMPeY6aoJLNYF/hlJqKdr+d1hCqlZajadk/FkxE=; h=Message-ID:In-Reply-To:References:Date:Mime-Version:From:To:Cc: Subject:Content-Type; b=P0fXTIJfuuvbiiwhXreBr+x/mJF5VL1zXrCeEdoDKTEyatqW1nAUAP28q++dvl9CdEYRhc0wiZlI1QW5Hv5F8ed8iLRZPy9IuvGI0x2KTJjFAtRZtp/UnMOkArj0JpteDPaxDDHdcNLf2wozZv17oQ9oA9qfmmBofrFkfLMgoZo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zte.com.cn; spf=pass smtp.mailfrom=zte.com.cn; arc=none smtp.client-ip=183.62.165.209 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zte.com.cn Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zte.com.cn Received: from mse-fl2.zte.com.cn (unknown [10.5.228.133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange x25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mxct.zte.com.cn (FangMail) with ESMTPS id 4fqgYg4SmVz501bD; Tue, 07 Apr 2026 17:16:39 +0800 (CST) Received: from szxlzmapp04.zte.com.cn ([10.5.231.166]) by mse-fl2.zte.com.cn with SMTP id 6379GVZl050576; Tue, 7 Apr 2026 17:16:31 +0800 (+08) (envelope-from wang.yechao255@zte.com.cn) Received: from mapi (szxlzmapp03[null]) by mapi (Zmail) with MAPI id mid12; Tue, 7 Apr 2026 17:16:33 +0800 (CST) X-Zmail-TransId: 2b0569d4cb716e6-99116 X-Mailer: Zmail v1.0 Message-ID: <2026040717163343908VqFt1HjxIYObFoWo2Xe@zte.com.cn> In-Reply-To: <20260407171052241tmZDFGusMP_wlEsBVVtJo@zte.com.cn> References: 20260407171052241tmZDFGusMP_wlEsBVVtJo@zte.com.cn Date: Tue, 7 Apr 2026 17:16:33 +0800 (CST) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 From: To: , , , , , Cc: , , , Subject: =?UTF-8?B?W1BBVENIIDMvM10gUklTQy1WOiBLVk06IFJlY292ZXIgZ3N0YWdlIGh1Z2UgcGFnZSBtYXBwaW5ncyBkdXJpbmcgZGlzYWJsZS1kaXJ0eS1sb2c=?= X-MAIL: mse-fl2.zte.com.cn 6379GVZl050576 X-TLS: YES X-SPF-DOMAIN: zte.com.cn X-ENVELOPE-SENDER: wang.yechao255@zte.com.cn X-SPF: None X-SOURCE-IP: 10.5.228.133 unknown Tue, 07 Apr 2026 17:16:39 +0800 X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 69D4CB77.000/4fqgYg4SmVz501bD Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Wang Yechao When dirty logging is enabled, the gstage mappings are split into 4K pages to track dirty pages. If the migration fails or is canceled, in order to keep the VM's performance consistent with that before dirty logging was enabled, the gstage huge page mappings are recoverd when dirty logging is disabled. With this patch, dirty_log_perf_test shows a decrease in the number of vCPU faults: $ perf stat -e kvm:kvm_page_fault \ /dirty_log_perf_test -s anonymous_hugetlb_1gb -v 1 -e -b 1G Before: 524,819 kvm:kvm_page_fault After : 263,211 kvm:kvm_page_fault Signed-off-by: Wang Yechao --- arch/riscv/include/asm/kvm_gstage.h | 4 +++ arch/riscv/kvm/gstage.c | 42 ++++++++++++++++++++++++ arch/riscv/kvm/mmu.c | 51 +++++++++++++++++++++++++++++ 3 files changed, 97 insertions(+) diff --git a/arch/riscv/include/asm/kvm_gstage.h b/arch/riscv/include/asm/k= vm_gstage.h index 373748c6745e..6e5aaa487adf 100644 --- a/arch/riscv/include/asm/kvm_gstage.h +++ b/arch/riscv/include/asm/kvm_gstage.h @@ -57,6 +57,10 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gstag= e, struct kvm_mmu_memory_cache *pcache, gpa_t addr, u32 target_level, bool flush); +void kvm_riscv_gstage_recover_huge(struct kvm_gstage *gstage, gpa_t addr, + unsigned long taget_page_size, + unsigned long *page_size); + enum kvm_riscv_gstage_op { GSTAGE_OP_NOP =3D 0, /* Nothing */ GSTAGE_OP_CLEAR, /* Clear/Unmap */ diff --git a/arch/riscv/kvm/gstage.c b/arch/riscv/kvm/gstage.c index ffec3e5ddcaf..54881a38b363 100644 --- a/arch/riscv/kvm/gstage.c +++ b/arch/riscv/kvm/gstage.c @@ -335,6 +335,48 @@ int kvm_riscv_gstage_split_huge(struct kvm_gstage *gst= age, return 0; } +void kvm_riscv_gstage_recover_huge(struct kvm_gstage *gstage, gpa_t addr, + unsigned long target_page_size, + unsigned long *page_size) +{ + u32 current_level =3D kvm_riscv_gstage_pgd_levels - 1; + pte_t *next_ptep =3D (pte_t *)gstage->pgd; + u32 target_level, out_level; + pte_t *ptep, *child_ptep; + int ret; + + out_level =3D 0; + ret =3D gstage_page_size_to_level(target_page_size, &target_level); + if (ret) + goto out; + + while (current_level >=3D target_level) { + ptep =3D (pte_t *)&next_ptep[gstage_pte_index(addr, current_level)]; + + out_level =3D current_level; + if (!pte_val(ptep_get(ptep))) + goto out; + + /* The mapping is already a huge page mapping. */ + if (gstage_pte_leaf(ptep)) + goto out; + + next_ptep =3D (pte_t *)gstage_pte_page_vaddr(ptep_get(ptep)); + current_level--; + } + + /* Replace the huge PTE with the first PTE entry of the child page table.= */ + child_ptep =3D (pte_t *)&next_ptep[0]; + set_pte(ptep, __pte(pte_val(ptep_get(child_ptep)))); + + gstage_tlb_flush(gstage, target_level, addr); + + put_page(virt_to_page(next_ptep)); + +out: + gstage_level_to_page_size(out_level, page_size); +} + void kvm_riscv_gstage_op_pte(struct kvm_gstage *gstage, gpa_t addr, pte_t *ptep, u32 ptep_level, enum kvm_riscv_gstage_op op) { diff --git a/arch/riscv/kvm/mmu.c b/arch/riscv/kvm/mmu.c index d2116c09c589..0b7077946e90 100644 --- a/arch/riscv/kvm/mmu.c +++ b/arch/riscv/kvm/mmu.c @@ -16,6 +16,8 @@ #include #include +static void kvm_mmu_recover_huge_pages(struct kvm *kvm, int slot); + static void mmu_wp_memory_region(struct kvm *kvm, int slot) { struct kvm_memslots *slots =3D kvm_memslots(kvm); @@ -175,6 +177,17 @@ void kvm_arch_commit_memory_region(struct kvm *kvm, if (kvm_dirty_log_manual_protect_and_init_set(kvm)) return; mmu_wp_memory_region(kvm, new->id); + } else { + /* + * Recover huge page mappings in the slot now that dirty logging + * is disabled, i.e. now that KVM does not have to track guest + * writes at 4KiB granularity. + * + * Dirty logging might be disabled by userspace if an ongoing VM + * live migration is cancelled and the VM must continue running + * on the source. + */ + kvm_mmu_recover_huge_pages(kvm, new->id); } } @@ -620,3 +633,41 @@ void kvm_riscv_mmu_update_hgatp(struct kvm_vcpu *vcpu) if (!kvm_riscv_gstage_vmid_bits()) kvm_riscv_local_hfence_gvma_all(); } + +static void kvm_mmu_recover_huge_pages(struct kvm *kvm, int slot) +{ + struct kvm_memslots *slots =3D kvm_memslots(kvm); + struct kvm_memory_slot *memslot =3D id_to_memslot(slots, slot); + unsigned long hva =3D gfn_to_hva(kvm, memslot->base_gfn); + phys_addr_t start =3D memslot->base_gfn << PAGE_SHIFT; + phys_addr_t end =3D (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; + phys_addr_t addr =3D start; + struct kvm_gstage gstage; + unsigned long page_size; + + if (!fault_supports_gstage_huge_mapping(memslot, hva)) + return; + + gstage.kvm =3D kvm; + gstage.flags =3D 0; + gstage.vmid =3D READ_ONCE(kvm->arch.vmid.vmid); + gstage.pgd =3D kvm->arch.pgd; + + spin_lock(&kvm->mmu_lock); + + while (addr < end) { + cond_resched_lock(&kvm->mmu_lock); + + if (get_hva_mapping_size(kvm, hva) < PMD_SIZE) { + addr +=3D PMD_SIZE; + hva +=3D PMD_SIZE; + continue; + } + + kvm_riscv_gstage_recover_huge(&gstage, addr, PMD_SIZE, &page_size); + + addr +=3D page_size; + hva +=3D page_size; + } + spin_unlock(&kvm->mmu_lock); +} --=20 2.47.3