[v1] KVM: x86/mmu: move reused pages to the top of active_mmu_pages

[PATCH] KVM: x86/mmu: move reused pages to the top of active_mmu_pages

Posted by Hamza Mahfooz 2 weeks, 5 days ago

Move reused shadow pages to the head of active_mmu_pages in
__kvm_mmu_get_shadow_page(). This will allow us to move towards more of
a LRU approximation eviction strategy instead of just straight FIFO.

Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
---
 arch/x86/kvm/mmu/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 02c450686b4a..2fe04e01863d 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2395,7 +2395,8 @@ static struct kvm_mmu_page *__kvm_mmu_get_shadow_page(struct kvm *kvm,
 	if (!sp) {
 		created = true;
 		sp = kvm_mmu_alloc_shadow_page(kvm, caches, gfn, sp_list, role);
-	}
+	} else if (!list_is_head(&sp->link, &kvm->arch.active_mmu_pages))
+		list_move(&sp->link, &kvm->arch.active_mmu_pages);
 
 	trace_kvm_mmu_get_page(sp, created);
 	return sp;
-- 
2.52.0

Re: [PATCH] KVM: x86/mmu: move reused pages to the top of active_mmu_pages

Posted by Sean Christopherson 2 weeks, 3 days ago

On Tue, Jan 20, 2026, Hamza Mahfooz wrote:
> Move reused shadow pages to the head of active_mmu_pages in
> __kvm_mmu_get_shadow_page(). This will allow us to move towards more of
> a LRU approximation eviction strategy instead of just straight FIFO.

Does this actually have a (positive) impact on real-world workloads?  It seems
like an obvious improvment, but there's enough subtlely around active_mmu_pages
that I don't want to make any changes without a strong benefit.

Specifically, kvm_zap_obsolete_pages() has a hard dependency on the list being
FIFO.  We _might_ be ok if we make sure to filter out obsolete pages, but only
because of KVM's behavior of (a) only allowing two memslot generations at any
given time and (b) zapping all shadow pages from the old/obsolete generation
prior to kvm_zap_obsolete_pages() exiting.

But it most definitely makes me nervous.

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3911ac9bddfd..929085d46dd7 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2327,6 +2327,16 @@ static struct kvm_mmu_page *kvm_mmu_find_shadow_page(struct kvm *kvm,

        if (collisions > kvm->stat.max_mmu_page_hash_collisions)
                kvm->stat.max_mmu_page_hash_collisions = collisions;
+
+       /*
+        * If a shadow page was found, move it to the head of the active pages
+        * as a rudimentary form of LRU-reclaim (KVM reclaims shadow pages from
+        * tail=>head if the VM hits the limit on the number of MMU pages).
+        * */
+       if (sp && !WARN_ON_ONCE(is_obsolete_sp(kvm, sp)) &&
+           !list_is_head(&sp->link, &kvm->arch.active_mmu_pages))
+               list_move(&sp->link, &kvm->arch.active_mmu_pages);
+
        return sp;
 }

> Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 02c450686b4a..2fe04e01863d 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -2395,7 +2395,8 @@ static struct kvm_mmu_page *__kvm_mmu_get_shadow_page(struct kvm *kvm,
>  	if (!sp) {
>  		created = true;
>  		sp = kvm_mmu_alloc_shadow_page(kvm, caches, gfn, sp_list, role);
> -	}
> +	} else if (!list_is_head(&sp->link, &kvm->arch.active_mmu_pages))
> +		list_move(&sp->link, &kvm->arch.active_mmu_pages);

As alluded to above, I think I'd prefer to put this in kvm_mmu_find_shadow_page()?
Largely a moot point, but it seems like we'd want to move a page to the head of
the list if we look it up for any reason.

Re: [PATCH] KVM: x86/mmu: move reused pages to the top of active_mmu_pages

Posted by Hamza Mahfooz 2 weeks, 2 days ago

On Thu, Jan 22, 2026 at 04:27:42PM -0800, Sean Christopherson wrote:
> Does this actually have a (positive) impact on real-world workloads?  It seems
> like an obvious improvment, but there's enough subtlely around active_mmu_pages
> that I don't want to make any changes without a strong benefit.
> 

My testing mostly focused on correctness, though I did see the fault rate
go down on a long running VM that I used to host a web server (I only
only gave it around a gig of RAM, so it is on the more extreme end).

> Specifically, kvm_zap_obsolete_pages() has a hard dependency on the list being
> FIFO.  We _might_ be ok if we make sure to filter out obsolete pages, but only
> because of KVM's behavior of (a) only allowing two memslot generations at any
> given time and (b) zapping all shadow pages from the old/obsolete generation
> prior to kvm_zap_obsolete_pages() exiting.

for_each_valid_sp() already filters out obsolete pages, so we should be
good to go from that perspective.

> As alluded to above, I think I'd prefer to put this in kvm_mmu_find_shadow_page()?
> Largely a moot point, but it seems like we'd want to move a page to the head of
> the list if we look it up for any reason.

Sounds good to me, I put it in __kvm_mmu_get_shadow_page() since it seemed
clearer and it's currently the only caller of kvm_mmu_find_shadow_page().