arch/x86/include/asm/kvm_host.h | 7 +- arch/x86/kvm/mmu/mmu.c | 139 +++++++++++++------------------- arch/x86/kvm/mmu/paging_tmpl.h | 14 ++-- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 8 +- 5 files changed, 78 insertions(+), 91 deletions(-)
This series is extracted out from the NUMA aware page table series[1]. MMU shrinker changes were in patches 1 to 9 in the old series. This series is changing KVM MMU shrinker behaviour by emptying MMU page caches which are used during page fault and MMU load operations. It also incorporates feedback from the NUMA aware page table series[1] regarding MMU shrinker. KVM MMU shrinker has not been very effective in alleviating pain under memory pressure. It frees up the pages actively being used which results in VM degradation. VM will take fault and bring them again in page tables. More discussions happened at [2]. Overall, consensus was to reprupose it into the code which frees pages from KVM MMU page caches. Recently [3], there was a discussion to disable shrinker for TDP MMU. Revival of this series is result of that discussion. There are two major differences from the old series. 1. There is no global accounting of cache pages. It is dynamically calculated in mmu_shrink_count(). This has two effects; i) counting will be inaccurate but code is much simpler, and ii) kvm_lock being used here, this should be fine as mmu_shrink_scan() also holds the lock for its operation. 2. Only empty mmu_shadow_page_cache and mmu_shadowed_info_cache. This version doesn't empty split_shadow_page_cache as it is used only during dirty logging operation and is one per VM unlike other two which are per vCPU. I am not fully convinced that adding it is needed as it will add the cost of adding one more mutex and synchronizing it in shrinker. Also, if a VM is being dirty tracked most likely it will be migrated (memory pressure might be the reason in the first place) so better to not hinder migration effort and let vCPUs free up their caches. If someone convinces me to add split cache as well then I can send a separate patch to add that as well. [1] https://lore.kernel.org/kvm/20230306224127.1689967-1-vipinsh@google.com/ [2] https://lore.kernel.org/lkml/Y45dldZnI6OIf+a5@google.com/ [3] https://lore.kernel.org/kvm/20240819214014.GA2313467.vipinsh@google.com/#t v1: - No global counting of pages in cache. As this number might not remain same between calls of mmu_shrink_count() and mmu_shrink_scan(). - Count cache pages in mmu_shrink_count(). KVM can tolerate inaccuracy here. - Empty mmu_shadow_page_cache and mmu_shadowed_info_cache only. Don't empty split_shadow_page_cache. v0: Patches 1-9 from NUMA aware page table series. https://lore.kernel.org/kvm/20230306224127.1689967-1-vipinsh@google.com/ Vipin Sharma (2): KVM: x86/mmu: Change KVM mmu shrinker to no-op KVM: x86/mmu: Use MMU shrinker to shrink KVM MMU memory caches arch/x86/include/asm/kvm_host.h | 7 +- arch/x86/kvm/mmu/mmu.c | 139 +++++++++++++------------------- arch/x86/kvm/mmu/paging_tmpl.h | 14 ++-- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 8 +- 5 files changed, 78 insertions(+), 91 deletions(-) base-commit: 12680d7b8ac4db2eba6237a21a93d2b0e78a52a6 -- 2.46.0.662.g92d0881bb0-goog
On 2024-09-13 02:43 PM, Vipin Sharma wrote: > This series is extracted out from the NUMA aware page table series[1]. > MMU shrinker changes were in patches 1 to 9 in the old series. I'm curious how you tested this series. Would it be posisble to write a selftest to exercise KVM's shrinker interactions? I don't think it needs to be anything fancy to be useful (e.g. just run a VM, trigger lots of shrinking, and make sure nothing blows up). There appears to be a debugfs interface which could be used to trigger shrinking from a selftest. https://docs.kernel.org/admin-guide/mm/shrinker_debugfs.html
On Wed, Sep 25, 2024 at 4:51 PM David Matlack <dmatlack@google.com> wrote: > > On 2024-09-13 02:43 PM, Vipin Sharma wrote: > > This series is extracted out from the NUMA aware page table series[1]. > > MMU shrinker changes were in patches 1 to 9 in the old series. > > I'm curious how you tested this series. Would it be posisble to write a > selftest to exercise KVM's shrinker interactions? I don't think it needs > to be anything fancy to be useful (e.g. just run a VM, trigger lots of > shrinking, and make sure nothing blows up). My testing was dropping caches (echo 2 > /proc/sys/vm/drop_caches) in background while running dirty_log_perf_test selftest multiple times. I added printk in shrink_count() and shrink_scan() to make sure pages are being reported and released. I can write a test which can spawn a thread to drop caches and a VM which touches all of its pages to generate page faults. Only downside is it will not detect if KVM MMU shrinker is being invoked, counting and freeing pages. > > There appears to be a debugfs interface which could be used to trigger > shrinking from a selftest. > > https://docs.kernel.org/admin-guide/mm/shrinker_debugfs.html This is interesting and it does what is needed to test KVM MMU shrinker. However, this needs CONFIG_DEBUG_FS and CONFIG_SHRINKER_DEBUG. I think using shrinker_debugfs will be better, selftest can just skip if it cannot find shrinker_debugfs files. One downside is that this test will not run if these configs are not enabled. Which one do you prefer? I am preferring shrinker_debugfs but concerned about its dependency on those two configs, not sure if it is okay to have this kind of dependency in a selftests.
On Mon, Sep 30, 2024 at 9:43 AM Vipin Sharma <vipinsh@google.com> wrote: > > On Wed, Sep 25, 2024 at 4:51 PM David Matlack <dmatlack@google.com> wrote: > > > > On 2024-09-13 02:43 PM, Vipin Sharma wrote: > > > This series is extracted out from the NUMA aware page table series[1]. > > > MMU shrinker changes were in patches 1 to 9 in the old series. > > > > I'm curious how you tested this series. Would it be posisble to write a > > selftest to exercise KVM's shrinker interactions? I don't think it needs > > to be anything fancy to be useful (e.g. just run a VM, trigger lots of > > shrinking, and make sure nothing blows up). > > My testing was dropping caches (echo 2 > /proc/sys/vm/drop_caches) in > background while running dirty_log_perf_test selftest multiple times. > I added printk in shrink_count() and shrink_scan() to make sure pages > are being reported and released. > > I can write a test which can spawn a thread to drop caches and a VM > which touches all of its pages to generate page faults. Only downside > is it will not detect if KVM MMU shrinker is being invoked, counting > and freeing pages. > > > > > There appears to be a debugfs interface which could be used to trigger > > shrinking from a selftest. > > > > https://docs.kernel.org/admin-guide/mm/shrinker_debugfs.html > > This is interesting and it does what is needed to test KVM MMU > shrinker. However, this needs CONFIG_DEBUG_FS and > CONFIG_SHRINKER_DEBUG. I think using shrinker_debugfs will be better, > selftest can just skip if it cannot find shrinker_debugfs files. One > downside is that this test will not run if these configs are not > enabled. > > Which one do you prefer? I am preferring shrinker_debugfs but > concerned about its dependency on those two configs, not sure if it is > okay to have this kind of dependency in a selftests. Configs are there to be enabled. If shrinker_debugfs is the right way to test this, I don't see any reason to shy away from using it.
© 2016 - 2024 Red Hat, Inc.