[PULL 06/34] accel/tcg: Optimize jump cache flush during tlb range flush

Richard Henderson posted 34 patches 3 years, 2 months ago
Maintainers: "Alex Bennée" <alex.bennee@linaro.org>, "Philippe Mathieu-Daudé" <f4bug@amsat.org>, Cornelia Huck <cohuck@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Palmer Dabbelt <palmer@dabbelt.com>, Alistair Francis <Alistair.Francis@wdc.com>, Richard Henderson <richard.henderson@linaro.org>, WANG Xuerui <git@xen0n.name>, Riku Voipio <riku.voipio@iki.fi>, Jiaxun Yang <jiaxun.yang@flygoat.com>, Laurent Vivier <laurent@vivier.eu>, Thomas Huth <thuth@redhat.com>
There is a newer version of this series
[PULL 06/34] accel/tcg: Optimize jump cache flush during tlb range flush
Posted by Richard Henderson 3 years, 2 months ago
From: Idan Horowitz <idan.horowitz@gmail.com>

When the length of the range is large enough, clearing the whole cache is
faster than iterating over the (possibly extremely large) set of pages
contained in the range.

This mimics the pre-existing similar optimization done on the flush of the
tlb itself.

Signed-off-by: Idan Horowitz <idan.horowitz@gmail.com>
Message-Id: <20220110164754.1066025-1-idan.horowitz@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 accel/tcg/cputlb.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index 5e0d0eebc3..926d9a9192 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -783,6 +783,15 @@ static void tlb_flush_range_by_mmuidx_async_0(CPUState *cpu,
     }
     qemu_spin_unlock(&env_tlb(env)->c.lock);
 
+    /*
+     * If the length is larger than the jump cache size, then it will take
+     * longer to clear each entry individually than it will to clear it all.
+     */
+    if (d.len >= (TARGET_PAGE_SIZE * TB_JMP_CACHE_SIZE)) {
+        cpu_tb_jmp_cache_clear(cpu);
+        return;
+    }
+
     for (target_ulong i = 0; i < d.len; i += TARGET_PAGE_SIZE) {
         tb_flush_jmp_cache(cpu, d.addr + i);
     }
-- 
2.25.1