From nobody Mon Sep 15 21:48:58 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 035EFC54EBD for ; Mon, 9 Jan 2023 21:54:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237783AbjAIVyJ (ORCPT ); Mon, 9 Jan 2023 16:54:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59198 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235654AbjAIVyE (ORCPT ); Mon, 9 Jan 2023 16:54:04 -0500 Received: from mail-il1-x149.google.com (mail-il1-x149.google.com [IPv6:2607:f8b0:4864:20::149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1FA1E3BEBB for ; Mon, 9 Jan 2023 13:54:03 -0800 (PST) Received: by mail-il1-x149.google.com with SMTP id h24-20020a056e021d9800b0030be8a5dd68so7065184ila.13 for ; Mon, 09 Jan 2023 13:54:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/kazqzODthv1qfFn0x+pjMq50iNJTv/+JbuwMSlo7YE=; b=NKkixVE4ytkVG7mwZgERD13bjl2m+lUGpZMCqnWHry9zfsPflpQq1rVJc74OuU1ia2 2CEmJO3mSmu3vSxAL4xtBu8+TupJ6cZjnt9Y5E+7povPyx3w6iTFO2cTh11afuuKuNSl jZS6pJFmGXWkUcEKCpYGcRWJAOgmMxQ3jNhfcy0XL+0pxWPM0J8UaKlC+ipUUTXSL/SP feUeudKhQwCLOSAXy2HxarZze74aLnKBHhe5PIpoiHgJb5xy93TOwrjHhPU5cUB/dh4q MNh9KR5MmosQRkbyCQFryoGEM6urWXA/eEMzSR3/WxUAh/cNMaq6nv1oJrMWUmyxixcS qmzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/kazqzODthv1qfFn0x+pjMq50iNJTv/+JbuwMSlo7YE=; b=3cH6ttfx4mZw0YQjydk08676+kOtyIXG++CL9mXmgAkyhZ++ee09EhSr5CI9xb3iEs NFeBJ30a41YkFB0dZksFZOlUcbSlk09yRSTj6Z2NLKUWv+9A5X4z5aNdKPh5r4B6PUHJ abSPJzoBxV0jM8TQl5SBOQbnqeTv2mVlRZvoE0c8Txj8eVhi11RHh4B3pEECEQzutZlH UGmfWs+R3IUGJyLeraTCXuc/75b7o+pTKV9zZs3Tj0riLVtgPCTyBIF7x4ADo8EoA7SL hdryeMXnmIfh7Szq7QQyru9zcKNIO88HB/bP2HzysFoAwIE+xQib8+jP5iYYTk4Irbus 74Vg== X-Gm-Message-State: AFqh2kqdEgVLN+847ZqxxrWTyrnstPzfJCfhxxb/5RAfoLtxb6TNQhGq aDpQ+pz8DQVZFxbvSyIqfDIkWWenI8PQ X-Google-Smtp-Source: AMrXdXvonqWAUpO95LxSDHvsAVVCuFauYrn8ITjqvbJGve5IrTypBrOzuR8Yn/DJ97ucl1mwY1BTfPeRWhSv X-Received: from rananta-linux.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:22b5]) (user=rananta job=sendgmr) by 2002:a05:6e02:1c28:b0:30b:fe74:8e15 with SMTP id m8-20020a056e021c2800b0030bfe748e15mr6475266ilh.11.1673301242581; Mon, 09 Jan 2023 13:54:02 -0800 (PST) Date: Mon, 9 Jan 2023 21:53:42 +0000 In-Reply-To: <20230109215347.3119271-1-rananta@google.com> Mime-Version: 1.0 References: <20230109215347.3119271-1-rananta@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109215347.3119271-2-rananta@google.com> Subject: [RFC PATCH 1/6] arm64: tlb: Refactor the core flush algorithm of __flush_tlb_range From: Raghavendra Rao Ananta To: Oliver Upton , Marc Zyngier , Ricardo Koller , Reiji Watanabe , James Morse , Alexandru Elisei , Suzuki K Poulose Cc: Paolo Bonzini , Catalin Marinas , Will Deacon , Jing Zhang , Colton Lewis , Raghavendra Rao Anata , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, the core TLB flush functionality of __flush_tlb_range() hardcodes vae1is (and variants) for the flush operation. In the upcoming patches, the KVM code reuses this core algorithm with ipas2e1is for range based TLB invalidations based on the IPA. Hence, extract the core flush functionality of __flush_tlb_range() into its own macro that accepts an 'op' argument to pass any TLBI instruction. No functional changes intended. Signed-off-by: Raghavendra Rao Ananta --- arch/arm64/include/asm/tlbflush.h | 107 +++++++++++++++--------------- 1 file changed, 54 insertions(+), 53 deletions(-) diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlb= flush.h index 412a3b9a3c25d..9a57eae14e576 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -278,14 +278,60 @@ static inline void flush_tlb_page(struct vm_area_stru= ct *vma, */ #define MAX_TLBI_OPS PTRS_PER_PTE =20 +/* When the CPU does not support TLB range operations, flush the TLB + * entries one by one at the granularity of 'stride'. If the TLB + * range ops are supported, then: + * + * 1. If 'pages' is odd, flush the first page through non-range + * operations; + * + * 2. For remaining pages: the minimum range granularity is decided + * by 'scale', so multiple range TLBI operations may be required. + * Start from scale =3D 0, flush the corresponding number of pages + * ((num+1)*2^(5*scale+1) starting from 'addr'), then increase it + * until no pages left. + * + * Note that certain ranges can be represented by either num =3D 31 and + * scale or num =3D 0 and scale + 1. The loop below favours the latter + * since num is limited to 30 by the __TLBI_RANGE_NUM() macro. + */ +#define __flush_tlb_range_op(op, start, pages, stride, asid, tlb_level, tl= bi_user) do { \ + int num =3D 0; \ + int scale =3D 0; \ + unsigned long addr; \ + \ + while (pages > 0) { \ + if (!system_supports_tlb_range() || \ + pages % 2 =3D=3D 1) { \ + addr =3D __TLBI_VADDR(start, asid); \ + __tlbi_level(op, addr, tlb_level); \ + if (tlbi_user) \ + __tlbi_user_level(op, addr, tlb_level); \ + start +=3D stride; \ + pages -=3D stride >> PAGE_SHIFT; \ + continue; \ + } \ + \ + num =3D __TLBI_RANGE_NUM(pages, scale); \ + if (num >=3D 0) { \ + addr =3D __TLBI_VADDR_RANGE(start, asid, scale, \ + num, tlb_level); \ + __tlbi(r##op, addr); \ + if (tlbi_user) \ + __tlbi_user(r##op, addr); \ + start +=3D __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \ + pages -=3D __TLBI_RANGE_PAGES(num, scale); \ + } \ + scale++; \ + } \ +} while (0) + static inline void __flush_tlb_range(struct vm_area_struct *vma, unsigned long start, unsigned long end, unsigned long stride, bool last_level, int tlb_level) { - int num =3D 0; - int scale =3D 0; - unsigned long asid, addr, pages; + unsigned long asid, pages; =20 start =3D round_down(start, stride); end =3D round_up(end, stride); @@ -307,56 +353,11 @@ static inline void __flush_tlb_range(struct vm_area_s= truct *vma, dsb(ishst); asid =3D ASID(vma->vm_mm); =20 - /* - * When the CPU does not support TLB range operations, flush the TLB - * entries one by one at the granularity of 'stride'. If the TLB - * range ops are supported, then: - * - * 1. If 'pages' is odd, flush the first page through non-range - * operations; - * - * 2. For remaining pages: the minimum range granularity is decided - * by 'scale', so multiple range TLBI operations may be required. - * Start from scale =3D 0, flush the corresponding number of pages - * ((num+1)*2^(5*scale+1) starting from 'addr'), then increase it - * until no pages left. - * - * Note that certain ranges can be represented by either num =3D 31 and - * scale or num =3D 0 and scale + 1. The loop below favours the latter - * since num is limited to 30 by the __TLBI_RANGE_NUM() macro. - */ - while (pages > 0) { - if (!system_supports_tlb_range() || - pages % 2 =3D=3D 1) { - addr =3D __TLBI_VADDR(start, asid); - if (last_level) { - __tlbi_level(vale1is, addr, tlb_level); - __tlbi_user_level(vale1is, addr, tlb_level); - } else { - __tlbi_level(vae1is, addr, tlb_level); - __tlbi_user_level(vae1is, addr, tlb_level); - } - start +=3D stride; - pages -=3D stride >> PAGE_SHIFT; - continue; - } - - num =3D __TLBI_RANGE_NUM(pages, scale); - if (num >=3D 0) { - addr =3D __TLBI_VADDR_RANGE(start, asid, scale, - num, tlb_level); - if (last_level) { - __tlbi(rvale1is, addr); - __tlbi_user(rvale1is, addr); - } else { - __tlbi(rvae1is, addr); - __tlbi_user(rvae1is, addr); - } - start +=3D __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; - pages -=3D __TLBI_RANGE_PAGES(num, scale); - } - scale++; - } + if (last_level) + __flush_tlb_range_op(vale1is, start, pages, stride, asid, tlb_level, tru= e); + else + __flush_tlb_range_op(vae1is, start, pages, stride, asid, tlb_level, true= ); + dsb(ish); } =20 --=20 2.39.0.314.g84b9a713c41-goog From nobody Mon Sep 15 21:48:58 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0B71DC5479D for ; Mon, 9 Jan 2023 21:54:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237726AbjAIVyP (ORCPT ); Mon, 9 Jan 2023 16:54:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237639AbjAIVyG (ORCPT ); Mon, 9 Jan 2023 16:54:06 -0500 Received: from mail-il1-x14a.google.com (mail-il1-x14a.google.com [IPv6:2607:f8b0:4864:20::14a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3352114D13 for ; Mon, 9 Jan 2023 13:54:05 -0800 (PST) Received: by mail-il1-x14a.google.com with SMTP id n15-20020a056e021baf00b0030387c2e1d3so7037041ili.5 for ; Mon, 09 Jan 2023 13:54:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MF+FfAxmoC6dFnj4ddByp++95c4ioR7dqQUFBF7r1tA=; b=KAaLiFLNSM2c203YeUTO7jF6xO8EhU/ISPtEeD5TD+CwxmpXoDhWvszR4cpvw0hbov XgSW91l2fld0hCkSrFubd+0ZVKstaNXvS2OObcyoZMl1C0Q/3091JKrAbJ1xtAzLYSZn +12PoMfix+0cqHZ0FWT6jhefdQ7qbQWxgqKSWvcydwnkGpi36ex064jnqxQXGcoagQED wmFS0SmXhsFi/MWc5uTsq9bPcd8HRE/fBqDCIg0Uxld930nL0GHYNBsLIKFAL0Dh45mN ipSG795V+G3ctkaDMs/fS4Zs7WckCYmyk7VP/PsR7QuY2Merv00NdFHgdLDlsW7DufCR jcVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MF+FfAxmoC6dFnj4ddByp++95c4ioR7dqQUFBF7r1tA=; b=mSwOzkHsMU+wVOpfzb7ZBKOWENQBArkuBNywDK8403fMuu6r/h7IYGsJX+5rgmXMoB p4Kluvbc6ijOaPpYa6Sc2yj70MafxTG+BV16Jsmr5cG1WY+851xuNoSoSH+x55MbdADj JtOyXbVC6W9R1RZsWVSAxDOtygd7m4irKY4inL3TE+mEeglWbhc9YPn8oCHYTblXY5/o BExx57F7KkJUbq6cD62xpeQ6hokDFDmLCOtIgI6EFuGv4zMoqC9P22AT2z8fwfMSPBzc 17z2+E5DudnVrWpJ3GnGuLVrqq4izrL+PBe9T1oSbjof6u1GMm9EON2uRtDirjfkJPEh ikVQ== X-Gm-Message-State: AFqh2koL55SFsa9Tqx/bydjBdi/6iFCmY0swoq/K0aDBWClGppb1xlHc sjb34oTp1xB8QG4xSIINrO9HPiOM/Qbg X-Google-Smtp-Source: AMrXdXvsMsw3ht7evpmsM3wj6mrDWGrXj8WzBYsxkxo6YHxdcq+zXrV8iCa7YF2HBD1+Zwf/eh64SHJ4FoDT X-Received: from rananta-linux.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:22b5]) (user=rananta job=sendgmr) by 2002:a02:cc4b:0:b0:375:c16b:7776 with SMTP id i11-20020a02cc4b000000b00375c16b7776mr6011988jaq.54.1673301244565; Mon, 09 Jan 2023 13:54:04 -0800 (PST) Date: Mon, 9 Jan 2023 21:53:43 +0000 In-Reply-To: <20230109215347.3119271-1-rananta@google.com> Mime-Version: 1.0 References: <20230109215347.3119271-1-rananta@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109215347.3119271-3-rananta@google.com> Subject: [RFC PATCH 2/6] KVM: arm64: Add support for FEAT_TLBIRANGE From: Raghavendra Rao Ananta To: Oliver Upton , Marc Zyngier , Ricardo Koller , Reiji Watanabe , James Morse , Alexandru Elisei , Suzuki K Poulose Cc: Paolo Bonzini , Catalin Marinas , Will Deacon , Jing Zhang , Colton Lewis , Raghavendra Rao Anata , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Define a generic function __kvm_tlb_flush_range() to invalidate the TLBs over a range of addresses. Use this to define __kvm_tlb_flush_range_vmid_ipa() (for VHE and nVHE) to flush a range of stage-2 page-tables using IPA in one go. If the system supports FEAT_TLBIRANGE, the following patches would conviniently replace global TLBI such as vmalls12e1is in the map, unmap, and dirty-logging paths with ripas2e1is instead. Signed-off-by: Raghavendra Rao Ananta --- arch/arm64/include/asm/kvm_asm.h | 21 +++++++++++++++++++++ arch/arm64/kvm/hyp/nvhe/hyp-main.c | 11 +++++++++++ arch/arm64/kvm/hyp/nvhe/tlb.c | 24 ++++++++++++++++++++++++ arch/arm64/kvm/hyp/vhe/tlb.c | 20 ++++++++++++++++++++ 4 files changed, 76 insertions(+) diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_= asm.h index 43c3bc0f9544d..bdf94ae0333b0 100644 --- a/arch/arm64/include/asm/kvm_asm.h +++ b/arch/arm64/include/asm/kvm_asm.h @@ -79,6 +79,7 @@ enum __kvm_host_smccc_func { __KVM_HOST_SMCCC_FUNC___pkvm_init_vm, __KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu, __KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm, + __KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_range_vmid_ipa, }; =20 #define DECLARE_KVM_VHE_SYM(sym) extern char sym[] @@ -221,10 +222,30 @@ DECLARE_KVM_NVHE_SYM(__per_cpu_end); DECLARE_KVM_HYP_SYM(__bp_harden_hyp_vecs); #define __bp_harden_hyp_vecs CHOOSE_HYP_SYM(__bp_harden_hyp_vecs) =20 +#define __kvm_tlb_flush_range(op, mmu, start, end, tlb_level) do { \ + unsigned long pages, stride; \ + \ + stride =3D PAGE_SIZE; \ + start =3D round_down(start, stride); \ + end =3D round_up(end, stride); \ + pages =3D (end - start) >> PAGE_SHIFT; \ + \ + if ((!system_supports_tlb_range() && \ + (end - start) >=3D (MAX_TLBI_OPS * stride)) || \ + pages >=3D MAX_TLBI_RANGE_PAGES) { \ + __kvm_tlb_flush_vmid(mmu); \ + break; \ + } \ + \ + __flush_tlb_range_op(op, start, pages, stride, 0, tlb_level, false); \ +} while (0) + extern void __kvm_flush_vm_context(void); extern void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu); extern void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t i= pa, int level); +extern void __kvm_tlb_flush_range_vmid_ipa(struct kvm_s2_mmu *mmu, phys_ad= dr_t start, + phys_addr_t end, int level); extern void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu); =20 extern void __kvm_timer_set_cntvoff(u64 cntvoff); diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/h= yp-main.c index 728e01d4536b0..ac52d0fbb9719 100644 --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c @@ -116,6 +116,16 @@ static void handle___kvm_flush_vm_context(struct kvm_c= pu_context *host_ctxt) __kvm_flush_vm_context(); } =20 +static void handle___kvm_tlb_flush_range_vmid_ipa(struct kvm_cpu_context *= host_ctxt) +{ + DECLARE_REG(struct kvm_s2_mmu *, mmu, host_ctxt, 1); + DECLARE_REG(phys_addr_t, start, host_ctxt, 2); + DECLARE_REG(phys_addr_t, end, host_ctxt, 3); + DECLARE_REG(int, level, host_ctxt, 4); + + __kvm_tlb_flush_range_vmid_ipa(kern_hyp_va(mmu), start, end, level); +} + static void handle___kvm_tlb_flush_vmid_ipa(struct kvm_cpu_context *host_c= txt) { DECLARE_REG(struct kvm_s2_mmu *, mmu, host_ctxt, 1); @@ -314,6 +324,7 @@ static const hcall_t host_hcall[] =3D { HANDLE_FUNC(__kvm_adjust_pc), HANDLE_FUNC(__kvm_vcpu_run), HANDLE_FUNC(__kvm_flush_vm_context), + HANDLE_FUNC(__kvm_tlb_flush_range_vmid_ipa), HANDLE_FUNC(__kvm_tlb_flush_vmid_ipa), HANDLE_FUNC(__kvm_tlb_flush_vmid), HANDLE_FUNC(__kvm_flush_cpu_context), diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c index d296d617f5896..292f5c4834d08 100644 --- a/arch/arm64/kvm/hyp/nvhe/tlb.c +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c @@ -55,6 +55,30 @@ static void __tlb_switch_to_host(struct tlb_inv_context = *cxt) } } =20 +void __kvm_tlb_flush_range_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t st= art, + phys_addr_t end, int level) +{ + struct tlb_inv_context cxt; + + dsb(ishst); + + /* Switch to requested VMID */ + __tlb_switch_to_guest(mmu, &cxt); + + __kvm_tlb_flush_range(ipas2e1is, mmu, start, end, level); + + dsb(ish); + __tlbi(vmalle1is); + dsb(ish); + isb(); + + /* See the comment below in __kvm_tlb_flush_vmid_ipa() */ + if (icache_is_vpipt()) + icache_inval_all_pou(); + + __tlb_switch_to_host(&cxt); +} + void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, int level) { diff --git a/arch/arm64/kvm/hyp/vhe/tlb.c b/arch/arm64/kvm/hyp/vhe/tlb.c index 24cef9b87f9e9..2631cc09e4184 100644 --- a/arch/arm64/kvm/hyp/vhe/tlb.c +++ b/arch/arm64/kvm/hyp/vhe/tlb.c @@ -79,6 +79,26 @@ static void __tlb_switch_to_host(struct tlb_inv_context = *cxt) local_irq_restore(cxt->flags); } =20 +void __kvm_tlb_flush_range_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t st= art, + phys_addr_t end, int level) +{ + struct tlb_inv_context cxt; + + dsb(ishst); + + /* Switch to requested VMID */ + __tlb_switch_to_guest(mmu, &cxt); + + __kvm_tlb_flush_range(ipas2e1is, mmu, start, end, level); + + dsb(ish); + __tlbi(vmalle1is); + dsb(ish); + isb(); + + __tlb_switch_to_host(&cxt); +} + void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu, phys_addr_t ipa, int level) { --=20 2.39.0.314.g84b9a713c41-goog From nobody Mon Sep 15 21:48:58 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCA57C54EBD for ; Mon, 9 Jan 2023 21:54:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237790AbjAIVyY (ORCPT ); Mon, 9 Jan 2023 16:54:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237601AbjAIVyI (ORCPT ); Mon, 9 Jan 2023 16:54:08 -0500 Received: from mail-ot1-x349.google.com (mail-ot1-x349.google.com [IPv6:2607:f8b0:4864:20::349]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C8BB14D13 for ; Mon, 9 Jan 2023 13:54:07 -0800 (PST) Received: by mail-ot1-x349.google.com with SMTP id cz20-20020a0568306a1400b006849b669d65so570193otb.10 for ; Mon, 09 Jan 2023 13:54:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SDYN0MRDIF+SnOeyd4pyYdGIKCrHRaNe9q+tU9g7tHk=; b=JpgU+lBoUCLbZ56Jvj7Pn/vdi78vU5U3QuKtvIBxucZgLmZuMLmd88ZuQTUzyBKvak 6URYiypCIW+TZt6tpj3vahCJkJfZ/bLR5/iBFmcgS2xJWZjSs5Km1XmDOCZFVyYNrNdF aQxMRNm9CoVdpDALfBC57x8+EFA1cfuvZqgDxyITXdGCIOEqTxQiAuVy9uJ1cxGO/3EV abfT9ocI5Otykw8Nadc9eXPbqcQZXXaZrNB0el9+yvluuWfZF1VWhrmpIVzBQ0tBz6CD +6oL0xnc06/xPuEhNIL+kfwdmFZFTaFbgKl3SlvFFlUqQ7wMRKAH+9CjY0D5URbPHfBd jhMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SDYN0MRDIF+SnOeyd4pyYdGIKCrHRaNe9q+tU9g7tHk=; b=nHXUnBLmbpCrMrKDdDHa+yfY5zztlxds2uXJF7VqDIZrXZ3TEu+wPxbkidlLxSDuSp jgKsoY62oLX/pKzL5sSCw/vmP2fqhO4o00OcsvJITPXFjG3WpQTdvQQWtBrIqnFs3de4 xg3qyQShIAQP0FUUk8SyNVWCcWZ9ASfLYIRQMPcLTERDPAQE7jVmoo5pc9p/N9RkHUUg W8O295G0atdSo0sB+vr6MB4GZwnpfbK+/MaR3h5sxlZ07h88m/wqFnLJQXyKtiyr/ioX 2VLVmxQnDBCr57w3/40Cw3TVHJP9L9HOCLweqYLUo3bjIDs8A/OkisBOkVAIepAencRa d3Cg== X-Gm-Message-State: AFqh2kpRn1JWcazTFAgdVuvowfRO6IuYedsJ3cQx4SvFt24foHmI6n8L zO65E3HQ4lh5xw7BM7IUkRO3B99sk8yx X-Google-Smtp-Source: AMrXdXtrkhTOwelbRPQw7yNs/JV3tuDLBxCLoimzlrlEKqCeF4IooRkE3J40GXTjhloRi+UlLLpIxIlpLZyo X-Received: from rananta-linux.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:22b5]) (user=rananta job=sendgmr) by 2002:a05:6808:1a1d:b0:364:532d:215a with SMTP id bk29-20020a0568081a1d00b00364532d215amr195559oib.187.1673301246427; Mon, 09 Jan 2023 13:54:06 -0800 (PST) Date: Mon, 9 Jan 2023 21:53:44 +0000 In-Reply-To: <20230109215347.3119271-1-rananta@google.com> Mime-Version: 1.0 References: <20230109215347.3119271-1-rananta@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109215347.3119271-4-rananta@google.com> Subject: [RFC PATCH 3/6] KVM: Define kvm_flush_remote_tlbs_range From: Raghavendra Rao Ananta To: Oliver Upton , Marc Zyngier , Ricardo Koller , Reiji Watanabe , James Morse , Alexandru Elisei , Suzuki K Poulose Cc: Paolo Bonzini , Catalin Marinas , Will Deacon , Jing Zhang , Colton Lewis , Raghavendra Rao Anata , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Define kvm_flush_remote_tlbs_range() to limit the TLB flush only to a certain range of addresses. Replace this with the existing call to kvm_flush_remote_tlbs() in the MMU notifier path. Architectures such as arm64 can define this to flush only the necessary addresses, instead of the entire range. Signed-off-by: Raghavendra Rao Ananta --- arch/arm64/kvm/mmu.c | 10 ++++++++++ include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 7 ++++++- 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 39d9a334efb57..70f76bc909c5d 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -91,6 +91,16 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) kvm_call_hyp(__kvm_tlb_flush_vmid, &kvm->arch.mmu); } =20 +void kvm_flush_remote_tlbs_range(struct kvm *kvm, unsigned long start, uns= igned long end) +{ + struct kvm_s2_mmu *mmu =3D &kvm->arch.mmu; + + if (system_supports_tlb_range()) + kvm_call_hyp(__kvm_tlb_flush_range_vmid_ipa, mmu, start, end, 0); + else + kvm_flush_remote_tlbs(kvm); +} + static bool kvm_is_device_pfn(unsigned long pfn) { return !pfn_is_map_memory(pfn); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index f51eb9419bfc3..a76cede9dc3bb 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1359,6 +1359,7 @@ int kvm_vcpu_yield_to(struct kvm_vcpu *target); void kvm_vcpu_on_spin(struct kvm_vcpu *vcpu, bool usermode_vcpu_not_eligib= le); =20 void kvm_flush_remote_tlbs(struct kvm *kvm); +void kvm_flush_remote_tlbs_range(struct kvm *kvm, unsigned long start, uns= igned long end); =20 #ifdef KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 03e6a38094c17..f538ecc984f5b 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -376,6 +376,11 @@ void kvm_flush_remote_tlbs(struct kvm *kvm) ++kvm->stat.generic.remote_tlb_flush; } EXPORT_SYMBOL_GPL(kvm_flush_remote_tlbs); + +void kvm_flush_remote_tlbs_range(struct kvm *kvm, unsigned long start, uns= igned long end) +{ + kvm_flush_remote_tlbs(kvm); +} #endif =20 static void kvm_flush_shadow_all(struct kvm *kvm) @@ -637,7 +642,7 @@ static __always_inline int __kvm_handle_hva_range(struc= t kvm *kvm, } =20 if (range->flush_on_ret && ret) - kvm_flush_remote_tlbs(kvm); + kvm_flush_remote_tlbs_range(kvm, range->start, range->end - 1); =20 if (locked) { KVM_MMU_UNLOCK(kvm); --=20 2.39.0.314.g84b9a713c41-goog From nobody Mon Sep 15 21:48:58 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF666C54EBD for ; Mon, 9 Jan 2023 21:54:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237788AbjAIVy1 (ORCPT ); Mon, 9 Jan 2023 16:54:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235654AbjAIVyJ (ORCPT ); Mon, 9 Jan 2023 16:54:09 -0500 Received: from mail-oi1-x24a.google.com (mail-oi1-x24a.google.com [IPv6:2607:f8b0:4864:20::24a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE0AB3591E for ; Mon, 9 Jan 2023 13:54:08 -0800 (PST) Received: by mail-oi1-x24a.google.com with SMTP id r32-20020a056808212000b0035e98193903so3177426oiw.21 for ; Mon, 09 Jan 2023 13:54:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OYPMgEiPrxU+CobVMeoQJZAp/V2pR2BlGHT0msnI4jA=; b=gyqy+T4vgLafFMd0I72XmEQLYVlQTS+qhprWpdg3mEk77/0NG6F6Vo9alqxnGmqKdb WZvjupvvlPafMfSMJ2s+wpXfxi6b6ezq2somnPaIa5uD4SQ7TWF30I0YFqDDXo5mQ3bh 79QWb0qRKtavRZqsJV+vpoW2IclN8C/BLnslNlOGvS72sL3JOCAzgoDtI/GUtCetTbKM YHg5XDnlWZE66YvINnBXAwLe5dKRNCj7Ol15d31EPXsjblFs/A0QhZ2iW6tmwtEi98nt pwcuo1f6vJWFJMbxln1e1WzOxTvCZkL1qdGaMkN2BvtMAA13Fq5oW+Cscfw1n6sJ1KOq CPIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OYPMgEiPrxU+CobVMeoQJZAp/V2pR2BlGHT0msnI4jA=; b=br6SjVjbstCAcIrBFyTfYrfwMn78Bq9suoH1BdbGHbEWusRQpG4qnFChy2YCsK+sfr gv2ohLUS3DgjLB/071J4Qmuk2NdhJFuCEMVO8L6V8Qg9CCDBDu2VzMwXdZERnMNMyiMR 9B7WWdT38hBM4H0S1k8LH8vHbGHpi/+AnYV8oYg8nLe9aC5/jwFxo2RSppxPytITI5Bs BzXjrqgChvR3dgu3ZCuYlgXF5tKwO5QfRdRy2KtJr503IqCy9x8Cr7VhETogNETL28da 4s4Jip7NM5PlRTJxBVLB4Dimq8lb7mk0LITbGqQFd+RPBqek/D0kB/KLtP3E+eHFrclM AxCw== X-Gm-Message-State: AFqh2krcEHsUcIvjOlAkwb+DgPkSlSJrQ2deOkUnGAJ9l99N52X2fSkW KAuCksaDxglssARu7lKTP/bcT1cnOWJa X-Google-Smtp-Source: AMrXdXtbj9JAAHI73qaN+iD7WOKh4IAZdRGGbBS99shwFb5KWNNXJ3O7u4NPvmn0Z64VJQ7npOOB1EfCbLio X-Received: from rananta-linux.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:22b5]) (user=rananta job=sendgmr) by 2002:a9d:193:0:b0:66e:be75:d63f with SMTP id e19-20020a9d0193000000b0066ebe75d63fmr2993190ote.294.1673301248299; Mon, 09 Jan 2023 13:54:08 -0800 (PST) Date: Mon, 9 Jan 2023 21:53:45 +0000 In-Reply-To: <20230109215347.3119271-1-rananta@google.com> Mime-Version: 1.0 References: <20230109215347.3119271-1-rananta@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109215347.3119271-5-rananta@google.com> Subject: [RFC PATCH 4/6] KVM: arm64: Optimize TLBIs in the dirty logging path From: Raghavendra Rao Ananta To: Oliver Upton , Marc Zyngier , Ricardo Koller , Reiji Watanabe , James Morse , Alexandru Elisei , Suzuki K Poulose Cc: Paolo Bonzini , Catalin Marinas , Will Deacon , Jing Zhang , Colton Lewis , Raghavendra Rao Anata , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently the dirty-logging paths, including kvm_arch_flush_remote_tlbs_memslot() and kvm_mmu_wp_memory_region() ivalidates the entire VM's TLB entries using kvm_flush_remote_tlbs(). As the range of IPAs is provided by these functions, this is highly inefficient on the systems which support FEAT_TLBIRANGE. Hence, use kvm_flush_remote_tlbs_range() to flush the TLBs instead. Signed-off-by: Raghavendra Rao Ananta --- arch/arm64/kvm/arm.c | 7 ++++++- arch/arm64/kvm/mmu.c | 2 +- 2 files changed, 7 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 00da570ed72bd..179520888c697 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -1433,7 +1433,12 @@ void kvm_arch_sync_dirty_log(struct kvm *kvm, struct= kvm_memory_slot *memslot) void kvm_arch_flush_remote_tlbs_memslot(struct kvm *kvm, const struct kvm_memory_slot *memslot) { - kvm_flush_remote_tlbs(kvm); + phys_addr_t start, end; + + start =3D memslot->base_gfn << PAGE_SHIFT; + end =3D (memslot->base_gfn + memslot->npages) << PAGE_SHIFT; + + kvm_flush_remote_tlbs_range(kvm, start, end); } =20 static int kvm_vm_ioctl_set_device_addr(struct kvm *kvm, diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 70f76bc909c5d..e34b81f5922ce 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -976,7 +976,7 @@ static void kvm_mmu_wp_memory_region(struct kvm *kvm, i= nt slot) write_lock(&kvm->mmu_lock); stage2_wp_range(&kvm->arch.mmu, start, end); write_unlock(&kvm->mmu_lock); - kvm_flush_remote_tlbs(kvm); + kvm_flush_remote_tlbs_range(kvm, start, end); } =20 /** --=20 2.39.0.314.g84b9a713c41-goog From nobody Mon Sep 15 21:48:58 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D191DC5479D for ; Mon, 9 Jan 2023 21:54:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237826AbjAIVye (ORCPT ); Mon, 9 Jan 2023 16:54:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59408 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237793AbjAIVyV (ORCPT ); Mon, 9 Jan 2023 16:54:21 -0500 Received: from mail-il1-x149.google.com (mail-il1-x149.google.com [IPv6:2607:f8b0:4864:20::149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90D7B3C389 for ; Mon, 9 Jan 2023 13:54:10 -0800 (PST) Received: by mail-il1-x149.google.com with SMTP id s2-20020a056e02216200b0030bc3be69e5so7112778ilv.20 for ; Mon, 09 Jan 2023 13:54:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=BjMR5IMZCP3wwXGcrrOi/YjnHWAxQ/J3ysDtq7OdZIc=; b=QFiqc5zB4Ezhsqh0uJctnxBIZ06ZL1uMn/Q6xiIlBq5mgzNSVfWobcWpl+XMH2a1ZB dyTqEaOqhGBhJi7TVqx9StbjModEQJPT51Fp+zKoDaVSyWWwvkyuhXL0k9HyZODKS15w 9JlfwdWOMTUVfOEYqSMJLThOAGQxLoYB8ZW/Avt5hEjhYd0eK3JbsAvOQXV9NpGxcwRz lghVpihiMlnlLlYNqfMsVhO5Zpa6QT6bvsGvCsxj/WPp3DYK/2EnM1p3hEFKreLXtScE jGL0g9SEPsYvYkF7UxjUTJ9ghGVXrlOWYd3EPOYn4m5JgjcjbfoXOFdcf4QsUodhmm6m Uvhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=BjMR5IMZCP3wwXGcrrOi/YjnHWAxQ/J3ysDtq7OdZIc=; b=IfhE3MPhoZBseLg8zsm4q8BBHFQBLfoXa9hQQYmXx5EVfOULKg7lIFwAsCFrbSqtOI xgscPAI/yt3gKW3sAu7fy1/WTaV5HuNhwuAT6i0nsjbgZ2UZ2mTIg+APiqlRSSuYZVD4 /73O5I4MP1sZhbaogW3Cr5iMJJlTtTMMu68eKZkqkBiWfAJWaItdivPFphtTvGghhDFv a9285OJ1LfeqP9AKZxEh62/64rWMyYQpsP5oeD4IlcJUuRdApidiZgdVJwewSVbD+06Z fsjNbrH0Is1AFBOuc9yUIGo6z0qnJBXpa0XEuWRhWFkG3sulXJX7L5tGbq/4WXpiRlc1 GHSg== X-Gm-Message-State: AFqh2ko2xjdrVk3MrsvLbWu7tZrkKW06dtAPvvjfmGGti/2ZBxn6sae9 VxJ6eo0JVdY7+4AWly/noN1569WYz+gW X-Google-Smtp-Source: AMrXdXveLZ+wlH5ggK1nS8geCa3M7PI/cZSvzt5mjAy3d14vsfPB1eAi+vunpQHd5SoWZ396qOqlx95aAAQc X-Received: from rananta-linux.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:22b5]) (user=rananta job=sendgmr) by 2002:a05:6e02:2206:b0:30c:3823:2907 with SMTP id j6-20020a056e02220600b0030c38232907mr4744567ilf.304.1673301250053; Mon, 09 Jan 2023 13:54:10 -0800 (PST) Date: Mon, 9 Jan 2023 21:53:46 +0000 In-Reply-To: <20230109215347.3119271-1-rananta@google.com> Mime-Version: 1.0 References: <20230109215347.3119271-1-rananta@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109215347.3119271-6-rananta@google.com> Subject: [RFC PATCH 5/6] KVM: arm64: Optimize the stage2 map path with TLBI range instructions From: Raghavendra Rao Ananta To: Oliver Upton , Marc Zyngier , Ricardo Koller , Reiji Watanabe , James Morse , Alexandru Elisei , Suzuki K Poulose Cc: Paolo Bonzini , Catalin Marinas , Will Deacon , Jing Zhang , Colton Lewis , Raghavendra Rao Anata , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, when the map path of stage2 page-table coalesces a bunch of pages into a hugepage, KVM invalidates the entire VM's TLB entries. This would cause a perforamance penality for the guest whose pages have already been coalesced earlier as they would have to refill their TLB entries unnecessarily again. Hence, if the system supports it, use __kvm_tlb_flush_range_vmid_ipa() to flush only the range of pages that have been combined into a hugepage, while leaving other TLB entries alone. Signed-off-by: Raghavendra Rao Ananta --- arch/arm64/kvm/hyp/pgtable.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index b11cf2c618a6c..099032bb01bce 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -686,6 +686,22 @@ static bool stage2_try_set_pte(const struct kvm_pgtabl= e_visit_ctx *ctx, kvm_pte_ return cmpxchg(ctx->ptep, ctx->old, new) =3D=3D ctx->old; } =20 +static void kvm_table_pte_flush(struct kvm_s2_mmu *mmu, u64 addr, u32 leve= l, u32 tlb_level) +{ + if (system_supports_tlb_range()) { + u64 end =3D addr + kvm_granule_size(level); + + kvm_call_hyp(__kvm_tlb_flush_range_vmid_ipa, mmu, addr, end, tlb_level); + } else { + /* + * Invalidate the whole stage-2, as we may have numerous leaf + * entries below us which would otherwise need invalidating + * individually. + */ + kvm_call_hyp(__kvm_tlb_flush_vmid, mmu); + } +} + /** * stage2_try_break_pte() - Invalidates a pte according to the * 'break-before-make' requirements of the @@ -693,6 +709,7 @@ static bool stage2_try_set_pte(const struct kvm_pgtable= _visit_ctx *ctx, kvm_pte_ * * @ctx: context of the visited pte. * @mmu: stage-2 mmu + * @tlb_level: The level at which the leaf pages are expected (for FEAT_TT= L hint) * * Returns: true if the pte was successfully broken. * @@ -701,7 +718,7 @@ static bool stage2_try_set_pte(const struct kvm_pgtable= _visit_ctx *ctx, kvm_pte_ * on the containing table page. */ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx, - struct kvm_s2_mmu *mmu) + struct kvm_s2_mmu *mmu, u32 tlb_level) { struct kvm_pgtable_mm_ops *mm_ops =3D ctx->mm_ops; =20 @@ -722,7 +739,7 @@ static bool stage2_try_break_pte(const struct kvm_pgtab= le_visit_ctx *ctx, * value (if any). */ if (kvm_pte_table(ctx->old, ctx->level)) - kvm_call_hyp(__kvm_tlb_flush_vmid, mmu); + kvm_table_pte_flush(mmu, ctx->addr, ctx->level, tlb_level); else if (kvm_pte_valid(ctx->old)) kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr, ctx->level); =20 @@ -804,7 +821,7 @@ static int stage2_map_walker_try_leaf(const struct kvm_= pgtable_visit_ctx *ctx, if (!stage2_pte_needs_update(ctx->old, new)) return -EAGAIN; =20 - if (!stage2_try_break_pte(ctx, data->mmu)) + if (!stage2_try_break_pte(ctx, data->mmu, ctx->level)) return -EAGAIN; =20 /* Perform CMOs before installation of the guest stage-2 PTE */ @@ -861,7 +878,11 @@ static int stage2_map_walk_leaf(const struct kvm_pgtab= le_visit_ctx *ctx, if (!childp) return -ENOMEM; =20 - if (!stage2_try_break_pte(ctx, data->mmu)) { + /* + * As the table will be replaced with a block, one level down would + * be the current page entries held by the table. + */ + if (!stage2_try_break_pte(ctx, data->mmu, ctx->level + 1)) { mm_ops->put_page(childp); return -EAGAIN; } --=20 2.39.0.314.g84b9a713c41-goog From nobody Mon Sep 15 21:48:58 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C002C5479D for ; Mon, 9 Jan 2023 21:54:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237864AbjAIVyk (ORCPT ); Mon, 9 Jan 2023 16:54:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59418 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237800AbjAIVyW (ORCPT ); Mon, 9 Jan 2023 16:54:22 -0500 Received: from mail-io1-xd4a.google.com (mail-io1-xd4a.google.com [IPv6:2607:f8b0:4864:20::d4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8474F3C0ED for ; Mon, 9 Jan 2023 13:54:13 -0800 (PST) Received: by mail-io1-xd4a.google.com with SMTP id t15-20020a5d81cf000000b006f95aa9ba6eso5760882iol.16 for ; Mon, 09 Jan 2023 13:54:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QnWQSYtdYbIpMuZp6LkKgpjLfcHlUZmLQRKHyPWiT+U=; b=SoQHKoORituEmv9rUcL664DtdyreklZiFGEKpPbMIqEEDHQRWBI7SRr3f7cEoqCsvZ 1e7IZZ836ejawXXum/N1EwK5B4IqD2gR6GsFJWCf92qXR9lLgyBN3G4c5l9JL/DOIgzd 5SoZlkxgAJogZrpvMLXlxuFB6zLuebhc5CnJ3Lvd6tWm5FooSJa7Awa5clf+HtqEP2aG eQESQnMpMQOrH2ch9tyF4Sn5bZVnHYL9cgG5M0HGBrBLO1SgEiQFRQ0FQspsxC0pC7hs ur+0YtUUCtgQF8i/Lg4Oe9SZDzJBgNK97UxeExA2fo2dWsRWc7G5+liOox0x9OsztI9W /J9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QnWQSYtdYbIpMuZp6LkKgpjLfcHlUZmLQRKHyPWiT+U=; b=m9PPpFUSIooWEe4h4UWN7elZewNBXaR7eYYzrt6yM80mGXMCbu04BjySnDDyxxDX9/ lQb8woB2RfSICqSk909Nt0myID5EZriI5d6iJinRjGNwmQqflLCLmfP1xpAlihle+9s2 AiQ6tVlywFy8b/WDEwioaLVGeABTDm6h7oGJMZdC8ilFBOksZ3nXOJaLKXVOUdLOYySL 41JWfkMHMgQG6w6EE8YENefjoLMtVkhf9f68ArLhWvAyDuS0QPg+WfJMXwPhozRKAtsP NMsbPKeoF60qJWrcIuXFaXxEGURivwQnlwQIj2wb7CBivz2k3Yugnv7FB7KvqDOatws1 6ucA== X-Gm-Message-State: AFqh2konwmjbDZeV1Hn95mvvb1MqzCzgTFE3/N4SWR2ZT+bpcSYy6S3G Co8l+QCOVtxLf4x8nm9bG0VpCDyd4Npm X-Google-Smtp-Source: AMrXdXuEKpemaG/E8lRrZNDz8BRpwdnTj1XFG2QPwNU3UfWi6vqI2Uo/p3n228o5g0euwm8Gw3DaqOUCn3p1 X-Received: from rananta-linux.c.googlers.com ([fda3:e722:ac3:cc00:2b:ff92:c0a8:22b5]) (user=rananta job=sendgmr) by 2002:a92:4a11:0:b0:300:e879:8094 with SMTP id m17-20020a924a11000000b00300e8798094mr6116771ilf.153.1673301252929; Mon, 09 Jan 2023 13:54:12 -0800 (PST) Date: Mon, 9 Jan 2023 21:53:47 +0000 In-Reply-To: <20230109215347.3119271-1-rananta@google.com> Mime-Version: 1.0 References: <20230109215347.3119271-1-rananta@google.com> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog Message-ID: <20230109215347.3119271-7-rananta@google.com> Subject: [RFC PATCH 6/6] KVM: arm64: Create a fast stage-2 unmap path From: Raghavendra Rao Ananta To: Oliver Upton , Marc Zyngier , Ricardo Koller , Reiji Watanabe , James Morse , Alexandru Elisei , Suzuki K Poulose Cc: Paolo Bonzini , Catalin Marinas , Will Deacon , Jing Zhang , Colton Lewis , Raghavendra Rao Anata , linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The current implementation of the stage-2 unmap walker traverses the entire page-table to clear and flush the TLBs for each entry. This could be very expensive if the VM is not backed by hugepages. The unmap operation could be made efficient by disconnecting the table at the very top (level at which the largest block mapping can be hosted) and do the rest of the unmapping using free_removed_table(). If the system supports FEAT_TLBIRANGE, flush the entire range that has been disconnected from the rest of the page-table. Suggested-by: Ricardo Koller Signed-off-by: Raghavendra Rao Ananta --- arch/arm64/kvm/hyp/pgtable.c | 44 ++++++++++++++++++++++++++++++++++++ 1 file changed, 44 insertions(+) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 099032bb01bce..7bcd898de2805 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1021,6 +1021,49 @@ static int stage2_unmap_walker(const struct kvm_pgta= ble_visit_ctx *ctx, return 0; } =20 +/* + * The fast walker executes only if the unmap size is exactly equal to the + * largest block mapping supported (i.e. at KVM_PGTABLE_MIN_BLOCK_LEVEL), + * such that the underneath hierarchy at KVM_PGTABLE_MIN_BLOCK_LEVEL can + * be disconnected from the rest of the page-table without the need to + * traverse all the PTEs, at all the levels, and unmap each and every one + * of them. The disconnected table can be freed using free_removed_table(). + */ +static int fast_stage2_unmap_walker(const struct kvm_pgtable_visit_ctx *ct= x, + enum kvm_pgtable_walk_flags visit) +{ + struct kvm_pgtable_mm_ops *mm_ops =3D ctx->mm_ops; + kvm_pte_t *childp =3D kvm_pte_follow(ctx->old, mm_ops); + struct kvm_s2_mmu *mmu =3D ctx->arg; + + if (!kvm_pte_valid(ctx->old) || ctx->level !=3D KVM_PGTABLE_MIN_BLOCK_LEV= EL) + return 0; + + if (!stage2_try_break_pte(ctx, mmu, 0)) + return -EAGAIN; + + /* + * Gain back a reference for stage2_unmap_walker() to free + * this table entry from KVM_PGTABLE_MIN_BLOCK_LEVEL - 1. + */ + mm_ops->get_page(ctx->ptep); + + mm_ops->free_removed_table(childp, ctx->level); + return 0; +} + +static void kvm_pgtable_try_fast_stage2_unmap(struct kvm_pgtable *pgt, u64= addr, u64 size) +{ + struct kvm_pgtable_walker walker =3D { + .cb =3D fast_stage2_unmap_walker, + .arg =3D pgt->mmu, + .flags =3D KVM_PGTABLE_WALK_TABLE_PRE, + }; + + if (size =3D=3D kvm_granule_size(KVM_PGTABLE_MIN_BLOCK_LEVEL)) + kvm_pgtable_walk(pgt, addr, size, &walker); +} + int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size) { struct kvm_pgtable_walker walker =3D { @@ -1029,6 +1072,7 @@ int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt,= u64 addr, u64 size) .flags =3D KVM_PGTABLE_WALK_LEAF | KVM_PGTABLE_WALK_TABLE_POST, }; =20 + kvm_pgtable_try_fast_stage2_unmap(pgt, addr, size); return kvm_pgtable_walk(pgt, addr, size, &walker); } =20 --=20 2.39.0.314.g84b9a713c41-goog