From nobody Fri Jun 12 09:45:14 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1FE3947B437 for ; Wed, 10 Jun 2026 20:21:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781122913; cv=none; b=rhc4lgn7p3/410InqkmyZPrAe3lBmZGI9u/cWAteTHvuJQhc/jT0ZnoZpTArgMYbMR721/DS/oToWTPNHa547gPuxZTFi98RdL5THfe83OKAmOQ8ulaEXofaq3EhALB2EvvS43UwuBKxuoUWUkqblWAjwm98OeHcUwpbhE1YU2k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781122913; c=relaxed/simple; bh=zla4ojH7IRpK8neQWnjZ1NqqA2JQS5zWlyS67zi1LRE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CJ6lZvjHkE9/IZK0LEFpGCDeJIKKlfxe2ENqCsqkjSWtgQ8pE5C6RYp7DhtA5GAYryr3suA0mpZQSZ6kKP5Y5ohRKM8IOff5xjpobO9ZYF6gn0Fyt1ybJ/fVmvsge3JckD6CuLqA8TdXE8ArzMTW984k/D+SwKzk45wtcR9eE+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b=tloWJSi1; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b="tloWJSi1" Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C2F5E202C; Wed, 10 Jun 2026 13:21:46 -0700 (PDT) Received: from devkitleo.cambridge.arm.com (devkitleo.cambridge.arm.com [10.1.196.90]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 247E73FB7F; Wed, 10 Jun 2026 13:21:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=arm.com; s=foss; t=1781122911; bh=zla4ojH7IRpK8neQWnjZ1NqqA2JQS5zWlyS67zi1LRE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tloWJSi1orvV8CRFoDOd6u5uY5bAzN7N+NznpZxFoBiMdE3tdaPh+JoUHguue9l0P jJ4P7yeKSSWQkrEuTRBVLxZsDawpUg4AVLIoEBjoe9aaBuGjLsqkIsuPVME88jbOy9 H1ke7mKUP8TEnwA+sWGcNQqMbL9d3YnwX5qFoRJQ= From: Leonardo Bras To: Marc Zyngier , Oliver Upton , Joey Gouly , Steffen Eiden , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Fuad Tabba , Leonardo Bras , Raghavendra Rao Ananta Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH v1 1/2] KVM: arm64: Introduce KVM_PGTABLE_WALK_SKIP_LEVEL* walk flags Date: Wed, 10 Jun 2026 21:21:08 +0100 Message-ID: <20260610202112.2695205-3-leo.bras@arm.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260610202112.2695205-2-leo.bras@arm.com> References: <20260610202112.2695205-2-leo.bras@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=4434; i=leo.bras@arm.com; h=from:subject; bh=zla4ojH7IRpK8neQWnjZ1NqqA2JQS5zWlyS67zi1LRE=; b=owGbwMvMwCX2pizjszvTwvWMp9WSGLI0j4veuf4rduXMCY4/J+64yWF44IzrqpvCgqo70vLWB DbVWOnwdZSyMIhxMciKKbLIPpq/iuf7lIwjV34sgJnDygQyhIGLUwAm8t2RkaFRe9Gljjkm5bGe Qg6Xvm6wf+i3/27nVQfe53fu8p/YqO7PyDCrVaPrcUpNaMzhj9yxnkxP2gPr6qqFbUX5vYvsbtV Y8AIA X-Developer-Key: i=leo.bras@arm.com; a=openpgp; fpr=36E6C95AE0F111CC5B6F4D2E688C33F8A0C5B0C5 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the new walking flags that tell kvm_pgtable_walk() to skip lower levels when walking the pagetables. Signed-off-by: Leonardo Bras --- arch/arm64/include/asm/kvm_pgtable.h | 13 +++++++++++++ arch/arm64/kvm/hyp/pgtable.c | 15 ++++++++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/= kvm_pgtable.h index 41a8687938eb..20c7c12e0e76 100644 --- a/arch/arm64/include/asm/kvm_pgtable.h +++ b/arch/arm64/include/asm/kvm_pgtable.h @@ -311,31 +311,44 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, = u64 end, * @KVM_PGTABLE_WALK_SHARED: Indicates the page-tables may be shared * with other software walkers. * @KVM_PGTABLE_WALK_IGNORE_EAGAIN: Don't terminate the walk early if * the walker returns -EAGAIN. * @KVM_PGTABLE_WALK_SKIP_BBM_TLBI: Visit and update table entries * without Break-before-make's * TLB invalidation. * @KVM_PGTABLE_WALK_SKIP_CMO: Visit and update table entries * without Cache maintenance * operations required. + * @KVM_PGTABLE_WALK_SKIP_LEVEL0: Skip visiting level-0+ entries + * @KVM_PGTABLE_WALK_SKIP_LEVEL1: Skip visiting level-1+ entries + * @KVM_PGTABLE_WALK_SKIP_LEVEL2: Skip visiting level-2+ entries + * @KVM_PGTABLE_WALK_SKIP_LEVEL3: Skip visiting level-3 entries */ enum kvm_pgtable_walk_flags { KVM_PGTABLE_WALK_LEAF =3D BIT(0), KVM_PGTABLE_WALK_TABLE_PRE =3D BIT(1), KVM_PGTABLE_WALK_TABLE_POST =3D BIT(2), KVM_PGTABLE_WALK_SHARED =3D BIT(3), KVM_PGTABLE_WALK_IGNORE_EAGAIN =3D BIT(4), KVM_PGTABLE_WALK_SKIP_BBM_TLBI =3D BIT(5), KVM_PGTABLE_WALK_SKIP_CMO =3D BIT(6), + KVM_PGTABLE_WALK_SKIP_LEVEL0 =3D BIT(7), + KVM_PGTABLE_WALK_SKIP_LEVEL1 =3D BIT(8), + KVM_PGTABLE_WALK_SKIP_LEVEL2 =3D BIT(9), + KVM_PGTABLE_WALK_SKIP_LEVEL3 =3D BIT(10), }; =20 +#define KVM_PGTABLE_WALK_SKIP_LEVELS (KVM_PGTABLE_WALK_SKIP_LEVEL0 | \ + KVM_PGTABLE_WALK_SKIP_LEVEL1 | \ + KVM_PGTABLE_WALK_SKIP_LEVEL2 | \ + KVM_PGTABLE_WALK_SKIP_LEVEL3 ) + struct kvm_pgtable_visit_ctx { kvm_pte_t *ptep; kvm_pte_t old; void *arg; struct kvm_pgtable_mm_ops *mm_ops; u64 start; u64 addr; u64 end; s8 level; enum kvm_pgtable_walk_flags flags; diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 91a7dfad6686..48d88a290a53 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -137,20 +137,33 @@ static bool kvm_pgtable_walk_continue(const struct kv= m_pgtable_walker *walker, * Ignore the return code altogether for walkers outside a fault handler * (e.g. write protecting a range of memory) and chug along with the * page table walk. */ if (r =3D=3D -EAGAIN) return walker->flags & KVM_PGTABLE_WALK_IGNORE_EAGAIN; =20 return !r; } =20 +static __always_inline bool kvm_pgtable_skip_level(s8 level, enum kvm_pgta= ble_walk_flags flags) +{ + flags &=3D KVM_PGTABLE_WALK_SKIP_LEVELS; + + if (likely(!flags)) + return false; + + if (level >=3D (fls(flags) - ffs(KVM_PGTABLE_WALK_SKIP_LEVELS))) + return true; + + return false; +} + static int __kvm_pgtable_walk(struct kvm_pgtable_walk_data *data, struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pgtable, s8 level= ); =20 static inline int __kvm_pgtable_visit(struct kvm_pgtable_walk_data *data, struct kvm_pgtable_mm_ops *mm_ops, kvm_pteref_t pteref, s8 level) { enum kvm_pgtable_walk_flags flags =3D data->walker->flags; kvm_pte_t *ptep =3D kvm_dereference_pteref(data->walker, pteref); struct kvm_pgtable_visit_ctx ctx =3D { @@ -185,21 +198,21 @@ static inline int __kvm_pgtable_visit(struct kvm_pgta= ble_walk_data *data, * into a newly installed or replaced table. */ if (reload) { ctx.old =3D READ_ONCE(*ptep); table =3D kvm_pte_table(ctx.old, level); } =20 if (!kvm_pgtable_walk_continue(data->walker, ret)) goto out; =20 - if (!table) { + if (!table || kvm_pgtable_skip_level(level + 1, ctx.flags)) { data->addr =3D ALIGN_DOWN(data->addr, kvm_granule_size(level)); data->addr +=3D kvm_granule_size(level); goto out; } =20 childp =3D (kvm_pteref_t)kvm_pte_follow(ctx.old, mm_ops); ret =3D __kvm_pgtable_walk(data, mm_ops, childp, level + 1); if (!kvm_pgtable_walk_continue(data->walker, ret)) goto out; =20 --=20 2.54.0 From nobody Fri Jun 12 09:45:14 2026 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D4285198E91 for ; Wed, 10 Jun 2026 20:21:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781122916; cv=none; b=QqkePMKNLKNE/9pExVXxo0ZrBSsN/CJ6ye74uxrmXUKSeeTsX8siwZwRB4Vc3ffoXkPxR4ieTSkjuq8RcS+czk9ox6hbZyLi33JbyXUPUIumfjmfCk7acyGGlvgh1D55Q6NP1/ywDrPOrc3PEaAjtQR0oqiwVxFQ8Q0uf6vXrLM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1781122916; c=relaxed/simple; bh=IoVLG3gZ7nVDc6fATHc768hV2/reoPfUmWuqYZJFMR0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U11lS+/ImRNWpsoIwdJaz3CuTZNpkRkfQItE+rcgPtlGdNnD0ymx7ZodKbYI7qqdvDXNEiLMxIdh4c2APR3vA4qZ/YWf/uPfmX1y+jgz9P5XHiXF5A7d8XRxp+c1yWnHaHAN3YIFiGow4XxJ45swLuPJuCmanEZ/u/Ed3hqLp70= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b=I9wQZDDa; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=arm.com header.i=@arm.com header.b="I9wQZDDa" Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9161B35B9; Wed, 10 Jun 2026 13:21:49 -0700 (PDT) Received: from devkitleo.cambridge.arm.com (devkitleo.cambridge.arm.com [10.1.196.90]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id EADFE3FB7F; Wed, 10 Jun 2026 13:21:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=arm.com; s=foss; t=1781122914; bh=IoVLG3gZ7nVDc6fATHc768hV2/reoPfUmWuqYZJFMR0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=I9wQZDDaXOfFEQRmNA/fXCEhydkwELhfIluY/ZhClXDn1AqP34kmnSpYTNnMq1VzH w40XtsMg6ULcF69P84teX9O/NAmfJersWMtYFeCsNNKS04M1/LVXUEtJhUMoLdiIKu KjpPml2M4o3wGci9o6tz1pGpwrto2PAD8kFLilJ0= From: Leonardo Bras To: Marc Zyngier , Oliver Upton , Joey Gouly , Steffen Eiden , Suzuki K Poulose , Zenghui Yu , Catalin Marinas , Will Deacon , Fuad Tabba , Leonardo Bras , Raghavendra Rao Ananta Cc: linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org Subject: [PATCH v1 2/2] KVM: arm64: Make stage2_split_walker() skip unnecessary walks Date: Wed, 10 Jun 2026 21:21:09 +0100 Message-ID: <20260610202112.2695205-4-leo.bras@arm.com> X-Mailer: git-send-email 2.54.0 In-Reply-To: <20260610202112.2695205-2-leo.bras@arm.com> References: <20260610202112.2695205-2-leo.bras@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=2355; i=leo.bras@arm.com; h=from:subject; bh=IoVLG3gZ7nVDc6fATHc768hV2/reoPfUmWuqYZJFMR0=; b=owGbwMvMwCX2pizjszvTwvWMp9WSGLI0j4sWfdpy2Pi+6GMZ5b6geZMdzXZJbWh0Dv0lc654b drOwnOtHaUsDGJcDLJiiiyyj+av4vk+JePIlR8LYOawMoEMYeDiFICJzOhmZFjP1HlrtkPg9wjB wtBVd9MY7zK27k79f4lVpXsuU9MPxcmMDHvljc7zNMjyH9p847jF75/NWYF6QQfS/I1sJ5z487j Ylg8A X-Developer-Key: i=leo.bras@arm.com; a=openpgp; fpr=36E6C95AE0F111CC5B6F4D2E688C33F8A0C5B0C5 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, when splitting a hugepage, all it's child and sibling nodes will be walked, with the walker just returning earlier if there is nothing to do. This means all pagetable entries in the splitting range get a callback from the walker function, even if it was a level-3 entry. Optimize splitting by skipping all level-3 entries, as they are already the smallest block size and can't be split any further. (i.e. set flag KVM_PGTABLE_WALK_SKIP_LEVEL3) Optimization measured on two scenarios involving eager-splitting on a VM with 1 memslot of 64GB: - Scenario 1: No manual protect, whole memslot split at dirty-track enable (KVM_SET_USER_MEMORY_REGION2 ioctl with KVM_MEM_LOG_DIRTY_PAGES) - Scenario 2: Manual protect, split happens during dirty-bit clean (KVM_CLEAR_DIRTY_LOG ioctl), average for 2 iterations. Scenario 1, improvement on dirty-track enable for the memslot: - Memory was already split (4k pages): -35.47% runtime - THP backed memory: -11.94% runtime - 64x1GB hugetlb memory: -14.46% runtime Scenario 2, improvement on dirty-log clean for the memslot: - Memory was already split (4k pages): -26.36% runtime - THP backed memory: -12.05% runtime - 64x1GB hugetlb memory: -13.87% runtime Signed-off-by: Leonardo Bras --- arch/arm64/kvm/hyp/pgtable.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c index 48d88a290a53..70103934a04a 100644 --- a/arch/arm64/kvm/hyp/pgtable.c +++ b/arch/arm64/kvm/hyp/pgtable.c @@ -1565,21 +1565,22 @@ static int stage2_split_walker(const struct kvm_pgt= able_visit_ctx *ctx, new =3D kvm_init_table_pte(childp, mm_ops); stage2_make_pte(ctx, new); return 0; } =20 int kvm_pgtable_stage2_split(struct kvm_pgtable *pgt, u64 addr, u64 size, struct kvm_mmu_memory_cache *mc) { struct kvm_pgtable_walker walker =3D { .cb =3D stage2_split_walker, - .flags =3D KVM_PGTABLE_WALK_LEAF, + .flags =3D KVM_PGTABLE_WALK_LEAF | + KVM_PGTABLE_WALK_SKIP_LEVEL3, .arg =3D mc, }; int ret; =20 ret =3D kvm_pgtable_walk(pgt, addr, size, &walker); dsb(ishst); return ret; } =20 int __kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm_s2_mmu *= mmu, --=20 2.54.0