From nobody Sun Nov 24 13:33:28 2024 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D6C91F7578 for ; Tue, 5 Nov 2024 18:43:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730832236; cv=none; b=t0tBV08b9rodssqpkRPcGBqfFdSY2QRfL8brMPo+oekeLWTBFrU+BABA3kjp2lqpfoD+h714vV2vfDotEgumUkaBZcQqP9TgrqeoJI+n8CkFYu74E1/AfV6PKOBa24KCTHNl9+w5dbKU6c8fqhASHQt+ssXEIZrgkB9a0utyWk0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730832236; c=relaxed/simple; bh=NUxSfRX4cW7cHK1sEEySqBH76Z0YY3GolZI8i3B0KXc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=e8ZIW0rnhizDzRRKGadxev7hE5dKTTJNLrMvVlQy0pnhw9EHBxrAUaoRe8tCntAfsyK94PHValWF+iFi4No0srVGZ0zBs4Lx6dj2Djtav1hS7c6qANjYUGhyoYjN902xIaSQela9GsEYjP5RQ3iF1Zzpwo/OD1yiK4TvMqwlUjI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jthoughton.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=J5kHK/OJ; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jthoughton.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="J5kHK/OJ" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6e9d6636498so111694847b3.2 for ; Tue, 05 Nov 2024 10:43:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1730832233; x=1731437033; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=878SGuKh3XmWcBJqcVpzoEFvF+zSdyFMaOS0SizaVl0=; b=J5kHK/OJQotjkpAFnmz0SpIlBwugvNynd8cVpQG9dHbrDnigzyhY0TqZ/jdVcG7CpR iaZht1Apr689NOIId/9y2JFAAs72kRlra1UpWdD3pLgmhajLJ8GJEVJPWRzQ53eZ6TyT Xx/iMZpOqjYZzwV2/J0dWkPYAKnVTFjrOpRYOnPDGfuggSzFY/rUdCpy6NXEK5rnc0yF 5y4CHRIRQFOsJpcKmGgvqWlJR54QVGpVJKWSMmTReq6ylA8eLfK6eMQ3BTrcUk93Os+K fSx6HeGsI0bpp9VgDQHH942yK7Bn2MM21ZNYRQGcKy+k0p476cxZJY72GCSESrI/4+9U txYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730832233; x=1731437033; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=878SGuKh3XmWcBJqcVpzoEFvF+zSdyFMaOS0SizaVl0=; b=NBqXfgXhQgOflKVqS4x7V+aOTt2m6oBRvH/4uVmHweXrWRqAOSKf3j3/Hzo7DiuHaA Qz+fTzIMtiCYDmCOmLAatSTcJcbyPNuBNqOt+PP0l8FObov6FLhqDyhDOrJGlTlYMIDH 9UVwb3uxk3vAexdKIpE4z8/KfzShc80hIkZZe5dAUjxrnX0Fkpm7LCrBewv8CpcmCBVL iONIwwbMtsyQlECz9KMHGM5f9UU9RteO03m3KwQOTctBkfz6E5+RVg0M0MquoUFhHcag tZ4yzPQLpXXsf/GHkHt99bj0eJiKvx1zhoe3htQoAL9Ih/1voWPAF71QPr44o2OItth1 m0Yw== X-Forwarded-Encrypted: i=1; AJvYcCXN/X8MZ3eXGETx5E6NB0x+YYOPdxgE3fGMDBWlYFZmkMCZ9KSg2CpbFSnosjJxqEAKHKniVu2cOmHkLJA=@vger.kernel.org X-Gm-Message-State: AOJu0Yyziz1oStZpFXLn61eM4/4TzGAqZvwPW06C7jC+04W069+ee5xP 51WPMsEBKPc3KOEq1RoVP65OxLlH6wfFxBNdV0H8Jpf33OkF9CwgpzDwUAjdVd+GSUSZvpACB0v 5yGBx719FC8hTTnKwEQ== X-Google-Smtp-Source: AGHT+IH7ISi4GnII+0BnZH0fUyJfZEHfRMZQzMU+f5JniaDp4cZHgc7uAg+CKsWk5yAjOsMg6WNz5CqdG6JaBDbt X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:13d:fb22:ac12:a84b]) (user=jthoughton job=sendgmr) by 2002:a05:6902:1d1:b0:e2e:3031:3f0c with SMTP id 3f1490d57ef6-e30e5b0ee45mr14173276.7.1730832233341; Tue, 05 Nov 2024 10:43:53 -0800 (PST) Date: Tue, 5 Nov 2024 18:43:32 +0000 In-Reply-To: <20241105184333.2305744-1-jthoughton@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241105184333.2305744-1-jthoughton@google.com> X-Mailer: git-send-email 2.47.0.199.ga7371fff76-goog Message-ID: <20241105184333.2305744-11-jthoughton@google.com> Subject: [PATCH v8 10/11] KVM: x86/mmu: Support rmap walks without holding mmu_lock when aging gfns From: James Houghton To: Sean Christopherson , Paolo Bonzini Cc: David Matlack , David Rientjes , James Houghton , Marc Zyngier , Oliver Upton , Wei Xu , Yu Zhao , Axel Rasmussen , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Sean Christopherson When A/D bits are supported on sptes, it is safe to simply clear the Accessed bits. The less obvious case is marking sptes for access tracking in the non-A/D case (for EPT only). In this case, we have to be sure that it is okay for TLB entries to exist for non-present sptes. For example, when doing dirty tracking, if we come across a non-present SPTE, we need to know that we need to do a TLB invalidation. This case is already supported today (as we already support *not* doing TLBIs for clear_young(); there is a separate notifier for clearing *and* flushing, clear_flush_young()). This works today because GET_DIRTY_LOG flushes the TLB before returning to userspace. Signed-off-by: Sean Christopherson Co-developed-by: James Houghton Signed-off-by: James Houghton --- arch/x86/kvm/mmu/mmu.c | 72 +++++++++++++++++++++++------------------- 1 file changed, 39 insertions(+), 33 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 71019762a28a..bdd6abf9b44e 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -952,13 +952,11 @@ static unsigned long kvm_rmap_get(struct kvm_rmap_hea= d *rmap_head) * locking is the same, but the caller is disallowed from modifying the rm= ap, * and so the unlock flow is a nop if the rmap is/was empty. */ -__maybe_unused static unsigned long kvm_rmap_lock_readonly(struct kvm_rmap_head *rmap_hea= d) { return __kvm_rmap_lock(rmap_head); } =20 -__maybe_unused static void kvm_rmap_unlock_readonly(struct kvm_rmap_head *rmap_head, unsigned long old_val) { @@ -1677,37 +1675,48 @@ static void rmap_add(struct kvm_vcpu *vcpu, const s= truct kvm_memory_slot *slot, } =20 static bool kvm_rmap_age_gfn_range(struct kvm *kvm, - struct kvm_gfn_range *range, bool test_only) + struct kvm_gfn_range *range, + bool test_only) { - struct slot_rmap_walk_iterator iterator; + struct kvm_rmap_head *rmap_head; struct rmap_iterator iter; + unsigned long rmap_val; bool young =3D false; u64 *sptep; + gfn_t gfn; + int level; + u64 spte; =20 - for_each_slot_rmap_range(range->slot, PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL, - range->start, range->end - 1, &iterator) { - for_each_rmap_spte(iterator.rmap, &iter, sptep) { - u64 spte =3D *sptep; + for (level =3D PG_LEVEL_4K; level <=3D KVM_MAX_HUGEPAGE_LEVEL; level++) { + for (gfn =3D range->start; gfn < range->end; + gfn +=3D KVM_PAGES_PER_HPAGE(level)) { + rmap_head =3D gfn_to_rmap(gfn, level, range->slot); + rmap_val =3D kvm_rmap_lock_readonly(rmap_head); =20 - if (!is_accessed_spte(spte)) - continue; + for_each_rmap_spte_lockless(rmap_head, &iter, sptep, spte) { + if (!is_accessed_spte(spte)) + continue; + + if (test_only) { + kvm_rmap_unlock_readonly(rmap_head, rmap_val); + return true; + } =20 - if (test_only) - return true; - - if (spte_ad_enabled(spte)) { - clear_bit((ffs(shadow_accessed_mask) - 1), - (unsigned long *)sptep); - } else { - /* - * WARN if mmu_spte_update() signals the need - * for a TLB flush, as Access tracking a SPTE - * should never trigger an _immediate_ flush. - */ - spte =3D mark_spte_for_access_track(spte); - WARN_ON_ONCE(mmu_spte_update(sptep, spte)); + if (spte_ad_enabled(spte)) + clear_bit((ffs(shadow_accessed_mask) - 1), + (unsigned long *)sptep); + else + /* + * If the following cmpxchg fails, the + * spte is being concurrently modified + * and should most likely stay young. + */ + cmpxchg64(sptep, spte, + mark_spte_for_access_track(spte)); + young =3D true; } - young =3D true; + + kvm_rmap_unlock_readonly(rmap_head, rmap_val); } } return young; @@ -1725,11 +1734,8 @@ bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_ran= ge *range) if (tdp_mmu_enabled) young =3D kvm_tdp_mmu_age_gfn_range(kvm, range); =20 - if (kvm_has_shadow_mmu_sptes(kvm)) { - write_lock(&kvm->mmu_lock); + if (kvm_has_shadow_mmu_sptes(kvm)) young |=3D kvm_rmap_age_gfn_range(kvm, range, false); - write_unlock(&kvm->mmu_lock); - } =20 return young; } @@ -1741,11 +1747,11 @@ bool kvm_test_age_gfn(struct kvm *kvm, struct kvm_g= fn_range *range) if (tdp_mmu_enabled) young =3D kvm_tdp_mmu_test_age_gfn(kvm, range); =20 - if (!young && kvm_has_shadow_mmu_sptes(kvm)) { - write_lock(&kvm->mmu_lock); + if (young) + return young; + + if (kvm_has_shadow_mmu_sptes(kvm)) young |=3D kvm_rmap_age_gfn_range(kvm, range, true); - write_unlock(&kvm->mmu_lock); - } =20 return young; } --=20 2.47.0.199.ga7371fff76-goog