From nobody Mon Feb 9 06:25:27 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2ACD09454 for ; Sat, 11 Jan 2025 01:04:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736557455; cv=none; b=ProfQ5jP7NzhehbiFJnmTx01/vtO7w4HrtEBcSolfPpFvThVyDGDAs1obbMfgEezPpLRkZe51UJGlV/iZPUOSOX69VoVWr+HqCxrbTxQxvNTaKNBVZQL8ms6KXPPEMU+BFk57UKUavz8ZhpULpqidZ+cWCpfD8IpYCzx529buLs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1736557455; c=relaxed/simple; bh=Y87IV5Xpnh+jgd2FxKTBjhTRZsxpUL28iWLdqHeOOJE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=oKNCu60ePVUhKwB8vf4GWHvDAaoUzQ3BjQmTBS73diEKUVFYba1MBO4zXG3x1BqG6XiulYo1UuCQ7oDp2+ZI6BuFkjLGKcNfNlbkafgX6xQ3wPFudzhS2nYGxbe6LJxlNiv+sv3EKPbDGsUZZAGOb+4jMcWwKsDLG9Ayie4bJ+o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=SLqjOoSr; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="SLqjOoSr" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ef909597d9so7425471a91.3 for ; Fri, 10 Jan 2025 17:04:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736557453; x=1737162253; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=Cy2pnqQ2fSzBgMiJrF6O6bTMaX9oe7O9BUSjAJGBR+0=; b=SLqjOoSryRv1XsjDEu9pvYFmixZ+y5ks07rjgwvUxIAntrzq17skWikXjZE7jSr3rE 7KJAobL/YPvRbgXyjcVI1S9LG9eaOpITDpC2HEwB0N27EHMW7o8Fj8QVLKoUizjcxKpL hc4EXRWI9BDwFyW0IXJGVF7lACrfncT6xOJN7iuMHnBm/nBMTHtXpxqQEkfFQV3lHxyS qD6k7ENdlurjmXlEotNJA9SGISGvAcSGSO4D0C0rOFvY1H2vohHxbVp56gW87UbDsiF0 zom9usmNp3OnXkTAe8aAV9UfKFut00bcYUBRqrgOC3tMlZ/I/UDFOWG7I2rgBpWGOWTz NVfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736557453; x=1737162253; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Cy2pnqQ2fSzBgMiJrF6O6bTMaX9oe7O9BUSjAJGBR+0=; b=McqGgUuf++ZSDWZZQ2bLY8Cp7cw4wEWqAevp4Ir9RUnek38DkY9ejL/DIgKBUqXhLf n1PFB6Eaniyme7KJs3w2OTT6cDyTmtbaTEYflYoQRf9cgO59bTYh8fldEjU9tM3z+dRV Y3FharBu0760n4OYhP7n3u9CAnXTgzbQhBha0LWRHT81j42qsAnirHZH7d6To6bhB4wK z9dJqePhRolHN2f7Sbn2B1lYN8Y8DVajO8oZDWNnQTy5kYL+QjeJYoIn++nwLDkCqBdb +LknJ8jXVdSFO0rrSco9Apq3HkA0NdNLy+HJ16nE5lSP0tHLBMWvwbLjJplnnjdhQd+C jUYQ== X-Forwarded-Encrypted: i=1; AJvYcCXlN1Vyu2eRYtUPF8L/LmWXcFL4iOEff8wX2UWniowqUPqIp0UKF5Ro/n9OqVvUTKD6OBIbbx132vS0E68=@vger.kernel.org X-Gm-Message-State: AOJu0YzaAoYBUpoFckdb831CChmcjH/D8fwQDBOLJZ3rSI+Ok2XF35v/ 8FUIvXMj/WLBWFLYfFwfPwT85BjZF8NroCjULUFqh5A14rhz+R4UVRXkfYkRSnaJ8qKZx/B4rOY pgg== X-Google-Smtp-Source: AGHT+IEiLT8u+IwJHWKAAuUTYAyOIXDxsf9AxyDDoSlDQKImRmj4TEJpD11oowKHMJ/OBwLbI6iHkHS2p5s= X-Received: from pfbln9.prod.google.com ([2002:a05:6a00:3cc9:b0:72a:f9c7:a2ed]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:408e:b0:725:9dc7:4f8b with SMTP id d2e1a72fcca58-72d21f562c6mr17589432b3a.15.1736557453546; Fri, 10 Jan 2025 17:04:13 -0800 (PST) Reply-To: Sean Christopherson Date: Fri, 10 Jan 2025 17:04:05 -0800 In-Reply-To: <20250111010409.1252942-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250111010409.1252942-1-seanjc@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250111010409.1252942-2-seanjc@google.com> Subject: [PATCH 1/5] KVM: Bound the number of dirty ring entries in a single reset at INT_MAX From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Xu , Yan Zhao , Maxim Levitsky , Sean Christopherson Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Cap the number of ring entries that are reset in a single ioctl to INT_MAX to ensure userspace isn't confused by a wrap into negative space, and so that, in a truly pathological scenario, KVM doesn't miss a TLB flush due to the count wrapping to zero. While the size of the ring is fixed at 0x10000 entries and KVM (currently) supports at most 4096, userspace is allowed to harvest entries from the ring while the reset is in-progress, i.e. it's possible for the ring to always harvested entries. Opportunistically return an actual error code from the helper so that a future fix to handle pending signals can gracefully return -EINTR. Cc: Peter Xu Cc: Yan Zhao Cc: Maxim Levitsky Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") Signed-off-by: Sean Christopherson --- include/linux/kvm_dirty_ring.h | 8 +++++--- virt/kvm/dirty_ring.c | 10 +++++----- virt/kvm/kvm_main.c | 9 ++++++--- 3 files changed, 16 insertions(+), 11 deletions(-) diff --git a/include/linux/kvm_dirty_ring.h b/include/linux/kvm_dirty_ring.h index 4862c98d80d3..82829243029d 100644 --- a/include/linux/kvm_dirty_ring.h +++ b/include/linux/kvm_dirty_ring.h @@ -49,9 +49,10 @@ static inline int kvm_dirty_ring_alloc(struct kvm_dirty_= ring *ring, } =20 static inline int kvm_dirty_ring_reset(struct kvm *kvm, - struct kvm_dirty_ring *ring) + struct kvm_dirty_ring *ring, + int *nr_entries_reset) { - return 0; + return -ENOENT; } =20 static inline void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, @@ -81,7 +82,8 @@ int kvm_dirty_ring_alloc(struct kvm_dirty_ring *ring, int= index, u32 size); * called with kvm->slots_lock held, returns the number of * processed pages. */ -int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring); +int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring, + int *nr_entries_reset); =20 /* * returns =3D0: successfully pushed diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 7bc74969a819..2faf894dec5a 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -104,19 +104,19 @@ static inline bool kvm_dirty_gfn_harvested(struct kvm= _dirty_gfn *gfn) return smp_load_acquire(&gfn->flags) & KVM_DIRTY_GFN_F_RESET; } =20 -int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring) +int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring, + int *nr_entries_reset) { u32 cur_slot, next_slot; u64 cur_offset, next_offset; unsigned long mask; - int count =3D 0; struct kvm_dirty_gfn *entry; bool first_round =3D true; =20 /* This is only needed to make compilers happy */ cur_slot =3D cur_offset =3D mask =3D 0; =20 - while (true) { + while (likely((*nr_entries_reset) < INT_MAX)) { entry =3D &ring->dirty_gfns[ring->reset_index & (ring->size - 1)]; =20 if (!kvm_dirty_gfn_harvested(entry)) @@ -129,7 +129,7 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_di= rty_ring *ring) kvm_dirty_gfn_set_invalid(entry); =20 ring->reset_index++; - count++; + (*nr_entries_reset)++; /* * Try to coalesce the reset operations when the guest is * scanning pages in the same slot. @@ -166,7 +166,7 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_di= rty_ring *ring) =20 trace_kvm_dirty_ring_reset(ring); =20 - return count; + return 0; } =20 void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, u32 slot, u64 offset) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 9d54473d18e3..2d63b4d46ccb 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4877,15 +4877,18 @@ static int kvm_vm_ioctl_reset_dirty_pages(struct kv= m *kvm) { unsigned long i; struct kvm_vcpu *vcpu; - int cleared =3D 0; + int cleared =3D 0, r; =20 if (!kvm->dirty_ring_size) return -EINVAL; =20 mutex_lock(&kvm->slots_lock); =20 - kvm_for_each_vcpu(i, vcpu, kvm) - cleared +=3D kvm_dirty_ring_reset(vcpu->kvm, &vcpu->dirty_ring); + kvm_for_each_vcpu(i, vcpu, kvm) { + r =3D kvm_dirty_ring_reset(vcpu->kvm, &vcpu->dirty_ring, &cleared); + if (r) + break; + } =20 mutex_unlock(&kvm->slots_lock); =20 --=20 2.47.1.613.gc27f4b7a9f-goog