From nobody Tue Dec 16 05:55:04 2025 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1B00280308 for ; Fri, 16 May 2025 21:35:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431346; cv=none; b=tR4i9X+BxOavqlhtgti5/dJqfLNUZiaesgLfHJer/X6/ZuZ0KrSMhOnMBgwvgMDl0MjGoqlA+hlWduYzUSsYo+OnoS/uw/rw2cZhyUKENXvpamAWU+KIzRcyv49/V+QOLwJO6NeY9PVykVD8lKQX1VW/vLihJzQ6dcj7xGH1H7o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431346; c=relaxed/simple; bh=TPssax79sxW9s4Uhhkr+BKlQZkVGnbippa0fsIPWqo0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ar6BKIPhMtJWVkXvt1t4Ri6QC/rTzx41/O57rwHKrqFORAMQu0MPWq1vbLRBliVOZnBpsFu0yt4L63njFVdhlzx6cw6LZtakU8WTJ72bKZRCfjLu/+SFYjh0n0Fu2Y973GgEV5nMoJ1EyFJP1+3sdZri4N/whMGaKfpJvePgf3s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Fxhgg+MY; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Fxhgg+MY" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-22fb8cfff31so19787165ad.1 for ; Fri, 16 May 2025 14:35:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747431344; x=1748036144; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=32n3PJO9GBTARszoinMNB66MtnH1QTCvI0q5z3orUsw=; b=Fxhgg+MYRKX5u3T6BOnRhHghesd9cKMdhtf01T/Hh6CHbGYy9e2vNKfxoorrmO6lZr nclYniCFXiI0OaAvTofyKGSyP4y9CaR5JJKgv0eEcnxFQV94R0eXb3EYgq5uTuGdHnG3 KRc/RvhajwMb/JWpeK6CfjyMLPcHdg4dsc1f999Lo7QRp+SpgGrKFyVKESr3B0W/xYuP QavTBXWxDHsHFQ2LxOLIJ5CPbN5Osk90Gc+TnyW/fMuhFfp+HqXjLZ7flHpq9+FNmZfz dqZPRgaSd53l/QectZLZ0HXHjUocg62OuYQfINagQBwY2tLtFIg/DMsrWi32Hh8k8tqF dwww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747431344; x=1748036144; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=32n3PJO9GBTARszoinMNB66MtnH1QTCvI0q5z3orUsw=; b=jF0roK73X/Ql2UQqaerFd8XjeKVztEyjCbMBAZ0xpJbiDYzM28RqTXnQojHRTPtA5a qltfxGsWHoKn39ugQq4paF/xM38wvsxJjlcU3hyS5SuI1IMt8qcrPjlAdadJxGyuNZMt ZgXmVYA2TBXhEHvoJ9fRYCKcGxqH3+QzORniMnARx5VINzNxWH0aLZLXen5H+lQevN7A SC7YEF09dpLGW120RGBSINdRjdNPTenoLUvKhSdDsEsBbLt+oi9Q4CJbaQME0P4efFRC UQzmqvpGxXgenqWQy74n/ijYovbmkt/PWkHjSIfH25i/Ejg8WLxAR98Sq1dD27FqoE7S ochg== X-Forwarded-Encrypted: i=1; AJvYcCVze1l7XV+Whj7G7BDj4oHhENzV/ISUOMNmFca0oDvHZe8/bEcAOwgDZmVnhpGHuZLA8mfPqrfhhDGQ5C8=@vger.kernel.org X-Gm-Message-State: AOJu0YxbvTSClSUL/wmwBQWB5dWtvGx6dJjElxAxOSylPb8DmbVGpXnq W8TSrnjbhyUyTlYereg7VaAuJLpdisksawM1QZSAfWBgJFcPanK6eLzcInr2/+T3HHM1saLptBJ Iga9SCQ== X-Google-Smtp-Source: AGHT+IG5j10n9pBrXztcWjA8kxz2B2rfIvBK/g2w38KT4xlpjVSYu39IgoUnHrzq9k9yoHJvwCdr38hRT4s= X-Received: from plht15.prod.google.com ([2002:a17:903:2f0f:b0:223:225b:3d83]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:e787:b0:224:26fd:82e5 with SMTP id d9443c01a7336-231d455d9ddmr69825065ad.48.1747431344204; Fri, 16 May 2025 14:35:44 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 16 May 2025 14:35:35 -0700 In-Reply-To: <20250516213540.2546077-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250516213540.2546077-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.1112.g889b7c5bd8-goog Message-ID: <20250516213540.2546077-2-seanjc@google.com> Subject: [PATCH v3 1/6] KVM: Bound the number of dirty ring entries in a single reset at INT_MAX From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Xu , Yan Zhao , Maxim Levitsky , Binbin Wu , James Houghton , Sean Christopherson , Pankaj Gupta Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Cap the number of ring entries that are reset in a single ioctl to INT_MAX to ensure userspace isn't confused by a wrap into negative space, and so that, in a truly pathological scenario, KVM doesn't miss a TLB flush due to the count wrapping to zero. While the size of the ring is fixed at 0x10000 entries and KVM (currently) supports at most 4096, userspace is allowed to harvest entries from the ring while the reset is in-progress, i.e. it's possible for the ring to always have harvested entries. Opportunistically return an actual error code from the helper so that a future fix to handle pending signals can gracefully return -EINTR. Drop the function comment now that the return code is a stanard 0/-errno (and because a future commit will add a proper lockdep assertion). Opportunistically drop a similarly stale comment for kvm_dirty_ring_push(). Cc: Peter Xu Cc: Yan Zhao Cc: Maxim Levitsky Cc: Binbin Wu Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") Reviewed-by: James Houghton Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Peter Xu Reviewed-by: Yan Zhao --- include/linux/kvm_dirty_ring.h | 18 +++++------------- virt/kvm/dirty_ring.c | 10 +++++----- virt/kvm/kvm_main.c | 9 ++++++--- 3 files changed, 16 insertions(+), 21 deletions(-) diff --git a/include/linux/kvm_dirty_ring.h b/include/linux/kvm_dirty_ring.h index da4d9b5f58f1..eb10d87adf7d 100644 --- a/include/linux/kvm_dirty_ring.h +++ b/include/linux/kvm_dirty_ring.h @@ -49,9 +49,10 @@ static inline int kvm_dirty_ring_alloc(struct kvm *kvm, = struct kvm_dirty_ring *r } =20 static inline int kvm_dirty_ring_reset(struct kvm *kvm, - struct kvm_dirty_ring *ring) + struct kvm_dirty_ring *ring, + int *nr_entries_reset) { - return 0; + return -ENOENT; } =20 static inline void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, @@ -77,17 +78,8 @@ bool kvm_arch_allow_write_without_running_vcpu(struct kv= m *kvm); u32 kvm_dirty_ring_get_rsvd_entries(struct kvm *kvm); int kvm_dirty_ring_alloc(struct kvm *kvm, struct kvm_dirty_ring *ring, int index, u32 size); - -/* - * called with kvm->slots_lock held, returns the number of - * processed pages. - */ -int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring); - -/* - * returns =3D0: successfully pushed - * <0: unable to push, need to wait - */ +int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring, + int *nr_entries_reset); void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, u32 slot, u64 offset); =20 bool kvm_dirty_ring_check_request(struct kvm_vcpu *vcpu); diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index d14ffc7513ee..77986f34eff8 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -105,19 +105,19 @@ static inline bool kvm_dirty_gfn_harvested(struct kvm= _dirty_gfn *gfn) return smp_load_acquire(&gfn->flags) & KVM_DIRTY_GFN_F_RESET; } =20 -int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring) +int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring, + int *nr_entries_reset) { u32 cur_slot, next_slot; u64 cur_offset, next_offset; unsigned long mask; - int count =3D 0; struct kvm_dirty_gfn *entry; bool first_round =3D true; =20 /* This is only needed to make compilers happy */ cur_slot =3D cur_offset =3D mask =3D 0; =20 - while (true) { + while (likely((*nr_entries_reset) < INT_MAX)) { entry =3D &ring->dirty_gfns[ring->reset_index & (ring->size - 1)]; =20 if (!kvm_dirty_gfn_harvested(entry)) @@ -130,7 +130,7 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_di= rty_ring *ring) kvm_dirty_gfn_set_invalid(entry); =20 ring->reset_index++; - count++; + (*nr_entries_reset)++; /* * Try to coalesce the reset operations when the guest is * scanning pages in the same slot. @@ -167,7 +167,7 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_di= rty_ring *ring) =20 trace_kvm_dirty_ring_reset(ring); =20 - return count; + return 0; } =20 void kvm_dirty_ring_push(struct kvm_vcpu *vcpu, u32 slot, u64 offset) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index b24db92e98f3..571688507204 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4903,15 +4903,18 @@ static int kvm_vm_ioctl_reset_dirty_pages(struct kv= m *kvm) { unsigned long i; struct kvm_vcpu *vcpu; - int cleared =3D 0; + int cleared =3D 0, r; =20 if (!kvm->dirty_ring_size) return -EINVAL; =20 mutex_lock(&kvm->slots_lock); =20 - kvm_for_each_vcpu(i, vcpu, kvm) - cleared +=3D kvm_dirty_ring_reset(vcpu->kvm, &vcpu->dirty_ring); + kvm_for_each_vcpu(i, vcpu, kvm) { + r =3D kvm_dirty_ring_reset(vcpu->kvm, &vcpu->dirty_ring, &cleared); + if (r) + break; + } =20 mutex_unlock(&kvm->slots_lock); =20 --=20 2.49.0.1112.g889b7c5bd8-goog From nobody Tue Dec 16 05:55:04 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1D0E280A58 for ; Fri, 16 May 2025 21:35:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431348; cv=none; b=NRIni7CkJhzYhZlUar2BFuJ07UO8dF0sfZG7oVwHbDb79p/HKYVw8qaC2SMBU9aTjur7jLtAmxWBB+kK3pei1Vs5bZxWyRv4pL9rQz8Dq7Ij8oOPIqwoH0YQclwYJAwS3trg6PNTfzpa6SWoKTwiwlaOvrqvU3Mv9OBiRTwYgyE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431348; c=relaxed/simple; bh=RrZIIikij6DmitW6dz05NLZuqphJeOs9Ps1ePqojGbE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=e2eFG71PBoNdwaX0Mz0FD7DlSnUqlHlACPWqX2/GBvrYj9acq7ne3tQo11y81IxQ8x8CSeXt4Y2xUGM19MO7RkpW6j/CmTOlriw2eb3bL+7Z/v/L8IMUSC3hl0+sfHhdXZQ5RAgBnYduyvuGXl7Hr3aEz7SL3N9iaKHNxn4dowk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=F3jW5aK8; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="F3jW5aK8" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-30a29af28d1so2123407a91.0 for ; Fri, 16 May 2025 14:35:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747431346; x=1748036146; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=G9wmHetGMkK5ECrcXO3NqdUKVr+Bbfy2OeQR2kTh+6U=; b=F3jW5aK8hS9KEX1vtOHIcreyHVASXmaxPtKg5/Oygiu5GwtIUGdEXGrJnE6gP7I/L3 JnlQP2oGA6uFUmSlTfq4zt3CSs4/3lkeMF1c7RhAVL4xeimdbEhT6jswq8UbmT08lYtY izRBSoBrO9sG6k/Fc7v/NE6ErPJ3n3KEdndH6OR1PYHfOpilP0tnY4dCXAmfClcu34Ga Q3OmW76oJglaYZZPT8tE6y+oNrXzT4oSXROKN4cfy38Z2ZaFoyS1bEnmKYjtEZdHwH// /0WEh/Fu9h0QVlFQnJa/rzOPeuPztu+Xpy1QdsiJ80KdG8s8brnYIX1mBhxDVx/o7wN2 mVQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747431346; x=1748036146; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=G9wmHetGMkK5ECrcXO3NqdUKVr+Bbfy2OeQR2kTh+6U=; b=Z7hHLKOdQD3mvIb1UZ/SrnhQTntNVE7GDvIDXe4WsvXoEKY97PFzN0Bv7bCLgBrqgR kUnFat+q1q7q5iDeHV4ig7mMULL4qwsZEdQ5Nxb/u5y6G6knivS+WJOGnctFbgBg0feu N2p2TLl2QP1j2nI4TltiivNeTRxvktZDieeT3qID4IsnrR33kqCR5BXrrLgcSqdOdbUH nAka657Kpe0ZUmZeBUD4ln/JPCICQc5Frg4ciZXxQaQEICAoNZlQxAWA7JYtOhfFcZz0 PUqggtBO6Gxw84Yewgr2lhb6IZ9q00TEhHB45geq5TEaoPUzwpQodNmnfxaeLu3+1Mvi IFwg== X-Forwarded-Encrypted: i=1; AJvYcCU+QKuSqz4mJ8xqbPEJXLtnT+5T/SDuB4h5+GcFxKlhQpI+RH70pQm8qD9vl2UhnPfwEEQwhgbdKp8x7ks=@vger.kernel.org X-Gm-Message-State: AOJu0YyUWfHHWvOfLSNPI6BwCw/ucr0x7qrfSrzQpQPbFB2Sncza8K/p oP9TW6KZRTlJ0turDNN2tYaqk2m4tzUO2U8e1lUo1sReRhg1bw4ilc3egljmVMIqwvuKEYpIhVu tiDn6Cw== X-Google-Smtp-Source: AGHT+IFa21w7PEBxGuke+jG4g4hQIIBprIgWw+tC2Qp349rJsk36zh/R3cXUzcky2fs51sE1Hx6oGfXmwC8= X-Received: from pjbsn4.prod.google.com ([2002:a17:90b:2e84:b0:2ef:7af4:5e8e]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3e8e:b0:30e:54be:37cc with SMTP id 98e67ed59e1d1-30e7d5a93a3mr6251597a91.23.1747431346001; Fri, 16 May 2025 14:35:46 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 16 May 2025 14:35:36 -0700 In-Reply-To: <20250516213540.2546077-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250516213540.2546077-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.1112.g889b7c5bd8-goog Message-ID: <20250516213540.2546077-3-seanjc@google.com> Subject: [PATCH v3 2/6] KVM: Bail from the dirty ring reset flow if a signal is pending From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Xu , Yan Zhao , Maxim Levitsky , Binbin Wu , James Houghton , Sean Christopherson , Pankaj Gupta Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Abort a dirty ring reset if the current task has a pending signal, as the hard limit of INT_MAX entries doesn't ensure KVM will respond to a signal in a timely fashion. Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") Reviewed-by: James Houghton Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Peter Xu Reviewed-by: Yan Zhao --- virt/kvm/dirty_ring.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 77986f34eff8..e844e869e8c7 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -118,6 +118,9 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_di= rty_ring *ring, cur_slot =3D cur_offset =3D mask =3D 0; =20 while (likely((*nr_entries_reset) < INT_MAX)) { + if (signal_pending(current)) + return -EINTR; + entry =3D &ring->dirty_gfns[ring->reset_index & (ring->size - 1)]; =20 if (!kvm_dirty_gfn_harvested(entry)) --=20 2.49.0.1112.g889b7c5bd8-goog From nobody Tue Dec 16 05:55:04 2025 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68BED281378 for ; Fri, 16 May 2025 21:35:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431349; cv=none; b=KKoEI4wTKUdxhAHw6u7E24roEKWHS9FgdfY2UZLeQWq3msy1dMKhKNRLoZ8pAGJPjdMgq00tG+xF2udTMaX2L8rOmt3kd2CWMckff/HQjC8uIq6wNfwk5cjm8vQDfVxRLX9vd3Jj815D1KLJXyGNPJ0nyVGpPTxKc5lLdkhxbPk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431349; c=relaxed/simple; bh=Rpd69iJyvfwAlVGI4I1bFliXaThxa8DUe5ebe7JgWeM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rxCaklvTDE8C4Yv0U4TVcDKu//xRbuZjST2QsHlw6AYq7+xxFs5ZFJo5azZ1Zpfjw+l0T0mnZr1i7ChRevRT31SvfQsrIpgOai5UpJ8a0TXYIDSUoLyhJrP+sKtGkmMzRSeHHUt6l+RlTI0XDc3ElSPSfRvaZKMcP4Xg2cullL0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=sPyTEakJ; arc=none smtp.client-ip=209.85.216.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="sPyTEakJ" Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-30a96aca21eso2651214a91.2 for ; Fri, 16 May 2025 14:35:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747431348; x=1748036148; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=ova2r7VVYyiTf4lugW4BCVVVxMshH9nMU4eEdweoL7o=; b=sPyTEakJOAiQuSgYr7QlwnFXbF6XSAp6RDNaxA1XWEpWrVQVl3cyIty0IXNpOWkukK AAldZHw4HAb8KJy1rYJgowTK9/ExKqL+/h0JyKP/vu+z1bxoKsG+IhaSNRy2FWCVdMTw wDLpkbS/Wml0Ao+kuNnjfUKexgd0gTDLA27a23XrRLsDNnT5iiWq5jOUQBwyvcDcraek CwD407tm8AYPQh+wdHefu6qNz4FuQT3ORw+3qWt0CCDlDrceh7n86iEV2Vlarwke84/P S+LfJLI94q6mqZsVHqtUFyWNPJlhjRocn68kGVV2Fh6P+XXtztWeO2lwRfs/ixOraznr vGGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747431348; x=1748036148; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ova2r7VVYyiTf4lugW4BCVVVxMshH9nMU4eEdweoL7o=; b=xN340NQjxA7Yu1OqW7NPOgnzztmwFLflIg4cfwVfTyi4ZhA/gAgxWaDpJnJhNgMBgZ gbVUIkrqvnJAjNQKwyZ7tgriX45LuvMbTj7lDOqwmwdnj2Arnp4EFgqMWTT6ay3LXG8T 3c9qvs6mASDS+JB3Auwk0+JuUsXEeoDbM6R2stn/GZLl1QC+/ieNRfx4u7+WW+4FLfJg aFaLJJW2OL88NXMcUMDuEmoiIPXdKmK6zskDmp6b6Fb4dPhfpbyEFrPnPRLhrILDDFjk /noeLBBz4nXuw1KRYLx1+US/iDT5TIilNU0U9y4m9rT7v93QApuUoGyTMVllzvP+zXOY VPuQ== X-Forwarded-Encrypted: i=1; AJvYcCWSb7NoVx9P1UoocpPN2YMUN7cKGc4eIqh9Khbx3vpAbYESL3wbLG/Tk/ILv/1aWmIgnBItBIXL/QPGmDQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yy+8+P6KqllFzRfJgApPl+J+BTa8FgAk6aM1O0nTuWtkdif268/ uLduEy02mP1VW9OYOVfLHT1XsSDXKvXWHQlZzEbSASDL0PvXs2ZFr0W6TEETg2tL6Ofe4ld9fKo 9Bs0SDw== X-Google-Smtp-Source: AGHT+IF809N1jB3x6msHfp9e5c5WFbWbgxrMw9eVJpQdfzg+ArgwpimlRRtEqmuLYTdV+qtKwWK8K/1OHpA= X-Received: from pjboi16.prod.google.com ([2002:a17:90b:3a10:b0:2fe:7f7a:74b2]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:d604:b0:30c:523e:89e3 with SMTP id 98e67ed59e1d1-30e7d520e3dmr7600236a91.11.1747431347773; Fri, 16 May 2025 14:35:47 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 16 May 2025 14:35:37 -0700 In-Reply-To: <20250516213540.2546077-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250516213540.2546077-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.1112.g889b7c5bd8-goog Message-ID: <20250516213540.2546077-4-seanjc@google.com> Subject: [PATCH v3 3/6] KVM: Conditionally reschedule when resetting the dirty ring From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Xu , Yan Zhao , Maxim Levitsky , Binbin Wu , James Houghton , Sean Christopherson , Pankaj Gupta Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When resetting a dirty ring, conditionally reschedule on each iteration after the first. The recently introduced hard limit mitigates the issue of an endless reset, but isn't sufficient to completely prevent RCU stalls, soft lockups, etc., nor is the hard limit intended to guard against such badness. Note! Take care to check for reschedule even in the "continue" paths, as a pathological scenario (or malicious userspace) could dirty the same gfn over and over, i.e. always hit the continue path. rcu: INFO: rcu_sched self-detected stall on CPU rcu: 4-....: (5249 ticks this GP) idle=3D51e4/1/0x4000000000000000 softi= rq=3D309/309 fqs=3D2563 rcu: (t=3D5250 jiffies g=3D-319 q=3D608 ncpus=3D24) CPU: 4 UID: 1000 PID: 1067 Comm: dirty_log_test Tainted: G L = 6.13.0-rc3-17fa7a24ea1e-HEAD-vm #814 Tainted: [L]=3DSOFTLOCKUP Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_mmu_enable_log_dirty_pt_masked+0x26/0x200 [kvm] Call Trace: kvm_reset_dirty_gfn.part.0+0xb4/0xe0 [kvm] kvm_dirty_ring_reset+0x58/0x220 [kvm] kvm_vm_ioctl+0x10eb/0x15d0 [kvm] __x64_sys_ioctl+0x8b/0xb0 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Tainted: [L]=3DSOFTLOCKUP Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 RIP: 0010:kvm_arch_mmu_enable_log_dirty_pt_masked+0x17/0x200 [kvm] Call Trace: kvm_reset_dirty_gfn.part.0+0xb4/0xe0 [kvm] kvm_dirty_ring_reset+0x58/0x220 [kvm] kvm_vm_ioctl+0x10eb/0x15d0 [kvm] __x64_sys_ioctl+0x8b/0xb0 do_syscall_64+0x5b/0x160 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: fb04a1eddb1a ("KVM: X86: Implement ring-based dirty memory tracking") Reviewed-by: James Houghton Signed-off-by: Sean Christopherson Reviewed-by: Peter Xu Reviewed-by: Yan Zhao --- virt/kvm/dirty_ring.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index e844e869e8c7..97cca0c02fd1 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -134,6 +134,16 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_d= irty_ring *ring, =20 ring->reset_index++; (*nr_entries_reset)++; + + /* + * While the size of each ring is fixed, it's possible for the + * ring to be constantly re-dirtied/harvested while the reset + * is in-progress (the hard limit exists only to guard against + * wrapping the count into negative space). + */ + if (!first_round) + cond_resched(); + /* * Try to coalesce the reset operations when the guest is * scanning pages in the same slot. --=20 2.49.0.1112.g889b7c5bd8-goog From nobody Tue Dec 16 05:55:04 2025 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D8C22820BC for ; Fri, 16 May 2025 21:35:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431351; cv=none; b=Z3q9bRjNKRt3ghqfyGlIGnuUQj9Pnfx9iI/osO6gTEfL2fcDm25COp+Aa4/mc4FTS00vm0c4wSkWYbCY3eJvVVQAuhp5xS7qTmMIdF/ppkMyQPhTxJU88k9+2Ndp+KHkxe3TNPybT8pMHeS+vLqsAFSmdVk7wGU2b6lL3uq8cn8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431351; c=relaxed/simple; bh=2v2dN1X+dAAa77UFU6vIKHEWb426YHGfzeCeOi2cJvE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=adE++EhrPKyHQCfVvCVEsvxeOhaqx+otuVjBCZB6mo27DsTBk/Yv5MQh1EWXUEW/iIgOR0vmxiq9RYFQbdFUO4EOLBgn+e8pgFfYn55nqVw9KV+LWaq/0yjX8pDxmll7VJvl1JToU49JmoyuzbY2T4yXMVtVmhivU7norFAGlGs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=rATP7kcH; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rATP7kcH" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-742b01ad1a5so1237413b3a.0 for ; Fri, 16 May 2025 14:35:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747431349; x=1748036149; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=vyBzQMKQxB1kwhDGECpLCE1qA4Bs8m5spdH0C7ry2ZU=; b=rATP7kcHZS7G5C95YqpEoqIR93Pue+DuwiKr6amQ47po1xRYYJxzz6mDTwQlfl8WQl xihAcBdA5QZgGx7ljHpTdzVVu4SgHAJXVtXYwFLQmXI8uueldwtnh25mnbcbvsESnq0D HLjFiBaO/SlbDMaEWTq/xSu1F/MkSQoxT7y71SsE3xDa48hbgsw9CsuNhqPeyYS62sLu wwCWJ62mhyAqhmP2D6TvBlLY2R0b+lA0NC7I3Kjp1ozuvqOx18x8CQK+mwLEkhm4CuYV Gc2AVpSUuGhWkJUqMke6iBHUUlELIYXyfETVdR4b7S1oDDZs8JKhahdUQiDcWh3vh43o 8OGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747431349; x=1748036149; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=vyBzQMKQxB1kwhDGECpLCE1qA4Bs8m5spdH0C7ry2ZU=; b=ETwqZDB6P8h5mJHvYGxATNGjyIK5sd+vMUXiqLV4Cdgo4yYtqoytE1Y2xoyBEDDsMT iRY9pwAh7yjMGvD7WeL4HkXkcO6fVlkLDeNhwNoQXO5mG0IQUI+xLS0xkj3AfyUGHtjg xgwWF4wBYT0QnoLbwpuJTzwOPbr9OlC8bCE97rciStygRL7fC1lMvFyhNb72rMJOiR44 qNHWoIevx2E+WaIltuL/qjIAq3GgJgSnQiFDRO4/7XoFbyaqFStDzSAIv2oHmsDiVb3j kcKok2wO662syLyDUeu/zXDCTv38PMGH9m0BflV99es5i0Gqh4CUvOa0Ty5OxozUBFn1 GYTA== X-Forwarded-Encrypted: i=1; AJvYcCX5vZYLbcLDXNnmUNyBXBz3gX6bC2onDKujXh6c52LehGpNWQLqnsGcqM1C3NNtvWdnf/t3Rjt5zWZAiDg=@vger.kernel.org X-Gm-Message-State: AOJu0Yx42kf2qDEjOdmtb053O+vtMIshyJeftCJk4bmGMbmxWyi7oXYY pN/RhOu7QW0qra1tYBUmz9J99Fnvt0GWCkVXmQHlu0jmNp8HJzeSEkSpBYrGh6oSqx3jJm/sdOe VM+lCQA== X-Google-Smtp-Source: AGHT+IGZ+89bx1MUQlOhDsp5hIxSmp3YXUWmvmkzMbfls2oWeAVq98kQbFTVLLPqo6+CMxFMWOoQeQWQ40o= X-Received: from pfoi17.prod.google.com ([2002:aa7:87d1:0:b0:736:86e0:8dee]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:94a7:b0:740:91eb:c66 with SMTP id d2e1a72fcca58-742acc906b0mr5995225b3a.3.1747431349536; Fri, 16 May 2025 14:35:49 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 16 May 2025 14:35:38 -0700 In-Reply-To: <20250516213540.2546077-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250516213540.2546077-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.1112.g889b7c5bd8-goog Message-ID: <20250516213540.2546077-5-seanjc@google.com> Subject: [PATCH v3 4/6] KVM: Check for empty mask of harvested dirty ring entries in caller From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Xu , Yan Zhao , Maxim Levitsky , Binbin Wu , James Houghton , Sean Christopherson , Pankaj Gupta Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When resetting a dirty ring, explicitly check that there is work to be done before calling kvm_reset_dirty_gfn(), e.g. if no harvested entries are found and/or on the loop's first iteration, and delete the extremely misleading comment "This is only needed to make compilers happy". KVM absolutely relies on mask to be zero-initialized, i.e. the comment is an outright lie. Furthermore, the compiler is right to complain that KVM is calling a function with uninitialized data, as there are no guarantees the implementation details of kvm_reset_dirty_gfn() will be visible to kvm_dirty_ring_reset(). While the flaw could be fixed by simply deleting (or rewording) the comment, and duplicating the check is unfortunate, checking mask in the caller will allow for additional cleanups. Opportunistically drop the zero-initialization of cur_slot and cur_offset. If a bug were introduced where either the slot or offset was consumed before mask is set to a non-zero value, then it is highly desirable for the compiler (or some other sanitizer) to yell. Cc: Peter Xu Cc: Yan Zhao Cc: Maxim Levitsky Cc: Binbin Wu Reviewed-by: Pankaj Gupta Reviewed-by: James Houghton Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Peter Xu Reviewed-by: Yan Zhao --- virt/kvm/dirty_ring.c | 44 ++++++++++++++++++++++++++++++++++--------- 1 file changed, 35 insertions(+), 9 deletions(-) diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 97cca0c02fd1..84c75483a089 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -55,9 +55,6 @@ static void kvm_reset_dirty_gfn(struct kvm *kvm, u32 slot= , u64 offset, u64 mask) struct kvm_memory_slot *memslot; int as_id, id; =20 - if (!mask) - return; - as_id =3D slot >> 16; id =3D (u16)slot; =20 @@ -108,15 +105,24 @@ static inline bool kvm_dirty_gfn_harvested(struct kvm= _dirty_gfn *gfn) int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_dirty_ring *ring, int *nr_entries_reset) { + /* + * To minimize mmu_lock contention, batch resets for harvested entries + * whose gfns are in the same slot, and are within N frame numbers of + * each other, where N is the number of bits in an unsigned long. For + * simplicity, process the current set of entries when the next entry + * can't be included in the batch. + * + * Track the current batch slot, the gfn offset into the slot for the + * batch, and the bitmask of gfns that need to be reset (relative to + * offset). Note, the offset may be adjusted backwards, e.g. so that + * a sequence of gfns X, X-1, ... X-N can be batched. + */ u32 cur_slot, next_slot; u64 cur_offset, next_offset; - unsigned long mask; + unsigned long mask =3D 0; struct kvm_dirty_gfn *entry; bool first_round =3D true; =20 - /* This is only needed to make compilers happy */ - cur_slot =3D cur_offset =3D mask =3D 0; - while (likely((*nr_entries_reset) < INT_MAX)) { if (signal_pending(current)) return -EINTR; @@ -164,14 +170,34 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_= dirty_ring *ring, continue; } } - kvm_reset_dirty_gfn(kvm, cur_slot, cur_offset, mask); + + /* + * Reset the slot for all the harvested entries that have been + * gathered, but not yet fully processed. + */ + if (mask) + kvm_reset_dirty_gfn(kvm, cur_slot, cur_offset, mask); + + /* + * The current slot was reset or this is the first harvested + * entry, (re)initialize the metadata. + */ cur_slot =3D next_slot; cur_offset =3D next_offset; mask =3D 1; first_round =3D false; } =20 - kvm_reset_dirty_gfn(kvm, cur_slot, cur_offset, mask); + /* + * Perform a final reset if there are harvested entries that haven't + * been processed, which is guaranteed if at least one harvested was + * found. The loop only performs a reset when the "next" entry can't + * be batched with the "current" entry(s), and that reset processes the + * _current_ entry(s); i.e. the last harvested entry, a.k.a. next, will + * always be left pending. + */ + if (mask) + kvm_reset_dirty_gfn(kvm, cur_slot, cur_offset, mask); =20 /* * The request KVM_REQ_DIRTY_RING_SOFT_FULL will be cleared --=20 2.49.0.1112.g889b7c5bd8-goog From nobody Tue Dec 16 05:55:04 2025 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 054FA283140 for ; Fri, 16 May 2025 21:35:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431353; cv=none; b=Po0J1G73XpypOglOX6Y6hFwfEEVCb7aqA7fdKI/VwNa4AU4mf3WPJKlJgH09ViCnpv+rR0IcultwX3REDtvmjPhY9xC0IuL/KHUNoZGEKlmS3nRMOMFLEIKkZAnQ8izdv9hYffs2v+aCtjK82wa0ZGi+DLqKPwVUzfW1cn3Lrl8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431353; c=relaxed/simple; bh=it/UrIDf2OFvaNHNDtjuW+PbzU1ouQngBb1IuFpC9X4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=BuQL4zxRn7VEswqMn9ZsMlS5VQvA+csKuMA5ab8XokRL8J6x5ffznbTY9zA4AgrAtBBipGbrpWDelLjJ+t1+KS9tUYEskMpClPD6WUVgMc3ZJD3jEFYNDKQ8kfzz0CEaW0FNWT0J8btJNYWda3bb5Yn+sG4lKFuomma/7D7HMBY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=f1UEsmk6; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="f1UEsmk6" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-b26e278dd1aso1288499a12.1 for ; Fri, 16 May 2025 14:35:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747431351; x=1748036151; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=oH92jyyUzDF22FMyCs7nXQd1z4bypvI1sQANzrXDq1I=; b=f1UEsmk6U/aXewlUVnu06jrHYnYAdgeFZL+/RncLDQTi8ZDQj58lSxHS+JqR+lFdYq rWwSks1DxzPxNINp7VulAvlcUhrW68jfZkAFv46x13tcOp1LAihrPc1Gm0kdrMw9e6lT uiFQcOhAukYELpUs/SOj7ARBxS0cD2562p7Ynd0VVAzkwM+U1euMQM98eGnKfOT8hnnb FtUQOGcQi2Mxhy3nFrQmnJb2pIezE2x2Xqoz+n66VdS0kCfAuB45yA8XJUWYYaIBrMsl TO/AIiXXPDJW5heI/2dKNP9cSimzpk9PDr6GFL5kKt5Rx8CST3ybC4RvI6fbsE1WOZnY nMdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747431351; x=1748036151; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oH92jyyUzDF22FMyCs7nXQd1z4bypvI1sQANzrXDq1I=; b=NyE4KdzZAd8nr0ExOBTYNsU3Epuubdu1uighnNPp2SAgd2tIMdsIVHEhYoWy8oLO7v YpwOPIeZ138Cw2LKN6pDzuxxxE/Dln34ifu0jVefqvcWUZgoMuy7w1h4zhFMyt+qgKoU 9uaT5QK931DjyL/UpsPjRqtdI0cWc5bwsJCBaVMmviO2V5LjWr7TgU7GIDqOONgPkX3E oExBEYf2gdnyRvBDOGHLASJjd6OkIyiRsXGRlET2ZdNUcOUYxx/HREdLa9X3CNzEfiXL HniodFByX8L9yVee0VXc51kc5dfktJjJZv3iCtwtasvC4839UDB0wZGJbuO1QcBXe1vQ sfJA== X-Forwarded-Encrypted: i=1; AJvYcCUa7jpaOa1Rjhi8yp9KNIqJQEpb9sJVWnNZ/GhNB9Vk4Jw1t5MC3Va4yxyxNBXQ8Q24pkIwCddt88kmtSY=@vger.kernel.org X-Gm-Message-State: AOJu0YyNf7vArIadIwTZHUgbDexRV3XTbuDoT+8vA1z3OVcTYUHPToy1 dMYeeut2J6w2TV/pQ2evSYnZgvoOdmBA3APnU5aAzUWSMBxBLSMZjYcslcCV8gBxgNxOUUabV0j jtGQ9vA== X-Google-Smtp-Source: AGHT+IG2XVhlAOS9mgw1KfyR0RdP6DuXNaKfftHf9CchOumLAZ3MSmb/leSlBEVbaGW23ke+ZRQXGORc/mY= X-Received: from plkb7.prod.google.com ([2002:a17:903:fa7:b0:22e:50c9:c04b]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:320e:b0:22f:b69f:e7fc with SMTP id d9443c01a7336-231d45c1151mr65220685ad.49.1747431351295; Fri, 16 May 2025 14:35:51 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 16 May 2025 14:35:39 -0700 In-Reply-To: <20250516213540.2546077-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250516213540.2546077-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.1112.g889b7c5bd8-goog Message-ID: <20250516213540.2546077-6-seanjc@google.com> Subject: [PATCH v3 5/6] KVM: Use mask of harvested dirty ring entries to coalesce dirty ring resets From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Xu , Yan Zhao , Maxim Levitsky , Binbin Wu , James Houghton , Sean Christopherson , Pankaj Gupta Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use "mask" instead of a dedicated boolean to track whether or not there is at least one to-be-reset entry for the current slot+offset. In the body of the loop, mask is zero only on the first iteration, i.e. !mask is equivalent to first_round. Opportunistically combine the adjacent "if (mask)" statements into a single if-statement. No functional change intended. Cc: Peter Xu Cc: Yan Zhao Cc: Maxim Levitsky Reviewed-by: Pankaj Gupta Reviewed-by: James Houghton Signed-off-by: Sean Christopherson Reviewed-by: Binbin Wu Reviewed-by: Peter Xu Reviewed-by: Yan Zhao --- virt/kvm/dirty_ring.c | 60 +++++++++++++++++++++---------------------- 1 file changed, 29 insertions(+), 31 deletions(-) diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 84c75483a089..54734025658a 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -121,7 +121,6 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_di= rty_ring *ring, u64 cur_offset, next_offset; unsigned long mask =3D 0; struct kvm_dirty_gfn *entry; - bool first_round =3D true; =20 while (likely((*nr_entries_reset) < INT_MAX)) { if (signal_pending(current)) @@ -141,42 +140,42 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_= dirty_ring *ring, ring->reset_index++; (*nr_entries_reset)++; =20 - /* - * While the size of each ring is fixed, it's possible for the - * ring to be constantly re-dirtied/harvested while the reset - * is in-progress (the hard limit exists only to guard against - * wrapping the count into negative space). - */ - if (!first_round) + if (mask) { + /* + * While the size of each ring is fixed, it's possible + * for the ring to be constantly re-dirtied/harvested + * while the reset is in-progress (the hard limit exists + * only to guard against the count becoming negative). + */ cond_resched(); =20 - /* - * Try to coalesce the reset operations when the guest is - * scanning pages in the same slot. - */ - if (!first_round && next_slot =3D=3D cur_slot) { - s64 delta =3D next_offset - cur_offset; + /* + * Try to coalesce the reset operations when the guest + * is scanning pages in the same slot. + */ + if (next_slot =3D=3D cur_slot) { + s64 delta =3D next_offset - cur_offset; =20 - if (delta >=3D 0 && delta < BITS_PER_LONG) { - mask |=3D 1ull << delta; - continue; - } + if (delta >=3D 0 && delta < BITS_PER_LONG) { + mask |=3D 1ull << delta; + continue; + } =20 - /* Backwards visit, careful about overflows! */ - if (delta > -BITS_PER_LONG && delta < 0 && - (mask << -delta >> -delta) =3D=3D mask) { - cur_offset =3D next_offset; - mask =3D (mask << -delta) | 1; - continue; + /* Backwards visit, careful about overflows! */ + if (delta > -BITS_PER_LONG && delta < 0 && + (mask << -delta >> -delta) =3D=3D mask) { + cur_offset =3D next_offset; + mask =3D (mask << -delta) | 1; + continue; + } } - } =20 - /* - * Reset the slot for all the harvested entries that have been - * gathered, but not yet fully processed. - */ - if (mask) + /* + * Reset the slot for all the harvested entries that + * have been gathered, but not yet fully processed. + */ kvm_reset_dirty_gfn(kvm, cur_slot, cur_offset, mask); + } =20 /* * The current slot was reset or this is the first harvested @@ -185,7 +184,6 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_di= rty_ring *ring, cur_slot =3D next_slot; cur_offset =3D next_offset; mask =3D 1; - first_round =3D false; } =20 /* --=20 2.49.0.1112.g889b7c5bd8-goog From nobody Tue Dec 16 05:55:04 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CBC1283C8E for ; Fri, 16 May 2025 21:35:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431355; cv=none; b=GIu29NVucNbEQhaSHIkBstcn7bh0XCPUBoWGdYa9BkB0rhTH3WpJ/EmOR79foaaPicFKzEwqfb5sy0b+g5mZkgAke14J1/WMz99WZnmGSgHkxWuyGlY/C55lAhxTBQ3KUJ0y3Lj0mvgbGpdP5AqmEwLsl7RdlkN9Ey6Nbk4fOxk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747431355; c=relaxed/simple; bh=Vec1LaJROunpusnyM0UhIbnSVhckBolPI8kAYzzwmlU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XcAt9xbEyDwyYXVoQSjtwVlbHWx2ugRcXCLJZ9KzbY6Kw2Pta7K9eJ5koCFV9vzuOTgbaFSOupF4oNnmvzmFF2xCF5keeQPB0Ux6jd3NlUlrPSaBXxn3lttq5Zp4Z2kwMbIA1Od+pZhIr/xctUcY/lxP0M8F1G3EC9GkWiEyaWg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=38f3hyRU; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="38f3hyRU" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b26e0fee459so1429053a12.1 for ; Fri, 16 May 2025 14:35:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1747431353; x=1748036153; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=4V663rFh0jkdB7SoksoChtdcIhj8HRg19/fZOy64aqQ=; b=38f3hyRUKv5Hktv2viVfMUFkesdhOkdcANjFyEK6njt3zMH4WSUnpREpUUYVAbb8Jt WQruqwkqTdDzb/YqclByShZ9gjUfPzQMXhAf9mjiYrxfS4PaNuTN2gF0aLQMXp0sjsOY sJPyiHpHDtlb7iVqwDv+N6DS/Z8juz1N/69lip4E70NgvQtB/X6DO6eaGoHtyQI457Al n6Pw1C7xa9L2lz4G0sZmlXZi0cO/Zm9TLNTNCZD1HPASBHp82WUyO46hClub/uI3p9Yl 6QMnQ6MiTBvmIopqDn/YLIUrdLr/AgGkz96rcR2QIJD05iCbOu7k82D/uFV7houkpn9I 9hLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747431353; x=1748036153; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=4V663rFh0jkdB7SoksoChtdcIhj8HRg19/fZOy64aqQ=; b=gLrF48D/gW03EwZpJUopAvZ4MgXC2Gmt5UCfIPTk3gCRUudpwMqDlO3FotfJnQi+1r l9DN60V6g3/H1lyn0XvN/OxTYiJ2Ydq2Rt7BbAa/o/70FpBGbPQ/e3SwtgVcOn0Uw/pM OdhyiyaaCFuNCcfj2YEB2zVNqAhB0+PIE4Nb+OTJkENqOHarsZoXoKzPu1dCskNCBv6o Z+GjZlKcWfgAIrkBnks0kwOHmLdfQUdPcyVLNtAM7CYjAaOs+1gExy/tIe61yazvowvd EGyhowwo24GP5ilSEu54LPgK1H0Dw+2DZsA7nEn5HXdBM7lpwRLsqZEKe4ieNVQ+Mo8G 4drg== X-Forwarded-Encrypted: i=1; AJvYcCVCshpksfVAH4tZxZZa7Naky8V8sJYpWJrt1YpX637f/d/nwMuFF+y2hQ/TvyqnjnAxi+6l9i+9o6gGNEA=@vger.kernel.org X-Gm-Message-State: AOJu0YyF8JZqSwxv+7TcecJzdHxj6xykwQ/lQd/eBwatUG8z6MPxae8u xZM3OhmejfJtJ3Rfk04PZsDcLuFqDuPEwJ2PAe/1rMivXuA4QI06LuRN3fbVtMnS2TFIYa+dIiR AuC/GnA== X-Google-Smtp-Source: AGHT+IHTShpa0P01pbMtjz7mDDMOXo31hETLUMs1wvsmqDIdr8kGVg9NIsmrYu0AlJiKZALyAjJyizqyiqw= X-Received: from pjbrs6.prod.google.com ([2002:a17:90b:2b86:b0:2fc:d77:541]) (user=seanjc job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2d50:b0:2fa:3174:e344 with SMTP id 98e67ed59e1d1-30e4dbb700bmr12723672a91.14.1747431353067; Fri, 16 May 2025 14:35:53 -0700 (PDT) Reply-To: Sean Christopherson Date: Fri, 16 May 2025 14:35:40 -0700 In-Reply-To: <20250516213540.2546077-1-seanjc@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250516213540.2546077-1-seanjc@google.com> X-Mailer: git-send-email 2.49.0.1112.g889b7c5bd8-goog Message-ID: <20250516213540.2546077-7-seanjc@google.com> Subject: [PATCH v3 6/6] KVM: Assert that slots_lock is held when resetting per-vCPU dirty rings From: Sean Christopherson To: Paolo Bonzini Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Peter Xu , Yan Zhao , Maxim Levitsky , Binbin Wu , James Houghton , Sean Christopherson , Pankaj Gupta Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Assert that slots_lock is held in kvm_dirty_ring_reset() and add a comment to explain _why_ slots needs to be held for the duration of the reset. Link: https://lore.kernel.org/all/aCSns6Q5oTkdXUEe@google.com Suggested-by: James Houghton Signed-off-by: Sean Christopherson Reviewed-by: Peter Xu Reviewed-by: Yan Zhao --- virt/kvm/dirty_ring.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/virt/kvm/dirty_ring.c b/virt/kvm/dirty_ring.c index 54734025658a..1ba02a06378c 100644 --- a/virt/kvm/dirty_ring.c +++ b/virt/kvm/dirty_ring.c @@ -122,6 +122,14 @@ int kvm_dirty_ring_reset(struct kvm *kvm, struct kvm_d= irty_ring *ring, unsigned long mask =3D 0; struct kvm_dirty_gfn *entry; =20 + /* + * Ensure concurrent calls to KVM_RESET_DIRTY_RINGS are serialized, + * e.g. so that KVM fully resets all entries processed by a given call + * before returning to userspace. Holding slots_lock also protects + * the various memslot accesses. + */ + lockdep_assert_held(&kvm->slots_lock); + while (likely((*nr_entries_reset) < INT_MAX)) { if (signal_pending(current)) return -EINTR; --=20 2.49.0.1112.g889b7c5bd8-goog