From nobody Sat Feb 7 22:21:28 2026 Received: from baidu.com (mx24.baidu.com [111.206.215.185]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81D3A19CCEA; Thu, 24 Apr 2025 02:56:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=111.206.215.185 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745463401; cv=none; b=pARz3S7Zmm74HrBQVaTy+O6lFLEaKkDZBJDUaeBwpx9Sa9AT22KeOFfFC+22FNbbMBFZiWVUENhMvPudsTOrt3nW+r3WtMrBvCaYycOSBqr2D7VKbRz/6dlf200/FGOB9LC4X7ivz7t4XVREewMdHTSmtRy78xicVXdk7q9cbko= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1745463401; c=relaxed/simple; bh=zQalGsbxTBX06jBm/oT6wKizmyMtivzPZ9QWHX/5RRI=; h=From:To:CC:Subject:Date:Message-ID:References:In-Reply-To: Content-Type:MIME-Version; b=LRASfcQECnp9gukdUUiK1XtPj+llpwe9ZIZ0qyiOzsRlg9KIbpaiVDBfn69jMd+4dFVnlEpX0g0RFN40XcovNdKFi1j6nn/Mwl4yTs27p5Lj3S6J+L8JfMbZfahYad4yNZIrKQivDmlXr8CYrDa5eZp/Q4Mln3D8npMZ0luvBgI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=baidu.com; spf=pass smtp.mailfrom=baidu.com; arc=none smtp.client-ip=111.206.215.185 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=baidu.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=baidu.com From: "Li,Rongqing" To: Sean Christopherson CC: "pbonzini@redhat.com" , "kvm@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "Li,Zhaoxin(ACG CCN)" Subject: =?gb2312?B?tPC4tDogWz8/Pz9dIFJlOiBbUEFUQ0hdIEtWTTogVXNlIGNhbGxfcmN1KCkg?= =?gb2312?B?aW4ga3ZtX2lvX2J1c19yZWdpc3Rlcl9kZXY=?= Thread-Topic: [????] Re: [PATCH] KVM: Use call_rcu() in kvm_io_bus_register_dev Thread-Index: AQHbtDGgrzSppzuzck+TBQD3uwy4MbOw0jWAgAFLEUA= Date: Thu, 24 Apr 2025 02:56:22 +0000 Message-ID: <4bfe7a8f5020448e903e6335173afc75@baidu.com> References: <20250423092509.3162-1-lirongqing@baidu.com> In-Reply-To: Accept-Language: zh-CN, en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FEAS-Client-IP: 172.31.50.16 X-FE-Last-Public-Client-IP: 100.100.100.38 X-FE-Policy-ID: 52:10:53:SYSTEM Content-Type: text/plain; charset="utf-8" > On Wed, Apr 23, 2025, lirongqing wrote: > > From: Li RongQing > > > > Use call_rcu() instead of costly synchronize_srcu_expedited(), this > > can reduce the VM bootup time, and reduce VM migration downtime > > > > Signed-off-by: lizhaoxin >=20 > If lizhaoxin is a co-author, then this needs: >=20 > Co-developed-by: lizhaoxin >=20 Thanks, I will add > If they are _the_ author, then the From: above is wrong. >=20 > > Signed-off-by: Li RongQing > > --- > > include/linux/kvm_host.h | 1 + > > virt/kvm/kvm_main.c | 11 +++++++++-- > > 2 files changed, 10 insertions(+), 2 deletions(-) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index > > 291d49b..e772704 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -203,6 +203,7 @@ struct kvm_io_range { #define NR_IOBUS_DEVS > 1000 > > > > struct kvm_io_bus { > > + struct rcu_head rcu; > > int dev_count; > > int ioeventfd_count; > > struct kvm_io_range range[]; > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index > > 2e591cc..af730a5 100644 > > --- a/virt/kvm/kvm_main.c > > +++ b/virt/kvm/kvm_main.c > > @@ -5865,6 +5865,13 @@ int kvm_io_bus_read(struct kvm_vcpu *vcpu, > enum kvm_bus bus_idx, gpa_t addr, > > return r < 0 ? r : 0; > > } > > > > +static void free_kvm_io_bus(struct rcu_head *rcu) { > > + struct kvm_io_bus *bus =3D container_of(rcu, struct kvm_io_bus, rcu); > > + > > + kfree(bus); > > +} > > + > > int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, > gpa_t addr, > > int len, struct kvm_io_device *dev) { @@ -5903,8 +5910,8 > @@ > > int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t > addr, > > memcpy(new_bus->range + i + 1, bus->range + i, > > (bus->dev_count - i) * sizeof(struct kvm_io_range)); > > rcu_assign_pointer(kvm->buses[bus_idx], new_bus); > > - synchronize_srcu_expedited(&kvm->srcu); > > - kfree(bus); > > + > > + call_srcu(&kvm->srcu, &bus->rcu, free_kvm_io_bus); >=20 > I don't think this is safe from a functional correctness perspective, as = KVM must > guarantee all readers see the new device before KVM returns control to > userspace. > E.g. I'm pretty sure KVM_REGISTER_COALESCED_MMIO is used while vCPUs are > active. >=20 > However, I'm pretty sure the only readers that actually rely on SRCU are = vCPUs, > so I _think_ the synchronize_srcu_expedited() is necessary if and only if= vCPUs > have been created. >=20 > That could race with concurrent vCPU creation in a few flows that don't t= ake > kvm->lock, but that should be ok from an ABI perspective. False > kvm->positives (vCPU > creation fails) are benign, and false negatives (vCPU created after the c= heck) are > inherently racy, i.e. userspace can't guarantee the vCPU sees any particu= lar > ordering. >=20 > So this? >=20 > if (READ_ONCE(kvm->created_vcpus)) { > synchronize_srcu_expedited(&kvm->srcu); > kfree(bus); > } else { > call_srcu(&kvm->srcu, &bus->rcu, free_kvm_io_bus); > } If call_srcu is able to used only before creating vCPU, the upper will have= little effect, since most device are created after creating vCPU We want to optimize the ioeventfd creation, since a VM will create lots of = ioeventfd, can ioeventfd uses call_srcu? Like: --- a/virt/kvm/eventfd.c +++ b/virt/kvm/eventfd.c @@ -853,8 +853,8 @@ static int kvm_assign_ioeventfd_idx(struct kvm *kvm, kvm_iodevice_init(&p->dev, &ioeventfd_ops); - ret =3D kvm_io_bus_register_dev(kvm, bus_idx, p->addr, p->length, - &p->dev); + ret =3D __kvm_io_bus_register_dev(kvm, bus_idx, p->addr, p->length, + &p->dev, false); if (ret < 0) goto unlock_fail; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 2e591cc..ff294b0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5865,8 +5865,15 @@ int kvm_io_bus_read(struct kvm_vcpu *vcpu, enum kvm_= bus bus_idx, gpa_t addr, return r < 0 ? r : 0; } -int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t a= ddr, - int len, struct kvm_io_device *dev) +static void free_kvm_io_bus(struct rcu_head *rcu) +{ + struct kvm_io_bus *bus =3D container_of(rcu, struct kvm_io_bus, rcu= ); + + kfree(bus); +} + +int __kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t= addr, + int len, struct kvm_io_device *dev, bool sync) { int i; struct kvm_io_bus *new_bus, *bus; @@ -5903,12 +5910,22 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum k= vm_bus bus_idx, gpa_t addr, memcpy(new_bus->range + i + 1, bus->range + i, (bus->dev_count - i) * sizeof(struct kvm_io_range)); rcu_assign_pointer(kvm->buses[bus_idx], new_bus); - synchronize_srcu_expedited(&kvm->srcu); - kfree(bus); + + if (sync) { + synchronize_srcu_expedited(&kvm->srcu); + kfree(bus); + } + else + call_srcu(&kvm->srcu, &bus->rcu, free_kvm_io_bus); return 0; }=20 Thanks -Li