From nobody Mon Dec 1 22:35:43 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F01C03164DC for ; Wed, 26 Nov 2025 13:38:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764164316; cv=none; b=Qd7+xROeh3vARpkTLxJkbd6gxVPZry1+QQAaixf0NsA6Qb7+AAFHl/WSHLtqBzg0W63lp7Evl8ZwjhX972J7utG8rvpxiXNVpV+c1zgN9fdP7h4NWK5isZIVfPfAzXtZrV6AE+p4o0/ehbcaTR2AGP4RiSFw4XQitZWR8XEQMsk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764164316; c=relaxed/simple; bh=qVWDhyzeG/Z8JL3F0OP9m69Cq46/dVVMczXqz8KQbOw=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=cuyvaSLuaH9X+5DwW/+c0fnypHs/z5vKyGFlGyfsjaSUAYXTN41TgJ81j9SNttmLjsYBO/dVP+uvSbAJCNxcF3xI6bevGlYXBy/t+BFBFFLB5toUQoYPzc535TAqx8IOZqR8NFS37qeRgZWl/Xct1A0G29C6/fqW468jCMcw1ak= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=HyxzKphA; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=BWNegSZC; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HyxzKphA"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="BWNegSZC" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1764164312; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=Mhs6sooJ56Q3btp7R+WL1sLMwD4EfCJBflvbYxzesYo=; b=HyxzKphA5zJM/ZOpeZKzu559khSLCNwXbdgHv2uq5NzzNKXi3oEHAVg1U1qGSLIdwvRZGd i+3pCsEXz1mZg9Zjm9tVDLAfHD+Poo1x44d/+RH2GQzK0lpJ+gDVkOmyHGHuJIz1X2fySm LEssByN3pOgisNm07z2r3KwcM/9vg0o= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-605-ojmH9lXbMpKijCc0VzId6g-1; Wed, 26 Nov 2025 08:38:31 -0500 X-MC-Unique: ojmH9lXbMpKijCc0VzId6g-1 X-Mimecast-MFC-AGG-ID: ojmH9lXbMpKijCc0VzId6g_1764164310 Received: by mail-ed1-f72.google.com with SMTP id 4fb4d7f45d1cf-640cdaf43aeso6890816a12.2 for ; Wed, 26 Nov 2025 05:38:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1764164310; x=1764769110; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Mhs6sooJ56Q3btp7R+WL1sLMwD4EfCJBflvbYxzesYo=; b=BWNegSZCjvVnxBuohBOpTKo1i7A71//bLcAANQcZSGqNkcfDXOTK7H8IxJxPPnzCy4 Dzj46okhhDfWt/aThc9CTXmMskdsa3Vy2ndIaAkJbF53lIMyKFfCNVGVF32WtqIevz3Y mlKtJcP/YL9Zl8N6cwfwuOfqGN5sgzVJp2rk5EZtk2KT6slMwxa1Vp/vxxxKh2x50oL5 iKp8zkcmU9bLqiw63PYQBuTLuST8yyYbuojKabdW2AiTbNKeJsbckQuZT+lxUldpBHUQ QwQrUmYWNO5F2rDwiD1Wl49Df0LAF3HPtA3ebGAtvk3gsZH0Iu2+p+WYeG+xH5eUTYL/ bUoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764164310; x=1764769110; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Mhs6sooJ56Q3btp7R+WL1sLMwD4EfCJBflvbYxzesYo=; b=OREm4V18EZ2XNSvA3TnHC3ifm7So2pm4+ZGCkpvL7O6KlDbRR8FQ181Ediq/+MWy4c GCoY1coBaklD++RJELaU95pJfKGMZ7Cn4bSAo5GxUw6gl3nZCIiV2fmES/IFvODO9lYr ZtcBbObgr/mBN5cSvOD2zMagB+nEAboevHBcT5nqtj9XqDnz9MovZjK9cMxWCfwINwoX 8xSQ1Yhia2rNsEpIYvy7Uptsrid4NlGKVhKa2ptjLgXL+V1fb9owm5BLRVidbY35uHTQ ZuQmHc0+nyRvGBq89G/SB85P6gf4PSa9EcV4bQF9+xiHBoiLsVtF+B0NizmfzMLUohOl DEZw== X-Forwarded-Encrypted: i=1; AJvYcCVJOQjksYxAPVcfiQ8FLey5q9IpOFxAx1atuaX81i15qLzE+/8E3LD8+wGtVOxOrVaxAKK6YCdKHNUNrdI=@vger.kernel.org X-Gm-Message-State: AOJu0Yzl4dR1c/kVSXtp7EmZmm1+KS3TA5cmKqZ/idAEz8s0P6598dra DKDhyn+92N5a8LJBM9KatGAcVuwkjS41izCjzoUow7E5Z5z3gRpIOcOVQQP/+79LKH+FirncIXO tRNfa01HQIaKBOvmFRh4KoUG3N67sR3cRpvwOkHY7ZEtC1g8XXLtiV3HhpNYPoyOsbw== X-Gm-Gg: ASbGncuxkOdhJozu7irwl7PGEcNZapiO/m7Acdpcp0o2KOy+aVPBE1YWTTGlOGjVswO M7DUc2ujmovfxksqML1PdVs6ZmUXYp7KwR4ssW1ZG6zVKKkrt+bKDlyxjT97ET5UImFKLlcnAcS X6EhedlND30D7EHD+2R2kU0/zWylhw36x1iqMxwlGPBuMscamVsz6L9gEAZcKwOfcHEEP5/vAYQ IuIg3itwSA1oAwQWYI/jEoN124Cv+qzdBKsWBPoK+QY1TGSS3Ap1zoFkbhzoh0IMWqtQYwpyQke RRUDL8TgjSE4LzlbPTXs7zS9vuAdixN6DDCbQRg3AvQFmYW84nbEaPr/qi3oDHBcdUsZqXLcGDU 3c4LiISdkAo5AW0A+LUsiPZlOP0jNrzWxyMCJt5i1tf5/1BTU X-Received: by 2002:a05:6402:3489:b0:63c:4da1:9a10 with SMTP id 4fb4d7f45d1cf-645546a3c05mr16378920a12.31.1764164310132; Wed, 26 Nov 2025 05:38:30 -0800 (PST) X-Google-Smtp-Source: AGHT+IHmZrPvuLqorBQCW4M+q1Ph2pYbinZMaUldcA+WNpMkTP4ojfgz3kplNRhWLiwfo17xPIT7cg== X-Received: by 2002:a05:6402:3489:b0:63c:4da1:9a10 with SMTP id 4fb4d7f45d1cf-645546a3c05mr16378881a12.31.1764164309584; Wed, 26 Nov 2025 05:38:29 -0800 (PST) Received: from stex1 (host-87-12-139-91.business.telecomitalia.it. [87.12.139.91]) by smtp.gmail.com with ESMTPSA id 4fb4d7f45d1cf-6453642d267sm17573469a12.22.2025.11.26.05.38.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Nov 2025 05:38:28 -0800 (PST) From: Stefano Garzarella To: virtualization@lists.linux.dev Cc: Stefano Garzarella , Jason Wang , "Michael S. Tsirkin" , =?UTF-8?q?Eugenio=20P=C3=A9rez?= , netdev@vger.kernel.org, Stefan Hajnoczi , kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH] vhost/vsock: improve RCU read sections around vhost_vsock_get() Date: Wed, 26 Nov 2025 14:38:26 +0100 Message-ID: <20251126133826.142496-1-sgarzare@redhat.com> X-Mailer: git-send-email 2.51.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Stefano Garzarella vhost_vsock_get() uses hash_for_each_possible_rcu() to find the `vhost_vsock` associated with the `guest_cid`. hash_for_each_possible_rcu() should only be called within an RCU read section, as mentioned in the following comment in include/linux/rculist.h: /** * hlist_for_each_entry_rcu - iterate over rcu list of given type * @pos: the type * to use as a loop cursor. * @head: the head for your list. * @member: the name of the hlist_node within the struct. * @cond: optional lockdep expression if called from non-RCU protection. * * This list-traversal primitive may safely run concurrently with * the _rcu list-mutation primitives such as hlist_add_head_rcu() * as long as the traversal is guarded by rcu_read_lock(). */ Currently, all calls to vhost_vsock_get() are between rcu_read_lock() and rcu_read_unlock() except for calls in vhost_vsock_set_cid() and vhost_vsock_reset_orphans(). In both cases, the current code is safe, but we can make improvements to make it more robust. About vhost_vsock_set_cid(), when building the kernel with CONFIG_PROVE_RCU_LIST enabled, we get the following RCU warning when the user space issues `ioctl(dev, VHOST_VSOCK_SET_GUEST_CID, ...)` : WARNING: suspicious RCU usage 6.18.0-rc7 #62 Not tainted ----------------------------- drivers/vhost/vsock.c:74 RCU-list traversed in non-reader section!! other info that might help us debug this: rcu_scheduler_active =3D 2, debug_locks =3D 1 1 lock held by rpc-libvirtd/3443: #0: ffffffffc05032a8 (vhost_vsock_mutex){+.+.}-{4:4}, at: vhost_vsock_de= v_ioctl+0x2ff/0x530 [vhost_vsock] stack backtrace: CPU: 2 UID: 0 PID: 3443 Comm: rpc-libvirtd Not tainted 6.18.0-rc7 #62 PRE= EMPT(none) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-7.fc42= 06/10/2025 Call Trace: dump_stack_lvl+0x75/0xb0 dump_stack+0x14/0x1a lockdep_rcu_suspicious.cold+0x4e/0x97 vhost_vsock_get+0x8f/0xa0 [vhost_vsock] vhost_vsock_dev_ioctl+0x307/0x530 [vhost_vsock] __x64_sys_ioctl+0x4f2/0xa00 x64_sys_call+0xed0/0x1da0 do_syscall_64+0x73/0xfa0 entry_SYSCALL_64_after_hwframe+0x76/0x7e ... This is not a real problem, because the vhost_vsock_get() caller, i.e. vhost_vsock_set_cid(), holds the `vhost_vsock_mutex` used by the hash table writers. Anyway, to prevent that warning, add lockdep_is_held() condition to hash_for_each_possible_rcu() to verify that either the caller is in an RCU read section or `vhost_vsock_mutex` is held when CONFIG_PROVE_RCU_LIST is enabled; and also clarify the comment for vhost_vsock_get() to better describe the locking requirements and the scope of the returned pointer validity. About vhost_vsock_reset_orphans(), currently this function is only called via vsock_for_each_connected_socket(), which holds the `vsock_table_lock` spinlock (which is also an RCU read-side critical section). However, add an explicit RCU read lock there to make the code more robust and explicit about the RCU requirements, and to prevent issues if the calling context changes in the future or if vhost_vsock_reset_orphans() is called from other contexts. Fixes: 834e772c8db0 ("vhost/vsock: fix use-after-free in network stack call= ers") Cc: stefanha@redhat.com Signed-off-by: Stefano Garzarella Reviewed-by: Stefan Hajnoczi --- drivers/vhost/vsock.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c index ae01457ea2cd..78cc66fbb3dd 100644 --- a/drivers/vhost/vsock.c +++ b/drivers/vhost/vsock.c @@ -64,14 +64,15 @@ static u32 vhost_transport_get_local_cid(void) return VHOST_VSOCK_DEFAULT_HOST_CID; } =20 -/* Callers that dereference the return value must hold vhost_vsock_mutex o= r the - * RCU read lock. +/* Callers must be in an RCU read section or hold the vhost_vsock_mutex. + * The return value can only be dereferenced while within the section. */ static struct vhost_vsock *vhost_vsock_get(u32 guest_cid) { struct vhost_vsock *vsock; =20 - hash_for_each_possible_rcu(vhost_vsock_hash, vsock, hash, guest_cid) { + hash_for_each_possible_rcu(vhost_vsock_hash, vsock, hash, guest_cid, + lockdep_is_held(&vhost_vsock_mutex)) { u32 other_cid =3D vsock->guest_cid; =20 /* Skip instances that have no CID yet */ @@ -707,9 +708,15 @@ static void vhost_vsock_reset_orphans(struct sock *sk) * executing. */ =20 + rcu_read_lock(); + /* If the peer is still valid, no need to reset connection */ - if (vhost_vsock_get(vsk->remote_addr.svm_cid)) + if (vhost_vsock_get(vsk->remote_addr.svm_cid)) { + rcu_read_unlock(); return; + } + + rcu_read_unlock(); =20 /* If the close timeout is pending, let it expire. This avoids races * with the timeout callback. --=20 2.51.1