From nobody Tue Feb 10 23:01:30 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B3CE7359FAB for ; Sun, 8 Feb 2026 14:35:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770561311; cv=none; b=DLELAxan8RQtP+FDsk2q1mdKwnPvzXr4MWhxwhq2TXF9iwsN+RluSlW+JfR9djMewYGwUpExB1kdicUMidBXStZVskdKBpxq2+JpbXaKwVHf8zOiN01leGQZooQwVs6Lg/RyUiDc8z2hlfQrFA+Mhbf5fr1+A06WNiy6Zy927CQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770561311; c=relaxed/simple; bh=wqFcfW6h1LSDWP+UoiYxFixHMysJrx5XeQZ09Za9f1A=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=b6HxdNsZU/6vKflP6/26f2qncfrUk8QyUPO35FGAGKQ/nkZkIDQcykiCOyGiaf2W8uMbYuF2iCXDgk1fH5CDycFA7viDDLxEOW/RnZQkMpAW+fWqs/Q0nLfdOb+6HJ+POosMhCiRuaq8z7b1ueEAMsTGrtrhlHLRpod2pQyt2zg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=hO4B+Kle; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hO4B+Kle" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1770561310; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fuXRRboAX7h/UGA1Bg3Zez56bIM2TDoXzKPU/mliA4M=; b=hO4B+Kle0nu5kjOYFh7kO6NYxnNF++MFq7XNQnHyJT69XAjN+AGhC9pT0eZfiraBxkSjXx sO0wTEbd2G2O4v+4Rw23ZHR3PDxjNaGzjb+XpRfV9S5kvN8mXS3v94yV8QDnqbf7/87tuF c39LBlzaTlpBQzPUOP+aM0VZ19dIpkM= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-117-YmmV4r8yO-GY1U8dunHN_A-1; Sun, 08 Feb 2026 09:35:07 -0500 X-MC-Unique: YmmV4r8yO-GY1U8dunHN_A-1 X-Mimecast-MFC-AGG-ID: YmmV4r8yO-GY1U8dunHN_A_1770561306 Received: from mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.93]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 4091C195608E; Sun, 8 Feb 2026 14:35:06 +0000 (UTC) Received: from S2.redhat.com (unknown [10.72.112.33]) by mx-prod-int-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 960F218004BB; Sun, 8 Feb 2026 14:35:02 +0000 (UTC) From: Cindy Lu To: lulu@redhat.com, mst@redhat.com, jasowang@redhat.com, kvm@vger.kernel.org, virtualization@lists.linux.dev, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFC 2/3] vhost/net: add netfilter socket support Date: Sun, 8 Feb 2026 22:32:23 +0800 Message-ID: <20260208143441.2177372-3-lulu@redhat.com> In-Reply-To: <20260208143441.2177372-1-lulu@redhat.com> References: <20260208143441.2177372-1-lulu@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.93 Content-Type: text/plain; charset="utf-8" Introduce the netfilter socket plumbing and the VHOST_NET_SET_FILTER ioctl. Initialize the netfilter state on open and release it on reset/close. Key points: - Add filter_sock + filter_lock to vhost_net - Validate SOCK_SEQPACKET AF_UNIX filter socket from userspace - Add vhost_net_set_filter() and VHOST_NET_SET_FILTER ioctl handler - Initialize filter state on open and clean up on reset/release Signed-off-by: Cindy Lu --- drivers/vhost/net.c | 109 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 109 insertions(+) diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 7f886d3dba7d..f02deff0e53c 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -131,6 +131,7 @@ struct vhost_net_virtqueue { struct vhost_net_buf rxq; /* Batched XDP buffs */ struct xdp_buff *xdp; + }; =20 struct vhost_net { @@ -147,6 +148,15 @@ struct vhost_net { bool tx_flush; /* Private page frag cache */ struct page_frag_cache pf_cache; + + /* + * Optional vhost-net filter offload socket. + * When configured, RX packets can be routed through a userspace + * filter chain via a SOCK_SEQPACKET control socket. Access to + * filter_sock is protected by filter_lock. + */ + struct socket *filter_sock; + spinlock_t filter_lock; }; =20 static unsigned vhost_net_zcopy_mask __read_mostly; @@ -1128,6 +1138,95 @@ static int get_rx_bufs(struct vhost_net_virtqueue *n= vq, return r; } =20 +/* + * Validate and acquire the filter socket from userspace. + * + * Returns: + * - NULL when fd =3D=3D -1 (explicitly disable filter) + * - a ref-counted struct socket on success + * - ERR_PTR(-errno) on validation failure + */ +static struct socket *get_filter_socket(int fd) +{ + int r; + struct socket *sock; + + /* Special case: userspace asks to disable filter. */ + if (fd =3D=3D -1) + return NULL; + + sock =3D sockfd_lookup(fd, &r); + if (!sock) + return ERR_PTR(-ENOTSOCK); + + if (sock->sk->sk_family !=3D AF_UNIX || + sock->sk->sk_type !=3D SOCK_SEQPACKET) { + sockfd_put(sock); + return ERR_PTR(-EINVAL); + } + + return sock; +} + +/* + * Drop the currently configured filter socket, if any. + * + * Caller does not need to hold filter_lock; this function clears the poin= ter + * under the lock and releases the socket reference afterwards. + */ +static void vhost_net_filter_stop(struct vhost_net *n) +{ + struct socket *sock =3D n->filter_sock; + + spin_lock(&n->filter_lock); + n->filter_sock =3D NULL; + spin_unlock(&n->filter_lock); + + if (sock) + sockfd_put(sock); +} + +/* + * Install or remove a filter socket for this vhost-net device. + * + * The ioctl passes an fd for a SOCK_SEQPACKET AF_UNIX socket created by + * userspace. We validate the socket type, replace any existing filter soc= ket, + * and keep a reference so RX path can safely send filter requests. + */ +static long vhost_net_set_filter(struct vhost_net *n, int fd) +{ + struct socket *sock; + int r; + + mutex_lock(&n->dev.mutex); + r =3D vhost_dev_check_owner(&n->dev); + if (r) + goto out; + + sock =3D get_filter_socket(fd); + if (IS_ERR(sock)) { + r =3D PTR_ERR(sock); + goto out; + } + + vhost_net_filter_stop(n); + + if (!sock) { + r =3D 0; + goto out; + } + + spin_lock(&n->filter_lock); + n->filter_sock =3D sock; + spin_unlock(&n->filter_lock); + + r =3D 0; + +out: + mutex_unlock(&n->dev.mutex); + return r; +} + /* Expects to be always run from workqueue - which acts as * read-size critical section for our kind of RCU. */ static void handle_rx(struct vhost_net *net) @@ -1383,6 +1482,8 @@ static int vhost_net_open(struct inode *inode, struct= file *f) =20 f->private_data =3D n; page_frag_cache_init(&n->pf_cache); + spin_lock_init(&n->filter_lock); + n->filter_sock =3D NULL; =20 return 0; } @@ -1433,6 +1534,7 @@ static int vhost_net_release(struct inode *inode, str= uct file *f) struct socket *tx_sock; struct socket *rx_sock; =20 + vhost_net_filter_stop(n); vhost_net_stop(n, &tx_sock, &rx_sock); vhost_net_flush(n); vhost_dev_stop(&n->dev); @@ -1637,6 +1739,8 @@ static long vhost_net_reset_owner(struct vhost_net *n) err =3D vhost_dev_check_owner(&n->dev); if (err) goto done; + + vhost_net_filter_stop(n); umem =3D vhost_dev_reset_owner_prepare(); if (!umem) { err =3D -ENOMEM; @@ -1737,6 +1841,7 @@ static long vhost_net_ioctl(struct file *f, unsigned = int ioctl, void __user *argp =3D (void __user *)arg; u64 __user *featurep =3D argp; struct vhost_vring_file backend; + struct vhost_net_filter filter; u64 features, count, copied; int r, i; =20 @@ -1745,6 +1850,10 @@ static long vhost_net_ioctl(struct file *f, unsigned= int ioctl, if (copy_from_user(&backend, argp, sizeof backend)) return -EFAULT; return vhost_net_set_backend(n, backend.index, backend.fd); + case VHOST_NET_SET_FILTER: + if (copy_from_user(&filter, argp, sizeof(filter))) + return -EFAULT; + return vhost_net_set_filter(n, filter.fd); case VHOST_GET_FEATURES: features =3D vhost_net_features[0]; if (copy_to_user(featurep, &features, sizeof features)) --=20 2.52.0