From nobody Sat Nov 15 10:35:46 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=reject dis=none) header.from=oracle.com ARC-Seal: i=1; a=rsa-sha256; t=1752785111; cv=none; d=zohomail.com; s=zohoarc; b=a/n3vXSPl+n4vzCnVNhSteH/KIdkjTYco7fnaYa3/MJad1LSiTIsYUPpyVlEe454yY6X/qWUO5x43PSwH5cCpDtUPdKoU2XEf5H28ohflP7KYQv97J2N5qQsvUwCDKibwwOqspXh9GXi52GvFJ6iRdIUfaJt/43+ABxi+/NuVCQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1752785111; h=Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=jEMH++dDedb77qsgBrEOe/ieXuj7ISA+5MLtq4+Sies=; b=c7QnnVtIYwLTmBmH3wV3iUwXvx2adrumZrwDe57p1JGb9yPvEfuBGnzogD2iKGOHB46aolfOA72ZYtv808cX1jOolcgVFax8t0YFpliyLcpuTPnLE7J+R9ul91od7yDjxY9UAKwmbb4wN41K/8vNoLW+d5eKoCLTGh3WMQqp/II= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=reject dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1752785111273136.63078165284526; Thu, 17 Jul 2025 13:45:11 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ucVT1-0007Bh-BG; Thu, 17 Jul 2025 16:44:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ucTW8-0008WF-Be for qemu-devel@nongnu.org; Thu, 17 Jul 2025 14:39:40 -0400 Received: from mx0b-00069f02.pphosted.com ([205.220.177.32]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ucTW4-00009G-5F for qemu-devel@nongnu.org; Thu, 17 Jul 2025 14:39:40 -0400 Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 56HH0iHs027536; Thu, 17 Jul 2025 18:39:35 GMT Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 47uk1b3s4b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 17 Jul 2025 18:39:34 +0000 (GMT) Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 56HHNQRY023999; Thu, 17 Jul 2025 18:39:33 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 47ue5d2tba-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 17 Jul 2025 18:39:33 +0000 Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 56HIcoiv007425; Thu, 17 Jul 2025 18:39:33 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 47ue5d2t8q-8; Thu, 17 Jul 2025 18:39:33 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :date:from:in-reply-to:message-id:references:subject:to; s= corp-2025-04-25; bh=jEMH++dDedb77qsgBrEOe/ieXuj7ISA+5MLtq4+Sies=; b= StY7DoXSnMdbB5velfWadQ1tsyYiP6uP9Ku16hgj9BPW5je3vvpWvTlhkD8SReKl wOYuePOOvWcyJ0X6zzGEl2Q82ZipEcpEZDL7EyR9cjymrWa2SLjslxy3M3k9s9fV hS0tNj5euUnRPKQh8BMVXCLvwgRtL4ubv1IUHZPfMcZ88SNQktkpUitwzooivTA3 iWyBzZirFdVXplwpU24AOSegT5xnut79ZHIBfj3xlVt85r2vtQdQHwfEeVqC6I5w wnJLhn2djuQNm6KjqtYI7ar8EOIa0/3c0lYyLwp85l94FFM/pUEXQOMC4jMC1lT8 l+CRDOSEPBwE9UZUgLH9Rg== From: Steve Sistare To: qemu-devel@nongnu.org Cc: Jason Wang , "Michael S. Tsirkin" , Stefano Garzarella , Peter Xu , Fabiano Rosas , Hamza Khan , Steve Sistare Subject: [RFC V2 7/8] tap: cpr support Date: Thu, 17 Jul 2025 11:39:27 -0700 Message-Id: <1752777568-236368-8-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1752777568-236368-1-git-send-email-steven.sistare@oracle.com> References: <1752777568-236368-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-07-17_02,2025-07-17_02,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 phishscore=0 spamscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 suspectscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2505160000 definitions=main-2507170165 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwNzE3MDE2NSBTYWx0ZWRfX+xFLPmS9IJMl jDh6wl3jEYLbxz5n1NHPoe5nZcqo2QpT7qo4wnihhXvvNXxoNaLSoLOgybsZwkRsMeS1QVBvA+C 7d2QSKZcZNG2qYS4e8agrW/ohFdlE0kZMLNXCP0qx/9RAAC9E2Bas4CFdxY2BhXIi59p/FoNXgN Y2oHkuUgEiSZmSMJl0ITKY9u4e2WjixBNEcQHbP/Ybh6ZdCOmpBuZICy08zEHCogW60bmZLgpru O2685TuWq2eZCtMFKvVzsTgMK0s4KIcBcZL21uacVKLaHo9wq7zZxoOtGyq8Zmt9HCEwDAA2Y0O c/zEusUFH3ecpSyf/+o7eTMnqalBVO6GcHFnexygVhl+oN12eQYM9Pw+W+Yx6XanD0iCvGpnDyR 7ejSV61A3wergECiIIJK9/FyaGdLvxETu0+OtdF/kB1gq822HM/KUdEUyL3YoHdrOwfEU2a7 X-Proofpoint-GUID: JjMflIiwhoPO3hSA330PDfJC50M978U8 X-Proofpoint-ORIG-GUID: JjMflIiwhoPO3hSA330PDfJC50M978U8 X-Authority-Analysis: v=2.4 cv=J8mq7BnS c=1 sm=1 tr=0 ts=68794366 b=1 cx=c_pps a=WeWmnZmh0fydH62SvGsd2A==:117 a=WeWmnZmh0fydH62SvGsd2A==:17 a=Wb1JkmetP80A:10 a=yPCof4ZbAAAA:8 a=zKPUYSCetFGJfgUTIKcA:9 Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=205.220.177.32; envelope-from=steven.sistare@oracle.com; helo=mx0b-00069f02.pphosted.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @oracle.com) X-ZM-MESSAGEID: 1752785113850124100 Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Provide the cpr=3Don option to preserve TAP and vhost descriptors during cpr-transfer, so the management layer does not need to create a new device for the target. Save all tap fd's in canonical order, leveraging the index argument of cpr_save_fd. For the i'th queue, the tap device fd is saved at index 2*i, and the vhostfd (if any) at index 2*i+1. tap and vhost fd's are passed by name to the monitor when a NIC is hot plugged, but the name is not known to qemu after cpr. Allow the manager to pass -1 for the fd "name" in the new qemu args to indicate that QEMU should search for a saved value. Example: -netdev tap,id=3Dhostnet2,fds=3D-1:-1,vhostfds=3D-1:-1,cpr=3Don Signed-off-by: Steve Sistare --- qapi/net.json | 5 +++- include/migration/cpr.h | 2 +- hw/vfio/device.c | 2 +- migration/cpr.c | 11 ++++---- net/tap.c | 70 ++++++++++++++++++++++++++++++++++++++-------= ---- 5 files changed, 67 insertions(+), 23 deletions(-) diff --git a/qapi/net.json b/qapi/net.json index 97ea183..5c7422b 100644 --- a/qapi/net.json +++ b/qapi/net.json @@ -238,6 +238,8 @@ # @poll-us: maximum number of microseconds that could be spent on busy # polling for tap (since 2.7) # +# @cpr: preserve fds and vhostfds during cpr-transfer. +# # Since: 1.2 ## { 'struct': 'NetdevTapOptions', @@ -256,7 +258,8 @@ '*vhostfds': 'str', '*vhostforce': 'bool', '*queues': 'uint32', - '*poll-us': 'uint32'} } + '*poll-us': 'uint32', + '*cpr': 'bool'} } =20 ## # @NetdevSocketOptions: diff --git a/include/migration/cpr.h b/include/migration/cpr.h index 0fa57dd..baff57f 100644 --- a/include/migration/cpr.h +++ b/include/migration/cpr.h @@ -45,7 +45,7 @@ void cpr_state_close(void); struct QIOChannel *cpr_state_ioc(void); =20 bool cpr_incoming_needed(void *opaque); -int cpr_get_fd_param(const char *name, const char *fdname, int index, +int cpr_get_fd_param(const char *name, const char *fdname, int index, bool= cpr, Error **errp); =20 QEMUFile *cpr_transfer_output(MigrationChannel *channel, Error **errp); diff --git a/hw/vfio/device.c b/hw/vfio/device.c index 96cf214..9eb6699 100644 --- a/hw/vfio/device.c +++ b/hw/vfio/device.c @@ -351,7 +351,7 @@ void vfio_device_free_name(VFIODevice *vbasedev) =20 void vfio_device_set_fd(VFIODevice *vbasedev, const char *str, Error **err= p) { - vbasedev->fd =3D cpr_get_fd_param(vbasedev->dev->id, str, 0, errp); + vbasedev->fd =3D cpr_get_fd_param(vbasedev->dev->id, str, 0, true, err= p); } =20 static VFIODeviceIOOps vfio_device_io_ops_ioctl; diff --git a/migration/cpr.c b/migration/cpr.c index e97be9d..6d01b8c 100644 --- a/migration/cpr.c +++ b/migration/cpr.c @@ -282,6 +282,7 @@ bool cpr_incoming_needed(void *opaque) * @name: CPR name for the descriptor * @fdname: An integer-valued string, or a name passed to a getfd command * @index: CPR index of the descriptor + * @cpr: use cpr * @errp: returned error message * * If CPR is not being performed, then use @fdname to find the fd. @@ -291,22 +292,22 @@ bool cpr_incoming_needed(void *opaque) * On success returns the fd value, else returns -1. */ int cpr_get_fd_param(const char *name, const char *fdname, int index, - Error **errp) + bool cpr, Error **errp) { ERRP_GUARD(); int fd; =20 - if (cpr_is_incoming()) { + if (cpr && cpr_is_incoming()) { fd =3D cpr_find_fd(name, index); if (fd < 0) { error_setg(errp, "cannot find saved value for fd %s", fdname); } } else { fd =3D monitor_fd_param(monitor_cur(), fdname, errp); - if (fd >=3D 0) { - cpr_save_fd(name, index, fd); - } else { + if (fd < 0) { error_prepend(errp, "Could not parse object fd %s:", fdname); + } else if (cpr) { + cpr_save_fd(name, index, fd); } } return fd; diff --git a/net/tap.c b/net/tap.c index 1b239fd..6a12751 100644 --- a/net/tap.c +++ b/net/tap.c @@ -35,6 +35,7 @@ #include "net/eth.h" #include "net/net.h" #include "clients.h" +#include "migration/cpr.h" #include "monitor/monitor.h" #include "system/system.h" #include "qapi/error.h" @@ -59,6 +60,7 @@ typedef struct TAPState { bool has_ufo; bool has_uso; bool enabled; + bool cpr; VHostNetState *vhost_net; unsigned host_vnet_hdr_len; Notifier exit; @@ -290,6 +292,9 @@ static void tap_cleanup(NetClientState *nc) { TAPState *s =3D DO_UPCAST(TAPState, nc, nc); =20 + if (s->cpr) { + cpr_delete_fd_all(nc->name); + } if (s->vhost_net) { vhost_net_cleanup(s->vhost_net); g_free(s->vhost_net); @@ -642,18 +647,24 @@ static int net_tap_init(const NetdevTapOptions *tap, = int *vnet_hdr, return fd; } =20 +/* CPR fd's for each queue are saved at these indices */ +#define TAP_FD_INDEX(queue) (2 * (queue) + 0) +#define TAP_VHOSTFD_INDEX(queue) (2 * (queue) + 1) + #define MAX_TAP_QUEUES 1024 =20 static void net_init_tap_one(const NetdevTapOptions *tap, NetClientState *= peer, const char *model, const char *name, const char *ifname, const char *script, const char *downscript, const char *vhostfdna= me, - int vnet_hdr, int fd, Error **errp) + int vnet_hdr, int fd, int index, Error **errp) { Error *err =3D NULL; TAPState *s =3D net_tap_fd_init(peer, model, name, fd, vnet_hdr); + bool cpr =3D tap->has_cpr ? tap->cpr : false; int vhostfd; =20 + s->cpr =3D cpr; tap_set_sndbuf(s->fd, tap, &err); if (err) { error_propagate(errp, err); @@ -688,7 +699,7 @@ static void net_init_tap_one(const NetdevTapOptions *ta= p, NetClientState *peer, } =20 if (vhostfdname) { - vhostfd =3D monitor_fd_param(monitor_cur(), vhostfdname, &err); + vhostfd =3D cpr_get_fd_param(name, vhostfdname, index, cpr, &e= rr); if (vhostfd =3D=3D -1) { error_propagate(errp, err); goto failed; @@ -699,7 +710,13 @@ static void net_init_tap_one(const NetdevTapOptions *t= ap, NetClientState *peer, goto failed; } } else { - vhostfd =3D open("/dev/vhost-net", O_RDWR); + vhostfd =3D cpr ? cpr_find_fd(name, index) : -1; + if (vhostfd < 0) { + vhostfd =3D open("/dev/vhost-net", O_RDWR); + if (cpr && vhostfd >=3D 0) { + cpr_save_fd(name, index, vhostfd); + } + } if (vhostfd < 0) { error_setg_errno(errp, errno, "tap: open vhost char device failed"); @@ -727,6 +744,9 @@ static void net_init_tap_one(const NetdevTapOptions *ta= p, NetClientState *peer, return; =20 failed: + if (cpr) { + cpr_delete_fd_all(name); + } qemu_del_net_client(&s->nc); } =20 @@ -759,7 +779,8 @@ static int get_fds(char *str, char *fds[], int max) int net_init_tap(const Netdev *netdev, const char *name, NetClientState *peer, Error **errp) { - const NetdevTapOptions *tap; + const NetdevTapOptions *tap =3D &netdev->u.tap; + bool cpr =3D tap->has_cpr ? tap->cpr : false; int fd, vnet_hdr =3D 0, i =3D 0, queues; /* for the no-fd, no-helper case */ const char *script; @@ -795,7 +816,7 @@ int net_init_tap(const Netdev *netdev, const char *name, goto out; } =20 - fd =3D monitor_fd_param(monitor_cur(), tap->fd, errp); + fd =3D cpr_get_fd_param(name, tap->fd, TAP_FD_INDEX(0), cpr, errp); if (fd =3D=3D -1) { ret =3D -1; goto out; @@ -818,13 +839,14 @@ int net_init_tap(const Netdev *netdev, const char *na= me, =20 net_init_tap_one(tap, peer, "tap", name, NULL, script, downscript, - vhostfdname, vnet_hdr, fd, &err); + vhostfdname, vnet_hdr, fd, TAP_VHOSTFD_INDEX(0), = &err); if (err) { error_propagate(errp, err); close(fd); ret =3D -1; goto out; } + } else if (tap->fds) { char **fds; char **vhost_fds; @@ -855,7 +877,7 @@ int net_init_tap(const Netdev *netdev, const char *name, } =20 for (i =3D 0; i < nfds; i++) { - fd =3D monitor_fd_param(monitor_cur(), fds[i], errp); + fd =3D cpr_get_fd_param(name, fds[i], TAP_FD_INDEX(i), cpr, er= rp); if (fd =3D=3D -1) { ret =3D -1; goto free_fail; @@ -884,7 +906,7 @@ int net_init_tap(const Netdev *netdev, const char *name, net_init_tap_one(tap, peer, "tap", name, ifname, script, downscript, tap->vhostfds ? vhost_fds[i] : NULL, - vnet_hdr, fd, &err); + vnet_hdr, fd, TAP_VHOSTFD_INDEX(i), &err); if (err) { error_propagate(errp, err); ret =3D -1; @@ -912,9 +934,15 @@ free_fail: goto out; } =20 - fd =3D net_bridge_run_helper(tap->helper, - tap->br ?: DEFAULT_BRIDGE_INTERFACE, - errp); + fd =3D cpr ? cpr_find_fd(name, TAP_FD_INDEX(0)) : -1; + if (fd < 0) { + fd =3D net_bridge_run_helper(tap->helper, + tap->br ?: DEFAULT_BRIDGE_INTERFACE, + errp); + if (cpr && fd >=3D 0) { + cpr_save_fd(name, TAP_FD_INDEX(0), fd); + } + } if (fd =3D=3D -1) { ret =3D -1; goto out; @@ -934,13 +962,14 @@ free_fail: =20 net_init_tap_one(tap, peer, "bridge", name, ifname, script, downscript, vhostfdname, - vnet_hdr, fd, &err); + vnet_hdr, fd, TAP_VHOSTFD_INDEX(0), &err); if (err) { error_propagate(errp, err); close(fd); ret =3D -1; goto out; } + } else { g_autofree char *default_script =3D NULL; g_autofree char *default_downscript =3D NULL; @@ -965,8 +994,14 @@ free_fail: } =20 for (i =3D 0; i < queues; i++) { - fd =3D net_tap_init(tap, &vnet_hdr, i >=3D 1 ? "no" : script, - ifname, sizeof ifname, queues > 1, errp); + fd =3D cpr ? cpr_find_fd(name, TAP_FD_INDEX(i)) : -1; + if (fd < 0) { + fd =3D net_tap_init(tap, &vnet_hdr, i >=3D 1 ? "no" : scri= pt, + ifname, sizeof ifname, queues > 1, errp); + if (cpr && fd >=3D 0) { + cpr_save_fd(name, TAP_FD_INDEX(i), fd); + } + } if (fd =3D=3D -1) { ret =3D -1; goto out; @@ -984,7 +1019,9 @@ free_fail: net_init_tap_one(tap, peer, "tap", name, ifname, i >=3D 1 ? "no" : script, i >=3D 1 ? "no" : downscript, - vhostfdname, vnet_hdr, fd, &err); + vhostfdname, vnet_hdr, + fd, TAP_VHOSTFD_INDEX(i), + &err); if (err) { error_propagate(errp, err); close(fd); @@ -995,6 +1032,9 @@ free_fail: } =20 out: + if (ret && cpr) { + cpr_delete_fd_all(name); + } return ret; } =20 --=20 1.8.3.1