From nobody Tue Feb 10 01:14:59 2026 Received: from mail-yx1-f45.google.com (mail-yx1-f45.google.com [74.125.224.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A463C27F759 for ; Sat, 15 Nov 2025 23:34:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763249678; cv=none; b=R5OGK24qMdMyrbruo26SHiFlpvT0sUPkgEc9c2x1T1AISQsxtokSJhoTlBBbPmsX82ar/MeSxb6WoTkHTRssDHtm7GeaU2P5U9QBhbUTxKyqI8dmi48dV8fj/FZhGcY3VtA4Z63To7ImG7hehUNi4EHBJhXv2loa8sXhNC28UgY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763249678; c=relaxed/simple; bh=iH6e2dbZhxNGawYVnoeitVB8B9t7TUQvEkiyaf4Tegw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BP8pt0jL3Vscjb7fMR+rpO59D/RjXlUWiIkxNUUxv9GIy7RUOj4DdNTn6cighD88sKqF8+rBz6j6vPle21HDHeVKTLg1tLSW0cygA3PqYIx4Xgnjnaru5T+3Z/MZsdp9K9UfPG0o4M6TH7NI1Lk27Zdp55JVdKanntkPrIda/J0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=AykjmslO; arc=none smtp.client-ip=74.125.224.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="AykjmslO" Received: by mail-yx1-f45.google.com with SMTP id 956f58d0204a3-640f2c9cc72so2325769d50.3 for ; Sat, 15 Nov 2025 15:34:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1763249673; x=1763854473; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=WkjwaQDLlS9Yr5+HyVNaGiCitPd127gM/CMqs5Sb1tM=; b=AykjmslO++YzOQz1+BrIYUf3j8mlyDpI3qflfsfKrohqm1bCvzDm98nYz6/M7dbEBn rWbWvsVCI1pSzRFxlcYMR5WkPk/05pqVq1c2eWikreG9KaAOiHGPpoVu4PU6R+cGG0gP XrSrVjIYz+wHsAVwEBUjVROhZIfU2krFNe5AMnr5s3ghtPWbkdbdzzoB6ypezkv1Jaaz c5vTKv9BogUHkQZbWhhJOuFGw+0xZSXmPqO6d0nLFCjqyxr39H3WiGTXCKX7S6encq4u jEzpa3H5pWAo4lk4LR8iitUEj7qh3YfXucQXWTDeiHyKESsXAmLt42LDTt1KT33hc6gC QwKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763249673; x=1763854473; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=WkjwaQDLlS9Yr5+HyVNaGiCitPd127gM/CMqs5Sb1tM=; b=VXODD2KEG8n7VC6ok7aMtgTGeFnoQSadUYP5w5iKverRrtizavus31g4JVo6msS+6F 0bK2ZXcbgT1eG7GqB0wHZSkugLAkF4faEVdjY1C2lN80adgz7d8L9ER51MzdKlWE0g2b DNZkb02/BL8uqHJIQPEyfPyPKHlj5u2WwY9ABPxBZW4yQ9Tfwyp/XLBHpajsC3wB7iWO Ex86cLHtuCZipOeMDJ8Rxqh/AlY9zELFEPsbdIEF+k3eASbFSXxbJAqnBCsJI86j5c4A KJBBO1BSHN4NK1fPby2TeKtdNhIIKqZr8oiCdV4HCXVKKm3EJgoczNr91H9XKMn6J57S zXMw== X-Forwarded-Encrypted: i=1; AJvYcCVuxUG7TwrD7puvaSMWxhT9Ys4ZjdMSvIWNUdmC0e0Vl8D7k8nZf5eWqDlmZPnrtYYZbT5B+WALyLZGHJw=@vger.kernel.org X-Gm-Message-State: AOJu0YxESL403ndW4J/PRgnP24Fdi4/F41GzyH2oTzez1+3v3Miu6gTF Uk3EqpW3YIb09M07wFOZ2hhalcKKdgNDENERtxJnKEtJPYCJg3uRskzZQecBsMPmoys= X-Gm-Gg: ASbGncsUPz5QjuiWR9ln8Ajs8YdLETv7mVCRkfnlLwWFQab023C8PEYexElfHzYt4jH gSH0kNUqbOwUqIX0wRWE9DKHX2in3JQZ82bV832VCNqe9uQJbsdXfisZZjd+mMd0zxC2cLD/t1x naLQmV1cm0tMV5z/3YJLMoZgdIWOXIBxbaNbVA/rgJ/0u7qFNoZ6SiUpjU8nsZgnf8SSFWDzFNm R03LykNKAdr10kxflCe5PZIHdf8Q7HW7JL7GW6moHROjuoik40eytl5l6CGhhkI4MiuX40nCXSC 5P5IHNTMP4aGEwDDui2xDG/ePzTJ1jZxVOxJCaO7N0f5vhQ/AlNIkgL0hWYvceD2DIbYWzlYzlR BBHwFOyTgC+Ul7s6lxCCZe2Q+Wwmew9on1Q3780lUZcfZY6J/qd1ZnrFM5nytGS4wkIXy8MPmdP uVA1J6Q2uqNkN9Nuf/TLtEhfdZZrelPoNdnSt5YJ4ZWwb5xyFbo80CpbXm6Q7rMoPKVLhs X-Google-Smtp-Source: AGHT+IGIvzvWx+seWY3lDwubxMit7pfqszZynxJfb+7BhQKgt/q2GPlXgRc870pYq2BGK02AYhbTJA== X-Received: by 2002:a05:690c:868a:10b0:788:161c:7117 with SMTP id 00721157ae682-78929dff2b5mr118300507b3.8.1763249673543; Sat, 15 Nov 2025 15:34:33 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-7882218774esm28462007b3.57.2025.11.15.15.34.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 15 Nov 2025 15:34:32 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v6 07/20] liveupdate: luo_session: Add ioctls for file preservation Date: Sat, 15 Nov 2025 18:33:53 -0500 Message-ID: <20251115233409.768044-8-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.rc1.455.g30608eb744-goog In-Reply-To: <20251115233409.768044-1-pasha.tatashin@soleen.com> References: <20251115233409.768044-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introducing the userspace interface and internal logic required to manage the lifecycle of file descriptors within a session. Previously, a session was merely a container; this change makes it a functional management unit. The following capabilities are added: A new set of ioctl commands are added, which operate on the file descriptor returned by CREATE_SESSION. This allows userspace to: - LIVEUPDATE_SESSION_PRESERVE_FD: Add a file descriptor to a session to be preserved across the live update. - LIVEUPDATE_SESSION_RETRIEVE_FD: Retrieve a preserved file in the new kernel using its unique token. - LIVEUPDATE_SESSION_FINISH: finish session The session's .release handler is enhanced to be state-aware. When a session's file descriptor is closed, it correctly unpreserves the session based on its current state before freeing all associated file resources. Signed-off-by: Pasha Tatashin --- include/uapi/linux/liveupdate.h | 103 ++++++++++++++++++ kernel/liveupdate/luo_session.c | 187 +++++++++++++++++++++++++++++++- 2 files changed, 286 insertions(+), 4 deletions(-) diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index 6e04254ee535..3902ffab4c53 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -53,6 +53,14 @@ enum { LIVEUPDATE_CMD_RETRIEVE_SESSION =3D 0x01, }; =20 +/* ioctl commands for session file descriptors */ +enum { + LIVEUPDATE_CMD_SESSION_BASE =3D 0x40, + LIVEUPDATE_CMD_SESSION_PRESERVE_FD =3D LIVEUPDATE_CMD_SESSION_BASE, + LIVEUPDATE_CMD_SESSION_RETRIEVE_FD =3D 0x41, + LIVEUPDATE_CMD_SESSION_FINISH =3D 0x42, +}; + /** * struct liveupdate_ioctl_create_session - ioctl(LIVEUPDATE_IOCTL_CREATE_= SESSION) * @size: Input; sizeof(struct liveupdate_ioctl_create_session) @@ -110,4 +118,99 @@ struct liveupdate_ioctl_retrieve_session { #define LIVEUPDATE_IOCTL_RETRIEVE_SESSION \ _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_RETRIEVE_SESSION) =20 +/* Session specific IOCTLs */ + +/** + * struct liveupdate_session_preserve_fd - ioctl(LIVEUPDATE_SESSION_PRESER= VE_FD) + * @size: Input; sizeof(struct liveupdate_session_preserve_fd) + * @fd: Input; The user-space file descriptor to be preserved. + * @token: Input; An opaque, unique token for preserved resource. + * + * Holds parameters for preserving a file descriptor. + * + * User sets the @fd field identifying the file descriptor to preserve + * (e.g., memfd, kvm, iommufd, VFIO). The kernel validates if this FD type + * and its dependencies are supported for preservation. If validation pass= es, + * the kernel marks the FD internally and *initiates the process* of prepa= ring + * its state for saving. The actual snapshotting of the state typically oc= curs + * during the subsequent %LIVEUPDATE_IOCTL_PREPARE execution phase, though + * some finalization might occur during freeze. + * On successful validation and initiation, the kernel uses the @token + * field with an opaque identifier representing the resource being preserv= ed. + * This token confirms the FD is targeted for preservation and is required= for + * the subsequent %LIVEUPDATE_SESSION_RETRIEVE_FD call after the live upda= te. + * + * Return: 0 on success (validation passed, preservation initiated), negat= ive + * error code on failure (e.g., unsupported FD type, dependency issue, + * validation failed). + */ +struct liveupdate_session_preserve_fd { + __u32 size; + __s32 fd; + __aligned_u64 token; +}; + +#define LIVEUPDATE_SESSION_PRESERVE_FD \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_PRESERVE_FD) + +/** + * struct liveupdate_session_retrieve_fd - ioctl(LIVEUPDATE_SESSION_RETRIE= VE_FD) + * @size: Input; sizeof(struct liveupdate_session_RETRIEVE_fd) + * @fd: Output; The new file descriptor representing the fully restored + * kernel resource. + * @token: Input; An opaque, token that was used to preserve the resource. + * + * Retrieve a previously preserved file descriptor. + * + * User sets the @token field to the value obtained from a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call before the live update. On success, + * the kernel restores the state (saved during the PREPARE/FREEZE phases) + * associated with the token and populates the @fd field with a new file + * descriptor referencing the restored resource in the current (new) kerne= l. + * This operation must be performed *before* signaling completion via + * %LIVEUPDATE_IOCTL_FINISH. + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +struct liveupdate_session_retrieve_fd { + __u32 size; + __s32 fd; + __aligned_u64 token; +}; + +#define LIVEUPDATE_SESSION_RETRIEVE_FD \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_RETRIEVE_FD) + +/** + * struct liveupdate_session_finish - ioctl(LIVEUPDATE_SESSION_FINISH) + * @size: Input; sizeof(struct liveupdate_session_finish) + * @reserved: Input; Must be zero. Reserved for future use. + * + * Signals the completion of the restoration process for a retrieved sessi= on. + * This is the final operation that should be performed on a session file + * descriptor after a live update. + * + * This ioctl must be called once all required file descriptors for the se= ssion + * have been successfully retrieved (using %LIVEUPDATE_SESSION_RETRIEVE_FD= ) and + * are fully restored from the userspace and kernel perspective. + * + * Upon success, the kernel releases its ownership of the preserved resour= ces + * associated with this session. This allows internal resources to be free= d, + * typically by decrementing reference counts on the underlying preserved + * objects. + * + * If this operation fails, the resources remain preserved in memory. User= space + * may attempt to call finish again. The resources will otherwise be reset + * during the next live update cycle. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_session_finish { + __u32 size; + __u32 reserved; +}; + +#define LIVEUPDATE_SESSION_FINISH \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_FINISH) + #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_sessio= n.c index cb74bfaba479..82ba6e3578f5 100644 --- a/kernel/liveupdate/luo_session.c +++ b/kernel/liveupdate/luo_session.c @@ -174,26 +174,189 @@ static void luo_session_remove(struct luo_session_he= ader *sh, sh->count--; } =20 +static int luo_session_finish_one(struct luo_session *session) +{ + guard(mutex)(&session->mutex); + return luo_file_finish(session); +} + +static void luo_session_unfreeze_one(struct luo_session *session) +{ + guard(mutex)(&session->mutex); + luo_file_unfreeze(session); +} + +static int luo_session_freeze_one(struct luo_session *session) +{ + guard(mutex)(&session->mutex); + return luo_file_freeze(session); +} + static int luo_session_release(struct inode *inodep, struct file *filep) { struct luo_session *session =3D filep->private_data; struct luo_session_header *sh; + int err =3D 0; =20 /* If retrieved is set, it means this session is from incoming list */ - if (session->retrieved) + if (session->retrieved) { sh =3D &luo_session_global.incoming; - else + + err =3D luo_session_finish_one(session); + if (err) { + pr_warn("Unable to finish session [%s] on release\n", + session->name); + } else { + luo_session_remove(sh, session); + luo_session_free(session); + } + + } else { sh =3D &luo_session_global.outgoing; =20 - luo_session_remove(sh, session); - luo_session_free(session); + scoped_guard(mutex, &session->mutex) + luo_file_unpreserve_files(session); + luo_session_remove(sh, session); + luo_session_free(session); + } + + return err; +} + +static int luo_session_preserve_fd(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_preserve_fd *argp =3D ucmd->cmd; + int err; + + guard(mutex)(&session->mutex); + err =3D luo_preserve_file(session, argp->token, argp->fd); + if (err) + return err; + + err =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (err) + pr_warn("The file was successfully preserved, but response to user faile= d\n"); + + return err; +} + +static int luo_session_retrieve_fd(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_retrieve_fd *argp =3D ucmd->cmd; + struct file *file; + int err; + + argp->fd =3D get_unused_fd_flags(O_CLOEXEC); + if (argp->fd < 0) + return argp->fd; + + guard(mutex)(&session->mutex); + err =3D luo_retrieve_file(session, argp->token, &file); + if (err < 0) + goto err_put_fd; + + err =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (err) + goto err_put_file; + + fd_install(argp->fd, file); =20 return 0; + +err_put_file: + fput(file); +err_put_fd: + put_unused_fd(argp->fd); + + return err; +} + +static int luo_session_finish(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_finish *argp =3D ucmd->cmd; + int err =3D luo_session_finish_one(session); + + if (err) + return err; + + return luo_ucmd_respond(ucmd, sizeof(*argp)); +} + +union ucmd_buffer { + struct liveupdate_session_finish finish; + struct liveupdate_session_preserve_fd preserve; + struct liveupdate_session_retrieve_fd retrieve; +}; + +struct luo_ioctl_op { + unsigned int size; + unsigned int min_size; + unsigned int ioctl_num; + int (*execute)(struct luo_session *session, struct luo_ucmd *ucmd); +}; + +#define IOCTL_OP(_ioctl, _fn, _struct, _last) = \ + [_IOC_NR(_ioctl) - LIVEUPDATE_CMD_SESSION_BASE] =3D { \ + .size =3D sizeof(_struct) + \ + BUILD_BUG_ON_ZERO(sizeof(union ucmd_buffer) < \ + sizeof(_struct)), \ + .min_size =3D offsetofend(_struct, _last), \ + .ioctl_num =3D _ioctl, \ + .execute =3D _fn, \ + } + +static const struct luo_ioctl_op luo_session_ioctl_ops[] =3D { + IOCTL_OP(LIVEUPDATE_SESSION_FINISH, luo_session_finish, + struct liveupdate_session_finish, reserved), + IOCTL_OP(LIVEUPDATE_SESSION_PRESERVE_FD, luo_session_preserve_fd, + struct liveupdate_session_preserve_fd, token), + IOCTL_OP(LIVEUPDATE_SESSION_RETRIEVE_FD, luo_session_retrieve_fd, + struct liveupdate_session_retrieve_fd, token), +}; + +static long luo_session_ioctl(struct file *filep, unsigned int cmd, + unsigned long arg) +{ + struct luo_session *session =3D filep->private_data; + const struct luo_ioctl_op *op; + struct luo_ucmd ucmd =3D {}; + union ucmd_buffer buf; + unsigned int nr; + int ret; + + nr =3D _IOC_NR(cmd); + if (nr < LIVEUPDATE_CMD_SESSION_BASE || (nr - LIVEUPDATE_CMD_SESSION_BASE= ) >=3D + ARRAY_SIZE(luo_session_ioctl_ops)) { + return -EINVAL; + } + + ucmd.ubuffer =3D (void __user *)arg; + ret =3D get_user(ucmd.user_size, (u32 __user *)ucmd.ubuffer); + if (ret) + return ret; + + op =3D &luo_session_ioctl_ops[nr - LIVEUPDATE_CMD_SESSION_BASE]; + if (op->ioctl_num !=3D cmd) + return -ENOIOCTLCMD; + if (ucmd.user_size < op->min_size) + return -EINVAL; + + ucmd.cmd =3D &buf; + ret =3D copy_struct_from_user(ucmd.cmd, op->size, ucmd.ubuffer, + ucmd.user_size); + if (ret) + return ret; + + return op->execute(session, &ucmd); } =20 static const struct file_operations luo_session_fops =3D { .owner =3D THIS_MODULE, .release =3D luo_session_release, + .unlocked_ioctl =3D luo_session_ioctl, }; =20 /* Create a "struct file" for session */ @@ -391,6 +554,8 @@ int luo_session_deserialize(void) session->count =3D sh->ser[i].count; session->files =3D sh->ser[i].files ? phys_to_virt(sh->ser[i].files) : 0; session->pgcnt =3D sh->ser[i].pgcnt; + scoped_guard(mutex, &session->mutex) + luo_file_deserialize(session); } =20 kho_restore_free(sh->header_ser); @@ -405,9 +570,14 @@ int luo_session_serialize(void) struct luo_session_header *sh =3D &luo_session_global.outgoing; struct luo_session *session; int i =3D 0; + int err; =20 guard(rwsem_write)(&sh->rwsem); list_for_each_entry(session, &sh->list, list) { + err =3D luo_session_freeze_one(session); + if (err) + goto err_undo; + strscpy(sh->ser[i].name, session->name, sizeof(sh->ser[i].name)); sh->ser[i].count =3D session->count; @@ -418,4 +588,13 @@ int luo_session_serialize(void) sh->header_ser->count =3D sh->count; =20 return 0; + +err_undo: + list_for_each_entry_continue_reverse(session, &sh->list, list) { + luo_session_unfreeze_one(session); + i--; + memset(&sh->ser[i], 0, sizeof(sh->ser[i])); + } + + return err; } --=20 2.52.0.rc1.455.g30608eb744-goog