From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D631329C67 for ; Tue, 25 Nov 2025 16:58:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089941; cv=none; b=GMFArPByBNQixAyEeti0sglA8nfCKcB3m0LsabuXlqIWZLJAJOVGfsHLYM/jMVogSkinViVMpCgwojJok88rs6y47XpkqVwEjUjhMtUlt4r3lFVc5O0To3+VH67thpCG1J2gMqN6TEERuEO9ARdl+mVbdO8XTrhGzuOv5dySTLU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089941; c=relaxed/simple; bh=WEusQLnBtna6buMpqlL2tqSWEtVqXOBlWahKJLdWuRY=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=d7jpl5WkhW8lrn28YnCqGQjqm5MoBzk143yRz5BM3pQ6uDoLKvdoKtvDrtGY76XbT2pNZ/aNUlzY5v8cIVhta+HYElRwlpSsR6zBaTBYjOs0QgJ7N1o9+IspY+YJdcPJnEPBkU9zoByiosC9eo4kBKmyrwh06Q4Z9KOdy9r0Luk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=A04ZTY6O; arc=none smtp.client-ip=209.85.128.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="A04ZTY6O" Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-78a7af9fe4fso59827197b3.2 for ; Tue, 25 Nov 2025 08:58:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089938; x=1764694738; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ZVXlOcghSkBH1UGo3jOcilbAwCQIPzNeYGCZyURm31Q=; b=A04ZTY6OQHcT3KFLL9ZucmStLDLBiGXHZrKhDCqDYAanuHOGg01cQu9RtE5otmiYip xk9xIAt5MiuoV/Pafa72Nj+nBIyx2WKoUj7j+I+GRHFBq78zdiHMg12bziBHswHREg7E yXtYaMecke8nhihoeIny9SvmrJcDQhKgh81L5A+/hrY0ZoclFjC5naLAMBBAhiH57AgU Qu2GrfXnbNQk9AYx1f/yx9OovMe+0UX4ox0mwk8I4EFuCLFYyCzBXmnoXToy/aKayYin amUDvCg5BujiaM7f9JNWuboJIrPJK7ieb6YfMyQHVAUuhI2OcmbBbGHQl+NE5Ovg7LH6 +3pA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089938; x=1764694738; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=ZVXlOcghSkBH1UGo3jOcilbAwCQIPzNeYGCZyURm31Q=; b=l8U1Sk0LYNvI8AvhcfF2uD27wFYaljvTM5sCib0/KfS7BxOQtfopy6Z1aFdrqjObpu a8naxMCQ00S9pLT0P9MHFcbQt2gxZ8tgSqnc2A4oEnhkyn7q4KzTxHZVxfhkkAjJMNQe 3Rz2EiV2HV7PG3AbRr9f6VOcMYZAJry0i5tVm/UFpEvNjMGneTVDthN/a5ecDPgBMh88 cOo7xFYXCRQIkxxGlKW3WgxXj0hppIIeHB+rw3A8IJP7SUG7dsWruyufJTCyUU2LdvlD qK9Y/s3RMV/oEIpx/insL/Tq9/xRSRVcfgZqML7es4RkS4anqlvqjnN3QAy15uvRFEKB JOYQ== X-Forwarded-Encrypted: i=1; AJvYcCXwI8RkSMe0HR6ZcJecN5HcBv0DK+2yHrYzViCnYXw+C6mjpI01OA4CGY6ILsZIOXs3okF4JIt/VQCevtI=@vger.kernel.org X-Gm-Message-State: AOJu0Yy95vbDcGFgyz4Zk7pitiDxKwveUram3HDDS2ZCRm89CGxTM329 KYGxi5nzNV3zHjV3oC+Wfd3mmCgoyg91ScN6NuCg39y+2me9Q4qBExcdcwyBTnoQy84= X-Gm-Gg: ASbGncu4J9Bj+FmTCTRp+wDZiBuuPyx+BmR9sZlHNJfjXv5AnUZvFezyjApOB6mfADS nhNglaN76YiSLBMWI0tHSbkQOlQ+xLcGRUBAy3ajSKQIiktK2gyOiT3RW2IM0LQzbthMP4SYId8 LQr+zAnyrktFP9/TjiI9uWEnK5zY6BHnmijW7QKtabhf2OPLCvPSalTXpQ73DohzEu+8YlLGhZm D3rahp83BK93aUOMtVkWfQIOfU5cQ2PFkM83pLGp5AdPN7bhyxnAb5kQttxiR6A9ShWvYmrPT2X YBYizIjECyAABlkEz/9k7CILDvmsHiCRpFwzC302K+u7i4i+mEf7DjedzTGVAowL599InT4gWLf 3HhNW7JZjqUTvxt7UwZ77JLZe3d171gyyQaHO8bumo/B/jrq4zu8eNw9DtAtEthzDhh1Kq1Vsax 13SjgQLaKUsrYNHVCI1X2NP+VcpCi/nj6LyhTbJ7BgI5DQYu707iKWQRfL1iHYene8 X-Google-Smtp-Source: AGHT+IFzLqONdmoMxtvQP22Mx+4CTT73xksgrCrTHrFT5fvQtJf9pzYLQiDyCLe6HMYfhVNW4vvqQw== X-Received: by 2002:a05:690c:4c13:b0:787:ce99:eaa0 with SMTP id 00721157ae682-78a8b580502mr146674547b3.70.1764089938106; Tue, 25 Nov 2025 08:58:58 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.58.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:58:57 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 01/18] liveupdate: luo_core: Live Update Orchestrato, Date: Tue, 25 Nov 2025 11:58:31 -0500 Message-ID: <20251125165850.3389713-2-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce LUO, a mechanism intended to facilitate kernel updates while keeping designated devices operational across the transition (e.g., via kexec). The primary use case is updating hypervisors with minimal disruption to running virtual machines. For userspace side of hypervisor update we have copyless migration. LUO is for updating the kernel. This initial patch lays the groundwork for the LUO subsystem. Further functionality, including the implementation of state transition logic, integration with KHO, and hooks for subsystems and file descriptors, will be added in subsequent patches. Create a character device at /dev/liveupdate. A new uAPI header, , will define the necessary structures. The magic number for IOCTL is registered in Documentation/userspace-api/ioctl/ioctl-number.rst. Signed-off-by: Pasha Tatashin Reviewed-by: Pratyush Yadav Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- .../userspace-api/ioctl/ioctl-number.rst | 2 + include/linux/liveupdate.h | 35 ++++++ include/uapi/linux/liveupdate.h | 46 ++++++++ kernel/liveupdate/Kconfig | 21 ++++ kernel/liveupdate/Makefile | 5 + kernel/liveupdate/luo_core.c | 111 ++++++++++++++++++ 6 files changed, 220 insertions(+) create mode 100644 include/linux/liveupdate.h create mode 100644 include/uapi/linux/liveupdate.h create mode 100644 kernel/liveupdate/luo_core.c diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documenta= tion/userspace-api/ioctl/ioctl-number.rst index 7c527a01d1cf..7232b3544cec 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -385,6 +385,8 @@ Code Seq# Include File = Comments 0xB8 01-02 uapi/misc/mrvl_cn10k_dpi.h Mar= vell CN10K DPI driver 0xB8 all uapi/linux/mshv.h Mic= rosoft Hyper-V /dev/mshv driver +0xBA 00-0F uapi/linux/liveupdate.h Pas= ha Tatashin + 0xC0 00-0F linux/usb/iowarrior.h 0xCA 00-0F uapi/misc/cxl.h Dea= d since 6.15 0xCA 10-2F uapi/misc/ocxl.h diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h new file mode 100644 index 000000000000..c6a1d6bd90cb --- /dev/null +++ b/include/linux/liveupdate.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ +#ifndef _LINUX_LIVEUPDATE_H +#define _LINUX_LIVEUPDATE_H + +#include +#include +#include + +#ifdef CONFIG_LIVEUPDATE + +/* Return true if live update orchestrator is enabled */ +bool liveupdate_enabled(void); + +/* Called during kexec to tell LUO that entered into reboot */ +int liveupdate_reboot(void); + +#else /* CONFIG_LIVEUPDATE */ + +static inline bool liveupdate_enabled(void) +{ + return false; +} + +static inline int liveupdate_reboot(void) +{ + return 0; +} + +#endif /* CONFIG_LIVEUPDATE */ +#endif /* _LINUX_LIVEUPDATE_H */ diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h new file mode 100644 index 000000000000..df34c1642c4d --- /dev/null +++ b/include/uapi/linux/liveupdate.h @@ -0,0 +1,46 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +/* + * Userspace interface for /dev/liveupdate + * Live Update Orchestrator + * + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _UAPI_LIVEUPDATE_H +#define _UAPI_LIVEUPDATE_H + +#include +#include + +/** + * DOC: General ioctl format + * + * The ioctl interface follows a general format to allow for extensibility= . Each + * ioctl is passed in a structure pointer as the argument providing the si= ze of + * the structure in the first u32. The kernel checks that any structure sp= ace + * beyond what it understands is 0. This allows userspace to use the backw= ard + * compatible portion while consistently using the newer, larger, structur= es. + * + * ioctls use a standard meaning for common errnos: + * + * - ENOTTY: The IOCTL number itself is not supported at all + * - E2BIG: The IOCTL number is supported, but the provided structure has + * non-zero in a part the kernel does not understand. + * - EOPNOTSUPP: The IOCTL number is supported, and the structure is + * understood, however a known field has a value the kernel does not + * understand or support. + * - EINVAL: Everything about the IOCTL was understood, but a field is not + * correct. + * - ENOENT: A provided token does not exist. + * - ENOMEM: Out of memory. + * - EOVERFLOW: Mathematics overflowed. + * + * As well as additional errnos, within specific ioctls. + */ + +/* The ioctl type, documented in ioctl-number.rst */ +#define LIVEUPDATE_IOCTL_TYPE 0xBA + +#endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index a973a54447de..9b2515f31afb 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -51,4 +51,25 @@ config KEXEC_HANDOVER_ENABLE_DEFAULT The default behavior can still be overridden at boot time by passing 'kho=3Doff'. =20 +config LIVEUPDATE + bool "Live Update Orchestrator" + depends on KEXEC_HANDOVER + help + Enable the Live Update Orchestrator. Live Update is a mechanism, + typically based on kexec, that allows the kernel to be updated + while keeping selected devices operational across the transition. + These devices are intended to be reclaimed by the new kernel and + re-attached to their original workload without requiring a device + reset. + + Ability to handover a device from current to the next kernel depends + on specific support within device drivers and related kernel + subsystems. + + This feature primarily targets virtual machine hosts to quickly update + the kernel hypervisor with minimal disruption to the running virtual + machines. + + If unsure, say N. + endmenu diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index f52ce1ebcf86..08954c1770c4 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -1,5 +1,10 @@ # SPDX-License-Identifier: GPL-2.0 =20 +luo-y :=3D \ + luo_core.o + obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUGFS) +=3D kexec_handover_debugfs.o + +obj-$(CONFIG_LIVEUPDATE) +=3D luo.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c new file mode 100644 index 000000000000..30ad8836360b --- /dev/null +++ b/kernel/liveupdate/luo_core.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: Live Update Orchestrator (LUO) + * + * Live Update is a specialized, kexec-based reboot process that allows a + * running kernel to be updated from one version to another while preservi= ng + * the state of selected resources and keeping designated hardware devices + * operational. For these devices, DMA activity may continue throughout the + * kernel transition. + * + * While the primary use case driving this work is supporting live updates= of + * the Linux kernel when it is used as a hypervisor in cloud environments,= the + * LUO framework itself is designed to be workload-agnostic. Live Update + * facilitates a full kernel version upgrade for any type of system. + * + * For example, a non-hypervisor system running an in-memory cache like + * memcached with many gigabytes of data can use LUO. The userspace service + * can place its cache into a memfd, have its state preserved by LUO, and + * restore it immediately after the kernel kexec. + * + * Whether the system is running virtual machines, containers, a + * high-performance database, or networking services, LUO's primary goal i= s to + * enable a full kernel update by preserving critical userspace state and + * keeping essential devices operational. + * + * The core of LUO is a mechanism that tracks the progress of a live updat= e, + * along with a callback API that allows other kernel subsystems to partic= ipate + * in the process. Example subsystems that can hook into LUO include: kvm, + * iommu, interrupts, vfio, participating filesystems, and memory manageme= nt. + * + * LUO uses Kexec Handover to transfer memory state from the current kerne= l to + * the next kernel. For more details see + * Documentation/core-api/kho/concepts.rst. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include + +static struct { + bool enabled; +} luo_global; + +static int __init early_liveupdate_param(char *buf) +{ + return kstrtobool(buf, &luo_global.enabled); +} +early_param("liveupdate", early_liveupdate_param); + +/* Public Functions */ + +/** + * liveupdate_reboot() - Kernel reboot notifier for live update final + * serialization. + * + * This function is invoked directly from the reboot() syscall pathway + * if kexec is in progress. + * + * If any callback fails, this function aborts KHO, undoes the freeze() + * callbacks, and returns an error. + */ +int liveupdate_reboot(void) +{ + return 0; +} + +/** + * liveupdate_enabled - Check if the live update feature is enabled. + * + * This function returns the state of the live update feature flag, which + * can be controlled via the ``liveupdate`` kernel command-line parameter. + * + * @return true if live update is enabled, false otherwise. + */ +bool liveupdate_enabled(void) +{ + return luo_global.enabled; +} + +struct luo_device_state { + struct miscdevice miscdev; +}; + +static const struct file_operations luo_fops =3D { + .owner =3D THIS_MODULE, +}; + +static struct luo_device_state luo_dev =3D { + .miscdev =3D { + .minor =3D MISC_DYNAMIC_MINOR, + .name =3D "liveupdate", + .fops =3D &luo_fops, + }, +}; + +static int __init liveupdate_ioctl_init(void) +{ + if (!liveupdate_enabled()) + return 0; + + return misc_register(&luo_dev.miscdev); +} +late_initcall(liveupdate_ioctl_init); --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53721329E45 for ; Tue, 25 Nov 2025 16:59:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089943; cv=none; b=iGmJsA363cz15XTdgw2s0x0qnxIRQm9Dyq2wWzFCiKqieQ+KcdK5quLts2wjp2UVff6+VS1g451b6BhKVOVW9tW8zr7WJTDsunK+UmIX1P7/S4nPaXS8QuQd4LAQ3KcxLVi+m21Bw4rdAAD/fRV3AmMPSsVtWffniL1pdMPiGUg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089943; c=relaxed/simple; bh=rJhPiNy/cHUqBMKkEMx4A4UA5MNXKD3qKlXEfa/ftHk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q9BWqqzB7Q6MmSG1eUtVtN0UHLQZgr3RapwoygWBVVZzKTmEjoMB541zNcwg5wxGD0S0DG0l8TLp4brbiACoHa2npdb92Q1aGu+oP8EYyFa8XPIsGewIirQxzZOwPFg5VJBOwm5dMG5KCdOty3ROJZo1RWS8HAbyl1mHmRfkxwE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=W+AkcV6U; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="W+AkcV6U" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-789524e6719so131047b3.1 for ; Tue, 25 Nov 2025 08:59:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089940; x=1764694740; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=yriDSUM+ad1aFjpJ92e3as8wKB+tAVJNH4fXWEg+Y9Q=; b=W+AkcV6U7AeqG6Qtv8IW3gBATCfN0Coq+hiwXoPqGWZ7h60VFtKXKRAVm8tfFNm7ho MYOMP4a73sX4sR5biRSiPw2Il7HNyEEshAFpL9v3mdzOPqBCJnBoNjrB9Kx2YV6RSrez rOwzsdRunzHEnE1OmedaCmDP6ut452LtO45xLlpTK/R/6u/Rfp68gGxnBsWvkae8Nwri 0FPvdvdBF8UjtF9b9fdm9ciZaR5LvEBzkb6ksAJCpWgKeNPL5NCGrwLXxRbIsYf+WYLy wzcVGzdAYNB1k79rP6wwerUs7FdzfPhZPAyvH7ZuWxjwhnKPtxG92ZRNIuSDsjIBCugj w36A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089940; x=1764694740; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=yriDSUM+ad1aFjpJ92e3as8wKB+tAVJNH4fXWEg+Y9Q=; b=Pw6vrNBxyoQvwpzK6a4LyudNZz7g9dfaDD+oiYlPkf+m1msHcl8rS9gpPlq3s5icLC W6chlkhIHACEEXKys6njWwO5qJppnpylWSYYCeWO/agfZNF+fzJqRRuXNjYOsaQ2yNnn xaWT1uD1R1X06ccg1l/n4ahHyaYL6mvTQCEelvsYmE1ViOmE/6GY8g7gvkwnM542ec5o mYc3AnlO5k35skopIPE04UwiqfqnqKVRX/J0QDzcC/KFTIbDxf7E7UHCZHz1t1p+z23e p1o+DwGnjH9XmLMdXzgLgDpNFnvn+i7SOl2N58ZusF/hXug4w/9CTqEMRRd26AtVklib vsTg== X-Forwarded-Encrypted: i=1; AJvYcCUk9OPK+McwABSMF26h7jUJTsnCe6mEbB8HEXaXUUHDUHMa34jI0e8a1od1FrygU1dfG1kk1nlnJ9EWSC8=@vger.kernel.org X-Gm-Message-State: AOJu0YwP1r8d2ZQgAty9tOvl8ZYsifvmhLLiR1baSiJ2JU8LzPh+1Dfy odiOV8jO79m9F9JkVzvKbkO9TFwRV4uaubemAHg7AzVmANQcisvlSJezforYGal//Uc= X-Gm-Gg: ASbGnctTNFKti0hoWp7wia+1v32zyPhelxJPa6A3Lem0apSHUwj72BUWV/yFNVrTEMV 1v4SI+Ki43axzCyfdek4kNpZbnSDBmyiGZQTPgS05WKLBUVkv2GpoDibwMop1R2l0RPl/gzSXJr CCQzXQlqVBTJV+OthUjMXwwq0QXj+2OdubgYVvZUvL2pS6ZBqW/fCzSVLnzEvwS64gAiBJNE26k OtOqsQakv+1R2mZVvGuBOjSvRSBQYyLeqNvewi61ZjvmTCipweh5TcNCIFB5Toc6fWRHf9pkghV 4A8U3iBpBzuZfFlrIwDJky0GRMf7yX5zU1kDpjarZdSbWOquS7Wi39TLJuv7nXq9f7t23YNehXW /aFvM3UWtEI1tvWGEo2aAsiUN+Wmi9JdH5kSfpUL2dulibgf0bKhvjsYw0copiXxqJ6NBY3ROaR tycSMW5Cf9PZPTlGddx9majaTGQrxa7Fpc23+wc56Kc2fyFJoOvvjHIwaIsVrlEyaMbhBpG6y6O Fs= X-Google-Smtp-Source: AGHT+IGM7GGVoSfxCAXkl5wg3aMgKhNb9HlP4vsUOFssrEUk0d2XFcXQJwhzNmOgejiMFvAK+iBZcA== X-Received: by 2002:a05:690c:4c08:b0:731:817b:4d73 with SMTP id 00721157ae682-78a8ab3014dmr159149187b3.14.1764089940135; Tue, 25 Nov 2025 08:59:00 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.58.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:58:59 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 02/18] liveupdate: luo_core: integrate with KHO Date: Tue, 25 Nov 2025 11:58:32 -0500 Message-ID: <20251125165850.3389713-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Integrate the LUO with the KHO framework to enable passing LUO state across a kexec reboot. This patch implements the lifecycle integration with KHO: 1. Incoming State: During early boot (`early_initcall`), LUO checks if KHO is active. If so, it retrieves the "LUO" subtree, verifies the "luo-v1" compatibility string, and reads the `liveupdate-number` to track the update count. 2. Outgoing State: During late initialization (`late_initcall`), LUO allocates a new FDT for the next kernel, populates it with the basic header (compatible string and incremented update number), and registers it with KHO (`kho_add_subtree`). 3. Finalization: The `liveupdate_reboot()` notifier is updated to invoke `kho_finalize()`. This ensures that all memory segments marked for preservation are properly serialized before the kexec jump. Signed-off-by: Pasha Tatashin Reviewed-by: Pratyush Yadav Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- include/linux/kho/abi/luo.h | 58 ++++++++++++ kernel/liveupdate/luo_core.c | 154 ++++++++++++++++++++++++++++++- kernel/liveupdate/luo_internal.h | 22 +++++ 3 files changed, 233 insertions(+), 1 deletion(-) create mode 100644 include/linux/kho/abi/luo.h create mode 100644 kernel/liveupdate/luo_internal.h diff --git a/include/linux/kho/abi/luo.h b/include/linux/kho/abi/luo.h new file mode 100644 index 000000000000..2099b51929e5 --- /dev/null +++ b/include/linux/kho/abi/luo.h @@ -0,0 +1,58 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: Live Update Orchestrator ABI + * + * This header defines the stable Application Binary Interface used by the + * Live Update Orchestrator to pass state from a pre-update kernel to a + * post-update kernel. The ABI is built upon the Kexec HandOver framework + * and uses a Flattened Device Tree to describe the preserved data. + * + * This interface is a contract. Any modification to the FDT structure, no= de + * properties, compatible strings, or the layout of the `__packed` seriali= zation + * structures defined here constitutes a breaking change. Such changes req= uire + * incrementing the version number in the relevant `_COMPATIBLE` string to + * prevent a new kernel from misinterpreting data from an old kernel. + * + * Changes are allowed provided the compatibility version is incremented; + * however, backward/forward compatibility is only guaranteed for kernels + * supporting the same ABI version. + * + * FDT Structure Overview: + * The entire LUO state is encapsulated within a single KHO entry named = "LUO". + * This entry contains an FDT with the following layout: + * + * .. code-block:: none + * + * / { + * compatible =3D "luo-v1"; + * liveupdate-number =3D <...>; + * }; + * + * Main LUO Node (/): + * + * - compatible: "luo-v1" + * Identifies the overall LUO ABI version. + * - liveupdate-number: u64 + * A counter tracking the number of successful live updates performed. + */ + +#ifndef _LINUX_KHO_ABI_LUO_H +#define _LINUX_KHO_ABI_LUO_H + +/* + * The LUO FDT hooks all LUO state for sessions, fds, etc. + * In the root it also carries "liveupdate-number" 64-bit property that + * corresponds to the number of live-updates performed on this machine. + */ +#define LUO_FDT_SIZE PAGE_SIZE +#define LUO_FDT_KHO_ENTRY_NAME "LUO" +#define LUO_FDT_COMPATIBLE "luo-v1" +#define LUO_FDT_LIVEUPDATE_NUM "liveupdate-number" + +#endif /* _LINUX_KHO_ABI_LUO_H */ diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index 30ad8836360b..9f9fe9a81b29 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -41,12 +41,26 @@ =20 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt =20 +#include +#include +#include #include +#include #include #include +#include +#include +#include +#include + +#include "kexec_handover_internal.h" +#include "luo_internal.h" =20 static struct { bool enabled; + void *fdt_out; + void *fdt_in; + u64 liveupdate_num; } luo_global; =20 static int __init early_liveupdate_param(char *buf) @@ -55,6 +69,129 @@ static int __init early_liveupdate_param(char *buf) } early_param("liveupdate", early_liveupdate_param); =20 +static int __init luo_early_startup(void) +{ + phys_addr_t fdt_phys; + int err, ln_size; + const void *ptr; + + if (!kho_is_enabled()) { + if (liveupdate_enabled()) + pr_warn("Disabling liveupdate because KHO is disabled\n"); + luo_global.enabled =3D false; + return 0; + } + + /* Retrieve LUO subtree, and verify its format. */ + err =3D kho_retrieve_subtree(LUO_FDT_KHO_ENTRY_NAME, &fdt_phys); + if (err) { + if (err !=3D -ENOENT) { + pr_err("failed to retrieve FDT '%s' from KHO: %pe\n", + LUO_FDT_KHO_ENTRY_NAME, ERR_PTR(err)); + return err; + } + + return 0; + } + + luo_global.fdt_in =3D phys_to_virt(fdt_phys); + err =3D fdt_node_check_compatible(luo_global.fdt_in, 0, + LUO_FDT_COMPATIBLE); + if (err) { + pr_err("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_FDT_KHO_ENTRY_NAME, LUO_FDT_COMPATIBLE, err); + + return -EINVAL; + } + + ln_size =3D 0; + ptr =3D fdt_getprop(luo_global.fdt_in, 0, LUO_FDT_LIVEUPDATE_NUM, + &ln_size); + if (!ptr || ln_size !=3D sizeof(luo_global.liveupdate_num)) { + pr_err("Unable to get live update number '%s' [%d]\n", + LUO_FDT_LIVEUPDATE_NUM, ln_size); + + return -EINVAL; + } + + luo_global.liveupdate_num =3D get_unaligned((u64 *)ptr); + pr_info("Retrieved live update data, liveupdate number: %lld\n", + luo_global.liveupdate_num); + + return 0; +} + +static int __init liveupdate_early_init(void) +{ + int err; + + err =3D luo_early_startup(); + if (err) { + luo_global.enabled =3D false; + luo_restore_fail("The incoming tree failed to initialize properly [%pe],= disabling live update\n", + ERR_PTR(err)); + } + + return err; +} +early_initcall(liveupdate_early_init); + +/* Called during boot to create outgoing LUO fdt tree */ +static int __init luo_fdt_setup(void) +{ + const u64 ln =3D luo_global.liveupdate_num + 1; + void *fdt_out; + int err; + + fdt_out =3D kho_alloc_preserve(LUO_FDT_SIZE); + if (IS_ERR(fdt_out)) { + pr_err("failed to allocate/preserve FDT memory\n"); + return PTR_ERR(fdt_out); + } + + err =3D fdt_create(fdt_out, LUO_FDT_SIZE); + err |=3D fdt_finish_reservemap(fdt_out); + err |=3D fdt_begin_node(fdt_out, ""); + err |=3D fdt_property_string(fdt_out, "compatible", LUO_FDT_COMPATIBLE); + err |=3D fdt_property(fdt_out, LUO_FDT_LIVEUPDATE_NUM, &ln, sizeof(ln)); + err |=3D fdt_end_node(fdt_out); + err |=3D fdt_finish(fdt_out); + if (err) + goto exit_free; + + err =3D kho_add_subtree(LUO_FDT_KHO_ENTRY_NAME, fdt_out); + if (err) + goto exit_free; + luo_global.fdt_out =3D fdt_out; + + return 0; + +exit_free: + kho_unpreserve_free(fdt_out); + pr_err("failed to prepare LUO FDT: %d\n", err); + + return err; +} + +/* + * late initcall because it initializes the outgoing tree that is needed o= nly + * once userspace starts using /dev/liveupdate. + */ +static int __init luo_late_startup(void) +{ + int err; + + if (!liveupdate_enabled()) + return 0; + + err =3D luo_fdt_setup(); + if (err) + luo_global.enabled =3D false; + + return err; +} +late_initcall(luo_late_startup); + /* Public Functions */ =20 /** @@ -69,7 +206,22 @@ early_param("liveupdate", early_liveupdate_param); */ int liveupdate_reboot(void) { - return 0; + int err; + + if (!liveupdate_enabled()) + return 0; + + err =3D kho_finalize(); + if (err) { + pr_err("kho_finalize failed %d\n", err); + /* + * kho_finalize() may return libfdt errors, to aboid passing to + * userspace unknown errors, change this to EAGAIN. + */ + err =3D -EAGAIN; + } + + return err; } =20 /** diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h new file mode 100644 index 000000000000..8612687b2000 --- /dev/null +++ b/kernel/liveupdate/luo_internal.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _LINUX_LUO_INTERNAL_H +#define _LINUX_LUO_INTERNAL_H + +#include + +/* + * Handles a deserialization failure: devices and memory is in unpredictab= le + * state. + * + * Continuing the boot process after a failure is dangerous because it cou= ld + * lead to leaks of private data. + */ +#define luo_restore_fail(__fmt, ...) panic(__fmt, ##__VA_ARGS__) + +#endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1ACFC329E72 for ; Tue, 25 Nov 2025 16:59:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089944; cv=none; b=W6bzGTqGZvDo9it2N5aHE+GM+66w9ogfNnNdddyvZsb5DTRoAkRqNc19hmbusPsWAwHr49i/1PUAJAGyr7CrFzLhoPLBX0aa1fEHu/9b63CZWiBJAJ9z5zk61aYEwygacyhWdP1n8UxAzldLqg4YTk7LsXMi5QfoC7H/F2RWVrI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089944; c=relaxed/simple; bh=9ZUelKfkvPljv8oQXIuVk1CLomlpXexOI6vdtHAFbG0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sdqOCJRDLGdoQ1rz6jB21sziyVTVGsGsTo3p+s/ucwplG3vg6W8S42aQV3qAjJiVBlBIsO6Scx2zBsBMDb2dQuhw9lP7sXwAAqCmWag4o4nPbC20vgjgR/O4u+0Cqkr67HS6lPzdonAFAY6rKqlYPYX7JfiZhPjp5A8I5r1jtHQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=FrlAXVcb; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="FrlAXVcb" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-789524e6719so132217b3.1 for ; Tue, 25 Nov 2025 08:59:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089942; x=1764694742; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=XpaVBm1+//tbxkEQIq/GBWXpVmrV48zOyAjTKxFhzXo=; b=FrlAXVcbXKUKJcNPrA0qKapgEF9UGLCsmf2E2yXVjGe4RX/6b3gE5dunyIWrHMqCaT lWoHwF727hAakvpRD8uMfFVvi/sFXq9MgB12IxlF+ySjdjWTNS8H0dUyelhqbmVbINDe rLAMTvltT4ezjlK8okW7p/XSjCFjQlSAoxBgzJZkxBbc0eBvgBkcCvMWhtAkH3PcjGxN Yg76Z+7qEHuLH0cax9MTkIVeHQWnMTmFdG1Doriq7V3RIb9GrW+EJ5La9lh78HSankZf v1fPfW0Yd97G+LVXu3IMPshEzhl6XXKLAHQ31oU8alh9Cu8h8ys0Ln60bgicf6LOJrdh UZaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089942; x=1764694742; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=XpaVBm1+//tbxkEQIq/GBWXpVmrV48zOyAjTKxFhzXo=; b=l8FK7wEZP9618WuWTCRXgV732v3z1hH15BIPrPWU4Sj0qABzX+M9dgjFmJ05Fxc41G VqqYbAP1IecJS4I+wsETF0o77N51kCv4c0JVQmmLSEcQMQ+XBLKt1+gR2WumrIb7InZX yBxWxxDTGvquxTUhUrortpTgFdS0rK1xxoKyM+TpIDlKwSg1uzhW2pcbZsDTLC35uJCw AbAJkZO1goAjYKOTMtIac9C90tiAU8prARdHbjtPnCJrYz7KESMfXAcN+hZEBsj9tLk4 vE3N4dti3qJKfzyGXy4oCVQHMdhJkLWelYcCDvT3/Ae3na+KFkiD9wbweW3Cjaj8k9fB W3QA== X-Forwarded-Encrypted: i=1; AJvYcCUv9/TaXxMgf0XM3j377DAsbsHCzeaBVaw71Q7jXBGS0/dRG++KUVlt6TZzb1LfT2axGJfxjPerk9BwYrE=@vger.kernel.org X-Gm-Message-State: AOJu0Yx8d/z9WIaCQRNaRFX6yZNaGiIqElv3XDexXts65XgeVjwGlZjq a43H37Y4UbBV77wfmwgp6Xvtgru8UqxAwrqTzc6wnX8sgDtjYWBOc6MW/ICwVcvQl90= X-Gm-Gg: ASbGncv0wgjTj1rZap+BODOGeFTIlVle6hJZVwnkmi3yprpKHL6qeIb0LIWmCp9QxWN IUz/K7WOWyYlm8BWT/NtpiPAy9r0yLM8gtz3mDkzMKpQ0n8BGKm2WosQL6casCGqf7WwIUr16pl WyItE3/HkKP1htBKtwAloHz3fbG+NLephJZGS/4KTsNjmbmNSURMNUwUk14GCqgkXnGBRwCBtuV sRvEfu0mb5nVZZ31wRxLZaoBr/QwxGO7CAPuOlm3SidoArFov+aGzPnE+mWeisDyxkd5zbfRRHF prxkdudXJ85Wn7N4KGrK3X5RHQj1rhisd2BIAra+il49TEdE4zeXvZfcCxj/xgz5N1gmlscmHZp 0WHbwRpVHorTSLd2RKw/v7/Utuo6FXXF0IQGImXJwTrdZolPf3I5Sn+va+YpdZy3h8evA3pI3oV KeQrWyL+o1Gpb2XD7painmojche1ydAplF844joQuYE3BDRR3UF2tPZeP3otQsAoQD X-Google-Smtp-Source: AGHT+IGT9/1MeLVmxLppNhw6ATfBnky7RFeGJGgEupGtHFRdbpi4HufaHsrUZln7Z2M/tCwpjqT1eg== X-Received: by 2002:a05:690e:c41:b0:640:db57:8d93 with SMTP id 956f58d0204a3-642f8dee6bemr14317878d50.15.1764089942010; Tue, 25 Nov 2025 08:59:02 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:01 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 03/18] kexec: call liveupdate_reboot() before kexec Date: Tue, 25 Nov 2025 11:58:33 -0500 Message-ID: <20251125165850.3389713-4-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Modify the kernel_kexec() to call liveupdate_reboot(). This ensures that the Live Update Orchestrator is notified just before the kernel executes the kexec jump. The liveupdate_reboot() function triggers the final freeze event, allowing participating FDs perform last-minute check or state saving within the blackout window. If liveupdate_reboot() returns an error (indicating a failure during LUO finalization), the kexec operation is aborted to prevent proceeding with an inconsistent state. An error is returned to user. Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Pratyush Yadav Tested-by: David Matlack --- kernel/kexec_core.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index a8890dd03a1d..3122235c225b 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -1145,6 +1146,10 @@ int kernel_kexec(void) goto Unlock; } =20 + error =3D liveupdate_reboot(); + if (error) + goto Unlock; + #ifdef CONFIG_KEXEC_JUMP if (kexec_image->preserve_context) { /* --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yx1-f44.google.com (mail-yx1-f44.google.com [74.125.224.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 653EA32B985 for ; Tue, 25 Nov 2025 16:59:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089950; cv=none; b=sSeKRJjYxaCR9aFTNl011s6fnJzL1tHtSzipF5WJLmQ5cJsZbUkygMWruEnnqMMEqBwvxDYUXqyWBNY806Qs7mePVsW8jDQYyDJo+Ej8an5mtq8/x2sTWjtWUbfZqlK/qFwkK6QqkWschyR5bnvKbfPNDd7Mswn7Z/xTbvNDTp8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089950; c=relaxed/simple; bh=D5fCL6iZKQn14ocxeWrxxDSn1MadfocFwYnc4F7rD3s=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QI8ZbJkeZdIDhPLUJcXEm+14l21FNqcGD0F3dQC8WVwWcfuURoUmILH2BJEfav86k6NRPEodyuzct4h7KyQWmRO+3MmIliCUj5DGyNlDjXmHkjsF8FF9gkZnjFer2L2JXg6HT5e5IevGC71Nq8ib+bucxEoWoQMPqtPQGWdpvXU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=ioqupjqd; arc=none smtp.client-ip=74.125.224.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="ioqupjqd" Received: by mail-yx1-f44.google.com with SMTP id 956f58d0204a3-63fc72db706so4787893d50.2 for ; Tue, 25 Nov 2025 08:59:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089944; x=1764694744; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Wyqf8xtaqHL0pKyPog8HLWpOnzGWnLymZvnKt+DgBKU=; b=ioqupjqdeZLNrduN0j1yLQ6I8hBU+UaLzzAZoHzyUdlwWZWk2sVVHDJQHXIsCBSpnp /zjSePdD5VgeE+cGqhAQJKxQBvANIbXzYYXrJ3zBFug/P6yeWtl5vKYN4pECbdGF6u3V 2RnQyMBe1NWN4n/g9sN83/e2rbdnSVy46Lxv667bBQTB4FVUZM676aA9BVNHjndC4Ppu L5N7kJoVykPO5Rttio/ECYkCrnK05APswqQ8t+Qqo4YudRgRFiVP8rQlfrSCWdhdPu9u anXeL1qpRyDe8R6dUtzpPEKwJ+IIwROju+sLVUmVoqoAP4DvWGg5XWS/BOHkm3h5XUZU SeKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089944; x=1764694744; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=Wyqf8xtaqHL0pKyPog8HLWpOnzGWnLymZvnKt+DgBKU=; b=Q4YHtQvkQT23mBHNJwC73/+062A53TplmqJkCwyatUVRG3ZrlAYH1YMNkJ+cvUYhuf 5OFkYoynNjWOH4PcbPRpgYp1hgICxWAc1Hv+P8OwOtlYO8hm4wy2LkTwjJHm9kuZimGm LgQdFCclKTycwMd2uvRiX/IU3IibxfnPsDf8gugXNXHLz3XYC/tC/neqS1maYPPpUhz0 mNorptEdkHaWUfRjWghZlufzCb154Dr6X5jwdsNNV+yVDJL57MK3lygst/Ea4d7vkaQ9 zxRiFAOoqiqZi52ZL9IYiePV10U9L8ov9OV9XGmJXZ11xmpXPu7fOlPcByzcPsZ9fP/t S2ag== X-Forwarded-Encrypted: i=1; AJvYcCWjsEstgYGdmcZKPO9nnjdUxb+65DjS2Prjw1/K+4XGq7/t0YZWjanmgyDU3IiWC2YcedrDGSgNCSK/Rww=@vger.kernel.org X-Gm-Message-State: AOJu0YwebJLoIb6NfXKsw4T09gG6+aw2+3op9GqJSKNuuy0rCVIJya4z iEAm1Kx0eScAJHx9VdKdW48XJ+nBaV2hoH0ahHg0tTYrzE+++z6iqc7VlmH5yFSaSDY= X-Gm-Gg: ASbGncvxTOlqinvnZ0rW1rLMXlgGF6WVhWA3NiXNy8memWarhBM7u7+FpeTGdsfl0z9 E8i/fNjlUBswyfZ6UTefWj9IuFQoVqzwftDa0BJ5mMjOual5f3z91hNnG1iFLjhbW2INKxM9AKv W+4Rh0I0EQQwxjyeEckGkHmcay2kE8VDwETwCfgEdTg+OAeQMSHKA2BmAt/d4xtHPkzOe8enIEj 5/g5qtvjjfGiu0KDMbX/nP8CcAwukLDTIDUsuKRRXagukrd3t1ftfRpcE6kd1n9ylu/08O/c6c+ ge331G5cVlIaoHr+A8RhNls2oQOmWViiyThIuk7yi+YpMr1NXBOdgMZnfE6/xfaN2ALGnljvpVG Dzn1CBqZjfSyW1xQC8MU/FIrZZ/rRYQBXvrtwM5MgS1PruoLWiHe0uPEgPvhIA/6MAQ607mlEzg G5zUxNS2gXx2z/23j7rbigw1bPTbGbkWRrNbC7Ly+Kc965enQtrKaI+XF9A7+d3eqJstG0BRvfZ YQXPWk= X-Google-Smtp-Source: AGHT+IGfxDKXNSzhQkFq+6+SZCG/hKvm9UVWzL+QoPHX7xpizdSRSw8Hy33eE0xmfgWRrieaDgkUhw== X-Received: by 2002:a05:690e:12c9:b0:63f:a856:5f90 with SMTP id 956f58d0204a3-64329212a6cmr2380372d50.4.1764089943942; Tue, 25 Nov 2025 08:59:03 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:03 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 04/18] liveupdate: luo_session: add sessions support Date: Tue, 25 Nov 2025 11:58:34 -0500 Message-ID: <20251125165850.3389713-5-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce concept of "Live Update Sessions" within the LUO framework. LUO sessions provide a mechanism to group and manage `struct file *` instances (representing file descriptors) that need to be preserved across a kexec-based live update. Each session is identified by a unique name and acts as a container for file objects whose state is critical to a userspace workload, such as a virtual machine or a high-performance database, aiming to maintain their functionality across a kernel transition. This groundwork establishes the framework for preserving file-backed state across kernel updates, with the actual file data preservation mechanisms to be implemented in subsequent patches. Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Pratyush Yadav Tested-by: David Matlack --- include/linux/kho/abi/luo.h | 71 +++++ include/uapi/linux/liveupdate.h | 3 + kernel/liveupdate/Makefile | 3 +- kernel/liveupdate/luo_core.c | 9 + kernel/liveupdate/luo_internal.h | 29 ++ kernel/liveupdate/luo_session.c | 463 +++++++++++++++++++++++++++++++ 6 files changed, 577 insertions(+), 1 deletion(-) create mode 100644 kernel/liveupdate/luo_session.c diff --git a/include/linux/kho/abi/luo.h b/include/linux/kho/abi/luo.h index 2099b51929e5..bf1ab2910959 100644 --- a/include/linux/kho/abi/luo.h +++ b/include/linux/kho/abi/luo.h @@ -32,6 +32,11 @@ * / { * compatible =3D "luo-v1"; * liveupdate-number =3D <...>; + * + * luo-session { + * compatible =3D "luo-session-v1"; + * luo-session-header =3D ; + * }; * }; * * Main LUO Node (/): @@ -40,11 +45,37 @@ * Identifies the overall LUO ABI version. * - liveupdate-number: u64 * A counter tracking the number of successful live updates performed. + * + * Session Node (luo-session): + * This node describes all preserved user-space sessions. + * + * - compatible: "luo-session-v1" + * Identifies the session ABI version. + * - luo-session-header: u64 + * The physical address of a `struct luo_session_header_ser`. This str= ucture + * is the header for a contiguous block of memory containing an array = of + * `struct luo_session_ser`, one for each preserved session. + * + * Serialization Structures: + * The FDT properties point to memory regions containing arrays of simpl= e, + * `__packed` structures. These structures contain the actual preserved = state. + * + * - struct luo_session_header_ser: + * Header for the session array. Contains the total page count of the + * preserved memory block and the number of `struct luo_session_ser` + * entries that follow. + * + * - struct luo_session_ser: + * Metadata for a single session, including its name and a physical po= inter + * to another preserved memory block containing an array of + * `struct luo_file_ser` for all files in that session. */ =20 #ifndef _LINUX_KHO_ABI_LUO_H #define _LINUX_KHO_ABI_LUO_H =20 +#include + /* * The LUO FDT hooks all LUO state for sessions, fds, etc. * In the root it also carries "liveupdate-number" 64-bit property that @@ -55,4 +86,44 @@ #define LUO_FDT_COMPATIBLE "luo-v1" #define LUO_FDT_LIVEUPDATE_NUM "liveupdate-number" =20 +/* + * LUO FDT session node + * LUO_FDT_SESSION_HEADER: is a u64 physical address of struct + * luo_session_header_ser + */ +#define LUO_FDT_SESSION_NODE_NAME "luo-session" +#define LUO_FDT_SESSION_COMPATIBLE "luo-session-v1" +#define LUO_FDT_SESSION_HEADER "luo-session-header" + +/** + * struct luo_session_header_ser - Header for the serialized session data = block. + * @count: The number of `struct luo_session_ser` entries that immediately + * follow this header in the memory block. + * + * This structure is located at the beginning of a contiguous block of + * physical memory preserved across the kexec. It provides the necessary + * metadata to interpret the array of session entries that follow. + * + * If this structure is modified, `LUO_FDT_SESSION_COMPATIBLE` must be upd= ated. + */ +struct luo_session_header_ser { + u64 count; +} __packed; + +/** + * struct luo_session_ser - Represents the serialized metadata for a LUO s= ession. + * @name: The unique name of the session, provided by the userspac= e at + * the time of session creation. + * + * This structure is used to package session-specific metadata for transfer + * between kernels via Kexec Handover. An array of these structures (one p= er + * session) is created and passed to the new kernel, allowing it to recons= truct + * the session context. + * + * If this structure is modified, `LUO_FDT_SESSION_COMPATIBLE` must be upd= ated. + */ +struct luo_session_ser { + char name[LIVEUPDATE_SESSION_NAME_LENGTH]; +} __packed; + #endif /* _LINUX_KHO_ABI_LUO_H */ diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index df34c1642c4d..40578ae19668 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -43,4 +43,7 @@ /* The ioctl type, documented in ioctl-number.rst */ #define LIVEUPDATE_IOCTL_TYPE 0xBA =20 +/* The maximum length of session name including null termination */ +#define LIVEUPDATE_SESSION_NAME_LENGTH 64 + #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 08954c1770c4..6af93caa58cf 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -1,7 +1,8 @@ # SPDX-License-Identifier: GPL-2.0 =20 luo-y :=3D \ - luo_core.o + luo_core.o \ + luo_session.o =20 obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index 9f9fe9a81b29..a0f7788cd003 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -118,6 +118,10 @@ static int __init luo_early_startup(void) pr_info("Retrieved live update data, liveupdate number: %lld\n", luo_global.liveupdate_num); =20 + err =3D luo_session_setup_incoming(luo_global.fdt_in); + if (err) + return err; + return 0; } =20 @@ -154,6 +158,7 @@ static int __init luo_fdt_setup(void) err |=3D fdt_begin_node(fdt_out, ""); err |=3D fdt_property_string(fdt_out, "compatible", LUO_FDT_COMPATIBLE); err |=3D fdt_property(fdt_out, LUO_FDT_LIVEUPDATE_NUM, &ln, sizeof(ln)); + err |=3D luo_session_setup_outgoing(fdt_out); err |=3D fdt_end_node(fdt_out); err |=3D fdt_finish(fdt_out); if (err) @@ -211,6 +216,10 @@ int liveupdate_reboot(void) if (!liveupdate_enabled()) return 0; =20 + err =3D luo_session_serialize(); + if (err) + return err; + err =3D kho_finalize(); if (err) { pr_err("kho_finalize failed %d\n", err); diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 8612687b2000..05ae91695ec6 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -19,4 +19,33 @@ */ #define luo_restore_fail(__fmt, ...) panic(__fmt, ##__VA_ARGS__) =20 +/** + * struct luo_session - Represents an active or incoming Live Update sessi= on. + * @name: A unique name for this session, used for identification and + * retrieval. + * @ser: Pointer to the serialized data for this session. + * @list: A list_head member used to link this session into a global= list + * of either outgoing (to be preserved) or incoming (restored= from + * previous kernel) sessions. + * @retrieved: A boolean flag indicating whether this session has been + * retrieved by a consumer in the new kernel. + * @mutex: protects fields in the luo_session. + */ +struct luo_session { + char name[LIVEUPDATE_SESSION_NAME_LENGTH]; + struct luo_session_ser *ser; + struct list_head list; + bool retrieved; + struct mutex mutex; +}; + +int luo_session_create(const char *name, struct file **filep); +int luo_session_retrieve(const char *name, struct file **filep); +int __init luo_session_setup_outgoing(void *fdt); +int __init luo_session_setup_incoming(void *fdt); +int luo_session_serialize(void); +int luo_session_deserialize(void); +bool luo_session_quiesce(void); +void luo_session_resume(void); + #endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_sessio= n.c new file mode 100644 index 000000000000..5829fe79896a --- /dev/null +++ b/kernel/liveupdate/luo_session.c @@ -0,0 +1,463 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO Sessions + * + * LUO Sessions provide the core mechanism for grouping and managing `stru= ct + * file *` instances that need to be preserved across a kexec-based live + * update. Each session acts as a named container for a set of file object= s, + * allowing a userspace agent to manage the lifecycle of resources critica= l to a + * workload. + * + * Core Concepts: + * + * - Named Containers: Sessions are identified by a unique, user-provided = name, + * which is used for both creation in the current kernel and retrieval i= n the + * next kernel. + * + * - Userspace Interface: Session management is driven from userspace via + * ioctls on /dev/liveupdate. + * + * - Serialization: Session metadata is preserved using the KHO framework.= When + * a live update is triggered via kexec, an array of `struct luo_session= _ser` + * is populated and placed in a preserved memory region. An FDT node is = also + * created, containing the count of sessions and the physical address of= this + * array. + * + * Session Lifecycle: + * + * 1. Creation: A userspace agent calls `luo_session_create()` to create a + * new, empty session and receives a file descriptor for it. + * + * 2. Serialization: When the `reboot(LINUX_REBOOT_CMD_KEXEC)` syscall is + * made, `luo_session_serialize()` is called. It iterates through all + * active sessions and writes their metadata into a memory area preser= ved + * by KHO. + * + * 3. Deserialization (in new kernel): After kexec, `luo_session_deserial= ize()` + * runs, reading the serialized data and creating a list of `struct + * luo_session` objects representing the preserved sessions. + * + * 4. Retrieval: A userspace agent in the new kernel can then call + * `luo_session_retrieve()` with a session name to get a new file + * descriptor and access the preserved state. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +/* 16 4K pages, give space for 744 sessions */ +#define LUO_SESSION_PGCNT 16ul +#define LUO_SESSION_MAX (((LUO_SESSION_PGCNT << PAGE_SHIFT) - \ + sizeof(struct luo_session_header_ser)) / \ + sizeof(struct luo_session_ser)) + +/** + * struct luo_session_header - Header struct for managing LUO sessions. + * @count: The number of sessions currently tracked in the @list. + * @list: The head of the linked list of `struct luo_session` instan= ces. + * @rwsem: A read-write semaphore providing synchronized access to the + * session list and other fields in this structure. + * @header_ser: The header data of serialization array. + * @ser: The serialized session data (an array of + * `struct luo_session_ser`). + * @active: Set to true when first initialized. If previous kernel did= not + * send session data, active stays false for incoming. + */ +struct luo_session_header { + long count; + struct list_head list; + struct rw_semaphore rwsem; + struct luo_session_header_ser *header_ser; + struct luo_session_ser *ser; + bool active; +}; + +/** + * struct luo_session_global - Global container for managing LUO sessions. + * @incoming: The sessions passed from the previous kernel. + * @outgoing: The sessions that are going to be passed to the next ker= nel. + */ +struct luo_session_global { + struct luo_session_header incoming; + struct luo_session_header outgoing; +}; + +static struct luo_session_global luo_session_global =3D { + .incoming =3D { + .list =3D LIST_HEAD_INIT(luo_session_global.incoming.list), + .rwsem =3D __RWSEM_INITIALIZER(luo_session_global.incoming.rwsem), + }, + .outgoing =3D { + .list =3D LIST_HEAD_INIT(luo_session_global.outgoing.list), + .rwsem =3D __RWSEM_INITIALIZER(luo_session_global.outgoing.rwsem), + }, +}; + +static struct luo_session *luo_session_alloc(const char *name) +{ + struct luo_session *session =3D kzalloc(sizeof(*session), GFP_KERNEL); + + if (!session) + return ERR_PTR(-ENOMEM); + + strscpy(session->name, name, sizeof(session->name)); + INIT_LIST_HEAD(&session->list); + mutex_init(&session->mutex); + + return session; +} + +static void luo_session_free(struct luo_session *session) +{ + mutex_destroy(&session->mutex); + kfree(session); +} + +static int luo_session_insert(struct luo_session_header *sh, + struct luo_session *session) +{ + struct luo_session *it; + + guard(rwsem_write)(&sh->rwsem); + + /* + * For outgoing we should make sure there is room in serialization array + * for new session. + */ + if (sh =3D=3D &luo_session_global.outgoing) { + if (sh->count =3D=3D LUO_SESSION_MAX) + return -ENOMEM; + } + + /* + * For small number of sessions this loop won't hurt performance + * but if we ever start using a lot of sessions, this might + * become a bottle neck during deserialization time, as it would + * cause O(n*n) complexity. + */ + list_for_each_entry(it, &sh->list, list) { + if (!strncmp(it->name, session->name, sizeof(it->name))) + return -EEXIST; + } + list_add_tail(&session->list, &sh->list); + sh->count++; + + return 0; +} + +static void luo_session_remove(struct luo_session_header *sh, + struct luo_session *session) +{ + guard(rwsem_write)(&sh->rwsem); + list_del(&session->list); + sh->count--; +} + +static int luo_session_release(struct inode *inodep, struct file *filep) +{ + struct luo_session *session =3D filep->private_data; + struct luo_session_header *sh; + + /* If retrieved is set, it means this session is from incoming list */ + if (session->retrieved) + sh =3D &luo_session_global.incoming; + else + sh =3D &luo_session_global.outgoing; + + luo_session_remove(sh, session); + luo_session_free(session); + + return 0; +} + +static const struct file_operations luo_session_fops =3D { + .owner =3D THIS_MODULE, + .release =3D luo_session_release, +}; + +/* Create a "struct file" for session */ +static int luo_session_getfile(struct luo_session *session, struct file **= filep) +{ + char name_buf[128]; + struct file *file; + + lockdep_assert_held(&session->mutex); + snprintf(name_buf, sizeof(name_buf), "[luo_session] %s", session->name); + file =3D anon_inode_getfile(name_buf, &luo_session_fops, session, O_RDWR); + if (IS_ERR(file)) + return PTR_ERR(file); + + *filep =3D file; + + return 0; +} + +int luo_session_create(const char *name, struct file **filep) +{ + struct luo_session *session; + int err; + + session =3D luo_session_alloc(name); + if (IS_ERR(session)) + return PTR_ERR(session); + + err =3D luo_session_insert(&luo_session_global.outgoing, session); + if (err) + goto err_free; + + scoped_guard(mutex, &session->mutex) + err =3D luo_session_getfile(session, filep); + if (err) + goto err_remove; + + return 0; + +err_remove: + luo_session_remove(&luo_session_global.outgoing, session); +err_free: + luo_session_free(session); + + return err; +} + +int luo_session_retrieve(const char *name, struct file **filep) +{ + struct luo_session_header *sh =3D &luo_session_global.incoming; + struct luo_session *session =3D NULL; + struct luo_session *it; + int err; + + scoped_guard(rwsem_read, &sh->rwsem) { + list_for_each_entry(it, &sh->list, list) { + if (!strncmp(it->name, name, sizeof(it->name))) { + session =3D it; + break; + } + } + } + + if (!session) + return -ENOENT; + + guard(mutex)(&session->mutex); + if (session->retrieved) + return -EINVAL; + + err =3D luo_session_getfile(session, filep); + if (!err) + session->retrieved =3D true; + + return err; +} + +int __init luo_session_setup_outgoing(void *fdt_out) +{ + struct luo_session_header_ser *header_ser; + u64 header_ser_pa; + int err; + + header_ser =3D kho_alloc_preserve(LUO_SESSION_PGCNT << PAGE_SHIFT); + if (IS_ERR(header_ser)) + return PTR_ERR(header_ser); + header_ser_pa =3D virt_to_phys(header_ser); + + err =3D fdt_begin_node(fdt_out, LUO_FDT_SESSION_NODE_NAME); + err |=3D fdt_property_string(fdt_out, "compatible", + LUO_FDT_SESSION_COMPATIBLE); + err |=3D fdt_property(fdt_out, LUO_FDT_SESSION_HEADER, &header_ser_pa, + sizeof(header_ser_pa)); + err |=3D fdt_end_node(fdt_out); + + if (err) + goto err_unpreserve; + + luo_session_global.outgoing.header_ser =3D header_ser; + luo_session_global.outgoing.ser =3D (void *)(header_ser + 1); + luo_session_global.outgoing.active =3D true; + + return 0; + +err_unpreserve: + kho_unpreserve_free(header_ser); + return err; +} + +int __init luo_session_setup_incoming(void *fdt_in) +{ + struct luo_session_header_ser *header_ser; + int err, header_size, offset; + u64 header_ser_pa; + const void *ptr; + + offset =3D fdt_subnode_offset(fdt_in, 0, LUO_FDT_SESSION_NODE_NAME); + if (offset < 0) { + pr_err("Unable to get session node: [%s]\n", + LUO_FDT_SESSION_NODE_NAME); + return -EINVAL; + } + + err =3D fdt_node_check_compatible(fdt_in, offset, + LUO_FDT_SESSION_COMPATIBLE); + if (err) { + pr_err("Session node incompatible [%s]\n", + LUO_FDT_SESSION_COMPATIBLE); + return -EINVAL; + } + + header_size =3D 0; + ptr =3D fdt_getprop(fdt_in, offset, LUO_FDT_SESSION_HEADER, &header_size); + if (!ptr || header_size !=3D sizeof(u64)) { + pr_err("Unable to get session header '%s' [%d]\n", + LUO_FDT_SESSION_HEADER, header_size); + return -EINVAL; + } + + header_ser_pa =3D get_unaligned((u64 *)ptr); + header_ser =3D phys_to_virt(header_ser_pa); + + luo_session_global.incoming.header_ser =3D header_ser; + luo_session_global.incoming.ser =3D (void *)(header_ser + 1); + luo_session_global.incoming.active =3D true; + + return 0; +} + +int luo_session_deserialize(void) +{ + struct luo_session_header *sh =3D &luo_session_global.incoming; + static bool is_deserialized; + static int err; + + /* If has been deserialized, always return the same error code */ + if (is_deserialized) + return err; + + is_deserialized =3D true; + if (!sh->active) + return 0; + + /* + * Note on error handling: + * + * If deserialization fails (e.g., allocation failure or corrupt data), + * we intentionally skip cleanup of sessions that were already restored. + * + * A partial failure leaves the preserved state inconsistent. + * Implementing a safe "undo" to unwind complex dependencies (sessions, + * files, hardware state) is error-prone and provides little value, as + * the system is effectively in a broken state. + * + * We treat these resources as leaked. The expected recovery path is for + * userspace to detect the failure and trigger a reboot, which will + * reliably reset devices and reclaim memory. + */ + for (int i =3D 0; i < sh->header_ser->count; i++) { + struct luo_session *session; + + session =3D luo_session_alloc(sh->ser[i].name); + if (IS_ERR(session)) { + pr_warn("Failed to allocate session [%s] during deserialization %pe\n", + sh->ser[i].name, session); + return PTR_ERR(session); + } + + err =3D luo_session_insert(sh, session); + if (err) { + luo_session_free(session); + pr_warn("Failed to insert session [%s] %pe\n", + session->name, ERR_PTR(err)); + return err; + } + } + + kho_restore_free(sh->header_ser); + sh->header_ser =3D NULL; + sh->ser =3D NULL; + + return 0; +} + +int luo_session_serialize(void) +{ + struct luo_session_header *sh =3D &luo_session_global.outgoing; + struct luo_session *session; + int i =3D 0; + + guard(rwsem_write)(&sh->rwsem); + list_for_each_entry(session, &sh->list, list) { + strscpy(sh->ser[i].name, session->name, + sizeof(sh->ser[i].name)); + i++; + } + sh->header_ser->count =3D sh->count; + + return 0; +} + +/** + * luo_session_quiesce - Ensure no active sessions exist and lock session = lists. + * + * Acquires exclusive write locks on both incoming and outgoing session li= sts. + * It then validates no sessions exist in either list. + * + * This mechanism is used during file handler un/registration to ensure th= at no + * sessions are currently using the handler, and no new sessions can be cr= eated + * while un/registration is in progress. + * + * This prevents registering new handlers while sessions are active or + * while deserialization is in progress. + * + * Return: + * true - System is quiescent (0 sessions) and locked. + * false - Active sessions exist. The locks are released internally. + */ +bool luo_session_quiesce(void) +{ + down_write(&luo_session_global.incoming.rwsem); + down_write(&luo_session_global.outgoing.rwsem); + + if (luo_session_global.incoming.count || + luo_session_global.outgoing.count) { + up_write(&luo_session_global.outgoing.rwsem); + up_write(&luo_session_global.incoming.rwsem); + return false; + } + + return true; +} + +/** + * luo_session_resume - Unlock session lists and resume normal activity. + * + * Releases the exclusive locks acquired by a successful call to + * luo_session_quiesce(). + */ +void luo_session_resume(void) +{ + up_write(&luo_session_global.outgoing.rwsem); + up_write(&luo_session_global.incoming.rwsem); +} --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yx1-f54.google.com (mail-yx1-f54.google.com [74.125.224.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F1DA32B9A4 for ; Tue, 25 Nov 2025 16:59:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089949; cv=none; b=u2qRk8jb8kBe8nvqTuyJGD1GIF5uuvVvJ+CCYEaZIcUbFVzAYG/7DenLTPapx7QdLx67cJbMQZWmw3/aToR1Hd2W2yz/LuwADVdfDehJd8qFsoTt9frR2mnD7eKtN60EKd9oOqbIgPLZ7z9ushgeAtbj9KtKldtU4vF4R4J/xDY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089949; c=relaxed/simple; bh=3tM7kSnLRy7sFArgZjMsRTIjTqxCZSzVQzifmn+OWtY=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=U2BGcj/dOjVB6bIINgbIUUY86M4dktC9r69Y55BRdi1E9Cjwgxbcig2BwMLxf+KqgzZDd/VL2ONq/RO3UshmzM24uOWVzhUDCHvIfsYTpo61FykwjjzRbI6W2EArdRBvCEF1rK0o+e0b6zPO28wAcatoU5jt6/37tc69GUODshc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=QGObA8dL; arc=none smtp.client-ip=74.125.224.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="QGObA8dL" Received: by mail-yx1-f54.google.com with SMTP id 956f58d0204a3-641e9422473so4543120d50.2 for ; Tue, 25 Nov 2025 08:59:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089946; x=1764694746; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=qY9wo9jaEowvBGqUAjQT6FCUqqOXzwqYWEwdmUcM3GY=; b=QGObA8dL/1xmeKrgHqpNkzxZIJ1GTEAuN4B2I0gZ8AutjEva1RIRJLz0MMlqU7LxhO CP1ZwbueKd0zOg2qYM0zTLO/8BICnMNAihHggLyhG0aCSKAmkejDCZR483/v/HtzXY07 h34jgTNOtSfzE5hiSI3lsQC4ylc+S/ENpojJiL1+OX8fvsXx1brFVCSQH8NgokxqEKxH PdIU7rGLiIX5KMyq1YDF1kp6rggRFVd2fg5RnGuDzinw/13K+2T+6Lh8hU9/GIAiZToq EsejDIcAyGvDSAkWa/bGPFose2h50+CHHx1SjnH0Me3G/ZiGprpLbFOq6ArtEc1A2Gko 7Tdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089946; x=1764694746; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=qY9wo9jaEowvBGqUAjQT6FCUqqOXzwqYWEwdmUcM3GY=; b=LBP+QJAKkQOnzXXJ+kt0iIuUTBvbRsTGncjhQ0DYjT55CbbeO9xEIfdYyiEH29b2gE b0v3Ra0gsd44oo2w9eCTdGqaR+jSl/I3/VKUrIOCIDAJAFsXoMSp6NOs3JDD4gOW0NYE nIW7kDFihVFyDXJ698mPqH4Wtm6B0hIFf7twJdQf1IA4cWpvxzLI/U2SOlwzpoGUtMED wlsb3oWEx1PJzMmdDjkyddO1kbgGWYBo/B53LiwexJRfygd6dh0psBOFWvvVshSqfiOX QNJ2ks5GPU5nw3XHTT2jFg+n7PPbZELJjc/P178Yj+oOufhEhXtfTG2QVmh6OPuAHB5y RGbQ== X-Forwarded-Encrypted: i=1; AJvYcCUtym6G7EQZErHNFIFGp7DiquLJ+u620OCcOCXyacd+EPVqxya0XrYDsa84RDggY4k5dNhqyiR5YfsNZzk=@vger.kernel.org X-Gm-Message-State: AOJu0YwNd0A+BaHc2uN+06Dwk6r2HYl0Fxo/au6dZY3+Raucrq1/dT4E ieFkLWVaylTtVGx1QXVbLBTw9EIL1PomQYBCv0WS4Vtl2b9edhmO4lQgWwYcSVjdkdY= X-Gm-Gg: ASbGncsMyq05TDbRzurUv69ZwVD+Hksnp+j5xM99s6lap/bVSx08THjbqyQER9plKVP xxdFrFlTc2aPWUFUheFUZCWEo94CG0nj2PQbm+VHO68s2zXBFKdXOx6p2lUdXuuMK5VoyPu/cvN CRVQYxd8aQCn0fL0TtplbPwOr2M7Zzj5k7E1ErYFaq0hSa5EB3gD3EOeGd2WV08SHNkm8UTG6Tb vARDENkijILT1ezp5Yr1nsp9eVeIn8JheBLX5hOwt6LiOZBE683z4Yf4eDzWyOxhFQIT+t8whS+ unlt6MpMG0gpyG5AGU1jjgzAjmhTa/Y3M1nrhGRhn/Uazjzc7Jhxwc3r4OHfsvM3S/gWrhXBOSG EsLhpV8hoyqa0ViOrYb32HT33UCTgmhMpIWkHfkJ2i1YAVPGxpJX4MVkk3lg0FJ5Gcp276glQzi hUsWI3rVlBxBbhCV0P728LS//oov+2U9PLYNu3zRKF0xiaeGJEUW5QV1LlockShWlj X-Google-Smtp-Source: AGHT+IFBNMs4cA9mfXfs+UfRF6OXAxVhmWhl0xvnB86ygK/E+UbH0gsrwEnsBLd9L25QsRfTdYCESw== X-Received: by 2002:a05:690c:6806:b0:788:20a1:48c0 with SMTP id 00721157ae682-78ab6d6ce3dmr62989787b3.12.1764089945871; Tue, 25 Nov 2025 08:59:05 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:05 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 05/18] liveupdate: luo_core: add user interface Date: Tue, 25 Nov 2025 11:58:35 -0500 Message-ID: <20251125165850.3389713-6-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the user-space interface for the Live Update Orchestrator via ioctl commands, enabling external control over the live update process and management of preserved resources. The idea is that there is going to be a single userspace agent driving the live update, therefore, only a single process can ever hold this device opened at a time. The following ioctl commands are introduced: LIVEUPDATE_IOCTL_CREATE_SESSION Provides a way for userspace to create a named session for grouping file descriptors that need to be preserved. It returns a new file descriptor representing the session. LIVEUPDATE_IOCTL_RETRIEVE_SESSION Allows the userspace agent in the new kernel to reclaim a preserved session by its name, receiving a new file descriptor to manage the restored resources. Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Pratyush Yadav Tested-by: David Matlack --- include/uapi/linux/liveupdate.h | 64 +++++++++++ kernel/liveupdate/luo_core.c | 178 +++++++++++++++++++++++++++++++ kernel/liveupdate/luo_internal.h | 21 ++++ 3 files changed, 263 insertions(+) diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index 40578ae19668..1183cf984b5f 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -46,4 +46,68 @@ /* The maximum length of session name including null termination */ #define LIVEUPDATE_SESSION_NAME_LENGTH 64 =20 +/* The /dev/liveupdate ioctl commands */ +enum { + LIVEUPDATE_CMD_BASE =3D 0x00, + LIVEUPDATE_CMD_CREATE_SESSION =3D LIVEUPDATE_CMD_BASE, + LIVEUPDATE_CMD_RETRIEVE_SESSION =3D 0x01, +}; + +/** + * struct liveupdate_ioctl_create_session - ioctl(LIVEUPDATE_IOCTL_CREATE_= SESSION) + * @size: Input; sizeof(struct liveupdate_ioctl_create_session) + * @fd: Output; The new file descriptor for the created session. + * @name: Input; A null-terminated string for the session name, max + * length %LIVEUPDATE_SESSION_NAME_LENGTH including termination + * character. + * + * Creates a new live update session for managing preserved resources. + * This ioctl can only be called on the main /dev/liveupdate device. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_ioctl_create_session { + __u32 size; + __s32 fd; + __u8 name[LIVEUPDATE_SESSION_NAME_LENGTH]; +}; + +#define LIVEUPDATE_IOCTL_CREATE_SESSION \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_CREATE_SESSION) + +/** + * struct liveupdate_ioctl_retrieve_session - ioctl(LIVEUPDATE_IOCTL_RETRI= EVE_SESSION) + * @size: Input; sizeof(struct liveupdate_ioctl_retrieve_session) + * @fd: Output; The new file descriptor for the retrieved session. + * @name: Input; A null-terminated string identifying the session to re= trieve. + * The name must exactly match the name used when the session was + * created in the previous kernel. + * + * Retrieves a handle (a new file descriptor) for a preserved session by i= ts + * name. This is the primary mechanism for a userspace agent to regain con= trol + * of its preserved resources after a live update. + * + * The userspace application provides the null-terminated `name` of a sess= ion + * it created before the live update. If a preserved session with a matchi= ng + * name is found, the kernel instantiates it and returns a new file descri= ptor + * in the `fd` field. This new session FD can then be used for all file-sp= ecific + * operations, such as restoring individual file descriptors with + * LIVEUPDATE_SESSION_RETRIEVE_FD. + * + * It is the responsibility of the userspace application to know the names= of + * the sessions it needs to retrieve. If no session with the given name is + * found, the ioctl will fail with -ENOENT. + * + * This ioctl can only be called on the main /dev/liveupdate device when t= he + * system is in the LIVEUPDATE_STATE_UPDATED state. + */ +struct liveupdate_ioctl_retrieve_session { + __u32 size; + __s32 fd; + __u8 name[LIVEUPDATE_SESSION_NAME_LENGTH]; +}; + +#define LIVEUPDATE_IOCTL_RETRIEVE_SESSION \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_RETRIEVE_SESSION) + #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index a0f7788cd003..f7ecaf7740d1 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -41,7 +41,13 @@ =20 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt =20 +#include +#include +#include +#include +#include #include +#include #include #include #include @@ -246,12 +252,183 @@ bool liveupdate_enabled(void) return luo_global.enabled; } =20 +/** + * DOC: LUO ioctl Interface + * + * The IOCTL user-space control interface for the LUO subsystem. + * It registers a character device, typically found at ``/dev/liveupdate``, + * which allows a userspace agent to manage the LUO state machine and its + * associated resources, such as preservable file descriptors. + * + * To ensure that the state machine is controlled by a single entity, acce= ss + * to this device is exclusive: only one process is permitted to have + * ``/dev/liveupdate`` open at any given time. Subsequent open attempts wi= ll + * fail with -EBUSY until the first process closes its file descriptor. + * This singleton model simplifies state management by preventing conflict= ing + * commands from multiple userspace agents. + */ + struct luo_device_state { struct miscdevice miscdev; + atomic_t in_use; }; =20 +static int luo_ioctl_create_session(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_create_session *argp =3D ucmd->cmd; + struct file *file; + int err; + + argp->fd =3D get_unused_fd_flags(O_CLOEXEC); + if (argp->fd < 0) + return argp->fd; + + err =3D luo_session_create(argp->name, &file); + if (err) + goto err_put_fd; + + err =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (err) + goto err_put_file; + + fd_install(argp->fd, file); + + return 0; + +err_put_file: + fput(file); +err_put_fd: + put_unused_fd(argp->fd); + + return err; +} + +static int luo_ioctl_retrieve_session(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_retrieve_session *argp =3D ucmd->cmd; + struct file *file; + int err; + + argp->fd =3D get_unused_fd_flags(O_CLOEXEC); + if (argp->fd < 0) + return argp->fd; + + err =3D luo_session_retrieve(argp->name, &file); + if (err < 0) + goto err_put_fd; + + err =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (err) + goto err_put_file; + + fd_install(argp->fd, file); + + return 0; + +err_put_file: + fput(file); +err_put_fd: + put_unused_fd(argp->fd); + + return err; +} + +static int luo_open(struct inode *inodep, struct file *filep) +{ + struct luo_device_state *ldev =3D container_of(filep->private_data, + struct luo_device_state, + miscdev); + + if (atomic_cmpxchg(&ldev->in_use, 0, 1)) + return -EBUSY; + + /* Always return -EIO to user if deserialization fail */ + if (luo_session_deserialize()) { + atomic_set(&ldev->in_use, 0); + return -EIO; + } + + return 0; +} + +static int luo_release(struct inode *inodep, struct file *filep) +{ + struct luo_device_state *ldev =3D container_of(filep->private_data, + struct luo_device_state, + miscdev); + atomic_set(&ldev->in_use, 0); + + return 0; +} + +union ucmd_buffer { + struct liveupdate_ioctl_create_session create; + struct liveupdate_ioctl_retrieve_session retrieve; +}; + +struct luo_ioctl_op { + unsigned int size; + unsigned int min_size; + unsigned int ioctl_num; + int (*execute)(struct luo_ucmd *ucmd); +}; + +#define IOCTL_OP(_ioctl, _fn, _struct, _last) = \ + [_IOC_NR(_ioctl) - LIVEUPDATE_CMD_BASE] =3D { \ + .size =3D sizeof(_struct) + \ + BUILD_BUG_ON_ZERO(sizeof(union ucmd_buffer) < \ + sizeof(_struct)), \ + .min_size =3D offsetofend(_struct, _last), \ + .ioctl_num =3D _ioctl, \ + .execute =3D _fn, \ + } + +static const struct luo_ioctl_op luo_ioctl_ops[] =3D { + IOCTL_OP(LIVEUPDATE_IOCTL_CREATE_SESSION, luo_ioctl_create_session, + struct liveupdate_ioctl_create_session, name), + IOCTL_OP(LIVEUPDATE_IOCTL_RETRIEVE_SESSION, luo_ioctl_retrieve_session, + struct liveupdate_ioctl_retrieve_session, name), +}; + +static long luo_ioctl(struct file *filep, unsigned int cmd, unsigned long = arg) +{ + const struct luo_ioctl_op *op; + struct luo_ucmd ucmd =3D {}; + union ucmd_buffer buf; + unsigned int nr; + int err; + + nr =3D _IOC_NR(cmd); + if (nr < LIVEUPDATE_CMD_BASE || + (nr - LIVEUPDATE_CMD_BASE) >=3D ARRAY_SIZE(luo_ioctl_ops)) { + return -EINVAL; + } + + ucmd.ubuffer =3D (void __user *)arg; + err =3D get_user(ucmd.user_size, (u32 __user *)ucmd.ubuffer); + if (err) + return err; + + op =3D &luo_ioctl_ops[nr - LIVEUPDATE_CMD_BASE]; + if (op->ioctl_num !=3D cmd) + return -ENOIOCTLCMD; + if (ucmd.user_size < op->min_size) + return -EINVAL; + + ucmd.cmd =3D &buf; + err =3D copy_struct_from_user(ucmd.cmd, op->size, ucmd.ubuffer, + ucmd.user_size); + if (err) + return err; + + return op->execute(&ucmd); +} + static const struct file_operations luo_fops =3D { .owner =3D THIS_MODULE, + .open =3D luo_open, + .release =3D luo_release, + .unlocked_ioctl =3D luo_ioctl, }; =20 static struct luo_device_state luo_dev =3D { @@ -260,6 +437,7 @@ static struct luo_device_state luo_dev =3D { .name =3D "liveupdate", .fops =3D &luo_fops, }, + .in_use =3D ATOMIC_INIT(0), }; =20 static int __init liveupdate_ioctl_init(void) diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 05ae91695ec6..1292ac47eef8 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -9,6 +9,27 @@ #define _LINUX_LUO_INTERNAL_H =20 #include +#include + +struct luo_ucmd { + void __user *ubuffer; + u32 user_size; + void *cmd; +}; + +static inline int luo_ucmd_respond(struct luo_ucmd *ucmd, + size_t kernel_cmd_size) +{ + /* + * Copy the minimum of what the user provided and what we actually + * have. + */ + if (copy_to_user(ucmd->ubuffer, ucmd->cmd, + min_t(size_t, ucmd->user_size, kernel_cmd_size))) { + return -EFAULT; + } + return 0; +} =20 /* * Handles a deserialization failure: devices and memory is in unpredictab= le --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 82A0732BF20 for ; Tue, 25 Nov 2025 16:59:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089953; cv=none; b=TaqucktFdriYBBisyjXyw61Vg5TUgyLDWU2fIlWVqZqBSY48O07vjTa7JHb5rMdBiWUV0pn+cec6I6UHGnifMwA5A4IyGUzIZsdLxsz4GyiZZcjpzuUv+BjU+tG6tKjm+S6SD2fRgfEN4wN+L9SE8jkSD/o1UPblg1vMLcVZTFg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089953; c=relaxed/simple; bh=LT/Xn+S4gSuiGh/cZ5DWuBHN4O1PvHlBCmJABbmQ7DM=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=N1zhJzhGcm9KjaovWRhPOczWm6kC5463VLyzwUn+xwUvmyrG0zRDh7imOBoYqNe5PmvEzbAC/OvjinCKelGlQOBAAoKFws354Ig9tUWFdzdSkP2t2aLhh0aWCU2/N0noM2lO2KsU4x4CXDCeFRIU8yG+uzC2Iwh3/R5HyadnIDk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Uf7maMGe; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Uf7maMGe" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-78aa642f306so24801037b3.2 for ; Tue, 25 Nov 2025 08:59:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089948; x=1764694748; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ZuoNpRcglRgUuYP8SlbPEV/H8bUFZ5d/uzl5XgFLkDo=; b=Uf7maMGeEpifEqngC4yA6AwFnoMZ3b2gPbwr2y883XmLsE93KhxF42Kc4I5YKTckIX S9OxTmv8YoVs/ODxEXBKB+GLUH46IVpdqmXp5Skz90Y+8prJWqyKuamRZuF5MNsFFmbX aVbcZKnkc9ascu+kNEqP8xZZvhFtynswk5oi4qkULOtyD9EvnULJtv3NA5je5iweGRAy oVBUdYQ9vv1IW1SPSQxW7XKJpu2wZ4zbxKXno0sc8nbZgDdCi0x4lP9W6OYtyRRcDZLf PaZoFP49gsaCsp68/MBMG8xk3vbwKi7TL9Ayh6HpuRRMOvCSHP+B2QJv6CC+Cqj0qTqV 7fPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089948; x=1764694748; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=ZuoNpRcglRgUuYP8SlbPEV/H8bUFZ5d/uzl5XgFLkDo=; b=sL/0meBqZJsQjzCb5Fg6l9n4UbZKZunz7gz4s+fh2NQpxbfHp0JnwCkV+GFOha587e g+VjZE9aniaiK27v1lcuiNL+vOqaLcc31fBvNi4LqNfEJIRF4jpgi9da3XRHNgZ1TfU2 s/n1CRTWe53RktyvpIufjBR8Lg8fTHTGOxDk6xRmkKIpzMTEGWNqEm6z2n5M7rehnqDt WTv6hucGWJucvZxz4WbvshafrpQkY5y8XXQhpL9+foH/0MnyVew82kfUki791WAdQF9d g3Yw+l1YSeqbaWIQUCDjXCYScTJMMg9Xm52uWj3h3G5yDRFRYkqWBg67IKAd+YgEropA t/yA== X-Forwarded-Encrypted: i=1; AJvYcCVuqS4v5IUwR68a1ug7vGdpgThfXlre3uIrD/zESJk2+1jNgPrglaIEgoj8AMcH54DJCIQCpJDm9Ome1MA=@vger.kernel.org X-Gm-Message-State: AOJu0Ywc9aXv6I+DBLnyyvKWj3trvpZfl5OQ0lGGvt8Bq/4Lk+LupQQx 3wD3XdqpNaZLDzG0EyPSbzKenq13LVCy42o6lfnWYQD2ElxHjuuS626ucdE34997+cM= X-Gm-Gg: ASbGncuS3Czn3cA3dd6NaXuzf/cLVeD2MEupZVO3x/Mtxnf1ddPhiYt2SbqZ2m5PICV tKk/F+jvLsHmnMGEFT7sxgRagqyGec2T71sixN1/WXiln7bi2giv8HYTC7i/YonITLDjU3PYaya FqkJ4QzeKyYs2WJheYGE4yqTQm/A0vfVf6OtMsWOdTfsL9jO5bT1T8/zO0SfkY3B2CoZwajkeXb kamEPXmCpdMSKzlTHrOOrXZK/J1Tgnvz6M2E6wGymCAvYb5BjlkhI2wF5+HjRAjIwB9O6aaZL07 oTigp93jRiC1XOwZE1iUbgGHv7Jww8gey+eTj8TDLA4OsZBge36u/yzNkS4MNaEHNyCWObcqZOw ui0Bk56ZT6mElNfC8lDsscL6d5qiFU2hFix2GcrVxnYptx34epD+cVh2WPBBdwjQL4u+1At00nb r1rXPNfBL+V2u50V2S5XB3eG6OIhq6PV28/ByFGatVi7nspV0lQNGDDUQ9EcIjVdvf2ME6PHURV J4flnU= X-Google-Smtp-Source: AGHT+IHvVEpp6zCFqvLo2bARIyvTR1RTQKWx3w94e6i5XlGrNPoZJRRs0PURPhj2vhunLQ8gEPfPVw== X-Received: by 2002:a05:690c:680a:b0:787:ed0d:42c1 with SMTP id 00721157ae682-78ab6fa988bmr30368477b3.62.1764089947949; Tue, 25 Nov 2025 08:59:07 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:07 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 06/18] liveupdate: luo_file: implement file systems callbacks Date: Tue, 25 Nov 2025 11:58:36 -0500 Message-ID: <20251125165850.3389713-7-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable This patch implements the core mechanism for managing preserved files throughout the live update lifecycle. It provides the logic to invoke the file handler callbacks (preserve, unpreserve, freeze, unfreeze, retrieve, and finish) at the appropriate stages. During the reboot phase, luo_file_freeze() serializes the final metadata for each file (handler compatible string, token, and data handle) into a memory region preserved by KHO. In the new kernel, luo_file_deserialize() reconstructs the in-memory file list from this data, preparing the session for retrieval. Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Pratyush Yadav Tested-by: David Matlack --- include/linux/kho/abi/luo.h | 39 +- include/linux/liveupdate.h | 98 ++++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_file.c | 880 +++++++++++++++++++++++++++++++ kernel/liveupdate/luo_internal.h | 38 ++ 5 files changed, 1055 insertions(+), 1 deletion(-) create mode 100644 kernel/liveupdate/luo_file.c diff --git a/include/linux/kho/abi/luo.h b/include/linux/kho/abi/luo.h index bf1ab2910959..bb099c92e469 100644 --- a/include/linux/kho/abi/luo.h +++ b/include/linux/kho/abi/luo.h @@ -69,6 +69,11 @@ * Metadata for a single session, including its name and a physical po= inter * to another preserved memory block containing an array of * `struct luo_file_ser` for all files in that session. + * + * - struct luo_file_ser: + * Metadata for a single preserved file. Contains the `compatible` str= ing to + * find the correct handler in the new kernel, a user-provided `token`= for + * identification, and an opaque `data` handle for the handler to use. */ =20 #ifndef _LINUX_KHO_ABI_LUO_H @@ -86,13 +91,43 @@ #define LUO_FDT_COMPATIBLE "luo-v1" #define LUO_FDT_LIVEUPDATE_NUM "liveupdate-number" =20 +#define LIVEUPDATE_HNDL_COMPAT_LENGTH 48 + +/** + * struct luo_file_ser - Represents the serialized preserves files. + * @compatible: File handler compatible string. + * @data: Private data + * @token: User provided token for this file + * + * If this structure is modified, LUO_SESSION_COMPATIBLE must be updated. + */ +struct luo_file_ser { + char compatible[LIVEUPDATE_HNDL_COMPAT_LENGTH]; + u64 data; + u64 token; +} __packed; + +/** + * struct luo_file_set_ser - Represents the serialized metadata for file s= et + * @files: The physical address of a contiguous memory block that holds + * the serialized state of files (array of luo_file_ser) in this= file + * set. + * @count: The total number of files that were part of this session duri= ng + * serialization. Used for iteration and validation during + * restoration. + */ +struct luo_file_set_ser { + u64 files; + u64 count; +} __packed; + /* * LUO FDT session node * LUO_FDT_SESSION_HEADER: is a u64 physical address of struct * luo_session_header_ser */ #define LUO_FDT_SESSION_NODE_NAME "luo-session" -#define LUO_FDT_SESSION_COMPATIBLE "luo-session-v1" +#define LUO_FDT_SESSION_COMPATIBLE "luo-session-v2" #define LUO_FDT_SESSION_HEADER "luo-session-header" =20 /** @@ -114,6 +149,7 @@ struct luo_session_header_ser { * struct luo_session_ser - Represents the serialized metadata for a LUO s= ession. * @name: The unique name of the session, provided by the userspac= e at * the time of session creation. + * @file_set_ser: Serialized files belonging to this session, * * This structure is used to package session-specific metadata for transfer * between kernels via Kexec Handover. An array of these structures (one p= er @@ -124,6 +160,7 @@ struct luo_session_header_ser { */ struct luo_session_ser { char name[LIVEUPDATE_SESSION_NAME_LENGTH]; + struct luo_file_set_ser file_set_ser; } __packed; =20 #endif /* _LINUX_KHO_ABI_LUO_H */ diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index c6a1d6bd90cb..122ad8f16ff9 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -8,8 +8,93 @@ #define _LINUX_LIVEUPDATE_H =20 #include +#include +#include #include #include +#include + +struct liveupdate_file_handler; +struct file; + +/** + * struct liveupdate_file_op_args - Arguments for file operation callbacks. + * @handler: The file handler being called. + * @retrieved: The retrieve status for the 'can_finish / finish' + * operation. + * @file: The file object. For retrieve: [OUT] The callback se= ts + * this to the new file. For other ops: [IN] The caller= sets + * this to the file being operated on. + * @serialized_data: The opaque u64 handle, preserve/prepare/freeze may u= pdate + * this field. + * + * This structure bundles all parameters for the file operation callbacks. + * The 'data' and 'file' fields are used for both input and output. + */ +struct liveupdate_file_op_args { + struct liveupdate_file_handler *handler; + bool retrieved; + struct file *file; + u64 serialized_data; +}; + +/** + * struct liveupdate_file_ops - Callbacks for live-updatable files. + * @can_preserve: Required. Lightweight check to see if this handler is + * compatible with the given file. + * @preserve: Required. Performs state-saving for the file. + * @unpreserve: Required. Cleans up any resources allocated by @preserve. + * @freeze: Optional. Final actions just before kernel transition. + * @unfreeze: Optional. Undo freeze operations. + * @retrieve: Required. Restores the file in the new kernel. + * @can_finish: Optional. Check if this FD can finish, i.e. all restorat= ion + * pre-requirements for this FD are satisfied. Called prior= to + * finish, in order to do successful finish calls for all + * resources in the session. + * @finish: Required. Final cleanup in the new kernel. + * @owner: Module reference + * + * All operations (except can_preserve) receive a pointer to a + * 'struct liveupdate_file_op_args' containing the necessary context. + */ +struct liveupdate_file_ops { + bool (*can_preserve)(struct liveupdate_file_handler *handler, + struct file *file); + int (*preserve)(struct liveupdate_file_op_args *args); + void (*unpreserve)(struct liveupdate_file_op_args *args); + int (*freeze)(struct liveupdate_file_op_args *args); + void (*unfreeze)(struct liveupdate_file_op_args *args); + int (*retrieve)(struct liveupdate_file_op_args *args); + bool (*can_finish)(struct liveupdate_file_op_args *args); + void (*finish)(struct liveupdate_file_op_args *args); + struct module *owner; +}; + +/** + * struct liveupdate_file_handler - Represents a handler for a live-updata= ble file type. + * @ops: Callback functions + * @compatible: The compatibility string (e.g., "memfd-v1", "vfiof= d-v1") + * that uniquely identifies the file type this handler + * supports. This is matched against the compatible s= tring + * associated with individual &struct file instances. + * + * Modules that want to support live update for specific file types should + * register an instance of this structure. LUO uses this registration to + * determine if a given file can be preserved and to find the appropriate + * operations to manage its state across the update. + */ +struct liveupdate_file_handler { + const struct liveupdate_file_ops *ops; + const char compatible[LIVEUPDATE_HNDL_COMPAT_LENGTH]; + + /* private: */ + + /* + * Used for linking this handler instance into a global list of + * registered file handlers. + */ + struct list_head __private list; +}; =20 #ifdef CONFIG_LIVEUPDATE =20 @@ -19,6 +104,9 @@ bool liveupdate_enabled(void); /* Called during kexec to tell LUO that entered into reboot */ int liveupdate_reboot(void); =20 +int liveupdate_register_file_handler(struct liveupdate_file_handler *fh); +int liveupdate_unregister_file_handler(struct liveupdate_file_handler *fh); + #else /* CONFIG_LIVEUPDATE */ =20 static inline bool liveupdate_enabled(void) @@ -31,5 +119,15 @@ static inline int liveupdate_reboot(void) return 0; } =20 +static inline int liveupdate_register_file_handler(struct liveupdate_file_= handler *fh) +{ + return -EOPNOTSUPP; +} + +static inline int liveupdate_unregister_file_handler(struct liveupdate_fil= e_handler *fh) +{ + return -EOPNOTSUPP; +} + #endif /* CONFIG_LIVEUPDATE */ #endif /* _LINUX_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 6af93caa58cf..7cad2eece32d 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -2,6 +2,7 @@ =20 luo-y :=3D \ luo_core.o \ + luo_file.o \ luo_session.o =20 obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c new file mode 100644 index 000000000000..e9727cb1275a --- /dev/null +++ b/kernel/liveupdate/luo_file.c @@ -0,0 +1,880 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO File Descriptors + * + * LUO provides the infrastructure to preserve specific, stateful file + * descriptors across a kexec-based live update. The primary goal is to al= low + * workloads, such as virtual machines using vfio, memfd, or iommufd, to + * retain access to their essential resources without interruption. + * + * The framework is built around a callback-based handler model and a well- + * defined lifecycle for each preserved file. + * + * Handler Registration: + * Kernel modules responsible for a specific file type (e.g., memfd, vfio) + * register a &struct liveupdate_file_handler. This handler provides a set= of + * callbacks that LUO invokes at different stages of the update process, m= ost + * notably: + * + * - can_preserve(): A lightweight check to determine if the handler is + * compatible with a given 'struct file'. + * - preserve(): The heavyweight operation that saves the file's state a= nd + * returns an opaque u64 handle. This is typically performed while the + * workload is still active to minimize the downtime during the + * actual reboot transition. + * - unpreserve(): Cleans up any resources allocated by .preserve(), cal= led + * if the preservation process is aborted before the reboot (i.e. sess= ion is + * closed). + * - freeze(): A final pre-reboot opportunity to prepare the state for k= exec. + * We are already in reboot syscall, and therefore userspace cannot mu= tate + * the file anymore. + * - unfreeze(): Undoes the actions of .freeze(), called if the live upd= ate + * is aborted after the freeze phase. + * - retrieve(): Reconstructs the file in the new kernel from the preser= ved + * handle. + * - finish(): Performs final check and cleanup in the new kernel. After + * succesul finish call, LUO gives up ownership to this file. + * + * File Preservation Lifecycle happy path: + * + * 1. Preserve (Normal Operation): A userspace agent preserves files one b= y one + * via an ioctl. For each file, luo_preserve_file() finds a compatible + * handler, calls its .preserve() operation, and creates an internal &s= truct + * luo_file to track the live state. + * + * 2. Freeze (Pre-Reboot): Just before the kexec, luo_file_freeze() is cal= led. + * It iterates through all preserved files, calls their respective .fre= eze() + * operation, and serializes their final metadata (compatible string, t= oken, + * and data handle) into a contiguous memory block for KHO. + * + * 3. Deserialize: After kexec, luo_file_deserialize() runs when session g= ets + * deserialized (which is when /dev/liveupdate is first opened). It rea= ds the + * serialized data from the KHO memory region and reconstructs the in-m= emory + * list of &struct luo_file instances for the new kernel, linking them = to + * their corresponding handlers. + * + * 4. Retrieve (New Kernel - Userspace Ready): The userspace agent can now + * restore file descriptors by providing a token. luo_retrieve_file() + * searches for the matching token, calls the handler's .retrieve() op = to + * re-create the 'struct file', and returns a new FD. Files can be + * retrieved in ANY order. + * + * 5. Finish (New Kernel - Cleanup): Once a session retrival is complete, + * luo_file_finish() is called. It iterates through all files, invokes = their + * .finish() operations for final cleanup, and releases all associated = kernel + * resources. + * + * File Preservation Lifecycle unhappy paths: + * + * 1. Abort Before Reboot: If the userspace agent aborts the live update + * process before calling reboot (e.g., by closing the session file + * descriptor), the session's release handler calls + * luo_file_unpreserve_files(). This invokes the .unpreserve() callback= on + * all preserved files, ensuring all allocated resources are cleaned up= and + * returning the system to a clean state. + * + * 2. Freeze Failure: During the reboot() syscall, if any handler's .freez= e() + * op fails, the .unfreeze() op is invoked on all previously *successfu= l* + * freezes to roll back their state. The reboot() syscall then returns = an + * error to userspace, canceling the live update. + * + * 3. Finish Failure: In the new kernel, if a handler's .finish() op fails, + * the luo_file_finish() operation is aborted. LUO retains ownership of + * all files within that session, including those that were not yet + * processed. The userspace agent can attempt to call the finish operat= ion + * again later. If the issue cannot be resolved, these resources will b= e held + * by LUO until the next live update cycle, at which point they will be + * discarded. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +static LIST_HEAD(luo_file_handler_list); + +/* 2 4K pages, give space for 128 files per file_set */ +#define LUO_FILE_PGCNT 2ul +#define LUO_FILE_MAX \ + ((LUO_FILE_PGCNT << PAGE_SHIFT) / sizeof(struct luo_file_ser)) + +/** + * struct luo_file - Represents a single preserved file instance. + * @fh: Pointer to the &struct liveupdate_file_handler that man= ages + * this type of file. + * @file: Pointer to the kernel's &struct file that is being pres= erved. + * This is NULL in the new kernel until the file is succes= sfully + * retrieved. + * @serialized_data: The opaque u64 handle to the serialized state of the = file. + * This handle is passed back to the handler's .freeze(), + * .retrieve(), and .finish() callbacks, allowing it to tr= ack + * and update its serialized state across phases. + * @retrieved: A flag indicating whether a user/kernel in the new kern= el has + * successfully called retrieve() on this file. This preve= nts + * multiple retrieval attempts. + * @mutex: A mutex that protects the fields of this specific insta= nce + * (e.g., @retrieved, @file), ensuring that operations like + * retrieving or finishing a file are atomic. + * @list: The list_head linking this instance into its parent + * file_set's list of preserved files. + * @token: The user-provided unique token used to identify this fi= le. + * + * This structure is the core in-kernel representation of a single file be= ing + * managed through a live update. An instance is created by luo_preserve_f= ile() + * to link a 'struct file' to its corresponding handler, a user-provided t= oken, + * and the serialized state handle returned by the handler's .preserve() + * operation. + * + * These instances are tracked in a per-file_set list. The @serialized_data + * field, which holds a handle to the file's serialized state, may be upda= ted + * during the .freeze() callback before being serialized for the next kern= el. + * After reboot, these structures are recreated by luo_file_deserialize() = and + * are finally cleaned up by luo_file_finish(). + */ +struct luo_file { + struct liveupdate_file_handler *fh; + struct file *file; + u64 serialized_data; + bool retrieved; + struct mutex mutex; + struct list_head list; + u64 token; +}; + +static int luo_alloc_files_mem(struct luo_file_set *file_set) +{ + size_t size; + void *mem; + + if (file_set->files) + return 0; + + WARN_ON_ONCE(file_set->count); + + size =3D LUO_FILE_PGCNT << PAGE_SHIFT; + mem =3D kho_alloc_preserve(size); + if (IS_ERR(mem)) + return PTR_ERR(mem); + + file_set->files =3D mem; + + return 0; +} + +static void luo_free_files_mem(struct luo_file_set *file_set) +{ + /* If file_set has files, no need to free preservation memory */ + if (file_set->count) + return; + + if (!file_set->files) + return; + + kho_unpreserve_free(file_set->files); + file_set->files =3D NULL; +} + +static bool luo_token_is_used(struct luo_file_set *file_set, u64 token) +{ + struct luo_file *iter; + + list_for_each_entry(iter, &file_set->files_list, list) { + if (iter->token =3D=3D token) + return true; + } + + return false; +} + +/** + * luo_preserve_file - Initiate the preservation of a file descriptor. + * @file_set: The file_set to which the preserved file will be added. + * @token: A unique, user-provided identifier for the file. + * @fd: The file descriptor to be preserved. + * + * This function orchestrates the first phase of preserving a file. Upon e= ntry, + * it takes a reference to the 'struct file' via fget(), effectively makin= g LUO + * a co-owner of the file. This reference is held until the file is either + * unpreserved or successfully finished in the next kernel, preventing the= file + * from being prematurely destroyed. + * + * This function orchestrates the first phase of preserving a file. It per= forms + * the following steps: + * + * 1. Validates that the @token is not already in use within the file_set. + * 2. Ensures the file_set's memory for files serialization is allocated + * (allocates if needed). + * 3. Iterates through registered handlers, calling can_preserve() to find= one + * compatible with the given @fd. + * 4. Calls the handler's .preserve() operation, which saves the file's st= ate + * and returns an opaque private data handle. + * 5. Adds the new instance to the file_set's internal list. + * + * On success, LUO takes a reference to the 'struct file' and considers it + * under its management until it is unpreserved or finished. + * + * In case of any failure, all intermediate allocations (file reference, m= emory + * for the 'luo_file' struct, etc.) are cleaned up before returning an err= or. + * + * Context: Can be called from an ioctl handler during normal system opera= tion. + * Return: 0 on success. Returns a negative errno on failure: + * -EEXIST if the token is already used. + * -EBADF if the file descriptor is invalid. + * -ENOSPC if the file_set is full. + * -ENOENT if no compatible handler is found. + * -ENOMEM on memory allocation failure. + * Other erros might be returned by .preserve(). + */ +int luo_preserve_file(struct luo_file_set *file_set, u64 token, int fd) +{ + struct liveupdate_file_op_args args =3D {0}; + struct liveupdate_file_handler *fh; + struct luo_file *luo_file; + struct file *file; + int err; + + if (luo_token_is_used(file_set, token)) + return -EEXIST; + + if (file_set->count =3D=3D LUO_FILE_MAX) + return -ENOSPC; + + file =3D fget(fd); + if (!file) + return -EBADF; + + err =3D luo_alloc_files_mem(file_set); + if (err) + goto err_fput; + + err =3D -ENOENT; + luo_list_for_each_private(fh, &luo_file_handler_list, list) { + if (fh->ops->can_preserve(fh, file)) { + err =3D 0; + break; + } + } + + /* err is still -ENOENT if no handler was found */ + if (err) + goto err_free_files_mem; + + luo_file =3D kzalloc(sizeof(*luo_file), GFP_KERNEL); + if (!luo_file) { + err =3D -ENOMEM; + goto err_free_files_mem; + } + + luo_file->file =3D file; + luo_file->fh =3D fh; + luo_file->token =3D token; + luo_file->retrieved =3D false; + mutex_init(&luo_file->mutex); + + args.handler =3D fh; + args.file =3D file; + err =3D fh->ops->preserve(&args); + if (err) + goto err_kfree; + + luo_file->serialized_data =3D args.serialized_data; + list_add_tail(&luo_file->list, &file_set->files_list); + file_set->count++; + + return 0; + +err_kfree: + kfree(luo_file); +err_free_files_mem: + luo_free_files_mem(file_set); +err_fput: + fput(file); + + return err; +} + +/** + * luo_file_unpreserve_files - Unpreserves all files from a file_set. + * @file_set: The files to be cleaned up. + * + * This function serves as the primary cleanup path for a file_set. It is + * invoked when the userspace agent closes the file_set's file descriptor. + * + * For each file, it performs the following cleanup actions: + * 1. Calls the handler's .unpreserve() callback to allow the handler to + * release any resources it allocated. + * 2. Removes the file from the file_set's internal tracking list. + * 3. Releases the reference to the 'struct file' that was taken by + * luo_preserve_file() via fput(), returning ownership. + * 4. Frees the memory associated with the internal 'struct luo_file'. + * + * After all individual files are unpreserved, it frees the contiguous mem= ory + * block that was allocated to hold their serialization data. + */ +void luo_file_unpreserve_files(struct luo_file_set *file_set) +{ + struct luo_file *luo_file; + + while (!list_empty(&file_set->files_list)) { + struct liveupdate_file_op_args args =3D {0}; + + luo_file =3D list_last_entry(&file_set->files_list, + struct luo_file, list); + + args.handler =3D luo_file->fh; + args.file =3D luo_file->file; + args.serialized_data =3D luo_file->serialized_data; + luo_file->fh->ops->unpreserve(&args); + + list_del(&luo_file->list); + file_set->count--; + + fput(luo_file->file); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + } + + luo_free_files_mem(file_set); +} + +static int luo_file_freeze_one(struct luo_file_set *file_set, + struct luo_file *luo_file) +{ + int err =3D 0; + + guard(mutex)(&luo_file->mutex); + + if (luo_file->fh->ops->freeze) { + struct liveupdate_file_op_args args =3D {0}; + + args.handler =3D luo_file->fh; + args.file =3D luo_file->file; + args.serialized_data =3D luo_file->serialized_data; + + err =3D luo_file->fh->ops->freeze(&args); + if (!err) + luo_file->serialized_data =3D args.serialized_data; + } + + return err; +} + +static void luo_file_unfreeze_one(struct luo_file_set *file_set, + struct luo_file *luo_file) +{ + guard(mutex)(&luo_file->mutex); + + if (luo_file->fh->ops->unfreeze) { + struct liveupdate_file_op_args args =3D {0}; + + args.handler =3D luo_file->fh; + args.file =3D luo_file->file; + args.serialized_data =3D luo_file->serialized_data; + + luo_file->fh->ops->unfreeze(&args); + } + + luo_file->serialized_data =3D 0; +} + +static void __luo_file_unfreeze(struct luo_file_set *file_set, + struct luo_file *failed_entry) +{ + struct list_head *files_list =3D &file_set->files_list; + struct luo_file *luo_file; + + list_for_each_entry(luo_file, files_list, list) { + if (luo_file =3D=3D failed_entry) + break; + + luo_file_unfreeze_one(file_set, luo_file); + } + + memset(file_set->files, 0, LUO_FILE_PGCNT << PAGE_SHIFT); +} + +/** + * luo_file_freeze - Freezes all preserved files and serializes their meta= data. + * @file_set: The file_set whose files are to be frozen. + * @file_set_ser: Where to put the serialized file_set. + * + * This function is called from the reboot() syscall path, just before the + * kernel transitions to the new image via kexec. Its purpose is to perfor= m the + * final preparation and serialization of all preserved files in the file_= set. + * + * It iterates through each preserved file in FIFO order (the order of + * preservation) and performs two main actions: + * + * 1. Freezes the File: It calls the handler's .freeze() callback for each + * file. This gives the handler a final opportunity to quiesce the devi= ce or + * prepare its state for the upcoming reboot. The handler may update its + * private data handle during this step. + * + * 2. Serializes Metadata: After a successful freeze, it copies the final = file + * metadata=E2=80=94the handler's compatible string, the user token, an= d the final + * private data handle=E2=80=94into the pre-allocated contiguous memory= buffer + * (file_set->files) that will be handed over to the next kernel via KH= O. + * + * Error Handling (Rollback): + * This function is atomic. If any handler's .freeze() operation fails, the + * entire live update is aborted. The __luo_file_unfreeze() helper is + * immediately called to invoke the .unfreeze() op on all files that were + * successfully frozen before the point of failure, rolling them back to a + * running state. The function then returns an error, causing the reboot() + * syscall to fail. + * + * Context: Called only from the liveupdate_reboot() path. + * Return: 0 on success, or a negative errno on failure. + */ +int luo_file_freeze(struct luo_file_set *file_set, + struct luo_file_set_ser *file_set_ser) +{ + struct luo_file_ser *file_ser =3D file_set->files; + struct luo_file *luo_file; + int err; + int i; + + if (!file_set->count) + return 0; + + if (WARN_ON(!file_ser)) + return -EINVAL; + + i =3D 0; + list_for_each_entry(luo_file, &file_set->files_list, list) { + err =3D luo_file_freeze_one(file_set, luo_file); + if (err < 0) { + pr_warn("Freeze failed for token[%#0llx] handler[%s] err[%pe]\n", + luo_file->token, luo_file->fh->compatible, + ERR_PTR(err)); + goto err_unfreeze; + } + + strscpy(file_ser[i].compatible, luo_file->fh->compatible, + sizeof(file_ser[i].compatible)); + file_ser[i].data =3D luo_file->serialized_data; + file_ser[i].token =3D luo_file->token; + i++; + } + + file_set_ser->count =3D file_set->count; + if (file_set->files) + file_set_ser->files =3D virt_to_phys(file_set->files); + + return 0; + +err_unfreeze: + __luo_file_unfreeze(file_set, luo_file); + + return err; +} + +/** + * luo_file_unfreeze - Unfreezes all files in a file_set and clear seriali= zation + * @file_set: The file_set whose files are to be unfrozen. + * @file_set_ser: Serialized file_set. + * + * This function rolls back the state of all files in a file_set after the + * freeze phase has begun but must be aborted. It is the counterpart to + * luo_file_freeze(). + * + * It invokes the __luo_file_unfreeze() helper with a NULL argument, which + * signals the helper to iterate through all files in the file_set and call + * their respective .unfreeze() handler callbacks. + * + * Context: This is called when the live update is aborted during + * the reboot() syscall, after luo_file_freeze() has been called. + */ +void luo_file_unfreeze(struct luo_file_set *file_set, + struct luo_file_set_ser *file_set_ser) +{ + if (!file_set->count) + return; + + __luo_file_unfreeze(file_set, NULL); + memset(file_set_ser, 0, sizeof(*file_set_ser)); +} + +/** + * luo_retrieve_file - Restores a preserved file from a file_set by its to= ken. + * @file_set: The file_set from which to retrieve the file. + * @token: The unique token identifying the file to be restored. + * @filep: Output parameter; on success, this is populated with a point= er + * to the newly retrieved 'struct file'. + * + * This function is the primary mechanism for recreating a file in the new + * kernel after a live update. It searches the file_set's list of deserial= ized + * files for an entry matching the provided @token. + * + * The operation is idempotent: if a file has already been successfully + * retrieved, this function will simply return a pointer to the existing + * 'struct file' and report success without re-executing the retrieve + * operation. This is handled by checking the 'retrieved' flag under a loc= k. + * + * File retrieval can happen in any order; it is not bound by the order of + * preservation. + * + * Context: Can be called from an ioctl or other in-kernel code in the new + * kernel. + * Return: 0 on success. Returns a negative errno on failure: + * -ENOENT if no file with the matching token is found. + * Any error code returned by the handler's .retrieve() op. + */ +int luo_retrieve_file(struct luo_file_set *file_set, u64 token, + struct file **filep) +{ + struct liveupdate_file_op_args args =3D {0}; + struct luo_file *luo_file; + int err; + + if (list_empty(&file_set->files_list)) + return -ENOENT; + + list_for_each_entry(luo_file, &file_set->files_list, list) { + if (luo_file->token =3D=3D token) + break; + } + + if (luo_file->token !=3D token) + return -ENOENT; + + guard(mutex)(&luo_file->mutex); + if (luo_file->retrieved) { + /* + * Someone is asking for this file again, so get a reference + * for them. + */ + get_file(luo_file->file); + *filep =3D luo_file->file; + return 0; + } + + args.handler =3D luo_file->fh; + args.serialized_data =3D luo_file->serialized_data; + err =3D luo_file->fh->ops->retrieve(&args); + if (!err) { + luo_file->file =3D args.file; + + /* Get reference so we can keep this file in LUO until finish */ + get_file(luo_file->file); + *filep =3D luo_file->file; + luo_file->retrieved =3D true; + } + + return err; +} + +static int luo_file_can_finish_one(struct luo_file_set *file_set, + struct luo_file *luo_file) +{ + bool can_finish =3D true; + + guard(mutex)(&luo_file->mutex); + + if (luo_file->fh->ops->can_finish) { + struct liveupdate_file_op_args args =3D {0}; + + args.handler =3D luo_file->fh; + args.file =3D luo_file->file; + args.serialized_data =3D luo_file->serialized_data; + args.retrieved =3D luo_file->retrieved; + can_finish =3D luo_file->fh->ops->can_finish(&args); + } + + return can_finish ? 0 : -EBUSY; +} + +static void luo_file_finish_one(struct luo_file_set *file_set, + struct luo_file *luo_file) +{ + struct liveupdate_file_op_args args =3D {0}; + + guard(mutex)(&luo_file->mutex); + + args.handler =3D luo_file->fh; + args.file =3D luo_file->file; + args.serialized_data =3D luo_file->serialized_data; + args.retrieved =3D luo_file->retrieved; + + luo_file->fh->ops->finish(&args); +} + +/** + * luo_file_finish - Completes the lifecycle for all files in a file_set. + * @file_set: The file_set to be finalized. + * + * This function orchestrates the final teardown of a live update file_set= in + * the new kernel. It should be called after all necessary files have been + * retrieved and the userspace agent is ready to release the preserved sta= te. + * + * The function iterates through all tracked files. For each file, it perf= orms + * the following sequence of cleanup actions: + * + * 1. If file is not yet retrieved, retrieves it, and calls can_finish() on + * every file in the file_set. If all can_finish return true, continue = to + * finish. + * 2. Calls the handler's .finish() callback (via luo_file_finish_one) to + * allow for final resource cleanup within the handler. + * 3. Releases LUO's ownership reference on the 'struct file' via fput(). = This + * is the counterpart to the get_file() call in luo_retrieve_file(). + * 4. Removes the 'struct luo_file' from the file_set's internal list. + * 5. Frees the memory for the 'struct luo_file' instance itself. + * + * After successfully finishing all individual files, it frees the + * contiguous memory block that was used to transfer the serialized metada= ta + * from the previous kernel. + * + * Error Handling (Atomic Failure): + * This operation is atomic. If any handler's .can_finish() op fails, the = entire + * function aborts immediately and returns an error. + * + * Context: Can be called from an ioctl handler in the new kernel. + * Return: 0 on success, or a negative errno on failure. + */ +int luo_file_finish(struct luo_file_set *file_set) +{ + struct list_head *files_list =3D &file_set->files_list; + struct luo_file *luo_file; + int err; + + if (!file_set->count) + return 0; + + list_for_each_entry(luo_file, files_list, list) { + err =3D luo_file_can_finish_one(file_set, luo_file); + if (err) + return err; + } + + while (!list_empty(&file_set->files_list)) { + luo_file =3D list_last_entry(&file_set->files_list, + struct luo_file, list); + + luo_file_finish_one(file_set, luo_file); + + if (luo_file->file) + fput(luo_file->file); + list_del(&luo_file->list); + file_set->count--; + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + } + + if (file_set->files) { + kho_restore_free(file_set->files); + file_set->files =3D NULL; + } + + return 0; +} + +/** + * luo_file_deserialize - Reconstructs the list of preserved files in the = new kernel. + * @file_set: The incoming file_set to fill with deserialized data. + * @file_set_ser: Serialized KHO file_set data from the previous kernel. + * + * This function is called during the early boot process of the new kernel= . It + * takes the raw, contiguous memory block of 'struct luo_file_ser' entries, + * provided by the previous kernel, and transforms it back into a live, + * in-memory linked list of 'struct luo_file' instances. + * + * For each serialized entry, it performs the following steps: + * 1. Reads the 'compatible' string. + * 2. Searches the global list of registered file handlers for one that + * matches the compatible string. + * 3. Allocates a new 'struct luo_file'. + * 4. Populates the new structure with the deserialized data (token, pri= vate + * data handle) and links it to the found handler. The 'file' pointer= is + * initialized to NULL, as the file has not been retrieved yet. + * 5. Adds the new 'struct luo_file' to the file_set's files_list. + * + * This prepares the file_set for userspace, which can later call + * luo_retrieve_file() to restore the actual file descriptors. + * + * Context: Called from session deserialization. + */ +int luo_file_deserialize(struct luo_file_set *file_set, + struct luo_file_set_ser *file_set_ser) +{ + struct luo_file_ser *file_ser; + u64 i; + + if (!file_set_ser->files) { + WARN_ON(file_set_ser->count); + return 0; + } + + file_set->count =3D file_set_ser->count; + file_set->files =3D phys_to_virt(file_set_ser->files); + + /* + * Note on error handling: + * + * If deserialization fails (e.g., allocation failure or corrupt data), + * we intentionally skip cleanup of files that were already restored. + * + * A partial failure leaves the preserved state inconsistent. + * Implementing a safe "undo" to unwind complex dependencies (sessions, + * files, hardware state) is error-prone and provides little value, as + * the system is effectively in a broken state. + * + * We treat these resources as leaked. The expected recovery path is for + * userspace to detect the failure and trigger a reboot, which will + * reliably reset devices and reclaim memory. + */ + file_ser =3D file_set->files; + for (i =3D 0; i < file_set->count; i++) { + struct liveupdate_file_handler *fh; + bool handler_found =3D false; + struct luo_file *luo_file; + + luo_list_for_each_private(fh, &luo_file_handler_list, list) { + if (!strcmp(fh->compatible, file_ser[i].compatible)) { + handler_found =3D true; + break; + } + } + + if (!handler_found) { + pr_warn("No registered handler for compatible '%s'\n", + file_ser[i].compatible); + return -ENOENT; + } + + luo_file =3D kzalloc(sizeof(*luo_file), GFP_KERNEL); + if (!luo_file) + return -ENOMEM; + + luo_file->fh =3D fh; + luo_file->file =3D NULL; + luo_file->serialized_data =3D file_ser[i].data; + luo_file->token =3D file_ser[i].token; + luo_file->retrieved =3D false; + mutex_init(&luo_file->mutex); + list_add_tail(&luo_file->list, &file_set->files_list); + } + + return 0; +} + +void luo_file_set_init(struct luo_file_set *file_set) +{ + INIT_LIST_HEAD(&file_set->files_list); +} + +void luo_file_set_destroy(struct luo_file_set *file_set) +{ + WARN_ON(file_set->count); + WARN_ON(!list_empty(&file_set->files_list)); +} + +/** + * liveupdate_register_file_handler - Register a file handler with LUO. + * @fh: Pointer to a caller-allocated &struct liveupdate_file_handler. + * The caller must initialize this structure, including a unique + * 'compatible' string and a valid 'fh' callbacks. This function adds the + * handler to the global list of supported file handlers. + * + * Context: Typically called during module initialization for file types t= hat + * support live update preservation. + * + * Return: 0 on success. Negative errno on failure. + */ +int liveupdate_register_file_handler(struct liveupdate_file_handler *fh) +{ + struct liveupdate_file_handler *fh_iter; + int err; + + if (!liveupdate_enabled()) + return -EOPNOTSUPP; + + /* Sanity check that all required callbacks are set */ + if (!fh->ops->preserve || !fh->ops->unpreserve || !fh->ops->retrieve || + !fh->ops->finish || !fh->ops->can_preserve) { + return -EINVAL; + } + + /* + * Ensure the system is quiescent (no active sessions). + * This prevents registering new handlers while sessions are active or + * while deserialization is in progress. + */ + if (!luo_session_quiesce()) + return -EBUSY; + + /* Check for duplicate compatible strings */ + luo_list_for_each_private(fh_iter, &luo_file_handler_list, list) { + if (!strcmp(fh_iter->compatible, fh->compatible)) { + pr_err("File handler registration failed: Compatible string '%s' alread= y registered.\n", + fh->compatible); + err =3D -EEXIST; + goto err_resume; + } + } + + /* Pin the module implementing the handler */ + if (!try_module_get(fh->ops->owner)) { + err =3D -EAGAIN; + goto err_resume; + } + + INIT_LIST_HEAD(&ACCESS_PRIVATE(fh, list)); + list_add_tail(&ACCESS_PRIVATE(fh, list), &luo_file_handler_list); + luo_session_resume(); + + return 0; + +err_resume: + luo_session_resume(); + return err; +} + +/** + * liveupdate_unregister_file_handler - Unregister a liveupdate file handl= er + * @fh: The file handler to unregister + * + * Unregisters the file handler from the liveupdate core. This function + * reverses the operations of liveupdate_register_file_handler(). + * + * It ensures safe removal by checking that: + * No live update session is currently in progress. + * + * If the unregistration fails, the internal test state is reverted. + * + * Return: 0 Success. -EOPNOTSUPP when live update is not enabled. -EBUSY = A live + * update is in progress, can't quiesce live update. + */ +int liveupdate_unregister_file_handler(struct liveupdate_file_handler *fh) +{ + if (!liveupdate_enabled()) + return -EOPNOTSUPP; + + if (!luo_session_quiesce()) + return -EBUSY; + + list_del(&ACCESS_PRIVATE(fh, list)); + module_put(fh->ops->owner); + luo_session_resume(); + + return 0; +} diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 1292ac47eef8..c8973b543d1d 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -40,6 +40,28 @@ static inline int luo_ucmd_respond(struct luo_ucmd *ucmd, */ #define luo_restore_fail(__fmt, ...) panic(__fmt, ##__VA_ARGS__) =20 +/* Mimics list_for_each_entry() but for private list head entries */ +#define luo_list_for_each_private(pos, head, member) \ + for (struct list_head *__iter =3D (head)->next; \ + __iter !=3D (head) && \ + ({ pos =3D container_of(__iter, typeof(*(pos)), member); 1; }); \ + __iter =3D __iter->next) + +/** + * struct luo_file_set - A set of files that belong to the same sessions. + * @files_list: An ordered list of files associated with this session, it = is + * ordered by preservation time. + * @files: The physically contiguous memory block that holds the seri= alized + * state of files. + * @count: A counter tracking the number of files currently stored in= the + * @files_list for this session. + */ +struct luo_file_set { + struct list_head files_list; + struct luo_file_ser *files; + long count; +}; + /** * struct luo_session - Represents an active or incoming Live Update sessi= on. * @name: A unique name for this session, used for identification and @@ -50,6 +72,7 @@ static inline int luo_ucmd_respond(struct luo_ucmd *ucmd, * previous kernel) sessions. * @retrieved: A boolean flag indicating whether this session has been * retrieved by a consumer in the new kernel. + * @file_set: A set of files that belong to this session. * @mutex: protects fields in the luo_session. */ struct luo_session { @@ -57,6 +80,7 @@ struct luo_session { struct luo_session_ser *ser; struct list_head list; bool retrieved; + struct luo_file_set file_set; struct mutex mutex; }; =20 @@ -69,4 +93,18 @@ int luo_session_deserialize(void); bool luo_session_quiesce(void); void luo_session_resume(void); =20 +int luo_preserve_file(struct luo_file_set *file_set, u64 token, int fd); +void luo_file_unpreserve_files(struct luo_file_set *file_set); +int luo_file_freeze(struct luo_file_set *file_set, + struct luo_file_set_ser *file_set_ser); +void luo_file_unfreeze(struct luo_file_set *file_set, + struct luo_file_set_ser *file_set_ser); +int luo_retrieve_file(struct luo_file_set *file_set, u64 token, + struct file **filep); +int luo_file_finish(struct luo_file_set *file_set); +int luo_file_deserialize(struct luo_file_set *file_set, + struct luo_file_set_ser *file_set_ser); +void luo_file_set_init(struct luo_file_set *file_set); +void luo_file_set_destroy(struct luo_file_set *file_set); + #endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5509D32C33C for ; Tue, 25 Nov 2025 16:59:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089954; cv=none; b=L7rP5uHQ2zxczY3eiG6GdMvA5ol+4x+WgenOlWIWxPOluLP8wvuOSiMMyiznyCB36Q6kidb0qvNSj78cM1WcqDp6wJ3p1MB6sIjR/GS8WqTS69oYgABQPNODHCgvR84jYajVVRAx75TCRtiuiWcFfZtWBWgpQBFcZKRgSCSD+nI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089954; c=relaxed/simple; bh=Cbc3NGxw4p43iE18wWJ6M1vSK4ZtypGbaDU6K1as4Zs=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=LzTGTWCan8TUTTN7W1OzcAyS2ukR91rLUOXsRjuJCSVGGitGrSX3iy5vhdioqDjMYAcY5ibFh3qo++oMLVi+8BDlFr6finr62qI3t4MZnIh7F8ehfBHCzUUJfGcA+K0JLnZrq403k63UFMpc3QodNlUkpBOSPaGKI2DUt5MCrug= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=kwKMZXOz; arc=none smtp.client-ip=209.85.128.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="kwKMZXOz" Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-78a7af9ff1dso56808377b3.1 for ; Tue, 25 Nov 2025 08:59:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089951; x=1764694751; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=KpceFFbzwkhlrhy2Jm5dMG8T2+lFVQ4MvwlS6QKhInw=; b=kwKMZXOzeK+iKAk/hCKYREHQ9j1WU4bjUMLwzsF/5rTkvzcO7kO77bKqckfn12z+6g 9nkTFvLM9f7ZJ3cyp5H/IjOpZ1T9uXeE55Ankt4nF3hpbMN68UYVGuaVjCe/jqkm2r1E VK7k6VHvtH0jBoFUO1surWSEZDTjr6qKeP0aRSx7PZ97H0sJV4EwGys4xm7vIxGrGwIW UF/k2Hur80EDr5WAM5grxePlFot4iLKJSJpbUrTPSIBE2AiRBoXGtkS8FYu4Uf0CYFc3 vV14/Jmaabm5UzvqK4Nx3w5XGWgRB7E4BuMrQ3sSQ1WkWzeDQAa659qhfcosPwaijnmG WlQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089951; x=1764694751; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=KpceFFbzwkhlrhy2Jm5dMG8T2+lFVQ4MvwlS6QKhInw=; b=cCr4nS0ysWeXT2tGhSd+o87mj94EE7qm890n0RWRfbNhaVWmE/eZNeD5tGoujGD7/M KuzJlEcRR+OBLVieqeqpn9AYNywfLsovM6rMP/g10Qdu9+eRfPvkX/vMuFA8L6hCrbT8 9EBZVWAcRHs2v9jcVM7ZVY6Vxd4Qqm2Grr/cJJl0XrcT2l07di9WAcMaR/d9t9bWqQ4h V19NWuEpU2eZN2Jy/XBsSp1M3Xjjgc5c+x7QEOMmOMkoFhoCyrCB0YcohZ7f6sa2KS1O veayLRQKefLJQZ7o2BTsstTRQctk3yoXiyZ2qCT6elGLLRg1ICVf4RXVFQbHJfdohFJh V6Fg== X-Forwarded-Encrypted: i=1; AJvYcCV0AAkzYba3U7PBXDtOdYcA/RLt8vp9/hP5kx/8Qg2iPprDXCq/U6bu++CCm4OO9bM5mAGCbnscgtdDPoE=@vger.kernel.org X-Gm-Message-State: AOJu0Yw/yHdu8X2rLKb2KE44U4pFdyBWMNiq2b7H2JzeciUbxmuliQXt jh2Pn0p6ZQBtklpeXxlOTZdQDNZMk6VvvJC3pvg7B6jX/ioQ+MjpxhiOkBGnLh2aGJA= X-Gm-Gg: ASbGncttaWQ/HM3DxtLhlk81eMU81hOYfEKgnx/8qpSlpVfmSzJyift2L5Tjy3FNNFd w47P1LXQ38evThUIApKu+eimT1/0OOQkJyvcblJOdhiqh/GUL4Fh/9FbY8J+OWx9jN5sk5NXbie cXFBR4G4zVlG0ihJi0BBk7zXvGeaz3IeGq++bh4SkjKCiQfjpnMWJHg1CHdlP6aj+LxlMf2zHEk ixCBfoSPh6YJi8DVGUAQUJ8PlwhP/b06iXpHKZsrFLjsNc/7UFNWFrf6rQZoab56SiZZa6oSYiA zaDUvZEK1CPo8xnSfuv6hTkE+yPGQ8Fw8crdAyosmHp2Fc3clvAH4o1lPm685aH00MgRi9nJToD h9AvVFePUx1l32suuqkmoUU49dxCMQm3tDwVA1pPCsJfEFGWAeKZnFx12LROvSLz2uCYs1zt5c/ LWUGgNGulU1sX6s8mj8Yu/I5itIl9k4yv+iZ3lJIRl6KhT7yTb1b5xB9QlaCvp6UgwUh30mFME1 yQ= X-Google-Smtp-Source: AGHT+IFu8J5gRTX3GIgwTSli4sL6rf21GHMjG3PjVUXauq1tChQ7CPdMv6IXpi0oqNufVhgEK9FyzA== X-Received: by 2002:a05:690c:6707:b0:789:6c45:5df with SMTP id 00721157ae682-78a8b472270mr127798697b3.23.1764089949946; Tue, 25 Nov 2025 08:59:09 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:09 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 07/18] liveupdate: luo_session: Add ioctls for file preservation Date: Tue, 25 Nov 2025 11:58:37 -0500 Message-ID: <20251125165850.3389713-8-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introducing the userspace interface and internal logic required to manage the lifecycle of file descriptors within a session. Previously, a session was merely a container; this change makes it a functional management unit. The following capabilities are added: A new set of ioctl commands are added, which operate on the file descriptor returned by CREATE_SESSION. This allows userspace to: - LIVEUPDATE_SESSION_PRESERVE_FD: Add a file descriptor to a session to be preserved across the live update. - LIVEUPDATE_SESSION_RETRIEVE_FD: Retrieve a preserved file in the new kernel using its unique token. - LIVEUPDATE_SESSION_FINISH: finish session The session's .release handler is enhanced to be state-aware. When a session's file descriptor is closed, it correctly unpreserves the session based on its current state before freeing all associated file resources. Signed-off-by: Pasha Tatashin Reviewed-by: Pratyush Yadav Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- include/uapi/linux/liveupdate.h | 103 ++++++++++++++++++ kernel/liveupdate/luo_session.c | 187 +++++++++++++++++++++++++++++++- 2 files changed, 288 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index 1183cf984b5f..30bc66ee9436 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -53,6 +53,14 @@ enum { LIVEUPDATE_CMD_RETRIEVE_SESSION =3D 0x01, }; =20 +/* ioctl commands for session file descriptors */ +enum { + LIVEUPDATE_CMD_SESSION_BASE =3D 0x40, + LIVEUPDATE_CMD_SESSION_PRESERVE_FD =3D LIVEUPDATE_CMD_SESSION_BASE, + LIVEUPDATE_CMD_SESSION_RETRIEVE_FD =3D 0x41, + LIVEUPDATE_CMD_SESSION_FINISH =3D 0x42, +}; + /** * struct liveupdate_ioctl_create_session - ioctl(LIVEUPDATE_IOCTL_CREATE_= SESSION) * @size: Input; sizeof(struct liveupdate_ioctl_create_session) @@ -110,4 +118,99 @@ struct liveupdate_ioctl_retrieve_session { #define LIVEUPDATE_IOCTL_RETRIEVE_SESSION \ _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_RETRIEVE_SESSION) =20 +/* Session specific IOCTLs */ + +/** + * struct liveupdate_session_preserve_fd - ioctl(LIVEUPDATE_SESSION_PRESER= VE_FD) + * @size: Input; sizeof(struct liveupdate_session_preserve_fd) + * @fd: Input; The user-space file descriptor to be preserved. + * @token: Input; An opaque, unique token for preserved resource. + * + * Holds parameters for preserving a file descriptor. + * + * User sets the @fd field identifying the file descriptor to preserve + * (e.g., memfd, kvm, iommufd, VFIO). The kernel validates if this FD type + * and its dependencies are supported for preservation. If validation pass= es, + * the kernel marks the FD internally and *initiates the process* of prepa= ring + * its state for saving. The actual snapshotting of the state typically oc= curs + * during the subsequent %LIVEUPDATE_IOCTL_PREPARE execution phase, though + * some finalization might occur during freeze. + * On successful validation and initiation, the kernel uses the @token + * field with an opaque identifier representing the resource being preserv= ed. + * This token confirms the FD is targeted for preservation and is required= for + * the subsequent %LIVEUPDATE_SESSION_RETRIEVE_FD call after the live upda= te. + * + * Return: 0 on success (validation passed, preservation initiated), negat= ive + * error code on failure (e.g., unsupported FD type, dependency issue, + * validation failed). + */ +struct liveupdate_session_preserve_fd { + __u32 size; + __s32 fd; + __aligned_u64 token; +}; + +#define LIVEUPDATE_SESSION_PRESERVE_FD \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_PRESERVE_FD) + +/** + * struct liveupdate_session_retrieve_fd - ioctl(LIVEUPDATE_SESSION_RETRIE= VE_FD) + * @size: Input; sizeof(struct liveupdate_session_retrieve_fd) + * @fd: Output; The new file descriptor representing the fully restored + * kernel resource. + * @token: Input; An opaque, token that was used to preserve the resource. + * + * Retrieve a previously preserved file descriptor. + * + * User sets the @token field to the value obtained from a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call before the live update. On success, + * the kernel restores the state (saved during the PREPARE/FREEZE phases) + * associated with the token and populates the @fd field with a new file + * descriptor referencing the restored resource in the current (new) kerne= l. + * This operation must be performed *before* signaling completion via + * %LIVEUPDATE_IOCTL_FINISH. + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +struct liveupdate_session_retrieve_fd { + __u32 size; + __s32 fd; + __aligned_u64 token; +}; + +#define LIVEUPDATE_SESSION_RETRIEVE_FD \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_RETRIEVE_FD) + +/** + * struct liveupdate_session_finish - ioctl(LIVEUPDATE_SESSION_FINISH) + * @size: Input; sizeof(struct liveupdate_session_finish) + * @reserved: Input; Must be zero. Reserved for future use. + * + * Signals the completion of the restoration process for a retrieved sessi= on. + * This is the final operation that should be performed on a session file + * descriptor after a live update. + * + * This ioctl must be called once all required file descriptors for the se= ssion + * have been successfully retrieved (using %LIVEUPDATE_SESSION_RETRIEVE_FD= ) and + * are fully restored from the userspace and kernel perspective. + * + * Upon success, the kernel releases its ownership of the preserved resour= ces + * associated with this session. This allows internal resources to be free= d, + * typically by decrementing reference counts on the underlying preserved + * objects. + * + * If this operation fails, the resources remain preserved in memory. User= space + * may attempt to call finish again. The resources will otherwise be reset + * during the next live update cycle. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_session_finish { + __u32 size; + __u32 reserved; +}; + +#define LIVEUPDATE_SESSION_FINISH \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_FINISH) + #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_sessio= n.c index 5829fe79896a..b08f5f329cee 100644 --- a/kernel/liveupdate/luo_session.c +++ b/kernel/liveupdate/luo_session.c @@ -125,6 +125,8 @@ static struct luo_session *luo_session_alloc(const char= *name) return ERR_PTR(-ENOMEM); =20 strscpy(session->name, name, sizeof(session->name)); + INIT_LIST_HEAD(&session->file_set.files_list); + luo_file_set_init(&session->file_set); INIT_LIST_HEAD(&session->list); mutex_init(&session->mutex); =20 @@ -133,6 +135,7 @@ static struct luo_session *luo_session_alloc(const char= *name) =20 static void luo_session_free(struct luo_session *session) { + luo_file_set_destroy(&session->file_set); mutex_destroy(&session->mutex); kfree(session); } @@ -177,16 +180,46 @@ static void luo_session_remove(struct luo_session_hea= der *sh, sh->count--; } =20 +static int luo_session_finish_one(struct luo_session *session) +{ + guard(mutex)(&session->mutex); + return luo_file_finish(&session->file_set); +} + +static void luo_session_unfreeze_one(struct luo_session *session, + struct luo_session_ser *ser) +{ + guard(mutex)(&session->mutex); + luo_file_unfreeze(&session->file_set, &ser->file_set_ser); +} + +static int luo_session_freeze_one(struct luo_session *session, + struct luo_session_ser *ser) +{ + guard(mutex)(&session->mutex); + return luo_file_freeze(&session->file_set, &ser->file_set_ser); +} + static int luo_session_release(struct inode *inodep, struct file *filep) { struct luo_session *session =3D filep->private_data; struct luo_session_header *sh; =20 /* If retrieved is set, it means this session is from incoming list */ - if (session->retrieved) + if (session->retrieved) { + int err =3D luo_session_finish_one(session); + + if (err) { + pr_warn("Unable to finish session [%s] on release\n", + session->name); + return err; + } sh =3D &luo_session_global.incoming; - else + } else { + scoped_guard(mutex, &session->mutex) + luo_file_unpreserve_files(&session->file_set); sh =3D &luo_session_global.outgoing; + } =20 luo_session_remove(sh, session); luo_session_free(session); @@ -194,9 +227,140 @@ static int luo_session_release(struct inode *inodep, = struct file *filep) return 0; } =20 +static int luo_session_preserve_fd(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_preserve_fd *argp =3D ucmd->cmd; + int err; + + guard(mutex)(&session->mutex); + err =3D luo_preserve_file(&session->file_set, argp->token, argp->fd); + if (err) + return err; + + err =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (err) + pr_warn("The file was successfully preserved, but response to user faile= d\n"); + + return err; +} + +static int luo_session_retrieve_fd(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_retrieve_fd *argp =3D ucmd->cmd; + struct file *file; + int err; + + argp->fd =3D get_unused_fd_flags(O_CLOEXEC); + if (argp->fd < 0) + return argp->fd; + + guard(mutex)(&session->mutex); + err =3D luo_retrieve_file(&session->file_set, argp->token, &file); + if (err < 0) + goto err_put_fd; + + err =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (err) + goto err_put_file; + + fd_install(argp->fd, file); + + return 0; + +err_put_file: + fput(file); +err_put_fd: + put_unused_fd(argp->fd); + + return err; +} + +static int luo_session_finish(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_finish *argp =3D ucmd->cmd; + int err =3D luo_session_finish_one(session); + + if (err) + return err; + + return luo_ucmd_respond(ucmd, sizeof(*argp)); +} + +union ucmd_buffer { + struct liveupdate_session_finish finish; + struct liveupdate_session_preserve_fd preserve; + struct liveupdate_session_retrieve_fd retrieve; +}; + +struct luo_ioctl_op { + unsigned int size; + unsigned int min_size; + unsigned int ioctl_num; + int (*execute)(struct luo_session *session, struct luo_ucmd *ucmd); +}; + +#define IOCTL_OP(_ioctl, _fn, _struct, _last) = \ + [_IOC_NR(_ioctl) - LIVEUPDATE_CMD_SESSION_BASE] =3D { \ + .size =3D sizeof(_struct) + \ + BUILD_BUG_ON_ZERO(sizeof(union ucmd_buffer) < \ + sizeof(_struct)), \ + .min_size =3D offsetofend(_struct, _last), \ + .ioctl_num =3D _ioctl, \ + .execute =3D _fn, \ + } + +static const struct luo_ioctl_op luo_session_ioctl_ops[] =3D { + IOCTL_OP(LIVEUPDATE_SESSION_FINISH, luo_session_finish, + struct liveupdate_session_finish, reserved), + IOCTL_OP(LIVEUPDATE_SESSION_PRESERVE_FD, luo_session_preserve_fd, + struct liveupdate_session_preserve_fd, token), + IOCTL_OP(LIVEUPDATE_SESSION_RETRIEVE_FD, luo_session_retrieve_fd, + struct liveupdate_session_retrieve_fd, token), +}; + +static long luo_session_ioctl(struct file *filep, unsigned int cmd, + unsigned long arg) +{ + struct luo_session *session =3D filep->private_data; + const struct luo_ioctl_op *op; + struct luo_ucmd ucmd =3D {}; + union ucmd_buffer buf; + unsigned int nr; + int ret; + + nr =3D _IOC_NR(cmd); + if (nr < LIVEUPDATE_CMD_SESSION_BASE || (nr - LIVEUPDATE_CMD_SESSION_BASE= ) >=3D + ARRAY_SIZE(luo_session_ioctl_ops)) { + return -EINVAL; + } + + ucmd.ubuffer =3D (void __user *)arg; + ret =3D get_user(ucmd.user_size, (u32 __user *)ucmd.ubuffer); + if (ret) + return ret; + + op =3D &luo_session_ioctl_ops[nr - LIVEUPDATE_CMD_SESSION_BASE]; + if (op->ioctl_num !=3D cmd) + return -ENOIOCTLCMD; + if (ucmd.user_size < op->min_size) + return -EINVAL; + + ucmd.cmd =3D &buf; + ret =3D copy_struct_from_user(ucmd.cmd, op->size, ucmd.ubuffer, + ucmd.user_size); + if (ret) + return ret; + + return op->execute(session, &ucmd); +} + static const struct file_operations luo_session_fops =3D { .owner =3D THIS_MODULE, .release =3D luo_session_release, + .unlocked_ioctl =3D luo_session_ioctl, }; =20 /* Create a "struct file" for session */ @@ -392,6 +556,11 @@ int luo_session_deserialize(void) session->name, ERR_PTR(err)); return err; } + + scoped_guard(mutex, &session->mutex) { + luo_file_deserialize(&session->file_set, + &sh->ser[i].file_set_ser); + } } =20 kho_restore_free(sh->header_ser); @@ -406,9 +575,14 @@ int luo_session_serialize(void) struct luo_session_header *sh =3D &luo_session_global.outgoing; struct luo_session *session; int i =3D 0; + int err; =20 guard(rwsem_write)(&sh->rwsem); list_for_each_entry(session, &sh->list, list) { + err =3D luo_session_freeze_one(session, &sh->ser[i]); + if (err) + goto err_undo; + strscpy(sh->ser[i].name, session->name, sizeof(sh->ser[i].name)); i++; @@ -416,6 +590,15 @@ int luo_session_serialize(void) sh->header_ser->count =3D sh->count; =20 return 0; + +err_undo: + list_for_each_entry_continue_reverse(session, &sh->list, list) { + i--; + luo_session_unfreeze_one(session, &sh->ser[i]); + memset(sh->ser[i].name, 0, sizeof(sh->ser[i].name)); + } + + return err; } =20 /** --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4F7E332C95A for ; Tue, 25 Nov 2025 16:59:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089956; cv=none; b=TyYOcimxkKfqK3MEh0B4LyEriZrn3y+wigjn2d4b+4gmfcLwawidbC4xrBFTmphPDSkvtOHbkaF0l9J8gjhGP1pk2OLDWDNIFVxNdkwvZ6S9qQaXK2N/bc6doX1l9GmpcMUypLFojfmBxTrTvEfhcwPHhhpOIIKsj2l+GMmQAys= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089956; c=relaxed/simple; bh=xrL/fE95vVE+j+qMFfMAEFOcWkNFadUPHDp/nSqn4jE=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eCPyJnWRIGpLKUl1rgjLn+wi4MjSU1tWnXcMQZvobbmbxxjsSc6nrPMj8ZCKlUPR0fMrymN8euAenfXoeX3XmQl6p7gYTKRFQYxVGEymGcG++fQzoa3vxhpldK7rLiNHCP+J6XtrXygx9b/Y/atTifMvSmZWuvBIEEWPPfHs7hM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=j5aI6HAP; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="j5aI6HAP" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-789314f0920so50509467b3.1 for ; Tue, 25 Nov 2025 08:59:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089953; x=1764694753; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=04W1U9tqhM0nKxenqkng147mrDWDs9iksZrUOMH/VOs=; b=j5aI6HAPEMWqlDeC2WzvfhkPnOC620ylIFv1u/P5gaVsmnX3TbiQ3Yna+vcsuRBcxk kwwwLY4iJAT4zIoR/og7InAVosA5SVh4s3m7HkyHAQNFJ5RdbqT4ALIddOfb3umaYAEH /i9USuRFo2wWEouHtKOFy9UbOFHDgoPB3p0FvZw07pUvtjBffHYgHgyC/rBvtSNEEVmy NJ0AWEMQ+FAFz0ipC677QVWdKHNaFJIfKoZPioxhdLzJRCrNU6Uh2NgqOdHnnaiF+srn K83Vm4hrouywSZtTcbMx/zj48UwYm/lpVqvb+HXBMWsikXOb/+T3dBlPYLI8+sWQyFUk 0f1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089953; x=1764694753; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=04W1U9tqhM0nKxenqkng147mrDWDs9iksZrUOMH/VOs=; b=k5rZzCrgmj//SM14yGUGONj0nV+7u6HEoUWSl39c5npSo4HEz1pcYG88n+CIBQ060Y XWkr1tfWGZfQ3gxSrXs1bPelfX2MSZyLTFnTK2ApZias5yvVs4+7acS17+/9P+FXQtXg ZnDwQHc6PiKRgImEaqhIw5xp7fi+/stFTSWxPIm8KiQ9d7zLFWPrcHhI7ew4EVv9XRfP KceQ0YII8luWNmYq51XzYTgakwlCG9gOF/7O67b56ySqrXP53+w0vAd0rZdLorxnStV8 eYLoS6h+x2EV0L6iH0F2yqqNWudVN/qGWvzNM34+C6JCdi2Cq94yhdTyojtexdAUPGU4 WydA== X-Forwarded-Encrypted: i=1; AJvYcCX1bUckC/mWTjOGIznukOPAYavndlfZoslw/C3klnRyurerJv2MtHCJN82uLxFEqLq8TgbpETzqTqKX5NE=@vger.kernel.org X-Gm-Message-State: AOJu0YxYmzPIGyVZfg8ASIw4/mMYs972QqaBFpYufCXVrVN+Hc9urH+w GOrsEkM1YR/hJo6yfCK1TV+xqJFW88PaNdlQKfDhvFFfmnULwbewwSOVx9O8ET/x/tc= X-Gm-Gg: ASbGncu7ACfaP1dAV8uf/OkylktzGMVw71MixZLZBbK5KUhUQsNVUeH11+gSpk/DsPZ u2vDndBLspTWigumo361XOpmPi8wSuaDkbD6gpIWJv+o4VTE4K2QdROL7JNMqyTY1dCGXdswWqk A6b27Cbww8ZLPHe2uU78uTyK+iJQNbnccvFDWEDqI0hnaney/PifsUvf4/zyYxEKRunsTIa0vkt Uc1CzINPfCvce09L6zhHyZ6SKj1yz9zpt92xsrVZJM51JCyKe9taDjjTrr3WfGXJFFmRknyKdPX HL3W3qUsUc5RZ9whHrQKZjNruRIkLEj49IklE1BVXVP7Qh0dxfB2aV0Bi+lR0JYNxwRvtu3DcLP A5a7DdtqujjglRZZJCHdBRZXtM6U1UDmxSMZqmfUTQ3Sc4a1fE5712Dpp3EVL3XUGkkmRajyc0k EBg+J99YdBc1pHPaMJt1o0ZtMeCUo72EBRTJtwydelI04H9GHrlEmNohNkV0CnpcgX X-Google-Smtp-Source: AGHT+IG/VheigjnnqLhidQQho4WaPxQMBWl1KDbGTnAN7GCTM0241MCFueP/x+zM5U+5JuAeLMQRDQ== X-Received: by 2002:a05:690c:84:b0:786:8a95:1e00 with SMTP id 00721157ae682-78ab6dacefamr27675197b3.10.1764089952872; Tue, 25 Nov 2025 08:59:12 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:12 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 08/18] docs: add luo documentation Date: Tue, 25 Nov 2025 11:58:38 -0500 Message-ID: <20251125165850.3389713-9-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the documentation files for the Live Update Orchestrator Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Pratyush Yadav Tested-by: David Matlack --- Documentation/core-api/index.rst | 1 + Documentation/core-api/liveupdate.rst | 54 ++++++++++++++++++++++ Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/liveupdate.rst | 20 ++++++++ 4 files changed, 76 insertions(+) create mode 100644 Documentation/core-api/liveupdate.rst create mode 100644 Documentation/userspace-api/liveupdate.rst diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/inde= x.rst index 6cbdcbfa79c3..5eb0fbbbc323 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -138,6 +138,7 @@ Documents that don't fit elsewhere or which have yet to= be categorized. :maxdepth: 1 =20 librs + liveupdate netlink =20 .. only:: subproject and html diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api= /liveupdate.rst new file mode 100644 index 000000000000..cca1993008d8 --- /dev/null +++ b/Documentation/core-api/liveupdate.rst @@ -0,0 +1,54 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update Orchestrator +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +.. kernel-doc:: kernel/liveupdate/luo_core.c + :doc: Live Update Orchestrator (LUO) + +LUO Sessions +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_session.c + :doc: LUO Sessions + +LUO Preserving File Descriptors +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_file.c + :doc: LUO File Descriptors + +Live Update Orchestrator ABI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D +.. kernel-doc:: include/linux/kho/abi/luo.h + :doc: Live Update Orchestrator ABI + +Public API +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: include/linux/liveupdate.h + +.. kernel-doc:: include/linux/kho/abi/luo.h + :functions: + +.. kernel-doc:: kernel/liveupdate/luo_core.c + :export: + +.. kernel-doc:: kernel/liveupdate/luo_file.c + :export: + +Internal API +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_core.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_session.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_file.c + :internal: + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update uAPI ` +- :doc:`/core-api/kho/concepts` diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspac= e-api/index.rst index b8c73be4fb11..8a61ac4c1bf1 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -61,6 +61,7 @@ Everything else :maxdepth: 1 =20 ELF + liveupdate netlink/index sysfs-platform_profile vduse diff --git a/Documentation/userspace-api/liveupdate.rst b/Documentation/use= rspace-api/liveupdate.rst new file mode 100644 index 000000000000..41c0473e4f16 --- /dev/null +++ b/Documentation/userspace-api/liveupdate.rst @@ -0,0 +1,20 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update uAPI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +ioctl interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_core.c + :doc: LUO ioctl Interface + +ioctl uAPI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: include/uapi/linux/liveupdate.h + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C860032D43B for ; Tue, 25 Nov 2025 16:59:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089958; cv=none; b=VKBJlnOMi/q210uDB5zcq3o9zWGpax9MzSB3osvXq9Vh3rJJoY5ib4c4J2ZPvsfs5+8BkY7PeSRAKt3DpP+MUd3pqSz7K8nJxFdb278G3VfqTvL8oqeN/bDB00paqXpW78AaSmgOjH3CQFZR07N4Lga9x8phj9tNfJx4X+0kEqw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089958; c=relaxed/simple; bh=rtE+Ceb6uvxsJ0lAv2S8AcwD3k610VK53wkRvQDM1Hc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qnk4o6i83hC44jP3vQVaaWVaDbN9ZKD12S6BOh7aNrWHoZo+pqcXMXkLoTRqg0ZNHrUOm+HiZdmZdT1oU+D4ZciLPlCIyPEalimhzwoza6GNTR5esnUrFk2orBPIKFL7rsiAgTao26nYEsseNIoWmIT93YGeQOIdkgojRwsxieY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=ZnbNAtt7; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="ZnbNAtt7" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-7866aca9ff4so58655837b3.3 for ; Tue, 25 Nov 2025 08:59:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089955; x=1764694755; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=n2VZ/uSTE9rekrj+aF+jwkex/aSDTyF9XJpHnZs+90c=; b=ZnbNAtt7KY+wWAuqReoR2RU9F5Ma3kWXG5zsoZkHcnNSfQZKMB/3hC9D3dSkFO0VhE dJy8Dj9mMLwDiIA0QdIQCYl/jPGvqqlVMJHKMvv9PN5Y/yd9M+uOXNUXnQL+Q6TBTJrM BYEWSJCK9Lv50BX48dq1nDOYVxPZJFdLdYRkC0F5nxKb7cqbzoaNZsaeSLYj0pVsn+8/ ktOUlFYKrl+Hl2/xFPou+GGkkv3nk1P0q8g9vWm5dVUB7sGH3AfD1xO0SkM9MztzJ81O bwKo7ppBP6V1aNqe+mIKt1B67oEK2t4AcDapv3I+b1vAemUHDCsraUHzHaRDf0fyPDvR PqGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089955; x=1764694755; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=n2VZ/uSTE9rekrj+aF+jwkex/aSDTyF9XJpHnZs+90c=; b=DmuHO9HBBDs04B78zeltceaCUYti+aZtlpp8dJZpT+Y9nYOHrrZK+spZQ1Pb/TezDC CUkjSFYPF4iHLCiJ1pcYTx0WMIK+BTR/14GGl1+FmZAMnc78CW4ONFWnXe0CX50pTcbN U2Rhqio4LVu8dnkqbB9/IfP05dScMRuxj2NmLeCiWtLkngzRLAZfyxYcbDgMw5Z2EIxE JufVF3NFRwLW7OZIgChxjRLi1wLBN9wJr3tQto2CGZA69slP48k/t896g6QF4wd1/B5u aI8LaXAeWbKn9OWaOGCRzAPhkTCYBP3rWTUwwcMUAt5P/P7DYFfvjKEd6uUxerT8Wdwr gM/Q== X-Forwarded-Encrypted: i=1; AJvYcCVfIAx/MOmtTcwVLyTY5JGoGvc4G02qZrZG6wl9m5dUhYeIpyGNeVlZk9sv2NyEyftSZhVDtMFMV7hKB8c=@vger.kernel.org X-Gm-Message-State: AOJu0YxsRGQ2bLLP0LeVIJvpvtA9Y4DJ5YOW2RlRdXx/AMK3wRIdEBGH TeVdzI9+GOgnr7aQEQU1MEc6Cl4IH4L2cUgRRf6tR1Q8IL0bJJz1+lGP09B1jfbwVwM= X-Gm-Gg: ASbGncs64ogLPKDvaZ08lBFbwEFqyr0LrIAIYDu/1SO0E460+Th0/yx06MMdjWsdF0s 9ovREEouCGUc1rAxxz1oLw5FnYjYnQuPZa/lF6QShuZTAueftDSm7t5uW9A3A7j9oTXhsr8ny8X 6s11ZwrUXoNiaYJ8HJJ2VReYi/+IzljTWqEdb96XOTq/CW/GjXBGJriw1H7k14dUcuhC1zehUyQ S80Tb+2KWt8EHEBunjFWUb7QJsohOics0e+HOBb6vmsXATkZHtcGwe5l1dYfyL+tzxkLEA9VpIL rN5ERxLinq4b/S9h/PJkhgUrt1u4fL6kiycQALgG/KYsf+DGugU1IO+HdRs7+Ks8O4bpqHSS6fx HEa17pefHd1J57fiNWh8iLgzj86PFmDYCd8bH/CdsCrmtxxqOxv3weExah4i4elAH+BMPh0caKM impRJ9uPiySU3j16BYYH36OQeGK8uwi7fGS8HsX8W425IXNzDsz6RWL2fYjZVZFDbs+oYYPwt4C Y7mNL4= X-Google-Smtp-Source: AGHT+IHp/uLRINC+2mEmiuoSxBmCL/SWIZK6CY7B3JageMUTar9wmfaPytTKzucTyEwNZIYxmxjOcA== X-Received: by 2002:a05:690c:61c6:b0:786:4fd5:e5dc with SMTP id 00721157ae682-78a8b53925emr120116757b3.36.1764089954718; Tue, 25 Nov 2025 08:59:14 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:14 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 09/18] MAINTAINERS: add liveupdate entry Date: Tue, 25 Nov 2025 11:58:39 -0500 Message-ID: <20251125165850.3389713-10-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a MAINTAINERS file entry for the new Live Update Orchestrator introduced in previous patches. Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Reviewed-by: Pratyush Yadav Tested-by: David Matlack --- MAINTAINERS | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index b46425e3b4d3..868d3d23fdea 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14466,6 +14466,18 @@ F: kernel/module/livepatch.c F: samples/livepatch/ F: tools/testing/selftests/livepatch/ =20 +LIVE UPDATE +M: Pasha Tatashin +M: Mike Rapoport +L: linux-kernel@vger.kernel.org +S: Maintained +F: Documentation/core-api/liveupdate.rst +F: Documentation/userspace-api/liveupdate.rst +F: include/linux/liveupdate.h +F: include/linux/liveupdate/ +F: include/uapi/linux/liveupdate.h +F: kernel/liveupdate/ + LLC (802.2) L: netdev@vger.kernel.org S: Odd fixes --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B87EA32A3F1 for ; Tue, 25 Nov 2025 16:59:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089970; cv=none; b=PEO+/Xq19g1ySaxsR7j6H9D4tz3Xk31cG59ML2JfLKCX86MA0lGMeT96/Rfii0OfdY6307LpdZ0JeCZR1Cil4TDY4X3t3PseBJaEFM46LkZ0f+1tWG8+RJDyN0OcACRicODnkcq3tSmvOKomBwSauxBUfPpfcPPbO3fWMi2k3v8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089970; c=relaxed/simple; bh=ezNu/UBbOrIs2ef8T8E6PIoZkTOz+W6uTYCkXqjXYwY=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gpLNLqrqOMGCNmyfi5MKsH40iM1vqYHC+GxvtjQz/aOKOM1ZJqTpLSSyeMEEIWXC0ovhcrZM0VNZXZw8zf7cFxfCC5K70lYwUA498FEpFy6fAkDfF4Uc+ByagHPWcFyldgF+JclaKNdrVZjx4qzHxNEh3Aqa2gAi9yECOR2ioQk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=iXZw7hfl; arc=none smtp.client-ip=209.85.128.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="iXZw7hfl" Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-78665368a5cso54932487b3.3 for ; Tue, 25 Nov 2025 08:59:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089967; x=1764694767; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=nfc1qJ15XMNkzBnOIVsDqUunQIJaXY9mGIdNtvrgttY=; b=iXZw7hflZL7OAS3iiBkI+haDyFOeuxrCbcXhxhCggPHlLlNevBxiD4s++ploTPjSEg hpLJCVg842lCZofYF+gEk8BNDtOAWHNaVTwwZArzbK+34BFg4J+OnHauAEJp5CSH/DRr HTD/fBRFMT4js/gOiRpZwQaePpmJ+jCrahnasy79zmdyrUQjEmF7kv7Hj4wweteX18eK +1tsKbtoKnfaxv/JNbhfWcqTcqDauafF3qxp0vzLZKyDpLMNG/XQFFQkvPhia/WZK0IW SVWzbgWzHyAASWTX3R9BTqhqbh3Pdb29XJXfVA9fTDm8W3+yQKIYrbN9WxZKBLqeBHcM 5S7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089967; x=1764694767; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=nfc1qJ15XMNkzBnOIVsDqUunQIJaXY9mGIdNtvrgttY=; b=OeqRmWVevGeP0wCT8tfo1vxZR5/uSGRdGeIKwSOUr5otkMBYYdEt2ZQyqX8qpE8ubw EFn7yEeES5UIVsdLz37QyKb/bChZZL9tLjTP/eiFXSqz5OhajqLvc7ZZYo37KSlSko1c CzPK0QNnDfwKUIanotVSCVgtUaBHcD4Cb+tY+dnPB7YI29OTTT4y957wIpu8HC5ptLjZ NLHlv4WcnNGl0KAHSQNvT1xsPa6JOCecauZeuc3S6Ddg4KPGRKrhKRHQsgPobY6EJway KoR2k79DOeBUG3FSy2SfoYJTDAUtjWS9AkQRrTot79JRB06CG3Ez13O+ABFGmOHTS+/g wrKg== X-Forwarded-Encrypted: i=1; AJvYcCUHC3tU459BpQVZ6Pkg6mhb+5HRkJwMudfieXVAlYZ/YoSxu+NSjGXUQ2C74HoLFdNXXPQTx7EJeJpON7g=@vger.kernel.org X-Gm-Message-State: AOJu0YxQkzD8PpSaXYSt1Q1G+rHlo2bc61I2yFBPdOYwcr2KoYLRD7Aw 1vR+yFs5swgILhGkuyvsaCrUmQ6B7owDpVrLxyrGprFQYXMKvhOqgxAFTDi+JSfZzlE= X-Gm-Gg: ASbGncvLH2lvW1BaMX4sI8vgXnSrRhizM+8yTnRCiYQay+EdCiT1zgf29+lFFFkWyQe 5stvr4Qp0+YS3/FxHrq0trgu9BW16g6piG5qWAypjOesnG8gOo6bqGxA/6gBJzdHslPyqHw4Onu 2W2yQrWjljgfXe2PNgDgIedt2U/JnyZKcA5FrjP3CeA8HKXfqmDSpChQjMOcaAw2fT/qwhzaJUP xSFxbHVCNiG7nEZSfkyRQRbi3ipl/roPbNfyF80s4GaitGt2ovnNo6PF2B7k1YepjL3Kproj8Jb LssQ8WVm9D9U43AZoYQg5MwpWKym+IeQaDn1h5LPHivqTln1hROXyah28hdR/1AunHA7Rx/68I7 bhAc5U9NBI3jUZhG3Smz/Y3YS/TVvfCqoFlAQNFEaYdwStacJT9QtSEkAF+p0elv/CVSoHjpwgf CRbw7ZdH2GgG0GMRFRnONdtfC87jvKuD76PXrnYoClfBOjl/x0JrnTxpBg7r9ouCHF X-Google-Smtp-Source: AGHT+IF5jBi+6cZ6abg5wWWkMVh+/fFfNNV00PFFZKtXm2L1YcP9KpDzV5NTX/aunBS8G3bL9Na65Q== X-Received: by 2002:a05:690c:4807:b0:786:5f42:5ac8 with SMTP id 00721157ae682-78a8b4720d7mr142868507b3.15.1764089966707; Tue, 25 Nov 2025 08:59:26 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:16 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 10/18] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Date: Tue, 25 Nov 2025 11:58:40 -0500 Message-ID: <20251125165850.3389713-11-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav shmem_inode_info::flags can have the VM flags VM_NORESERVE and VM_LOCKED. These are used to suppress pre-accounting or to lock the pages in the inode respectively. Using the VM flags directly makes it difficult to add shmem-specific flags that are unrelated to VM behavior since one would need to find a VM flag not used by shmem and re-purpose it. Introduce SHMEM_F_NORESERVE and SHMEM_F_LOCKED which represent the same information, but their bits are independent of the VM flags. Callers can still pass VM_NORESERVE to shmem_get_inode(), but it gets transformed to the shmem-specific flag internally. No functional changes intended. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- include/linux/shmem_fs.h | 6 ++++++ mm/shmem.c | 28 +++++++++++++++------------- 2 files changed, 21 insertions(+), 13 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 0e47465ef0fd..650874b400b5 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -10,6 +10,7 @@ #include #include #include +#include =20 struct swap_iocb; =20 @@ -19,6 +20,11 @@ struct swap_iocb; #define SHMEM_MAXQUOTAS 2 #endif =20 +/* Suppress pre-accounting of the entire object size. */ +#define SHMEM_F_NORESERVE BIT(0) +/* Disallow swapping. */ +#define SHMEM_F_LOCKED BIT(1) + struct shmem_inode_info { spinlock_t lock; unsigned int seals; /* shmem seals */ diff --git a/mm/shmem.c b/mm/shmem.c index 58701d14dd96..1d5036dec08a 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -175,20 +175,20 @@ static inline struct shmem_sb_info *SHMEM_SB(struct s= uper_block *sb) */ static inline int shmem_acct_size(unsigned long flags, loff_t size) { - return (flags & VM_NORESERVE) ? + return (flags & SHMEM_F_NORESERVE) ? 0 : security_vm_enough_memory_mm(current->mm, VM_ACCT(size)); } =20 static inline void shmem_unacct_size(unsigned long flags, loff_t size) { - if (!(flags & VM_NORESERVE)) + if (!(flags & SHMEM_F_NORESERVE)) vm_unacct_memory(VM_ACCT(size)); } =20 static inline int shmem_reacct_size(unsigned long flags, loff_t oldsize, loff_t newsize) { - if (!(flags & VM_NORESERVE)) { + if (!(flags & SHMEM_F_NORESERVE)) { if (VM_ACCT(newsize) > VM_ACCT(oldsize)) return security_vm_enough_memory_mm(current->mm, VM_ACCT(newsize) - VM_ACCT(oldsize)); @@ -206,7 +206,7 @@ static inline int shmem_reacct_size(unsigned long flags, */ static inline int shmem_acct_blocks(unsigned long flags, long pages) { - if (!(flags & VM_NORESERVE)) + if (!(flags & SHMEM_F_NORESERVE)) return 0; =20 return security_vm_enough_memory_mm(current->mm, @@ -215,7 +215,7 @@ static inline int shmem_acct_blocks(unsigned long flags= , long pages) =20 static inline void shmem_unacct_blocks(unsigned long flags, long pages) { - if (flags & VM_NORESERVE) + if (flags & SHMEM_F_NORESERVE) vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE)); } =20 @@ -1551,7 +1551,7 @@ int shmem_writeout(struct folio *folio, struct swap_i= ocb **plug, int nr_pages; bool split =3D false; =20 - if ((info->flags & VM_LOCKED) || sbinfo->noswap) + if ((info->flags & SHMEM_F_LOCKED) || sbinfo->noswap) goto redirty; =20 if (!total_swap_pages) @@ -2910,15 +2910,15 @@ int shmem_lock(struct file *file, int lock, struct = ucounts *ucounts) * ipc_lock_object() when called from shmctl_do_lock(), * no serialization needed when called from shm_destroy(). */ - if (lock && !(info->flags & VM_LOCKED)) { + if (lock && !(info->flags & SHMEM_F_LOCKED)) { if (!user_shm_lock(inode->i_size, ucounts)) goto out_nomem; - info->flags |=3D VM_LOCKED; + info->flags |=3D SHMEM_F_LOCKED; mapping_set_unevictable(file->f_mapping); } - if (!lock && (info->flags & VM_LOCKED) && ucounts) { + if (!lock && (info->flags & SHMEM_F_LOCKED) && ucounts) { user_shm_unlock(inode->i_size, ucounts); - info->flags &=3D ~VM_LOCKED; + info->flags &=3D ~SHMEM_F_LOCKED; mapping_clear_unevictable(file->f_mapping); } retval =3D 0; @@ -3062,7 +3062,7 @@ static struct inode *__shmem_get_inode(struct mnt_idm= ap *idmap, spin_lock_init(&info->lock); atomic_set(&info->stop_eviction, 0); info->seals =3D F_SEAL_SEAL; - info->flags =3D flags & VM_NORESERVE; + info->flags =3D (flags & VM_NORESERVE) ? SHMEM_F_NORESERVE : 0; info->i_crtime =3D inode_get_mtime(inode); info->fsflags =3D (dir =3D=3D NULL) ? 0 : SHMEM_I(dir)->fsflags & SHMEM_FL_INHERITED; @@ -5804,8 +5804,10 @@ static inline struct inode *shmem_get_inode(struct m= nt_idmap *idmap, /* common code */ =20 static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *n= ame, - loff_t size, unsigned long flags, unsigned int i_flags) + loff_t size, unsigned long vm_flags, + unsigned int i_flags) { + unsigned long flags =3D (vm_flags & VM_NORESERVE) ? SHMEM_F_NORESERVE : 0; struct inode *inode; struct file *res; =20 @@ -5822,7 +5824,7 @@ static struct file *__shmem_file_setup(struct vfsmoun= t *mnt, const char *name, return ERR_PTR(-ENOMEM); =20 inode =3D shmem_get_inode(&nop_mnt_idmap, mnt->mnt_sb, NULL, - S_IFREG | S_IRWXUGO, 0, flags); + S_IFREG | S_IRWXUGO, 0, vm_flags); if (IS_ERR(inode)) { shmem_unacct_size(flags, size); return ERR_CAST(inode); --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DFACD32FA1D for ; Tue, 25 Nov 2025 16:59:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089971; cv=none; b=payzZQ4FM+m5s9vgb2cu7HriHmAdg8AXmw6ElbJiIx0R12BVPjFo6iXaGINHEuyIj/VvvtoGm+atU1Xb+ene/tcftb0f/RyipgM4ZRoFkO1+/fF62NyoMp8Zw1OIkE2Nzh2CI5NZPotJ0+L2qgORug5Z/rr6e2J9FcESTNVf0og= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089971; c=relaxed/simple; bh=v9x1l0vuGtuabHoTMFaYXgKBzUw5ntyOjf/K0GgYcpQ=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GnSyveZRsBcKifglXsHpJ45tGW/nkRzFJPFPfWFStbihy1A5Qt0XYfO0IDSgBxPunKh4J7AuxVLJlY9GaVD7HUHTmGycD5KVtQNWVflJZgOYtkydlM/t6yLrkVphvoQNbU0G7smrampGMvJQr+I29d2NspmBr9jWvw5gBznJiUc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=RHTRwS3T; arc=none smtp.client-ip=209.85.128.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="RHTRwS3T" Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-78802ac22abso62168607b3.3 for ; Tue, 25 Nov 2025 08:59:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089969; x=1764694769; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=rLwSseklhA0qKOMy2cW44bpZPsfi2vW7qm+LuafppIg=; b=RHTRwS3TjRxpHzWC/9ffzr7awvtTYzVC/0U1YqseCVowNj/dj8nGzpSJqTycG4IaDq GCr2a+udzcuwCfRBPYazhM8QKRikvGRtZy7XRFieZ9jwTFgeVgjKLeXugMN9gELR9iRr kNNO/I3ne61l9VgD8XNgtlbPxJzCuFeoxLmEnQ8Vm6XqTzGEROQxbsZc8uRrCY4jpncj z3Wc3PEAWLGPVwFype8cqGHkYRfjBN4up/Hb1Qcr7FhLYTYQV61bIicobWa0QRPp7HJd rOlixgWvqOB6AROT4kxNDrLH4Yz6SAbesWFUMwdhZUAHcyQwa9WyBlYI6O9Ao03k7Vds Yoeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089969; x=1764694769; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=rLwSseklhA0qKOMy2cW44bpZPsfi2vW7qm+LuafppIg=; b=cBO1w+zo83dDy6QOkua6aXvQHXLf7NDoROY5iPJtkpM0l7dtgC/orZREs1p/s2/6Za UTYJiog+s9Eztm8lVS+FxqpvzdDJlkg/J4wtIwlJx3+VXqzlBWZQFChgUCbx86jkXHfR 5boR0gup4LYDNP/grytIiNhNPpc0wKRQXPPAq43/mikh4AJOiHoLV1rAhot2SI3jf4XM Tw/RKGuTQfwUdEQjKwGBJh0wxyzi4R9i7tTBeRwl8ViPcsTUoWE4wCvUy5YEnq4pIqiA rFXWPkh1oF2FzxP8pUkVsN9HATDvql3hwTFfCmDWs1c5gcnbAPCWphMkQOgVE7WCHPjD ST+g== X-Forwarded-Encrypted: i=1; AJvYcCWhIBFN/aogdM7w3gT/9mvCFUoodU1I1v48uoB7X9zO0S68U+MpGbPr3QPoKB954j5LAcjLcEKMz+qUX18=@vger.kernel.org X-Gm-Message-State: AOJu0YyFBLXBQTn4O3YXBjmxC71zd6Dh8aOFLVRfNASxWU8hIT3TlJ9c zoFheGKwZJ7tnXY+E+u1Cde9JQgX1RbzY9OuQ5TNAwfMdELZ4tO6nVMpNglg3uVnTTU= X-Gm-Gg: ASbGncszxVCGFX3IK5qGpbZ2gsV+SWlVnS6KzTWPXgVlFFx0XLxRqz3Cc0nEGeVF/CY dYMwA4c/LFV+Ia30WeWjXsKW2ukv7IY2K/uFUvksM0jbXXlLbBwCCfIVv1R8xeA25M9GDFhBQCs J4RIrUxGpYQQco/44e1bfwtaHxwKgwU66PUkk5V6djM/ku4MkDObfYZ20jKp3U/MPAdtlmUifHQ unAm65ROsvilZdkWrdfu17c2zopO723x9WHWH2u9zOcFDV6HlSa6iIdBFWokFn2y4XPXkM0e7WY m9o+/QXakJOyLNOdUE1D5FMpOBL0pDLEaTurZnP3tiD5GgWsq0f5QyxhQCEv0qabLMxETR7DbKm 3Uyec48wMMs92MVc3HywQkE/fekfqEyvtt+2LOOA1FdjXMLv98Kte3dHNAxnLwTnplWspCEDbuQ JxXgLahqHMv6sVLHzy2WVvYG0B4VhqFXQ5OgpEt9/k2LTHoyW9NPUWM+6DpH726+T/athny/tYu Tw= X-Google-Smtp-Source: AGHT+IHyOGiEIWn7K/FqPN8HAm8rR5ND4rL8A082/WWeW+u5rZ2m5KAWzpxwpgtD4aV75ucYKtS3TQ== X-Received: by 2002:a05:690c:4b08:b0:788:1086:8834 with SMTP id 00721157ae682-78ab6d811b4mr30671357b3.12.1764089968779; Tue, 25 Nov 2025 08:59:28 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:28 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 11/18] mm: shmem: allow freezing inode mapping Date: Tue, 25 Nov 2025 11:58:41 -0500 Message-ID: <20251125165850.3389713-12-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav To prepare a shmem inode for live update, its index -> folio mappings must be serialized. Once the mappings are serialized, they cannot change since it would cause the serialized data to become inconsistent. This can be done by pinning the folios to avoid migration, and by making sure no folios can be added to or removed from the inode. While mechanisms to pin folios already exist, the only way to stop folios being added or removed are the grow and shrink file seals. But file seals come with their own semantics, one of which is that they can't be removed. This doesn't work with liveupdate since it can be cancelled or error out, which would need the seals to be removed and the file's normal functionality to be restored. Introduce SHMEM_F_MAPPING_FROZEN to indicate this instead. It is internal to shmem and is not directly exposed to userspace. It functions similar to F_SEAL_GROW | F_SEAL_SHRINK, but additionally disallows hole punching, and can be removed. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- include/linux/shmem_fs.h | 17 +++++++++++++++++ mm/shmem.c | 11 +++++++++++ 2 files changed, 28 insertions(+) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 650874b400b5..d34a64eafe60 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -24,6 +24,14 @@ struct swap_iocb; #define SHMEM_F_NORESERVE BIT(0) /* Disallow swapping. */ #define SHMEM_F_LOCKED BIT(1) +/* + * Disallow growing, shrinking, or hole punching in the inode. Combined wi= th + * folio pinning, makes sure the inode's mapping stays fixed. + * + * In some ways similar to F_SEAL_GROW | F_SEAL_SHRINK, but can be removed= and + * isn't directly visible to userspace. + */ +#define SHMEM_F_MAPPING_FROZEN BIT(2) =20 struct shmem_inode_info { spinlock_t lock; @@ -186,6 +194,15 @@ static inline bool shmem_file(struct file *file) return shmem_mapping(file->f_mapping); } =20 +/* Must be called with inode lock taken exclusive. */ +static inline void shmem_freeze(struct inode *inode, bool freeze) +{ + if (freeze) + SHMEM_I(inode)->flags |=3D SHMEM_F_MAPPING_FROZEN; + else + SHMEM_I(inode)->flags &=3D ~SHMEM_F_MAPPING_FROZEN; +} + /* * If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages * beyond i_size's notion of EOF, which fallocate has committed to reservi= ng: diff --git a/mm/shmem.c b/mm/shmem.c index 1d5036dec08a..786573479360 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1297,6 +1297,8 @@ static int shmem_setattr(struct mnt_idmap *idmap, return -EPERM; =20 if (newsize !=3D oldsize) { + if (info->flags & SHMEM_F_MAPPING_FROZEN) + return -EPERM; error =3D shmem_reacct_size(SHMEM_I(inode)->flags, oldsize, newsize); if (error) @@ -3289,6 +3291,10 @@ shmem_write_begin(const struct kiocb *iocb, struct a= ddress_space *mapping, return -EPERM; } =20 + if (unlikely((info->flags & SHMEM_F_MAPPING_FROZEN) && + pos + len > inode->i_size)) + return -EPERM; + ret =3D shmem_get_folio(inode, index, pos + len, &folio, SGP_WRITE); if (ret) return ret; @@ -3662,6 +3668,11 @@ static long shmem_fallocate(struct file *file, int m= ode, loff_t offset, =20 inode_lock(inode); =20 + if (info->flags & SHMEM_F_MAPPING_FROZEN) { + error =3D -EPERM; + goto out; + } + if (mode & FALLOC_FL_PUNCH_HOLE) { struct address_space *mapping =3D file->f_mapping; loff_t unmap_start =3D round_up(offset, PAGE_SIZE); --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yx1-f51.google.com (mail-yx1-f51.google.com [74.125.224.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4C0032FA28 for ; Tue, 25 Nov 2025 16:59:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089974; cv=none; b=uB0nZss1oK2jHSxRB+xhQSCdU42J09Q27Jd5aMqCz4RbjQHWMVsUyXEn/UkJ7pjZOUeT54w/ThKK30TBDzMi0i8zhniicw+767Sk2sNaLT0lqqPh1oPKH/9dl1ODWdqAp1NKtANtMH+ShVnfJwwEmAWmqqnU2GVXXqiKfTfpQRw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089974; c=relaxed/simple; bh=KuhNTb4CjpCAEi4yupsWgshbtHoJow3Edu2RC58eD+A=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KVUXHJssIaruxPRaKfDdkV3BsppU7SBTKNj64L6FdOtq7vLDmpweSs/9nXWzuolgpIc/+vtlRqlzHWYmZ0OztoZTWvwtlav+xCgb3xl1PxlTkGsCwv8N3Gt2m+4qAhEgV0Off/07nQbrTaZ6UOoJZHYzneQEACEh3x6lv804Vs4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=U88ZGUUw; arc=none smtp.client-ip=74.125.224.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="U88ZGUUw" Received: by mail-yx1-f51.google.com with SMTP id 956f58d0204a3-641e4744e59so5983154d50.2 for ; Tue, 25 Nov 2025 08:59:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089970; x=1764694770; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=VfwgkEKDg4HJju8ezcZR8QklgFqMiwexlZP+Lehn03I=; b=U88ZGUUw8sIuIpfHDMPI8Q+2mhFeTq4UMOJ5xNh+pvA+/gPggiKVc+Oy88M8daEp3V Kr3h6IOATS5pPABp90uNpjoL+4wfUqnqouasFuJ9B+QdYqdPQD/lMGECFiQrtYL2i8fi 5oMZyIfERnWMcJWUZR/jqMlyoyMwDzyfIcsof8sk0CjUOubhJSQsZUYfHCBftBgsaCZN O5MPse1lXcbv3EJhGYM3stpyZYmiBeTZ3BjsUXRbC0r+7a9voyl+ki3haOC4JRV8N80G uTR0Ga5FN13YNYPXylzji1M/Pbj3phu0vLzlzYkucAvJUv1QtB+6k4ncrOnVdTNnHoub VQkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089970; x=1764694770; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=VfwgkEKDg4HJju8ezcZR8QklgFqMiwexlZP+Lehn03I=; b=sO18u1occXzbspcVDECzzsYutbLHly69plCb+58I05s+VMox+yMLvlF0+z5lbkLSPY 5rXXSmh3WPx/cShvVeiLRuceEuYvSfRmFZifNdU90pOmtoP/g0NUQFgGPJuuh1usLMxN enddjhB6Vjf67YlGv710FMmI7NoIbtiQXUnMLyBM6evlwsAZ5mpiPlz6a3Nvl0/M3FYR YnBijMPLwxopxOkZ5FJH7qdu3WWG5Tmf/yCP5vaR4tF7KnRgZsy4TSZop5QdyWOduSS2 HMHYa96knrphCYoRxS8KuFtq0qqTTdod1hl2fS52BGzl4J4ulvZ3wRJMnO+O5npGA77x bNPw== X-Forwarded-Encrypted: i=1; AJvYcCXVw3/1LYRpI51/0U50WoMfmwuX0huL9Mz8AxBKS5nI8lPLZsyrlSOcAujR0wBzu9rgzbX9s3hXyUUpTuc=@vger.kernel.org X-Gm-Message-State: AOJu0YzPGVvMasjE6K03UriVVRzZYV4mu4SGG3mjm4jvRgx/dwXpUkti z5kKFQpS7YYAqeGAakkXECGjSMcOaZgEH1gFaKdOlF13yw+X+eyeKHa/RWlFe03+Fgg= X-Gm-Gg: ASbGnctLNnA6TcnsSmF9RsqBt0YM72eSDbrd3g92ktkLhIKTggCCHLntIwp60EzUJv+ EHPFdnkHv3NK5Y3LFhtG9I4JtkHQI+0fnpfBQ41B1DBQx+D7RQB4MnBd2AKOTz5i+ZKyjKLrx4T R7Okh1w12wCd4Ux8m9WDlRs6EKlbRJMs6VUB6Bdfv70tyCxQI0jPMW5fRmURnPnwHCzjt5jYhN8 KE6zykh/EXxW2rh+Aj6doKbvPjiomVstj58+9n+BSrjG31o5g6bo6acNVGV1jQZ2CBI4hpnmVjY Eck0SnhqzQWH7wHqpQHGoXk2nLs6mbcqHIDxz4a4QzYLKI9QX/Is01hQ6AUM9Tm5rNe/BfRgDcu ognTCLPlWmL7WAFpY9lElJrevEcTSK7w9/nYzaqliPdr89SWzB133dQKzHUIPtzYh41jb3xaI3t wVaSdYGzMMoUqP9ItN+0bLWVB6gbhQ6WbKgL5ucJJCE0AhqI2ucUd9WmfQpgRxD5aWQPO25NEUz LLXG6w= X-Google-Smtp-Source: AGHT+IHxFSNIvKptfLX2O2s6I7Ya1I+lmAyOHmAnZEOWuFdMycjzG+KGE1lNfFdZGkk9BfTpPXWC4g== X-Received: by 2002:a53:d048:0:20b0:63f:beb2:950f with SMTP id 956f58d0204a3-64302a5d4c6mr9409405d50.34.1764089970594; Tue, 25 Nov 2025 08:59:30 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:30 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 12/18] mm: shmem: export some functions to internal.h Date: Tue, 25 Nov 2025 11:58:42 -0500 Message-ID: <20251125165850.3389713-13-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav shmem_inode_acct_blocks(), shmem_recalc_inode(), and shmem_add_to_page_cache() are used by shmem_alloc_and_add_folio(). This functionality will be used by memfd LUO integration. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- mm/internal.h | 6 ++++++ mm/shmem.c | 10 +++++----- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 1561fc2ff5b8..4ba155524f80 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1562,6 +1562,12 @@ void __meminit __init_page_from_nid(unsigned long pf= n, int nid); unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memc= g, int priority); =20 +int shmem_add_to_page_cache(struct folio *folio, + struct address_space *mapping, + pgoff_t index, void *expected, gfp_t gfp); +int shmem_inode_acct_blocks(struct inode *inode, long pages); +bool shmem_recalc_inode(struct inode *inode, long alloced, long swapped); + #ifdef CONFIG_SHRINKER_DEBUG static inline __printf(2, 0) int shrinker_debugfs_name_alloc( struct shrinker *shrinker, const char *fmt, va_list ap) diff --git a/mm/shmem.c b/mm/shmem.c index 786573479360..679721e48a87 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -219,7 +219,7 @@ static inline void shmem_unacct_blocks(unsigned long fl= ags, long pages) vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE)); } =20 -static int shmem_inode_acct_blocks(struct inode *inode, long pages) +int shmem_inode_acct_blocks(struct inode *inode, long pages) { struct shmem_inode_info *info =3D SHMEM_I(inode); struct shmem_sb_info *sbinfo =3D SHMEM_SB(inode->i_sb); @@ -435,7 +435,7 @@ static void shmem_free_inode(struct super_block *sb, si= ze_t freed_ispace) * * Return: true if swapped was incremented from 0, for shmem_writeout(). */ -static bool shmem_recalc_inode(struct inode *inode, long alloced, long swa= pped) +bool shmem_recalc_inode(struct inode *inode, long alloced, long swapped) { struct shmem_inode_info *info =3D SHMEM_I(inode); bool first_swapped =3D false; @@ -861,9 +861,9 @@ static void shmem_update_stats(struct folio *folio, int= nr_pages) /* * Somewhat like filemap_add_folio, but error if expected item has gone. */ -static int shmem_add_to_page_cache(struct folio *folio, - struct address_space *mapping, - pgoff_t index, void *expected, gfp_t gfp) +int shmem_add_to_page_cache(struct folio *folio, + struct address_space *mapping, + pgoff_t index, void *expected, gfp_t gfp) { XA_STATE_ORDER(xas, &mapping->i_pages, index, folio_order(folio)); unsigned long nr =3D folio_nr_pages(folio); --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8981732AAB8 for ; Tue, 25 Nov 2025 16:59:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089975; cv=none; b=NwdtxF6tCUI6Cphogwa1ja6Ib60srX9t5sTU9Tu80tu7jqJFPeVu1lA8C5yrFQiBHVR1bmECw8sSKdP8RRDe/4zWd9PIYcrKOCRo7mtLMsOdmmi1rFGjBjX0d8Jzjd9QPxYOETsk6G8McFBQyZkjlDC0D1hNxE92QSyCyyGgpMw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089975; c=relaxed/simple; bh=Cgp78p6MJr5ikWgAbD441bmUNWaMvB9QxQCFESAbYw0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J/BLN67zJl9LeW61X/3J8lFI7GrUKR2A4/SmNJoV7cSSVfNoaiIkn2HY4kFCTfysd0M7RL5sJfPjB/XAUbqv3vXfV6rJaqmaYtiCqwQzp3ZNpFQWh3IogC5Ph8g14xYxRGUXmipTislG+rKRXAAeUpOORtcE+llri4mfIQbAdVA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=RZN45G2w; arc=none smtp.client-ip=209.85.128.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="RZN45G2w" Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-7895017c722so54441427b3.2 for ; Tue, 25 Nov 2025 08:59:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089972; x=1764694772; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=yx9quZG9gLF2dl0IKuMBsHxvDi//DiVhoRUqED9zxDI=; b=RZN45G2wbDuSV0HRp6JijJQd/3rGnFubrD45DMPC3tre2NkFVEESsBIeAFhFooL0MT o1ZVVNXOb47qOMZc+Exxpti8Ougo1j3zr1hgKEMP5fE/+ToSFLciJxk2z4OWuc1jt09B oZ4VDa7J1t71IiDxa7oDEVPLN+geY+ViV5IL49YQGlJ9FUVIATdKexbFn75GUSHK9Ct/ 91UGUM0Zqq3g7HG1TkhX1AuR8kiMwudG1DND+Au5tYlddVZYpkJdxdyZMG+uo0530UxL BBkekJaVLAr7A5jsJzkB/KoJ9JkBShDkuxmJOSW90kXWPO5h+g6Vu1+Ezmzeexclc8Zc CNfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089972; x=1764694772; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=yx9quZG9gLF2dl0IKuMBsHxvDi//DiVhoRUqED9zxDI=; b=WGk4ckFZGYkTD/C1rzuY8fuyKgGGW1P+kCkrRU8U51YW2ZGvoKl1MKg6xIWNSRICU9 BIec/sYGNSQeJnrH9EgfHt1+2XDsqzqZgdAdreZVbId4cYEd9JMA0xYI5okZyQt7t4hT vpaoMbCsX56xT2j+wUnQTvTCVAr3kdTAisYjK5iUr//XLcB4/1E85g+yLKtQcN1YJ/Y1 R6sYYVGCiH2GxA6qNPOgwgzwsSMQuH22BNvrV6CvELxYyDLf+67Z9jrh2slua/Xuwa/s mvYyrQKniFFw/H1/D/t2TKKjX7bA/UBEuaoOgn3pWdlKEF6kNZAkgUmqu4btbHXaQg7N RPZg== X-Forwarded-Encrypted: i=1; AJvYcCXE9xjji8/J1c7YoDrHvpvpwmElMCYu1rqYFH4Xw8ANJOHXOx2aNaD0uXYKLo5F1YwryxdAT3d4A2H8ZAc=@vger.kernel.org X-Gm-Message-State: AOJu0YxrZykPBCE7CV+hFg/XxbY4sTYYa9NtBHe9kbCcolJscQthKSB9 NnrQ6zubPXxmksUxzDpnuaCOqxCTO37xsfXhEAevl5LsfM1x2zh0OcWi5mX1D2OpX+Y= X-Gm-Gg: ASbGncv39SD2A2ISAbbzjzKZXycxFoazokYkwa40qOW6BI07h0KG+H4dYLeieksIYdu kI7LOCvOGrayi+SkKQ02OwDvOAfZfHUqfEsXnBH6pb3/XV6+7S/DInLU9zYPYF92/5JNWJsG54S Zm00SDqXwuje7p5c4S3PexTct1PTOQmcLZTzrb7jmVcYlREYImpp1D3gOQWLBwRXSMCoHYc3MeI CjfCJ+ysrlP9PGLRinDVud88BTEZQNSfVHsNrIMDpx/oc/mC4+E3nlOSrG6ilN319sKg3FMmQ/j MWpE44CbN8F3jwEOOTKp5xdjgPuzhdrw5ct05j+fsDdWiWdNIwdVu6uTkY65YnCJZAKesEdq3Kc ll1N+7+shaIMyvS9wXr1AJGKe8yCCyqzhRlvsGJaAjk+L+W55A99zmBXaPzJSKAkIuKLb9PjoMg IEng9dS6oNrsnRBHANb9OcO87IfIPaRHXStuiWvivOG+L/59QQEEN8JobSpqRpN4C5 X-Google-Smtp-Source: AGHT+IEW6KgCKYwZ7XEqOpbUcHlowFzPqPsBiqMYeBlVaw8DURiXeYQOxuH5IPmz+1TT/Eb1rS/paQ== X-Received: by 2002:a05:690c:338f:b0:788:737:4830 with SMTP id 00721157ae682-78ab6fce723mr28091657b3.66.1764089972519; Tue, 25 Nov 2025 08:59:32 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:32 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 13/18] liveupdate: luo_file: add private argument to store runtime state Date: Tue, 25 Nov 2025 11:58:43 -0500 Message-ID: <20251125165850.3389713-14-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav Currently file handlers only get the serialized_data field to store their state. This field has a pointer to the serialized state of the file, and it becomes a part of LUO file's serialized state. File handlers can also need some runtime state to track information that shouldn't make it in the serialized data. One such example is a vmalloc pointer. While kho_preserve_vmalloc() preserves the memory backing a vmalloc allocation, it does not store the original vmap pointer, since that has no use being passed to the next kernel. The pointer is needed to free the memory in case the file is unpreserved. Provide a private field in struct luo_file and pass it to all the callbacks. The field's can be set by preserve, and must be freed by unpreserve. Signed-off-by: Pratyush Yadav Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- include/linux/liveupdate.h | 5 +++++ kernel/liveupdate/luo_file.c | 9 +++++++++ 2 files changed, 14 insertions(+) diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index 122ad8f16ff9..a7f6ee5b6771 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -27,6 +27,10 @@ struct file; * this to the file being operated on. * @serialized_data: The opaque u64 handle, preserve/prepare/freeze may u= pdate * this field. + * @private_data: Private data for the file used to hold runtime state= that + * is not preserved. Set by the handler's .preserve() + * callback, and must be freed in the handler's + * .unpreserve() callback. * * This structure bundles all parameters for the file operation callbacks. * The 'data' and 'file' fields are used for both input and output. @@ -36,6 +40,7 @@ struct liveupdate_file_op_args { bool retrieved; struct file *file; u64 serialized_data; + void *private_data; }; =20 /** diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c index e9727cb1275a..ddff87917b21 100644 --- a/kernel/liveupdate/luo_file.c +++ b/kernel/liveupdate/luo_file.c @@ -129,6 +129,10 @@ static LIST_HEAD(luo_file_handler_list); * This handle is passed back to the handler's .freeze(), * .retrieve(), and .finish() callbacks, allowing it to tr= ack * and update its serialized state across phases. + * @private_data: Pointer to the private data for the file used to hold r= untime + * state that is not preserved. Set by the handler's .pres= erve() + * callback, and must be freed in the handler's .unpreserv= e() + * callback. * @retrieved: A flag indicating whether a user/kernel in the new kern= el has * successfully called retrieve() on this file. This preve= nts * multiple retrieval attempts. @@ -155,6 +159,7 @@ struct luo_file { struct liveupdate_file_handler *fh; struct file *file; u64 serialized_data; + void *private_data; bool retrieved; struct mutex mutex; struct list_head list; @@ -298,6 +303,7 @@ int luo_preserve_file(struct luo_file_set *file_set, u6= 4 token, int fd) goto err_kfree; =20 luo_file->serialized_data =3D args.serialized_data; + luo_file->private_data =3D args.private_data; list_add_tail(&luo_file->list, &file_set->files_list); file_set->count++; =20 @@ -344,6 +350,7 @@ void luo_file_unpreserve_files(struct luo_file_set *fil= e_set) args.handler =3D luo_file->fh; args.file =3D luo_file->file; args.serialized_data =3D luo_file->serialized_data; + args.private_data =3D luo_file->private_data; luo_file->fh->ops->unpreserve(&args); =20 list_del(&luo_file->list); @@ -370,6 +377,7 @@ static int luo_file_freeze_one(struct luo_file_set *fil= e_set, args.handler =3D luo_file->fh; args.file =3D luo_file->file; args.serialized_data =3D luo_file->serialized_data; + args.private_data =3D luo_file->private_data; =20 err =3D luo_file->fh->ops->freeze(&args); if (!err) @@ -390,6 +398,7 @@ static void luo_file_unfreeze_one(struct luo_file_set *= file_set, args.handler =3D luo_file->fh; args.file =3D luo_file->file; args.serialized_data =3D luo_file->serialized_data; + args.private_data =3D luo_file->private_data; =20 luo_file->fh->ops->unfreeze(&args); } --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1AA5330D32 for ; Tue, 25 Nov 2025 16:59:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089978; cv=none; b=Xw8PLPc/VCjc//N1fU1nkgVBFntDBwTxKmjGm97Mhz3SS9EQlWKzv/QmHcifsqtsGtCyBHH8ugfxo/+daNijqcMqUjGcqPSpNUjVjsUk7YSF7sbNiVPqQf2j7hxsyRxQv8ragad+quyoFWOBGxnNWNRHyrysShvB2QuyTr82RBU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089978; c=relaxed/simple; bh=FoZhYQ5anN9QvxDg+LqoX1tvpbpX6PKiEw9R3dzkS2M=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=W+e/2McFx3PxvweXwsQwbGLEvhd49S9U3Py4PgZEMKww7fsea0KHcr0cTfNnIa+prxOT+efLYuk/8fzpbg48xW45dqKxBMJN6S56xRS9OkrHS/IY7CaU8SUIb3QB9XSXUoaBdMqPIFUshhMgWA2CrQlp9dcyaB69FpDy93XxXCg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=YtQtC6Cg; arc=none smtp.client-ip=209.85.128.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="YtQtC6Cg" Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-787c9f90eccso59974387b3.3 for ; Tue, 25 Nov 2025 08:59:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089974; x=1764694774; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=hoq1KHM//ceWZHYzHvP2GAeyKGu/ekSRGep2L5s8MQE=; b=YtQtC6Cg9C6m65HMIAnsVp0IM5jeaLNY89HFe+MJVan7NgwA5/CUxtovNkY0SakgTS pNpRqjZFOMMqo3i1q4hnsnd+S9QWWVh9xYwnYDCR58bhR3rG3BvfIo1/bKMnJc+fufre Hoj+ksyxMmKoHQddOKEbh7NzYpj6iK9iJ51e4o2nHPurpXzB/liExEC5kZ+ATHC4T47F 5Jt0xZtRJpYoOIjlJhid2fWJrGPvoNg5YDTwxiXPN4ElQQiIzPHGqJnYN32I0447zbWm k/0zJ5DXi3Ycr8Wo7dCUVRsmLWPkMVTRXbaAyt9Ss/IfxvDndW8WJp0f50nYutbaojaC UcYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089974; x=1764694774; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=hoq1KHM//ceWZHYzHvP2GAeyKGu/ekSRGep2L5s8MQE=; b=eXjNcD8qSH9BU5DylV0gVM/cU/Rkuk74aqRi2VNLfM+cIypqbtX7zHxWZmJO7qCPcj COJ824hMPR2Pp6Jids7t7kgOGtqRh2q8p21H8qEpSSYiLwcs2cV7IQrdaz6VpviNRZD9 zCBJ/W1uv3vHFi4tXKIcMkGAbSlUqi8n+ObfVaLAMjlj4azmJSG0TMpyMEZKMP0epknf SnLZNRx+hX2KM5pwrT4fMsZRFIUgM4AmZD68bhWxO/VOm2oB7jrjY0AsOPe6udcxPFln 7hJnRVAogisMGvCzM60aMzvNAajnDF5VZX02o2dTmbqbrVuICQzD5HNsoXeRSqHTTjJ6 6XAw== X-Forwarded-Encrypted: i=1; AJvYcCWbL3tM2rEscbbRbsSEJHt+t98ulAkWBNsh6PZa336qhel6pR33vOQuXy5A6QS4q/6d1kJqM9JEWOB7I5E=@vger.kernel.org X-Gm-Message-State: AOJu0YyukrhFPU3Lvc+qubqQFTquFjExdtcQBpmoOba6xwhTUF7WGT04 u7XME9Rmw0qjT7eGVbjkcGSX2mJkaCXUazfyF24iwx+430X/Azj7B60YYCfy8v7wMHw= X-Gm-Gg: ASbGncuIDAYBCBT+OT2NjYs9VopMnz9nNpIgezQM27zZJnD49WvHT4AGEeIaQkBlurA bkTjGVHS71aZqjYXaCb0YEDqpnS96tBf8TUU2tnxNcPoM/MJe/zO8ZFSMdwIEsTq5G9ihlA7eIM +WvcP7X4qI/XQm61uwVg8P8q3C+jW0eqSBgkd5X+4MTugKsmUvj8KYpF5WoB4D1AV/SM3VrkJ2V HmgLLWspf8ywEPlhlXMVRMNqA7Ibts/qBswnZ8YAuLUoVf3g6UQ7y5Gg3ldQtXIyE0F90PsSS9Z nIAzLe8X8eWrQi6SMXTu1Lc12d7TnYFQm3uyKUWgn7GA8jAC9UV/G4tyd7LZe4JeBqxjCrJicrY MFkKVO1Q1QMsa8v6qR0V6VRwwMTzdvxOcUA1974OZdrQgFzsRga7U6+jA+fUfCiBxLfG3thbNmA L/Z8e8l75DzlkICPnPi2XYovpJZcJeEzlKB8CAElMvU4FWrlSakqz8rmGgTfTahH1tVzuTajcDE EcI22M= X-Google-Smtp-Source: AGHT+IGhyqko2XNGAQ+MbliOV7Q7lZRrugm5nEIedcDe3Hixtd5DaNp0Tw4exj+v0JxhFqTVbLupfA== X-Received: by 2002:a05:690c:45c5:b0:786:45ce:9bd3 with SMTP id 00721157ae682-78ab6f345bamr29492047b3.34.1764089974430; Tue, 25 Nov 2025 08:59:34 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:33 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 14/18] mm: memfd_luo: allow preserving memfd Date: Tue, 25 Nov 2025 11:58:44 -0500 Message-ID: <20251125165850.3389713-15-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav The ability to preserve a memfd allows userspace to use KHO and LUO to transfer its memory contents to the next kernel. This is useful in many ways. For one, it can be used with IOMMUFD as the backing store for IOMMU page tables. Preserving IOMMUFD is essential for performing a hypervisor live update with passthrough devices. memfd support provides the first building block for making that possible. For another, applications with a large amount of memory that takes time to reconstruct, reboots to consume kernel upgrades can be very expensive. memfd with LUO gives those applications reboot-persistent memory that they can use to quickly save and reconstruct that state. While memfd is backed by either hugetlbfs or shmem, currently only support on shmem is added. To be more precise, support for anonymous shmem files is added. The handover to the next kernel is not transparent. All the properties of the file are not preserved; only its memory contents, position, and size. The recreated file gets the UID and GID of the task doing the restore, and the task's cgroup gets charged with the memory. Once preserved, the file cannot grow or shrink, and all its pages are pinned to avoid migrations and swapping. The file can still be read from or written to. Use vmalloc to get the buffer to hold the folios, and preserve it using kho_preserve_vmalloc(). This doesn't have the size limit. Signed-off-by: Pratyush Yadav Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- MAINTAINERS | 2 + include/linux/kho/abi/memfd.h | 77 +++++ mm/Makefile | 1 + mm/memfd_luo.c | 516 ++++++++++++++++++++++++++++++++++ 4 files changed, 596 insertions(+) create mode 100644 include/linux/kho/abi/memfd.h create mode 100644 mm/memfd_luo.c diff --git a/MAINTAINERS b/MAINTAINERS index 868d3d23fdea..425c46bba764 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14469,6 +14469,7 @@ F: tools/testing/selftests/livepatch/ LIVE UPDATE M: Pasha Tatashin M: Mike Rapoport +R: Pratyush Yadav L: linux-kernel@vger.kernel.org S: Maintained F: Documentation/core-api/liveupdate.rst @@ -14477,6 +14478,7 @@ F: include/linux/liveupdate.h F: include/linux/liveupdate/ F: include/uapi/linux/liveupdate.h F: kernel/liveupdate/ +F: mm/memfd_luo.c =20 LLC (802.2) L: netdev@vger.kernel.org diff --git a/include/linux/kho/abi/memfd.h b/include/linux/kho/abi/memfd.h new file mode 100644 index 000000000000..da7d063474a1 --- /dev/null +++ b/include/linux/kho/abi/memfd.h @@ -0,0 +1,77 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + * + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Pratyush Yadav + */ + +#ifndef _LINUX_KHO_ABI_MEMFD_H +#define _LINUX_KHO_ABI_MEMFD_H + +#include +#include + +/** + * DOC: memfd Live Update ABI + * + * This header defines the ABI for preserving the state of a memfd across a + * kexec reboot using the LUO. + * + * The state is serialized into a packed structure `struct memfd_luo_ser` + * which is handed over to the next kernel via the KHO mechanism. + * + * This interface is a contract. Any modification to the structure layout + * constitutes a breaking change. Such changes require incrementing the + * version number in the MEMFD_LUO_FH_COMPATIBLE string. + */ + +/** + * MEMFD_LUO_FOLIO_DIRTY - The folio is dirty. + * + * This flag indicates the folio contains data from user. A non-dirty foli= o is + * one that was allocated (say using fallocate(2)) but not written to. + */ +#define MEMFD_LUO_FOLIO_DIRTY BIT(0) + +/** + * MEMFD_LUO_FOLIO_UPTODATE - The folio is up-to-date. + * + * An up-to-date folio has been zeroed out. shmem zeroes out folios on fir= st + * use. This flag tracks which folios need zeroing. + */ +#define MEMFD_LUO_FOLIO_UPTODATE BIT(1) + +/** + * struct memfd_luo_folio_ser - Serialized state of a single folio. + * @pfn: The page frame number of the folio. + * @flags: Flags to describe the state of the folio. + * @index: The page offset (pgoff_t) of the folio within the original = file. + */ +struct memfd_luo_folio_ser { + u64 pfn:52; + u64 flags:12; + u64 index; +} __packed; + +/** + * struct memfd_luo_ser - Main serialization structure for a memfd. + * @pos: The file's current position (f_pos). + * @size: The total size of the file in bytes (i_size). + * @nr_folios: Number of folios in the folios array. + * @folios: KHO vmalloc descriptor pointing to the array of + * struct memfd_luo_folio_ser. + */ +struct memfd_luo_ser { + u64 pos; + u64 size; + u64 nr_folios; + struct kho_vmalloc folios; +} __packed; + +/* The compatibility string for memfd file handler */ +#define MEMFD_LUO_FH_COMPATIBLE "memfd-v1" + +#endif /* _LINUX_KHO_ABI_MEMFD_H */ diff --git a/mm/Makefile b/mm/Makefile index 21abb3353550..7738ec416f00 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -100,6 +100,7 @@ obj-$(CONFIG_NUMA) +=3D memory-tiers.o obj-$(CONFIG_DEVICE_MIGRATION) +=3D migrate_device.o obj-$(CONFIG_TRANSPARENT_HUGEPAGE) +=3D huge_memory.o khugepaged.o obj-$(CONFIG_PAGE_COUNTER) +=3D page_counter.o +obj-$(CONFIG_LIVEUPDATE) +=3D memfd_luo.o obj-$(CONFIG_MEMCG_V1) +=3D memcontrol-v1.o obj-$(CONFIG_MEMCG) +=3D memcontrol.o vmpressure.o ifdef CONFIG_SWAP diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c new file mode 100644 index 000000000000..4f6ba63b4310 --- /dev/null +++ b/mm/memfd_luo.c @@ -0,0 +1,516 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + * + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Pratyush Yadav + */ + +/** + * DOC: Memfd Preservation via LUO + * + * Overview + * =3D=3D=3D=3D=3D=3D=3D=3D + * + * Memory file descriptors (memfd) can be preserved over a kexec using the= Live + * Update Orchestrator (LUO) file preservation. This allows userspace to + * transfer its memory contents to the next kernel after a kexec. + * + * The preservation is not intended to be transparent. Only select propert= ies of + * the file are preserved. All others are reset to default. The preserved + * properties are described below. + * + * .. note:: + * The LUO API is not stabilized yet, so the preserved properties of a = memfd + * are also not stable and are subject to backwards incompatible change= s. + * + * .. note:: + * Currently a memfd backed by Hugetlb is not supported. Memfds created + * with ``MFD_HUGETLB`` will be rejected. + * + * Preserved Properties + * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + * + * The following properties of the memfd are preserved across kexec: + * + * File Contents + * All data stored in the file is preserved. + * + * File Size + * The size of the file is preserved. Holes in the file are filled by + * allocating pages for them during preservation. + * + * File Position + * The current file position is preserved, allowing applications to cont= inue + * reading/writing from their last position. + * + * File Status Flags + * memfds are always opened with ``O_RDWR`` and ``O_LARGEFILE``. This pr= operty + * is maintained. + * + * Non-Preserved Properties + * =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D + * + * All properties which are not preserved must be assumed to be reset to + * default. This section describes some of those properties which may be m= ore of + * note. + * + * ``FD_CLOEXEC`` flag + * A memfd can be created with the ``MFD_CLOEXEC`` flag that sets the + * ``FD_CLOEXEC`` on the file. This flag is not preserved and must be set + * again after restore via ``fcntl()``. + * + * Seals + * File seals are not preserved. The file is unsealed on restore and if + * needed, must be sealed again via ``fcntl()``. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "internal.h" + +static int memfd_luo_preserve_folios(struct file *file, + struct kho_vmalloc *kho_vmalloc, + struct memfd_luo_folio_ser **out_folios_ser, + u64 *nr_foliosp) +{ + struct inode *inode =3D file_inode(file); + struct memfd_luo_folio_ser *folios_ser; + unsigned int max_folios; + long i, size, nr_pinned; + struct folio **folios; + int err =3D -EINVAL; + pgoff_t offset; + u64 nr_folios; + + size =3D i_size_read(inode); + /* + * If the file has zero size, then the folios and nr_folios properties + * are not set. + */ + if (!size) { + *nr_foliosp =3D 0; + *out_folios_ser =3D NULL; + memset(kho_vmalloc, 0, sizeof(*kho_vmalloc)); + return 0; + } + + /* + * Guess the number of folios based on inode size. Real number might end + * up being smaller if there are higher order folios. + */ + max_folios =3D PAGE_ALIGN(size) / PAGE_SIZE; + folios =3D kvmalloc_array(max_folios, sizeof(*folios), GFP_KERNEL); + if (!folios) + return -ENOMEM; + + /* + * Pin the folios so they don't move around behind our back. This also + * ensures none of the folios are in CMA -- which ensures they don't + * fall in KHO scratch memory. It also moves swapped out folios back to + * memory. + * + * A side effect of doing this is that it allocates a folio for all + * indices in the file. This might waste memory on sparse memfds. If + * that is really a problem in the future, we can have a + * memfd_pin_folios() variant that does not allocate a page on empty + * slots. + */ + nr_pinned =3D memfd_pin_folios(file, 0, size - 1, folios, max_folios, + &offset); + if (nr_pinned < 0) { + err =3D nr_pinned; + pr_err("failed to pin folios: %d\n", err); + goto err_free_folios; + } + nr_folios =3D nr_pinned; + + folios_ser =3D vcalloc(nr_folios, sizeof(*folios_ser)); + if (!folios_ser) { + err =3D -ENOMEM; + goto err_unpin; + } + + for (i =3D 0; i < nr_folios; i++) { + struct memfd_luo_folio_ser *pfolio =3D &folios_ser[i]; + struct folio *folio =3D folios[i]; + unsigned int flags =3D 0; + + err =3D kho_preserve_folio(folio); + if (err) + goto err_unpreserve; + + if (folio_test_dirty(folio)) + flags |=3D MEMFD_LUO_FOLIO_DIRTY; + if (folio_test_uptodate(folio)) + flags |=3D MEMFD_LUO_FOLIO_UPTODATE; + + pfolio->pfn =3D folio_pfn(folio); + pfolio->flags =3D flags; + pfolio->index =3D folio->index; + } + + err =3D kho_preserve_vmalloc(folios_ser, kho_vmalloc); + if (err) + goto err_unpreserve; + + kvfree(folios); + *nr_foliosp =3D nr_folios; + *out_folios_ser =3D folios_ser; + + /* + * Note: folios_ser is purposely not freed here. It is preserved + * memory (via KHO). In the 'unpreserve' path, we use the vmap pointer + * that is passed via private_data. + */ + return 0; + +err_unpreserve: + for (i =3D i - 1; i >=3D 0; i--) + kho_unpreserve_folio(folios[i]); + vfree(folios_ser); +err_unpin: + unpin_folios(folios, nr_folios); +err_free_folios: + kvfree(folios); + + return err; +} + +static void memfd_luo_unpreserve_folios(struct kho_vmalloc *kho_vmalloc, + struct memfd_luo_folio_ser *folios_ser, + u64 nr_folios) +{ + long i; + + if (!nr_folios) + return; + + kho_unpreserve_vmalloc(kho_vmalloc); + + for (i =3D 0; i < nr_folios; i++) { + const struct memfd_luo_folio_ser *pfolio =3D &folios_ser[i]; + struct folio *folio; + + if (!pfolio->pfn) + continue; + + folio =3D pfn_folio(pfolio->pfn); + + kho_unpreserve_folio(folio); + unpin_folio(folio); + } + + vfree(folios_ser); +} + +static int memfd_luo_preserve(struct liveupdate_file_op_args *args) +{ + struct inode *inode =3D file_inode(args->file); + struct memfd_luo_folio_ser *folios_ser; + struct memfd_luo_ser *ser; + u64 nr_folios; + int err =3D 0; + + inode_lock(inode); + shmem_freeze(inode, true); + + /* Allocate the main serialization structure in preserved memory */ + ser =3D kho_alloc_preserve(sizeof(*ser)); + if (IS_ERR(ser)) { + err =3D PTR_ERR(ser); + goto err_unlock; + } + + ser->pos =3D args->file->f_pos; + ser->size =3D i_size_read(inode); + + err =3D memfd_luo_preserve_folios(args->file, &ser->folios, + &folios_ser, &nr_folios); + if (err) + goto err_free_ser; + + ser->nr_folios =3D nr_folios; + inode_unlock(inode); + + args->private_data =3D folios_ser; + args->serialized_data =3D virt_to_phys(ser); + + return 0; + +err_free_ser: + kho_unpreserve_free(ser); +err_unlock: + shmem_freeze(inode, false); + inode_unlock(inode); + return err; +} + +static int memfd_luo_freeze(struct liveupdate_file_op_args *args) +{ + struct memfd_luo_ser *ser; + + if (WARN_ON_ONCE(!args->serialized_data)) + return -EINVAL; + + ser =3D phys_to_virt(args->serialized_data); + + /* + * The pos might have changed since prepare. Everything else stays the + * same. + */ + ser->pos =3D args->file->f_pos; + + return 0; +} + +static void memfd_luo_unpreserve(struct liveupdate_file_op_args *args) +{ + struct inode *inode =3D file_inode(args->file); + struct memfd_luo_ser *ser; + + if (WARN_ON_ONCE(!args->serialized_data)) + return; + + inode_lock(inode); + shmem_freeze(inode, false); + + ser =3D phys_to_virt(args->serialized_data); + + memfd_luo_unpreserve_folios(&ser->folios, args->private_data, + ser->nr_folios); + + kho_unpreserve_free(ser); + inode_unlock(inode); +} + +static void memfd_luo_discard_folios(const struct memfd_luo_folio_ser *fol= ios_ser, + u64 nr_folios) +{ + u64 i; + + for (i =3D 0; i < nr_folios; i++) { + const struct memfd_luo_folio_ser *pfolio =3D &folios_ser[i]; + struct folio *folio; + phys_addr_t phys; + + if (!pfolio->pfn) + continue; + + phys =3D PFN_PHYS(pfolio->pfn); + folio =3D kho_restore_folio(phys); + if (!folio) { + pr_warn_ratelimited("Unable to restore folio at physical address: %llx\= n", + phys); + continue; + } + + folio_put(folio); + } +} + +static void memfd_luo_finish(struct liveupdate_file_op_args *args) +{ + struct memfd_luo_folio_ser *folios_ser; + struct memfd_luo_ser *ser; + + if (args->retrieved) + return; + + ser =3D phys_to_virt(args->serialized_data); + if (!ser) + return; + + if (ser->nr_folios) { + folios_ser =3D kho_restore_vmalloc(&ser->folios); + if (!folios_ser) + goto out; + + memfd_luo_discard_folios(folios_ser, ser->nr_folios); + vfree(folios_ser); + } + +out: + kho_restore_free(ser); +} + +static int memfd_luo_retrieve_folios(struct file *file, + struct memfd_luo_folio_ser *folios_ser, + u64 nr_folios) +{ + struct inode *inode =3D file_inode(file); + struct address_space *mapping =3D inode->i_mapping; + struct folio *folio; + int err =3D -EIO; + long i; + + for (i =3D 0; i < nr_folios; i++) { + const struct memfd_luo_folio_ser *pfolio =3D &folios_ser[i]; + phys_addr_t phys; + u64 index; + int flags; + + if (!pfolio->pfn) + continue; + + phys =3D PFN_PHYS(pfolio->pfn); + folio =3D kho_restore_folio(phys); + if (!folio) { + pr_err("Unable to restore folio at physical address: %llx\n", + phys); + goto put_folios; + } + index =3D pfolio->index; + flags =3D pfolio->flags; + + /* Set up the folio for insertion. */ + __folio_set_locked(folio); + __folio_set_swapbacked(folio); + + err =3D mem_cgroup_charge(folio, NULL, mapping_gfp_mask(mapping)); + if (err) { + pr_err("shmem: failed to charge folio index %ld: %d\n", + i, err); + goto unlock_folio; + } + + err =3D shmem_add_to_page_cache(folio, mapping, index, NULL, + mapping_gfp_mask(mapping)); + if (err) { + pr_err("shmem: failed to add to page cache folio index %ld: %d\n", + i, err); + goto unlock_folio; + } + + if (flags & MEMFD_LUO_FOLIO_UPTODATE) + folio_mark_uptodate(folio); + if (flags & MEMFD_LUO_FOLIO_DIRTY) + folio_mark_dirty(folio); + + err =3D shmem_inode_acct_blocks(inode, 1); + if (err) { + pr_err("shmem: failed to account folio index %ld: %d\n", + i, err); + goto unlock_folio; + } + + shmem_recalc_inode(inode, 1, 0); + folio_add_lru(folio); + folio_unlock(folio); + folio_put(folio); + } + + return 0; + +unlock_folio: + folio_unlock(folio); + folio_put(folio); +put_folios: + /* + * Note: don't free the folios already added to the file. They will be + * freed when the file is freed. Free the ones not added yet here. + */ + for (long j =3D i + 1; j < nr_folios; j++) { + const struct memfd_luo_folio_ser *pfolio =3D &folios_ser[j]; + + folio =3D kho_restore_folio(pfolio->pfn); + if (folio) + folio_put(folio); + } + + return err; +} + +static int memfd_luo_retrieve(struct liveupdate_file_op_args *args) +{ + struct memfd_luo_folio_ser *folios_ser; + struct memfd_luo_ser *ser; + struct file *file; + int err; + + ser =3D phys_to_virt(args->serialized_data); + if (!ser) + return -EINVAL; + + file =3D shmem_file_setup("", 0, VM_NORESERVE); + + if (IS_ERR(file)) { + pr_err("failed to setup file: %pe\n", file); + return PTR_ERR(file); + } + + vfs_setpos(file, ser->pos, MAX_LFS_FILESIZE); + file->f_inode->i_size =3D ser->size; + + if (ser->nr_folios) { + folios_ser =3D kho_restore_vmalloc(&ser->folios); + if (!folios_ser) { + err =3D -EINVAL; + goto put_file; + } + + err =3D memfd_luo_retrieve_folios(file, folios_ser, ser->nr_folios); + vfree(folios_ser); + if (err) + goto put_file; + } + + args->file =3D file; + kho_restore_free(ser); + + return 0; + +put_file: + fput(file); + + return err; +} + +static bool memfd_luo_can_preserve(struct liveupdate_file_handler *handler, + struct file *file) +{ + struct inode *inode =3D file_inode(file); + + return shmem_file(file) && !inode->i_nlink; +} + +static const struct liveupdate_file_ops memfd_luo_file_ops =3D { + .freeze =3D memfd_luo_freeze, + .finish =3D memfd_luo_finish, + .retrieve =3D memfd_luo_retrieve, + .preserve =3D memfd_luo_preserve, + .unpreserve =3D memfd_luo_unpreserve, + .can_preserve =3D memfd_luo_can_preserve, + .owner =3D THIS_MODULE, +}; + +static struct liveupdate_file_handler memfd_luo_handler =3D { + .ops =3D &memfd_luo_file_ops, + .compatible =3D MEMFD_LUO_FH_COMPATIBLE, +}; + +static int __init memfd_luo_init(void) +{ + int err =3D liveupdate_register_file_handler(&memfd_luo_handler); + + if (err && err !=3D -EOPNOTSUPP) { + pr_err("Could not register luo filesystem handler: %pe\n", + ERR_PTR(err)); + + return err; + } + + return 0; +} +late_initcall(memfd_luo_init); --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yx1-f53.google.com (mail-yx1-f53.google.com [74.125.224.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B3F6331226 for ; Tue, 25 Nov 2025 16:59:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089979; cv=none; b=tQsqVX0ppIXTPKpOXd3gaqnpeY5+tGyHFuxzrqxNMXdR9EO+Ke/uIXKCvXVurX+SRTXfCNLGjZRjpwe7QhsZkRkEcVk3kcibzX1EJbBVUsnOBeUJclHslNhI//4UD3AOs7mn/L0BnfIaVpW+N1GpCNnk7rpA7+pbQR4qegYiQWs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089979; c=relaxed/simple; bh=fychgt9nAVC6qlaQbGXvu+POyBVW9WjNWP+jjeOAE6s=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Frgl+qjYPoEc+gsF/a6EmxKY25/Tbzsnk1jSprvaqTzXtLp6vac0D8GJtjOkY/g0U8dwfn/FP5ay7ps5ypCqEzFPv76BmAaBKE/DEb3roafudiG6KxG/R1LRL7cIyVwb2jlfcaa+fQ7dQAyq33MrwwUwtqlkcGBwnK+uEOFoiic= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=QFRoAdF3; arc=none smtp.client-ip=74.125.224.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="QFRoAdF3" Received: by mail-yx1-f53.google.com with SMTP id 956f58d0204a3-63f996d4e1aso6067488d50.0 for ; Tue, 25 Nov 2025 08:59:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089976; x=1764694776; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=RKFulOeeZC8khoIo0yKZZlpL1Ef0ob1cKcWS/buX2aU=; b=QFRoAdF3IFRPlbySoUDhL6X3A/qQ0oyijI7bkI/Dl/z7Cbhvia/AZ4zE3Ex6J6h6ro h3zKdjHMyfYexWkM1dHg690qtCGlpYwPNC1XkdQldRKhR5Nr29oZdxj5GD8r31Wa8iq5 sTpZxldZSHHzGNHhNXiK1NVfWY09tPyCd6wDnRcmZQLPExPEf3JuTic5bOKQtsGFHYvA 41SM3gDNZNKe8HgcqpGm8tkE9f1zzaVtWJPKE6QuHcye1wwgNNUV4sJDj8udaJl0Tpnm crXunP2f0rHlCq7vRIt8v3iS84L+37VEukQp5xyqK53ptQEBGgXb3vzpWK89maOJKs/A cbtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089976; x=1764694776; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=RKFulOeeZC8khoIo0yKZZlpL1Ef0ob1cKcWS/buX2aU=; b=TaGV2b9Ux1SLF1o08eKuJSGxEAMRw77sFeiwLSDc/iZU0js4p7KaSIy6AZwLeKTU6g +RvU7xGRylvTXe+GWHFaoRz+DrGquW/nkKlsboqky7h3kN0oVAVGYZMXsWQbweJFTxMS eaDOGD++3JCzuQ/9x7LQH6fzvKI0cqVl9zZm3T0dQI41YFCyZ/AZ0hzPbwsqOVRSNCrL Fcpc2at0wyQN5geA2dL+nim63DVvJFI7LIJ+KXZYLHD8HE5gBxJ4ocMjeHGZFEP6aXeq 6a9BNUNPVRfp+QGUCbaJbpQzo9Lq3aP2mAFLtVlMIIxJpjruSdqE2mYZQI6JtTat9WM3 vVIA== X-Forwarded-Encrypted: i=1; AJvYcCXHxnz2klqHYKP7i3ETo+JrM44UD3CEkqsDsm32cKG2hBsoNjx4gXy+6DdjD/VngRHrC9PhKtGPkhON3IY=@vger.kernel.org X-Gm-Message-State: AOJu0YxmtPQbj04gQ5SAjXYsfqOGO0B3LQNkWOqiequ+4suLMxV4l+Mm eswxsR84ZOrArRml/77Ty10+ZxN7oBHp8p+Hi4LreSdFpFCw8RKbr2tYF+AseEFyCT4= X-Gm-Gg: ASbGncvyDMokmyMrwEzbmFJLrBlGW567pbTybXVuz3hH5BLT10O2HyHenH1FU4XG1Z8 NZdlTe1p8cFyBUg38bOKodcbEWygqmnrEGNAkZnss+NQc18hbW8ImLcVKCLqXABoBaia7qI9N9S MFK0AVxGIbYFtkZTnkWFsIrY7EjL+GLDP9CNwUfZLOYNwLLwo1vZcQdKzDpujaxfO/AUon/iwRV F+jZcSMo6hcHezqPEWzItljUTyOswq3jDT8zRDi2chvU/u0TWw4spbl6HbuPN1AwATRJQ0p4IEG 2DPzSviSl6r9SWmS/kXi7MDy5vJF1kS2pyv8nMjPY8VetWzXvERt6NmOgsUSJy66izpLj1Qbanu 5xtTDQN/9xN0LCjEcQRlhKfiHSizwtkjoVxi+wVOFsZIRUqLqFXp3TG3APcIM6H50imxbY9qGTt 4q3/6cbScYR7s0yYnQR2CZCRJsjU9kvC/vI1c6O3cwGrKPZMFS3b1guioaz0yiy1WL04vP7Iu6Z Y4= X-Google-Smtp-Source: AGHT+IFFyrvPeYSGKaZ2gflHxpJKasjuM+nfTPdri6fpn7nF6c9WwHVJYxBbdaTgAkmoqf3bueMENQ== X-Received: by 2002:a53:d057:0:10b0:63f:a3d8:1b0e with SMTP id 956f58d0204a3-64302a3aa73mr10362858d50.12.1764089976255; Tue, 25 Nov 2025 08:59:36 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:35 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 15/18] docs: add documentation for memfd preservation via LUO Date: Tue, 25 Nov 2025 11:58:45 -0500 Message-ID: <20251125165850.3389713-16-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav Add the documentation under the "Preserving file descriptors" section of LUO's documentation. Signed-off-by: Pratyush Yadav Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- Documentation/core-api/liveupdate.rst | 7 +++++++ Documentation/mm/index.rst | 1 + Documentation/mm/memfd_preservation.rst | 23 +++++++++++++++++++++++ MAINTAINERS | 1 + 4 files changed, 32 insertions(+) create mode 100644 Documentation/mm/memfd_preservation.rst diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api= /liveupdate.rst index cca1993008d8..7960eb15a81f 100644 --- a/Documentation/core-api/liveupdate.rst +++ b/Documentation/core-api/liveupdate.rst @@ -23,6 +23,13 @@ Live Update Orchestrator ABI .. kernel-doc:: include/linux/kho/abi/luo.h :doc: Live Update Orchestrator ABI =20 +The following types of file descriptors can be preserved + +.. toctree:: + :maxdepth: 1 + + ../mm/memfd_preservation + Public API =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D .. kernel-doc:: include/linux/liveupdate.h diff --git a/Documentation/mm/index.rst b/Documentation/mm/index.rst index ba6a8872849b..7aa2a8886908 100644 --- a/Documentation/mm/index.rst +++ b/Documentation/mm/index.rst @@ -48,6 +48,7 @@ documentation, or deleted if it has served its purpose. hugetlbfs_reserv ksm memory-model + memfd_preservation mmu_notifier multigen_lru numa diff --git a/Documentation/mm/memfd_preservation.rst b/Documentation/mm/mem= fd_preservation.rst new file mode 100644 index 000000000000..66e0fb6d5ef0 --- /dev/null +++ b/Documentation/mm/memfd_preservation.rst @@ -0,0 +1,23 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D +Memfd Preservation via LUO +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +.. kernel-doc:: mm/memfd_luo.c + :doc: Memfd Preservation via LUO + +Memfd Preservation ABI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +.. kernel-doc:: include/linux/kho/abi/memfd.h + :doc: DOC: memfd Live Update ABI + +.. kernel-doc:: include/linux/kho/abi/memfd.h + :internal: + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`/core-api/liveupdate` +- :doc:`/core-api/kho/concepts` diff --git a/MAINTAINERS b/MAINTAINERS index 425c46bba764..cabbf30d50e1 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14473,6 +14473,7 @@ R: Pratyush Yadav L: linux-kernel@vger.kernel.org S: Maintained F: Documentation/core-api/liveupdate.rst +F: Documentation/mm/memfd_preservation.rst F: Documentation/userspace-api/liveupdate.rst F: include/linux/liveupdate.h F: include/linux/liveupdate/ --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yx1-f51.google.com (mail-yx1-f51.google.com [74.125.224.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA222331228 for ; Tue, 25 Nov 2025 16:59:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089982; cv=none; b=lFmANexhdja8VOZwaSjnIJfVH4+Q4i5X8RkvmOeMMJhsDY18wlVbeG+lfTw4ALe/1w7ye929FyzZ+7r0H8MZw6MpLErOhD3Wcsp8kKdInS+Y0twwTlQJCO6tFktj424dG3I03hARnHdkxj0cRI0uzgxQ+DXoixwcezmBArB/GVk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089982; c=relaxed/simple; bh=HEODgnQEmrvDg0TTbdzQfuYXj8+LwYzuO6fQwWMxlhU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=I2kPiwJRPV00ZRB1VwJGKGVQadPYMz570AJYdC5Z0QEuVia9AopP7QZ6RipW/cYmzj/bYPfhHunhOKGYDgGIJCZYrYilNn7mqSAfBaQkCVni8eGLV8IrJDUslld4yQZKSQSPunskEE/06eAt7iTkVDL3Weq1SqtmD8vzwb+09Uo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=RCacG+XQ; arc=none smtp.client-ip=74.125.224.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="RCacG+XQ" Received: by mail-yx1-f51.google.com with SMTP id 956f58d0204a3-640d4f2f13dso4748546d50.1 for ; Tue, 25 Nov 2025 08:59:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089978; x=1764694778; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=56ycedZJemtskFHXt7gNPJP5doNt+tRE9168VaCAAZg=; b=RCacG+XQx8y56aaDhcvOAVb6bUURp5uOw0M5pCX1l9OmlmHlW/yYv1zcA9o3i02xWa rlNrZRbHmYbD9uHZSBhB8yg9ZBClb6i1VYO3pDsQbZ3P0K43rNJsBjTEAEoeHSzaRTc1 4uE3hJnv/v9T+h0mRQOvHtWiRyrGWmzHXoeaz/7paGSKsHchD1K5DmrKcXbKIYJOXvkh AAeJCo5aLjw0Tg0OVo3LmYMTbb19XOkiaZc0sKIHuEXGDzJA6PpfDMVa36FQNbwFP2t0 jGBkaHIY9pS7TJDCyEO/j6YOGV8OYH6Z+U+KyvPbA9Xu6NZ4k+9fKvmceTsJzK0lkCCv YwUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089978; x=1764694778; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=56ycedZJemtskFHXt7gNPJP5doNt+tRE9168VaCAAZg=; b=mG5mv0VkDTotz66BmwuYxAWuS0rencs2EO8pr7cYPrAZ7Hb1osGF7n/rX75jxTaLwJ a1UumZbypoe87gvu8FTIr07oCkSS4tpYHVJ8g/p96WnE3HSHiFUgtfFUKlXTHuZCjkBD rz3xiGCVBm93EJVro/NxBpg8BHTENgK2mBKh39IZmC5aZQ6BJP+Gp1C3PlGFETVOECIL s6hksDJa8ZcVZqBtO8i1TXnVHhWS67KAg1w1sQrPIqOkVuaAjDM4tKFu20gPHRkT1qCg tfNtQEA3oHNyT7pSansaOlsE6WbaWjqQFRJQBMgmffOh+2ila1+TlXIhljZkfXHaRzra 4FBw== X-Forwarded-Encrypted: i=1; AJvYcCX0BZ8MWTmfXsR2gNrw3np30CExYtlGlpnv7+Z/QsVyjsBTUzIkC+oYseJFFqDpbDxZGHM4BAsC+m4T9KA=@vger.kernel.org X-Gm-Message-State: AOJu0YwjVm7ENlGd0b0+kfOoZV5P4cWDkY+rSM1hbO7TcqFbvgoiTmD+ fmS7F4qB2TtQPw45W9FEYK9SX+mNPv/y9o5dN/ToldNLDb5XCnK6A0q0OeurKz1f4g4= X-Gm-Gg: ASbGncsHOX688ctfwYTtMWXjFHPM4YAholTunMH371UrFeclK6c+HR8tEFYXz5jOy6v 2KUdpMuAH1V7oImjj1CGnw/y4HafEPnf2PS0pqqK6f+pU6G94wpl+gmcXyx1l1b5cmJQYkg+esx eqL2fzOuHfmwmDARWbc1Nr14jm5/4bHzuLD9MDGDOnlwagSFomgiuvRwUFY/2aOJxjpAYVOQLUy 9J6Coz4fLvXe6naZb9hKhumJdWaqN9QAWN4MfqYKXk0ZUo62EWD31nOkPYsfTHbhI4uHcfvPC9X cnCVMEEZfZZ0mDvWh3xZe0AQqM8EnTka0zIJaBm5R6c2KYcSAHzXH9SyAyokomc9JpYQeBNfHhd OwqQMgrjsgctnmo3KDLSkhJ+A2Yzxge4s9delJYHSgMZFBsjdzzVwE1sqZ6Ih3YovdlWYTRQSXn okh+CFENoX9a/GsFq2PVyD5TVUW0rfc6hFZlua+7njXrIjjLKlvFNS8BF4MZko4Vnf X-Google-Smtp-Source: AGHT+IFkPMOigFOoUXlZI1AX5lk4gQ7DF9aW/Qjd084DWZEkCwHhiHMRNCk0TlUzNybxB3Ue7agvUw== X-Received: by 2002:a05:690e:1403:b0:641:f5bc:69a6 with SMTP id 956f58d0204a3-6432939b0e8mr2129643d50.84.1764089978372; Tue, 25 Nov 2025 08:59:38 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:37 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 16/18] selftests/liveupdate: Add userspace API selftests Date: Tue, 25 Nov 2025 11:58:46 -0500 Message-ID: <20251125165850.3389713-17-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a selftest suite for LUO. These tests validate the core userspace-facing API provided by the /dev/liveupdate device and its associated ioctls. The suite covers fundamental device behavior, session management, and the file preservation mechanism using memfd as a test case. This provides regression testing for the LUO uAPI. The following functionality is verified: Device Access: Basic open and close operations on /dev/liveupdate. Enforcement of exclusive device access (verifying EBUSY on a second open). Session Management: Successful creation of sessions with unique names. Failure to create sessions with duplicate names. File Preservation: Preserving a single memfd and verifying its content remains intact post-preservation. Preserving multiple memfds within a single session, each with unique data. A complex scenario involving multiple sessions, each containing a mix of empty and data-filled memfds. Note: This test suite is limited to verifying the pre-kexec functionality of LUO (e.g., session creation, file preservation). The post-kexec restoration of resources is not covered, as the kselftest framework does not currently support orchestrating a reboot and continuing execution in the new kernel. Signed-off-by: Pasha Tatashin Reviewed-by: Pratyush Yadav Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- MAINTAINERS | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/liveupdate/.gitignore | 9 + tools/testing/selftests/liveupdate/Makefile | 27 ++ tools/testing/selftests/liveupdate/config | 11 + .../testing/selftests/liveupdate/liveupdate.c | 348 ++++++++++++++++++ 6 files changed, 397 insertions(+) create mode 100644 tools/testing/selftests/liveupdate/.gitignore create mode 100644 tools/testing/selftests/liveupdate/Makefile create mode 100644 tools/testing/selftests/liveupdate/config create mode 100644 tools/testing/selftests/liveupdate/liveupdate.c diff --git a/MAINTAINERS b/MAINTAINERS index cabbf30d50e1..83bac6c48c98 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14480,6 +14480,7 @@ F: include/linux/liveupdate/ F: include/uapi/linux/liveupdate.h F: kernel/liveupdate/ F: mm/memfd_luo.c +F: tools/testing/selftests/liveupdate/ =20 LLC (802.2) L: netdev@vger.kernel.org diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Mak= efile index c46ebdb9b8ef..56e44a98d6a5 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -54,6 +54,7 @@ TARGETS +=3D kvm TARGETS +=3D landlock TARGETS +=3D lib TARGETS +=3D livepatch +TARGETS +=3D liveupdate TARGETS +=3D lkdtm TARGETS +=3D lsm TARGETS +=3D membarrier diff --git a/tools/testing/selftests/liveupdate/.gitignore b/tools/testing/= selftests/liveupdate/.gitignore new file mode 100644 index 000000000000..661827083ab6 --- /dev/null +++ b/tools/testing/selftests/liveupdate/.gitignore @@ -0,0 +1,9 @@ +# SPDX-License-Identifier: GPL-2.0-only +* +!/**/ +!*.c +!*.h +!*.sh +!.gitignore +!config +!Makefile diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile new file mode 100644 index 000000000000..620cb4ce85af --- /dev/null +++ b/tools/testing/selftests/liveupdate/Makefile @@ -0,0 +1,27 @@ +# SPDX-License-Identifier: GPL-2.0-only + +TEST_GEN_PROGS +=3D liveupdate + +include ../lib.mk + +CFLAGS +=3D $(KHDR_INCLUDES) +CFLAGS +=3D -Wall -O2 -Wno-unused-function +CFLAGS +=3D -MD + +LIB_O :=3D $(patsubst %.c, $(OUTPUT)/%.o, $(LIB_C)) +TEST_O :=3D $(patsubst %, %.o, $(TEST_GEN_PROGS)) +TEST_O +=3D $(patsubst %, %.o, $(TEST_GEN_PROGS_EXTENDED)) + +TEST_DEP_FILES :=3D $(patsubst %.o, %.d, $(LIB_O)) +TEST_DEP_FILES +=3D $(patsubst %.o, %.d, $(TEST_O)) +-include $(TEST_DEP_FILES) + +$(LIB_O): $(OUTPUT)/%.o: %.c + $(CC) $(CFLAGS) $(CPPFLAGS) $(TARGET_ARCH) -c $< -o $@ + +$(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED): $(OUTPUT)/%: %.o $(LIB_O) + $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH) $< $(LIB_O) $(LDLIB= S) -o $@ + +EXTRA_CLEAN +=3D $(LIB_O) +EXTRA_CLEAN +=3D $(TEST_O) +EXTRA_CLEAN +=3D $(TEST_DEP_FILES) diff --git a/tools/testing/selftests/liveupdate/config b/tools/testing/self= tests/liveupdate/config new file mode 100644 index 000000000000..91d03f9a6a39 --- /dev/null +++ b/tools/testing/selftests/liveupdate/config @@ -0,0 +1,11 @@ +CONFIG_BLK_DEV_INITRD=3Dy +CONFIG_KEXEC_FILE=3Dy +CONFIG_KEXEC_HANDOVER=3Dy +CONFIG_KEXEC_HANDOVER_ENABLE_DEFAULT=3Dy +CONFIG_KEXEC_HANDOVER_DEBUGFS=3Dy +CONFIG_KEXEC_HANDOVER_DEBUG=3Dy +CONFIG_LIVEUPDATE=3Dy +CONFIG_LIVEUPDATE_TEST=3Dy +CONFIG_MEMFD_CREATE=3Dy +CONFIG_TMPFS=3Dy +CONFIG_SHMEM=3Dy diff --git a/tools/testing/selftests/liveupdate/liveupdate.c b/tools/testin= g/selftests/liveupdate/liveupdate.c new file mode 100644 index 000000000000..c2878e3d5ef9 --- /dev/null +++ b/tools/testing/selftests/liveupdate/liveupdate.c @@ -0,0 +1,348 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/* + * Selftests for the Live Update Orchestrator. + * This test suite verifies the functionality and behavior of the + * /dev/liveupdate character device and its session management capabilitie= s. + * + * Tests include: + * - Device access: basic open/close, and enforcement of exclusive access. + * - Session management: creation of unique sessions, and duplicate name d= etection. + * - Resource preservation: successfully preserving individual and multipl= e memfds, + * verifying contents remain accessible. + * - Complex multi-session scenarios involving mixed empty and populated f= iles. + */ + +#include +#include +#include +#include +#include + +#include + +#include "../kselftest.h" +#include "../kselftest_harness.h" + +#define LIVEUPDATE_DEV "/dev/liveupdate" + +FIXTURE(liveupdate_device) { + int fd1; + int fd2; +}; + +FIXTURE_SETUP(liveupdate_device) +{ + self->fd1 =3D -1; + self->fd2 =3D -1; +} + +FIXTURE_TEARDOWN(liveupdate_device) +{ + if (self->fd1 >=3D 0) + close(self->fd1); + if (self->fd2 >=3D 0) + close(self->fd2); +} + +/* + * Test Case: Basic Open and Close + * + * Verifies that the /dev/liveupdate device can be opened and subsequently + * closed without errors. Skips if the device does not exist. + */ +TEST_F(liveupdate_device, basic_open_close) +{ + self->fd1 =3D open(LIVEUPDATE_DEV, O_RDWR); + + if (self->fd1 < 0 && errno =3D=3D ENOENT) + SKIP(return, "%s does not exist.", LIVEUPDATE_DEV); + + ASSERT_GE(self->fd1, 0); + ASSERT_EQ(close(self->fd1), 0); + self->fd1 =3D -1; +} + +/* + * Test Case: Exclusive Open Enforcement + * + * Verifies that the /dev/liveupdate device can only be opened by one proc= ess + * at a time. It checks that a second attempt to open the device fails with + * the EBUSY error code. + */ +TEST_F(liveupdate_device, exclusive_open) +{ + self->fd1 =3D open(LIVEUPDATE_DEV, O_RDWR); + + if (self->fd1 < 0 && errno =3D=3D ENOENT) + SKIP(return, "%s does not exist.", LIVEUPDATE_DEV); + + ASSERT_GE(self->fd1, 0); + self->fd2 =3D open(LIVEUPDATE_DEV, O_RDWR); + EXPECT_LT(self->fd2, 0); + EXPECT_EQ(errno, EBUSY); +} + +/* Helper function to create a LUO session via ioctl. */ +static int create_session(int lu_fd, const char *name) +{ + struct liveupdate_ioctl_create_session args =3D {}; + + args.size =3D sizeof(args); + strncpy((char *)args.name, name, sizeof(args.name) - 1); + + if (ioctl(lu_fd, LIVEUPDATE_IOCTL_CREATE_SESSION, &args)) + return -errno; + + return args.fd; +} + +/* + * Test Case: Create Duplicate Session + * + * Verifies that attempting to create two sessions with the same name fails + * on the second attempt with EEXIST. + */ +TEST_F(liveupdate_device, create_duplicate_session) +{ + int session_fd1, session_fd2; + + self->fd1 =3D open(LIVEUPDATE_DEV, O_RDWR); + if (self->fd1 < 0 && errno =3D=3D ENOENT) + SKIP(return, "%s does not exist", LIVEUPDATE_DEV); + + ASSERT_GE(self->fd1, 0); + + session_fd1 =3D create_session(self->fd1, "duplicate-session-test"); + ASSERT_GE(session_fd1, 0); + + session_fd2 =3D create_session(self->fd1, "duplicate-session-test"); + EXPECT_LT(session_fd2, 0); + EXPECT_EQ(-session_fd2, EEXIST); + + ASSERT_EQ(close(session_fd1), 0); +} + +/* + * Test Case: Create Distinct Sessions + * + * Verifies that creating two sessions with different names succeeds. + */ +TEST_F(liveupdate_device, create_distinct_sessions) +{ + int session_fd1, session_fd2; + + self->fd1 =3D open(LIVEUPDATE_DEV, O_RDWR); + if (self->fd1 < 0 && errno =3D=3D ENOENT) + SKIP(return, "%s does not exist", LIVEUPDATE_DEV); + + ASSERT_GE(self->fd1, 0); + + session_fd1 =3D create_session(self->fd1, "distinct-session-1"); + ASSERT_GE(session_fd1, 0); + + session_fd2 =3D create_session(self->fd1, "distinct-session-2"); + ASSERT_GE(session_fd2, 0); + + ASSERT_EQ(close(session_fd1), 0); + ASSERT_EQ(close(session_fd2), 0); +} + +static int preserve_fd(int session_fd, int fd_to_preserve, __u64 token) +{ + struct liveupdate_session_preserve_fd args =3D {}; + + args.size =3D sizeof(args); + args.fd =3D fd_to_preserve; + args.token =3D token; + + if (ioctl(session_fd, LIVEUPDATE_SESSION_PRESERVE_FD, &args)) + return -errno; + + return 0; +} + +/* + * Test Case: Preserve MemFD + * + * Verifies that a valid memfd can be successfully preserved in a session = and + * that its contents remain intact after the preservation call. + */ +TEST_F(liveupdate_device, preserve_memfd) +{ + const char *test_str =3D "hello liveupdate"; + char read_buf[64] =3D {}; + int session_fd, mem_fd; + + self->fd1 =3D open(LIVEUPDATE_DEV, O_RDWR); + if (self->fd1 < 0 && errno =3D=3D ENOENT) + SKIP(return, "%s does not exist", LIVEUPDATE_DEV); + ASSERT_GE(self->fd1, 0); + + session_fd =3D create_session(self->fd1, "preserve-memfd-test"); + ASSERT_GE(session_fd, 0); + + mem_fd =3D memfd_create("test-memfd", 0); + ASSERT_GE(mem_fd, 0); + + ASSERT_EQ(write(mem_fd, test_str, strlen(test_str)), strlen(test_str)); + ASSERT_EQ(preserve_fd(session_fd, mem_fd, 0x1234), 0); + ASSERT_EQ(close(session_fd), 0); + + ASSERT_EQ(lseek(mem_fd, 0, SEEK_SET), 0); + ASSERT_EQ(read(mem_fd, read_buf, sizeof(read_buf)), strlen(test_str)); + ASSERT_STREQ(read_buf, test_str); + ASSERT_EQ(close(mem_fd), 0); +} + +/* + * Test Case: Preserve Multiple MemFDs + * + * Verifies that multiple memfds can be preserved in a single session, + * each with a unique token, and that their contents remain distinct and + * correct after preservation. + */ +TEST_F(liveupdate_device, preserve_multiple_memfds) +{ + const char *test_str1 =3D "data for memfd one"; + const char *test_str2 =3D "data for memfd two"; + char read_buf[64] =3D {}; + int session_fd, mem_fd1, mem_fd2; + + self->fd1 =3D open(LIVEUPDATE_DEV, O_RDWR); + if (self->fd1 < 0 && errno =3D=3D ENOENT) + SKIP(return, "%s does not exist", LIVEUPDATE_DEV); + ASSERT_GE(self->fd1, 0); + + session_fd =3D create_session(self->fd1, "preserve-multi-memfd-test"); + ASSERT_GE(session_fd, 0); + + mem_fd1 =3D memfd_create("test-memfd-1", 0); + ASSERT_GE(mem_fd1, 0); + mem_fd2 =3D memfd_create("test-memfd-2", 0); + ASSERT_GE(mem_fd2, 0); + + ASSERT_EQ(write(mem_fd1, test_str1, strlen(test_str1)), strlen(test_str1)= ); + ASSERT_EQ(write(mem_fd2, test_str2, strlen(test_str2)), strlen(test_str2)= ); + + ASSERT_EQ(preserve_fd(session_fd, mem_fd1, 0xAAAA), 0); + ASSERT_EQ(preserve_fd(session_fd, mem_fd2, 0xBBBB), 0); + + memset(read_buf, 0, sizeof(read_buf)); + ASSERT_EQ(lseek(mem_fd1, 0, SEEK_SET), 0); + ASSERT_EQ(read(mem_fd1, read_buf, sizeof(read_buf)), strlen(test_str1)); + ASSERT_STREQ(read_buf, test_str1); + + memset(read_buf, 0, sizeof(read_buf)); + ASSERT_EQ(lseek(mem_fd2, 0, SEEK_SET), 0); + ASSERT_EQ(read(mem_fd2, read_buf, sizeof(read_buf)), strlen(test_str2)); + ASSERT_STREQ(read_buf, test_str2); + + ASSERT_EQ(close(mem_fd1), 0); + ASSERT_EQ(close(mem_fd2), 0); + ASSERT_EQ(close(session_fd), 0); +} + +/* + * Test Case: Preserve Complex Scenario + * + * Verifies a more complex scenario with multiple sessions and a mix of em= pty + * and non-empty memfds distributed across them. + */ +TEST_F(liveupdate_device, preserve_complex_scenario) +{ + const char *data1 =3D "data for session 1"; + const char *data2 =3D "data for session 2"; + char read_buf[64] =3D {}; + int session_fd1, session_fd2; + int mem_fd_data1, mem_fd_empty1, mem_fd_data2, mem_fd_empty2; + + self->fd1 =3D open(LIVEUPDATE_DEV, O_RDWR); + if (self->fd1 < 0 && errno =3D=3D ENOENT) + SKIP(return, "%s does not exist", LIVEUPDATE_DEV); + ASSERT_GE(self->fd1, 0); + + session_fd1 =3D create_session(self->fd1, "complex-session-1"); + ASSERT_GE(session_fd1, 0); + session_fd2 =3D create_session(self->fd1, "complex-session-2"); + ASSERT_GE(session_fd2, 0); + + mem_fd_data1 =3D memfd_create("data1", 0); + ASSERT_GE(mem_fd_data1, 0); + ASSERT_EQ(write(mem_fd_data1, data1, strlen(data1)), strlen(data1)); + + mem_fd_empty1 =3D memfd_create("empty1", 0); + ASSERT_GE(mem_fd_empty1, 0); + + mem_fd_data2 =3D memfd_create("data2", 0); + ASSERT_GE(mem_fd_data2, 0); + ASSERT_EQ(write(mem_fd_data2, data2, strlen(data2)), strlen(data2)); + + mem_fd_empty2 =3D memfd_create("empty2", 0); + ASSERT_GE(mem_fd_empty2, 0); + + ASSERT_EQ(preserve_fd(session_fd1, mem_fd_data1, 0x1111), 0); + ASSERT_EQ(preserve_fd(session_fd1, mem_fd_empty1, 0x2222), 0); + ASSERT_EQ(preserve_fd(session_fd2, mem_fd_data2, 0x3333), 0); + ASSERT_EQ(preserve_fd(session_fd2, mem_fd_empty2, 0x4444), 0); + + ASSERT_EQ(lseek(mem_fd_data1, 0, SEEK_SET), 0); + ASSERT_EQ(read(mem_fd_data1, read_buf, sizeof(read_buf)), strlen(data1)); + ASSERT_STREQ(read_buf, data1); + + memset(read_buf, 0, sizeof(read_buf)); + ASSERT_EQ(lseek(mem_fd_data2, 0, SEEK_SET), 0); + ASSERT_EQ(read(mem_fd_data2, read_buf, sizeof(read_buf)), strlen(data2)); + ASSERT_STREQ(read_buf, data2); + + ASSERT_EQ(lseek(mem_fd_empty1, 0, SEEK_SET), 0); + ASSERT_EQ(read(mem_fd_empty1, read_buf, sizeof(read_buf)), 0); + + ASSERT_EQ(lseek(mem_fd_empty2, 0, SEEK_SET), 0); + ASSERT_EQ(read(mem_fd_empty2, read_buf, sizeof(read_buf)), 0); + + ASSERT_EQ(close(mem_fd_data1), 0); + ASSERT_EQ(close(mem_fd_empty1), 0); + ASSERT_EQ(close(mem_fd_data2), 0); + ASSERT_EQ(close(mem_fd_empty2), 0); + ASSERT_EQ(close(session_fd1), 0); + ASSERT_EQ(close(session_fd2), 0); +} + +/* + * Test Case: Preserve Unsupported File Descriptor + * + * Verifies that attempting to preserve a file descriptor that does not ha= ve + * a registered Live Update handler fails gracefully. + * Uses /dev/null as a representative of a file type (character device) + * that is not supported by the orchestrator. + */ +TEST_F(liveupdate_device, preserve_unsupported_fd) +{ + int session_fd, unsupported_fd; + int ret; + + self->fd1 =3D open(LIVEUPDATE_DEV, O_RDWR); + if (self->fd1 < 0 && errno =3D=3D ENOENT) + SKIP(return, "%s does not exist", LIVEUPDATE_DEV); + ASSERT_GE(self->fd1, 0); + + session_fd =3D create_session(self->fd1, "unsupported-fd-test"); + ASSERT_GE(session_fd, 0); + + unsupported_fd =3D open("/dev/null", O_RDWR); + ASSERT_GE(unsupported_fd, 0); + + ret =3D preserve_fd(session_fd, unsupported_fd, 0xDEAD); + EXPECT_EQ(ret, -ENOENT); + + ASSERT_EQ(close(unsupported_fd), 0); + ASSERT_EQ(close(session_fd), 0); +} + +TEST_HARNESS_MAIN --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yx1-f50.google.com (mail-yx1-f50.google.com [74.125.224.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7D475331A62 for ; Tue, 25 Nov 2025 16:59:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=74.125.224.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089984; cv=none; b=ug9qflOGtLlKAwrnFrwJKfg7nflAEP2srGkPaII/zlxOUf6v6nikLVyBvM8f/E17Rj0ZIdeBep9Aof1gaeasmAFpOhhtIswsUBSrAx/AT6kV77fi0g/JsoBeLmDvjn5VF/f7LX4iffey936Dlb1bw/Q4Sz3Pm422LHmJhA5mJr8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089984; c=relaxed/simple; bh=nU/T6u7xPoCdN1CIsmHHus/vm+8czfvaxNynn5WVm3w=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=le6rSLQCmuuuvAQBTMf6JmPi8y9GZvUzyuKwmnERtCk4gwq70vpMunBNPJOVXaoeesoc+EbvhfPOQVEhjHeRICSoWgoHKnocL1Af6j/4D4qgqDwk15UM2JWGZGzNOmUtr3Ojg1EyxDfEl6TLwDHCwLA3mHnIo7lkQiBWz5RdGHM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=QDESGPoZ; arc=none smtp.client-ip=74.125.224.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="QDESGPoZ" Received: by mail-yx1-f50.google.com with SMTP id 956f58d0204a3-6420dc2e5feso4511160d50.3 for ; Tue, 25 Nov 2025 08:59:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089980; x=1764694780; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=1zn01wPWOlP3y55xcUHgz6OrhAWK/GekbUUj2NOJck4=; b=QDESGPoZwZI//DRxxH1o9Nrl7KrUeCeCkaHSAO68eUx3W9SR16cFn9JOTa8xmrEbfb iKzfMrBbYb2CLXl3tgifOn6dqX7qlmaMfn/eI2aqkB8WHL8zQsN457UtyD2fjtxOxCVg 55DxY+IdEj94UzA9LxxWUBW0IHPf6r/cSEo03hVKpjBrPQUngpQf7hXfzCaXfZFEtNOK knwC+jqQoaF2yN4qOfzCoXfS7E7iiPohYTdN9Fyi8JoSEeBu/Izel0KXsWfK+AuZibpe Q1naw3OgYi5qUdhJi0j54ynjicxysMBVw4edDvk44Gu1xMgZ6pK6ZGugZb+fDNZUlEEv +kLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089980; x=1764694780; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=1zn01wPWOlP3y55xcUHgz6OrhAWK/GekbUUj2NOJck4=; b=eiBhsPBihg/EjKF8cWvk4ZXwW88QRQHWYI6z9LArLEn3+in0sButt8lexzjmaP+XxX VY0aPoC3F2e1HyWEIsuExef6NNZXtlnqFIgkg+8eHlVFXw9crv+L7pG45dRbBwuLBU5T 0A6Kae7gztjeTay+LKdtW4k4iJ2HdwcPRXJ9PpSNFUEqacqtAbVdu3U/lKe15Ho1b0z9 a6KKKciPuCWVU+Z/y+a0BYL6PMdjnIMIXoKcvICKntJUDwXOtQD0ATD5SPWnF9/GrBT0 WbDwQoJnSqO/3DMox4qpY5SZE6Pub9iaO1aAwzqgfazl4Es+vyXarRWkYmf0pvmIyv22 wwKg== X-Forwarded-Encrypted: i=1; AJvYcCUYM6JTdUYjLn46p7QLjWqcqZQ9ZGaixrT5wTm9+eac4jJEoRYfRcDLas1bw7akPL7QtEbUgHF2QRkIUCM=@vger.kernel.org X-Gm-Message-State: AOJu0YzAomlCQt27ATo+mLamFS5DKb63u4l7A7GCaPgjfa9auyU4w+CF 3JMxm6nDu6Hd5a42z7g32yAnDTiZvQiCZFuLWMPOGvv+k+RzaEwBRPC80GXsK1vPQaU= X-Gm-Gg: ASbGnct4y3YNvQj3M9iVxk0VFfEOX8pIbP27Xg2k3/byXc4j31nWt1hmEPN6whEYi5+ Z93tC/L+u4fY9j27LsAa5K/M25fcOe6HOulTqMI5KZZ3RB3PmJdx0OOQTd9gSXf1FiDZSZLmLaj tdVlaNOkF5njYsIJ57v5CS+QdXnGPamEqPBqjcpTUq3fN8721MYv/9ImMm4qUHnhcsDwpp73Qh2 5RmnUhSidwLgrsZiev5bI0doUbrHBfqtZZQH/LG8zf7cQuyGSzn+GWoxzkOH0UzWWYNnTidUXuj tk2Vsa7ZQ7Pgrc4RG8bsyI5vWK4W4Lw4CSu8z1MTOo5QlnZ6ev4qtM44CIWDsaSVfI8wIB/tKAk DVjVmOo64JKjYxQzKFWQLynk4Eb6NOOscYgmK+phKUYmRPZcwCaqjOZemnLfevvvjPScheDWjMA gFVciW0R6FhjmJ7AJ0Wg40wxYyXbkbmHJ5hXYr1WRSzNy/jzWHzeZkTt7yY+96kSrH X-Google-Smtp-Source: AGHT+IHaLBHTW0gdKW5yIXp86TXIcw0le8lgb42O63AJOI349kwyVeCAoddVEYxKN/vR8v7sgdqIVQ== X-Received: by 2002:a05:690e:2459:b0:63e:30e1:4429 with SMTP id 956f58d0204a3-64302abcb44mr10008545d50.38.1764089980255; Tue, 25 Nov 2025 08:59:40 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:39 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 17/18] selftests/liveupdate: Add simple kexec-based selftest for LUO Date: Tue, 25 Nov 2025 11:58:47 -0500 Message-ID: <20251125165850.3389713-18-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Introduce a kexec-based selftest, luo_kexec_simple, to validate the end-to-end lifecycle of a Live Update Orchestrator session across a reboot. While existing tests verify the uAPI in a pre-reboot context, this test ensures that the core functionality=E2=80=94preserving state via Kexec Hand= over and restoring it in a new kernel=E2=80=94works as expected. The test operates in two stages, managing its state across the reboot by preserving a dedicated "state session" containing a memfd. This mechanism dogfoods the LUO feature itself for state tracking, making the test self-contained. The test validates the following sequence: Stage 1 (Pre-kexec): - Creates a test session (test-session). - Creates and preserves a memfd with a known data pattern into the test session. - Creates the state-tracking session to signal progression to Stage 2. - Executes a kexec reboot via a helper script. Stage 2 (Post-kexec): - Retrieves the state-tracking session to confirm it is in the post-reboot stage. - Retrieves the preserved test session. - Restores the memfd from the test session and verifies its contents match the original data pattern written in Stage 1. - Finalizes both the test and state sessions to ensure a clean teardown. The test relies on a helper script (do_kexec.sh) to perform the reboot and a shared utility library (luo_test_utils.c) for common LUO operations, keeping the main test logic clean and focused. Signed-off-by: Pasha Tatashin Reviewed-by: Zhu Yanjun Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- tools/testing/selftests/liveupdate/Makefile | 6 + .../testing/selftests/liveupdate/do_kexec.sh | 16 ++ .../selftests/liveupdate/luo_kexec_simple.c | 89 ++++++ .../selftests/liveupdate/luo_test_utils.c | 266 ++++++++++++++++++ .../selftests/liveupdate/luo_test_utils.h | 44 +++ 5 files changed, 421 insertions(+) create mode 100755 tools/testing/selftests/liveupdate/do_kexec.sh create mode 100644 tools/testing/selftests/liveupdate/luo_kexec_simple.c create mode 100644 tools/testing/selftests/liveupdate/luo_test_utils.c create mode 100644 tools/testing/selftests/liveupdate/luo_test_utils.h diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile index 620cb4ce85af..bbbec633970c 100644 --- a/tools/testing/selftests/liveupdate/Makefile +++ b/tools/testing/selftests/liveupdate/Makefile @@ -1,7 +1,13 @@ # SPDX-License-Identifier: GPL-2.0-only =20 +LIB_C +=3D luo_test_utils.c + TEST_GEN_PROGS +=3D liveupdate =20 +TEST_GEN_PROGS_EXTENDED +=3D luo_kexec_simple + +TEST_FILES +=3D do_kexec.sh + include ../lib.mk =20 CFLAGS +=3D $(KHDR_INCLUDES) diff --git a/tools/testing/selftests/liveupdate/do_kexec.sh b/tools/testing= /selftests/liveupdate/do_kexec.sh new file mode 100755 index 000000000000..3c7c6cafbef8 --- /dev/null +++ b/tools/testing/selftests/liveupdate/do_kexec.sh @@ -0,0 +1,16 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +set -e + +# Use $KERNEL and $INITRAMFS to pass custom Kernel and optional initramfs + +KERNEL=3D"${KERNEL:-/boot/bzImage}" +set -- -l -s --reuse-cmdline "$KERNEL" + +INITRAMFS=3D"${INITRAMFS:-/boot/initramfs}" +if [ -f "$INITRAMFS" ]; then + set -- "$@" --initrd=3D"$INITRAMFS" +fi + +kexec "$@" +kexec -e diff --git a/tools/testing/selftests/liveupdate/luo_kexec_simple.c b/tools/= testing/selftests/liveupdate/luo_kexec_simple.c new file mode 100644 index 000000000000..d7ac1f3dc4cb --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_kexec_simple.c @@ -0,0 +1,89 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + * + * A simple selftest to validate the end-to-end lifecycle of a LUO session + * across a single kexec reboot. + */ + +#include "luo_test_utils.h" + +#define TEST_SESSION_NAME "test-session" +#define TEST_MEMFD_TOKEN 0x1A +#define TEST_MEMFD_DATA "hello kexec world" + +/* Constants for the state-tracking mechanism, specific to this test file.= */ +#define STATE_SESSION_NAME "kexec_simple_state" +#define STATE_MEMFD_TOKEN 999 + +/* Stage 1: Executed before the kexec reboot. */ +static void run_stage_1(int luo_fd) +{ + int session_fd; + + ksft_print_msg("[STAGE 1] Starting pre-kexec setup...\n"); + + ksft_print_msg("[STAGE 1] Creating state file for next stage (2)...\n"); + create_state_file(luo_fd, STATE_SESSION_NAME, STATE_MEMFD_TOKEN, 2); + + ksft_print_msg("[STAGE 1] Creating session '%s' and preserving memfd...\n= ", + TEST_SESSION_NAME); + session_fd =3D luo_create_session(luo_fd, TEST_SESSION_NAME); + if (session_fd < 0) + fail_exit("luo_create_session for '%s'", TEST_SESSION_NAME); + + if (create_and_preserve_memfd(session_fd, TEST_MEMFD_TOKEN, + TEST_MEMFD_DATA) < 0) { + fail_exit("create_and_preserve_memfd for token %#x", + TEST_MEMFD_TOKEN); + } + + close(luo_fd); + daemonize_and_wait(); +} + +/* Stage 2: Executed after the kexec reboot. */ +static void run_stage_2(int luo_fd, int state_session_fd) +{ + int session_fd, mfd, stage; + + ksft_print_msg("[STAGE 2] Starting post-kexec verification...\n"); + + restore_and_read_stage(state_session_fd, STATE_MEMFD_TOKEN, &stage); + if (stage !=3D 2) + fail_exit("Expected stage 2, but state file contains %d", stage); + + ksft_print_msg("[STAGE 2] Retrieving session '%s'...\n", TEST_SESSION_NAM= E); + session_fd =3D luo_retrieve_session(luo_fd, TEST_SESSION_NAME); + if (session_fd < 0) + fail_exit("luo_retrieve_session for '%s'", TEST_SESSION_NAME); + + ksft_print_msg("[STAGE 2] Restoring and verifying memfd (token %#x)...\n", + TEST_MEMFD_TOKEN); + mfd =3D restore_and_verify_memfd(session_fd, TEST_MEMFD_TOKEN, + TEST_MEMFD_DATA); + if (mfd < 0) + fail_exit("restore_and_verify_memfd for token %#x", TEST_MEMFD_TOKEN); + close(mfd); + + ksft_print_msg("[STAGE 2] Test data verified successfully.\n"); + ksft_print_msg("[STAGE 2] Finalizing test session...\n"); + if (luo_session_finish(session_fd) < 0) + fail_exit("luo_session_finish for test session"); + close(session_fd); + + ksft_print_msg("[STAGE 2] Finalizing state session...\n"); + if (luo_session_finish(state_session_fd) < 0) + fail_exit("luo_session_finish for state session"); + close(state_session_fd); + + ksft_print_msg("\n--- SIMPLE KEXEC TEST PASSED ---\n"); +} + +int main(int argc, char *argv[]) +{ + return luo_test(argc, argv, STATE_SESSION_NAME, + run_stage_1, run_stage_2); +} diff --git a/tools/testing/selftests/liveupdate/luo_test_utils.c b/tools/te= sting/selftests/liveupdate/luo_test_utils.c new file mode 100644 index 000000000000..3c8721c505df --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_test_utils.c @@ -0,0 +1,266 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "luo_test_utils.h" + +int luo_open_device(void) +{ + return open(LUO_DEVICE, O_RDWR); +} + +int luo_create_session(int luo_fd, const char *name) +{ + struct liveupdate_ioctl_create_session arg =3D { .size =3D sizeof(arg) }; + + snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s", + LIVEUPDATE_SESSION_NAME_LENGTH - 1, name); + + if (ioctl(luo_fd, LIVEUPDATE_IOCTL_CREATE_SESSION, &arg) < 0) + return -errno; + + return arg.fd; +} + +int luo_retrieve_session(int luo_fd, const char *name) +{ + struct liveupdate_ioctl_retrieve_session arg =3D { .size =3D sizeof(arg) = }; + + snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s", + LIVEUPDATE_SESSION_NAME_LENGTH - 1, name); + + if (ioctl(luo_fd, LIVEUPDATE_IOCTL_RETRIEVE_SESSION, &arg) < 0) + return -errno; + + return arg.fd; +} + +int create_and_preserve_memfd(int session_fd, int token, const char *data) +{ + struct liveupdate_session_preserve_fd arg =3D { .size =3D sizeof(arg) }; + long page_size =3D sysconf(_SC_PAGE_SIZE); + void *map =3D MAP_FAILED; + int mfd =3D -1, ret =3D -1; + + mfd =3D memfd_create("test_mfd", 0); + if (mfd < 0) + return -errno; + + if (ftruncate(mfd, page_size) !=3D 0) + goto out; + + map =3D mmap(NULL, page_size, PROT_WRITE, MAP_SHARED, mfd, 0); + if (map =3D=3D MAP_FAILED) + goto out; + + snprintf(map, page_size, "%s", data); + munmap(map, page_size); + + arg.fd =3D mfd; + arg.token =3D token; + if (ioctl(session_fd, LIVEUPDATE_SESSION_PRESERVE_FD, &arg) < 0) + goto out; + + ret =3D 0; +out: + if (ret !=3D 0 && errno !=3D 0) + ret =3D -errno; + if (mfd >=3D 0) + close(mfd); + return ret; +} + +int restore_and_verify_memfd(int session_fd, int token, + const char *expected_data) +{ + struct liveupdate_session_retrieve_fd arg =3D { .size =3D sizeof(arg) }; + long page_size =3D sysconf(_SC_PAGE_SIZE); + void *map =3D MAP_FAILED; + int mfd =3D -1, ret =3D -1; + + arg.token =3D token; + if (ioctl(session_fd, LIVEUPDATE_SESSION_RETRIEVE_FD, &arg) < 0) + return -errno; + mfd =3D arg.fd; + + map =3D mmap(NULL, page_size, PROT_READ, MAP_SHARED, mfd, 0); + if (map =3D=3D MAP_FAILED) + goto out; + + if (expected_data && strcmp(expected_data, map) !=3D 0) { + ksft_print_msg("Data mismatch! Expected '%s', Got '%s'\n", + expected_data, (char *)map); + ret =3D -EINVAL; + goto out_munmap; + } + + ret =3D mfd; +out_munmap: + munmap(map, page_size); +out: + if (ret < 0 && errno !=3D 0) + ret =3D -errno; + if (ret < 0 && mfd >=3D 0) + close(mfd); + return ret; +} + +int luo_session_finish(int session_fd) +{ + struct liveupdate_session_finish arg =3D { .size =3D sizeof(arg) }; + + if (ioctl(session_fd, LIVEUPDATE_SESSION_FINISH, &arg) < 0) + return -errno; + + return 0; +} + +void create_state_file(int luo_fd, const char *session_name, int token, + int next_stage) +{ + char buf[32]; + int state_session_fd; + + state_session_fd =3D luo_create_session(luo_fd, session_name); + if (state_session_fd < 0) + fail_exit("luo_create_session for state tracking"); + + snprintf(buf, sizeof(buf), "%d", next_stage); + if (create_and_preserve_memfd(state_session_fd, token, buf) < 0) + fail_exit("create_and_preserve_memfd for state tracking"); + + /* + * DO NOT close session FD, otherwise it is going to be unpreserved + */ +} + +void restore_and_read_stage(int state_session_fd, int token, int *stage) +{ + char buf[32] =3D {0}; + int mfd; + + mfd =3D restore_and_verify_memfd(state_session_fd, token, NULL); + if (mfd < 0) + fail_exit("failed to restore state memfd"); + + if (read(mfd, buf, sizeof(buf) - 1) < 0) + fail_exit("failed to read state mfd"); + + *stage =3D atoi(buf); + + close(mfd); +} + +void daemonize_and_wait(void) +{ + pid_t pid; + + ksft_print_msg("[STAGE 1] Forking persistent child to hold sessions...\n"= ); + + pid =3D fork(); + if (pid < 0) + fail_exit("fork failed"); + + if (pid > 0) { + ksft_print_msg("[STAGE 1] Child PID: %d. Resources are pinned.\n", pid); + ksft_print_msg("[STAGE 1] You may now perform kexec reboot.\n"); + exit(EXIT_SUCCESS); + } + + /* Detach from terminal so closing the window doesn't kill us */ + if (setsid() < 0) + fail_exit("setsid failed"); + + close(STDIN_FILENO); + close(STDOUT_FILENO); + close(STDERR_FILENO); + + /* Change dir to root to avoid locking filesystems */ + if (chdir("/") < 0) + exit(EXIT_FAILURE); + + while (1) + sleep(60); +} + +static int parse_stage_args(int argc, char *argv[]) +{ + static struct option long_options[] =3D { + {"stage", required_argument, 0, 's'}, + {0, 0, 0, 0} + }; + int option_index =3D 0; + int stage =3D 1; + int opt; + + optind =3D 1; + while ((opt =3D getopt_long(argc, argv, "s:", long_options, &option_index= )) !=3D -1) { + switch (opt) { + case 's': + stage =3D atoi(optarg); + if (stage !=3D 1 && stage !=3D 2) + fail_exit("Invalid stage argument"); + break; + default: + fail_exit("Unknown argument"); + } + } + return stage; +} + +int luo_test(int argc, char *argv[], + const char *state_session_name, + luo_test_stage1_fn stage1, + luo_test_stage2_fn stage2) +{ + int target_stage =3D parse_stage_args(argc, argv); + int luo_fd =3D luo_open_device(); + int state_session_fd; + int detected_stage; + + if (luo_fd < 0) { + ksft_exit_skip("Failed to open %s. Is the luo module loaded?\n", + LUO_DEVICE); + } + + state_session_fd =3D luo_retrieve_session(luo_fd, state_session_name); + if (state_session_fd =3D=3D -ENOENT) + detected_stage =3D 1; + else if (state_session_fd >=3D 0) + detected_stage =3D 2; + else + fail_exit("Failed to check for state session"); + + if (target_stage !=3D detected_stage) { + ksft_exit_fail_msg("Stage mismatch Requested --stage %d, but system is i= n stage %d.\n" + "(State session %s: %s)\n", + target_stage, detected_stage, state_session_name, + (detected_stage =3D=3D 2) ? "EXISTS" : "MISSING"); + } + + if (target_stage =3D=3D 1) + stage1(luo_fd); + else + stage2(luo_fd, state_session_fd); + + return 0; +} diff --git a/tools/testing/selftests/liveupdate/luo_test_utils.h b/tools/te= sting/selftests/liveupdate/luo_test_utils.h new file mode 100644 index 000000000000..90099bf49577 --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_test_utils.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + * + * Utility functions for LUO kselftests. + */ + +#ifndef LUO_TEST_UTILS_H +#define LUO_TEST_UTILS_H + +#include +#include +#include +#include "../kselftest.h" + +#define LUO_DEVICE "/dev/liveupdate" + +#define fail_exit(fmt, ...) \ + ksft_exit_fail_msg("[%s:%d] " fmt " (errno: %s)\n", \ + __func__, __LINE__, ##__VA_ARGS__, strerror(errno)) + +int luo_open_device(void); +int luo_create_session(int luo_fd, const char *name); +int luo_retrieve_session(int luo_fd, const char *name); +int luo_session_finish(int session_fd); + +int create_and_preserve_memfd(int session_fd, int token, const char *data); +int restore_and_verify_memfd(int session_fd, int token, const char *expect= ed_data); + +void create_state_file(int luo_fd, const char *session_name, int token, + int next_stage); +void restore_and_read_stage(int state_session_fd, int token, int *stage); + +void daemonize_and_wait(void); + +typedef void (*luo_test_stage1_fn)(int luo_fd); +typedef void (*luo_test_stage2_fn)(int luo_fd, int state_session_fd); + +int luo_test(int argc, char *argv[], const char *state_session_name, + luo_test_stage1_fn stage1, luo_test_stage2_fn stage2); + +#endif /* LUO_TEST_UTILS_H */ --=20 2.52.0.460.gd25c4c69ec-goog From nobody Tue Dec 2 00:04:03 2025 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 744BF3321B0 for ; Tue, 25 Nov 2025 16:59:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089986; cv=none; b=MOBX40X3h3TljN5eouT8Ifmu7rO1GvfTt3o8jUdXm3G48pUvKOcwz++orK+PG2iKwoMdYXK++ukg7p5LXhX7zUERLe74/u5gKO4CYQLEIs/cy8Foq/at8fbq9YptibyNTcivNeku906XKxL5s9dwcCabTBpXWgzou0LEaA1nfNo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764089986; c=relaxed/simple; bh=G7Q0itscrNM8vJpPgM2kTNcPU4YmQP4p7bNpWy/a7kk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HRAqAy8CN8x/2u8zaGn1ECYd2WGjr8ALhPWk1lf+EX0C0/U5z+zkaxMDmJ4B0G7VrYD+GzgtciSeWVJpkK5jn2ZXoCJyAOX6Nr+ng0a/z1ZDzTJz2n/zQbMhfHLzAz5xvNEVjT/MffIW6vVFd6soZ2e9wlexBFDnKPpG32fGIas= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=UenYKkM/; arc=none smtp.client-ip=209.85.128.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="UenYKkM/" Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-78802ac2296so56318427b3.3 for ; Tue, 25 Nov 2025 08:59:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1764089982; x=1764694782; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=6vFlri8REH/qac3zTP1mpTsu23ZBhs3xYfWJv9me6+k=; b=UenYKkM/KBIO10KPDFiut3GDa2YGiBynyrhPNKT5owCSZJWVhcYZ36ub05X50ZNk1e JRdljqFOImAexPoA7Es7HcPMyBwesh1MOc2m16b7s1t8IKlWW5ShfPykxdeVSn14OqYz cixbaKk9lBycxn+9ulbPn+qgHWeWOEEwEMbiRiywwM9moRAVNHO94zt0GM1bi9l4zyuI Ki5xKWJ+DijN19qZ2kGy/qZlNHVhDjKHwLtzBbQp2w3RiU3wOTnjviLPPVaQ3yhxzOB3 xwHw8kF4NPbNUSqdiZ0H19cdzaN0MpQbaIqfOrrMgURHDmrZRNegOU1hZZ7Rfb+ZjQ3H vRKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764089982; x=1764694782; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-gg:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=6vFlri8REH/qac3zTP1mpTsu23ZBhs3xYfWJv9me6+k=; b=QLfRT37nHJ/Vd/1nvo4QWKoi3ZjBgbSrHXoQNlODKAVPV2n2T2O6jIkc8DsjX8mYxH yRL631SlITphFzcqCh6OXTUjMscsLPseZzzaCVh117lxxyxHvQosE78iueJLGBfS/kVv BeWqagYfLYYCSGW8XGoN/pEw7EdsarbTj9jqGDh3FsEX1FTNaVzL8oz8k3KwFX4CLxP7 4Rwhd+/+fHewIXZOpl+k0wAqsZd2t10JTCkaQn2ddM1Ta7PflCKVAQEudVkzCCvGIiWZ dmeY+380H7uldWkjy1B96SvrmX2E6mCqzL2r7sq5mq6WBEP1XuQ/C1JUo4Kv/JDxbfhJ Xa6A== X-Forwarded-Encrypted: i=1; AJvYcCXwUPgJVMQ4hi9wNDHI/rePgJCp263oEamiUZJK8BejxWM8gmb1Ab/ZROsPKESvtm60OYoIKX6lZ1c/SNA=@vger.kernel.org X-Gm-Message-State: AOJu0YyyK/rmQugschucTpFs3rGPS/t+N8pyS4yGlKr2bSuPOdm61BZ1 rYhUVbcsO3//0E77Gm9emU7n4FdHbI9EQgEeAu+ZDzhUoywTxCUFCDHWQqiNoBJhmTs= X-Gm-Gg: ASbGncvaK5GlFsyPF0KFB3J7Y9v1AWU3KrnURCyTuewyIxMDe3F2zrExfStCGdlXK3l 6EUYIn8L84PdF3b/o2oXkIb3fA6ofWJ4PTa10Tnx2Pmtb+nE6umUb78V98OTzlNIQBZQq8gSuvc uqhwnfDBpmpt2NlqryyM+2tdbDwR/EaeGNHi7Ds40mwX3UlNGbuQ+7hW7ZL3TZj3VDAcZSoDUrK pvD9dZNzz07dK1ECh29uw0hbsiwciYMOfx/OyY9V00u0/w3RgPP1rH5rSSY9NBJ7d7uq9KJJ4/q h7Iw6iHcIpfKl4JN/fMK/Qwhdc2oSRMHZpZ6k0Vnjk/CvsSAuAyV5/98GH3b5f06LOum2oWv7eq Bf3TJQxfrFZuZ8W/Ni0OPWNfGdzPFOYNfWZxlxTY0Z+3EHFQpuHgjMcCrzeFPQdppN3MfdZaRyX RNWVAqfAYZfCmKsUXF9iGZcVe94I2HHE3MXhIP4Ck2c/qneC8I8g+MmvKRTlwlw+sPl5x+gmBbK Cc= X-Google-Smtp-Source: AGHT+IHRC9KjSLkE1gdlWKz3GVcjnQbHxjnBnEjPOoxO9pY/zLR64stKUm4wy05jQAzAnoAL6WDFUg== X-Received: by 2002:a05:690c:6c90:b0:783:7143:d825 with SMTP id 00721157ae682-78a8b497584mr142558557b3.25.1764089982243; Tue, 25 Nov 2025 08:59:42 -0800 (PST) Received: from soleen.c.googlers.com.com (182.221.85.34.bc.googleusercontent.com. [34.85.221.182]) by smtp.gmail.com with ESMTPSA id 00721157ae682-78a798a5518sm57284357b3.14.2025.11.25.08.59.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 25 Nov 2025 08:59:41 -0800 (PST) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org Subject: [PATCH v8 18/18] selftests/liveupdate: Add kexec test for multiple and empty sessions Date: Tue, 25 Nov 2025 11:58:48 -0500 Message-ID: <20251125165850.3389713-19-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.52.0.460.gd25c4c69ec-goog In-Reply-To: <20251125165850.3389713-1-pasha.tatashin@soleen.com> References: <20251125165850.3389713-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a new kexec-based selftest, luo_kexec_multi_session, to validate the end-to-end lifecycle of a more complex LUO scenario. While the existing luo_kexec_simple test covers the basic end-to-end lifecycle, it is limited to a single session with one preserved file. This new test significantly expands coverage by verifying LUO's ability to handle a mixed workload involving multiple sessions, some of which are intentionally empty. This ensures that the LUO core correctly preserves and restores the state of all session types across a reboot. The test validates the following sequence: Stage 1 (Pre-kexec): - Creates two empty test sessions (multi-test-empty-1, multi-test-empty-2). - Creates a session with one preserved memfd (multi-test-files-1). - Creates another session with two preserved memfds (multi-test-files-2), each containing unique data. - Creates a state-tracking session to manage the transition to Stage 2. - Executes a kexec reboot via the helper script. Stage 2 (Post-kexec): - Retrieves the state-tracking session to confirm it is in the post-reboot stage. - Retrieves all four test sessions (both the empty and non-empty ones). - For the non-empty sessions, restores the preserved memfds and verifies their contents match the original data patterns. - Finalizes all test sessions and the state session to ensure a clean teardown and that all associated kernel resources are correctly released. This test provides greater confidence in the robustness of the LUO framework by validating its behavior in a more realistic, multi-faceted scenario. Signed-off-by: Pasha Tatashin Reviewed-by: Mike Rapoport (Microsoft) Tested-by: David Matlack --- tools/testing/selftests/liveupdate/Makefile | 1 + .../selftests/liveupdate/luo_multi_session.c | 162 ++++++++++++++++++ 2 files changed, 163 insertions(+) create mode 100644 tools/testing/selftests/liveupdate/luo_multi_session.c diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile index bbbec633970c..080754787ede 100644 --- a/tools/testing/selftests/liveupdate/Makefile +++ b/tools/testing/selftests/liveupdate/Makefile @@ -5,6 +5,7 @@ LIB_C +=3D luo_test_utils.c TEST_GEN_PROGS +=3D liveupdate =20 TEST_GEN_PROGS_EXTENDED +=3D luo_kexec_simple +TEST_GEN_PROGS_EXTENDED +=3D luo_multi_session =20 TEST_FILES +=3D do_kexec.sh =20 diff --git a/tools/testing/selftests/liveupdate/luo_multi_session.c b/tools= /testing/selftests/liveupdate/luo_multi_session.c new file mode 100644 index 000000000000..0ee2d795beef --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_multi_session.c @@ -0,0 +1,162 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + * + * A selftest to validate the end-to-end lifecycle of multiple LUO sessions + * across a kexec reboot, including empty sessions and sessions with multi= ple + * files. + */ + +#include "luo_test_utils.h" + +#define SESSION_EMPTY_1 "multi-test-empty-1" +#define SESSION_EMPTY_2 "multi-test-empty-2" +#define SESSION_FILES_1 "multi-test-files-1" +#define SESSION_FILES_2 "multi-test-files-2" + +#define MFD1_TOKEN 0x1001 +#define MFD2_TOKEN 0x2002 +#define MFD3_TOKEN 0x3003 + +#define MFD1_DATA "Data for session files 1" +#define MFD2_DATA "First file for session files 2" +#define MFD3_DATA "Second file for session files 2" + +#define STATE_SESSION_NAME "kexec_multi_state" +#define STATE_MEMFD_TOKEN 998 + +/* Stage 1: Executed before the kexec reboot. */ +static void run_stage_1(int luo_fd) +{ + int s_empty1_fd, s_empty2_fd, s_files1_fd, s_files2_fd; + + ksft_print_msg("[STAGE 1] Starting pre-kexec setup for multi-session test= ...\n"); + + ksft_print_msg("[STAGE 1] Creating state file for next stage (2)...\n"); + create_state_file(luo_fd, STATE_SESSION_NAME, STATE_MEMFD_TOKEN, 2); + + ksft_print_msg("[STAGE 1] Creating empty sessions '%s' and '%s'...\n", + SESSION_EMPTY_1, SESSION_EMPTY_2); + s_empty1_fd =3D luo_create_session(luo_fd, SESSION_EMPTY_1); + if (s_empty1_fd < 0) + fail_exit("luo_create_session for '%s'", SESSION_EMPTY_1); + + s_empty2_fd =3D luo_create_session(luo_fd, SESSION_EMPTY_2); + if (s_empty2_fd < 0) + fail_exit("luo_create_session for '%s'", SESSION_EMPTY_2); + + ksft_print_msg("[STAGE 1] Creating session '%s' with one memfd...\n", + SESSION_FILES_1); + + s_files1_fd =3D luo_create_session(luo_fd, SESSION_FILES_1); + if (s_files1_fd < 0) + fail_exit("luo_create_session for '%s'", SESSION_FILES_1); + if (create_and_preserve_memfd(s_files1_fd, MFD1_TOKEN, MFD1_DATA) < 0) { + fail_exit("create_and_preserve_memfd for token %#x", + MFD1_TOKEN); + } + + ksft_print_msg("[STAGE 1] Creating session '%s' with two memfds...\n", + SESSION_FILES_2); + + s_files2_fd =3D luo_create_session(luo_fd, SESSION_FILES_2); + if (s_files2_fd < 0) + fail_exit("luo_create_session for '%s'", SESSION_FILES_2); + if (create_and_preserve_memfd(s_files2_fd, MFD2_TOKEN, MFD2_DATA) < 0) { + fail_exit("create_and_preserve_memfd for token %#x", + MFD2_TOKEN); + } + if (create_and_preserve_memfd(s_files2_fd, MFD3_TOKEN, MFD3_DATA) < 0) { + fail_exit("create_and_preserve_memfd for token %#x", + MFD3_TOKEN); + } + + close(luo_fd); + daemonize_and_wait(); +} + +/* Stage 2: Executed after the kexec reboot. */ +static void run_stage_2(int luo_fd, int state_session_fd) +{ + int s_empty1_fd, s_empty2_fd, s_files1_fd, s_files2_fd; + int mfd1, mfd2, mfd3, stage; + + ksft_print_msg("[STAGE 2] Starting post-kexec verification...\n"); + + restore_and_read_stage(state_session_fd, STATE_MEMFD_TOKEN, &stage); + if (stage !=3D 2) { + fail_exit("Expected stage 2, but state file contains %d", + stage); + } + + ksft_print_msg("[STAGE 2] Retrieving all sessions...\n"); + s_empty1_fd =3D luo_retrieve_session(luo_fd, SESSION_EMPTY_1); + if (s_empty1_fd < 0) + fail_exit("luo_retrieve_session for '%s'", SESSION_EMPTY_1); + + s_empty2_fd =3D luo_retrieve_session(luo_fd, SESSION_EMPTY_2); + if (s_empty2_fd < 0) + fail_exit("luo_retrieve_session for '%s'", SESSION_EMPTY_2); + + s_files1_fd =3D luo_retrieve_session(luo_fd, SESSION_FILES_1); + if (s_files1_fd < 0) + fail_exit("luo_retrieve_session for '%s'", SESSION_FILES_1); + + s_files2_fd =3D luo_retrieve_session(luo_fd, SESSION_FILES_2); + if (s_files2_fd < 0) + fail_exit("luo_retrieve_session for '%s'", SESSION_FILES_2); + + ksft_print_msg("[STAGE 2] Verifying contents of session '%s'...\n", + SESSION_FILES_1); + mfd1 =3D restore_and_verify_memfd(s_files1_fd, MFD1_TOKEN, MFD1_DATA); + if (mfd1 < 0) + fail_exit("restore_and_verify_memfd for token %#x", MFD1_TOKEN); + close(mfd1); + + ksft_print_msg("[STAGE 2] Verifying contents of session '%s'...\n", + SESSION_FILES_2); + + mfd2 =3D restore_and_verify_memfd(s_files2_fd, MFD2_TOKEN, MFD2_DATA); + if (mfd2 < 0) + fail_exit("restore_and_verify_memfd for token %#x", MFD2_TOKEN); + close(mfd2); + + mfd3 =3D restore_and_verify_memfd(s_files2_fd, MFD3_TOKEN, MFD3_DATA); + if (mfd3 < 0) + fail_exit("restore_and_verify_memfd for token %#x", MFD3_TOKEN); + close(mfd3); + + ksft_print_msg("[STAGE 2] Test data verified successfully.\n"); + + ksft_print_msg("[STAGE 2] Finalizing all test sessions...\n"); + if (luo_session_finish(s_empty1_fd) < 0) + fail_exit("luo_session_finish for '%s'", SESSION_EMPTY_1); + close(s_empty1_fd); + + if (luo_session_finish(s_empty2_fd) < 0) + fail_exit("luo_session_finish for '%s'", SESSION_EMPTY_2); + close(s_empty2_fd); + + if (luo_session_finish(s_files1_fd) < 0) + fail_exit("luo_session_finish for '%s'", SESSION_FILES_1); + close(s_files1_fd); + + if (luo_session_finish(s_files2_fd) < 0) + fail_exit("luo_session_finish for '%s'", SESSION_FILES_2); + close(s_files2_fd); + + ksft_print_msg("[STAGE 2] Finalizing state session...\n"); + if (luo_session_finish(state_session_fd) < 0) + fail_exit("luo_session_finish for state session"); + close(state_session_fd); + + ksft_print_msg("\n--- MULTI-SESSION KEXEC TEST PASSED ---\n"); +} + +int main(int argc, char *argv[]) +{ + return luo_test(argc, argv, STATE_SESSION_NAME, + run_stage_1, run_stage_2); +} --=20 2.52.0.460.gd25c4c69ec-goog