From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qk1-f171.google.com (mail-qk1-f171.google.com [209.85.222.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 283E1185E4A for ; Mon, 29 Sep 2025 01:03:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107815; cv=none; b=aL44hpp4IWzZuVX5Dk3VozptJUknVecysEabYk0iCrHp11jPtJymmsmI15TTpAWVjMdJBfanl/RBcuJr//RJAezJ+kOblz2lEbnidWg6jmzKxmHJ9FO4gmHEFKDUu0uFlcwU9kgmBXlPMXtGPmqi08pnQzm2SK0+R8OzkLvVwpg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107815; c=relaxed/simple; bh=H9bVmk4XNEIhyUuYBpnqxQGGI4EN7gtWE5L1w0h2bJI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fsSFxw3e843xKgkHlsOC6RCg3G2dgZ6/Hli5vlrhQB+XLxhpt2X6I/3xDr42GyC2hRnh3AzRZAGXl9xgt9bAShuSCnWkGLnKDxQkFm9YRe3nHMXBWekuXDJ3lLvL6rMnwZCCUYWvRPNThptfi+giyF3rBkkCabqX9kcNjliU9xg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=FXHMpFhK; arc=none smtp.client-ip=209.85.222.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="FXHMpFhK" Received: by mail-qk1-f171.google.com with SMTP id af79cd13be357-85d5cd6fe9fso292958185a.0 for ; Sun, 28 Sep 2025 18:03:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107813; x=1759712613; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=RU4szgNN13WysJ6xhtBfiTDK+ejkMnCmbQOqkk33H88=; b=FXHMpFhK55fIeMK2ud2Rt4xuVS/+SwhFg2c12I3fXTAYYDZ4ZrC3kU1ecCpYWlgRz3 BTPAsEhL1G/2oNODFHozE/EK2WTnPwqYy5URo5NSNhwCDuwA8o5ytL1kXfNAQijshBFS OdCBzt3LVy9ThgVez0fNbQM5Y+geFwOWozGDWS7SVzXvz+2f6IEgEDuWvX+Fq9D5kVp+ 7Dj5HlNilSGZ/amqU0hjc0QpyrElswpH/OFrx0uhRcdB4VbFMJ2vCXNNSbzzm7ZuEFyS jXG5Inw7pi+unnWtP0KAEr67QmnxGtu+Mn0e5hj9eM+85f/7AQExL7hiuJcBVDP/dnij c3Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107813; x=1759712613; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RU4szgNN13WysJ6xhtBfiTDK+ejkMnCmbQOqkk33H88=; b=EI8Sl6fIAyK0Bt8ukNTiq4LZZX7/nrhdQgQxg24cgESg0VzwlnOIThAa/eOL0mSgfS hwBS+55Ai84fIcVbTodMhvmCHjCGKRqzsbql2fDkRQsDiODdMg7BTlI5D3+WrzhVbqnX c+EYPHPssTm8SMXteAxGJoBITXllTqq7qUocYWEDG/heGyLCm9n02jrs0v10E9CavBGM vN4OUFVakPNGH201yCVhOm7de0SZELUFRxJxw5WzSSYZj7wJxOv1gK5D2rS6Cy4qEjJx yJJL7PQ6iUlfdmc6NluEbIUcl1D01ZJHW7Rr7bja1I/tYc5IclruYcAGJewTVgry7ckt t+hQ== X-Forwarded-Encrypted: i=1; AJvYcCXMoyxN1kHtsbAp1J0zf/beFuUYnJdskSQ0jnQcASf5BDvLyub7w9u9eXWjkKu6nFKWc9WVBdKIqRxTRLA=@vger.kernel.org X-Gm-Message-State: AOJu0Yy9q6TbL2Mibr7sGd5B1a9PZ9PEnQhXBhCz80IPuQWgpt3dC7H3 r2qbbnxQJhODn3k89Qr5m1K8VGyGICD41dpS+fOrKQiWPG1KTcSFPp5go2+fCN9La+4= X-Gm-Gg: ASbGncskH5e7m6ZQnP0eyEde2Ots5ON4ddPqfP8gVHzH3ESL268ZoWFvaEJh6VCRhQK EOBuOFupSY1niP2zUSVv2HHQTI+jr/CQB3zGrIwygZRYknU8PV2J+cM1gtsXWdoo+yxaFf1yUXr iX5ktL5T+LgPAnAIgC/zoHSQULntL/Tu0nv8oCaY3rftUJPqVI11PeKyvbeVc0c8p6ICsQyjBKq +rjPF1apRj+EhaLop+ZMacfRvjl7mp3WyeZm/azZ0iTpYfqLKY2MPxT4vQ8NTDnsH1JtYZwYxml TkV2lkX26LCLADaY6EQ1kustaIXwqUJiOknidvQrrCylBH0iS5e+NQP8zhunteIw8DvpUkeAR9+ PFqAzm6oqQwlWKn05vfGpoPgqNEAroyfivZUL+v/bRas9+K3Ig4mSIrq03VXmgXw5O9hn0UWLzX Eh52OwCi7Pmg/1rajfaQ== X-Google-Smtp-Source: AGHT+IEzNLDCIUEBf9S5DRDPsrWxtgBORWz8LvfpibpwyTvuEl9/0gmUqA9ZfzRNolrGZ8uq0mC63A== X-Received: by 2002:a05:620a:1a90:b0:84f:110c:b6e7 with SMTP id af79cd13be357-85adf7bb783mr2147257785a.6.1759107812795; Sun, 28 Sep 2025 18:03:32 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:32 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 01/30] kho: allow to drive kho from within kernel Date: Mon, 29 Sep 2025 01:02:52 +0000 Message-ID: <20250929010321.3462457-2-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allow to do finalize and abort from kernel modules, so LUO could drive the KHO sequence via its own state machine. Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 15 +++++++++ kernel/kexec_handover.c | 56 ++++++++++++++++++++++++++++++++-- 2 files changed, 69 insertions(+), 2 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index 25042c1d8d54..04d0108db98e 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -67,6 +67,10 @@ void kho_memory_init(void); =20 void kho_populate(phys_addr_t fdt_phys, u64 fdt_len, phys_addr_t scratch_p= hys, u64 scratch_len); + +int kho_finalize(void); +int kho_abort(void); + #else static inline bool kho_is_enabled(void) { @@ -139,6 +143,17 @@ static inline void kho_populate(phys_addr_t fdt_phys, = u64 fdt_len, phys_addr_t scratch_phys, u64 scratch_len) { } + +static inline int kho_finalize(void) +{ + return -EOPNOTSUPP; +} + +static inline int kho_abort(void) +{ + return -EOPNOTSUPP; +} + #endif /* CONFIG_KEXEC_HANDOVER */ =20 #endif /* LINUX_KEXEC_HANDOVER_H */ diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 76f0940fb485..0ba5a2dbae28 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -1067,7 +1067,7 @@ static int kho_out_update_debugfs_fdt(void) return err; } =20 -static int kho_abort(void) +static int __kho_abort(void) { int err; unsigned long order; @@ -1100,7 +1100,33 @@ static int kho_abort(void) return err; } =20 -static int kho_finalize(void) +int kho_abort(void) +{ + int ret =3D 0; + + if (!kho_enable) + return -EOPNOTSUPP; + + mutex_lock(&kho_out.lock); + + if (!kho_out.finalized) { + ret =3D -ENOENT; + goto unlock; + } + + ret =3D __kho_abort(); + if (ret) + goto unlock; + + kho_out.finalized =3D false; + ret =3D kho_out_update_debugfs_fdt(); + +unlock: + mutex_unlock(&kho_out.lock); + return ret; +} + +static int __kho_finalize(void) { int err =3D 0; u64 *preserved_mem_map; @@ -1149,6 +1175,32 @@ static int kho_finalize(void) return err; } =20 +int kho_finalize(void) +{ + int ret =3D 0; + + if (!kho_enable) + return -EOPNOTSUPP; + + mutex_lock(&kho_out.lock); + + if (kho_out.finalized) { + ret =3D -EEXIST; + goto unlock; + } + + ret =3D __kho_finalize(); + if (ret) + goto unlock; + + kho_out.finalized =3D true; + ret =3D kho_out_update_debugfs_fdt(); + +unlock: + mutex_unlock(&kho_out.lock); + return ret; +} + static int kho_out_finalize_get(void *data, u64 *val) { mutex_lock(&kho_out.lock); --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9B83D1D9A5D for ; Mon, 29 Sep 2025 01:03:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107818; cv=none; b=vGusv794SqfZTCexM/fM9V3Z2MCAlYHiKRaY5s1a9vi+z3Q+0D5QeRnnCv8bZUiI7w7sXEXlog+MbnmYQcrPSEUl7fJzjHx4RkgU3ELKkplDc081UI0/e2WYXy6b9Qa3+Rad+Vy8VpWlVmY4pJc+GxjdOAJaPNlxE+xnco34Ed4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107818; c=relaxed/simple; bh=uDWniFjf4ZkL5jUcn9/kgwbIQ/WTJAGcjIq0jJxdPuI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=P8j5Jo2AUj12UYnqvNhaAMLbMdcrMAy5zAYYVlkbBM3Z1Eo9Ph/cDi9GjPq2oxdzOoJWciiWnHdWAKIICL1fm65ZTJYrU9sO9AZVT1+cg0mCWg5yX4eP8fXYGi/rMZ6N+peh3/ROuFc8xAyD3WhcA9j/OZx5iP2/y8lBKVGUGgs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=NA1ukK2I; arc=none smtp.client-ip=209.85.222.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="NA1ukK2I" Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-85e76e886a0so273372085a.1 for ; Sun, 28 Sep 2025 18:03:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107814; x=1759712614; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=M+0a641RVp4eNXltm9Frq8R4NOKxAZQzQ19sOBX9Om4=; b=NA1ukK2IL21cB5oShJC5TeYfSJCwQh7d/1Awn2XAY4LDHqEfNFH0YW2Tv7bLlmbrS3 BkJqITKNWiUptAYVzr9H+lWpH9tVeQ4zRfl6bHm/WmWp//pDxyoZ65oYrJAy2lEyhnJB QDLjPCNJEVpPOdSbh7ydCBRv2JNnewetSo1r1epbMioo6a87lkTtnANu3NEQJVfNNwBA zEsttr7A0vL5VlPxYNmYkCGakYt9xwM2Gv+IGO0XEG8+05+ei3iOEnk/DRPMS8A3Wn8z j6bKvpqTrSqNciLX2qfVkc/rB6UZ0oM00XEY2WdsdhsIopT1bGuaG8i1tFpNA6ggdTi3 oSjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107814; x=1759712614; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M+0a641RVp4eNXltm9Frq8R4NOKxAZQzQ19sOBX9Om4=; b=iChqqJHdE0EAFVMVPKWJxx4bp7UTJsdk22Y0RzXSL7bXGx3JZlt9wL6u6sc5Xonwej FmLdgCJ6mqNsTE6e+qCIc86bAJxN+BslJMba0Zee9sF40tZpABYz+djXu5W93OSzcKAy tAngu5x3B/52Qk/UySj0GD7Hb+JTl9dE+1Tj+B5o5Ph4qXbkyvWEO0FyO0r4DK3xeKrL +atuZY9r6skOYiY+nGAjNJqwHu+3+lcSYGRPUlWoh2YrIVJPVioZzUjMuMgM47Lzba00 WHEtMz8VClg0zKUJ6m/E9+cYkfLINSF69mqqC+wvZj/NGFWQuFMM2sNsqJqqyBN2Ack3 V73A== X-Forwarded-Encrypted: i=1; AJvYcCVWswA4EUSM/KJzIHFky8QwdYobifu0JNzPys0z32e/VUig4orrDCGTL6Q9n3Ug79a+Iv06oKZMChl40W4=@vger.kernel.org X-Gm-Message-State: AOJu0YyTMcLNUA/YBU+0XJKGbX8PkxGGcvsDlWz+23D8zbQk7z8h4Mpf KVaRURsGWX9Oc+wkDyUD8w/vSv2iB0XndqbhGEnc+8+a2sPMuUfZThOpz/uKQSmJZCk= X-Gm-Gg: ASbGnctpFPhDdEoGDh+8K/+Vo15QNgxI7wHS0zIGZThaZ+OfTAusTymUkqtO3Mb+2cw or11W4xVLfZ+FL4E9jVkXOKraWOkEQWRxgN3TtI2ykuYpproUfC5eyUJu0CLnQzQv/LqgVugGqn luk1pH3BzH8XuHziKAIIHny9Fpk6US8AaUeTxcWDu2He7CgGl6INqcPr01dW8WoSwiRJCYTSKPu IYY73soEprD5Y+INwQCxWx6FKwVp/fC3ZjUQGEEqriUB+c6ZtaQlIYUhzikQC40MxyF/q1GNCyC pk13ZtKmXjuSqIEL74XbMLvtul8LNt3Ksd9P+Pl64fMpH9/ti0TUqoxCrhte5W9ZnPGfxAkUgeW F+wsj12qfSbWsTYXJb4mdtm+E/kzyzisrjubAYG8L3d3ncHYQZXUqpfPCfpFXt4UyBldAWXaGbN XmyeePuCc= X-Google-Smtp-Source: AGHT+IGonW1CRHUUsXyfbJh6ZoXbzRjlubAVAKY6OPaG8ochgaHRZVLV4raeX9y82h2VGUt1hBc7Vw== X-Received: by 2002:a05:620a:25d4:b0:84c:e9:f9d7 with SMTP id af79cd13be357-85ae7fb3278mr1782998185a.62.1759107814346; Sun, 28 Sep 2025 18:03:34 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:33 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 02/30] kho: make debugfs interface optional Date: Mon, 29 Sep 2025 01:02:53 +0000 Message-ID: <20250929010321.3462457-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, KHO is controlled via debugfs interface, but once LUO is introduced, it can control KHO, and the debug interface becomes optional. Add a separate config CONFIG_KEXEC_HANDOVER_DEBUG that enables the debugfs interface, and allows to inspect the tree. Move all debugfs related code to a new file to keep the .c files clear of ifdefs. Co-developed-by: Mike Rapoport (Microsoft) Signed-off-by: Mike Rapoport (Microsoft) Signed-off-by: Pasha Tatashin --- MAINTAINERS | 3 +- kernel/Kconfig.kexec | 10 ++ kernel/Makefile | 1 + kernel/kexec_handover.c | 255 +++++-------------------------- kernel/kexec_handover_debug.c | 218 ++++++++++++++++++++++++++ kernel/kexec_handover_internal.h | 44 ++++++ 6 files changed, 311 insertions(+), 220 deletions(-) create mode 100644 kernel/kexec_handover_debug.c create mode 100644 kernel/kexec_handover_internal.h diff --git a/MAINTAINERS b/MAINTAINERS index 156fa8eefa69..a6cbcc7fb396 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13759,13 +13759,14 @@ KEXEC HANDOVER (KHO) M: Alexander Graf M: Mike Rapoport M: Changyuan Lyu +M: Pasha Tatashin L: kexec@lists.infradead.org L: linux-mm@kvack.org S: Maintained F: Documentation/admin-guide/mm/kho.rst F: Documentation/core-api/kho/* F: include/linux/kexec_handover.h -F: kernel/kexec_handover.c +F: kernel/kexec_handover* F: tools/testing/selftests/kho/ =20 KEYS-ENCRYPTED diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 422270d64820..e68156d8c72b 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -109,6 +109,16 @@ config KEXEC_HANDOVER to keep data or state alive across the kexec. For this to work, both source and target kernels need to have this option enabled. =20 +config KEXEC_HANDOVER_DEBUG + bool "kexec handover debug interface" + depends on KEXEC_HANDOVER + depends on DEBUG_FS + help + Allow to control kexec handover device tree via debugfs + interface, i.e. finalize the state or aborting the finalization. + Also, enables inspecting the KHO fdt trees with the debugfs binary + blobs. + config CRASH_DUMP bool "kernel crash dumps" default ARCH_DEFAULT_CRASH_DUMP diff --git a/kernel/Makefile b/kernel/Makefile index df3dd8291bb6..9fe722305c9b 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -83,6 +83,7 @@ obj-$(CONFIG_KEXEC) +=3D kexec.o obj-$(CONFIG_KEXEC_FILE) +=3D kexec_file.o obj-$(CONFIG_KEXEC_ELF) +=3D kexec_elf.o obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o +obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_BACKTRACE_SELF_TEST) +=3D backtracetest.o obj-$(CONFIG_COMPAT) +=3D compat.o obj-$(CONFIG_CGROUPS) +=3D cgroup/ diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 0ba5a2dbae28..f0f6c6b8ad83 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -10,7 +10,6 @@ =20 #include #include -#include #include #include #include @@ -28,6 +27,7 @@ */ #include "../mm/internal.h" #include "kexec_internal.h" +#include "kexec_handover_internal.h" =20 #define KHO_FDT_COMPATIBLE "kho-v1" #define PROP_PRESERVED_MEMORY_MAP "preserved-memory-map" @@ -101,8 +101,6 @@ struct khoser_mem_chunk; =20 struct kho_serialization { struct page *fdt; - struct list_head fdt_list; - struct dentry *sub_fdt_dir; struct kho_mem_track track; /* First chunk of serialized preserved memory map */ struct khoser_mem_chunk *preserved_mem_map; @@ -465,8 +463,8 @@ static void __init kho_mem_deserialize(const void *fdt) * area for early allocations that happen before page allocator is * initialized. */ -static struct kho_scratch *kho_scratch; -static unsigned int kho_scratch_cnt; +struct kho_scratch *kho_scratch; +unsigned int kho_scratch_cnt; =20 /* * The scratch areas are scaled by default as percent of memory allocated = from @@ -662,36 +660,24 @@ static void __init kho_reserve_scratch(void) kho_enable =3D false; } =20 -struct fdt_debugfs { - struct list_head list; - struct debugfs_blob_wrapper wrapper; - struct dentry *file; +struct kho_out { + struct blocking_notifier_head chain_head; + struct mutex lock; /* protects KHO FDT finalization */ + struct kho_serialization ser; + bool finalized; + struct kho_debugfs dbg; }; =20 -static int kho_debugfs_fdt_add(struct list_head *list, struct dentry *dir, - const char *name, const void *fdt) -{ - struct fdt_debugfs *f; - struct dentry *file; - - f =3D kmalloc(sizeof(*f), GFP_KERNEL); - if (!f) - return -ENOMEM; - - f->wrapper.data =3D (void *)fdt; - f->wrapper.size =3D fdt_totalsize(fdt); - - file =3D debugfs_create_blob(name, 0400, dir, &f->wrapper); - if (IS_ERR(file)) { - kfree(f); - return PTR_ERR(file); - } - - f->file =3D file; - list_add(&f->list, list); - - return 0; -} +static struct kho_out kho_out =3D { + .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), + .lock =3D __MUTEX_INITIALIZER(kho_out.lock), + .ser =3D { + .track =3D { + .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), + }, + }, + .finalized =3D false, +}; =20 /** * kho_add_subtree - record the physical address of a sub FDT in KHO root = tree. @@ -704,7 +690,8 @@ static int kho_debugfs_fdt_add(struct list_head *list, = struct dentry *dir, * by KHO for the new kernel to retrieve it after kexec. * * A debugfs blob entry is also created at - * ``/sys/kernel/debug/kho/out/sub_fdts/@name``. + * ``/sys/kernel/debug/kho/out/sub_fdts/@name`` when kernel is configured = with + * CONFIG_KEXEC_HANDOVER_DEBUG * * Return: 0 on success, error code on failure */ @@ -721,7 +708,7 @@ int kho_add_subtree(struct kho_serialization *ser, cons= t char *name, void *fdt) if (err) return err; =20 - return kho_debugfs_fdt_add(&ser->fdt_list, ser->sub_fdt_dir, name, fdt); + return kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); } EXPORT_SYMBOL_GPL(kho_add_subtree); =20 @@ -1044,29 +1031,6 @@ void *kho_restore_vmalloc(const struct kho_vmalloc *= preservation) } EXPORT_SYMBOL_GPL(kho_restore_vmalloc); =20 -/* Handling for debug/kho/out */ - -static struct dentry *debugfs_root; - -static int kho_out_update_debugfs_fdt(void) -{ - int err =3D 0; - struct fdt_debugfs *ff, *tmp; - - if (kho_out.finalized) { - err =3D kho_debugfs_fdt_add(&kho_out.ser.fdt_list, kho_out.dir, - "fdt", page_to_virt(kho_out.ser.fdt)); - } else { - list_for_each_entry_safe(ff, tmp, &kho_out.ser.fdt_list, list) { - debugfs_remove(ff->file); - list_del(&ff->list); - kfree(ff); - } - } - - return err; -} - static int __kho_abort(void) { int err; @@ -1119,7 +1083,8 @@ int kho_abort(void) goto unlock; =20 kho_out.finalized =3D false; - ret =3D kho_out_update_debugfs_fdt(); + + kho_debugfs_cleanup(&kho_out.dbg); =20 unlock: mutex_unlock(&kho_out.lock); @@ -1169,7 +1134,7 @@ static int __kho_finalize(void) abort: if (err) { pr_err("Failed to convert KHO state tree: %d\n", err); - kho_abort(); + __kho_abort(); } =20 return err; @@ -1194,119 +1159,32 @@ int kho_finalize(void) goto unlock; =20 kho_out.finalized =3D true; - ret =3D kho_out_update_debugfs_fdt(); + ret =3D kho_debugfs_fdt_add(&kho_out.dbg, "fdt", + page_to_virt(kho_out.ser.fdt), true); =20 unlock: mutex_unlock(&kho_out.lock); return ret; } =20 -static int kho_out_finalize_get(void *data, u64 *val) +bool kho_finalized(void) { - mutex_lock(&kho_out.lock); - *val =3D kho_out.finalized; - mutex_unlock(&kho_out.lock); - - return 0; -} - -static int kho_out_finalize_set(void *data, u64 _val) -{ - int ret =3D 0; - bool val =3D !!_val; + bool ret; =20 mutex_lock(&kho_out.lock); - - if (val =3D=3D kho_out.finalized) { - if (kho_out.finalized) - ret =3D -EEXIST; - else - ret =3D -ENOENT; - goto unlock; - } - - if (val) - ret =3D kho_finalize(); - else - ret =3D kho_abort(); - - if (ret) - goto unlock; - - kho_out.finalized =3D val; - ret =3D kho_out_update_debugfs_fdt(); - -unlock: + ret =3D kho_out.finalized; mutex_unlock(&kho_out.lock); - return ret; -} - -DEFINE_DEBUGFS_ATTRIBUTE(fops_kho_out_finalize, kho_out_finalize_get, - kho_out_finalize_set, "%llu\n"); - -static int scratch_phys_show(struct seq_file *m, void *v) -{ - for (int i =3D 0; i < kho_scratch_cnt; i++) - seq_printf(m, "0x%llx\n", kho_scratch[i].addr); - - return 0; -} -DEFINE_SHOW_ATTRIBUTE(scratch_phys); =20 -static int scratch_len_show(struct seq_file *m, void *v) -{ - for (int i =3D 0; i < kho_scratch_cnt; i++) - seq_printf(m, "0x%llx\n", kho_scratch[i].size); - - return 0; -} -DEFINE_SHOW_ATTRIBUTE(scratch_len); - -static __init int kho_out_debugfs_init(void) -{ - struct dentry *dir, *f, *sub_fdt_dir; - - dir =3D debugfs_create_dir("out", debugfs_root); - if (IS_ERR(dir)) - return -ENOMEM; - - sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); - if (IS_ERR(sub_fdt_dir)) - goto err_rmdir; - - f =3D debugfs_create_file("scratch_phys", 0400, dir, NULL, - &scratch_phys_fops); - if (IS_ERR(f)) - goto err_rmdir; - - f =3D debugfs_create_file("scratch_len", 0400, dir, NULL, - &scratch_len_fops); - if (IS_ERR(f)) - goto err_rmdir; - - f =3D debugfs_create_file("finalize", 0600, dir, NULL, - &fops_kho_out_finalize); - if (IS_ERR(f)) - goto err_rmdir; - - kho_out.dir =3D dir; - kho_out.ser.sub_fdt_dir =3D sub_fdt_dir; - return 0; - -err_rmdir: - debugfs_remove_recursive(dir); - return -ENOENT; + return ret; } =20 struct kho_in { - struct dentry *dir; phys_addr_t fdt_phys; phys_addr_t scratch_phys; - struct list_head fdt_list; + struct kho_debugfs dbg; }; =20 static struct kho_in kho_in =3D { - .fdt_list =3D LIST_HEAD_INIT(kho_in.fdt_list), }; =20 static const void *kho_get_fdt(void) @@ -1370,56 +1248,6 @@ int kho_retrieve_subtree(const char *name, phys_addr= _t *phys) } EXPORT_SYMBOL_GPL(kho_retrieve_subtree); =20 -/* Handling for debugfs/kho/in */ - -static __init int kho_in_debugfs_init(const void *fdt) -{ - struct dentry *sub_fdt_dir; - int err, child; - - kho_in.dir =3D debugfs_create_dir("in", debugfs_root); - if (IS_ERR(kho_in.dir)) - return PTR_ERR(kho_in.dir); - - sub_fdt_dir =3D debugfs_create_dir("sub_fdts", kho_in.dir); - if (IS_ERR(sub_fdt_dir)) { - err =3D PTR_ERR(sub_fdt_dir); - goto err_rmdir; - } - - err =3D kho_debugfs_fdt_add(&kho_in.fdt_list, kho_in.dir, "fdt", fdt); - if (err) - goto err_rmdir; - - fdt_for_each_subnode(child, fdt, 0) { - int len =3D 0; - const char *name =3D fdt_get_name(fdt, child, NULL); - const u64 *fdt_phys; - - fdt_phys =3D fdt_getprop(fdt, child, "fdt", &len); - if (!fdt_phys) - continue; - if (len !=3D sizeof(*fdt_phys)) { - pr_warn("node `%s`'s prop `fdt` has invalid length: %d\n", - name, len); - continue; - } - err =3D kho_debugfs_fdt_add(&kho_in.fdt_list, sub_fdt_dir, name, - phys_to_virt(*fdt_phys)); - if (err) { - pr_warn("failed to add fdt `%s` to debugfs: %d\n", name, - err); - continue; - } - } - - return 0; - -err_rmdir: - debugfs_remove_recursive(kho_in.dir); - return err; -} - static __init int kho_init(void) { int err =3D 0; @@ -1434,27 +1262,16 @@ static __init int kho_init(void) goto err_free_scratch; } =20 - debugfs_root =3D debugfs_create_dir("kho", NULL); - if (IS_ERR(debugfs_root)) { - err =3D -ENOENT; + err =3D kho_debugfs_init(); + if (err) goto err_free_fdt; - } =20 - err =3D kho_out_debugfs_init(); + err =3D kho_out_debugfs_init(&kho_out.dbg); if (err) goto err_free_fdt; =20 if (fdt) { - err =3D kho_in_debugfs_init(fdt); - /* - * Failure to create /sys/kernel/debug/kho/in does not prevent - * reviving state from KHO and setting up KHO for the next - * kexec. - */ - if (err) - pr_err("failed exposing handover FDT in debugfs: %d\n", - err); - + kho_in_debugfs_init(&kho_in.dbg, fdt); return 0; } =20 diff --git a/kernel/kexec_handover_debug.c b/kernel/kexec_handover_debug.c new file mode 100644 index 000000000000..b88d138a97be --- /dev/null +++ b/kernel/kexec_handover_debug.c @@ -0,0 +1,218 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * kexec_handover.c - kexec handover metadata processing + * Copyright (C) 2023 Alexander Graf + * Copyright (C) 2025 Microsoft Corporation, Mike Rapoport + * Copyright (C) 2025 Google LLC, Changyuan Lyu + * Copyright (C) 2025 Google LLC, Pasha Tatashin + */ + +#define pr_fmt(fmt) "KHO: " fmt + +#include +#include +#include +#include +#include "kexec_handover_internal.h" + +static struct dentry *debugfs_root; + +struct fdt_debugfs { + struct list_head list; + struct debugfs_blob_wrapper wrapper; + struct dentry *file; +}; + +static int __kho_debugfs_fdt_add(struct list_head *list, struct dentry *di= r, + const char *name, const void *fdt) +{ + struct fdt_debugfs *f; + struct dentry *file; + + f =3D kmalloc(sizeof(*f), GFP_KERNEL); + if (!f) + return -ENOMEM; + + f->wrapper.data =3D (void *)fdt; + f->wrapper.size =3D fdt_totalsize(fdt); + + file =3D debugfs_create_blob(name, 0400, dir, &f->wrapper); + if (IS_ERR(file)) { + kfree(f); + return PTR_ERR(file); + } + + f->file =3D file; + list_add(&f->list, list); + + return 0; +} + +int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, + const void *fdt, bool root) +{ + struct dentry *dir; + + if (root) + dir =3D dbg->dir; + else + dir =3D dbg->sub_fdt_dir; + + return __kho_debugfs_fdt_add(&dbg->fdt_list, dir, name, fdt); +} + +void kho_debugfs_cleanup(struct kho_debugfs *dbg) +{ + struct fdt_debugfs *ff, *tmp; + + list_for_each_entry_safe(ff, tmp, &dbg->fdt_list, list) { + debugfs_remove(ff->file); + list_del(&ff->list); + kfree(ff); + } +} + +static int kho_out_finalize_get(void *data, u64 *val) +{ + *val =3D kho_finalized(); + + return 0; +} + +static int kho_out_finalize_set(void *data, u64 _val) +{ + bool val =3D !!_val; + + if (val) + return kho_finalize(); + + return kho_abort(); +} + +DEFINE_DEBUGFS_ATTRIBUTE(kho_out_finalize_fops, kho_out_finalize_get, + kho_out_finalize_set, "%llu\n"); + +static int scratch_phys_show(struct seq_file *m, void *v) +{ + for (int i =3D 0; i < kho_scratch_cnt; i++) + seq_printf(m, "0x%llx\n", kho_scratch[i].addr); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(scratch_phys); + +static int scratch_len_show(struct seq_file *m, void *v) +{ + for (int i =3D 0; i < kho_scratch_cnt; i++) + seq_printf(m, "0x%llx\n", kho_scratch[i].size); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(scratch_len); + +__init void kho_in_debugfs_init(struct kho_debugfs *dbg, const void *fdt) +{ + struct dentry *dir, *sub_fdt_dir; + int err, child; + + INIT_LIST_HEAD(&dbg->fdt_list); + + dir =3D debugfs_create_dir("in", debugfs_root); + if (IS_ERR(dir)) { + err =3D PTR_ERR(dir); + goto err_out; + } + + sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); + if (IS_ERR(sub_fdt_dir)) { + err =3D PTR_ERR(sub_fdt_dir); + goto err_rmdir; + } + + err =3D __kho_debugfs_fdt_add(&dbg->fdt_list, dir, "fdt", fdt); + if (err) + goto err_rmdir; + + fdt_for_each_subnode(child, fdt, 0) { + int len =3D 0; + const char *name =3D fdt_get_name(fdt, child, NULL); + const u64 *fdt_phys; + + fdt_phys =3D fdt_getprop(fdt, child, "fdt", &len); + if (!fdt_phys) + continue; + if (len !=3D sizeof(*fdt_phys)) { + pr_warn("node %s prop fdt has invalid length: %d\n", + name, len); + continue; + } + err =3D __kho_debugfs_fdt_add(&dbg->fdt_list, sub_fdt_dir, name, + phys_to_virt(*fdt_phys)); + if (err) { + pr_warn("failed to add fdt %s to debugfs: %d\n", name, + err); + continue; + } + } + + dbg->dir =3D dir; + dbg->sub_fdt_dir =3D sub_fdt_dir; + + return; +err_rmdir: + debugfs_remove_recursive(dir); +err_out: + /* + * Failure to create /sys/kernel/debug/kho/in does not prevent + * reviving state from KHO and setting up KHO for the next + * kexec. + */ + if (err) + pr_err("failed exposing handover FDT in debugfs: %d\n", err); +} + +__init int kho_out_debugfs_init(struct kho_debugfs *dbg) +{ + struct dentry *dir, *f, *sub_fdt_dir; + + INIT_LIST_HEAD(&dbg->fdt_list); + + dir =3D debugfs_create_dir("out", debugfs_root); + if (IS_ERR(dir)) + return -ENOMEM; + + sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); + if (IS_ERR(sub_fdt_dir)) + goto err_rmdir; + + f =3D debugfs_create_file("scratch_phys", 0400, dir, NULL, + &scratch_phys_fops); + if (IS_ERR(f)) + goto err_rmdir; + + f =3D debugfs_create_file("scratch_len", 0400, dir, NULL, + &scratch_len_fops); + if (IS_ERR(f)) + goto err_rmdir; + + f =3D debugfs_create_file("finalize", 0600, dir, NULL, + &kho_out_finalize_fops); + if (IS_ERR(f)) + goto err_rmdir; + + dbg->dir =3D dir; + dbg->sub_fdt_dir =3D sub_fdt_dir; + return 0; + +err_rmdir: + debugfs_remove_recursive(dir); + return -ENOENT; +} + +__init int kho_debugfs_init(void) +{ + debugfs_root =3D debugfs_create_dir("kho", NULL); + if (IS_ERR(debugfs_root)) + return -ENOENT; + return 0; +} diff --git a/kernel/kexec_handover_internal.h b/kernel/kexec_handover_inter= nal.h new file mode 100644 index 000000000000..f6f172ddcae4 --- /dev/null +++ b/kernel/kexec_handover_internal.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef LINUX_KEXEC_HANDOVER_INTERNAL_H +#define LINUX_KEXEC_HANDOVER_INTERNAL_H + +#include +#include +#include + +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +#include + +struct kho_debugfs { + struct dentry *dir; + struct dentry *sub_fdt_dir; + struct list_head fdt_list; +}; + +#else +struct kho_debugfs {}; +#endif + +extern struct kho_scratch *kho_scratch; +extern unsigned int kho_scratch_cnt; + +bool kho_finalized(void); + +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +int kho_debugfs_init(void); +void kho_in_debugfs_init(struct kho_debugfs *dbg, const void *fdt); +int kho_out_debugfs_init(struct kho_debugfs *dbg); +int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, + const void *fdt, bool root); +void kho_debugfs_cleanup(struct kho_debugfs *dbg); +#else +static inline int kho_debugfs_init(void) { return 0; } +static inline void kho_in_debugfs_init(struct kho_debugfs *dbg, + const void *fdt) { } +static inline int kho_out_debugfs_init(struct kho_debugfs *dbg) { return 0= ; } +static inline int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char = *name, + const void *fdt, bool root) { return 0; } +static inline void kho_debugfs_cleanup(struct kho_debugfs *dbg) {} +#endif /* CONFIG_KEXEC_HANDOVER_DEBUG */ + +#endif /* LINUX_KEXEC_HANDOVER_INTERNAL_H */ --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0C8741E32A2 for ; Mon, 29 Sep 2025 01:03:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107819; cv=none; b=EUOVdAo0/7WW7NphVsr+9s4z1wZMttiiPK9lV5Ry/Tqq9JIGVGyd7gHBJOzbA/JqMdE8itP+PEMqRuWBT1ZcNBWg9QTLJqFpWnjNs1wWkxQqZQbLikruDrtBbARmXLzIm9Vyf09D9pPOBn16xOtCTBcwiRXHCCF79IYpdzzh5zs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107819; c=relaxed/simple; bh=DTknv/itgwBRy5esQokbot0UdjnuFnwXLezUoKggTPU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nj+NfURdCVHutjzD7mXuyuFa5TYLuWBfwK6470QVGlB+H0Bd4F6gFP9oPAky0GH5s6z89b+wTh19qQeuqI/TNnNR1y+4VNli88lA4dG7nSIaWthS4OT8Z0vc3g7/IgKFOqFM6+QG2rgD3oK+o/0KZsHz+wdozxHQDXf7VZDUz6Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=cztYuVEi; arc=none smtp.client-ip=209.85.160.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="cztYuVEi" Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4df0467b510so21092911cf.3 for ; Sun, 28 Sep 2025 18:03:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107816; x=1759712616; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=InPKJ+DVmlS5ON5h6PY7wB65pY/z4Szssm5k6MkNkUg=; b=cztYuVEispUIg7tYxWmYRCGU7BVSNDGpkzhANNO2JGRCcyGNaS29iS/Otw1UtaowO4 QLNbHxYpkaDxh0twLDKwuJxLXNbyzVtg6yZWlFgdGXi8qXn30S/51TapGQeSwDAe+bYY nocDJWBncrL+fQyKgRyD0Kr5aBQVSjUIHACd66emjK8KLD/DrErlQ+yTn+mdjoHklASN yNaeUamrXF4f+ptq/h+XfmUYwGVitP4oyN+y6Kg8OXyDfsIkyFfuIpvOre7isajQ9euO kuvuHC5jmtqkl1M1S3ZSSQv+XVveVoXdpRKL1GvqoHMyClhOwAhRhqpW4NQH37xcFGuI ilEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107816; x=1759712616; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=InPKJ+DVmlS5ON5h6PY7wB65pY/z4Szssm5k6MkNkUg=; b=PsSFng2EVHe6HqwhiqEGHnL2JdxVAPX3gZmwy/sBe7UegP2qjaOFx6CnDTUPB3xiEi GECV3XE1l+4AYtS8NxquhT8Gc0Jkz15WdSqhZOOKDgOYVzUcWF2/BcgSbPwJH//74o0f MKqV1qDKIGidrHC8ap62AlYWtVxLSTuqT6c0QobupyITftQk2iAP4EZJxhEbQmtt+4ZH pAPiY0pyjchtK/RcX+ov1gWNEL7GbA0oBS2oyG3wQeTB11iEUEn66GpdWthb8IyTY8Xo 7ieMlI0pN0PMGqQ7teZt1bmM47053cr/ziRcdP4RVfMd2cKgwYACH3Ufk+qz5o5GNvbT GdZA== X-Forwarded-Encrypted: i=1; AJvYcCUZ7hx1ZLEgHwJio01HBsC1ux22Qb31y/gDNsnyI3DRWelZKsB6fGin4dVKdy2HDbWCML6+cjQz7lgrB3c=@vger.kernel.org X-Gm-Message-State: AOJu0Yx5F0h2rljB1dxWMN8PcFWTXvYvhEM36X3sHb1qFASG18o4Z0w8 1zi9Br3sY4jA2mwMoTVZpMm5ybJ3Sdyp7l67qgcaqaUiSfxbkCsHh6NxhC0LlE8jLqU= X-Gm-Gg: ASbGncu43PMheCK1kMNa+FxoIBksYV6+Oz6TeGDjQN5qI5a+La4OxKbR2rJI8Qmz2+u 42bDHi4LHVTx632E3Dw1LbBSiLuV1kztBJO2/v+pnr3RbogqN8vW0yczJtDooBofAQiIjP8xZxt 8aDLtFWpoMbGP1U/RCLSl7OgTVZa8UmDeRHLtNyhrxI9BKH8M8JYDw1seze7SruC0Q+BSJMLb9B XT1NI+0jbwhBgZ0jNP4ZpYlIYpHAhEiscb43TGSov9sN2hc587eB7o6VyNUj7PgF/5o5TucXHZj koO3anZ9wGLzltTu5EBbgjCDt5MUVOI8xHOw8aahVdQDJzOHqYHeiF6EuICtDBte0WX0uB1fHiO kvkexnlrHz4G1JBVND2g8EdM+sOCvVLzgOh5UIzFmxMiSOqaovkFgVTzNgmVI77O93Jh8uw1E1I 8ZZJmkadY= X-Google-Smtp-Source: AGHT+IEeqjVDugejiyRZkFI9fwQsbM4MYpnyw6njUkRoNLb5uLzTQ6O7u0O+iVlspc2d9wL32DOF7g== X-Received: by 2002:a05:622a:2619:b0:4b7:aa99:5449 with SMTP id d75a77b69052e-4da47353c82mr193134891cf.2.1759107815823; Sun, 28 Sep 2025 18:03:35 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:35 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 03/30] kho: drop notifiers Date: Mon, 29 Sep 2025 01:02:54 +0000 Message-ID: <20250929010321.3462457-4-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Mike Rapoport (Microsoft)" The KHO framework uses a notifier chain as the mechanism for clients to participate in the finalization process. While this works for a single, central state machine, it is too restrictive for kernel-internal components like pstore/reserve_mem or IMA. These components need a simpler, direct way to register their state for preservation (e.g., during their initcall) without being part of a complex, shutdown-time notifier sequence. The notifier model forces all participants into a single finalization flow and makes direct preservation from an arbitrary context difficult. This patch refactors the client participation model by removing the notifier chain and introducing a direct API for managing FDT subtrees. The core kho_finalize() and kho_abort() state machine remains, but clients now register their data with KHO beforehand. Signed-off-by: Mike Rapoport (Microsoft) Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 28 +---- kernel/kexec_handover.c | 184 +++++++++++++++---------------- kernel/kexec_handover_debug.c | 17 +-- kernel/kexec_handover_internal.h | 5 +- mm/memblock.c | 60 ++-------- 5 files changed, 118 insertions(+), 176 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index 04d0108db98e..2faf290803ce 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -10,14 +10,7 @@ struct kho_scratch { phys_addr_t size; }; =20 -/* KHO Notifier index */ -enum kho_event { - KEXEC_KHO_FINALIZE =3D 0, - KEXEC_KHO_ABORT =3D 1, -}; - struct folio; -struct notifier_block; struct page; =20 #define DECLARE_KHOSER_PTR(name, type) \ @@ -37,8 +30,6 @@ struct page; (typeof((s).ptr))((s).phys ? phys_to_virt((s).phys) : NULL); \ }) =20 -struct kho_serialization; - struct kho_vmalloc_chunk; struct kho_vmalloc { DECLARE_KHOSER_PTR(first, struct kho_vmalloc_chunk *); @@ -57,12 +48,10 @@ int kho_preserve_vmalloc(void *ptr, struct kho_vmalloc = *preservation); struct folio *kho_restore_folio(phys_addr_t phys); struct page *kho_restore_pages(phys_addr_t phys, unsigned int nr_pages); void *kho_restore_vmalloc(const struct kho_vmalloc *preservation); -int kho_add_subtree(struct kho_serialization *ser, const char *name, void = *fdt); +int kho_add_subtree(const char *name, void *fdt); +void kho_remove_subtree(void *fdt); int kho_retrieve_subtree(const char *name, phys_addr_t *phys); =20 -int register_kho_notifier(struct notifier_block *nb); -int unregister_kho_notifier(struct notifier_block *nb); - void kho_memory_init(void); =20 void kho_populate(phys_addr_t fdt_phys, u64 fdt_len, phys_addr_t scratch_p= hys, @@ -114,23 +103,16 @@ static inline void *kho_restore_vmalloc(const struct = kho_vmalloc *preservation) return NULL; } =20 -static inline int kho_add_subtree(struct kho_serialization *ser, - const char *name, void *fdt) +static inline int kho_add_subtree(const char *name, void *fdt) { return -EOPNOTSUPP; } =20 -static inline int kho_retrieve_subtree(const char *name, phys_addr_t *phys) +static inline void kho_remove_subtree(void *fdt) { - return -EOPNOTSUPP; } =20 -static inline int register_kho_notifier(struct notifier_block *nb) -{ - return -EOPNOTSUPP; -} - -static inline int unregister_kho_notifier(struct notifier_block *nb) +static inline int kho_retrieve_subtree(const char *name, phys_addr_t *phys) { return -EOPNOTSUPP; } diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index f0f6c6b8ad83..e0dc0ed565ef 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -15,7 +15,6 @@ #include #include #include -#include #include #include =20 @@ -99,33 +98,34 @@ struct kho_mem_track { =20 struct khoser_mem_chunk; =20 -struct kho_serialization { - struct page *fdt; - struct kho_mem_track track; - /* First chunk of serialized preserved memory map */ - struct khoser_mem_chunk *preserved_mem_map; +struct kho_sub_fdt { + struct list_head l; + const char *name; + void *fdt; }; =20 struct kho_out { - struct blocking_notifier_head chain_head; + void *fdt; + bool finalized; + struct mutex lock; /* protects KHO FDT finalization */ =20 - struct dentry *dir; + struct list_head sub_fdts; + struct mutex fdts_lock; =20 - struct mutex lock; /* protects KHO FDT finalization */ + struct kho_mem_track track; + /* First chunk of serialized preserved memory map */ + struct khoser_mem_chunk *preserved_mem_map; =20 - struct kho_serialization ser; - bool finalized; + struct kho_debugfs dbg; }; =20 static struct kho_out kho_out =3D { - .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), .lock =3D __MUTEX_INITIALIZER(kho_out.lock), - .ser =3D { - .fdt_list =3D LIST_HEAD_INIT(kho_out.ser.fdt_list), - .track =3D { - .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), - }, + .track =3D { + .orders =3D XARRAY_INIT(kho_out.track.orders, 0), }, + .sub_fdts =3D LIST_HEAD_INIT(kho_out.sub_fdts), + .fdts_lock =3D __MUTEX_INITIALIZER(kho_out.fdts_lock), .finalized =3D false, }; =20 @@ -366,14 +366,14 @@ static void kho_mem_ser_free(struct khoser_mem_chunk = *first_chunk) } } =20 -static int kho_mem_serialize(struct kho_serialization *ser) +static int kho_mem_serialize(struct kho_out *kho_out) { struct khoser_mem_chunk *first_chunk =3D NULL; struct khoser_mem_chunk *chunk =3D NULL; struct kho_mem_phys *physxa; unsigned long order; =20 - xa_for_each(&ser->track.orders, order, physxa) { + xa_for_each(&kho_out->track.orders, order, physxa) { struct kho_mem_phys_bits *bits; unsigned long phys; =20 @@ -401,7 +401,7 @@ static int kho_mem_serialize(struct kho_serialization *= ser) } } =20 - ser->preserved_mem_map =3D first_chunk; + kho_out->preserved_mem_map =3D first_chunk; =20 return 0; =20 @@ -660,28 +660,8 @@ static void __init kho_reserve_scratch(void) kho_enable =3D false; } =20 -struct kho_out { - struct blocking_notifier_head chain_head; - struct mutex lock; /* protects KHO FDT finalization */ - struct kho_serialization ser; - bool finalized; - struct kho_debugfs dbg; -}; - -static struct kho_out kho_out =3D { - .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), - .lock =3D __MUTEX_INITIALIZER(kho_out.lock), - .ser =3D { - .track =3D { - .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), - }, - }, - .finalized =3D false, -}; - /** * kho_add_subtree - record the physical address of a sub FDT in KHO root = tree. - * @ser: serialization control object passed by KHO notifiers. * @name: name of the sub tree. * @fdt: the sub tree blob. * @@ -695,34 +675,45 @@ static struct kho_out kho_out =3D { * * Return: 0 on success, error code on failure */ -int kho_add_subtree(struct kho_serialization *ser, const char *name, void = *fdt) +int kho_add_subtree(const char *name, void *fdt) { - int err =3D 0; - u64 phys =3D (u64)virt_to_phys(fdt); - void *root =3D page_to_virt(ser->fdt); + struct kho_sub_fdt *sub_fdt; + int err; =20 - err |=3D fdt_begin_node(root, name); - err |=3D fdt_property(root, PROP_SUB_FDT, &phys, sizeof(phys)); - err |=3D fdt_end_node(root); + sub_fdt =3D kmalloc(sizeof(*sub_fdt), GFP_KERNEL); + if (!sub_fdt) + return -ENOMEM; =20 - if (err) - return err; + INIT_LIST_HEAD(&sub_fdt->l); + sub_fdt->name =3D name; + sub_fdt->fdt =3D fdt; + + mutex_lock(&kho_out.fdts_lock); + list_add_tail(&sub_fdt->l, &kho_out.sub_fdts); + err =3D kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); + mutex_unlock(&kho_out.fdts_lock); =20 - return kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); + return err; } EXPORT_SYMBOL_GPL(kho_add_subtree); =20 -int register_kho_notifier(struct notifier_block *nb) +void kho_remove_subtree(void *fdt) { - return blocking_notifier_chain_register(&kho_out.chain_head, nb); -} -EXPORT_SYMBOL_GPL(register_kho_notifier); + struct kho_sub_fdt *sub_fdt; + + mutex_lock(&kho_out.fdts_lock); + list_for_each_entry(sub_fdt, &kho_out.sub_fdts, l) { + if (sub_fdt->fdt =3D=3D fdt) { + list_del(&sub_fdt->l); + kfree(sub_fdt); + kho_debugfs_fdt_remove(&kho_out.dbg, fdt); + break; + } + } + mutex_unlock(&kho_out.fdts_lock); =20 -int unregister_kho_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_unregister(&kho_out.chain_head, nb); } -EXPORT_SYMBOL_GPL(unregister_kho_notifier); +EXPORT_SYMBOL_GPL(kho_remove_subtree); =20 /** * kho_preserve_folio - preserve a folio across kexec. @@ -737,7 +728,7 @@ int kho_preserve_folio(struct folio *folio) { const unsigned long pfn =3D folio_pfn(folio); const unsigned int order =3D folio_order(folio); - struct kho_mem_track *track =3D &kho_out.ser.track; + struct kho_mem_track *track =3D &kho_out.track; =20 return __kho_preserve_order(track, pfn, order); } @@ -755,7 +746,7 @@ EXPORT_SYMBOL_GPL(kho_preserve_folio); */ int kho_preserve_pages(struct page *page, unsigned int nr_pages) { - struct kho_mem_track *track =3D &kho_out.ser.track; + struct kho_mem_track *track =3D &kho_out.track; const unsigned long start_pfn =3D page_to_pfn(page); const unsigned long end_pfn =3D start_pfn + nr_pages; unsigned long pfn =3D start_pfn; @@ -851,7 +842,7 @@ static struct kho_vmalloc_chunk *new_vmalloc_chunk(stru= ct kho_vmalloc_chunk *cur =20 static void kho_vmalloc_unpreserve_chunk(struct kho_vmalloc_chunk *chunk) { - struct kho_mem_track *track =3D &kho_out.ser.track; + struct kho_mem_track *track =3D &kho_out.track; unsigned long pfn =3D PHYS_PFN(virt_to_phys(chunk)); =20 __kho_unpreserve(track, pfn, pfn + 1); @@ -1033,11 +1024,11 @@ EXPORT_SYMBOL_GPL(kho_restore_vmalloc); =20 static int __kho_abort(void) { - int err; + int err =3D 0; unsigned long order; struct kho_mem_phys *physxa; =20 - xa_for_each(&kho_out.ser.track.orders, order, physxa) { + xa_for_each(&kho_out.track.orders, order, physxa) { struct kho_mem_phys_bits *bits; unsigned long phys; =20 @@ -1047,17 +1038,13 @@ static int __kho_abort(void) xa_destroy(&physxa->phys_bits); kfree(physxa); } - xa_destroy(&kho_out.ser.track.orders); + xa_destroy(&kho_out.track.orders); =20 - if (kho_out.ser.preserved_mem_map) { - kho_mem_ser_free(kho_out.ser.preserved_mem_map); - kho_out.ser.preserved_mem_map =3D NULL; + if (kho_out.preserved_mem_map) { + kho_mem_ser_free(kho_out.preserved_mem_map); + kho_out.preserved_mem_map =3D NULL; } =20 - err =3D blocking_notifier_call_chain(&kho_out.chain_head, KEXEC_KHO_ABORT, - NULL); - err =3D notifier_to_errno(err); - if (err) pr_err("Failed to abort KHO finalization: %d\n", err); =20 @@ -1084,7 +1071,7 @@ int kho_abort(void) =20 kho_out.finalized =3D false; =20 - kho_debugfs_cleanup(&kho_out.dbg); + kho_debugfs_fdt_remove(&kho_out.dbg, kho_out.fdt); =20 unlock: mutex_unlock(&kho_out.lock); @@ -1095,41 +1082,46 @@ static int __kho_finalize(void) { int err =3D 0; u64 *preserved_mem_map; - void *fdt =3D page_to_virt(kho_out.ser.fdt); + void *root =3D kho_out.fdt; + struct kho_sub_fdt *fdt; =20 - err |=3D fdt_create(fdt, PAGE_SIZE); - err |=3D fdt_finish_reservemap(fdt); - err |=3D fdt_begin_node(fdt, ""); - err |=3D fdt_property_string(fdt, "compatible", KHO_FDT_COMPATIBLE); + err |=3D fdt_create(root, PAGE_SIZE); + err |=3D fdt_finish_reservemap(root); + err |=3D fdt_begin_node(root, ""); + err |=3D fdt_property_string(root, "compatible", KHO_FDT_COMPATIBLE); /** * Reserve the preserved-memory-map property in the root FDT, so * that all property definitions will precede subnodes created by * KHO callers. */ - err |=3D fdt_property_placeholder(fdt, PROP_PRESERVED_MEMORY_MAP, + err |=3D fdt_property_placeholder(root, PROP_PRESERVED_MEMORY_MAP, sizeof(*preserved_mem_map), (void **)&preserved_mem_map); if (err) goto abort; =20 - err =3D kho_preserve_folio(page_folio(kho_out.ser.fdt)); + err =3D kho_preserve_folio(virt_to_folio(kho_out.fdt)); if (err) goto abort; =20 - err =3D blocking_notifier_call_chain(&kho_out.chain_head, - KEXEC_KHO_FINALIZE, &kho_out.ser); - err =3D notifier_to_errno(err); + err =3D kho_mem_serialize(&kho_out); if (err) goto abort; =20 - err =3D kho_mem_serialize(&kho_out.ser); - if (err) - goto abort; + *preserved_mem_map =3D (u64)virt_to_phys(kho_out.preserved_mem_map); =20 - *preserved_mem_map =3D (u64)virt_to_phys(kho_out.ser.preserved_mem_map); + mutex_lock(&kho_out.fdts_lock); + list_for_each_entry(fdt, &kho_out.sub_fdts, l) { + phys_addr_t phys =3D virt_to_phys(fdt->fdt); =20 - err |=3D fdt_end_node(fdt); - err |=3D fdt_finish(fdt); + err |=3D fdt_begin_node(root, fdt->name); + err |=3D fdt_property(root, PROP_SUB_FDT, &phys, sizeof(phys)); + err |=3D fdt_end_node(root); + }; + mutex_unlock(&kho_out.fdts_lock); + + err |=3D fdt_end_node(root); + err |=3D fdt_finish(root); =20 abort: if (err) { @@ -1160,7 +1152,7 @@ int kho_finalize(void) =20 kho_out.finalized =3D true; ret =3D kho_debugfs_fdt_add(&kho_out.dbg, "fdt", - page_to_virt(kho_out.ser.fdt), true); + kho_out.fdt, true); =20 unlock: mutex_unlock(&kho_out.lock); @@ -1252,15 +1244,17 @@ static __init int kho_init(void) { int err =3D 0; const void *fdt =3D kho_get_fdt(); + struct page *fdt_page; =20 if (!kho_enable) return 0; =20 - kho_out.ser.fdt =3D alloc_page(GFP_KERNEL); - if (!kho_out.ser.fdt) { + fdt_page =3D alloc_page(GFP_KERNEL); + if (!fdt_page) { err =3D -ENOMEM; goto err_free_scratch; } + kho_out.fdt =3D page_to_virt(fdt_page); =20 err =3D kho_debugfs_init(); if (err) @@ -1288,8 +1282,8 @@ static __init int kho_init(void) return 0; =20 err_free_fdt: - put_page(kho_out.ser.fdt); - kho_out.ser.fdt =3D NULL; + put_page(fdt_page); + kho_out.fdt =3D NULL; err_free_scratch: for (int i =3D 0; i < kho_scratch_cnt; i++) { void *start =3D __va(kho_scratch[i].addr); @@ -1300,7 +1294,7 @@ static __init int kho_init(void) kho_enable =3D false; return err; } -late_initcall(kho_init); +fs_initcall(kho_init); =20 static void __init kho_release_scratch(void) { @@ -1436,7 +1430,7 @@ int kho_fill_kimage(struct kimage *image) if (!kho_out.finalized) return 0; =20 - image->kho.fdt =3D page_to_phys(kho_out.ser.fdt); + image->kho.fdt =3D virt_to_phys(kho_out.fdt); =20 scratch_size =3D sizeof(*kho_scratch) * kho_scratch_cnt; scratch =3D (struct kexec_buf){ diff --git a/kernel/kexec_handover_debug.c b/kernel/kexec_handover_debug.c index b88d138a97be..af4bad225630 100644 --- a/kernel/kexec_handover_debug.c +++ b/kernel/kexec_handover_debug.c @@ -61,14 +61,17 @@ int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const = char *name, return __kho_debugfs_fdt_add(&dbg->fdt_list, dir, name, fdt); } =20 -void kho_debugfs_cleanup(struct kho_debugfs *dbg) +void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, void *fdt) { - struct fdt_debugfs *ff, *tmp; - - list_for_each_entry_safe(ff, tmp, &dbg->fdt_list, list) { - debugfs_remove(ff->file); - list_del(&ff->list); - kfree(ff); + struct fdt_debugfs *ff; + + list_for_each_entry(ff, &dbg->fdt_list, list) { + if (ff->wrapper.data =3D=3D fdt) { + debugfs_remove(ff->file); + list_del(&ff->list); + kfree(ff); + break; + } } } =20 diff --git a/kernel/kexec_handover_internal.h b/kernel/kexec_handover_inter= nal.h index f6f172ddcae4..229a05558b99 100644 --- a/kernel/kexec_handover_internal.h +++ b/kernel/kexec_handover_internal.h @@ -30,7 +30,7 @@ void kho_in_debugfs_init(struct kho_debugfs *dbg, const v= oid *fdt); int kho_out_debugfs_init(struct kho_debugfs *dbg); int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, const void *fdt, bool root); -void kho_debugfs_cleanup(struct kho_debugfs *dbg); +void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, void *fdt); #else static inline int kho_debugfs_init(void) { return 0; } static inline void kho_in_debugfs_init(struct kho_debugfs *dbg, @@ -38,7 +38,8 @@ static inline void kho_in_debugfs_init(struct kho_debugfs= *dbg, static inline int kho_out_debugfs_init(struct kho_debugfs *dbg) { return 0= ; } static inline int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char = *name, const void *fdt, bool root) { return 0; } -static inline void kho_debugfs_cleanup(struct kho_debugfs *dbg) {} +static inline void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, + void *fdt) { } #endif /* CONFIG_KEXEC_HANDOVER_DEBUG */ =20 #endif /* LINUX_KEXEC_HANDOVER_INTERNAL_H */ diff --git a/mm/memblock.c b/mm/memblock.c index e23e16618e9b..c4b2d4e4c715 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -2444,53 +2444,18 @@ int reserve_mem_release_by_name(const char *name) #define MEMBLOCK_KHO_FDT "memblock" #define MEMBLOCK_KHO_NODE_COMPATIBLE "memblock-v1" #define RESERVE_MEM_KHO_NODE_COMPATIBLE "reserve-mem-v1" -static struct page *kho_fdt; - -static int reserve_mem_kho_finalize(struct kho_serialization *ser) -{ - int err =3D 0, i; - - for (i =3D 0; i < reserved_mem_count; i++) { - struct reserve_mem_table *map =3D &reserved_mem_table[i]; - struct page *page =3D phys_to_page(map->start); - unsigned int nr_pages =3D map->size >> PAGE_SHIFT; - - err |=3D kho_preserve_pages(page, nr_pages); - } - - err |=3D kho_preserve_folio(page_folio(kho_fdt)); - err |=3D kho_add_subtree(ser, MEMBLOCK_KHO_FDT, page_to_virt(kho_fdt)); - - return notifier_from_errno(err); -} - -static int reserve_mem_kho_notifier(struct notifier_block *self, - unsigned long cmd, void *v) -{ - switch (cmd) { - case KEXEC_KHO_FINALIZE: - return reserve_mem_kho_finalize((struct kho_serialization *)v); - case KEXEC_KHO_ABORT: - return NOTIFY_DONE; - default: - return NOTIFY_BAD; - } -} - -static struct notifier_block reserve_mem_kho_nb =3D { - .notifier_call =3D reserve_mem_kho_notifier, -}; =20 static int __init prepare_kho_fdt(void) { int err =3D 0, i; + struct page *fdt_page; void *fdt; =20 - kho_fdt =3D alloc_page(GFP_KERNEL); - if (!kho_fdt) + fdt_page =3D alloc_page(GFP_KERNEL); + if (!fdt_page) return -ENOMEM; =20 - fdt =3D page_to_virt(kho_fdt); + fdt =3D page_to_virt(fdt_page); =20 err |=3D fdt_create(fdt, PAGE_SIZE); err |=3D fdt_finish_reservemap(fdt); @@ -2499,7 +2464,10 @@ static int __init prepare_kho_fdt(void) err |=3D fdt_property_string(fdt, "compatible", MEMBLOCK_KHO_NODE_COMPATI= BLE); for (i =3D 0; i < reserved_mem_count; i++) { struct reserve_mem_table *map =3D &reserved_mem_table[i]; + struct page *page =3D phys_to_page(map->start); + unsigned int nr_pages =3D map->size >> PAGE_SHIFT; =20 + err |=3D kho_preserve_pages(page, nr_pages); err |=3D fdt_begin_node(fdt, map->name); err |=3D fdt_property_string(fdt, "compatible", RESERVE_MEM_KHO_NODE_COM= PATIBLE); err |=3D fdt_property(fdt, "start", &map->start, sizeof(map->start)); @@ -2507,13 +2475,14 @@ static int __init prepare_kho_fdt(void) err |=3D fdt_end_node(fdt); } err |=3D fdt_end_node(fdt); - err |=3D fdt_finish(fdt); =20 + err |=3D kho_preserve_folio(page_folio(fdt_page)); + err |=3D kho_add_subtree(MEMBLOCK_KHO_FDT, fdt); + if (err) { pr_err("failed to prepare memblock FDT for KHO: %d\n", err); - put_page(kho_fdt); - kho_fdt =3D NULL; + put_page(fdt_page); } =20 return err; @@ -2529,13 +2498,6 @@ static int __init reserve_mem_init(void) err =3D prepare_kho_fdt(); if (err) return err; - - err =3D register_kho_notifier(&reserve_mem_kho_nb); - if (err) { - put_page(kho_fdt); - kho_fdt =3D NULL; - } - return err; } late_initcall(reserve_mem_init); --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 824291D9A5F for ; Mon, 29 Sep 2025 01:03:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107820; cv=none; b=c12p2c8HsCQd2rAQ/CPOLmAAa99k6YmWau1c1BWKOluOS6j93LO8pXzwYso7S7nliydXTFcinElUJ4hzXkvVGOtN6W+FWtccSGjOaDcApHuNy0pCqBxMTmdpaRbuM1/9uIjI8NQsoUoco+YVhjjYGz1rJp3LguztbhegfYvv/58= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107820; c=relaxed/simple; bh=Y/Ivo5zgLtU+RH6gb+3c1+9ZNDdvFIGCYZipUGkL+4w=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pcm8ew7+GK1eNkbmwOjezleUDnXqh7qbwzCvChh6j0vzwVwfdYuIftWYKsMVPxCrokd4T7ZzpkO7ihXqpyfig4zS2+iW/Z7G8kxKBM60CxIzKsI8LdpC2tA/jZZAa4TUp9YTUuCkal72CUYxg9AO/bMt11trwTkSpIJaFpOzkTI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=HFMXSoS4; arc=none smtp.client-ip=209.85.222.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="HFMXSoS4" Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-85b94fe19e2so399147485a.3 for ; Sun, 28 Sep 2025 18:03:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107817; x=1759712617; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ezlL3LcjIf2ht7RpS8bynkMbzZtMKTbogv0OCRBHCJM=; b=HFMXSoS4PQuaDTMFFqb7ZmjXNAegr54JYRE+RyQYVFaQ/yuGeomYzk10Iq6SDbWjI7 goGuisDvG07tJdh38tIEAnc1gRzEwSRJLEtOHwgtSDu3y51QMp18kGAmjGVWJCR0h5NW 3/umAhto3EsVdZjEUtW2rM+X1rt5Ivxrk2+ZB7OypK+OtRDGzgNzPbpXh5/k0UCOehWA u/yMslRAx3pVNLrfCVoEATFhEGV9JGqVR+j3xcFZrerfe7xyKdc6z3ZkKzxncAiXE30i zmD4Q5yqJ8Zhtz2bsVE9sB3v20j8kc0nKJMrPXN8z4fU6gavDpqpjKsofaP75834L7t5 gO3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107817; x=1759712617; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ezlL3LcjIf2ht7RpS8bynkMbzZtMKTbogv0OCRBHCJM=; b=rPmCVWM7V2zJXsI/423JSln589euZNGvmUjuP2VFd8nKnruuVfk7FDjDpKbNW0owm1 j38+7GXqjNXiySc/mahTKtdiiWf/tQSuFXFEr2US0tA4+c3alpKABTJ7DrUG56BwZlzL AscLXma74EslcT/kVVSQrAogsO+v5ZBjR2RgmslVRsDgJRpQj59NKPvPKYpxm7bi1E+L wQiMYZT4NhL3zn2O33ndwOjb/fws/CTKEFtk9LYnugloX1JoCBEKmuncCF86vAsupyTM 5kZCUWouHArLK++OFgUCQL64ntqFxyLzo6aj/xa+2xpMxbwM80fgL/f78EGwn9KX5qGd YBiQ== X-Forwarded-Encrypted: i=1; AJvYcCVjLrcle6IOsD0PS6kNW3W9Xynf9S3Uiaa4/Bv9PLCh//pCCxP2TK33Dw0BsjD2AfRkeVjg3bnnssTcXTk=@vger.kernel.org X-Gm-Message-State: AOJu0YzsW3yxMoASRjubhijT0t3dorhOvV2UKa+Ggbc3BmdhlC1mNWSJ 00rMzMYQ+VYnml0qcrEZX16gltpGulzF9fOO9Pc1Cd27H2KgDT/E65CQsKRQ8d6uwGc= X-Gm-Gg: ASbGncvf02VHhdaGWqT9rN61P6I7yue10/ApruG+cgodOtGSQC4cd6AqY2pG8DdAs3t fbEvvJ4q8HTlzO3KAVFf+VopL1NSY1CUXVMXTr134DjWsBprahwHZYAEuZEi0zYNScfTYmucJ27 kN21ciKwmTxnHC2c96wvqEhiLlfzqo3j+arjEP8Jqy0d3TxdCBKzmfcm/H/gNg0Bb/uBGBRj9Fy 8Ehz7B+1zBCLLOcdRN/1idaKxpQWVeRbdY0WQ7R+nxdtJuPzMU7nqTtsJwTN7N6f68osoZE33Fb qKyxwThlaXlH8z7ouwRoNVtm0gDdLKInTRtOGuz+w2MqvMG4NcbhUh5lWXbJSPVxE0jPJGQtsz2 L3D0KgyibZsX79xD/710kPV5kw5XvxgkFNYXzNh0P8KP/sDOxilGj6JZD+xnabxZXNZNyyLcoh0 RvfvS8zdDmuYHWylmMfuDgHNt15S4h X-Google-Smtp-Source: AGHT+IHaETUM84OTpHheMB0hdUMDHvd/ZJx9O8NiNJ3bWGiu/zS0OpwUNUxS2hk0EnKZS1qySrlkgw== X-Received: by 2002:a05:620a:f0c:b0:855:cfe0:b6eb with SMTP id af79cd13be357-85ae9c6d420mr1897035985a.75.1759107817220; Sun, 28 Sep 2025 18:03:37 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:36 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 04/30] kho: add interfaces to unpreserve folios and page ranes Date: Mon, 29 Sep 2025 01:02:55 +0000 Message-ID: <20250929010321.3462457-5-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allow users of KHO to cancel the previous preservation by adding the necessary interfaces to unpreserve folio and pages. Co-developed-by: Changyuan Lyu Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 12 +++++ kernel/kexec_handover.c | 85 ++++++++++++++++++++++++++++------ 2 files changed, 84 insertions(+), 13 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index 2faf290803ce..4ba145713838 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -43,7 +43,9 @@ bool kho_is_enabled(void); bool is_kho_boot(void); =20 int kho_preserve_folio(struct folio *folio); +int kho_unpreserve_folio(struct folio *folio); int kho_preserve_pages(struct page *page, unsigned int nr_pages); +int kho_unpreserve_pages(struct page *page, unsigned int nr_pages); int kho_preserve_vmalloc(void *ptr, struct kho_vmalloc *preservation); struct folio *kho_restore_folio(phys_addr_t phys); struct page *kho_restore_pages(phys_addr_t phys, unsigned int nr_pages); @@ -76,11 +78,21 @@ static inline int kho_preserve_folio(struct folio *foli= o) return -EOPNOTSUPP; } =20 +static inline int kho_unpreserve_folio(struct folio *folio) +{ + return -EOPNOTSUPP; +} + static inline int kho_preserve_pages(struct page *page, unsigned int nr_pa= ges) { return -EOPNOTSUPP; } =20 +static inline int kho_unpreserve_pages(struct page *page, unsigned int nr_= pages) +{ + return -EOPNOTSUPP; +} + static inline int kho_preserve_vmalloc(void *ptr, struct kho_vmalloc *preservation) { diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index e0dc0ed565ef..26e035eb1314 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -153,26 +153,33 @@ static void *xa_load_or_alloc(struct xarray *xa, unsi= gned long index, size_t sz) return elm; } =20 -static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pf= n, - unsigned long end_pfn) +static void __kho_unpreserve_order(struct kho_mem_track *track, unsigned l= ong pfn, + unsigned int order) { struct kho_mem_phys_bits *bits; struct kho_mem_phys *physxa; + const unsigned long pfn_high =3D pfn >> order; =20 - while (pfn < end_pfn) { - const unsigned int order =3D - min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn)); - const unsigned long pfn_high =3D pfn >> order; + physxa =3D xa_load(&track->orders, order); + if (!physxa) + return; + + bits =3D xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); + if (!bits) + return; =20 - physxa =3D xa_load(&track->orders, order); - if (!physxa) - continue; + clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); +} + +static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pf= n, + unsigned long end_pfn) +{ + unsigned int order; =20 - bits =3D xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); - if (!bits) - continue; + while (pfn < end_pfn) { + order =3D min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn)); =20 - clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); + __kho_unpreserve_order(track, pfn, order); =20 pfn +=3D 1 << order; } @@ -734,6 +741,30 @@ int kho_preserve_folio(struct folio *folio) } EXPORT_SYMBOL_GPL(kho_preserve_folio); =20 +/** + * kho_unpreserve_folio - unpreserve a folio. + * @folio: folio to unpreserve. + * + * Instructs KHO to unpreserve a folio that was preserved by + * kho_preserve_folio() before. The provided @folio (pfn and order) + * must exactly match a previously preserved folio. + * + * Return: 0 on success, error code on failure + */ +int kho_unpreserve_folio(struct folio *folio) +{ + const unsigned long pfn =3D folio_pfn(folio); + const unsigned int order =3D folio_order(folio); + struct kho_mem_track *track =3D &kho_out.track; + + if (kho_out.finalized) + return -EBUSY; + + __kho_unpreserve_order(track, pfn, order); + return 0; +} +EXPORT_SYMBOL_GPL(kho_unpreserve_folio); + /** * kho_preserve_pages - preserve contiguous pages across kexec * @page: first page in the list. @@ -773,6 +804,34 @@ int kho_preserve_pages(struct page *page, unsigned int= nr_pages) } EXPORT_SYMBOL_GPL(kho_preserve_pages); =20 +/** + * kho_unpreserve_pages - unpreserve contiguous pages. + * @page: first page in the list. + * @nr_pages: number of pages. + * + * Instructs KHO to unpreserve @nr_pages contigious pages starting from @= page. + * This call must exactly match a granularity at which memory was original= ly + * preserved by kho_preserve_pages, call with the same @page and + * @nr_pages). Unpreserving arbitrary sub-ranges of larger preserved block= s is + * not supported. + * + * Return: 0 on success, error code on failure + */ +int kho_unpreserve_pages(struct page *page, unsigned int nr_pages) +{ + struct kho_mem_track *track =3D &kho_out.track; + const unsigned long start_pfn =3D page_to_pfn(page); + const unsigned long end_pfn =3D start_pfn + nr_pages; + + if (kho_out.finalized) + return -EBUSY; + + __kho_unpreserve(track, start_pfn, end_pfn); + + return 0; +} +EXPORT_SYMBOL_GPL(kho_unpreserve_pages); + struct kho_vmalloc_hdr { DECLARE_KHOSER_PTR(next, struct kho_vmalloc_chunk *); }; --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 109FC1EE033 for ; Mon, 29 Sep 2025 01:03:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107822; cv=none; b=jRRKcsjoficqnl+gbOXNzKPQEC95QrW9sVvJwTZq+KzXKyxDMQVfdPIOArqdiDW5GRk90OhXQMlrsyF8JPUuF9kCdbmE6E42F0BlHJu+LNp+ciNXVN09pyT7qQ/oHImH0Vf4LY98zfIIWttHV3FnZwiqWZ1dr08d6+YjVgoegOE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107822; c=relaxed/simple; bh=H863SX6t2g0rcimmk1vD/Y1fEaIHCkBYP8Gf6AaW1Ug=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fKExLLj/KJ5ZYw9mmaFSqVSfiGpnrLeyxsfWoKOKVL8QLYiI8lRhlKU3LeBLkytI0B5QDeFB8S6fTdHoTWZPpWdKUAgSOzafLn+fmXgCfTEbBmifyNMLxUrtGaeE/GWZAyx3jmCbGQCmC5LaRX7k6uq+zdOiy1/So7LYYBaHECs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=gQBi0rde; arc=none smtp.client-ip=209.85.160.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="gQBi0rde" Received: by mail-qt1-f182.google.com with SMTP id d75a77b69052e-4dce9229787so33641211cf.0 for ; Sun, 28 Sep 2025 18:03:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107819; x=1759712619; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=seHRxlr9qoYmGWxME4ah/Rh2okXUsVbz/ndLeC4LCP8=; b=gQBi0rdeGswziM8BkntC20uIOSiGooee+q/B2q41M6BgoZqgGeP5Bv8MuJEEWEkBUP +ZlZbzmEr4DRc+6zcNJezEjKAPSmoJRJzY/dRqGal2hAQjzTp+0G7hPw28GxMf9S5eiT kBioqI4KApASRB1+H4OudFm8ghWw4zPrbE9lHh1ZH2QSAQIJgsVriCjcp3y0rL8Sj2Gf p17+6MivXXOaxB8FxwvJJwHCkSWAhlVN9wcUg1cAun9FY0yON7xeTnJE2ZJxBU/raEZ/ F0Qq0uii5j4tPPWUADLs6IyfkBjRIrTrG5WrxYxQLH5aoLutzCpUFKSpMEmxHkHKZkam TuGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107819; x=1759712619; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=seHRxlr9qoYmGWxME4ah/Rh2okXUsVbz/ndLeC4LCP8=; b=NgtPQd9OgD7hIeM9JZZnNE2JvWZjshZXmcBYty35tzgAKLxBg/GTJGjRzCFTS0RB1w oMNWuKkep5ZfK5gtTKJnb8eFnwX6TVDRK6IWceZPQmJIFFazCxOLkxOHTxCpilAIlSVy pGc7ioGbgB9ZIQI4zZKX8U2U45SXW/9ytTV2USHzdGPvAbb0hMyd4LwOEE7sQ3zXHYV4 kDdYwT2vyw80/t6311QNjlXQrGf0l+98GwvVDNmJVIUS1EOmAw6CEFLDEzohOaK7GroL w1WRpJP8ugtrj1jlg0ub73CPEJ2opQp1ZFs640a+evjIobaCMY8PZZVD1MOt7f3lixCW hMng== X-Forwarded-Encrypted: i=1; AJvYcCX7bwoptcYY/1p4HTW1VGzEdLmimldiP30lyfT7VnRBv4WXInK3Q17DPzrtifa/isfvKEZQGhtGrIsLOyc=@vger.kernel.org X-Gm-Message-State: AOJu0YxX5ibDbohA39IALtzTOBtvKOaNb0jH7nNHeD8rwidhLXIpBS25 1ihXwSHQcIQ9R3vitU5L62DfJ1Ap7WFq00Aq2fK5yHKQefGLKN9njTqGSPGC9coyI7M= X-Gm-Gg: ASbGnct96373072v3fTLb2wZ3X91i3vCNAtGG3X3QuRCjHgJKpQshS7ZY6Ge3pcYV++ dq1jSCxhR4mKEdTOZlkw7vvDEUbk3UlHXGBGVSy7wnDqTIaMNdM0teFjTtNSzUcQb8i/GQ7DQTo PH+YBIITY12Z6wtgzBSA+u+VpSQP8VeP5OQxHUv+rg0PxgQhBpKixqggcT/naROi7yhSeRSvC/B 7hFOSKtW1KQ+1ViLPJFDqcMLJh+67mVnt5yAcbcTeRxQ7CZmka76e4EQh4fUnZTIAdNGeHakK1H Yijso2wjnXEnjOt1VheKP5MNNGh9xTr/9bMZKI+aEPspkyN6OHrbY2Hq1o+7J/s2Lgf8hV4Avzi EQmwYSHCrpmLN6jYS7mKnAAcY55hXlYCGIia7vvMb2LunHIFaWGMTilprjVvs1yix1RR1uGDlNN UWzqvfBCg= X-Google-Smtp-Source: AGHT+IGWluQyWGzOMAK6ELVmstmCfELrwzvUE/VJRxrRrDSk2QQU4Sd30xTz+Cb2R/+uvVNJIiSTzA== X-Received: by 2002:a05:622a:4012:b0:4b3:4f82:2b2a with SMTP id d75a77b69052e-4da4744e220mr214493921cf.4.1759107818629; Sun, 28 Sep 2025 18:03:38 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:38 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 05/30] kho: don't unpreserve memory during abort Date: Mon, 29 Sep 2025 01:02:56 +0000 Message-ID: <20250929010321.3462457-6-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" KHO allows clients to preserve memory regions at any point before the KHO state is finalized. The finalization process itself involves KHO performing its own actions, such as serializing the overall preserved memory map. If this finalization process is aborted, the current implementation destroys KHO's internal memory tracking structures (`kho_out.ser.track.orders`). This behavior effectively unpreserves all memory from KHO's perspective, regardless of whether those preservations were made by clients before the finalization attempt or by KHO itself during finalization. This premature unpreservation is incorrect. An abort of the finalization process should only undo actions taken by KHO as part of that specific finalization attempt. Individual memory regions preserved by clients prior to finalization should remain preserved, as their lifecycle is managed by the clients themselves. These clients might still need to call kho_unpreserve_folio() or kho_unpreserve_phys() based on their own logic, even after a KHO finalization attempt is aborted. Signed-off-by: Pasha Tatashin --- kernel/kexec_handover.c | 21 +-------------------- 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 26e035eb1314..61b31cfc75f2 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -1083,31 +1083,12 @@ EXPORT_SYMBOL_GPL(kho_restore_vmalloc); =20 static int __kho_abort(void) { - int err =3D 0; - unsigned long order; - struct kho_mem_phys *physxa; - - xa_for_each(&kho_out.track.orders, order, physxa) { - struct kho_mem_phys_bits *bits; - unsigned long phys; - - xa_for_each(&physxa->phys_bits, phys, bits) - kfree(bits); - - xa_destroy(&physxa->phys_bits); - kfree(physxa); - } - xa_destroy(&kho_out.track.orders); - if (kho_out.preserved_mem_map) { kho_mem_ser_free(kho_out.preserved_mem_map); kho_out.preserved_mem_map =3D NULL; } =20 - if (err) - pr_err("Failed to abort KHO finalization: %d\n", err); - - return err; + return 0; } =20 int kho_abort(void) --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEE06208994 for ; Mon, 29 Sep 2025 01:03:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107823; cv=none; b=ZKnYs5igEr+i0mWdovTai3W8koaWs0j5SSOZV0QR+RHR3RsycDA/CNiTx5zmgtC9xfNkQsWhRZl095Wi7K4YEtXFTGPTOycHls8dxW+k46523xoqHAxDgSQNiHBNwLolwzxT2CTyW4HTYqeBhHW8BW6PjYGS7lHvTueEHZl3FCc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107823; c=relaxed/simple; bh=uocVzTpuxi0ebUpykEeNIS1BP1SwChvze955nF6VwOw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=D9xHqwQO1Xtg2t86Yfgv/KV7b/uPiN7Vj2CNMnQ39K0b7+ZqKLPKD2zNfgqc6fP/eD2mtKPntgoFlFIkGo/dIw4w5j5Yf19X4ufkVuV9/7BruFjwrM5XX1o56fv9y5OXpi8ILEA/Ps99YkoaQ0oCWHduMbHDWZzCxMYRfr9GW6Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=OF6k+XtG; arc=none smtp.client-ip=209.85.160.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="OF6k+XtG" Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-4d9f7a34daaso35129271cf.0 for ; Sun, 28 Sep 2025 18:03:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107820; x=1759712620; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=T9ZBZaC3vdnAz+H8XWrjcI+m01YkVPNevX7SPwh/ro8=; b=OF6k+XtG4d9+ONrAotco/CqbBeafIkR6hMFyu9AxUjIw33t1/slbCETYek9/e6CKSL WDHlANdhT1RVEI2NqV7APpmQEAZi7CBuFZP0mUqdjOU5sQHQNCLL9IVTMch8isWZbRRH Jj6laX9sMFlMUm2G69DVaUJEr9AjxJnovMFpySlIkxilGNQdQD58qMwHH8Sq2N5nRhCe 8pAjR3KYnH0QdteTEpLo3BDz+deetpnYjv3KZyTeTg5wG4Y0xVr6jTzvQCJtNXPvnDNC arl6gFii5y9ghms5oY8wCsmpm7/6D3KotfrkUd5LOUu39spYG5VMVNkhVRwmw5aKC2+B 9+yg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107820; x=1759712620; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T9ZBZaC3vdnAz+H8XWrjcI+m01YkVPNevX7SPwh/ro8=; b=LR/vzwbLquXJVEMtooMjYSXm2DPsmCHljXP+2YDh1mfWCxPk6H5AlzqiRQjSUC6Tc7 61G+llpGtPkUvOooOGgb/nORXtbSE2QDpTWeZuEoEMIEgvmEJa3FKlPwFXMuNA+CV35D WbMeBNBdEdTEFmgZojIt+JFTLVLRq+3WaXY95KI9mX3aabJKxpinOwoIsVyG3CYiaT1K /GOAOMMRJ/oHsLnJEElmWG85AFzpWV04uMxc0HHdMAXHqF9ey0c/OzAPNPj6ZxiZjzRO dzxj9tUyP00kEfQqxo8CWoe5arpBnrroHsxKa0WTXb0qm1WYhbCr2bwWUThVomoXRoqb M90g== X-Forwarded-Encrypted: i=1; AJvYcCWBSXaVUQX1xkEfMxHThv/Cg+acSfwjjrYIGT/dmZzeUdj19gCY52IGm/dRD2zdVrmG0/T+Zrn6nrHWEmw=@vger.kernel.org X-Gm-Message-State: AOJu0YyYPTrBPAee4Qi08EkGawlo6KUll0Y4Y6ewK747LaVzzCBabhCN gxZzAz3t5PIg0qCvJFCia+n/xz8Z+na122QrdQ6zElToLpCQbLbTdb96rBQHjSN+ANk= X-Gm-Gg: ASbGncuORSCkHmRX1y4RFpblS1/6Fomf1+UbgU3f9+r2YyPn9ZKEBgHz5wa6pjy+ge5 5qN59P1gFvaliIi5edWeQ7wwDniX8k/aMdWNm5cezBtxjoXrpObIAzJHt8I86CcntJ4Ls0fUVF9 aP08S/kUF9DDhHuW96jjLlxkEI3VfwxrldXQ9Ggtb9tC3JB6JraP1y++KUtPuDeaVKHHviVjsM6 wosPfvqV1x4+act3goOsmfME7XXVfJJMCnXWCWtK5hqT7EE5odcHeeJjPV+F9124HCUYKVGCe3H 410kpd7gvO0uIz8SzorK1h+CY6H/MC0Kfi9uDAxpa9SgdzwU/l+RiVFZ9EDDXYDNRx39nwUOx1S e4TLED9Rs387Dd3l/C1ZDglHU1gjBSZTnNznZCwAa7J2tZFhPZMX2MQKAqwApw4bPgTorysyveR djsscAWXgox/TTXLAXd/iMLLSyTOo4 X-Google-Smtp-Source: AGHT+IE10H11AdCpFcZo+kud9wUci9g/QjNXpF8kId7N/q2vHl0kUjCR57f1sJWtfpnjlo9cA3j/tg== X-Received: by 2002:a05:622a:2609:b0:4da:155a:76fc with SMTP id d75a77b69052e-4da4744f3b0mr213359021cf.16.1759107819934; Sun, 28 Sep 2025 18:03:39 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:39 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 06/30] liveupdate: kho: move to kernel/liveupdate Date: Mon, 29 Sep 2025 01:02:57 +0000 Message-ID: <20250929010321.3462457-7-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move KHO to kernel/liveupdate/ in preparation of placing all Live Update core kernel related files to the same place. Signed-off-by: Pasha Tatashin Reviewed-by: Jason Gunthorpe --- Documentation/core-api/kho/concepts.rst | 2 +- MAINTAINERS | 2 +- init/Kconfig | 2 ++ kernel/Kconfig.kexec | 25 ---------------- kernel/Makefile | 3 +- kernel/liveupdate/Kconfig | 30 +++++++++++++++++++ kernel/liveupdate/Makefile | 4 +++ kernel/{ =3D> liveupdate}/kexec_handover.c | 6 ++-- .../{ =3D> liveupdate}/kexec_handover_debug.c | 0 .../kexec_handover_internal.h | 0 10 files changed, 42 insertions(+), 32 deletions(-) create mode 100644 kernel/liveupdate/Kconfig create mode 100644 kernel/liveupdate/Makefile rename kernel/{ =3D> liveupdate}/kexec_handover.c (99%) rename kernel/{ =3D> liveupdate}/kexec_handover_debug.c (100%) rename kernel/{ =3D> liveupdate}/kexec_handover_internal.h (100%) diff --git a/Documentation/core-api/kho/concepts.rst b/Documentation/core-a= pi/kho/concepts.rst index 36d5c05cfb30..d626d1dbd678 100644 --- a/Documentation/core-api/kho/concepts.rst +++ b/Documentation/core-api/kho/concepts.rst @@ -70,5 +70,5 @@ in the FDT. That state is called the KHO finalization pha= se. =20 Public API =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D -.. kernel-doc:: kernel/kexec_handover.c +.. kernel-doc:: kernel/liveupdate/kexec_handover.c :export: diff --git a/MAINTAINERS b/MAINTAINERS index a6cbcc7fb396..e5c800ed4819 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13766,7 +13766,7 @@ S: Maintained F: Documentation/admin-guide/mm/kho.rst F: Documentation/core-api/kho/* F: include/linux/kexec_handover.h -F: kernel/kexec_handover* +F: kernel/liveupdate/kexec_handover* F: tools/testing/selftests/kho/ =20 KEYS-ENCRYPTED diff --git a/init/Kconfig b/init/Kconfig index d295e79d3547..9952c112154f 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2138,6 +2138,8 @@ config TRACEPOINTS =20 source "kernel/Kconfig.kexec" =20 +source "kernel/liveupdate/Kconfig" + endmenu # General setup =20 source "arch/Kconfig" diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index e68156d8c72b..15632358bcf7 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -94,31 +94,6 @@ config KEXEC_JUMP Jump between original kernel and kexeced kernel and invoke code in physical address mode via KEXEC =20 -config KEXEC_HANDOVER - bool "kexec handover" - depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE - depends on !DEFERRED_STRUCT_PAGE_INIT - select MEMBLOCK_KHO_SCRATCH - select KEXEC_FILE - select DEBUG_FS - select LIBFDT - select CMA - help - Allow kexec to hand over state across kernels by generating and - passing additional metadata to the target kernel. This is useful - to keep data or state alive across the kexec. For this to work, - both source and target kernels need to have this option enabled. - -config KEXEC_HANDOVER_DEBUG - bool "kexec handover debug interface" - depends on KEXEC_HANDOVER - depends on DEBUG_FS - help - Allow to control kexec handover device tree via debugfs - interface, i.e. finalize the state or aborting the finalization. - Also, enables inspecting the KHO fdt trees with the debugfs binary - blobs. - config CRASH_DUMP bool "kernel crash dumps" default ARCH_DEFAULT_CRASH_DUMP diff --git a/kernel/Makefile b/kernel/Makefile index 9fe722305c9b..e83669841b8c 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -52,6 +52,7 @@ obj-y +=3D printk/ obj-y +=3D irq/ obj-y +=3D rcu/ obj-y +=3D livepatch/ +obj-y +=3D liveupdate/ obj-y +=3D dma/ obj-y +=3D entry/ obj-y +=3D unwind/ @@ -82,8 +83,6 @@ obj-$(CONFIG_CRASH_DUMP_KUNIT_TEST) +=3D crash_core_test.o obj-$(CONFIG_KEXEC) +=3D kexec.o obj-$(CONFIG_KEXEC_FILE) +=3D kexec_file.o obj-$(CONFIG_KEXEC_ELF) +=3D kexec_elf.o -obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o -obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_BACKTRACE_SELF_TEST) +=3D backtracetest.o obj-$(CONFIG_COMPAT) +=3D compat.o obj-$(CONFIG_CGROUPS) +=3D cgroup/ diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig new file mode 100644 index 000000000000..eebe564b385d --- /dev/null +++ b/kernel/liveupdate/Kconfig @@ -0,0 +1,30 @@ +# SPDX-License-Identifier: GPL-2.0-only + +menu "Live Update" + +config KEXEC_HANDOVER + bool "kexec handover" + depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE + depends on !DEFERRED_STRUCT_PAGE_INIT + select MEMBLOCK_KHO_SCRATCH + select KEXEC_FILE + select DEBUG_FS + select LIBFDT + select CMA + help + Allow kexec to hand over state across kernels by generating and + passing additional metadata to the target kernel. This is useful + to keep data or state alive across the kexec. For this to work, + both source and target kernels need to have this option enabled. + +config KEXEC_HANDOVER_DEBUG + bool "kexec handover debug interface" + depends on KEXEC_HANDOVER + depends on DEBUG_FS + help + Allow to control kexec handover device tree via debugfs + interface, i.e. finalize the state or aborting the finalization. + Also, enables inspecting the KHO fdt trees with the debugfs binary + blobs. + +endmenu diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile new file mode 100644 index 000000000000..67c7f71b33fa --- /dev/null +++ b/kernel/liveupdate/Makefile @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o +obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o diff --git a/kernel/kexec_handover.c b/kernel/liveupdate/kexec_handover.c similarity index 99% rename from kernel/kexec_handover.c rename to kernel/liveupdate/kexec_handover.c index 61b31cfc75f2..71e98c44cf47 100644 --- a/kernel/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -24,8 +24,8 @@ * KHO is tightly coupled with mm init and needs access to some of mm * internal APIs. */ -#include "../mm/internal.h" -#include "kexec_internal.h" +#include "../../mm/internal.h" +#include "../kexec_internal.h" #include "kexec_handover_internal.h" =20 #define KHO_FDT_COMPATIBLE "kho-v1" @@ -1129,7 +1129,7 @@ static int __kho_finalize(void) err |=3D fdt_finish_reservemap(root); err |=3D fdt_begin_node(root, ""); err |=3D fdt_property_string(root, "compatible", KHO_FDT_COMPATIBLE); - /** + /* * Reserve the preserved-memory-map property in the root FDT, so * that all property definitions will precede subnodes created by * KHO callers. diff --git a/kernel/kexec_handover_debug.c b/kernel/liveupdate/kexec_handov= er_debug.c similarity index 100% rename from kernel/kexec_handover_debug.c rename to kernel/liveupdate/kexec_handover_debug.c diff --git a/kernel/kexec_handover_internal.h b/kernel/liveupdate/kexec_han= dover_internal.h similarity index 100% rename from kernel/kexec_handover_internal.h rename to kernel/liveupdate/kexec_handover_internal.h --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C989F207A18 for ; Mon, 29 Sep 2025 01:03:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107827; cv=none; b=SfeFVsPRPOxGwG+YRlJ6UQf/AmHaeYzOo08lWczjmI/qixCcqJ2BgQo/zxgt6gxtgSKDTGeoWtBs8c8hFXJXUBZAMh3srANRJHCYh+sEjcZKgixJTp9hFOf4Y2WzNzqltisIBSVD6hWAEoqOqUUDgYyV6a95GV2EPVOiwdT8H/0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107827; c=relaxed/simple; bh=Kbh3Jvhz7McGkdnwd5ALkrlgP6XpRgJChHjdyyrukpg=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=m9N31o+KnBPIE5jBMJE9vgrG04LWijZmGmokTX6134F5gEnCLWct52SW5tsTARUPAoQ5GIwcx/IfwzYHI5nXzRHGyo2BM7u5PVlUYlAhNCY+OjGuwKhy85htgH+9OAFjpNBOa1Lg4i2l2CP21zur8F66mrZiDSpizmUvKEPzVNg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=bMUk3mqi; arc=none smtp.client-ip=209.85.160.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="bMUk3mqi" Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-4d9bcc2368eso44131671cf.2 for ; Sun, 28 Sep 2025 18:03:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107821; x=1759712621; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=2/KrKJyeC51poRw8B7dmjOqz1AtVPEr+fT1b5tJ+R+A=; b=bMUk3mqiK8XLlVCG2e+da62hQa1MatnfIfBYDujA9T6YLF/RMApyse4ab0Ph9mfr1u KkAXW0tvmBPcM4phSMxlSr64onIfCrhNWv9TtENEylJcAh/ttpHifjCRqG41jW660gPA V9LuIzWNymUwRunlPxuyHZxWEtilmCAE88PSLStbq6qba/yhUuaaIKFxLsYXfZUOhWJW CUnkAjbCFjnrNjlSId2qliZC+RG/YzZ/MskOJ7lwnOZpeIcBAXqSgQqthmXC1UTT6Nts pSgNI3bG12qc/chMA3wVbN7Og13GX+M0SB3oe+TO28drjtFv7oiFF/P7ekutqe2Xz6Gr vPyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107821; x=1759712621; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2/KrKJyeC51poRw8B7dmjOqz1AtVPEr+fT1b5tJ+R+A=; b=C44s0rsLwyi7ZViy8Eq5vPvh3EwNKncj2NNpq6ttLanDGawAKGl2NkWpe0x6rytoQf cXCl/8PRQgLfXGBZLuAdX8ay8zmnM9nuLmvDRmtizBysnNajv3sJROreDW8/byholGSD ZT9O954x2POpy4ky3qese1kvT7li4fkQPKHdG577dpqDHAHv97dAeprFerEc4D1eiyEK Tx+G+j6r+mWojOB7RanLZLK7RnXyNaKirEmuIXtnnP8H+hfZtjP8a8CTzR+0YNQYzu53 TlKa7AoytYl3FSVH5trF3VbF+l9liG7J+gW5e8d1aZZBtuUlIG1g61PLhCTbDjWSxurg 3bOQ== X-Forwarded-Encrypted: i=1; AJvYcCX7JejepAB3F11QkP2/oWZRKjjpUvg72lA8GTP8c5wfwMrbg5mNm7Z8C6LbrmM9xaSE50JJPY3KxBMhH90=@vger.kernel.org X-Gm-Message-State: AOJu0YyS8gQhkfzT8b34PrbFmzsQedjnMMv/K9RRLgQhXIJf1J7pz7Mb SgSeB1So22XOQ+7sUrH5ps2qXCWDpoSt6s5VcNhB7ghKxrJz/eDv0XQ5K/qf3fBgiGs= X-Gm-Gg: ASbGnctlARb7/iQA0SJpZUqjwt5aG57A33Qkexs3l8L5TrHNi07IeMaFS4hArcThjQ/ Jd4IdVxJtoQgrKWLHWnAIU6qi3BHucW0NOQye5VkdCfTDdza+JxaCM24f0/TqTZq8TTVUb3UgDF M66W7EMysYhAFte+XxoGdbT6X9oAZ3z+a78kx6Q0g5wph9LZySFwbdqDLad1qsRnZOf2YVyBSCw aNuF0vA5ko8ll2BXX3BIGKirnnh4h7MBIkM8rv7TJbe0jRJQzlHHlP0S/taXRSQdy9Dhy76H244 duTpiaN24b8NVC087Q8R7lXaedNJ6Kutc1M2i2cvuI2zD+MyU4SQGKI+YBsb6ZqvL9apArTjzDa KA9djSoYLJch4NIdL6Vr0sjpvQOm4pyRhBoAttdTMbFrMpN3woC5AvpTMpNJ3K4DM6FW1McEdXt QtJ3FkfgSm+GGM6z9qIQ== X-Google-Smtp-Source: AGHT+IGFZwesMETmsyB8kVw5N3ny/xuT6AceRfCWnXxNytcydQFaT1YdGvc3ICpEpNLLBRz0AbG3bA== X-Received: by 2002:a05:622a:a708:b0:4de:45ff:1de with SMTP id d75a77b69052e-4de45ff259bmr91581871cf.21.1759107821282; Sun, 28 Sep 2025 18:03:41 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:40 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 07/30] liveupdate: luo_core: luo_ioctl: Live Update Orchestrator Date: Mon, 29 Sep 2025 01:02:58 +0000 Message-ID: <20250929010321.3462457-8-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce LUO, a mechanism intended to facilitate kernel updates while keeping designated devices operational across the transition (e.g., via kexec). The primary use case is updating hypervisors with minimal disruption to running virtual machines. For userspace side of hypervisor update we have copyless migration. LUO is for updating the kernel. This initial patch lays the groundwork for the LUO subsystem. Further functionality, including the implementation of state transition logic, integration with KHO, and hooks for subsystems and file descriptors, will be added in subsequent patches. Create a character device at /dev/liveupdate. A new uAPI header, , will define the necessary structures. The magic number for IOCTL is registered in Documentation/userspace-api/ioctl/ioctl-number.rst. Signed-off-by: Pasha Tatashin --- .../userspace-api/ioctl/ioctl-number.rst | 2 + include/linux/liveupdate.h | 64 ++++ include/uapi/linux/liveupdate.h | 94 ++++++ kernel/liveupdate/Kconfig | 27 ++ kernel/liveupdate/Makefile | 6 + kernel/liveupdate/luo_core.c | 297 ++++++++++++++++++ kernel/liveupdate/luo_internal.h | 22 ++ kernel/liveupdate/luo_ioctl.c | 54 ++++ 8 files changed, 566 insertions(+) create mode 100644 include/linux/liveupdate.h create mode 100644 include/uapi/linux/liveupdate.h create mode 100644 kernel/liveupdate/luo_core.c create mode 100644 kernel/liveupdate/luo_internal.h create mode 100644 kernel/liveupdate/luo_ioctl.c diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documenta= tion/userspace-api/ioctl/ioctl-number.rst index 7c527a01d1cf..7232b3544cec 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -385,6 +385,8 @@ Code Seq# Include File = Comments 0xB8 01-02 uapi/misc/mrvl_cn10k_dpi.h Mar= vell CN10K DPI driver 0xB8 all uapi/linux/mshv.h Mic= rosoft Hyper-V /dev/mshv driver +0xBA 00-0F uapi/linux/liveupdate.h Pas= ha Tatashin + 0xC0 00-0F linux/usb/iowarrior.h 0xCA 00-0F uapi/misc/cxl.h Dea= d since 6.15 0xCA 10-2F uapi/misc/ocxl.h diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h new file mode 100644 index 000000000000..85a6828c95b0 --- /dev/null +++ b/include/linux/liveupdate.h @@ -0,0 +1,64 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ +#ifndef _LINUX_LIVEUPDATE_H +#define _LINUX_LIVEUPDATE_H + +#include +#include +#include +#include + +#ifdef CONFIG_LIVEUPDATE + +/* Return true if live update orchestrator is enabled */ +bool liveupdate_enabled(void); + +/* Called during reboot to tell participants to complete serialization */ +int liveupdate_reboot(void); + +/* + * Return true if machine is in updated state (i.e. live update boot in + * progress) + */ +bool liveupdate_state_updated(void); + +/* + * Return true if machine is in normal state (i.e. no live update in progr= ess). + */ +bool liveupdate_state_normal(void); + +enum liveupdate_state liveupdate_get_state(void); + +#else /* CONFIG_LIVEUPDATE */ + +static inline int liveupdate_reboot(void) +{ + return 0; +} + +static inline bool liveupdate_enabled(void) +{ + return false; +} + +static inline bool liveupdate_state_updated(void) +{ + return false; +} + +static inline bool liveupdate_state_normal(void) +{ + return true; +} + +static inline enum liveupdate_state liveupdate_get_state(void) +{ + return LIVEUPDATE_STATE_NORMAL; +} + +#endif /* CONFIG_LIVEUPDATE */ +#endif /* _LINUX_LIVEUPDATE_H */ diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h new file mode 100644 index 000000000000..3cb09b2c4353 --- /dev/null +++ b/include/uapi/linux/liveupdate.h @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +/* + * Userspace interface for /dev/liveupdate + * Live Update Orchestrator + * + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _UAPI_LIVEUPDATE_H +#define _UAPI_LIVEUPDATE_H + +#include +#include + +/** + * enum liveupdate_state - Defines the possible states of the live update + * orchestrator. + * @LIVEUPDATE_STATE_UNDEFINED: State has not yet been initialized. + * @LIVEUPDATE_STATE_NORMAL: Default state, no live update in prog= ress. + * @LIVEUPDATE_STATE_PREPARED: Live update is prepared for reboot; t= he + * LIVEUPDATE_PREPARE callbacks have com= pleted + * successfully. + * Devices might operate in a limited st= ate + * for example the participating devices= might + * not be allowed to unbind, and also the + * setting up of new DMA mappings might = be + * disabled in this state. + * @LIVEUPDATE_STATE_FROZEN: The final reboot event + * (%LIVEUPDATE_FREEZE) has been sent, a= nd the + * system is performing its final state = saving + * within the "blackout window". User + * workloads must be suspended. The actu= al + * reboot (kexec) into the next kernel is + * imminent. + * @LIVEUPDATE_STATE_UPDATED: The system has rebooted into the next + * kernel via live update the system is = now + * running the next kernel, awaiting the + * finish event. + * + * These states track the progress and outcome of a live update operation. + */ +enum liveupdate_state { + LIVEUPDATE_STATE_UNDEFINED =3D 0, + LIVEUPDATE_STATE_NORMAL =3D 1, + LIVEUPDATE_STATE_PREPARED =3D 2, + LIVEUPDATE_STATE_FROZEN =3D 3, + LIVEUPDATE_STATE_UPDATED =3D 4, +}; + +/** + * enum liveupdate_event - Events that trigger live update callbacks. + * @LIVEUPDATE_PREPARE: PREPARE should happen *before* the blackout window. + * Subsystems should prepare for an upcoming reboot by + * serializing their states. However, it must be cons= idered + * that user applications, e.g. virtual machines are = still + * running during this phase. + * @LIVEUPDATE_FREEZE: FREEZE sent from the reboot() syscall, when the cu= rrent + * kernel is on its way out. This is the final opport= unity + * for subsystems to save any state that must persist + * across the reboot. Callbacks for this event should= be as + * fast as possible since they are on the critical pa= th of + * rebooting into the next kernel. + * @LIVEUPDATE_FINISH: FINISH is sent in the newly booted kernel after a + * successful live update and normally *after* the bl= ackout + * window. Subsystems should perform any final cleanup + * during this phase. This phase also provides an + * opportunity to clean up devices that were preserve= d but + * never explicitly reclaimed during the live update + * process. State restoration should have already occ= urred + * before this event. Callbacks for this event must n= ot + * fail. The completion of this call transitions the + * machine from ``updated`` to ``normal`` state. + * @LIVEUPDATE_CANCEL: CANCEL the live update and go back to normal state= . This + * event is user initiated, or is done automatically = when + * LIVEUPDATE_PREPARE or LIVEUPDATE_FREEZE stage fail= s. + * Subsystems should revert any actions taken during = the + * corresponding prepare event. Callbacks for this ev= ent + * must not fail. + * + * These events represent the different stages and actions within the live + * update process that subsystems (like device drivers and bus drivers) + * need to be aware of to correctly serialize and restore their state. + * + */ +enum liveupdate_event { + LIVEUPDATE_PREPARE =3D 0, + LIVEUPDATE_FREEZE =3D 1, + LIVEUPDATE_FINISH =3D 2, + LIVEUPDATE_CANCEL =3D 3, +}; + +#endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index eebe564b385d..f6b0bde188d9 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -1,7 +1,34 @@ # SPDX-License-Identifier: GPL-2.0-only +# +# Copyright (c) 2025, Google LLC. +# Pasha Tatashin +# +# Live Update Orchestrator +# =20 menu "Live Update" =20 +config LIVEUPDATE + bool "Live Update Orchestrator" + depends on KEXEC_HANDOVER + help + Enable the Live Update Orchestrator. Live Update is a mechanism, + typically based on kexec, that allows the kernel to be updated + while keeping selected devices operational across the transition. + These devices are intended to be reclaimed by the new kernel and + re-attached to their original workload without requiring a device + reset. + + Ability to handover a device from current to the next kernel depends + on specific support within device drivers and related kernel + subsystems. + + This feature primarily targets virtual machine hosts to quickly update + the kernel hypervisor with minimal disruption to the running virtual + machines. + + If unsure, say N. + config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 67c7f71b33fa..d90cc3b4bf7b 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -1,4 +1,10 @@ # SPDX-License-Identifier: GPL-2.0 =20 +luo-y :=3D \ + luo_core.o \ + luo_ioctl.o + obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o + +obj-$(CONFIG_LIVEUPDATE) +=3D luo.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c new file mode 100644 index 000000000000..954d533bd8c4 --- /dev/null +++ b/kernel/liveupdate/luo_core.c @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: Live Update Orchestrator (LUO) + * + * Live Update is a specialized, kexec-based reboot process that allows a + * running kernel to be updated from one version to another while preservi= ng + * the state of selected resources and keeping designated hardware devices + * operational. For these devices, DMA activity may continue throughout the + * kernel transition. + * + * While the primary use case driving this work is supporting live updates= of + * the Linux kernel when it is used as a hypervisor in cloud environments,= the + * LUO framework itself is designed to be workload-agnostic. Much like Ker= nel + * Live Patching, which applies security fixes regardless of the workload, + * Live Update facilitates a full kernel version upgrade for any type of s= ystem. + * + * For example, a non-hypervisor system running an in-memory cache like + * memcached with many gigabytes of data can use LUO. The userspace service + * can place its cache into a memfd, have its state preserved by LUO, and + * restore it immediately after the kernel kexec. + * + * Whether the system is running virtual machines, containers, a + * high-performance database, or networking services, LUO's primary goal i= s to + * enable a full kernel update by preserving critical userspace state and + * keeping essential devices operational. + * + * The core of LUO is a state machine that tracks the progress of a live u= pdate, + * along with a callback API that allows other kernel subsystems to partic= ipate + * in the process. Example subsystems that can hook into LUO include: kvm, + * iommu, interrupts, vfio, participating filesystems, and memory manageme= nt. + * + * LUO uses Kexec Handover to transfer memory state from the current kerne= l to + * the next kernel. For more details see + * Documentation/core-api/kho/concepts.rst. + * + * The LUO state machine ensures that operations are performed in the corr= ect + * sequence and provides a mechanism to track and recover from potential + * failures. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include "luo_internal.h" + +DECLARE_RWSEM(luo_state_rwsem); + +static enum liveupdate_state luo_state =3D LIVEUPDATE_STATE_UNDEFINED; + +static const char *const luo_state_str[] =3D { + [LIVEUPDATE_STATE_UNDEFINED] =3D "undefined", + [LIVEUPDATE_STATE_NORMAL] =3D "normal", + [LIVEUPDATE_STATE_PREPARED] =3D "prepared", + [LIVEUPDATE_STATE_FROZEN] =3D "frozen", + [LIVEUPDATE_STATE_UPDATED] =3D "updated", +}; + +static bool luo_enabled; + +static int __init early_liveupdate_param(char *buf) +{ + return kstrtobool(buf, &luo_enabled); +} +early_param("liveupdate", early_liveupdate_param); + +/* Return true if the current state is equal to the provided state */ +static inline bool is_current_luo_state(enum liveupdate_state expected_sta= te) +{ + return liveupdate_get_state() =3D=3D expected_state; +} + +static void __luo_set_state(enum liveupdate_state state) +{ + WRITE_ONCE(luo_state, state); +} + +static inline void luo_set_state(enum liveupdate_state state) +{ + pr_info("Switched from [%s] to [%s] state\n", + luo_current_state_str(), luo_state_str[state]); + __luo_set_state(state); +} + +static int luo_do_freeze_calls(void) +{ + return 0; +} + +static void luo_do_finish_calls(void) +{ +} + +/* Get the current state as a string */ +const char *luo_current_state_str(void) +{ + return luo_state_str[liveupdate_get_state()]; +} + +enum liveupdate_state liveupdate_get_state(void) +{ + return READ_ONCE(luo_state); +} + +int luo_prepare(void) +{ + return 0; +} + +/** + * luo_freeze() - Initiate the final freeze notification phase for live up= date. + * + * Attempts to transition the live update orchestrator state from + * %LIVEUPDATE_STATE_PREPARED to %LIVEUPDATE_STATE_FROZEN. This function is + * typically called just before the actual reboot system call (e.g., kexec) + * is invoked, either directly by the orchestration tool or potentially fr= om + * within the reboot syscall path itself. + * + * @return 0: Success. Negative error otherwise. State is reverted to + * %LIVEUPDATE_STATE_NORMAL in case of an error during callbacks, and ever= ything + * is canceled via cancel notifcation. + */ +int luo_freeze(void) +{ + int ret; + + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[freeze] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_FROZEN], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + ret =3D luo_do_freeze_calls(); + if (!ret) + luo_set_state(LIVEUPDATE_STATE_FROZEN); + else + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return ret; +} + +/** + * luo_finish - Finalize the live update process in the new kernel. + * + * This function is called after a successful live update reboot into a n= ew + * kernel, once the new kernel is ready to transition to the normal operat= ional + * state. It signals the completion of the live update sequence to subsyst= ems. + * + * @return 0 on success, ``-EAGAIN`` if the state change was cancelled by = the + * user while waiting for the lock, or ``-EINVAL`` if the orchestrator is = not in + * the updated state. + */ +int luo_finish(void) +{ + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[finish] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_UPDATED)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_NORMAL], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + luo_do_finish_calls(); + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return 0; +} + +int luo_cancel(void) +{ + return 0; +} + +void luo_state_read_enter(void) +{ + down_read(&luo_state_rwsem); +} + +void luo_state_read_exit(void) +{ + up_read(&luo_state_rwsem); +} + +static int __init luo_startup(void) +{ + __luo_set_state(LIVEUPDATE_STATE_NORMAL); + + return 0; +} +early_initcall(luo_startup); + +/* Public Functions */ + +/** + * liveupdate_reboot() - Kernel reboot notifier for live update final + * serialization. + * + * This function is invoked directly from the reboot() syscall pathway if a + * reboot is initiated while the live update state is %LIVEUPDATE_STATE_PR= EPARED + * (i.e., if the user did not explicitly trigger the frozen state). It han= dles + * the implicit transition into the final frozen state. + * + * It triggers the %LIVEUPDATE_REBOOT event callbacks for participating + * subsystems. These callbacks must perform final state saving very quickl= y as + * they execute during the blackout period just before kexec. + * + * If any %LIVEUPDATE_FREEZE callback fails, this function triggers the + * %LIVEUPDATE_CANCEL event for all participants to revert their state, ab= orts + * the live update, and returns an error. + */ +int liveupdate_reboot(void) +{ + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED)) + return 0; + + return luo_freeze(); +} + +/** + * liveupdate_state_updated - Check if the system is in the live update + * 'updated' state. + * + * This function checks if the live update orchestrator is in the + * ``LIVEUPDATE_STATE_UPDATED`` state. This state indicates that the syste= m has + * successfully rebooted into a new kernel as part of a live update, and t= he + * preserved devices are expected to be in the process of being reclaimed. + * + * This is typically used by subsystems during early boot of the new kernel + * to determine if they need to attempt to restore state from a previous + * live update. + * + * @return true if the system is in the ``LIVEUPDATE_STATE_UPDATED`` state, + * false otherwise. + */ +bool liveupdate_state_updated(void) +{ + return is_current_luo_state(LIVEUPDATE_STATE_UPDATED); +} + +/** + * liveupdate_state_normal - Check if the system is in the live update 'no= rmal' + * state. + * + * This function checks if the live update orchestrator is in the + * ``LIVEUPDATE_STATE_NORMAL`` state. This state indicates that no live up= date + * is in progress. It represents the default operational state of the syst= em. + * + * This can be used to gate actions that should only be performed when no + * live update activity is occurring. + * + * @return true if the system is in the ``LIVEUPDATE_STATE_NORMAL`` state, + * false otherwise. + */ +bool liveupdate_state_normal(void) +{ + return is_current_luo_state(LIVEUPDATE_STATE_NORMAL); +} + +/** + * liveupdate_enabled - Check if the live update feature is enabled. + * + * This function returns the state of the live update feature flag, which + * can be controlled via the ``liveupdate`` kernel command-line parameter. + * + * @return true if live update is enabled, false otherwise. + */ +bool liveupdate_enabled(void) +{ + return luo_enabled; +} diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h new file mode 100644 index 000000000000..2e0861781673 --- /dev/null +++ b/kernel/liveupdate/luo_internal.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _LINUX_LUO_INTERNAL_H +#define _LINUX_LUO_INTERNAL_H + +int luo_cancel(void); +int luo_prepare(void); +int luo_freeze(void); +int luo_finish(void); + +void luo_state_read_enter(void); +void luo_state_read_exit(void); +extern struct rw_semaphore luo_state_rwsem; + +const char *luo_current_state_str(void); + +#endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_ioctl.c b/kernel/liveupdate/luo_ioctl.c new file mode 100644 index 000000000000..fc2afb450ad5 --- /dev/null +++ b/kernel/liveupdate/luo_ioctl.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +struct luo_device_state { + struct miscdevice miscdev; +}; + +static const struct file_operations luo_fops =3D { + .owner =3D THIS_MODULE, +}; + +static struct luo_device_state luo_dev =3D { + .miscdev =3D { + .minor =3D MISC_DYNAMIC_MINOR, + .name =3D "liveupdate", + .fops =3D &luo_fops, + }, +}; + +static int __init liveupdate_init(void) +{ + if (!liveupdate_enabled()) + return 0; + + return misc_register(&luo_dev.miscdev); +} +module_init(liveupdate_init); + +static void __exit liveupdate_exit(void) +{ + misc_deregister(&luo_dev.miscdev); +} +module_exit(liveupdate_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Pasha Tatashin"); +MODULE_DESCRIPTION("Live Update Orchestrator"); +MODULE_VERSION("0.1"); --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62480220F2C for ; Mon, 29 Sep 2025 01:03:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107827; cv=none; b=Q2ep4OIIsVMD5dK97iY8VPHWl27xWqbEx/n7SRuTmLmAwwJGqoE77SVRud/Ha6Gb+5+6sPyBKsOrSpQ+gTH+p+7WMKHbyQ1eMzhgZJowIgAGV0PR9YBgad49OOV2O2pNNXAIERqTE9HI2DqrqKG3IqnYgY8JEVHX6aw5cc/hn3o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107827; c=relaxed/simple; bh=+sjjN122RdXtJfHWbdGdwjJQhTnn9dy9doHBJONF8gc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QTZph0QeQ4Ltfc52juiKNyvEZX4XkFdm5PmANFgY6wivkYTp1JCu0tcR7k7r6UOq9SRSzTExPP0fDCTiGmDXq4FBRRa2N/yOqcsmy6fOBFzuCFPJYi4ybqb6vWiy7aJcRQQgjf7nBHyJ4IPNsiMIpitVZTjlukLvN9ivKRsy7oI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=XCOZm/G+; arc=none smtp.client-ip=209.85.160.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="XCOZm/G+" Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4df4d23fb59so16769541cf.1 for ; Sun, 28 Sep 2025 18:03:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107823; x=1759712623; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=2J5W7RP3mcD6qfnmKA/9lO7swWy58O6Kyen5gVTP9fM=; b=XCOZm/G+U5cS7RMtM3Xq7GpL3MgxI77b+WNiH7vNr91NQTQn3UE7d97TjHREsKsI5K Auu1xWipP09dcV7xNt1R57lWrTWBKHPCqmALm+2Xk+3hBMWfTRhXQYWtAclN9pKo9dGn 5P/km+i+yKJ6wparvtUeQiAjVa0PjpFlaH74CLLXmz5/ym5KGfvMMnRGwdJBiliL+TzA rD/+iyDKZ83NI/QXZ/RQHa/b/53D2sOenIu2GlWqif5gxEQcsto/ojl3lI3HL6MdfBC9 ZxDQ4thZY6Igafv8PHgWlUfQUHrqSXVqVWbyitfSiWyj5F1WPiEgYhI/zhqcuGoBiZUY JZSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107823; x=1759712623; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2J5W7RP3mcD6qfnmKA/9lO7swWy58O6Kyen5gVTP9fM=; b=DTrvlyD7fdX96pOgpsMkCqUO28vW8aQNGtc4NPeh0/dr+k4Uf2jGSaKWXiqIFuMdA2 oul56kxq81kopjVOKwDbe7yotMNfvT4DS3UrdnnPG7YsYmUZXDqi7hz5mZhVDBk/zfH2 VwwdQn0jW3Qbty9CaiTTFmcdwp51AkKuvNblWCRh+Zj8xzIjLfECCUK4vQukpr+MV2jL opiul8KT0EjLLMN3WjuXaQhuJ38efpVUfBkY4lnDDnOBZEBWRaxf3NVQiItJq1RhcCMG JLO4HHIewg9FNbVsTR4dh1aOCxRGMLEoOfAsOvtSVdl4QzNhklgANoLwvmhYM3DBh7en xnOA== X-Forwarded-Encrypted: i=1; AJvYcCWvAoLi8yGB+pEtGAmFY7oyMlMnKxUBajECqggYwCQBgdekFjdIo829z2Augpvsd+RYl9wj59xF8w2EIio=@vger.kernel.org X-Gm-Message-State: AOJu0Ywh/zMADi8ZhdAxU6+7SoMyLLcP1DynugOTh3A97BMRZvK8MbU5 Gomi3SB37iFaOWVBw8+1rQt2SgR+JIJ/IZaXbzgLk05lv4lqJiaZjzFlYqv29iKJoD8= X-Gm-Gg: ASbGncu9yeIQzJr9iv9aW8cizx+OA14Ri8+NXMn6sdgRSJYro+g9RrbQFRVqUAM7IAh 96SnrB8tcMjonZjOMi/HNK4FBtP3DL5yshFZmPoRhhz1qcC8QDwtuqlizel5wHGEQT8DMlFZxt6 Vq635lxJUNw30hEZcnziQR3GD0rwmHNViFN4HDGpfEtKMgqdeaEfNunlAEptdAX4pDila8NH+N5 cxYHcg0s0xdHBd+2Ryj3/iESxeItzs8pot7sarvtKyssneCbc58gA7O0p/7l5SZNjpk11R/g9U0 3EaNZnivK0deVTKd6AW1CdJwPLb2LBvZrdyGLnZZjto8LULZ/2Met4RvUi7vx/AdBaU8ZJv6DBo JXfXUUZmxUwXBskWKNvG7DpnKWdwNtBQ7CFwP9CXAHEbDw5KOr7qRu6xYeyErjqePqzFdpNWpGy WgAEmpFaxavKzo2gUJ0g== X-Google-Smtp-Source: AGHT+IGaV3SESLMA5+DIYYmSBm152vC0KVkg9BHG7ku4t1EPFpwQC7mmZrH2IaNVGBpmOmq3lM96pA== X-Received: by 2002:ac8:5f84:0:b0:4b4:8e38:8f96 with SMTP id d75a77b69052e-4da4d8e3cbamr195662391cf.83.1759107822880; Sun, 28 Sep 2025 18:03:42 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:42 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 08/30] liveupdate: luo_core: integrate with KHO Date: Mon, 29 Sep 2025 01:02:59 +0000 Message-ID: <20250929010321.3462457-9-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Integrate the LUO with the KHO framework to enable passing LUO state across a kexec reboot. When LUO is transitioned to a "prepared" state, it tells KHO to finalize, so all memory segments that were added to KHO preservation list are getting preserved. After "Prepared" state no new segments can be preserved. If LUO is canceled, it also tells KHO to cancel the serialization, and therefore, later LUO can go back into the prepared state. This patch introduces the following changes: - During the KHO finalization phase allocate FDT blob. - Populate this FDT with a LUO compatibility string ("luo-v1"). LUO now depends on `CONFIG_KEXEC_HANDOVER`. The core state transition logic (`luo_do_*_calls`) remains unimplemented in this patch. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_core.c | 282 ++++++++++++++++++++++++++++++- kernel/liveupdate/luo_internal.h | 13 ++ 2 files changed, 292 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index 954d533bd8c4..10796481447a 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -47,9 +47,13 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt =20 #include +#include #include +#include #include +#include #include +#include #include #include "luo_internal.h" =20 @@ -67,6 +71,21 @@ static const char *const luo_state_str[] =3D { =20 static bool luo_enabled; =20 +static void *luo_fdt_out; +static void *luo_fdt_in; + +/* + * The LUO FDT size depends on the number of participating subsystems, + * + * The current fixed size (4K) is large enough to handle reasonable number= of + * preserved entities. If this size ever becomes insufficient, it can eith= er be + * increased, or a dynamic size calculation mechanism could be implemented= in + * the future. + */ +#define LUO_FDT_SIZE PAGE_SIZE +#define LUO_KHO_ENTRY_NAME "LUO" +#define LUO_COMPATIBLE "luo-v1" + static int __init early_liveupdate_param(char *buf) { return kstrtobool(buf, &luo_enabled); @@ -91,6 +110,52 @@ static inline void luo_set_state(enum liveupdate_state = state) __luo_set_state(state); } =20 +/* Called during the prepare phase, to create LUO fdt tree */ +static int luo_fdt_setup(void) +{ + void *fdt_out; + int ret; + + fdt_out =3D luo_contig_alloc_preserve(LUO_FDT_SIZE); + if (IS_ERR(fdt_out)) { + pr_err("failed to allocate/preserve FDT memory\n"); + return PTR_ERR(fdt_out); + } + + ret =3D fdt_create_empty_tree(fdt_out, LUO_FDT_SIZE); + if (ret) + goto exit_free; + + ret =3D fdt_setprop_string(fdt_out, 0, "compatible", LUO_COMPATIBLE); + if (ret) + goto exit_free; + + ret =3D kho_add_subtree(LUO_KHO_ENTRY_NAME, fdt_out); + if (ret) + goto exit_free; + luo_fdt_out =3D fdt_out; + + return 0; + +exit_free: + luo_contig_free_unpreserve(fdt_out, LUO_FDT_SIZE); + pr_err("failed to prepare LUO FDT: %d\n", ret); + + return ret; +} + +static void luo_fdt_destroy(void) +{ + kho_remove_subtree(luo_fdt_out); + luo_contig_free_unpreserve(luo_fdt_out, LUO_FDT_SIZE); + luo_fdt_out =3D NULL; +} + +static int luo_do_prepare_calls(void) +{ + return 0; +} + static int luo_do_freeze_calls(void) { return 0; @@ -100,6 +165,71 @@ static void luo_do_finish_calls(void) { } =20 +static void luo_do_cancel_calls(void) +{ +} + +static int __luo_prepare(void) +{ + int ret; + + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[prepare] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_NORMAL)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_PREPARED], + luo_current_state_str()); + ret =3D -EINVAL; + goto exit_unlock; + } + + ret =3D luo_fdt_setup(); + if (ret) + goto exit_unlock; + + ret =3D luo_do_prepare_calls(); + if (ret) { + luo_fdt_destroy(); + goto exit_unlock; + } + + luo_set_state(LIVEUPDATE_STATE_PREPARED); + +exit_unlock: + up_write(&luo_state_rwsem); + + return ret; +} + +static int __luo_cancel(void) +{ + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[cancel] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED) && + !is_current_luo_state(LIVEUPDATE_STATE_FROZEN)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_NORMAL], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + luo_do_cancel_calls(); + luo_fdt_destroy(); + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return 0; +} + /* Get the current state as a string */ const char *luo_current_state_str(void) { @@ -111,9 +241,28 @@ enum liveupdate_state liveupdate_get_state(void) return READ_ONCE(luo_state); } =20 +/** + * luo_prepare - Initiate the live update preparation phase. + * + * This function is called to begin the live update process. It attempts to + * transition the luo to the ``LIVEUPDATE_STATE_PREPARED`` state. + * + * If the calls complete successfully, the orchestrator state is set + * to ``LIVEUPDATE_STATE_PREPARED``. If any call fails a + * ``LIVEUPDATE_CANCEL`` is sent to roll back any actions. + * + * @return 0 on success, ``-EAGAIN`` if the state change was cancelled by = the + * user while waiting for the lock, ``-EINVAL`` if the orchestrator is not= in + * the normal state, or a negative error code returned by the calls. + */ int luo_prepare(void) { - return 0; + int err =3D __luo_prepare(); + + if (err) + return err; + + return kho_finalize(); } =20 /** @@ -193,9 +342,28 @@ int luo_finish(void) return 0; } =20 +/** + * luo_cancel - Cancel the ongoing live update from prepared or frozen sta= tes. + * + * This function is called to abort a live update that is currently in the + * ``LIVEUPDATE_STATE_PREPARED`` state. + * + * If the state is correct, it triggers the ``LIVEUPDATE_CANCEL`` notifier= chain + * to allow subsystems to undo any actions performed during the prepare or + * freeze events. Finally, the orchestrator state is transitioned back to + * ``LIVEUPDATE_STATE_NORMAL``. + * + * @return 0 on success, or ``-EAGAIN`` if the state change was cancelled = by the + * user while waiting for the lock. + */ int luo_cancel(void) { - return 0; + int err =3D kho_abort(); + + if (err) + return err; + + return __luo_cancel(); } =20 void luo_state_read_enter(void) @@ -210,7 +378,36 @@ void luo_state_read_exit(void) =20 static int __init luo_startup(void) { - __luo_set_state(LIVEUPDATE_STATE_NORMAL); + phys_addr_t fdt_phys; + int ret; + + if (!kho_is_enabled()) { + if (luo_enabled) + pr_warn("Disabling liveupdate because KHO is disabled\n"); + luo_enabled =3D false; + return 0; + } + + /* Retrieve LUO subtree, and verify its format. */ + ret =3D kho_retrieve_subtree(LUO_KHO_ENTRY_NAME, &fdt_phys); + if (ret) { + if (ret !=3D -ENOENT) { + luo_restore_fail("failed to retrieve FDT '%s' from KHO: %d\n", + LUO_KHO_ENTRY_NAME, ret); + } + __luo_set_state(LIVEUPDATE_STATE_NORMAL); + + return 0; + } + + luo_fdt_in =3D __va(fdt_phys); + ret =3D fdt_node_check_compatible(luo_fdt_in, 0, LUO_COMPATIBLE); + if (ret) { + luo_restore_fail("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_KHO_ENTRY_NAME, LUO_COMPATIBLE, ret); + } + + __luo_set_state(LIVEUPDATE_STATE_UPDATED); =20 return 0; } @@ -295,3 +492,82 @@ bool liveupdate_enabled(void) { return luo_enabled; } + +/** + * luo_contig_alloc_preserve - Allocate, zero, and preserve contiguous mem= ory. + * @size: The number of bytes to allocate. + * + * Allocates a physically contiguous block of zeroed pages that is large + * enough to hold @size bytes. The allocated memory is then registered with + * KHO for preservation across a kexec. + * + * Note: The actual allocated size will be rounded up to the nearest + * power-of-two page boundary. + * + * @return A virtual pointer to the allocated and preserved memory on succ= ess, + * or an ERR_PTR() encoded error on failure. + */ +void *luo_contig_alloc_preserve(size_t size) +{ + int order, ret; + void *mem; + + if (!size) + return ERR_PTR(-EINVAL); + + order =3D get_order(size); + if (order > MAX_PAGE_ORDER) + return ERR_PTR(-E2BIG); + + mem =3D (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, order); + if (!mem) + return ERR_PTR(-ENOMEM); + + ret =3D kho_preserve_pages(virt_to_page(mem), 1 << order); + if (ret) { + free_pages((unsigned long)mem, order); + return ERR_PTR(ret); + } + + return mem; +} + +/** + * luo_contig_free_unpreserve - Unpreserve and free contiguous memory. + * @mem: Pointer to the memory allocated by luo_contig_alloc_preserve(). + * @size: The original size requested during allocation. This is used to + * recalculate the correct order for freeing the pages. + * + * Unregisters the memory from KHO preservation and frees the underlying + * pages back to the system. This function should be called to clean up + * memory allocated with luo_contig_alloc_preserve(). + */ +void luo_contig_free_unpreserve(void *mem, size_t size) +{ + unsigned int order; + + if (!mem || !size) + return; + + order =3D get_order(size); + if (WARN_ON_ONCE(order > MAX_PAGE_ORDER)) + return; + + WARN_ON_ONCE(kho_unpreserve_pages(virt_to_page(mem), 1 << order)); + free_pages((unsigned long)mem, order); +} + +void luo_contig_free_restore(void *mem, size_t size) +{ + unsigned int order; + + if (!mem || !size) + return; + + order =3D get_order(size); + if (WARN_ON_ONCE(order > MAX_PAGE_ORDER)) + return; + + WARN_ON_ONCE(!kho_restore_pages(__pa(mem), 1 << order)); + free_pages((unsigned long)mem, order); +} diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 2e0861781673..c98842caa4a0 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -8,6 +8,15 @@ #ifndef _LINUX_LUO_INTERNAL_H #define _LINUX_LUO_INTERNAL_H =20 +/* + * Handles a deserialization failure: devices and memory is in unpredictab= le + * state. + * + * Continuing the boot process after a failure is dangerous because it cou= ld + * lead to leaks of private data. + */ +#define luo_restore_fail(__fmt, ...) panic(__fmt, ##__VA_ARGS__) + int luo_cancel(void); int luo_prepare(void); int luo_freeze(void); @@ -19,4 +28,8 @@ extern struct rw_semaphore luo_state_rwsem; =20 const char *luo_current_state_str(void); =20 +void *luo_contig_alloc_preserve(size_t size); +void luo_contig_free_unpreserve(void *mem, size_t size); +void luo_contig_free_restore(void *mem, size_t size); + #endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F382223311 for ; Mon, 29 Sep 2025 01:03:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107829; cv=none; b=kHu7YiStnlrprqwfJpaeVW2ADSbb7TNcHx4Oagd5HgSzyQ7U258QG4fUpi1f6I+2EvqOpVKzD4hEKhy0FJGm46TXfF9QOPDUuI3FBa/yZazFN9df2NOKOFGw36iIffrezRc1l2aR/LoltFEAX51B7bjOK7ABu+eug5WSVnMGuKU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107829; c=relaxed/simple; bh=zK93UZTn2t7PGQ3H6iYrbXG4ZbFX30XxJbMiQ2s5Z6o=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PQZMT2PgbGsjcS1X2dIh3QKiB6Lt5oeSGfyHKoLHdPmqs/gDw9KHFM4oUcLpHJozEQZ6nvu+64q2SRuuxjLqPXMq+K/pbhnxkKtalbmDNopkEzdo0tNn9cMiq2rtQyTuY6IvwaR3EiNv/z1DIUu51NG6RUpfnRi5nZoJeGvrDJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Z0B5rVxJ; arc=none smtp.client-ip=209.85.160.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Z0B5rVxJ" Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-4df0467b510so21093901cf.3 for ; Sun, 28 Sep 2025 18:03:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107825; x=1759712625; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=JRKEwXa1tg/M6bNaqOfoeFXrUpvvgwejj9DdMRFtL4w=; b=Z0B5rVxJDSr91zM/+frz61GmcvcSePgnhnSQtMyxvEcTWsebr44znOfryGcB/3AhSN xaX9KTmsnn3zsOXkmRwtCx9dOZcZii1jkzAkhh4W47WVK2iVMeMVl7VhsljxDNm5DnYV mrm65QLHMGDfq1z99nxqqGGAJqhoUfmdsU4MBGmWbMoeNmpkP72nqP5Uu/YzIk2BAwX9 +apPgqDBFDEp6V2VgHnL9gtMGn6qn+ym0UADdI6HLgYmBeKACHkScDtqfWm8mLDxZB7b 3Bcti+yLDegTnRPn/JfiAVzSoVG/Tai6eNWuHDrsaoBIYVkD6FNO6xOnwniUaQml5M4Y pMWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107825; x=1759712625; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JRKEwXa1tg/M6bNaqOfoeFXrUpvvgwejj9DdMRFtL4w=; b=pGpE6DhDwaDPUQwcCpu353sBFJ3jfnDmv6D9RCo/NAbEXUTt7LXWG3x4AuKIE/PLyq Dl/uGYkbW7yTfHOy8Ahlr8A42vtUMiCOIwAWYaFb7LqyTUFsUqo5W//WnolnMFxlz3Qj P30oCR+pFSlv6I/2AmAur+TOWPPisu53IkJrD/Oa7OEHOrvP9zRNoab0HEeSp8ldvmUO SsIV8MQg29ulO0eh9SiYifLHv/EUcFqESHfDege8WCPOVbRa8UO69N8DEemIgCwOMO/R OYL9iJjt7zgOrwZe9LjMVU3TC1O87FDto1R9UeiBhgLVqV93C5otPZNicTMy1EZxIV63 3XGA== X-Forwarded-Encrypted: i=1; AJvYcCVUcIJTcJrpqo+xqkvEBz7QvVvmg8/g5pQka7gkFojZFLDeRkyWsxlnOD16e6txw7Hwq2/1EDi9ozAj4k8=@vger.kernel.org X-Gm-Message-State: AOJu0YzRWyAfttr6l0YuXqQlHGWTw9EbTfYo/aN6BDq4SVPLHYuZ/WY/ DV1BX/eA16hdN+BfcL2j2L617pFrF0Uu+4Yb7whTKbcc1Rd2BoqKX4ZqerZBg0Yf9N0= X-Gm-Gg: ASbGncs8PHAue5IWcMOf30RyCe90J5TplSmHE9UduBoYFNKWioU9JCnEmu8nbOWKBCb LvDpKCcudG6ngNQypQPHaOxXmYr6+e+3oIPnTgm5UgD+1JdLw528lTr2ChJi0/Wm4qNjvOSUjJi kyl3O1RsMja5+6Lg/xd2cIDFXbmtCNgfmsuOeg3OjiOkD6CNsk2f3McBQJ0LXDzwCb95W/5i9Vo pxeEsRhBrNFBT9KtCUDZ9BuaSBS40VB8ROjCg2O6ZfbHW/KlfYMq9LpDx1iPvsp6vgZxDWpUMew sHXWMeO0aHsplgNRVS5gRPnJP1IFpZ+KrKDaHvE9Q/H8SX6db0EpIwNju9jOK3IiWafV897Fm0x GjGPfBQiguz3AMNQQNRclhvtmbgz/mRRTs8ZfP5fC8os/upfbgtyLpxMB0C/X9mK3mncWeiX3Yo uWF1dTosRk+CyRj4Yrhg== X-Google-Smtp-Source: AGHT+IEwhqe/BqmKvNuWGS3uf0Nm2/PuSwk/9IAcyGtlVKkcjZXQAFpVwNkxDeP2e7Bl4f/no/k03w== X-Received: by 2002:ac8:7e86:0:b0:4d9:c572:f9b6 with SMTP id d75a77b69052e-4da4bbe418cmr172161361cf.55.1759107824543; Sun, 28 Sep 2025 18:03:44 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:43 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 09/30] liveupdate: luo_subsystems: add subsystem registration Date: Mon, 29 Sep 2025 01:03:00 +0000 Message-ID: <20250929010321.3462457-10-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the framework for kernel subsystems (e.g., KVM, IOMMU, device drivers) to register with LUO and participate in the live update process via callbacks. Subsystem Registration: - Defines struct liveupdate_subsystem in linux/liveupdate.h, which subsystems use to provide their name and optional callbacks (prepare, freeze, cancel, finish). The callbacks accept a u64 *data intended for passing state/handles. - Exports liveupdate_register_subsystem() and liveupdate_unregister_subsystem() API functions. - Adds drivers/misc/liveupdate/luo_subsystems.c to manage a list of registered subsystems. Registration/unregistration is restricted to specific LUO states (NORMAL/UPDATED). Callback Framework: - The main luo_core.c state transition functions now delegate to new luo_do_subsystems_*_calls() functions defined in luo_subsystems.c. - These new functions are intended to iterate through the registered subsystems and invoke their corresponding callbacks. FDT Integration: - Adds a /subsystems subnode within the main LUO FDT created in luo_core.c. This node has its own compatibility string (subsystems-v1). - luo_subsystems_fdt_setup() populates this node by adding a property for each registered subsystem, using the subsystem's name. Currently, these properties are initialized with a placeholder u64 value (0). - luo_subsystems_startup() is called from luo_core.c on boot to find and validate the /subsystems node in the FDT received via KHO. - Adds a stub API function liveupdate_get_subsystem_data() intended for subsystems to retrieve their persisted u64 data from the FDT in the new kernel. Signed-off-by: Pasha Tatashin --- include/linux/liveupdate.h | 66 +++++++ kernel/liveupdate/Makefile | 3 +- kernel/liveupdate/luo_core.c | 19 +- kernel/liveupdate/luo_internal.h | 7 + kernel/liveupdate/luo_subsystems.c | 291 +++++++++++++++++++++++++++++ 5 files changed, 383 insertions(+), 3 deletions(-) create mode 100644 kernel/liveupdate/luo_subsystems.c diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index 85a6828c95b0..4c378a986cfe 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -12,6 +12,52 @@ #include #include =20 +struct liveupdate_subsystem; + +/** + * struct liveupdate_subsystem_ops - LUO events callback functions + * @prepare: Optional. Called during LUO prepare phase. Should perform + * preparatory actions and can store a u64 handle/state + * via the 'data' pointer for use in later callbacks. + * Return 0 on success, negative error code on failure. + * @freeze: Optional. Called during LUO freeze event (before actual = jump + * to new kernel). Should perform final state saving action= s and + * can update the u64 handle/state via the 'data' pointer. = Retur: + * 0 on success, negative error code on failure. + * @cancel: Optional. Called if the live update process is canceled = after + * prepare (or freeze) was called. Receives the u64 data + * set by prepare/freeze. Used for cleanup. + * @boot: Optional. Call durng boot post live update. This callbac= k is + * done when subsystem register during live update. + * @finish: Optional. Called after the live update is finished in th= e new + * kernel. + * Receives the u64 data set by prepare/freeze. Used for cl= eanup. + * @owner: Module reference + */ +struct liveupdate_subsystem_ops { + int (*prepare)(struct liveupdate_subsystem *handle, u64 *data); + int (*freeze)(struct liveupdate_subsystem *handle, u64 *data); + void (*cancel)(struct liveupdate_subsystem *handle, u64 data); + void (*boot)(struct liveupdate_subsystem *handle, u64 data); + void (*finish)(struct liveupdate_subsystem *handle, u64 data); + struct module *owner; +}; + +/** + * struct liveupdate_subsystem - Represents a subsystem participating in L= UO + * @ops: Callback functions + * @name: Unique name identifying the subsystem. + * @list: List head used internally by LUO. Should not be modified= by + * caller after registration. + * @private_data: For LUO internal use, cached value of data field. + */ +struct liveupdate_subsystem { + const struct liveupdate_subsystem_ops *ops; + const char *name; + struct list_head list; + u64 private_data; +}; + #ifdef CONFIG_LIVEUPDATE =20 /* Return true if live update orchestrator is enabled */ @@ -33,6 +79,10 @@ bool liveupdate_state_normal(void); =20 enum liveupdate_state liveupdate_get_state(void); =20 +int liveupdate_register_subsystem(struct liveupdate_subsystem *h); +int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h); +int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a); + #else /* CONFIG_LIVEUPDATE */ =20 static inline int liveupdate_reboot(void) @@ -60,5 +110,21 @@ static inline enum liveupdate_state liveupdate_get_stat= e(void) return LIVEUPDATE_STATE_NORMAL; } =20 +static inline int liveupdate_register_subsystem(struct liveupdate_subsyste= m *h) +{ + return 0; +} + +static inline int liveupdate_unregister_subsystem(struct liveupdate_subsys= tem *h) +{ + return 0; +} + +static inline int liveupdate_get_subsystem_data(struct liveupdate_subsyste= m *h, + u64 *data) +{ + return -ENODATA; +} + #endif /* CONFIG_LIVEUPDATE */ #endif /* _LINUX_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index d90cc3b4bf7b..2881bab0c6df 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -2,7 +2,8 @@ =20 luo-y :=3D \ luo_core.o \ - luo_ioctl.o + luo_ioctl.o \ + luo_subsystems.o =20 obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index 10796481447a..92edaeaaad3e 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -130,6 +130,10 @@ static int luo_fdt_setup(void) if (ret) goto exit_free; =20 + ret =3D luo_subsystems_fdt_setup(fdt_out); + if (ret) + goto exit_free; + ret =3D kho_add_subtree(LUO_KHO_ENTRY_NAME, fdt_out); if (ret) goto exit_free; @@ -153,20 +157,30 @@ static void luo_fdt_destroy(void) =20 static int luo_do_prepare_calls(void) { - return 0; + int ret; + + ret =3D luo_do_subsystems_prepare_calls(); + + return ret; } =20 static int luo_do_freeze_calls(void) { - return 0; + int ret; + + ret =3D luo_do_subsystems_freeze_calls(); + + return ret; } =20 static void luo_do_finish_calls(void) { + luo_do_subsystems_finish_calls(); } =20 static void luo_do_cancel_calls(void) { + luo_do_subsystems_cancel_calls(); } =20 static int __luo_prepare(void) @@ -408,6 +422,7 @@ static int __init luo_startup(void) } =20 __luo_set_state(LIVEUPDATE_STATE_UPDATED); + luo_subsystems_startup(luo_fdt_in); =20 return 0; } diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index c98842caa4a0..c62fbbb0790c 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -32,4 +32,11 @@ void *luo_contig_alloc_preserve(size_t size); void luo_contig_free_unpreserve(void *mem, size_t size); void luo_contig_free_restore(void *mem, size_t size); =20 +void luo_subsystems_startup(void *fdt); +int luo_subsystems_fdt_setup(void *fdt); +int luo_do_subsystems_prepare_calls(void); +int luo_do_subsystems_freeze_calls(void); +void luo_do_subsystems_finish_calls(void); +void luo_do_subsystems_cancel_calls(void); + #endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_subsystems.c b/kernel/liveupdate/luo_sub= systems.c new file mode 100644 index 000000000000..69f00d5c000e --- /dev/null +++ b/kernel/liveupdate/luo_subsystems.c @@ -0,0 +1,291 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO Subsystems support + * + * Various kernel subsystems register with the Live Update Orchestrator to + * participate in the live update process. These subsystems are notified at + * different stages of the live update sequence, allowing them to serialize + * device state before the reboot and restore it afterwards. Examples incl= ude + * the device layer, interrupt controllers, KVM, IOMMU, and specific device + * drivers. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +#define LUO_SUBSYSTEMS_NODE_NAME "subsystems" +#define LUO_SUBSYSTEMS_COMPATIBLE "subsystems-v1" + +static DEFINE_MUTEX(luo_subsystem_list_mutex); +static LIST_HEAD(luo_subsystems_list); +static void *luo_fdt_out; +static void *luo_fdt_in; + +/** + * luo_subsystems_fdt_setup - Adds and populates the 'subsystems' node in = the + * FDT. + * @fdt: Pointer to the LUO FDT blob. + * + * Add subsystems node and each subsystem to the LUO FDT blob. + * + * Returns: 0 on success, negative errno on failure. + */ +int luo_subsystems_fdt_setup(void *fdt) +{ + struct liveupdate_subsystem *subsystem; + const u64 zero_data =3D 0; + int ret, node_offset; + + guard(mutex)(&luo_subsystem_list_mutex); + ret =3D fdt_add_subnode(fdt, 0, LUO_SUBSYSTEMS_NODE_NAME); + if (ret < 0) + goto exit_error; + + node_offset =3D ret; + ret =3D fdt_setprop_string(fdt, node_offset, "compatible", + LUO_SUBSYSTEMS_COMPATIBLE); + if (ret < 0) + goto exit_error; + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + ret =3D fdt_setprop(fdt, node_offset, subsystem->name, + &zero_data, sizeof(zero_data)); + if (ret < 0) + goto exit_error; + } + + luo_fdt_out =3D fdt; + return 0; +exit_error: + pr_err("Failed to setup 'subsystems' node to FDT: %s\n", + fdt_strerror(ret)); + return -ENOSPC; +} + +/** + * luo_subsystems_startup - Validates the LUO subsystems FDT node at start= up. + * @fdt: Pointer to the LUO FDT blob passed from the previous kernel. + * + * This __init function checks the existence and validity of the '/subsyst= ems' + * node in the FDT. This node is considered mandatory. + */ +void __init luo_subsystems_startup(void *fdt) +{ + int ret, node_offset; + + guard(mutex)(&luo_subsystem_list_mutex); + node_offset =3D fdt_subnode_offset(fdt, 0, LUO_SUBSYSTEMS_NODE_NAME); + if (node_offset < 0) + luo_restore_fail("Failed to find /subsystems node\n"); + + ret =3D fdt_node_check_compatible(fdt, node_offset, + LUO_SUBSYSTEMS_COMPATIBLE); + if (ret) { + luo_restore_fail("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_SUBSYSTEMS_NODE_NAME, + LUO_SUBSYSTEMS_COMPATIBLE, ret); + } + luo_fdt_in =3D fdt; +} + +static int luo_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) +{ + return 0; +} + +/** + * luo_do_subsystems_prepare_calls - Calls prepare callbacks and updates F= DT + * if all prepares succeed. Handles cancellation on failure. + * + * Phase 1: Calls 'prepare' for all subsystems and stores results temporar= ily. + * If any 'prepare' fails, calls 'cancel' on previously prepared subsystems + * and returns the error. + * Phase 2: If all 'prepare' calls succeeded, writes the stored data to th= e FDT. + * If any FDT write fails, calls 'cancel' on *all* prepared subsystems and + * returns the FDT error. + * + * Returns: 0 on success. Negative errno on failure. + */ +int luo_do_subsystems_prepare_calls(void) +{ + return 0; +} + +/** + * luo_do_subsystems_freeze_calls - Calls freeze callbacks and updates FDT + * if all freezes succeed. Handles cancellation on failure. + * + * Phase 1: Calls 'freeze' for all subsystems and stores results temporari= ly. + * If any 'freeze' fails, calls 'cancel' on previously called subsystems + * and returns the error. + * Phase 2: If all 'freeze' calls succeeded, writes the stored data to the= FDT. + * If any FDT write fails, calls 'cancel' on *all* subsystems and + * returns the FDT error. + * + * Returns: 0 on success. Negative errno on failure. + */ +int luo_do_subsystems_freeze_calls(void) +{ + return 0; +} + +/** + * luo_do_subsystems_finish_calls- Calls finish callbacks for all subsyste= ms. + * + * This function is called at the end of live update cycle to do the final + * clean-up or housekeeping of the post-live update states. + */ +void luo_do_subsystems_finish_calls(void) +{ +} + +/** + * luo_do_subsystems_cancel_calls - Calls cancel callbacks for all subsyst= ems. + * + * This function is typically called when the live update process needs to= be + * aborted externally, for example, after the prepare phase may have run b= ut + * before actual reboot. It iterates through all registered subsystems and= calls + * the 'cancel' callback for those that implement it and likely completed + * prepare. + */ +void luo_do_subsystems_cancel_calls(void) +{ +} + +/** + * liveupdate_register_subsystem - Register a kernel subsystem handler wit= h LUO + * @h: Pointer to the liveupdate_subsystem structure allocated and populat= ed + * by the calling subsystem. + * + * Registers a subsystem handler that provides callbacks for different eve= nts + * of the live update cycle. Registration is typically done during the + * subsystem's module init or core initialization. + * + * Can only be called when LUO is in the NORMAL or UPDATED states. + * The provided name (@h->name) must be unique among registered subsystems. + * + * Return: 0 on success, negative error code otherwise. + */ +int liveupdate_register_subsystem(struct liveupdate_subsystem *h) +{ + struct liveupdate_subsystem *iter; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + guard(mutex)(&luo_subsystem_list_mutex); + list_for_each_entry(iter, &luo_subsystems_list, list) { + if (iter =3D=3D h) { + pr_warn("Subsystem '%s' (%p) already registered.\n", + h->name, h); + ret =3D -EEXIST; + goto out_unlock; + } + + if (!strcmp(iter->name, h->name)) { + pr_err("Subsystem with name '%s' already registered.\n", + h->name); + ret =3D -EEXIST; + goto out_unlock; + } + } + + if (!try_module_get(h->ops->owner)) { + pr_warn("Subsystem '%s' unable to get reference.\n", h->name); + ret =3D -EAGAIN; + goto out_unlock; + } + + INIT_LIST_HEAD(&h->list); + list_add_tail(&h->list, &luo_subsystems_list); + +out_unlock: + /* + * If we are booting during live update, and subsystem provided a boot + * callback, do it now, since we know that subsystem has already + * initialized. + */ + if (!ret && liveupdate_state_updated() && h->ops->boot) { + u64 data; + + ret =3D luo_get_subsystem_data(h, &data); + if (!WARN_ON_ONCE(ret)) + h->ops->boot(h, data); + } + + luo_state_read_exit(); + + return ret; +} + +/** + * liveupdate_unregister_subsystem - Unregister a kernel subsystem handler= from + * LUO + * @h: Pointer to the same liveupdate_subsystem structure that was used du= ring + * registration. + * + * Unregisters a previously registered subsystem handler. Typically called + * during module exit or subsystem teardown. LUO removes the structure fro= m its + * internal list; the caller is responsible for any necessary memory clean= up + * of the structure itself. + * + * Return: 0 on success, negative error code otherwise. + * -EINVAL if h is NULL. + * -ENOENT if the specified handler @h is not found in the registration li= st. + * -EBUSY if LUO is not in the NORMAL state. + */ +int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h) +{ + struct liveupdate_subsystem *iter; + bool found =3D false; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + guard(mutex)(&luo_subsystem_list_mutex); + list_for_each_entry(iter, &luo_subsystems_list, list) { + if (iter =3D=3D h) { + found =3D true; + break; + } + } + + if (found) { + list_del_init(&h->list); + } else { + pr_warn("Subsystem handler '%s' not found for unregistration.\n", + h->name); + ret =3D -ENOENT; + } + + module_put(h->ops->owner); + luo_state_read_exit(); + + return ret; +} + +int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) +{ + return 0; +} --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F2BC1D9A5F for ; Mon, 29 Sep 2025 01:03:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107830; cv=none; b=BUjsXy4puYAXFZl7igcoocKmIILXwq0UYiH51ZfqlM2mpLdR/eF8QfMjEIgD+R+nMNGMbsScF44zmOm2MkEtPHlw55qV3lguWeDtMEHG6apWQaQZ8ybA/OaA+QgEbUQsLvsQBZyhhlp6O/hFx5Hv2wyGMtJe2QAlXqVvTZ4xQRE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107830; c=relaxed/simple; bh=1eYRhAZ5gxArkp6W4PjKuMPzvA3/DnYF8EhUap2c5NM=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=i9GF9exY4Vf+DiCmGyS8URXRW4+KJP4Uk39NKwSYsOscBDyAY1t707q/OS8rSGUasrjhW6Wi6ieSOAaJZN4tFzGEsdnRnEDts/SKkB1AtaXYCRBeyiiJSD/jUroUcqQBGwttosTuhHOo0PZF3SpSn1ndtwsJpGgC0bmEu5BSVQw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=NiVqUYoh; arc=none smtp.client-ip=209.85.160.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="NiVqUYoh" Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-4da894db6e9so37096681cf.0 for ; Sun, 28 Sep 2025 18:03:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107826; x=1759712626; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=qNgtpsyj7RLAHZgXPkVDbvjA113c53pRuAQVjz5hWqc=; b=NiVqUYohbc56gOCJnWj2aj9O10RP3g7MoGPa5exk1sHV7V/1h5hbbsycPvcyca70hW /tA8ACR6010OsdseTkQKj1jf051DBYyY3fYlgENFYAN0LhhG6i4SoT96Vx5SRpvp8cVA U058h7v/8XjOkkruVMgEPu2QX1F3QIMjWgYLUwu9QnePwj/0il+P5QP6GgRN7LRoCvjf EdlIy2P0N9nAzLksJQpfn+NWWVWf3LcYqOHMnF1p1+7FRbGivaltBaKd/MCST8kW5Loc iVjKlHPHZBy7YXf4sO7pCCh0DLFIGNfBu68mnfioYvPq0YlImmbZj0UPXDEX3Syt4oni A0CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107826; x=1759712626; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qNgtpsyj7RLAHZgXPkVDbvjA113c53pRuAQVjz5hWqc=; b=dK6PkCS/0IZ7pZyV6cQIybN62Dyi+ENEKM/g12FMSwsRl7G4UL/MroujcwCA1hCLSD EX6RUdrpQ4PEYrEpr0EOs0ZbW79PDfPYZad0/DwD3Eu1BsmO70nb10IhNrCODsiZg+nE 7ps2684vZLaqLiIzC+g4XNDF3sFfJU+br7IHAGheev032o4QGelLuEKntRVvwHD3MSq+ XnSqxKM388fw6WLgf3LnnQZMczfccq5plWJYmTANnKGOtvrzkEB2sAdciykn1P5vbLVC QJnbWqnULeiAkN5KG4FJCvcaiMA3OQb1UO/Tt9jP0v5HBl02UhmPwTgD6znDXWTqBtAb xqgA== X-Forwarded-Encrypted: i=1; AJvYcCVTF9nyuh6sR8N2j5Ygz9mA6OrOwmCFci/KHB11kcqa2nIxWpVi/WpUvnvHdqy96tYnWyhgqFceyQa3wME=@vger.kernel.org X-Gm-Message-State: AOJu0YydsWIVIZCC49Gb6cEX9jGi0a6v+xLeSChXC2lWdD5j55HYi1Rm fLfqH3rgUXBWrF3Ev+BjvOsJPprxkXTqjec8IlkmV2UTi/jOP0KGSslAbIaLXy4t5g8= X-Gm-Gg: ASbGncveow+OXORkelinW9EAOCG/qiAcTXkXcvdJMceLiBT6fUCIxN3mW61hX86QRNJ jRmCwlmjx+pvs9mBM6gVDZUdT936qh0qO0/Sxmu2FMghnK2IqPa1pTX5XgAUKOJAT6B7EuW6zzF JCRyOdvBmX422CTSXttsU/vA/qcz0/uhWu+Q8Wb2CglOOSGw8Oac2WoPIaKkuqPRdV2/CPr6NfD BP4PE4Hj3UnAIscVM1M0bbDa5O8lW9/Hk6M8PcfdrLCztphqB6z6F87eLnDObnhVhWSp/s1aobI +2tMHX34y35FOiHbi/7aDVeYQOP0NWEfz8zlEeFnab9sb8YfjPOQ3U8GM0GkSLG/0jEJi5+/wXE dDvuIV1ZBEVzXzRodmCiE0Kp+hyaPSqzwJSkrt8L+ljsMjWbiejmF4rn7CElZObyAJ8W6qMyKY2 ep1t0dub2KXtCFzudFgg== X-Google-Smtp-Source: AGHT+IHxH0LChANCVLQGPE4rLJ8RYbxJBPP7307a2p1i+UBLkReBGwGJRjYxsz6FhDJmrWtglXB3xA== X-Received: by 2002:ac8:5fc5:0:b0:4e0:b72b:7f6d with SMTP id d75a77b69052e-4e0b72b8612mr35576991cf.29.1759107825963; Sun, 28 Sep 2025 18:03:45 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:45 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 10/30] liveupdate: luo_subsystems: implement subsystem callbacks Date: Mon, 29 Sep 2025 01:03:01 +0000 Message-ID: <20250929010321.3462457-11-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement the core logic within luo_subsystems.c to handle the invocation of registered subsystem callbacks and manage the persistence of their state via the LUO FDT. This replaces the stub implementations from the previous patch. This completes the core mechanism enabling subsystems to actively participate in the LUO state machine, execute phase-specific logic, and persist/restore a u64 state across the live update transition using the FDT. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_subsystems.c | 167 ++++++++++++++++++++++++++++- 1 file changed, 164 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/luo_subsystems.c b/kernel/liveupdate/luo_sub= systems.c index 69f00d5c000e..ebb7c0db08f3 100644 --- a/kernel/liveupdate/luo_subsystems.c +++ b/kernel/liveupdate/luo_subsystems.c @@ -101,8 +101,81 @@ void __init luo_subsystems_startup(void *fdt) luo_fdt_in =3D fdt; } =20 +static void __luo_do_subsystems_cancel_calls(struct liveupdate_subsystem *= boundary_subsystem) +{ + struct liveupdate_subsystem *subsystem; + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (subsystem =3D=3D boundary_subsystem) + break; + + if (subsystem->ops->cancel) { + subsystem->ops->cancel(subsystem, + subsystem->private_data); + } + subsystem->private_data =3D 0; + } +} + +static void luo_subsystems_retrieve_data_from_fdt(void) +{ + struct liveupdate_subsystem *subsystem; + int node_offset, prop_len; + const void *prop; + + if (!luo_fdt_in) + return; + + node_offset =3D fdt_subnode_offset(luo_fdt_in, 0, + LUO_SUBSYSTEMS_NODE_NAME); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + prop =3D fdt_getprop(luo_fdt_in, node_offset, + subsystem->name, &prop_len); + + if (!prop || prop_len !=3D sizeof(u64)) { + luo_restore_fail("In FDT node '/%s' can't find property '%s': %s\n", + LUO_SUBSYSTEMS_NODE_NAME, + subsystem->name, + fdt_strerror(node_offset)); + } + memcpy(&subsystem->private_data, prop, sizeof(u64)); + } +} + +static int luo_subsystems_commit_data_to_fdt(void) +{ + struct liveupdate_subsystem *subsystem; + int ret, node_offset; + + node_offset =3D fdt_subnode_offset(luo_fdt_out, 0, + LUO_SUBSYSTEMS_NODE_NAME); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + ret =3D fdt_setprop(luo_fdt_out, node_offset, subsystem->name, + &subsystem->private_data, sizeof(u64)); + if (ret < 0) { + pr_err("Failed to set FDT property for subsystem '%s' %s\n", + subsystem->name, fdt_strerror(ret)); + return -ENOENT; + } + } + + return 0; +} + static int luo_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) { + int node_offset, prop_len; + const void *prop; + + node_offset =3D fdt_subnode_offset(luo_fdt_in, 0, + LUO_SUBSYSTEMS_NODE_NAME); + prop =3D fdt_getprop(luo_fdt_in, node_offset, h->name, &prop_len); + if (!prop || prop_len !=3D sizeof(u64)) { + luo_state_read_exit(); + return -ENOENT; + } + memcpy(data, prop, sizeof(u64)); + return 0; } =20 @@ -121,7 +194,30 @@ static int luo_get_subsystem_data(struct liveupdate_su= bsystem *h, u64 *data) */ int luo_do_subsystems_prepare_calls(void) { - return 0; + struct liveupdate_subsystem *subsystem; + int ret; + + guard(mutex)(&luo_subsystem_list_mutex); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (!subsystem->ops->prepare) + continue; + + ret =3D subsystem->ops->prepare(subsystem, + &subsystem->private_data); + if (ret < 0) { + pr_err("Subsystem '%s' prepare callback failed [%d]\n", + subsystem->name, ret); + __luo_do_subsystems_cancel_calls(subsystem); + + return ret; + } + } + + ret =3D luo_subsystems_commit_data_to_fdt(); + if (ret) + __luo_do_subsystems_cancel_calls(NULL); + + return ret; } =20 /** @@ -139,7 +235,30 @@ int luo_do_subsystems_prepare_calls(void) */ int luo_do_subsystems_freeze_calls(void) { - return 0; + struct liveupdate_subsystem *subsystem; + int ret; + + guard(mutex)(&luo_subsystem_list_mutex); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (!subsystem->ops->freeze) + continue; + + ret =3D subsystem->ops->freeze(subsystem, + &subsystem->private_data); + if (ret < 0) { + pr_err("Subsystem '%s' freeze callback failed [%d]\n", + subsystem->name, ret); + __luo_do_subsystems_cancel_calls(subsystem); + + return ret; + } + } + + ret =3D luo_subsystems_commit_data_to_fdt(); + if (ret) + __luo_do_subsystems_cancel_calls(NULL); + + return ret; } =20 /** @@ -150,6 +269,18 @@ int luo_do_subsystems_freeze_calls(void) */ void luo_do_subsystems_finish_calls(void) { + struct liveupdate_subsystem *subsystem; + + guard(mutex)(&luo_subsystem_list_mutex); + luo_subsystems_retrieve_data_from_fdt(); + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (subsystem->ops->finish) { + subsystem->ops->finish(subsystem, + subsystem->private_data); + } + subsystem->private_data =3D 0; + } } =20 /** @@ -163,6 +294,9 @@ void luo_do_subsystems_finish_calls(void) */ void luo_do_subsystems_cancel_calls(void) { + guard(mutex)(&luo_subsystem_list_mutex); + __luo_do_subsystems_cancel_calls(NULL); + luo_subsystems_commit_data_to_fdt(); } =20 /** @@ -285,7 +419,34 @@ int liveupdate_unregister_subsystem(struct liveupdate_= subsystem *h) return ret; } =20 +/** + * liveupdate_get_subsystem_data - Retrieve raw private data for a subsyst= em + * from FDT. + * @h: Pointer to the liveupdate_subsystem structure representing the + * subsystem instance. The 'name' field is used to find the property. + * @data: Output pointer where the subsystem's raw private u64 data will= be + * stored via memcpy. + * + * Reads the 8-byte data property associated with the subsystem @h->name + * directly from the '/subsystems' node within the globally accessible + * 'luo_fdt_in' blob. Returns appropriate error codes if inputs are invali= d, or + * nodes/properties are missing or invalid. + * + * Return: 0 on success. -ENOENT on error. + */ int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) { - return 0; + int ret; + + luo_state_read_enter(); + if (WARN_ON_ONCE(!luo_fdt_in || !liveupdate_state_updated())) { + luo_state_read_exit(); + return -ENOENT; + } + + scoped_guard(mutex, &luo_subsystem_list_mutex) + ret =3D luo_get_subsystem_data(h, data); + luo_state_read_exit(); + + return ret; } --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E55AF284B3B for ; Mon, 29 Sep 2025 01:04:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107890; cv=none; b=YDQFYACG1tLfBmPICDWT1JXr1evi39tvP5JXs64oWBjpPvsaT+nWaw6SPmWCpJ4itH4q46gnoznfTDmuFWpS2NBDekdK2vFgKFZwm88VLYbaYdjwDvUntkrwSyQpcBWCTt4vnNIyh8zqkV6jmPS4B58g/4T78ScENBw9lLWlWmE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107890; c=relaxed/simple; bh=y18JrZ+RKa4FFiODMr7HVly8Elqh7J5PFIpDsoHY7Xk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=loI3XxR7/jLQvynw2cTIRN3ctwvsPWU2TQWPelM8F4gGToPT/gYWhYNgAeKopGiPH/VaQ++6bNUZ9tL1kxFM+4O0rfQRplR32SYLQHjI+wHM9zKIpQ+rY+EEaLGvd2miXe8yqxz42qdZpd93NtucbkLErN0tTo+zLvdxO1oqqaM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=gMCJdLBE; arc=none smtp.client-ip=209.85.222.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="gMCJdLBE" Received: by mail-qk1-f174.google.com with SMTP id af79cd13be357-85780d76b48so433959585a.1 for ; Sun, 28 Sep 2025 18:04:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107884; x=1759712684; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=0YV/C30kHxyQuWB9Hp3Rra2QjX9aAIcMuuehexAQLV8=; b=gMCJdLBEmWyhtV4xptlySNsujjhuvuMRlN9ylnJAYkaVj8a5O75K3BeEKyhyqsMQMa 63Aep2wmMFvl7+L4+2t4ZZpEW3WPtJP5GYBF62mre6mj5cv/7NJJuQvt4igiGi58S7hU XYevYo3L91n7aiG0LGts7G3WYooJS3NG2Vt5T6FYJJX9rtoAjfUmWCWlQlBJt8tOo9fw ytf6hTuleUqrrD3aqfXcj7YQN/LDTbJ0JsHtbVecIb8Pm+t774/V5c8r6R8w67qNtmsN XeJ/ruRWiOB8FIVdEF+lcN2SaacN53P9iLcXu8Oq4VSwlbLVF8t6carJwZK/WbR8hvBw /9Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107884; x=1759712684; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0YV/C30kHxyQuWB9Hp3Rra2QjX9aAIcMuuehexAQLV8=; b=G0tihyx/d0jffjDdg/wLSyT7Qz6NjFQ8yZDjTIwbreWT/BOrJOb8zTLXzBRMFs7mUV ulHPo/ApcpYlPo/BZHvJJ/3MXFGJULrrgYM0xxmSEksjIhR985Cwxo+/GJjS5HmKOuiI O3iUdtA+4ZikVEJN0AeohDc+2NfCGouMN8fnJllxPxMWbEmo4xGurix7uajQWTGwXje1 stiQy/zMoKK6lUcvAb1PYaOO44tkyz7mW/Rk+Uiir2JU2KfMvcza9MWL1z0PcHjQ37Qz mANkURbV5AGcKj+JlHtGLY6EUWMKw6n4tTZzDE+L/+jT2DBTX48tOxrHqRFKGvLbUYwz oNdw== X-Forwarded-Encrypted: i=1; AJvYcCVIjL15TC8hYD9eGvcMNXbUraAMj5a0rNy/EDTTpRSqWX4RPPI8rmsV+vIDyZRBrewhaHhPBmcPgiuGhcU=@vger.kernel.org X-Gm-Message-State: AOJu0YzuEoDHRQ3S/brpBAG2xDv0cv+B3tNgDaP7nY9rkioRQSOU9My2 JSNgOTrnWAgQpT5TTcoxEuWlvbeYZ/S79ffqfmCYKISplhcqKFXoo3ZJqxmiCldETRs= X-Gm-Gg: ASbGnctpWIIW6I4JMMBbS5ktcyBCjsf5neAMkPqG30g1euBsV4814DE2JJt7uuitleW 16uuq4XMBfJOQA71JNLTQTx9D+Vm0XcXlEacs+Gy9gp5NAFz66/xvzBEUdWMw0Pgy0wL7yXNgnN lIU8zDGY++oF4UKPEJp9aSDXpiUpH9RY4ZusuJhFAs46s6bZ+IAm5yCLe+ZHaDlA4kBUe9tR0BW MGhRdZkatlBerBkEa4p5E9fowG1s600q5Yxm6Le0WwhTfFfQrG/sQjTrbHbc4eGSeBn9HyQbryO D4fgqI72NizIyuH6cKvH4uUAN/jsXsEZCmkxFgO7hulxQygxafkYeJ8qVKbrviqcun/BxYSuyJO 0q7Wk6i8CBq1C+3k4MISIltTMEmUMTpofJGrxVlPqYsYldO5codD69nvwoa4SXfCe/WKCpUo0XL h9lPYO9C8Ethg6k6XM9Q== X-Google-Smtp-Source: AGHT+IE/2gkO6BHU1Wstnbl9JT+BQr3NoLV1YqF2Ma44UH3Fg76MALcfhQNt/9B9O7w+tSiqkpudoA== X-Received: by 2002:a05:620a:2805:b0:849:8fcc:69e4 with SMTP id af79cd13be357-85ae8c269e3mr1755065285a.68.1759107827446; Sun, 28 Sep 2025 18:03:47 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.03.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:03:46 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 11/30] liveupdate: luo_session: Add sessions support Date: Mon, 29 Sep 2025 01:03:02 +0000 Message-ID: <20250929010321.3462457-12-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce concept of "Live Update Sessions" within the LUO framework. LUO sessions provide a mechanism to group and manage `struct file *` instances (representing file descriptors) that need to be preserved across a kexec-based live update. Each session is identified by a unique name and acts as a container for file objects whose state is critical to a userspace workload, such as a virtual machine or a high-performance database, aiming to maintain their functionality across a kernel transition. This groundwork establishes the framework for preserving file-backed state across kernel updates, with the actual file data preservation mechanisms to be implemented in subsequent patches. Signed-off-by: Pasha Tatashin --- include/uapi/linux/liveupdate.h | 3 + kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_internal.h | 34 ++ kernel/liveupdate/luo_session.c | 607 +++++++++++++++++++++++++++++++ 4 files changed, 645 insertions(+) create mode 100644 kernel/liveupdate/luo_session.c diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index 3cb09b2c4353..e8c0c210a790 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -91,4 +91,7 @@ enum liveupdate_event { LIVEUPDATE_CANCEL =3D 3, }; =20 +/* The maximum length of session name including null termination */ +#define LIVEUPDATE_SESSION_NAME_LENGTH 56 + #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 2881bab0c6df..f64cfc92cbf0 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -3,6 +3,7 @@ luo-y :=3D \ luo_core.o \ luo_ioctl.o \ + luo_session.o \ luo_subsystems.o =20 obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index c62fbbb0790c..9223f71844ca 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -32,6 +32,40 @@ void *luo_contig_alloc_preserve(size_t size); void luo_contig_free_unpreserve(void *mem, size_t size); void luo_contig_free_restore(void *mem, size_t size); =20 +/** + * struct luo_session - Represents an active or incoming Live Update sessi= on. + * @name: A unique name for this session, used for identification and + * retrieval. + * @files_xa: An xarray used to store the files associated with this sess= ion. + * @ser: Pointer to the serialized data for this session. + * @count: A counter tracking the number of files currently stored in = the + * @files_xa for this session. + * @list: A list_head member used to link this session into a global = list + * of either outgoing (to be preserved) or incoming (restored = from + * previous kernel) sessions. + * @retrieved: A boolean flag indicating whether this session has been ret= rieved + * by a consumer in the new kernel. Valid only during the + * LIVEUPDATE_STATE_UPDATED state. + * @mutex: Session lock, protects files_xa, and count. + * @state: State of this session: prepared/frozen/updated/normal. + * @files: The physical address of a contiguous memory block that holds + * the serialized state of files. + */ +struct luo_session { + char name[LIVEUPDATE_SESSION_NAME_LENGTH]; + struct xarray files_xa; + struct luo_session_ser *ser; + long count; + struct list_head list; + bool retrieved; + struct mutex mutex; + enum liveupdate_state state; + u64 files; +}; + +int luo_session_create(const char *name, struct file **filep); +int luo_session_retrieve(const char *name, struct file **filep); + void luo_subsystems_startup(void *fdt); int luo_subsystems_fdt_setup(void *fdt); int luo_do_subsystems_prepare_calls(void); diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_sessio= n.c new file mode 100644 index 000000000000..74dee42e24b7 --- /dev/null +++ b/kernel/liveupdate/luo_session.c @@ -0,0 +1,607 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO Sessions + * + * LUO Sessions provide the core mechanism for grouping and managing file + * descriptors that need to be preserved across a kexec-based live update. + * Each session acts as a named container for a set of file objects, allow= ing + * a userspace agent (e.g., a Live Update Orchestration Daemon) to manage = the + * lifecycle of resources critical to a workload. + * + * Core Concepts: + * + * - Named Containers: Sessions are identified by a unique, user-provided = name, + * which is used for both creation and retrieval. + * + * - Userspace Interface: Session management is driven from userspace via + * ioctls on /dev/liveupdate (e.g., CREATE_SESSION, RETRIEVE_SESSION). + * + * - Serialization: Session metadata is preserved using the KHO framework. + * During the 'prepare' phase, an array of `struct luo_session_ser` is + * allocated and preserved. An FDT node is also created, containing the + * count of sessions and the physical address of this array. + * + * Session Lifecycle and State Management: + * + * 1. Creation: A userspace agent calls `luo_session_create()` to create = a new, + * empty session, receiving a file descriptor handle. This can be done= in + * the NORMAL or UPDATED states. + * + * 2. Name Collision: In the UPDATED state, `luo_session_create()` checks= for + * name conflicts against sessions preserved from the previous kernel = to + * prevent ambiguity. + * + * 3. Preparation (`prepare` callback): When the global LUO PREPARE event= is + * triggered, the list of all created sessions is serialized. The main + * `ser` array is allocated, and each active `struct luo_session` is g= iven + * a direct pointer to its corresponding entry in this array. + * + * 4. Release After Prepare: When a session FD is closed *after* the PREP= ARE + * event, the `.release` handler uses the session's direct pointer to + * `memset(0)` its entry in the `ser` array. This effectively marks the + * session as defunct without needing to resize the already-preserved + * memory. + * + * 5. Boot (`boot` callback): In the new kernel, the FDT is read to locate + * the preserved `ser` array. The metadata (count, physical address) is + * stored in the `luo_session` global. + * + * 6. Lazy Deserialization: The actual `luo_session` list is populated on + * first use (e.g., by `retrieve`, `finish`, or `create`). During this + * process, any zeroed-out entries from step 4 are skipped. + * + * 7. Retrieval: The userspace agent calls `luo_session_retrieve()` in th= e new + * kernel to get a new FD handle for a preserved session by its name. + * + * 8. Finalization (`finish` callback): When the global LUO FINISH event = is + * sent, any preserved sessions that were successfully retrieved are m= oved + * to the `luo_session_global` list, making them available for a subse= quent + * live update. Any sessions that were not retrieved are considered st= ale + * and are cleaned up. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +#define LUO_SESSION_NODE_NAME "luo-session" +#define LUO_SESSION_COMPATIBLE "luo-session-v1" + +/** + * struct luo_session_ser - Represents the serialized metadata for a LUO s= ession. + * @name: The unique name of the session, copied from the `luo_session` + * structure. + * @files: The physical address of a contiguous memory block that holds + * the serialized state of files. + * @pgcnt: The number of pages occupied by the `files` memory block. + * @count: The total number of files that were part of this session duri= ng + * serialization. Used for iteration and validation during + * restoration. + * + * This structure is used to package session-specific metadata for transfer + * between kernels via Kexec Handover. An array of these structures (one p= er + * session) is created and passed to the new kernel, allowing it to recons= truct + * the session context. + * + * If this structure is modified, LUO_SESSION_COMPATIBLE must be updated. + */ +struct luo_session_ser { + char name[LIVEUPDATE_SESSION_NAME_LENGTH]; + u64 files; + u64 pgcnt; + u64 count; +} __packed; + +/** + * struct luo_session_global - Global container for managing LUO sessions. + * @count: The number of sessions currently tracked in the @list. + * @list: The head of the linked list of `struct luo_session` instances. + * @rwsem: A read-write semaphore providing synchronized access to the ses= sion + * list and other fields in this structure. + * @ser: A pointer to the contiguous block of memory holding the seriali= zed + * session data (an array of `struct luo_session_ser`). For `_out`= , this + * is allocated and populated during `prepare`. For `_in`, this po= ints + * to the data restored from the previous kernel. + * @pgcnt: The size, in pages, of the memory block pointed to by @ser. + * @fdt: A pointer to the FDT blob that contains the metadata for this g= roup + * of sessions. This FDT is what is ultimately passed to the paren= t LUO + * subsystem for preservation. + */ +struct luo_session_global { + long count; + struct list_head list; + struct rw_semaphore rwsem; + struct luo_session_ser *ser; + u64 pgcnt; + void *fdt; + long ser_count; +}; + +static struct luo_session_global luo_session_global; + +static struct luo_session *luo_session_alloc(const char *name) +{ + struct luo_session *session =3D kzalloc(sizeof(*session), GFP_KERNEL); + + if (!session) + return NULL; + + strscpy(session->name, name, sizeof(session->name)); + xa_init(&session->files_xa); + session->count =3D 0; + INIT_LIST_HEAD(&session->list); + mutex_init(&session->mutex); + session->state =3D LIVEUPDATE_STATE_NORMAL; + + return session; +} + +static void luo_session_free(struct luo_session *session) +{ + WARN_ON(session->count); + xa_destroy(&session->files_xa); + mutex_destroy(&session->mutex); + kfree(session); +} + +static int luo_session_insert(struct luo_session *session) +{ + struct luo_session *it; + + lockdep_assert_held_write(&luo_session_global.rwsem); + /* + * For small number of sessions this loop won't hurt performance + * but if we ever start using a lot of sessions, this might + * become a bottle neck during deserialization time, as it would + * cause O(n*n) complexity. + */ + list_for_each_entry(it, &luo_session_global.list, list) { + if (!strncmp(it->name, session->name, sizeof(it->name))) + return -EEXIST; + } + list_add_tail(&session->list, &luo_session_global.list); + luo_session_global.count++; + + return 0; +} + +static void luo_session_remove(struct luo_session *session) +{ + lockdep_assert_held_write(&luo_session_global.rwsem); + list_del(&session->list); + luo_session_global.count--; +} + +/* One session switches from the updated state to normal state */ +static void luo_session_finish_one(struct luo_session *session) +{ +} + +/* Cancel one session from frozen or prepared state, back to normal */ +static void luo_session_cancel_one(struct luo_session *session) +{ +} + +/* One session is changed from normal to prepare state */ +static int luo_session_prepare_one(struct luo_session *session) +{ + return 0; +} + +static int luo_session_release(struct inode *inodep, struct file *filep) +{ + struct luo_session *session =3D filep->private_data; + + scoped_guard(rwsem_read, &luo_session_global.rwsem) { + scoped_guard(mutex, &session->mutex) { + if (session->ser) { + memset(session->ser, 0, + sizeof(struct luo_session_ser)); + } + } + } + + if (session->state =3D=3D LIVEUPDATE_STATE_UPDATED) + luo_session_finish_one(session); + if (session->state =3D=3D LIVEUPDATE_STATE_PREPARED || + session->state =3D=3D LIVEUPDATE_STATE_FROZEN) { + luo_session_cancel_one(session); + } + + scoped_guard(rwsem_write, &luo_session_global.rwsem) + luo_session_remove(session); + luo_session_free(session); + + return 0; +} + +static const struct file_operations luo_session_fops =3D { + .owner =3D THIS_MODULE, + .release =3D luo_session_release, +}; + +static void luo_session_deserialize(void) +{ + static int visited; + int i; + + if (visited) + return; + + guard(rwsem_write)(&luo_session_global.rwsem); + if (visited) + return; + visited++; + for (i =3D 0; i < luo_session_global.ser_count; i++) { + struct luo_session *session; + + /* + * If there is no name, this session was remove from + * preservation after prepare. So, skip it. + */ + if (!luo_session_global.ser[i].name[0]) + continue; + + session =3D luo_session_alloc(luo_session_global.ser[i].name); + if (!session) + luo_restore_fail("Failed to allocate session on boot\n"); + + if (luo_session_insert(session)) { + luo_restore_fail("Failed to insert session due to name conflict [%s]\n", + session->name); + } + + session->state =3D LIVEUPDATE_STATE_UPDATED; + session->count =3D luo_session_global.ser[i].count; + session->files =3D luo_session_global.ser[i].files; + } +} + +/* Create a "struct file" for session, and delete it on case of failure */ +static int luo_session_getfile(struct luo_session *session, struct file **= filep) +{ + char name_buf[128]; + struct file *file; + + scoped_guard(mutex, &session->mutex) { + lockdep_assert_held(&session->mutex); + snprintf(name_buf, sizeof(name_buf), "[luo_session] %s", + session->name); + file =3D anon_inode_getfile(name_buf, &luo_session_fops, session, + O_RDWR); + } + if (IS_ERR(file)) { + scoped_guard(rwsem_write, &luo_session_global.rwsem) + luo_session_remove(session); + luo_session_free(session); + return PTR_ERR(file); + } + + *filep =3D file; + return 0; +} + +int luo_session_create(const char *name, struct file **filep) +{ + struct luo_session *session; + int ret; + + guard(rwsem_read)(&luo_state_rwsem); + + /* New sessions cannot be added after prepared state */ + if (!liveupdate_state_normal() && !liveupdate_state_updated()) + return -EAGAIN; + + session =3D luo_session_alloc(name); + if (!session) + return -ENOMEM; + + scoped_guard(rwsem_write, &luo_session_global.rwsem) + ret =3D luo_session_insert(session); + if (ret) { + luo_session_free(session); + return ret; + } + + return luo_session_getfile(session, filep); +} + +int luo_session_retrieve(const char *name, struct file **filep) +{ + struct luo_session *session =3D NULL; + struct luo_session *it; + + guard(rwsem_read)(&luo_state_rwsem); + + /* Can only retrieve in the updated state */ + if (!liveupdate_state_updated()) + return -EAGAIN; + + luo_session_deserialize(); + scoped_guard(rwsem_read, &luo_session_global.rwsem) { + list_for_each_entry(it, &luo_session_global.list, list) { + if (!strncmp(it->name, name, sizeof(it->name))) { + session =3D it; + break; + } + } + } + + if (!session) + return -ENOENT; + + scoped_guard(mutex, &session->mutex) { + /* + * Session already retrieved or a session with the same name was + * created during updated state + */ + if (session->retrieved || session->state !=3D LIVEUPDATE_STATE_UPDATED) + return -EADDRINUSE; + + session->retrieved =3D true; + } + + return luo_session_getfile(session, filep); +} + +static void luo_session_global_preserved_cleanup(void) +{ + lockdep_assert_held_write(&luo_session_global.rwsem); + if (luo_session_global.ser && !IS_ERR(luo_session_global.ser)) { + luo_contig_free_unpreserve(luo_session_global.ser, + luo_session_global.pgcnt << PAGE_SHIFT); + } + if (luo_session_global.fdt && !IS_ERR(luo_session_global.fdt)) + luo_contig_free_unpreserve(luo_session_global.fdt, PAGE_SIZE); + + luo_session_global.fdt =3D NULL; + luo_session_global.ser =3D NULL; + luo_session_global.ser_count =3D 0; + luo_session_global.pgcnt =3D 0; +} + +static int luo_session_fdt_setup(void) +{ + u64 ser_pa; + int ret; + + lockdep_assert_held_write(&luo_session_global.rwsem); + luo_session_global.pgcnt =3D DIV_ROUND_UP(luo_session_global.count * + sizeof(struct luo_session_ser), PAGE_SIZE); + + if (luo_session_global.pgcnt > 0) { + size_t ser_size =3D luo_session_global.pgcnt << PAGE_SHIFT; + + luo_session_global.ser =3D luo_contig_alloc_preserve(ser_size); + if (IS_ERR(luo_session_global.ser)) { + ret =3D PTR_ERR(luo_session_global.ser); + goto exit_cleanup; + } + } + + luo_session_global.fdt =3D luo_contig_alloc_preserve(PAGE_SIZE); + if (IS_ERR(luo_session_global.fdt)) { + ret =3D PTR_ERR(luo_session_global.fdt); + goto exit_cleanup; + } + + ret =3D fdt_create(luo_session_global.fdt, PAGE_SIZE); + if (ret < 0) + goto exit_cleanup; + + ret =3D fdt_finish_reservemap(luo_session_global.fdt); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_begin_node(luo_session_global.fdt, LUO_SESSION_NODE_NAME); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_property_string(luo_session_global.fdt, "compatible", + LUO_SESSION_COMPATIBLE); + if (ret < 0) + goto exit_end_node; + + ret =3D fdt_property_u64(luo_session_global.fdt, "count", + luo_session_global.count); + if (ret < 0) + goto exit_end_node; + + ser_pa =3D luo_session_global.ser ? __pa(luo_session_global.ser) : 0; + ret =3D fdt_property_u64(luo_session_global.fdt, "data", ser_pa); + if (ret < 0) + goto exit_end_node; + + ret =3D fdt_property_u64(luo_session_global.fdt, "pgcnt", + luo_session_global.pgcnt); + if (ret < 0) + goto exit_end_node; + + ret =3D fdt_end_node(luo_session_global.fdt); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_finish(luo_session_global.fdt); + if (ret < 0) + goto exit_cleanup; + + return 0; + +exit_end_node: + fdt_end_node(luo_session_global.fdt); +exit_finish: + fdt_finish(luo_session_global.fdt); +exit_cleanup: + luo_session_global_preserved_cleanup(); + + return ret; +} + +/* + * Change all sessions to normal state: make every file within each session + * to be in the normal state. + */ +static void luo_session_cancel(struct liveupdate_subsystem *h, u64 data) +{ + struct luo_session *it; + + guard(rwsem_write)(&luo_session_global.rwsem); + list_for_each_entry(it, &luo_session_global.list, list) + luo_session_cancel_one(it); + luo_session_global_preserved_cleanup(); +} + +static int luo_session_prepare(struct liveupdate_subsystem *h, u64 *data) +{ + struct luo_session_ser *ser; + struct luo_session *it; + int ret; + + scoped_guard(rwsem_write, &luo_session_global.rwsem) { + ret =3D luo_session_fdt_setup(); + if (ret) + return ret; + + ser =3D luo_session_global.ser; + list_for_each_entry(it, &luo_session_global.list, list) { + if (it->state =3D=3D LIVEUPDATE_STATE_NORMAL) { + ret =3D luo_session_prepare_one(it); + if (ret) + break; + } + strscpy(ser->name, it->name, sizeof(ser->name)); + ser->count =3D it->count; + ser->files =3D it->files; + it->ser =3D ser; + ser++; + } + + if (!ret) + *data =3D __pa(luo_session_global.fdt); + } + + if (ret) + luo_session_cancel(h, 0); + + return ret; +} + +static int luo_session_freeze(struct liveupdate_subsystem *h, u64 *data) +{ + return 0; +} + +/* + * Finish every file within each session. If session has not been reclaimed + * remove it, otherwise keep this session, so it can participate in the + * next live update. + */ +static void luo_session_finish(struct liveupdate_subsystem *h, u64 data) +{ + struct luo_session *session, *tmp; + + luo_session_deserialize(); + + list_for_each_entry_safe(session, tmp, &luo_session_global.list, list) { + /* + * Skip sessions that were created in new kernel or have been + * finished already. + */ + if (session->state !=3D LIVEUPDATE_STATE_UPDATED) + continue; + luo_session_finish_one(session); + if (!session->retrieved) { + pr_warn("Removing unreclaimed session[%s]\n", + session->name); + scoped_guard(rwsem_write, &luo_session_global.rwsem) + luo_session_remove(session); + luo_session_free(session); + } + } + + scoped_guard(rwsem_write, &luo_session_global.rwsem) + luo_session_global_preserved_cleanup(); +} + +static void luo_session_boot(struct liveupdate_subsystem *h, u64 data) +{ + u64 count, data_pa, pgcnt; + const void *prop; + int prop_len; + void *fdt; + + fdt =3D __va(data); + if (fdt_node_check_compatible(fdt, 0, LUO_SESSION_COMPATIBLE)) + luo_restore_fail("luo-session FDT incompatible\n"); + + prop =3D fdt_getprop(fdt, 0, "count", &prop_len); + if (!prop || prop_len !=3D sizeof(u64)) + luo_restore_fail("luo-session FDT missing or invalid 'count'\n"); + count =3D be64_to_cpup(prop); + + prop =3D fdt_getprop(fdt, 0, "data", &prop_len); + if (!prop || prop_len !=3D sizeof(u64)) + luo_restore_fail("luo-session FDT missing or invalid 'data'\n"); + data_pa =3D be64_to_cpup(prop); + + prop =3D fdt_getprop(fdt, 0, "pgcnt", &prop_len); + if (!prop || prop_len !=3D sizeof(u64)) + luo_restore_fail("luo-session FDT missing or invalid 'pgcnt'\n"); + pgcnt =3D be64_to_cpup(prop); + + if (!count) + return; + + guard(rwsem_write)(&luo_session_global.rwsem); + luo_session_global.fdt =3D fdt; + luo_session_global.ser =3D __va(data_pa); + luo_session_global.ser_count =3D count; + luo_session_global.pgcnt =3D pgcnt; +} + +static const struct liveupdate_subsystem_ops luo_session_subsys_ops =3D { + .prepare =3D luo_session_prepare, + .freeze =3D luo_session_freeze, + .cancel =3D luo_session_cancel, + .boot =3D luo_session_boot, + .finish =3D luo_session_finish, + .owner =3D THIS_MODULE, +}; + +static struct liveupdate_subsystem luo_session_subsys =3D { + .ops =3D &luo_session_subsys_ops, + .name =3D LUO_SESSION_COMPATIBLE, +}; + +static int __init luo_session_startup(void) +{ + int ret; + + if (!liveupdate_enabled()) + return 0; + + init_rwsem(&luo_session_global.rwsem); + INIT_LIST_HEAD(&luo_session_global.list); + + ret =3D liveupdate_register_subsystem(&luo_session_subsys); + if (ret) { + pr_warn("Failed to register luo_session subsystem [%d]\n", ret); + return ret; + } + + return ret; +} +late_initcall(luo_session_startup); --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D63B3185E4A for ; Mon, 29 Sep 2025 01:04:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107873; cv=none; b=ekC4fSnVzUJL9CJitAlG0atgKZkqT5T24dT5Y4MHRx+KPGY3yT9+ApdhwkuvZhs2vSzIqxr8Vnw9g+rH/7dAlsIU7ZoL4o8TyDiB5Rc49wh80fbW1DGlU8sNwEi2DN8v2j4wcy0OOlEJaIpMnm+g2jHUNjro30kawl6lDLIB77g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107873; c=relaxed/simple; bh=Y7nqGbM4CxQL86kixsX259srVrd5urtxz+FvNAryO8g=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pBaePG8CkyXBQDXpMaDq+yGnMZDa20IsPImO7H3JyY+jpbShNArWUzdihhLN9D0j2l6C8A38Tg98H0GFc+SagtrBXt5UU8sOlHtbGkL6V/imhHxC8yuxy1CzJEA80h4h0UD8kq7908N3+DdasQ6JLle+6/KPPVAR4kPEOFi3sIw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=FUvUjTul; arc=none smtp.client-ip=209.85.160.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="FUvUjTul" Received: by mail-qt1-f182.google.com with SMTP id d75a77b69052e-4df4d23fb59so16773991cf.1 for ; Sun, 28 Sep 2025 18:04:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107870; x=1759712670; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=zcSgDH9VLG2SUi6g3+OVDiDVEtFrrRUlaRrL6JR5uy0=; b=FUvUjTulAcdL0Hy8nICHHKSiz4BRUYwCLvUG/OMnwizHU3hoOuUCwkFFK5kO5eFjtH WN61fs5sMjCB53dKMEEmdviCxsa1QIqH20uoxvDJgRv6R12h+RyjmYBYxRg9M/7L8TMh HkP3nK7MZdlZqoW27b5GfnzBLpcMDypmFiByGIbJriynIkEKNZ6kpO7nir/2DcaUuC+m iD9Eivck40OG/ruaBuaUNyym9Aebljh9NoDsQCq4v2Ks3/ADahe65xDmLno6h8AODh2v o1/PZh4G3/n/0wSi9XWVq3wkP+XWrYs2IxvzviHlkp0aOjs/FB2hjAxRYtcXaTWSWfxm FUlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107870; x=1759712670; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zcSgDH9VLG2SUi6g3+OVDiDVEtFrrRUlaRrL6JR5uy0=; b=rePOnMLn4qG9WNmChqNxQZ/q1Bk61U/PkFbJdOi4g0lV+DzuilZuJiLtfcF7TGiuom YeWX+CXh+LZWlYIRwQg15F9CBAA91kfKLZaquwInn++DdznaYd3WXYG5cx8O1QD7k1Iu p+hAE8gtJHWWAUiA39MJGT/9+mvj2oiqQ17LlY4fRtODujs74W995BOmAnIVJibNXN0i TLAgqP5Bzz2Cpa8MGZ5FwG9p7WEba+kYHYOP7QzLn0ZTjm+d3+HI9hYP0S5Jj5r5Js5T PSY4Kl69U/B5lB5KnIkD+B8DQ9adJw7ltx7Afuna6bR81Pt2A19x65bIxrej8rUvWFzn iBRw== X-Forwarded-Encrypted: i=1; AJvYcCXGQEMtdSodrRpm0/GvcTxzEjLC8M3x9zpUhO2I71MaoPlVhNF7/StODncA21x20+3zcslxAfAkk7yV6mQ=@vger.kernel.org X-Gm-Message-State: AOJu0YyZm8dwl7trXsYWkVRqW/UZi0dBwXgchdtDR2H19PfVp/nGHHpW qjMrMunIF5CatEfwzc2tv4CV2Tchmv63suQGF3kBC9Fdo3W1Y5nPu3X2JT9/+MpYRgk= X-Gm-Gg: ASbGncst/cNIYsTx9TBmjPJ8jNlBEQlmG9m/y7GIVxjoQ+9VYw8ikLT9+84UF2u5nM0 sU8H4U/nEmcgEgGyyK5VohzqHeMsbGfTb3qLL68WULU7aCL4n0TOPZM44KwYO7tGHMFCYVjFTK8 o4sSsmWQfcJ9P3OKVpSJvF5XBng0jw8dteK/R8ajWGo/M9qaTqK1/jcRICy7/pxnfCtphT/YZGv C3e3Ubat/wVNDRck3DuZxSPKdu5j+NuXRst3ztMttx2okvURW6iEB55GIm7Hr97KcuxLTOoZk+g GPMEY01J620/tJrrsL7QPBWjjRQ0jJdUaLHnoYdBO6GuNnQ5AkpRRrjx022n9y7oqg1NoggVfcR FGbuOaM3pLK7XZeQ99xoA5yTqejv+rH52tPZKYue/3V5Up6MutLcOa0WET6luJrgmgjpU0ehFt4 /o77Oq5dWJvgvWHUbsKA== X-Google-Smtp-Source: AGHT+IGF0S9IclOBJkoE08yEa53PAxQoR+uRPkimKbM/oD4UzxexMGmwGzEkKHwSpei9vqN/FCsqQA== X-Received: by 2002:ac8:5f53:0:b0:4b9:d7c2:756a with SMTP id d75a77b69052e-4da4cd49c0cmr194938831cf.77.1759107869557; Sun, 28 Sep 2025 18:04:29 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:28 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 12/30] liveupdate: luo_ioctl: add user interface Date: Mon, 29 Sep 2025 01:03:03 +0000 Message-ID: <20250929010321.3462457-13-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the user-space interface for the Live Update Orchestrator via ioctl commands, enabling external control over the live update process and management of preserved resources. The idea is that there is going to be a single userspace agent driving the live update, therefore, only a single process can ever hold this device opened at a time. The following ioctl commands are introduced: LIVEUPDATE_IOCTL_GET_STATE Allows userspace to query the current state of the LUO state machine (e.g., NORMAL, PREPARED, UPDATED). LIVEUPDATE_IOCTL_SET_EVENT Enables userspace to drive the LUO state machine by sending global events. This includes: LIVEUPDATE_PREPARE To begin the state-saving process. LIVEUPDATE_FINISH To signal completion of restoration in the new kernel. LIVEUPDATE_CANCEL To abort a prepared update. LIVEUPDATE_IOCTL_CREATE_SESSION Provides a way for userspace to create a named session for grouping file descriptors that need to be preserved. It returns a new file descriptor representing the session. LIVEUPDATE_IOCTL_RETRIEVE_SESSION Allows the userspace agent in the new kernel to reclaim a preserved session by its name, receiving a new file descriptor to manage the restored resources. Signed-off-by: Pasha Tatashin --- include/uapi/linux/liveupdate.h | 199 ++++++++++++++++++++++++++++++ kernel/liveupdate/luo_internal.h | 20 +++ kernel/liveupdate/luo_ioctl.c | 201 +++++++++++++++++++++++++++++++ 3 files changed, 420 insertions(+) diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index e8c0c210a790..2e38ef3094aa 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -14,6 +14,32 @@ #include #include =20 +/** + * DOC: General ioctl format + * + * The ioctl interface follows a general format to allow for extensibility= . Each + * ioctl is passed in a structure pointer as the argument providing the si= ze of + * the structure in the first u32. The kernel checks that any structure sp= ace + * beyond what it understands is 0. This allows userspace to use the backw= ard + * compatible portion while consistently using the newer, larger, structur= es. + * + * ioctls use a standard meaning for common errnos: + * + * - ENOTTY: The IOCTL number itself is not supported at all + * - E2BIG: The IOCTL number is supported, but the provided structure has + * non-zero in a part the kernel does not understand. + * - EOPNOTSUPP: The IOCTL number is supported, and the structure is + * understood, however a known field has a value the kernel does not + * understand or support. + * - EINVAL: Everything about the IOCTL was understood, but a field is not + * correct. + * - ENOENT: A provided token does not exist. + * - ENOMEM: Out of memory. + * - EOVERFLOW: Mathematics overflowed. + * + * As well as additional errnos, within specific ioctls. + */ + /** * enum liveupdate_state - Defines the possible states of the live update * orchestrator. @@ -94,4 +120,177 @@ enum liveupdate_event { /* The maximum length of session name including null termination */ #define LIVEUPDATE_SESSION_NAME_LENGTH 56 =20 +/* The ioctl type, documented in ioctl-number.rst */ +#define LIVEUPDATE_IOCTL_TYPE 0xBA + +/* The /dev/liveupdate ioctl commands */ +enum { + LIVEUPDATE_CMD_BASE =3D 0x00, + LIVEUPDATE_CMD_GET_STATE =3D LIVEUPDATE_CMD_BASE, + LIVEUPDATE_CMD_SET_EVENT =3D 0x01, + LIVEUPDATE_CMD_CREATE_SESSION =3D 0x02, + LIVEUPDATE_CMD_RETRIEVE_SESSION =3D 0x03, +}; + +/** + * struct liveupdate_ioctl_get_state - ioctl(LIVEUPDATE_IOCTL_GET_STATE) + * @size: Input; sizeof(struct liveupdate_ioctl_get_state) + * @state: Output; The current live update state. + * + * Query the current state of the live update orchestrator. + * + * The kernel fills the @state with the current + * state of the live update subsystem. Possible states are: + * + * - %LIVEUPDATE_STATE_NORMAL: Default state; no live update operation is + * currently in progress. + * - %LIVEUPDATE_STATE_PREPARED: The preparation phase (triggered by + * %LIVEUPDATE_PREPARE) has completed + * successfully. The system is ready for the + * reboot transition. Note that some + * device operations (e.g., unbinding, new D= MA + * mappings) might be restricted in this sta= te. + * - %LIVEUPDATE_STATE_UPDATED: The system has successfully rebooted into= the + * new kernel via live update. It is now run= ning + * the new kernel code and is awaiting the + * completion signal from user space via + * %LIVEUPDATE_FINISH after restoration task= s are + * done. + * + * See the definition of &enum liveupdate_state for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_ioctl_get_state { + __u32 size; + __u32 state; +}; + +#define LIVEUPDATE_IOCTL_GET_STATE \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_GET_STATE) + +/** + * struct liveupdate_ioctl_set_event - ioctl(LIVEUPDATE_IOCTL_SET_EVENT) + * @size: Input; sizeof(struct liveupdate_ioctl_set_event) + * @event: Input; The live update event. + * + * Notify live update orchestrator about global event, that causes a state + * transition. + * + * Event, can be one of the following: + * + * - %LIVEUPDATE_PREPARE: Initiates the live update preparation phase. This + * typically triggers the saving process for items = marked + * via the PRESERVE ioctls. This typically occurs + * *before* the "blackout window", while user + * applications (e.g., VMs) may still be running. K= ernel + * subsystems receiving the %LIVEUPDATE_PREPARE eve= nt + * should serialize necessary state. This command d= oes + * not transfer data. + * - %LIVEUPDATE_FINISH: Signal restoration completion and triggercleanup. + * + * Signals that user space has completed all necess= ary + * restoration actions in the new kernel (after a l= ive + * update reboot). Calling this ioctl triggers the + * cleanup phase: any resources that were successfu= lly + * preserved but were *not* subsequently restored + * (reclaimed) via the RESTORE ioctls will have the= ir + * preserved state discarded and associated kernel + * resources released. Involved devices may be rese= t. All + * desired restorations *must* be completed *before* + * this. Kernel callbacks for the %LIVEUPDATE_FINISH + * event must not fail. Successfully completing this + * phase transitions the system state from + * %LIVEUPDATE_STATE_UPDATED back to + * %LIVEUPDATE_STATE_NORMAL. This command does + * not transfer data. + * - %LIVEUPDATE_CANCEL: Cancel the live update preparation phase. + * + * Notifies the live update subsystem to abort the + * preparation sequence potentially initiated by + * %LIVEUPDATE_PREPARE event. + * + * When triggered, subsystems receiving the + * %LIVEUPDATE_CANCEL event should revert any state + * changes or actions taken specifically for the ab= orted + * prepare phase (e.g., discard partially serialized + * state). The kernel releases resources allocated + * specifically for this *aborted preparation attem= pt*. + * + * This operation cancels the current *attempt* to + * prepare for a live update but does **not** remove + * previously validated items from the internal list + * of potentially preservable resources. + * + * This command does not transfer data. Kernel call= backs + * for the %LIVEUPDATE_CANCEL event must not fail. + * + * See the definition of &enum liveupdate_event for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_ioctl_set_event { + __u32 size; + __u32 event; +}; + +#define LIVEUPDATE_IOCTL_SET_EVENT \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SET_EVENT) + +/** + * struct liveupdate_ioctl_create_session - ioctl(LIVEUPDATE_IOCTL_CREATE_= SESSION) + * @size: Input; sizeof(struct liveupdate_ioctl_create_session) + * @fd: Output; The new file descriptor for the created session. + * @name: Input; A null-terminated string for the session name, max + * length %LIVEUPDATE_SESSION_NAME_LENGTH including termination + * char. + * + * Creates a new live update session for managing preserved resources. + * This ioctl can only be called on the main /dev/liveupdate device. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_ioctl_create_session { + __u32 size; + __s32 fd; + __u8 name[LIVEUPDATE_SESSION_NAME_LENGTH]; +}; + +#define LIVEUPDATE_IOCTL_CREATE_SESSION \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_CREATE_SESSION) + +/** + * struct liveupdate_ioctl_retrieve_session - ioctl(LIVEUPDATE_IOCTL_RETRI= EVE_SESSION) + * @size: Input; sizeof(struct liveupdate_ioctl_retrieve_session) + * @fd: Output; The new file descriptor for the retrieved session. + * @name: Input; A null-terminated string identifying the session to re= trieve. + * The name must exactly match the name used when the session was + * created in the previous kernel. + * + * Retrieves a handle (a new file descriptor) for a preserved session by i= ts + * name. This is the primary mechanism for a userspace agent to regain con= trol + * of its preserved resources after a live update. + * + * The userspace application provides the null-terminated `name` of a sess= ion + * it created before the live update. If a preserved session with a matchi= ng + * name is found, the kernel instantiates it and returns a new file descri= ptor + * in the `fd` field. This new session FD can then be used for all file-sp= ecific + * operations, such as restoring individual file descriptors with + * LIVEUPDATE_SESSION_RESTORE_FD. + * + * It is the responsibility of the userspace application to know the names= of + * the sessions it needs to retrieve. If no session with the given name is + * found, the ioctl will fail with -ENOENT. + * + * This ioctl can only be called on the main /dev/liveupdate device when t= he + * system is in the LIVEUPDATE_STATE_UPDATED state. + */ +struct liveupdate_ioctl_retrieve_session { + __u32 size; + __s32 fd; + __u8 name[64]; +}; + +#define LIVEUPDATE_IOCTL_RETRIEVE_SESSION \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_RETRIEVE_SESSION) #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 9223f71844ca..a14e0b685ccb 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -17,6 +17,26 @@ */ #define luo_restore_fail(__fmt, ...) panic(__fmt, ##__VA_ARGS__) =20 +struct luo_ucmd { + void __user *ubuffer; + u32 user_size; + void *cmd; +}; + +static inline int luo_ucmd_respond(struct luo_ucmd *ucmd, + size_t kernel_cmd_size) +{ + /* + * Copy the minimum of what the user provided and what we actually + * have. + */ + if (copy_to_user(ucmd->ubuffer, ucmd->cmd, + min_t(size_t, ucmd->user_size, kernel_cmd_size))) { + return -EFAULT; + } + return 0; +} + int luo_cancel(void); int luo_prepare(void); int luo_freeze(void); diff --git a/kernel/liveupdate/luo_ioctl.c b/kernel/liveupdate/luo_ioctl.c index fc2afb450ad5..01ccb8a6d3f4 100644 --- a/kernel/liveupdate/luo_ioctl.c +++ b/kernel/liveupdate/luo_ioctl.c @@ -5,6 +5,25 @@ * Pasha Tatashin */ =20 +/** + * DOC: LUO ioctl Interface + * + * The IOCTL user-space control interface for the LUO subsystem. + * It registers a character device, typically found at ``/dev/liveupdate``, + * which allows a userspace agent to manage the LUO state machine and its + * associated resources, such as preservable file descriptors. + * + * To ensure that the state machine is controlled by a single entity, acce= ss + * to this device is exclusive: only one process is permitted to have + * ``/dev/liveupdate`` open at any given time. Subsequent open attempts wi= ll + * fail with -EBUSY until the first process closes its file descriptor. + * This singleton model simplifies state management by preventing conflict= ing + * commands from multiple userspace agents. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include #include #include #include @@ -19,10 +38,191 @@ =20 struct luo_device_state { struct miscdevice miscdev; + atomic_t in_use; +}; + +static int luo_ioctl_get_state(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_get_state *argp =3D ucmd->cmd; + + argp->state =3D liveupdate_get_state(); + + return luo_ucmd_respond(ucmd, sizeof(*argp)); +} + +static int luo_ioctl_set_event(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_set_event *argp =3D ucmd->cmd; + int ret; + + switch (argp->event) { + case LIVEUPDATE_PREPARE: + ret =3D luo_prepare(); + break; + case LIVEUPDATE_FINISH: + ret =3D luo_finish(); + break; + case LIVEUPDATE_CANCEL: + ret =3D luo_cancel(); + break; + default: + ret =3D -EOPNOTSUPP; + } + + return ret; +} + +static int luo_ioctl_create_session(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_create_session *argp =3D ucmd->cmd; + struct file *file; + int ret; + + argp->fd =3D get_unused_fd_flags(O_CLOEXEC); + if (argp->fd < 0) + return argp->fd; + + ret =3D luo_session_create(argp->name, &file); + if (ret) + return ret; + + ret =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (ret) { + fput(file); + put_unused_fd(argp->fd); + return ret; + } + + fd_install(argp->fd, file); + + return 0; +} + +static int luo_ioctl_retrieve_session(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_retrieve_session *argp =3D ucmd->cmd; + struct file *file; + int ret; + + argp->fd =3D get_unused_fd_flags(O_CLOEXEC); + if (argp->fd < 0) + return argp->fd; + + ret =3D luo_session_retrieve(argp->name, &file); + if (ret < 0) { + put_unused_fd(argp->fd); + + return ret; + } + + ret =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (ret) { + fput(file); + put_unused_fd(argp->fd); + return ret; + } + + fd_install(argp->fd, file); + + return 0; +} + +static int luo_open(struct inode *inodep, struct file *filep) +{ + struct luo_device_state *ldev =3D container_of(filep->private_data, + struct luo_device_state, + miscdev); + + if (atomic_cmpxchg(&ldev->in_use, 0, 1)) + return -EBUSY; + + return 0; +} + +static int luo_release(struct inode *inodep, struct file *filep) +{ + struct luo_device_state *ldev =3D container_of(filep->private_data, + struct luo_device_state, + miscdev); + atomic_set(&ldev->in_use, 0); + + return 0; +} + +union ucmd_buffer { + struct liveupdate_ioctl_create_session create; + struct liveupdate_ioctl_get_state state; + struct liveupdate_ioctl_retrieve_session retrieve; + struct liveupdate_ioctl_set_event event; }; =20 +struct luo_ioctl_op { + unsigned int size; + unsigned int min_size; + unsigned int ioctl_num; + int (*execute)(struct luo_ucmd *ucmd); +}; + +#define IOCTL_OP(_ioctl, _fn, _struct, _last) = \ + [_IOC_NR(_ioctl) - LIVEUPDATE_CMD_BASE] =3D { \ + .size =3D sizeof(_struct) + \ + BUILD_BUG_ON_ZERO(sizeof(union ucmd_buffer) < \ + sizeof(_struct)), \ + .min_size =3D offsetofend(_struct, _last), \ + .ioctl_num =3D _ioctl, \ + .execute =3D _fn, \ + } + +static const struct luo_ioctl_op luo_ioctl_ops[] =3D { + IOCTL_OP(LIVEUPDATE_IOCTL_CREATE_SESSION, luo_ioctl_create_session, + struct liveupdate_ioctl_create_session, name), + IOCTL_OP(LIVEUPDATE_IOCTL_GET_STATE, luo_ioctl_get_state, + struct liveupdate_ioctl_get_state, state), + IOCTL_OP(LIVEUPDATE_IOCTL_RETRIEVE_SESSION, luo_ioctl_retrieve_session, + struct liveupdate_ioctl_retrieve_session, name), + IOCTL_OP(LIVEUPDATE_IOCTL_SET_EVENT, luo_ioctl_set_event, + struct liveupdate_ioctl_set_event, event), +}; + +static long luo_ioctl(struct file *filep, unsigned int cmd, unsigned long = arg) +{ + const struct luo_ioctl_op *op; + struct luo_ucmd ucmd =3D {}; + union ucmd_buffer buf; + unsigned int nr; + int ret; + + nr =3D _IOC_NR(cmd); + if (nr < LIVEUPDATE_CMD_BASE || + (nr - LIVEUPDATE_CMD_BASE) >=3D ARRAY_SIZE(luo_ioctl_ops)) { + return -EINVAL; + } + + ucmd.ubuffer =3D (void __user *)arg; + ret =3D get_user(ucmd.user_size, (u32 __user *)ucmd.ubuffer); + if (ret) + return ret; + + op =3D &luo_ioctl_ops[nr - LIVEUPDATE_CMD_BASE]; + if (op->ioctl_num !=3D cmd) + return -ENOIOCTLCMD; + if (ucmd.user_size < op->min_size) + return -EINVAL; + + ucmd.cmd =3D &buf; + ret =3D copy_struct_from_user(ucmd.cmd, op->size, ucmd.ubuffer, + ucmd.user_size); + if (ret) + return ret; + + return op->execute(&ucmd); +} + static const struct file_operations luo_fops =3D { .owner =3D THIS_MODULE, + .open =3D luo_open, + .release =3D luo_release, + .unlocked_ioctl =3D luo_ioctl, }; =20 static struct luo_device_state luo_dev =3D { @@ -31,6 +231,7 @@ static struct luo_device_state luo_dev =3D { .name =3D "liveupdate", .fops =3D &luo_fops, }, + .in_use =3D ATOMIC_INIT(0), }; =20 static int __init liveupdate_init(void) --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0289B1EF39E for ; Mon, 29 Sep 2025 01:04:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107876; cv=none; b=jRoIGQjOyyj8GhxQOOWIsLBPUcOvQRcJv42eU5EpJEJNSNKVGut1EjHa1HL/uj5jI6vM2xdwxzHJdnoDlHOcjPDclPabDJB68xpyKCd4/mP/fgQQe5WBe4sEVFsmLQ+DzoG+OtqCmwlLl3AYdhr+sm33v4w/TnhxeLw0HQGdDmo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107876; c=relaxed/simple; bh=s7riUWIsmzempjfTg2DxymE7etahoAWCKxcbNfqLgO4=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sNGLUwFvulMZWbnt0h+CIeCpLlS/sQE/VICYpZbk5MbSaunEhFFRk/03OBdOtUZrh/TG0YQ9Ph3m0ksgdC++JlQ8Qf817y5nFp7DHCceCyD/xwQGwF6uB/pJF5XnAoJFwYspVBaTUVf1J97wYXa4P9wt0xiTW7Kkdq9YMqeRQ+4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=cJ1jVEv+; arc=none smtp.client-ip=209.85.160.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="cJ1jVEv+" Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-4d7b4b3c06dso20915481cf.1 for ; Sun, 28 Sep 2025 18:04:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107872; x=1759712672; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QWmliTpIctCvQpdpBySgcvRmzR2McUJ9TQRf0sVYx5U=; b=cJ1jVEv+fnuHht9eHig52RAB+nz4Qn1ImXgM4YBOmnZ8l3mtyccf1yOlDjT1kxmRO6 qptdG2P9q2nTEQFMXSdff3BFoCluLiuW5/LnEYBJtfaHGmmk5uDqI+lvlbY4Z1HHVM3N dpdBZKxGpflxeEVmJKejUhfAegjCzG+T2GCVQ1SUALD62BUHA5K7s7WbzvUK7vzLzaPX mFKjCEkKIdeqZxXIpEN/h9BiAYIkn2rktDEVEINqr52VCzwQ2If99OfBxK9MmHvIF/yb r68A4R50WbAlcsq1XHCw3VwB9kE07mWhgrWA9Ysk5ynKxruvvKny7E/8L41ef1MubORS bCGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107872; x=1759712672; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QWmliTpIctCvQpdpBySgcvRmzR2McUJ9TQRf0sVYx5U=; b=cglbZo4Bd/89SEMyKlOJOlKNqO71HJnE0omGDW67beC+S/UsrOiewGFt9fFn+GRmL+ v7TemWJO9t+HDipoCLqg073KpEYikR7B2sEB9EY8VG3ROPycdQ+GSp+M160WXE3xaYlG J4uvJIswxGYb+vWD3pLPHSgAUQBaZKRKZTLTefSOCphJveTTu7VtsSkvX2S52FqWstDn r9m/J7on6vKcDAxVdVKYOZ22iaqRPfsMq86peFGn/JU+cq/GVy/EGQBk52jLOY6hGyen HXVdOwimJWIpdmLuB5pDTFQ24Q2kABiJHhsOpTnHF1ddZ1qSvBl7eLUYX5EAm5tcQ9yP Fu8A== X-Forwarded-Encrypted: i=1; AJvYcCWuK2uDFr+XZsyCPxThVKbVJ8QAoQFA0PLgyjES7HqPfqwvPPwY4Lu6MtvI1DPBIz3SUa43E0y5Pv/0Qts=@vger.kernel.org X-Gm-Message-State: AOJu0Yw6zR38YtkLWSKMIn5r939qfhaxktF28BVSNU/L9iOWeJFm6jzZ pX1RASM297jzk6J1JZDD7rJAVQ5R5rrKIrotqP0vKZ1ytfg+zyOnxoDBi6dJTg7D5U0= X-Gm-Gg: ASbGncvn9xE1HgPZKppXfT7NfQMDHFyEKL+kEcctDzV5+uJMmYehu0EUawGB9YXPoUy uo+36aOJXwxBMQIP7O/4heoTDB9GHYhicvNd1sOsuo0dy9T7lHaJ0YJ2WhJvNp588UvsKHpe6lb uTwNPZKCyxBb+0IF91BkXyzA4wMz64AnRUXX16UCW7OCa9brh44oAr7EzYb12TuY9E5x8EYV/vO H2xsK0Ex3LbzSeUV3caaKH888SoJnpUXZxjAl3Zit4Nmk1qBirUOEm9uKglKAeBPvGJ/Q12pOUb /lUo3Wdv9xjLupZGcZ5tRyXr2ygNmtwa9i1Vq1d+56rXmcILKb4zuDZLnYUKsCDVRd16C+QaYDz Wo2HUHec5mMUdPo7l0R53/4Wfo0ZOtZ/IaHF/lmdvjcyyf589FCXZqPmAP/C/ugjVYu1R7l2vTf SKJzVrMTkbpmjTMJdqqQ== X-Google-Smtp-Source: AGHT+IG0CVkGX/h5klhvmrp3PnurAX/EzEyivOu5tBhLiUlxcyjYS6JE/ZicsPVeRpadbDVekFht6w== X-Received: by 2002:ac8:7d10:0:b0:4b5:4874:4f95 with SMTP id d75a77b69052e-4da4b13eca8mr179372601cf.51.1759107871010; Sun, 28 Sep 2025 18:04:31 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:30 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 13/30] liveupdate: luo_file: implement file systems callbacks Date: Mon, 29 Sep 2025 01:03:04 +0000 Message-ID: <20250929010321.3462457-14-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implements the core logic within luo_file.c to invoke the prepare, reboot, finish, and cancel callbacks for preserved file instances, replacing the previous stub implementations. It also handles the persistence and retrieval of the u64 data payload associated with each file via the LUO FDT. Signed-off-by: Pasha Tatashin --- include/linux/liveupdate.h | 79 ++++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_file.c | 599 +++++++++++++++++++++++++++++++ kernel/liveupdate/luo_internal.h | 14 + 4 files changed, 693 insertions(+) create mode 100644 kernel/liveupdate/luo_file.c diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index 4c378a986cfe..c0a7f8c40719 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -13,6 +13,72 @@ #include =20 struct liveupdate_subsystem; +struct liveupdate_file_handler; +struct file; + +/** + * struct liveupdate_file_ops - Callbacks for live-updatable files. + * @prepare: Optional. Saves state for a specific file instance @fil= e, + * before update, potentially returning value via @data. + * Returns 0 on success, negative errno on failure. + * @freeze: Optional. Performs final actions just before kernel + * transition, potentially reading/updating the handle via + * @data. + * Returns 0 on success, negative errno on failure. + * @cancel: Optional. Cleans up state/resources if update is aborted + * after prepare/freeze succeeded, using the @data handle = (by + * value) from the successful prepare. Returns void. + * @finish: Optional. Performs final cleanup in the new kernel usin= g the + * preserved @data handle (by value). Returns void. + * @retrieve: Retrieve the preserved file. Must be called before fini= sh. + * @can_preserve: callback to determine if @file can be preserved by this + * handler. + * Return bool (true if preservable, false otherwise). + * @owner: Module reference + */ +struct liveupdate_file_ops { + int (*prepare)(struct liveupdate_file_handler *handler, + struct file *file, u64 *data); + int (*freeze)(struct liveupdate_file_handler *handler, + struct file *file, u64 *data); + void (*cancel)(struct liveupdate_file_handler *handler, + struct file *file, u64 data); + void (*finish)(struct liveupdate_file_handler *handler, + struct file *file, u64 data, bool reclaimed); + int (*retrieve)(struct liveupdate_file_handler *handler, + u64 data, struct file **file); + bool (*can_preserve)(struct liveupdate_file_handler *handler, + struct file *file); + struct module *owner; +}; + +/* The max size is set so it can be reliably used during in serialization = */ +#define LIVEUPDATE_HNDL_COMPAT_LENGTH 48 + +/** + * struct liveupdate_file_handler - Represents a handler for a live-updata= ble + * file type. + * @ops: Callback functions + * @compatible: The compatibility string (e.g., "memfd-v1", "vfiofd-v1") + * that uniquely identifies the file type this handler sup= ports. + * This is matched against the compatible string associate= d with + * individual &struct liveupdate_file instances. + * @list: used for linking this handler instance into a global li= st of + * registered file handlers. + * @count: Atomic counter of number of files that are preserved an= d use + * this handler. + * + * Modules that want to support live update for specific file types should + * register an instance of this structure. LUO uses this registration to + * determine if a given file can be preserved and to find the appropriate + * operations to manage its state across the update. + */ +struct liveupdate_file_handler { + const struct liveupdate_file_ops *ops; + const char compatible[LIVEUPDATE_HNDL_COMPAT_LENGTH]; + struct list_head list; + atomic_t count; +}; =20 /** * struct liveupdate_subsystem_ops - LUO events callback functions @@ -83,6 +149,9 @@ int liveupdate_register_subsystem(struct liveupdate_subs= ystem *h); int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h); int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a); =20 +int liveupdate_register_file_handler(struct liveupdate_file_handler *h); +int liveupdate_unregister_file_handler(struct liveupdate_file_handler *h); + #else /* CONFIG_LIVEUPDATE */ =20 static inline int liveupdate_reboot(void) @@ -126,5 +195,15 @@ static inline int liveupdate_get_subsystem_data(struct= liveupdate_subsystem *h, return -ENODATA; } =20 +static inline int liveupdate_register_file_handler(struct liveupdate_file_= handler *h) +{ + return 0; +} + +static inline int liveupdate_unregister_file_handler(struct liveupdate_fil= e_handler *h) +{ + return 0; +} + #endif /* CONFIG_LIVEUPDATE */ #endif /* _LINUX_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index f64cfc92cbf0..282d36a18993 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -2,6 +2,7 @@ =20 luo-y :=3D \ luo_core.o \ + luo_file.o \ luo_ioctl.o \ luo_session.o \ luo_subsystems.o diff --git a/kernel/liveupdate/luo_file.c b/kernel/liveupdate/luo_file.c new file mode 100644 index 000000000000..69f3acf90da5 --- /dev/null +++ b/kernel/liveupdate/luo_file.c @@ -0,0 +1,599 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO file descriptors + * + * LUO provides the infrastructure necessary to preserve + * specific types of stateful file descriptors across a kernel live + * update transition. The primary goal is to allow workloads, such as virt= ual + * machines using vfio, memfd, or iommufd to retain access to their essent= ial + * resources without interruption after the underlying kernel is updated. + * + * The framework operates based on handler registration and instance track= ing: + * + * 1. Handler Registration: Kernel modules responsible for specific file + * types (e.g., memfd, vfio) register a &struct liveupdate_file_handler + * handler. This handler contains callbacks + * (&liveupdate_file_handler.ops->prepare, + * &liveupdate_file_handler.ops->freeze, + * &liveupdate_file_handler.ops->finish, etc.) and a unique 'compatible' s= tring + * identifying the file type. Registration occurs via + * liveupdate_register_file_handler(). + * + * 2. File Instance Tracking: When a potentially preservable file needs to= be + * managed for live update, the core LUO logic (luo_preserve_file()) finds= a + * compatible registered handler using its + * &liveupdate_file_handler.ops->can_preserve callback. If found, an inte= rnal + * &struct luo_file instance is created, assigned a unique u64 'token', and + * added to a list. + * + * 3. State Persistence ... + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +/* Registered files. */ +static DECLARE_RWSEM(luo_file_handler_list_rwsem); +static LIST_HEAD(luo_file_handler_list); + +/** + * struct luo_file_ser - Represents the serialized preserves files. + * @compatible: File handler compatabile string. + * @files: Private data + * @token: User provided token for this file + * + * If this structure is modified, LUO_SESSION_COMPATIBLE must be updated. + */ +struct luo_file_ser { + char compatible[LIVEUPDATE_HNDL_COMPAT_LENGTH]; + u64 data; + u64 token; +} __packed; + +/** + * struct luo_file - Represents a file descriptor instance preserved + * across live update. + * @fh: Pointer to the &struct liveupdate_file_handler containi= ng + * the implementation of prepare, freeze, cancel, and fini= sh + * operations specific to this file's type. + * @file: A pointer to the kernel's &struct file object represent= ing + * the open file descriptor that is being preserved. + * @private_data: Internal storage used by the live update core framework + * between phases. + * @reclaimed: Flag indicating whether this preserved file descriptor = has + * been successfully 'reclaimed' (e.g., requested via an i= octl) + * by user-space or the owning kernel subsystem in the new + * kernel after the live update. + * @state: The current state of file descriptor, it is allowed to + * prepare, freeze, and finish FDs before the global state + * switch. + * @mutex: Lock to protect FD state, and allow independently to ch= ange + * the FD state compared to global state. + * + * This structure holds the necessary callbacks and context for managing a + * specific open file descriptor throughout the different phases of a live + * update process. Instances of this structure are typically allocated, + * populated with file-specific details (&file, &arg, callbacks, compatibi= lity + * string, token), and linked into a central list managed by the LUO. The + * private_data field is used internally by the core logic to store state + * between phases. + */ +struct luo_file { + struct liveupdate_file_handler *fh; + struct file *file; + u64 private_data; + bool reclaimed; + enum liveupdate_state state; + struct mutex mutex; +}; + +static int luo_file_prepare_one(struct luo_file *h) +{ + int ret =3D 0; + + guard(mutex)(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_NORMAL) { + if (h->fh->ops->prepare) { + ret =3D h->fh->ops->prepare(h->fh, h->file, + &h->private_data); + } + if (!ret) + h->state =3D LIVEUPDATE_STATE_PREPARED; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_PREPARED && + h->state !=3D LIVEUPDATE_STATE_FROZEN); + } + + return ret; +} + +static int luo_file_freeze_one(struct luo_file *h) +{ + int ret =3D 0; + + guard(mutex)(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_PREPARED) { + if (h->fh->ops->freeze) { + ret =3D h->fh->ops->freeze(h->fh, h->file, + &h->private_data); + } + if (!ret) + h->state =3D LIVEUPDATE_STATE_FROZEN; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_FROZEN); + } + + return ret; +} + +static void luo_file_finish_one(struct luo_file *h) +{ + guard(mutex)(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_UPDATED) { + if (h->fh->ops->finish) { + h->fh->ops->finish(h->fh, h->file, h->private_data, + h->reclaimed); + } + h->state =3D LIVEUPDATE_STATE_NORMAL; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_NORMAL); + } +} + +static void luo_file_cancel_one(struct luo_file *h) +{ + guard(mutex)(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_NORMAL) + return; + + if (WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_PREPARED && + h->state !=3D LIVEUPDATE_STATE_FROZEN)) { + return; + } + + if (h->fh->ops->cancel) + h->fh->ops->cancel(h->fh, h->file, h->private_data); + + h->private_data =3D 0; + h->state =3D LIVEUPDATE_STATE_NORMAL; +} + +static void __luo_file_cancel(struct luo_session *session) +{ + unsigned long token; + struct luo_file *h; + + xa_for_each(&session->files_xa, token, h) + luo_file_cancel_one(h); +} + +int luo_file_prepare(struct luo_session *session) +{ + struct luo_file *luo_file; + struct luo_file_ser *ser; + unsigned long token; + size_t ser_size; + int ret =3D 0; + int i; + + if (!session->count) + return 0; + + ser_size =3D session->count * sizeof(struct luo_file_ser); + ser =3D luo_contig_alloc_preserve(ser_size); + if (IS_ERR(ser)) + return PTR_ERR(ser); + + i =3D 0; + xa_for_each(&session->files_xa, token, luo_file) { + ret =3D luo_file_prepare_one(luo_file); + if (ret < 0) { + pr_err("Prepare failed for session[%s] token[%#0llx] handler[%s] ret[%d= ]\n", + session->name, (u64)token, luo_file->fh->compatible, ret); + goto exit_cleanup; + } + + strscpy(ser[i].compatible, luo_file->fh->compatible, + sizeof(ser[i].compatible)); + ser[i].data =3D luo_file->private_data; + ser[i].token =3D token; + i++; + } + + session->files =3D __pa(ser); + + return 0; + +exit_cleanup: + __luo_file_cancel(session); + luo_contig_free_unpreserve(ser, ser_size); + + return ret; +} + +int luo_file_freeze(struct luo_session *session) +{ + struct luo_file *luo_file; + struct luo_file_ser *ser; + unsigned long token; + size_t ser_size; + int ret =3D 0; + int i; + + if (!session->count) + return 0; + + if (WARN_ON(!session->files)) + return -EINVAL; + + ser =3D __va(session->files); + + i =3D 0; + xa_for_each(&session->files_xa, token, luo_file) { + ret =3D luo_file_freeze_one(luo_file); + if (ret < 0) { + pr_err("Freeze failed for session[%s] token[%#0llx] handler[%s] ret[%d]= \n", + session->name, (u64)token, luo_file->fh->compatible, ret); + goto exit_cleanup; + } + ser[i].data =3D luo_file->private_data; + i++; + } + + return 0; + +exit_cleanup: + __luo_file_cancel(session); + ser_size =3D session->count * sizeof(struct luo_file_ser); + luo_contig_free_unpreserve(ser, ser_size); + + return ret; +} + +void luo_file_finish(struct luo_session *session) +{ + struct luo_file *luo_file; + struct luo_file_ser *ser; + unsigned long token; + size_t ser_size; + + if (!session->count) + return; + + xa_for_each(&session->files_xa, token, luo_file) + luo_file_finish_one(luo_file); + + ser_size =3D session->count * sizeof(struct luo_file_ser); + ser =3D __va(session->files); + luo_contig_free_restore(ser, ser_size); +} + +void luo_file_cancel(struct luo_session *session) +{ + struct luo_file_ser *ser; + size_t ser_size; + + if (!session->count) + return; + + __luo_file_cancel(session); + + if (session->files) { + ser =3D __va(session->files); + ser_size =3D session->count * sizeof(struct luo_file_ser); + luo_contig_free_unpreserve(ser, ser_size); + session->files =3D 0; + } +} + +void luo_file_deserialize(struct luo_session *session) +{ + struct luo_file_ser *ser; + u64 i; + + if (!session->files) + return; + + guard(rwsem_read)(&luo_file_handler_list_rwsem); + ser =3D __va(session->files); + for (i =3D 0; i < session->count; i++) { + struct liveupdate_file_handler *fh; + bool handler_found =3D false; + struct luo_file *luo_file; + int ret; + + if (xa_load(&session->files_xa, ser[i].token)) { + luo_restore_fail("Duplicate token %llu found in incoming FDT for file d= escriptors.\n", + ser[i].token); + } + + list_for_each_entry(fh, &luo_file_handler_list, list) { + if (!strcmp(fh->compatible, ser[i].compatible)) { + handler_found =3D true; + break; + } + } + + if (!handler_found) { + luo_restore_fail("No registered handler for compatible '%s'\n", + ser[i].compatible); + } + + luo_file =3D kzalloc(sizeof(*luo_file), + GFP_KERNEL | __GFP_NOFAIL); + luo_file->fh =3D fh; + luo_file->file =3D NULL; + luo_file->private_data =3D ser[i].data; + luo_file->reclaimed =3D false; + mutex_init(&luo_file->mutex); + luo_file->state =3D LIVEUPDATE_STATE_UPDATED; + ret =3D xa_err(xa_store(&session->files_xa, ser[i].token, + luo_file, GFP_KERNEL | __GFP_NOFAIL)); + if (ret < 0) { + luo_restore_fail("Failed to store luo_file for token %llu in XArray: %d= \n", + ser[i].token, ret); + } + } +} + +/** + * luo_preserve_file - Register a file descriptor for live update manageme= nt. + * @token: Token value for this file descriptor. + * @fd: file descriptor to be preserved. + * + * Context: Must be called when LUO is in 'normal' state. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_preserve_file(struct luo_session *session, u64 token, int fd) +{ + struct liveupdate_file_handler *fh; + struct luo_file *luo_file; + bool found =3D false; + int ret =3D -ENOENT; + struct file *file; + + file =3D fget(fd); + if (!file) { + pr_err("Bad file descriptor\n"); + return -EBADF; + } + + guard(rwsem_read)(&luo_file_handler_list_rwsem); + list_for_each_entry(fh, &luo_file_handler_list, list) { + if (fh->ops->can_preserve(fh, file)) { + found =3D true; + break; + } + } + + if (!found) + goto exit_cleanup; + + luo_file =3D kzalloc(sizeof(*luo_file), GFP_KERNEL); + if (!luo_file) { + ret =3D -ENOMEM; + goto exit_cleanup; + } + + luo_file->private_data =3D 0; + luo_file->reclaimed =3D false; + + luo_file->file =3D file; + luo_file->fh =3D fh; + mutex_init(&luo_file->mutex); + luo_file->state =3D LIVEUPDATE_STATE_NORMAL; + + if (xa_load(&session->files_xa, token)) { + ret =3D -EEXIST; + pr_warn("Token %llu is already taken\n", token); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + goto exit_cleanup; + } + + ret =3D xa_err(xa_store(&session->files_xa, token, luo_file, + GFP_KERNEL)); + if (ret < 0) { + pr_warn("Failed to store file for token %llu in XArray: %d\n", + token, ret); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + goto exit_cleanup; + } + atomic_inc(&luo_file->fh->count); + session->count++; + +exit_cleanup: + if (ret) + fput(file); + + return ret; +} + +/** + * luo_unpreserve_file - Unregister a file instance using its token. + * @token: The unique token of the file instance to unregister. + * + * Finds the &struct luo_file associated with the @token in the + * global list and removes it. This function *only* removes the entry from= the + * list; it does *not* free the memory allocated for the &struct luo_file + * itself. The caller is responsible for freeing the structure after this + * function returns successfully. + * + * Context: Can be called when a preserved file descriptor is closed or + * no longer needs live update management. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_unpreserve_file(struct luo_session *session, u64 token) +{ + struct luo_file *luo_file; + + luo_file =3D xa_erase(&session->files_xa, token); + if (!luo_file) + return -ENOENT; + + if (luo_file->file) + fput(luo_file->file); + mutex_destroy(&luo_file->mutex); + scoped_guard(rwsem_read, &luo_file_handler_list_rwsem) + atomic_dec(&luo_file->fh->count); + kfree(luo_file); + session->count--; + + return 0; +} + +/** + * luo_retrieve_file - Find a registered file instance by its token. + * @token: The unique token of the file instance to retrieve. + * @filep: Output parameter. On success (return value 0), this will point + * to the retrieved "struct file". + * + * Searches the global list for a &struct luo_file matching the @token. Us= es a + * read lock, allowing concurrent retrievals. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_retrieve_file(struct luo_session *session, u64 token, + struct file **filep) +{ + struct luo_file *luo_file; + int ret =3D 0; + + luo_file =3D xa_load(&session->files_xa, token); + if (!luo_file) + return -ENOENT; + + if (luo_file->reclaimed) + return -EADDRINUSE; + + guard(mutex)(&luo_file->mutex); + if (luo_file->reclaimed) + return -EADDRINUSE; + + ret =3D luo_file->fh->ops->retrieve(luo_file->fh, luo_file->private_data, + filep); + if (!ret) { + /* Get a reference so, we can keep this file in LUO */ + luo_file->file =3D *filep; + get_file(luo_file->file); + luo_file->reclaimed =3D true; + } + + return ret; +} + +void luo_file_unpreserve_all_files(struct luo_session *session) +{ + unsigned long token; + struct luo_file *h; + + xa_for_each(&session->files_xa, token, h) + luo_unpreserve_file(session, token); +} + +void luo_file_unpreserve_unreclaimed_files(struct luo_session *session) +{ + unsigned long token; + struct luo_file *h; + + xa_for_each(&session->files_xa, token, h) { + if (!h->reclaimed) { + pr_err("Unpreserving unreclaimed file, session[%s] token[%#0llx] handle= r[%s]\n", + session->name, (u64)token, h->fh->compatible); + luo_unpreserve_file(session, token); + } + } +} + +/** + * liveupdate_register_file_handler - Register a file handler with LUO. + * @fh: Pointer to a caller-allocated &struct liveupdate_file_handler. + * The caller must initialize this structure, including a unique + * 'compatible' string and a valid 'fh' callbacks. This function adds the + * handler to the global list of supported file handlers. + * + * Context: Typically called during module initialization for file types t= hat + * support live update preservation. + * + * Return: 0 on success. Negative errno on failure. + */ +int liveupdate_register_file_handler(struct liveupdate_file_handler *fh) +{ + struct liveupdate_file_handler *fh_iter; + + guard(rwsem_read)(&luo_state_rwsem); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) + return -EBUSY; + + guard(rwsem_write)(&luo_file_handler_list_rwsem); + list_for_each_entry(fh_iter, &luo_file_handler_list, list) { + if (!strcmp(fh_iter->compatible, fh->compatible)) { + pr_err("File handler registration failed: Compatible string '%s' alread= y registered.\n", + fh->compatible); + return -EEXIST; + } + } + + if (!try_module_get(fh->ops->owner)) + return -EAGAIN; + + INIT_LIST_HEAD(&fh->list); + atomic_set(&fh->count, 0); + list_add_tail(&fh->list, &luo_file_handler_list); + + return 0; +} + +/** + * liveupdate_unregister_file - Unregister a file handler. + * @fh: Pointer to the specific &struct liveupdate_file_handler instance + * that was previously returned by or passed to + * liveupdate_register_file_handler. + * + * Removes the specified handler instance @fh from the global list of + * registered file handlers. This function only removes the entry from the + * list; it does not free the memory associated with @fh itself. The caller + * is responsible for freeing the structure memory after this function ret= urns + * successfully. + * + * Return: 0 on success. Negative errno on failure. + */ +int liveupdate_unregister_file_handler(struct liveupdate_file_handler *fh) +{ + guard(rwsem_read)(&luo_state_rwsem); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) + return -EBUSY; + + guard(rwsem_write)(&luo_file_handler_list_rwsem); + if (atomic_read(&fh->count)) { + pr_warn("Unable to unregister file handler, files are preserved\n"); + return -EBUSY; + } + + list_del_init(&fh->list); + module_put(fh->ops->owner); + + return 0; +} diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index a14e0b685ccb..c9bce82aac22 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -8,6 +8,8 @@ #ifndef _LINUX_LUO_INTERNAL_H #define _LINUX_LUO_INTERNAL_H =20 +#include + /* * Handles a deserialization failure: devices and memory is in unpredictab= le * state. @@ -93,4 +95,16 @@ int luo_do_subsystems_freeze_calls(void); void luo_do_subsystems_finish_calls(void); void luo_do_subsystems_cancel_calls(void); =20 +int luo_retrieve_file(struct luo_session *session, u64 token, struct file = **filep); +int luo_preserve_file(struct luo_session *session, u64 token, int fd); +int luo_unpreserve_file(struct luo_session *session, u64 token); +void luo_file_unpreserve_all_files(struct luo_session *session); +void luo_file_unpreserve_unreclaimed_files(struct luo_session *session); + +int luo_file_prepare(struct luo_session *session); +int luo_file_freeze(struct luo_session *session); +void luo_file_finish(struct luo_session *session); +void luo_file_cancel(struct luo_session *session); +void luo_file_deserialize(struct luo_session *session); + #endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qk1-f177.google.com (mail-qk1-f177.google.com [209.85.222.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 72F321F7569 for ; Mon, 29 Sep 2025 01:04:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107877; cv=none; b=WQM5JLOdjwMvDxQSAgcCYQjSxtZ2sl6KpUYqxyn/h3nUGI8vFgXVhaPa4z+kAGF0Hi0HXSUkPb0lE59FSIr6kKSN9VFWRupdHxGj5kSL2qqf/x4z0dkbbwE6hBOATYlNVbNP+QBDtS2bVIO/NlSH2IgwT7cIvNvbWRXehUYhWw0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107877; c=relaxed/simple; bh=RlKQ5V2ZzZmQKlJsqIodazY91A7eX734eTJx5yjkWtY=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AvTAP9sKv5MAo6IRWwDGLp6PnhiiR+5Rb4a1d1Yajddll4se/XI8gBPKguHYoJyvOpeGJbbWGa3vmoWUT556JqneeP2ApOHb3CeA67qb7K9IRS1IM6EUcV2I+vwt8KyfQoJdDeFItrLoScASwdpLvGL17uGvvgD0f3lsTFcHNXg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=MNmiv5OS; arc=none smtp.client-ip=209.85.222.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="MNmiv5OS" Received: by mail-qk1-f177.google.com with SMTP id af79cd13be357-856701dc22aso411032185a.3 for ; Sun, 28 Sep 2025 18:04:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107873; x=1759712673; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=eKSQxJd378SNA6x6L2UHx48FuJQFa6/qcFSRD3fSLhE=; b=MNmiv5OShONpqxuXIFblXBrdMtAN1FsQsfIpsGN3XnE0M9h0SSZuMOCOvhT89ZasFM JIhC2BZYkqGo6Ax4+a2Mzdv0zmqeKMSW0j/6r9H1NPoIye6wWCOoqhLsahQWgUn5odXi u502kAilTCZvDv/nemo0+48PLkjmWyW0PwqJmRcQZ58FUTmgizpdN01yMBMjrZ076DtL xyrF+waoytOZkepe57LPfsDjxMAFhXH5apQIvhAYzuYKlG8WrdyBnkjbgLd9hcIQUvU9 2Wv3pco2AaRFWPfB5uN3Byx1mR+0d/F/QxkOg+6bhilhl9Q6YmP0ArihnPEfHUsLgoXC RoUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107873; x=1759712673; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eKSQxJd378SNA6x6L2UHx48FuJQFa6/qcFSRD3fSLhE=; b=Gz7oh5+o4GK0cH7eGk0vuve/UMV0D5yukpa0liYai8EtiR279DoLE0xaBlHnkVs7zM jJ/9oYni2yyuszvktGhkMIEjsGyKyfNUOzK0i9WeTpv66hYw7JvHk8XuS7nabq72gLlf vO/7Q1jzIKXNJP7kZBHSTU2ppVvXZmX+zhm852VgI9dqcAQO5SifE2T+iG9ouPncR4f1 MrJdZdH2L/M+YqrAmWDzgDxP2nUAUZflKA3al/KNG8tSmsfCY2vfmHqQmbpLDKfBaO4W sScgUZxeH/dov3KgWw/Hs2UFhjdSsUBdhW71Au7n4SkZEQg+9VgQGsUiPBEPW3iu05jh 4U8w== X-Forwarded-Encrypted: i=1; AJvYcCXx0YE64MigYBA9eh+tErJFjV1R9B4rrwSZL0dlTiM8Db9IZWV0KZQVloFcECGgm7wWJ3q5ObgxPsTVPNo=@vger.kernel.org X-Gm-Message-State: AOJu0Ywa/yirQw6/3fLLxyAUbMvgwpQ7yetZXy0w4KoPTTLVNT7puQjV ZbbEsbKOjLS1mF2NzwfOnx1PIqzadOX3CwEuEkArrveePrTRvbZO4uQvBideGFcfnb8= X-Gm-Gg: ASbGncskv1gvcf/joNu/iIXEoSr2cdSenJcZSOCbtdPO6QEjnqSLDzNLz4dZeCzlVxP mcwfun963A62VBqy639hR3WrIY5qEacdc3Qm0s5MHeteQQxOULLemY8NKR7G20VdL3kOz/78AfV e0frEfWueG2KcMHZnded+cWuMs47aPv8bVq/iEizTZWdkGRZOch6fi781UUr5YKFSMjMPGqzJoQ i4zoCWONBUabnOhnYNaXAYkO7rE+gj4Yhwo1Ot7WSn6vv+ITIPuVu39C6FO41KclxypmbJnESSH IW0JOXrRQFc6O3pXTN7oLuCD9i9ngeHI1XYx0eHXK5EjqucI09+FHmjX0KiBdPZjmWuQFQSBTyo vxsDbVM9GgBn+aNTF6MxQTObqC+DUOyvAyHZMHwfdm221J3tqBvbO6Vx5toNf6BIkJ6N9Pw+sv7 pc7L2LD08+ZidecHZa/A== X-Google-Smtp-Source: AGHT+IGVEo5UaPwVVHkZnI4b3W4Xx5HgtH7j7YrypJxN5nCk7Gsj9C4puufgqUDBeLEBZKkEb1nMwQ== X-Received: by 2002:a05:620a:4722:b0:826:ef9:3346 with SMTP id af79cd13be357-85ae033d193mr1866752785a.18.1759107873040; Sun, 28 Sep 2025 18:04:33 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:32 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 14/30] liveupdate: luo_session: Add ioctls for file preservation and state management Date: Mon, 29 Sep 2025 01:03:05 +0000 Message-ID: <20250929010321.3462457-15-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introducing the userspace interface and internal logic required to manage the lifecycle of file descriptors within a session. Previously, a session was merely a container; this change makes it a functional management unit. The following capabilities are added: A new set of ioctl commands are added, which operate on the file descriptor returned by CREATE_SESSION. This allows userspace to: - LIVEUPDATE_SESSION_PRESERVE_FD: Add a file descriptor to a session to be preserved across the live update. - LIVEUPDATE_SESSION_UNPRESERVE_FD: Remove a previously added file descriptor from the session. - LIVEUPDATE_SESSION_RESTORE_FD: Retrieve a preserved file in the new kernel using its unique token. A state machine for each individual session, distinct from the global LUO state. This enables more granular control, allowing userspace to prepare or freeze specific sessions independently. This is managed via: - LIVEUPDATE_SESSION_SET_EVENT: An ioctl to send PREPARE, FREEZE, CANCEL, or FINISH events to a single session. - LIVEUPDATE_SESSION_GET_STATE: An ioctl to query the current state of a single session. The global subsystem callbacks (luo_session_prepare, luo_session_freeze) are updated to iterate through all existing sessions. They now trigger the appropriate per-session state transitions for any sessions that haven't already been transitioned individually by userspace. The session's .release handler is enhanced to be state-aware. When a session's file descriptor is closed, it now correctly cancels or finishes the session based on its current state before freeing all associated file resources, preventing resource leaks. Signed-off-by: Pasha Tatashin --- include/uapi/linux/liveupdate.h | 164 ++++++++++++++++++ kernel/liveupdate/luo_session.c | 284 +++++++++++++++++++++++++++++++- 2 files changed, 446 insertions(+), 2 deletions(-) diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index 2e38ef3094aa..59a0f561d148 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -132,6 +132,16 @@ enum { LIVEUPDATE_CMD_RETRIEVE_SESSION =3D 0x03, }; =20 +/* ioctl commands for session file descriptors */ +enum { + LIVEUPDATE_CMD_SESSION_BASE =3D 0x40, + LIVEUPDATE_CMD_SESSION_PRESERVE_FD =3D LIVEUPDATE_CMD_SESSION_BASE, + LIVEUPDATE_CMD_SESSION_UNPRESERVE_FD =3D 0x41, + LIVEUPDATE_CMD_SESSION_RESTORE_FD =3D 0x42, + LIVEUPDATE_CMD_SESSION_GET_STATE =3D 0x43, + LIVEUPDATE_CMD_SESSION_SET_EVENT =3D 0x44, +}; + /** * struct liveupdate_ioctl_get_state - ioctl(LIVEUPDATE_IOCTL_GET_STATE) * @size: Input; sizeof(struct liveupdate_ioctl_get_state) @@ -293,4 +303,158 @@ struct liveupdate_ioctl_retrieve_session { =20 #define LIVEUPDATE_IOCTL_RETRIEVE_SESSION \ _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_RETRIEVE_SESSION) + +/* Session specific IOCTLs */ + +/** + * struct liveupdate_session_preserve_fd - ioctl(LIVEUPDATE_SESSION_PRESER= VE_FD) + * @size: Input; sizeof(struct liveupdate_session_preserve_fd) + * @fd: Input; The user-space file descriptor to be preserved. + * @token: Input; An opaque, unique token for preserved resource. + * + * Holds parameters for preserving Validate and initiate preservation for = a file + * descriptor. + * + * User sets the @fd field identifying the file descriptor to preserve + * (e.g., memfd, kvm, iommufd, VFIO). The kernel validates if this FD type + * and its dependencies are supported for preservation. If validation pass= es, + * the kernel marks the FD internally and *initiates the process* of prepa= ring + * its state for saving. The actual snapshotting of the state typically oc= curs + * during the subsequent %LIVEUPDATE_IOCTL_PREPARE execution phase, though + * some finalization might occur during freeze. + * On successful validation and initiation, the kernel uses the @token + * field with an opaque identifier representing the resource being preserv= ed. + * This token confirms the FD is targeted for preservation and is required= for + * the subsequent %LIVEUPDATE_SESSION_RESTORE_FD call after the live updat= e. + * + * Return: 0 on success (validation passed, preservation initiated), negat= ive + * error code on failure (e.g., unsupported FD type, dependency issue, + * validation failed). + */ +struct liveupdate_session_preserve_fd { + __u32 size; + __s32 fd; + __aligned_u64 token; +}; + +#define LIVEUPDATE_SESSION_PRESERVE_FD \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_PRESERVE_FD) + +/** + * struct liveupdate_session_unpreserve_FD - ioctl(LIVEUPDATE_SESSION_UNPR= ESERVE_FD) + * @size: Input; sizeof(struct liveupdate_session_unpreserve_fd) + * @reserved: Must be zero. + * @token: Input; A token for resource to be unpreserved. + * + * Remove a file descriptor from the preservation list. + * + * Allows user space to explicitly remove a file descriptor from the set of + * items marked as potentially preservable. User space provides a @token t= hat + * was previously used by a successful %LIVEUPDATE_SESSION_PRESERVE_FD call + * (potentially from a prior, possibly canceled, live update attempt). The + * kernel reads the token value from the provided user-space address. + * + * On success, the kernel removes the corresponding entry (identified by t= he + * token value read from the user pointer) from its internal preservation = list. + * The provided @token (representing the now-removed entry) becomes invalid + * after this call. + * + * Return: 0 on success, negative error code on failure (e.g., -EBUSY or -= EINVAL + * if bad address provided, invalid token value read, token not found). + */ +struct liveupdate_session_unpreserve_fd { + __u32 size; + __u32 reserved; + __aligned_u64 token; +}; + +#define LIVEUPDATE_SESSION_UNPRESERVE_FD \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_UNPRESERVE_FD) + +/** + * struct liveupdate_session_restore_fd - ioctl(LIVEUPDATE_SESSION_RESTORE= _FD) + * @size: Input; sizeof(struct liveupdate_session_restore_fd) + * @fd: Output; The new file descriptor representing the fully restored + * kernel resource. + * @token: Input; An opaque, token that was used to preserve the resource. + * + * Restore a previously preserved file descriptor. + * + * User sets the @token field to the value obtained from a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call before the live update. On success, + * the kernel restores the state (saved during the PREPARE/FREEZE phases) + * associated with the token and populates the @fd field with a new file + * descriptor referencing the restored resource in the current (new) kerne= l. + * This operation must be performed *before* signaling completion via + * %LIVEUPDATE_IOCTL_FINISH. + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +struct liveupdate_session_restore_fd { + __u32 size; + __s32 fd; + __aligned_u64 token; +}; + +#define LIVEUPDATE_SESSION_RESTORE_FD \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_RESTORE_FD) + +/** + * struct liveupdate_session_get_state - ioctl(LIVEUPDATE_SESSION_GET_STAT= E) + * @size: Input; sizeof(struct liveupdate_session_get_state) + * @incoming: Input; If 1, query the state of a restored file from the inc= oming + * (previous kernel's) set. If 0, query a file being prepared f= or + * preservation in the current set. + * @reserved: Must be zero. + * @state: Output; The live update state of this FD. + * + * Query the current live update state of a specific preserved file descri= ptor. + * + * - %LIVEUPDATE_STATE_NORMAL: Default state + * - %LIVEUPDATE_STATE_PREPARED: Prepare callback has been performed on th= is FD. + * - %LIVEUPDATE_STATE_FROZEN: Freeze callback ahs been performed on thi= s FD. + * - %LIVEUPDATE_STATE_UPDATED: The system has successfully rebooted into= the + * new kernel. + * + * See the definition of &enum liveupdate_state for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_session_get_state { + __u32 size; + __u8 incoming; + __u8 reserved[3]; + __u32 state; +}; + +#define LIVEUPDATE_SESSION_GET_STATE \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_GET_STATE) + +/** + * struct liveupdate_session_set_event - ioctl(LIVEUPDATE_SESSION_SET_EVEN= T) + * @size: Input; sizeof(struct liveupdate_session_set_event) + * @event: Input; The live update event. + * + * Notify a specific preserved file descriptor of an event, that causes a = state + * transition for that file descriptor. + * + * Event, can be one of the following: + * + * - %LIVEUPDATE_PREPARE: Initiates the FD live update preparation phase. + * - %LIVEUPDATE_FREEZE: Initiates the FD live update freeze phase. + * - %LIVEUPDATE_CANCEL: Cancel the FD preparation or freeze phase. + * - %LIVEUPDATE_FINISH: FD Restoration completion and trigger cleanup. + * + * See the definition of &enum liveupdate_event for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_session_set_event { + __u32 size; + __u32 event; +}; + +#define LIVEUPDATE_SESSION_SET_EVENT \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SESSION_SET_EVENT) + #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/luo_session.c b/kernel/liveupdate/luo_sessio= n.c index 74dee42e24b7..966b68532d79 100644 --- a/kernel/liveupdate/luo_session.c +++ b/kernel/liveupdate/luo_session.c @@ -188,17 +188,66 @@ static void luo_session_remove(struct luo_session *se= ssion) /* One session switches from the updated state to normal state */ static void luo_session_finish_one(struct luo_session *session) { + scoped_guard(mutex, &session->mutex) { + if (session->state !=3D LIVEUPDATE_STATE_UPDATED) + return; + luo_file_finish(session); + session->files =3D 0; + luo_file_unpreserve_unreclaimed_files(session); + session->state =3D LIVEUPDATE_STATE_NORMAL; + } } =20 /* Cancel one session from frozen or prepared state, back to normal */ static void luo_session_cancel_one(struct luo_session *session) { + guard(mutex)(&session->mutex); + if (session->state =3D=3D LIVEUPDATE_STATE_FROZEN || + session->state =3D=3D LIVEUPDATE_STATE_PREPARED) { + luo_file_cancel(session); + session->state =3D LIVEUPDATE_STATE_NORMAL; + session->files =3D 0; + session->ser =3D NULL; + } } =20 /* One session is changed from normal to prepare state */ static int luo_session_prepare_one(struct luo_session *session) { - return 0; + int ret; + + guard(mutex)(&session->mutex); + if (session->state !=3D LIVEUPDATE_STATE_NORMAL) + return -EBUSY; + + ret =3D luo_file_prepare(session); + if (!ret) + session->state =3D LIVEUPDATE_STATE_PREPARED; + + return ret; +} + +/* One session is changed from prepared to frozen state */ +static int luo_session_freeze_one(struct luo_session *session) +{ + int ret; + + guard(mutex)(&session->mutex); + if (session->state !=3D LIVEUPDATE_STATE_PREPARED) + return -EBUSY; + + ret =3D luo_file_freeze(session); + + /* + * If fail, freeze is cancel, and as a side effect, we go back to normal + * state + */ + if (!ret) + session->state =3D LIVEUPDATE_STATE_FROZEN; + else + session->state =3D LIVEUPDATE_STATE_NORMAL; + + return ret; } =20 static int luo_session_release(struct inode *inodep, struct file *filep) @@ -220,6 +269,8 @@ static int luo_session_release(struct inode *inodep, st= ruct file *filep) session->state =3D=3D LIVEUPDATE_STATE_FROZEN) { luo_session_cancel_one(session); } + scoped_guard(mutex, &session->mutex) + luo_file_unpreserve_all_files(session); =20 scoped_guard(rwsem_write, &luo_session_global.rwsem) luo_session_remove(session); @@ -228,9 +279,219 @@ static int luo_session_release(struct inode *inodep, = struct file *filep) return 0; } =20 +static int luo_session_preserve_fd(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_preserve_fd *argp =3D ucmd->cmd; + int ret; + + guard(rwsem_read)(&luo_state_rwsem); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + pr_warn("File can be preserved only in normal or updated state\n"); + return -EBUSY; + } + + guard(mutex)(&session->mutex); + + if (session->state !=3D LIVEUPDATE_STATE_NORMAL) + return -EBUSY; + + ret =3D luo_preserve_file(session, argp->token, argp->fd); + if (ret) + return ret; + + ret =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (ret) + pr_warn("The file was successfully preserved, but response to user faile= d\n"); + + return ret; +} + +static int luo_session_unpreserve_fd(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_unpreserve_fd *argp =3D ucmd->cmd; + int ret; + + if (argp->reserved) + return -EOPNOTSUPP; + + guard(rwsem_read)(&luo_state_rwsem); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + pr_warn("File can be preserved only in normal or updated state\n"); + return -EBUSY; + } + + guard(mutex)(&session->mutex); + + if (session->state !=3D LIVEUPDATE_STATE_NORMAL) + return -EBUSY; + + ret =3D luo_unpreserve_file(session, argp->token); + if (ret) + return ret; + + ret =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (ret) + pr_warn("The file was successfully unpreserved, but response to user fai= led\n"); + + return ret; +} + +static int luo_session_restore_fd(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_restore_fd *argp =3D ucmd->cmd; + struct file *file; + int ret; + + guard(rwsem_read)(&luo_state_rwsem); + if (!liveupdate_state_updated()) + return -EBUSY; + + argp->fd =3D get_unused_fd_flags(O_CLOEXEC); + if (argp->fd < 0) + return argp->fd; + + guard(mutex)(&session->mutex); + + /* Session might have already finished independatly from global state */ + if (session->state !=3D LIVEUPDATE_STATE_UPDATED) + return -EBUSY; + + ret =3D luo_retrieve_file(session, argp->token, &file); + if (ret < 0) { + put_unused_fd(argp->fd); + + return ret; + } + + ret =3D luo_ucmd_respond(ucmd, sizeof(*argp)); + if (ret) + return ret; + + fd_install(argp->fd, file); + + return 0; +} + +static int luo_session_get_state(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_get_state *argp =3D ucmd->cmd; + + if (argp->reserved[0] | argp->reserved[1] | argp->reserved[2]) + return -EOPNOTSUPP; + + argp->state =3D READ_ONCE(session->state); + + return luo_ucmd_respond(ucmd, sizeof(*argp)); +} + +static int luo_session_set_event(struct luo_session *session, + struct luo_ucmd *ucmd) +{ + struct liveupdate_session_set_event *argp =3D ucmd->cmd; + int ret =3D 0; + + switch (argp->event) { + case LIVEUPDATE_PREPARE: + ret =3D luo_session_prepare_one(session); + break; + case LIVEUPDATE_FREEZE: + ret =3D luo_session_freeze_one(session); + break; + case LIVEUPDATE_FINISH: + luo_session_finish_one(session); + break; + case LIVEUPDATE_CANCEL: + luo_session_cancel_one(session); + break; + default: + ret =3D -EINVAL; + } + + return ret; +} + +union ucmd_buffer { + struct liveupdate_session_get_state state; + struct liveupdate_session_preserve_fd preserve; + struct liveupdate_session_restore_fd restore; + struct liveupdate_session_set_event event; + struct liveupdate_session_unpreserve_fd unpreserve; +}; + +struct luo_ioctl_op { + unsigned int size; + unsigned int min_size; + unsigned int ioctl_num; + int (*execute)(struct luo_session *session, struct luo_ucmd *ucmd); +}; + +#define IOCTL_OP(_ioctl, _fn, _struct, _last) = \ + [_IOC_NR(_ioctl) - LIVEUPDATE_CMD_SESSION_BASE] =3D { \ + .size =3D sizeof(_struct) + \ + BUILD_BUG_ON_ZERO(sizeof(union ucmd_buffer) < \ + sizeof(_struct)), \ + .min_size =3D offsetofend(_struct, _last), \ + .ioctl_num =3D _ioctl, \ + .execute =3D _fn, \ + } + +static const struct luo_ioctl_op luo_session_ioctl_ops[] =3D { + IOCTL_OP(LIVEUPDATE_SESSION_GET_STATE, luo_session_get_state, + struct liveupdate_session_get_state, state), + IOCTL_OP(LIVEUPDATE_SESSION_PRESERVE_FD, luo_session_preserve_fd, + struct liveupdate_session_preserve_fd, token), + IOCTL_OP(LIVEUPDATE_SESSION_RESTORE_FD, luo_session_restore_fd, + struct liveupdate_session_restore_fd, token), + IOCTL_OP(LIVEUPDATE_SESSION_SET_EVENT, luo_session_set_event, + struct liveupdate_session_set_event, event), + IOCTL_OP(LIVEUPDATE_SESSION_UNPRESERVE_FD, luo_session_unpreserve_fd, + struct liveupdate_session_unpreserve_fd, token), +}; + +static long luo_session_ioctl(struct file *filep, unsigned int cmd, + unsigned long arg) +{ + struct luo_session *session =3D filep->private_data; + const struct luo_ioctl_op *op; + struct luo_ucmd ucmd =3D {}; + union ucmd_buffer buf; + unsigned int nr; + int ret; + + nr =3D _IOC_NR(cmd); + if (nr < LIVEUPDATE_CMD_SESSION_BASE || (nr - LIVEUPDATE_CMD_SESSION_BASE= ) >=3D + ARRAY_SIZE(luo_session_ioctl_ops)) { + return -EINVAL; + } + + ucmd.ubuffer =3D (void __user *)arg; + ret =3D get_user(ucmd.user_size, (u32 __user *)ucmd.ubuffer); + if (ret) + return ret; + + op =3D &luo_session_ioctl_ops[nr - LIVEUPDATE_CMD_SESSION_BASE]; + if (op->ioctl_num !=3D cmd) + return -ENOIOCTLCMD; + if (ucmd.user_size < op->min_size) + return -EINVAL; + + ucmd.cmd =3D &buf; + ret =3D copy_struct_from_user(ucmd.cmd, op->size, ucmd.ubuffer, + ucmd.user_size); + if (ret) + return ret; + + return op->execute(session, &ucmd); +} + static const struct file_operations luo_session_fops =3D { .owner =3D THIS_MODULE, .release =3D luo_session_release, + .unlocked_ioctl =3D luo_session_ioctl, }; =20 static void luo_session_deserialize(void) @@ -267,6 +528,7 @@ static void luo_session_deserialize(void) session->state =3D LIVEUPDATE_STATE_UPDATED; session->count =3D luo_session_global.ser[i].count; session->files =3D luo_session_global.ser[i].files; + luo_file_deserialize(session); } } =20 @@ -501,7 +763,25 @@ static int luo_session_prepare(struct liveupdate_subsy= stem *h, u64 *data) =20 static int luo_session_freeze(struct liveupdate_subsystem *h, u64 *data) { - return 0; + struct luo_session *it; + int ret; + + WARN_ON(!luo_session_global.fdt); + + scoped_guard(rwsem_read, &luo_session_global.rwsem) { + list_for_each_entry(it, &luo_session_global.list, list) { + if (it->state =3D=3D LIVEUPDATE_STATE_PREPARED) { + ret =3D luo_session_freeze_one(it); + if (ret) + break; + } + } + } + + if (ret) + luo_session_cancel(h, 0); + + return ret; } =20 /* --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04A9A1FDE22 for ; Mon, 29 Sep 2025 01:04:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107878; cv=none; b=F0VyW6+rCcshWz0vfDWi0/ipv67+HFz5r7Zs0A2xH6QAbHgOkQc0mqUsitjliRV5Koq7hcdBvRyzx9pg+8CjKCqOawgs4s9lkHsMQFlSm0BTkdcKjL4873qupGTi2nhK5FypS3STvLQZ/v5wpyWTM8E0MmReh309stmCTmn4vY8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107878; c=relaxed/simple; bh=hPTwkZ9Zs7OGfgoBOmsXzZchI4dTFATs/A5A5sAPQhI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TIbdUTb8uLgPwLYmSlUJ4Q0HbWUkiXmYOZb3alqMh4XX924Kgz+KMEk0sMuVXXclUV4h1F4Wl5xmMdK1lt1xvgDkU8cUXnoeIZH8m+8EM16sHrPR335nmVs+9e6g4B0E9I7YpRPdWLyPyZbJInplC/PnzEtsesRnU0tLTKoTzJs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Ue76cvBa; arc=none smtp.client-ip=209.85.160.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Ue76cvBa" Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-4dce9229787so33646561cf.0 for ; Sun, 28 Sep 2025 18:04:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107875; x=1759712675; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=4673JtxFCCWFxWem0KySHim/AVBmSFNvOtZj4TDhxZ0=; b=Ue76cvBagnFBBmoFEthA+UebVXS0ul8p2//sjtmZ8tFttzpGA+evB4A+Z/cY5sYf9p sJJzKP55lHXMdDXLl13ztZsxflDiYoR6WXm6ase7jabD2wP/o1TWCFTteH0Kv8X50cDk Lg6Q0R3NGLTO3f9ckZMcsBulv8HP46ZpWvqpPgdIF9OnZkG2fOzQiQ0ITU+r17tqhwp9 1pMufNlR4jSjQZ+SpIkRTgeUeNGHksX8ZP5iSTuZwHQTENtaJ0gjvxPGh9JZu3RuFojf vgB8Nxpi2Qb3scndsQ0Wf27O8xvB8yeomNF7ZjSJLgr9e5anJIsUwejCiuta5kMfeLBs NViw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107875; x=1759712675; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4673JtxFCCWFxWem0KySHim/AVBmSFNvOtZj4TDhxZ0=; b=d1MBvx+lqMuRvYTEXOkmwD3dR7vuZrMgM/YTnMD8Tzs1j/wW6mV9HROnHTSbDi7OGn 20yyKYYUZoKkLsl5m7zS6snKcLw2HlnmWnfwn99fwJcpYHe3wLgaoL9NIMmhcNmeyBnp q7BjwiHk0t1FX5MT61sOYkGJtunOenXD0U+7WgeWqOA9yJIsmOsLKWlbJtyPFUIQ8Syp zBCKtyGAHZsMMDMDbbklWxyQTrzzdpgwq16wfTwLQL7qEWKivmYTGA08L1F339dz/yje 24sRDXn+XZROtqQ14a6lVc9Y6Jb1yoLsD5xlNOeVpfEww2GID6zgHYzBBdAt1tqhU9s3 OUog== X-Forwarded-Encrypted: i=1; AJvYcCVP7eItFnnP+MN/7/hyGu8cb1afkHFUTQyJexg1zdQx6NirsFYprlJ03hTCJrZ1/QFaPZor+nTBjOe88qA=@vger.kernel.org X-Gm-Message-State: AOJu0Yx1RiAwNWn0tXDiluZUhFYQa2YQgpG62ES9l38qrix+AaOAlB6z 9UfdVWGgsIJrCAWz5LIiqP2eW4C+jdaCoZWxpEu5hJgwKAgSoTZ2Fvs32a8C65MVbB0= X-Gm-Gg: ASbGncv8Mnubl00ICHte49yBx6L+m8wNa3vCKy/ADgDP035ZC4KHRcl/v4PWuWYIZqa 35zcralx+83sUIgNy1evv/sKbBZkKTncyPtCr2Wi5Jd+KJS3WWJ7DB66bZKfJZagogN1LcJx8RY /po4w+I3jW25eibf1GIFNLyr0N3A8w2rnKAz59hlvWFuzbOLln9XDEvN3U5BMYdbdGi39KH2O3I kQmSKZW78C0JNGkU6QGyF3HfmpRbAivdmnntcrptWFA1A73ikQjHcsNA92wSWVzTYp7GslyfTP6 qko1TxvHBBdvdmwllpGwt6k/tAjh0nnqOjR39m5H+GbITrUKp8Y5t+7N/W4GWIvo/McGlM+2bUx zL9FLQuj5Kigv2AVpLeFkJteGo31CEUU9DXw7xlS4IeVnA+3W/OjIwmjXJj5yf3J/rjcYZs3f9K Z43wg2EiVuNNt2I6Qp8A== X-Google-Smtp-Source: AGHT+IGDD1FThqyz2U7zZsapceDkvp6OYAFPiCrTNFDU7CmxN/gojkNSulBOcc5jk5r12Vod1QdZCA== X-Received: by 2002:a05:622a:5c8:b0:4e0:b5ef:2ba3 with SMTP id d75a77b69052e-4e0b5ef2f60mr33122951cf.37.1759107874548; Sun, 28 Sep 2025 18:04:34 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:33 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 15/30] reboot: call liveupdate_reboot() before kexec Date: Mon, 29 Sep 2025 01:03:06 +0000 Message-ID: <20250929010321.3462457-16-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Modify the reboot() syscall handler in kernel/reboot.c to call liveupdate_reboot() when processing the LINUX_REBOOT_CMD_KEXEC command. This ensures that the Live Update Orchestrator is notified just before the kernel executes the kexec jump. The liveupdate_reboot() function triggers the final LIVEUPDATE_FREEZE event, allowing participating subsystems to perform last-minute state saving within the blackout window, and transitions the LUO state machine to FROZEN. The call is placed immediately before kernel_kexec() to ensure LUO finalization happens at the latest possible moment before the kernel transition. If liveupdate_reboot() returns an error (indicating a failure during LUO finalization), the kexec operation is aborted to prevent proceeding with an inconsistent state. Signed-off-by: Pasha Tatashin --- kernel/reboot.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/reboot.c b/kernel/reboot.c index ec087827c85c..bdeb04a773db 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -797,6 +798,9 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsig= ned int, cmd, =20 #ifdef CONFIG_KEXEC_CORE case LINUX_REBOOT_CMD_KEXEC: + ret =3D liveupdate_reboot(); + if (ret) + break; ret =3D kernel_kexec(); break; #endif --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 76D322080C0 for ; Mon, 29 Sep 2025 01:04:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107879; cv=none; b=QzQpGuL6cAefgvm5lg9Nm4Ka+1wtMGn98iZnbDeOfygz3lF6O19Itwlgx8PVO8N2x47Ck4qoolr1M3aUL7daOA7bFHZU0h39hOSi2jpCzW5E9D70tc4rK+b4Y13brg+pVeG+z8IgU11+EaHe8CImsggcwIglykuGnc09YOJuooQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107879; c=relaxed/simple; bh=DI5vibT0+29UIjhRz7D4vqAFmzDsP/l5eapzRzZNOQw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EoY5YBeQIpLuegu5hoiEj2wThBcDeVskuy5r7gyPKTSY0Y//y5pozCDqivnz7z6mF1lxgdH4U4K2PfWSc8D62AHhjsL9LHB5ExDmBzZgt4FiNpkTgq1dXc2o39PFPlbm5L8EkQmAT3Vy5fP4z/dGUJcpZO12FxjuQt/rmFTDFbw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=MKiGmsN2; arc=none smtp.client-ip=209.85.160.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="MKiGmsN2" Received: by mail-qt1-f182.google.com with SMTP id d75a77b69052e-4de8bd25183so32895171cf.2 for ; Sun, 28 Sep 2025 18:04:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107876; x=1759712676; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=KQeH+ACDAvWfipNpkX5nREJXLCZp+2/jqJfKfCHTDvk=; b=MKiGmsN2WYYz/WAujnueowinTHOJeSDOIbt2w04aI5nxK/f7efwVsRW54abO0Irgsg AR9UMU9ASau5r40q2k5fNouDu7rwopqagFCKproUsyxGzkMjYwTthtUKhMq1EejtNzUR ctMHKRzBFhjGSKCEIKiyhHAQslJlMAOi3hkasAKTuT9MabwNSJv245Vo1pxwys4WCV/3 ullqK8XjuEBDSz3n1l9/spt+D30sEVauf4BfqPwzSeSCffl0lY3wPeCd7YMQkikvzJCT fNY1ib8uqrIh50E1+/kqgu18rX+noHwHfy86BtqPPo0jSz7XlKawyZgTOV/YXd+gHIWq Dc6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107876; x=1759712676; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KQeH+ACDAvWfipNpkX5nREJXLCZp+2/jqJfKfCHTDvk=; b=U7LYspoyZU9sWhHzWAZkuAPGtQr7GW1DdCcV+u48e13FlWufpm5yAMk5SKFmSZTDKz GG4PzK1628C+iB7MOAEarRCiaBT6zAOBa8+choj08hs5B6t58TckolvnTTKqGYGUOAKH VYrDLK5RU6fNSwSTZUa40aMm2fV5Wzlbl4frf2NVKhrfrC8gbAUblGkMOP1Ha8rfmYk9 cdsqS7WgMRt3Egp3SMC/z2Tg3hiYAB8aI4SONZrj7RjRrxTQ3zYsdMtNFj3lmdXMkRXo mmYsYuKoWxwvWfEYxK84+1jHxq97pgsjwPXGaPBkEbmi6wTv0GYWpdo1/3mZpWfSlgof QH9w== X-Forwarded-Encrypted: i=1; AJvYcCUV34w45A97JWEd6nrcryt2HWM9GM881SSCVHuxGz2psODQyWDCjqBNne7IGu9Pk56tdwP+t/E4IFbWpIM=@vger.kernel.org X-Gm-Message-State: AOJu0YwWLOj6LTzN16ng8fhTqvEjY3pa5RwYc0R6iI687sWlatOh0WlZ EjH0ffvpb4KNp+0ZBeaHXe7tNqhIgEwTDIGgWFHi1B6RRM+wHTwv/KToPUvohcoJLIc= X-Gm-Gg: ASbGncvB9YPTOeZlaLkOualgKrQ8LWpFe3Z0KOk4ivfPB5fatcq63v4l+MDiTcyUW0m F/7yJWXS225REcYeXHZwx4HlCoE0SlYB+ZOFEyGPycIf2CRBmhiWwbVtu9x2+keVNFOwm5JaCEM /DBxvtK/P08iv9qEsNSexb61GUrEqfqsbtP/q+YCLtJW9mo7K+l3bh8kCvnbS39Nshark6NFkdv OcjBo6YWENDptL4cgfG0gS/bBJaI3sYhFTR88az+eq6gDiDFrKOWwePONHRoYEhxPcknINfIZEO 4xuPnHHkRxQT9JMSMnWUjld9o/5BNS/4AMFjR1xEB8lPSXVpA9MHuNHhPD/2pDfEPe9p9YwWYsY ZQITkhpu6WJ23sK7nBODNrJdjw1sHrqF4n12Au767inyg7OuSm8Qst2y1cpaa2yYnIFwBXggzM3 0vUa4jz7ccHqQInmHVow== X-Google-Smtp-Source: AGHT+IEPxuS/7M1Es5d/LhiTv2FSJGMeUGCP9yWRMSwEvaODB3P25ESHbhk9CzfQ3IGx46q/YialgQ== X-Received: by 2002:ac8:58c1:0:b0:4df:bab4:f710 with SMTP id d75a77b69052e-4dfbab4fa58mr62437871cf.25.1759107876179; Sun, 28 Sep 2025 18:04:36 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:35 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 16/30] kho: move kho debugfs directory to liveupdate Date: Mon, 29 Sep 2025 01:03:07 +0000 Message-ID: <20250929010321.3462457-17-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now, that LUO and KHO both live under kernel/liveupdate, it makes sense to also move the kho debugfs files to liveupdate/ The old names: /sys/kernel/debug/kho/out/ /sys/kernel/debug/kho/in/ The new names: /sys/kernel/debug/liveupdate/kho_out/ /sys/kernel/debug/liveupdate/kho_in/ Also, export the liveupdate_debufs_root, so LUO selftests could use it as well. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/kexec_handover_debug.c | 11 ++++++----- kernel/liveupdate/luo_internal.h | 4 ++++ 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/liveupdate/kexec_handover_debug.c b/kernel/liveupdate/k= exec_handover_debug.c index af4bad225630..f06d6cdfeab3 100644 --- a/kernel/liveupdate/kexec_handover_debug.c +++ b/kernel/liveupdate/kexec_handover_debug.c @@ -14,8 +14,9 @@ #include #include #include "kexec_handover_internal.h" +#include "luo_internal.h" =20 -static struct dentry *debugfs_root; +struct dentry *liveupdate_debugfs_root; =20 struct fdt_debugfs { struct list_head list; @@ -120,7 +121,7 @@ __init void kho_in_debugfs_init(struct kho_debugfs *dbg= , const void *fdt) =20 INIT_LIST_HEAD(&dbg->fdt_list); =20 - dir =3D debugfs_create_dir("in", debugfs_root); + dir =3D debugfs_create_dir("in", liveupdate_debugfs_root); if (IS_ERR(dir)) { err =3D PTR_ERR(dir); goto err_out; @@ -180,7 +181,7 @@ __init int kho_out_debugfs_init(struct kho_debugfs *dbg) =20 INIT_LIST_HEAD(&dbg->fdt_list); =20 - dir =3D debugfs_create_dir("out", debugfs_root); + dir =3D debugfs_create_dir("out", liveupdate_debugfs_root); if (IS_ERR(dir)) return -ENOMEM; =20 @@ -214,8 +215,8 @@ __init int kho_out_debugfs_init(struct kho_debugfs *dbg) =20 __init int kho_debugfs_init(void) { - debugfs_root =3D debugfs_create_dir("kho", NULL); - if (IS_ERR(debugfs_root)) + liveupdate_debugfs_root =3D debugfs_create_dir("liveupdate", NULL); + if (IS_ERR(liveupdate_debugfs_root)) return -ENOENT; return 0; } diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index c9bce82aac22..083b80754c9e 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -107,4 +107,8 @@ void luo_file_finish(struct luo_session *session); void luo_file_cancel(struct luo_session *session); void luo_file_deserialize(struct luo_session *session); =20 +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +extern struct dentry *liveupdate_debugfs_root; +#endif + #endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 17D3420468D for ; Mon, 29 Sep 2025 01:04:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107881; cv=none; b=FowsyGBTNCdq/+gtQT030jQZWk0nKy7xe++IXG2ZhjN0lAJaI10FVGJaEKof2By7N0CItIN0K3+4nam5KGaMdiqJG1Fdjw8fJNYkFBjkqHbMIdRE4IYTyFdN7Ox4tCE3yO3q6iT0bVbtkM+v/HwU6xBBXzJhjCgThjW0NR/WnO8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107881; c=relaxed/simple; bh=3gwWJy7+z6EF8lOvRyZj4VhN2DeytbPF5iMG3aF/JqQ=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UYyESxNc9nkpc8paKvsuYlN8YhwZOhgMu1mKTdMEGucfrlvsZCe9Ty7P4olvTou401ecZpOXZnHInM7O59Lfclb6OQPaflGwMW7u0b1nzorGCLSo03zxJs+6Iq+bZjvOIZDzKsK2upmG0VXU/gvj6uoUMCUGbl5qFOs+CmOiuak= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=RxEmgXlF; arc=none smtp.client-ip=209.85.160.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="RxEmgXlF" Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-4df60ea79d1so14511491cf.2 for ; Sun, 28 Sep 2025 18:04:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107878; x=1759712678; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=SG9/lgBnsjBu7n3rYE1T/HXPjWC8Lull3DU31VlVdZs=; b=RxEmgXlFzaYGr8Lsn0mPk41DSW22CAJYCxjUqZiDg6AEEoJu9EELaEWVCaTeF7Qdva nJTfK5tVsF4tfwisKBEVj0vVkX+Yjy5B2t70UHjGUwmBSwfe+jO0JUvPoyWdYjNfFex4 rmpk5q6YpUMHbNBzVj6BU7p1hLCzMB1GcRHm0MHGl/LtfQRUz+X1q6DoeP7CZ8YeTp7Q L36DTBlN6cyUDLpAID6mcAsdcz7losQYtpzgylvggeQOo5cfUmjSZnAh+fb8Wmm2pSo/ rDpbVBOxRclg4vKmvt6aTJ3pOY4xyuysr55t84itEPXRog0w/+Zhxx4YZ9jVZoetvue5 wdQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107878; x=1759712678; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SG9/lgBnsjBu7n3rYE1T/HXPjWC8Lull3DU31VlVdZs=; b=CBlLglDrerFzLODQOUzgX00dTXGWzxVnco3UKUA/yThPxjPTH2h6mIiKdeP+qziDQZ E/unRTyE8+/7+5nI5Qd98FnFOT7lOAouzWs2ct07cIWNz5E8+rPLULGRKyEEkcYVoVjh 6KuAJ4YsZsllPtwZs1GSPUFzgD71QmzVjCN0VhloLHwGNJ0NdQq7a0hrw0eG4YqxmvK4 6w08VtS/d5eTDT+2pbBtD4jQa/VcA3oar9SdWNBDqmKNZEz+x9FSMZvQF+Gab8QFRYoP iy6YzqLFrFk6BYcaZbZP8KFnL5mJPLUqMamX+R5+vz/bbw3zef6t2vQvGtunVhIHHpW0 PFjA== X-Forwarded-Encrypted: i=1; AJvYcCUFRQFnAezGwf7Q+SOZ5nSM8XUBCVnrWY1+UUB8Ms/aPelosExZEpuQN13K/FBht4supzKrDE1ViQVymrs=@vger.kernel.org X-Gm-Message-State: AOJu0YwbFlDxNCL+ZsQt4IIEpZtbP/7FhBnO+NfwCSej7TsCwud66cUt pYKuTbD+JoFbnqFUy4dSlUsio3m8bTdMqvP/PWw/OifVUqVTrzKjO+qCHZ0Wjvx85Kw= X-Gm-Gg: ASbGncugTTT8Sz0wH+GTXTicXCDCAf0iPV6NfmmRM3YbGm4OHxGx6dMYW8JDViVLXYW 3ACOWebHGMO4vwegiNUz2aB23KHyHvDdUU5+2kjbzgOGvGz0Uh55r6Qnl1UXJV6zWFt+x/aXElg np+fXF55dlVg0dqPLn1FU5y2r46K45G4UWorBl9FRVwcJ4QIPVJO6KDVJTj/D4cF6OGhcl3t1tk DHTNe+DG7kSPbOkM2LXXZQMh4oCYR4Qp+RIuo9O0vJ3zcwY+zbDYxZbdBG4MVpDjw04oGXHxh27 OvI+Lz5PwA5L6YZBh3hxNYav0ZIsY7qZU3g4f5GxSC/BYQFEa9hsofTOT53Dy3EBr5AHV68pZc3 mbTF4kziggv2WOlaYvCN0iKeLOIq7GiEWzf7HtXuebAWROVAW/lJStU3QTQXiTV0MOio46rdycv Y32Cobgjs= X-Google-Smtp-Source: AGHT+IFBSJGsiJuTnL3aTk6ORRLg5HvmtexVkNoDC0xgqOUWiBHy/Femj87R1xZ3CMujKVmOA+t67A== X-Received: by 2002:ac8:7e89:0:b0:4dd:c935:93cc with SMTP id d75a77b69052e-4ddc944e0d0mr126398441cf.84.1759107877691; Sun, 28 Sep 2025 18:04:37 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:37 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 17/30] liveupdate: add selftests for subsystems un/registration Date: Mon, 29 Sep 2025 01:03:08 +0000 Message-ID: <20250929010321.3462457-18-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a self-test mechanism for the LUO to allow verification of core subsystem management functionality. This is primarily intended for developers and system integrators validating the live update feature. The tests are enabled via the new Kconfig option CONFIG_LIVEUPDATE_SELFTESTS (default 'n') and are triggered through a new ioctl command, LIVEUPDATE_IOCTL_SELFTESTS, added to the /dev/liveupdate device node. This ioctl accepts commands defined in luo_selftests.h to: - LUO_CMD_SUBSYSTEM_REGISTER: Creates and registers a dummy LUO subsystem using the liveupdate_register_subsystem() function. It allocates a data page and copies initial data from userspace. - LUO_CMD_SUBSYSTEM_UNREGISTER: Unregisters the specified dummy subsystem using the liveupdate_unregister_subsystem() function and cleans up associated test resources. - LUO_CMD_SUBSYSTEM_GETDATA: Copies the data page associated with a registered test subsystem back to userspace, allowing verification of data potentially modified or preserved by test callbacks. This provides a way to test the fundamental registration and unregistration flows within the LUO framework from userspace without requiring a full live update sequence. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/Kconfig | 15 ++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_selftests.c | 345 ++++++++++++++++++++++++++++++ kernel/liveupdate/luo_selftests.h | 84 ++++++++ 4 files changed, 445 insertions(+) create mode 100644 kernel/liveupdate/luo_selftests.c create mode 100644 kernel/liveupdate/luo_selftests.h diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index f6b0bde188d9..8311c2593d32 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -29,6 +29,21 @@ config LIVEUPDATE =20 If unsure, say N. =20 +config LIVEUPDATE_SELFTESTS + bool "Live Update Orchestrator - self-tests" + depends on LIVEUPDATE + help + Say Y here to build self-tests for the LUO framework. When enabled, + these tests can be initiated via the ioctl interface to help verify + the core live update functionality. + + This option is primarily intended for developers working on the + live update feature or for validation purposes during system + integration. + + If you are unsure or are building a production kernel where size + or attack surface is a concern, say N. + config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 282d36a18993..2df7dfdf45c1 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -11,3 +11,4 @@ obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o =20 obj-$(CONFIG_LIVEUPDATE) +=3D luo.o +obj-$(CONFIG_LIVEUPDATE_SELFTESTS) +=3D luo_selftests.o diff --git a/kernel/liveupdate/luo_selftests.c b/kernel/liveupdate/luo_self= tests.c new file mode 100644 index 000000000000..a476b88468fa --- /dev/null +++ b/kernel/liveupdate/luo_selftests.c @@ -0,0 +1,345 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO Selftests + * + * We provide ioctl-based selftest interface for the LUO. It provides a + * mechanism to test core LUO functionality, particularly the registration, + * unregistration, and data handling aspects of LUO subsystems, without + * requiring a full live update event sequence. + * + * The tests are intended primarily for developers working on the LUO fram= ework + * or for validation purposes during system integration. This functionalit= y is + * conditionally compiled based on the `CONFIG_LIVEUPDATE_SELFTESTS` Kconf= ig + * option and should typically be disabled in production kernels. + * + * Interface: + * The selftests are accessed via the `/dev/liveupdate` character device u= sing + * the `LIVEUPDATE_IOCTL_SELFTESTS` ioctl command. The argument to the ioc= tl + * is a pointer to a `struct liveupdate_selftest` structure (defined in + * `uapi/linux/liveupdate.h`), which contains: + * - `cmd`: The specific selftest command to execute (e.g., + * `LUO_CMD_SUBSYSTEM_REGISTER`). + * - `arg`: A pointer to a command-specific argument structure. For subsys= tem + * tests, this points to a `struct luo_arg_subsystem` (defined in + * `luo_selftests.h`). + * + * Commands: + * - `LUO_CMD_SUBSYSTEM_REGISTER`: + * Registers a new dummy LUO subsystem. It allocates kernel memory for test + * data, copies initial data from the user-provided `data_page`, sets up + * simple logging callbacks, and calls the core + * `liveupdate_register_subsystem()` + * function. Requires `arg` pointing to `struct luo_arg_subsystem`. + * - `LUO_CMD_SUBSYSTEM_UNREGISTER`: + * Unregisters a previously registered dummy subsystem identified by `name= `. + * It calls the core `liveupdate_unregister_subsystem()` function and then + * frees the associated kernel memory and internal tracking structures. + * Requires `arg` pointing to `struct luo_arg_subsystem` (only `name` used= ). + * - `LUO_CMD_SUBSYSTEM_GETDATA`: + * Copies the content of the kernel data page associated with the specified + * dummy subsystem (`name`) back to the user-provided `data_page`. This al= lows + * userspace to verify the state of the data after potential test operatio= ns. + * Requires `arg` pointing to `struct luo_arg_subsystem`. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" +#include "luo_selftests.h" + +static struct luo_subsystems { + struct liveupdate_subsystem handle; + char name[LUO_NAME_LENGTH]; + void *data; + bool in_use; + bool preserved; +} luo_subsystems[LUO_MAX_SUBSYSTEMS]; + +/* Only allow one selftest ioctl operation at a time */ +static DEFINE_MUTEX(luo_ioctl_mutex); + +static int luo_subsystem_prepare(struct liveupdate_subsystem *h, u64 *data) +{ + struct luo_subsystems *s =3D container_of(h, struct luo_subsystems, + handle); + unsigned long phys_addr =3D __pa(s->data); + int ret; + + ret =3D kho_preserve_pages(phys_to_page(phys_addr), 1); + if (ret) + return ret; + + s->preserved =3D true; + *data =3D phys_addr; + pr_info("Subsystem '%s' prepare data[%lx]\n", + s->name, phys_addr); + + if (strstr(s->name, NAME_PREPARE_FAIL)) + return -EAGAIN; + + return 0; +} + +static int luo_subsystem_freeze(struct liveupdate_subsystem *h, u64 *data) +{ + struct luo_subsystems *s =3D container_of(h, struct luo_subsystems, + handle); + + pr_info("Subsystem '%s' freeze data[%llx]\n", s->name, *data); + + return 0; +} + +static void luo_subsystem_cancel(struct liveupdate_subsystem *h, u64 data) +{ + struct luo_subsystems *s =3D container_of(h, struct luo_subsystems, + handle); + + pr_info("Subsystem '%s' canel data[%llx]\n", s->name, data); + s->preserved =3D false; + WARN_ON(kho_unpreserve_pages(phys_to_page(data), 1)); +} + +static void luo_subsystem_finish(struct liveupdate_subsystem *h, u64 data) +{ + struct luo_subsystems *s =3D container_of(h, struct luo_subsystems, + handle); + + pr_info("Subsystem '%s' finish data[%llx]\n", s->name, data); +} + +static const struct liveupdate_subsystem_ops luo_selftest_subsys_ops =3D { + .prepare =3D luo_subsystem_prepare, + .freeze =3D luo_subsystem_freeze, + .cancel =3D luo_subsystem_cancel, + .finish =3D luo_subsystem_finish, + .owner =3D THIS_MODULE, +}; + +static int luo_subsystem_idx(char *name) +{ + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (luo_subsystems[i].in_use && + !strcmp(luo_subsystems[i].name, name)) + break; + } + + if (i =3D=3D LUO_MAX_SUBSYSTEMS) { + pr_warn("Subsystem with name '%s' is not registred\n", name); + + return -EINVAL; + } + + return i; +} + +static void luo_put_and_free_subsystem(char *name) +{ + int i =3D luo_subsystem_idx(name); + + if (i < 0) + return; + + if (luo_subsystems[i].preserved) + kho_unpreserve_pages(virt_to_page(luo_subsystems[i].data), 1); + free_page((unsigned long)luo_subsystems[i].data); + luo_subsystems[i].in_use =3D false; + luo_subsystems[i].preserved =3D false; +} + +static int luo_get_and_alloc_subsystem(char *name, void __user *data, + struct liveupdate_subsystem **hp) +{ + unsigned long page_addr, i; + + page_addr =3D get_zeroed_page(GFP_KERNEL); + if (!page_addr) { + pr_warn("Failed to allocate memory for subsystem data\n"); + return -ENOMEM; + } + + if (copy_from_user((void *)page_addr, data, PAGE_SIZE)) { + free_page(page_addr); + return -EFAULT; + } + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (!luo_subsystems[i].in_use) + break; + } + + if (i =3D=3D LUO_MAX_SUBSYSTEMS) { + pr_warn("Maximum number of subsystems registered\n"); + free_page(page_addr); + return -ENOMEM; + } + + luo_subsystems[i].in_use =3D true; + luo_subsystems[i].handle.ops =3D &luo_selftest_subsys_ops; + luo_subsystems[i].handle.name =3D luo_subsystems[i].name; + strscpy(luo_subsystems[i].name, name, LUO_NAME_LENGTH); + luo_subsystems[i].data =3D (void *)page_addr; + + *hp =3D &luo_subsystems[i].handle; + + return 0; +} + +static int luo_cmd_subsystem_unregister(void __user *argp) +{ + struct luo_arg_subsystem arg; + int ret, i; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + i =3D luo_subsystem_idx(arg.name); + if (i < 0) + return i; + + ret =3D liveupdate_unregister_subsystem(&luo_subsystems[i].handle); + if (ret) + return ret; + + luo_put_and_free_subsystem(arg.name); + + return 0; +} + +static int luo_cmd_subsystem_register(void __user *argp) +{ + struct liveupdate_subsystem *h; + struct luo_arg_subsystem arg; + int ret; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + ret =3D luo_get_and_alloc_subsystem(arg.name, + (void __user *)arg.data_page, &h); + if (ret) + return ret; + + ret =3D liveupdate_register_subsystem(h); + if (ret) + luo_put_and_free_subsystem(arg.name); + + return ret; +} + +static int luo_cmd_subsystem_getdata(void __user *argp) +{ + struct luo_arg_subsystem arg; + int i; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + i =3D luo_subsystem_idx(arg.name); + if (i < 0) + return i; + + if (copy_to_user(arg.data_page, luo_subsystems[i].data, + PAGE_SIZE)) { + return -EFAULT; + } + + return 0; +} + +static int luo_ioctl_selftests(void __user *argp) +{ + struct liveupdate_selftest luo_st; + void __user *cmd_argp; + int ret =3D 0; + + if (copy_from_user(&luo_st, argp, sizeof(luo_st))) + return -EFAULT; + + cmd_argp =3D (void __user *)luo_st.arg; + + mutex_lock(&luo_ioctl_mutex); + switch (luo_st.cmd) { + case LUO_CMD_SUBSYSTEM_REGISTER: + ret =3D luo_cmd_subsystem_register(cmd_argp); + break; + + case LUO_CMD_SUBSYSTEM_UNREGISTER: + ret =3D luo_cmd_subsystem_unregister(cmd_argp); + break; + + case LUO_CMD_SUBSYSTEM_GETDATA: + ret =3D luo_cmd_subsystem_getdata(cmd_argp); + break; + + default: + pr_warn("ioctl: unknown self-test command nr: 0x%llx\n", + luo_st.cmd); + ret =3D -ENOTTY; + break; + } + mutex_unlock(&luo_ioctl_mutex); + + return ret; +} + +static long luo_selftest_ioctl(struct file *filep, unsigned int cmd, + unsigned long arg) +{ + int ret =3D 0; + + if (_IOC_TYPE(cmd) !=3D LIVEUPDATE_IOCTL_TYPE) + return -ENOTTY; + + switch (cmd) { + case LIVEUPDATE_IOCTL_FREEZE: + ret =3D luo_freeze(); + break; + + case LIVEUPDATE_IOCTL_SELFTESTS: + ret =3D luo_ioctl_selftests((void __user *)arg); + break; + + default: + pr_warn("ioctl: unknown command nr: 0x%x\n", _IOC_NR(cmd)); + ret =3D -ENOTTY; + break; + } + + return ret; +} + +static const struct file_operations luo_selftest_fops =3D { + .open =3D nonseekable_open, + .unlocked_ioctl =3D luo_selftest_ioctl, +}; + +static int __init luo_seltesttest_init(void) +{ + if (!liveupdate_debugfs_root) { + pr_err("liveupdate root is not set\n"); + return 0; + } + debugfs_create_file_unsafe("luo_selftest", 0600, + liveupdate_debugfs_root, NULL, + &luo_selftest_fops); + return 0; +} + +late_initcall(luo_seltesttest_init); diff --git a/kernel/liveupdate/luo_selftests.h b/kernel/liveupdate/luo_self= tests.h new file mode 100644 index 000000000000..098f2e9e6a78 --- /dev/null +++ b/kernel/liveupdate/luo_selftests.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _LINUX_LUO_SELFTESTS_H +#define _LINUX_LUO_SELFTESTS_H + +#include +#include + +/* Maximum number of subsystem self-test can register */ +#define LUO_MAX_SUBSYSTEMS 16 +#define LUO_NAME_LENGTH 32 + +#define LUO_CMD_SUBSYSTEM_REGISTER 0 +#define LUO_CMD_SUBSYSTEM_UNREGISTER 1 +#define LUO_CMD_SUBSYSTEM_GETDATA 2 +struct luo_arg_subsystem { + char name[LUO_NAME_LENGTH]; + void *data_page; +}; + +/* + * Test name prefixes: + * normal: prepare and freeze callbacks do not fail + * prepare_fail: prepare callback fails for this test. + * freeze_fail: freeze callback fails for this test + */ +#define NAME_NORMAL "ksft_luo" +#define NAME_PREPARE_FAIL "ksft_prepare_fail" +#define NAME_FREEZE_FAIL "ksft_freeze_fail" + +/** + * struct liveupdate_selftest - Holds directions for the self-test operati= ons. + * @cmd: Selftest comman defined in luo_selftests.h. + * @arg: Argument for the self test command. + * + * This structure is used only for the selftest purposes. + */ +struct liveupdate_selftest { + __u64 cmd; + __u64 arg; +}; + +/** + * LIVEUPDATE_IOCTL_FREEZE - Notify subsystems of imminent reboot + * transition. + * + * Argument: None. + * + * Notifies the live update subsystem and associated components that the k= ernel + * is about to execute the final reboot transition into the new kernel (e.= g., + * via kexec). This action triggers the internal %LIVEUPDATE_FREEZE kernel + * event. This event provides subsystems a final, brief opportunity (withi= n the + * "blackout window") to save critical state or perform last-moment quiesc= ing. + * Any remaining or deferred state saving for items marked via the PRESERVE + * ioctls typically occurs in response to the %LIVEUPDATE_FREEZE event. + * + * This ioctl should only be called when the system is in the + * %LIVEUPDATE_STATE_PREPARED state. This command does not transfer data. + * + * Return: 0 if the notification is successfully processed by the kernel (= but + * reboot follows). Returns a negative error code if the notification fails + * or if the system is not in the %LIVEUPDATE_STATE_PREPARED state. + */ +#define LIVEUPDATE_IOCTL_FREEZE \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x05) + +/** + * LIVEUPDATE_IOCTL_SELFTESTS - Interface for the LUO selftests + * + * Argument: Pointer to &struct liveupdate_selftest. + * + * Use by LUO selftests, commands are declared in luo_selftests.h + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +#define LIVEUPDATE_IOCTL_SELFTESTS \ + _IOWR(LIVEUPDATE_IOCTL_TYPE, 0x08, struct liveupdate_selftest) + +#endif /* _LINUX_LUO_SELFTESTS_H */ --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98CB5264614 for ; Mon, 29 Sep 2025 01:04:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107883; cv=none; b=qWe4lX4GdUEZvwAUHlWQIVM1KdsQ8ffNWQGAyHUZ2v8Fahv3Xs6xuNFZI/eZMZxU3HquMkcnTTMqG+8CxmxPQSO/yHWlOLwVXtT9duNOyAOM9BYYse5riikdAxCLEMfGHoDdbkpEV0t6DXuJcm6QAHNpKmfiVV5cl07b69Ke+PY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107883; c=relaxed/simple; bh=L6ao1Nu+sGNUW1SHO/j5boKkbrSZSkTVDZP0p9xXls8=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=o/s6aawhqSWQvBiBCrx/xzHEJLn/yEJ3tqQhUkD1itANFEeUPvn6PQpImH3Lt1SDOpKk6Rjs7AON7C1YaAl9zdVuIeBFpy8jdMPJ3RWS4S1fYZzAb60SjRhTalkGkaVi6K9HVFx2d9FdYvaxyHraogiERUo8yQPVSTGU4fIkWZo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=MZStL5Ln; arc=none smtp.client-ip=209.85.160.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="MZStL5Ln" Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-4da37f6e64cso34636761cf.2 for ; Sun, 28 Sep 2025 18:04:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107879; x=1759712679; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=hWJ63KeQmpqYfuZrXZOnIQWXyJkptFNNQr+9KCQ8c70=; b=MZStL5LnjrV0W2GN+iOHhhY4L2HZXuX/F5JU41Om2kidaQ71QBWs9X5CvEDewVfCBK lgr5sshX22zCiRELuHlkd2aCVRKAvUWvzq1JOCwmh4E1t2blWthE8JiLTUjB1Hyv2KsH /GqQFC9VZnv7wnL2U4BO5ZFlOj4u4Lsimv8W0pL+EuUSmMjHEn0ohuYbrPm3F5m+6YG9 Ay3FgHeEYj4v2TBfziFlI9MiMV+R9kfHvZ3w0PWr3GnQ+DLG3KQIsG1hOH5piRuESQSm Yj1DUKALslbFAVD8orwy/W9XYme/lzpSE8S9q1ZTEVvDovXXiF1jKSylW5OCT1MD5h6p eY9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107879; x=1759712679; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hWJ63KeQmpqYfuZrXZOnIQWXyJkptFNNQr+9KCQ8c70=; b=QAfesWfvQNSrET/JQTkA9oDF1mPku25wbH5/6RzVUvdY11uXxB3dcgqR+KQ863EFLJ 7gs4ukWWZCeuXJ9wxzKPM8x0k65gXeuUkbpavq89lWdX0rUMTIBq2Ku01bqmcy9owwNJ 5K1fkB1NwYIp+41CqO6dECSeJnB0ND8T9H013dHa/ilZKtjddY+JaWHYi8tYDMCG4Qs0 fpb3fCcg3+7H8veXQvtEqqKe4anuZePoYqu9y6bIR16/Kce53mWpPFuOTgD825bCmcVD VDzVh5ouKu4LfyIr5bk7hzZoNpg1+hHQWD68rMWR+Eb6HCbMFOhsZmmUVgFcneERBKri UTlw== X-Forwarded-Encrypted: i=1; AJvYcCWVpWcdH4j4qlHwwdtxLOMZTve2aewKxT5bsNt8LiaMspb9CD9fYLcHUhh13uN2RtJF3K+kou8FfJ4LRKA=@vger.kernel.org X-Gm-Message-State: AOJu0YyuLNJhauN+uD0/fBfQG/jal9GBTq6aAJ4A8k+cCnrI+ms42VuN vxoshQpOAEW0VuPhltDdNT9/zDsSkJmQ71AAt5QwYn2VeeiH+RLyPlhndtkn4f+v/y0= X-Gm-Gg: ASbGnctU1sTBge8qdmeSfXPoNZI7LjqkjY2thog07cFeikf/VdaLpb7RqWBVsIMM12V 2fei/Q+mJNLrbuZAYuQxR683Jk4wFUYV7FDZDCCvZ93VCyOoYf6bvIdi35+WEV1QeuEsUhgnFpu enohHCcOVDGwoIEHFtSVFJrnc8TiNJnunjKqC6Gc2Xn7S3jmd63bTwFyQSbDixT5r7d0XeIiCCd 6HhnErsEHP9eFuRsvT6kSh7MmpEMnT52jexMVSebHOUdjihrltdqrDjLhVDM+1AHJ3SfB1ZeRy6 /vcQU6cLgPHIbSjOyOsrxg8eLIVtEE4J3KhtckLSiOZzK0TX+MEYcdvgc8J5TQXsMB/XWws9eGC t2xkfg+oW5lorOTtOyG3JY4a/adZyYI75PqcG9w2G8m33zgESwiq48NvH/rIbhrTRUwDmy9v0lh vDuns3XL4= X-Google-Smtp-Source: AGHT+IGQ9JesfWQ9VOblraXsGcOaxMqov/TZFk+utQcihiAURpdIQFxS9/FIvzoyuC4KA4xQDrrIxg== X-Received: by 2002:a05:622a:5c8:b0:4e0:b5ef:2ba3 with SMTP id d75a77b69052e-4e0b5ef2f60mr33124721cf.37.1759107879208; Sun, 28 Sep 2025 18:04:39 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:38 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 18/30] selftests/liveupdate: add subsystem/state tests Date: Mon, 29 Sep 2025 01:03:09 +0000 Message-ID: <20250929010321.3462457-19-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduces a new set of userspace selftests for the LUO. These tests verify the functionality LUO by using the kernel-side selftest ioctls provided by the LUO module, primarily focusing on subsystem management and basic LUO state transitions. Signed-off-by: Pasha Tatashin --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/liveupdate/.gitignore | 1 + tools/testing/selftests/liveupdate/Makefile | 7 + tools/testing/selftests/liveupdate/config | 6 + .../testing/selftests/liveupdate/liveupdate.c | 348 ++++++++++++++++++ 5 files changed, 363 insertions(+) create mode 100644 tools/testing/selftests/liveupdate/.gitignore create mode 100644 tools/testing/selftests/liveupdate/Makefile create mode 100644 tools/testing/selftests/liveupdate/config create mode 100644 tools/testing/selftests/liveupdate/liveupdate.c diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Mak= efile index c46ebdb9b8ef..56e44a98d6a5 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -54,6 +54,7 @@ TARGETS +=3D kvm TARGETS +=3D landlock TARGETS +=3D lib TARGETS +=3D livepatch +TARGETS +=3D liveupdate TARGETS +=3D lkdtm TARGETS +=3D lsm TARGETS +=3D membarrier diff --git a/tools/testing/selftests/liveupdate/.gitignore b/tools/testing/= selftests/liveupdate/.gitignore new file mode 100644 index 000000000000..af6e773cf98f --- /dev/null +++ b/tools/testing/selftests/liveupdate/.gitignore @@ -0,0 +1 @@ +/liveupdate diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile new file mode 100644 index 000000000000..2a573c36016e --- /dev/null +++ b/tools/testing/selftests/liveupdate/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0-only +CFLAGS +=3D -Wall -O2 -Wno-unused-function +CFLAGS +=3D $(KHDR_INCLUDES) + +TEST_GEN_PROGS +=3D liveupdate + +include ../lib.mk diff --git a/tools/testing/selftests/liveupdate/config b/tools/testing/self= tests/liveupdate/config new file mode 100644 index 000000000000..382c85b89570 --- /dev/null +++ b/tools/testing/selftests/liveupdate/config @@ -0,0 +1,6 @@ +CONFIG_KEXEC_FILE=3Dy +CONFIG_KEXEC_HANDOVER=3Dy +CONFIG_KEXEC_HANDOVER_DEBUG=3Dy +CONFIG_LIVEUPDATE=3Dy +CONFIG_LIVEUPDATE_SYSFS_API=3Dy +CONFIG_LIVEUPDATE_SELFTESTS=3Dy diff --git a/tools/testing/selftests/liveupdate/liveupdate.c b/tools/testin= g/selftests/liveupdate/liveupdate.c new file mode 100644 index 000000000000..7c0ceaac0283 --- /dev/null +++ b/tools/testing/selftests/liveupdate/liveupdate.c @@ -0,0 +1,348 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include + +#include "../kselftest.h" +#include "../kselftest_harness.h" +#include "../../../../kernel/liveupdate/luo_selftests.h" + +struct subsystem_info { + void *data_page; + void *verify_page; + char test_name[LUO_NAME_LENGTH]; + bool registered; +}; + +FIXTURE(subsystem) { + int fd; + int fd_dbg; + struct subsystem_info si[LUO_MAX_SUBSYSTEMS]; +}; + +FIXTURE(state) { + int fd; + int fd_dbg; +}; + +#define LUO_DEVICE "/dev/liveupdate" +#define LUO_DBG_DEVICE "/sys/kernel/debug/liveupdate/luo_selftest" +static size_t page_size; + +const char *const luo_state_str[] =3D { + [LIVEUPDATE_STATE_UNDEFINED] =3D "undefined", + [LIVEUPDATE_STATE_NORMAL] =3D "normal", + [LIVEUPDATE_STATE_PREPARED] =3D "prepared", + [LIVEUPDATE_STATE_FROZEN] =3D "frozen", + [LIVEUPDATE_STATE_UPDATED] =3D "updated", +}; + +static int run_luo_selftest_cmd(int fd_dbg, __u64 cmd_code, + struct luo_arg_subsystem *subsys_arg) +{ + struct liveupdate_selftest k_arg; + + k_arg.cmd =3D cmd_code; + k_arg.arg =3D (__u64)(unsigned long)subsys_arg; + + return ioctl(fd_dbg, LIVEUPDATE_IOCTL_SELFTESTS, &k_arg); +} + +static int register_subsystem(int fd_dbg, struct subsystem_info *si) +{ + struct luo_arg_subsystem subsys_arg; + int ret; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", si->test_name); + subsys_arg.data_page =3D si->data_page; + + ret =3D run_luo_selftest_cmd(fd_dbg, LUO_CMD_SUBSYSTEM_REGISTER, + &subsys_arg); + if (!ret) + si->registered =3D true; + + return ret; +} + +static int unregister_subsystem(int fd_dbg, struct subsystem_info *si) +{ + struct luo_arg_subsystem subsys_arg; + int ret; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", si->test_name); + + ret =3D run_luo_selftest_cmd(fd_dbg, LUO_CMD_SUBSYSTEM_UNREGISTER, + &subsys_arg); + if (!ret) + si->registered =3D false; + + return ret; +} + +FIXTURE_SETUP(state) +{ + page_size =3D sysconf(_SC_PAGE_SIZE); + self->fd =3D open(LUO_DEVICE, O_RDWR); + if (self->fd < 0) + SKIP(return, "open(%s) failed [%d]", LUO_DEVICE, errno); + + self->fd_dbg =3D open(LUO_DBG_DEVICE, O_RDWR); + ASSERT_GE(self->fd_dbg, 0); +} + +FIXTURE_TEARDOWN(state) +{ + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + struct liveupdate_ioctl_get_state ligs =3D {.size =3D sizeof(ligs)}; + + ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs); + if (ligs.state !=3D LIVEUPDATE_STATE_NORMAL) + ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel); + close(self->fd); +} + +FIXTURE_SETUP(subsystem) +{ + int i; + + page_size =3D sysconf(_SC_PAGE_SIZE); + memset(&self->si, 0, sizeof(self->si)); + self->fd =3D open(LUO_DEVICE, O_RDWR); + if (self->fd < 0) + SKIP(return, "open(%s) failed [%d]", LUO_DEVICE, errno); + + self->fd_dbg =3D open(LUO_DBG_DEVICE, O_RDWR); + ASSERT_GE(self->fd_dbg, 0); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + snprintf(self->si[i].test_name, LUO_NAME_LENGTH, + NAME_NORMAL ".%d", i); + + self->si[i].data_page =3D mmap(NULL, page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + -1, 0); + ASSERT_NE(MAP_FAILED, self->si[i].data_page); + memset(self->si[i].data_page, 'A' + i, page_size); + + self->si[i].verify_page =3D mmap(NULL, page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + -1, 0); + ASSERT_NE(MAP_FAILED, self->si[i].verify_page); + memset(self->si[i].verify_page, 0, page_size); + } +} + +FIXTURE_TEARDOWN(subsystem) +{ + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + enum liveupdate_state state =3D LIVEUPDATE_STATE_NORMAL; + int i; + + ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state); + if (state !=3D LIVEUPDATE_STATE_NORMAL) + ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (self->si[i].registered) + unregister_subsystem(self->fd_dbg, &self->si[i]); + munmap(self->si[i].data_page, page_size); + munmap(self->si[i].verify_page, page_size); + } + + close(self->fd); +} + +TEST_F(state, normal) +{ + struct liveupdate_ioctl_get_state ligs =3D {.size =3D sizeof(ligs)}; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_NORMAL); +} + +TEST_F(state, prepared) +{ + struct liveupdate_ioctl_get_state ligs =3D {.size =3D sizeof(ligs)}; + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_PREPARED); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_NORMAL); +} + +TEST_F(state, sysfs_prepared) +{ + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); +} + +TEST_F(state, sysfs_frozen) +{ + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + ASSERT_EQ(0, ioctl(self->fd_dbg, LIVEUPDATE_IOCTL_FREEZE, NULL)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); +} + +TEST_F(subsystem, register_unregister) +{ + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[0])); + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[0])); +} + +TEST_F(subsystem, double_unregister) +{ + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[0])); + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[0])); + EXPECT_NE(0, unregister_subsystem(self->fd_dbg, &self->si[0])); + EXPECT_TRUE(errno =3D=3D EINVAL || errno =3D=3D ENOENT); +} + +TEST_F(subsystem, register_unregister_many) +{ + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_F(subsystem, getdata_verify) +{ + struct liveupdate_ioctl_get_state ligs =3D {.size =3D sizeof(ligs), .stat= e =3D 0}; + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_PREPARED); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + struct luo_arg_subsystem subsys_arg; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", + self->si[i].test_name); + subsys_arg.data_page =3D self->si[i].verify_page; + + ASSERT_EQ(0, run_luo_selftest_cmd(self->fd_dbg, + LUO_CMD_SUBSYSTEM_GETDATA, + &subsys_arg)); + ASSERT_EQ(0, memcmp(self->si[i].data_page, + self->si[i].verify_page, + page_size)); + } + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_NORMAL); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_F(subsystem, prepare_fail) +{ + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + int i; + + snprintf(self->si[LUO_MAX_SUBSYSTEMS - 1].test_name, LUO_NAME_LENGTH, + NAME_PREPARE_FAIL ".%d", LUO_MAX_SUBSYSTEMS - 1); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(-1, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); + + snprintf(self->si[LUO_MAX_SUBSYSTEMS - 1].test_name, LUO_NAME_LENGTH, + NAME_NORMAL ".%d", LUO_MAX_SUBSYSTEMS - 1); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + ASSERT_EQ(0, ioctl(self->fd_dbg, LIVEUPDATE_IOCTL_FREEZE, NULL)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_HARNESS_MAIN --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E8BC226A1CF for ; Mon, 29 Sep 2025 01:04:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107884; cv=none; b=DLLyuGtBjT2Rjg8/eIeYTgUtC9KsvEqnXQieAMNXnZkoS6/XBphR0TNGmQ0XsBo4ftdKB9HW51C1ptZQzBI6UpV/FELv1B9bWMF0yaEWvE+LyFgO2UV+IiUz6b0CH0yJKt/F9cEnaIEnBOvu2bqUPaw61rPmOlzMKl3mebLcB6g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107884; c=relaxed/simple; bh=TZZO/P46r8g00GY+tV6D7n7mil1OMeqdQT3ahHm3ces=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UKwHPNH0FKj+x+mfijtqDbTuOICJ9wZoSAvmGRct+yHSh/b3AdaVL2zyfc+hI7TeiwZUNuwsT0xm95sLFsRr7iLMXLa+OSqIW+ewccrDaMuHAR+ZDVTmqv4AKFbar3X02Ej+MVb4HmIA9jZ/uS7yt+Ix/ac7ueGphfx9VSuhf00= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=FbSIkthQ; arc=none smtp.client-ip=209.85.160.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="FbSIkthQ" Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-4db7e5a653cso33688441cf.1 for ; Sun, 28 Sep 2025 18:04:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107880; x=1759712680; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Qo+j6vvSs0kYXvlpvUfoWrks4CXynF5fRGFMBKVd5/I=; b=FbSIkthQlOj+3XizBHveKhbLcXK6qQHekhjG4OJUBWy8szfN5fMow2YkXERBY/9FMP afyQaE6aVRSL8gO75nRyuBmIjfqjBtgDo/iDwDJnClbZTwyRp+VWarJd4YSGy7Y1MT9r WU16WI2csyTtKJKQwwBFDWfeT6SZ8Q87QKY+xeQAObRcYN4TuczQlOavjH+wG3enEep3 /f5/cROo3XbQUgcn/8xKTQA0euedIRtLL4NnqAFSJIHhnCUCUMr+FxcML9h3/Ktb6++H Hxds0xFcPsIxymlHe4nKxaYN2jCnXHKAcY5e1brRpSowjFf2fcfIhMVO9OfLkZUB1cpO r1iA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107880; x=1759712680; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qo+j6vvSs0kYXvlpvUfoWrks4CXynF5fRGFMBKVd5/I=; b=IPg/vd9+DapqzU3Cyw8ae+ug8V5zpjgosVDllMEDSfbpN4gjoqZbd6/Wcur/9HpKF5 9T0K5QaKzqicoq9x85JUZDqZ70LKyZqShw8e0jM0n36A9dw8B3ZHl1LzAtoCwpxMkG1V 2cIQZWHGBDtWmCgGxAF4DfEMR2p4g7GypL3uwsHK7ZjCiOh2G081oGfNLGZzHzBVwS+0 DEOSywOC5q4BZSmFd2HcigH9Y4Xol6xMdIKhlU2dB/5YxhZGmfkBmxVbpI74ZRWn7/8k KBlWPPqEirLfcf00whUgEjUFgfqFsgJYfsG2SgvNlNh06gJ2BZ1s8dSAzzJYcoTqoTmQ JfxQ== X-Forwarded-Encrypted: i=1; AJvYcCUE6ckDqL5xe33WV84gqrddAw6l16Ux+3eDtkmbPx1CGcJm2HP2Zlhq01Gl39aeDk4zDNGX9Ll6rn13+hY=@vger.kernel.org X-Gm-Message-State: AOJu0YygmITpeUbp3Iy5ZQB7azLH7GYiVvQIEg0fTq0PLfwGTZRqLQPN F82hEVlrrWbRQHa9Qq5Jl6TF27ARSXzoKRWokMXbNhlgT+y0OdG5F37f6/JSTwFr3K8= X-Gm-Gg: ASbGnctQTjV0v6ada6+rN/+SIeKrVVwEcmGVvLNEQL05TFw2I5MuB7pSqErpxoDUWtD WWfFk8bEsg6+41iPv//gEgMuO/O1KrShgzz1OnC9F5PBe8wBAi0ON/ya8ZybXHAkmwfkQ3jV4vc 7rtbpGLHaIjq/hPRwVFL5bVlmYSLPqc9+kFgqAFfzbISw7YoZ4U0MNcfKG6klMysw6M0b4eOUaM GIMeWIHLyHmlXYmyzOCD7Eqsbhzx2aKaHm9AqtROMnsjoUycjckjJriYI8YEUAszkLvqzMEXjKr f/MaaDhKeUSYDp2iv5m4hFhtotHkHoJDDb2urA9y9Ux3GXscP9lUE6SwggAdYHYnUEmgX1wdyFN PY/fCABA4UDcqliyYOajoRuhVVpvFpJ+1tD5mqtmwdRihGbrbeCSQfvZGflGYNLCMMg0DsPtwX8 upYkaVzSo= X-Google-Smtp-Source: AGHT+IFnn/Ai8Ev964hSNdZizcDfUH5OIiSZcW9Lxxs1D7Eu4fY6PuQ8Bl+TY2i03llueVLxiUozLg== X-Received: by 2002:a05:622a:904:b0:4c3:a0ef:9060 with SMTP id d75a77b69052e-4da49253c74mr204367941cf.26.1759107880566; Sun, 28 Sep 2025 18:04:40 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:40 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 19/30] docs: add luo documentation Date: Mon, 29 Sep 2025 01:03:10 +0000 Message-ID: <20250929010321.3462457-20-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the documentation files for the Live Update Orchestrator Signed-off-by: Pasha Tatashin --- Documentation/core-api/index.rst | 1 + Documentation/core-api/liveupdate.rst | 57 ++++++++++++++++++++++ Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/liveupdate.rst | 25 ++++++++++ 4 files changed, 84 insertions(+) create mode 100644 Documentation/core-api/liveupdate.rst create mode 100644 Documentation/userspace-api/liveupdate.rst diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/inde= x.rst index 6cbdcbfa79c3..5eb0fbbbc323 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -138,6 +138,7 @@ Documents that don't fit elsewhere or which have yet to= be categorized. :maxdepth: 1 =20 librs + liveupdate netlink =20 .. only:: subproject and html diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api= /liveupdate.rst new file mode 100644 index 000000000000..7c1c3af6f960 --- /dev/null +++ b/Documentation/core-api/liveupdate.rst @@ -0,0 +1,57 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update Orchestrator +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +.. kernel-doc:: kernel/liveupdate/luo_core.c + :doc: Live Update Orchestrator (LUO) + +LUO Subsystems Participation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :doc: LUO Subsystems support + +LUO Sessions +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_session.c + :doc: LUO Sessions + +LUO Preserving File Descriptors +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_file.c + :doc: LUO file descriptors + +Public API +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: include/linux/liveupdate.h + +.. kernel-doc:: kernel/liveupdate/luo_core.c + :export: + +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :export: + +.. kernel-doc:: kernel/liveupdate/luo_file.c + :export: + +Internal API +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_core.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_session.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_file.c + :internal: + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update uAPI ` +- :doc:`/core-api/kho/concepts` diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspac= e-api/index.rst index 0167e59b541e..64b0099ee161 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -61,6 +61,7 @@ Everything else :maxdepth: 1 =20 ELF + liveupdate netlink/index shadow_stack sysfs-platform_profile diff --git a/Documentation/userspace-api/liveupdate.rst b/Documentation/use= rspace-api/liveupdate.rst new file mode 100644 index 000000000000..70b5017c0e3c --- /dev/null +++ b/Documentation/userspace-api/liveupdate.rst @@ -0,0 +1,25 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update uAPI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +ioctl interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_ioctl.c + :doc: LUO ioctl Interface + +ioctl uAPI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: include/uapi/linux/liveupdate.h + +LUO selftests ioctl +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_selftests.c + :doc: LUO Selftests + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E58627056B for ; Mon, 29 Sep 2025 01:04:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107885; cv=none; b=sv9ZPIpuTP+fXDFa4JpXXQiK4Nwrbs/0MRImsfEJvLJhj2glOQV8AysYVqKIvftEiTUH8CD3ufuBC/7epEQsGU4cFyORUk3wK6A4U92qqJZjcgM+9++M3tphED1SYMXffw27Ynxtu4JvS5ua/lE6WCxcuSemGEFcP9MQ4wZcNWU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107885; c=relaxed/simple; bh=9W8xPT3N53RvPHAnS9f5/QrLnT01yqAHq2ba0G88QkA=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VjQw8ZKYh93WhP/cLEMjw4V02MfEkYnKLjODbFFEdlkidYzPwfVMlNUs+jiVijEetpOnWKsKHHw3TUKr/Dh5o8OTeBYgEqNBy8+9JibA240HSR0wLUTxT6MAYTNXr8/xQIFDFry8nSPlUMlYZ3TeiAE6HhszFQeZoTpiGfIPR5Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=YqgwQruV; arc=none smtp.client-ip=209.85.222.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="YqgwQruV" Received: by mail-qk1-f182.google.com with SMTP id af79cd13be357-85d5cd6fe9fso293012485a.0 for ; Sun, 28 Sep 2025 18:04:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107882; x=1759712682; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=g7EcpRB2lXFP1cLCapL56dmZqGYeN55I/l6E6y0w+Pw=; b=YqgwQruV64DhTVuvvXUghhuaB+AJDi1BI1iuUcrVyWiMrPyXJEWs/dwlNUAnUN024w T6KBZc/EJHoBLkXs8WfudD9QpAUfnowBYAeQDRF3q+dM8PTTf0h4bOMwKFlOTOzw4WMf MfKHlrAwaY6i3poHdRRD2bUgIQcABCKGgCyY1ibV04dgNPuVvvvS5BBdskC6E4FXB6ky rKqmCnfvb+12cLzv78dlntY4PMfABHP1UIWjLoxffc3AAsgJn07ZCy/Bwj/jO7bP4kiH k3YTU5+2SnVTbnsZ5O98z8ZU+Mm7SWNeahppJ9z+3NLrVs9wuHaMhAaRWlEokvGsLr0Z /tnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107882; x=1759712682; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=g7EcpRB2lXFP1cLCapL56dmZqGYeN55I/l6E6y0w+Pw=; b=XGKER+xGlsMNYVC9B6D02BbJzvR1vM4gYS5XTNMuglylrmi622F+OFOenzFPd9jbnS Zj4VEMN8CRuQjhbS97NVGr7b6Z7DniBdlYho2eyyqWSx7ZZftsrS9XMpn6sNX1q2wp9v LX0ssKgoH20cK+0ayybUZ/EcEcnfj4UGCku0bMJfhvycVxrrxzzzzu7XP4a9ixMItzNf UMqUTZW3rJ4wz2xi+5W73paX6EcUcBOtBx+GeZW4Pv+7s8LMM+JUpNzan+7978n4Vlfr VqH3lx1vuM9dWNp/7mg3ImiM+OmZ9H4kpRlTdTKoPbn6BCWNFlNJjWp07KKgUzaX3/3a nj4Q== X-Forwarded-Encrypted: i=1; AJvYcCVg/9cZHyj3zrJbCeNBqUk1yUw3tcQCV1tnGRNdupA9N46qO/KkL7GFmNjLgQd2aYyW6FdEiIg9CKRETZE=@vger.kernel.org X-Gm-Message-State: AOJu0YxtmWPTwh4U3jzstcmli2Likg/q/0dqTEOdGsrs7EDBGt+hFwQx JLqYfx6qxwab90IpVGFtu0PDdVYBR96+06zIEviBRMyhoM3IvF2CEGdQUDSwNAlWqOE= X-Gm-Gg: ASbGncs4pGEKm/RMgo9QXkBP5WHE9m/ysbd2Jut36WDlqw4gt/WtYF4ydzN5P0QbFgY ThgadiySRD3i2s9AqG20xyMj0k0kYFSBKpR+w0UsvdMMn/KYP1ToTFQZcyuxS3o/uZcRPTk763P VNXlLwyODJTCMl6Ai/ZHsTtIUa9fSS4nbwBdFetbVIiwDRVkzOxdiTsTr8xsj3BsIt/uRA7/lLF 8NPiNJ8PhKgACTLFMaw3GXV9OsqB8OrXHQ460PxlO27pdf1Q52vhf1VIlfLM6O5UFWIKPFIk+Vr nlwChbn453kf6+xAnIfziMivOD53mbT/D9f13SvAx04OKku3AZ1i3qkxPtzSZZL45GzrIZrAYP+ O+sNVAEYi1BAd5vOMS3vfmCjiy4ALU22ZZGSCyXs6mx/LpT71W7PzBc4wnyfZt8mVa1zwnM1pHn baCNC5HYn5VebID3g//A== X-Google-Smtp-Source: AGHT+IFiLMT68nSMNgBaSuSlJ+JLaJj7OJOpOwM9krl/tnM9X12Z1SJwMWK8EaN96FbQ1GWCyUBDww== X-Received: by 2002:a05:620a:294a:b0:864:48eb:34e with SMTP id af79cd13be357-86448eb08f9mr880925585a.55.1759107881875; Sun, 28 Sep 2025 18:04:41 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:41 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 20/30] MAINTAINERS: add liveupdate entry Date: Mon, 29 Sep 2025 01:03:11 +0000 Message-ID: <20250929010321.3462457-21-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a MAINTAINERS file entry for the new Live Update Orchestrator introduced in previous patches. Signed-off-by: Pasha Tatashin --- MAINTAINERS | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index e5c800ed4819..e99af6101d3c 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14431,6 +14431,18 @@ F: kernel/module/livepatch.c F: samples/livepatch/ F: tools/testing/selftests/livepatch/ =20 +LIVE UPDATE +M: Pasha Tatashin +L: linux-kernel@vger.kernel.org +S: Maintained +F: Documentation/ABI/testing/sysfs-kernel-liveupdate +F: Documentation/core-api/liveupdate.rst +F: Documentation/userspace-api/liveupdate.rst +F: include/linux/liveupdate.h +F: include/uapi/linux/liveupdate.h +F: kernel/liveupdate/ +F: tools/testing/selftests/liveupdate/ + LLC (802.2) L: netdev@vger.kernel.org S: Odd fixes --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C09A7279782 for ; Mon, 29 Sep 2025 01:04:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107887; cv=none; b=k1NDELEpoApRJe5h34+i2TOvNz/g93zXrnJ5uOo10TDFn+2OW6LV2UTmnVzAHAlSfoEM7YIGD9b15EwZd5HtnkTArWKXUF3UJhM1/lr0w51KejKQhKuAZ9dUAIpR9nuNx04caaxfcwsp6Wwq3Hjjpr8JoNMp6Ohew7hQu2fLL7g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107887; c=relaxed/simple; bh=hkGL4iGBINyFUpwIkVmPBG/v+0aFAVp0ef3WvzV/lRA=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eZKecpoCwBdiGlPgI0/pqO3C5tlPaj3S2TvtbvSqSeIsY49innOJ8nTFY4TDyZGx5JF/PHoYMVnDpa5Y6sQYBD62SXonQ/LdQtkhNiustRKnIu/E93sXq5hlS+gvUQcIiNvk+lCjBMixT8Ck2jntq5eAF/8jtRXwfyQLcFR7xvw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=arVbYY40; arc=none smtp.client-ip=209.85.222.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="arVbYY40" Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-85d02580a07so233331685a.0 for ; Sun, 28 Sep 2025 18:04:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107883; x=1759712683; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=H3orCdZcM4OnRmQqZSjpp/rFEAME7J/Ll0MOwDMAGFw=; b=arVbYY40EKsaATvXkWqB2NUOF7nKmqwpk7E5gpOjjaHCilLu/qqNTKaBd8D0siAipG Pc6TgevnEkOOfnIa8PkDv0ZvMF4bo2i0YVWe8VtourutIvlZjVVeeqfN6p/lpAM3z0H1 4Xa6dZiF87uSsS4TgC/PubuloH6fNk50pu4L+MDCSWvhmC8gGi9jeBKmEH9TIXE3TlmF Ldc6VK5vKLaJAikn0vIDI+SU+cHMvd46+Bsu46XVlwnhyKkzdMTeL1E+lF/ugri89t7T TmwpoDmOj+9bwUG6hACtZpA+2h1ovFY00bV0jNG9nBUnEF96nSamDc0pYQI6KTm/2xU3 4yhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107883; x=1759712683; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H3orCdZcM4OnRmQqZSjpp/rFEAME7J/Ll0MOwDMAGFw=; b=CLaZqFzIS0N2bABkqO02lJ/aL9lbVCX46A0M6+bFtAfecvBt2ZL4vgbmjBIKRW3RGK hkFfA66lPSTKbfarNTBL3z/0TwUtcm3ioSD2SOq4Dv+w+fK4LAAHcaU97SJUuCvW6gmR lYJm/wHVcLfqzuP13CSN7IVps8nOahXcLXg7DOg5M+zgGThVeyUD/nUmmtjuabBvJjyZ eMWzu7WNrYn6tJAAZLbU/fCM70i+uQFy4HjSJO9T2g/TQf2a6kiS6wmLuDwdkBmBcc5o YsI5InlIPFBiGi4l46U7+MMoQuHY8WWPKj+5MQa9XNllSkiHDV3dVs5nKQyaNCCYO83a shxw== X-Forwarded-Encrypted: i=1; AJvYcCXubo67V/C7WMu8y8sbux74tPtZIfGDSa3Pp2AZ9FGy6rrUN75cSxj+mMn8ACZB1cnXmaJzB+BF7ClwKTs=@vger.kernel.org X-Gm-Message-State: AOJu0YxDQb0DsaDnePvKCHsss/wuCSlDcDohwe8ogV7KjCFDhwpKPLbh +5lCHmtbHy8uW+jln/5gnGmei09oWqlCR5lvWlwBnrs3TMbHY4ng34IQFqK3sRtF1OU= X-Gm-Gg: ASbGnctlWUSsgoE2HRogBUpLkwpzB4rRCvzbQsHCMrmaSQWYC464etz59y1SVm+k4yj o9UiS9t2PBJP4XIzWvzk3eVm4frHUsuyB6Z+7UWvLz+VCsKd9WHLjupdSlQNftp3O7tV8ihUB/x Selg/bkhB0IAMmgUXoRDBy0//I5DFg8rAhufXQrQJSCscUz1bc5SlL4ZcvnKwbB/UBeceNehOZk 8gQWl37O1Va8E5pUU9zpmVYaRiM3InKNX7bIpv4tzO4OBGZVGdCtfdis0qMCiA1rLYFHUn+ynDn B/eZaHZFzniLOk59qgtJmnNm0trr224pmMHfioe+VZejy5hFECcXadIzMhZcrWqtRrc01tXlUOk oat6MrlvK+L9D8pPOSxPDPzaibwW6XP1AopjuSeRg29AxT7GWop0gzCpUE6E21pzUqJuFesnyy4 qFtRHkJZWzJkdQ9774Wg== X-Google-Smtp-Source: AGHT+IGJx0yEjs9NdTtup483rQFUUHbu23+HQshPBoJHcCouKPKah+/aB6Qgtrmj1NCU6W09m2WUMQ== X-Received: by 2002:a05:620a:172a:b0:7e7:fd49:b0c7 with SMTP id af79cd13be357-8645c15e564mr910943285a.7.1759107883275; Sun, 28 Sep 2025 18:04:43 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:42 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 21/30] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Date: Mon, 29 Sep 2025 01:03:12 +0000 Message-ID: <20250929010321.3462457-22-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav shmem_inode_info::flags can have the VM flags VM_NORESERVE and VM_LOCKED. These are used to suppress pre-accounting or to lock the pages in the inode respectively. Using the VM flags directly makes it difficult to add shmem-specific flags that are unrelated to VM behavior since one would need to find a VM flag not used by shmem and re-purpose it. Introduce SHMEM_F_NORESERVE and SHMEM_F_LOCKED which represent the same information, but their bits are independent of the VM flags. Callers can still pass VM_NORESERVE to shmem_get_inode(), but it gets transformed to the shmem-specific flag internally. No functional changes intended. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- include/linux/shmem_fs.h | 6 ++++++ mm/shmem.c | 29 ++++++++++++++++------------- 2 files changed, 22 insertions(+), 13 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 0e47465ef0fd..650874b400b5 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -10,6 +10,7 @@ #include #include #include +#include =20 struct swap_iocb; =20 @@ -19,6 +20,11 @@ struct swap_iocb; #define SHMEM_MAXQUOTAS 2 #endif =20 +/* Suppress pre-accounting of the entire object size. */ +#define SHMEM_F_NORESERVE BIT(0) +/* Disallow swapping. */ +#define SHMEM_F_LOCKED BIT(1) + struct shmem_inode_info { spinlock_t lock; unsigned int seals; /* shmem seals */ diff --git a/mm/shmem.c b/mm/shmem.c index b9081b817d28..ce3b912f62da 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -175,20 +175,20 @@ static inline struct shmem_sb_info *SHMEM_SB(struct s= uper_block *sb) */ static inline int shmem_acct_size(unsigned long flags, loff_t size) { - return (flags & VM_NORESERVE) ? + return (flags & SHMEM_F_NORESERVE) ? 0 : security_vm_enough_memory_mm(current->mm, VM_ACCT(size)); } =20 static inline void shmem_unacct_size(unsigned long flags, loff_t size) { - if (!(flags & VM_NORESERVE)) + if (!(flags & SHMEM_F_NORESERVE)) vm_unacct_memory(VM_ACCT(size)); } =20 static inline int shmem_reacct_size(unsigned long flags, loff_t oldsize, loff_t newsize) { - if (!(flags & VM_NORESERVE)) { + if (!(flags & SHMEM_F_NORESERVE)) { if (VM_ACCT(newsize) > VM_ACCT(oldsize)) return security_vm_enough_memory_mm(current->mm, VM_ACCT(newsize) - VM_ACCT(oldsize)); @@ -206,7 +206,7 @@ static inline int shmem_reacct_size(unsigned long flags, */ static inline int shmem_acct_blocks(unsigned long flags, long pages) { - if (!(flags & VM_NORESERVE)) + if (!(flags & SHMEM_F_NORESERVE)) return 0; =20 return security_vm_enough_memory_mm(current->mm, @@ -215,7 +215,7 @@ static inline int shmem_acct_blocks(unsigned long flags= , long pages) =20 static inline void shmem_unacct_blocks(unsigned long flags, long pages) { - if (flags & VM_NORESERVE) + if (flags & SHMEM_F_NORESERVE) vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE)); } =20 @@ -1551,7 +1551,7 @@ int shmem_writeout(struct folio *folio, struct swap_i= ocb **plug, int nr_pages; bool split =3D false; =20 - if ((info->flags & VM_LOCKED) || sbinfo->noswap) + if ((info->flags & SHMEM_F_LOCKED) || sbinfo->noswap) goto redirty; =20 if (!total_swap_pages) @@ -2907,15 +2907,15 @@ int shmem_lock(struct file *file, int lock, struct = ucounts *ucounts) * ipc_lock_object() when called from shmctl_do_lock(), * no serialization needed when called from shm_destroy(). */ - if (lock && !(info->flags & VM_LOCKED)) { + if (lock && !(info->flags & SHMEM_F_LOCKED)) { if (!user_shm_lock(inode->i_size, ucounts)) goto out_nomem; - info->flags |=3D VM_LOCKED; + info->flags |=3D SHMEM_F_LOCKED; mapping_set_unevictable(file->f_mapping); } - if (!lock && (info->flags & VM_LOCKED) && ucounts) { + if (!lock && (info->flags & SHMEM_F_LOCKED) && ucounts) { user_shm_unlock(inode->i_size, ucounts); - info->flags &=3D ~VM_LOCKED; + info->flags &=3D ~SHMEM_F_LOCKED; mapping_clear_unevictable(file->f_mapping); } retval =3D 0; @@ -3059,7 +3059,8 @@ static struct inode *__shmem_get_inode(struct mnt_idm= ap *idmap, spin_lock_init(&info->lock); atomic_set(&info->stop_eviction, 0); info->seals =3D F_SEAL_SEAL; - info->flags =3D flags & VM_NORESERVE; + if (flags & VM_NORESERVE) + info->flags =3D SHMEM_F_NORESERVE; info->i_crtime =3D inode_get_mtime(inode); info->fsflags =3D (dir =3D=3D NULL) ? 0 : SHMEM_I(dir)->fsflags & SHMEM_FL_INHERITED; @@ -5801,8 +5802,10 @@ static inline struct inode *shmem_get_inode(struct m= nt_idmap *idmap, /* common code */ =20 static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *n= ame, - loff_t size, unsigned long flags, unsigned int i_flags) + loff_t size, unsigned long vm_flags, + unsigned int i_flags) { + unsigned long flags =3D (vm_flags & VM_NORESERVE) ? SHMEM_F_NORESERVE : 0; struct inode *inode; struct file *res; =20 @@ -5819,7 +5822,7 @@ static struct file *__shmem_file_setup(struct vfsmoun= t *mnt, const char *name, return ERR_PTR(-ENOMEM); =20 inode =3D shmem_get_inode(&nop_mnt_idmap, mnt->mnt_sb, NULL, - S_IFREG | S_IRWXUGO, 0, flags); + S_IFREG | S_IRWXUGO, 0, vm_flags); if (IS_ERR(inode)) { shmem_unacct_size(flags, size); return ERR_CAST(inode); --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0956927F198 for ; Mon, 29 Sep 2025 01:04:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107888; cv=none; b=h4IhTG6ZtBljZqda30ZK5nhJwpE5bs7BrFXY5aHJqCxoib+U/RmfZNJz/8aEcTF0sZrxsR0BVdg3CuNJ3Pvk6D5fCMFL0kHhoG+m79/A+/2k6Tgb4bugXTkzMZPWymqbKrE90MKWQRhgheLg0QNe3Z+FvBy0uXuCY0pYpF3fkOM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107888; c=relaxed/simple; bh=wKkOsRSYJa3vc4lYOd4d9l8kUjqRigmITBj/yOFEkIE=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=b42x46mxPGdBL5yrFy9TSBGllRJA+r/HwDRoV/Ify3mwLrcjEcgViTCNBtRyAJ4lyxrpbcyK1xepl/K3C4bHZ/eQ2n4fBeXbVwPf2vz1HpA/hUb2QTyaJnWHyD3LVdXr71Xdhum3Q+DoZrN2W4fVaL6MeSStMSJ/gSYjbGzkyC8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=XVMFzDf5; arc=none smtp.client-ip=209.85.222.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="XVMFzDf5" Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-854cfde0ca2so546343385a.3 for ; Sun, 28 Sep 2025 18:04:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107885; x=1759712685; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=a5jod2izHbVl6GoMuI25PsEFlFJNITXcz8AeEB8I3yk=; b=XVMFzDf55KpUL/PhJzfKCV3iKDZ5TJ0eWvCFG0YemrtrljsavxwxjoXxLTYJVeXaJ/ godOw6SVBFUcXrgXwopBV1k5CagHO2hm+JLDxxkR+ykmzomjXOiRbvstKUCC56FTOa/U Rv7ojf+Ib0S/8DZNajvbPdZ5GKtFFT/Jep+Zkx1PcxUQhkRkmZ9dXS6uDgASRYm04V0N GAmUNrBuTrbLvzfxKWffm0+sqsVW1swJCFW6SxLiS29UEc/HhrGcsyjFo7pgAsCuvfft SpjsZ53vGFfxC97OzW9hC7vWMtpFUCIXPox2NMQSn5cH75eeizoMDwm4oOsm5jjOvYpY /X/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107885; x=1759712685; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a5jod2izHbVl6GoMuI25PsEFlFJNITXcz8AeEB8I3yk=; b=prkHNJDloUOhApjLncg8OrW11rS6Vw49B/YpPJJdBxxX4/E9SSFEzr2WmcAzVY84Cy tONlE11uwfvLY5tesAU6Ahc5Ba9v3THOwwyUtE+NgjLAzuuLSx6LT3K8XITSp7uZzDm6 e2o5Dx3PCdMECHKV1iN7q5Wr8jYW52QVaD6JdbbCUEMhSZA8stWrkN2tmoCsXpgV2Ykh iaGqpyRoWQ75ryQU6HjY3Oq73wAK3l+A1g/Uhdisr73Qiw505yzDnDiJ/sE8wcnSAh6/ kT0Hn7yctwGjK0OeKjEXdEmHv18jlwFaJ8mUoj/Zy2MFRpjOhxOlPPQOOzg4Fc6xWQuT eltA== X-Forwarded-Encrypted: i=1; AJvYcCUdrVSKyele2uT2MyHqHrzn4DbDTqteYMc1RFhT9faaiPQpcGJj5wOGbwGUAgiljSIqpYPOzDsSSgPF87E=@vger.kernel.org X-Gm-Message-State: AOJu0Yw1uvLK/yjmatgWDhRC8714Qw0tTbVzlZ8ySDp2R/bIEXK8uawe B8+gYV7kEAhP3G208CZ8HFP6HvRwDQ1tfDat6cA/0xarIx5J74CRdKDSbzJfmDystgk= X-Gm-Gg: ASbGncvfcRHdK01L6NTHMtUb81W4plGIllg8V4E5hxCoRNsjf/k10m2wfwiUBnf5d+5 BmSI4MWsbwVLL/cbtE9ydg00aYcMmPzn4K7CDuiXWK+vnr0mgEzFcBW2rzCTuYnMNunzVXcIGaF oP/1W5o/d5VPLN417uQE18dLfG9SnASX173DcnV/fq3QaacI4zztfuWpE75vfgKpVLxRYJID9kJ 5QMGVaTPkSC2z0hbyZvqdx+8libtGQThuF4cjLnEKZ6v0aH3pz329Xw3P/kvlBjMUBbGBIPDeNv 9+gRDwo1KGAln8r7WAeI3HFiOIFpxIvBujo+XRBBcaKy10n6FhxfKkw92tQXmeCaKWY/ifO13Ed cmi6u8esT7k+Z2VRw1T88cwWZCAqy82iJaDzeTZxHFi0fyNYcHzbj5iXdt4Se6UJyFoSa5QEJ3C qpzuy0bnk= X-Google-Smtp-Source: AGHT+IHcWK3+r2DO1mFFWuWSG7txN7iCjmZaeFDLs8Xu4WrTbTEg8yxHdhQIyiKJ1YNZRSeC44Oztw== X-Received: by 2002:a05:620a:618b:b0:85c:809:3f10 with SMTP id af79cd13be357-85c080943d0mr1326429385a.26.1759107884637; Sun, 28 Sep 2025 18:04:44 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:44 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 22/30] mm: shmem: allow freezing inode mapping Date: Mon, 29 Sep 2025 01:03:13 +0000 Message-ID: <20250929010321.3462457-23-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav To prepare a shmem inode for live update via the Live Update Orchestrator (LUO), its index -> folio mappings must be serialized. Once the mappings are serialized, they cannot change since it would cause the serialized data to become inconsistent. This can be done by pinning the folios to avoid migration, and by making sure no folios can be added to or removed from the inode. While mechanisms to pin folios already exist, the only way to stop folios being added or removed are the grow and shrink file seals. But file seals come with their own semantics, one of which is that they can't be removed. This doesn't work with liveupdate since it can be cancelled or error out, which would need the seals to be removed and the file's normal functionality to be restored. Introduce SHMEM_F_MAPPING_FROZEN to indicate this instead. It is internal to shmem and is not directly exposed to userspace. It functions similar to F_SEAL_GROW | F_SEAL_SHRINK, but additionally disallows hole punching, and can be removed. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- include/linux/shmem_fs.h | 17 +++++++++++++++++ mm/shmem.c | 12 +++++++++++- 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 650874b400b5..a9f5db472a39 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -24,6 +24,14 @@ struct swap_iocb; #define SHMEM_F_NORESERVE BIT(0) /* Disallow swapping. */ #define SHMEM_F_LOCKED BIT(1) +/* + * Disallow growing, shrinking, or hole punching in the inode. Combined wi= th + * folio pinning, makes sure the inode's mapping stays fixed. + * + * In some ways similar to F_SEAL_GROW | F_SEAL_SHRINK, but can be removed= and + * isn't directly visible to userspace. + */ +#define SHMEM_F_MAPPING_FROZEN BIT(2) =20 struct shmem_inode_info { spinlock_t lock; @@ -186,6 +194,15 @@ static inline bool shmem_file(struct file *file) return shmem_mapping(file->f_mapping); } =20 +/* Must be called with inode lock taken exclusive. */ +static inline void shmem_i_mapping_freeze(struct inode *inode, bool freeze) +{ + if (freeze) + SHMEM_I(inode)->flags |=3D SHMEM_F_MAPPING_FROZEN; + else + SHMEM_I(inode)->flags &=3D ~SHMEM_F_MAPPING_FROZEN; +} + /* * If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages * beyond i_size's notion of EOF, which fallocate has committed to reservi= ng: diff --git a/mm/shmem.c b/mm/shmem.c index ce3b912f62da..bd7d9afe5a27 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1292,7 +1292,8 @@ static int shmem_setattr(struct mnt_idmap *idmap, loff_t newsize =3D attr->ia_size; =20 /* protected by i_rwsem */ - if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) || + if ((info->flags & SHMEM_F_MAPPING_FROZEN) || + (newsize < oldsize && (info->seals & F_SEAL_SHRINK)) || (newsize > oldsize && (info->seals & F_SEAL_GROW))) return -EPERM; =20 @@ -3287,6 +3288,10 @@ shmem_write_begin(const struct kiocb *iocb, struct a= ddress_space *mapping, return -EPERM; } =20 + if (unlikely((info->flags & SHMEM_F_MAPPING_FROZEN) && + pos + len > inode->i_size)) + return -EPERM; + ret =3D shmem_get_folio(inode, index, pos + len, &folio, SGP_WRITE); if (ret) return ret; @@ -3660,6 +3665,11 @@ static long shmem_fallocate(struct file *file, int m= ode, loff_t offset, =20 inode_lock(inode); =20 + if (info->flags & SHMEM_F_MAPPING_FROZEN) { + error =3D -EPERM; + goto out; + } + if (mode & FALLOC_FL_PUNCH_HOLE) { struct address_space *mapping =3D file->f_mapping; loff_t unmap_start =3D round_up(offset, PAGE_SIZE); --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DCE129AB05 for ; Mon, 29 Sep 2025 01:04:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107891; cv=none; b=k5ufMMDrRpf3Ut/4zgr2su0KdlHvUPnb++o++ImBFCGMT+SgwVhSWjadjYKOG/X/N+4svODfakK7SdVMTMkv/Pqc6dHWLx5Sl740V4Ylb4/KZryikUbGHegG6wh30Cejz8F/TKFsWcYT5TF6Eis9kDKro7ga69p+mjNWXxOCnn8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107891; c=relaxed/simple; bh=f4psfNd7pl4tLW9WB19i5zd8tji59wv/dnyXClT/vO4=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NeIsGzUKlKoQy6LziL9v+uoCacpoYRIg3YXDFnzFAvncHPBZ7iWyLZ1PMDrAqjUWAfoIKy0tmgzbM9bRUtHbaOrKTsE7O5FTS6vC/th6mui/L7e9mR+TWyEgrt7+pXOLN4gfp3mJSOnUTRBIX/vuO53cE4BEMEMI3wnndDjaLLk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=h+0JsHvW; arc=none smtp.client-ip=209.85.160.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="h+0JsHvW" Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-4dce9229787so33647971cf.0 for ; Sun, 28 Sep 2025 18:04:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107886; x=1759712686; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=o6klfNbsQaSU6FDtMq1rEXrmqStKMVNSQ5o5W34IBdA=; b=h+0JsHvW4V17uP+cyXnzHlSbn/ZPjdURR4ckrzz45eBqytrocZT9uJN7LJuqs+iwy2 itazGG3QDsfzAJzhQDlDLLoIcMTGQ2ue1DEEv3ymBFL0uk5iqQKypFZp30VvW8I+vyRB WwQf9I56nR/b1/O0VcnHVPji7ElBPeI9I321hEcxdb4PZwD9y4fOVqrBFf0n4Ejo3ToK 4a6+g/TUGksmEC8qo4JuBpEDQpb0ZLuDVK8ykHvMkOGB/viUTjFwzcanZbEw4Ra4rgqq L1O25eHPZR/7f1QYOukG+AqSh0qwnkiotcXVziCTp5vfl1umWagjVIst3en+pvkizK4L prig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107886; x=1759712686; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=o6klfNbsQaSU6FDtMq1rEXrmqStKMVNSQ5o5W34IBdA=; b=YImFnnAcZNWxDQQi3cuO2l+5PjFWv5eE9Qq/gCwjpjAzRns01nnEM9NVBhBkvNjKz+ kiH7VZ52EvWFMufHrchaw2ld3IZv1rD3I0oGdjsoFogH+j3MQnQFlLEaBosB8IaU/WKp IykE04ofnN+vYXbsDgnh8QfxL+A3+Sw5OJK6Sa/evCOe78eJv4/h70eRFygM94m1j1Xp KXglPJol95nf1dD2iU8Aew70WgV56Jq+EFrdOJR1X0tK6IWzjFreeWWNaRbJSzZzzpBc L4jSs56iRwl5UPa7MLACmA8UCWsjKKGZ/0QPgqeXwSLCBmSfVLsg0XH3p2ECm7amASkH 12mQ== X-Forwarded-Encrypted: i=1; AJvYcCXz0e1TndHSNSXgkHybc3dyWRGk7x2CQtvUaXVTE7dV9f7a+gdJTynvsai4vl5gb06WEZKcLhw0WRD9B4U=@vger.kernel.org X-Gm-Message-State: AOJu0Yz+0EroBMc9kMhNWl3/qvqxFiXYnXr2KtAu4DY6or3LG8/qA2t+ O74FnyizT1WEGMzvz7M5OIUZomGuDVNwYb1RIssOwJ3qQ8myq0m+FDE1Vflb6AoSweg= X-Gm-Gg: ASbGnctbOUXLVWZg0LwX7WSHr2/x6Dp1lb2A/Ra84l6SCHPtlKuxBeqLG1AbS+Bg6V0 n6nfleJsPvWdM8MnbSbksSX53V1OOKIxbhNRqeL0CXUFCEAWgQB00rCsfW6csO5gToAqOPCtkSx nKpFe/Jg8lxAJ93FlFzWfQUi7BtOh0qxHmAdz1vDyptwV7rgBdOWCo1vVZ6uaIQUOQ1xtq5azPY AykPGuuOOBaieXGRXNQcSJ+X2gcFgW5DFHIEPiffE3cVMOYJ/c6Hnipr3dExsdVSKwmcPAh+NqM 2dyQzenNgnyvAyTw+Av/Cdnio225k3FaSNmX0BSf4lQWDSRTxQm0g5GEy9UcDP9TsQvWS8r8U3M op4HguG4kVezZCtoojDeoSxxC7+miQFemR/IZNvuzWolyE38u6acJftQQYEzCBcEIAKNWBE9118 DG3QSqWB6eRHWRZoEo24jmk/4x62Im X-Google-Smtp-Source: AGHT+IHEKESKD14PFByufgCCj+q1xbdbjfqjaSBhECif5xZZLUIBv1xuJQ+qilMP2TC7MJ+Z1A1Lng== X-Received: by 2002:a05:622a:19a3:b0:4df:194:b46d with SMTP id d75a77b69052e-4df0194c12cmr84674601cf.80.1759107886091; Sun, 28 Sep 2025 18:04:46 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:45 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 23/30] mm: shmem: export some functions to internal.h Date: Mon, 29 Sep 2025 01:03:14 +0000 Message-ID: <20250929010321.3462457-24-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav shmem_inode_acct_blocks(), shmem_recalc_inode(), and shmem_add_to_page_cache() are used by shmem_alloc_and_add_folio(). This functionality will also be used in the future by Live Update Orchestrator (LUO) to recreate memfd files after a live update. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- mm/internal.h | 6 ++++++ mm/shmem.c | 10 +++++----- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 1561fc2ff5b8..4ba155524f80 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1562,6 +1562,12 @@ void __meminit __init_page_from_nid(unsigned long pf= n, int nid); unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memc= g, int priority); =20 +int shmem_add_to_page_cache(struct folio *folio, + struct address_space *mapping, + pgoff_t index, void *expected, gfp_t gfp); +int shmem_inode_acct_blocks(struct inode *inode, long pages); +bool shmem_recalc_inode(struct inode *inode, long alloced, long swapped); + #ifdef CONFIG_SHRINKER_DEBUG static inline __printf(2, 0) int shrinker_debugfs_name_alloc( struct shrinker *shrinker, const char *fmt, va_list ap) diff --git a/mm/shmem.c b/mm/shmem.c index bd7d9afe5a27..4647a0b2831c 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -219,7 +219,7 @@ static inline void shmem_unacct_blocks(unsigned long fl= ags, long pages) vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE)); } =20 -static int shmem_inode_acct_blocks(struct inode *inode, long pages) +int shmem_inode_acct_blocks(struct inode *inode, long pages) { struct shmem_inode_info *info =3D SHMEM_I(inode); struct shmem_sb_info *sbinfo =3D SHMEM_SB(inode->i_sb); @@ -435,7 +435,7 @@ static void shmem_free_inode(struct super_block *sb, si= ze_t freed_ispace) * * Return: true if swapped was incremented from 0, for shmem_writeout(). */ -static bool shmem_recalc_inode(struct inode *inode, long alloced, long swa= pped) +bool shmem_recalc_inode(struct inode *inode, long alloced, long swapped) { struct shmem_inode_info *info =3D SHMEM_I(inode); bool first_swapped =3D false; @@ -861,9 +861,9 @@ static void shmem_update_stats(struct folio *folio, int= nr_pages) /* * Somewhat like filemap_add_folio, but error if expected item has gone. */ -static int shmem_add_to_page_cache(struct folio *folio, - struct address_space *mapping, - pgoff_t index, void *expected, gfp_t gfp) +int shmem_add_to_page_cache(struct folio *folio, + struct address_space *mapping, + pgoff_t index, void *expected, gfp_t gfp) { XA_STATE_ORDER(xas, &mapping->i_pages, index, folio_order(folio)); unsigned long nr =3D folio_nr_pages(folio); --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E968F26B0AE for ; Mon, 29 Sep 2025 01:04:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107892; cv=none; b=HOsnb7kuxI9AQO3Vo0Bvg5dDXntRt1y4dfNUKq69A/2LoF+B0pRSjqontnB7Tiv0jvY3+wTlP2+FpHEyQG1moSZIpkseThsETkjrpf92vKXAAv8GkpIqb0b+H9PT3XNshl2qe0pqt6KFUTzokkqGXWta8Bm42kh4/i4AX2qm5U8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107892; c=relaxed/simple; bh=atx7+45CbzyMl9240tLjOxzl3me8oSMCbHABSuWi61w=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Lt7zCtxFQDGF4tf8OEi4Z3QrP1LiPc/T2DX/znD9xG3GsBQHgDep44KhPENUfeT/CP28lB1pjLV58imI7qq4D15pem7Z/y+A1ZF6omdTBD0i+Ma8e9r6dA/F/FcZPPoJUOcHgMErxfS1VQmcROsO7GktToNBl5+4za5Mz9HMx94= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Hm4aFKwA; arc=none smtp.client-ip=209.85.160.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Hm4aFKwA" Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-4dfe74ed2e1so12567121cf.2 for ; Sun, 28 Sep 2025 18:04:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107888; x=1759712688; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=pwXh0/kKCu0f+t04ISmoQpHfND/zfxsioOU5iAk15yM=; b=Hm4aFKwA9MRTxuJ67DDDYFEYA+z6mMjMQ68x0toevSQ6MScm5DbnCVNNNvDZNw8grE v9tv/Bn+fgXPWclF4cvo2G4Rgmn0RlDBhOG2yobPGNxzwedzOzpvsgVPzBAbzV7Rybum K/sHp4n8uE7gtXg/exmo+yGzgkIPxvw39sKS4akFMwUowYMo75+9zj6Bt906NumdJ1x3 6oyK+j5cB9x3NeN7MNdnXQgvKI8C0MAveypU1KWXTaApnW59pdg9/+dPXbEeMRZ71niM FqxYOlFpM5XJNMKYBybu5Fz+0WUwFAeScm610nVyuOjB177LK4C4TtntiD54F187oifO 5yhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107888; x=1759712688; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pwXh0/kKCu0f+t04ISmoQpHfND/zfxsioOU5iAk15yM=; b=SKcPT23RqAuC6vo5Sk0JO3jqhu0bI16LT55QNWmnjoAiUZ7NaFCFdrIWTslyOQqqUO AbJ9uY/TnWpFqwM/RLVJBrrfK3hbIAUNrXrXR66dy0LpGwobLBIXLFm2wVY+MtSSLIkk 8mPZjur+Cli/BlSm4U0XWvNd0PdxgFBXC1D77KU7ZRz+fUbNEhrF6u5S2tC02btELlW1 xeD94xZsCezOkKifmveutAD+v0xB5MPqV/MJnk6ueEOMFog6eY+j5frko72ZhrDVzPky SrhlCVlYks6mZ7Ue41LXOjRS7kTurDc9LN/YVedlqVB4YEEZPsr1Vaqx4iSrHeNpjll8 ngeA== X-Forwarded-Encrypted: i=1; AJvYcCWAQ8Ukceu6subUXsE1RTrNbCrEufgDs/d0uB1XFTtqLTCOUlvgI2uRuUgkyWR6qAfMKG/AdpuCN/iFiVs=@vger.kernel.org X-Gm-Message-State: AOJu0YzhWyV5wR28fBd/l/+2YHDIEwdFtPvUYYqDlEa3Hdk2OMIOvUeT oh0Gku3orVMEkEvZMbt5AomfBqI7LpG+3SduYBJYfWnq3yKeVq28Wyov2OVLSXxI4lE= X-Gm-Gg: ASbGncsdkq7mfv68YCgESRh+0riuL+4LH30jDKNudDtvyHdCV9Zh7f7t4moeDhs35Cv 6MqyOctikfkNVIDKTavYnIw+mzgQe6j6mh+/B2CGUA+t7umLVMD7v+8JGTwM4fS70Bb0215EvBo cIrkb1+ATAcqo2CmFOwzRZ19AK4QfPs9ktSFlEVhOhFxqGlTnfjulHD/JjqphQg//1mgAhZNftS lE2r0S9OYSky+PeWxm/6s8xPeVoSeSWisrUXY5t4X+QM45Rd95pnpBxXnBlYDXczlR+ICP8seaA 9F4VRd0vTj+pF5vd4nf+osBEIbrgndE7Zk1/mdaNDnEi3HJ0akT3E9h8/JU+4YQiHuYuVdjYuOM cAf+g1/hgSTq88SK4wDh+zQQ0SmxXnKwTxRVydocvHbgLpMDWDCciRL3seoBjwydgzEvfMJrDrO b9gG9v8zSGf5SJztIz0g== X-Google-Smtp-Source: AGHT+IGRP/eMVDf7zXMK0pi9Zti3NauS9lYe/VmLc3i8GoMpiU6Y/BcqSpE7f6fMLGjBIl3ZckzaXg== X-Received: by 2002:a05:622a:1b07:b0:4c0:983:9436 with SMTP id d75a77b69052e-4da48e74a08mr188378531cf.34.1759107887482; Sun, 28 Sep 2025 18:04:47 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:46 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 24/30] luo: allow preserving memfd Date: Mon, 29 Sep 2025 01:03:15 +0000 Message-ID: <20250929010321.3462457-25-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav The ability to preserve a memfd allows userspace to use KHO and LUO to transfer its memory contents to the next kernel. This is useful in many ways. For one, it can be used with IOMMUFD as the backing store for IOMMU page tables. Preserving IOMMUFD is essential for performing a hypervisor live update with passthrough devices. memfd support provides the first building block for making that possible. For another, applications with a large amount of memory that takes time to reconstruct, reboots to consume kernel upgrades can be very expensive. memfd with LUO gives those applications reboot-persistent memory that they can use to quickly save and reconstruct that state. While memfd is backed by either hugetlbfs or shmem, currently only support on shmem is added. To be more precise, support for anonymous shmem files is added. The handover to the next kernel is not transparent. All the properties of the file are not preserved; only its memory contents, position, and size. The recreated file gets the UID and GID of the task doing the restore, and the task's cgroup gets charged with the memory. After LUO is in prepared state, the file cannot grow or shrink, and all its pages are pinned to avoid migrations and swapping. The file can still be read from or written to. Co-developed-by: Changyuan Lyu Signed-off-by: Changyuan Lyu Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin Signed-off-by: Pratyush Yadav --- MAINTAINERS | 2 + mm/Makefile | 1 + mm/memfd_luo.c | 523 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 526 insertions(+) create mode 100644 mm/memfd_luo.c diff --git a/MAINTAINERS b/MAINTAINERS index e99af6101d3c..a17e4e077174 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14433,6 +14433,7 @@ F: tools/testing/selftests/livepatch/ =20 LIVE UPDATE M: Pasha Tatashin +R: Pratyush Yadav L: linux-kernel@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-kernel-liveupdate @@ -14441,6 +14442,7 @@ F: Documentation/userspace-api/liveupdate.rst F: include/linux/liveupdate.h F: include/uapi/linux/liveupdate.h F: kernel/liveupdate/ +F: mm/memfd_luo.c F: tools/testing/selftests/liveupdate/ =20 LLC (802.2) diff --git a/mm/Makefile b/mm/Makefile index 21abb3353550..7738ec416f00 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -100,6 +100,7 @@ obj-$(CONFIG_NUMA) +=3D memory-tiers.o obj-$(CONFIG_DEVICE_MIGRATION) +=3D migrate_device.o obj-$(CONFIG_TRANSPARENT_HUGEPAGE) +=3D huge_memory.o khugepaged.o obj-$(CONFIG_PAGE_COUNTER) +=3D page_counter.o +obj-$(CONFIG_LIVEUPDATE) +=3D memfd_luo.o obj-$(CONFIG_MEMCG_V1) +=3D memcontrol-v1.o obj-$(CONFIG_MEMCG) +=3D memcontrol.o vmpressure.o ifdef CONFIG_SWAP diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c new file mode 100644 index 000000000000..221e31c1197e --- /dev/null +++ b/mm/memfd_luo.c @@ -0,0 +1,523 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + * Changyuan Lyu + * + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Pratyush Yadav + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include "internal.h" + +#define PRESERVED_PFN_MASK GENMASK(63, 12) +#define PRESERVED_PFN_SHIFT 12 +#define PRESERVED_FLAG_DIRTY BIT(0) +#define PRESERVED_FLAG_UPTODATE BIT(1) + +#define PRESERVED_FOLIO_PFN(desc) (((desc) & PRESERVED_PFN_MASK) >> PRESER= VED_PFN_SHIFT) +#define PRESERVED_FOLIO_FLAGS(desc) ((desc) & ~PRESERVED_PFN_MASK) +#define PRESERVED_FOLIO_MKDESC(pfn, flags) (((pfn) << PRESERVED_PFN_SHIFT)= | (flags)) + +struct memfd_luo_preserved_folio { + /* + * The folio descriptor is made of 2 parts. The bottom 12 bits are used + * for storing flags, the others for storing the PFN. + */ + u64 foliodesc; + u64 index; +}; + +static int memfd_luo_preserve_folios(struct memfd_luo_preserved_folio *pfo= lios, + struct folio **folios, + unsigned int nr_folios) +{ + int err; + long i; + + for (i =3D 0; i < nr_folios; i++) { + struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + struct folio *folio =3D folios[i]; + unsigned int flags =3D 0; + unsigned long pfn; + + err =3D kho_preserve_folio(folio); + if (err) + goto err_unpreserve; + + pfn =3D folio_pfn(folio); + if (folio_test_dirty(folio)) + flags |=3D PRESERVED_FLAG_DIRTY; + if (folio_test_uptodate(folio)) + flags |=3D PRESERVED_FLAG_UPTODATE; + + pfolio->foliodesc =3D PRESERVED_FOLIO_MKDESC(pfn, flags); + pfolio->index =3D folio->index; + } + + return 0; + +err_unpreserve: + i--; + for (; i >=3D 0; i--) + WARN_ON_ONCE(kho_unpreserve_folio(folios[i])); + return err; +} + +static void memfd_luo_unpreserve_folios(const struct memfd_luo_preserved_f= olio *pfolios, + unsigned int nr_folios) +{ + unsigned int i; + + for (i =3D 0; i < nr_folios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + struct folio *folio; + + if (!pfolio->foliodesc) + continue; + + folio =3D pfn_folio(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + + WARN_ON_ONCE(kho_unpreserve_folio(folio)); + unpin_folio(folio); + } +} + +static void *memfd_luo_create_fdt(unsigned long size) +{ + unsigned int order =3D get_order(size); + struct folio *fdt_folio; + int err =3D 0; + void *fdt; + + if (order > MAX_PAGE_ORDER) + return NULL; + + fdt_folio =3D folio_alloc(GFP_KERNEL | __GFP_ZERO, order); + if (!fdt_folio) + return NULL; + + fdt =3D folio_address(fdt_folio); + + err |=3D fdt_create(fdt, (1 << (order + PAGE_SHIFT))); + err |=3D fdt_finish_reservemap(fdt); + err |=3D fdt_begin_node(fdt, ""); + if (err) + goto free; + + return fdt; + +free: + folio_put(fdt_folio); + return NULL; +} + +static int memfd_luo_finish_fdt(void *fdt) +{ + int err; + + err =3D fdt_end_node(fdt); + if (err) + return err; + + return fdt_finish(fdt); +} + +static int memfd_luo_prepare(struct liveupdate_file_handler *handler, + struct file *file, u64 *data) +{ + struct memfd_luo_preserved_folio *preserved_folios; + struct inode *inode =3D file_inode(file); + unsigned int max_folios, nr_folios =3D 0; + int err =3D 0, preserved_size; + struct folio **folios; + long size, nr_pinned; + pgoff_t offset; + void *fdt; + u64 pos; + + inode_lock(inode); + shmem_i_mapping_freeze(inode, true); + + size =3D i_size_read(inode); + if ((PAGE_ALIGN(size) / PAGE_SIZE) > UINT_MAX) { + err =3D -E2BIG; + goto err_unlock; + } + + /* + * Guess the number of folios based on inode size. Real number might end + * up being smaller if there are higher order folios. + */ + max_folios =3D PAGE_ALIGN(size) / PAGE_SIZE; + folios =3D kvmalloc_array(max_folios, sizeof(*folios), GFP_KERNEL); + if (!folios) { + err =3D -ENOMEM; + goto err_unfreeze; + } + + /* + * Pin the folios so they don't move around behind our back. This also + * ensures none of the folios are in CMA -- which ensures they don't + * fall in KHO scratch memory. It also moves swapped out folios back to + * memory. + * + * A side effect of doing this is that it allocates a folio for all + * indices in the file. This might waste memory on sparse memfds. If + * that is really a problem in the future, we can have a + * memfd_pin_folios() variant that does not allocate a page on empty + * slots. + */ + nr_pinned =3D memfd_pin_folios(file, 0, size - 1, folios, max_folios, + &offset); + if (nr_pinned < 0) { + err =3D nr_pinned; + pr_err("failed to pin folios: %d\n", err); + goto err_free_folios; + } + /* nr_pinned won't be more than max_folios which is also unsigned int. */ + nr_folios =3D (unsigned int)nr_pinned; + + preserved_size =3D sizeof(struct memfd_luo_preserved_folio) * nr_folios; + if (check_mul_overflow(sizeof(struct memfd_luo_preserved_folio), + nr_folios, &preserved_size)) { + err =3D -E2BIG; + goto err_unpin; + } + + /* + * Most of the space should be taken by preserved folios. So take its + * size, plus a page for other properties. + */ + fdt =3D memfd_luo_create_fdt(PAGE_ALIGN(preserved_size) + PAGE_SIZE); + if (!fdt) { + err =3D -ENOMEM; + goto err_unpin; + } + + pos =3D file->f_pos; + err =3D fdt_property(fdt, "pos", &pos, sizeof(pos)); + if (err) + goto err_free_fdt; + + err =3D fdt_property(fdt, "size", &size, sizeof(size)); + if (err) + goto err_free_fdt; + + err =3D fdt_property_placeholder(fdt, "folios", preserved_size, + (void **)&preserved_folios); + if (err) { + pr_err("Failed to reserve folios property in FDT: %s\n", + fdt_strerror(err)); + err =3D -ENOMEM; + goto err_free_fdt; + } + + err =3D memfd_luo_preserve_folios(preserved_folios, folios, nr_folios); + if (err) + goto err_free_fdt; + + err =3D memfd_luo_finish_fdt(fdt); + if (err) + goto err_unpreserve; + + err =3D kho_preserve_folio(virt_to_folio(fdt)); + if (err) + goto err_unpreserve; + + kvfree(folios); + inode_unlock(inode); + + *data =3D virt_to_phys(fdt); + return 0; + +err_unpreserve: + memfd_luo_unpreserve_folios(preserved_folios, nr_folios); +err_free_fdt: + folio_put(virt_to_folio(fdt)); +err_unpin: + unpin_folios(folios, nr_pinned); +err_free_folios: + kvfree(folios); +err_unfreeze: + shmem_i_mapping_freeze(inode, false); +err_unlock: + inode_unlock(inode); + return err; +} + +static int memfd_luo_freeze(struct liveupdate_file_handler *handler, + struct file *file, u64 *data) +{ + u64 pos =3D file->f_pos; + void *fdt; + int err; + + if (WARN_ON_ONCE(!*data)) + return -EINVAL; + + fdt =3D phys_to_virt(*data); + + /* + * The pos might have changed since prepare. Everything else stays the + * same. + */ + err =3D fdt_setprop(fdt, 0, "pos", &pos, sizeof(pos)); + if (err) + return err; + + return 0; +} + +static void memfd_luo_cancel(struct liveupdate_file_handler *handler, + struct file *file, u64 data) +{ + const struct memfd_luo_preserved_folio *pfolios; + struct inode *inode =3D file_inode(file); + struct folio *fdt_folio; + void *fdt; + int len; + + if (WARN_ON_ONCE(!data)) + return; + + inode_lock(inode); + shmem_i_mapping_freeze(inode, false); + + fdt =3D phys_to_virt(data); + fdt_folio =3D virt_to_folio(fdt); + pfolios =3D fdt_getprop(fdt, 0, "folios", &len); + if (pfolios) + memfd_luo_unpreserve_folios(pfolios, len / sizeof(*pfolios)); + + kho_unpreserve_folio(fdt_folio); + folio_put(fdt_folio); + inode_unlock(inode); +} + +static struct folio *memfd_luo_get_fdt(u64 data) +{ + return kho_restore_folio((phys_addr_t)data); +} + +static void memfd_luo_discard_folios(const struct memfd_luo_preserved_foli= o *pfolios, + unsigned int nr_folios) +{ + unsigned int i; + + for (i =3D 0; i < nr_folios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + struct folio *folio; + phys_addr_t phys; + + if (!pfolio->foliodesc) + continue; + + phys =3D PFN_PHYS(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + folio =3D kho_restore_folio(phys); + if (!folio) { + pr_warn_ratelimited("Unable to restore folio at physical address: %llx\= n", + phys); + continue; + } + + folio_put(folio); + } +} + +static void memfd_luo_finish(struct liveupdate_file_handler *handler, + struct file *file, u64 data, bool reclaimed) +{ + const struct memfd_luo_preserved_folio *pfolios; + struct folio *fdt_folio; + int len; + + if (reclaimed) + return; + + fdt_folio =3D memfd_luo_get_fdt(data); + + pfolios =3D fdt_getprop(folio_address(fdt_folio), 0, "folios", &len); + if (pfolios) + memfd_luo_discard_folios(pfolios, len / sizeof(*pfolios)); + + folio_put(fdt_folio); +} + +static int memfd_luo_retrieve(struct liveupdate_file_handler *handler, u64= data, + struct file **file_p) +{ + const struct memfd_luo_preserved_folio *pfolios; + int nr_pfolios, len, ret =3D 0, i =3D 0; + struct address_space *mapping; + struct folio *folio, *fdt_folio; + const u64 *pos, *size; + struct inode *inode; + struct file *file; + const void *fdt; + + fdt_folio =3D memfd_luo_get_fdt(data); + if (!fdt_folio) + return -ENOENT; + + fdt =3D page_to_virt(folio_page(fdt_folio, 0)); + + pfolios =3D fdt_getprop(fdt, 0, "folios", &len); + if (!pfolios || len % sizeof(*pfolios)) { + pr_err("invalid 'folios' property\n"); + ret =3D -EINVAL; + goto put_fdt; + } + nr_pfolios =3D len / sizeof(*pfolios); + + size =3D fdt_getprop(fdt, 0, "size", &len); + if (!size || len !=3D sizeof(u64)) { + pr_err("invalid 'size' property\n"); + ret =3D -EINVAL; + goto put_folios; + } + + pos =3D fdt_getprop(fdt, 0, "pos", &len); + if (!pos || len !=3D sizeof(u64)) { + pr_err("invalid 'pos' property\n"); + ret =3D -EINVAL; + goto put_folios; + } + + file =3D shmem_file_setup("", 0, VM_NORESERVE); + + if (IS_ERR(file)) { + ret =3D PTR_ERR(file); + pr_err("failed to setup file: %d\n", ret); + goto put_folios; + } + + inode =3D file->f_inode; + mapping =3D inode->i_mapping; + vfs_setpos(file, *pos, MAX_LFS_FILESIZE); + + for (; i < nr_pfolios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + phys_addr_t phys; + u64 index; + int flags; + + if (!pfolio->foliodesc) + continue; + + phys =3D PFN_PHYS(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + folio =3D kho_restore_folio(phys); + if (!folio) { + pr_err("Unable to restore folio at physical address: %llx\n", + phys); + goto put_file; + } + index =3D pfolio->index; + flags =3D PRESERVED_FOLIO_FLAGS(pfolio->foliodesc); + + /* Set up the folio for insertion. */ + __folio_set_locked(folio); + __folio_set_swapbacked(folio); + + ret =3D mem_cgroup_charge(folio, NULL, mapping_gfp_mask(mapping)); + if (ret) { + pr_err("shmem: failed to charge folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + ret =3D shmem_add_to_page_cache(folio, mapping, index, NULL, + mapping_gfp_mask(mapping)); + if (ret) { + pr_err("shmem: failed to add to page cache folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + if (flags & PRESERVED_FLAG_UPTODATE) + folio_mark_uptodate(folio); + if (flags & PRESERVED_FLAG_DIRTY) + folio_mark_dirty(folio); + + ret =3D shmem_inode_acct_blocks(inode, 1); + if (ret) { + pr_err("shmem: failed to account folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + shmem_recalc_inode(inode, 1, 0); + folio_add_lru(folio); + folio_unlock(folio); + folio_put(folio); + } + + inode->i_size =3D *size; + *file_p =3D file; + folio_put(fdt_folio); + return 0; + +unlock_folio: + folio_unlock(folio); + folio_put(folio); +put_file: + fput(file); + i++; +put_folios: + for (; i < nr_pfolios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + + folio =3D kho_restore_folio(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + if (folio) + folio_put(folio); + } + +put_fdt: + folio_put(fdt_folio); + return ret; +} + +static bool memfd_luo_can_preserve(struct liveupdate_file_handler *handler, + struct file *file) +{ + struct inode *inode =3D file_inode(file); + + return shmem_file(file) && !inode->i_nlink; +} + +static const struct liveupdate_file_ops memfd_luo_file_ops =3D { + .prepare =3D memfd_luo_prepare, + .freeze =3D memfd_luo_freeze, + .cancel =3D memfd_luo_cancel, + .finish =3D memfd_luo_finish, + .retrieve =3D memfd_luo_retrieve, + .can_preserve =3D memfd_luo_can_preserve, + .owner =3D THIS_MODULE, +}; + +static struct liveupdate_file_handler memfd_luo_handler =3D { + .ops =3D &memfd_luo_file_ops, + .compatible =3D "memfd-v1", +}; + +static int __init memfd_luo_init(void) +{ + int err; + + err =3D liveupdate_register_file_handler(&memfd_luo_handler); + if (err) + pr_err("Could not register luo filesystem handler: %d\n", err); + + return err; +} +late_initcall(memfd_luo_init); --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 925EC1C54AF for ; Mon, 29 Sep 2025 01:04:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107893; cv=none; b=rgx04THhHIccvaDC8Mr0T832xDPcZwPi3ACzjn1zCGi+xD/DwPcNuSxj3dxcn2FKQcphMyMcCWw8DfvyuUZy6ubgMOunxGNoTZwXli7p4oN/QsZ+mjggl2doXXLm0EpZ0iHq3bhQndA5bvMvPOOrnKFz34WA5iZucMDqKsZjvAQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107893; c=relaxed/simple; bh=WXzDViG1EBFvkKiOeUeAP9q+pypvSpFhJ7bHYH+HoEo=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=axYX30GdIBoFn+n8hOCXdWnH2NZMoOmH2cLuCW/i5HepkC9ZcNQbqanr9M6drlbCAH5oQ7JzA9+WF6D95MgNiaBxVZwGF+N0IFP0f/+6zNeHKxU1c/ucyTRSQ30bZCjEZ3o3R3Lu48sGMD9m8yNbS/rOaEtuhbYRSHfjRHs+sGk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=IbWcf+xV; arc=none smtp.client-ip=209.85.160.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="IbWcf+xV" Received: by mail-qt1-f182.google.com with SMTP id d75a77b69052e-4e06163d9e9so7334631cf.3 for ; Sun, 28 Sep 2025 18:04:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107889; x=1759712689; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=RJ45Fu6tAabGvmgrZwCPQgADOSMlHUCADrVKHKVQoCI=; b=IbWcf+xVuRCNvmimiFMoYLSCOlM50ynF6YnhRvAHK/l6GxvX8Wje5WdMYJBcaVBRj8 X6b0TI89FtGB1vRaLDdX3Byo4cSAuq1iwt0T2OKrLLM26d0G+v1Kffv57LaQUHZlkqgu BJgP5t4XAmUh7Axc9bTpRyBiCnePeRox+OyPkW/QADxjjg6kMmOs8qmmydFQ+3QQkpnX WjCc1/96tsZg57LzmWg59gVhBAoJtW3mvnsLqtV4dYmR6Ks/rPHOBXNukAn/yJS6rrx8 2hl0eLspiPgOc4phW47qhYLoeHcUE1DAlKlw5dVt7idonNH7lwgB7kTh2EmtJi56C6Ii r+xQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107889; x=1759712689; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RJ45Fu6tAabGvmgrZwCPQgADOSMlHUCADrVKHKVQoCI=; b=gtQ4lQ4kfSq93o32nLYMtp/0yGvjWNfOT96RK8ePd5UKhsaeRZBKztomtbE9XJePwb GPHrTEb/6PB8BiEmoTJdUiOlxIArK3JxUL0z+I8V/j8y41qTNOjikKC0D8PB+P2STvtp DU+wPlclH8UK0wdB9hwuQhRymmd1/U2l6gzZRCXeRJdZZiY5ytyqVdINoEobX7fVVNzn GsZwrARpKq0rRYC1ft7wp4sbzB0nQkAUgynrSYKWY74c5w7fwoTtNMR9jcDm6DvLRvbD Qjb666l2oFDGIFTC/GVqZyCpvaotRJaN3xmE7uokbr2PVfeweq5oT4Tl41fUv7e6HZID 2EHg== X-Forwarded-Encrypted: i=1; AJvYcCUIIvFFw7gySSRCbc0k7k3ltLKJEv3AlEHhGwLv9oldfHf4xvKcol006n5CQHagiVkiTC6mhUxm8Ihe/R4=@vger.kernel.org X-Gm-Message-State: AOJu0YzPVgdiMtbXktaEv+ppFFT9OrwYP2od50/llDTQQ0N8TLY//70t ma7vMDXOuBkBzCq3aZ1ukNStlTma2CIwavA1s3ij23bguVc0TwbOT6SO3WvR9Zj9kWI= X-Gm-Gg: ASbGncv25hMFbUY0Tn5JYYBIMWWrW5bdkd3ubCqopfZ7VkhM89LUYTisOWLTGAdFHod kmbcTPR3cK0D9ClJTRmAo+KHyXodNbC6KmTBVB4udmKGBZST1+gGRUnoOpXNF6m+kVwthgKYue/ IFhCKNc1vwEcLozJVII4JeArhoebXp9xiw8Y51fY2qgC+bYJ0cYvaQ9Bt79XB5b5HNtxrrVcH/n SZ0GVZRJMj6lUPAX0J+KB43QyOJqxTrw2QyP/A1EE3gc9YDtpeRXG88vcsUjAwW+k+bL8MUP0gi dlWos+RE3cafbe0IZFhUR+4HD/5tM2Am2LhWmxsZTD4Ano/myEY2YaKD7QRf9ZScQwFdwIiV7Ua o3x5AGduQlWXhji52eXKfGKs1fPPEgfbaj8d4bVNHF5CONnulTmj/7okN7XY2Kb9w5nHrX7z/da 3LKeXnZhBE7sSigHFIaQ== X-Google-Smtp-Source: AGHT+IHY12mDEvSS4c6BYDbWBSp3AujLQ3K0TRMk1C+DYz/5jFy/PbxR43Ax3YllToE+repLGKruxQ== X-Received: by 2002:a05:622a:5987:b0:4d9:5efc:2dce with SMTP id d75a77b69052e-4da47827704mr211866571cf.11.1759107888907; Sun, 28 Sep 2025 18:04:48 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:48 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 25/30] docs: add documentation for memfd preservation via LUO Date: Mon, 29 Sep 2025 01:03:16 +0000 Message-ID: <20250929010321.3462457-26-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav Add the documentation under the "Preserving file descriptors" section of LUO's documentation. The doc describes the properties preserved, behaviour of the file under different LUO states, serialization format, and current limitations. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- Documentation/core-api/liveupdate.rst | 7 ++ Documentation/mm/index.rst | 1 + Documentation/mm/memfd_preservation.rst | 138 ++++++++++++++++++++++++ MAINTAINERS | 1 + 4 files changed, 147 insertions(+) create mode 100644 Documentation/mm/memfd_preservation.rst diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api= /liveupdate.rst index 7c1c3af6f960..b44710d75088 100644 --- a/Documentation/core-api/liveupdate.rst +++ b/Documentation/core-api/liveupdate.rst @@ -23,6 +23,13 @@ LUO Preserving File Descriptors .. kernel-doc:: kernel/liveupdate/luo_file.c :doc: LUO file descriptors =20 +The following types of file descriptors can be preserved + +.. toctree:: + :maxdepth: 1 + + ../mm/memfd_preservation + Public API =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D .. kernel-doc:: include/linux/liveupdate.h diff --git a/Documentation/mm/index.rst b/Documentation/mm/index.rst index ba6a8872849b..7aa2a8886908 100644 --- a/Documentation/mm/index.rst +++ b/Documentation/mm/index.rst @@ -48,6 +48,7 @@ documentation, or deleted if it has served its purpose. hugetlbfs_reserv ksm memory-model + memfd_preservation mmu_notifier multigen_lru numa diff --git a/Documentation/mm/memfd_preservation.rst b/Documentation/mm/mem= fd_preservation.rst new file mode 100644 index 000000000000..3fc612e1288c --- /dev/null +++ b/Documentation/mm/memfd_preservation.rst @@ -0,0 +1,138 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D +Memfd Preservation via LUO +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +Overview +=3D=3D=3D=3D=3D=3D=3D=3D + +Memory file descriptors (memfd) can be preserved over a kexec using the Li= ve +Update Orchestrator (LUO) file preservation. This allows userspace to tran= sfer +its memory contents to the next kernel after a kexec. + +The preservation is not intended to be transparent. Only select properties= of +the file are preserved. All others are reset to default. The preserved +properties are described below. + +.. note:: + The LUO API is not stabilized yet, so the preserved properties of a mem= fd are + also not stable and are subject to backwards incompatible changes. + +.. note:: + Currently a memfd backed by Hugetlb is not supported. Memfds created + with ``MFD_HUGETLB`` will be rejected. + +Preserved Properties +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The following properties of the memfd are preserved across kexec: + +File Contents + All data stored in the file is preserved. + +File Size + The size of the file is preserved. Holes in the file are filled by alloc= ating + pages for them during preservation. + +File Position + The current file position is preserved, allowing applications to continue + reading/writing from their last position. + +File Status Flags + memfds are always opened with ``O_RDWR`` and ``O_LARGEFILE``. This prope= rty is + maintained. + +Non-Preserved Properties +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +All properties which are not preserved must be assumed to be reset to defa= ult. +This section describes some of those properties which may be more of note. + +``FD_CLOEXEC`` flag + A memfd can be created with the ``MFD_CLOEXEC`` flag that sets the + ``FD_CLOEXEC`` on the file. This flag is not preserved and must be set a= gain + after restore via ``fcntl()``. + +Seals + File seals are not preserved. The file is unsealed on restore and if nee= ded, + must be sealed again via ``fcntl()``. + +Behavior with LUO states +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +This section described the behavior of the memfd in the different LUO stat= es. + +Normal Phase + During the normal phase, the memfd can be marked for preservation using = the + ``LIVEUPDATE_SESSION_PRESERVE_FD`` ioctl. The memfd acts as a regular me= mfd + during this phase with no additional restrictions. + +Prepared Phase + After LUO enters ``LIVEUPDATE_STATE_PREPARED``, the memfd is serialized = and + prepared for the next kernel. During this phase, the below things happen: + + - All the folios are pinned. If some folios reside in ``ZONE_MIGRATE``, = they + are migrated out. This ensures none of the preserved folios land in KHO + scratch area. + - Pages in swap are swapped in. Currently, there is no way to pass pages= in + swap over KHO, so all swapped out pages are swapped back in and pinned. + - The memfd goes into "frozen mapping" mode. The file can no longer grow= or + shrink, or punch holes. This ensures the serialized mappings stay in s= ync. + The file can still be read from or written to or mmap-ed. + +Freeze Phase + Updates the current file position in the serialized data to capture any + changes that occurred between prepare and freeze phases. After this, the= FD is + not allowed to be accessed. + +Restoration Phase + After being restored, the memfd is functional as normal with the propert= ies + listed above restored. + +Cancellation + If the liveupdate is cancelled after going into prepared phase, the memfd + functions like in normal phase. + +Serialization format +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The state is serialized in an FDT with the following structure:: + + /dts-v1/; + + / { + compatible =3D "memfd-v1"; + pos =3D ; + size =3D ; + folios =3D ; + }; + +Each folio descriptor contains: + +- PFN + flags (8 bytes) + + - Physical frame number (PFN) of the preserved folio (bits 63:12). + - Folio flags (bits 11:0): + + - ``PRESERVED_FLAG_DIRTY`` (bit 0) + - ``PRESERVED_FLAG_UPTODATE`` (bit 1) + +- Folio index within the file (8 bytes). + +Limitations +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The current implementation has the following limitations: + +Size + Currently the size of the file is limited by the size of the FDT. The FD= T can + be at of most ``MAX_PAGE_ORDER`` order. By default this is 4 MiB with 4K + pages. Each page in the file is tracked using 16 bytes. This limits the + maximum size of the file to 1 GiB. + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` +- :doc:`/core-api/kho/concepts` diff --git a/MAINTAINERS b/MAINTAINERS index a17e4e077174..a9941e920ef6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14438,6 +14438,7 @@ L: linux-kernel@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-kernel-liveupdate F: Documentation/core-api/liveupdate.rst +F: Documentation/mm/memfd_preservation.rst F: Documentation/userspace-api/liveupdate.rst F: include/linux/liveupdate.h F: include/uapi/linux/liveupdate.h --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25DF42BDC33 for ; Mon, 29 Sep 2025 01:04:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107896; cv=none; b=JLuPk/Kf+9FQN4+HINafWehNzRPGKC0sl+oX+2inZjxm/XfZRPvwKAe+4wjr8R915JDGsyrTeLzE4E8BJJqd5t4Q01sHIUnGqxIBEitKiWPZT3Iy/YavTX+B0l9m7JiMwamR647jIV6bF52+cNf45g9/E8sTHX0t7lWBfMKo6ME= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107896; c=relaxed/simple; bh=IO1VyVAb8djvsCGtCT3kCdTlkiwVfW7taQgdIXRPTtw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NTuy9ECBG7U7y9zR2AuXgzOLgsYLhLWB/jfcworWB79W+WeXH4gM9kF+mv6LXWOyyBEulS5ipBo0107Heo1BmFDDWTAMnVDlcsD7mP4cabcW3IRnIPi0xA1mFXAE5CDP/aKGpA66Y6Q4fitXlb0O1/PUCEY731Z/3M2bstyvzuw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=M8by+FzY; arc=none smtp.client-ip=209.85.160.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="M8by+FzY" Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-4df4d23fb59so16775951cf.1 for ; Sun, 28 Sep 2025 18:04:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107891; x=1759712691; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=qnDErCVzLoc/L71Odr4fuNY8x0m/A8ysS1VGAup5oYo=; b=M8by+FzY8xRJnamukkpq+ZEM/Wdw2JO1YzGbcxu42cN4zg/gvsRTpCtyleP/B0leOx IsZnJTgKpZJK7HRwY7eeIeKELh3PpJWFi1kmJQLP1ld+a/94axEQm1zI7S+MJxVydUMj tgeg+zoIEMrbf1U9trUa1fPpphWKiGSZPIANXyzOASN6eowm5s6qSkfzI0oVy6jd5IGc CUDGL0MbK01gursL5cvqZI7toHHCpGX0uiHyvcU8+TrnoQ5MBX6HOPqXQ/GoRl8zbphy xLgIOlwh/sFFnRxcYDpWEJF+ATuAjsG0yBK+PpQrNS1IEdqCELd0khs3hOzbPteX0o9w Ud3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107891; x=1759712691; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qnDErCVzLoc/L71Odr4fuNY8x0m/A8ysS1VGAup5oYo=; b=IT2c5PQY+W/FzBmmmEIINkb7foD0YhMthQXJr9yISSyvXx9MyCp+Z8CcDnD599o8nv GqPEyRnHPelJRZUHcVtifvsjpc3ufQ+Zwa29rQSWHfwo1j9Sz7BbBb8QahjL7do/GHTp tkGkGEFG6KBt4z+WcfHtCDJWwRUfN/CdfQLYVjh9fdbOZ82DdHIsqFP9/bXXwjt4TOaG aVuUPUOKnCIEp1QQM3mUlXiP0GKBdExSDFP0PjdsaqX9Zxj/L1qi7/RlOf5s4Sij+s3v JkL4BaLCPZFUXLgSVaitGe7luWbrNjZ99z0Vz9Bk0XFqZxqLWOY9hDr6evWkzuClPZV1 w5KA== X-Forwarded-Encrypted: i=1; AJvYcCXDrUtEezljIEF04ZzPxaJjcmc7BOXJWESlgITMY9IQcGP338d0lUJV5L/EZYAO8Y4jEPEPbPVMVUefUZQ=@vger.kernel.org X-Gm-Message-State: AOJu0YzcoxU0ArTx5r8/nFHHi6XkY+iSfgQPuaAzcenJKNt3kP95mEfK QNSklm/G8ADtBvuMa+e/yHhw0/O1PD9Gon/LViSfgwh4ztEpBt6i4vxcHUEmMqRfe7Y= X-Gm-Gg: ASbGncsGAElPUqy/uJ3Eq/uF50mHeQrL6a6TGfGzI1ZXvw8jtnSFZXjGKlg/tIi0jfK gCiR2gJW99IA0M9y7g/C0SmlqZIOfbUGm/7xTAOCZIJK7dmNHB1772jnui2bBMSDkLYYLiM15l9 N25XIHGXm738c7ASW+z16q8daLSDDlfE0nJy9INSZDD7Sv06DfH+SXbh3WnydaAXRkuwB38OMuK Z4Ua3DgSmf1JKjlbKJ09BFVcGmxWcwV2CDXviD5btNVUIVCAvqX/ISiWCFS/gFUurvubV7MJ8HL x8xl5S4ZEHFGh4HINBY03nfDsC42Dt1btbKHkeomJEteqJ0gYi189YU6AvHtWDS6adCkankps2a sb8ELuThv8O7e9MMhh5jUL4qcwqUOPiPPrtSYoFIsh1W9lWLHU/ODKhQ+oj0oHc1xLMs2O1W0ZJ LfObibWAI= X-Google-Smtp-Source: AGHT+IGxz4bz/Ulq2oqYfhk12cyeqtFbLsr8uDXkWRvJE5REk0ihSw3RWoQtagO38Ya8siWyMamt7g== X-Received: by 2002:ac8:7d8e:0:b0:4d0:7fc9:5c6 with SMTP id d75a77b69052e-4da4b42cbfcmr192806211cf.50.1759107890439; Sun, 28 Sep 2025 18:04:50 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:49 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 26/30] selftests/liveupdate: Add multi-kexec session lifecycle test Date: Mon, 29 Sep 2025 01:03:17 +0000 Message-ID: <20250929010321.3462457-27-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce multi-stage selftest, luo_multi_kexec, to validate the end-to-end lifecycle of Live Update Orchestrator sessions across multiple kexec reboots. The test operates in three stages, using a preserved memfd within a dedicated "state_session" to track its progress across reboots. This avoids reliance on filesystem flags and tests the core preservation mechanism itself. The test validates the following critical LUO functionalities: 1. Initial Preservation (Stage 1 -> 2): - Creates multiple sessions (session-A, session-B, session-C) and populates them with memfd files containing unique data. - Triggers a global LIVEUPDATE_PREPARE event and executes the first kexec. 2. Intermediate State Management (Stage 2 -> 3): - After the first reboot, it verifies that all sessions were correctly preserved. - It then tests divergent session lifecycles: - Session A: Is retrieved and explicitly finalized with a per-session LIVEUPDATE_FINISH event. This validates that a finished session is not carried over to the next kexec. - Session B: Is retrieved but left open. This validates that an active, retrieved session is correctly re-preserved during the next global PREPARE. - Session C: Is deliberately not retrieved. This validates that the global LIVEUPDATE_FINISH event correctly identifies and cleans up stale, unreclaimed sessions. - The state-tracking memfd is updated by un-preserving and re-preserving it, testing in-place modification of a session's contents. - A global FINISH followed by a global PREPARE is triggered before the second kexec. 3. Final Verification (Stage 3): - After the second reboot, it confirms the final state: - Asserts that session-B (the re-preserved session) and the updated state session have survived. - Asserts that session-A (explicitly finished) and session-C (unreclaimed) were correctly cleaned up and no longer exist. Example output: root@debian-vm:~/liveupdate$ ./luo_multi_kexec LUO state is NORMAL. Starting Stage 1. [STAGE 1] Creating state file for next stage (2)... [STAGE 1] Setting up Sessions A, B, C for first kexec... - Session 'session-A' created. - Session 'session-B' created. - Session 'session-C' created. [STAGE 1] Triggering global PREPARE... [STAGE 1] Executing kexec... <---- cut reboot messages ----> Debian GNU/Linux 12 debian-vm ttyS0 debian-vm login: root (automatic login) root@debian-vm:~$ cd liveupdate/ root@debian-vm:~/liveupdate$ ./luo_multi_kexec LUO state is UPDATED. Restoring state to determine stage... State file indicates we are entering Stage 2. [STAGE 2] Partially reclaiming and preparing for second kexec... - Verifying session 'session-A'... Success. All files verified. - Verifying session 'session-B'... Success. All files verified. - Finishing state session to allow modification... - Updating state file for next stage (3)... - Session A verified. Sending per-session FINISH. - Session B verified. Keeping FD open for next kexec. - NOT retrieving Session C to test global finish cleanup. [STAGE 2] Triggering global FINISH... [STAGE 2] Triggering global PREPARE for next kexec... [STAGE 2] Executing second kexec... <---- cut reboot messages ----> Debian GNU/Linux 12 debian-vm ttyS0 debian-vm login: root (automatic login) root@debian-vm:~$ cd liveupdate/ root@debian-vm:~/liveupdate$ ./luo_multi_kexec LUO state is UPDATED. Restoring state to determine stage... State file indicates we are entering Stage 3. [STAGE 3] Final verification... [STAGE 3] Verifying surviving sessions... - Verifying session 'session-B'... Success. All files verified. [STAGE 3] Verifying Session A was cleaned up... Success. Session A not found as expected. [STAGE 3] Verifying Session C was cleaned up... Success. Session C not found as expected. [STAGE 3] Triggering final global FINISH... --- TEST PASSED --- Signed-off-by: Pasha Tatashin --- tools/testing/selftests/liveupdate/.gitignore | 1 + tools/testing/selftests/liveupdate/Makefile | 31 +++ .../testing/selftests/liveupdate/do_kexec.sh | 6 + .../selftests/liveupdate/luo_multi_kexec.c | 182 +++++++++++++ .../selftests/liveupdate/luo_test_utils.c | 241 ++++++++++++++++++ .../selftests/liveupdate/luo_test_utils.h | 51 ++++ 6 files changed, 512 insertions(+) create mode 100755 tools/testing/selftests/liveupdate/do_kexec.sh create mode 100644 tools/testing/selftests/liveupdate/luo_multi_kexec.c create mode 100644 tools/testing/selftests/liveupdate/luo_test_utils.c create mode 100644 tools/testing/selftests/liveupdate/luo_test_utils.h diff --git a/tools/testing/selftests/liveupdate/.gitignore b/tools/testing/= selftests/liveupdate/.gitignore index af6e773cf98f..de7ca45d3892 100644 --- a/tools/testing/selftests/liveupdate/.gitignore +++ b/tools/testing/selftests/liveupdate/.gitignore @@ -1 +1,2 @@ /liveupdate +/luo_multi_kexec diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile index 2a573c36016e..1cbc816ed5c5 100644 --- a/tools/testing/selftests/liveupdate/Makefile +++ b/tools/testing/selftests/liveupdate/Makefile @@ -1,7 +1,38 @@ # SPDX-License-Identifier: GPL-2.0-only + +KHDR_INCLUDES ?=3D -I../../../usr/include CFLAGS +=3D -Wall -O2 -Wno-unused-function CFLAGS +=3D $(KHDR_INCLUDES) +LDFLAGS +=3D -static + +# --- Test Configuration (Edit this section when adding new tests) --- +LUO_SHARED_SRCS :=3D luo_test_utils.c +LUO_SHARED_HDRS +=3D luo_test_utils.h + +LUO_MANUAL_TESTS +=3D luo_multi_kexec + +TEST_FILES +=3D do_kexec.sh =20 TEST_GEN_PROGS +=3D liveupdate =20 +# --- Automatic Rule Generation (Do not edit below) --- + +TEST_GEN_PROGS_EXTENDED +=3D $(LUO_MANUAL_TESTS) + +# Define the full list of sources for each manual test. +$(foreach test,$(LUO_MANUAL_TESTS), \ + $(eval $(test)_SOURCES :=3D $(test).c $(LUO_SHARED_SRCS))) + +# This loop automatically generates an explicit build rule for each manual= test. +# It includes dependencies on the shared headers and makes the output +# executable. +# Note the use of '$$' to escape automatic variables for the 'eval' comman= d. +$(foreach test,$(LUO_MANUAL_TESTS), \ + $(eval $(OUTPUT)/$(test): $($(test)_SOURCES) $(LUO_SHARED_HDRS) \ + $(call msg,LINK,,$$@) ; \ + $(Q)$(LINK.c) $$^ $(LDLIBS) -o $$@ ; \ + $(Q)chmod +x $$@ \ + ) \ +) + include ../lib.mk diff --git a/tools/testing/selftests/liveupdate/do_kexec.sh b/tools/testing= /selftests/liveupdate/do_kexec.sh new file mode 100755 index 000000000000..bb396a92c3b8 --- /dev/null +++ b/tools/testing/selftests/liveupdate/do_kexec.sh @@ -0,0 +1,6 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +set -e + +kexec -l -s --reuse-cmdline /boot/bzImage +kexec -e diff --git a/tools/testing/selftests/liveupdate/luo_multi_kexec.c b/tools/t= esting/selftests/liveupdate/luo_multi_kexec.c new file mode 100644 index 000000000000..1f350990ee67 --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_multi_kexec.c @@ -0,0 +1,182 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include "luo_test_utils.h" + +#define KEXEC_SCRIPT "./do_kexec.sh" + +#define NUM_SESSIONS 3 + +/* Helper to set up one session and all its files */ +static void setup_session(int luo_fd, struct session_info *s, int session_= idx) +{ + int i; + + snprintf(s->name, sizeof(s->name), "session-%c", 'A' + session_idx); + + s->fd =3D luo_create_session(luo_fd, s->name); + if (s->fd < 0) + fail_exit("luo_create_session for %s", s->name); + + for (i =3D 0; i < 2; i++) { + s->file_tokens[i] =3D (session_idx * 100) + i; + snprintf(s->file_data[i], sizeof(s->file_data[i]), + "Data for %.*s-File%d", + (int)sizeof(s->name), s->name, i); + + if (create_and_preserve_memfd(s->fd, s->file_tokens[i], + s->file_data[i]) < 0) + fail_exit("create_and_preserve_memfd for token %d", + s->file_tokens[i]); + } +} + +/* Run before the first kexec */ +static void run_stage_1(int luo_fd) +{ + struct session_info sessions[NUM_SESSIONS] =3D {0}; + int i; + + ksft_print_msg("[STAGE 1] Creating state file for next stage (2)...\n"); + create_state_file(luo_fd, 2); + + ksft_print_msg("[STAGE 1] Setting up Sessions A, B, C for first kexec...\= n"); + for (i =3D 0; i < NUM_SESSIONS; i++) { + setup_session(luo_fd, &sessions[i], i); + ksft_print_msg(" - Session '%s' created.\n", sessions[i].name); + } + + ksft_print_msg("[STAGE 1] Triggering global PREPARE...\n"); + if (luo_set_global_event(luo_fd, LIVEUPDATE_PREPARE) < 0) + fail_exit("luo_set_global_event(PREPARE)"); + + ksft_print_msg("[STAGE 1] Executing kexec...\n"); + if (system(KEXEC_SCRIPT) !=3D 0) + fail_exit("kexec script failed"); + + /* Should not be reached */ + sleep(10); + exit(EXIT_FAILURE); +} + +/* Run after first kexec, before second kexec */ +static void run_stage_2(int luo_fd, int state_session_fd) +{ + struct session_info sessions[NUM_SESSIONS] =3D {0}; + int session_fd_A; + + ksft_print_msg("[STAGE 2] Partially reclaiming and preparing for second k= exec...\n"); + + reinit_all_sessions(sessions, NUM_SESSIONS); + + session_fd_A =3D verify_session_and_get_fd(luo_fd, &sessions[0]); + verify_session_and_get_fd(luo_fd, &sessions[1]); + + ksft_print_msg(" - Finishing state session to allow modification...\n"); + if (luo_set_session_event(state_session_fd, LIVEUPDATE_FINISH) < 0) + fail_exit("luo_set_session_event(FINISH) for state_session"); + + ksft_print_msg(" - Updating state file for next stage (3)...\n"); + update_state_file(state_session_fd, 3); + + ksft_print_msg(" - Session A verified. Sending per-session FINISH.\n"); + if (luo_set_session_event(session_fd_A, LIVEUPDATE_FINISH) < 0) + fail_exit("luo_set_session_event(FINISH) for Session A"); + close(session_fd_A); + + ksft_print_msg(" - Session B verified. Its FD will be auto-closed for ne= xt kexec.\n"); + ksft_print_msg(" - NOT retrieving Session C to test global finish cleanu= p.\n"); + + ksft_print_msg("[STAGE 2] Triggering global FINISH...\n"); + if (luo_set_global_event(luo_fd, LIVEUPDATE_FINISH) < 0) + fail_exit("luo_set_global_event(FINISH)"); + + ksft_print_msg("[STAGE 2] Triggering global PREPARE for next kexec...\n"); + if (luo_set_global_event(luo_fd, LIVEUPDATE_PREPARE) < 0) + fail_exit("luo_set_global_event(PREPARE)"); + + ksft_print_msg("[STAGE 2] Executing second kexec...\n"); + if (system(KEXEC_SCRIPT) !=3D 0) + fail_exit("kexec script failed"); + + sleep(10); + exit(EXIT_FAILURE); +} + +/* Run after second kexec */ +static void run_stage_3(int luo_fd) +{ + struct session_info sessions[NUM_SESSIONS] =3D {0}; + int ret; + + ksft_print_msg("[STAGE 3] Final verification...\n"); + + reinit_all_sessions(sessions, NUM_SESSIONS); + + ksft_print_msg("[STAGE 3] Verifying surviving sessions...\n"); + /* Session B */ + verify_session_and_get_fd(luo_fd, &sessions[1]); + + ksft_print_msg("[STAGE 3] Verifying Session A was cleaned up...\n"); + ret =3D luo_retrieve_session(luo_fd, sessions[0].name); + if (ret !=3D -ENOENT) + fail_exit("Expected ENOENT for Session A, but got %d", ret); + ksft_print_msg(" Success. Session A not found as expected.\n"); + + ksft_print_msg("[STAGE 3] Verifying Session C was cleaned up...\n"); + ret =3D luo_retrieve_session(luo_fd, sessions[2].name); + if (ret !=3D -ENOENT) + fail_exit("Expected ENOENT for Session C, but got %d", ret); + ksft_print_msg(" Success. Session C not found as expected.\n"); + + ksft_print_msg("[STAGE 3] Triggering final global FINISH...\n"); + if (luo_set_global_event(luo_fd, LIVEUPDATE_FINISH) < 0) + fail_exit("luo_set_global_event(FINISH)"); + + ksft_print_msg("\n--- MULTI-KEXEC TEST PASSED ---\n"); +} + +int main(int argc, char *argv[]) +{ + enum liveupdate_state state; + int luo_fd, stage =3D 0; + + luo_fd =3D luo_open_device(); + if (luo_fd < 0) { + ksft_exit_skip("Failed to open %s. Is the luo module loaded?\n", + LUO_DEVICE); + } + + if (luo_get_global_state(luo_fd, &state) < 0) + fail_exit("luo_get_global_state"); + + if (state =3D=3D LIVEUPDATE_STATE_NORMAL) { + ksft_print_msg("LUO state is NORMAL. Starting Stage 1.\n"); + run_stage_1(luo_fd); + } else if (state =3D=3D LIVEUPDATE_STATE_UPDATED) { + int state_session_fd; + + ksft_print_msg("LUO state is UPDATED. Restoring state to determine stage= ...\n"); + state_session_fd =3D restore_and_read_state(luo_fd, &stage); + if (state_session_fd < 0) + fail_exit("Could not restore test state"); + + if (stage =3D=3D 2) { + ksft_print_msg("State file indicates we are entering Stage 2.\n"); + run_stage_2(luo_fd, state_session_fd); + } else if (stage =3D=3D 3) { + ksft_print_msg("State file indicates we are entering Stage 3.\n"); + run_stage_3(luo_fd); + } else { + fail_exit("Invalid stage found in state file: %d", + stage); + } + } + + close(luo_fd); + ksft_exit_pass(); +} diff --git a/tools/testing/selftests/liveupdate/luo_test_utils.c b/tools/te= sting/selftests/liveupdate/luo_test_utils.c new file mode 100644 index 000000000000..c0840e6e66fd --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_test_utils.c @@ -0,0 +1,241 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "luo_test_utils.h" +#include "../kselftest.h" + +/* The fail_exit function is now a macro in the header. */ + +int luo_open_device(void) +{ + return open(LUO_DEVICE, O_RDWR); +} + +int luo_create_session(int luo_fd, const char *name) +{ + struct liveupdate_ioctl_create_session arg =3D { .size =3D sizeof(arg) }; + + snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s", + LIVEUPDATE_SESSION_NAME_LENGTH - 1, name); + if (ioctl(luo_fd, LIVEUPDATE_IOCTL_CREATE_SESSION, &arg) < 0) + return -errno; + return arg.fd; +} + +int luo_retrieve_session(int luo_fd, const char *name) +{ + struct liveupdate_ioctl_retrieve_session arg =3D { .size =3D sizeof(arg) = }; + + snprintf((char *)arg.name, LIVEUPDATE_SESSION_NAME_LENGTH, "%.*s", + LIVEUPDATE_SESSION_NAME_LENGTH - 1, name); + if (ioctl(luo_fd, LIVEUPDATE_IOCTL_RETRIEVE_SESSION, &arg) < 0) + return -errno; + return arg.fd; +} + +int create_and_preserve_memfd(int session_fd, int token, const char *data) +{ + struct liveupdate_session_preserve_fd arg =3D { .size =3D sizeof(arg) }; + long page_size =3D sysconf(_SC_PAGE_SIZE); + void *map =3D MAP_FAILED; + int mfd =3D -1, ret =3D -1; + + mfd =3D memfd_create("test_mfd", 0); + if (mfd < 0) + return -errno; + + if (ftruncate(mfd, page_size) !=3D 0) + goto out; + + map =3D mmap(NULL, page_size, PROT_WRITE, MAP_SHARED, mfd, 0); + if (map =3D=3D MAP_FAILED) + goto out; + + snprintf(map, page_size, "%s", data); + munmap(map, page_size); + + arg.fd =3D mfd; + arg.token =3D token; + if (ioctl(session_fd, LIVEUPDATE_SESSION_PRESERVE_FD, &arg) < 0) + goto out; + + ret =3D 0; /* Success */ +out: + if (ret !=3D 0 && errno !=3D 0) + ret =3D -errno; + if (mfd >=3D 0) + close(mfd); + return ret; +} + +int restore_and_verify_memfd(int session_fd, int token, + const char *expected_data) +{ + struct liveupdate_session_restore_fd arg =3D { .size =3D sizeof(arg) }; + long page_size =3D sysconf(_SC_PAGE_SIZE); + void *map =3D MAP_FAILED; + int mfd =3D -1, ret =3D -1; + + arg.token =3D token; + if (ioctl(session_fd, LIVEUPDATE_SESSION_RESTORE_FD, &arg) < 0) + return -errno; + mfd =3D arg.fd; + + map =3D mmap(NULL, page_size, PROT_READ, MAP_SHARED, mfd, 0); + if (map =3D=3D MAP_FAILED) + goto out; + + if (expected_data && strcmp(expected_data, map) !=3D 0) { + ksft_print_msg("Data mismatch for token %d!\n", token); + ret =3D -EINVAL; + goto out_munmap; + } + + ret =3D mfd; /* Success, return the new fd */ +out_munmap: + munmap(map, page_size); +out: + if (ret < 0 && errno !=3D 0) + ret =3D -errno; + if (ret < 0 && mfd >=3D 0) + close(mfd); + return ret; +} + +int luo_set_session_event(int session_fd, enum liveupdate_event event) +{ + struct liveupdate_session_set_event arg =3D { .size =3D sizeof(arg) }; + + arg.event =3D event; + return ioctl(session_fd, LIVEUPDATE_SESSION_SET_EVENT, &arg); +} + +int luo_set_global_event(int luo_fd, enum liveupdate_event event) +{ + struct liveupdate_ioctl_set_event arg =3D { .size =3D sizeof(arg) }; + + arg.event =3D event; + return ioctl(luo_fd, LIVEUPDATE_IOCTL_SET_EVENT, &arg); +} + +int luo_get_global_state(int luo_fd, enum liveupdate_state *state) +{ + struct liveupdate_ioctl_get_state arg =3D { .size =3D sizeof(arg) }; + + if (ioctl(luo_fd, LIVEUPDATE_IOCTL_GET_STATE, &arg) < 0) + return -errno; + *state =3D arg.state; + return 0; +} + +void create_state_file(int luo_fd, int next_stage) +{ + char buf[32]; + int state_session_fd; + + state_session_fd =3D luo_create_session(luo_fd, STATE_SESSION_NAME); + if (state_session_fd < 0) + fail_exit("luo_create_session failed"); + + snprintf(buf, sizeof(buf), "%d", next_stage); + if (create_and_preserve_memfd(state_session_fd, + STATE_MEMFD_TOKEN, buf) < 0) { + fail_exit("create_and_preserve_memfd failed"); + } +} + +int restore_and_read_state(int luo_fd, int *stage) +{ + char buf[32] =3D {0}; + int state_session_fd, mfd; + + state_session_fd =3D luo_retrieve_session(luo_fd, STATE_SESSION_NAME); + if (state_session_fd < 0) + return state_session_fd; + + mfd =3D restore_and_verify_memfd(state_session_fd, STATE_MEMFD_TOKEN, + NULL); + if (mfd < 0) + fail_exit("failed to restore state memfd"); + + if (read(mfd, buf, sizeof(buf) - 1) < 0) + fail_exit("failed to read state mfd"); + + *stage =3D atoi(buf); + + close(mfd); + return state_session_fd; +} + +void update_state_file(int session_fd, int next_stage) +{ + char buf[32]; + struct liveupdate_session_unpreserve_fd arg =3D { .size =3D sizeof(arg) }; + + arg.token =3D STATE_MEMFD_TOKEN; + if (ioctl(session_fd, LIVEUPDATE_SESSION_UNPRESERVE_FD, &arg) < 0) + fail_exit("unpreserve failed"); + + snprintf(buf, sizeof(buf), "%d", next_stage); + if (create_and_preserve_memfd(session_fd, STATE_MEMFD_TOKEN, buf) < 0) + fail_exit("create_and_preserve failed"); +} + +void reinit_all_sessions(struct session_info *sessions, int num) +{ + int i, j; + + for (i =3D 0; i < num; i++) { + snprintf(sessions[i].name, sizeof(sessions[i].name), + "session-%c", 'A' + i); + for (j =3D 0; j < 2; j++) { + sessions[i].file_tokens[j] =3D (i * 100) + j; + snprintf(sessions[i].file_data[j], + sizeof(sessions[i].file_data[j]), + "Data for %.*s-File%d", + LIVEUPDATE_SESSION_NAME_LENGTH, + sessions[i].name, j); + } + } +} + +int verify_session_and_get_fd(int luo_fd, struct session_info *s) +{ + int i, session_fd; + + ksft_print_msg(" - Verifying session '%s'...\n", s->name); + + session_fd =3D luo_retrieve_session(luo_fd, s->name); + if (session_fd < 0) + fail_exit("luo_retrieve_session for %s", s->name); + + for (i =3D 0; i < 2; i++) { + int mfd =3D restore_and_verify_memfd(session_fd, + s->file_tokens[i], + s->file_data[i]); + if (mfd < 0) { + fail_exit("restore_and_verify_memfd for token %d", + s->file_tokens[i]); + } + close(mfd); + } + ksft_print_msg(" Success. All files verified.\n"); + return session_fd; +} diff --git a/tools/testing/selftests/liveupdate/luo_test_utils.h b/tools/te= sting/selftests/liveupdate/luo_test_utils.h new file mode 100644 index 000000000000..e30cfcb0a596 --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_test_utils.h @@ -0,0 +1,51 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef LUO_TEST_UTILS_H +#define LUO_TEST_UTILS_H + +#include +#include +#include +#include "../kselftest.h" + +#define LUO_DEVICE "/dev/liveupdate" +#define STATE_SESSION_NAME "state_session" +#define STATE_MEMFD_TOKEN 999 + +#define MAX_FILES_PER_SESSION 5 + +struct session_info { + char name[LIVEUPDATE_SESSION_NAME_LENGTH]; + int fd; + int file_tokens[MAX_FILES_PER_SESSION]; + char file_data[MAX_FILES_PER_SESSION][128]; +}; + +#define fail_exit(fmt, ...) \ + ksft_exit_fail_msg("[%s] " fmt " (errno: %s)\n", \ + __func__, ##__VA_ARGS__, strerror(errno)) + +int luo_open_device(void); + +int luo_create_session(int luo_fd, const char *name); +int luo_retrieve_session(int luo_fd, const char *name); + +int create_and_preserve_memfd(int session_fd, int token, const char *data); +int restore_and_verify_memfd(int session_fd, int token, const char *expect= ed_data); +int verify_session_and_get_fd(int luo_fd, struct session_info *s); + +int luo_set_session_event(int session_fd, enum liveupdate_event event); +int luo_set_global_event(int luo_fd, enum liveupdate_event event); +int luo_get_global_state(int luo_fd, enum liveupdate_state *state); + +void create_state_file(int luo_fd, int next_stage); +int restore_and_read_state(int luo_fd, int *stage); +void update_state_file(int session_fd, int next_stage); +void reinit_all_sessions(struct session_info *sessions, int num); + +#endif /* LUO_TEST_UTILS_H */ --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 65C9C2BE636 for ; Mon, 29 Sep 2025 01:04:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107896; cv=none; b=B5yunIJGFZx89nZHe58byynQcI6+4fUAXtIV/BwNdg+gYQy4CqdsmkDbyfQGlqXqXUhyG2HtULNAyRod4H1JZlYQ2GqagdTIlXlo2coArTVwPklA9Y3l4F+WtBW1E1RTXAIHIhAilsWqUfJ0OpxxfJJ7RYIjWBHhOJvmDMJ0kVU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107896; c=relaxed/simple; bh=dTUrXfOK43UjtKGaHxGoUkNGqZ39TsswRbP9ug41gpk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eY6Fxew7KO9Seslp04fi/9e/J0d8Cq/FO/16MQ9C9lv0XJIuwVP7qVmBBVUNJYf8EtWqh84BIZCDxJUtbfA6XYioOqP8iIflDCd49Ky7m0vDnO7gSXJrW6dQ2fFrGfD9sh4tU/7I9oGNnFITiqvRKPxDog1WOmllqTw8m1YisaQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Hc5vqpyT; arc=none smtp.client-ip=209.85.160.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Hc5vqpyT" Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-4ddf60466d4so29543991cf.2 for ; Sun, 28 Sep 2025 18:04:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107892; x=1759712692; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=3ydKBwOyRTlAOFM+HzD45Tc0PX6eZlXsguWH3WK3DRQ=; b=Hc5vqpyTaVJWJgc0sufkfuF8aRKS9dgNMrFLH2zFRYUn8imtbNjW4C+G00JQursGrE CNVUZTkpy7CincjdSHs8Za6FcXmehoRqWjLCUVpbTPiSbMqM9BEosb9NeNk4ontVTdhM 1jeklbo90i6NSN4cK5v86thmGEOyuPOjGbWpz5fV2bebmVwS8khvHi6tDVlAId4zJqZ7 Q3omhDO5NEXKFOgjDJmj62ZzulJngt7oyfoiZJ2cSIXsIhQzyIA5I1BCnv/4wy1QyJGx X5I3uDjxbIkErTPyEMQOuddhjvFE2k5nBgCrfbHMfXvJAwdNZHPpTgEbPkerl39761UW Jy1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107892; x=1759712692; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3ydKBwOyRTlAOFM+HzD45Tc0PX6eZlXsguWH3WK3DRQ=; b=f8N7J5RCOS35sCMN1Q8qBNd7rq1eXFrlxdDgupoye95w8194iSBQasfJYTQUDHoZRN Uw5GTmd9oorxe+yVrHROKoQvbb4inzHrqxhd4S5aHXRm0kvSFzaZ5VeN9jz/tcN5hokp Vd1KKxgS0ucj1VJdb8pjv6cqA0wjehd1e0Px1ZL19DfYi8hpvCluOpmJYz0O4iYe2nvH AixlHDltlr26pKt331N78x0IWBccuWWxi89150uJu4BuUHl/15X18txZkv+dB0oL7NOg +B2skaukj1hNg/iwwM+mJrchONoNpCaCKIHia1+VFwat0vfk8I+0jirU1S0jvCtX+cgm FWoA== X-Forwarded-Encrypted: i=1; AJvYcCWrUWl+ZCZJuIDTABQj+3DifdaXh149PBNh6tiOziD0z6famWfioUXpbvaBKc5RSwtAlixdvQIxHbix7a4=@vger.kernel.org X-Gm-Message-State: AOJu0YyLxcPYk6CWPnl4eJlxljb4oCp8cOoEKdZPLQlnpy2+Gxp+f12r /VUFJo+zdF++yqy30VjavxELI86E6ZccxT+YhonJlStV1UxO81kEBEcr7EbRGjToE/M= X-Gm-Gg: ASbGnctPDLyzheKOEe5D75BvfPxuJhwcEp2ElAkXIn1bWZxS/PvIzREDw+2mWEGpC37 PN0b1EnQPZM8HdGYVQX1g26nfRXCtkEyRDCM0ObQ148Qf/Q6K8pNVA1JA7W4n4XfYaP2UPm9Txc +dMRdRd1KFRSIApodSzHtasJi9qbYNUub+gG8ks+rCDFKDt0+Gc2Kethm2YaV1CSOWJOKM7jQ/5 LG36MzijusDL21coZW0kLJQVJkHKFrdzmtB+m0TMjptZZOhvyWOTkntaMQz08YBstFO4rhnVaQ8 j0G75JnC7zDr74xQm7xffV3nt7pHOvB3npbz7dsUPTa0lzy/Lt7WtGOdNMRG1pwjQnBjh6Z/SY7 ERerTSPwGJMuJez8Am3yVf7dMNroHoO8BiVOGt3pygnNxM9qqKMP/EvyBUacBN1vVlWYo3Tn+7G 8gb8hVyCQ= X-Google-Smtp-Source: AGHT+IFl5jf3mdJXfLDVCYuyr7wgE9yCTlaWZAFW+PFDbEgMjgXl4J4uX5xR5SoXPcaZ54yBI+aaQw== X-Received: by 2002:a05:622a:1ccb:b0:4df:8368:4ae1 with SMTP id d75a77b69052e-4df836851b5mr75011421cf.33.1759107892153; Sun, 28 Sep 2025 18:04:52 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:51 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 27/30] selftests/liveupdate: Add multi-file and unreclaimed file test Date: Mon, 29 Sep 2025 01:03:18 +0000 Message-ID: <20250929010321.3462457-28-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a new selftest, luo_multi_file, to validate two key aspects of the Live Update Orchestrator file preservation mechanism: the ability to handle multiple files within a single session, and the correct cleanup of unreclaimed files. The test implements a full kexec cycle with the following flow: 1. Pre-kexec: - A single session is created. - Three distinct memfd files (A, B, and C) are created, populated with unique data, and preserved within this session. - The global LIVEUPDATE_PREPARE event is triggered, and the system reboots via kexec. 2. Post-kexec: - The preserved session is retrieved. - Files A and C are restored and their contents are verified to ensure that multiple files can be successfully restored from a single session. - File B is intentionally not restored. - The global LIVEUPDATE_FINISH event is triggered. 3. Verification: - The test is considered successful if files A and C are verified correctly. - The user is prompted to check the kernel log (dmesg) for a message confirming that the unreclaimed file (B) was identified and cleaned up by the LUO core, thus validating the cleanup path. Signed-off-by: Pasha Tatashin --- tools/testing/selftests/liveupdate/Makefile | 1 + .../selftests/liveupdate/luo_multi_file.c | 119 ++++++++++++++++++ 2 files changed, 120 insertions(+) create mode 100644 tools/testing/selftests/liveupdate/luo_multi_file.c diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile index 1cbc816ed5c5..f43b7d03e017 100644 --- a/tools/testing/selftests/liveupdate/Makefile +++ b/tools/testing/selftests/liveupdate/Makefile @@ -9,6 +9,7 @@ LDFLAGS +=3D -static LUO_SHARED_SRCS :=3D luo_test_utils.c LUO_SHARED_HDRS +=3D luo_test_utils.h =20 +LUO_MANUAL_TESTS +=3D luo_multi_file LUO_MANUAL_TESTS +=3D luo_multi_kexec =20 TEST_FILES +=3D do_kexec.sh diff --git a/tools/testing/selftests/liveupdate/luo_multi_file.c b/tools/te= sting/selftests/liveupdate/luo_multi_file.c new file mode 100644 index 000000000000..ae38fe8aba4c --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_multi_file.c @@ -0,0 +1,119 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include "luo_test_utils.h" + +#define KEXEC_SCRIPT "./do_kexec.sh" + +#define SESSION_NAME "multi_file_session" +#define TOKEN_A 101 +#define TOKEN_B 102 +#define TOKEN_C 103 + +#define DATA_A "Alpha file data" +#define DATA_B "Bravo file data which will be unreclaimed" +#define DATA_C "Charlie file data" + +static void run_pre_kexec(int luo_fd) +{ + int session_fd; + + ksft_print_msg("[PRE-KEXEC] Starting workload...\n"); + + session_fd =3D luo_create_session(luo_fd, SESSION_NAME); + if (session_fd < 0) + fail_exit("Failed to create session '%s'", SESSION_NAME); + + ksft_print_msg("[PRE-KEXEC] Preserving 3 memfds (A, B, C)...\n"); + if (create_and_preserve_memfd(session_fd, TOKEN_A, DATA_A) < 0) + fail_exit("Failed to preserve memfd A"); + if (create_and_preserve_memfd(session_fd, TOKEN_B, DATA_B) < 0) + fail_exit("Failed to preserve memfd B"); + if (create_and_preserve_memfd(session_fd, TOKEN_C, DATA_C) < 0) + fail_exit("Failed to preserve memfd C"); + ksft_print_msg("[PRE-KEXEC] All memfds preserved.\n"); + + if (luo_set_global_event(luo_fd, LIVEUPDATE_PREPARE) < 0) + fail_exit("Failed to set global PREPARE event"); + + ksft_print_msg("[PRE-KEXEC] System is ready. Executing kexec...\n"); + if (system(KEXEC_SCRIPT) !=3D 0) + fail_exit("kexec script failed"); + + sleep(10); /* Should not be reached */ + exit(EXIT_FAILURE); +} + +static void run_post_kexec(int luo_fd) +{ + int session_fd, mfd_a, mfd_c; + + ksft_print_msg("[POST-KEXEC] Starting workload...\n"); + + session_fd =3D luo_retrieve_session(luo_fd, SESSION_NAME); + if (session_fd < 0) + fail_exit("Failed to retrieve session '%s'", SESSION_NAME); + + /* 1. VERIFY SUCCESS: Restore and verify memfd A. */ + ksft_print_msg("[POST-KEXEC] Restoring and verifying memfd A (token %d)..= .\n", + TOKEN_A); + mfd_a =3D restore_and_verify_memfd(session_fd, TOKEN_A, DATA_A); + if (mfd_a < 0) + fail_exit("Failed to restore or verify memfd A"); + close(mfd_a); + ksft_print_msg(" Success.\n"); + + /* 2. VERIFY SUCCESS: Restore and verify memfd C. */ + ksft_print_msg("[POST-KEXEC] Restoring and verifying memfd C (token %d)..= .\n", + TOKEN_C); + mfd_c =3D restore_and_verify_memfd(session_fd, TOKEN_C, DATA_C); + if (mfd_c < 0) + fail_exit("Failed to restore or verify memfd C"); + close(mfd_c); + ksft_print_msg(" Success.\n"); + + ksft_print_msg("[POST-KEXEC] NOT restoring memfd B (token %d) to test cle= anup.\n", + TOKEN_B); + + if (luo_set_global_event(luo_fd, LIVEUPDATE_FINISH) < 0) + fail_exit("Failed to set global FINISH event"); + + close(session_fd); + + ksft_print_msg("\n--- TEST PASSED ---\n"); + ksft_print_msg("Check dmesg for cleanup log of token %d in session '%s'.\= n", + TOKEN_B, SESSION_NAME); +} + +int main(int argc, char *argv[]) +{ + enum liveupdate_state state; + int luo_fd; + + luo_fd =3D luo_open_device(); + if (luo_fd < 0) { + ksft_exit_skip("Failed to open %s. Is the luo module loaded?\n", + LUO_DEVICE); + } + + if (luo_get_global_state(luo_fd, &state) < 0) + fail_exit("Failed to get LUO state"); + + switch (state) { + case LIVEUPDATE_STATE_NORMAL: + run_pre_kexec(luo_fd); + break; + case LIVEUPDATE_STATE_UPDATED: + run_post_kexec(luo_fd); + break; + default: + fail_exit("Test started in an unexpected state: %d", state); + } + + close(luo_fd); + ksft_exit_pass(); +} --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CEFA82BEC2D for ; Mon, 29 Sep 2025 01:04:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107897; cv=none; b=q4Hs3alWQT/L743tUCjEfU6T4pjEmEh3nlyj1Iv/YXJgR4oUwuvjMrCFyM/MjfgJ1MaKIsJvNRDJeVsDduhUbqGRwsiddLOdKd0iv/Fka9uJ8EW+kNr5YvzDv7H4Bam7q+vfMFupNNe0w0U7LhFNYjackBHtRNV7ptUvN0YAsxg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107897; c=relaxed/simple; bh=1QPlkD1SD8j2jzJsRJQTTXYN8lhxikHZkxJhprewfsI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cyvvs0FYlaammamPPIAu2litz7REhUdN3RDRNFjC7mkWPW5WtzTUpDfbnwlZzNtU2b2oD7/2AEhiX1NbJrPZWP5g6Ah4MGSX5aWJ3KBWjmSiXEI+0y7M5wsZ1UIzgJQGAxv+EEzWxhhGfNIqRHRi0NN/OEO4J/bHrpO4qlWOb64= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=W7iENXGW; arc=none smtp.client-ip=209.85.160.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="W7iENXGW" Received: by mail-qt1-f182.google.com with SMTP id d75a77b69052e-4dfe74ed2e1so12568211cf.2 for ; Sun, 28 Sep 2025 18:04:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107894; x=1759712694; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=zogfiVWVPuHqZ/hymZklfdYD+7OrEbaP/ax6pqZUiEo=; b=W7iENXGWw2H75H+Y5Yblx+Ha0LmPjz7dqK8hG5r89qE/QxXWTblVB3qbbvvqOZAdV0 tWhauoLSWPs1doBrbaaYr4PRdfhTJPfiHX/6R3yoZpbj9kLIGatC2eMoyp+erC4Vs5S3 yAFr2HObIg4dHrSY1oXT8I1RsUW6memaImp4wDQJ2Tn3zdRIL5l54aVSzbSgoFaVRADP M8ZQdqfteazi8PseJ/1Fi2QjeALfCL06PSjMLbiS+hrGiNiNZv7UnVU95MzMmbkZCikt k6sYZu0e1u6+3tlEOmEK9pBH3zisAIO895VQHK4Ms+evCeRdwtpxHaInP+P4KMV1fW4/ HQeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107894; x=1759712694; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zogfiVWVPuHqZ/hymZklfdYD+7OrEbaP/ax6pqZUiEo=; b=Ut6RobfT766IWRacFw3UmvjGaOijDIuTyIq8QjXS7O20ifGphyouNP9c/9hOEm+fHx itbVXjUgmZ7pZP7IMTg4K/6BSvTC1SOklTm+yH79D3o9gVq69mbMXxV6f4keYeneYy9B AqQr5g9PlfhcHIDopmKUk/IemKguFiGMFa1Qs9xpG7mwlf2+Htk7MT+xJ2glRUxp7jYU TbxUJQ7e26ZTtCmIhi/uvKqpEYBxvmy6ZejpMfJU3ewDha+5Yu+BThoyEpYFYxqsJUQs 8PKZrP6B2LMvjKyd0AAVIsyS/2+2TRMcgL3JBdOKJcB0v2i5/T+Y5C6EiBU2xPrn58xE qloQ== X-Forwarded-Encrypted: i=1; AJvYcCUZJlzaVaIuctt4ibqpIeNPnN2PSgz4Giv0p/uS2TxuQwcXN1NaVXILgboHjLld8oLGCaEwoOj1AZ1bFHU=@vger.kernel.org X-Gm-Message-State: AOJu0YwdzmoBvPbBYdv91TlPPg2romvpVtkednLImhiY5nqo30ilJ+Jd BFPSyGLKPyKUyrhopX9mSH+bBsiyPAeannwxaomFbldYTV6jnqldGJ9Y/jqWxHjPaI4= X-Gm-Gg: ASbGncszmpVlFWwL/H6pHBm/uXlaYrJWHFIU1+X9aeGaBzEzkRmgZ0ROlvkbZ9QTlrg f4m8HQ1+UJ9Qwj/KbSlgbEZ7mSc7z24nFie3GbB077IVBLelilXBK8iJiyXa34DizF4YAc908Xj 85y3VX0z9eiIfswL4+DIqLZqYCPR1ptSRieMsEsUTyBYliR9EjWAiVLXrzgO2n31K7wJy+SCQ0J xmdlQay8QAAtx+9WyAdeUZff277cS3AzCQs6eooOGbjwnbTEABqHZf9Y806X08WSRlLLJRpXIO9 aFqhfs7u3v+WzQ+6IMP7vLUFwvSjRJhqmH+ZURj5T5eIdNMrTooL6ZN5Ly3VIfiLxtLKreS4fjU MWV8eudMXetIu9Ky2hdMh9GvXZHxQuPP4oVob9KplrSWCctGYQBef0Fd8v298Fu2vsl7saFoaGZ 7rBxV3ayA= X-Google-Smtp-Source: AGHT+IFJy+gdoGTPdKklQMCl/lbBiwtV/NLqoePakM8hqAPjFKr61eCoWw/x5r7bMaORajvgVwYiZg== X-Received: by 2002:a05:622a:1e08:b0:4b5:4874:4fa5 with SMTP id d75a77b69052e-4da482d6300mr196629091cf.18.1759107893586; Sun, 28 Sep 2025 18:04:53 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:53 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 28/30] selftests/liveupdate: Add multi-session workflow and state interaction test Date: Mon, 29 Sep 2025 01:03:19 +0000 Message-ID: <20250929010321.3462457-29-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a new, luo_multi_session, test to validate the orchestration of multiple LUO sessions with differing lifecycles through a full kexec reboot. The test validates interactions between per-session and global state transitions: 1. Mixed State Preparation: Before the first kexec, sessions are put into different states to test the global PREPARE event's behavior: - Session A & C: Are individually transitioned to PREPARED via a per-session ioctl. The test verifies that the subsequent global PREPARE correctly handles these already-prepared sessions. - Session B: Is transitioned to PREPARED and then immediately back to NORMAL via a per-session CANCEL. This validates the rollback mechanism and ensures the session is correctly picked up and prepared by the subsequent global PREPARE. - Session D: Is left in the NORMAL state, verifying that the global PREPARE correctly transitions sessions that have not been individually managed. 2. Unreclaimed Session Cleanup: - After the kexec reboot, sessions A, B, C, and D are all retrieved and verified to ensure they were preserved correctly, regardless of their pre-kexec transition path. - Session E: Is intentionally not retrieved. This validates that the global FINISH event correctly identifies and cleans up an entire unreclaimed session and all of its preserved file resources, preventing leaks. Signed-off-by: Pasha Tatashin --- tools/testing/selftests/liveupdate/Makefile | 1 + .../selftests/liveupdate/luo_multi_session.c | 155 ++++++++++++++++++ 2 files changed, 156 insertions(+) create mode 100644 tools/testing/selftests/liveupdate/luo_multi_session.c diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile index f43b7d03e017..72892942dd61 100644 --- a/tools/testing/selftests/liveupdate/Makefile +++ b/tools/testing/selftests/liveupdate/Makefile @@ -11,6 +11,7 @@ LUO_SHARED_HDRS +=3D luo_test_utils.h =20 LUO_MANUAL_TESTS +=3D luo_multi_file LUO_MANUAL_TESTS +=3D luo_multi_kexec +LUO_MANUAL_TESTS +=3D luo_multi_session =20 TEST_FILES +=3D do_kexec.sh =20 diff --git a/tools/testing/selftests/liveupdate/luo_multi_session.c b/tools= /testing/selftests/liveupdate/luo_multi_session.c new file mode 100644 index 000000000000..9ea96d7b997f --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_multi_session.c @@ -0,0 +1,155 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include "luo_test_utils.h" +#include "../kselftest.h" + +#define KEXEC_SCRIPT "./do_kexec.sh" + +#define NUM_SESSIONS 5 +#define FILES_PER_SESSION 5 + +/* Helper to manage one session and its files */ +static void setup_session(int luo_fd, struct session_info *s, int session_= idx) +{ + int i; + + snprintf(s->name, sizeof(s->name), "session-%c", 'A' + session_idx); + + s->fd =3D luo_create_session(luo_fd, s->name); + if (s->fd < 0) + fail_exit("Failed to create session '%s'", s->name); + + /* Create and preserve all files for this session */ + for (i =3D 0; i < FILES_PER_SESSION; i++) { + s->file_tokens[i] =3D (session_idx * 100) + i; + snprintf(s->file_data[i], sizeof(s->file_data[i]), + "Data for %.*s-File%d", + LIVEUPDATE_SESSION_NAME_LENGTH, + s->name, i); + + if (create_and_preserve_memfd(s->fd, s->file_tokens[i], + s->file_data[i]) < 0) { + fail_exit("Failed to preserve token %d in session '%s'", + s->file_tokens[i], s->name); + } + } +} + +/* Helper to re-initialize the expected session data post-reboot */ +static void reinit_sessions(struct session_info *sessions) +{ + int i, j; + + for (i =3D 0; i < NUM_SESSIONS; i++) { + snprintf(sessions[i].name, sizeof(sessions[i].name), + "session-%c", 'A' + i); + for (j =3D 0; j < FILES_PER_SESSION; j++) { + sessions[i].file_tokens[j] =3D (i * 100) + j; + snprintf(sessions[i].file_data[j], + sizeof(sessions[i].file_data[j]), + "Data for %.*s-File%d", + LIVEUPDATE_SESSION_NAME_LENGTH, + sessions[i].name, j); + } + } +} + +static void run_pre_kexec(int luo_fd) +{ + struct session_info sessions[NUM_SESSIONS] =3D {0}; + int i; + + ksft_print_msg("[PRE-KEXEC] Starting workload...\n"); + + ksft_print_msg("[PRE-KEXEC] Setting up %d sessions with %d files each...\= n", + NUM_SESSIONS, FILES_PER_SESSION); + for (i =3D 0; i < NUM_SESSIONS; i++) + setup_session(luo_fd, &sessions[i], i); + ksft_print_msg("[PRE-KEXEC] Setup complete.\n"); + + ksft_print_msg("[PRE-KEXEC] Performing individual session state transitio= ns...\n"); + ksft_print_msg(" - Preparing Session A...\n"); + if (luo_set_session_event(sessions[0].fd, LIVEUPDATE_PREPARE) < 0) + fail_exit("Failed to prepare Session A"); + + ksft_print_msg(" - Preparing and then Canceling Session B...\n"); + if (luo_set_session_event(sessions[1].fd, LIVEUPDATE_PREPARE) < 0) + fail_exit("Failed to prepare Session B"); + if (luo_set_session_event(sessions[1].fd, LIVEUPDATE_CANCEL) < 0) + fail_exit("Failed to cancel Session B"); + + ksft_print_msg(" - Preparing Session C...\n"); + if (luo_set_session_event(sessions[2].fd, LIVEUPDATE_PREPARE) < 0) + fail_exit("Failed to prepare Session C"); + + ksft_print_msg(" - Sessions D & E remain in NORMAL state.\n"); + + ksft_print_msg("[PRE-KEXEC] Triggering global PREPARE event...\n"); + if (luo_set_global_event(luo_fd, LIVEUPDATE_PREPARE) < 0) + fail_exit("Failed to set global PREPARE event"); + + ksft_print_msg("[PRE-KEXEC] System is ready. Executing kexec...\n"); + if (system(KEXEC_SCRIPT) !=3D 0) + fail_exit("kexec script failed"); + + sleep(10); + exit(EXIT_FAILURE); +} + +static void run_post_kexec(int luo_fd) +{ + struct session_info sessions[NUM_SESSIONS] =3D {0}; + + ksft_print_msg("[POST-KEXEC] Starting workload...\n"); + + reinit_sessions(sessions); + + ksft_print_msg("[POST-KEXEC] Verifying preserved sessions (A, B, C, D)...= \n"); + verify_session_and_get_fd(luo_fd, &sessions[0]); + verify_session_and_get_fd(luo_fd, &sessions[1]); + verify_session_and_get_fd(luo_fd, &sessions[2]); + verify_session_and_get_fd(luo_fd, &sessions[3]); + + ksft_print_msg("[POST-KEXEC] NOT retrieving session E to test cleanup.\n"= ); + + ksft_print_msg("[POST-KEXEC] Driving global state to FINISH...\n"); + if (luo_set_global_event(luo_fd, LIVEUPDATE_FINISH) < 0) + fail_exit("Failed to set global FINISH event"); + + ksft_print_msg("\n--- TEST PASSED ---\n"); + ksft_print_msg("Check dmesg for cleanup log of session E.\n"); +} + +int main(int argc, char *argv[]) +{ + enum liveupdate_state state; + int luo_fd; + + luo_fd =3D luo_open_device(); + if (luo_fd < 0) { + ksft_exit_skip("Failed to open %s. Is the luo module loaded?\n", + LUO_DEVICE); + } + + if (luo_get_global_state(luo_fd, &state) < 0) + fail_exit("Failed to get LUO state"); + + switch (state) { + case LIVEUPDATE_STATE_NORMAL: + run_pre_kexec(luo_fd); + break; + case LIVEUPDATE_STATE_UPDATED: + run_post_kexec(luo_fd); + break; + default: + fail_exit("Test started in an unexpected state: %d", state); + } + + close(luo_fd); + ksft_exit_pass(); +} --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 725BE2BF015 for ; Mon, 29 Sep 2025 01:04:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107898; cv=none; b=YJ8mwQCpNsQQFdFNG+YLINyAsdI6QkEixaHyT91LtZ0aC6xpFWwirVS5HpLES5R9O5FshsZYn/2mqY01Actuwz1VjrffjL2tqD2iLAMSUtk1yUPoB+Wg+cdRT9nFbFbMLftAS9Ttp3/s6kOIR/lTrle8hPPBDYuYySuSxyK++dk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107898; c=relaxed/simple; bh=/p25wN3NUJJNuRoY3XDprKeWUbooH5B1sb7ey2HL834=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=T+j248PcWj9jP916Ee8T18FSPUMVij+gJxIuri6wFDUyDA6Df2+yw8bOVV9/BAQMHJ/H6oI8i61/VvHtB7f7fBouDEve52UxVYdFVZhlLtNiTItghWG30DB0vvnnGHX3Vh2dNiAaoaKWOxIS0BggDYT6JFGjYg/An/tzJBmaBv8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=P15rTLZs; arc=none smtp.client-ip=209.85.160.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="P15rTLZs" Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-4df2911ac5aso9065451cf.2 for ; Sun, 28 Sep 2025 18:04:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107895; x=1759712695; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=N2v75INyzNMPI4oEdp2NmuGRVQWzF+umEVMJPwM0ApE=; b=P15rTLZsu20GbwuziMOIqH3PYd9Ypt9KilmCFFAZ8eOQ9RVArqqevuRa3zw/N59m3V 6rVWvYnHWdpuYxwqQvRT38BEzNl7KdRSMcCxBkq8pFZbwK98+9KCVlgymT12fexiWw+8 Ly/P9kszLY73bC+MmHqOD4qpq706BcaWOGH0BkFQb9vU+bc9sGEtrWB3FeYUvioyWKNJ oZa0mRjpVZPYvi7WkSRTQHZtJjIJe+X4mYed7DnIz5PfirL/yNyo774Rv10hptZilXqA TCuUNyyXKPelTXq7nRwWE756fI2P3kYBXAz6Hhh7qVw8+mDtuzE3k38M7euDWzjhnE9Q sqRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107895; x=1759712695; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=N2v75INyzNMPI4oEdp2NmuGRVQWzF+umEVMJPwM0ApE=; b=dqKEjMeCbFQUN2FJCtrxd22jd+ZFy/aCW9ilvlAk6912eMXAHHp7QLkS2HDvvUy/pO eed7RQHmZmO2e+Z1ToMFpjA4q+BsxzGho60Q9wvLLkucTEjvomDSsU9GYgAW90CKTyyZ Ts/9DFZpa/gRJ41IIYFM0twxiLehCFV11g7szDAWfNrfEFYytNGkiBZsYlPJPd8SEXjA TwUUFDAsBzh40ZoYRhAqBZ4LMJAr8EUccZFZtqWMT+RqcXmGBTE4wR9dgV0AHxyDn9St NJAFWuQbN47g+UxRC6cB8o2HZA9pbVaClsTVyD8wn9sFwFs3bzug9qV+glm76/XVdbHl 2YKw== X-Forwarded-Encrypted: i=1; AJvYcCWMgpr4Fe8pztfwaauMqcYarl1OUXpwTcWvBg3osUqFEZNOIqyxrC2lm0mMbSUqXdKFXqzZ1n1chFM2sd8=@vger.kernel.org X-Gm-Message-State: AOJu0YzsEJZIXSxiKljPCkT0/zBbye7J+zt0oE+wqLNiL96Bw3cVQKGB GFSycIeWGZ2stIW0HeZDMcTORWe9Dtk0m80yKTw9HMEMRReQa0erKNtHc0xoFBhORWQ= X-Gm-Gg: ASbGncs4X+QtScXZ5PohLgtdhOp6TsehQYuzjC3K1sZrypdj7xXL897tSbPexDLZiib jbL56zQWAT0DXoycEYTvgzziIrB0nPhs+eqF9DKhr5U1zLIvr3OzWQ7U+ifo+vFFYW0+TDV1w3X O9esxDrQkZy4UZ8USUkaLWV8WbUtH/qz9V3HF4eRRobaASl7Q88WbWphmLY3k/qEv1V3OUqkQK+ jckNvS5hwmVlH0XWR9biMjelethCBpunhoRAZcqefx+0EgQyXkJM9E4923B+xFeZaUXgFAuZEMI JQXFlAlrDLMu5fP2TRSF/yIG1YufeEASA6NY5yqfSo3SqVBr6RN5DvhLmbXaMNaA8LHGa2yJaYR iA3bEU+q5eIQQwxCnjdSWEqpi/ldja4Vdy9JojT/tyMlxCS6m5nxZpteO6jYKEBRGGpcEWiN+97 yO5y1oHq8= X-Google-Smtp-Source: AGHT+IGNYpnFHhEDZ7IQxe60li51JF8Ceu8MZliO1sqTV6jJCplnwCE0LZP4epB2zRmz4aDw8ZyDcw== X-Received: by 2002:a05:622a:1e89:b0:4cc:d6f0:2e41 with SMTP id d75a77b69052e-4da47354de0mr216005241cf.6.1759107894911; Sun, 28 Sep 2025 18:04:54 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:54 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 29/30] selftests/liveupdate: Add test for unreclaimed resource cleanup Date: Mon, 29 Sep 2025 01:03:20 +0000 Message-ID: <20250929010321.3462457-30-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a new selftest, luo_unreclaimed, to specifically validate that the LUO framework correctly identifies and cleans up preserved resources that are not restored by userspace after a kexec reboot. Ensuring proper cleanup of unreclaimed (or "abandoned") resources is critical for preventing resource leaks in the kernel. This test provides a focused scenario to verify this cleanup path, which is a key aspect of the LUO's robustness. The test performs a full kexec cycle with the following simple flow: 1. Pre-kexec: - A single session is created. - Two memfd files are preserved: File A (which will be restored) and File B (which will be abandoned). - The global LIVEUPDATE_PREPARE event is triggered, and the system reboots. 2. Post-kexec: - The preserved session is retrieved. - Only File A is restored and its contents are verified to confirm the basic preservation mechanism is working. - File B is intentionally not restored. - The global LIVEUPDATE_FINISH event is triggered. 3. Verification: - The test passes if File A is verified successfully. Signed-off-by: Pasha Tatashin --- tools/testing/selftests/liveupdate/Makefile | 1 + .../selftests/liveupdate/luo_unreclaimed.c | 107 ++++++++++++++++++ 2 files changed, 108 insertions(+) create mode 100644 tools/testing/selftests/liveupdate/luo_unreclaimed.c diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile index 72892942dd61..ffce73233149 100644 --- a/tools/testing/selftests/liveupdate/Makefile +++ b/tools/testing/selftests/liveupdate/Makefile @@ -12,6 +12,7 @@ LUO_SHARED_HDRS +=3D luo_test_utils.h LUO_MANUAL_TESTS +=3D luo_multi_file LUO_MANUAL_TESTS +=3D luo_multi_kexec LUO_MANUAL_TESTS +=3D luo_multi_session +LUO_MANUAL_TESTS +=3D luo_unreclaimed =20 TEST_FILES +=3D do_kexec.sh =20 diff --git a/tools/testing/selftests/liveupdate/luo_unreclaimed.c b/tools/t= esting/selftests/liveupdate/luo_unreclaimed.c new file mode 100644 index 000000000000..c3921b21b97b --- /dev/null +++ b/tools/testing/selftests/liveupdate/luo_unreclaimed.c @@ -0,0 +1,107 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include "luo_test_utils.h" +#include "../kselftest.h" + +#define KEXEC_SCRIPT "./do_kexec.sh" + +#define SESSION_NAME "unreclaimed_session" +#define TOKEN_A 100 +#define TOKEN_B 200 +#define DATA_A "This is file A, the one we retrieve." +#define DATA_B "This is file B, the one we abandon." + +static void run_pre_kexec(int luo_fd) +{ + int session_fd; + + ksft_print_msg("[PRE-KEXEC] Starting workload...\n"); + + session_fd =3D luo_create_session(luo_fd, SESSION_NAME); + if (session_fd < 0) + fail_exit("Failed to create session '%s'", SESSION_NAME); + + ksft_print_msg("[PRE-KEXEC] Preserving memfd A (to be restored).\n"); + if (create_and_preserve_memfd(session_fd, TOKEN_A, DATA_A) < 0) + fail_exit("Failed to preserve memfd A"); + + ksft_print_msg("[PRE-KEXEC] Preserving memfd B (to be abandoned).\n"); + if (create_and_preserve_memfd(session_fd, TOKEN_B, DATA_B) < 0) + fail_exit("Failed to preserve memfd B"); + + if (luo_set_global_event(luo_fd, LIVEUPDATE_PREPARE) < 0) + fail_exit("Failed to set global PREPARE event"); + + ksft_print_msg("[PRE-KEXEC] System is ready. Executing kexec...\n"); + if (system(KEXEC_SCRIPT) !=3D 0) + fail_exit("kexec script failed"); + + sleep(10); + exit(EXIT_FAILURE); +} + +static void run_post_kexec(int luo_fd) +{ + int session_fd, mfd_a; + + ksft_print_msg("[POST-KEXEC] Starting workload...\n"); + + session_fd =3D luo_retrieve_session(luo_fd, SESSION_NAME); + if (session_fd < 0) + fail_exit("Failed to retrieve session '%s'", SESSION_NAME); + + ksft_print_msg("[POST-KEXEC] Restoring and verifying memfd A (token %d)..= .\n", + TOKEN_A); + mfd_a =3D restore_and_verify_memfd(session_fd, TOKEN_A, DATA_A); + if (mfd_a < 0) + fail_exit("Failed to restore or verify memfd A"); + close(mfd_a); + ksft_print_msg(" Data verification PASSED for memfd A.\n"); + + ksft_print_msg("[POST-KEXEC] NOT restoring memfd B (token %d) to test cle= anup.\n", + TOKEN_B); + + ksft_print_msg("[POST-KEXEC] Driving global state to FINISH...\n"); + if (luo_set_global_event(luo_fd, LIVEUPDATE_FINISH) < 0) + fail_exit("Failed to set global FINISH event"); + + close(session_fd); + + ksft_print_msg("\n--- TEST PASSED ---\n"); + ksft_print_msg("Check dmesg for cleanup log of token %d in session '%s'.\= n", + TOKEN_B, SESSION_NAME); +} + +int main(int argc, char *argv[]) +{ + enum liveupdate_state state; + int luo_fd; + + luo_fd =3D luo_open_device(); + if (luo_fd < 0) { + ksft_exit_skip("Failed to open %s. Is the luo module loaded?\n", + LUO_DEVICE); + } + + if (luo_get_global_state(luo_fd, &state) < 0) + fail_exit("Failed to get LUO state"); + + switch (state) { + case LIVEUPDATE_STATE_NORMAL: + run_pre_kexec(luo_fd); + break; + case LIVEUPDATE_STATE_UPDATED: + run_post_kexec(luo_fd); + break; + default: + fail_exit("Test started in an unexpected state: %d", state); + } + + close(luo_fd); + ksft_exit_pass(); +} --=20 2.51.0.536.g15c5d4f767-goog From nobody Sun Oct 5 07:24:32 2025 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 887922C028E for ; Mon, 29 Sep 2025 01:04:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107901; cv=none; b=Bckk1AP9O9og0pEgHJeqn7fJ/BR7GcRcTcfrllsFimGFErkf55lAENLFl4ZbRIZtRSTCplPz4LZIY66eaCahuVGqamy8b7dP8KbaQDVf822AcBaz/rt/q/ImFTrl6Yj85i2+usPFjlVxgoGm5utWNaxIjEDBONeLYzhcSuatg8g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1759107901; c=relaxed/simple; bh=YxndNYekOIE4c7Py3BfYUApvqGWLMZMwxTdov0PwqSk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Z7d1UgIPkfBPjyQJylMGJ8Y7rpP0tTJ0dibkJYBwupLjP1GqjerCR6S5/dYBfUZuvhGVmdH6a43fmNHx8N3HmGxI3C9Wrf0r2I4Z1zNWsLEAfnjs9WyXjaSvTIt5X/qzCowrHijeWb096CIpUbUEhSLxb1w3GPr2/XQ7RFVCNHI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=iA6Gv4Uf; arc=none smtp.client-ip=209.85.160.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="iA6Gv4Uf" Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-4d142e9903fso26845371cf.0 for ; Sun, 28 Sep 2025 18:04:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1759107896; x=1759712696; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ougWc4G/qIZT3q+j2o8YnJ0ye/FPwzyh3Yh9OeHBEk8=; b=iA6Gv4UfCuldLImNHDxeF22VUESqTDA4QxoBJLZQFc485cj2iO0lwRGLwYJLqQTAQj tQntflsKcN01xTOwwLu9M/WyAVEh2n2DoJvrsqct0dcZtrzeNAncR3JA6hR1cyWtk5a5 O5jZUw5bvDLIypoo1sQ7qsE0EBwrDE2Uw/0Q7/nZGXpiIoqAaUBA74S+qvCFH8RkhZTO UEZ4hLXlH7a7GDPBvU2aPFEYwpa36jMdm9/trun5+92JSrbDs+eaFuqGOUoy6fnygBbn KbI94Sh906CF7GxwDzpzVWrhZOVup5yqNaxWuvLUxs+qlMi12/EjmSNlTdAxlam2jhbG kkmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759107896; x=1759712696; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ougWc4G/qIZT3q+j2o8YnJ0ye/FPwzyh3Yh9OeHBEk8=; b=HclV5ZcMOdPS+uQF29Ac7/QVuWfGAF1TBzeopYWyjKKg/JVELvfyzNiBr9byG4tZ3e pTvVA+EnfdN+hyhCpBQ6SncJls7x4xLU//3RXPN8oW2ZC0SOMclIvbyQYd+F1HS65kmY JDCGooM1tCmKEbiSFKZ8CnOiAzdgzCXOkGqadDRM+lXCDb4Y5bzXFukQtsWp2pUftilu 9P9ys9bJ1zY+vUb1W5db2R7wNHCuuNpjTIWtmqnzMqppNXtCnt9FcEkImRk0RQLl9jpG l001W45poSYXMS8B0gV1OurVynBr4OoS07nJBV57rx0H+ySzRHWasEhl5voWHnK47Njl pt/w== X-Forwarded-Encrypted: i=1; AJvYcCWJdgP3YCx14Yxy1ZTJ8JvIBk/iKXsBTLFGqkW4I50LRGEqD7MYyCBLUoCnSx9kLx/VlhgUQqc6Ew5CS2U=@vger.kernel.org X-Gm-Message-State: AOJu0YwuOujRUZo75Wk4TWtuH3wAA/JQ5c05vHViYILZ2i5QlOaATyuZ 4ZbHvJt5pObrEW7SRmdIrU7N4g/MNlqpNk45IFWiPEvDE52z77PDYJDJ5gY8/wVW4gk= X-Gm-Gg: ASbGncsJrp2Itz6N8Pk6qLI6votoWeVJ0LMsJ+ETXdu6njh2llRNeFDZyqOu2fh2I+O MHuxoZFKGiKwtjQXroMIUYxsOkcd2i8RNIS1ix/+O5sAnP6jsw/2cvm+D38iYUbl9A/Rd7JIBcN zwTCHn8S+7ie9nfqwV5bRo+wHtk983YgQn86gf7zQYTK+iUcH8kmGcxJib1xabzFYvJjDai2EW8 A0rlZZgeELItUfz+shxRnRdjU+tGQMlE3Z/OS2ZwGBTMcZeAsh4GB9i/BdbRRVNq/NxXdk3i+E8 ud24G3FrgzsfYt0Lk26pq2EFsxfjJqTrfh8vOSgo1lNq1ns99fabv35fyiIhTXPaka81hE1JxGg sHWS/ekJUh7C6zfoYlIvZLRj0IwxPEoU0mZodj99xjAuHdRQe2aGnG55G6te7TZa9Rv77220OLZ Y/qUFnEas= X-Google-Smtp-Source: AGHT+IFYNaFG7Jc9+8mczCUfL2NG5AxHH54S3++B2bClP4OT/C9rMlU+s2JYq6QncxNId85tdheqhw== X-Received: by 2002:ac8:7d43:0:b0:4b7:90c0:3156 with SMTP id d75a77b69052e-4da4735376emr195893531cf.9.1759107896425; Sun, 28 Sep 2025 18:04:56 -0700 (PDT) Received: from soleen.c.googlers.com.com (53.47.86.34.bc.googleusercontent.com. [34.86.47.53]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4db0c0fbe63sm64561521cf.23.2025.09.28.18.04.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 28 Sep 2025 18:04:55 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com, hughd@google.com, skhawaja@google.com, chrisl@kernel.org, steven.sistare@oracle.com Subject: [PATCH v4 30/30] selftests/liveupdate: Add tests for per-session state and cancel cycles Date: Mon, 29 Sep 2025 01:03:21 +0000 Message-ID: <20250929010321.3462457-31-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.51.0.536.g15c5d4f767-goog In-Reply-To: <20250929010321.3462457-1-pasha.tatashin@soleen.com> References: <20250929010321.3462457-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce two new, non-kexec selftests to validate the state transition logic for individual LUO sessions, with a focus on the PREPARE, FREEZE, and CANCEL events. While other tests cover the full kexec lifecycle, it is critical to also test the internal per-session state machine's logic and rollback capabilities in isolation. These tests provide this focused coverage, ensuring the core session management ioctls behave as expected. The new test cases are: 1. session_prepare_cancel_cycle: - Verifies the fundamental NORMAL -> PREPARED -> NORMAL state transition path. - It creates a session, preserves a file, sends a per-session PREPARE event, asserts the state is PREPARED, then sends a CANCEL event and asserts the state has correctly returned to NORMAL. 2. session_freeze_cancel_cycle: - Extends the first test by validating the more critical ... -> FROZEN -> NORMAL rollback path. - It follows the same steps but adds a FREEZE event after PREPARE, asserting the session enters the FROZEN state. - It then sends a CANCEL event, verifying that a session can be rolled back even from this final pre-kexec state. This is essential for robustly handling aborts. Signed-off-by: Pasha Tatashin --- tools/testing/selftests/liveupdate/Makefile | 9 ++- .../testing/selftests/liveupdate/liveupdate.c | 56 +++++++++++++++++++ 2 files changed, 64 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile index ffce73233149..25a6dec790bb 100644 --- a/tools/testing/selftests/liveupdate/Makefile +++ b/tools/testing/selftests/liveupdate/Makefile @@ -16,11 +16,18 @@ LUO_MANUAL_TESTS +=3D luo_unreclaimed =20 TEST_FILES +=3D do_kexec.sh =20 -TEST_GEN_PROGS +=3D liveupdate +LUO_MAIN_TESTS +=3D liveupdate =20 # --- Automatic Rule Generation (Do not edit below) --- =20 TEST_GEN_PROGS_EXTENDED +=3D $(LUO_MANUAL_TESTS) +TEST_GEN_PROGS :=3D $(LUO_MAIN_TESTS) + +liveupdate_SOURCES :=3D liveupdate.c $(LUO_SHARED_SRCS) + +$(OUTPUT)/liveupdate: $(liveupdate_SOURCES) $(LUO_SHARED_HDRS) + $(call msg,LINK,,$@) + $(Q)$(LINK.c) $^ $(LDLIBS) -o $@ =20 # Define the full list of sources for each manual test. $(foreach test,$(LUO_MANUAL_TESTS), \ diff --git a/tools/testing/selftests/liveupdate/liveupdate.c b/tools/testin= g/selftests/liveupdate/liveupdate.c index 7c0ceaac0283..804aa25ce5ae 100644 --- a/tools/testing/selftests/liveupdate/liveupdate.c +++ b/tools/testing/selftests/liveupdate/liveupdate.c @@ -17,6 +17,7 @@ #include =20 #include +#include "luo_test_utils.h" =20 #include "../kselftest.h" #include "../kselftest_harness.h" @@ -52,6 +53,16 @@ const char *const luo_state_str[] =3D { [LIVEUPDATE_STATE_UPDATED] =3D "updated", }; =20 +static int get_session_state(int session_fd) +{ + struct liveupdate_session_get_state arg =3D { .size =3D sizeof(arg) }; + + if (ioctl(session_fd, LIVEUPDATE_SESSION_GET_STATE, &arg) < 0) + return -errno; + + return arg.state; +} + static int run_luo_selftest_cmd(int fd_dbg, __u64 cmd_code, struct luo_arg_subsystem *subsys_arg) { @@ -345,4 +356,49 @@ TEST_F(subsystem, prepare_fail) ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); } =20 +TEST_F(state, session_freeze_cancel_cycle) +{ + int session_fd; + const char *session_name =3D "freeze_cancel_session"; + const int memfd_token =3D 5678; + + session_fd =3D luo_create_session(self->fd, session_name); + ASSERT_GE(session_fd, 0); + + ASSERT_EQ(0, create_and_preserve_memfd(session_fd, memfd_token, + "freeze test data")); + + ASSERT_EQ(0, luo_set_session_event(session_fd, LIVEUPDATE_PREPARE)); + ASSERT_EQ(get_session_state(session_fd), LIVEUPDATE_STATE_PREPARED); + + ASSERT_EQ(0, luo_set_session_event(session_fd, LIVEUPDATE_FREEZE)); + ASSERT_EQ(get_session_state(session_fd), LIVEUPDATE_STATE_FROZEN); + + ASSERT_EQ(0, luo_set_session_event(session_fd, LIVEUPDATE_CANCEL)); + ASSERT_EQ(get_session_state(session_fd), LIVEUPDATE_STATE_NORMAL); + + close(session_fd); +} + +TEST_F(state, session_prepare_cancel_cycle) +{ + const char *session_name =3D "prepare_cancel_session"; + const int memfd_token =3D 1234; + int session_fd; + + session_fd =3D luo_create_session(self->fd, session_name); + ASSERT_GE(session_fd, 0); + + ASSERT_EQ(0, create_and_preserve_memfd(session_fd, memfd_token, + "prepare test data")); + + ASSERT_EQ(0, luo_set_session_event(session_fd, LIVEUPDATE_PREPARE)); + ASSERT_EQ(get_session_state(session_fd), LIVEUPDATE_STATE_PREPARED); + + ASSERT_EQ(0, luo_set_session_event(session_fd, LIVEUPDATE_CANCEL)); + ASSERT_EQ(get_session_state(session_fd), LIVEUPDATE_STATE_NORMAL); + + close(session_fd); +} + TEST_HARNESS_MAIN --=20 2.51.0.536.g15c5d4f767-goog