From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8ED202EB5A9 for ; Wed, 23 Jul 2025 14:46:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282020; cv=none; b=hR+OYvAn60dpr0Xk5VAtmAGefeLVU6dLLM1eOqQoam6BkePP7wPEPOz3Rt3O37659EOok/nHnLwY9eO1t+YkIl1zJf82x3G1f9WwvTR2V2fQc45S5TbuD5q/BMpI+euPDeBtjfsmYJqm2nabuAurCoh2PO0Vgu9jMwMwlsHtPlw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282020; c=relaxed/simple; bh=f+GDZMWDZJ+VR9AGii24ODCbYkbiXkpxGsAaLIJVFdg=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nnEkFoDU0ViVJNa1dymIIRF4EDybqBi6YayStG5nhz6n4joYenhGPTPt/6aoIn0Be+93zDGdFjFaVUOwFNnwpBrCG/bvNFLjIMAKXEi9T6CmtIPFzeUFN2Z8gmwLHeS4ihMgD9p8x/cUJFDOR6kNi2QCk4KEpztml95yR9+JBjk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=IHZNcmFa; arc=none smtp.client-ip=209.85.128.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="IHZNcmFa" Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-7183dae670dso62710477b3.2 for ; Wed, 23 Jul 2025 07:46:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282016; x=1753886816; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=inB1PU3JWa0vbtTol7UPnIAGavYFixpKMP3Mnwcd84A=; b=IHZNcmFatXGtvf/rYbXauEIWrfIzb1QEczTtP46KsO1VxWQL2GSvPvqA7welG7yDMy kPny59P7Fb9o1mxJIh/8DwUndsozrNI2GQkaqD89nndQo05Y5Ll2w0HY4bpBpYswlm68 HSrSc42XyqKGkMyCFWx10JADe3vN+d1ukSd7awPUFWYFJTQfhRaa0UDKPxl4WQZXntfQ FJdLcFy9dDthXh02+pKLHtY84O32mR+m3z9G7ldDcGFnUbqghx6LzcCy3E1i3QtrqGpF OycRt3LzEW/o0+10B6FyrtbH6/VE3j4AoC25P0IQVyc9z8WgShZhbY/H4FxwZ0OITJEb Mq6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282016; x=1753886816; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=inB1PU3JWa0vbtTol7UPnIAGavYFixpKMP3Mnwcd84A=; b=p0NZH0aADUuOOrmdXxc5n+y8mNKgXvmWnkZKS0/BO56qis0bVvsx0nzlu//0o+BPor zFDpxipyrdDhu1gsdXNVIcsI7k/oXNyNqwrZ/Lyht5/Xazu5/fi3QQphr8jO4VYocrT+ mx50KGduSpcABSMhc/aUpgwCwJ5SMiphpNERQlzRVhKl/R7bYyFDoNgGwWzLE70hG0fX PWxcIYjtn6FsaShxwxF9YK//ZV3PDJPtfWYjrq214tZhXtPUrdDi1DuwjeviLR6kgD2j QOBWq4PfrAxgCui7MhdXwEqCA5C72ovXBU6Ifu1DwhQijzCGeXYjWLkBw4xgcw9S2IzB vjNg== X-Forwarded-Encrypted: i=1; AJvYcCUWEtqPwDZYMeBJrUx+0S6AUxDEk4tWHpMHkTRj75lFd+RG20BaFWQiHTH1rWjMAHzriWUm//x9AUVQYFM=@vger.kernel.org X-Gm-Message-State: AOJu0YwDRv30Ext0x6mYTyZ90v825WaLfiL5bvGGnmt35lOer9ywRvQz oOMIgg3ywnweXg6AezJshEHk3b7IJyAuDmrhEneApqZhE6JsEF88QMwkdL/+IR3RPU8= X-Gm-Gg: ASbGncvJdK5h5IX/MRf2QsUK4Pj+D8x1eHA9Wt9T778J2NJtUCqsWYkBpIquomY5f0A 5G8m6BQhlN+9bvWSaa2jBS7I0wu6kS15IB6Taf4qIMu0vP0lYWY4id3F7mfIrt2doUwX1Y8nJNd 1D0qQEvFNufZsBVTLx0BxzgUHA9zC9l6VFhF3NLPLcS9mRLjHeBF5OlW6iKRmiHa8c/KlwJx7bI aihPBiMOQJMpC4gLmLX9OHiphT6uROIyRkoEd5SeweYMkh+mrNSHfmK0dyH4lZtlEZkwpchvjdz TsHhU6GQPakCeQMmjIplFbklqqlfVOdMjh+RLiCc53e1PvPpy9hcldQ2++0pgwBk3wnXEc4SVcP ic0xns74bHkFPWKktnSArEF52m4iRrYgtZpx24Inqi2281G4zrmT50uP/a5tckSPXRIemC/eEtX uDXYwiwf3A1hZ6yA== X-Google-Smtp-Source: AGHT+IHbn/J5Pj5jthdHC0UvAscyWuGSIeItHc3xuSaAqAb6w+jHqSJyHG/OBKZLtTuVuVxeIgZROw== X-Received: by 2002:a05:690c:3693:b0:70d:f3f9:1898 with SMTP id 00721157ae682-719b4258e07mr46642237b3.35.1753282016453; Wed, 23 Jul 2025 07:46:56 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.46.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:46:55 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 01/32] kho: init new_physxa->phys_bits to fix lockdep Date: Wed, 23 Jul 2025 14:46:14 +0000 Message-ID: <20250723144649.1696299-2-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Lockdep shows the following warning: INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. [] dump_stack_lvl+0x66/0xa0 [] assign_lock_key+0x10c/0x120 [] register_lock_class+0xf4/0x2f0 [] __lock_acquire+0x7f/0x2c40 [] ? __pfx_hlock_conflict+0x10/0x10 [] ? native_flush_tlb_global+0x8e/0xa0 [] ? __flush_tlb_all+0x4e/0xa0 [] ? __kernel_map_pages+0x112/0x140 [] ? xa_load_or_alloc+0x67/0xe0 [] lock_acquire+0xe6/0x280 [] ? xa_load_or_alloc+0x67/0xe0 [] _raw_spin_lock+0x30/0x40 [] ? xa_load_or_alloc+0x67/0xe0 [] xa_load_or_alloc+0x67/0xe0 [] kho_preserve_folio+0x90/0x100 [] __kho_finalize+0xcf/0x400 [] kho_finalize+0x34/0x70 This is becase xa has its own lock, that is not initialized in xa_load_or_alloc. Modifiy __kho_preserve_order(), to properly call xa_init(&new_physxa->phys_bits); Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation") Signed-off-by: Pasha Tatashin Acked-by: Mike Rapoport (Microsoft) --- kernel/kexec_handover.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 5a21dbe17950..1ff6b242f98c 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -144,14 +144,35 @@ static int __kho_preserve_order(struct kho_mem_track = *track, unsigned long pfn, unsigned int order) { struct kho_mem_phys_bits *bits; - struct kho_mem_phys *physxa; + struct kho_mem_phys *physxa, *new_physxa; const unsigned long pfn_high =3D pfn >> order; =20 might_sleep(); =20 - physxa =3D xa_load_or_alloc(&track->orders, order, sizeof(*physxa)); - if (IS_ERR(physxa)) - return PTR_ERR(physxa); + physxa =3D xa_load(&track->orders, order); + if (!physxa) { + new_physxa =3D kzalloc(sizeof(*physxa), GFP_KERNEL); + if (!new_physxa) + return -ENOMEM; + + xa_init(&new_physxa->phys_bits); + physxa =3D xa_cmpxchg(&track->orders, order, NULL, new_physxa, + GFP_KERNEL); + if (xa_is_err(physxa)) { + int err_ret =3D xa_err(physxa); + + xa_destroy(&new_physxa->phys_bits); + kfree(new_physxa); + + return err_ret; + } + if (physxa) { + xa_destroy(&new_physxa->phys_bits); + kfree(new_physxa); + } else { + physxa =3D new_physxa; + } + } =20 bits =3D xa_load_or_alloc(&physxa->phys_bits, pfn_high / PRESERVE_BITS, sizeof(*bits)); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yb1-f169.google.com (mail-yb1-f169.google.com [209.85.219.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B27BC2F533D for ; Wed, 23 Jul 2025 14:46:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282022; cv=none; b=l+T4ba2YbxZwyVzDoaOFL/QR+c3SADEnvT00QkAEZ3Jnz7txAHfuQ0cZIXgFM2sYrpiGISYbSkOAzHCC/zTcA0rS4mujwIHDbC7Js19lsdL8geoVycogxeEVeHRl1v4w3dM64b3I0fN46D0siO+eSdZGAfqhaBFX4bsuTBFZvyU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282022; c=relaxed/simple; bh=sJLDn8ryoW/ov6UjZZGTTudXv5WHl8/v2Z5ZbPyJ1x4=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CF1x9J5qQ6zUljD+DP4ZF3qNVX5tAiuj26C+rHEK+3CrhEiHmCr8u04HVSNzDXNqOuhIiCpgtehcqm72LmNLIa3PNxYO/XlRmXHQV4tNA+SPrceeYCn3mlAd73yDgLVp7cui6EwPzlAR1YRKZkzNxi1wkEjBlwSuirt4gdo3120= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=lLeh4OIW; arc=none smtp.client-ip=209.85.219.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="lLeh4OIW" Received: by mail-yb1-f169.google.com with SMTP id 3f1490d57ef6-e7d9d480e6cso5531762276.2 for ; Wed, 23 Jul 2025 07:46:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282019; x=1753886819; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=n7cl1P/nffEHtPrwqwqBj2n/TmbnAXjnzD/OU22eOOo=; b=lLeh4OIWCBKjqs1N2TriFDuB+ubYYiseni13/mb49rqAEhv2jLPKXb3Q/Mt4rf8D+Z 7yJwP/iH0iEperDTE1UJelAthrHwmUaZj5cFcEBvnG0IGsmcnsZ4WAvZv7geAbLGv4vE fmMj/N8K1oy3jNEltDSq/BUyPW75NEC+xgiUamE4ezJO2QZFp/l1vZOkUu96jPBefpci x/ghPi6RS0ioS+uuVxBn1m0yOsq9yXXNmipUz9VUSxh1x2ahltuhH9zXtrxah945fb6q 9soOx0N9pwYeBRyFHMoAuRsRWpIIyt7NidN7JD7l97y2oc7p4deeBVm17WMsi+ciOXrt 8etg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282019; x=1753886819; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=n7cl1P/nffEHtPrwqwqBj2n/TmbnAXjnzD/OU22eOOo=; b=WKlnDTXDKY7dvkYHNOE5SJVlAIP4hQUvJkjEvGIuedUNcSro00mJxySh7mxHMQGX6g N2Hbj+FE7wSVaKKuFP6BBodd6aNyyNt+9t2wd2PWGUB2VqER+QqCbdaxahggSInzoC3g /aVVtDkT5wm7bGtqwkjKA8QC5sQvQSs1Lwyk6PeHfYXG1XrBwm6D3HNVxyz7PkOwn9zL ZU8kTRDv+UYJnSnbpPduWvIQ6ax5wbNFYxdwdpxqnvNMoYpMEMB+qeJtgsQEIcqQzxmt gDvpbwEcSLt2ggMTJ9Gl+8XPmK8laBaGcANXvRZXM+Py0Kbn8bw4iVsG8ozaXIRSS2e0 TvaQ== X-Forwarded-Encrypted: i=1; AJvYcCXPrgUwA/mFlSq6Id2ktFsVvPdafH9Cq9XYNP2/U6EoEShSW/msRmeUxr5ViCpePPCHUH+PLV5JVKA22SA=@vger.kernel.org X-Gm-Message-State: AOJu0YynFnxjCyJ2AcwXK4zFNqujq2sRQFjVdRNWJ2+u4VY7qFn7FI2m x/wXmTQszr7rRVKCdBKRe0NiPHh5YbFTj2b52PPBf+ZujZXSrS/LDu5YrIvsFqrGPC0= X-Gm-Gg: ASbGncvBNPK1hq1x8lWsL8BftxlD3D7Uw02VgzizzZ9R9pbeVsTyS17uXTigzl2pW1I WZQ1Qp3qlt5iuBxE6dW+nDkCVnOJMJSf/X9gx84aD0KSRakxXKdrf1x6vDHubs1aJyJLvVc7zCw O8cd6RoeX49cjRFHbnVuo5LPCBCb1xHfMaNaaW8iUDs0JE61bkkAm49FVfAwWcNGM2HOeJopRch tti7kiMjWuRHr7f+xlHbLRkWRghLfBULR1npc+gZrV95zQHu1PwGM5vPFIgf3vWU+KPQumHj0Po ffLEtx/oAQdIEa8qFBD0pLME3o0RsRRb2zI+rSL+niTI1uFw37QT5I3wz4ATGHwexU8C/qAUjwq Kwy11jcLc4ppzVao5CnYJUfYpxnuI2Cll3U6tmGsmaf+n7gWQstbHMUhRQjC1w2y+p3Kzk/wukm CTJJyc3cLe0TOetQ== X-Google-Smtp-Source: AGHT+IE6ehQOayyo1lb2cZBdo6W6rK2fNueePB1JRBmEV6kcBUnmTmWNn0MIzxo8F5fZ2uNGvf14yw== X-Received: by 2002:a05:690c:490b:b0:719:4c68:a6f8 with SMTP id 00721157ae682-719b4258512mr43723377b3.32.1753282018494; Wed, 23 Jul 2025 07:46:58 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.46.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:46:57 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 02/32] kho: mm: Don't allow deferred struct page with KHO Date: Wed, 23 Jul 2025 14:46:15 +0000 Message-ID: <20250723144649.1696299-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" KHO uses struct pages for the preserved memory early in boot, however, with deferred struct page initialization, only a small portion of memory has properly initialized struct pages. This problem was detected where vmemmap is poisoned, and illegal flag combinations are detected. Don't allow them to be enabled together, and later we will have to teach KHO to work properly with deferred struct page init kernel feature. Fixes: 990a950fe8fd ("kexec: add config option for KHO") Signed-off-by: Pasha Tatashin Acked-by: Mike Rapoport (Microsoft) --- kernel/Kconfig.kexec | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 2ee603a98813..1224dd937df0 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -97,6 +97,7 @@ config KEXEC_JUMP config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE + depends on !DEFERRED_STRUCT_PAGE_INIT select MEMBLOCK_KHO_SCRATCH select KEXEC_FILE select DEBUG_FS --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 95DC32F5C2E for ; Wed, 23 Jul 2025 14:47:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282023; cv=none; b=lD9WjA1OB22jTaMWfDu6SMpQvG01BdIujSQl4Pi3+LiVARBinID9yPcrCPalJwYMTuw7QIrc5fs5+zqo7cRG6lYPtDyS7nYjf7egFwhqcLa/aihVmv/lF6geeYf8ZAsybX2Uw5a5WnH9SexNhBFV3R89mMPynpIV7VoZ+gwA2Lg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282023; c=relaxed/simple; bh=kan5N6v9h+6HDmaXzRrpPnNHjduBvdwtdSKyvQ1qoHI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YbQOD0vyYizQetsH3FVp5OpguVSMQ4FyvHt1l3gLOqPhc9RrkDJ6+aIM0EZlXFzi2zusZPfoZvxNryGHcXHfTHhXTabib14rmJZG4UVwpCMokASVbbEcrfWu88LbISGa7FS8nD4TbpTJ3GmdWyT+9bt9hrb1S3FVRGzuenpoktY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=KL4Num2E; arc=none smtp.client-ip=209.85.128.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="KL4Num2E" Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-70e4043c5b7so57109867b3.1 for ; Wed, 23 Jul 2025 07:47:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282020; x=1753886820; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=gy3J9QeC6eXj3cGR0pgjxNs/6APC+1lztNiCJckBgPA=; b=KL4Num2ErMDjSve2Ap/UeK84q8BVWlgZScO0wXfZKdHj0BoY/3vpaaEdSO67uTDl0N EK+9w26CgCTEG3g4ZZovf6HJlIZvbRroRps3D+HI5DUKgFzXZM9v8Eq4dHlaGBr5iN+q uWUiQdMER9jsseUMjfid2gZ61FAjv3HbuJrTeAXbXe9RqB+4DPKuTl9FOQpdXXDx4W/M iIJmCFgAbbCNbxiRDbC8sa1dbiUHM6u9TQTOQZgm94jDXnK9D5YifpVCN2jSO+HJbVfA DdZuLjPg7krOtEVrVPmqLqlPSeSdzNj7RhlMtOIBTI02bBIkS6pcDsfAl+4TgsDl0iHM To5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282020; x=1753886820; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gy3J9QeC6eXj3cGR0pgjxNs/6APC+1lztNiCJckBgPA=; b=nashTzbWH/DebicNXY2NvIdCP1pThzgPmvTdVtlm7quyqw/Aj/Wnzl4fU/CLXCiCVV hboOmnvD4CItTh55VSvB8nZ+/yF+CBT5NjntzUjcVAAuCDkJkzxfUR4l1suPc7BGmTf+ wQolU/VCwXtLzu9Z3OCEIhZwxL3cdHP4gzY0EKpoV2jGY1s0LDgHiwaObNDeJ54hY0+N wsS7bRLNFcXIgrFB8Py+aSgwodwXg+PwNKOrHm3av8xguSnmLX6Yqq/fnb/OgTOaaSmP MErn1KNAr8Yyv9db3n02D4sO8fqBON9r7kaMnp7k5pfN/ud+cMRg62ZhZvnAeuZhB6uc xAYg== X-Forwarded-Encrypted: i=1; AJvYcCUsUYuJ/yuExJ55h7d6dwJW2lBEbD3MuSgonKoE7D4mKN57r92oLmiefr+elkOeCnc9JC1MdgX1g6KF2w8=@vger.kernel.org X-Gm-Message-State: AOJu0YwlEjDlJJLDubWd7yft5j7m50OsqkyYg5fg4linCU4vJRm+EBB5 bQfoOfLC6CQA0D3CkSc+e3sBhEpj8jXnTT05r/LM3jehvXFaligBb2uVDN/P7vrXqAU= X-Gm-Gg: ASbGnct9TwQSNPAphiQxyA8hA5MWbGo9dELF/lxBt4m92RCGlrkeNEu5IARtfrC0C01 zLJA/EW1MiASMQ9tXk2qz1VaQcxMXeixpepnRwI7DLZV9UlNW6xAJ0mmOjaldORSc7cO3UFyW5y z1igyswV1SsdYJUmwDOGehDz2eRdVLz6wl7DQgKzWvDALpVcxEMsSb8epzThqkak063q4T1wLTY X0/1lzmwMTsmotX8Mi5MO7MgV6mYudal71RTWF2uzGnsPsqTnQ/D5Y6JnkPJRQ2azfhLWb9ANL7 I0Z+A1jcZ+ucF60LAYlx77kvKLSpfAZW5kun21V1fL56oJ8+/KEAe4EF5XGrDre2+O/drS3Pt2c FJkEUM7m+Skvmmnw8e84Smaekl2ny53PNXHPLxQMWEylft/zuBvJzefEzHrN029rx0SFPkF1VHI jdB39ekh9BhrB1aQ== X-Google-Smtp-Source: AGHT+IE8V9tRE54MK48zflOS7PSA4HAhDqRjtC0UhDc7I1+H3UGz76R3OVDEkrrQ9nCRF167WfSdiw== X-Received: by 2002:a05:690c:74c3:b0:710:f1da:1b5f with SMTP id 00721157ae682-719b424d284mr43181487b3.34.1753282020293; Wed, 23 Jul 2025 07:47:00 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.46.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:46:59 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 03/32] kho: warn if KHO is disabled due to an error Date: Wed, 23 Jul 2025 14:46:16 +0000 Message-ID: <20250723144649.1696299-4-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" During boot scratch area is allocated based on command line parameters or auto calculated. However, scratch area may fail to allocate, and in that case KHO is disabled. Currently, no warning is printed that KHO is disabled, which makes it confusing for the end user to figure out why KHO is not available. Add the missing warning message. Signed-off-by: Pasha Tatashin Acked-by: Mike Rapoport (Microsoft) --- kernel/kexec_handover.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 1ff6b242f98c..368e23db0a17 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -565,6 +565,7 @@ static void __init kho_reserve_scratch(void) err_free_scratch_desc: memblock_free(kho_scratch, kho_scratch_cnt * sizeof(*kho_scratch)); err_disable_kho: + pr_warn("Failed to reserve scratch area, disabling kexec handover\n"); kho_enable =3D false; } =20 --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7FB7B2F6F8F for ; Wed, 23 Jul 2025 14:47:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282026; cv=none; b=CbiccLEJ6UdWUUTO9bezQ0/O6wqlRH8Bkrj6bvd6KRkmKsXmuSMteMmnrzRQ4/ttjPrzIBUw2+mi8t9eijTyWpqnPXsLdwxoRN+IkAuIqLmtrU5GiP050dLznbeeDlM3Z2CYE2DcnTH13MIzQFoZuaVQZxCAEhxSQh2o4YSpAVE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282026; c=relaxed/simple; bh=hEYGgJwEES9dRnVz2N9sTt0Ks0Wwq55TXM1cQALvjO4=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JUuaW9pB/pGz9J1fSE7wcow03QMmlKHlNPTMRcG3x4S5rp+WsSK0OcopDPVg3p3sqklUlNiXLaxJoCZl7NXLCR/Cy9wEAYEh+D2yQ5mR6MbEc4fXl5HiymKrWtOc08rHbbomSWNyncZ0P270GZ2++ob5NJuNWG5VGfLZ/DUNJLs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=0e0Sz2TA; arc=none smtp.client-ip=209.85.128.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="0e0Sz2TA" Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-7183dae670dso62712167b3.2 for ; Wed, 23 Jul 2025 07:47:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282022; x=1753886822; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=3XYi+jDPzWu41ED2O5U4L0RfTKyUhB75+lmsBNGCN0Q=; b=0e0Sz2TA42Ej2BiOtVHwZX9CtJhxBGK1FTBpRlsuVY9WTAJeo50zM/Anxkw+iOUOce exncFkOaeFlC4UVYO1lXxTkOxe4PshM5MJsNfwYZ8eD78M1iiz651ByqRLVzuMhjSeWd acRAtjAH6YAtxPFbQFLrhKKagehgM86PcDu6nuueDeTTBtM+SOexl3UK8KI0xvf8grea KASQqvERVBuqhybq9YzWTPYNR8W9rzCTi9kR8SFlSI3o0/7m6TwI1jsktOv2qe0+CBAr 8VLxKxlNlQKxKZn+4v8RAeP723Stbx1W9UlbsVdEUryzyIfXC2TajOMm1KNT13tRatxJ 2oKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282022; x=1753886822; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3XYi+jDPzWu41ED2O5U4L0RfTKyUhB75+lmsBNGCN0Q=; b=gY/LQ5nU3qi8RBM2QVN2v1eeN+eypMBWZljSZTds6koDZBnw/YOh7wGprdwmhK7hXc keFXOV+03ELcD82eE1A8gTAeUJKW98AjWodcvjFnqtEEJLyrEe+Uv1cXtAhAhke0t+dL AoC+kcTsz9YTMyR2WfgJ7pVFgcy50OcA5SvgAsXJSnCy/Grfp7SCmGD0xRWc0ETwFf2U KpiEbWnq5LVjyhF3lQ2k2iPam6f2VguwEbbFDvqZKbHh/J7DU1XhvC1vkLjSCH4/o7pt yPkcX3err1C0LOH9AJGtVWbJ9TZ/TZosM6PqSPtnipPHmoxJQEHhtOd5fuJVgRfBNnMW MdjQ== X-Forwarded-Encrypted: i=1; AJvYcCUW4KdU+OLuAYQSEVDujXUCo+mxUlmpNKotoDC+in7jL8N079bVaXv6k+60KSfZCAsBmoCfK2mjw23Tmzw=@vger.kernel.org X-Gm-Message-State: AOJu0YwMD5Jfl7PvOhuG3BUoqCAY0C+Xs3WrTBb4cHZIoE966TenfZ5l IRZSnx04quLsJQn08c/9WD8tiazh7B7OzwuMqj6J0sZIJckUuRYQpSOpRiVPGF3Vsj0= X-Gm-Gg: ASbGncusGs7WRG7jDKTfyWs0Q/cXXibVyR0802NVJirJ1qB0BuRHVG8BxvJxoim2a1I KOzgOiNCovccq1XIhZLmFKPu22q3d5WAe04124HR8QMzW/akIw7kiukrlaqN5h+hcEyvqaqzyu9 +z9k2H+8W3c2G5ZnR/YtJH/pLj6rmwMNqPlaNIb5O7J+CrzZgD5gSyQ+W/kPGNKxYlJmzSFPYYJ EQB0Q2DHx1YY7XDUE65rd94j72/9SBpYqRJ8/ZKXZa/ZD2F5rJd42Pwdt691mNFobS++kGA6iFt s38tLeXKpR3K16heCM30RDmQwUjoaW+KHTt2QP4kjSPcpRp8NG7bYwQ4x+9EO2XJcwGTgPCdDiy H6PC+hIFqfMnmK1FyJKWqxVMBvA3wsrz9TAkIZL03DgpqLkLlpX08aVKFJg/sW6/BBnNjeGPxOX Pyqf32E/pNxMtFNw== X-Google-Smtp-Source: AGHT+IGIyqUWtaPvswuBOlmOuoZ6Oiu0TyqZMj4g6mxjo7nkGd2imCQgeqT6/0gbrHlbiudRcFkz0w== X-Received: by 2002:a05:690c:4911:b0:718:38bd:5433 with SMTP id 00721157ae682-719b427d2cemr38041947b3.42.1753282022224; Wed, 23 Jul 2025 07:47:02 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:01 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 04/32] kho: allow to drive kho from within kernel Date: Wed, 23 Jul 2025 14:46:17 +0000 Message-ID: <20250723144649.1696299-5-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allow to do finalize and abort from kernel modules, so LUO could drive the KHO sequence via its own state machine. Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 15 +++++++++ kernel/kexec_handover.c | 58 ++++++++++++++++++++++++++++++++-- 2 files changed, 71 insertions(+), 2 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index 348844cffb13..f98565def593 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -54,6 +54,10 @@ void kho_memory_init(void); =20 void kho_populate(phys_addr_t fdt_phys, u64 fdt_len, phys_addr_t scratch_p= hys, u64 scratch_len); + +int kho_finalize(void); +int kho_abort(void); + #else static inline bool kho_is_enabled(void) { @@ -104,6 +108,17 @@ static inline void kho_populate(phys_addr_t fdt_phys, = u64 fdt_len, phys_addr_t scratch_phys, u64 scratch_len) { } + +static inline int kho_finalize(void) +{ + return -EOPNOTSUPP; +} + +static inline int kho_abort(void) +{ + return -EOPNOTSUPP; +} + #endif /* CONFIG_KEXEC_HANDOVER */ =20 #endif /* LINUX_KEXEC_HANDOVER_H */ diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 368e23db0a17..c6ccc8e0705d 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -757,7 +757,7 @@ static int kho_out_update_debugfs_fdt(void) return err; } =20 -static int kho_abort(void) +static int __kho_abort(void) { int err; unsigned long order; @@ -790,7 +790,34 @@ static int kho_abort(void) return err; } =20 -static int kho_finalize(void) +int kho_abort(void) +{ + int ret =3D 0; + + if (!kho_enable) + return -EOPNOTSUPP; + + mutex_lock(&kho_out.lock); + + if (!kho_out.finalized) { + ret =3D -ENOENT; + goto unlock; + } + + ret =3D __kho_abort(); + if (ret) + goto unlock; + + kho_out.finalized =3D false; + ret =3D kho_out_update_debugfs_fdt(); + +unlock: + mutex_unlock(&kho_out.lock); + return ret; +} +EXPORT_SYMBOL_GPL(kho_abort); + +static int __kho_finalize(void) { int err =3D 0; u64 *preserved_mem_map; @@ -839,6 +866,33 @@ static int kho_finalize(void) return err; } =20 +int kho_finalize(void) +{ + int ret =3D 0; + + if (!kho_enable) + return -EOPNOTSUPP; + + mutex_lock(&kho_out.lock); + + if (kho_out.finalized) { + ret =3D -EEXIST; + goto unlock; + } + + ret =3D __kho_finalize(); + if (ret) + goto unlock; + + kho_out.finalized =3D true; + ret =3D kho_out_update_debugfs_fdt(); + +unlock: + mutex_unlock(&kho_out.lock); + return ret; +} +EXPORT_SYMBOL_GPL(kho_finalize); + static int kho_out_finalize_get(void *data, u64 *val) { mutex_lock(&kho_out.lock); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EE4EF2F85FF for ; Wed, 23 Jul 2025 14:47:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282029; cv=none; b=qm2dF8gZOtg29Q7kOQ/Fp3JMYthNEs7tMaUX7N3dlZ+JA+jhDgx+V5yABLd+XmQ4xhDJGZgXDjnH31VzmP/5kklBO9FlYhtauLa+X0ix6SsvEVloU33MznI3w01mMWIbZKlPzDMi1zc+7UARkjF5mFEGCU+Llbj4qKJFfycaib8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282029; c=relaxed/simple; bh=QKiIrXzuD4XfVxFEvjmQqJZ6VbxVWuGjRJ+RtdblcSs=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=D4OXxMmxOfZp/LQB+MbzU5lp+FmhkSEDe2QMlWX2eXB+J280DA+Z2mcs2fnoofmkjHVUhzzz9TIn03D/LU2471vD89jCp5exp2IEX7UAfQXS5bfWljKzZsM+MZF3uKmUSNaUTdyLgE/joVDkQW3svAy9K/Idlow7bWjM3iBn3UI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=cLdUHu1/; arc=none smtp.client-ip=209.85.128.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="cLdUHu1/" Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-70e5e6ab7b8so58491587b3.1 for ; Wed, 23 Jul 2025 07:47:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282025; x=1753886825; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=wO2wIgP85BCTULTLfa+o1uX+1i01pv/IFr2OKVLnU4k=; b=cLdUHu1/eAuppDYWiBrTeyu/D3n6CkkiXBP5odzGT9NAaJuklHIYJcey7dnTfnTE9B gBrZE2ysZ9574YODASQUXIUkXdIE2g7ukqcMMh4FDsVN4VlWQ4Kkv87sy9bDyDSI3vMJ FGtESQOvRsBWAMpm0t3UrYgzhDWV+YwmWkwdE4v8jgLBzI28nJSyy7FaMItJbDyD1O3r 1F0zLWgoPe4csLFJ61le94aHfRwZaEvqccZIpraRioHBxDsnVpNUfJwPnyHHbv8XnbnG 32zdQ8VO2J675BhoYy8DZBBD40yS+fx1B4o99IGb2uB+sPSBs9ucTtvWYK3yZMkG6PK4 2yrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282025; x=1753886825; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wO2wIgP85BCTULTLfa+o1uX+1i01pv/IFr2OKVLnU4k=; b=LuLvpeGAeZLBO5GJHTz9ByoXJrdwsIHjLS06li7cAnKuz28n5YXWeUEPUw5fKVejd7 q2btCG1XqjgqVRjxjEQ3G3Bet126/QyMfJ5PHynoafdmtTrflO6E890HonN+QcUmEWke tOcznPzqIc9nTM3f5X59PIsTFL0FackcgUvvI67VUVBII28dBtPdHrU0coJpNdvJuHdp oDsFpNNMZxMQId2j4/zQExwYeV7rc2Un6rleXwb7VRk6KimY3/X9GV+hyOsAbomSYwng gKHVGIjpFcPdlnVOjFyX6t83WIB7nHUvRbLuq4h7oMaX4hxWS8+gpE5/5SCRb+D449Us Z/9g== X-Forwarded-Encrypted: i=1; AJvYcCXE9WiPw4cmouxbF/cy52QWgTu/vQZVAkexPip5jm0zPVVawCBgUV9LuwIHnbi35473u1itxUJiIF3zy8I=@vger.kernel.org X-Gm-Message-State: AOJu0YxDLMv25L401XPVpgDm7hkXD+Dh5+UK9jOpAAb/Lf/GUh/Juow/ eVF7JnPwaomLezD2HkZnwAGbToNIYh6vfRlfdNmxgsFEhNfE9b252CppV68cb+tdp+Q= X-Gm-Gg: ASbGncssFwiNRlsClUkz9lcYZlh9gXrafX87ivw21Hi755gGmQlfxPrFGw9urIu3yZs PK15LcPm9c5cftcat1JgpX4eDTsfAcC663wIgbbljq0N3aPr/R4BXCwVOkvWDA2AmzaQDT9159o nMy3rn6DIWvC9uytuwdPqMt61OrTjCuMRqeIgTjFDoeHL3TZHW3UrNtAN8aLD6EJLaqUDTwBY6O tMF+XpSdfpEkUTuNFtbtfdvh7egvCC1LAeGrsIOQ2jhP5K9eJBPQiP4baGAENNxEyUFwXesPyei be+7onvCQAyMPiJCmosF+0YCKCFmLzlbCDf+VfFf3uVVdo6giOVR+M8iKVxBL8wf0ZlIapgrZ3f 2E6wFpmUlMBriue+SMgN/9ElXrw7658LEPIDy0yevd9MLJDjEpWtg2qJKT0T8+3IL/yPlWTJNgS 1NDI3kZK21KRb/NUEATVLYgkdj X-Google-Smtp-Source: AGHT+IE6rL13P8SmxsZ2hnJI/80dcvjlLdQGko+A8D2Jc2ZZD8DIngXktvGTpmE5ph+kb1Vpkot2Qg== X-Received: by 2002:a05:690c:6803:b0:718:17b1:3eb with SMTP id 00721157ae682-719b41b9f44mr43595217b3.8.1753282024457; Wed, 23 Jul 2025 07:47:04 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:03 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 05/32] kho: make debugfs interface optional Date: Wed, 23 Jul 2025 14:46:18 +0000 Message-ID: <20250723144649.1696299-6-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, KHO is controlled via debugfs interface, but once LUO is introduced, it can control KHO, and the debug interface becomes optional. Add a separate config CONFIG_KEXEC_HANDOVER_DEBUG that enables the debugfs interface, and allows to inspect the tree. Move all debugfs related code to a new file to keep the .c files clear of ifdefs. Signed-off-by: Pasha Tatashin Co-developed-by: Mike Rapoport (Microsoft) Signed-off-by: Mike Rapoport (Microsoft) --- MAINTAINERS | 3 +- kernel/Kconfig.kexec | 10 ++ kernel/Makefile | 1 + kernel/kexec_handover.c | 278 ++++--------------------------- kernel/kexec_handover_debug.c | 218 ++++++++++++++++++++++++ kernel/kexec_handover_internal.h | 44 +++++ 6 files changed, 311 insertions(+), 243 deletions(-) create mode 100644 kernel/kexec_handover_debug.c create mode 100644 kernel/kexec_handover_internal.h diff --git a/MAINTAINERS b/MAINTAINERS index 10850512c118..00de7c78de86 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13349,13 +13349,14 @@ KEXEC HANDOVER (KHO) M: Alexander Graf M: Mike Rapoport M: Changyuan Lyu +M: Pasha Tatashin L: kexec@lists.infradead.org L: linux-mm@kvack.org S: Maintained F: Documentation/admin-guide/mm/kho.rst F: Documentation/core-api/kho/* F: include/linux/kexec_handover.h -F: kernel/kexec_handover.c +F: kernel/kexec_handover* =20 KEYS-ENCRYPTED M: Mimi Zohar diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 1224dd937df0..9968d3d4dd17 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -109,6 +109,16 @@ config KEXEC_HANDOVER to keep data or state alive across the kexec. For this to work, both source and target kernels need to have this option enabled. =20 +config KEXEC_HANDOVER_DEBUG + bool "kexec handover debug interface" + depends on KEXEC_HANDOVER + depends on DEBUG_FS + help + Allow to control kexec handover device tree via debugfs + interface, i.e. finalize the state or aborting the finalization. + Also, enables inspecting the KHO fdt trees with the debugfs binary + blobs. + config CRASH_DUMP bool "kernel crash dumps" default ARCH_DEFAULT_CRASH_DUMP diff --git a/kernel/Makefile b/kernel/Makefile index 32e80dd626af..e4b4afa86a70 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -82,6 +82,7 @@ obj-$(CONFIG_KEXEC) +=3D kexec.o obj-$(CONFIG_KEXEC_FILE) +=3D kexec_file.o obj-$(CONFIG_KEXEC_ELF) +=3D kexec_elf.o obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o +obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_BACKTRACE_SELF_TEST) +=3D backtracetest.o obj-$(CONFIG_COMPAT) +=3D compat.o obj-$(CONFIG_CGROUPS) +=3D cgroup/ diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index c6ccc8e0705d..32fdc388752b 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -10,7 +10,6 @@ =20 #include #include -#include #include #include #include @@ -27,6 +26,7 @@ */ #include "../mm/internal.h" #include "kexec_internal.h" +#include "kexec_handover_internal.h" =20 #define KHO_FDT_COMPATIBLE "kho-v1" #define PROP_PRESERVED_MEMORY_MAP "preserved-memory-map" @@ -84,8 +84,6 @@ struct khoser_mem_chunk; =20 struct kho_serialization { struct page *fdt; - struct list_head fdt_list; - struct dentry *sub_fdt_dir; struct kho_mem_track track; /* First chunk of serialized preserved memory map */ struct khoser_mem_chunk *preserved_mem_map; @@ -381,8 +379,8 @@ static void __init kho_mem_deserialize(const void *fdt) * area for early allocations that happen before page allocator is * initialized. */ -static struct kho_scratch *kho_scratch; -static unsigned int kho_scratch_cnt; +struct kho_scratch *kho_scratch; +unsigned int kho_scratch_cnt; =20 /* * The scratch areas are scaled by default as percent of memory allocated = from @@ -569,36 +567,24 @@ static void __init kho_reserve_scratch(void) kho_enable =3D false; } =20 -struct fdt_debugfs { - struct list_head list; - struct debugfs_blob_wrapper wrapper; - struct dentry *file; +struct kho_out { + struct blocking_notifier_head chain_head; + struct mutex lock; /* protects KHO FDT finalization */ + struct kho_serialization ser; + bool finalized; + struct kho_debugfs dbg; }; =20 -static int kho_debugfs_fdt_add(struct list_head *list, struct dentry *dir, - const char *name, const void *fdt) -{ - struct fdt_debugfs *f; - struct dentry *file; - - f =3D kmalloc(sizeof(*f), GFP_KERNEL); - if (!f) - return -ENOMEM; - - f->wrapper.data =3D (void *)fdt; - f->wrapper.size =3D fdt_totalsize(fdt); - - file =3D debugfs_create_blob(name, 0400, dir, &f->wrapper); - if (IS_ERR(file)) { - kfree(f); - return PTR_ERR(file); - } - - f->file =3D file; - list_add(&f->list, list); - - return 0; -} +static struct kho_out kho_out =3D { + .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), + .lock =3D __MUTEX_INITIALIZER(kho_out.lock), + .ser =3D { + .track =3D { + .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), + }, + }, + .finalized =3D false, +}; =20 /** * kho_add_subtree - record the physical address of a sub FDT in KHO root = tree. @@ -611,7 +597,8 @@ static int kho_debugfs_fdt_add(struct list_head *list, = struct dentry *dir, * by KHO for the new kernel to retrieve it after kexec. * * A debugfs blob entry is also created at - * ``/sys/kernel/debug/kho/out/sub_fdts/@name``. + * ``/sys/kernel/debug/kho/out/sub_fdts/@name`` when kernel is configured = with + * CONFIG_KEXEC_HANDOVER_DEBUG * * Return: 0 on success, error code on failure */ @@ -628,33 +615,10 @@ int kho_add_subtree(struct kho_serialization *ser, co= nst char *name, void *fdt) if (err) return err; =20 - return kho_debugfs_fdt_add(&ser->fdt_list, ser->sub_fdt_dir, name, fdt); + return kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); } EXPORT_SYMBOL_GPL(kho_add_subtree); =20 -struct kho_out { - struct blocking_notifier_head chain_head; - - struct dentry *dir; - - struct mutex lock; /* protects KHO FDT finalization */ - - struct kho_serialization ser; - bool finalized; -}; - -static struct kho_out kho_out =3D { - .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), - .lock =3D __MUTEX_INITIALIZER(kho_out.lock), - .ser =3D { - .fdt_list =3D LIST_HEAD_INIT(kho_out.ser.fdt_list), - .track =3D { - .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), - }, - }, - .finalized =3D false, -}; - int register_kho_notifier(struct notifier_block *nb) { return blocking_notifier_chain_register(&kho_out.chain_head, nb); @@ -734,29 +698,6 @@ int kho_preserve_phys(phys_addr_t phys, size_t size) } EXPORT_SYMBOL_GPL(kho_preserve_phys); =20 -/* Handling for debug/kho/out */ - -static struct dentry *debugfs_root; - -static int kho_out_update_debugfs_fdt(void) -{ - int err =3D 0; - struct fdt_debugfs *ff, *tmp; - - if (kho_out.finalized) { - err =3D kho_debugfs_fdt_add(&kho_out.ser.fdt_list, kho_out.dir, - "fdt", page_to_virt(kho_out.ser.fdt)); - } else { - list_for_each_entry_safe(ff, tmp, &kho_out.ser.fdt_list, list) { - debugfs_remove(ff->file); - list_del(&ff->list); - kfree(ff); - } - } - - return err; -} - static int __kho_abort(void) { int err; @@ -809,7 +750,8 @@ int kho_abort(void) goto unlock; =20 kho_out.finalized =3D false; - ret =3D kho_out_update_debugfs_fdt(); + + kho_debugfs_cleanup(&kho_out.dbg); =20 unlock: mutex_unlock(&kho_out.lock); @@ -860,7 +802,7 @@ static int __kho_finalize(void) abort: if (err) { pr_err("Failed to convert KHO state tree: %d\n", err); - kho_abort(); + __kho_abort(); } =20 return err; @@ -885,7 +827,8 @@ int kho_finalize(void) goto unlock; =20 kho_out.finalized =3D true; - ret =3D kho_out_update_debugfs_fdt(); + ret =3D kho_debugfs_fdt_add(&kho_out.dbg, "fdt", + page_to_virt(kho_out.ser.fdt), true); =20 unlock: mutex_unlock(&kho_out.lock); @@ -893,112 +836,24 @@ int kho_finalize(void) } EXPORT_SYMBOL_GPL(kho_finalize); =20 -static int kho_out_finalize_get(void *data, u64 *val) +bool kho_finalized(void) { - mutex_lock(&kho_out.lock); - *val =3D kho_out.finalized; - mutex_unlock(&kho_out.lock); - - return 0; -} - -static int kho_out_finalize_set(void *data, u64 _val) -{ - int ret =3D 0; - bool val =3D !!_val; + bool ret; =20 mutex_lock(&kho_out.lock); - - if (val =3D=3D kho_out.finalized) { - if (kho_out.finalized) - ret =3D -EEXIST; - else - ret =3D -ENOENT; - goto unlock; - } - - if (val) - ret =3D kho_finalize(); - else - ret =3D kho_abort(); - - if (ret) - goto unlock; - - kho_out.finalized =3D val; - ret =3D kho_out_update_debugfs_fdt(); - -unlock: + ret =3D kho_out.finalized; mutex_unlock(&kho_out.lock); - return ret; -} - -DEFINE_DEBUGFS_ATTRIBUTE(fops_kho_out_finalize, kho_out_finalize_get, - kho_out_finalize_set, "%llu\n"); - -static int scratch_phys_show(struct seq_file *m, void *v) -{ - for (int i =3D 0; i < kho_scratch_cnt; i++) - seq_printf(m, "0x%llx\n", kho_scratch[i].addr); - - return 0; -} -DEFINE_SHOW_ATTRIBUTE(scratch_phys); - -static int scratch_len_show(struct seq_file *m, void *v) -{ - for (int i =3D 0; i < kho_scratch_cnt; i++) - seq_printf(m, "0x%llx\n", kho_scratch[i].size); - - return 0; -} -DEFINE_SHOW_ATTRIBUTE(scratch_len); - -static __init int kho_out_debugfs_init(void) -{ - struct dentry *dir, *f, *sub_fdt_dir; - - dir =3D debugfs_create_dir("out", debugfs_root); - if (IS_ERR(dir)) - return -ENOMEM; - - sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); - if (IS_ERR(sub_fdt_dir)) - goto err_rmdir; =20 - f =3D debugfs_create_file("scratch_phys", 0400, dir, NULL, - &scratch_phys_fops); - if (IS_ERR(f)) - goto err_rmdir; - - f =3D debugfs_create_file("scratch_len", 0400, dir, NULL, - &scratch_len_fops); - if (IS_ERR(f)) - goto err_rmdir; - - f =3D debugfs_create_file("finalize", 0600, dir, NULL, - &fops_kho_out_finalize); - if (IS_ERR(f)) - goto err_rmdir; - - kho_out.dir =3D dir; - kho_out.ser.sub_fdt_dir =3D sub_fdt_dir; - return 0; - -err_rmdir: - debugfs_remove_recursive(dir); - return -ENOENT; + return ret; } =20 struct kho_in { - struct dentry *dir; phys_addr_t fdt_phys; phys_addr_t scratch_phys; - struct list_head fdt_list; + struct kho_debugfs dbg; }; =20 static struct kho_in kho_in =3D { - .fdt_list =3D LIST_HEAD_INIT(kho_in.fdt_list), }; =20 static const void *kho_get_fdt(void) @@ -1042,56 +897,6 @@ int kho_retrieve_subtree(const char *name, phys_addr_= t *phys) } EXPORT_SYMBOL_GPL(kho_retrieve_subtree); =20 -/* Handling for debugfs/kho/in */ - -static __init int kho_in_debugfs_init(const void *fdt) -{ - struct dentry *sub_fdt_dir; - int err, child; - - kho_in.dir =3D debugfs_create_dir("in", debugfs_root); - if (IS_ERR(kho_in.dir)) - return PTR_ERR(kho_in.dir); - - sub_fdt_dir =3D debugfs_create_dir("sub_fdts", kho_in.dir); - if (IS_ERR(sub_fdt_dir)) { - err =3D PTR_ERR(sub_fdt_dir); - goto err_rmdir; - } - - err =3D kho_debugfs_fdt_add(&kho_in.fdt_list, kho_in.dir, "fdt", fdt); - if (err) - goto err_rmdir; - - fdt_for_each_subnode(child, fdt, 0) { - int len =3D 0; - const char *name =3D fdt_get_name(fdt, child, NULL); - const u64 *fdt_phys; - - fdt_phys =3D fdt_getprop(fdt, child, "fdt", &len); - if (!fdt_phys) - continue; - if (len !=3D sizeof(*fdt_phys)) { - pr_warn("node `%s`'s prop `fdt` has invalid length: %d\n", - name, len); - continue; - } - err =3D kho_debugfs_fdt_add(&kho_in.fdt_list, sub_fdt_dir, name, - phys_to_virt(*fdt_phys)); - if (err) { - pr_warn("failed to add fdt `%s` to debugfs: %d\n", name, - err); - continue; - } - } - - return 0; - -err_rmdir: - debugfs_remove_recursive(kho_in.dir); - return err; -} - static __init int kho_init(void) { int err =3D 0; @@ -1106,27 +911,16 @@ static __init int kho_init(void) goto err_free_scratch; } =20 - debugfs_root =3D debugfs_create_dir("kho", NULL); - if (IS_ERR(debugfs_root)) { - err =3D -ENOENT; + err =3D kho_debugfs_init(); + if (err) goto err_free_fdt; - } =20 - err =3D kho_out_debugfs_init(); + err =3D kho_out_debugfs_init(&kho_out.dbg); if (err) goto err_free_fdt; =20 if (fdt) { - err =3D kho_in_debugfs_init(fdt); - /* - * Failure to create /sys/kernel/debug/kho/in does not prevent - * reviving state from KHO and setting up KHO for the next - * kexec. - */ - if (err) - pr_err("failed exposing handover FDT in debugfs: %d\n", - err); - + kho_in_debugfs_init(&kho_in.dbg, fdt); return 0; } =20 diff --git a/kernel/kexec_handover_debug.c b/kernel/kexec_handover_debug.c new file mode 100644 index 000000000000..b88d138a97be --- /dev/null +++ b/kernel/kexec_handover_debug.c @@ -0,0 +1,218 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * kexec_handover.c - kexec handover metadata processing + * Copyright (C) 2023 Alexander Graf + * Copyright (C) 2025 Microsoft Corporation, Mike Rapoport + * Copyright (C) 2025 Google LLC, Changyuan Lyu + * Copyright (C) 2025 Google LLC, Pasha Tatashin + */ + +#define pr_fmt(fmt) "KHO: " fmt + +#include +#include +#include +#include +#include "kexec_handover_internal.h" + +static struct dentry *debugfs_root; + +struct fdt_debugfs { + struct list_head list; + struct debugfs_blob_wrapper wrapper; + struct dentry *file; +}; + +static int __kho_debugfs_fdt_add(struct list_head *list, struct dentry *di= r, + const char *name, const void *fdt) +{ + struct fdt_debugfs *f; + struct dentry *file; + + f =3D kmalloc(sizeof(*f), GFP_KERNEL); + if (!f) + return -ENOMEM; + + f->wrapper.data =3D (void *)fdt; + f->wrapper.size =3D fdt_totalsize(fdt); + + file =3D debugfs_create_blob(name, 0400, dir, &f->wrapper); + if (IS_ERR(file)) { + kfree(f); + return PTR_ERR(file); + } + + f->file =3D file; + list_add(&f->list, list); + + return 0; +} + +int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, + const void *fdt, bool root) +{ + struct dentry *dir; + + if (root) + dir =3D dbg->dir; + else + dir =3D dbg->sub_fdt_dir; + + return __kho_debugfs_fdt_add(&dbg->fdt_list, dir, name, fdt); +} + +void kho_debugfs_cleanup(struct kho_debugfs *dbg) +{ + struct fdt_debugfs *ff, *tmp; + + list_for_each_entry_safe(ff, tmp, &dbg->fdt_list, list) { + debugfs_remove(ff->file); + list_del(&ff->list); + kfree(ff); + } +} + +static int kho_out_finalize_get(void *data, u64 *val) +{ + *val =3D kho_finalized(); + + return 0; +} + +static int kho_out_finalize_set(void *data, u64 _val) +{ + bool val =3D !!_val; + + if (val) + return kho_finalize(); + + return kho_abort(); +} + +DEFINE_DEBUGFS_ATTRIBUTE(kho_out_finalize_fops, kho_out_finalize_get, + kho_out_finalize_set, "%llu\n"); + +static int scratch_phys_show(struct seq_file *m, void *v) +{ + for (int i =3D 0; i < kho_scratch_cnt; i++) + seq_printf(m, "0x%llx\n", kho_scratch[i].addr); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(scratch_phys); + +static int scratch_len_show(struct seq_file *m, void *v) +{ + for (int i =3D 0; i < kho_scratch_cnt; i++) + seq_printf(m, "0x%llx\n", kho_scratch[i].size); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(scratch_len); + +__init void kho_in_debugfs_init(struct kho_debugfs *dbg, const void *fdt) +{ + struct dentry *dir, *sub_fdt_dir; + int err, child; + + INIT_LIST_HEAD(&dbg->fdt_list); + + dir =3D debugfs_create_dir("in", debugfs_root); + if (IS_ERR(dir)) { + err =3D PTR_ERR(dir); + goto err_out; + } + + sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); + if (IS_ERR(sub_fdt_dir)) { + err =3D PTR_ERR(sub_fdt_dir); + goto err_rmdir; + } + + err =3D __kho_debugfs_fdt_add(&dbg->fdt_list, dir, "fdt", fdt); + if (err) + goto err_rmdir; + + fdt_for_each_subnode(child, fdt, 0) { + int len =3D 0; + const char *name =3D fdt_get_name(fdt, child, NULL); + const u64 *fdt_phys; + + fdt_phys =3D fdt_getprop(fdt, child, "fdt", &len); + if (!fdt_phys) + continue; + if (len !=3D sizeof(*fdt_phys)) { + pr_warn("node %s prop fdt has invalid length: %d\n", + name, len); + continue; + } + err =3D __kho_debugfs_fdt_add(&dbg->fdt_list, sub_fdt_dir, name, + phys_to_virt(*fdt_phys)); + if (err) { + pr_warn("failed to add fdt %s to debugfs: %d\n", name, + err); + continue; + } + } + + dbg->dir =3D dir; + dbg->sub_fdt_dir =3D sub_fdt_dir; + + return; +err_rmdir: + debugfs_remove_recursive(dir); +err_out: + /* + * Failure to create /sys/kernel/debug/kho/in does not prevent + * reviving state from KHO and setting up KHO for the next + * kexec. + */ + if (err) + pr_err("failed exposing handover FDT in debugfs: %d\n", err); +} + +__init int kho_out_debugfs_init(struct kho_debugfs *dbg) +{ + struct dentry *dir, *f, *sub_fdt_dir; + + INIT_LIST_HEAD(&dbg->fdt_list); + + dir =3D debugfs_create_dir("out", debugfs_root); + if (IS_ERR(dir)) + return -ENOMEM; + + sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); + if (IS_ERR(sub_fdt_dir)) + goto err_rmdir; + + f =3D debugfs_create_file("scratch_phys", 0400, dir, NULL, + &scratch_phys_fops); + if (IS_ERR(f)) + goto err_rmdir; + + f =3D debugfs_create_file("scratch_len", 0400, dir, NULL, + &scratch_len_fops); + if (IS_ERR(f)) + goto err_rmdir; + + f =3D debugfs_create_file("finalize", 0600, dir, NULL, + &kho_out_finalize_fops); + if (IS_ERR(f)) + goto err_rmdir; + + dbg->dir =3D dir; + dbg->sub_fdt_dir =3D sub_fdt_dir; + return 0; + +err_rmdir: + debugfs_remove_recursive(dir); + return -ENOENT; +} + +__init int kho_debugfs_init(void) +{ + debugfs_root =3D debugfs_create_dir("kho", NULL); + if (IS_ERR(debugfs_root)) + return -ENOENT; + return 0; +} diff --git a/kernel/kexec_handover_internal.h b/kernel/kexec_handover_inter= nal.h new file mode 100644 index 000000000000..41e9616fcdd0 --- /dev/null +++ b/kernel/kexec_handover_internal.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef LINUX_KEXEC_HANDOVER_INTERNAL_H +#define LINUX_KEXEC_HANDOVER_INTERNAL_H + +#include +#include +#include + +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +#include + +struct kho_debugfs { + struct dentry *dir; + struct dentry *sub_fdt_dir; + struct list_head fdt_list; +}; + +#else +struct kho_debugfs {} +#endif + +extern struct kho_scratch *kho_scratch; +extern unsigned int kho_scratch_cnt; + +bool kho_finalized(void); + +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +int kho_debugfs_init(void); +void kho_in_debugfs_init(struct kho_debugfs *dbg, const void *fdt); +int kho_out_debugfs_init(struct kho_debugfs *dbg); +int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, + const void *fdt, bool root); +void kho_debugfs_cleanup(struct kho_debugfs *dbg); +#else +static inline int kho_debugfs_init(void) { return 0; } +static inline void kho_in_debugfs_init(struct kho_debugfs *dbg, + const void *fdt) { } +static inline int kho_out_debugfs_init(struct kho_debugfs *dbg) { return 0= ; } +static inline int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char = *name, + const void *fdt, bool root) { return 0; } +static inline void kho_debugfs_cleanup(struct kho_debugfs *dbg) {} +#endif /* CONFIG_KEXEC_HANDOVER_DEBUG */ + +#endif /* LINUX_KEXEC_HANDOVER_INTERNAL_H */ --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2D982F5301 for ; Wed, 23 Jul 2025 14:47:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282031; cv=none; b=YdUXVvzhTMsRoDxmVgvoTgHFVv3kOua+dRquLGJlLekRo3BIJ1vUtlHRGtKH7h6ZJYurlXGdMWmn7ILzLMFy0poqcsADZlpm6yRgLbUGUGm/3F4btKynloATG19MQyGlfxUfHmRoQaHqvdES+5msy4qq1VNbt41IONgLn1MYzyk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282031; c=relaxed/simple; bh=CFuLiKLJh8N2wBbybcS5k5e7t7gAErTPWCFvcc5WXGU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aXLbEB98MyW5GNcPKhxxJB4Qt9fcllKjjDVXD7OyWCK4prR5iVSMVDSdA4t0VF/tkReqb1hyd883TxeZwLk7duyChIHFx6Te0gLPKA6dwfY+/rL1M1p2f0ZBSkcY5meHlAAco2kWALGm534ObEcT0E5HniwqSQx6TRyUqBnJ72U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=xYKX0kz+; arc=none smtp.client-ip=209.85.219.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="xYKX0kz+" Received: by mail-yb1-f170.google.com with SMTP id 3f1490d57ef6-e8bc1258b2eso5261474276.2 for ; Wed, 23 Jul 2025 07:47:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282026; x=1753886826; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=P16dMjG+eBIby4m6CV13FKaPKlPVREZrX2+cF3fbAz8=; b=xYKX0kz+/axYGEMwZyZ51u1LhgFfdDGGPsY7Jbjq/MHx3E+4jSFgMDzNrpPwsV+GZQ HkIcaOrr6gQ/L8G8e8Wq7zlQ8CxvN9JtNt4BnsSfqtfWowG+eRb0VKQUXcTn/Jz1CfI+ z7oZVVtBsD6BNCudonB9ZlOpaPhqzP4wBBs4gN/8tm701XCI60sxbEc56B1rap1ch8z0 B99EXG6mDpxmEDbSETI35aCTw88eaJerEFlumYflSMjmgu7UuOKtQQG/bwpXxKF9U2C9 RxrMuI2jySWRBy5MwMShDH0c/TGmhyvC9sgcKJZxJWv0WpDFb0NyW3DoTRs5u0hzKIH7 mBeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282026; x=1753886826; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=P16dMjG+eBIby4m6CV13FKaPKlPVREZrX2+cF3fbAz8=; b=Yit71SoXj5iewukeUB75c75Df/PHmVE3haF4s9YAKePrH0bz4/wOZn7k0yl2Wm+Uig xnLyJQIeDkCYggo4jCCz4pIKfkq6KKZtYMOrExkxrHNG1V6dfbDbYN0l1Zy9mxzbOXGC 100VE3enjwifWCRV9gIHVwC5rIE8St+nG/sfxXbZ/dXKxQIr59fL5czVdaTm1Hnu959m EhqmM4xqsoKSbvAFFsEJBON7ZrqeIc5NG2KixBIaMzZ5qdMoskax4o2GDodAWgTzlj1T FYD5PcKu/EpeErMshUpHTl9rMF16nJYIiVCrYrBvTJjHtXNLR/0IubeXKV+pW9GvIltY WiQA== X-Forwarded-Encrypted: i=1; AJvYcCVhInl6d8pP41r1FV70tUyUg6yYrXgl4j4ZNKVDLTb1GW39CfXHDrZcJSYKBleEbP0/Xg2Ea7PCBEhgooM=@vger.kernel.org X-Gm-Message-State: AOJu0Yx474HBSjQ3BpTfiqFgFdeuCRPtgpjcCkIE6YLBvlOLV92A0QCN Ml1P1AN+iFOXAK7XO6HoC+o0xTwlIqSQiKVkbtML8j3yUqgMmufaLy+iUyimgWa4ExA= X-Gm-Gg: ASbGnctLKf0AnQ0r5J0BQiEsd87YYuyMp0mh30MuLXo7bf4wKRV++bEiZDOyRBFpokm oWM5Fdp7y+pMjVCTqBnyv63pMmJDI7UxFuykI3SOMj23riEmIJ0Rw5xCkjdd4GzTDceNplzxpHX vwjJikF0OqfPJz2J5lVe4YiLhSuRCK4Nxy4sWa6TYFevXuPzI5WXIC1BaeLuebxR3RRx6Kvuapl ocjltWsHS4Zl9EwV8RGphJ1qYPX6K8pIpuE3JyxQoTWHBBut+XREig9P5OcY2Q0HFELmYCdMv0q dAf11s11EMFd0Zzy/Xl0YF086gy9r6nhqNCWzR2cDhtf4VhenussptptHNRMTDU3XNVJc9Fo4GP /5DmFkiPRiKlsSn3FjV7q4GztLLS144rBKPh6W9JRchcmTeE2PsAO32A5qFvYj7RzbvFNp5P1lr IYlRSTNq2c5+byTQ== X-Google-Smtp-Source: AGHT+IHWatUs7Hs0KKZ14v/cggc4Ed333ZLRxX+YU0gH76DWBOv1E+jPUtUb0U/ZKTGi/fUUCRdxmw== X-Received: by 2002:a05:690c:3809:b0:719:5664:87fd with SMTP id 00721157ae682-719b4256275mr41785577b3.37.1753282026373; Wed, 23 Jul 2025 07:47:06 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:05 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 06/32] kho: drop notifiers Date: Wed, 23 Jul 2025 14:46:19 +0000 Message-ID: <20250723144649.1696299-7-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Mike Rapoport (Microsoft)" The KHO framework uses a notifier chain as the mechanism for clients to participate in the finalization process. While this works for a single, central state machine, it is too restrictive for kernel-internal components like pstore/reserve_mem or IMA. These components need a simpler, direct way to register their state for preservation (e.g., during their initcall) without being part of a complex, shutdown-time notifier sequence. The notifier model forces all participants into a single finalization flow and makes direct preservation from an arbitrary context difficult. This patch refactors the client participation model by removing the notifier chain and introducing a direct API for managing FDT subtrees. The core kho_finalize() and kho_abort() state machine remains, but clients now register their data with KHO beforehand. Signed-off-by: Mike Rapoport (Microsoft) Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 28 +---- kernel/kexec_handover.c | 177 +++++++++++++++++-------------- kernel/kexec_handover_debug.c | 17 +-- kernel/kexec_handover_internal.h | 5 +- mm/memblock.c | 56 ++-------- 5 files changed, 124 insertions(+), 159 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index f98565def593..cabdff5f50a2 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -10,14 +10,7 @@ struct kho_scratch { phys_addr_t size; }; =20 -/* KHO Notifier index */ -enum kho_event { - KEXEC_KHO_FINALIZE =3D 0, - KEXEC_KHO_ABORT =3D 1, -}; - struct folio; -struct notifier_block; =20 #define DECLARE_KHOSER_PTR(name, type) \ union { \ @@ -36,20 +29,16 @@ struct notifier_block; (typeof((s).ptr))((s).phys ? phys_to_virt((s).phys) : NULL); \ }) =20 -struct kho_serialization; - #ifdef CONFIG_KEXEC_HANDOVER bool kho_is_enabled(void); =20 int kho_preserve_folio(struct folio *folio); int kho_preserve_phys(phys_addr_t phys, size_t size); struct folio *kho_restore_folio(phys_addr_t phys); -int kho_add_subtree(struct kho_serialization *ser, const char *name, void = *fdt); +int kho_add_subtree(const char *name, void *fdt); +void kho_remove_subtree(void *fdt); int kho_retrieve_subtree(const char *name, phys_addr_t *phys); =20 -int register_kho_notifier(struct notifier_block *nb); -int unregister_kho_notifier(struct notifier_block *nb); - void kho_memory_init(void); =20 void kho_populate(phys_addr_t fdt_phys, u64 fdt_len, phys_addr_t scratch_p= hys, @@ -79,23 +68,16 @@ static inline struct folio *kho_restore_folio(phys_addr= _t phys) return NULL; } =20 -static inline int kho_add_subtree(struct kho_serialization *ser, - const char *name, void *fdt) +static inline int kho_add_subtree(const char *name, void *fdt) { return -EOPNOTSUPP; } =20 -static inline int kho_retrieve_subtree(const char *name, phys_addr_t *phys) +static inline void kho_remove_subtree(void *fdt) { - return -EOPNOTSUPP; } =20 -static inline int register_kho_notifier(struct notifier_block *nb) -{ - return -EOPNOTSUPP; -} - -static inline int unregister_kho_notifier(struct notifier_block *nb) +static inline int kho_retrieve_subtree(const char *name, phys_addr_t *phys) { return -EOPNOTSUPP; } diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 32fdc388752b..30d673f7f68a 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -15,7 +15,6 @@ #include #include #include -#include #include =20 #include @@ -82,11 +81,35 @@ struct kho_mem_track { =20 struct khoser_mem_chunk; =20 -struct kho_serialization { - struct page *fdt; +struct kho_sub_fdt { + struct list_head l; + const char *name; + void *fdt; +}; + +struct kho_out { + void *fdt; + bool finalized; + struct mutex lock; /* protects KHO FDT finalization */ + + struct list_head sub_fdts; + struct mutex fdts_lock; + struct kho_mem_track track; /* First chunk of serialized preserved memory map */ struct khoser_mem_chunk *preserved_mem_map; + + struct kho_debugfs dbg; +}; + +static struct kho_out kho_out =3D { + .lock =3D __MUTEX_INITIALIZER(kho_out.lock), + .track =3D { + .orders =3D XARRAY_INIT(kho_out.track.orders, 0), + }, + .sub_fdts =3D LIST_HEAD_INIT(kho_out.sub_fdts), + .fdts_lock =3D __MUTEX_INITIALIZER(kho_out.fdts_lock), + .finalized =3D false, }; =20 static void *xa_load_or_alloc(struct xarray *xa, unsigned long index, size= _t sz) @@ -285,14 +308,14 @@ static void kho_mem_ser_free(struct khoser_mem_chunk = *first_chunk) } } =20 -static int kho_mem_serialize(struct kho_serialization *ser) +static int kho_mem_serialize(struct kho_out *kho_out) { struct khoser_mem_chunk *first_chunk =3D NULL; struct khoser_mem_chunk *chunk =3D NULL; struct kho_mem_phys *physxa; unsigned long order; =20 - xa_for_each(&ser->track.orders, order, physxa) { + xa_for_each(&kho_out->track.orders, order, physxa) { struct kho_mem_phys_bits *bits; unsigned long phys; =20 @@ -320,7 +343,7 @@ static int kho_mem_serialize(struct kho_serialization *= ser) } } =20 - ser->preserved_mem_map =3D first_chunk; + kho_out->preserved_mem_map =3D first_chunk; =20 return 0; =20 @@ -567,28 +590,8 @@ static void __init kho_reserve_scratch(void) kho_enable =3D false; } =20 -struct kho_out { - struct blocking_notifier_head chain_head; - struct mutex lock; /* protects KHO FDT finalization */ - struct kho_serialization ser; - bool finalized; - struct kho_debugfs dbg; -}; - -static struct kho_out kho_out =3D { - .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), - .lock =3D __MUTEX_INITIALIZER(kho_out.lock), - .ser =3D { - .track =3D { - .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), - }, - }, - .finalized =3D false, -}; - /** * kho_add_subtree - record the physical address of a sub FDT in KHO root = tree. - * @ser: serialization control object passed by KHO notifiers. * @name: name of the sub tree. * @fdt: the sub tree blob. * @@ -602,34 +605,45 @@ static struct kho_out kho_out =3D { * * Return: 0 on success, error code on failure */ -int kho_add_subtree(struct kho_serialization *ser, const char *name, void = *fdt) +int kho_add_subtree(const char *name, void *fdt) { - int err =3D 0; - u64 phys =3D (u64)virt_to_phys(fdt); - void *root =3D page_to_virt(ser->fdt); + struct kho_sub_fdt *sub_fdt; + int err; =20 - err |=3D fdt_begin_node(root, name); - err |=3D fdt_property(root, PROP_SUB_FDT, &phys, sizeof(phys)); - err |=3D fdt_end_node(root); + sub_fdt =3D kmalloc(sizeof(*sub_fdt), GFP_KERNEL); + if (!sub_fdt) + return -ENOMEM; =20 - if (err) - return err; + INIT_LIST_HEAD(&sub_fdt->l); + sub_fdt->name =3D name; + sub_fdt->fdt =3D fdt; + + mutex_lock(&kho_out.fdts_lock); + list_add_tail(&sub_fdt->l, &kho_out.sub_fdts); + err =3D kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); + mutex_unlock(&kho_out.fdts_lock); =20 - return kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); + return err; } EXPORT_SYMBOL_GPL(kho_add_subtree); =20 -int register_kho_notifier(struct notifier_block *nb) +void kho_remove_subtree(void *fdt) { - return blocking_notifier_chain_register(&kho_out.chain_head, nb); -} -EXPORT_SYMBOL_GPL(register_kho_notifier); + struct kho_sub_fdt *sub_fdt; + + mutex_lock(&kho_out.fdts_lock); + list_for_each_entry(sub_fdt, &kho_out.sub_fdts, l) { + if (sub_fdt->fdt =3D=3D fdt) { + list_del(&sub_fdt->l); + kfree(sub_fdt); + kho_debugfs_fdt_remove(&kho_out.dbg, fdt); + break; + } + } + mutex_unlock(&kho_out.fdts_lock); =20 -int unregister_kho_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_unregister(&kho_out.chain_head, nb); } -EXPORT_SYMBOL_GPL(unregister_kho_notifier); +EXPORT_SYMBOL_GPL(kho_remove_subtree); =20 /** * kho_preserve_folio - preserve a folio across kexec. @@ -644,7 +658,7 @@ int kho_preserve_folio(struct folio *folio) { const unsigned long pfn =3D folio_pfn(folio); const unsigned int order =3D folio_order(folio); - struct kho_mem_track *track =3D &kho_out.ser.track; + struct kho_mem_track *track =3D &kho_out.track; =20 if (kho_out.finalized) return -EBUSY; @@ -670,7 +684,7 @@ int kho_preserve_phys(phys_addr_t phys, size_t size) const unsigned long start_pfn =3D pfn; const unsigned long end_pfn =3D PHYS_PFN(phys + size); int err =3D 0; - struct kho_mem_track *track =3D &kho_out.ser.track; + struct kho_mem_track *track =3D &kho_out.track; =20 if (kho_out.finalized) return -EBUSY; @@ -700,11 +714,11 @@ EXPORT_SYMBOL_GPL(kho_preserve_phys); =20 static int __kho_abort(void) { - int err; + int err =3D 0; unsigned long order; struct kho_mem_phys *physxa; =20 - xa_for_each(&kho_out.ser.track.orders, order, physxa) { + xa_for_each(&kho_out.track.orders, order, physxa) { struct kho_mem_phys_bits *bits; unsigned long phys; =20 @@ -714,17 +728,13 @@ static int __kho_abort(void) xa_destroy(&physxa->phys_bits); kfree(physxa); } - xa_destroy(&kho_out.ser.track.orders); + xa_destroy(&kho_out.track.orders); =20 - if (kho_out.ser.preserved_mem_map) { - kho_mem_ser_free(kho_out.ser.preserved_mem_map); - kho_out.ser.preserved_mem_map =3D NULL; + if (kho_out.preserved_mem_map) { + kho_mem_ser_free(kho_out.preserved_mem_map); + kho_out.preserved_mem_map =3D NULL; } =20 - err =3D blocking_notifier_call_chain(&kho_out.chain_head, KEXEC_KHO_ABORT, - NULL); - err =3D notifier_to_errno(err); - if (err) pr_err("Failed to abort KHO finalization: %d\n", err); =20 @@ -751,7 +761,7 @@ int kho_abort(void) =20 kho_out.finalized =3D false; =20 - kho_debugfs_cleanup(&kho_out.dbg); + kho_debugfs_fdt_remove(&kho_out.dbg, kho_out.fdt); =20 unlock: mutex_unlock(&kho_out.lock); @@ -763,41 +773,46 @@ static int __kho_finalize(void) { int err =3D 0; u64 *preserved_mem_map; - void *fdt =3D page_to_virt(kho_out.ser.fdt); + void *root =3D kho_out.fdt; + struct kho_sub_fdt *fdt; =20 - err |=3D fdt_create(fdt, PAGE_SIZE); - err |=3D fdt_finish_reservemap(fdt); - err |=3D fdt_begin_node(fdt, ""); - err |=3D fdt_property_string(fdt, "compatible", KHO_FDT_COMPATIBLE); + err |=3D fdt_create(root, PAGE_SIZE); + err |=3D fdt_finish_reservemap(root); + err |=3D fdt_begin_node(root, ""); + err |=3D fdt_property_string(root, "compatible", KHO_FDT_COMPATIBLE); /** * Reserve the preserved-memory-map property in the root FDT, so * that all property definitions will precede subnodes created by * KHO callers. */ - err |=3D fdt_property_placeholder(fdt, PROP_PRESERVED_MEMORY_MAP, + err |=3D fdt_property_placeholder(root, PROP_PRESERVED_MEMORY_MAP, sizeof(*preserved_mem_map), (void **)&preserved_mem_map); if (err) goto abort; =20 - err =3D kho_preserve_folio(page_folio(kho_out.ser.fdt)); + err =3D kho_preserve_folio(virt_to_folio(kho_out.fdt)); if (err) goto abort; =20 - err =3D blocking_notifier_call_chain(&kho_out.chain_head, - KEXEC_KHO_FINALIZE, &kho_out.ser); - err =3D notifier_to_errno(err); + err =3D kho_mem_serialize(&kho_out); if (err) goto abort; =20 - err =3D kho_mem_serialize(&kho_out.ser); - if (err) - goto abort; + *preserved_mem_map =3D (u64)virt_to_phys(kho_out.preserved_mem_map); =20 - *preserved_mem_map =3D (u64)virt_to_phys(kho_out.ser.preserved_mem_map); + mutex_lock(&kho_out.fdts_lock); + list_for_each_entry(fdt, &kho_out.sub_fdts, l) { + phys_addr_t phys =3D virt_to_phys(fdt->fdt); =20 - err |=3D fdt_end_node(fdt); - err |=3D fdt_finish(fdt); + err |=3D fdt_begin_node(root, fdt->name); + err |=3D fdt_property(root, PROP_SUB_FDT, &phys, sizeof(phys)); + err |=3D fdt_end_node(root); + }; + mutex_unlock(&kho_out.fdts_lock); + + err |=3D fdt_end_node(root); + err |=3D fdt_finish(root); =20 abort: if (err) { @@ -828,7 +843,7 @@ int kho_finalize(void) =20 kho_out.finalized =3D true; ret =3D kho_debugfs_fdt_add(&kho_out.dbg, "fdt", - page_to_virt(kho_out.ser.fdt), true); + kho_out.fdt, true); =20 unlock: mutex_unlock(&kho_out.lock); @@ -901,15 +916,17 @@ static __init int kho_init(void) { int err =3D 0; const void *fdt =3D kho_get_fdt(); + struct page *fdt_page; =20 if (!kho_enable) return 0; =20 - kho_out.ser.fdt =3D alloc_page(GFP_KERNEL); - if (!kho_out.ser.fdt) { + fdt_page =3D alloc_page(GFP_KERNEL); + if (!fdt_page) { err =3D -ENOMEM; goto err_free_scratch; } + kho_out.fdt =3D page_to_virt(fdt_page); =20 err =3D kho_debugfs_init(); if (err) @@ -937,8 +954,8 @@ static __init int kho_init(void) return 0; =20 err_free_fdt: - put_page(kho_out.ser.fdt); - kho_out.ser.fdt =3D NULL; + put_page(fdt_page); + kho_out.fdt =3D NULL; err_free_scratch: for (int i =3D 0; i < kho_scratch_cnt; i++) { void *start =3D __va(kho_scratch[i].addr); @@ -949,7 +966,7 @@ static __init int kho_init(void) kho_enable =3D false; return err; } -late_initcall(kho_init); +fs_initcall(kho_init); =20 static void __init kho_release_scratch(void) { @@ -1085,7 +1102,7 @@ int kho_fill_kimage(struct kimage *image) if (!kho_enable) return 0; =20 - image->kho.fdt =3D page_to_phys(kho_out.ser.fdt); + image->kho.fdt =3D virt_to_phys(kho_out.fdt); =20 scratch_size =3D sizeof(*kho_scratch) * kho_scratch_cnt; scratch =3D (struct kexec_buf){ diff --git a/kernel/kexec_handover_debug.c b/kernel/kexec_handover_debug.c index b88d138a97be..af4bad225630 100644 --- a/kernel/kexec_handover_debug.c +++ b/kernel/kexec_handover_debug.c @@ -61,14 +61,17 @@ int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const = char *name, return __kho_debugfs_fdt_add(&dbg->fdt_list, dir, name, fdt); } =20 -void kho_debugfs_cleanup(struct kho_debugfs *dbg) +void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, void *fdt) { - struct fdt_debugfs *ff, *tmp; - - list_for_each_entry_safe(ff, tmp, &dbg->fdt_list, list) { - debugfs_remove(ff->file); - list_del(&ff->list); - kfree(ff); + struct fdt_debugfs *ff; + + list_for_each_entry(ff, &dbg->fdt_list, list) { + if (ff->wrapper.data =3D=3D fdt) { + debugfs_remove(ff->file); + list_del(&ff->list); + kfree(ff); + break; + } } } =20 diff --git a/kernel/kexec_handover_internal.h b/kernel/kexec_handover_inter= nal.h index 41e9616fcdd0..240517596ea3 100644 --- a/kernel/kexec_handover_internal.h +++ b/kernel/kexec_handover_internal.h @@ -30,7 +30,7 @@ void kho_in_debugfs_init(struct kho_debugfs *dbg, const v= oid *fdt); int kho_out_debugfs_init(struct kho_debugfs *dbg); int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, const void *fdt, bool root); -void kho_debugfs_cleanup(struct kho_debugfs *dbg); +void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, void *fdt); #else static inline int kho_debugfs_init(void) { return 0; } static inline void kho_in_debugfs_init(struct kho_debugfs *dbg, @@ -38,7 +38,8 @@ static inline void kho_in_debugfs_init(struct kho_debugfs= *dbg, static inline int kho_out_debugfs_init(struct kho_debugfs *dbg) { return 0= ; } static inline int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char = *name, const void *fdt, bool root) { return 0; } -static inline void kho_debugfs_cleanup(struct kho_debugfs *dbg) {} +static inline void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, + void *fdt) { } #endif /* CONFIG_KEXEC_HANDOVER_DEBUG */ =20 #endif /* LINUX_KEXEC_HANDOVER_INTERNAL_H */ diff --git a/mm/memblock.c b/mm/memblock.c index 154f1d73b61f..6af0b51b1bb7 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -2501,51 +2501,18 @@ int reserve_mem_release_by_name(const char *name) #define MEMBLOCK_KHO_FDT "memblock" #define MEMBLOCK_KHO_NODE_COMPATIBLE "memblock-v1" #define RESERVE_MEM_KHO_NODE_COMPATIBLE "reserve-mem-v1" -static struct page *kho_fdt; - -static int reserve_mem_kho_finalize(struct kho_serialization *ser) -{ - int err =3D 0, i; - - for (i =3D 0; i < reserved_mem_count; i++) { - struct reserve_mem_table *map =3D &reserved_mem_table[i]; - - err |=3D kho_preserve_phys(map->start, map->size); - } - - err |=3D kho_preserve_folio(page_folio(kho_fdt)); - err |=3D kho_add_subtree(ser, MEMBLOCK_KHO_FDT, page_to_virt(kho_fdt)); - - return notifier_from_errno(err); -} - -static int reserve_mem_kho_notifier(struct notifier_block *self, - unsigned long cmd, void *v) -{ - switch (cmd) { - case KEXEC_KHO_FINALIZE: - return reserve_mem_kho_finalize((struct kho_serialization *)v); - case KEXEC_KHO_ABORT: - return NOTIFY_DONE; - default: - return NOTIFY_BAD; - } -} - -static struct notifier_block reserve_mem_kho_nb =3D { - .notifier_call =3D reserve_mem_kho_notifier, -}; =20 static int __init prepare_kho_fdt(void) { int err =3D 0, i; + struct page *fdt_page; void *fdt; =20 - kho_fdt =3D alloc_page(GFP_KERNEL); - if (!kho_fdt) + fdt_page =3D alloc_page(GFP_KERNEL); + if (!fdt_page) return -ENOMEM; =20 - fdt =3D page_to_virt(kho_fdt); + fdt =3D page_to_virt(fdt_page); =20 err |=3D fdt_create(fdt, PAGE_SIZE); err |=3D fdt_finish_reservemap(fdt); @@ -2555,6 +2522,7 @@ static int __init prepare_kho_fdt(void) for (i =3D 0; i < reserved_mem_count; i++) { struct reserve_mem_table *map =3D &reserved_mem_table[i]; =20 + err |=3D kho_preserve_phys(map->start, map->size); err |=3D fdt_begin_node(fdt, map->name); err |=3D fdt_property_string(fdt, "compatible", RESERVE_MEM_KHO_NODE_COM= PATIBLE); err |=3D fdt_property(fdt, "start", &map->start, sizeof(map->start)); @@ -2562,13 +2530,14 @@ static int __init prepare_kho_fdt(void) err |=3D fdt_end_node(fdt); } err |=3D fdt_end_node(fdt); - err |=3D fdt_finish(fdt); =20 + err |=3D kho_preserve_folio(page_folio(fdt_page)); + err |=3D kho_add_subtree(MEMBLOCK_KHO_FDT, fdt); + if (err) { pr_err("failed to prepare memblock FDT for KHO: %d\n", err); - put_page(kho_fdt); - kho_fdt =3D NULL; + put_page(fdt_page); } =20 return err; @@ -2584,13 +2553,6 @@ static int __init reserve_mem_init(void) err =3D prepare_kho_fdt(); if (err) return err; - - err =3D register_kho_notifier(&reserve_mem_kho_nb); - if (err) { - put_page(kho_fdt); - kho_fdt =3D NULL; - } - return err; } late_initcall(reserve_mem_init); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8C4DC2FA626 for ; Wed, 23 Jul 2025 14:47:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282034; cv=none; b=LLuKQMrlVtaJfmFaZPuceWsbpMscBDDYF+zrmeSuR3bzznthfo10ZshlhJHWrqt2XiS1OfgILxAjYKoMgVq1NhC8OspX9bWyOOZEuo2eAQ6i2rDLUPO/p8VVW30+jtaNlvSJXKQFWnxWEj4mVWSOL2jsREQGTbmsnNmmJV6FRzA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282034; c=relaxed/simple; bh=vSU0slshkk55HBCY79+hFfFYEDsYssJQsWxVN1trIM0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=spc407l7T/kKAdHfPhVT56Cs2hzvLH7C5aeuGixhq3JKSFvYj3+PU5K+24jJHKnNGejpOOz0ViVEMl3Dc/mk5urpVAyDlBGZcJWXnjqPhVF+OL5NSQOAEHmMVxCKiXNZOMI6W6HBEjjKrLVrSDXZ+zC7D+3VihWL/EONahGtMeQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=K9o/+lpe; arc=none smtp.client-ip=209.85.128.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="K9o/+lpe" Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-713fba639f3so56987317b3.1 for ; Wed, 23 Jul 2025 07:47:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282028; x=1753886828; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=yUhylTCjCy3uyYdpO2FGhnRAf1jyGBpF/+3Er3S6ctA=; b=K9o/+lpeE85nh0EYGvui6/NDTaWSTdbYCtCRTAk0CPgCFfEQrYpAqgePC6ig2jrVy7 8Vg9ZD6sD3zcsrnnpDm9xyKJk1AJQeDJ1FWsmmzTdbSjnaxSVzlxEURUEzjxsr5pu/nG GeNNJFC3yAdHfOTO3xEvCx6ZYJQUvEMngcBqHhwVyeClk8WLDc/wW41nPMLUumNOCXx8 mRQpqVxEjA0mELUduN706Z7UOVlURPagyA+sVyHzzMm6gBVqEnlAjjUUS4UQXzJZX8T8 cQssajnQA5CpDU1NYT9wwX63/psAiChTjsW9kCoy1uDhgqdH+oo9RjrMR2CG4hOXjLP0 XWeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282028; x=1753886828; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yUhylTCjCy3uyYdpO2FGhnRAf1jyGBpF/+3Er3S6ctA=; b=GaPbGO1xZwl1lfvaGKusHxEc9aCT3z+1EJVCg7zTg12cqZQ487KxMAjfQrNWZPOA++ WqfGGaQaoPn4LfLJ6r5nUFIb+68xkfMF+mjxpMleJW2eIyfwhA3A5bSFqrIWEzDyYv42 Cj6BVo7+tNSw0+Qki4gEDHywqDy0AvmoNQsVOiXynVvJ66fAxE4EbpsiuSmW5AzV6Mav Mo3aDk08hwYmPmIUnZYRlBXd7L/IWTFsDtQk+Tc0JEtk9Ol4yUOQ8xXqkvl7y98zf/n9 bTccshimqL5dHDEoO/XGX8egb55pjC1Sv/SyfHRKKf68qgrbuMMbpR/7WQOOsyl/csZh DQJw== X-Forwarded-Encrypted: i=1; AJvYcCWiJljLDt6i/tu6nmODcTn2ddgBSS7qV78/B1rhfJEy5d5hoHHPsWgivATT+4ovKpaCT5QZdvw0IbO3sUs=@vger.kernel.org X-Gm-Message-State: AOJu0YwQYgCxoVNSvkYy+U8TTVqdxN+/hGJ3/QlRJ7Q63GWsmsdfRWaJ 74ZN++40OXS50wnq864F2TuaBGtAL2Sl1Vi3ubeRu7JnlycCstebhxnBP/pGz95EysU= X-Gm-Gg: ASbGncvaBjzcwq2YHFBFtC0R4TH/plqjt6ap5tRaxQoEPuG3f0oc3CQq1SW6P+t1K8L lJK/gX+ApHe5rzokN1GvL3vvxGKuXJOOf9NpydcLKpx7WYwOU+xfCcbmnamsR7/+sYRtYPHj+K7 +Opw9f4lma26v+a6JTF/PKRlbTGffHIHvtWCgS1W71V/lsXIii5AT92ZRrJVYrxvvYvRRy8fGlO THC2wAmhf5bGvcSLhoqp1gZpDdoo4s6IXd2lC6GuG6bqF52DsqmjuCAXBdesESjsve2IcDtHUlp h0+8WTf4KMlNMHbCToZyjNjCGQva7mskhqakuv3+CLQXcUUNWGmUJCDGgi9gg4PbAWSdlEPxMJe 7r+Fs9DmOnLNaGxKgPXnFtMlM+R5TZFmOJ3sYcdnccNFPnClIHxO8I+Ax3VXwXYx0SBwbIw+AbD Q6P4kuUJHQEDa3xM1Zk8HlFDKh X-Google-Smtp-Source: AGHT+IGZ9Bv9q7KwM+e5SZ4p5HtmWokaa1yTavfU4ti4TiXgkGO4Jv4+Imx8qhXiY7NKxrpPJpXoKw== X-Received: by 2002:a05:690c:93:b0:710:f46d:cec0 with SMTP id 00721157ae682-719b41459e7mr40440567b3.1.1753282028495; Wed, 23 Jul 2025 07:47:08 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:07 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 07/32] kho: add interfaces to unpreserve folios and physical memory ranges Date: Wed, 23 Jul 2025 14:46:20 +0000 Message-ID: <20250723144649.1696299-8-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Changyuan Lyu Allow users of KHO to cancel the previous preservation by adding the necessary interfaces to unpreserve folio. Signed-off-by: Changyuan Lyu Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 12 +++++ kernel/kexec_handover.c | 90 +++++++++++++++++++++++++++++----- 2 files changed, 89 insertions(+), 13 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index cabdff5f50a2..383e9460edb9 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -33,7 +33,9 @@ struct folio; bool kho_is_enabled(void); =20 int kho_preserve_folio(struct folio *folio); +int kho_unpreserve_folio(struct folio *folio); int kho_preserve_phys(phys_addr_t phys, size_t size); +int kho_unpreserve_phys(phys_addr_t phys, size_t size); struct folio *kho_restore_folio(phys_addr_t phys); int kho_add_subtree(const char *name, void *fdt); void kho_remove_subtree(void *fdt); @@ -58,11 +60,21 @@ static inline int kho_preserve_folio(struct folio *foli= o) return -EOPNOTSUPP; } =20 +static inline int kho_unpreserve_folio(struct folio *folio) +{ + return -EOPNOTSUPP; +} + static inline int kho_preserve_phys(phys_addr_t phys, size_t size) { return -EOPNOTSUPP; } =20 +static inline int kho_unpreserve_phys(phys_addr_t phys, size_t size) +{ + return -EOPNOTSUPP; +} + static inline struct folio *kho_restore_folio(phys_addr_t phys) { return NULL; diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 30d673f7f68a..26ad926912a7 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -136,26 +136,33 @@ static void *xa_load_or_alloc(struct xarray *xa, unsi= gned long index, size_t sz) return elm; } =20 -static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pf= n, - unsigned long end_pfn) +static void __kho_unpreserve_order(struct kho_mem_track *track, unsigned l= ong pfn, + unsigned int order) { struct kho_mem_phys_bits *bits; struct kho_mem_phys *physxa; + const unsigned long pfn_high =3D pfn >> order; =20 - while (pfn < end_pfn) { - const unsigned int order =3D - min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn)); - const unsigned long pfn_high =3D pfn >> order; + physxa =3D xa_load(&track->orders, order); + if (!physxa) + return; =20 - physxa =3D xa_load(&track->orders, order); - if (!physxa) - continue; + bits =3D xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); + if (!bits) + return; =20 - bits =3D xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); - if (!bits) - continue; + clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); +} =20 - clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); +static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pf= n, + unsigned long end_pfn) +{ + unsigned int order; + + while (pfn < end_pfn) { + order =3D min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn)); + + __kho_unpreserve_order(track, pfn, order); =20 pfn +=3D 1 << order; } @@ -667,6 +674,30 @@ int kho_preserve_folio(struct folio *folio) } EXPORT_SYMBOL_GPL(kho_preserve_folio); =20 +/** + * kho_unpreserve_folio - unpreserve a folio. + * @folio: folio to unpreserve. + * + * Instructs KHO to unpreserve a folio that was preserved by + * kho_preserve_folio() before. The provided @folio (pfn and order) + * must exactly match a previously preserved folio. + * + * Return: 0 on success, error code on failure + */ +int kho_unpreserve_folio(struct folio *folio) +{ + const unsigned long pfn =3D folio_pfn(folio); + const unsigned int order =3D folio_order(folio); + struct kho_mem_track *track =3D &kho_out.track; + + if (kho_out.finalized) + return -EBUSY; + + __kho_unpreserve_order(track, pfn, order); + return 0; +} +EXPORT_SYMBOL_GPL(kho_unpreserve_folio); + /** * kho_preserve_phys - preserve a physically contiguous range across kexec. * @phys: physical address of the range. @@ -712,6 +743,39 @@ int kho_preserve_phys(phys_addr_t phys, size_t size) } EXPORT_SYMBOL_GPL(kho_preserve_phys); =20 +/** + * kho_unpreserve_phys - unpreserve a physically contiguous range. + * @phys: physical address of the range. + * @size: size of the range. + * + * Instructs KHO to unpreserve the memory range from @phys to @phys + @siz= e. + * The @phys address must be aligned to @size, and @size must be a + * power-of-2 multiple of PAGE_SIZE. + * This call must exactly match a granularity at which memory was original= ly + * preserved (either by a `kho_preserve_phys` call with the same `phys` and + * `size`). Unpreserving arbitrary sub-ranges of larger preserved blocks i= s not + * supported. + * + * Return: 0 on success, error code on failure + */ +int kho_unpreserve_phys(phys_addr_t phys, size_t size) +{ + struct kho_mem_track *track =3D &kho_out.track; + unsigned long pfn =3D PHYS_PFN(phys); + unsigned long end_pfn =3D PHYS_PFN(phys + size); + + if (kho_out.finalized) + return -EBUSY; + + if (!PAGE_ALIGNED(phys) || !PAGE_ALIGNED(size)) + return -EINVAL; + + __kho_unpreserve(track, pfn, end_pfn); + + return 0; +} +EXPORT_SYMBOL_GPL(kho_unpreserve_phys); + static int __kho_abort(void) { int err =3D 0; --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49D842FA645 for ; Wed, 23 Jul 2025 14:47:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282036; cv=none; b=FarAb1K6b4WwMQnGkznao3B1jh+KgkEGheYUddvaXuJ4BP5HyeOeddNnb5FlWmOI+CHLV3w7c1adcnsQsZOdJWK01+/etENOT140htD3SOTbkRAH4BWYp3aTIvNSul9ja5qoIZSWCQI2BlZYAS57ZJZ2kzFwvA2dEdfyx47BqXM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282036; c=relaxed/simple; bh=bjMsn0d+DdHq5tKORAnt+APd2ow312ZieXyXxxJ1EAQ=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gjalcmRrCvqmRNst48kDbUQYfb79lB7H2ZBYCwZFNz5knuzNKWkYVenn7nWgNW+7aOQJIlAmT3afOky1at8ojs40YcjcXxMYDqqUZHSwa4nBTcxz6A1o7VJZyXU79jjg1jgn19+rMi+hsUtgd8sBHcNPbtkTNdKLjyY0rYY4xL8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=YYcyprev; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="YYcyprev" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-70e23e9aeefso46936207b3.2 for ; Wed, 23 Jul 2025 07:47:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282031; x=1753886831; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=hrgbtdA3RdwzYLcpgmSOZtyyO0wlzXdBbkaeyV0ZrEA=; b=YYcyprevxnCTmOixqqXr0d0w+E5s0nacz+jBpdJYKq5vD8DBF+iD0jDNrxvUK4GBdj E9hFw2AL/N97+sHtKgmWbxH+I9LEXxcPRvqGJ/qBFUW9qJr1SgC6hep1fRvLNyPoWH4e q0HEm/CLU48eT6ECu2ZFjj30PQvcd5tBiuImceL7YnQ16Bwkpyo+nQ2uAHSt5lJKddKg XIJurFPyNr+LZOLswiqxm4/6KTEnXXHnjzPSJST94ihrIGjAolTTysgZDjDEIvXCDYC+ ge5Ar/LhWdLH7dU7rBGZT8vlR5sce3mgZNeFB4GFr70I+c59+HPtI85u+gL6aPmJEd/a W3QQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282031; x=1753886831; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hrgbtdA3RdwzYLcpgmSOZtyyO0wlzXdBbkaeyV0ZrEA=; b=vmKjr4oxaX6EbUvghqWYPw7UB/jT7lShtwfwM0LkPYvJ/73b42rqAhMPOb+z/mi3tQ lErVPwMOTEzRJqj/x+X9JYQHq+a/5Ye8fazhwTKmFT0IF0ZkzGTuULKj8SXVkxeRjpuw Lnm/02reJ3BYNigngk2fvL1/95FFU8EeGJbkjUxqCQ/KPJprhrhsSYAuXCCABln97t4I gh8y9CjHusEmgmXkzP48hbTk6gDnhLpCLkLKGdsMq8ycfNueTMmyf7HbkWBeiz4WRCW8 FScypKEKkR8Fesnp4CEsc3nzZOLZz+mW79VgBBJskLE+1temPAcB7Qq2BkN9NQwN62rj eKNQ== X-Forwarded-Encrypted: i=1; AJvYcCUUhyZE/lr6zZQzjKaIryULAh78/SSWRD44QH+3CZA/S2FQIUCDf/ohjTeUBi2Mzwv7HH4kVivdyfO10tA=@vger.kernel.org X-Gm-Message-State: AOJu0YxMJMFM+AtjXIaqhrChrmU0+F3Atlw0wr/ABW0VPgDQuP9KEMZ7 C0cnZRgx5ozVPmubHB+yNizvddwiEnNXE+D98T/DSlM2hzZ2pcW2YyGqkkTQejm5ukk= X-Gm-Gg: ASbGnctFLdcI23lAUYZekgoxZPvGONMuIsQkRoFa+LwZmVwpX5DssuMayuLcTu5Wngw Tr4zVsoGXF/dn3wTN9e64qnekw7OPpddx/iUMFWY8fM0GXz3lpuuKOohHRaqOWtAal4+Ktd2IGw ev1OhBTf1fU3FAHW386NGPTh9va3sEZTfB2TENIw7DctT2Y/dVzNmb7S9DbJlOHbOQLObi6ESPl WKAKvqAXQPphIoFomtk4qMeR055+wdUq5MLp15BLvtjPyM66qfUQogisIUoBuE7NjXSIcm1rnnB yjtkOlv/v9gOR+AsQsA7ej/JeaULOJ2LY44G4i4Vii0tuiV8fToAcDEPdlLO5ogs9JbmKHHR7Dy 0SkarQweqzo8NSe/7RVXgmPygTSXH1rR5Ff+07aGrjV+S8FUNnQK7Fpmb2Wbs0AbAUu/YglNyQ+ u/c9aq5wKWxg0h1A== X-Google-Smtp-Source: AGHT+IEtHo37LGPot/hXX3d5/RxjiZTetvWjVt3dWuTs4pORFq/5bZ2tCgRCgkPbneoKEGKAmSRK9g== X-Received: by 2002:a05:690c:f8e:b0:718:3b9f:f1f0 with SMTP id 00721157ae682-719b42e0146mr44064387b3.26.1753282030545; Wed, 23 Jul 2025 07:47:10 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:09 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 08/32] kho: don't unpreserve memory during abort Date: Wed, 23 Jul 2025 14:46:21 +0000 Message-ID: <20250723144649.1696299-9-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" KHO allows clients to preserve memory regions at any point before the KHO state is finalized. The finalization process itself involves KHO performing its own actions, such as serializing the overall preserved memory map. If this finalization process is aborted, the current implementation destroys KHO's internal memory tracking structures (`kho_out.ser.track.orders`). This behavior effectively unpreserves all memory from KHO's perspective, regardless of whether those preservations were made by clients before the finalization attempt or by KHO itself during finalization. This premature unpreservation is incorrect. An abort of the finalization process should only undo actions taken by KHO as part of that specific finalization attempt. Individual memory regions preserved by clients prior to finalization should remain preserved, as their lifecycle is managed by the clients themselves. These clients might still need to call kho_unpreserve_folio() or kho_unpreserve_phys() based on their own logic, even after a KHO finalization attempt is aborted. Signed-off-by: Pasha Tatashin --- kernel/kexec_handover.c | 21 +-------------------- 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 26ad926912a7..7908886170f0 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -778,31 +778,12 @@ EXPORT_SYMBOL_GPL(kho_unpreserve_phys); =20 static int __kho_abort(void) { - int err =3D 0; - unsigned long order; - struct kho_mem_phys *physxa; - - xa_for_each(&kho_out.track.orders, order, physxa) { - struct kho_mem_phys_bits *bits; - unsigned long phys; - - xa_for_each(&physxa->phys_bits, phys, bits) - kfree(bits); - - xa_destroy(&physxa->phys_bits); - kfree(physxa); - } - xa_destroy(&kho_out.track.orders); - if (kho_out.preserved_mem_map) { kho_mem_ser_free(kho_out.preserved_mem_map); kho_out.preserved_mem_map =3D NULL; } =20 - if (err) - pr_err("Failed to abort KHO finalization: %d\n", err); - - return err; + return 0; } =20 int kho_abort(void) --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B73782F8C55 for ; Wed, 23 Jul 2025 14:47:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282041; cv=none; b=IDr5Knf59Z0agLprV2zXC/TtIEzdB1rtaNMWykBkEEZPm7r1LiESjw+zzySZxD5pPc5KE/AW2GaxxBWrn9LeK8Mj6NcFKkMkZUjhImZ6Or7UyToUCjt88mR0im62Xg1kExdJSDzdlhLnBhmOg/sMHJkXdhgZ7Rv2gjpKyBwtEs8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282041; c=relaxed/simple; bh=yKPCRlIIOOnca1aFJUxlIUjA569RfuFMGS2JCSWpU14=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dG1WHaF1Gxe8PSqsu8Zw7TOLbcwPM9Sx2hDJLvJCScZaGYmJu2hFYHra5jffUe9uvcWWetjdsUDJmbxa5apyx1Ynojg52A9Ne+JHmihFgH0aLtq7LY4RadidErGdTo4c80o+I0xOzsJmtvFmom1gJTODauPyxL8gbnKCfTnhNzk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=TDW5FKcM; arc=none smtp.client-ip=209.85.128.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="TDW5FKcM" Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-711d4689084so62405697b3.0 for ; Wed, 23 Jul 2025 07:47:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282033; x=1753886833; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=D29uKJn8/3Rn+J78XMEvNXa1khBLyQZ2td0gZqNttys=; b=TDW5FKcMMZuu+jzk1OJfzCT1wCdyFuNk8ONCAYluOonvEf9rpJ/zDgWl0bplmjXtiT glOAYhUlsSDjfiETUQS9CKtwaTwM4ZUrgALj2W5gOaKEtuc2nYeY8exc9uLMVjV8y4eS 3fp+B/L3YUNG3OH3VFdVg32reLLv36flMG6EVnm/HoD380mpG/zlFnQ9EEITbBY1Gfvo GTvqU6vkF5pUKNkyT3hbYDJ84h5giRGCJquMLXSnWpgKHdfwQj+E+GC7pKFAduZ+JU0p SvrXAnHtoRvmCcPnY/ojuX2mcyTgvFb34ewULBm8zwyXhvJh1vJuGKkzP8wifVRclAJQ N1qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282033; x=1753886833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D29uKJn8/3Rn+J78XMEvNXa1khBLyQZ2td0gZqNttys=; b=pZsZ1y1rA/BPKjkggDMzhlBREi2pC8dZYbmKTs+7YOBmSkDvW6LlaSYXEKD1AJsAXa sMZZXCFLe+YXOOEawZkhl0i1/IfROeJlKwtWWTIvVF3xYHXX62P7Jka7Kp0OuOsf2uFh psrxvu3VDeBqRAxSi+TLaqmMPWGqt4lGvU1qXCLT/tD3O76w6NADfKsKuyZTB4IDetqM teCKWrtwz+hhk2yaDHMcForl5w1/dNja0ks/rzsWZyrOl36LhF7uAtf/LatOEauBjtvT r1XF1Nfc+3ym2OEu2jcrgudDWtyLcTMMmXZiFrd5oSE21worgS6R/gjgsf5CXj9K8jHD 1htA== X-Forwarded-Encrypted: i=1; AJvYcCU17eS9S3tHMdeezVQHsJ38EjGTn9kvsvLggdBLW7AhFVIjOUD9wk/PgbyCxSiVToD3swPWN2w/mDuje6U=@vger.kernel.org X-Gm-Message-State: AOJu0YzDIjy7QY7bIHi+BfeOC7kXyjJ9pEBWREW9pjDcfVK+EHhYwjqL Y+Ej2FwRH/QWX8L8OhloI25U4FW3Cz9yqAdf77TSRVBo4P/Hlj7hv5oBZe/Od+MefJ8= X-Gm-Gg: ASbGncu9glcsxjOXt6lsv9PupqV+6rSQSTn/MsjOUl20c8VgnjwHB5y0zc25i4PM38n dO1YkwOPAJdY7Rudx9V+JTEcb/348btxTJD1gUlis/hN5dNeycGmNshZNBruiy/bVOk8sY5h7L0 KhV1HPj4E2iJhIYvVUeWVX1fyyQBjS1VZhxk5gxMIIRmpjYFxiFuo1/s050lOaoY/QZftn4RSGi G6djSvaboeH1bETW+wPGZCSjhoenn+itZDlluhxOZdADZuc70mLfVxXMpcd3SESMURSkC2KE9zy y6OY2PEQ5J7HyngP1qeftfV9vHMrjPn+QTG3pDgVsqC7yMXyFoGzubspXf1kZ4LQhMpez2uS/js RDMg6w1nyef8bUsJ4Z2WvO6FlzUOK36DqoBg1P4apA4mJcLInDDAeEtdci6/b7uWcZsTZeaDksQ hVWFccumwb+Xs6bv3VxOa/k2+5 X-Google-Smtp-Source: AGHT+IFeWfzg1NbPbQd0TMCJbAp/VOmnf+ZjLhYQ4sLpsf9nKFizjYFZRceoNYpmNyr1XfBgtE4QnA== X-Received: by 2002:a05:690c:6c05:b0:70e:2d30:43d6 with SMTP id 00721157ae682-719b4371f86mr39627127b3.38.1753282032758; Wed, 23 Jul 2025 07:47:12 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:11 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 09/32] liveupdate: kho: move to kernel/liveupdate Date: Wed, 23 Jul 2025 14:46:22 +0000 Message-ID: <20250723144649.1696299-10-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move KHO to kernel/liveupdate/ in preparation of placing all Live Update core kernel related files to the same place. Signed-off-by: Pasha Tatashin Reviewed-by: Jason Gunthorpe --- Documentation/core-api/kho/concepts.rst | 2 +- MAINTAINERS | 2 +- init/Kconfig | 2 ++ kernel/Kconfig.kexec | 25 ---------------- kernel/Makefile | 3 +- kernel/liveupdate/Kconfig | 30 +++++++++++++++++++ kernel/liveupdate/Makefile | 7 +++++ kernel/{ =3D> liveupdate}/kexec_handover.c | 6 ++-- .../{ =3D> liveupdate}/kexec_handover_debug.c | 0 .../kexec_handover_internal.h | 0 10 files changed, 45 insertions(+), 32 deletions(-) create mode 100644 kernel/liveupdate/Kconfig create mode 100644 kernel/liveupdate/Makefile rename kernel/{ =3D> liveupdate}/kexec_handover.c (99%) rename kernel/{ =3D> liveupdate}/kexec_handover_debug.c (100%) rename kernel/{ =3D> liveupdate}/kexec_handover_internal.h (100%) diff --git a/Documentation/core-api/kho/concepts.rst b/Documentation/core-a= pi/kho/concepts.rst index 36d5c05cfb30..d626d1dbd678 100644 --- a/Documentation/core-api/kho/concepts.rst +++ b/Documentation/core-api/kho/concepts.rst @@ -70,5 +70,5 @@ in the FDT. That state is called the KHO finalization pha= se. =20 Public API =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D -.. kernel-doc:: kernel/kexec_handover.c +.. kernel-doc:: kernel/liveupdate/kexec_handover.c :export: diff --git a/MAINTAINERS b/MAINTAINERS index 00de7c78de86..3b276cfeb038 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13356,7 +13356,7 @@ S: Maintained F: Documentation/admin-guide/mm/kho.rst F: Documentation/core-api/kho/* F: include/linux/kexec_handover.h -F: kernel/kexec_handover* +F: kernel/liveupdate/kexec_handover* =20 KEYS-ENCRYPTED M: Mimi Zohar diff --git a/init/Kconfig b/init/Kconfig index 666783eb50ab..fb7ac0e56a87 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2083,6 +2083,8 @@ config TRACEPOINTS =20 source "kernel/Kconfig.kexec" =20 +source "kernel/liveupdate/Kconfig" + endmenu # General setup =20 source "arch/Kconfig" diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 9968d3d4dd17..b05f5018ed98 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -94,31 +94,6 @@ config KEXEC_JUMP Jump between original kernel and kexeced kernel and invoke code in physical address mode via KEXEC =20 -config KEXEC_HANDOVER - bool "kexec handover" - depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE - depends on !DEFERRED_STRUCT_PAGE_INIT - select MEMBLOCK_KHO_SCRATCH - select KEXEC_FILE - select DEBUG_FS - select LIBFDT - select CMA - help - Allow kexec to hand over state across kernels by generating and - passing additional metadata to the target kernel. This is useful - to keep data or state alive across the kexec. For this to work, - both source and target kernels need to have this option enabled. - -config KEXEC_HANDOVER_DEBUG - bool "kexec handover debug interface" - depends on KEXEC_HANDOVER - depends on DEBUG_FS - help - Allow to control kexec handover device tree via debugfs - interface, i.e. finalize the state or aborting the finalization. - Also, enables inspecting the KHO fdt trees with the debugfs binary - blobs. - config CRASH_DUMP bool "kernel crash dumps" default ARCH_DEFAULT_CRASH_DUMP diff --git a/kernel/Makefile b/kernel/Makefile index e4b4afa86a70..632f692512d7 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -53,6 +53,7 @@ obj-y +=3D printk/ obj-y +=3D irq/ obj-y +=3D rcu/ obj-y +=3D livepatch/ +obj-y +=3D liveupdate/ obj-y +=3D dma/ obj-y +=3D entry/ obj-$(CONFIG_MODULES) +=3D module/ @@ -81,8 +82,6 @@ obj-$(CONFIG_CRASH_DM_CRYPT) +=3D crash_dump_dm_crypt.o obj-$(CONFIG_KEXEC) +=3D kexec.o obj-$(CONFIG_KEXEC_FILE) +=3D kexec_file.o obj-$(CONFIG_KEXEC_ELF) +=3D kexec_elf.o -obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o -obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_BACKTRACE_SELF_TEST) +=3D backtracetest.o obj-$(CONFIG_COMPAT) +=3D compat.o obj-$(CONFIG_CGROUPS) +=3D cgroup/ diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig new file mode 100644 index 000000000000..eebe564b385d --- /dev/null +++ b/kernel/liveupdate/Kconfig @@ -0,0 +1,30 @@ +# SPDX-License-Identifier: GPL-2.0-only + +menu "Live Update" + +config KEXEC_HANDOVER + bool "kexec handover" + depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE + depends on !DEFERRED_STRUCT_PAGE_INIT + select MEMBLOCK_KHO_SCRATCH + select KEXEC_FILE + select DEBUG_FS + select LIBFDT + select CMA + help + Allow kexec to hand over state across kernels by generating and + passing additional metadata to the target kernel. This is useful + to keep data or state alive across the kexec. For this to work, + both source and target kernels need to have this option enabled. + +config KEXEC_HANDOVER_DEBUG + bool "kexec handover debug interface" + depends on KEXEC_HANDOVER + depends on DEBUG_FS + help + Allow to control kexec handover device tree via debugfs + interface, i.e. finalize the state or aborting the finalization. + Also, enables inspecting the KHO fdt trees with the debugfs binary + blobs. + +endmenu diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile new file mode 100644 index 000000000000..72cf7a8e6739 --- /dev/null +++ b/kernel/liveupdate/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the linux kernel. +# + +obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o +obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o diff --git a/kernel/kexec_handover.c b/kernel/liveupdate/kexec_handover.c similarity index 99% rename from kernel/kexec_handover.c rename to kernel/liveupdate/kexec_handover.c index 7908886170f0..72900afbdf04 100644 --- a/kernel/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -23,8 +23,8 @@ * KHO is tightly coupled with mm init and needs access to some of mm * internal APIs. */ -#include "../mm/internal.h" -#include "kexec_internal.h" +#include "../../mm/internal.h" +#include "../kexec_internal.h" #include "kexec_handover_internal.h" =20 #define KHO_FDT_COMPATIBLE "kho-v1" @@ -825,7 +825,7 @@ static int __kho_finalize(void) err |=3D fdt_finish_reservemap(root); err |=3D fdt_begin_node(root, ""); err |=3D fdt_property_string(root, "compatible", KHO_FDT_COMPATIBLE); - /** + /* * Reserve the preserved-memory-map property in the root FDT, so * that all property definitions will precede subnodes created by * KHO callers. diff --git a/kernel/kexec_handover_debug.c b/kernel/liveupdate/kexec_handov= er_debug.c similarity index 100% rename from kernel/kexec_handover_debug.c rename to kernel/liveupdate/kexec_handover_debug.c diff --git a/kernel/kexec_handover_internal.h b/kernel/liveupdate/kexec_han= dover_internal.h similarity index 100% rename from kernel/kexec_handover_internal.h rename to kernel/liveupdate/kexec_handover_internal.h --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25F762F547D for ; Wed, 23 Jul 2025 14:47:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282047; cv=none; b=hsRHN6lwnUFEANfNmRySHnINPnOlxzZe5sWaUq2NmXGLtXQ9qzDjbf2HpvcoG1kN9VYtW/cjoV/ud8BVnRnxxakXVvAxoDALfU0D0I3I/r8bIX1c53CK09cfG9sQgAbrc53D4wVDiMk+6iv7vmSe+p93PO4H1JLLlPbqqWUpxjY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282047; c=relaxed/simple; bh=6xtfxyIVDA3x4P9xsRb/FLVHQxnX4xAqJhNXgoG+/2Y=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=nJz6gTZvMIQJmp3S68TBjP/pNhFusqVC3uCS5wEK6MuMknAuFX59TYmrm6drF11ZUJSBRLzgf4b5wogsdzRABMHv7yDT7EAmkhzylvGRza7u5fWX3E+pMl6Xedj4BcVjvMXKbv0D+r0HaIjsenqSmXvZ5o0tRMlcCLUt4Fp8vrg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=So7eXAwP; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="So7eXAwP" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-70e1d8c2dc2so65662717b3.3 for ; Wed, 23 Jul 2025 07:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282035; x=1753886835; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=vXQSCYE6FRs6KbafkWlGbnDNp03kOfKlk9RAaFRpXVc=; b=So7eXAwPGlrRgtouaNFgWzwQcYiPyub3GZ69Tpmb2hgEMTCzWijyjtski32Wi2M+Q6 CuLLwMTMFExMhWUsyQByuQNNrBkVqu7VATcy0kgo1nOseQgvmrIIGBYeW7IE8YiImk4s bBqrQEACQptuL9kk05/w/5iJOro6tEbb6BkWVNTO08noABtA8ID380lljw2atK5e5hu1 En7rF5CiAeEb0Y4fj0DZEYqjqT/+dv5sG+j1zvNggJNnPJmsCEGMtYWqRJyUlBw8CGd3 SlPxuRqdu92gRKo+T2pUuUxcCNtxscnuMfg4HLgWwXQzBQSC8iVkxx54jSBdITbL7+i9 iiOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282035; x=1753886835; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vXQSCYE6FRs6KbafkWlGbnDNp03kOfKlk9RAaFRpXVc=; b=VKKhY2PIiP+euazUKi4e6Udw4HWv5shmnH9XRrgXkM1U8FAb5O4R8I9YA0K6/GDpCF YIRFayWT2mRAzh0XqsiegfCRw2ekuCuj/NWkj0lOXLhDbJQ3CzidGu2NWW+y1JY+kUVz hZ6GR98YL1+RcaTKYNCXecDEtXiUHm5YR6G9lELn00WzlIFxH9/SsgoWsUNiy6hGaYes TKjRzvPaHDNaVwntfkMIhfsilPwrzdbiRG1jC1LyujEvG0PlyNPyp1MXQRC968M/Pu/+ 659Nj3cukxFZUwehNbDKNwPvLFbwLS5TFBPPPwq7HlbDo9sc7v84TyPF2IW1upYgufQI MAyg== X-Forwarded-Encrypted: i=1; AJvYcCV2o6Ka/sDwln6JDctInT/bXtWxVyHEf5VfEtIC+N4KiYkdJf8u8QmK39rulBPFzg0aIX7VUFiO2D/K7Jk=@vger.kernel.org X-Gm-Message-State: AOJu0Yzmikt2m+ijXdIai+BP9rA58PF4+MJCflA7yPTBUI5p149lp/J1 TLyS9ysub4vZRekxmgYng1UYJJ50E97GBJ/VUXEfmek3Z5La35IF0yMYKKiuaXwGX8M= X-Gm-Gg: ASbGncsOn+lB+qU+YdyUyjI6oR1NtFdF3V1U43Pr9PRFmlRnw3pQR8+C6oOywvq9PjT hSBQafEAycHkjLsh/Yl3+4UgbK6RlSvoQ4ZteEW+NmKezRN5kfuvm64vRWWPLqMLAynBmcUyxyb SHfWjhQlAEK3Qgbgic7LkDsb2Q6iyT4QDkB+iZJFnFGK/cY5z8b4Ciej1D6D1bqaITe+QhbTSQX gnY8zS47iAy6Tf/MYuNxWJHIF5/jSeNPRNCCT25oqx5HPqnGpJdrv7SScrzTSJ2y39KawuvGxKj Sw75uL0y26ArOv9Y8WnprNeDVTSKwqZpmRM7u4PMtEnnHLt+nmjrR1kblo6o3t8G0OuNA5l14Hd vQcnZKFOTPzmuzDKNxjmDlp+Nd1K0NxhfcWDXFMTC3TnsXR5TtKHeeyaNtg6U5H8pwAunUQNTJz Zgx5m71WpmtLH7IA== X-Google-Smtp-Source: AGHT+IGfTs7pf0AbBWquufObMODOggfiGd1fHBHu+qiKXvkfWPENf9ewiJ9dh+/X4YXAKKzqsynwKg== X-Received: by 2002:a05:690c:74c3:b0:710:f1da:1b5f with SMTP id 00721157ae682-719b424d284mr43190267b3.34.1753282034740; Wed, 23 Jul 2025 07:47:14 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:14 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 10/32] liveupdate: luo_core: Live Update Orchestrator Date: Wed, 23 Jul 2025 14:46:23 +0000 Message-ID: <20250723144649.1696299-11-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce LUO, a mechanism intended to facilitate kernel updates while keeping designated devices operational across the transition (e.g., via kexec). The primary use case is updating hypervisors with minimal disruption to running virtual machines. For userspace side of hypervisor update we have copyless migration. LUO is for updating the kernel. This initial patch lays the groundwork for the LUO subsystem. Further functionality, including the implementation of state transition logic, integration with KHO, and hooks for subsystems and file descriptors, will be added in subsequent patches. Signed-off-by: Pasha Tatashin --- include/linux/liveupdate.h | 140 ++++++++++++++ kernel/liveupdate/Kconfig | 27 +++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_core.c | 301 +++++++++++++++++++++++++++++++ kernel/liveupdate/luo_internal.h | 21 +++ 5 files changed, 490 insertions(+) create mode 100644 include/linux/liveupdate.h create mode 100644 kernel/liveupdate/luo_core.c create mode 100644 kernel/liveupdate/luo_internal.h diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h new file mode 100644 index 000000000000..da8f05c81e51 --- /dev/null +++ b/include/linux/liveupdate.h @@ -0,0 +1,140 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ +#ifndef _LINUX_LIVEUPDATE_H +#define _LINUX_LIVEUPDATE_H + +#include +#include +#include + +/** + * enum liveupdate_event - Events that trigger live update callbacks. + * @LIVEUPDATE_PREPARE: PREPARE should happen *before* the blackout window. + * Subsystems should prepare for an upcoming reboot by + * serializing their states. However, it must be cons= idered + * that user applications, e.g. virtual machines are = still + * running during this phase. + * @LIVEUPDATE_FREEZE: FREEZE sent from the reboot() syscall, when the cu= rrent + * kernel is on its way out. This is the final opport= unity + * for subsystems to save any state that must persist + * across the reboot. Callbacks for this event should= be as + * fast as possible since they are on the critical pa= th of + * rebooting into the next kernel. + * @LIVEUPDATE_FINISH: FINISH is sent in the newly booted kernel after a + * successful live update and normally *after* the bl= ackout + * window. Subsystems should perform any final cleanup + * during this phase. This phase also provides an + * opportunity to clean up devices that were preserve= d but + * never explicitly reclaimed during the live update + * process. State restoration should have already occ= urred + * before this event. Callbacks for this event must n= ot + * fail. The completion of this call transitions the + * machine from ``updated`` to ``normal`` state. + * @LIVEUPDATE_CANCEL: CANCEL the live update and go back to normal state= . This + * event is user initiated, or is done automatically = when + * LIVEUPDATE_PREPARE or LIVEUPDATE_FREEZE stage fail= s. + * Subsystems should revert any actions taken during = the + * corresponding prepare event. Callbacks for this ev= ent + * must not fail. + * + * These events represent the different stages and actions within the live + * update process that subsystems (like device drivers and bus drivers) + * need to be aware of to correctly serialize and restore their state. + * + */ +enum liveupdate_event { + LIVEUPDATE_PREPARE, + LIVEUPDATE_FREEZE, + LIVEUPDATE_FINISH, + LIVEUPDATE_CANCEL, +}; + +/** + * enum liveupdate_state - Defines the possible states of the live update + * orchestrator. + * @LIVEUPDATE_STATE_UNDEFINED: State has not yet been initialized. + * @LIVEUPDATE_STATE_NORMAL: Default state, no live update in prog= ress. + * @LIVEUPDATE_STATE_PREPARED: Live update is prepared for reboot; t= he + * LIVEUPDATE_PREPARE callbacks have com= pleted + * successfully. + * Devices might operate in a limited st= ate + * for example the participating devices= might + * not be allowed to unbind, and also the + * setting up of new DMA mappings might = be + * disabled in this state. + * @LIVEUPDATE_STATE_FROZEN: The final reboot event + * (%LIVEUPDATE_FREEZE) has been sent, a= nd the + * system is performing its final state = saving + * within the "blackout window". User + * workloads must be suspended. The actu= al + * reboot (kexec) into the next kernel is + * imminent. + * @LIVEUPDATE_STATE_UPDATED: The system has rebooted into the next + * kernel via live update the system is = now + * running the next kernel, awaiting the + * finish event. + * + * These states track the progress and outcome of a live update operation. + */ +enum liveupdate_state { + LIVEUPDATE_STATE_UNDEFINED =3D 0, + LIVEUPDATE_STATE_NORMAL =3D 1, + LIVEUPDATE_STATE_PREPARED =3D 2, + LIVEUPDATE_STATE_FROZEN =3D 3, + LIVEUPDATE_STATE_UPDATED =3D 4, +}; + +#ifdef CONFIG_LIVEUPDATE + +/* Return true if live update orchestrator is enabled */ +bool liveupdate_enabled(void); + +/* Called during reboot to tell participants to complete serialization */ +int liveupdate_reboot(void); + +/* + * Return true if machine is in updated state (i.e. live update boot in + * progress) + */ +bool liveupdate_state_updated(void); + +/* + * Return true if machine is in normal state (i.e. no live update in progr= ess). + */ +bool liveupdate_state_normal(void); + +enum liveupdate_state liveupdate_get_state(void); + +#else /* CONFIG_LIVEUPDATE */ + +static inline int liveupdate_reboot(void) +{ + return 0; +} + +static inline bool liveupdate_enabled(void) +{ + return false; +} + +static inline bool liveupdate_state_updated(void) +{ + return false; +} + +static inline bool liveupdate_state_normal(void) +{ + return true; +} + +static inline enum liveupdate_state liveupdate_get_state(void) +{ + return LIVEUPDATE_STATE_NORMAL; +} + +#endif /* CONFIG_LIVEUPDATE */ +#endif /* _LINUX_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index eebe564b385d..f6b0bde188d9 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -1,7 +1,34 @@ # SPDX-License-Identifier: GPL-2.0-only +# +# Copyright (c) 2025, Google LLC. +# Pasha Tatashin +# +# Live Update Orchestrator +# =20 menu "Live Update" =20 +config LIVEUPDATE + bool "Live Update Orchestrator" + depends on KEXEC_HANDOVER + help + Enable the Live Update Orchestrator. Live Update is a mechanism, + typically based on kexec, that allows the kernel to be updated + while keeping selected devices operational across the transition. + These devices are intended to be reclaimed by the new kernel and + re-attached to their original workload without requiring a device + reset. + + Ability to handover a device from current to the next kernel depends + on specific support within device drivers and related kernel + subsystems. + + This feature primarily targets virtual machine hosts to quickly update + the kernel hypervisor with minimal disruption to the running virtual + machines. + + If unsure, say N. + config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 72cf7a8e6739..b3c72c405780 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -5,3 +5,4 @@ =20 obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o +obj-$(CONFIG_LIVEUPDATE) +=3D luo_core.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c new file mode 100644 index 000000000000..8cee093807ff --- /dev/null +++ b/kernel/liveupdate/luo_core.c @@ -0,0 +1,301 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: Live Update Orchestrator (LUO) + * + * Live Update is a specialized, kexec-based reboot process that allows a + * running kernel to be updated from one version to another while preservi= ng + * the state of selected resources and keeping designated hardware devices + * operational. For these devices, DMA activity may continue throughout the + * kernel transition. + * + * While the primary use case driving this work is supporting live updates= of + * the Linux kernel when it is used as a hypervisor in cloud environments,= the + * LUO framework itself is designed to be workload-agnostic. Much like Ker= nel + * Live Patching, which applies security fixes regardless of the workload, + * Live Update facilitates a full kernel version upgrade for any type of s= ystem. + * + * For example, a non-hypervisor system running an in-memory cache like + * memcached with many gigabytes of data can use LUO. The userspace service + * can place its cache into a memfd, have its state preserved by LUO, and + * restore it immediately after the kernel kexec. + * + * Whether the system is running virtual machines, containers, a + * high-performance database, or networking services, LUO's primary goal i= s to + * enable a full kernel update by preserving critical userspace state and + * keeping essential devices operational. + * + * The core of LUO is a state machine that tracks the progress of a live u= pdate, + * along with a callback API that allows other kernel subsystems to partic= ipate + * in the process. Example subsystems that can hook into LUO include: kvm, + * iommu, interrupts, vfio, participating filesystems, and memory manageme= nt. + * + * LUO uses Kexec Handover to transfer memory state from the current kerne= l to + * the next kernel. For more details see + * Documentation/core-api/kho/concepts.rst. + * + * The LUO state machine ensures that operations are performed in the corr= ect + * sequence and provides a mechanism to track and recover from potential + * failures. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include "luo_internal.h" + +static DECLARE_RWSEM(luo_state_rwsem); + +static enum liveupdate_state luo_state =3D LIVEUPDATE_STATE_UNDEFINED; + +static const char *const luo_state_str[] =3D { + [LIVEUPDATE_STATE_UNDEFINED] =3D "undefined", + [LIVEUPDATE_STATE_NORMAL] =3D "normal", + [LIVEUPDATE_STATE_PREPARED] =3D "prepared", + [LIVEUPDATE_STATE_FROZEN] =3D "frozen", + [LIVEUPDATE_STATE_UPDATED] =3D "updated", +}; + +static bool luo_enabled; + +static int __init early_liveupdate_param(char *buf) +{ + return kstrtobool(buf, &luo_enabled); +} +early_param("liveupdate", early_liveupdate_param); + +/* Return true if the current state is equal to the provided state */ +static inline bool is_current_luo_state(enum liveupdate_state expected_sta= te) +{ + return liveupdate_get_state() =3D=3D expected_state; +} + +static void __luo_set_state(enum liveupdate_state state) +{ + WRITE_ONCE(luo_state, state); +} + +static inline void luo_set_state(enum liveupdate_state state) +{ + pr_info("Switched from [%s] to [%s] state\n", + luo_current_state_str(), luo_state_str[state]); + __luo_set_state(state); +} + +static int luo_do_freeze_calls(void) +{ + return 0; +} + +static void luo_do_finish_calls(void) +{ +} + +/* Get the current state as a string */ +const char *luo_current_state_str(void) +{ + return luo_state_str[liveupdate_get_state()]; +} + +enum liveupdate_state liveupdate_get_state(void) +{ + return READ_ONCE(luo_state); +} + +int luo_prepare(void) +{ + return 0; +} + +/** + * luo_freeze() - Initiate the final freeze notification phase for live up= date. + * + * Attempts to transition the live update orchestrator state from + * %LIVEUPDATE_STATE_PREPARED to %LIVEUPDATE_STATE_FROZEN. This function is + * typically called just before the actual reboot system call (e.g., kexec) + * is invoked, either directly by the orchestration tool or potentially fr= om + * within the reboot syscall path itself. + * + * @return 0: Success. Negative error otherwise. State is reverted to + * %LIVEUPDATE_STATE_NORMAL in case of an error during callbacks, and ever= ything + * is canceled via cancel notifcation. + */ +int luo_freeze(void) +{ + int ret; + + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[freeze] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_FROZEN], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + ret =3D luo_do_freeze_calls(); + if (!ret) + luo_set_state(LIVEUPDATE_STATE_FROZEN); + else + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return ret; +} + +/** + * luo_finish - Finalize the live update process in the new kernel. + * + * This function is called after a successful live update reboot into a n= ew + * kernel, once the new kernel is ready to transition to the normal operat= ional + * state. It signals the completion of the live update sequence to subsyst= ems. + * + * @return 0 on success, ``-EAGAIN`` if the state change was cancelled by = the + * user while waiting for the lock, or ``-EINVAL`` if the orchestrator is = not in + * the updated state. + */ +int luo_finish(void) +{ + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[finish] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_UPDATED)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_NORMAL], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + luo_do_finish_calls(); + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return 0; +} + +int luo_cancel(void) +{ + return 0; +} + +void luo_state_read_enter(void) +{ + down_read(&luo_state_rwsem); +} + +void luo_state_read_exit(void) +{ + up_read(&luo_state_rwsem); +} + +static int __init luo_startup(void) +{ + __luo_set_state(LIVEUPDATE_STATE_NORMAL); + + return 0; +} +early_initcall(luo_startup); + +/* Public Functions */ + +/** + * liveupdate_reboot() - Kernel reboot notifier for live update final + * serialization. + * + * This function is invoked directly from the reboot() syscall pathway if a + * reboot is initiated while the live update state is %LIVEUPDATE_STATE_PR= EPARED + * (i.e., if the user did not explicitly trigger the frozen state). It han= dles + * the implicit transition into the final frozen state. + * + * It triggers the %LIVEUPDATE_REBOOT event callbacks for participating + * subsystems. These callbacks must perform final state saving very quickl= y as + * they execute during the blackout period just before kexec. + * + * If any %LIVEUPDATE_FREEZE callback fails, this function triggers the + * %LIVEUPDATE_CANCEL event for all participants to revert their state, ab= orts + * the live update, and returns an error. + */ +int liveupdate_reboot(void) +{ + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED)) + return 0; + + return luo_freeze(); +} +EXPORT_SYMBOL_GPL(liveupdate_reboot); + +/** + * liveupdate_state_updated - Check if the system is in the live update + * 'updated' state. + * + * This function checks if the live update orchestrator is in the + * ``LIVEUPDATE_STATE_UPDATED`` state. This state indicates that the syste= m has + * successfully rebooted into a new kernel as part of a live update, and t= he + * preserved devices are expected to be in the process of being reclaimed. + * + * This is typically used by subsystems during early boot of the new kernel + * to determine if they need to attempt to restore state from a previous + * live update. + * + * @return true if the system is in the ``LIVEUPDATE_STATE_UPDATED`` state, + * false otherwise. + */ +bool liveupdate_state_updated(void) +{ + return is_current_luo_state(LIVEUPDATE_STATE_UPDATED); +} +EXPORT_SYMBOL_GPL(liveupdate_state_updated); + +/** + * liveupdate_state_normal - Check if the system is in the live update 'no= rmal' + * state. + * + * This function checks if the live update orchestrator is in the + * ``LIVEUPDATE_STATE_NORMAL`` state. This state indicates that no live up= date + * is in progress. It represents the default operational state of the syst= em. + * + * This can be used to gate actions that should only be performed when no + * live update activity is occurring. + * + * @return true if the system is in the ``LIVEUPDATE_STATE_NORMAL`` state, + * false otherwise. + */ +bool liveupdate_state_normal(void) +{ + return is_current_luo_state(LIVEUPDATE_STATE_NORMAL); +} +EXPORT_SYMBOL_GPL(liveupdate_state_normal); + +/** + * liveupdate_enabled - Check if the live update feature is enabled. + * + * This function returns the state of the live update feature flag, which + * can be controlled via the ``liveupdate`` kernel command-line parameter. + * + * @return true if live update is enabled, false otherwise. + */ +bool liveupdate_enabled(void) +{ + return luo_enabled; +} +EXPORT_SYMBOL_GPL(liveupdate_enabled); diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h new file mode 100644 index 000000000000..3d10f3eb20a7 --- /dev/null +++ b/kernel/liveupdate/luo_internal.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _LINUX_LUO_INTERNAL_H +#define _LINUX_LUO_INTERNAL_H + +int luo_cancel(void); +int luo_prepare(void); +int luo_freeze(void); +int luo_finish(void); + +void luo_state_read_enter(void); +void luo_state_read_exit(void); + +const char *luo_current_state_str(void); + +#endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B7CE42FBFEE for ; Wed, 23 Jul 2025 14:47:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282051; cv=none; b=nKHh2L2QPQVEwzSM/c8o74RkJmtmGsPNu5gMcCqps044SbL+gVnHhg813ktfjXByR9l150SmPmk+zfw6aYhmcUe47CtV4y2FcPGj6u6ckj8SzRXszegUJj/S7Rh+n7N26UlEuYsILtg53W+obVMRYRSHLrkfg0f5R2vEjPLdibY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282051; c=relaxed/simple; bh=68fuMsTaKD3Wnucw9AAh9zoVaQeYptH5ixtgvH+HmAU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PM+qvn/2JKFe5sSBB7IWXHQuu1rseHud/e8XVbxvdeIlRB0C1pWPLr5rJBIOp6UHogXdv9yEF7+iDfsGa8RhJjnTt1AB5qaOm3cie3tlxfqCCazBYhpCoKTLn5MfhqXQL98BKp2/K6Zyx0y0IpQBkc1v2S4xjgzavpfcAANWvog= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=R44ljpeW; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="R44ljpeW" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-70e3e0415a7so260047b3.0 for ; Wed, 23 Jul 2025 07:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282037; x=1753886837; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=4F1QX9qtkry2jc2ztpFtzfSo3tcuBncamP+UfSK+M08=; b=R44ljpeW2aiUQPfSpcsXyKfWL0rEO+j985Kvz9qQhp6c9j3CJpFHLVIQXjxjFKQwK7 0WmqXf0Jmoikod9rFQgELrjEGYP2GhaOXbZYlwPTRLbXcEi1kn54GLEevMlmoo7BgmXi ATqVRpOE+dUiVkXhAE7MWaGtcpw4E108QvVjQxqfKhUb5COaIdtWBp0QdIA4xBUQD9bU 6QkBptdXncY+2Q/lCzhQztkdndKNGkXXJptXGHKIoyicZfMIB3+/+39kf4+gyi+eOpOM 1ZJpOdK50YuYw999wktw5RKsIHdsBGbzkmi8mwACPXzNlYs69FGg5Utndu7GQLbosymO NU4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282037; x=1753886837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4F1QX9qtkry2jc2ztpFtzfSo3tcuBncamP+UfSK+M08=; b=VqBFsm4UKpvagaQmqfLkrAeVB80R6kJ5GWe1RdPAvSKoH3yr5V9znvQTU8QHCITOH/ 96AWZHAk75xq7WOC643PLgW0uIJb/UlTzV2kWJK2IEMauafSO/fA98azPiU6167e2cUt jeIJGZ3osj65FYGf6lHMTo/i2gyyRCq4CKZLk10D5YtPZaAnSYro9dzC92BHRhiMNBXu 46BeqChYGlN5ehVYBPhWR9ELDZRuDAaYY0JzhsHD3Yb+XlmYQTMGQWfIVf7NKDAoBuqz aKjkDFvYoYUrUSD6kw3xP8o4wGW5jms6J81ygjOwt6TntZu3NbXfhfnTUJ7pbu/DfQGm tBkw== X-Forwarded-Encrypted: i=1; AJvYcCWacXBb4EWk9kP/NZlINb6RykwaRxkoe7KOTy53y7uNq1hailHx3WALIsGQbLUAyOYqXBlEJY3uQAtkbaA=@vger.kernel.org X-Gm-Message-State: AOJu0Yz/57C6KW17ZtRwCPoqEXSGwMAWD8XX71HN7GHytxwR+0gSGd/6 baBzzvtHsJAx8O80H2tNssQVmFPexBufbObJuxhuJ+JP6yaDvblvZELq2v38TaTq8YQ= X-Gm-Gg: ASbGncvJS7JiC01hTEox59xu4GAXn9Y8ERpfH/mG1lxSjfmRCD9N9wx9Id2TrVCKV1d cjnTB45b2aEIEYiRa48IOUS+k9OftD4eumfT2jtckV3ktdPXKcM9nUucUcrf84kqDsg3jugKMW7 AIjRHleCCdiLocm+rusmx5cLNBHXnEo+AwGeBAouSH5ioZ4wRzagL4hLuk7vQTBo1YFRUP5hH9K VUVZMStPPVbVnuavYTFETeVCuffj1stUXKzqfVLHUosv+yG7lFC/ztIw68TRg0GgJAqrxKb0vwV O1REFA6R3f6SMvX6l9/9PZBGIeXLHD+mBeY2ifJ+QwKlhjDs2B0oOBfMU7zPqmSusBMQJ8SqNeA UkchKcM5TvWPoJMndTG7zVfqgWMR4I19TdstDE60YmMjAW9ZzhidmPL0wNgi6r68nC7NCDHXj56 uJ0nuTPbNftR8OMA== X-Google-Smtp-Source: AGHT+IGIKplQvsQ6oyU5xYVnnj9HKRC7xjC99zYFucpfaGhtR6VBuuTxNds+2TPsqRP8M9w0Avf/qg== X-Received: by 2002:a05:690c:60c7:b0:712:cc11:b02 with SMTP id 00721157ae682-719a0a12ea1mr108397577b3.4.1753282036550; Wed, 23 Jul 2025 07:47:16 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:16 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 11/32] liveupdate: luo_core: integrate with KHO Date: Wed, 23 Jul 2025 14:46:24 +0000 Message-ID: <20250723144649.1696299-12-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Integrate the LUO with the KHO framework to enable passing LUO state across a kexec reboot. When LUO is transitioned to a "prepared" state, it tells KHO to finalize, so all memory segments that were added to KHO preservation list are getting preserved. After "Prepared" state no new segments can be preserved. If LUO is canceled, it also tells KHO to cancel the serialization, and therefore, later LUO can go back into the prepared state. This patch introduces the following changes: - During the KHO finalization phase allocate FDT blob. - Populate this FDT with a LUO compatibility string ("luo-v1"). - Implement a KHO notifier LUO now depends on `CONFIG_KEXEC_HANDOVER`. The core state transition logic (`luo_do_*_calls`) remains unimplemented in this patch. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_core.c | 214 ++++++++++++++++++++++++++++++++++- 1 file changed, 211 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index 8cee093807ff..c80a1f188359 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -47,9 +47,12 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt =20 #include +#include #include +#include #include #include +#include #include #include "luo_internal.h" =20 @@ -67,6 +70,21 @@ static const char *const luo_state_str[] =3D { =20 static bool luo_enabled; =20 +static void *luo_fdt_out; +static void *luo_fdt_in; + +/* + * The LUO FDT size depends on the number of participating subsystems, + * + * The current fixed size (4K) is large enough to handle reasonable number= of + * preserved entities. If this size ever becomes insufficient, it can eith= er be + * increased, or a dynamic size calculation mechanism could be implemented= in + * the future. + */ +#define LUO_FDT_SIZE PAGE_SIZE +#define LUO_KHO_ENTRY_NAME "LUO" +#define LUO_COMPATIBLE "luo-v1" + static int __init early_liveupdate_param(char *buf) { return kstrtobool(buf, &luo_enabled); @@ -91,6 +109,60 @@ static inline void luo_set_state(enum liveupdate_state = state) __luo_set_state(state); } =20 +/* Called during the prepare phase, to create LUO fdt tree */ +static int luo_fdt_setup(void) +{ + void *fdt_out; + int ret; + + fdt_out =3D (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, + get_order(LUO_FDT_SIZE)); + if (!fdt_out) { + pr_err("failed to allocate FDT memory\n"); + return -ENOMEM; + } + + ret =3D fdt_create_empty_tree(fdt_out, LUO_FDT_SIZE); + if (ret) + goto exit_free; + + ret =3D fdt_setprop_string(fdt_out, 0, "compatible", LUO_COMPATIBLE); + if (ret) + goto exit_free; + + ret =3D kho_preserve_phys(__pa(fdt_out), LUO_FDT_SIZE); + if (ret) + goto exit_free; + + ret =3D kho_add_subtree(LUO_KHO_ENTRY_NAME, fdt_out); + if (ret) + goto exit_unpreserve; + luo_fdt_out =3D fdt_out; + + return 0; + +exit_unpreserve: + WARN_ON_ONCE(kho_unpreserve_phys(__pa(fdt_out), LUO_FDT_SIZE)); +exit_free: + free_pages((unsigned long)fdt_out, get_order(LUO_FDT_SIZE)); + pr_err("failed to prepare LUO FDT: %d\n", ret); + + return ret; +} + +static void luo_fdt_destroy(void) +{ + WARN_ON_ONCE(kho_unpreserve_phys(__pa(luo_fdt_out), LUO_FDT_SIZE)); + kho_remove_subtree(luo_fdt_out); + free_pages((unsigned long)luo_fdt_out, get_order(LUO_FDT_SIZE)); + luo_fdt_out =3D NULL; +} + +static int luo_do_prepare_calls(void) +{ + return 0; +} + static int luo_do_freeze_calls(void) { return 0; @@ -100,6 +172,71 @@ static void luo_do_finish_calls(void) { } =20 +static void luo_do_cancel_calls(void) +{ +} + +static int __luo_prepare(void) +{ + int ret; + + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[prepare] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_NORMAL)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_PREPARED], + luo_current_state_str()); + ret =3D -EINVAL; + goto exit_unlock; + } + + ret =3D luo_fdt_setup(); + if (ret) + goto exit_unlock; + + ret =3D luo_do_prepare_calls(); + if (ret) { + luo_fdt_destroy(); + goto exit_unlock; + } + + luo_set_state(LIVEUPDATE_STATE_PREPARED); + +exit_unlock: + up_write(&luo_state_rwsem); + + return ret; +} + +static int __luo_cancel(void) +{ + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[cancel] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED) && + !is_current_luo_state(LIVEUPDATE_STATE_FROZEN)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_NORMAL], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + luo_do_cancel_calls(); + luo_fdt_destroy(); + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return 0; +} + /* Get the current state as a string */ const char *luo_current_state_str(void) { @@ -111,9 +248,28 @@ enum liveupdate_state liveupdate_get_state(void) return READ_ONCE(luo_state); } =20 +/** + * luo_prepare - Initiate the live update preparation phase. + * + * This function is called to begin the live update process. It attempts to + * transition the luo to the ``LIVEUPDATE_STATE_PREPARED`` state. + * + * If the calls complete successfully, the orchestrator state is set + * to ``LIVEUPDATE_STATE_PREPARED``. If any call fails a + * ``LIVEUPDATE_CANCEL`` is sent to roll back any actions. + * + * @return 0 on success, ``-EAGAIN`` if the state change was cancelled by = the + * user while waiting for the lock, ``-EINVAL`` if the orchestrator is not= in + * the normal state, or a negative error code returned by the calls. + */ int luo_prepare(void) { - return 0; + int err =3D __luo_prepare(); + + if (err) + return err; + + return kho_finalize(); } =20 /** @@ -193,9 +349,28 @@ int luo_finish(void) return 0; } =20 +/** + * luo_cancel - Cancel the ongoing live update from prepared or frozen sta= tes. + * + * This function is called to abort a live update that is currently in the + * ``LIVEUPDATE_STATE_PREPARED`` state. + * + * If the state is correct, it triggers the ``LIVEUPDATE_CANCEL`` notifier= chain + * to allow subsystems to undo any actions performed during the prepare or + * freeze events. Finally, the orchestrator state is transitioned back to + * ``LIVEUPDATE_STATE_NORMAL``. + * + * @return 0 on success, or ``-EAGAIN`` if the state change was cancelled = by the + * user while waiting for the lock. + */ int luo_cancel(void) { - return 0; + int err =3D kho_abort(); + + if (err) + return err; + + return __luo_cancel(); } =20 void luo_state_read_enter(void) @@ -210,7 +385,40 @@ void luo_state_read_exit(void) =20 static int __init luo_startup(void) { - __luo_set_state(LIVEUPDATE_STATE_NORMAL); + phys_addr_t fdt_phys; + int ret; + + if (!kho_is_enabled()) { + if (luo_enabled) + pr_warn("Disabling liveupdate because KHO is disabled\n"); + luo_enabled =3D false; + return 0; + } + + /* + * Retrieve LUO subtree, and verify its format. Panic in case of + * exceptions, since machine devices and memory is in unpredictable + * state. + */ + ret =3D kho_retrieve_subtree(LUO_KHO_ENTRY_NAME, &fdt_phys); + if (ret) { + if (ret !=3D -ENOENT) { + panic("failed to retrieve FDT '%s' from KHO: %d\n", + LUO_KHO_ENTRY_NAME, ret); + } + __luo_set_state(LIVEUPDATE_STATE_NORMAL); + + return 0; + } + + luo_fdt_in =3D __va(fdt_phys); + ret =3D fdt_node_check_compatible(luo_fdt_in, 0, LUO_COMPATIBLE); + if (ret) { + panic("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_KHO_ENTRY_NAME, LUO_COMPATIBLE, ret); + } + + __luo_set_state(LIVEUPDATE_STATE_UPDATED); =20 return 0; } --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 928692F546C for ; Wed, 23 Jul 2025 14:47:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282048; cv=none; b=ZsjDmeUR3RGR79A98dE3gkF2CX9NI35hKY7ZKjyZW/KVeWViAV3p2+rt4AFbECXPCIVDB1Hg+mkF90dgjuCbeT5KOBFzz6iOoA5/p2PioHPwufU6aBqC8w6gX3YytCk46x1lTR9s79yRJiXNSQYpkOxo0j1ue+Ol+JQGIaP7n90= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282048; c=relaxed/simple; bh=LCBnkF05BFy67Dj3X9soES2dtoYMqCQz73KyCOTgCrI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hyz3xd70kIdOhzmko9YdyiqL7EWMgq8omiZtCYXrafuAjOcn7cV2zfgp2mIMdplv+uE/vAmScPknxWDzajHc/ioEOexWiUjYVBhz8+aWq/TX8hhwCXpv3XBdzM8IUJuOy8Z28ue94NYgz6vmd1KvkJBOVi5MP5Db0E7XTR7Ic/A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=m5DGXr6N; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="m5DGXr6N" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-71840959355so20327b3.1 for ; Wed, 23 Jul 2025 07:47:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282039; x=1753886839; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=PZSlqdnTxC2Wjx/Crjm0HF+UsFjTh+cAEzYLoPE8XGM=; b=m5DGXr6NYcGJEPB8rHEQgqA6Y5Ra6mUukhEcpFFyitFA+tLEZTL1vdctEV+52kMkVH fEYYUy1oiXeNVzQ87kzqWalZjOHkz5L5z8nppWYslCHk55sJt/aN2eINKVj/RAZCmUSz Mi8BCY73XOjynagXHoOfsLIcU5v6q2Kug+z2On6EwpUq2JXwzt5VKKqd2p032Gpcis6D 6IygIaBo9ol60CxU5f1aE+oAzDjbTeH12+1Lpg7FxdzQKcWh0khh9XRZpEVZruWqrHHu ZNYNMX9FDzcsbMLVrH8HZSegt7jOAebzNe1fakq2uexh94lBUYfymaQ5G38N/Rx5BjLu cSsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282039; x=1753886839; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PZSlqdnTxC2Wjx/Crjm0HF+UsFjTh+cAEzYLoPE8XGM=; b=em07wUHPrN+XMILGhddDFIO+qo/PaEBb4Z43982ucgWQ3ZqiC+hkGdStkq9QZOPIw3 UjxWXl48EP/2FgBad0bsT9ca6mxhebpcPzhNCeMFKSfhHw52zmv+gI5YwWM54YySGbVe mkFpDGUsm7rbkVy7KXg9SlZnesbmlc5NRc2RbLq1KKSHKZvPHeSjv185XIhPju3Yok1Z zP9sJfWlYhMDgYOyT2mLk4Il0Me+sHpbI0EiaA5fwg0PWncPoJt1Ar+LF8xRPcDWaKsU ot9joztt/KAA+bd2Vbn+BhKbfaTfut/YMKlq4SGBekX1lcEwsftcDDKeAf6wYhL9hj7S UL3g== X-Forwarded-Encrypted: i=1; AJvYcCUm+pxVEMSY8B8acQbXBplBGvcqfTSXWvPsHyxpspwClYybBI7/fnDpXxaJN4jTQJDTVhBCuRouxXFBujE=@vger.kernel.org X-Gm-Message-State: AOJu0YyVJuhiRZRTVoR2B4abGalfWgh/NIUBBuwIUyRR9/lKEowujqD+ T3JwnmcVRSE0yfS+rHbHBt7GV9Yg6WAKPtf788ImYAH/EPruXS0/hxeKQ06YC45z0ag= X-Gm-Gg: ASbGncuJW2dUqnfFKk6AfS4Iuy2RFFgeQSXqdVlSr0oGBEkIb+SCxa3qdIhbgEIEFI7 5ixhf9XgxR+NvBXQ/NFPvOlSKOpx2W8ye0Pz7Fc2SMICrq6tYmbZmu6dBO+O3Uc9f6PB3eQeHkt tXZomN2QlGLI85G9iskWtZC4yUH51PkccFTMfMITqpfOXr7U7kdfLglsiJa7j5FjLpKUh1W2szT OEnsHExOEpgRK+5YxKKWa6yiuTgJUk1c85Eez4vkNY3wO/BnQZCaX7M9yqt7Zb/Kr6dMsQ0f761 19G/N7ydAHs68DbJuX9Ooi7B01a/FLOK0EEm+4unrFOXjkyMYi8RAwM2bFPDm16PgXeE3YXz9C+ RT0pUbowwjilieBCifj/Aayw2wL5ZOC6LL2sojXlIpZ6esILzLhyJTn2eNb6TiGomVOI4ug2bkx Frj2xCMwsN2/IKzg== X-Google-Smtp-Source: AGHT+IFN9e7mm9DK4PFQKKKYO6Lt9oj1mK23kThGYjIcIMOCvVxNcrxRaUhkgKiI35z2KgooHTIX8A== X-Received: by 2002:a05:690c:4d87:b0:719:a0ff:1bd1 with SMTP id 00721157ae682-719b4b2610cmr39176557b3.3.1753282038557; Wed, 23 Jul 2025 07:47:18 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:18 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 12/32] liveupdate: luo_subsystems: add subsystem registration Date: Wed, 23 Jul 2025 14:46:25 +0000 Message-ID: <20250723144649.1696299-13-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the framework for kernel subsystems (e.g., KVM, IOMMU, device drivers) to register with LUO and participate in the live update process via callbacks. Subsystem Registration: - Defines struct liveupdate_subsystem in linux/liveupdate.h, which subsystems use to provide their name and optional callbacks (prepare, freeze, cancel, finish). The callbacks accept a u64 *data intended for passing state/handles. - Exports liveupdate_register_subsystem() and liveupdate_unregister_subsystem() API functions. - Adds drivers/misc/liveupdate/luo_subsystems.c to manage a list of registered subsystems. Registration/unregistration is restricted to specific LUO states (NORMAL/UPDATED). Callback Framework: - The main luo_core.c state transition functions now delegate to new luo_do_subsystems_*_calls() functions defined in luo_subsystems.c. - These new functions are intended to iterate through the registered subsystems and invoke their corresponding callbacks. FDT Integration: - Adds a /subsystems subnode within the main LUO FDT created in luo_core.c. This node has its own compatibility string (subsystems-v1). - luo_subsystems_fdt_setup() populates this node by adding a property for each registered subsystem, using the subsystem's name. Currently, these properties are initialized with a placeholder u64 value (0). - luo_subsystems_startup() is called from luo_core.c on boot to find and validate the /subsystems node in the FDT received via KHO. It panics if the node is missing or incompatible. - Adds a stub API function liveupdate_get_subsystem_data() intended for subsystems to retrieve their persisted u64 data from the FDT in the new kernel. Signed-off-by: Pasha Tatashin --- include/linux/liveupdate.h | 61 +++++++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_core.c | 19 +- kernel/liveupdate/luo_internal.h | 7 + kernel/liveupdate/luo_subsystems.c | 284 +++++++++++++++++++++++++++++ 5 files changed, 370 insertions(+), 2 deletions(-) create mode 100644 kernel/liveupdate/luo_subsystems.c diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index da8f05c81e51..fed68b9ab32b 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -88,6 +88,47 @@ enum liveupdate_state { LIVEUPDATE_STATE_UPDATED =3D 4, }; =20 +/** + * struct liveupdate_subsystem_ops - LUO events callback functions + * @prepare: Optional. Called during LUO prepare phase. Should perform + * preparatory actions and can store a u64 handle/state + * via the 'data' pointer for use in later callbacks. + * Return 0 on success, negative error code on failure. + * @freeze: Optional. Called during LUO freeze event (before actual = jump + * to new kernel). Should perform final state saving action= s and + * can update the u64 handle/state via the 'data' pointer. = Retur: + * 0 on success, negative error code on failure. + * @cancel: Optional. Called if the live update process is canceled = after + * prepare (or freeze) was called. Receives the u64 data + * set by prepare/freeze. Used for cleanup. + * @finish: Optional. Called after the live update is finished in th= e new + * kernel. + * Receives the u64 data set by prepare/freeze. Used for cl= eanup. + */ +struct liveupdate_subsystem_ops { + int (*prepare)(void *arg, u64 *data); + int (*freeze)(void *arg, u64 *data); + void (*cancel)(void *arg, u64 data); + void (*finish)(void *arg, u64 data); +}; + +/** + * struct liveupdate_subsystem - Represents a subsystem participating in L= UO + * @ops: Callback functions + * @name: Unique name identifying the subsystem. + * @arg: Add this argument to callback functions. + * @list: List head used internally by LUO. Should not be modified= by + * caller after registration. + * @private_data: For LUO internal use, cached value of data field. + */ +struct liveupdate_subsystem { + const struct liveupdate_subsystem_ops *ops; + const char *name; + void *arg; + struct list_head list; + u64 private_data; +}; + #ifdef CONFIG_LIVEUPDATE =20 /* Return true if live update orchestrator is enabled */ @@ -109,6 +150,10 @@ bool liveupdate_state_normal(void); =20 enum liveupdate_state liveupdate_get_state(void); =20 +int liveupdate_register_subsystem(struct liveupdate_subsystem *h); +int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h); +int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a); + #else /* CONFIG_LIVEUPDATE */ =20 static inline int liveupdate_reboot(void) @@ -136,5 +181,21 @@ static inline enum liveupdate_state liveupdate_get_sta= te(void) return LIVEUPDATE_STATE_NORMAL; } =20 +static inline int liveupdate_register_subsystem(struct liveupdate_subsyste= m *h) +{ + return 0; +} + +static inline int liveupdate_unregister_subsystem(struct liveupdate_subsys= tem *h) +{ + return 0; +} + +static inline int liveupdate_get_subsystem_data(struct liveupdate_subsyste= m *h, + u64 *data) +{ + return -ENODATA; +} + #endif /* CONFIG_LIVEUPDATE */ #endif /* _LINUX_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index b3c72c405780..999208a1fdbb 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -6,3 +6,4 @@ obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_core.o +obj-$(CONFIG_LIVEUPDATE) +=3D luo_subsystems.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index c80a1f188359..fff84c51d986 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -130,6 +130,10 @@ static int luo_fdt_setup(void) if (ret) goto exit_free; =20 + ret =3D luo_subsystems_fdt_setup(fdt_out); + if (ret) + goto exit_free; + ret =3D kho_preserve_phys(__pa(fdt_out), LUO_FDT_SIZE); if (ret) goto exit_free; @@ -160,20 +164,30 @@ static void luo_fdt_destroy(void) =20 static int luo_do_prepare_calls(void) { - return 0; + int ret; + + ret =3D luo_do_subsystems_prepare_calls(); + + return ret; } =20 static int luo_do_freeze_calls(void) { - return 0; + int ret; + + ret =3D luo_do_subsystems_freeze_calls(); + + return ret; } =20 static void luo_do_finish_calls(void) { + luo_do_subsystems_finish_calls(); } =20 static void luo_do_cancel_calls(void) { + luo_do_subsystems_cancel_calls(); } =20 static int __luo_prepare(void) @@ -419,6 +433,7 @@ static int __init luo_startup(void) } =20 __luo_set_state(LIVEUPDATE_STATE_UPDATED); + luo_subsystems_startup(luo_fdt_in); =20 return 0; } diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 3d10f3eb20a7..98bf799adb61 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -18,4 +18,11 @@ void luo_state_read_exit(void); =20 const char *luo_current_state_str(void); =20 +void luo_subsystems_startup(void *fdt); +int luo_subsystems_fdt_setup(void *fdt); +int luo_do_subsystems_prepare_calls(void); +int luo_do_subsystems_freeze_calls(void); +void luo_do_subsystems_finish_calls(void); +void luo_do_subsystems_cancel_calls(void); + #endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_subsystems.c b/kernel/liveupdate/luo_sub= systems.c new file mode 100644 index 000000000000..436929a17de0 --- /dev/null +++ b/kernel/liveupdate/luo_subsystems.c @@ -0,0 +1,284 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO Subsystems support + * + * Various kernel subsystems register with the Live Update Orchestrator to + * participate in the live update process. These subsystems are notified at + * different stages of the live update sequence, allowing them to serialize + * device state before the reboot and restore it afterwards. Examples incl= ude + * the device layer, interrupt controllers, KVM, IOMMU, and specific device + * drivers. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include "luo_internal.h" + +#define LUO_SUBSYSTEMS_NODE_NAME "subsystems" +#define LUO_SUBSYSTEMS_COMPATIBLE "subsystems-v1" + +static DEFINE_MUTEX(luo_subsystem_list_mutex); +static LIST_HEAD(luo_subsystems_list); +static void *luo_fdt_out; +static void *luo_fdt_in; + +/** + * luo_subsystems_fdt_setup - Adds and populates the 'subsystems' node in = the + * FDT. + * @fdt: Pointer to the LUO FDT blob. + * + * Add subsystems node and each subsystem to the LUO FDT blob. + * + * Returns: 0 on success, negative errno on failure. + */ +int luo_subsystems_fdt_setup(void *fdt) +{ + struct liveupdate_subsystem *subsystem; + const u64 zero_data =3D 0; + int ret, node_offset; + + ret =3D fdt_add_subnode(fdt, 0, LUO_SUBSYSTEMS_NODE_NAME); + if (ret < 0) + goto exit_error; + + node_offset =3D ret; + ret =3D fdt_setprop_string(fdt, node_offset, "compatible", + LUO_SUBSYSTEMS_COMPATIBLE); + if (ret < 0) + goto exit_error; + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + ret =3D fdt_setprop(fdt, node_offset, subsystem->name, + &zero_data, sizeof(zero_data)); + if (ret < 0) + goto exit_error; + } + + luo_fdt_out =3D fdt; + return 0; +exit_error: + pr_err("Failed to setup 'subsystems' node to FDT: %s\n", + fdt_strerror(ret)); + return -ENOSPC; +} + +/** + * luo_subsystems_startup - Validates the LUO subsystems FDT node at start= up. + * @fdt: Pointer to the LUO FDT blob passed from the previous kernel. + * + * This __init function checks the existence and validity of the '/subsyst= ems' + * node in the FDT. This node is considered mandatory. It calls panic() if + * the node is missing, inaccessible, or invalid (e.g., missing compatible, + * wrong compatible string), indicating a critical configuration error for= LUO. + */ +void __init luo_subsystems_startup(void *fdt) +{ + int ret, node_offset; + + node_offset =3D fdt_subnode_offset(fdt, 0, LUO_SUBSYSTEMS_NODE_NAME); + if (node_offset < 0) + panic("Failed to find /subsystems node\n"); + + ret =3D fdt_node_check_compatible(fdt, node_offset, + LUO_SUBSYSTEMS_COMPATIBLE); + if (ret) { + panic("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_SUBSYSTEMS_NODE_NAME, LUO_SUBSYSTEMS_COMPATIBLE, ret); + } + luo_fdt_in =3D fdt; +} + +/** + * luo_do_subsystems_prepare_calls - Calls prepare callbacks and updates F= DT + * if all prepares succeed. Handles cancellation on failure. + * + * Phase 1: Calls 'prepare' for all subsystems and stores results temporar= ily. + * If any 'prepare' fails, calls 'cancel' on previously prepared subsystems + * and returns the error. + * Phase 2: If all 'prepare' calls succeeded, writes the stored data to th= e FDT. + * If any FDT write fails, calls 'cancel' on *all* prepared subsystems and + * returns the FDT error. + * + * Returns: 0 on success. Negative errno on failure. + */ +int luo_do_subsystems_prepare_calls(void) +{ + return 0; +} + +/** + * luo_do_subsystems_freeze_calls - Calls freeze callbacks and updates FDT + * if all freezes succeed. Handles cancellation on failure. + * + * Phase 1: Calls 'freeze' for all subsystems and stores results temporari= ly. + * If any 'freeze' fails, calls 'cancel' on previously called subsystems + * and returns the error. + * Phase 2: If all 'freeze' calls succeeded, writes the stored data to the= FDT. + * If any FDT write fails, calls 'cancel' on *all* subsystems and + * returns the FDT error. + * + * Returns: 0 on success. Negative errno on failure. + */ +int luo_do_subsystems_freeze_calls(void) +{ + return 0; +} + +/** + * luo_do_subsystems_finish_calls- Calls finish callbacks for all subsyste= ms. + * + * This function is called at the end of live update cycle to do the final + * clean-up or housekeeping of the post-live update states. + */ +void luo_do_subsystems_finish_calls(void) +{ +} + +/** + * luo_do_subsystems_cancel_calls - Calls cancel callbacks for all subsyst= ems. + * + * This function is typically called when the live update process needs to= be + * aborted externally, for example, after the prepare phase may have run b= ut + * before actual reboot. It iterates through all registered subsystems and= calls + * the 'cancel' callback for those that implement it and likely completed + * prepare. + */ +void luo_do_subsystems_cancel_calls(void) +{ +} + +/** + * liveupdate_register_subsystem - Register a kernel subsystem handler wit= h LUO + * @h: Pointer to the liveupdate_subsystem structure allocated and populat= ed + * by the calling subsystem. + * + * Registers a subsystem handler that provides callbacks for different eve= nts + * of the live update cycle. Registration is typically done during the + * subsystem's module init or core initialization. + * + * Can only be called when LUO is in the NORMAL or UPDATED states. + * The provided name (@h->name) must be unique among registered subsystems. + * + * Return: 0 on success, negative error code otherwise. + */ +int liveupdate_register_subsystem(struct liveupdate_subsystem *h) +{ + struct liveupdate_subsystem *iter; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + mutex_lock(&luo_subsystem_list_mutex); + list_for_each_entry(iter, &luo_subsystems_list, list) { + if (iter =3D=3D h) { + pr_warn("Subsystem '%s' (%p) already registered.\n", + h->name, h); + ret =3D -EEXIST; + goto out_unlock; + } + + if (!strcmp(iter->name, h->name)) { + pr_err("Subsystem with name '%s' already registered.\n", + h->name); + ret =3D -EEXIST; + goto out_unlock; + } + } + + INIT_LIST_HEAD(&h->list); + list_add_tail(&h->list, &luo_subsystems_list); + +out_unlock: + mutex_unlock(&luo_subsystem_list_mutex); + luo_state_read_exit(); + + return ret; +} +EXPORT_SYMBOL_GPL(liveupdate_register_subsystem); + +/** + * liveupdate_unregister_subsystem - Unregister a kernel subsystem handler= from + * LUO + * @h: Pointer to the same liveupdate_subsystem structure that was used du= ring + * registration. + * + * Unregisters a previously registered subsystem handler. Typically called + * during module exit or subsystem teardown. LUO removes the structure fro= m its + * internal list; the caller is responsible for any necessary memory clean= up + * of the structure itself. + * + * Return: 0 on success, negative error code otherwise. + * -EINVAL if h is NULL. + * -ENOENT if the specified handler @h is not found in the registration li= st. + * -EBUSY if LUO is not in the NORMAL state. + */ +int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h) +{ + struct liveupdate_subsystem *iter; + bool found =3D false; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + mutex_lock(&luo_subsystem_list_mutex); + list_for_each_entry(iter, &luo_subsystems_list, list) { + if (iter =3D=3D h) { + found =3D true; + break; + } + } + + if (found) { + list_del_init(&h->list); + } else { + pr_warn("Subsystem handler '%s' not found for unregistration.\n", + h->name); + ret =3D -ENOENT; + } + + mutex_unlock(&luo_subsystem_list_mutex); + luo_state_read_exit(); + + return ret; +} +EXPORT_SYMBOL_GPL(liveupdate_unregister_subsystem); + +/** + * liveupdate_get_subsystem_data - Retrieve raw private data for a subsyst= em + * from FDT. + * @h: Pointer to the liveupdate_subsystem structure representing the + * subsystem instance. The 'name' field is used to find the property. + * @data: Output pointer where the subsystem's raw private u64 data will= be + * stored via memcpy. + * + * Reads the 8-byte data property associated with the subsystem @h->name + * directly from the '/subsystems' node within the globally accessible + * 'luo_fdt_in' blob. Returns appropriate error codes if inputs are invali= d, or + * nodes/properties are missing or invalid. + * + * Return: 0 on success. -ENOENT on error. + */ +int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) +{ + return 0; +} +EXPORT_SYMBOL_GPL(liveupdate_get_subsystem_data); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f174.google.com (mail-yw1-f174.google.com [209.85.128.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F8D82F5482 for ; Wed, 23 Jul 2025 14:47:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282049; cv=none; b=tXfnVz5WhOr2RgvQjXfSEErdq6nTzbvlwoyEHR5gWKIBEGOoWmHdf/RvqbOf/icigwxV9YDvbJvzxkVjFocXDQQDVjuA1ZNwA1BCLCZN7WaaBK+QAhEleJo91BOqOZFYFFJK40d6UtQVZM+/XtUQWtnE8QMc/OoLjoEK+G0pUqU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282049; c=relaxed/simple; bh=h1L0cbeVfJxO9kghsvJ6bWhgk+o6POMPbs4sljzTVS0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=FM7spni7fQ3K9JXSj0wgV0ecISHkzflIEItKH/gAYGK5DTGbRHO3bREJm3QHFtqMakC3hiO+ryQfVLZrM9HVhuLXVTcCt6oImmZZWNRvcoJp1tRMn/p3MD2noAssxl+hCgMj80nXuDJuEUp8HHICFTG62R8gEv/6WoQz695fzac= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=ksU/galY; arc=none smtp.client-ip=209.85.128.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="ksU/galY" Received: by mail-yw1-f174.google.com with SMTP id 00721157ae682-70e75f30452so44735557b3.2 for ; Wed, 23 Jul 2025 07:47:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282040; x=1753886840; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=5UX19BNUcwB6TjSqCz6om7S5yVRpVe4SwKywfTR0ocE=; b=ksU/galYA/CnJq8mbRbvGlg59PEgswL5MSxvTNKED3a9CycFHuIrRmC5UxRuqmDGly iWfsXZayuhJXIgX0oE3b22ErGFEXCi/xhmYA8Cw1EWD7IYr2T11ke5HbYZ/QtAUZ/Ngb 5BnmuBrOyjAxngM0/m3FI8OtwwNBFEAGQGNn/peEU314L6Ts1lamt7GhzatA3+cOfUcS S3EVwR8xobm0jOqUmXDM2pvW5IbL/nHqDu2OEK1/GFhpVuO2X6x7g9ybz0yumLF098gY QDYnabfzwDM/IszMbrfH2lXi5UAhx+hEgnXWPzZo3KVJeNaIb3J4fwizMH4EjZCWOBE0 t0XQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282040; x=1753886840; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5UX19BNUcwB6TjSqCz6om7S5yVRpVe4SwKywfTR0ocE=; b=hrNOwOFSUb0waHrzNh471zaMzaxJI1EoS2xPwv+RwUoe9hf+y2U0pLnwI2xhS+l0TQ GkmLQju1w4DGpaFQuQKfxSaB/yE5CRbdW+xFq08ht3lEjIsXC3FWIT8jJGFlN0PxI+Fc V1eR0aNe/a9ESj0bF1I13pvONFF2+wF7FTHPt3bORCbtqEdCDCZ2Yr58tOYtFGcQqee2 khaixb9jv7+4MEp6ZGnx/vwq05CEsFbQ8+M4Fw4NU94ORPOoBr75QM71YWgdJLSocGVs vHFbR+aGezDMaAuIq4vJCnjHtU+D7AI2DpBHItEiMX5i5S9xm4rFlVi+m8o+gc9Y8tlw G+Ag== X-Forwarded-Encrypted: i=1; AJvYcCV/3nbTCQxKpp3Gmt3hnWHG7I/M9cZqVnY9KQSXEUlhKGTHg1yTQFGEW7GkDSM77754M/y16wwwkI4TDl4=@vger.kernel.org X-Gm-Message-State: AOJu0YzQLf2pF545PVSpRiCGidruR/r90dWvITUdOGfCbRdrYK6idcUu kXoowfkr3ngPFx01LXhBB+xTOQeruH/b4UzuukFIKdIOXUOuIfD/7Ct5SUzfEMGnXRU= X-Gm-Gg: ASbGncvKXVE0Wcy3/lpCYsMXoiupupmRdrSbivoPB3AJAg3LfpIi5BSt7VAKKGCjaig NcDqsbhomgHSc+7/Ta0Y0iYAZAZ8b3jYBDh40nCqe3XsXANY/5pxt999vXhxhDk5or5uqH1PfOx r3w2H1jdTlBEW9MuDiyN55j09gPIHSxj9nvy3xavw6S6M5QfvsRKDlW1rX3c/buohUu5I9OyQvV 1fahcDCGzrkuvW4DLeD+ESzq2EcnhD/QNAvoAvB9AtUh6Wp4g/RUSHa9Nqe5I0IZ1FakOUHwESG e+3Cc+F6hrJZI2BROiWJ705rZIaRvuvl9XW7N9hpXEOSeoO/yAB3tkmivEi4ImqRmUIIgxl0Mdm y5cEymUWpEChtExpkBx8AmAqB/YOVNfK3HlETr+jijOmrQJQD4Y5qQVsBV6viRoJOPLzGc/qzB2 DMUKKSGDzPm29hVQ== X-Google-Smtp-Source: AGHT+IHeWCg5lvgS3ayNuzJMHqUh46fT1SJdtuE/GRTZAvZI/FWwWLnu+spwLbexNESx5UBQY3MOUg== X-Received: by 2002:a05:690c:7202:b0:710:f2a1:fa6 with SMTP id 00721157ae682-719b4335d19mr40934067b3.29.1753282040576; Wed, 23 Jul 2025 07:47:20 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:19 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 13/32] liveupdate: luo_subsystems: implement subsystem callbacks Date: Wed, 23 Jul 2025 14:46:26 +0000 Message-ID: <20250723144649.1696299-14-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement the core logic within luo_subsystems.c to handle the invocation of registered subsystem callbacks and manage the persistence of their state via the LUO FDT. This replaces the stub implementations from the previous patch. This completes the core mechanism enabling subsystems to actively participate in the LUO state machine, execute phase-specific logic, and persist/restore a u64 state across the live update transition using the FDT. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_subsystems.c | 140 ++++++++++++++++++++++++++++- 1 file changed, 138 insertions(+), 2 deletions(-) diff --git a/kernel/liveupdate/luo_subsystems.c b/kernel/liveupdate/luo_sub= systems.c index 436929a17de0..0e0070d01584 100644 --- a/kernel/liveupdate/luo_subsystems.c +++ b/kernel/liveupdate/luo_subsystems.c @@ -99,6 +99,66 @@ void __init luo_subsystems_startup(void *fdt) luo_fdt_in =3D fdt; } =20 +static void __luo_do_subsystems_cancel_calls(struct liveupdate_subsystem *= boundary_subsystem) +{ + struct liveupdate_subsystem *subsystem; + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (subsystem =3D=3D boundary_subsystem) + break; + + if (subsystem->ops->cancel) { + subsystem->ops->cancel(subsystem->arg, + subsystem->private_data); + } + subsystem->private_data =3D 0; + } +} + +static void luo_subsystems_retrieve_data_from_fdt(void) +{ + struct liveupdate_subsystem *subsystem; + int node_offset, prop_len; + const void *prop; + + if (!luo_fdt_in) + return; + + node_offset =3D fdt_subnode_offset(luo_fdt_in, 0, + LUO_SUBSYSTEMS_NODE_NAME); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + prop =3D fdt_getprop(luo_fdt_in, node_offset, + subsystem->name, &prop_len); + + if (!prop || prop_len !=3D sizeof(u64)) { + panic("In FDT node '/%s' can't find property '%s': %s\n", + LUO_SUBSYSTEMS_NODE_NAME, subsystem->name, + fdt_strerror(node_offset)); + } + memcpy(&subsystem->private_data, prop, sizeof(u64)); + } +} + +static int luo_subsystems_commit_data_to_fdt(void) +{ + struct liveupdate_subsystem *subsystem; + int ret, node_offset; + + node_offset =3D fdt_subnode_offset(luo_fdt_out, 0, + LUO_SUBSYSTEMS_NODE_NAME); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + ret =3D fdt_setprop(luo_fdt_out, node_offset, subsystem->name, + &subsystem->private_data, sizeof(u64)); + if (ret < 0) { + pr_err("Failed to set FDT property for subsystem '%s' %s\n", + subsystem->name, fdt_strerror(ret)); + return -ENOENT; + } + } + + return 0; +} + /** * luo_do_subsystems_prepare_calls - Calls prepare callbacks and updates F= DT * if all prepares succeed. Handles cancellation on failure. @@ -114,7 +174,29 @@ void __init luo_subsystems_startup(void *fdt) */ int luo_do_subsystems_prepare_calls(void) { - return 0; + struct liveupdate_subsystem *subsystem; + int ret; + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (!subsystem->ops->prepare) + continue; + + ret =3D subsystem->ops->prepare(subsystem->arg, + &subsystem->private_data); + if (ret < 0) { + pr_err("Subsystem '%s' prepare callback failed [%d]\n", + subsystem->name, ret); + __luo_do_subsystems_cancel_calls(subsystem); + + return ret; + } + } + + ret =3D luo_subsystems_commit_data_to_fdt(); + if (ret) + __luo_do_subsystems_cancel_calls(NULL); + + return ret; } =20 /** @@ -132,7 +214,29 @@ int luo_do_subsystems_prepare_calls(void) */ int luo_do_subsystems_freeze_calls(void) { - return 0; + struct liveupdate_subsystem *subsystem; + int ret; + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (!subsystem->ops->freeze) + continue; + + ret =3D subsystem->ops->freeze(subsystem->arg, + &subsystem->private_data); + if (ret < 0) { + pr_err("Subsystem '%s' freeze callback failed [%d]\n", + subsystem->name, ret); + __luo_do_subsystems_cancel_calls(subsystem); + + return ret; + } + } + + ret =3D luo_subsystems_commit_data_to_fdt(); + if (ret) + __luo_do_subsystems_cancel_calls(NULL); + + return ret; } =20 /** @@ -143,6 +247,17 @@ int luo_do_subsystems_freeze_calls(void) */ void luo_do_subsystems_finish_calls(void) { + struct liveupdate_subsystem *subsystem; + + luo_subsystems_retrieve_data_from_fdt(); + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (subsystem->ops->finish) { + subsystem->ops->finish(subsystem->arg, + subsystem->private_data); + } + subsystem->private_data =3D 0; + } } =20 /** @@ -156,6 +271,8 @@ void luo_do_subsystems_finish_calls(void) */ void luo_do_subsystems_cancel_calls(void) { + __luo_do_subsystems_cancel_calls(NULL); + luo_subsystems_commit_data_to_fdt(); } =20 /** @@ -279,6 +396,25 @@ EXPORT_SYMBOL_GPL(liveupdate_unregister_subsystem); */ int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) { + int node_offset, prop_len; + const void *prop; + + luo_state_read_enter(); + if (WARN_ON_ONCE(!luo_fdt_in || !liveupdate_state_updated())) { + luo_state_read_exit(); + return -ENOENT; + } + + node_offset =3D fdt_subnode_offset(luo_fdt_in, 0, + LUO_SUBSYSTEMS_NODE_NAME); + prop =3D fdt_getprop(luo_fdt_in, node_offset, h->name, &prop_len); + if (!prop || prop_len !=3D sizeof(u64)) { + luo_state_read_exit(); + return -ENOENT; + } + memcpy(data, prop, sizeof(u64)); + luo_state_read_exit(); + return 0; } EXPORT_SYMBOL_GPL(liveupdate_get_subsystem_data); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3576F2FA633 for ; Wed, 23 Jul 2025 14:47:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282051; cv=none; b=ugat3Xr/dMkRgI0oycKGcpbRF98rwFU0uMR5X+rrP7n745765B+dbzS7GVqmpBe7VkcKpYWeJAyD1oMm71TUI2byZgara72jvKW5x1r7yVmRF+3AmlIqnVUEDvu3ioowUSK+UriVVsQEZ/LenOE5pqlUJ6/KSsWIa/mxPekXEvA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282051; c=relaxed/simple; bh=JEBv7ujCg/6ppGBPUrQkC9q5/Jniw7vqQmsSqDtLHdM=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qcm2dBQ9XgfnU+m8kxWJg579V6IV/XhdhsglepNeDDnDkJLI7PYtrV04xfkg5tXNo9lF1EhOtj9CoB8g249No7aV/NcVpDzq2hKtozggijkJYEQ6aWt9H8QxaVVH+kRsWjxV4kY/wCGrQ/lC3PAEpZ/62QfV+J+aPf6usctpOO8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=sVqtCdoW; arc=none smtp.client-ip=209.85.128.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="sVqtCdoW" Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-710e344bbf9so64515447b3.2 for ; Wed, 23 Jul 2025 07:47:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282043; x=1753886843; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=LimjlICD6sUHgsbY/Y2oVH6YvmC/VDcU5uGtC2pfHEg=; b=sVqtCdoWTHjjGKkk8r5dwE3Ux+7PpxlQBnAPWzMGxbpZxa3xLQXBjq6UgF9pWLn+QG v9EQo5kshx9OPMjmjlp5q9bqk0qEA3pDBdWnKZfnGIEG3zjsz/WjbsUSEPc9wDStwoja cMTc8cxXHCZ7hJHwYsCCY3Uy0idZsKjh52et9fDeT34w7oGp0ze2i1RPB3m4JxMIBdjC fOW4rvO5CeUH0AMoMYAdjDxX6KcuxoOZ4oOMU+6pjbCPmviSLe/6eoS2XxeumL/EAHir phUwGtYtPrYISYWK1uElY1t6tts4hXDy4U5Kpf2rm8avfLc7+sFCCMHchOQ4CEznK3zF kduw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282043; x=1753886843; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LimjlICD6sUHgsbY/Y2oVH6YvmC/VDcU5uGtC2pfHEg=; b=PE3oXl+P3HyAQub3eeuhoKFHiDyWyRSMV7lwXDf2SrA6if3SFgMdJIrdX72LU9RQBq H1Tce62nFh1wltModMBiFFIKXG5Ds9vy6zItuZijxNIS6WUJMR+EJ/rAzSyemyZeUo+N /JC/E4HYMQrsCw9m2G5q/tt9qSk3B039sqi6LjTsR1b6AQQ2ASytgb/eheVWdcHeAgiF lOY+KarO2gro3OJyKBgyoL0OjWIaQgKlvS5GjtABDD851BD1T36q6yy30QeM1BJM2s9M yEmrsecaVnvPGwA8g8Ivp6avMv6zDKx6ftUab7K9boaeOJd6iznRDEFvgFprIZfy2ZD7 OnTw== X-Forwarded-Encrypted: i=1; AJvYcCUdkQN6CI2E8CoTJFuCoQbyJlDMzc70SFWDHcekFMM6LM4AJdEYdtp+tNadgGu150Sw2ryGpWmNxSgodVE=@vger.kernel.org X-Gm-Message-State: AOJu0Yymz8P0qslk9wXfwCVmGkO8m47l6X910mxJF3IXEVmfNVuI248U +xH4cqJV0Oq+ovT4sQXnp8g/snnWw7TSQIObV1ZjZ/+BC1bvIXjrZL6Dmaz53Zezlco= X-Gm-Gg: ASbGncvaWe42acQgiMfybERf6TnVAUY3smvFojp1GoeyEJJkfDO3cQJfOfCISax+KYr vXB1ejhF7/Amr1FW0GNhL715RbKnUN144T0mbMDJl/hfktaXi98UbekJuTfh5HZcytPOCb99OXZ 8bIf+BCI3z4omv2bdQo8hfyJd3GWg9j9fKSmgoNwHo63MNo87ajeyHchaElvNib4VBlj/5s07YF GbmGgwvXQtME7BNB6RdUdzuKakgnUxl0UuPzztbbw1onZhh+oYm1PAhM/QPHFXb6MS1H8bT0A/g k7njWmGu3O2B64+AHIq/ow9RcNwFNqflwvHrX0iJtlXf0oJ5WrEv/DmQfsrSQILMGRPGdIoIJ39 lHD4NvnkFNxlIwbEGDqDRe/AtExFX/7js3IHB5AqKY5Ceboy8jb5Uuq98BL0ueuW0Nrv7GnSdH9 DG13vAc9LqpToIEw== X-Google-Smtp-Source: AGHT+IE1KB1/uVE4jUf4IJeel0OlV1177O/VVSuJSc+kWDoCjrMljLQyTiL7urB8xdRJhfIvOcxAhw== X-Received: by 2002:a05:690c:6606:b0:712:c14a:a36a with SMTP id 00721157ae682-719b422a7edmr43106877b3.20.1753282042565; Wed, 23 Jul 2025 07:47:22 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:21 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 14/32] liveupdate: luo_files: add infrastructure for FDs Date: Wed, 23 Jul 2025 14:46:27 +0000 Message-ID: <20250723144649.1696299-15-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the framework within LUO to support preserving specific types of file descriptors across a live update transition. This allows stateful FDs (like memfds or vfio FDs used by VMs) to be recreated in the new kernel. Note: The core logic for iterating through the luo_files_list and invoking the handler callbacks (prepare, freeze, cancel, finish) within luo_do_files_*_calls, as well as managing the u64 data persistence via the FDT for individual files, is currently implemented as stubs in this patch. This patch sets up the registration, FDT layout, and retrieval framework. Signed-off-by: Pasha Tatashin --- include/linux/liveupdate.h | 68 ++++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_files.c | 663 +++++++++++++++++++++++++++++++ kernel/liveupdate/luo_internal.h | 4 + 4 files changed, 736 insertions(+) create mode 100644 kernel/liveupdate/luo_files.c diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index fed68b9ab32b..28a8aa4cafca 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -88,6 +88,61 @@ enum liveupdate_state { LIVEUPDATE_STATE_UPDATED =3D 4, }; =20 +struct file; + +/** + * struct liveupdate_file_ops - Callbacks for for live-updatable files. + * @prepare: Optional. Saves state for a specific file instance (@fi= le, + * @arg) before update, potentially returning value via @d= ata. + * Returns 0 on success, negative errno on failure. + * @freeze: Optional. Performs final actions just before kernel + * transition, potentially reading/updating the handle via + * @data. + * Returns 0 on success, negative errno on failure. + * @cancel: Optional. Cleans up state/resources if update is aborted + * after prepare/freeze succeeded, using the @data handle = (by + * value) from the successful prepare. Returns void. + * @finish: Optional. Performs final cleanup in the new kernel usin= g the + * preserved @data handle (by value). Returns void. + * @retrieve: Retrieve the preserved file. Must be called before fini= sh. + * @can_preserve: callback to determine if @file with associated context = (@arg) + * can be preserved by this handler. + * Return bool (true if preservable, false otherwise). + */ +struct liveupdate_file_ops { + int (*prepare)(struct file *file, void *arg, u64 *data); + int (*freeze)(struct file *file, void *arg, u64 *data); + void (*cancel)(struct file *file, void *arg, u64 data); + void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed); + int (*retrieve)(void *arg, u64 data, struct file **file); + bool (*can_preserve)(struct file *file, void *arg); +}; + +/** + * struct liveupdate_file_handler - Represents a handler for a live-updata= ble + * file type. + * @ops: Callback functions + * @compatible: The compatibility string (e.g., "memfd-v1", "vfiofd-v1") + * that uniquely identifies the file type this handler sup= ports. + * This is matched against the compatible string associate= d with + * individual &struct liveupdate_file instances. + * @arg: An opaque pointer to implementation-specific context da= ta + * associated with this file handler registration. + * @list: used for linking this handler instance into a global li= st of + * registered file handlers. + * + * Modules that want to support live update for specific file types should + * register an instance of this structure. LUO uses this registration to + * determine if a given file can be preserved and to find the appropriate + * operations to manage its state across the update. + */ +struct liveupdate_file_handler { + const struct liveupdate_file_ops *ops; + const char *compatible; + void *arg; + struct list_head list; +}; + /** * struct liveupdate_subsystem_ops - LUO events callback functions * @prepare: Optional. Called during LUO prepare phase. Should perform @@ -154,6 +209,9 @@ int liveupdate_register_subsystem(struct liveupdate_sub= system *h); int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h); int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a); =20 +int liveupdate_register_file_handler(struct liveupdate_file_handler *h); +int liveupdate_unregister_file_handler(struct liveupdate_file_handler *h); + #else /* CONFIG_LIVEUPDATE */ =20 static inline int liveupdate_reboot(void) @@ -197,5 +255,15 @@ static inline int liveupdate_get_subsystem_data(struct= liveupdate_subsystem *h, return -ENODATA; } =20 +static inline int liveupdate_register_file_handler(struct liveupdate_file_= handler *h) +{ + return 0; +} + +static inline int liveupdate_unregister_file_handler(struct liveupdate_fil= e_handler *h) +{ + return 0; +} + #endif /* CONFIG_LIVEUPDATE */ #endif /* _LINUX_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 999208a1fdbb..b5054140b9a9 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -6,4 +6,5 @@ obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_core.o +obj-$(CONFIG_LIVEUPDATE) +=3D luo_files.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_subsystems.o diff --git a/kernel/liveupdate/luo_files.c b/kernel/liveupdate/luo_files.c new file mode 100644 index 000000000000..3582f1ec96c4 --- /dev/null +++ b/kernel/liveupdate/luo_files.c @@ -0,0 +1,663 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO file descriptors + * + * LUO provides the infrastructure necessary to preserve + * specific types of stateful file descriptors across a kernel live + * update transition. The primary goal is to allow workloads, such as virt= ual + * machines using vfio, memfd, or iommufd to retain access to their essent= ial + * resources without interruption after the underlying kernel is updated. + * + * The framework operates based on handler registration and instance track= ing: + * + * 1. Handler Registration: Kernel modules responsible for specific file + * types (e.g., memfd, vfio) register a &struct liveupdate_file_handler + * handler. This handler contains callbacks + * (&liveupdate_file_handler.ops->prepare, + * &liveupdate_file_handler.ops->freeze, + * &liveupdate_file_handler.ops->finish, etc.) and a unique 'compatible' s= tring + * identifying the file type. Registration occurs via + * liveupdate_register_file_handler(). + * + * 2. File Instance Tracking: When a potentially preservable file needs to= be + * managed for live update, the core LUO logic (luo_register_file()) finds= a + * compatible registered handler using its + * &liveupdate_file_handler.ops->can_preserve callback. If found, an inte= rnal + * &struct luo_file instance is created, assigned a unique u64 'token', and + * added to a list. + * + * 3. State Persistence (FDT): During the LUO prepare/freeze phases, the + * registered handler callbacks are invoked for each tracked file instance. + * These callbacks can generate a u64 data payload representing the minimal + * state needed for restoration. This payload, along with the handler's + * compatible string and the unique token, is stored in a dedicated + * '/file-descriptors' node within the main LUO FDT blob passed via + * Kexec Handover (KHO). + * + * 4. Restoration: In the new kernel, the LUO framework parses the incoming + * FDT to reconstruct the list of &struct luo_file instances. When the + * original owner requests the file, luo_retrieve_file() uses the correspo= nding + * handler's &liveupdate_file_handler.ops->retrieve callback, passing the + * persisted u64 data, to recreate or find the appropriate &struct file ob= ject. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +#define LUO_FILES_NODE_NAME "file-descriptors" +#define LUO_FILES_COMPATIBLE "file-descriptors-v1" + +static DEFINE_XARRAY(luo_files_xa_in); +static DEFINE_XARRAY(luo_files_xa_out); +static bool luo_files_xa_in_recreated; + +/* Registered files. */ +static DECLARE_RWSEM(luo_register_file_list_rwsem); +static LIST_HEAD(luo_register_file_list); + +static void *luo_file_fdt_out; +static void *luo_file_fdt_in; + +static size_t luo_file_fdt_out_size; + +static atomic64_t luo_files_count; + +/** + * struct luo_file - Represents a file descriptor instance preserved + * across live update. + * @fh: Pointer to the &struct liveupdate_file_handler containi= ng + * the implementation of prepare, freeze, cancel, and fini= sh + * operations specific to this file's type. + * @file: A pointer to the kernel's &struct file object represent= ing + * the open file descriptor that is being preserved. + * @private_data: Internal storage used by the live update core framework + * between phases. + * @reclaimed: Flag indicating whether this preserved file descriptor = has + * been successfully 'reclaimed' (e.g., requested via an i= octl) + * by user-space or the owning kernel subsystem in the new + * kernel after the live update. + * @state: The current state of file descriptor, it is allowed to + * prepare, freeze, and finish FDs before the global state + * switch. + * @mutex: Lock to protect FD state, and allow independently to c= hange + * the FD state compared to global state. + * + * This structure holds the necessary callbacks and context for managing a + * specific open file descriptor throughout the different phases of a live + * update process. Instances of this structure are typically allocated, + * populated with file-specific details (&file, &arg, callbacks, compatibi= lity + * string, token), and linked into a central list managed by the LUO. The + * private_data field is used internally by the core logic to store state + * between phases. + */ +struct luo_file { + struct liveupdate_file_handler *fh; + struct file *file; + u64 private_data; + bool reclaimed; + enum liveupdate_state state; + struct mutex mutex; +}; + +static void luo_files_recreate_luo_files_xa_in(void) +{ + const char *node_name, *fdt_compat_str; + struct liveupdate_file_handler *fh; + struct luo_file *luo_file; + const void *data_ptr; + int file_node_offset; + int ret =3D 0; + + if (luo_files_xa_in_recreated || !luo_file_fdt_in) + return; + + /* Take write in order to guarantee that we re-create list once */ + down_write(&luo_register_file_list_rwsem); + if (luo_files_xa_in_recreated) + goto exit_unlock; + + fdt_for_each_subnode(file_node_offset, luo_file_fdt_in, 0) { + bool handler_found =3D false; + u64 token; + + node_name =3D fdt_get_name(luo_file_fdt_in, file_node_offset, + NULL); + if (!node_name) { + panic("FDT subnode at offset %d: Cannot get name\n", + file_node_offset); + } + + ret =3D kstrtou64(node_name, 0, &token); + if (ret < 0) { + panic("FDT node '%s': Failed to parse token\n", + node_name); + } + + if (xa_load(&luo_files_xa_in, token)) { + panic("Duplicate token %llu found in incoming FDT for file descriptors.= \n", + token); + } + + fdt_compat_str =3D fdt_getprop(luo_file_fdt_in, file_node_offset, + "compatible", NULL); + if (!fdt_compat_str) { + panic("FDT node '%s': Missing 'compatible' property\n", + node_name); + } + + data_ptr =3D fdt_getprop(luo_file_fdt_in, file_node_offset, "data", + NULL); + if (!data_ptr) { + panic("Can't recover property 'data' for FDT node '%s'\n", + node_name); + } + + list_for_each_entry(fh, &luo_register_file_list, list) { + if (!strcmp(fh->compatible, fdt_compat_str)) { + handler_found =3D true; + break; + } + } + + if (!handler_found) { + panic("FDT node '%s': No registered handler for compatible '%s'\n", + node_name, fdt_compat_str); + } + + luo_file =3D kmalloc(sizeof(*luo_file), + GFP_KERNEL | __GFP_NOFAIL); + luo_file->fh =3D fh; + luo_file->file =3D NULL; + memcpy(&luo_file->private_data, data_ptr, sizeof(u64)); + luo_file->reclaimed =3D false; + mutex_init(&luo_file->mutex); + luo_file->state =3D LIVEUPDATE_STATE_UPDATED; + ret =3D xa_err(xa_store(&luo_files_xa_in, token, luo_file, + GFP_KERNEL | __GFP_NOFAIL)); + if (ret < 0) { + panic("Failed to store luo_file for token %llu in XArray: %d\n", + token, ret); + } + } + luo_files_xa_in_recreated =3D true; + +exit_unlock: + up_write(&luo_register_file_list_rwsem); +} + +static size_t luo_files_fdt_size(void) +{ + u64 num_files =3D atomic64_read(&luo_files_count); + + /* Estimate a 1K overhead, + 128 bytes per file entry */ + return PAGE_SIZE << get_order(SZ_1K + (num_files * 128)); +} + +static void luo_files_fdt_cleanup(void) +{ + WARN_ON_ONCE(kho_unpreserve_phys(__pa(luo_file_fdt_out), + luo_file_fdt_out_size)); + + free_pages((unsigned long)luo_file_fdt_out, + get_order(luo_file_fdt_out_size)); + + luo_file_fdt_out_size =3D 0; + luo_file_fdt_out =3D NULL; +} + +static int luo_files_to_fdt(struct xarray *files_xa_out) +{ + const u64 zero_data =3D 0; + unsigned long token; + struct luo_file *h; + char token_str[19]; + int ret =3D 0; + + xa_for_each(files_xa_out, token, h) { + snprintf(token_str, sizeof(token_str), "%#0llx", (u64)token); + + ret =3D fdt_begin_node(luo_file_fdt_out, token_str); + if (ret < 0) + break; + + ret =3D fdt_property_string(luo_file_fdt_out, "compatible", + h->fh->compatible); + if (ret < 0) { + fdt_end_node(luo_file_fdt_out); + break; + } + + ret =3D fdt_property_u64(luo_file_fdt_out, "data", zero_data); + if (ret < 0) { + fdt_end_node(luo_file_fdt_out); + break; + } + + ret =3D fdt_end_node(luo_file_fdt_out); + if (ret < 0) + break; + } + + return ret; +} + +static int luo_files_fdt_setup(void) +{ + int ret; + + luo_file_fdt_out_size =3D luo_files_fdt_size(); + luo_file_fdt_out =3D (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, + get_order(luo_file_fdt_out_size)); + if (!luo_file_fdt_out) { + pr_err("Failed to allocate FDT memory (%zu bytes)\n", + luo_file_fdt_out_size); + luo_file_fdt_out_size =3D 0; + return -ENOMEM; + } + + ret =3D kho_preserve_phys(__pa(luo_file_fdt_out), luo_file_fdt_out_size); + if (ret) { + pr_err("Failed to kho preserve FDT memory (%zu bytes)\n", + luo_file_fdt_out_size); + luo_file_fdt_out_size =3D 0; + luo_file_fdt_out =3D NULL; + return ret; + } + + ret =3D fdt_create(luo_file_fdt_out, luo_file_fdt_out_size); + if (ret < 0) + goto exit_cleanup; + + ret =3D fdt_finish_reservemap(luo_file_fdt_out); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_begin_node(luo_file_fdt_out, LUO_FILES_NODE_NAME); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_property_string(luo_file_fdt_out, "compatible", + LUO_FILES_COMPATIBLE); + if (ret < 0) + goto exit_end_node; + + ret =3D luo_files_to_fdt(&luo_files_xa_out); + if (ret < 0) + goto exit_end_node; + + ret =3D fdt_end_node(luo_file_fdt_out); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_finish(luo_file_fdt_out); + if (ret < 0) + goto exit_cleanup; + + return 0; + +exit_end_node: + fdt_end_node(luo_file_fdt_out); +exit_finish: + fdt_finish(luo_file_fdt_out); +exit_cleanup: + pr_err("Failed to setup FDT: %s (ret %d)\n", fdt_strerror(ret), ret); + luo_files_fdt_cleanup(); + + return ret; +} + +static int luo_files_prepare(void *arg, u64 *data) +{ + int ret; + + ret =3D luo_files_fdt_setup(); + if (ret) + return ret; + + *data =3D __pa(luo_file_fdt_out); + + return ret; +} + +static int luo_files_freeze(void *arg, u64 *data) +{ + return 0; +} + +static void luo_files_finish(void *arg, u64 data) +{ + luo_files_recreate_luo_files_xa_in(); +} + +static void luo_files_cancel(void *arg, u64 data) +{ +} + +static const struct liveupdate_subsystem_ops luo_file_subsys_ops =3D { + .prepare =3D luo_files_prepare, + .freeze =3D luo_files_freeze, + .cancel =3D luo_files_cancel, + .finish =3D luo_files_finish, +}; + +static struct liveupdate_subsystem luo_file_subsys =3D { + .ops =3D &luo_file_subsys_ops, + .name =3D LUO_FILES_NODE_NAME, +}; + +static int __init luo_files_startup(void) +{ + int ret; + + if (!liveupdate_enabled()) + return 0; + + ret =3D liveupdate_register_subsystem(&luo_file_subsys); + if (ret) { + pr_warn("Failed to register luo_file subsystem [%d]\n", ret); + return ret; + } + + if (liveupdate_state_updated()) { + u64 fdt_pa; + + ret =3D liveupdate_get_subsystem_data(&luo_file_subsys, &fdt_pa); + if (ret) + panic("Failed to retrieve luo_file data [%d]\n", ret); + + ret =3D fdt_node_check_compatible(__va(fdt_pa), 0, + LUO_FILES_COMPATIBLE); + if (ret) { + panic("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_FILES_NODE_NAME, LUO_FILES_COMPATIBLE, ret); + } + luo_file_fdt_in =3D __va(fdt_pa); + } + + return ret; +} +late_initcall(luo_files_startup); + +/** + * luo_register_file - Register a file descriptor for live update manageme= nt. + * @token: Token value for this file descriptor. + * @fd: file descriptor to be preserved. + * + * Context: Must be called when LUO is in 'normal' state. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_register_file(u64 token, int fd) +{ + struct liveupdate_file_handler *fh; + struct luo_file *luo_file; + bool found =3D false; + int ret =3D -ENOENT; + struct file *file; + + file =3D fget(fd); + if (!file) { + pr_err("Bad file descriptor\n"); + return -EBADF; + } + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + pr_warn("File can be registered only in normal or updated state\n"); + luo_state_read_exit(); + fput(file); + return -EBUSY; + } + + down_read(&luo_register_file_list_rwsem); + list_for_each_entry(fh, &luo_register_file_list, list) { + if (fh->ops->can_preserve(file, fh->arg)) { + found =3D true; + break; + } + } + + if (!found) + goto exit_unlock; + + luo_file =3D kmalloc(sizeof(*luo_file), GFP_KERNEL); + if (!luo_file) { + ret =3D -ENOMEM; + goto exit_unlock; + } + + luo_file->private_data =3D 0; + luo_file->reclaimed =3D false; + + luo_file->file =3D file; + luo_file->fh =3D fh; + mutex_init(&luo_file->mutex); + luo_file->state =3D LIVEUPDATE_STATE_NORMAL; + + if (xa_load(&luo_files_xa_out, token)) { + ret =3D -EEXIST; + pr_warn("Token %llu is already taken\n", token); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + goto exit_unlock; + } + + ret =3D xa_err(xa_store(&luo_files_xa_out, token, luo_file, + GFP_KERNEL)); + if (ret < 0) { + pr_warn("Failed to store file for token %llu in XArray: %d\n", + token, ret); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + goto exit_unlock; + } + atomic64_inc(&luo_files_count); + +exit_unlock: + up_read(&luo_register_file_list_rwsem); + luo_state_read_exit(); + + if (ret) + fput(file); + + return ret; +} + +/** + * luo_unregister_file - Unregister a file instance using its token. + * @token: The unique token of the file instance to unregister. + * + * Finds the &struct luo_file associated with the @token in the + * global list and removes it. This function *only* removes the entry from= the + * list; it does *not* free the memory allocated for the &struct luo_file + * itself. The caller is responsible for freeing the structure after this + * function returns successfully. + * + * Context: Can be called when a preserved file descriptor is closed or + * no longer needs live update management. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_unregister_file(u64 token) +{ + struct luo_file *luo_file; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + pr_warn("File can be unregistered only in normal or updates state\n"); + luo_state_read_exit(); + return -EBUSY; + } + + luo_file =3D xa_erase(&luo_files_xa_out, token); + if (luo_file) { + fput(luo_file->file); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + atomic64_dec(&luo_files_count); + } else { + pr_warn("Failed to unregister: token %llu not found.\n", + token); + ret =3D -ENOENT; + } + luo_state_read_exit(); + + return ret; +} + +/** + * luo_retrieve_file - Find a registered file instance by its token. + * @token: The unique token of the file instance to retrieve. + * @filep: Output parameter. On success (return value 0), this will point + * to the retrieved "struct file". + * + * Searches the global list for a &struct luo_file matching the @token. Us= es a + * read lock, allowing concurrent retrievals. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_retrieve_file(u64 token, struct file **filep) +{ + struct luo_file *luo_file; + int ret =3D 0; + + luo_files_recreate_luo_files_xa_in(); + luo_state_read_enter(); + if (!liveupdate_state_updated()) { + pr_warn("File can be retrieved only in updated state\n"); + luo_state_read_exit(); + return -EBUSY; + } + + luo_file =3D xa_load(&luo_files_xa_in, token); + if (luo_file && !luo_file->reclaimed) { + mutex_lock(&luo_file->mutex); + if (!luo_file->reclaimed) { + luo_file->reclaimed =3D true; + ret =3D luo_file->fh->ops->retrieve(luo_file->fh->arg, + luo_file->private_data, + filep); + if (!ret) + luo_file->file =3D *filep; + } + mutex_unlock(&luo_file->mutex); + } else if (luo_file && luo_file->reclaimed) { + pr_err("The file descriptor for token %lld has already been retrieved\n", + token); + ret =3D -EINVAL; + } else { + ret =3D -ENOENT; + } + + luo_state_read_exit(); + + return ret; +} + +/** + * liveupdate_register_file_handler - Register a file handler with LUO. + * @fh: Pointer to a caller-allocated &struct liveupdate_file_handler. + * The caller must initialize this structure, including a unique + * 'compatible' string and a valid 'fh' callbacks. This function adds the + * handler to the global list of supported file handlers. + * + * Context: Typically called during module initialization for file types t= hat + * support live update preservation. + * + * Return: 0 on success. Negative errno on failure. + */ +int liveupdate_register_file_handler(struct liveupdate_file_handler *fh) +{ + struct liveupdate_file_handler *fh_iter; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + down_write(&luo_register_file_list_rwsem); + list_for_each_entry(fh_iter, &luo_register_file_list, list) { + if (!strcmp(fh_iter->compatible, fh->compatible)) { + pr_err("File handler registration failed: Compatible string '%s' alread= y registered.\n", + fh->compatible); + ret =3D -EEXIST; + goto exit_unlock; + } + } + + INIT_LIST_HEAD(&fh->list); + list_add_tail(&fh->list, &luo_register_file_list); + +exit_unlock: + up_write(&luo_register_file_list_rwsem); + luo_state_read_exit(); + + return ret; +} +EXPORT_SYMBOL_GPL(liveupdate_register_file_handler); + +/** + * liveupdate_unregister_file - Unregister a file handler. + * @fh: Pointer to the specific &struct liveupdate_file_handler instance + * that was previously returned by or passed to + * liveupdate_register_file_handler. + * + * Removes the specified handler instance @fh from the global list of + * registered file handlers. This function only removes the entry from the + * list; it does not free the memory associated with @fh itself. The caller + * is responsible for freeing the structure memory after this function ret= urns + * successfully. + * + * Return: 0 on success. Negative errno on failure. + */ +int liveupdate_unregister_file_handler(struct liveupdate_file_handler *fh) +{ + unsigned long token; + struct luo_file *h; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + down_write(&luo_register_file_list_rwsem); + + xa_for_each(&luo_files_xa_out, token, h) { + if (h->fh =3D=3D fh) { + up_write(&luo_register_file_list_rwsem); + luo_state_read_exit(); + return -EBUSY; + } + } + + list_del_init(&fh->list); + up_write(&luo_register_file_list_rwsem); + luo_state_read_exit(); + + return ret; +} +EXPORT_SYMBOL_GPL(liveupdate_unregister_file_handler); diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 98bf799adb61..f77e8b3044f9 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -25,4 +25,8 @@ int luo_do_subsystems_freeze_calls(void); void luo_do_subsystems_finish_calls(void); void luo_do_subsystems_cancel_calls(void); =20 +int luo_retrieve_file(u64 token, struct file **filep); +int luo_register_file(u64 token, int fd); +int luo_unregister_file(u64 token); + #endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C76482FC005 for ; Wed, 23 Jul 2025 14:47:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282051; cv=none; b=D9EQBYVggKfRfS9c2Z95eblRz3RbnCD8GjSX7v+u+fdAg2FOH2vXk/YlGvgZndzRN0FU8nJYJ+pvTuFvDFXG0hEhLd0/DBi2Rrq7KScr2dpvVENtF5KGCkvH1E0t06SZVhAXKZ37wdG0xD4Aw/rtXtsmdVsAPz6b8htSWwm/SYs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282051; c=relaxed/simple; bh=Cfi9j2oHaj3a+1RPOlQ6V9S6n7G0Pg+wsLencW8TXUg=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tRGkRTZKJu7eyWVrdF1rdYyVbC5GsoDCCNboau47XmTx9MraU14G25SBwBh/hRq/jGnrwQulNWrD6EhnaLQVdLFr32sgBhh1ZlZqDjIsdURg5+zNeN4IQo1Xde9Deq89OAoSAd2PcIcWM0AbkQYrvWsfkQWgw7sGsksIYKyeauo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=3VLGzvMI; arc=none smtp.client-ip=209.85.128.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="3VLGzvMI" Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-70e3e0415a7so262597b3.0 for ; Wed, 23 Jul 2025 07:47:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282045; x=1753886845; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=sk38mGBhqK7qzK3HNwB+80JVX/lHD/SIvkcYryZBkd4=; b=3VLGzvMIMKYavfCYFadg0jiuAu3OUzrL0BaH9/F63R5glO8OibHyupSKD8hSxQMUIR NOEhS+pD/U7P4QAXFrnu8rBAlciyDtkhYKQnUFTHKnudoSvwua97UteWOkaHJkXOhSr6 fJ6CjLCTtUCuU2+pZEqaFTQkSBRb3WCY0dr7MTYURuet8nSaPA90PO7eEL7lbO+0N1Od WvV6EiHXEkgkTGxSDiFvn7BGCIY6UQQrY6s6oNwUJWt9FERzYewmXPnAmTpFR2uBchKm JTvolDWKXnOLW2bdRKNMqSYM7yFIpA91aTfwX8MFJQXyXgM4YVEX/55Lt/fTRdRTX1lR sXRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282045; x=1753886845; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sk38mGBhqK7qzK3HNwB+80JVX/lHD/SIvkcYryZBkd4=; b=f5vnhKxEzy7BCK+qA8QbcZL+8O0oOPcmbgPag6JHFj3KoH+LxvfJNKz5RdtzZt5yOl 0e2YIW3mYwg8pKHm5B91W2bm7yMhc4JRij5it3OnCyV8swAA1BHLVCpBqyt6tKo9MoLB BeVChcPWmTuyDKlo7vywPGTJAov2ChDkIW/GaZMcAY+8JqNO07AKyL+yDPLRsGywkenp y3tOMzRfe/DSZa8jryj10R0KTyAvNuGUGwKYCgCz1fW7Q5vEHgIBHRDl5e01KHA6nLS6 W33DyCaAJblBskBh81dZe9fiq9RUKw6PNY/i1a6yRd6SR0hlfzZv6G9KG/9+fSaooKdh f9fg== X-Forwarded-Encrypted: i=1; AJvYcCUedoaW43fA3QVx9uHFHLb/LE7QHZYOKxLXNTPdjq4OxJ1OjtE3sfD3vIqPdcr7qS8fH6X+pSuFnmkdp/o=@vger.kernel.org X-Gm-Message-State: AOJu0Yz/RchY6tIZDN7yWO+6YA2zLVSkASrMAjMcr+b49J6af0t1YifB J/E814jarmEqGwpWbcAVNoR82JhDjwRhwLkxcBzCKybpKJDFgmTjMnvUE2szQngspOM= X-Gm-Gg: ASbGncs6HdM7D+EZqiILMuYcw4rfOblil3Yy4z8aRhHUZhv0I9RC1nlQP7OB4mClqJ9 LCaZhnfnxshXsnDaaYe/idNC0Niok7f5VkgFHrQTRtdWS9eDHTP4zXjxA1zmz6/8KykeYGoC+sI t6LwljdG6UpBGPHMudSQdwn0Yozub0H19Al6JuoP5DxGLPKJTSSvukgQxPBcV5Kr2Bu7C5V0I7I QX4EZg9nJSwFFZxQUkecDtfQHBflvYkSUxr4eyUO4uNdBYOWbM9ug5ZSnVtKjI2/z0RWhIN4iKr 1gY/dkyeqEx6YvFH+kViTQvDZhCLNsp24M8SF7+xrCPB+iDrRYqQ5Dvd6xsfxRLjkzOGGqihEGs TwAl8scmvwIK4kFg2kGZXUlmjTJLhRXm0HZkIC+1htfW3YxyENlH+8rIt/u3g+sUfLdgWObLYB7 DkGXvJWjx2whQ9JQ== X-Google-Smtp-Source: AGHT+IHGYfjoIy8qo73uD860AaG0pKa+DBLv7UTBP+7WtZjAXyEb/laqBGmfDkMsPVthzLrkCs+A9A== X-Received: by 2002:a05:690c:d84:b0:712:c5f7:1f11 with SMTP id 00721157ae682-719a0b52782mr94165947b3.10.1753282044761; Wed, 23 Jul 2025 07:47:24 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:23 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 15/32] liveupdate: luo_files: implement file systems callbacks Date: Wed, 23 Jul 2025 14:46:28 +0000 Message-ID: <20250723144649.1696299-16-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implements the core logic within luo_files.c to invoke the prepare, reboot, finish, and cancel callbacks for preserved file instances, replacing the previous stub implementations. It also handles the persistence and retrieval of the u64 data payload associated with each file via the LUO FDT. This completes the core mechanism enabling registered files handlers to act= ively manage file state across the live update transition using the LUO framework. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_files.c | 166 +++++++++++++++++++++++++++++++++- 1 file changed, 164 insertions(+), 2 deletions(-) diff --git a/kernel/liveupdate/luo_files.c b/kernel/liveupdate/luo_files.c index 3582f1ec96c4..cd956ea69f43 100644 --- a/kernel/liveupdate/luo_files.c +++ b/kernel/liveupdate/luo_files.c @@ -325,31 +325,193 @@ static int luo_files_fdt_setup(void) return ret; } =20 +static int luo_files_prepare_one(struct luo_file *h) +{ + int ret =3D 0; + + mutex_lock(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_NORMAL) { + if (h->fh->ops->prepare) { + ret =3D h->fh->ops->prepare(h->file, h->fh->arg, + &h->private_data); + } + if (!ret) + h->state =3D LIVEUPDATE_STATE_PREPARED; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_PREPARED && + h->state !=3D LIVEUPDATE_STATE_FROZEN); + } + mutex_unlock(&h->mutex); + + return ret; +} + +static int luo_files_freeze_one(struct luo_file *h) +{ + int ret =3D 0; + + mutex_lock(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_PREPARED) { + if (h->fh->ops->freeze) { + ret =3D h->fh->ops->freeze(h->file, h->fh->arg, + &h->private_data); + } + if (!ret) + h->state =3D LIVEUPDATE_STATE_FROZEN; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_FROZEN); + } + mutex_unlock(&h->mutex); + + return ret; +} + +static void luo_files_finish_one(struct luo_file *h) +{ + mutex_lock(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_UPDATED) { + if (h->fh->ops->finish) { + h->fh->ops->finish(h->file, h->fh->arg, h->private_data, + h->reclaimed); + } + h->state =3D LIVEUPDATE_STATE_NORMAL; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_NORMAL); + } + mutex_unlock(&h->mutex); +} + +static void luo_files_cancel_one(struct luo_file *h) +{ + int ret; + + mutex_lock(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_NORMAL) + goto exit_unlock; + + ret =3D WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_PREPARED && + h->state !=3D LIVEUPDATE_STATE_FROZEN); + if (ret) + goto exit_unlock; + + if (h->fh->ops->cancel) + h->fh->ops->cancel(h->file, h->fh->arg, h->private_data); + h->private_data =3D 0; + h->state =3D LIVEUPDATE_STATE_NORMAL; + +exit_unlock: + mutex_unlock(&h->mutex); +} + +static void __luo_files_cancel(struct luo_file *boundary_file) +{ + unsigned long token; + struct luo_file *h; + + xa_for_each(&luo_files_xa_out, token, h) { + if (h =3D=3D boundary_file) + break; + + luo_files_cancel_one(h); + } + luo_files_fdt_cleanup(); +} + +static int luo_files_commit_data_to_fdt(void) +{ + int node_offset, ret; + unsigned long token; + char token_str[19]; + struct luo_file *h; + + xa_for_each(&luo_files_xa_out, token, h) { + snprintf(token_str, sizeof(token_str), "%#0llx", (u64)token); + node_offset =3D fdt_subnode_offset(luo_file_fdt_out, + 0, + token_str); + ret =3D fdt_setprop(luo_file_fdt_out, node_offset, "data", + &h->private_data, sizeof(h->private_data)); + if (ret < 0) { + pr_err("Failed to set data property for token %s: %s\n", + token_str, fdt_strerror(ret)); + return -ENOSPC; + } + } + + return 0; +} + static int luo_files_prepare(void *arg, u64 *data) { + unsigned long token; + struct luo_file *h; int ret; =20 ret =3D luo_files_fdt_setup(); if (ret) return ret; =20 - *data =3D __pa(luo_file_fdt_out); + xa_for_each(&luo_files_xa_out, token, h) { + ret =3D luo_files_prepare_one(h); + if (ret < 0) { + pr_err("Prepare failed for file token %#0llx handler '%s' [%d]\n", + (u64)token, h->fh->compatible, ret); + __luo_files_cancel(h); + + return ret; + } + } + + ret =3D luo_files_commit_data_to_fdt(); + if (ret) + __luo_files_cancel(NULL); + else + *data =3D __pa(luo_file_fdt_out); =20 return ret; } =20 static int luo_files_freeze(void *arg, u64 *data) { - return 0; + unsigned long token; + struct luo_file *h; + int ret; + + xa_for_each(&luo_files_xa_out, token, h) { + ret =3D luo_files_freeze_one(h); + if (ret < 0) { + pr_err("Freeze callback failed for file token %#0llx handler '%s' [%d]\= n", + (u64)token, h->fh->compatible, ret); + __luo_files_cancel(h); + + return ret; + } + } + + ret =3D luo_files_commit_data_to_fdt(); + if (ret) + __luo_files_cancel(NULL); + + return ret; } =20 static void luo_files_finish(void *arg, u64 data) { + unsigned long token; + struct luo_file *h; + luo_files_recreate_luo_files_xa_in(); + xa_for_each(&luo_files_xa_in, token, h) { + luo_files_finish_one(h); + mutex_destroy(&h->mutex); + kfree(h); + } + xa_destroy(&luo_files_xa_in); } =20 static void luo_files_cancel(void *arg, u64 data) { + __luo_files_cancel(NULL); } =20 static const struct liveupdate_subsystem_ops luo_file_subsys_ops =3D { --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yb1-f181.google.com (mail-yb1-f181.google.com [209.85.219.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7B5262F5499 for ; Wed, 23 Jul 2025 14:47:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282056; cv=none; b=LvEDeSB96VRu85Ksivl5n78/cGA/sGhdITG8yDhN6A833tVANI34jC6tMpajeIRjLlGEI5ZbvvZYilevnRR7EDzXRLpG7QnXNyPZd1RR8B7vk0z6j4uyEIHvVYSoDYPdpL7AlkHyYZHZmALhvKRxQ2bFuO2KehDaXt0osaIE95o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282056; c=relaxed/simple; bh=S+USuASgiAWLoGLEg40jmspH2nuy/XVSxxvs0BOFsVY=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VD0nkgTuIqfAgSdRXZze06erve2XO1RE5710da9nNr3+zCEX9BqwKiMFt4Us9j2/PD56MIETuF/FLQsYy9WtXeinhZKu1bRQc7UQFJR6OxXN+Dor/plgPyg1jaTHcx4CwRAV5/o6fYhVpORgZRtjmmd0C+73L7pkUJb+8gwcJp4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=z7kLsxSx; arc=none smtp.client-ip=209.85.219.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="z7kLsxSx" Received: by mail-yb1-f181.google.com with SMTP id 3f1490d57ef6-e8bd2eaf8ccso6110790276.2 for ; Wed, 23 Jul 2025 07:47:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282048; x=1753886848; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=q9OeQwvAtSpE7KYIsGHzDzwJJRVZThZJEHAd4sog2is=; b=z7kLsxSxz4m6w2WyWBlzth234hZUWBRLrqHjQkvTXD1xT/61UYCkFSBiRGMANXP8Hc aGAYKsZuDu6lzBeiavvC1xrXhTUSC6S5uCYnAmWh/pRroMZhtMkWp/pr0lgxoMBYSJiq KdmnInZQM63Z3hHfIth7YgyLfJ7MzEFmC+a7VCNf94oHVsJCrHReuCiybbJGma5Tcy9H APCILb31/1PzSO46ySORPouWI5Iu5PoIjfssjRHJBlHzl1Fh87KA9UQ6+EwjFEJUatrI wZy3obPlpd/K0Ymanl2EAS6NH9or6A8queBazweQFiay72OY343xmVxp2uPwJ9BJHl6P Ur0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282048; x=1753886848; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q9OeQwvAtSpE7KYIsGHzDzwJJRVZThZJEHAd4sog2is=; b=npMCcGNYLb8L9TSF0pd6N9Ro5RP+a9c4NSYwoAnUWpB3qqQXLfq5E3IV5WKR7OsmG0 0DVPMMmnJeMAk7Hcn8pM7GnxYJSx/Ppq8C7QcjkumqtrJbKnX+V9aBqOEehBDtXvP44d ovjY+ws9Ytl75cVflmpXNr0wab00a6Q8KKv5TznNnntSsq6yxjTzDM9zKphiEKJxSoxq 4WfB9heJ6QEFmWFLA0Yyg8n2VAjaELHWypr/CigJMMBcuYDLAPg3CkEAz93X6iWwh0vS zUSDsn17FF3kUwEJ7eYMnPfVmemX38hozLanmwhleruEanZ3NjeP3cdTqm1ABL1QGA3H NxVw== X-Forwarded-Encrypted: i=1; AJvYcCVq/Wnm/2oDgq+ZEkbMFOnU60LPaiKIaPvHIarzi/4MbHMT1xo4UVXc5gN6ITfciigtTiJWLqGNVcLB1IA=@vger.kernel.org X-Gm-Message-State: AOJu0YynNM7QWOKPFoVnYMQAxyDofIg3nD/zJXO7hH6jzuU4QkWJ6n8N wG+/iSd3yCQuueAMLVAbIIHtJQXpqYp3gxx+XErDp7bX+xEwieaMWsyh4O2xmqDOh+U= X-Gm-Gg: ASbGncs5CO0v9hfNFa8um7jlMRk1ZgOmdQfYJiEpDsdVSt8BpjCVFPecYnC+6mRWxDP f5mNXr6Kk9bSjXf9DAyo9kC4TdT3NSc01ckSEgmxtw1tUNUGirHtx1brw3+mQqZ/0RT708ptqtC jcjIMYVmR5a4DfbkXTtuDMYb1Gej2454vU5kqm05wmjw8YT1fFVBmhPEa10F6Yc5hcz10tGa9eg gCwQxhrtFPoHTsvxuPAsayf33Fk6ekbnHpyFaFb4tt941ULIVHOC8i3kYz3um/NvRczvP2kq4r5 QQFPnZBJ6P49qa5XG/vCxWnMRYydyNYaXHAd870MrrebFgHlZgcFnCAEW5tSla7YHLMYRowhNxf zBYEeNQDjWjr/8MyYP8DErODQzPAf2NRC9XtWsZ90G9hNoJWETTX3OhVneM1LSDp9jdB/CmhRPI L1P+0nKpnriTsyq9A7al3OjsGw X-Google-Smtp-Source: AGHT+IF4tR956kUR9cpPnn2sFwi5VeVXYICzL+BOy6UFPlMUfshdtWwybksAvrcMrknJvRf/sOFakA== X-Received: by 2002:a05:690c:4d83:b0:70e:7503:1181 with SMTP id 00721157ae682-719b41660f4mr39787257b3.18.1753282047128; Wed, 23 Jul 2025 07:47:27 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:26 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 16/32] liveupdate: luo_ioctl: add ioctl interface Date: Wed, 23 Jul 2025 14:46:29 +0000 Message-ID: <20250723144649.1696299-17-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the user-space interface for the Live Update Orchestrator via ioctl commands, enabling external control over the live update process and management of preserved resources. Create a character device at /dev/liveupdate. Access to this device requires the CAP_SYS_ADMIN capability. A new uAPI header, , defines the necessary structures. The magic number is registered in Documentation/userspace-api/ioctl/ioctl-number.rst. Signed-off-by: Pasha Tatashin --- .../userspace-api/ioctl/ioctl-number.rst | 2 + include/linux/liveupdate.h | 36 +-- include/uapi/linux/liveupdate.h | 265 ++++++++++++++++++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_ioctl.c | 178 ++++++++++++ 5 files changed, 447 insertions(+), 35 deletions(-) create mode 100644 include/uapi/linux/liveupdate.h create mode 100644 kernel/liveupdate/luo_ioctl.c diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documenta= tion/userspace-api/ioctl/ioctl-number.rst index bc91756bde73..8368aa05b4df 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -380,6 +380,8 @@ Code Seq# Include File = Comments 0xB8 01-02 uapi/misc/mrvl_cn10k_dpi.h Marve= ll CN10K DPI driver 0xB8 all uapi/linux/mshv.h Micro= soft Hyper-V /dev/mshv driver +0xBA all uapi/linux/liveupdate.h Pasha= Tatashin + 0xC0 00-0F linux/usb/iowarrior.h 0xCA 00-0F uapi/misc/cxl.h Dead = since 6.15 0xCA 10-2F uapi/misc/ocxl.h diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index 28a8aa4cafca..970447de5d8c 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -10,6 +10,7 @@ #include #include #include +#include =20 /** * enum liveupdate_event - Events that trigger live update callbacks. @@ -53,41 +54,6 @@ enum liveupdate_event { LIVEUPDATE_CANCEL, }; =20 -/** - * enum liveupdate_state - Defines the possible states of the live update - * orchestrator. - * @LIVEUPDATE_STATE_UNDEFINED: State has not yet been initialized. - * @LIVEUPDATE_STATE_NORMAL: Default state, no live update in prog= ress. - * @LIVEUPDATE_STATE_PREPARED: Live update is prepared for reboot; t= he - * LIVEUPDATE_PREPARE callbacks have com= pleted - * successfully. - * Devices might operate in a limited st= ate - * for example the participating devices= might - * not be allowed to unbind, and also the - * setting up of new DMA mappings might = be - * disabled in this state. - * @LIVEUPDATE_STATE_FROZEN: The final reboot event - * (%LIVEUPDATE_FREEZE) has been sent, a= nd the - * system is performing its final state = saving - * within the "blackout window". User - * workloads must be suspended. The actu= al - * reboot (kexec) into the next kernel is - * imminent. - * @LIVEUPDATE_STATE_UPDATED: The system has rebooted into the next - * kernel via live update the system is = now - * running the next kernel, awaiting the - * finish event. - * - * These states track the progress and outcome of a live update operation. - */ -enum liveupdate_state { - LIVEUPDATE_STATE_UNDEFINED =3D 0, - LIVEUPDATE_STATE_NORMAL =3D 1, - LIVEUPDATE_STATE_PREPARED =3D 2, - LIVEUPDATE_STATE_FROZEN =3D 3, - LIVEUPDATE_STATE_UPDATED =3D 4, -}; - struct file; =20 /** diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h new file mode 100644 index 000000000000..7b12a1073c3c --- /dev/null +++ b/include/uapi/linux/liveupdate.h @@ -0,0 +1,265 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +/* + * Userspace interface for /dev/liveupdate + * Live Update Orchestrator + * + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _UAPI_LIVEUPDATE_H +#define _UAPI_LIVEUPDATE_H + +#include +#include + +/** + * enum liveupdate_state - Defines the possible states of the live update + * orchestrator. + * @LIVEUPDATE_STATE_UNDEFINED: State has not yet been initialized. + * @LIVEUPDATE_STATE_NORMAL: Default state, no live update in prog= ress. + * @LIVEUPDATE_STATE_PREPARED: Live update is prepared for reboot; t= he + * LIVEUPDATE_PREPARE callbacks have com= pleted + * successfully. + * Devices might operate in a limited st= ate + * for example the participating devices= might + * not be allowed to unbind, and also the + * setting up of new DMA mappings might = be + * disabled in this state. + * @LIVEUPDATE_STATE_FROZEN: The final reboot event + * (%LIVEUPDATE_FREEZE) has been sent, a= nd the + * system is performing its final state = saving + * within the "blackout window". User + * workloads must be suspended. The actu= al + * reboot (kexec) into the next kernel is + * imminent. + * @LIVEUPDATE_STATE_UPDATED: The system has rebooted into the next + * kernel via live update the system is = now + * running the next kernel, awaiting the + * finish event. + * + * These states track the progress and outcome of a live update operation. + */ +enum liveupdate_state { + LIVEUPDATE_STATE_UNDEFINED =3D 0, + LIVEUPDATE_STATE_NORMAL =3D 1, + LIVEUPDATE_STATE_PREPARED =3D 2, + LIVEUPDATE_STATE_FROZEN =3D 3, + LIVEUPDATE_STATE_UPDATED =3D 4, +}; + +/** + * struct liveupdate_fd - Holds parameters for preserving and restoring fi= le + * descriptors across live update. + * @fd: Input for %LIVEUPDATE_IOCTL_FD_PRESERVE: The user-space file + * descriptor to be preserved. + * Output for %LIVEUPDATE_IOCTL_FD_RESTORE: The new file descriptor + * representing the fully restored kernel resource. + * @flags: Unused, reserved for future expansion, must be set to 0. + * @token: Input for %LIVEUPDATE_IOCTL_FD_PRESERVE: An opaque, unique token + * preserved for preserved resource. + * Input for %LIVEUPDATE_IOCTL_FD_RESTORE: The token previously + * provided to the preserve ioctl for the resource to be restored. + * + * This structure is used as the argument for the %LIVEUPDATE_IOCTL_FD_PRE= SERVE + * and %LIVEUPDATE_IOCTL_FD_RESTORE ioctls. These ioctls allow specific ty= pes + * of file descriptors (for example memfd, kvm, iommufd, and VFIO) to have= their + * underlying kernel state preserved across a live update cycle. + * + * To preserve an FD, user space passes this struct to + * %LIVEUPDATE_IOCTL_FD_PRESERVE with the @fd field set. On success, the + * kernel uses the @token field to uniquly associate the preserved FD. + * + * After the live update transition, user space passes the struct populate= d with + * the *same* @token to %LIVEUPDATE_IOCTL_FD_RESTORE. The kernel uses the = @token + * to find the preserved state and, on success, populates the @fd field wi= th a + * new file descriptor referring to the restored resource. + */ +struct liveupdate_fd { + int fd; + __u32 flags; + __aligned_u64 token; +}; + +/* The ioctl type, documented in ioctl-number.rst */ +#define LIVEUPDATE_IOCTL_TYPE 0xBA + +/** + * LIVEUPDATE_IOCTL_FD_PRESERVE - Validate and initiate preservation for a= file + * descriptor. + * + * Argument: Pointer to &struct liveupdate_fd. + * + * User sets the @fd field identifying the file descriptor to preserve + * (e.g., memfd, kvm, iommufd, VFIO). The kernel validates if this FD type + * and its dependencies are supported for preservation. If validation pass= es, + * the kernel marks the FD internally and *initiates the process* of prepa= ring + * its state for saving. The actual snapshotting of the state typically oc= curs + * during the subsequent %LIVEUPDATE_IOCTL_PREPARE execution phase, though + * some finalization might occur during freeze. + * On successful validation and initiation, the kernel uses the @token + * field with an opaque identifier representing the resource being preserv= ed. + * This token confirms the FD is targeted for preservation and is required= for + * the subsequent %LIVEUPDATE_IOCTL_FD_RESTORE call after the live update. + * + * Return: 0 on success (validation passed, preservation initiated), negat= ive + * error code on failure (e.g., unsupported FD type, dependency issue, + * validation failed). + */ +#define LIVEUPDATE_IOCTL_FD_PRESERVE \ + _IOW(LIVEUPDATE_IOCTL_TYPE, 0x00, struct liveupdate_fd) + +/** + * LIVEUPDATE_IOCTL_FD_UNPRESERVE - Remove a file descriptor from the + * preservation list. + * + * Argument: Pointer to __u64 token. + * + * Allows user space to explicitly remove a file descriptor from the set of + * items marked as potentially preservable. User space provides a pointer = to the + * __u64 @token that was previously returned by a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call (potentially from a prior, possibly + * cancelled, live update attempt). The kernel reads the token value from = the + * provided user-space address. + * + * On success, the kernel removes the corresponding entry (identified by t= he + * token value read from the user pointer) from its internal preservation = list. + * The provided @token (representing the now-removed entry) becomes invalid + * after this call. + * + * Return: 0 on success, negative error code on failure (e.g., -EBUSY or -= EINVAL + * if not in %LIVEUPDATE_STATE_NORMAL, bad address provided, invalid token= value + * read, token not found). + */ +#define LIVEUPDATE_IOCTL_FD_UNPRESERVE \ + _IOW(LIVEUPDATE_IOCTL_TYPE, 0x01, __u64) + +/** + * LIVEUPDATE_IOCTL_FD_RESTORE - Restore a previously preserved file descr= iptor. + * + * Argument: Pointer to &struct liveupdate_fd. + * + * User sets the @token field to the value obtained from a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call before the live update. On success, + * the kernel restores the state (saved during the PREPARE/FREEZE phases) + * associated with the token and populates the @fd field with a new file + * descriptor referencing the restored resource in the current (new) kerne= l. + * This operation must be performed *before* signaling completion via + * %LIVEUPDATE_IOCTL_FINISH. + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +#define LIVEUPDATE_IOCTL_FD_RESTORE \ + _IOWR(LIVEUPDATE_IOCTL_TYPE, 0x02, struct liveupdate_fd) + +/** + * LIVEUPDATE_IOCTL_GET_STATE - Query the current state of the live update + * orchestrator. + * + * Argument: Pointer to &enum liveupdate_state. + * + * The kernel fills the enum value pointed to by the argument with the cur= rent + * state of the live update subsystem. Possible states are: + * + * - %LIVEUPDATE_STATE_NORMAL: Default state; no live update operation is + * currently in progress. + * - %LIVEUPDATE_STATE_PREPARED: The preparation phase (triggered by + * %LIVEUPDATE_IOCTL_PREPARE) has completed + * successfully. The system is ready for the + * reboot transition. Note that some + * device operations (e.g., unbinding, new D= MA + * mappings) might be restricted in this sta= te. + * - %LIVEUPDATE_STATE_UPDATED: The system has successfully rebooted into= the + * new kernel via live update. It is now run= ning + * the new kernel code and is awaiting the + * completion signal from user space via + * %LIVEUPDATE_IOCTL_FINISH after + * restoration tasks are done. + * + * See the definition of &enum liveupdate_state for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +#define LIVEUPDATE_IOCTL_GET_STATE \ + _IOR(LIVEUPDATE_IOCTL_TYPE, 0x03, enum liveupdate_state) + +/** + * LIVEUPDATE_IOCTL_PREPARE - Initiate preparation phase and trigger state + * saving. + * + * Argument: None. + * + * Initiates the live update preparation phase. This action corresponds to + * the internal %LIVEUPDATE_PREPARE. This typically triggers the saving pr= ocess + * for items marked via the PRESERVE ioctls. This typically occurs *before* + * the "blackout window", while user applications (e.g., VMs) may still be + * running. Kernel subsystems receiving the %LIVEUPDATE_PREPARE event shou= ld + * serialize necessary state. This command does not transfer data. + * + * Return: 0 on success, negative error code on failure. Transitions state + * towards %LIVEUPDATE_STATE_PREPARED on success. + */ +#define LIVEUPDATE_IOCTL_PREPARE \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x04) + +/** + * LIVEUPDATE_IOCTL_CANCEL - Cancel the live update preparation phase. + * + * Argument: None. + * + * Notifies the live update subsystem to abort the preparation sequence + * potentially initiated by %LIVEUPDATE_IOCTL_PREPARE. This action + * typically corresponds to the internal %LIVEUPDATE_CANCEL kernel event, + * which might also be triggered automatically if the PREPARE stage fails + * internally. + * + * When triggered, subsystems receiving the %LIVEUPDATE_CANCEL event should + * revert any state changes or actions taken specifically for the aborted + * prepare phase (e.g., discard partially serialized state). The kernel + * releases resources allocated specifically for this *aborted preparation + * attempt*. + * + * This operation cancels the current *attempt* to prepare for a live upda= te + * but does **not** remove previously validated items from the internal li= st + * of potentially preservable resources. Consequently, preservation tokens + * previously generated by successful %LIVEUPDATE_IOCTL_FD_PRESERVE or cal= ls + * generally **remain valid** as identifiers for those potentially preserv= able + * resources. However, since the system state returns towards + * %LIVEUPDATE_STATE_NORMAL, user space must initiate a new live update se= quence + * (starting with %LIVEUPDATE_IOCTL_PREPARE) to proceed with an update + * using these (or other) tokens. + * + * This command does not transfer data. Kernel callbacks for the + * %LIVEUPDATE_CANCEL event must not fail. + * + * Return: 0 on success, negative error code on failure. Transitions state= back + * towards %LIVEUPDATE_STATE_NORMAL on success. + */ +#define LIVEUPDATE_IOCTL_CANCEL \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x06) + +/** + * LIVEUPDATE_IOCTL_EVENT_FINISH - Signal restoration completion and trigg= er + * cleanup. + * + * Argument: None. + * + * Signals that user space has completed all necessary restoration actions= in + * the new kernel (after a live update reboot). This action corresponds to= the + * internal %LIVEUPDATE_FINISH kernel event. Calling this ioctl triggers t= he + * cleanup phase: any resources that were successfully preserved but were = *not* + * subsequently restored (reclaimed) via the RESTORE ioctls will have their + * preserved state discarded and associated kernel resources released. Inv= olved + * devices may be reset. All desired restorations *must* be completed *bef= ore* + * this. Kernel callbacks for the %LIVEUPDATE_FINISH event must not fail. + * Successfully completing this phase transitions the system state from + * %LIVEUPDATE_STATE_UPDATED back to %LIVEUPDATE_STATE_NORMAL. This comman= d does + * not transfer data. + * + * Return: 0 on success, negative error code on failure. + */ +#define LIVEUPDATE_IOCTL_FINISH \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x07) + +#endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index b5054140b9a9..cb3ea380f6b9 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -7,4 +7,5 @@ obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_core.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_files.o +obj-$(CONFIG_LIVEUPDATE) +=3D luo_ioctl.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_subsystems.o diff --git a/kernel/liveupdate/luo_ioctl.c b/kernel/liveupdate/luo_ioctl.c new file mode 100644 index 000000000000..3de1d243df5a --- /dev/null +++ b/kernel/liveupdate/luo_ioctl.c @@ -0,0 +1,178 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO ioctl Interface + * + * The IOCTL user-space control interface for the LUO subsystem. + * It registers a misc character device, typically found at ``/dev/liveupd= ate``, + * which allows privileged userspace applications (requiring %CAP_SYS_ADMI= N) to + * manage and monitor the LUO state machine and associated resources like + * preservable file descriptors. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +static int luo_ioctl_fd_restore(struct liveupdate_fd *luo_fd) +{ + struct file *file; + int ret; + int fd; + + fd =3D get_unused_fd_flags(O_CLOEXEC); + if (fd < 0) { + pr_err("Failed to allocate new fd: %d\n", fd); + return fd; + } + + ret =3D luo_retrieve_file(luo_fd->token, &file); + if (ret < 0) { + put_unused_fd(fd); + + return ret; + } + + fd_install(fd, file); + luo_fd->fd =3D fd; + + return 0; +} + +static int luo_open(struct inode *inodep, struct file *filep) +{ + if (!capable(CAP_SYS_ADMIN)) + return -EACCES; + + if (filep->f_flags & O_EXCL) + return -EINVAL; + + return 0; +} + +static long luo_ioctl(struct file *filep, unsigned int cmd, unsigned long = arg) +{ + void __user *argp =3D (void __user *)arg; + struct liveupdate_fd luo_fd; + enum liveupdate_state state; + int ret =3D 0; + u64 token; + + if (_IOC_TYPE(cmd) !=3D LIVEUPDATE_IOCTL_TYPE) + return -ENOTTY; + + switch (cmd) { + case LIVEUPDATE_IOCTL_GET_STATE: + state =3D liveupdate_get_state(); + if (copy_to_user(argp, &state, sizeof(state))) + ret =3D -EFAULT; + break; + + case LIVEUPDATE_IOCTL_PREPARE: + ret =3D luo_prepare(); + break; + + case LIVEUPDATE_IOCTL_FINISH: + ret =3D luo_finish(); + break; + + case LIVEUPDATE_IOCTL_CANCEL: + ret =3D luo_cancel(); + break; + + case LIVEUPDATE_IOCTL_FD_PRESERVE: + if (copy_from_user(&luo_fd, argp, sizeof(luo_fd))) { + ret =3D -EFAULT; + break; + } + + ret =3D luo_register_file(luo_fd.token, luo_fd.fd); + if (!ret && copy_to_user(argp, &luo_fd, sizeof(luo_fd))) { + WARN_ON_ONCE(luo_unregister_file(luo_fd.token)); + ret =3D -EFAULT; + } + break; + + case LIVEUPDATE_IOCTL_FD_UNPRESERVE: + if (copy_from_user(&token, argp, sizeof(u64))) { + ret =3D -EFAULT; + break; + } + + ret =3D luo_unregister_file(token); + break; + + case LIVEUPDATE_IOCTL_FD_RESTORE: + if (copy_from_user(&luo_fd, argp, sizeof(luo_fd))) { + ret =3D -EFAULT; + break; + } + + ret =3D luo_ioctl_fd_restore(&luo_fd); + if (!ret && copy_to_user(argp, &luo_fd, sizeof(luo_fd))) + ret =3D -EFAULT; + break; + + default: + pr_warn("ioctl: unknown command nr: 0x%x\n", _IOC_NR(cmd)); + ret =3D -ENOTTY; + break; + } + + return ret; +} + +static const struct file_operations fops =3D { + .owner =3D THIS_MODULE, + .open =3D luo_open, + .unlocked_ioctl =3D luo_ioctl, +}; + +static struct miscdevice liveupdate_miscdev =3D { + .minor =3D MISC_DYNAMIC_MINOR, + .name =3D "liveupdate", + .fops =3D &fops, +}; + +static int __init liveupdate_init(void) +{ + int err; + + if (!liveupdate_enabled()) + return 0; + + err =3D misc_register(&liveupdate_miscdev); + if (err < 0) { + pr_err("Failed to register misc device '%s': %d\n", + liveupdate_miscdev.name, err); + } + + return err; +} +module_init(liveupdate_init); + +static void __exit liveupdate_exit(void) +{ + misc_deregister(&liveupdate_miscdev); +} +module_exit(liveupdate_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Pasha Tatashin"); +MODULE_DESCRIPTION("Live Update Orchestrator"); +MODULE_VERSION("0.1"); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f173.google.com (mail-yw1-f173.google.com [209.85.128.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED0A32FCE18 for ; Wed, 23 Jul 2025 14:47:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282056; cv=none; b=QxT41W9RQ+HwyEMgz3oPtamgh/I25QnV99uSIw1/sQAbkZycoQjVLsw2s4BRRY9pvN2VvkVuErKNDPF8oJGfE9HgQ0plyomL9mvR83xdp7tK1448xmefzTlJSVMbdKC65/4oXtuz8x2Cl80C9E7XWgbA9gs5tXT29G90S6hWroA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282056; c=relaxed/simple; bh=7oXioOhsPJ9d6KQC4yF3MfCEzgW1M4CcvxGCS2JdQQY=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lXjzivINsIXa7IohpFNG0fHQtOCWkcpu7hmr69oHKWjBazbQG1M7UsobdW7f74A7LYyQdYjkJbE84/+xicPvFOuQOXiJVOA++gMZkV3UItTpBcRPWxB8OI2bTYgrhf4kv9K18lGSgxfBZA1zyYc77PtAWmghs5VW60dEDbIb80Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=WXfJwm3O; arc=none smtp.client-ip=209.85.128.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="WXfJwm3O" Received: by mail-yw1-f173.google.com with SMTP id 00721157ae682-718425f1172so68322667b3.0 for ; Wed, 23 Jul 2025 07:47:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282049; x=1753886849; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=BuaZboxZCQ/wXH7LW7KxgHTaIDnECxtLxwF2+wVPu54=; b=WXfJwm3OaebVlEDJGmQdPTkh9pdJ5wUUs82VdsLwIr+9AbwKSyIdCtbHQgCtpbxaUA FgDihJU97U01ZvPor+JyWmuxCRUW22sjaEHDSNT3P8PQyy+BB8bXguAEcm2VoP9en5h7 Ja1WPI/4euZt+LdclwP0h6GUxcYYAU9AF3L7BxH9imdfEjp+op1pKATX1ugDQEIvnXcE F1y9eoHVYTS49dOzU2I3QorJ99YJ0+7YXraL50L28EP7zGCkNorknP64SSD+PClK58Uq SgQbBjZHlUFIgSv1WVI7qVpaa6a95Wjmdskt292+cUXhrBE8liIx3tVVFOJrbibyfjV7 F1XQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282049; x=1753886849; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BuaZboxZCQ/wXH7LW7KxgHTaIDnECxtLxwF2+wVPu54=; b=V/XQrTLkfaU/9e59LMZWtdO1/mPAhxX70kqntAYJZ/pZUbLR9HzicZ2KoM/jqdmebh 2x4Zwx/wXD/ZuKIS+oCAYBWdBNtMfyJo8iyAnDEuuG8/1DdBbHKI9uVEbGQXesc6MBIM ydcKz1MZkWapazJy+JF9jTxc6iei2GRTiQrAtM9tGzsCzr4uRRoiGHyBWQGdrOw9qNqJ V0o0WWqSY1sTW/FlvGDKt6Z0ty+c0KDwHgunfYNMMWhHW4N8uc3QneLVdBSgVv2IBsVY HsmiueDxLyxv79rxgsjO5GTNhIR6NwmNUerLUrOFDM0BmdkP44SnCNsUdn1yPWyzOIvM Hj+g== X-Forwarded-Encrypted: i=1; AJvYcCXioQiiqZw3aZ3ni1LA9ZZrGgauxTQohd4r4Lhk+7mBM/wNN8LcoL/NyVPazJIjC+G/0jvU/SnnL6nKdRE=@vger.kernel.org X-Gm-Message-State: AOJu0Yy5yxqZ0WlbiPkJBlF5nK7MIji6paF4o0ippvS/7tJNd2gk3klV nXgFLlvLZTfh3FxP/VamjzG6q50HcWHpmKnZlQW9vPV+VOghHo4sNjcldk9axnn8Ank= X-Gm-Gg: ASbGncvEXikobXeByvMysc0zGK1srF5jkZstRm8F6u0VrUTsF1IXTWootybeffUgYy2 YuB5Rfyv3qLzaiUgrEDZtZwpgCqu0p918+hhT7YTfUGqfOWN4d+bpSOU7oz0gjcfydM5F0r5dot c+Oi7XktBfoYynCYwdOnCuecvg8r+yzScW1Yu65U3R8VkXaSN95EYkrxGSjmozSNjHGIh8suBvs JM97DrvJmmOTi6XeaQfxmJTANcA5FCczvz4Bkv+xg5OpSnEJqos+3MlFvKHIFlsVdlwGgGI2w4k PSjcBIHCXkXh9PjoCRc1NZXTPLk/rFrB7jDZ07jtePi9H10xqgwt/vAnxojkyF2m50W7nY1dWl7 n0eKVNQqQaZvIPqmMHLbBvdZP6icHZpLp0VXcmQYI2m6TErPMHv5J9WFTSX5iux3TlZ3V6P5l5E ezuLQQdMu4LRoFUQ== X-Google-Smtp-Source: AGHT+IHIEOVDLgBUtd06XIvFo8+3Y7nL7WITrWUa0ITVg5fLvibUXHlW/Jf3oLpOE6dYCWJXFxrBHg== X-Received: by 2002:a05:690c:6909:b0:70d:f673:140b with SMTP id 00721157ae682-719b4221fcfmr41663997b3.14.1753282049214; Wed, 23 Jul 2025 07:47:29 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:28 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 17/32] liveupdate: luo_sysfs: add sysfs state monitoring Date: Wed, 23 Jul 2025 14:46:30 +0000 Message-ID: <20250723144649.1696299-18-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a sysfs interface for the Live Update Orchestrator under /sys/kernel/liveupdate/. This interface provides a way for userspace tools and scripts to monitor the current state of the LUO state machine. The main feature is a read-only file, state, which displays the current LUO state as a string ("normal", "prepared", "frozen", "updated"). The interface uses sysfs_notify to allow userspace listeners (e.g., via poll) to be efficiently notified of state changes. ABI documentation for this new sysfs interface is added in Documentation/ABI/testing/sysfs-kernel-liveupdate. This read-only sysfs interface complements the main ioctl interface provided by /dev/liveupdate, which handles LUO control operations and resource management. Signed-off-by: Pasha Tatashin --- .../ABI/testing/sysfs-kernel-liveupdate | 51 ++++++++++ kernel/liveupdate/Kconfig | 18 ++++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_core.c | 1 + kernel/liveupdate/luo_internal.h | 6 ++ kernel/liveupdate/luo_sysfs.c | 92 +++++++++++++++++++ 6 files changed, 169 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-kernel-liveupdate create mode 100644 kernel/liveupdate/luo_sysfs.c diff --git a/Documentation/ABI/testing/sysfs-kernel-liveupdate b/Documentat= ion/ABI/testing/sysfs-kernel-liveupdate new file mode 100644 index 000000000000..bb85cbae4943 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-liveupdate @@ -0,0 +1,51 @@ +What: /sys/kernel/liveupdate/ +Date: May 2025 +KernelVersion: 6.16.0 +Contact: pasha.tatashin@soleen.com +Description: Directory containing interfaces to query the live + update orchestrator. Live update is the ability to reboot the + host kernel (e.g., via kexec, without a full power cycle) while + keeping specifically designated devices operational ("alive") + across the transition. After the new kernel boots, these devices + can be re-attached to their original workloads (e.g., virtual + machines) with their state preserved. This is particularly + useful, for example, for quick hypervisor updates without + terminating running virtual machines. + + +What: /sys/kernel/liveupdate/state +Date: May 2025 +KernelVersion: 6.16.0 +Contact: pasha.tatashin@soleen.com +Description: Read-only file that displays the current state of the live + update orchestrator as a string. Possible values are: + + "normal" No live update operation is in progress. This is + the default operational state. + + "prepared" The live update preparation phase has completed + successfully (e.g., triggered via the + /dev/liveupdate event). Kernel subsystems have + been notified via the %LIVEUPDATE_PREPARE + event/callback and should have initiated state + saving. User workloads (e.g., VMs) are generally + still running, but some operations (like device + unbinding or new DMA mappings) might be + restricted. The system is ready for the reboot + trigger. + + "frozen" The final reboot notification has been sent + (e.g., triggered via the 'reboot()' syscall), + corresponding to the %LIVEUPDATE_REBOOT kernel + event. Subsystems have had their final chance to + save state. User workloads must be suspended. + The system is about to execute the reboot into + the new kernel (imminent kexec). This state + corresponds to the "blackout window". + + "updated" The system has successfully rebooted into the + new kernel via live update. Restoration of + preserved resources can now occur (typically via + ioctl commands). The system is awaiting the + final 'finish' signal after user space completes + restoration tasks. diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index f6b0bde188d9..75a17ca8a592 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -29,6 +29,24 @@ config LIVEUPDATE =20 If unsure, say N. =20 +config LIVEUPDATE_SYSFS_API + bool "Live Update sysfs monitoring interface" + depends on SYSFS + depends on LIVEUPDATE + help + Enable a sysfs interface for the Live Update Orchestrator + at /sys/kernel/liveupdate/. + + This allows monitoring the LUO state ('normal', 'prepared', + 'frozen', 'updated') via the read-only 'state' file. + + This interface complements the primary /dev/liveupdate ioctl + interface, which handles the full update process. + This sysfs API may be useful for scripting, or userspace monitoring + needed to coordinate application restarts and minimize downtime. + + If unsure, say N. + config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index cb3ea380f6b9..e35ddc51ab2b 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -9,3 +9,4 @@ obj-$(CONFIG_LIVEUPDATE) +=3D luo_core.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_files.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_ioctl.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_subsystems.o +obj-$(CONFIG_LIVEUPDATE_SYSFS_API) +=3D luo_sysfs.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index fff84c51d986..41dbe784445e 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -100,6 +100,7 @@ static inline bool is_current_luo_state(enum liveupdate= _state expected_state) static void __luo_set_state(enum liveupdate_state state) { WRITE_ONCE(luo_state, state); + luo_sysfs_notify(); } =20 static inline void luo_set_state(enum liveupdate_state state) diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index f77e8b3044f9..05cd861ed2a8 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -29,4 +29,10 @@ int luo_retrieve_file(u64 token, struct file **filep); int luo_register_file(u64 token, int fd); int luo_unregister_file(u64 token); =20 +#ifdef CONFIG_LIVEUPDATE_SYSFS_API +void luo_sysfs_notify(void); +#else +static inline void luo_sysfs_notify(void) {} +#endif + #endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_sysfs.c b/kernel/liveupdate/luo_sysfs.c new file mode 100644 index 000000000000..935946bb741b --- /dev/null +++ b/kernel/liveupdate/luo_sysfs.c @@ -0,0 +1,92 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO sysfs interface + * + * Provides a sysfs interface at ``/sys/kernel/liveupdate/`` for monitorin= g LUO + * state. Live update allows rebooting the kernel (via kexec) while prese= rving + * designated device state for attached workloads (e.g., VMs), useful for + * minimizing downtime during hypervisor updates. + * + * /sys/kernel/liveupdate/state + * ---------------------------- + * - Permissions: Read-only + * - Description: Displays the current LUO state string. + * - Valid States: + * @normal + * Idle state. + * @prepared + * Preparation phase complete (triggered via '/dev/liveupdate'). Res= ources + * checked, state saving initiated via %LIVEUPDATE_PREPARE event. + * Workloads mostly running but may be restricted. Ready forreboot + * trigger. + * @frozen + * Final reboot notification sent (triggered via 'reboot'). Correspo= nds to + * %LIVEUPDATE_REBOOT event. Final state saving. Workloads must be + * suspended. System about to kexec ("blackout window"). + * @updated + * New kernel booted via live update. Awaiting 'finish' signal. + * + * Userspace Interaction & Blackout Window Reduction + * ------------------------------------------------- + * Userspace monitors the ``state`` file to coordinate actions: + * - Suspend workloads before @frozen state is entered. + * - Initiate resource restoration upon entering @updated state. + * - Resume workloads after restoration, minimizing downtime. + */ + +#include +#include +#include +#include "luo_internal.h" + +static bool luo_sysfs_initialized; + +#define LUO_DIR_NAME "liveupdate" + +void luo_sysfs_notify(void) +{ + if (luo_sysfs_initialized) + sysfs_notify(kernel_kobj, LUO_DIR_NAME, "state"); +} + +/* Show the current live update state */ +static ssize_t state_show(struct kobject *kobj, struct kobj_attribute *att= r, + char *buf) +{ + return sysfs_emit(buf, "%s\n", luo_current_state_str()); +} + +static struct kobj_attribute state_attribute =3D __ATTR_RO(state); + +static struct attribute *luo_attrs[] =3D { + &state_attribute.attr, + NULL +}; + +static struct attribute_group luo_attr_group =3D { + .attrs =3D luo_attrs, + .name =3D LUO_DIR_NAME, +}; + +static int __init luo_init(void) +{ + int ret; + + ret =3D sysfs_create_group(kernel_kobj, &luo_attr_group); + if (ret) { + pr_err("Failed to create group\n"); + return ret; + } + + luo_sysfs_initialized =3D true; + pr_info("Initialized\n"); + + return 0; +} +subsys_initcall(luo_init); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4C87E2FCFC3 for ; Wed, 23 Jul 2025 14:47:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282056; cv=none; b=Cpgj63uw+h4Gjvs05y7a21HUtRvzLzAi5qW/r13HSZBE33rFwDCQwk7C4/PqQaVjYI9D47MfrvqvK9FR3Gl2SJUyeBlB+OP/NCzMA73P1J8Cq6obqyW0zdgfFlP5I9VslTupdtv6dR9ED+QJhEFJ2+/bVMnlO/p2aGvTFOOHTsk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282056; c=relaxed/simple; bh=MeL5rpabCOQ9W64WCDzEoX0aCqKWSq3A09PvP5YRATc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sRZLUdAF7tYUy32/xHYgFgRWyFZp6vqW1iJZ8yrm7yCw8xZS9SE8knvT4h2a8bHKs+wkymrZ7ONCmzm1uPj1BujE2bOgAP0P7iWJziFb1aPnH8PGZvPX6aPdjRINDWWStj9HloKj73VgkDuDVmKVYrW9ihDSUszwE2Z/sx3tQMg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=Yyy7zBhk; arc=none smtp.client-ip=209.85.128.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="Yyy7zBhk" Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-70e75f30452so44737367b3.2 for ; Wed, 23 Jul 2025 07:47:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282051; x=1753886851; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=+FWnE5Bow2uoaZcwISvBgq/4l5cKqkAoia4t69mDLc0=; b=Yyy7zBhkzE9t6E8ma3qmVy6Xxm+sC68V/EvMLhqxTwRVXtMOh14j3t1ZR3jFnbZhVL gCkDWN9JRIpqI0pgC4qc/1p1lbggJ45t6IGJjc4xPt4zCxPSiAf+WEr71IZGBO3tpMt9 eEX17hJV+76mpL8wSoMwgWYa/2ogjO5ld5LF5aJ1V+YRv3HP68kBDUx4NY237tGtBCOR PTOYqBRUi/RVpW0msVaUOV57yLYuQc/88nP1HQY0dao5tosixpnQXVvdFwRIwv8AiwE+ DGy6dn5kCoPB5xFMr4f1quxO+OyymmeG7L1abtzK0tX74IoFqMDmbW7ibeukt4TASkub 2HMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282051; x=1753886851; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+FWnE5Bow2uoaZcwISvBgq/4l5cKqkAoia4t69mDLc0=; b=McSDq5WVVV4fW4tve/mn3YpybJQkMfgyD1wJZ5PMaPDTSCeTvdnLd+uNQtr28BL0Qc eTAzJ/XabDVuMal8Vkx6wh5L/YlBbGX7PW6wvnniaT1n9up5/Rqyn0nzl3Bm252+7cF5 GE3L3JCNSspI0h0AFYjUqTqpsPS+ryzkuAiFWeIB5X0E6u1ArxyR+7ou4pfhdhw9d3gD nqx2SeuP9Uc1fUt4j4+SZm0TXdmjm1uHy0Kwxvp07EWbOYI4hy6/d/sOLks49Ox70fMo HOovBzUXzjgB0+f8DkmWDwIbMm+llLclMJwjjuUKbYQmMr4o5ZcbrhQ2+XM4/7gHi/gm Kiww== X-Forwarded-Encrypted: i=1; AJvYcCUg+isFC5eOWZPzr7CAaOZj57cDF79wWHsPsYzQSMyCF9V4JX2n1byfpqKNwgf2RkVuO5yc0j/mAMDTxzw=@vger.kernel.org X-Gm-Message-State: AOJu0YyvfgOiGnmiJJjKnJGjiPX+8/ljSZKraNY0Y5XQABoVEG9qM4kL fG+GfmlBvAkyuh6TRpvzwtWMT3SbkpyJdKy8Bd556/hF1XqEiA2yk9pDeAzzjR4eTkg= X-Gm-Gg: ASbGncvm0RBW5xTIfZWdydpFNqE4/8OdDNQ+8QLCVUcmZtX8YWk0db/k5vV65xDb3jx m/MMYWvX+/N4DShLmpVQTiRPkvutF7rQLqRKSB4+mbaepqE8rYeLT5tHgO6kVMNGiPDZSmmHRme g0KMEqJ8FnSK99E//yjk34xuITapaPThy7VkkYdzyYKaFDZPnsG4Ubg8ae2M8EdXiR3iA/qndur Tp8NB0fYAMn7ykR2QHbo2pNHqbkUI5vXdh3paunIOr5SFARPwH18Rh6dSsMeQWwhtrNZHj7zNAB Z0/uypVJSPfvlIsuLnhfAazG8DxDkZlfVoqJOzud7kkzQCopju2PuRcYu5GiOCeMzUZx5AETTaP z52kBiuPGmwSArCW6pr1gToBdJGPJkzMnpBHhg8kqG9MbJTlW/NTs3t7uTDi1EiWCN9xNMTzs/U sRKxO4vh9LF9mT4A== X-Google-Smtp-Source: AGHT+IHLyQqcsnOqnRO6HNMaeplbRrfZNdB8+kP2xWX2qbNG9/moHQNfSe1QScqIpYGuy7dJX/Q6FQ== X-Received: by 2002:a05:690c:d1b:b0:719:61b8:ffd7 with SMTP id 00721157ae682-719b422f811mr42827367b3.16.1753282051271; Wed, 23 Jul 2025 07:47:31 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:30 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 18/32] reboot: call liveupdate_reboot() before kexec Date: Wed, 23 Jul 2025 14:46:31 +0000 Message-ID: <20250723144649.1696299-19-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Modify the reboot() syscall handler in kernel/reboot.c to call liveupdate_reboot() when processing the LINUX_REBOOT_CMD_KEXEC command. This ensures that the Live Update Orchestrator is notified just before the kernel executes the kexec jump. The liveupdate_reboot() function triggers the final LIVEUPDATE_FREEZE event, allowing participating subsystems to perform last-minute state saving within the blackout window, and transitions the LUO state machine to FROZEN. The call is placed immediately before kernel_kexec() to ensure LUO finalization happens at the latest possible moment before the kernel transition. If liveupdate_reboot() returns an error (indicating a failure during LUO finalization), the kexec operation is aborted to prevent proceeding with an inconsistent state. Signed-off-by: Pasha Tatashin --- kernel/reboot.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/reboot.c b/kernel/reboot.c index ec087827c85c..bdeb04a773db 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -797,6 +798,9 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsig= ned int, cmd, =20 #ifdef CONFIG_KEXEC_CORE case LINUX_REBOOT_CMD_KEXEC: + ret =3D liveupdate_reboot(); + if (ret) + break; ret =3D kernel_kexec(); break; #endif --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B5672FC3B5 for ; Wed, 23 Jul 2025 14:47:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282060; cv=none; b=HVgBKfbwOohCisk4BfDQHoJBxt8/r4eAkjZe7uNEmsXNSv3MtJ1rSuql0d0UPppvDR1bpw159Ym3Tzf1r0tnQBUpT3oDdQjE7B+Kmbd3wBOkU75iYiRcxXa5rsxU3QBL4559HQA3fq/is2YVsFasEwKgCEAOig9a1nlKUawae18= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282060; c=relaxed/simple; bh=BYE6CFYNp+bC24GEWjI/kpdlZd1bkuK5lqVnOzx7zbo=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cvuMHvLA/lUhvgHrQZnebgJ5jKXJj2QDvuukdBRQ5+k6i/0JQRw6MWLgmbK6rUtS8P2SDRPl/SO74tYbHwQgwwxuZHwFl0s/MTiSJiOAjN8icyjKHRW4kEsDtKA+BUM6ojdh2yaAgzBcwPKzNt+4P7Wzqk5InqLPon2I3OovCvo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=MfM1vsUL; arc=none smtp.client-ip=209.85.128.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="MfM1vsUL" Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-7180bb37846so54191737b3.3 for ; Wed, 23 Jul 2025 07:47:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282053; x=1753886853; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=+trs6zLLPmtqa3TLX7M9IRokbE5V0+8Th4y6mj0Kyj4=; b=MfM1vsULL/brUnd0SM/r5d5jGS24kBw36dOk+k1u8tGauy67q+a2kUpzQ4WJYOZNVh vyRKWSai3zvLyqTdnI0CQnvoGUe2IJGRF28YH6qrDzmmYjsMQ1n0YCf6FX/az85vTH0o Dk5VKUC7C2iraFOKHU7kkShfMoFlKDSOvkj0V0sIVPTnrkjHzF2YUkSoRCEhM9wLMYyD foOMi8NyQqiT+CAGnddXqB4hhSTxZJPFs+miQvkazJw6zadmo+SjBwAePiPCSZTG3naj o1sDeLc8O0ehJkrLftHErqBM8OeVYmOLmfIQLKsD0YFHjLFZ0gKUs7ym1mo/VufLY7+V 4PJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282053; x=1753886853; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+trs6zLLPmtqa3TLX7M9IRokbE5V0+8Th4y6mj0Kyj4=; b=GL/FcZrpjwRqZGOoc6m6FGU6/XSJS7ICeQJONi0lnUa7dwn310T8M6eOVtifdOcaVP wyFlPQwLRMEeuy0zJnQcoa0frSwyENLK298RNZUfYbkS7wOVRbt/AFGaOUA9zJWjN/yF q2Km0FXYIZBAjjchlDtMOHwYRhFOL7W5RzNlu/vWR85tVTKHX1TkoEwEgvyPXpvBbAcI 5hrG341/fchkTqb9u4GovXxqijHwgjRDqlG1oVNiKwxS5dPxCaWhrx954qcCG36Y8vpY mnv65UW5RkeKn/04nKP+rmQCYwtqvPkDJS4QgUTlJCqJ8kULUHt9exVTWdoukPvBxYr7 qP0w== X-Forwarded-Encrypted: i=1; AJvYcCVYDfpzaPjBgPKaFq0CsSdhS2OkGrdcgP+dKYSzH6UtA4gpop5SR9llcEjn83GH0nIXwhNKmjWNw9Q5ecA=@vger.kernel.org X-Gm-Message-State: AOJu0YxT9Ir4IXpOAWfzpWh6t5BEz8kZWuA3WVI6t/gbCHcNzGdPkF5F J3PCw0oU26j92rW/cTNUvCk6PrBWOGThX91uNrw1QWEw89HmPu+WW2eszI0jpfBq+tc= X-Gm-Gg: ASbGncsImqoFqOT5+Q1QXqx7BvvDr3VbImtij0yrFQOMo/C0hUnOgo7tBQSU6pnAWck FZ7T9a7AvUC0/O6JtrFImPJjsWveoIgKvLTGPFRcWy9qRDprEAnJlQ6uwCHvqTS8UBaeSGscmhj lS9Co/Mn/ZdawXY+zLZ32zV1LBT4XabpeRO5MeuSHs8T2mMIxzhWHcMs1X9B+cgoAMuRnfuYUdy LqKQLcuPq8aOq7G2r6qp/taohnBa6qSJIijkyl04jr/ip/4ncGlzAU2Ga2O5EFUijmW5DnW5BK6 D+0b2f7IgNlgPTT9CQAz5tRFcg/uNEI1eNmBuFZzKY2Hq00I9a0wKuYx/wyheeiI2qHQk3eWirP aabZCcxQycYr8tzl1CbKggUfBT/sBZhATPzemw/4ECT36l90KJA6fI6bMGix/O9D8Gzk8UEM+p5 t5lzufbkxdd5DJHA== X-Google-Smtp-Source: AGHT+IFtwvreCL2sXFpP4tWrdElOCBJZZ61/Ssj3Y5oUsXYhD2MOsAfBd7Qfuabw6eVPL1olg8NVIg== X-Received: by 2002:a05:690c:6c86:b0:717:ca51:d781 with SMTP id 00721157ae682-719b4185488mr45868507b3.17.1753282053192; Wed, 23 Jul 2025 07:47:33 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:32 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 19/32] liveupdate: luo_files: luo_ioctl: session-based file descriptor tracking Date: Wed, 23 Jul 2025 14:46:32 +0000 Message-ID: <20250723144649.1696299-20-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, file descriptors registered for preservation via /dev/liveupdate remain globally registered with the LUO core until explicitly unregistered. This behavior can lead to unintended consequences: if a userspace process (e.g., a VMM or an agent managing FDs) registers FDs for preservation and then crashes or exits prematurely before LUO transitions to a PREPARED state (and without explicitly unregistering them), these FDs would remain marked for preservation. This could result in unnecessary resources being carried over to the next kernel or stale state or leaks. Introduce a session-based approach to FD preservation to address this issue. Each open instance of /dev/liveupdate now corresponds to a "LUO session," which tracks the FDs registered through it. If a LUO session is closed (i.e., the file descriptor for /dev/liveupdate is closed by userspace) while LUO is still in the NORMAL or UPDATED state, all FDs registered during that specific session are automatically unregistered. This ensures that FD preservations are tied to the lifetime of the controlling userspace entity's session, preventing unintentional leakage of preserved FD state into the next kernel if the live update process is not fully initiated and completed for those FDs. FDs are only globally committed for preservation if the LUO state machine progresses beyond NORMAL (i.e., into PREPARED or FROZEN) before the managing session is closed. In the future, we can relax this even further, and preserve only when the session is still open while we are already in reboot() system call. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_files.c | 225 ++++++++++++++++++++++++------- kernel/liveupdate/luo_internal.h | 9 +- kernel/liveupdate/luo_ioctl.c | 20 ++- 3 files changed, 197 insertions(+), 57 deletions(-) diff --git a/kernel/liveupdate/luo_files.c b/kernel/liveupdate/luo_files.c index cd956ea69f43..256b5261f81e 100644 --- a/kernel/liveupdate/luo_files.c +++ b/kernel/liveupdate/luo_files.c @@ -67,7 +67,6 @@ #define LUO_FILES_COMPATIBLE "file-descriptors-v1" =20 static DEFINE_XARRAY(luo_files_xa_in); -static DEFINE_XARRAY(luo_files_xa_out); static bool luo_files_xa_in_recreated; =20 /* Registered files. */ @@ -81,6 +80,15 @@ static size_t luo_file_fdt_out_size; =20 static atomic64_t luo_files_count; =20 +/* Opened sessions */ +static DECLARE_RWSEM(luo_sessions_list_rwsem); +static LIST_HEAD(luo_sessions_list); + +struct luo_session { + struct xarray files_xa_out; + struct list_head list; +}; + /** * struct luo_file - Represents a file descriptor instance preserved * across live update. @@ -262,6 +270,7 @@ static int luo_files_to_fdt(struct xarray *files_xa_out) =20 static int luo_files_fdt_setup(void) { + struct luo_session *s; int ret; =20 luo_file_fdt_out_size =3D luo_files_fdt_size(); @@ -300,9 +309,15 @@ static int luo_files_fdt_setup(void) if (ret < 0) goto exit_end_node; =20 - ret =3D luo_files_to_fdt(&luo_files_xa_out); - if (ret < 0) - goto exit_end_node; + down_read(&luo_sessions_list_rwsem); + list_for_each_entry(s, &luo_sessions_list, list) { + ret =3D luo_files_to_fdt(&s->files_xa_out); + if (ret < 0) { + up_read(&luo_sessions_list_rwsem); + goto exit_end_node; + } + } + up_read(&luo_sessions_list_rwsem); =20 ret =3D fdt_end_node(luo_file_fdt_out); if (ret < 0) @@ -405,44 +420,59 @@ static void luo_files_cancel_one(struct luo_file *h) =20 static void __luo_files_cancel(struct luo_file *boundary_file) { + struct luo_session *s; unsigned long token; struct luo_file *h; =20 - xa_for_each(&luo_files_xa_out, token, h) { - if (h =3D=3D boundary_file) - break; + down_read(&luo_sessions_list_rwsem); + list_for_each_entry(s, &luo_sessions_list, list) { + xa_for_each(&s->files_xa_out, token, h) { + if (h =3D=3D boundary_file) + goto exit; =20 - luo_files_cancel_one(h); + luo_files_cancel_one(h); + } } +exit: + up_read(&luo_sessions_list_rwsem); luo_files_fdt_cleanup(); } =20 static int luo_files_commit_data_to_fdt(void) { + struct luo_session *s; int node_offset, ret; unsigned long token; char token_str[19]; struct luo_file *h; =20 - xa_for_each(&luo_files_xa_out, token, h) { - snprintf(token_str, sizeof(token_str), "%#0llx", (u64)token); - node_offset =3D fdt_subnode_offset(luo_file_fdt_out, - 0, - token_str); - ret =3D fdt_setprop(luo_file_fdt_out, node_offset, "data", - &h->private_data, sizeof(h->private_data)); - if (ret < 0) { - pr_err("Failed to set data property for token %s: %s\n", - token_str, fdt_strerror(ret)); - return -ENOSPC; + down_read(&luo_sessions_list_rwsem); + list_for_each_entry(s, &luo_sessions_list, list) { + xa_for_each(&s->files_xa_out, token, h) { + snprintf(token_str, sizeof(token_str), "%#0llx", + (u64)token); + node_offset =3D fdt_subnode_offset(luo_file_fdt_out, + 0, token_str); + ret =3D fdt_setprop(luo_file_fdt_out, node_offset, "data", + &h->private_data, + sizeof(h->private_data)); + if (ret < 0) { + up_read(&luo_sessions_list_rwsem); + pr_err("Failed to set data property for token %s: %s\n", + token_str, fdt_strerror(ret)); + up_read(&luo_sessions_list_rwsem); + return -ENOSPC; + } } } + up_read(&luo_sessions_list_rwsem); =20 return 0; } =20 static int luo_files_prepare(void *arg, u64 *data) { + struct luo_session *s; unsigned long token; struct luo_file *h; int ret; @@ -451,16 +481,21 @@ static int luo_files_prepare(void *arg, u64 *data) if (ret) return ret; =20 - xa_for_each(&luo_files_xa_out, token, h) { - ret =3D luo_files_prepare_one(h); - if (ret < 0) { - pr_err("Prepare failed for file token %#0llx handler '%s' [%d]\n", - (u64)token, h->fh->compatible, ret); - __luo_files_cancel(h); - - return ret; + down_read(&luo_sessions_list_rwsem); + list_for_each_entry(s, &luo_sessions_list, list) { + xa_for_each(&s->files_xa_out, token, h) { + ret =3D luo_files_prepare_one(h); + if (ret < 0) { + pr_err("Prepare failed for file token %#0llx handler '%s' [%d]\n", + (u64)token, h->fh->compatible, ret); + __luo_files_cancel(h); + up_read(&luo_sessions_list_rwsem); + + return ret; + } } } + up_read(&luo_sessions_list_rwsem); =20 ret =3D luo_files_commit_data_to_fdt(); if (ret) @@ -473,20 +508,26 @@ static int luo_files_prepare(void *arg, u64 *data) =20 static int luo_files_freeze(void *arg, u64 *data) { + struct luo_session *s; unsigned long token; struct luo_file *h; int ret; =20 - xa_for_each(&luo_files_xa_out, token, h) { - ret =3D luo_files_freeze_one(h); - if (ret < 0) { - pr_err("Freeze callback failed for file token %#0llx handler '%s' [%d]\= n", - (u64)token, h->fh->compatible, ret); - __luo_files_cancel(h); - - return ret; + down_read(&luo_sessions_list_rwsem); + list_for_each_entry(s, &luo_sessions_list, list) { + xa_for_each(&s->files_xa_out, token, h) { + ret =3D luo_files_freeze_one(h); + if (ret < 0) { + pr_err("Freeze callback failed for file token %#0llx handler '%s' [%d]= \n", + (u64)token, h->fh->compatible, ret); + __luo_files_cancel(h); + up_read(&luo_sessions_list_rwsem); + + return ret; + } } } + up_read(&luo_sessions_list_rwsem); =20 ret =3D luo_files_commit_data_to_fdt(); if (ret) @@ -561,6 +602,7 @@ late_initcall(luo_files_startup); =20 /** * luo_register_file - Register a file descriptor for live update manageme= nt. + * @s: Session for the file that is being registered * @token: Token value for this file descriptor. * @fd: file descriptor to be preserved. * @@ -568,10 +610,11 @@ late_initcall(luo_files_startup); * * Return: 0 on success. Negative errno on failure. */ -int luo_register_file(u64 token, int fd) +int luo_register_file(struct luo_session *s, u64 token, int fd) { struct liveupdate_file_handler *fh; struct luo_file *luo_file; + struct luo_session *_s; bool found =3D false; int ret =3D -ENOENT; struct file *file; @@ -615,15 +658,20 @@ int luo_register_file(u64 token, int fd) mutex_init(&luo_file->mutex); luo_file->state =3D LIVEUPDATE_STATE_NORMAL; =20 - if (xa_load(&luo_files_xa_out, token)) { - ret =3D -EEXIST; - pr_warn("Token %llu is already taken\n", token); - mutex_destroy(&luo_file->mutex); - kfree(luo_file); - goto exit_unlock; + down_read(&luo_sessions_list_rwsem); + list_for_each_entry(_s, &luo_sessions_list, list) { + if (xa_load(&_s->files_xa_out, token)) { + up_read(&luo_sessions_list_rwsem); + ret =3D -EEXIST; + pr_warn("Token %llu is already taken\n", token); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + goto exit_unlock; + } } + up_read(&luo_sessions_list_rwsem); =20 - ret =3D xa_err(xa_store(&luo_files_xa_out, token, luo_file, + ret =3D xa_err(xa_store(&s->files_xa_out, token, luo_file, GFP_KERNEL)); if (ret < 0) { pr_warn("Failed to store file for token %llu in XArray: %d\n", @@ -646,6 +694,7 @@ int luo_register_file(u64 token, int fd) =20 /** * luo_unregister_file - Unregister a file instance using its token. + * @s: Session for the file that is being registered. * @token: The unique token of the file instance to unregister. * * Finds the &struct luo_file associated with the @token in the @@ -659,7 +708,7 @@ int luo_register_file(u64 token, int fd) * * Return: 0 on success. Negative errno on failure. */ -int luo_unregister_file(u64 token) +int luo_unregister_file(struct luo_session *s, u64 token) { struct luo_file *luo_file; int ret =3D 0; @@ -671,7 +720,7 @@ int luo_unregister_file(u64 token) return -EBUSY; } =20 - luo_file =3D xa_erase(&luo_files_xa_out, token); + luo_file =3D xa_erase(&s->files_xa_out, token); if (luo_file) { fput(luo_file->file); mutex_destroy(&luo_file->mutex); @@ -736,6 +785,74 @@ int luo_retrieve_file(u64 token, struct file **filep) return ret; } =20 +/** + * luo_create_session - Create and register a new LUO file preservation se= ssion. + * + * This function is called when a userspace process opens the /dev/liveupd= ate + * character device. + * + * Each session allows a specific open instance of /dev/liveupdate to + * independently register file descriptors for preservation. These registr= ations + * are local to the session until LUO's prepare phase aggregates them. + * If the /dev/liveupdate file descriptor is closed while LUO is still in + * the NORMAL or UPDATES states, all file descriptors registered within th= at + * session will be automatically unregistered by luo_destroy_session(). + * + * Return: Pointer to the newly allocated &struct luo_session on success, + * NULL on memory allocation failure. + */ +struct luo_session *luo_create_session(void) +{ + struct luo_session *s; + + s =3D kmalloc(sizeof(struct luo_session), GFP_KERNEL); + if (s) { + xa_init(&s->files_xa_out); + INIT_LIST_HEAD(&s->list); + + down_write(&luo_sessions_list_rwsem); + list_add_tail(&s->list, &luo_sessions_list); + up_write(&luo_sessions_list_rwsem); + } + + return s; +} + +/** + * luo_destroy_session - Release a LUO file preservation session. + * @s: Pointer to the &struct luo_session to be destroyed, previously obta= ined + * from luo_create_session(). + * + * This function must be called when a userspace file descriptor for + * /dev/liveupdate is being closed (typically from the .release file + * operation). It is responsible for cleaning up all resources associated + * with the given LUO session @s. + */ +void luo_destroy_session(struct luo_session *s) +{ + unsigned long token; + struct luo_file *h; + + down_write(&luo_sessions_list_rwsem); + list_del(&s->list); + up_write(&luo_sessions_list_rwsem); + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + goto skip_unregister; + } + + xa_for_each(&s->files_xa_out, token, h) + luo_unregister_file(s, token); + + luo_state_read_exit(); + +skip_unregister: + xa_destroy(&s->files_xa_out); + kfree(s); +} + /** * liveupdate_register_file_handler - Register a file handler with LUO. * @fh: Pointer to a caller-allocated &struct liveupdate_file_handler. @@ -796,6 +913,7 @@ EXPORT_SYMBOL_GPL(liveupdate_register_file_handler); */ int liveupdate_unregister_file_handler(struct liveupdate_file_handler *fh) { + struct luo_session *s; unsigned long token; struct luo_file *h; int ret =3D 0; @@ -807,15 +925,18 @@ int liveupdate_unregister_file_handler(struct liveupd= ate_file_handler *fh) } =20 down_write(&luo_register_file_list_rwsem); - - xa_for_each(&luo_files_xa_out, token, h) { - if (h->fh =3D=3D fh) { - up_write(&luo_register_file_list_rwsem); - luo_state_read_exit(); - return -EBUSY; + down_read(&luo_sessions_list_rwsem); + list_for_each_entry(s, &luo_sessions_list, list) { + xa_for_each(&s->files_xa_out, token, h) { + if (h->fh =3D=3D fh) { + up_read(&luo_sessions_list_rwsem); + up_write(&luo_register_file_list_rwsem); + luo_state_read_exit(); + return -EBUSY; + } } } - + up_read(&luo_sessions_list_rwsem); list_del_init(&fh->list); up_write(&luo_register_file_list_rwsem); luo_state_read_exit(); diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 05cd861ed2a8..8fef414e7e3e 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -25,9 +25,14 @@ int luo_do_subsystems_freeze_calls(void); void luo_do_subsystems_finish_calls(void); void luo_do_subsystems_cancel_calls(void); =20 +struct luo_session; + int luo_retrieve_file(u64 token, struct file **filep); -int luo_register_file(u64 token, int fd); -int luo_unregister_file(u64 token); +int luo_register_file(struct luo_session *s, u64 token, int fd); +int luo_unregister_file(struct luo_session *s, u64 token); + +struct luo_session *luo_create_session(void); +void luo_destroy_session(struct luo_session *s); =20 #ifdef CONFIG_LIVEUPDATE_SYSFS_API void luo_sysfs_notify(void); diff --git a/kernel/liveupdate/luo_ioctl.c b/kernel/liveupdate/luo_ioctl.c index 3de1d243df5a..d2c49cf33dd3 100644 --- a/kernel/liveupdate/luo_ioctl.c +++ b/kernel/liveupdate/luo_ioctl.c @@ -62,6 +62,17 @@ static int luo_open(struct inode *inodep, struct file *f= ilep) if (filep->f_flags & O_EXCL) return -EINVAL; =20 + filep->private_data =3D luo_create_session(); + if (!filep->private_data) + return -ENOMEM; + + return 0; +} + +static int luo_release(struct inode *inodep, struct file *filep) +{ + luo_destroy_session(filep->private_data); + return 0; } =20 @@ -101,9 +112,11 @@ static long luo_ioctl(struct file *filep, unsigned int= cmd, unsigned long arg) break; } =20 - ret =3D luo_register_file(luo_fd.token, luo_fd.fd); + ret =3D luo_register_file(filep->private_data, luo_fd.token, + luo_fd.fd); if (!ret && copy_to_user(argp, &luo_fd, sizeof(luo_fd))) { - WARN_ON_ONCE(luo_unregister_file(luo_fd.token)); + WARN_ON_ONCE(luo_unregister_file(filep->private_data, + luo_fd.token)); ret =3D -EFAULT; } break; @@ -114,7 +127,7 @@ static long luo_ioctl(struct file *filep, unsigned int = cmd, unsigned long arg) break; } =20 - ret =3D luo_unregister_file(token); + ret =3D luo_unregister_file(filep->private_data, token); break; =20 case LIVEUPDATE_IOCTL_FD_RESTORE: @@ -140,6 +153,7 @@ static long luo_ioctl(struct file *filep, unsigned int = cmd, unsigned long arg) static const struct file_operations fops =3D { .owner =3D THIS_MODULE, .open =3D luo_open, + .release =3D luo_release, .unlocked_ioctl =3D luo_ioctl, }; =20 --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C91512EF297 for ; Wed, 23 Jul 2025 14:47:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282061; cv=none; b=bs8akdXYjFcA2bxfmocj408PuxEMYG5Junsfj/u8kimwNiUITlzopmahGEDGyURYg6A4gCzLmTfyf+6pvjEfZ6U6JRL0zcqsI1bGc4nOK+C3cHYYrcka359SnT8jj1k4vrRVAy/SxyB7aQ+zpbkA1mMN/q/lQZF4Ye5GOgnXjTs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282061; c=relaxed/simple; bh=0K/tRVmbaZ0VX6fH8M8cdhr3USMmOz4iPh58HsL+dDs=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MIkybwV4jKEoY+nuHp+V4jQUg/Ijb+uWVg1WeOmnxS3/Jw+lheszRtC73VdDXYeMc9vZ+WcUYE8k0aV8Dn79c7krbxb2KM91aafVllelcCgd2MD+jNPdl5gayg7P5d5S79i39BIqDRYnppGhhmIeCNnPG3Bs1MlrQLaKpHxNiTM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=XB4uIIXV; arc=none smtp.client-ip=209.85.128.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="XB4uIIXV" Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-717b580ff2aso60414927b3.0 for ; Wed, 23 Jul 2025 07:47:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282055; x=1753886855; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=/ZJDnzlkjFdal2Rjq5d+u9z1RQAbXbOPFvsyuVPVQrw=; b=XB4uIIXVxAgp3MXXqcxtclS+QlBo1mLspvyfzGbSvVFh+GHYyLNz1SwbqG/ytQEcY2 1uI+4xsyIxny4daZZil6Fk2YLl5b2T35jMwudWOY+tlb3MwYk2JJ6J/u/a5GxW38dc6S 9nZrxs9olyr3opgLKdlqpTdGWyoaVsoHNY39fIO70reHcJl3TgND7hL78dZqsTYaDQKl y+kRVsAZ9+Jo8Elk9zUs4Xi+m23ZgPB7shObD65vLHkjZfBPs/toeb28MSINYLmfUKX3 PdCM+bpADjJCSQzRde3rfdkTwXDLX4CzuCgBFpzsq5ax+BOT6uKsJFrht7gghAzgO8R3 troQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282055; x=1753886855; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/ZJDnzlkjFdal2Rjq5d+u9z1RQAbXbOPFvsyuVPVQrw=; b=ikiWGPdH7Za/gwcAsKHoAm8Fiy3yLv7qw15sHJI/ctbvggpyOr8z75YvFnXOktClwi SzltlYHioIPUp9vUmuqIdZlg/DEDUXrolpYyjg1HectLoxVZFauwE4v8wm+t6yMFAAVo Xl/WZ4C7YyE66nvUcGRnfXAOBD03S45Xr1SqX5dkvOOFS452DgdMPXFhahtFpiwEYppp jfv/g/TZc+tKxr28P02gd5LSgknVdobSPGkpGvPrDNXN0qg1zEtVtNK2kD4b2GJAOi3L pno5TQ58/mUDu7Ua3tl8K9Tb4XDw/e3IWXRANcsT4hhy7nZM09vh6mhGUNrfaCh5Njak 4XqQ== X-Forwarded-Encrypted: i=1; AJvYcCWAgwpvxh9qa4xO+DpbAizx+LWEC4a9NRAg9Kic3Nzr4Pr0tNimfbsAPx17xP9Y4mX5i5mOy6socw2xveU=@vger.kernel.org X-Gm-Message-State: AOJu0Yx5IMFU16oXp6hOrrunxBoyxsPvAlRnva8CVHeZXinSKd8tO0A+ 03mDlw73j/Xft50Tj38EiSRvhvaLBVlh7RgmZcIFBLKjklAsppCV6sf3cLakvdarlk8= X-Gm-Gg: ASbGncs+XySN00e3BRyK7V7hCQfQg6zK0atcn3hJ/SuJa+HDZgYUqMQYm/u+CekNPBw l0d0OWI3qIqUHg2pEBgCTUdqzKod9s5dPlt4Rw1LNWNzdEQNdcFR8e/2fUIK1z/Dk3NS0hMjul/ Bq1qerxx88bOqcmmhUB4kte9055E9RIHgXv980J1STEAZlFkhrrTCX9059OMrMyALFvl0Np0QD8 e4nuY3LRMY/f8NUwRlv8D/bEguQXlWIo2SL5xPo4U2dZadoC3wa60JVpcckFBL/sMrTsVIZPSus ivI+/9bjaSJhE5BFHXPpbhuLFM+P8cD4RC71YkLYFtetldf2Tu4lxxiJ8ZlwhziCKlARfU2qW0k yI5GLNQ4TYsTNuxeIMjMnGWvFqyWEXr/1IJ/MWtc1xMCkgV5SvPD4gxImQ1l3gIohxDx7WBMPV8 fE81DIOEVxDhOcrw== X-Google-Smtp-Source: AGHT+IHlIrN5M9MAslsUdLs3lHTAX0ZWx0BjoPEfQHGEOchWYmIsSv7ekgCGt5Hg7CwZYoZ1gLsHVQ== X-Received: by 2002:a05:690c:3804:b0:70e:7638:a3a9 with SMTP id 00721157ae682-719b4200595mr38740617b3.18.1753282055367; Wed, 23 Jul 2025 07:47:35 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:34 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 20/32] kho: move kho debugfs directory to liveupdate Date: Wed, 23 Jul 2025 14:46:33 +0000 Message-ID: <20250723144649.1696299-21-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now, that LUO and KHO both live under kernel/liveupdate, it makes sense to also move the kho debugfs files to liveupdate/ The old names: /sys/kernel/debug/kho/out/ /sys/kernel/debug/kho/in/ The new names: /sys/kernel/debug/liveupdate/kho_out/ /sys/kernel/debug/liveupdate/kho_in/ Also, export the liveupdate_debufs_root, so LUO selftests could use it as well. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/kexec_handover_debug.c | 11 ++++++----- kernel/liveupdate/luo_internal.h | 4 ++++ 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/liveupdate/kexec_handover_debug.c b/kernel/liveupdate/k= exec_handover_debug.c index af4bad225630..f06d6cdfeab3 100644 --- a/kernel/liveupdate/kexec_handover_debug.c +++ b/kernel/liveupdate/kexec_handover_debug.c @@ -14,8 +14,9 @@ #include #include #include "kexec_handover_internal.h" +#include "luo_internal.h" =20 -static struct dentry *debugfs_root; +struct dentry *liveupdate_debugfs_root; =20 struct fdt_debugfs { struct list_head list; @@ -120,7 +121,7 @@ __init void kho_in_debugfs_init(struct kho_debugfs *dbg= , const void *fdt) =20 INIT_LIST_HEAD(&dbg->fdt_list); =20 - dir =3D debugfs_create_dir("in", debugfs_root); + dir =3D debugfs_create_dir("in", liveupdate_debugfs_root); if (IS_ERR(dir)) { err =3D PTR_ERR(dir); goto err_out; @@ -180,7 +181,7 @@ __init int kho_out_debugfs_init(struct kho_debugfs *dbg) =20 INIT_LIST_HEAD(&dbg->fdt_list); =20 - dir =3D debugfs_create_dir("out", debugfs_root); + dir =3D debugfs_create_dir("out", liveupdate_debugfs_root); if (IS_ERR(dir)) return -ENOMEM; =20 @@ -214,8 +215,8 @@ __init int kho_out_debugfs_init(struct kho_debugfs *dbg) =20 __init int kho_debugfs_init(void) { - debugfs_root =3D debugfs_create_dir("kho", NULL); - if (IS_ERR(debugfs_root)) + liveupdate_debugfs_root =3D debugfs_create_dir("liveupdate", NULL); + if (IS_ERR(liveupdate_debugfs_root)) return -ENOENT; return 0; } diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 8fef414e7e3e..fbb9c6642d19 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -40,4 +40,8 @@ void luo_sysfs_notify(void); static inline void luo_sysfs_notify(void) {} #endif =20 +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +extern struct dentry *liveupdate_debugfs_root; +#endif + #endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f179.google.com (mail-yw1-f179.google.com [209.85.128.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C42772FD5AA for ; Wed, 23 Jul 2025 14:47:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282065; cv=none; b=YSPb2AWlCF1jvaFG/585iHaZwYsrpPtueOTqwCNhrXjtC5VXns2QfeyHyulBywaBW1KJIBnFGex/OrDWinX+sREhpSSGIoQ+sRExEqJsCxDwHidhBS9VDXQ5zbchgUcN1ZVVCfZC2Oeop+c71/2BxzYEer1mQQVtboMx0/MdkOk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282065; c=relaxed/simple; bh=g2oc8a0/4jtFiZLrYOjPs43EVupkayV5fkDkA66r10c=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=f/db74wVqW07bqgrqdSeJUh0CgBIBvXIseL/CWjQuU21lLg66BK96x6xeQOENcc35DZ5KdhHJ5A9Hdyd8Jzpu7DX0FXtaqmwjl6LWkkafm3R9CLwyr6HLxA0B1R53ZRuplz9azPDBtFqbt4Pdk4JUrc5r7kY9HIWc07f/HGOamU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=2/Qp7cGk; arc=none smtp.client-ip=209.85.128.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="2/Qp7cGk" Received: by mail-yw1-f179.google.com with SMTP id 00721157ae682-712be7e034cso67228907b3.0 for ; Wed, 23 Jul 2025 07:47:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282058; x=1753886858; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=anrDwp0eME7u1ALRQxj17PhFcownCkP9aNfbuxryMlU=; b=2/Qp7cGk2/bDNnXciWER2FZG2y1065JDQjMulx5z9Ad/fmkZFuueknV9BO7bAklnoA QKoW+I/S0UEvW9A2lcMjmpDAaWODqqxk4C22VrBD/IjUxDai1f46dDgsxGK6m8SsA3u3 9yrz5Uk5RWdQUcS0c8RrYQjsSl7OggH8YnVaLtFRc/SmWRPm0ZbvGt1QUk34DbevxYGV dkxrItqy6GBA8o3RCNsQ8p0DwBCIANp1t1nGnAEez+Iq2uye+lQ53bTSNHcB5/89cFD3 9EwDBnJ64dwCIn4OVcNWhQtAUFR3ly1AfCJeBQvrOd3c7rJ9vb+HU50Z8heAFNZ6RoK/ 5K1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282058; x=1753886858; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=anrDwp0eME7u1ALRQxj17PhFcownCkP9aNfbuxryMlU=; b=V5vUKo7p622le1rnmQp1dH7648IWONiQcqEemuemDJE00Qb1z1lUlewF0WI48oesCQ muVyE4X8aIUBwR+Fy2RqaLHYhM3185IlC6C5QOZzqNMQLC8vFXvuOiZLJnl5vWGeKg5H ZqMqIoPmzl+TD495hHFlBYBTdqGXz598jIoMnklSawE/CFkmGuYrdJcv4R/sPt1bcOhJ q8VXVb9U31o/dtVFcjL4lDaX1zB6ASL2n97ZgDbfMco6E0xWfdjuJLywZj4wk4FiLfPP 1J3PlM6AM/gdOcFETZLC0/shC+sn5ZKjws0sCqKKr897m8G0iUv8Em8tKX8Rn0gIS2/P RLGw== X-Forwarded-Encrypted: i=1; AJvYcCXf604RiIvizmBkzHLi0en/6gq8X5eE9dgsfBIsi7afaP/ik616iKO2F2/99WlM2VDUP5M1wuyyhVRBsVQ=@vger.kernel.org X-Gm-Message-State: AOJu0YxLVthyUZy7jC2cHOc2JOsxAtopYpYYOwB91M4PPS9bLeDa7cw3 MdZUha9LMf+mKAiCuV6b/fOaYbur4YZCkeug+iTfpreknBeXPEDLAp7Usy/L1xQ8yVI= X-Gm-Gg: ASbGncso6PKNaG1kExybpZq6MNkVo9UjQGFaOzr6FnxMHn2Vax84XK3d41xFb2bQPk+ vacm5HxBMh6Xfm1WoO1g/Sqw4Xk4wMxLsfe2ZQM01TahzWk7caviXVs2QP3ErrZnN7BAp/j27zx 2cZpVJWMWI6efPYEPSkENXUtHxTMFKpjXTRXpUJRiFLvArz4AaDV5kDFZCl2+4KpD+JsYaj3xUY q4+MlkXbKrmhnsBq4Y74EFQ5ppRGQ3O8l4LkDD6Hmr2jJuLZcLxgX+Qu3LMSeBIZCOvRXwqhXZ0 Q79EGLItR1/UQiQUt4MfrGoOqfmsBurx3dLNRNF0Q7fzwLYyMpigTO1VCRwPacKq0jcA3+f93WH DEm4vCAQ+HtTAp2PJobuRSVlK5Gb9Qlql80kixDrAiMn1aae7NiL9O0rsBbI1577OmpPJ884NUK uQ6loUo/1SJquuIQ== X-Google-Smtp-Source: AGHT+IEq0PV8i6iKON0XPIItsPa9CfAALAQ3IWvTSJEL0fMdHI3uPm5IDBAuVJRK3YeOMk6mgAsf0g== X-Received: by 2002:a05:690c:c14:b0:702:d85:5347 with SMTP id 00721157ae682-719b4238961mr41715367b3.36.1753282057544; Wed, 23 Jul 2025 07:47:37 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:36 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 21/32] liveupdate: add selftests for subsystems un/registration Date: Wed, 23 Jul 2025 14:46:34 +0000 Message-ID: <20250723144649.1696299-22-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Introduce a self-test mechanism for the LUO to allow verification of core subsystem management functionality. This is primarily intended for developers and system integrators validating the live update feature. The tests are enabled via the new Kconfig option CONFIG_LIVEUPDATE_SELFTESTS (default 'n') and are triggered through a new ioctl command, LIVEUPDATE_IOCTL_SELFTESTS, added to the /dev/liveupdate device node. This ioctl accepts commands defined in luo_selftests.h to: - LUO_CMD_SUBSYSTEM_REGISTER: Creates and registers a dummy LUO subsystem using the liveupdate_register_subsystem() function. It allocates a data page and copies initial data from userspace. - LUO_CMD_SUBSYSTEM_UNREGISTER: Unregisters the specified dummy subsystem using the liveupdate_unregister_subsystem() function and cleans up associated test resources. - LUO_CMD_SUBSYSTEM_GETDATA: Copies the data page associated with a registered test subsystem back to userspace, allowing verification of data potentially modified or preserved by test callbacks. This provides a way to test the fundamental registration and unregistration flows within the LUO framework from userspace without requiring a full live update sequence. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/Kconfig | 15 ++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_selftests.c | 344 ++++++++++++++++++++++++++++++ kernel/liveupdate/luo_selftests.h | 84 ++++++++ 4 files changed, 444 insertions(+) create mode 100644 kernel/liveupdate/luo_selftests.c create mode 100644 kernel/liveupdate/luo_selftests.h diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index 75a17ca8a592..5be04ede357d 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -47,6 +47,21 @@ config LIVEUPDATE_SYSFS_API =20 If unsure, say N. =20 +config LIVEUPDATE_SELFTESTS + bool "Live Update Orchestrator - self-tests" + depends on LIVEUPDATE + help + =C2=A0 Say Y here to build self-tests for the LUO framework. When enabled, + these tests can be initiated via the ioctl interface to help verify + the core live update functionality. + + =C2=A0 This option is primarily intended for developers working on the + =C2=A0 live update feature or for validation purposes during system + =C2=A0 integration. + + =C2=A0 If you are unsure or are building a production kernel where size + =C2=A0 or attack surface is a concern, say N. + config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index e35ddc51ab2b..dfb63414cab2 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -7,6 +7,7 @@ obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_core.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_files.o +obj-$(CONFIG_LIVEUPDATE_SELFTESTS) +=3D luo_selftests.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_ioctl.o obj-$(CONFIG_LIVEUPDATE) +=3D luo_subsystems.o obj-$(CONFIG_LIVEUPDATE_SYSFS_API) +=3D luo_sysfs.o diff --git a/kernel/liveupdate/luo_selftests.c b/kernel/liveupdate/luo_self= tests.c new file mode 100644 index 000000000000..a198195fd1a5 --- /dev/null +++ b/kernel/liveupdate/luo_selftests.c @@ -0,0 +1,344 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO Selftests + * + * We provide ioctl-based selftest interface for the LUO. It provides a + * mechanism to test core LUO functionality, particularly the registration, + * unregistration, and data handling aspects of LUO subsystems, without + * requiring a full live update event sequence. + * + * The tests are intended primarily for developers working on the LUO fram= ework + * or for validation purposes during system integration. This functionalit= y is + * conditionally compiled based on the `CONFIG_LIVEUPDATE_SELFTESTS` Kconf= ig + * option and should typically be disabled in production kernels. + * + * Interface: + * The selftests are accessed via the `/dev/liveupdate` character device u= sing + * the `LIVEUPDATE_IOCTL_SELFTESTS` ioctl command. The argument to the ioc= tl + * is a pointer to a `struct liveupdate_selftest` structure (defined in + * `uapi/linux/liveupdate.h`), which contains: + * - `cmd`: The specific selftest command to execute (e.g., + * `LUO_CMD_SUBSYSTEM_REGISTER`). + * - `arg`: A pointer to a command-specific argument structure. For subsys= tem + * tests, this points to a `struct luo_arg_subsystem` (defined in + * `luo_selftests.h`). + * + * Commands: + * - `LUO_CMD_SUBSYSTEM_REGISTER`: + * Registers a new dummy LUO subsystem. It allocates kernel memory for test + * data, copies initial data from the user-provided `data_page`, sets up + * simple logging callbacks, and calls the core + * `liveupdate_register_subsystem()` + * function. Requires `arg` pointing to `struct luo_arg_subsystem`. + * - `LUO_CMD_SUBSYSTEM_UNREGISTER`: + * Unregisters a previously registered dummy subsystem identified by `name= `. + * It calls the core `liveupdate_unregister_subsystem()` function and then + * frees the associated kernel memory and internal tracking structures. + * Requires `arg` pointing to `struct luo_arg_subsystem` (only `name` used= ). + * - `LUO_CMD_SUBSYSTEM_GETDATA`: + * Copies the content of the kernel data page associated with the specified + * dummy subsystem (`name`) back to the user-provided `data_page`. This al= lows + * userspace to verify the state of the data after potential test operatio= ns. + * Requires `arg` pointing to `struct luo_arg_subsystem`. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" +#include "luo_selftests.h" + +static struct luo_subsystems { + struct liveupdate_subsystem handle; + char name[LUO_NAME_LENGTH]; + void *data; + bool in_use; + bool preserved; +} luo_subsystems[LUO_MAX_SUBSYSTEMS]; + +/* Only allow one selftest ioctl operation at a time */ +static DEFINE_MUTEX(luo_ioctl_mutex); + +static int luo_subsystem_prepare(void *arg, u64 *data) +{ + unsigned long i =3D (unsigned long)arg; + unsigned long phys_addr =3D __pa(luo_subsystems[i].data); + int ret; + + ret =3D kho_preserve_phys(phys_addr, PAGE_SIZE); + if (ret) + return ret; + + luo_subsystems[i].preserved =3D true; + *data =3D phys_addr; + pr_info("Subsystem '%s' prepare data[%lx]\n", + luo_subsystems[i].name, phys_addr); + + if (strstr(luo_subsystems[i].name, NAME_PREPARE_FAIL)) + return -EAGAIN; + + return 0; +} + +static int luo_subsystem_freeze(void *arg, u64 *data) +{ + unsigned long i =3D (unsigned long)arg; + + pr_info("Subsystem '%s' freeze data[%llx]\n", + luo_subsystems[i].name, *data); + + return 0; +} + +static void luo_subsystem_cancel(void *arg, u64 data) +{ + unsigned long i =3D (unsigned long)arg; + + pr_info("Subsystem '%s' canel data[%llx]\n", + luo_subsystems[i].name, data); + luo_subsystems[i].preserved =3D false; + WARN_ON(kho_unpreserve_phys(data, PAGE_SIZE)); +} + +static void luo_subsystem_finish(void *arg, u64 data) +{ + unsigned long i =3D (unsigned long)arg; + + pr_info("Subsystem '%s' finish data[%llx]\n", + luo_subsystems[i].name, data); +} + +static const struct liveupdate_subsystem_ops luo_selftest_subsys_ops =3D { + .prepare =3D luo_subsystem_prepare, + .freeze =3D luo_subsystem_freeze, + .cancel =3D luo_subsystem_cancel, + .finish =3D luo_subsystem_finish, +}; + +static int luo_subsystem_idx(char *name) +{ + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (luo_subsystems[i].in_use && + !strcmp(luo_subsystems[i].name, name)) + break; + } + + if (i =3D=3D LUO_MAX_SUBSYSTEMS) { + pr_warn("Subsystem with name '%s' is not registred\n", name); + + return -EINVAL; + } + + return i; +} + +static void luo_put_and_free_subsystem(char *name) +{ + int i =3D luo_subsystem_idx(name); + + if (i < 0) + return; + + if (luo_subsystems[i].preserved) + kho_unpreserve_phys(__pa(luo_subsystems[i].data), PAGE_SIZE); + free_page((unsigned long)luo_subsystems[i].data); + luo_subsystems[i].in_use =3D false; + luo_subsystems[i].preserved =3D false; +} + +static int luo_get_and_alloc_subsystem(char *name, void __user *data, + struct liveupdate_subsystem **hp) +{ + unsigned long page_addr, i; + + page_addr =3D get_zeroed_page(GFP_KERNEL); + if (!page_addr) { + pr_warn("Failed to allocate memory for subsystem data\n"); + return -ENOMEM; + } + + if (copy_from_user((void *)page_addr, data, PAGE_SIZE)) { + free_page(page_addr); + return -EFAULT; + } + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (!luo_subsystems[i].in_use) + break; + } + + if (i =3D=3D LUO_MAX_SUBSYSTEMS) { + pr_warn("Maximum number of subsystems registered\n"); + free_page(page_addr); + return -ENOMEM; + } + + luo_subsystems[i].in_use =3D true; + luo_subsystems[i].handle.ops =3D &luo_selftest_subsys_ops; + luo_subsystems[i].handle.name =3D luo_subsystems[i].name; + luo_subsystems[i].handle.arg =3D (void *)i; + strscpy(luo_subsystems[i].name, name, LUO_NAME_LENGTH); + luo_subsystems[i].data =3D (void *)page_addr; + + *hp =3D &luo_subsystems[i].handle; + + return 0; +} + +static int luo_cmd_subsystem_unregister(void __user *argp) +{ + struct luo_arg_subsystem arg; + int ret, i; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + i =3D luo_subsystem_idx(arg.name); + if (i < 0) + return i; + + ret =3D liveupdate_unregister_subsystem(&luo_subsystems[i].handle); + if (ret) + return ret; + + luo_put_and_free_subsystem(arg.name); + + return 0; +} + +static int luo_cmd_subsystem_register(void __user *argp) +{ + struct liveupdate_subsystem *h; + struct luo_arg_subsystem arg; + int ret; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + ret =3D luo_get_and_alloc_subsystem(arg.name, + (void __user *)arg.data_page, &h); + if (ret) + return ret; + + ret =3D liveupdate_register_subsystem(h); + if (ret) + luo_put_and_free_subsystem(arg.name); + + return ret; +} + +static int luo_cmd_subsystem_getdata(void __user *argp) +{ + struct luo_arg_subsystem arg; + int i; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + i =3D luo_subsystem_idx(arg.name); + if (i < 0) + return i; + + if (copy_to_user(arg.data_page, luo_subsystems[i].data, + PAGE_SIZE)) { + return -EFAULT; + } + + return 0; +} + +static int luo_ioctl_selftests(void __user *argp) +{ + struct liveupdate_selftest luo_st; + void __user *cmd_argp; + int ret =3D 0; + + if (copy_from_user(&luo_st, argp, sizeof(luo_st))) + return -EFAULT; + + cmd_argp =3D (void __user *)luo_st.arg; + + mutex_lock(&luo_ioctl_mutex); + switch (luo_st.cmd) { + case LUO_CMD_SUBSYSTEM_REGISTER: + ret =3D luo_cmd_subsystem_register(cmd_argp); + break; + + case LUO_CMD_SUBSYSTEM_UNREGISTER: + ret =3D luo_cmd_subsystem_unregister(cmd_argp); + break; + + case LUO_CMD_SUBSYSTEM_GETDATA: + ret =3D luo_cmd_subsystem_getdata(cmd_argp); + break; + + default: + pr_warn("ioctl: unknown self-test command nr: 0x%llx\n", + luo_st.cmd); + ret =3D -ENOTTY; + break; + } + mutex_unlock(&luo_ioctl_mutex); + + return ret; +} + +static long luo_selftest_ioctl(struct file *filep, unsigned int cmd, + unsigned long arg) +{ + int ret =3D 0; + + if (_IOC_TYPE(cmd) !=3D LIVEUPDATE_IOCTL_TYPE) + return -ENOTTY; + + switch (cmd) { + case LIVEUPDATE_IOCTL_FREEZE: + ret =3D luo_freeze(); + break; + + case LIVEUPDATE_IOCTL_SELFTESTS: + ret =3D luo_ioctl_selftests((void __user *)arg); + break; + + default: + pr_warn("ioctl: unknown command nr: 0x%x\n", _IOC_NR(cmd)); + ret =3D -ENOTTY; + break; + } + + return ret; +} + +static const struct file_operations luo_selftest_fops =3D { + .open =3D nonseekable_open, + .unlocked_ioctl =3D luo_selftest_ioctl, +}; + +static int __init luo_seltesttest_init(void) +{ + if (!liveupdate_debugfs_root) { + pr_err("liveupdate root is not set\n"); + return 0; + } + debugfs_create_file_unsafe("luo_selftest", 0600, + liveupdate_debugfs_root, NULL, + &luo_selftest_fops); + return 0; +} + +late_initcall(luo_seltesttest_init); diff --git a/kernel/liveupdate/luo_selftests.h b/kernel/liveupdate/luo_self= tests.h new file mode 100644 index 000000000000..098f2e9e6a78 --- /dev/null +++ b/kernel/liveupdate/luo_selftests.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _LINUX_LUO_SELFTESTS_H +#define _LINUX_LUO_SELFTESTS_H + +#include +#include + +/* Maximum number of subsystem self-test can register */ +#define LUO_MAX_SUBSYSTEMS 16 +#define LUO_NAME_LENGTH 32 + +#define LUO_CMD_SUBSYSTEM_REGISTER 0 +#define LUO_CMD_SUBSYSTEM_UNREGISTER 1 +#define LUO_CMD_SUBSYSTEM_GETDATA 2 +struct luo_arg_subsystem { + char name[LUO_NAME_LENGTH]; + void *data_page; +}; + +/* + * Test name prefixes: + * normal: prepare and freeze callbacks do not fail + * prepare_fail: prepare callback fails for this test. + * freeze_fail: freeze callback fails for this test + */ +#define NAME_NORMAL "ksft_luo" +#define NAME_PREPARE_FAIL "ksft_prepare_fail" +#define NAME_FREEZE_FAIL "ksft_freeze_fail" + +/** + * struct liveupdate_selftest - Holds directions for the self-test operati= ons. + * @cmd: Selftest comman defined in luo_selftests.h. + * @arg: Argument for the self test command. + * + * This structure is used only for the selftest purposes. + */ +struct liveupdate_selftest { + __u64 cmd; + __u64 arg; +}; + +/** + * LIVEUPDATE_IOCTL_FREEZE - Notify subsystems of imminent reboot + * transition. + * + * Argument: None. + * + * Notifies the live update subsystem and associated components that the k= ernel + * is about to execute the final reboot transition into the new kernel (e.= g., + * via kexec). This action triggers the internal %LIVEUPDATE_FREEZE kernel + * event. This event provides subsystems a final, brief opportunity (withi= n the + * "blackout window") to save critical state or perform last-moment quiesc= ing. + * Any remaining or deferred state saving for items marked via the PRESERVE + * ioctls typically occurs in response to the %LIVEUPDATE_FREEZE event. + * + * This ioctl should only be called when the system is in the + * %LIVEUPDATE_STATE_PREPARED state. This command does not transfer data. + * + * Return: 0 if the notification is successfully processed by the kernel (= but + * reboot follows). Returns a negative error code if the notification fails + * or if the system is not in the %LIVEUPDATE_STATE_PREPARED state. + */ +#define LIVEUPDATE_IOCTL_FREEZE \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x05) + +/** + * LIVEUPDATE_IOCTL_SELFTESTS - Interface for the LUO selftests + * + * Argument: Pointer to &struct liveupdate_selftest. + * + * Use by LUO selftests, commands are declared in luo_selftests.h + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +#define LIVEUPDATE_IOCTL_SELFTESTS \ + _IOWR(LIVEUPDATE_IOCTL_TYPE, 0x08, struct liveupdate_selftest) + +#endif /* _LINUX_LUO_SELFTESTS_H */ --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yb1-f169.google.com (mail-yb1-f169.google.com [209.85.219.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB4DC2FCE09 for ; Wed, 23 Jul 2025 14:47:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282068; cv=none; b=Ys1ftkx2jEKSSdsRVP8B6ue9rELPaVsZKz8vi5r3DVYQG2H1Kj0BqDWoqEDTxyCWQJjpYWmivU+qKvXea4r3tzqQCSrJntnUFgAO7zbVuF4ZAS2kHza+XE6QBdi/w/Pb9URPUbRCkhkV7pkajIDEclC4nI7yHzZDzGjL+ApgPzU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282068; c=relaxed/simple; bh=LTvazaelJup2RLQo0+ts/x55y7xfpD3AP0z69NBtyJ0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YJxvOXmpYmm+xhU2blwsxTsJ6cDDW8gpQVqP8uTkkkKlTl1/bJci0lZBYHY23VxEJR3pqT9P9kfmEnqyXD83F80TmSGenPvNCBNLe46T7JKgtWdNxFjsnCBbbMWmjOPTC0wMnnymXd43+/GUcfFl2WMi8XJ7uHwYWg0XANOohn8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=uoDxazMg; arc=none smtp.client-ip=209.85.219.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="uoDxazMg" Received: by mail-yb1-f169.google.com with SMTP id 3f1490d57ef6-e8d70c65a0aso4705175276.0 for ; Wed, 23 Jul 2025 07:47:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282060; x=1753886860; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=oWmBXVjGh1wc/IzDBDeCQXSx4hXhB+JQyCN77jCTWMc=; b=uoDxazMgJGtk47QKS/WVQbR29BuuOp3QFzLyPL1+wDPXHg9ShnKEmI7BVjKFLkW/F0 WZY44Xb24UjlVwCtYkiK5xf9dobBozfYdZxwB8IoDYj7YdR9lhwab97KUgOA3LZzTu1t S9EgyUbzkeJ3i4RqEexWTCGwQ6dQDX7kPxXjNvLeCaWNNvC3SZpiKTJtEkavMcqgMQiE VMblkLnCjSqE4NuHbs3fVGcr977LeNDCZZR104ORJIlgj2ywka4YvNbJQqp2EdURItSa ZivglskcnHcbzNAvXM+tdqf7Uds9DVnMVvaW0opTYZv6I3ndhnjcG884nQZY548Cp6fq 07uQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282060; x=1753886860; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oWmBXVjGh1wc/IzDBDeCQXSx4hXhB+JQyCN77jCTWMc=; b=UGF6dyPMzFxcyOgg8Gofw0TVVAuQrqvgEnok9hm71WqHglsUQ7OJ5ynqZBGxKAtjC+ D3wA8U9BqgDnXMWjjlo4BfZ6E953bWITIEei9vkFHL9pdE7IOq5GQKAybF9VuZ8hUS73 QKyBEGu118eO21uEO/nD4BRp+FLBhE+JUQ198SGIamvk9pxDUMbFvS4ffMzFZJVAYp+b Y/jYVH11zJ7HYjbmAKtPPqn2rRPX5409iXbvcm2UTmRmgUJQpWsJf0OnwEsz6Mla9qNN AaSg0SoZEJ7gHnfJvE6SYDdrpV9S1dluNZ9YmYmQ03tTRkQU3WXbPUhJyILG8R0CyiSj 5Zzg== X-Forwarded-Encrypted: i=1; AJvYcCWu9yafT1nshQHucfFzvPW4xq8y/HOGKKcrBbOWbde/Q5xuXDFm6MgWSLqCue2mtcX90dxaU37XN3P4/W8=@vger.kernel.org X-Gm-Message-State: AOJu0YxyUvEw8t27AbzCMHRvYwg7o6t5HRctB7zqHsL2x0LKBxnzCeGV uStVF9RMiTMUuZGqtDVqnCQnG1LPoGjFEFHeLGdwZG+drbk4XotZpwiadaNPdapC55Y= X-Gm-Gg: ASbGnctbnGeWpBQ5oxc2cfmvxpiDwvXHD3iJSEIxTmyburui0p5gDKd/si8JtSlwbTH CAt7KvSBmI90vqBHSbt+wSV/cqL6VF/OR5u9v6891nYScTFBRT/mKKtxbGoNXpJAqrIjpmQuusF 4Ttkc8e7FCTYeMVrWaQsAMa6tWkR5ZD2HnVOq+4N59kBX2D7cZtyGBiCRHvm9Rd6oaFP1TEk8zq 9sXNGxm0hdqc2F8LfTD2n92YYxHZUcxAHgg8uKGOzRrnNiAk6tQB5E7SZS7rewPhgHCe6e4aiS0 tXPONFQIfRM6lqg3s98KBnzrHjt5r5X4LUnPEFidnVF1HGfNmK6fuhWmly+f0oEtTdGQ8k5fiUm LHXNmZOL0nUf7nPumtbZgNcwJxfmYbQZi9jZmvyCVtWXXC5UNa768uHXc/sUOWyUAh0LipElG5d 8iBQ6bZ1MnIbuEQw== X-Google-Smtp-Source: AGHT+IEBhojheZGJ6Q8/J2Jc3qfebiDxZt2nYMumXzSS9twbam4ZudlzHL8AvgdpH7YpzOzJmZiVmA== X-Received: by 2002:a05:690c:6087:b0:718:a975:d4f6 with SMTP id 00721157ae682-719b424d291mr46021797b3.31.1753282059487; Wed, 23 Jul 2025 07:47:39 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:38 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 22/32] selftests/liveupdate: add subsystem/state tests Date: Wed, 23 Jul 2025 14:46:35 +0000 Message-ID: <20250723144649.1696299-23-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduces a new set of userspace selftests for the LUO. These tests verify the functionality LUO by using the kernel-side selftest ioctls provided by the LUO module, primarily focusing on subsystem management and basic LUO state transitions. Signed-off-by: Pasha Tatashin --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/liveupdate/.gitignore | 1 + tools/testing/selftests/liveupdate/Makefile | 7 + tools/testing/selftests/liveupdate/config | 6 + .../testing/selftests/liveupdate/liveupdate.c | 356 ++++++++++++++++++ 5 files changed, 371 insertions(+) create mode 100644 tools/testing/selftests/liveupdate/.gitignore create mode 100644 tools/testing/selftests/liveupdate/Makefile create mode 100644 tools/testing/selftests/liveupdate/config create mode 100644 tools/testing/selftests/liveupdate/liveupdate.c diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Mak= efile index 339b31e6a6b5..d8fc84ccac32 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -53,6 +53,7 @@ TARGETS +=3D kvm TARGETS +=3D landlock TARGETS +=3D lib TARGETS +=3D livepatch +TARGETS +=3D liveupdate TARGETS +=3D lkdtm TARGETS +=3D lsm TARGETS +=3D membarrier diff --git a/tools/testing/selftests/liveupdate/.gitignore b/tools/testing/= selftests/liveupdate/.gitignore new file mode 100644 index 000000000000..af6e773cf98f --- /dev/null +++ b/tools/testing/selftests/liveupdate/.gitignore @@ -0,0 +1 @@ +/liveupdate diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile new file mode 100644 index 000000000000..2a573c36016e --- /dev/null +++ b/tools/testing/selftests/liveupdate/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0-only +CFLAGS +=3D -Wall -O2 -Wno-unused-function +CFLAGS +=3D $(KHDR_INCLUDES) + +TEST_GEN_PROGS +=3D liveupdate + +include ../lib.mk diff --git a/tools/testing/selftests/liveupdate/config b/tools/testing/self= tests/liveupdate/config new file mode 100644 index 000000000000..382c85b89570 --- /dev/null +++ b/tools/testing/selftests/liveupdate/config @@ -0,0 +1,6 @@ +CONFIG_KEXEC_FILE=3Dy +CONFIG_KEXEC_HANDOVER=3Dy +CONFIG_KEXEC_HANDOVER_DEBUG=3Dy +CONFIG_LIVEUPDATE=3Dy +CONFIG_LIVEUPDATE_SYSFS_API=3Dy +CONFIG_LIVEUPDATE_SELFTESTS=3Dy diff --git a/tools/testing/selftests/liveupdate/liveupdate.c b/tools/testin= g/selftests/liveupdate/liveupdate.c new file mode 100644 index 000000000000..989a9a67d4cf --- /dev/null +++ b/tools/testing/selftests/liveupdate/liveupdate.c @@ -0,0 +1,356 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include + +#include "../kselftest.h" +#include "../kselftest_harness.h" +#include "../../../../kernel/liveupdate/luo_selftests.h" + +struct subsystem_info { + void *data_page; + void *verify_page; + char test_name[LUO_NAME_LENGTH]; + bool registered; +}; + +FIXTURE(subsystem) { + int fd; + int fd_dbg; + struct subsystem_info si[LUO_MAX_SUBSYSTEMS]; +}; + +FIXTURE(state) { + int fd; + int fd_dbg; +}; + +#define LUO_DEVICE "/dev/liveupdate" +#define LUO_DBG_DEVICE "/sys/kernel/debug/liveupdate/luo_selftest" +#define LUO_SYSFS_STATE "/sys/kernel/liveupdate/state" +static size_t page_size; + +const char *const luo_state_str[] =3D { + [LIVEUPDATE_STATE_UNDEFINED] =3D "undefined", + [LIVEUPDATE_STATE_NORMAL] =3D "normal", + [LIVEUPDATE_STATE_PREPARED] =3D "prepared", + [LIVEUPDATE_STATE_FROZEN] =3D "frozen", + [LIVEUPDATE_STATE_UPDATED] =3D "updated", +}; + +static int run_luo_selftest_cmd(int fd_dbg, __u64 cmd_code, + struct luo_arg_subsystem *subsys_arg) +{ + struct liveupdate_selftest k_arg; + + k_arg.cmd =3D cmd_code; + k_arg.arg =3D (__u64)(unsigned long)subsys_arg; + + return ioctl(fd_dbg, LIVEUPDATE_IOCTL_SELFTESTS, &k_arg); +} + +static int register_subsystem(int fd_dbg, struct subsystem_info *si) +{ + struct luo_arg_subsystem subsys_arg; + int ret; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", si->test_name); + subsys_arg.data_page =3D si->data_page; + + ret =3D run_luo_selftest_cmd(fd_dbg, LUO_CMD_SUBSYSTEM_REGISTER, + &subsys_arg); + if (!ret) + si->registered =3D true; + + return ret; +} + +static int unregister_subsystem(int fd_dbg, struct subsystem_info *si) +{ + struct luo_arg_subsystem subsys_arg; + int ret; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", si->test_name); + + ret =3D run_luo_selftest_cmd(fd_dbg, LUO_CMD_SUBSYSTEM_UNREGISTER, + &subsys_arg); + if (!ret) + si->registered =3D false; + + return ret; +} + +static int get_sysfs_state(void) +{ + char buf[64]; + ssize_t len; + int fd, i; + + fd =3D open(LUO_SYSFS_STATE, O_RDONLY); + if (fd < 0) { + ksft_print_msg("Failed to open sysfs state file '%s': %s\n", + LUO_SYSFS_STATE, strerror(errno)); + return -errno; + } + + len =3D read(fd, buf, sizeof(buf) - 1); + close(fd); + + if (len <=3D 0) { + ksft_print_msg("Failed to read sysfs state file '%s': %s\n", + LUO_SYSFS_STATE, strerror(errno)); + return -errno; + } + if (buf[len - 1] =3D=3D '\n') + buf[len - 1] =3D '\0'; + else + buf[len] =3D '\0'; + + for (i =3D 0; i < ARRAY_SIZE(luo_state_str); i++) { + if (!strcmp(buf, luo_state_str[i])) + return i; + } + + return -EIO; +} + +FIXTURE_SETUP(state) +{ + int state; + + page_size =3D sysconf(_SC_PAGE_SIZE); + self->fd =3D open(LUO_DEVICE, O_RDWR); + if (self->fd < 0) + SKIP(return, "open(%s) failed [%d]", LUO_DEVICE, errno); + + self->fd_dbg =3D open(LUO_DBG_DEVICE, O_RDWR); + ASSERT_GE(self->fd_dbg, 0); + + state =3D get_sysfs_state(); + if (state < 0) { + if (state =3D=3D -ENOENT || state =3D=3D -EACCES) + SKIP(return, "sysfs state not accessible (%d)", state); + } +} + +FIXTURE_TEARDOWN(state) +{ + enum liveupdate_state state =3D LIVEUPDATE_STATE_NORMAL; + + ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state); + if (state !=3D LIVEUPDATE_STATE_NORMAL) + ioctl(self->fd, LIVEUPDATE_IOCTL_CANCEL, NULL); + close(self->fd); +} + +FIXTURE_SETUP(subsystem) +{ + int i; + + page_size =3D sysconf(_SC_PAGE_SIZE); + memset(&self->si, 0, sizeof(self->si)); + self->fd =3D open(LUO_DEVICE, O_RDWR); + if (self->fd < 0) + SKIP(return, "open(%s) failed [%d]", LUO_DEVICE, errno); + + self->fd_dbg =3D open(LUO_DBG_DEVICE, O_RDWR); + ASSERT_GE(self->fd_dbg, 0); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + snprintf(self->si[i].test_name, LUO_NAME_LENGTH, + NAME_NORMAL ".%d", i); + + self->si[i].data_page =3D mmap(NULL, page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + -1, 0); + ASSERT_NE(MAP_FAILED, self->si[i].data_page); + memset(self->si[i].data_page, 'A' + i, page_size); + + self->si[i].verify_page =3D mmap(NULL, page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + -1, 0); + ASSERT_NE(MAP_FAILED, self->si[i].verify_page); + memset(self->si[i].verify_page, 0, page_size); + } +} + +FIXTURE_TEARDOWN(subsystem) +{ + enum liveupdate_state state =3D LIVEUPDATE_STATE_NORMAL; + int i; + + ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state); + if (state !=3D LIVEUPDATE_STATE_NORMAL) + ioctl(self->fd, LIVEUPDATE_IOCTL_CANCEL, NULL); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (self->si[i].registered) + unregister_subsystem(self->fd_dbg, &self->si[i]); + munmap(self->si[i].data_page, page_size); + munmap(self->si[i].verify_page, page_size); + } + + close(self->fd); +} + +TEST_F(state, normal) +{ + enum liveupdate_state state; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state)); + ASSERT_EQ(state, LIVEUPDATE_STATE_NORMAL); +} + +TEST_F(state, prepared) +{ + enum liveupdate_state state; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_PREPARE, NULL)); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state)); + ASSERT_EQ(state, LIVEUPDATE_STATE_PREPARED); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_CANCEL, NULL)); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state)); + ASSERT_EQ(state, LIVEUPDATE_STATE_NORMAL); +} + +TEST_F(state, sysfs_normal) +{ + ASSERT_EQ(LIVEUPDATE_STATE_NORMAL, get_sysfs_state()); +} + +TEST_F(state, sysfs_prepared) +{ + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_PREPARE, NULL)); + ASSERT_EQ(LIVEUPDATE_STATE_PREPARED, get_sysfs_state()); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_CANCEL, NULL)); + ASSERT_EQ(LIVEUPDATE_STATE_NORMAL, get_sysfs_state()); +} + +TEST_F(state, sysfs_frozen) +{ + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_PREPARE, NULL)); + + ASSERT_EQ(LIVEUPDATE_STATE_PREPARED, get_sysfs_state()); + + ASSERT_EQ(0, ioctl(self->fd_dbg, LIVEUPDATE_IOCTL_FREEZE, NULL)); + ASSERT_EQ(LIVEUPDATE_STATE_FROZEN, get_sysfs_state()); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_CANCEL, NULL)); + ASSERT_EQ(LIVEUPDATE_STATE_NORMAL, get_sysfs_state()); +} + +TEST_F(subsystem, register_unregister) +{ + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[0])); + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[0])); +} + +TEST_F(subsystem, double_unregister) +{ + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[0])); + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[0])); + EXPECT_NE(0, unregister_subsystem(self->fd_dbg, &self->si[0])); + EXPECT_TRUE(errno =3D=3D EINVAL || errno =3D=3D ENOENT); +} + +TEST_F(subsystem, register_unregister_many) +{ + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_F(subsystem, getdata_verify) +{ + enum liveupdate_state state; + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_PREPARE, NULL)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state)); + ASSERT_EQ(state, LIVEUPDATE_STATE_PREPARED); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + struct luo_arg_subsystem subsys_arg; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", + self->si[i].test_name); + subsys_arg.data_page =3D self->si[i].verify_page; + + ASSERT_EQ(0, run_luo_selftest_cmd(self->fd_dbg, + LUO_CMD_SUBSYSTEM_GETDATA, + &subsys_arg)); + ASSERT_EQ(0, memcmp(self->si[i].data_page, + self->si[i].verify_page, + page_size)); + } + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_CANCEL, NULL)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state)); + ASSERT_EQ(state, LIVEUPDATE_STATE_NORMAL); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_F(subsystem, prepare_fail) +{ + int i; + + snprintf(self->si[LUO_MAX_SUBSYSTEMS - 1].test_name, LUO_NAME_LENGTH, + NAME_PREPARE_FAIL ".%d", LUO_MAX_SUBSYSTEMS - 1); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(-1, ioctl(self->fd, LIVEUPDATE_IOCTL_PREPARE, NULL)); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); + + snprintf(self->si[LUO_MAX_SUBSYSTEMS - 1].test_name, LUO_NAME_LENGTH, + NAME_NORMAL ".%d", LUO_MAX_SUBSYSTEMS - 1); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_PREPARE, NULL)); + ASSERT_EQ(0, ioctl(self->fd_dbg, LIVEUPDATE_IOCTL_FREEZE, NULL)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_CANCEL, NULL)); + ASSERT_EQ(LIVEUPDATE_STATE_NORMAL, get_sysfs_state()); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_HARNESS_MAIN --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 703992F5C3D for ; Wed, 23 Jul 2025 14:47:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282069; cv=none; b=hr8rCl/tiL/bHoOb4XrFzrGgrIkqRxLctcV/48y/Lz1IAIOlEMUtngve4nNtmMb4SoqDRDmSooehETwN+hyfnoJSF0whT9O1R6yanUuN+RpaY0qWkDA4cDsVNZ+zYqnBkK55scXTEJxc5dhTzSAUpD+fpudojd+ddwmlyzPCbpY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282069; c=relaxed/simple; bh=3Bs7yjVCwVYzX1qp5XJYtcpXHYaHTnRk2TWgFXsiFjo=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XbXrqA04Y9QiOS1QhWw7PAJQgGPGZDIevj9ZCJWhi/P2kWaeQ0zAp4FfMGjsjy6/1ftBAzAIKpjkpiRBu5xQuYoU+1uFT9xP3mpfDLTFXHradCr9tgtaFjXIvT2i232lSyGo6ZBS1dNYatBMfFkqPnQz8YM66loSGKJ1fryfY98= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=Xr8Kgw8v; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="Xr8Kgw8v" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-7180bb37846so54193247b3.3 for ; Wed, 23 Jul 2025 07:47:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282061; x=1753886861; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=NC1ayEHkvN6ijinKURZBxrM3hNqjaSgPk5qZ/n3J1rM=; b=Xr8Kgw8v9TcOXLvuo1llO1is0AziDcvbvA8eksWYAJqUZT8B3+iENKGlanYrmVdsR3 YpF17MmB4AHKedUkCiaPNvM0PnyiUPOcu+/tW9pVXjl6r80k7v3xggjGEVCOJDXzbUd4 Gp+Il9h0COkLksqZGev64plplswaOQyXzednGDj8KuwfPKKVa5xVhCHojvo9GZ6yWpj2 dRInPFwDwJB9jWPl+M88vVtO8Z52uoHkatW0dkq7vgm7HhKcON55CPaq/AIAi46jKCZQ QP5kR21Zown2uz0f0oPsBSr9WJy49zBFVQY/SsFmyKiIlPkf90tPBZhnN2i0kUlwHeiF FVKg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282061; x=1753886861; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NC1ayEHkvN6ijinKURZBxrM3hNqjaSgPk5qZ/n3J1rM=; b=HovNtffMgwYVsExCvgfA40ays/xu5EdOmUl9gfD/d+5V3JVVcASd08RZbQ86BqO8Xl PecoibeDVUbhMCP/CMVWBuhTYe0abdjSCJiwOCdpYI+M0J4D5+mRoXgjseMCp0Hv5i+q 6WebGwsW6gCYoCHdtbT7I9HSn8/toxjGw4xKgxG0USGVFHRnIkM9Uex/7txGg4bh84fp Szq+TyqzQWWKQm5TAkMhfRQlY7c/ITjBdBES8qauAoJWxKXY2Suaz+2+TczL663NAo+i P6EH+1EQ4x77zSlUFMDm1FGgFYtwgukBGsN4H1G5a3VQ/7ZR4aYJTG06vbcskr+57z8a KBwg== X-Forwarded-Encrypted: i=1; AJvYcCU8t2UpWfxc7wPdbMHjj/a2qeMT/EdyEKTS9YHvkkugESLwpS9R/Os4+oG9S6QGsOd9Uei+RF1P6K15+YM=@vger.kernel.org X-Gm-Message-State: AOJu0Ywj5Q2jxcMTjuzZFBVDhLrWUfA2y5VGKQ/ukvJJOhtwJMF/uz4G EfhKJpJw4oguenqj0yMvwECTHeEqpJH7UcTQ2wFMlNjzvI7ylrcV+lC+RtQh70dcftw= X-Gm-Gg: ASbGncu8mleD59VYSGq7jtntMhtcCkXMhOWXKfcnRoXwEHFOFWhfitIUdAAToeG+Lew J82U97KCrnGcNe6DnEhCX84OdSF6oG9wYni97drNajQYSGts0weoTnuD3oxCy7JIy9T60X9kVzg roEGwVVJGB1q4A1HW/ZdixIqw+XRbHRm69+bH8SZevdNiCzHeC8YWljWe63hbpj0JvnOje9xroR 08eFZKAePBA95tdd+knzf+bTBKi5y319U0x6Jpjetj3JGZJ3SNb+traDFBRjJ8PFxUOA9R9Bope igaNkFmccQWJpAigtYuhSZ2WO8+hhi+DF3YPgKvVCBfM7BpbmtSDTih+BYAviVr1v9C2Uf0hKQe Lcn9FTZ/cdZ4xMT9neeHnPFvcyRuRGMxpuNThUZmb6vGV3JL5KKQ4cMiwdPEkkSQWzBWwlSa6Df BbD0irbxnozK+7VUhMyHHzhl1v X-Google-Smtp-Source: AGHT+IHtTYkgbLD8Z9PD3xbOoYfvfKs4NkwGaH86Xh5bYMAIlsFdLVYPN6ZxosRwBh+RGdz+VhmZwQ== X-Received: by 2002:a05:690c:9681:b0:718:4511:e173 with SMTP id 00721157ae682-719b4185000mr42897137b3.12.1753282061377; Wed, 23 Jul 2025 07:47:41 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:40 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 23/32] docs: add luo documentation Date: Wed, 23 Jul 2025 14:46:36 +0000 Message-ID: <20250723144649.1696299-24-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the documentation files for the Live Update Orchestrator Signed-off-by: Pasha Tatashin --- Documentation/admin-guide/index.rst | 1 + Documentation/admin-guide/liveupdate.rst | 16 +++++++ Documentation/core-api/index.rst | 1 + Documentation/core-api/liveupdate.rst | 50 ++++++++++++++++++++++ Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/liveupdate.rst | 25 +++++++++++ 6 files changed, 94 insertions(+) create mode 100644 Documentation/admin-guide/liveupdate.rst create mode 100644 Documentation/core-api/liveupdate.rst create mode 100644 Documentation/userspace-api/liveupdate.rst diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guid= e/index.rst index 259d79fbeb94..3f59ccf32760 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -95,6 +95,7 @@ likely to be of interest on almost any system. cgroup-v2 cgroup-v1/index cpu-load + liveupdate mm/index module-signing namespaces/index diff --git a/Documentation/admin-guide/liveupdate.rst b/Documentation/admin= -guide/liveupdate.rst new file mode 100644 index 000000000000..ff05cc1dd784 --- /dev/null +++ b/Documentation/admin-guide/liveupdate.rst @@ -0,0 +1,16 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update sysfs +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +LUO sysfs interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_sysfs.c + :doc: LUO sysfs interface + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/inde= x.rst index 7a4ca18ca6e2..a79d806f2c8e 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -136,6 +136,7 @@ Documents that don't fit elsewhere or which have yet to= be categorized. :maxdepth: 1 =20 librs + liveupdate netlink =20 .. only:: subproject and html diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api= /liveupdate.rst new file mode 100644 index 000000000000..41c4b76cd3ec --- /dev/null +++ b/Documentation/core-api/liveupdate.rst @@ -0,0 +1,50 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update Orchestrator +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +.. kernel-doc:: kernel/liveupdate/luo_core.c + :doc: Live Update Orchestrator (LUO) + +LUO Subsystems Participation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :doc: LUO Subsystems support + +LUO Preserving File Descriptors +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_files.c + :doc: LUO file descriptors + +Public API +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: include/linux/liveupdate.h + +.. kernel-doc:: kernel/liveupdate/luo_core.c + :export: + +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :export: + +.. kernel-doc:: kernel/liveupdate/luo_files.c + :export: + +Internal API +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_core.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_files.c + :internal: + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update uAPI ` +- :doc:`Live Update SysFS ` +- :doc:`/core-api/kho/concepts` diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspac= e-api/index.rst index b8c73be4fb11..ee8326932cb0 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -62,6 +62,7 @@ Everything else =20 ELF netlink/index + liveupdate sysfs-platform_profile vduse futex2 diff --git a/Documentation/userspace-api/liveupdate.rst b/Documentation/use= rspace-api/liveupdate.rst new file mode 100644 index 000000000000..70b5017c0e3c --- /dev/null +++ b/Documentation/userspace-api/liveupdate.rst @@ -0,0 +1,25 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update uAPI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +ioctl interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_ioctl.c + :doc: LUO ioctl Interface + +ioctl uAPI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: include/uapi/linux/liveupdate.h + +LUO selftests ioctl +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_selftests.c + :doc: LUO Selftests + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E3972FCE17 for ; Wed, 23 Jul 2025 14:47:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282069; cv=none; b=Nb58JPmV9uY2OADe9Wpt7UClZqMLfXGvHdhVb15p4xEnfUGira1peIYtI+ee0mz0RD1TqGaJD5q790kSSjgFT2NbXW5YtYqBPz9sG9X+6qLFB5QYaFNV+UkULiWazhZtmSgCau9SjI5xp2ZQiivOsYlBQiCFZEpm/MCEIxUiGpw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282069; c=relaxed/simple; bh=p6mnW8nHz5mmsnmxHLqpu00m8RR2v1Mvzx5SXmQ28YA=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=H+3+NrVqHcWddi+lwhkiaWoSejuPFz2KV5a+F6IsckdFqgy3w+o8df0RyktdW16Vfek1xQGSOOvBjwM7N3RnsT7NuWMSeS/YjCTAzcARJ9DzDpjr+yln/ULZ5yO6AKt1EdFTXgNSguhSDRLU7VGHGDyAnCftyAkNGFapGufodgY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=fNJQNMa6; arc=none smtp.client-ip=209.85.128.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="fNJQNMa6" Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-70e3e0415a7so267537b3.0 for ; Wed, 23 Jul 2025 07:47:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282063; x=1753886863; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=vA7LYlc82x3WwgY57zmotOihPt8s/pKmtJ2MhRkxfkk=; b=fNJQNMa6f1p8o9RygaADlwqx7OSnnaJFpIl/A6sRVgQasSvGeE3vggpKgKDHD85bwv qeEsuTGhhJFZopgjYe4ZAXcJ8xRjNE2KyT8noqLdunAzs1gAY8jVByk/v2NPbB29WIAB wFvy/Ty9RWKmNNwMDyWI+ytsL0YUvnJ9h7peQU9F8WE8xD+GsmYKXNtzsSqolNlAHhoN cT7/yWvkxRp1wV0b5aVebw3uvBs2OxxPF2gc5yy8wXQUmoG0jdvUTeJsVyrbBXvtCH5j RMRYi8uCufah4jhbvTMMHME+1+VwUtfaSZx6HdznbqEaAyYHq+EMYO0RiBxdepMlZ7FY V74A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282063; x=1753886863; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vA7LYlc82x3WwgY57zmotOihPt8s/pKmtJ2MhRkxfkk=; b=kJPQs4wgkYaGSI66uiQ7uX/KLErmMuVHpt248Nu9EkOxEPGJgES/8pUJDcrwimvZ9J MDQhoSSrtbhPmPXVxpikMsk/0DpJdLiGR87FGZDCGhX7E1GP4SVmb+r7tJf82mDH2pH7 fzLoHpf1famYrcDWHiLAHaO2tzmFPdfLW/r3gqZd7zwpflIWCl9jihlmbyQxVeXBMjpF UDLITgaasxDFqSVvE+v1BmGZeIzRefrTDyGavZb2rdXsKefFzvbTAtG1Q02vQieePCev C6Mxbyz5y41qE4gMQloBG9DDbt4Ddaa2H4Zd0z2uOPaTl3g1bBuoSC9uU2OdZHjxgh69 gNDw== X-Forwarded-Encrypted: i=1; AJvYcCU4ZjwuK7i0L/XqXehEBMY0fI5Xv4yrPlSfHes+hg95ntk7kJUESzIHhok5jzd7vSPaUWmqVRZ78CI+lzM=@vger.kernel.org X-Gm-Message-State: AOJu0Yzz8W6NvqIfuSri3lffBsKEU8RMkfozIj8qDICB21s8fnysL8KH ucRyVEIkYakuIoL6SlQLP/YQ3qmAt6spn/UuCv4PM11SE95s63F6flYetscoL+Vgcfw= X-Gm-Gg: ASbGncvv6QLLhQ5lzuK/WZDttaQDPUvypizMUi8EfIWVzUicdXPPqPYPcHV+tdhtYOL hs/3jv2NkPS92JAc9ENPl39KfVOZ/oLp+r7VJ5akEu7USi7Jcv2XhBII3d+gX4dqXdb/nya88Gg bEDKwvF2tT2oiLSq6mPU4HcP8ApECgIcUY76i+4R2VAoYeaMjq9lntlXDPgBp/LRuT0A6/s59Vx NvLGs7Wag7OClTBhZNDaD6sYYa9VrwZIIlarOUmGMdpryTRzjfZuRCjjNabSMK6aym2h5NioTKr Ga/gT7Q8vp0A4OQk8fT8l0euZTlPykmTjoeVQODp8zPTouVt5mQS2cuyu7NFFwt69Y2uhgvvs0k oM4pOKU7ZAUUVQWgTlokwpZotmDhan4Wwl70iAmYs3bnvh0DJezzQ/7vuOvO9kACKDfNXjY7XRu /9r9H29wOols2krA== X-Google-Smtp-Source: AGHT+IHqS70M03S5he3NvWDhoQuPFBoBKoEYJVBTcn9mZuJ/zrlY+6jXwnz4jOOosBMr5/pOQwUPow== X-Received: by 2002:a05:690c:480b:b0:719:4c68:a713 with SMTP id 00721157ae682-719b4b26117mr35918287b3.2.1753282063460; Wed, 23 Jul 2025 07:47:43 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:42 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 24/32] MAINTAINERS: add liveupdate entry Date: Wed, 23 Jul 2025 14:46:37 +0000 Message-ID: <20250723144649.1696299-25-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a MAINTAINERS file entry for the new Live Update Orchestrator introduced in previous patches. Signed-off-by: Pasha Tatashin --- MAINTAINERS | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 3b276cfeb038..711cf25d283d 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14012,6 +14012,19 @@ F: kernel/module/livepatch.c F: samples/livepatch/ F: tools/testing/selftests/livepatch/ =20 +LIVE UPDATE +M: Pasha Tatashin +L: linux-kernel@vger.kernel.org +S: Maintained +F: Documentation/ABI/testing/sysfs-kernel-liveupdate +F: Documentation/admin-guide/liveupdate.rst +F: Documentation/core-api/liveupdate.rst +F: Documentation/userspace-api/liveupdate.rst +F: include/linux/liveupdate.h +F: include/uapi/linux/liveupdate.h +F: kernel/liveupdate/ +F: tools/testing/selftests/liveupdate/ + LLC (802.2) L: netdev@vger.kernel.org S: Odd fixes --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98B812FE37A for ; Wed, 23 Jul 2025 14:47:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282073; cv=none; b=ezt3/T7A8q98mX0J0+e/DF6dm8RNxhBM8sGnDrS3hdXUseSGEq/XHP69OZv6rRCIRNvq4tPhIEfe2FV3p9X0Q3E73KS65MRtrGzda+O9TuSR9j1P2wdN133RuIH4DMpSiLhLgbY8FpctSrp8pmJjnFGrcbfUlV4FkleabzpS9vw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282073; c=relaxed/simple; bh=rSRqE8vgQhxsoLI9oMtT0uX+yqw4oNSPUrkN3zwErJM=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=X2/Gh27k9zR52am437ot0pKks5AOXBeWs7yDnzREJ31bOw9yXqOZpnSNbrsC+b4FrIHWVIXWpCwQuPDaFtinP9JNUn1QpfL/Ydk4cs/+560j3qxo9Ed7H2quF5Oep3FtD7gGvPneP1peFQOeu76RIxkyfiS6DVacgkJ0zsDlJps= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=I1LApViz; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="I1LApViz" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-717b580ff2aso60416427b3.0 for ; Wed, 23 Jul 2025 07:47:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282066; x=1753886866; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=p4VVvtVDYrPoMJ5697caghWAOTwp7h+e9BDkNCujjMk=; b=I1LApViz9hS+FH/Jp6eP9+T3S1WDdoIOAy5KK+sJI+HeSptcbQCVQImhtENJzYOic2 9K1Ove/gWwckvuOcRBLRpnnb2FBhvVzyMAeHDCnO2eUuwVq0CcMQdFuAMYlCEZpztxCU YTXNVHs7apQdnFq3J0VaZcTOK5O58PXF14MbxOvJIstlsTqc2kkjV1PAarl2cnUxPM2o Ra6aRiXcLlToT198V29kPIiLbPl1uou8Xb8TxlnC4WcjLHdMvO9KS2RLnbA9E4Pbtxaz ZCQeY1gQJKQvWtLBn399ZXi65NoFzYVTguJELpueMghw5QrdcFtvUoHYOytLZr3RgSKq 4J1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282066; x=1753886866; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=p4VVvtVDYrPoMJ5697caghWAOTwp7h+e9BDkNCujjMk=; b=uL5FCgyVsLAd3fjR50xSgDPHDKEOwNU4kGc5DqJXvisRxUd1Z2vRiN+shblVW8F8st jnnT5/CTMdKrUO3lz0pFBret76lrkG+oPDoGPi0GZswOx06QZtiCKh+rzc8JgrlImHXc kuxcN7s7Ly9nzPhXYkbi1JRnjKkih7VCErjwk4Jn9mLBxpjoPwFZonexA9D5CAJBqljX o0Z1ONSMcb1zVyTiei7CPbpzflRjlwlpZ3npnr2zNcxaeSaIgTQqhg5pu1ioJf88XcR5 +p1VKlgGv+b38lvO90j5TthZi9ZnzNvK1kyb7GMWrRZ8PvGKe9H6gK8PVT7m0ko7bdEw EbcQ== X-Forwarded-Encrypted: i=1; AJvYcCWddjYeSCQsg9RyRi+Bg2NvYE0YAdfzCOnh7HzmhGlrL2JLMXGyGe6sPtJ1BrxLyzfz6baVVgU9KN6zcpA=@vger.kernel.org X-Gm-Message-State: AOJu0Yws0KhHTkJrfgqWGyQDQklctpEyoC+6cZejr/tbLE/XXAgfjoFa 8yHI1fVT65o+3KPlEoevHHSBfNdVhZR0kpCk9c92q/jnwzVIGDH8sCaKlZhXJgocZRg= X-Gm-Gg: ASbGncvRYbqF8pOpJh2T4oauDRQK0HUUR1zmMSHaq7ZTh12n5JgyGfS1DxYOpOKpbhf fCWlBj/Bk/Fux7kV+b2uHAVS+AucK7B2+Dex1z/od/Ldi/kaCUIVYoojcwhPfVkpbTq0zBi1VKz 1f5Lk1E6DLbQDEdVRNf9EXCTd7Jwvj0/tQrwb7kBWCnUngtRvCjppQCGJ3+GA+GtC2bO/dsXQok pJaRauLAy5O645ajAm6f/1Fd/MXEdoi6PmHVjWDF9j0K/rD/A64hBX0RKa9CqD1r/cHYwDB5wdt XxZRhueY7dcHRXtn2xJsh810K1yiL6WUrUf3Qb4HLb/zc+PpLEOoTRWX6z4h0xdq1mFV1Bw/lW3 esy/L6ZP5n8TwW05BqnG2+kSM5W8YySred8V0LcjmbUGc3oB3IO9vhlYrUlhIWSEk/1RazQuDaS qn5Xa5s4HQ/xPxjTwiljKRC3UY X-Google-Smtp-Source: AGHT+IElamuRa8AIfKQOvI7xwmAtMtuuPebAeCpy0BgVpmPLxZTObJATbGExKCbAoZbplohPzmoJHw== X-Received: by 2002:a05:690c:25c1:b0:718:85f1:76d with SMTP id 00721157ae682-719b42932damr39674127b3.26.1753282065581; Wed, 23 Jul 2025 07:47:45 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:45 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 25/32] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Date: Wed, 23 Jul 2025 14:46:38 +0000 Message-ID: <20250723144649.1696299-26-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav shmem_inode_info::flags can have the VM flags VM_NORESERVE and VM_LOCKED. These are used to suppress pre-accounting or to lock the pages in the inode respectively. Using the VM flags directly makes it difficult to add shmem-specific flags that are unrelated to VM behavior since one would need to find a VM flag not used by shmem and re-purpose it. Introduce SHMEM_F_NORESERVE and SHMEM_F_LOCKED which represent the same information, but their bits are independent of the VM flags. Callers can still pass VM_NORESERVE to shmem_get_inode(), but it gets transformed to the shmem-specific flag internally. No functional changes intended. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- include/linux/shmem_fs.h | 6 ++++++ mm/shmem.c | 30 +++++++++++++++++------------- 2 files changed, 23 insertions(+), 13 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 5f03a39a26f7..578a5f3d1935 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -10,6 +10,7 @@ #include #include #include +#include =20 /* inode in-kernel data */ =20 @@ -17,6 +18,11 @@ #define SHMEM_MAXQUOTAS 2 #endif =20 +/* Suppress pre-accounting of the entire object size. */ +#define SHMEM_F_NORESERVE BIT(0) +/* Disallow swapping. */ +#define SHMEM_F_LOCKED BIT(1) + struct shmem_inode_info { spinlock_t lock; unsigned int seals; /* shmem seals */ diff --git a/mm/shmem.c b/mm/shmem.c index 3a5a65b1f41a..6eded368d17a 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -175,20 +175,20 @@ static inline struct shmem_sb_info *SHMEM_SB(struct s= uper_block *sb) */ static inline int shmem_acct_size(unsigned long flags, loff_t size) { - return (flags & VM_NORESERVE) ? + return (flags & SHMEM_F_NORESERVE) ? 0 : security_vm_enough_memory_mm(current->mm, VM_ACCT(size)); } =20 static inline void shmem_unacct_size(unsigned long flags, loff_t size) { - if (!(flags & VM_NORESERVE)) + if (!(flags & SHMEM_F_NORESERVE)) vm_unacct_memory(VM_ACCT(size)); } =20 static inline int shmem_reacct_size(unsigned long flags, loff_t oldsize, loff_t newsize) { - if (!(flags & VM_NORESERVE)) { + if (!(flags & SHMEM_F_NORESERVE)) { if (VM_ACCT(newsize) > VM_ACCT(oldsize)) return security_vm_enough_memory_mm(current->mm, VM_ACCT(newsize) - VM_ACCT(oldsize)); @@ -206,7 +206,7 @@ static inline int shmem_reacct_size(unsigned long flags, */ static inline int shmem_acct_blocks(unsigned long flags, long pages) { - if (!(flags & VM_NORESERVE)) + if (!(flags & SHMEM_F_NORESERVE)) return 0; =20 return security_vm_enough_memory_mm(current->mm, @@ -215,7 +215,7 @@ static inline int shmem_acct_blocks(unsigned long flags= , long pages) =20 static inline void shmem_unacct_blocks(unsigned long flags, long pages) { - if (flags & VM_NORESERVE) + if (flags & SHMEM_F_NORESERVE) vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE)); } =20 @@ -1557,7 +1557,7 @@ int shmem_writeout(struct folio *folio, struct writeb= ack_control *wbc) if (WARN_ON_ONCE(!wbc->for_reclaim)) goto redirty; =20 - if ((info->flags & VM_LOCKED) || sbinfo->noswap) + if ((info->flags & SHMEM_F_LOCKED) || sbinfo->noswap) goto redirty; =20 if (!total_swap_pages) @@ -2910,15 +2910,15 @@ int shmem_lock(struct file *file, int lock, struct = ucounts *ucounts) * ipc_lock_object() when called from shmctl_do_lock(), * no serialization needed when called from shm_destroy(). */ - if (lock && !(info->flags & VM_LOCKED)) { + if (lock && !(info->flags & SHMEM_F_LOCKED)) { if (!user_shm_lock(inode->i_size, ucounts)) goto out_nomem; - info->flags |=3D VM_LOCKED; + info->flags |=3D SHMEM_F_LOCKED; mapping_set_unevictable(file->f_mapping); } - if (!lock && (info->flags & VM_LOCKED) && ucounts) { + if (!lock && (info->flags & SHMEM_F_LOCKED) && ucounts) { user_shm_unlock(inode->i_size, ucounts); - info->flags &=3D ~VM_LOCKED; + info->flags &=3D ~SHMEM_F_LOCKED; mapping_clear_unevictable(file->f_mapping); } retval =3D 0; @@ -3062,7 +3062,9 @@ static struct inode *__shmem_get_inode(struct mnt_idm= ap *idmap, spin_lock_init(&info->lock); atomic_set(&info->stop_eviction, 0); info->seals =3D F_SEAL_SEAL; - info->flags =3D flags & VM_NORESERVE; + info->flags =3D 0; + if (flags & VM_NORESERVE) + info->flags |=3D SHMEM_F_NORESERVE; info->i_crtime =3D inode_get_mtime(inode); info->fsflags =3D (dir =3D=3D NULL) ? 0 : SHMEM_I(dir)->fsflags & SHMEM_FL_INHERITED; @@ -5801,8 +5803,10 @@ static inline struct inode *shmem_get_inode(struct m= nt_idmap *idmap, /* common code */ =20 static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *n= ame, - loff_t size, unsigned long flags, unsigned int i_flags) + loff_t size, unsigned long vm_flags, + unsigned int i_flags) { + unsigned long flags =3D (vm_flags & VM_NORESERVE) ? SHMEM_F_NORESERVE : 0; struct inode *inode; struct file *res; =20 @@ -5819,7 +5823,7 @@ static struct file *__shmem_file_setup(struct vfsmoun= t *mnt, const char *name, return ERR_PTR(-ENOMEM); =20 inode =3D shmem_get_inode(&nop_mnt_idmap, mnt->mnt_sb, NULL, - S_IFREG | S_IRWXUGO, 0, flags); + S_IFREG | S_IRWXUGO, 0, vm_flags); if (IS_ERR(inode)) { shmem_unacct_size(flags, size); return ERR_CAST(inode); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ABEB82FF474 for ; Wed, 23 Jul 2025 14:47:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282074; cv=none; b=rjc80KOeKiQ5xqJFtbYmTX+0sqRUkJX8ZUj/g+fOmUhYWJ+CS1XvnG+xaKec4ErBZvd9TJ36hB56TaGXbqCS5MEApg5X0GQ+Q2XxRqaSUg2H0RK+ZFAK/cOIfp2jsHRDC5JanB8zOVm0nyijTXU/aseMb8EQ/NomHqammt1SiMA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282074; c=relaxed/simple; bh=XdAIwDQk9O8+e27TqOcTJB/ZHlzR4zDdUnKHElnpGKk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NGCi2UU9vYmbBNUojoPPWVm5w0YZLxXbfIsdpCYhg44GiJyP/tjSxjOejNhn2931cI5T8dAnA7xIF6QBGP7wHXiYUrjrSMIJRqgRezafQBw/kbPV/qzNB8oYZLj3I3h3cai/iopIQ5ZrmcWNFQJEKdACCzy2Bh5w/DwXvkaRwd4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=LVSylLjr; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="LVSylLjr" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-71840959355so27367b3.1 for ; Wed, 23 Jul 2025 07:47:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282068; x=1753886868; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=po25YEPvXRrTuFLL1nDXhv73SfufZ4T7e/xtjD+elU8=; b=LVSylLjr2ZxFjIaHssw6wLCghfI2bDo8oKymdDfL5NC68rbnDiJqjPGgnTACsmXUYv rKpXyjTbKYvhrR8DMxGZwIeMAWpOyxnf8VVp1IJ7rIqPdm0NI+LT7vt2Kekm/WxZw5qB sxtnOwJXoFVrjHVNHv3L6QPM3vRnoRBUgN1VwzZ4B+LccRuz7ex9ABKBCgcAv5Li/Z9P xuMW1NoPiGvqLN4e09NHGhxNpj2TdRPFXB9scJk5ZQQVMyJOBHoVz4jBJ8Ysc3nHOt1y 9D2cTHFahp6KzgU8QS0kk52apndPb/ytbEd63P4KvoVcm2cvJTJuqXoDYwA9hhLRErq+ OYeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282068; x=1753886868; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=po25YEPvXRrTuFLL1nDXhv73SfufZ4T7e/xtjD+elU8=; b=kaE9CLpqb9PQqv5ySalNwaGKFnU95KRkeweGlM4zoA1g7JLarGN1uyCbo1ozaZA5IK 2+RLhdmw8v6qUJvVgMt29C1gtPFkCF1WcmeKpGUJuM+Xu86otWdm5N74PMEMQ4449d4V qMzMtRMc0C9Dm7y/spQ7n5JXKnV59nsXAbHtPwFpVUf+ytu9c1Ons+unHWWZLO1lemOU HwmG0huW8bvi34+w/vkruLjheSrCRc7ZWvFn6iCVEhOtVyPrXwRGr+ep83oqUm+eTx5w 1HlXjerknU9BEKo9/neoxHM7FdRpD6seIXvnLSHXd7mIHpExO3je8d81vjnEACLK34Dn HsLA== X-Forwarded-Encrypted: i=1; AJvYcCVf8u0NhPi2QrPuUXgZ05ijd/BeyEoYJ03aP8QqLta8vWUICTgoQkjxElWBUs9/r1zKyWS1cqhON/NEtT0=@vger.kernel.org X-Gm-Message-State: AOJu0YxHBDpR/saLDZ6cvt2i1EipwjFXDGVinHO61U6BITW3B9elyPK5 nFuGXUPXy1OnPpOnhzFv4at/xMfKv3PbNXfoZovYHYMmwtldeXXCLULNJF33b0yDMJU= X-Gm-Gg: ASbGncub5mwmIQzs0K+i1K9LjSn6NjU2hJhULIwrpbARGcOwtLczMuGcQOB5XBxBO// X/UQQXzRjemHqbgyv9XvICfDAlB7LQiYg5sO5A/WYntxDJKzsPSVfEaCFzlz1TlA7OKaKvBMZ0o KC6w/GUrudxm2rkARJPkSBiEbZHBd13CwkTAKitAJc4J76SdnC1X2YygEIaURFodozVJPUxw15o 9nWW5h6bfA6nnKPSg2emyoIphWzAPT8Wy9WR7x0Dkp1QPjoffoZv3IfRKets3eWhtpU/AJxi0aY MFXDJwBlddbhNkHxiDhTfzmpl5z/QIJFAOIl1IGHs9cghA+hPSEEqN3Bmf+8T3QebqKf54mz/ze Ln8Jyg5qpkziWD7K4mwmq9XC4t+1W4V+MIn8aI0gGTdzFzyVnczIDvd6Z1WrXxw/VmPMF2tMfkH AqfMCtI0iczI3jYg== X-Google-Smtp-Source: AGHT+IHUEvrhwYQb+PkqEHsQ9NVaBJx/HXsqV9Ci/9SKu6SQ4+1qmUQGxK2puIL8Ml0NTVfy05s2Ig== X-Received: by 2002:a05:690c:620c:b0:713:fe84:6f96 with SMTP id 00721157ae682-719a0b8f1fbmr86322747b3.14.1753282067864; Wed, 23 Jul 2025 07:47:47 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:46 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 26/32] mm: shmem: allow freezing inode mapping Date: Wed, 23 Jul 2025 14:46:39 +0000 Message-ID: <20250723144649.1696299-27-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav To prepare a shmem inode for live update via the Live Update Orchestrator (LUO), its index -> folio mappings must be serialized. Once the mappings are serialized, they cannot change since it would cause the serialized data to become inconsistent. This can be done by pinning the folios to avoid migration, and by making sure no folios can be added to or removed from the inode. While mechanisms to pin folios already exist, the only way to stop folios being added or removed are the grow and shrink file seals. But file seals come with their own semantics, one of which is that they can't be removed. This doesn't work with liveupdate since it can be cancelled or error out, which would need the seals to be removed and the file's normal functionality to be restored. Introduce SHMEM_F_MAPPING_FROZEN to indicate this instead. It is internal to shmem and is not directly exposed to userspace. It functions similar to F_SEAL_GROW | F_SEAL_SHRINK, but additionally disallows hole punching, and can be removed. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- include/linux/shmem_fs.h | 17 +++++++++++++++++ mm/shmem.c | 12 +++++++++++- 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 578a5f3d1935..1dd2aad0986b 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -22,6 +22,14 @@ #define SHMEM_F_NORESERVE BIT(0) /* Disallow swapping. */ #define SHMEM_F_LOCKED BIT(1) +/* + * Disallow growing, shrinking, or hole punching in the inode. Combined wi= th + * folio pinning, makes sure the inode's mapping stays fixed. + * + * In some ways similar to F_SEAL_GROW | F_SEAL_SHRINK, but can be removed= and + * isn't directly visible to userspace. + */ +#define SHMEM_F_MAPPING_FROZEN BIT(2) =20 struct shmem_inode_info { spinlock_t lock; @@ -183,6 +191,15 @@ static inline bool shmem_file(struct file *file) return shmem_mapping(file->f_mapping); } =20 +/* Must be called with inode lock taken exclusive. */ +static inline void shmem_i_mapping_freeze(struct inode *inode, bool freeze) +{ + if (freeze) + SHMEM_I(inode)->flags |=3D SHMEM_F_MAPPING_FROZEN; + else + SHMEM_I(inode)->flags &=3D ~SHMEM_F_MAPPING_FROZEN; +} + /* * If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages * beyond i_size's notion of EOF, which fallocate has committed to reservi= ng: diff --git a/mm/shmem.c b/mm/shmem.c index 6eded368d17a..d1e74f59cdba 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1297,7 +1297,8 @@ static int shmem_setattr(struct mnt_idmap *idmap, loff_t newsize =3D attr->ia_size; =20 /* protected by i_rwsem */ - if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) || + if ((info->flags & SHMEM_F_MAPPING_FROZEN) || + (newsize < oldsize && (info->seals & F_SEAL_SHRINK)) || (newsize > oldsize && (info->seals & F_SEAL_GROW))) return -EPERM; =20 @@ -3291,6 +3292,10 @@ shmem_write_begin(struct file *file, struct address_= space *mapping, return -EPERM; } =20 + if (unlikely((info->flags & SHMEM_F_MAPPING_FROZEN) && + pos + len > inode->i_size)) + return -EPERM; + ret =3D shmem_get_folio(inode, index, pos + len, &folio, SGP_WRITE); if (ret) return ret; @@ -3664,6 +3669,11 @@ static long shmem_fallocate(struct file *file, int m= ode, loff_t offset, =20 inode_lock(inode); =20 + if (info->flags & SHMEM_F_MAPPING_FROZEN) { + error =3D -EPERM; + goto out; + } + if (mode & FALLOC_FL_PUNCH_HOLE) { struct address_space *mapping =3D file->f_mapping; loff_t unmap_start =3D round_up(offset, PAGE_SIZE); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4C7C530112E for ; Wed, 23 Jul 2025 14:47:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282075; cv=none; b=FSbi7ZImwBFe/hSqnjuT1cjlJXul5Hs4H1U4xaTCLJyBCr5bUXcxPfqGpwqdj4nm18Nsm2l5++iJgOUgU9LqEC9qg0uliF1aRYvlX8hcv3pAPuHVjcaNXqMxo12Kcb4UXTJz2CzHg1mLFYn/DVhZpmwORx6v3xJXflDACzNXAg8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282075; c=relaxed/simple; bh=JpDjGN5akNkgXqHHoeTKMcxNwFHTNgIwpzw2mOBo7i8=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bDPMC11uOtMv74A4yMl0AMMmIGjIExJHojwf0EvNNemjQbGUSh1FPxbVGRcNGAVf8iqwQjWP6dd28wPA+zIn2QkSw5C67xUX6+CJ95QDD3ihZ1AE3zwcLJ3Bpv5N79F6gBjscjmnVDvgnnhXyBHWyw1nNYOV2p4g2CymNEmyy9I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=q7fICFX0; arc=none smtp.client-ip=209.85.128.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="q7fICFX0" Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-70e1d8c2dc2so65672157b3.3 for ; Wed, 23 Jul 2025 07:47:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282070; x=1753886870; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=a7cc+J8nsCZ/iRindbQQ/niSVeH9iE0nAvoyR2/FgGA=; b=q7fICFX01pEWSwL+H88zCuDfRF0cHPDrOOL+wgkjl8GB6aLUHyAioY8txya8u2mShO 5N55i69V2ZISXDHMiOMdnCl3gABIRlPtGQZweu2HOAWs+nAGWodRIkOmHWJfleMS+E0y +62SyJzazulBVwHlD/N813PZUV4EYJDoVpP22Y0IM1MxyufosanUgddwGswO7d9vCE9s VrbY9QLpnojVQNTJrOiDcv7T3GUvr85ODe0B2HEFjXdXYM3f+TJtNpTKbKnxiCchm28Y hJFen//G7lX8TxZEVCp0MK/PqcZIPtwBA5GXd+QHewArgJYK/2iQzgea27+0uTOHKn5t 0Sig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282070; x=1753886870; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a7cc+J8nsCZ/iRindbQQ/niSVeH9iE0nAvoyR2/FgGA=; b=UcRrYm3zTnGNZKKCJrnleEB4Ap4l3z+YhU8riES8yER2U1zCWFKFWY8mCHA0SvqQ0Z ghSrp8qzor3wfm91EnLdLzl26nUhrhWN1t5rl0wt3boMduEMsUkrf4vDq1ZiMIh9FJb5 kMDeMKBS0Xp+jvVD7sGPHVPh0vOEzVq/g9ly29qrnQojQ5f8npOlMPUkQtrZ+dn0udSX 6ftv7nUeUZPtKjQRnlVtT+PbD+dRsyJY3fIWJmdYtUhzd5l20ImqXLYOWAvUTfqhqghB o8Ofo36Tgnoi/GZPmipF5JAN1OLsyjl92ZViukM5NHVoAacCT7sAis0lXP2NNVD8WZDz o6Bg== X-Forwarded-Encrypted: i=1; AJvYcCVgb946ZSyfUcA3ihDs8jmm4s/ykw1WmyF6ZF7uBvyV/vjAZL9tPhcwsbr7mR1qJRD/rAuF+e7Vt0+Sk4w=@vger.kernel.org X-Gm-Message-State: AOJu0Yx+p9FzygynOgnx91j2BF3+BsO310gXAwiiRIXHw2LGgM9fT84P FYWppilJG8NZj2GaETA6cdSdx6l6S8ZznDVua9h/u2sZ+mrzh9XnCt8dXaotuz2fpt8= X-Gm-Gg: ASbGncv4+w0daw4QqpMftXz1fqcPgg2CoYhvaP4sShS2pjihI8pTTCABZID10Ap11sP IC/tHvOgFeaSVVDaJ8OFTHPQV251MmUv7F8J1C6HlRJGTQzfaRlH9d5vtu9UX28DR8izpuejvm5 gG3B6MD8iq4pm28HlI5jOD3UDanh2KpX8fCgj3GjFy0c+PPK7FNRYLRiqkd6kuj3X/qoHmq+cIu Fz+5QoxqzqofZqgY2S4FOJ2U5nxyPXvJHQqsDgkIDVqkTOCfcztMHzM4SieiPrCLFycKGIN/gjc /kHFsWXQKDZjGcWAfX4heCQTqRudlYv55McBpCs2SImmKniE/A+HzJ3hfcY3wBB+WdOBh7JmEkp LNGaHAGRsDi6kgU9YNhJZ9aqoJrB/fgrse/RswUFYO/PQ1h3DvayFtXfa3iV0bs25B2wIm4V68P sfcUE+bMQ2tlQIvA== X-Google-Smtp-Source: AGHT+IE5VwnfY3W/9KIKWadyQ8YmwEe7nfICAnpKXhV+FZKxq742FfGNPYd0VcNoWXlFXd9fHhIFIQ== X-Received: by 2002:a05:690c:92:b0:719:57a5:fde3 with SMTP id 00721157ae682-719b4149efcmr44710187b3.3.1753282069966; Wed, 23 Jul 2025 07:47:49 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:49 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 27/32] mm: shmem: export some functions to internal.h Date: Wed, 23 Jul 2025 14:46:40 +0000 Message-ID: <20250723144649.1696299-28-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav shmem_inode_acct_blocks(), shmem_recalc_inode(), and shmem_add_to_page_cache() are used by shmem_alloc_and_add_folio(). This functionality will also be used in the future by Live Update Orchestrator (LUO) to recreate memfd files after a live update. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- mm/internal.h | 6 ++++++ mm/shmem.c | 10 +++++----- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 6b8ed2017743..991917a8ae23 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1535,6 +1535,12 @@ void __meminit __init_page_from_nid(unsigned long pf= n, int nid); unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memc= g, int priority); =20 +int shmem_add_to_page_cache(struct folio *folio, + struct address_space *mapping, + pgoff_t index, void *expected, gfp_t gfp); +int shmem_inode_acct_blocks(struct inode *inode, long pages); +void shmem_recalc_inode(struct inode *inode, long alloced, long swapped); + #ifdef CONFIG_SHRINKER_DEBUG static inline __printf(2, 0) int shrinker_debugfs_name_alloc( struct shrinker *shrinker, const char *fmt, va_list ap) diff --git a/mm/shmem.c b/mm/shmem.c index d1e74f59cdba..4a616fe595e2 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -219,7 +219,7 @@ static inline void shmem_unacct_blocks(unsigned long fl= ags, long pages) vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE)); } =20 -static int shmem_inode_acct_blocks(struct inode *inode, long pages) +int shmem_inode_acct_blocks(struct inode *inode, long pages) { struct shmem_inode_info *info =3D SHMEM_I(inode); struct shmem_sb_info *sbinfo =3D SHMEM_SB(inode->i_sb); @@ -433,7 +433,7 @@ static void shmem_free_inode(struct super_block *sb, si= ze_t freed_ispace) * But normally info->alloced =3D=3D inode->i_mapping->nrpages + info->s= wapped * So mm freed is info->alloced - (inode->i_mapping->nrpages + info->swapp= ed) */ -static void shmem_recalc_inode(struct inode *inode, long alloced, long swa= pped) +void shmem_recalc_inode(struct inode *inode, long alloced, long swapped) { struct shmem_inode_info *info =3D SHMEM_I(inode); long freed; @@ -879,9 +879,9 @@ static void shmem_update_stats(struct folio *folio, int= nr_pages) /* * Somewhat like filemap_add_folio, but error if expected item has gone. */ -static int shmem_add_to_page_cache(struct folio *folio, - struct address_space *mapping, - pgoff_t index, void *expected, gfp_t gfp) +int shmem_add_to_page_cache(struct folio *folio, + struct address_space *mapping, + pgoff_t index, void *expected, gfp_t gfp) { XA_STATE_ORDER(xas, &mapping->i_pages, index, folio_order(folio)); long nr =3D folio_nr_pages(folio); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f177.google.com (mail-yw1-f177.google.com [209.85.128.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A50830204E for ; Wed, 23 Jul 2025 14:47:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282081; cv=none; b=u6WHg6cKTpkcSmLNVhBtyQP4iLvyUd9kgf3Vd/VPs3jRAUZdTIzNrpHFxE5GAVZgdoqpepp5OmKrpad64fyWsYpr+pGyFe/+8gIg14ibGm74KQ5P0ZOh+wkrNCNQ0HC7L+wnYRLKs6bQXmFCTfyOMqoPP+RucaXPm09jB++aSGk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282081; c=relaxed/simple; bh=XfJreG7L1/bBWGQ4MFeRvEf9vE2GQ2Cz559cvXIL4D8=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eNaf/4waB82wiVg2Fz3f8FxHrR0tk/M+9LfEkArIYKRCVbnGM/Xc69G3lbFIl+dBzCyaWi+lt+f7iF3tDbRpkCuvwsjBsV++dWwQcHLQ0GC+GBs2rTKvoH51KTaNL50RIDAHrlfGPDsBASxCZQBCGzxuydt3nKT6dM6lrxuiHUo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=szc0gi5u; arc=none smtp.client-ip=209.85.128.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="szc0gi5u" Received: by mail-yw1-f177.google.com with SMTP id 00721157ae682-7115e32802bso46729707b3.1 for ; Wed, 23 Jul 2025 07:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282072; x=1753886872; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=wtJupIf6Ypyd0tCxpTwxbuwQ6WNo77WGA9Hp+LVd5sg=; b=szc0gi5uputT0AsVsODAOl3hS/1d6iPL3IF67Yiugv0Z3dGqhsbcc3qNTQpBbkSAC2 ZA4Ih2JhnwOGta/8yIC/p10W0nAddFgNZr78lIr6ZnrQZv93YGE5FWoe2RgzSe6sgn1r yIXM7uTJs55nZoeo2qJMQQRJRFCdBV2oGLrd4ZMn4L0biIxfNk0AoZlRGI39ChsrKwO6 4g54NJvG0Jz1YIWb4h68+0eODfGV9gAzxSlT9p3kNg5qRs6WHsrXBLbrAsv7czcA7as8 7KGI/3LI2CsL7Y71KOmbaGrg0ZO/v1s4RET/T8nFc4R0KLLbdHK5yDF87E1liTBkFILF EbXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282072; x=1753886872; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wtJupIf6Ypyd0tCxpTwxbuwQ6WNo77WGA9Hp+LVd5sg=; b=N/esnGH++i24cMonwdazu0axGTjXMdBk1FgRYO8PIGPXAa8XvjX0gN1aBZOj8KFsg+ Dv/gVpVvM7JbdmrkllFkiknJs+pYUNJNmFu7TaIkb6SD3AhZZmOVRGWaeUB4YwEEZMLk VuWtgGzuyX0GrghboO2AJqS9hNefc+h/bH2CQgBefXGdpDq4Tba31z5aYBePBH7IYjDC g6Wf40CB7Z2ugD/BlfM6uHGfgqqZsL8nqLeyipsN99cIMebeQSDDbwpyMsPnpMvksX5a CBh28Wkh05dEB45PHwoRmqzN7yC7MVOB3pwMswb7olrgk06d5FgUxrTSCUHnqhvHgU8l P4PA== X-Forwarded-Encrypted: i=1; AJvYcCVr5RXQoX7iMrvB7OloMHCYa6DPADBXRIxvgES92Gbh9BJaZbx7vqogl2i/sjIwrBOa36kuXtFtcZvSLoc=@vger.kernel.org X-Gm-Message-State: AOJu0Yx7Io8m4KYXmXuwCn86/mPWTiylytZn5f2xiurDrLHTLOeHR+hw By/dv8fLoYncP9cgdxD7agDTIDAeCNd9Nk2dYJpyFOhiw47Q4dvca+GU+OqqyJnDUnU= X-Gm-Gg: ASbGnctuIlKdeVcgEASiUB43ZX08fXv9sgudZE23mIp6k3a3Xh0tnJbCRnFKILPJR/8 lGLcsj4lLX0FM0GeIndvVAJWErXjTgYE60dmvRyzVPQACMCuj515ZP5T+KiwnGiD/uMFMsVOQQn kVJvOHFNPvnQ9LTp7XwUvYLwua1PLqmjRciU8GKBdwIlncCGU9XlVoKGixipKVEXKAOmtVoyvvT f0hVVBTsyPBIqBXxsBCt2ZNRPyJjmU85mkcKyeLEaGwpOseutlkhNHWdeufkrWw131hHD1T4a8D jKmOwBpggs+6fYJ1UWHBfPjBfeP7HkQJTgVd17CAjo1er3zECmih8FY/jibU6ITNmrFfD0OtTjb cF3O0gUoFtl670l9U6N1wpa8skjrIUyKCROHGY1oHuTbus7tWd3wlElC8D3DAaKIFw0/+ZEs9Yf ViWZMedSnkyRMGGA== X-Google-Smtp-Source: AGHT+IHfzwaC0qTLCy7UrLGppuYqdj5p8yEzbNJCgFXwlPj2olsgcEtPeRaw03hIWhrKTEuA9eaYYQ== X-Received: by 2002:a05:690c:4b13:b0:70d:ed5d:b4cd with SMTP id 00721157ae682-719b414d629mr46782557b3.17.1753282071864; Wed, 23 Jul 2025 07:47:51 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:51 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 28/32] luo: allow preserving memfd Date: Wed, 23 Jul 2025 14:46:41 +0000 Message-ID: <20250723144649.1696299-29-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav The ability to preserve a memfd allows userspace to use KHO and LUO to transfer its memory contents to the next kernel. This is useful in many ways. For one, it can be used with IOMMUFD as the backing store for IOMMU page tables. Preserving IOMMUFD is essential for performing a hypervisor live update with passthrough devices. memfd support provides the first building block for making that possible. For another, applications with a large amount of memory that takes time to reconstruct, reboots to consume kernel upgrades can be very expensive. memfd with LUO gives those applications reboot-persistent memory that they can use to quickly save and reconstruct that state. While memfd is backed by either hugetlbfs or shmem, currently only support on shmem is added. To be more precise, support for anonymous shmem files is added. The handover to the next kernel is not transparent. All the properties of the file are not preserved; only its memory contents, position, and size. The recreated file gets the UID and GID of the task doing the restore, and the task's cgroup gets charged with the memory. After LUO is in prepared state, the file cannot grow or shrink, and all its pages are pinned to avoid migrations and swapping. The file can still be read from or written to. Co-developed-by: Changyuan Lyu Signed-off-by: Changyuan Lyu Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin Signed-off-by: Pratyush Yadav --- MAINTAINERS | 2 + mm/Makefile | 1 + mm/memfd_luo.c | 501 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 504 insertions(+) create mode 100644 mm/memfd_luo.c diff --git a/MAINTAINERS b/MAINTAINERS index 711cf25d283d..361032f23876 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14014,6 +14014,7 @@ F: tools/testing/selftests/livepatch/ =20 LIVE UPDATE M: Pasha Tatashin +R: Pratyush Yadav L: linux-kernel@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-kernel-liveupdate @@ -14023,6 +14024,7 @@ F: Documentation/userspace-api/liveupdate.rst F: include/linux/liveupdate.h F: include/uapi/linux/liveupdate.h F: kernel/liveupdate/ +F: mm/memfd_luo.c F: tools/testing/selftests/liveupdate/ =20 LLC (802.2) diff --git a/mm/Makefile b/mm/Makefile index 1a7a11d4933d..63cca66c068a 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -100,6 +100,7 @@ obj-$(CONFIG_NUMA) +=3D memory-tiers.o obj-$(CONFIG_DEVICE_MIGRATION) +=3D migrate_device.o obj-$(CONFIG_TRANSPARENT_HUGEPAGE) +=3D huge_memory.o khugepaged.o obj-$(CONFIG_PAGE_COUNTER) +=3D page_counter.o +obj-$(CONFIG_LIVEUPDATE) +=3D memfd_luo.o obj-$(CONFIG_MEMCG_V1) +=3D memcontrol-v1.o obj-$(CONFIG_MEMCG) +=3D memcontrol.o vmpressure.o ifdef CONFIG_SWAP diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c new file mode 100644 index 000000000000..339824ab6729 --- /dev/null +++ b/mm/memfd_luo.c @@ -0,0 +1,501 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + * Changyuan Lyu + * + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Pratyush Yadav + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include "internal.h" + +static const char memfd_luo_compatible[] =3D "memfd-v1"; + +#define PRESERVED_PFN_MASK GENMASK(63, 12) +#define PRESERVED_PFN_SHIFT 12 +#define PRESERVED_FLAG_DIRTY BIT(0) +#define PRESERVED_FLAG_UPTODATE BIT(1) + +#define PRESERVED_FOLIO_PFN(desc) (((desc) & PRESERVED_PFN_MASK) >> PRESER= VED_PFN_SHIFT) +#define PRESERVED_FOLIO_FLAGS(desc) ((desc) & ~PRESERVED_PFN_MASK) +#define PRESERVED_FOLIO_MKDESC(pfn, flags) (((pfn) << PRESERVED_PFN_SHIFT)= | (flags)) + +struct memfd_luo_preserved_folio { + /* + * The folio descriptor is made of 2 parts. The bottom 12 bits are used + * for storing flags, the others for storing the PFN. + */ + u64 foliodesc; + u64 index; +}; + +static int memfd_luo_preserve_folios(struct memfd_luo_preserved_folio *pfo= lios, + struct folio **folios, + unsigned int nr_folios) +{ + unsigned int i; + int err; + + for (i =3D 0; i < nr_folios; i++) { + struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + struct folio *folio =3D folios[i]; + unsigned int flags =3D 0; + unsigned long pfn; + + err =3D kho_preserve_folio(folio); + if (err) + goto err_unpreserve; + + pfn =3D folio_pfn(folio); + if (folio_test_dirty(folio)) + flags |=3D PRESERVED_FLAG_DIRTY; + if (folio_test_uptodate(folio)) + flags |=3D PRESERVED_FLAG_UPTODATE; + + pfolio->foliodesc =3D PRESERVED_FOLIO_MKDESC(pfn, flags); + pfolio->index =3D folio->index; + } + + return 0; + +err_unpreserve: + i--; + for (; i >=3D 0; i--) + WARN_ON_ONCE(kho_unpreserve_folio(folios[i])); + return err; +} + +static void memfd_luo_unpreserve_folios(const struct memfd_luo_preserved_f= olio *pfolios, + unsigned int nr_folios) +{ + unsigned int i; + + for (i =3D 0; i < nr_folios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + struct folio *folio; + + if (!pfolio->foliodesc) + continue; + + folio =3D pfn_folio(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + + kho_unpreserve_folio(folio); + unpin_folio(folio); + } +} + +static void *memfd_luo_create_fdt(unsigned long size) +{ + unsigned int order =3D get_order(size); + struct folio *fdt_folio; + int err =3D 0; + void *fdt; + + if (order > MAX_PAGE_ORDER) + return NULL; + + fdt_folio =3D folio_alloc(GFP_KERNEL, order); + if (!fdt_folio) + return NULL; + + fdt =3D folio_address(fdt_folio); + + err |=3D fdt_create(fdt, (1 << (order + PAGE_SHIFT))); + err |=3D fdt_finish_reservemap(fdt); + err |=3D fdt_begin_node(fdt, ""); + if (err) + goto free; + + return fdt; + +free: + folio_put(fdt_folio); + return NULL; +} + +static int memfd_luo_finish_fdt(void *fdt) +{ + int err; + + err =3D fdt_end_node(fdt); + if (err) + return err; + + return fdt_finish(fdt); +} + +static int memfd_luo_prepare(struct file *file, void *arg, u64 *data) +{ + struct memfd_luo_preserved_folio *preserved_folios; + struct inode *inode =3D file_inode(file); + unsigned int max_folios, nr_folios =3D 0; + int err =3D 0, preserved_size; + struct folio **folios; + long size, nr_pinned; + pgoff_t offset; + void *fdt; + u64 pos; + + if (WARN_ON_ONCE(!shmem_file(file))) + return -EINVAL; + + inode_lock(inode); + shmem_i_mapping_freeze(inode, true); + + size =3D i_size_read(inode); + if ((PAGE_ALIGN(size) / PAGE_SIZE) > UINT_MAX) { + err =3D -E2BIG; + goto err_unlock; + } + + /* + * Guess the number of folios based on inode size. Real number might end + * up being smaller if there are higher order folios. + */ + max_folios =3D PAGE_ALIGN(size) / PAGE_SIZE; + folios =3D kvmalloc_array(max_folios, sizeof(*folios), GFP_KERNEL); + if (!folios) { + err =3D -ENOMEM; + goto err_unfreeze; + } + + /* + * Pin the folios so they don't move around behind our back. This also + * ensures none of the folios are in CMA -- which ensures they don't + * fall in KHO scratch memory. It also moves swapped out folios back to + * memory. + * + * A side effect of doing this is that it allocates a folio for all + * indices in the file. This might waste memory on sparse memfds. If + * that is really a problem in the future, we can have a + * memfd_pin_folios() variant that does not allocate a page on empty + * slots. + */ + nr_pinned =3D memfd_pin_folios(file, 0, size - 1, folios, max_folios, + &offset); + if (nr_pinned < 0) { + err =3D nr_pinned; + pr_err("failed to pin folios: %d\n", err); + goto err_free_folios; + } + /* nr_pinned won't be more than max_folios which is also unsigned int. */ + nr_folios =3D (unsigned int)nr_pinned; + + preserved_size =3D sizeof(struct memfd_luo_preserved_folio) * nr_folios; + if (check_mul_overflow(sizeof(struct memfd_luo_preserved_folio), + nr_folios, &preserved_size)) { + err =3D -E2BIG; + goto err_unpin; + } + + /* + * Most of the space should be taken by preserved folios. So take its + * size, plus a page for other properties. + */ + fdt =3D memfd_luo_create_fdt(PAGE_ALIGN(preserved_size) + PAGE_SIZE); + if (!fdt) { + err =3D -ENOMEM; + goto err_unpin; + } + + pos =3D file->f_pos; + err =3D fdt_property(fdt, "pos", &pos, sizeof(pos)); + if (err) + goto err_free_fdt; + + err =3D fdt_property(fdt, "size", &size, sizeof(size)); + if (err) + goto err_free_fdt; + + err =3D fdt_property_placeholder(fdt, "folios", preserved_size, + (void **)&preserved_folios); + if (err) { + pr_err("Failed to reserve folios property in FDT: %s\n", + fdt_strerror(err)); + err =3D -ENOMEM; + goto err_free_fdt; + } + + err =3D memfd_luo_preserve_folios(preserved_folios, folios, nr_folios); + if (err) + goto err_free_fdt; + + err =3D memfd_luo_finish_fdt(fdt); + if (err) + goto err_unpreserve; + + err =3D kho_preserve_folio(virt_to_folio(fdt)); + if (err) + goto err_unpreserve; + + kvfree(folios); + inode_unlock(inode); + + *data =3D virt_to_phys(fdt); + return 0; + +err_unpreserve: + memfd_luo_unpreserve_folios(preserved_folios, nr_folios); +err_free_fdt: + folio_put(virt_to_folio(fdt)); +err_unpin: + unpin_folios(folios, nr_pinned); +err_free_folios: + kvfree(folios); +err_unfreeze: + shmem_i_mapping_freeze(inode, false); +err_unlock: + inode_unlock(inode); + return err; +} + +static int memfd_luo_freeze(struct file *file, void *arg, u64 *data) +{ + u64 pos =3D file->f_pos; + void *fdt; + int err; + + if (WARN_ON_ONCE(!*data)) + return -EINVAL; + + fdt =3D phys_to_virt(*data); + + /* + * The pos or size might have changed since prepare. Everything else + * stays the same. + */ + err =3D fdt_setprop(fdt, 0, "pos", &pos, sizeof(pos)); + if (err) + return err; + + return 0; +} + +static void memfd_luo_cancel(struct file *file, void *arg, u64 data) +{ + const struct memfd_luo_preserved_folio *pfolios; + struct inode *inode =3D file_inode(file); + struct folio *fdt_folio; + void *fdt; + int len; + + if (WARN_ON_ONCE(!data)) + return; + + inode_lock(inode); + shmem_i_mapping_freeze(inode, false); + + fdt =3D phys_to_virt(data); + fdt_folio =3D virt_to_folio(fdt); + pfolios =3D fdt_getprop(fdt, 0, "folios", &len); + if (pfolios) + memfd_luo_unpreserve_folios(pfolios, len / sizeof(*pfolios)); + + kho_unpreserve_folio(fdt_folio); + folio_put(fdt_folio); + inode_unlock(inode); +} + +static struct folio *memfd_luo_get_fdt(u64 data) +{ + return kho_restore_folio((phys_addr_t)data); +} + +static void memfd_luo_finish(struct file *file, void *arg, u64 data, + bool reclaimed) +{ + const struct memfd_luo_preserved_folio *pfolios; + struct folio *fdt_folio; + int len; + + if (reclaimed) + return; + + fdt_folio =3D memfd_luo_get_fdt(data); + + pfolios =3D fdt_getprop(folio_address(fdt_folio), 0, "folios", &len); + if (pfolios) + memfd_luo_unpreserve_folios(pfolios, len / sizeof(*pfolios)); + + folio_put(fdt_folio); +} + +static int memfd_luo_retrieve(void *arg, u64 data, struct file **file_p) +{ + const struct memfd_luo_preserved_folio *pfolios; + int nr_pfolios, len, ret =3D 0, i =3D 0; + struct address_space *mapping; + struct folio *folio, *fdt_folio; + const u64 *pos, *size; + struct inode *inode; + struct file *file; + const void *fdt; + + fdt_folio =3D memfd_luo_get_fdt(data); + if (!fdt_folio) + return -ENOENT; + + fdt =3D page_to_virt(folio_page(fdt_folio, 0)); + + pfolios =3D fdt_getprop(fdt, 0, "folios", &len); + if (!pfolios || len % sizeof(*pfolios)) { + pr_err("invalid 'folios' property\n"); + ret =3D -EINVAL; + goto put_fdt; + } + nr_pfolios =3D len / sizeof(*pfolios); + + size =3D fdt_getprop(fdt, 0, "size", &len); + if (!size || len !=3D sizeof(u64)) { + pr_err("invalid 'size' property\n"); + ret =3D -EINVAL; + goto put_folios; + } + + pos =3D fdt_getprop(fdt, 0, "pos", &len); + if (!pos || len !=3D sizeof(u64)) { + pr_err("invalid 'pos' property\n"); + ret =3D -EINVAL; + goto put_folios; + } + + file =3D shmem_file_setup("", 0, VM_NORESERVE); + + if (IS_ERR(file)) { + ret =3D PTR_ERR(file); + pr_err("failed to setup file: %d\n", ret); + goto put_folios; + } + + inode =3D file->f_inode; + mapping =3D inode->i_mapping; + vfs_setpos(file, *pos, MAX_LFS_FILESIZE); + + for (; i < nr_pfolios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + phys_addr_t phys; + u64 index; + int flags; + + if (!pfolio->foliodesc) + continue; + + phys =3D PFN_PHYS(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + folio =3D kho_restore_folio(phys); + if (!folio) { + pr_err("Unable to restore folio at physical address: %llx\n", + phys); + goto put_file; + } + index =3D pfolio->index; + flags =3D PRESERVED_FOLIO_FLAGS(pfolio->foliodesc); + + /* Set up the folio for insertion. */ + /* + * TODO: Should find a way to unify this and + * shmem_alloc_and_add_folio(). + */ + __folio_set_locked(folio); + __folio_set_swapbacked(folio); + + ret =3D mem_cgroup_charge(folio, NULL, mapping_gfp_mask(mapping)); + if (ret) { + pr_err("shmem: failed to charge folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + ret =3D shmem_add_to_page_cache(folio, mapping, index, NULL, + mapping_gfp_mask(mapping)); + if (ret) { + pr_err("shmem: failed to add to page cache folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + if (flags & PRESERVED_FLAG_UPTODATE) + folio_mark_uptodate(folio); + if (flags & PRESERVED_FLAG_DIRTY) + folio_mark_dirty(folio); + + ret =3D shmem_inode_acct_blocks(inode, 1); + if (ret) { + pr_err("shmem: failed to account folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + shmem_recalc_inode(inode, 1, 0); + folio_add_lru(folio); + folio_unlock(folio); + folio_put(folio); + } + + inode->i_size =3D *size; + *file_p =3D file; + folio_put(fdt_folio); + return 0; + +unlock_folio: + folio_unlock(folio); + folio_put(folio); +put_file: + fput(file); + i++; +put_folios: + for (; i < nr_pfolios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + + folio =3D kho_restore_folio(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + if (folio) + folio_put(folio); + } + +put_fdt: + folio_put(fdt_folio); + return ret; +} + +static bool memfd_luo_can_preserve(struct file *file, void *arg) +{ + struct inode *inode =3D file_inode(file); + + return shmem_file(file) && !inode->i_nlink; +} + +static const struct liveupdate_file_ops memfd_luo_file_ops =3D { + .prepare =3D memfd_luo_prepare, + .freeze =3D memfd_luo_freeze, + .cancel =3D memfd_luo_cancel, + .finish =3D memfd_luo_finish, + .retrieve =3D memfd_luo_retrieve, + .can_preserve =3D memfd_luo_can_preserve, +}; + +static struct liveupdate_file_handler memfd_luo_handler =3D { + .ops =3D &memfd_luo_file_ops, + .compatible =3D memfd_luo_compatible, +}; + +static int __init memfd_luo_init(void) +{ + int err; + + err =3D liveupdate_register_file_handler(&memfd_luo_handler); + if (err) + pr_err("Could not register luo filesystem handler: %d\n", err); + + return err; +} +late_initcall(memfd_luo_init); --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C42A22FD86A for ; Wed, 23 Jul 2025 14:47:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282080; cv=none; b=dcPQx79MvaqvNl/Q71lI9d6eqbrwiGeWLtN/rWvyEEsbMDnIhBZh2BpcxlcDs9bhXvzZGZpMZr7bRc9PicyU83S1R904luQ3HXW9ACR5iEwWFZ+CmsC0Chli3clpea8F3gcOe6hCVO1jW06KDT6dCmUWJN6wTIBajGalp2E4w94= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282080; c=relaxed/simple; bh=zC6u2biwGcXL2iC3P/2IdgSx0g0kQc0JVTiEcMEVtJ0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=olbaxfhsiRIw+l4YmRV0SizbGhOAVSjFx1MDdurloLUzypRuaWQPLG+y3SFKf2TqpYTRi/QXE8RyT5nWdAdU21GI6y0H0+q5SdbnAtU5aCgNTm6meteobGIH6Wg3vRYIL3dPWiTBfEiV8j+/7VFSzncZy9Rdy1pMfNdIr04DvPA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=GWv9AWc6; arc=none smtp.client-ip=209.85.128.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="GWv9AWc6" Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-7183d264e55so67251837b3.2 for ; Wed, 23 Jul 2025 07:47:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282074; x=1753886874; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=JgvqzDIiRiBX+3t66TRkd0Ymp5/wa/ERt/pQr69clFU=; b=GWv9AWc6wLaC4emA0ALPX+CyHF9Lnl2b/UeBHZvMpBb9i3OphFQYQchconaWIW6H8y bDa0amkJDboEBvga1Rg9hGIuq+aQP44YWBnRBKYo0s1o+mHLH0TUgcBi9h5ZVr9ggcwt Nw8bKVv9mof6LT5oXMy4axHpQ+Q1qGlKQ9dFkXAe1FceaW95IzfEYzVF+/p7i6HvHpTg j5ihmIBTN3+2JpZINiUIUdT5AjwlGs4YfFC01qN2YQA6ZDg2jvyl2QKS8e1xQmBX5ImN SvrL5GLnmXmvRX5cgw4jYE0XxolX23Lf/x7TNSWJFfQh/6mAjmNcyz233tXpW8LQPwok 3+Vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282074; x=1753886874; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JgvqzDIiRiBX+3t66TRkd0Ymp5/wa/ERt/pQr69clFU=; b=ki4VEhB1ojcvgl5qND/2nvb0ZVVEidIAeFIV5KeP7qPmMBHXtaupQDHNZYYafkxIE0 RyzAFeWlhbr8JYSWhKmEegKwKTnYMPip2TliiBR1UH3vPmQS4BzyeeR53GX/uatB+rJT +q7LE4suhgC7H5SOt7giPKB0mRF7BXrFrcojIdxp4e0AGjNVAKpogjIA9Ua5PYtxM/Er IX8Rc2558ojj+KCfXx+k166a05rn+PkJqtpuF53etv7cDVlbBcVU97582M6jPM/PKjAs g5OMfF2M7bGOs3yOfPDPopbasok6BXXM3El3K0zo4KWkzyaoAHRefRKN+IscfAGImWyV aqAA== X-Forwarded-Encrypted: i=1; AJvYcCUjJQVkwMUxNkkHIiE5VGmV5OluvQEQ7Geif/Eyz6FK6QLcP5JYQUFUI2G9pdrBCi8YcWLePwhx0GI0vmM=@vger.kernel.org X-Gm-Message-State: AOJu0YzvTAPPcNyE1+4Y4t/BbGWsH7O8yJwZqVR5FRdgMU4xc2mGN5oA rbSr8OFGbH5F2RW6KSfI5/qSp+OFtA/TQItkPXIswXKiHvEngnRP8kDiOD6MhzF1MHM= X-Gm-Gg: ASbGncsPfyQU73+KY6ExJvAkwNJYBfqM96yL0XXVxT0u9nrDqfTvElyTE+oP/wF3ysE cy7qvv1psUTKwrAS9Nh0TgUa2wZlKM/er0GMX7etamJYKTVJiheWS5Qme1fGxFPEuzYhwHCj/0P +WBHzNk0CjPAPzdRekeWBou68iCqKmrcmGNbvozzjqMeM3K9+lp2dwgFYYfnYRPpdk+Y3gpJpf0 eCV1KbX7rMOEmzPvSzy2VEtmLTqPZOqKtSR3FDkEKCAXY4MWXvWgH3TdfxD5r327VIUmCdecs6X bfRkOCVrbtxM+PpGGuml3R30Xf0jNB56VAhl3YvTe1X1KVPSxT6ymmswUXOurWz/s9zegkQ4d52 zXbhALTyLnZTcfs8zC4sf+XOp/YcuX2BTlFfSC+whDJ/tQ+W5lawt6SONWf/EAAYS7ncwFDQwCE lIcIkgif7GwRmVbQ== X-Google-Smtp-Source: AGHT+IGGfx7z148fU3GfejgZgpbPyTfIvTW82LKz4E+3FHtFbrw8wPq+9rOHj4lG5MqVILJufgSXww== X-Received: by 2002:a05:690c:358a:b0:70c:a57c:94ba with SMTP id 00721157ae682-719b4166f53mr47129757b3.17.1753282073888; Wed, 23 Jul 2025 07:47:53 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:53 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 29/32] docs: add documentation for memfd preservation via LUO Date: Wed, 23 Jul 2025 14:46:42 +0000 Message-ID: <20250723144649.1696299-30-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav Add the documentation under the "Preserving file descriptors" section of LUO's documentation. The doc describes the properties preserved, behaviour of the file under different LUO states, serialization format, and current limitations. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- Documentation/core-api/liveupdate.rst | 7 ++ Documentation/mm/index.rst | 1 + Documentation/mm/memfd_preservation.rst | 138 ++++++++++++++++++++++++ MAINTAINERS | 1 + 4 files changed, 147 insertions(+) create mode 100644 Documentation/mm/memfd_preservation.rst diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api= /liveupdate.rst index 41c4b76cd3ec..232d5f623992 100644 --- a/Documentation/core-api/liveupdate.rst +++ b/Documentation/core-api/liveupdate.rst @@ -18,6 +18,13 @@ LUO Preserving File Descriptors .. kernel-doc:: kernel/liveupdate/luo_files.c :doc: LUO file descriptors =20 +The following types of file descriptors can be preserved + +.. toctree:: + :maxdepth: 1 + + ../mm/memfd_preservation + Public API =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D .. kernel-doc:: include/linux/liveupdate.h diff --git a/Documentation/mm/index.rst b/Documentation/mm/index.rst index d3ada3e45e10..97267567ef80 100644 --- a/Documentation/mm/index.rst +++ b/Documentation/mm/index.rst @@ -47,6 +47,7 @@ documentation, or deleted if it has served its purpose. hugetlbfs_reserv ksm memory-model + memfd_preservation mmu_notifier multigen_lru numa diff --git a/Documentation/mm/memfd_preservation.rst b/Documentation/mm/mem= fd_preservation.rst new file mode 100644 index 000000000000..416cd1dafc97 --- /dev/null +++ b/Documentation/mm/memfd_preservation.rst @@ -0,0 +1,138 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D +Memfd Preservation via LUO +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +Overview +=3D=3D=3D=3D=3D=3D=3D=3D + +Memory file descriptors (memfd) can be preserved over a kexec using the Li= ve +Update Orchestrator (LUO) file preservation. This allows userspace to tran= sfer +its memory contents to the next kernel after a kexec. + +The preservation is not intended to be transparent. Only select properties= of +the file are preserved. All others are reset to default. The preserved +properties are described below. + +.. note:: + The LUO API is not stabilized yet, so the preserved properties of a mem= fd are + also not stable and are subject to backwards incompatible changes. + +.. note:: + Currently a memfd backed by Hugetlb is not supported. Memfds created + with ``MFD_HUGETLB`` will be rejected. + +Preserved Properties +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The following properties of the memfd are preserved across kexec: + +File Contents + All data stored in the file is preserved. + +File Size + The size of the file is preserved. Holes in the file are filled by alloc= ating + pages for them during preservation. + +File Position + The current file position is preserved, allowing applications to continue + reading/writing from their last position. + +File Status Flags + memfds are always opened with ``O_RDWR`` and ``O_LARGEFILE``. This prope= rty is + maintained. + +Non-Preserved Properties +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +All properties which are not preserved must be assumed to be reset to defa= ult. +This section describes some of those properties which may be more of note. + +``FD_CLOEXEC`` flag + A memfd can be created with the ``MFD_CLOEXEC`` flag that sets the + ``FD_CLOEXEC`` on the file. This flag is not preserved and must be set a= gain + after restore via ``fcntl()``. + +Seals + File seals are not preserved. The file is unsealed on restore and if nee= ded, + must be sealed again via ``fcntl()``. + +Behavior with LUO states +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +This section described the behavior of the memfd in the different LUO stat= es. + +Normal Phase + During the normal phase, the memfd can be marked for preservation using = the + ``LIVEUPDATE_IOCTL_FD_PRESERVE`` ioctl. The memfd acts as a regular memfd + during this phase with no additional restrictions. + +Prepared Phase + After LUO enters ``LIVEUPDATE_STATE_PREPARED``, the memfd is serialized = and + prepared for the next kernel. During this phase, the below things happen: + + - All the folios are pinned. If some folios reside in ``ZONE_MIGRATE``, = they + are migrated out. This ensures none of the preserved folios land in KHO + scratch area. + - Pages in swap are swapped in. Currently, there is no way to pass pages= in + swap over KHO, so all swapped out pages are swapped back in and pinned. + - The memfd goes into "frozen mapping" mode. The file can no longer grow= or + shrink, or punch holes. This ensures the serialized mappings stay in s= ync. + The file can still be read from or written to or mmap-ed. + +Freeze Phase + Updates the current file position in the serialized data to capture any + changes that occurred between prepare and freeze phases. After this, the= FD is + not allowed to be accessed. + +Restoration Phase + After being restored, the memfd is functional as normal with the propert= ies + listed above restored. + +Cancellation + If the liveupdate is canceled after going into prepared phase, the memfd + functions like in normal phase. + +Serialization format +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The state is serialized in an FDT with the following structure:: + + /dts-v1/; + + / { + compatible =3D "memfd-v1"; + pos =3D ; + size =3D ; + folios =3D ; + }; + +Each folio descriptor contains: + +- PFN + flags (8 bytes) + + - Physical frame number (PFN) of the preserved folio (bits 63:12). + - Folio flags (bits 11:0): + + - ``PRESERVED_FLAG_DIRTY`` (bit 0) + - ``PRESERVED_FLAG_UPTODATE`` (bit 1) + +- Folio index within the file (8 bytes). + +Limitations +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The current implementation has the following limitations: + +Size + Currently the size of the file is limited by the size of the FDT. The FD= T can + be at of most ``MAX_PAGE_ORDER`` order. By default this is 4 MiB with 4K + pages. Each page in the file is tracked using 16 bytes. This limits the + maximum size of the file to 1 GiB. + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` +- :doc:`/core-api/kho/concepts` diff --git a/MAINTAINERS b/MAINTAINERS index 361032f23876..b4fde9f62e9b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14020,6 +14020,7 @@ S: Maintained F: Documentation/ABI/testing/sysfs-kernel-liveupdate F: Documentation/admin-guide/liveupdate.rst F: Documentation/core-api/liveupdate.rst +F: Documentation/mm/memfd_preservation.rst F: Documentation/userspace-api/liveupdate.rst F: include/linux/liveupdate.h F: include/uapi/linux/liveupdate.h --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 71456301149 for ; Wed, 23 Jul 2025 14:47:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282085; cv=none; b=iwbRsf5OjiKNBC6Uf1w9OihM6aOZ37+A5hxuCST1dCDs6q/4uQYK7WQysLe4RrNZbY/6b33TFaJRQJWAlyMWKQ+Ax5y4sSy5aRRDEGK8akTYEwOj/A/IBCrJHOTjbX1mWOiFoViKh6nitFa4Kn20n42jkAK41/rQlR0+vHPaq3o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282085; c=relaxed/simple; bh=vepJPtSM4UDtXKviQeBbm1OdHGnp/I7tMn8jrp4ro84=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BYa9yBmMvztD+txokONQbItGKXNoZIvdZNTxDqr24mkZ8iD/JKeU5lF8SW0fwL0DDnWmdlzSAkiHog/xkT/Q4uwelV/MirG0TSDZuAYjKIEzy4SeTF3sxIDorA+xS53EqtUmpulMGB7rBA9NxCJ2B9Ly7/N0YtD21hYLvLleM6A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=JS1n7GEP; arc=none smtp.client-ip=209.85.128.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="JS1n7GEP" Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-718425f1172so68329157b3.0 for ; Wed, 23 Jul 2025 07:47:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282076; x=1753886876; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ddb9O3W6dbBtTQBECl2xf4CCJbIM6CMV4Ov5izVmo8E=; b=JS1n7GEPOoSzPkZMFIhYl9lcA1D4fDfEEPK2XKKg5GaVlGZ+HnqrOolqFqvhPpIbLT QPH7Hd4HZFSyKunNkJ1FBNqPQbvvHQK9DNYzfvv+w+BSEuoZeuToev6mqAsxtyuRZpsz r65Qu3Zq3Bs5d2jcKlDhCb8AGrxSRRniEA0RyafxmBixEZH4H8jP0/N41HbpbB+xT99J OvoWRUTGXvDt8kr6AuuFEEZz1aQc8CtPmSwpdhTSwpbsc25stQyMgmxtgQN8jZo6MdCO PVRc01G6WLXlORJInG0ipH5gA+HfGKy0hbWXIC4q8CQXLOKVvBq8JBCXKAXlBbKICTwl 0FAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282076; x=1753886876; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ddb9O3W6dbBtTQBECl2xf4CCJbIM6CMV4Ov5izVmo8E=; b=ejE2visprHy9cOHEY2mveTfXgZ2ZY3L26XPKv6nl9JDI8wjTGXaNMU8nLWq60dYdBF LZeDqs33q22RhFrEpE845zNXxKDE8Fe+ScApPuQMiyI9pKuYb67QAorcAIe7xVkzSOWF 7mxk6LPi7TINKwpNHFzokNVvn+Nm+3KZjByL1J39xpQxRdYyJGuE9c9wv5QcL1KKZPLF Qn2G7AxhqjFOgqfYW4lMHLFZdTdiW9UC5IbxLHHQLPv4Naf5VzvWZEEtV8CPSy4D9vaU /s+XQlLy/931R4hr90r4AAuR/2el6141p6UBD59L5lhJ4aTt2Mgu3Rx1ZNS6cijs/qkz 2nmw== X-Forwarded-Encrypted: i=1; AJvYcCXNNuCZlB1I3qjqxDTXcPsxwYRF4yPNvVpNfxl6Ekt8t+gK4OLSk1Wl9GkOfG9Dn/KJK6FvdQrJJBRHXdI=@vger.kernel.org X-Gm-Message-State: AOJu0YzDbqRmhTG3LVDgJs6dJpy3jfIKzP8UFfi2d0cSOlJzKTZGefp9 0IoJd6DWG1lHYyo8PR/Rl7w3d20l0uo+/gWMFgP4OHgvt8mQMqDvSWOXjnDEbXgH1cQ= X-Gm-Gg: ASbGnctDROZNZXy204Fuwu3PFsTN8fuowImWkuEuvVBmw7RSy09FN+xO29yDhmebOn9 1i2h/pk/oBM8HTrtyfWX4pBYK7bOLuzUAbM3/LRlCG011ciwMWPkGD5GWK4DJo5pXh0TRfkwddn FIdl61wo4Ni1PD0EITVl2fNVAI7XmS4HvZ1Lf6t9+aLHM2GsuJmRchzX8F87Fl/3LombNEAdHnC RWxn3+X7+DK15++vCpinFCj1XRNlvJTs9lc5JxRIVY3b+KytT5nUyL7dkI4nQ3znxyKchJ1VlZ7 u0lkoBgYt4gz2rVEM+WKtp12CcYEmXYRc15aHh5U1pySVWWrSgI4JQ+TqPvCUnTh+ZrOsoey5x2 H4JNQ+Nn18RcMBGtd4ELeuLKJxEvm9KYO2fB++jD9YbbI5KHYNJ/1Xe7T4IYdSo6y9uvHlAgLvC E+QgfMK0BlTR2xnw== X-Google-Smtp-Source: AGHT+IGlOv83r7AFmSXicXojx61jzE5jt0kd4sJetNYOXC3TepTaetdKwNBRcnR6w4mKhuPo017eUg== X-Received: by 2002:a05:690c:45c1:b0:70e:2d1a:82b8 with SMTP id 00721157ae682-719b42f4b40mr40484857b3.34.1753282075975; Wed, 23 Jul 2025 07:47:55 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:55 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 30/32] tools: introduce libluo Date: Wed, 23 Jul 2025 14:46:43 +0000 Message-ID: <20250723144649.1696299-31-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav LibLUO is a C library for interacting with the Live Update Orchestrator (LUO) subsystem. It provides a set of APIs for applications to interact with LUO, avoiding the need to directly calling the LUO ioctls. It provides APIs for controlling the LUO state and preserve and restore file descriptors across live updates. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- MAINTAINERS | 1 + tools/lib/luo/LICENSE | 165 ++++++++++++++++++ tools/lib/luo/Makefile | 37 ++++ tools/lib/luo/README.md | 166 ++++++++++++++++++ tools/lib/luo/include/libluo.h | 128 ++++++++++++++ tools/lib/luo/include/liveupdate.h | 265 +++++++++++++++++++++++++++++ tools/lib/luo/libluo.c | 203 ++++++++++++++++++++++ 7 files changed, 965 insertions(+) create mode 100644 tools/lib/luo/LICENSE create mode 100644 tools/lib/luo/Makefile create mode 100644 tools/lib/luo/README.md create mode 100644 tools/lib/luo/include/libluo.h create mode 100644 tools/lib/luo/include/liveupdate.h create mode 100644 tools/lib/luo/libluo.c diff --git a/MAINTAINERS b/MAINTAINERS index b4fde9f62e9b..f833b340fabd 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14026,6 +14026,7 @@ F: include/linux/liveupdate.h F: include/uapi/linux/liveupdate.h F: kernel/liveupdate/ F: mm/memfd_luo.c +F: tools/lib/luo/ F: tools/testing/selftests/liveupdate/ =20 LLC (802.2) diff --git a/tools/lib/luo/LICENSE b/tools/lib/luo/LICENSE new file mode 100644 index 000000000000..0a041280bd00 --- /dev/null +++ b/tools/lib/luo/LICENSE @@ -0,0 +1,165 @@ + GNU LESSER GENERAL PUBLIC LICENSE + Version 3, 29 June 2007 + + Copyright (C) 2007 Free Software Foundation, Inc. + Everyone is permitted to copy and distribute verbatim copies + of this license document, but changing it is not allowed. + + + This version of the GNU Lesser General Public License incorporates +the terms and conditions of version 3 of the GNU General Public +License, supplemented by the additional permissions listed below. + + 0. Additional Definitions. + + As used herein, "this License" refers to version 3 of the GNU Lesser +General Public License, and the "GNU GPL" refers to version 3 of the GNU +General Public License. + + "The Library" refers to a covered work governed by this License, +other than an Application or a Combined Work as defined below. + + An "Application" is any work that makes use of an interface provided +by the Library, but which is not otherwise based on the Library. +Defining a subclass of a class defined by the Library is deemed a mode +of using an interface provided by the Library. + + A "Combined Work" is a work produced by combining or linking an +Application with the Library. The particular version of the Library +with which the Combined Work was made is also called the "Linked +Version". + + The "Minimal Corresponding Source" for a Combined Work means the +Corresponding Source for the Combined Work, excluding any source code +for portions of the Combined Work that, considered in isolation, are +based on the Application, and not on the Linked Version. + + The "Corresponding Application Code" for a Combined Work means the +object code and/or source code for the Application, including any data +and utility programs needed for reproducing the Combined Work from the +Application, but excluding the System Libraries of the Combined Work. + + 1. Exception to Section 3 of the GNU GPL. + + You may convey a covered work under sections 3 and 4 of this License +without being bound by section 3 of the GNU GPL. + + 2. Conveying Modified Versions. + + If you modify a copy of the Library, and, in your modifications, a +facility refers to a function or data to be supplied by an Application +that uses the facility (other than as an argument passed when the +facility is invoked), then you may convey a copy of the modified +version: + + a) under this License, provided that you make a good faith effort to + ensure that, in the event an Application does not supply the + function or data, the facility still operates, and performs + whatever part of its purpose remains meaningful, or + + b) under the GNU GPL, with none of the additional permissions of + this License applicable to that copy. + + 3. Object Code Incorporating Material from Library Header Files. + + The object code form of an Application may incorporate material from +a header file that is part of the Library. You may convey such object +code under terms of your choice, provided that, if the incorporated +material is not limited to numerical parameters, data structure +layouts and accessors, or small macros, inline functions and templates +(ten or fewer lines in length), you do both of the following: + + a) Give prominent notice with each copy of the object code that the + Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the object code with a copy of the GNU GPL and this license + document. + + 4. Combined Works. + + You may convey a Combined Work under terms of your choice that, +taken together, effectively do not restrict modification of the +portions of the Library contained in the Combined Work and reverse +engineering for debugging such modifications, if you also do each of +the following: + + a) Give prominent notice with each copy of the Combined Work that + the Library is used in it and that the Library and its use are + covered by this License. + + b) Accompany the Combined Work with a copy of the GNU GPL and this lice= nse + document. + + c) For a Combined Work that displays copyright notices during + execution, include the copyright notice for the Library among + these notices, as well as a reference directing the user to the + copies of the GNU GPL and this license document. + + d) Do one of the following: + + 0) Convey the Minimal Corresponding Source under the terms of this + License, and the Corresponding Application Code in a form + suitable for, and under terms that permit, the user to + recombine or relink the Application with a modified version of + the Linked Version to produce a modified Combined Work, in the + manner specified by section 6 of the GNU GPL for conveying + Corresponding Source. + + 1) Use a suitable shared library mechanism for linking with the + Library. A suitable mechanism is one that (a) uses at run time + a copy of the Library already present on the user's computer + system, and (b) will operate properly with a modified version + of the Library that is interface-compatible with the Linked + Version. + + e) Provide Installation Information, but only if you would otherwise + be required to provide such information under section 6 of the + GNU GPL, and only to the extent that such information is + necessary to install and execute a modified version of the + Combined Work produced by recombining or relinking the + Application with a modified version of the Linked Version. (If + you use option 4d0, the Installation Information must accompany + the Minimal Corresponding Source and Corresponding Application + Code. If you use option 4d1, you must provide the Installation + Information in the manner specified by section 6 of the GNU GPL + for conveying Corresponding Source.) + + 5. Combined Libraries. + + You may place library facilities that are a work based on the +Library side by side in a single library together with other library +facilities that are not Applications and are not covered by this +License, and convey such a combined library under terms of your +choice, if you do both of the following: + + a) Accompany the combined library with a copy of the same work based + on the Library, uncombined with any other library facilities, + conveyed under the terms of this License. + + b) Give prominent notice with the combined library that part of it + is a work based on the Library, and explaining where to find the + accompanying uncombined form of the same work. + + 6. Revised Versions of the GNU Lesser General Public License. + + The Free Software Foundation may publish revised and/or new versions +of the GNU Lesser General Public License from time to time. Such new +versions will be similar in spirit to the present version, but may +differ in detail to address new problems or concerns. + + Each version is given a distinguishing version number. If the +Library as you received it specifies that a certain numbered version +of the GNU Lesser General Public License "or any later version" +applies to it, you have the option of following the terms and +conditions either of that published version or of any later version +published by the Free Software Foundation. If the Library as you +received it does not specify a version number of the GNU Lesser +General Public License, you may choose any version of the GNU Lesser +General Public License ever published by the Free Software Foundation. + + If the Library as you received it specifies that a proxy can decide +whether future versions of the GNU Lesser General Public License shall +apply, that proxy's public statement of acceptance of any version is +permanent authorization for you to choose that version for the +Library. diff --git a/tools/lib/luo/Makefile b/tools/lib/luo/Makefile new file mode 100644 index 000000000000..e851c37d3d0a --- /dev/null +++ b/tools/lib/luo/Makefile @@ -0,0 +1,37 @@ +# SPDX-License-Identifier: LGPL-3.0-or-later +SRCS =3D libluo.c +OBJS =3D $(SRCS:.c=3D.o) +INCLUDE_DIR =3D include +HEADERS =3D $(wildcard $(INCLUDE_DIR)/*.h) + +CC =3D gcc +AR =3D ar +CFLAGS =3D -Wall -Wextra -fPIC -O2 -g -I$(INCLUDE_DIR) +LDFLAGS =3D -shared + +LIB_NAME =3D libluo +STATIC_LIB =3D $(LIB_NAME).a +SHARED_LIB =3D $(LIB_NAME).so + +.PHONY: all clean install + +all: $(STATIC_LIB) $(SHARED_LIB) + +$(STATIC_LIB): $(OBJS) + $(AR) rcs $@ $^ + +$(SHARED_LIB): $(OBJS) + $(CC) $(LDFLAGS) -o $@ $^ + +%.o: %.c $(HEADERS) + $(CC) $(CFLAGS) -c $< -o $@ + +clean: + rm -f $(OBJS) $(STATIC_LIB) $(SHARED_LIB) + +install: all + install -d $(DESTDIR)/usr/local/lib + install -d $(DESTDIR)/usr/local/include + install -m 644 $(STATIC_LIB) $(DESTDIR)/usr/local/lib + install -m 755 $(SHARED_LIB) $(DESTDIR)/usr/local/lib + install -m 644 $(HEADERS) $(DESTDIR)/usr/local/include diff --git a/tools/lib/luo/README.md b/tools/lib/luo/README.md new file mode 100644 index 000000000000..a716ccb2992c --- /dev/null +++ b/tools/lib/luo/README.md @@ -0,0 +1,166 @@ +# LibLUO - Live Update Orchestrator Library + +A C library for interacting with the Linux Live Update Orchestrator (LUO) = subsystem. + +## Overview + +LibLUO provides a set of APIs for applications to interact with LUO, avoid= ing +the need to directly calling the LUO ioctls. It provides APIs for controll= ing +the LUO state and preserve and restore file descriptors across live update= s. + +## Features + +- Initialize and manage connection to the LUO device. +- Preserve file descriptors before a live update. +- Restore file descriptors after a live update. +- Control the live update state machine (prepare, cancel, finish). +- Query the current state of the LUO subsystem. +- The library also includes a test suite for testing both LibLUO and the k= ernel + LUO interface. + +## Building + +```bash +make +``` + +This will build both static (`libluo.a`) and shared (`libluo.so`) versions= of the library. + +To build the tests, do + +``` bash +make tests +``` + +This will build the `tests/test` binary. + +## Installation + +```bash +sudo make install +``` + +This will install the library to `/usr/local/lib` and the header file to `= /usr/local/include`. + +## Usage + +### Preserving a file descriptor + +```c +#include +#include +#include +#include + +int main() { + int ret; + uint64_t token; + int fd, new_fd; + enum luo_state state; + + // Initialize the library + ret =3D luo_init(); + if (ret < 0) { + fprintf(stderr, "Failed to initialize LibLUO: %d\n", ret); + return 1; + } + + // Check if LUO is available + if (!luo_is_available()) { + fprintf(stderr, "LUO is not available on this system\n"); + return 1; + } + + // Get the current LUO state + ret =3D luo_get_state(&state); + if (ret < 0) { + fprintf(stderr, "Failed to get LUO state: %d\n", ret); + luo_cleanup(); + return 1; + } + + printf("Current LUO state: %s\n", luo_state_to_string(state)); + + // Open a file descriptor to preserve + fd =3D memfd_create("luo_memfd", 0); + if (fd < 0) { + perror("Failed to open memfd"); + luo_cleanup(); + return 1; + } + + // Preserve the file descriptor + ret =3D luo_fd_preserve(fd, &token); + if (ret < 0) { + fprintf(stderr, "Failed to preserve FD: %d\n", ret); + close(fd); + luo_cleanup(); + return 1; + } + + printf("FD %d preserved with token %lu\n", fd, token); + + // After a live update, restore the file descriptor + if (state =3D=3D LUO_STATE_UPDATED) { + ret =3D luo_fd_restore(token, &new_fd); + if (ret < 0) { + fprintf(stderr, "Failed to restore FD: %d\n", ret); + } else { + printf("FD restored: %d\n", new_fd); + close(new_fd); + } + + // Signal completion of restoration + luo_finish(); + } + + close(fd); + luo_cleanup(); + return 0; +} +``` + +### Controlling the Live Update Process + +```c +#include +#include + +int main() { + int ret; + + ret =3D luo_init(); + if (ret < 0) { + return 1; + } + + // Initiate the preparation phase + ret =3D luo_prepare(); + if (ret < 0) { + fprintf(stderr, "Failed to prepare for live update: %d\n", ret); + luo_cleanup(); + return 1; + } + + // At this point, the system is ready for kexec reboot + // The freeze operation is handled internally by the kernel + // during kexec. + + // After reboot, in the new kernel + // Signal completion of restoration + ret =3D luo_finish(); + if (ret < 0) { + fprintf(stderr, "Failed to finish live update: %d\n", ret); + luo_cleanup(); + return 1; + } + + luo_cleanup(); + return 0; +} +``` + +## License + +This library is provided under the terms of the GNU Lesser General Public +License version 3.0, or (at your option) any later version. diff --git a/tools/lib/luo/include/libluo.h b/tools/lib/luo/include/libluo.h new file mode 100644 index 000000000000..86b277e8e4f6 --- /dev/null +++ b/tools/lib/luo/include/libluo.h @@ -0,0 +1,128 @@ +// SPDX-License-Identifier: LGPL-3.0-or-later +/** + * @file libluo.h + * @brief Library for interacting with the Linux Live Update Orchestrator = (LUO) + * + * This library provides a simple interface for applications to interact w= ith + * the Linux Live Update Orchestrator (LUO) subsystem, allowing them to pr= eserve + * and restore file descriptors across live kernel updates. + * + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Author: Pratyush Yadav + */ + +#ifndef _LIBLUO_H +#define _LIBLUO_H + +#include +#include +#include + +/** + * @brief Initialize the LUO library + * + * Opens the LUO device file and prepares the library for use. + * + * @return 0 on success, negative error code on failure + */ +int luo_init(void); + +/** + * @brief Clean up and release resources used by the LUO library + * + * Closes the LUO device file and releases any resources allocated by the + * library. + */ +void luo_cleanup(void); + +/** + * @brief Get the current state of the LUO subsystem + * + * @param[out] state Pointer to store the current LUO state + * @return 0 on success, negative error code on failure + */ +int luo_get_state(enum liveupdate_state *state); + +/** + * @brief Preserve a file descriptor for restoration after a live update + * + * Marks the specified file descriptor for preservation across a live upda= te. + * The kernel validates if the FD type is supported for preservation. + * + * @param[in] fd The file descriptor to preserve + * @param[in] token Token to associate fd with. Must be unique. + * @return 0 on success, negative error code on failure + */ +int luo_fd_preserve(int fd, uint64_t token); + +/** + * @brief Cancel preservation of a previously preserved file descriptor + * + * Removes a file descriptor from the preservation list using its token. + * + * @param[in] token The token used to preserve fd previously. + * @return 0 on success, negative error code on failure + */ +int luo_fd_unpreserve(uint64_t token); + +/** + * @brief Restore a previously preserved file descriptor + * + * Restores a file descriptor that was preserved before the live update. + * This must be called after the system has rebooted into the new kernel. + * + * @param[in] token The token returned by luo_fd_preserve before the update + * @param[out] fd Pointer to store the new file descriptor + * @return 0 on success, negative error code on failure + */ +int luo_fd_restore(uint64_t token, int *fd); + +/** + * @brief Initiate the preparation phase for a live update + * + * Triggers the PREPARE phase in the LUO subsystem, which begins the + * state saving process for items marked for preservation. + * + * @return 0 on success, negative error code on failure + */ +int luo_prepare(void); + +/** + * @brief Cancel the live update preparation phase + * + * Aborts the preparation sequence and returns the system to normal state. + * + * @return 0 on success, negative error code on failure + */ +int luo_cancel(void); + +/** + * @brief Signal completion of restoration after a live update + * + * Notifies the LUO subsystem that all necessary restoration actions + * have been completed in the new kernel. + * + * @return 0 on success, negative error code on failure + */ +int luo_finish(void); + +/** + * @brief Check if the LUO subsystem is available + * + * Tests if the LUO device file exists and can be opened. + * + * @return true if LUO is available, false otherwise + */ +bool luo_is_available(void); + +/** + * @brief Convert a liveupdate_state enum value to a string + * + * Returns a string representation of the given LUO state. + * + * @param[in] state The LUO state to convert + * @return A constant string representing the state + */ +const char *luo_state_to_string(enum liveupdate_state state); + +#endif /* _LIBLUO_H */ diff --git a/tools/lib/luo/include/liveupdate.h b/tools/lib/luo/include/liv= eupdate.h new file mode 100644 index 000000000000..7b12a1073c3c --- /dev/null +++ b/tools/lib/luo/include/liveupdate.h @@ -0,0 +1,265 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +/* + * Userspace interface for /dev/liveupdate + * Live Update Orchestrator + * + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _UAPI_LIVEUPDATE_H +#define _UAPI_LIVEUPDATE_H + +#include +#include + +/** + * enum liveupdate_state - Defines the possible states of the live update + * orchestrator. + * @LIVEUPDATE_STATE_UNDEFINED: State has not yet been initialized. + * @LIVEUPDATE_STATE_NORMAL: Default state, no live update in prog= ress. + * @LIVEUPDATE_STATE_PREPARED: Live update is prepared for reboot; t= he + * LIVEUPDATE_PREPARE callbacks have com= pleted + * successfully. + * Devices might operate in a limited st= ate + * for example the participating devices= might + * not be allowed to unbind, and also the + * setting up of new DMA mappings might = be + * disabled in this state. + * @LIVEUPDATE_STATE_FROZEN: The final reboot event + * (%LIVEUPDATE_FREEZE) has been sent, a= nd the + * system is performing its final state = saving + * within the "blackout window". User + * workloads must be suspended. The actu= al + * reboot (kexec) into the next kernel is + * imminent. + * @LIVEUPDATE_STATE_UPDATED: The system has rebooted into the next + * kernel via live update the system is = now + * running the next kernel, awaiting the + * finish event. + * + * These states track the progress and outcome of a live update operation. + */ +enum liveupdate_state { + LIVEUPDATE_STATE_UNDEFINED =3D 0, + LIVEUPDATE_STATE_NORMAL =3D 1, + LIVEUPDATE_STATE_PREPARED =3D 2, + LIVEUPDATE_STATE_FROZEN =3D 3, + LIVEUPDATE_STATE_UPDATED =3D 4, +}; + +/** + * struct liveupdate_fd - Holds parameters for preserving and restoring fi= le + * descriptors across live update. + * @fd: Input for %LIVEUPDATE_IOCTL_FD_PRESERVE: The user-space file + * descriptor to be preserved. + * Output for %LIVEUPDATE_IOCTL_FD_RESTORE: The new file descriptor + * representing the fully restored kernel resource. + * @flags: Unused, reserved for future expansion, must be set to 0. + * @token: Input for %LIVEUPDATE_IOCTL_FD_PRESERVE: An opaque, unique token + * preserved for preserved resource. + * Input for %LIVEUPDATE_IOCTL_FD_RESTORE: The token previously + * provided to the preserve ioctl for the resource to be restored. + * + * This structure is used as the argument for the %LIVEUPDATE_IOCTL_FD_PRE= SERVE + * and %LIVEUPDATE_IOCTL_FD_RESTORE ioctls. These ioctls allow specific ty= pes + * of file descriptors (for example memfd, kvm, iommufd, and VFIO) to have= their + * underlying kernel state preserved across a live update cycle. + * + * To preserve an FD, user space passes this struct to + * %LIVEUPDATE_IOCTL_FD_PRESERVE with the @fd field set. On success, the + * kernel uses the @token field to uniquly associate the preserved FD. + * + * After the live update transition, user space passes the struct populate= d with + * the *same* @token to %LIVEUPDATE_IOCTL_FD_RESTORE. The kernel uses the = @token + * to find the preserved state and, on success, populates the @fd field wi= th a + * new file descriptor referring to the restored resource. + */ +struct liveupdate_fd { + int fd; + __u32 flags; + __aligned_u64 token; +}; + +/* The ioctl type, documented in ioctl-number.rst */ +#define LIVEUPDATE_IOCTL_TYPE 0xBA + +/** + * LIVEUPDATE_IOCTL_FD_PRESERVE - Validate and initiate preservation for a= file + * descriptor. + * + * Argument: Pointer to &struct liveupdate_fd. + * + * User sets the @fd field identifying the file descriptor to preserve + * (e.g., memfd, kvm, iommufd, VFIO). The kernel validates if this FD type + * and its dependencies are supported for preservation. If validation pass= es, + * the kernel marks the FD internally and *initiates the process* of prepa= ring + * its state for saving. The actual snapshotting of the state typically oc= curs + * during the subsequent %LIVEUPDATE_IOCTL_PREPARE execution phase, though + * some finalization might occur during freeze. + * On successful validation and initiation, the kernel uses the @token + * field with an opaque identifier representing the resource being preserv= ed. + * This token confirms the FD is targeted for preservation and is required= for + * the subsequent %LIVEUPDATE_IOCTL_FD_RESTORE call after the live update. + * + * Return: 0 on success (validation passed, preservation initiated), negat= ive + * error code on failure (e.g., unsupported FD type, dependency issue, + * validation failed). + */ +#define LIVEUPDATE_IOCTL_FD_PRESERVE \ + _IOW(LIVEUPDATE_IOCTL_TYPE, 0x00, struct liveupdate_fd) + +/** + * LIVEUPDATE_IOCTL_FD_UNPRESERVE - Remove a file descriptor from the + * preservation list. + * + * Argument: Pointer to __u64 token. + * + * Allows user space to explicitly remove a file descriptor from the set of + * items marked as potentially preservable. User space provides a pointer = to the + * __u64 @token that was previously returned by a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call (potentially from a prior, possibly + * cancelled, live update attempt). The kernel reads the token value from = the + * provided user-space address. + * + * On success, the kernel removes the corresponding entry (identified by t= he + * token value read from the user pointer) from its internal preservation = list. + * The provided @token (representing the now-removed entry) becomes invalid + * after this call. + * + * Return: 0 on success, negative error code on failure (e.g., -EBUSY or -= EINVAL + * if not in %LIVEUPDATE_STATE_NORMAL, bad address provided, invalid token= value + * read, token not found). + */ +#define LIVEUPDATE_IOCTL_FD_UNPRESERVE \ + _IOW(LIVEUPDATE_IOCTL_TYPE, 0x01, __u64) + +/** + * LIVEUPDATE_IOCTL_FD_RESTORE - Restore a previously preserved file descr= iptor. + * + * Argument: Pointer to &struct liveupdate_fd. + * + * User sets the @token field to the value obtained from a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call before the live update. On success, + * the kernel restores the state (saved during the PREPARE/FREEZE phases) + * associated with the token and populates the @fd field with a new file + * descriptor referencing the restored resource in the current (new) kerne= l. + * This operation must be performed *before* signaling completion via + * %LIVEUPDATE_IOCTL_FINISH. + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +#define LIVEUPDATE_IOCTL_FD_RESTORE \ + _IOWR(LIVEUPDATE_IOCTL_TYPE, 0x02, struct liveupdate_fd) + +/** + * LIVEUPDATE_IOCTL_GET_STATE - Query the current state of the live update + * orchestrator. + * + * Argument: Pointer to &enum liveupdate_state. + * + * The kernel fills the enum value pointed to by the argument with the cur= rent + * state of the live update subsystem. Possible states are: + * + * - %LIVEUPDATE_STATE_NORMAL: Default state; no live update operation is + * currently in progress. + * - %LIVEUPDATE_STATE_PREPARED: The preparation phase (triggered by + * %LIVEUPDATE_IOCTL_PREPARE) has completed + * successfully. The system is ready for the + * reboot transition. Note that some + * device operations (e.g., unbinding, new D= MA + * mappings) might be restricted in this sta= te. + * - %LIVEUPDATE_STATE_UPDATED: The system has successfully rebooted into= the + * new kernel via live update. It is now run= ning + * the new kernel code and is awaiting the + * completion signal from user space via + * %LIVEUPDATE_IOCTL_FINISH after + * restoration tasks are done. + * + * See the definition of &enum liveupdate_state for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +#define LIVEUPDATE_IOCTL_GET_STATE \ + _IOR(LIVEUPDATE_IOCTL_TYPE, 0x03, enum liveupdate_state) + +/** + * LIVEUPDATE_IOCTL_PREPARE - Initiate preparation phase and trigger state + * saving. + * + * Argument: None. + * + * Initiates the live update preparation phase. This action corresponds to + * the internal %LIVEUPDATE_PREPARE. This typically triggers the saving pr= ocess + * for items marked via the PRESERVE ioctls. This typically occurs *before* + * the "blackout window", while user applications (e.g., VMs) may still be + * running. Kernel subsystems receiving the %LIVEUPDATE_PREPARE event shou= ld + * serialize necessary state. This command does not transfer data. + * + * Return: 0 on success, negative error code on failure. Transitions state + * towards %LIVEUPDATE_STATE_PREPARED on success. + */ +#define LIVEUPDATE_IOCTL_PREPARE \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x04) + +/** + * LIVEUPDATE_IOCTL_CANCEL - Cancel the live update preparation phase. + * + * Argument: None. + * + * Notifies the live update subsystem to abort the preparation sequence + * potentially initiated by %LIVEUPDATE_IOCTL_PREPARE. This action + * typically corresponds to the internal %LIVEUPDATE_CANCEL kernel event, + * which might also be triggered automatically if the PREPARE stage fails + * internally. + * + * When triggered, subsystems receiving the %LIVEUPDATE_CANCEL event should + * revert any state changes or actions taken specifically for the aborted + * prepare phase (e.g., discard partially serialized state). The kernel + * releases resources allocated specifically for this *aborted preparation + * attempt*. + * + * This operation cancels the current *attempt* to prepare for a live upda= te + * but does **not** remove previously validated items from the internal li= st + * of potentially preservable resources. Consequently, preservation tokens + * previously generated by successful %LIVEUPDATE_IOCTL_FD_PRESERVE or cal= ls + * generally **remain valid** as identifiers for those potentially preserv= able + * resources. However, since the system state returns towards + * %LIVEUPDATE_STATE_NORMAL, user space must initiate a new live update se= quence + * (starting with %LIVEUPDATE_IOCTL_PREPARE) to proceed with an update + * using these (or other) tokens. + * + * This command does not transfer data. Kernel callbacks for the + * %LIVEUPDATE_CANCEL event must not fail. + * + * Return: 0 on success, negative error code on failure. Transitions state= back + * towards %LIVEUPDATE_STATE_NORMAL on success. + */ +#define LIVEUPDATE_IOCTL_CANCEL \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x06) + +/** + * LIVEUPDATE_IOCTL_EVENT_FINISH - Signal restoration completion and trigg= er + * cleanup. + * + * Argument: None. + * + * Signals that user space has completed all necessary restoration actions= in + * the new kernel (after a live update reboot). This action corresponds to= the + * internal %LIVEUPDATE_FINISH kernel event. Calling this ioctl triggers t= he + * cleanup phase: any resources that were successfully preserved but were = *not* + * subsequently restored (reclaimed) via the RESTORE ioctls will have their + * preserved state discarded and associated kernel resources released. Inv= olved + * devices may be reset. All desired restorations *must* be completed *bef= ore* + * this. Kernel callbacks for the %LIVEUPDATE_FINISH event must not fail. + * Successfully completing this phase transitions the system state from + * %LIVEUPDATE_STATE_UPDATED back to %LIVEUPDATE_STATE_NORMAL. This comman= d does + * not transfer data. + * + * Return: 0 on success, negative error code on failure. + */ +#define LIVEUPDATE_IOCTL_FINISH \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x07) + +#endif /* _UAPI_LIVEUPDATE_H */ diff --git a/tools/lib/luo/libluo.c b/tools/lib/luo/libluo.c new file mode 100644 index 000000000000..7de4bf01de16 --- /dev/null +++ b/tools/lib/luo/libluo.c @@ -0,0 +1,203 @@ +// SPDX-License-Identifier: LGPL-3.0-or-later +/* + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Author: Pratyush Yadav + */ +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * The liveupdate header is not mainline right now, so it is not available= on + * the system include path. It is copied from Linux tree and put in includ= e/. + * + * This can be removed when liveupdate hits mainline. + */ +#include + +#define LUO_DEVICE_PATH "/dev/liveupdate" + +/* File descriptor for the LUO device */ +static int luo_fd =3D -1; + +#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0])) + +int luo_init(void) +{ + if (luo_fd >=3D 0) + /* Already initialized */ + return 0; + + luo_fd =3D open(LUO_DEVICE_PATH, O_RDWR); + if (luo_fd < 0) { + int err =3D -errno; + + fprintf(stderr, "Failed to open %s: %s\n", + LUO_DEVICE_PATH, strerror(errno)); + return err; + } + + return 0; +} + +void luo_cleanup(void) +{ + if (luo_fd >=3D 0) { + close(luo_fd); + luo_fd =3D -1; + } +} + +bool luo_is_available(void) +{ + struct stat st; + + /* Use stat() to check if the device file exists and is accessible */ + if (stat(LUO_DEVICE_PATH, &st) < 0) + return false; + + /* Verify it's a character device file. */ + if (!S_ISCHR(st.st_mode)) + return false; + + return true; +} + +int luo_get_state(enum liveupdate_state *state) +{ + int ret; + + if (!state) + return -EINVAL; + + if (luo_fd < 0) + return -EBADF; + + ret =3D ioctl(luo_fd, LIVEUPDATE_IOCTL_GET_STATE, state); + if (ret < 0) + return -errno; + + return 0; +} + +int luo_fd_preserve(int fd, uint64_t token) +{ + struct liveupdate_fd fd_data; + int ret; + + if (fd < 0) + return -EINVAL; + + if (luo_fd < 0) + return -EBADF; + + fd_data.fd =3D fd; + fd_data.flags =3D 0; /* Must be set to 0 as per API documentation */ + fd_data.token =3D token; + + ret =3D ioctl(luo_fd, LIVEUPDATE_IOCTL_FD_PRESERVE, &fd_data); + if (ret < 0) + return -errno; + + return 0; +} + +int luo_fd_unpreserve(uint64_t token) +{ + int ret; + + if (luo_fd < 0) + return -EBADF; + + ret =3D ioctl(luo_fd, LIVEUPDATE_IOCTL_FD_UNPRESERVE, &token); + if (ret < 0) + return -errno; + + return 0; +} + +int luo_fd_restore(uint64_t token, int *fd) +{ + struct liveupdate_fd fd_data; + int ret; + + if (!fd) + return -EINVAL; + + if (luo_fd < 0) + return -EBADF; + + fd_data.fd =3D -1; /* Will be filled by the kernel */ + fd_data.flags =3D 0; /* Must be set to 0 as per API documentation */ + fd_data.token =3D token; + + ret =3D ioctl(luo_fd, LIVEUPDATE_IOCTL_FD_RESTORE, &fd_data); + if (ret < 0) + return -errno; + + *fd =3D fd_data.fd; + return 0; +} + +int luo_prepare(void) +{ + int ret; + + if (luo_fd < 0) + return -EBADF; + + ret =3D ioctl(luo_fd, LIVEUPDATE_IOCTL_PREPARE); + if (ret < 0) + return -errno; + + return 0; +} + +int luo_cancel(void) +{ + int ret; + + if (luo_fd < 0) + return -EBADF; + + ret =3D ioctl(luo_fd, LIVEUPDATE_IOCTL_CANCEL); + if (ret < 0) + return -errno; + + return 0; +} + +int luo_finish(void) +{ + int ret; + + if (luo_fd < 0) + return -EBADF; + + ret =3D ioctl(luo_fd, LIVEUPDATE_IOCTL_FINISH); + if (ret < 0) + return -errno; + + return 0; +} + +const char *luo_state_to_string(enum liveupdate_state state) +{ + static const char * const state_strings[] =3D { + [LIVEUPDATE_STATE_UNDEFINED] =3D "undefined", + [LIVEUPDATE_STATE_NORMAL] =3D "normal", + [LIVEUPDATE_STATE_PREPARED] =3D "prepared", + [LIVEUPDATE_STATE_FROZEN] =3D "frozen", + [LIVEUPDATE_STATE_UPDATED] =3D "updated" + }; + + if (state >=3D 0 && state < ARRAY_SIZE(state_strings) && state_strings[st= ate]) + return state_strings[state]; + + return "UNKNOWN"; +} --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E15B42FCE18 for ; Wed, 23 Jul 2025 14:47:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282086; cv=none; b=OE8I7QmEjQIYVJtNvuKLEPZPK6zENBLEiIsRYO9SM1DErLcffyGHGVnqtlQAwB53Xt1Kk957QvclHhgTEKPQNvpM/Dr2X0zA9lXyXDVbx65/kviCyen1Ffqj0ZKBUoyZjUwtYuEe0DuVlCJVj2poEL8PMapnBsz7ielW0a8Go5Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282086; c=relaxed/simple; bh=kV/i7WzK++AoKinYUIyGUJOx4XSzNlPUkIXf9Vx78zI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=f+ScqdPEzByCBabZKdQ9GTSKsb9llM03Tn/wZ/M4rR+7UE8nD01g7EwCzUNYrtABWGjnipY4C6KI/feA7OOeyFDVtOZqvQIKroZZ0eYt08vDMOthQVUltotEszMBggecIkBlgMjMjIXohlkH058jG767bseiDTzHMG2x5cpaqrY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=lrXB55ZD; arc=none smtp.client-ip=209.85.128.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="lrXB55ZD" Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-718425f1172so68329637b3.0 for ; Wed, 23 Jul 2025 07:47:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282078; x=1753886878; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=hShlI4m6hw+YtxzJd2p960yByHj9Zxa7UX6hlMU7/TY=; b=lrXB55ZDcWdmbgG4MC/uUtOOj23D/nwbN1bgbdaJ0FJM8aYReCA5/70ND1uAr9qL/Q RQeK+ugEIV8FAYYs5pxyQFwuw875rEJSS12BHFk6NIFmfARA5d0zLA0ZH22VmwzHkgiN k5nd0ka6Gj9B+rg1VBRw27AmhzmN6mBm9+9jUMx0rA5GmK5kSeo9dOKVYHamj8DrGtM/ IfgYSkwOxBhj4g6Fl2Yq+FrdbTk35yv4fKAwjrCMkus/3mnaqV4ML33zLQbMGa/92AdI 4Ot4KzPlHoyGLHjDzEozF44xbMtNl98/iZbhN9YApqq9nMY8UFBdp3oVHiPqIdGxE8Ji 395g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282078; x=1753886878; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hShlI4m6hw+YtxzJd2p960yByHj9Zxa7UX6hlMU7/TY=; b=wdW17WMVLuXHM5h96h9D+vjHu7rMX8GZP5+vaoV7lNXjYCzKp1B23ZTiv3ORdmhSad dithqy8pR7o6KSqIDDSi9s6GH1RDtm3hVIIaKozgC54V2jUz2Gf4+l8yVozqAn9A6bz+ QUto4n6GIWYp4xsDBs9pCmepKv4WU8sVDIU+/Dijzt80fYxShmXvR6RISPZpw+ZlezZ6 W2vZZfsABDRd9joKqlQdfRHJ0fM8Dn+d0H/LwjfcCISMD4zrhuINWz94rGL6alXl/xS0 EIhpyS1R36l3O+cEQPSEd9BKUTOdV3ydYZ/Q81WuInxSBDlS3Oh05/hj4FTSpbJFPgLB MJ1A== X-Forwarded-Encrypted: i=1; AJvYcCXHkZlO7KMvdpTlga2PCjvfPDcBqgUI0YRx5LFPmCxCitokTxAzrsa6Lybc9d523fZs/1lWi8gyniWf/qc=@vger.kernel.org X-Gm-Message-State: AOJu0YxWJn5KBQq7RpbDGbqv47YlH4jbPJKEAhiRHxvjIHmbL1atQ69X CiCyUwjHZmrtN9pF/Kn1+zsEFECBYvjl21RCGFvWdpWNim4/5cjZ3QOqRc2/JYq8JEI= X-Gm-Gg: ASbGncufjbNuGjcngL83X9sDTtDUCqiLh+G07wxW01zTY9XkTCT0UDgSo5ltG8oxVFA 7WwQbgZeuuoy7PpeJGydexj1SKkYOPug9+94ML2zuK/GMQnbbTCHbBhgw/OXz9+dqk7zCxKFLSG yS5ktLzfDKC6Vi/2QhQ/vFCdwU4xG57wj/kjyIMtrpXuVrercULDEp18qeYQasIwo02QUJwU/RX JviIYBR2W31PsrkHVNpnuFfjPaC+kBxy1oLdnO/zhUPFrFwfRpobC+Wh4QUPXPBI9uMtNiUf1xX 3Tcilh1azK/WMND3P/cZcaG21/5CeqME3ozWIWVS3OKXs3dr5S6wRo2S5/DmgWd0q+idYYuoQWG amX3AZwBlAhI49+efFRZGpCQ2zjTjJBzuuH7TA56KjFIjpbSbyDkGXl9kvQazAvxrMF5ZdgCbVp dsu0GaaMKbMeU8mA== X-Google-Smtp-Source: AGHT+IH675RnRNbWShRM8kSWIrUMBTag3AVm1BnglGtYW6J1tHoiU+7BSxfTq41Ub2U0xE0WezZ+cg== X-Received: by 2002:a05:690c:9a0d:b0:719:671d:255 with SMTP id 00721157ae682-719b41e6aeamr42227787b3.3.1753282078060; Wed, 23 Jul 2025 07:47:58 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:57 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 31/32] libluo: introduce luoctl Date: Wed, 23 Jul 2025 14:46:44 +0000 Message-ID: <20250723144649.1696299-32-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav luoctl is a utility to interact with the LUO state machine. It currently supports viewing and change the current state of LUO. This can be used by scripts, tools, or developers to control LUO state during the live update process. Example usage: $ luoctl state normal $ luoctl prepare $ luoctl state prepared $ luoctl cancel $ luoctl state normal Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- tools/lib/luo/Makefile | 6 +- tools/lib/luo/cli/.gitignore | 1 + tools/lib/luo/cli/Makefile | 18 ++++ tools/lib/luo/cli/luoctl.c | 178 +++++++++++++++++++++++++++++++++++ 4 files changed, 202 insertions(+), 1 deletion(-) create mode 100644 tools/lib/luo/cli/.gitignore create mode 100644 tools/lib/luo/cli/Makefile create mode 100644 tools/lib/luo/cli/luoctl.c diff --git a/tools/lib/luo/Makefile b/tools/lib/luo/Makefile index e851c37d3d0a..e8f6bd3b9e85 100644 --- a/tools/lib/luo/Makefile +++ b/tools/lib/luo/Makefile @@ -13,7 +13,7 @@ LIB_NAME =3D libluo STATIC_LIB =3D $(LIB_NAME).a SHARED_LIB =3D $(LIB_NAME).so =20 -.PHONY: all clean install +.PHONY: all clean install cli =20 all: $(STATIC_LIB) $(SHARED_LIB) =20 @@ -26,8 +26,12 @@ $(SHARED_LIB): $(OBJS) %.o: %.c $(HEADERS) $(CC) $(CFLAGS) -c $< -o $@ =20 +cli: $(STATIC_LIB) + $(MAKE) -C cli + clean: rm -f $(OBJS) $(STATIC_LIB) $(SHARED_LIB) + $(MAKE) -C cli clean =20 install: all install -d $(DESTDIR)/usr/local/lib diff --git a/tools/lib/luo/cli/.gitignore b/tools/lib/luo/cli/.gitignore new file mode 100644 index 000000000000..3a5e2d287f60 --- /dev/null +++ b/tools/lib/luo/cli/.gitignore @@ -0,0 +1 @@ +/luoctl diff --git a/tools/lib/luo/cli/Makefile b/tools/lib/luo/cli/Makefile new file mode 100644 index 000000000000..6c0cbf92a420 --- /dev/null +++ b/tools/lib/luo/cli/Makefile @@ -0,0 +1,18 @@ +# SPDX-License-Identifier: LGPL-3.0-or-later +LUOCTL =3D luoctl +INCLUDE_DIR =3D ../include +HEADERS =3D $(wildcard $(INCLUDE_DIR)/*.h) + +CC =3D gcc +CFLAGS =3D -Wall -Wextra -O2 -g -I$(INCLUDE_DIR) +LDFLAGS =3D -L.. -l:libluo.a + +.PHONY: all clean + +all: $(LUOCTL) + +luoctl: luoctl.c ../libluo.a $(HEADERS) + $(CC) $(CFLAGS) -o $@ $< $(LDFLAGS) + +clean: + rm -f $(LUOCTL) diff --git a/tools/lib/luo/cli/luoctl.c b/tools/lib/luo/cli/luoctl.c new file mode 100644 index 000000000000..39ba0bdd44f0 --- /dev/null +++ b/tools/lib/luo/cli/luoctl.c @@ -0,0 +1,178 @@ +// SPDX-License-Identifier: LGPL-3.0-or-later +/** + * @file luoctl.c + * @brief Simple utility to interact with LUO + * + * This utility allows viewing and controlling LUO state. + * + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Author: Pratyush Yadav + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +#define fatal(fmt, ...) \ + do { \ + fprintf(stderr, "Error: " fmt, ##__VA_ARGS__); \ + exit(1); \ + } while (0) + +struct command { + char *name; + int (*handler)(void); +}; + +static void usage(const char *prog_name) +{ + printf("Usage: %s [command]\n\n", prog_name); + printf("Commands:\n"); + printf(" state - Show current LUO state\n"); + printf(" prepare - Prepare for live update\n"); + printf(" cancel - Cancel live update preparation\n"); + printf(" finish - Signal completion of restoration\n"); +} + +static enum liveupdate_state get_state(void) +{ + enum liveupdate_state state; + int ret; + + ret =3D luo_get_state(&state); + if (ret) + fatal("failed to get LUO state: %s\n", strerror(-ret)); + + return state; +} + +static int show_state(void) +{ + enum liveupdate_state state; + + state =3D get_state(); + printf("%s\n", luo_state_to_string(state)); + return 0; +} + +static int do_prepare(void) +{ + enum liveupdate_state state; + int ret; + + state =3D get_state(); + if (state !=3D LIVEUPDATE_STATE_NORMAL) + fatal("can only switch to prepared state from normal state. Current stat= e: %s\n", + luo_state_to_string(state)); + + ret =3D luo_prepare(); + if (ret) + fatal("failed to prepare for live update: %s\n", strerror(-ret)); + + return 0; +} + +static int do_cancel(void) +{ + enum liveupdate_state state; + int ret; + + state =3D get_state(); + if (state !=3D LIVEUPDATE_STATE_PREPARED) + fatal("can only cancel from normal state. Current state: %s\n", + luo_state_to_string(state)); + + ret =3D luo_cancel(); + if (ret) + fatal("failed to cancel live update: %s\n", strerror(-ret)); + + return 0; +} + +static int do_finish(void) +{ + enum liveupdate_state state; + int ret; + + state =3D get_state(); + if (state !=3D LIVEUPDATE_STATE_UPDATED) + fatal("can only finish from updated state. Current state: %s\n", + luo_state_to_string(state)); + + ret =3D luo_finish(); + if (ret) + fatal("failed to finish live update: %s\n", strerror(-ret)); + + return 0; +} + +static struct command commands[] =3D { + {"state", show_state}, + {"prepare", do_prepare}, + {"cancel", do_cancel}, + {"finish", do_finish}, + {NULL, NULL}, +}; + +int main(int argc, char *argv[]) +{ + struct option long_options[] =3D { + {"help", no_argument, 0, 'h'}, + {0, 0, 0, 0} + }; + struct command *command; + int ret =3D -EINVAL, opt; + char *cmd; + + if (!luo_is_available()) { + fprintf(stderr, "LUO is not available on this system\n"); + return 1; + } + + while ((opt =3D getopt_long(argc, argv, "ht:e:", long_options, NULL)) != =3D -1) { + switch (opt) { + case 'h': + usage(argv[0]); + return 0; + default: + fprintf(stderr, "Try '%s --help' for more information.\n", argv[0]); + return 1; + } + } + + if (argc - optind !=3D 1) { + usage(argv[0]); + return 1; + } + + cmd =3D argv[optind]; + + ret =3D luo_init(); + if (ret < 0) { + fprintf(stderr, "Failed to initialize LibLUO: %s\n", strerror(-ret)); + return 1; + } + + command =3D &commands[0]; + while (command->name) { + if (!strcmp(cmd, command->name)) { + ret =3D command->handler(); + break; + } + command++; + } + + if (!command->name) { + fprintf(stderr, "Unknown command %s. Try '%s --help' for more informatio= n\n", + cmd, argv[0]); + ret =3D -EINVAL; + } + + luo_cleanup(); + return (ret < 0) ? 1 : 0; +} --=20 2.50.0.727.gbf7dc18ff4-goog From nobody Mon Oct 6 06:43:24 2025 Received: from mail-yb1-f182.google.com (mail-yb1-f182.google.com [209.85.219.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6E938303DF9 for ; Wed, 23 Jul 2025 14:48:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282089; cv=none; b=L6nq8YjVf5M+n0cwSufUVhEPHK4mSpDFqZLjGpBntIODzizqe13+Cb+8Ei7ig5tsi3KGeHeCB/MAYxsyTR6g/dxdVdnkXDUMpQkK+Ak3dF8kcYG+nX1gurRyvDgw6Dbw7+rQ8Ar2+4nefPcLq34k9ZjcrawegL6iw8FUaEm3wqo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753282089; c=relaxed/simple; bh=8l0K6QJ4RkWllZcpyZGwgZlU9m1P2gnk5TToNIUeT/Q=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q8G5XE9ZYwTNdzliDRVq04GxizoxPKC1Z6xYIxxfWcticQtw9rdvQpl9X2jz/9PMtss9Z7eGhx9FsEpizU8zWEHomKvRQ5nt/ew67TrYnr1sOdxwSPzE73qhr9dokBrHnlPIGW0y/ju868XEUdrgsx0eHHkSxBf+vfrocmKmMR4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=lk6oHfL3; arc=none smtp.client-ip=209.85.219.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="lk6oHfL3" Received: by mail-yb1-f182.google.com with SMTP id 3f1490d57ef6-e8d7ad77e4cso4898809276.1 for ; Wed, 23 Jul 2025 07:48:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1753282080; x=1753886880; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=60o0+pILn7oYpaQU4GzfpCaIJI/wVLOztDCQbTE/8Fw=; b=lk6oHfL33DRygEmycTqTSPdwXuwH5hy32r3J/F9TQA4/3S2s9v9ceb3ykwDH++yvbv 2TVA0atGroGi3hMZgUdmG4tIa2R871AkUfPtVogcmXzdrKzvE/KWbslIkFJ3HFwM29Y+ O+k9GMtmqMTKp+R4fTPHCBBIPjwPZTgbReCHHU90wobbPo1y34nZHAGZxwbfpkbcj1Nd RwC/9FLHW7PLtnVT1GGh65D+K5kjemtItUntFClQxU6Cviyv/VkUjkF2VIkIzFv9NSb6 zZ+oC4HN7qvvIb2aki9AMGAUx5oDugz/tOQKS3TmdsBKPnG2lfkQXX8R/X4tmFUYi+mW OQew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753282080; x=1753886880; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=60o0+pILn7oYpaQU4GzfpCaIJI/wVLOztDCQbTE/8Fw=; b=pjZds1IiNQDr8dcW6k1Xa4alRN6wSsDqX86rgR2vIuYWlBhftjN7Z1WMki7vcy5Ls1 gcWuwt9cbXA5PoNzC3ujFk8ArNmpWrdIRGpYHRUivoKwqMz84iqv9Oi3Tg531IjXqUqP NWJOjUGsp1R4ncDgYLz/1WYS26HNH5ONebYutU0S/oekMt9z4B7SuDgQPWubNjpp/YW1 FVO0aEU5tZ90patrrwtv2GsFkMZODVLdBPv95fGzeAMz6MS11+Ceo1L3vPPPVESd7oie p3Cskv2BLrgZGc1jRlKdULUf3nw5xrBF71haybDjz/xay3au3qZoQpPASKwkx3jt0Bzk ad8A== X-Forwarded-Encrypted: i=1; AJvYcCV2prF2boZIFciS1YoNb+zw2fGAccRUCtf4353n87x58jOcTok7e3hMzRVZpGVFg2m+QP9PBhJDnN/dJ50=@vger.kernel.org X-Gm-Message-State: AOJu0Yz/3EXSfDwgahReQA5Wfs7fCQVpgVAnpL2JEoWCefwYQxHhSwZ6 XS0+73yfEElzolyKaUyUII6yQH8BSOVS551xYKmIBNnwH1xH6oxF684kx0cJ8AdO8HI= X-Gm-Gg: ASbGncvTHCEnkZJOKCdSq1rZBRabVGsUej6L4JwlFvNWzd8/JIB/19vbkZCpiB9HOqO B2XB2w9Q2HRJFfH33f+qBUucTWgB3aaxG69RilfZPJK0gVxqTWWuAgH4SNPl/czqcod/aYLJJSb jcqYj8n8cp+4mxKq9Gy9OIXAeB50JwIvg7V+b4gg4WUZf2vq18iBSLe5XDuEtD5DGUppw3HtNxZ OKYG3bv92++TsqBwMbktS4hKBCg7Ll1eO5UgchyEt56W02ttL4dL1spR8jsCQxgBKa1t6cCYXJF ID4PbKJoIlH1+320K7nPNh3o+8BhvG88DU2iKlfVzC27AHuYTG8VBHRsIEfT3HBqDbeK+1KY6rJ eiP8ITkaBpuWkjJFl0Navx14FCvuLaPsAZco4uyWCNl/W7uG0n7ITgYVkKvFerGKOLe68rNEhja m4Rx7mgurPQvRPRQ== X-Google-Smtp-Source: AGHT+IEHR5YhXNXKqQu5cYnYQfYX3NmO3Uw1youtd2STD6HMFKwDB0cE2+RVBh3uuo0TCOuCIAe/Cw== X-Received: by 2002:a05:690c:3391:b0:717:b35b:94d1 with SMTP id 00721157ae682-719b42089f0mr42556337b3.9.1753282080302; Wed, 23 Jul 2025 07:48:00 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 00721157ae682-719532c7e4fsm30482117b3.72.2025.07.23.07.47.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Jul 2025 07:47:59 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v2 32/32] libluo: add tests Date: Wed, 23 Jul 2025 14:46:45 +0000 Message-ID: <20250723144649.1696299-33-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.0.727.gbf7dc18ff4-goog In-Reply-To: <20250723144649.1696299-1-pasha.tatashin@soleen.com> References: <20250723144649.1696299-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav Add a test suite for libluo itself, and for the kernel LUO interface. The below tests are added: 1. init - Tests the initialization and cleanup functions of libluo. 2. state - Tests the luo_get_state() API, which in turn tests the LIVEUPDATE_IOCTL_GET_STATE ioctl 3. preserve - Creates a memfd, preserves it, puts LUO in prepared state, cancels liveupdate, and makes sure memfd is functional. 4. prepared - Puts a memfd in LUO enters prepared state. Then it makes sure the memfd stays functional but remains in restricted mode. It makes sure the memfd can't grow or shrink, but can be read from or written to. 5. transitions - Tests transitions from normal to prepared to cancel state work. 6. error - Tests error handling of the library on invalid inputs. 7. kexec - Tests the main functionality of LUO -- preserving a FD over kexec. It creates a memfd with random data, saves the data to a file on disk, and then preserves the FD and goes into prepared state. Now the test runner must perform a kexec. Once rebooted, running the test again resumes the test. It fetches the memfd back, nd compares its content with the saved data on disk. A specific test can be selected or excluded uring the -t or -e arguments. Sample run: $ ./test LibLUO Test Suite =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Testing initialization and cleanup... PASSED Testing get_state... PASSED (current state: normal) Testing state transitions... PASSED Testing fd_preserve with freeze and cancel... PASSED Testing operations on prepared memfd... PASSED Testing error handling... PASSED Testing fd preserve for kexec... READY FOR KEXEC (token: 3) Run kexec now and then run this test again to complete. All requested tests completed. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- tools/lib/luo/Makefile | 4 + tools/lib/luo/tests/.gitignore | 1 + tools/lib/luo/tests/Makefile | 18 + tools/lib/luo/tests/test.c | 848 +++++++++++++++++++++++++++++++++ 4 files changed, 871 insertions(+) create mode 100644 tools/lib/luo/tests/.gitignore create mode 100644 tools/lib/luo/tests/Makefile create mode 100644 tools/lib/luo/tests/test.c diff --git a/tools/lib/luo/Makefile b/tools/lib/luo/Makefile index e8f6bd3b9e85..ef4c489efcc5 100644 --- a/tools/lib/luo/Makefile +++ b/tools/lib/luo/Makefile @@ -29,9 +29,13 @@ $(SHARED_LIB): $(OBJS) cli: $(STATIC_LIB) $(MAKE) -C cli =20 +tests: $(STATIC_LIB) + $(MAKE) -C tests + clean: rm -f $(OBJS) $(STATIC_LIB) $(SHARED_LIB) $(MAKE) -C cli clean + $(MAKE) -C tests clean =20 install: all install -d $(DESTDIR)/usr/local/lib diff --git a/tools/lib/luo/tests/.gitignore b/tools/lib/luo/tests/.gitignore new file mode 100644 index 000000000000..ee4c92682341 --- /dev/null +++ b/tools/lib/luo/tests/.gitignore @@ -0,0 +1 @@ +/test diff --git a/tools/lib/luo/tests/Makefile b/tools/lib/luo/tests/Makefile new file mode 100644 index 000000000000..7f4689722ff6 --- /dev/null +++ b/tools/lib/luo/tests/Makefile @@ -0,0 +1,18 @@ +# SPDX-License-Identifier: LGPL-3.0-or-later +TESTS =3D test +INCLUDE_DIR =3D ../include +HEADERS =3D $(wildcard $(INCLUDE_DIR)/*.h) + +CC =3D gcc +CFLAGS =3D -Wall -Wextra -O2 -g -I$(INCLUDE_DIR) +LDFLAGS =3D -L.. -l:libluo.a + +.PHONY: all clean + +all: $(TESTS) + +test: test.c ../libluo.a $(HEADERS) + $(CC) $(CFLAGS) -o $@ $< $(LDFLAGS) + +clean: + rm -f $(TESTS) diff --git a/tools/lib/luo/tests/test.c b/tools/lib/luo/tests/test.c new file mode 100644 index 000000000000..7963ae8ebadf --- /dev/null +++ b/tools/lib/luo/tests/test.c @@ -0,0 +1,848 @@ +// SPDX-License-Identifier: LGPL-3.0-or-later +#define _GNU_SOURCE +/** + * @file test.c + * @brief Test program for the LibLUO library + * + * This program tests the basic functionality of the LibLUO library. + * + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Author: Pratyush Yadav + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* Path to store token for kexec test */ +#define TOKEN_FILE "libluo_test_token" +#define TEST_DATA_FILE "libluo_test_data" +#define MEMFD_NAME "libluo_test_memfd" + +/* Size of the random data buffer (1 MiB) */ +#define RANDOM_BUFFER_SIZE (1 << 20) +static char random_buffer[RANDOM_BUFFER_SIZE]; + +/* Test IDs */ +#define TEST_INIT_CLEANUP (1 << 0) +#define TEST_GET_STATE (1 << 1) +#define TEST_FD_PRESERVE (1 << 2) +#define TEST_ERROR_HANDLING (1 << 3) +#define TEST_FD_KEXEC (1 << 4) +#define TEST_FD_PREPARED (1 << 5) +#define TEST_STATE_TRANSITIONS (1 << 6) +#define TEST_ALL (TEST_INIT_CLEANUP | TEST_GET_STATE | \ + TEST_FD_PRESERVE | TEST_ERROR_HANDLING | \ + TEST_FD_KEXEC | TEST_FD_PREPARED | \ + TEST_STATE_TRANSITIONS) + +/* + * luo_fd_preserve() needs a unique token. Generate a monotonically increa= sing + * token. + */ +static uint64_t next_token() +{ + static uint64_t token =3D 0; + + return token++; +} + +/* Read exactly specified size from fd. Any less results in error. */ +static int read_size(int fd, char *buffer, size_t size) +{ + size_t remain =3D size; + ssize_t bytes_read; + + while (remain) { + bytes_read =3D read(fd, buffer, remain); + if (bytes_read =3D=3D 0) + return -ENODATA; + if (bytes_read < 0) + return -errno; + + remain -=3D bytes_read; + } + + return 0; +} + +/* Write exactly specified size from fd. Any less results in error. */ +static int write_size(int fd, const char *buffer, size_t size) +{ + size_t remain =3D size; + ssize_t written; + + while (remain) { + written =3D write(fd, buffer, remain); + if (written =3D=3D 0) + return -EIO; + if (written < 0) + return -errno; + + remain -=3D written; + } + + return 0; +} + +static int generate_random_data(char *buffer, size_t size) +{ + int fd, ret; + + fd =3D open("/dev/urandom", O_RDONLY); + if (fd < 0) + return -errno; + + ret =3D read_size(fd, buffer, size); + close(fd); + return ret; +} + +static int save_test_data(const char *buffer, size_t size) +{ + int fd, ret; + + fd =3D open(TEST_DATA_FILE, O_RDWR); + if (fd < 0) + return -errno; + + ret =3D write_size(fd, buffer, size); + close(fd); + return ret; +} + +static int load_test_data(char *buffer, size_t size) +{ + int fd, ret; + + fd =3D open(TEST_DATA_FILE, O_RDONLY); + if (fd < 0) + return -errno; + + ret =3D read_size(fd, buffer, size); + close(fd); + return ret; +} + +/* Create and initialize a memfd with random data. */ +static int create_test_fd(const char *memfd_name, char *buffer, size_t siz= e) +{ + int fd; + int ret; + + fd =3D memfd_create(memfd_name, 0); + if (fd < 0) + return -errno; + + ret =3D generate_random_data(buffer, size); + if (ret < 0) { + close(fd); + return ret; + } + + if (write_size(fd, buffer, size) < 0) { + close(fd); + return -errno; + } + + /* Reset file position to beginning */ + if (lseek(fd, 0, SEEK_SET) < 0) { + close(fd); + return -errno; + } + + return fd; +} + +/* + * Make sure fd contains expected data up to size. Returns 0 on success, 1= on + * data mismatch, -errno on error. + */ +static int verify_fd_content(int fd, const char *expected_data, size_t siz= e) +{ + char buffer[size]; + int ret; + + /* Reset file position to beginning */ + if (lseek(fd, 0, SEEK_SET) < 0) + return -errno; + + ret =3D read_size(fd, buffer, size); + if (ret < 0) + return ret; + + if (memcmp(buffer, expected_data, size) !=3D 0) + return 1; + + return 0; +} + +/* Save token to file for kexec test. */ +static int save_token(uint64_t token) +{ + FILE *file =3D fopen(TOKEN_FILE, "w"); + + if (!file) + return -errno; + + if (fprintf(file, "%lu", token) < 0) { + fclose(file); + return -errno; + } + + fclose(file); + return 0; +} + +/* Load token from file for kexec test. */ +static int load_token(uint64_t *token) +{ + FILE *file =3D fopen(TOKEN_FILE, "r"); + + if (!file) + return -errno; + + if (fscanf(file, "%lu", token) !=3D 1) { + fclose(file); + return -EINVAL; + } + + fclose(file); + return 0; +} + +/* Test initialization and cleanup */ +static void test_init_cleanup(void) +{ + int ret; + + printf("Testing initialization and cleanup... "); + + ret =3D luo_init(); + if (ret < 0) { + printf("FAILED (init: %s)\n", strerror(-ret)); + return; + } + + luo_cleanup(); + printf("PASSED\n"); +} + +/* Test getting LUO state */ +static void test_get_state(void) +{ + int ret; + enum liveupdate_state state; + + printf("Testing get_state... "); + + ret =3D luo_init(); + if (ret < 0) { + printf("FAILED (init: %s)\n", strerror(-ret)); + return; + } + + ret =3D luo_get_state(&state); + if (ret < 0) { + printf("FAILED (get_state: %s)\n", strerror(-ret)); + luo_cleanup(); + return; + } + + printf("PASSED (current state: %s)\n", luo_state_to_string(state)); + luo_cleanup(); +} + +/* Test preserving and unpreserving a file descriptor with prepare and can= cel */ +static void test_fd_preserve_unpreserve(void) +{ + uint64_t token =3D next_token(); + int ret, fd =3D -1; + + printf("Testing fd_preserve with freeze and cancel... "); + + ret =3D luo_init(); + if (ret < 0) { + printf("FAILED (init: %s)\n", strerror(-ret)); + return; + } + + fd =3D create_test_fd(MEMFD_NAME, random_buffer, sizeof(random_buffer)); + if (fd < 0) { + ret =3D fd; + printf("FAILED (create_test_fd: %s)\n", strerror(-ret)); + goto out_cleanup; + } + + ret =3D luo_fd_preserve(fd, token); + if (ret < 0) { + printf("FAILED (preserve: %s)\n", strerror(-ret)); + goto out_close_fd; + } + + ret =3D luo_prepare(); + if (ret < 0) { + printf("FAILED (prepare: %s)\n", strerror(-ret)); + goto out_unpreserve; + } + + ret =3D luo_cancel(); + if (ret < 0) { + printf("FAILED (cancel: %s)\n", strerror(-ret)); + goto out_unpreserve; + } + + ret =3D luo_fd_unpreserve(token); + if (ret < 0) { + printf("FAILED (unpreserve: %s)\n", strerror(-ret)); + goto out_close_fd; + } + + ret =3D verify_fd_content(fd, random_buffer, sizeof(random_buffer)); + if (ret < 0) { + printf("FAILED (verify_fd_content: %s)\n", + ret =3D=3D 1 ? "data mismatch" : strerror(-ret)); + goto out_close_fd; + } + + printf("PASSED\n"); + goto out_close_fd; + +out_unpreserve: + luo_fd_unpreserve(token); +out_close_fd: + close(fd); +out_cleanup: + luo_cleanup(); +} + +/* Test error handling with invalid inputs. */ +static void test_error_handling(void) +{ + int ret; + + printf("Testing error handling... "); + + ret =3D luo_init(); + if (ret < 0) { + printf("FAILED (init: %s)\n", strerror(-ret)); + return; + } + + /* Test with invalid file descriptor */ + ret =3D luo_fd_preserve(-1, next_token()); + if (ret !=3D -EINVAL) { + printf("FAILED (expected EINVAL for invalid fd, got %d)\n", ret); + luo_cleanup(); + return; + } + + /* Test with NULL state pointer */ + ret =3D luo_get_state(NULL); + if (ret !=3D -EINVAL) { + printf("FAILED (expected EINVAL for NULL state, got %d)\n", ret); + luo_cleanup(); + return; + } + + luo_cleanup(); + printf("PASSED\n"); +} + +/* Test preserving a file descriptor for kexec reboot */ +static void test_fd_preserve_for_kexec(void) +{ + enum liveupdate_state state; + int fd =3D -1, ret; + uint64_t token; + + ret =3D luo_init(); + if (ret < 0) { + printf("FAILED (init: %s)\n", strerror(-ret)); + return; + } + + /* Check if we're in post-kexec state */ + ret =3D luo_get_state(&state); + if (ret < 0) { + printf("FAILED (get_state: %s)\n", strerror(-ret)); + goto out_cleanup; + } + + if (state =3D=3D LIVEUPDATE_STATE_UPDATED) { + /* Post-kexec: restore the file descriptor */ + printf("Testing memfd restore after kexec... "); + + ret =3D load_token(&token); + if (ret < 0) { + printf("FAILED (load_token: %s)\n", strerror(-ret)); + goto out_cleanup; + } + + ret =3D load_test_data(random_buffer, RANDOM_BUFFER_SIZE); + if (ret < 0) { + printf("FAILED (load_test_data: %s)\n", strerror(-ret)); + goto out_cleanup; + } + + ret =3D luo_fd_restore(token, &fd); + if (ret < 0) { + printf("FAILED (restore: %s)\n", strerror(-ret)); + goto out_cleanup; + } + + /* Verify the file descriptor content with stored data. */ + ret =3D verify_fd_content(fd, random_buffer, RANDOM_BUFFER_SIZE); + if (ret) { + printf("FAILED (verify_fd_content: %s)\n", + ret =3D=3D 1 ? "data mismatch" : strerror(-ret)); + goto out_close_fd; + } + + ret =3D luo_finish(); + if (ret < 0) { + printf("FAILED (finish: %s)\n", strerror(-ret)); + goto out_close_fd; + } + + printf("PASSED\n"); + goto out_close_fd; + } else { + /* Pre-kexec: preserve the file descriptor */ + printf("Testing fd preserve for kexec... "); + + fd =3D create_test_fd(MEMFD_NAME, random_buffer, RANDOM_BUFFER_SIZE); + if (fd < 0) { + ret =3D fd; + printf("FAILED (create_test_fd: %s)\n", strerror(-ret)); + goto out_cleanup; + } + + /* Save random data to file for post-kexec verification */ + ret =3D save_test_data(random_buffer, RANDOM_BUFFER_SIZE); + if (ret < 0) { + printf("FAILED (save_test_data: %s)\n", strerror(-ret)); + goto out_close_fd; + } + + token =3D next_token(); + ret =3D luo_fd_preserve(fd, token); + if (ret < 0) { + printf("FAILED (preserve: %s)\n", strerror(-ret)); + goto out_close_fd; + } + + /* Save token to file for post-kexec restoration */ + ret =3D save_token(token); + if (ret < 0) { + printf("FAILED (save_token: %s)\n", strerror(-ret)); + goto out_unpreserve; + } + + ret =3D luo_prepare(); + if (ret < 0) { + printf("FAILED (prepare: %s)\n", strerror(-ret)); + goto out_unpreserve; + } + + printf("READY FOR KEXEC (token: %lu)\n", token); + printf("Run kexec now and then run this test again to complete.\n"); + + /* Note: At this point, the system should perform kexec reboot. + * The test will continue in the new kernel with the + * LIVEUPDATE_STATE_UPDATED state. + * + * Since the FD is now preserved, we can close it. + */ + goto out_close_fd; + } + +out_unpreserve: + luo_fd_unpreserve(token); +out_close_fd: + close(fd); +out_cleanup: + luo_cleanup(); +} + +/* + * Test that prepared memfd can't grow or shrink, but reads and writes sti= ll + * work. + */ +static void test_fd_prepared_operations(void) +{ + char write_buffer[128] =3D {'A'}; + size_t initial_size, file_size; + int ret, fd =3D -1; + uint64_t token; + + printf("Testing operations on prepared memfd... "); + + ret =3D luo_init(); + if (ret < 0) { + printf("FAILED (init: %s)\n", strerror(-ret)); + return; + } + + /* Create and initialize test file descriptor */ + fd =3D create_test_fd(MEMFD_NAME, random_buffer, sizeof(random_buffer)); + if (fd < 0) { + ret =3D fd; + printf("FAILED (create_test_fd: %s)\n", strerror(-ret)); + goto out_cleanup; + } + + /* Get initial file size */ + ret =3D lseek(fd, 0, SEEK_END); + if (ret < 0) { + printf("FAILED (lseek to end: %s)\n", strerror(errno)); + goto out_close_fd; + } + initial_size =3D (size_t)ret; + + token =3D next_token(); + ret =3D luo_fd_preserve(fd, token); + if (ret < 0) { + printf("FAILED (preserve: %s)\n", strerror(-ret)); + goto out_close_fd; + } + + ret =3D luo_prepare(); + if (ret < 0) { + printf("FAILED (prepare: %s)\n", strerror(-ret)); + goto out_unpreserve; + } + + /* Test 1: Write to the prepared file descriptor (within existing size) */ + if (lseek(fd, 0, SEEK_SET) < 0) { + printf("FAILED (lseek before write: %s)\n", strerror(errno)); + goto out_cancel; + } + + /* Write buffer is smaller than total file size. */ + ret =3D write_size(fd, write_buffer, sizeof(write_buffer)); + if (ret < 0) { + printf("FAILED (write to prepared fd: %s)\n", strerror(errno)); + goto out_cancel; + } + + ret =3D verify_fd_content(fd, write_buffer, sizeof(write_buffer)); + if (ret) { + printf("FAILED (verify_fd_content after write: %s)\n", + ret =3D=3D 1 ? "data mismatch" : strerror(-ret)); + goto out_cancel; + } + + /* Test 2: Try to grow the file using write(). */ + + /* First, seek to one byte behind initial size. */ + ret =3D lseek(fd, initial_size - 1, SEEK_SET); + if (ret < 0) { + printf("FAILED: (lseek after write verification: %s)\n", + strerror(errno)); + } + + /* + * Then, write some data that should increase the file size. This should + * fail. + */ + ret =3D write_size(fd, write_buffer, sizeof(write_buffer)); + if (ret =3D=3D 0) { + printf("FAILED: (write beyond initial size succeeded)\n"); + goto out_cancel; + } + + ret =3D lseek(fd, 0, SEEK_END); + if (ret < 0) { + printf("FAILED (lseek after larger write: %s)\n", strerror(errno)); + goto out_cancel; + } + file_size =3D (size_t)ret; + + if (file_size !=3D initial_size) { + printf("FAILED (file grew beyond initial size: %zu !=3D %zu)\n", + (size_t)file_size, initial_size); + goto out_cancel; + } + + /* Test 3: Try to shrink the file using truncate */ + ret =3D ftruncate(fd, initial_size / 2); + if (ret =3D=3D 0) { + printf("FAILED (file was truncated)\n"); + goto out_cancel; + } + + ret =3D lseek(fd, 0, SEEK_END); + if (ret < 0) { + printf("FAILED (lseek after shrink attempt: %s)\n", strerror(errno)); + goto out_cancel; + } + file_size =3D (size_t)ret; + + if (file_size !=3D initial_size) { + printf("FAILED (file shrunk from initial size: %zu !=3D %zu)\n", + (size_t)file_size, initial_size); + goto out_cancel; + } + + ret =3D luo_cancel(); + if (ret < 0) { + printf("FAILED (cancel: %s)\n", strerror(-ret)); + goto out_unpreserve; + } + + ret =3D luo_fd_unpreserve(token); + if (ret < 0) { + printf("FAILED (unpreserve: %s)\n", strerror(-ret)); + goto out_close_fd; + } + + printf("PASSED\n"); + goto out_close_fd; + +out_cancel: + luo_cancel(); +out_unpreserve: + luo_fd_unpreserve(token); +out_close_fd: + close(fd); +out_cleanup: + luo_cleanup(); +} + +static int test_prepare_cancel_sequence(const char *sequence_name) +{ + int ret; + enum liveupdate_state state; + + /* Initial state should be NORMAL */ + ret =3D luo_get_state(&state); + if (ret < 0) { + printf("FAILED (%s get initial state failed: %s)\n", + sequence_name, strerror(-ret)); + return ret; + } + + if (state !=3D LIVEUPDATE_STATE_NORMAL) { + printf("FAILED (%s unexpected initial state: %s)\n", + sequence_name, luo_state_to_string(state)); + return -EINVAL; + } + + /* Test NORMAL -> PREPARED transition */ + ret =3D luo_prepare(); + if (ret < 0) { + printf("FAILED (%s prepare failed: %s)\n", + sequence_name, strerror(-ret)); + return ret; + } + + ret =3D luo_get_state(&state); + if (ret < 0) { + printf("FAILED (%s get state after prepare failed: %s)\n", + sequence_name, strerror(-ret)); + goto out_cancel; + } + + if (state !=3D LIVEUPDATE_STATE_PREPARED) { + printf("FAILED (%s expected PREPARED state, got %s)\n", + sequence_name, luo_state_to_string(state)); + ret =3D -EINVAL; + goto out_cancel; + } + + /* Test PREPARED -> NORMAL transition via cancel */ + ret =3D luo_cancel(); + if (ret < 0) { + printf("FAILED (%s cancel failed: %s)\n", + sequence_name, strerror(-ret)); + return ret; + } + + ret =3D luo_get_state(&state); + if (ret < 0) { + printf("FAILED (%s get state after cancel failed: %s)\n", + sequence_name, strerror(-ret)); + return ret; + } + + if (state !=3D LIVEUPDATE_STATE_NORMAL) { + printf("FAILED (%s expected NORMAL state after cancel, got %s)\n", + sequence_name, luo_state_to_string(state)); + return -EINVAL; + } + + return 0; + +out_cancel: + luo_cancel(); + return ret; +} + +/* Test all state transitions */ +static void test_state_transitions(void) +{ + int ret; + + printf("Testing state transitions... "); + + ret =3D luo_init(); + if (ret < 0) { + printf("FAILED (init failed: %s)\n", strerror(-ret)); + return; + } + + /* Test first prepare -> cancel sequence */ + ret =3D test_prepare_cancel_sequence("first"); + if (ret < 0) + goto out; + + /* + * Test second prepare -> freeze -> cancel sequence in case the + * previous cancellation left some side effects. + */ + ret =3D test_prepare_cancel_sequence("second"); + if (ret < 0) + goto out; + + printf("PASSED\n"); + +out: + luo_cleanup(); +} + +/* Test name to flag mapping */ +struct test { + const char *name; + void (*fn)(void); + unsigned int flag; +}; + +/* Array of test names and their corresponding flags */ +static struct test tests[] =3D { + {"init", test_init_cleanup, TEST_INIT_CLEANUP}, + {"state", test_get_state, TEST_GET_STATE}, + {"transitions", test_state_transitions, TEST_STATE_TRANSITIONS}, + {"preserve", test_fd_preserve_unpreserve, TEST_FD_PRESERVE}, + {"prepared", test_fd_prepared_operations, TEST_FD_PREPARED}, + {"error", test_error_handling, TEST_ERROR_HANDLING}, + {"kexec", test_fd_preserve_for_kexec, TEST_FD_KEXEC}, + {NULL, NULL, 0} +}; + +static int parse_test_names(char *arg, unsigned int *flags) +{ + char *name; + struct test *test; + + *flags =3D 0; + name =3D strtok(arg, ","); + + while (name !=3D NULL) { + test =3D tests; + while (test->name) { + if (strcmp(name, test->name) =3D=3D 0) { + *flags |=3D test->flag; + break; + } + test++; + } + + /* Check if we found a match */ + if (!test->name) { + printf("Unknown test: %s\n", name); + return 1; + } + + name =3D strtok(NULL, ","); + } + + return 0; +} + +static void usage(const char *program_name) +{ + printf("Usage: %s [options]\n", program_name); + printf("Options:\n"); + printf(" -h, --help Show this help message\n"); + printf(" -t, --test=3DTEST_ID Run specific test(s)\n"); + printf(" -e, --exclude=3DTEST_ID Exclude specific test(s)\n"); + printf("\n"); + printf("Test IDs:\n"); + printf(" init - Test initialization and cleanup\n"); + printf(" state - Test getting LUO state\n"); + printf(" preserve - Test memfd preserve/unpreserve with freeze/cancel= \n"); + printf(" prepared - Test memfd functions can read/write but not grow = after prepare\n"); + printf(" transitions - Test all state transitions (NORMAL->PREPARED->FRO= ZEN->NORMAL)\n"); + printf(" error - Test error handling\n"); + printf(" kexec - Test memfd preserve for kexec\n"); + printf("\n"); + printf("Multiple tests can be specified with comma separation.\n"); + printf("Example: %s --test=3Dinit,state --exclude=3Dkexec\n", program_nam= e); + printf("By default, all tests are run.\n"); +} + +int main(int argc, char *argv[]) +{ + unsigned int tests_to_run =3D TEST_ALL; + unsigned int tests_to_exclude =3D 0; + struct option long_options[] =3D { + {"help", no_argument, 0, 'h'}, + {"test", required_argument, 0, 't'}, + {"exclude", required_argument, 0, 'e'}, + {0, 0, 0, 0} + }; + struct test *test; + int opt; + + printf("LibLUO Test Suite\n"); + printf("=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\n\n"); + + if (!luo_is_available()) { + printf("LUO is not available on this system. Skipping tests.\n"); + return 0; + } + + while ((opt =3D getopt_long(argc, argv, "ht:e:", long_options, NULL)) != =3D -1) { + switch (opt) { + case 'h': + usage(argv[0]); + return 0; + case 't': + if (parse_test_names(optarg, &tests_to_run)) + return 1; + break; + case 'e': + if (parse_test_names(optarg, &tests_to_exclude)) + return 1; + break; + default: + printf("Try '%s --help' for more information.\n", argv[0]); + return 1; + } + } + + /* Apply exclusions to the tests to run */ + tests_to_run &=3D ~tests_to_exclude; + if (!tests_to_run) { + printf("ERROR: all tests excluded\n"); + return 1; + } + + /* Run selected tests */ + test =3D tests; + while (test->name) { + if (tests_to_run & test->flag) + test->fn(); + test++; + } + + printf("\nAll requested tests completed.\n"); + return 0; +} --=20 2.50.0.727.gbf7dc18ff4-goog