From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f50.google.com (mail-qv1-f50.google.com [209.85.219.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C79191A23AC for ; Thu, 7 Aug 2025 01:44:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531092; cv=none; b=A6zYGAIpWMZRXa+IkUcnZ0TYbcoKnnhlT/Y9/UicuJsXEtJKicYXm5f3qI09Z3dEUOeLHTs7SgU+6CoDFT4FB8HS31ZfJv+34jdNe4mhkx/lbN6B2vom1hPzcKI8nOV0iwRG4ZGiU889uo3RmORLNeckFPxIiYAevnR53mXNsWc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531092; c=relaxed/simple; bh=j8U9dBX25POg3ylmKduTkjn9/PsdBpAzRpSmG4BnN0U=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gKT2xs/kA5hNGE/LLmKyY2TDg0Dxu7Nwk3fW45NJCNR6h/+zoqG3Uu9mPK4gOwsh+zgl1kfdh/CS7pNKaOERrIOurGBrrV4Uj8isILax1wDWOnRWj530Kxq2bsRyjolqX8fb2xgjLQGipR8ijg9z1K3MTFA5o4Q91vt88pEuApg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=hcD3kctN; arc=none smtp.client-ip=209.85.219.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="hcD3kctN" Received: by mail-qv1-f50.google.com with SMTP id 6a1803df08f44-7077a1563b5so5301536d6.1 for ; Wed, 06 Aug 2025 18:44:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531089; x=1755135889; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QL9c1v+BMlgTEn5uRe69e/JcEH0iWS47GlDWai1cxXk=; b=hcD3kctNjZc3kFV5k9f3vcT7REP/8ZW8l/0asf6FdEBVRdWccpGky9M4nf1Rl0hCfA +A/WqueVjZ6askxg8Gx6/6Gg51h0n20X7KILDPQulb2hc3VpRA8grBnutwJn3RDyYNkE hN/3h/KOliYnihzqrZCh1Acfi+opPSbl/ecGjRWYTOCVC2RJCrhvNxKIyb88RPQ81/gD RXPn0nh8DUuh/71T5urQR3wmmN3ci00EBK9FCABq31oSfCdlb4j2Rfo6S45kdTw4wCuu mBQ2klevl1FvzMiZDW1NXdL6i9OkAtjIPhk/w+bwXqevH0bjKPtr6XspSEDbPoKwd/dC T6jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531089; x=1755135889; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QL9c1v+BMlgTEn5uRe69e/JcEH0iWS47GlDWai1cxXk=; b=R1Brd6u1C1C4WXYCaevCxmLiXWhD86jCk8SWDwe3H91FfK9YYPAJX21oo2OJx3IRIc 0WOyAl9xg5zyMmn2gBqOPGIcdjjga52Trbp2ddTOJ3T9asm3WigO+L3+2vmbvQ5WC605 QJLWM7/RWFr1H+rbKWcrJd22oOfcMbBOMn8yvX99QYyJq+dtmbuz+RGplUiR2F2923ab GZbUFHnigmLVlIicd931K3z8CAjRP7QXAZp897bpxe2HloXV+wLy8NcZfRUbSokQICmB Z6/OFhy9394/ogbG0rbTrf8C4jtwFYh5xOX0Ayim3RTgCaSeBgeaQf4EQ7FRuDnrvtvS g1nQ== X-Forwarded-Encrypted: i=1; AJvYcCX05LdXd9gXEcvicfTrtTFU/W61L0pk8vsc99T6Zrc5eRgvSvHYkq4dA9byBwnyndPnKAtuVuizUmwYhhA=@vger.kernel.org X-Gm-Message-State: AOJu0YxaqvuFE4uh5y0Gc5ZQrlDKfhOSFRXfR0iboFcIWuBKiIJZxvEf tp33nuDYrPQhlL3RFYoAVwo+GpFqmbPIydmR3vtD+4JDG71GKTK5tvSnYi85c6i90jE= X-Gm-Gg: ASbGncsYtfKIKEMXrg1tHGA6woRwf59t2CQsDeQgZ5DRNj5aWacGRHr+5QYIZQuaH7Y n9ouo6QOMKCaEi/GnhXjQquX+SqaGCLCa+yScKWnnqMGyoVCxlM9CBjlZLu1vV55IUr6J14I9OD k/zxLqU9il62c/kWez/LC1RHRKbJntzgrqbDzCEyIM7nFn07JTqy9Am6N7Q9tayCcZDi1cIeMaD F2/Sr1NBfV2Nxp1Tq+s8rN967lsL2AhiyB9Jv6AtQZqH24SKIVL+8GsbXoNCfGivvwwOCtcodhI gdVHO5cs0DeM5IlK4H2wHOpYaxtxAntZ5PVX1i/ZWDg8qGDN7rb4wTxuanujarNkLBohde6Se7i Pgcf9ein2b7DVkxn9ftWiV2cgZViKsUWWAa54KEqbAdWUEKPioKHQMCL5HmVX7VOuIEwg7jsiCH 67cdpW8cZ1Er3pT5iKwk6wgh0= X-Google-Smtp-Source: AGHT+IHEETHVqmTbuWgs08Fwqn+h92i7koaC3q6+hGqOy8xHcPTrkyrnZKvlXSo9rTCs6WBaAbd2Hw== X-Received: by 2002:ad4:4ee4:0:b0:709:76b4:5936 with SMTP id 6a1803df08f44-7098a898d5bmr22807336d6.55.1754531088556; Wed, 06 Aug 2025 18:44:48 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:47 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 01/30] kho: init new_physxa->phys_bits to fix lockdep Date: Thu, 7 Aug 2025 01:44:07 +0000 Message-ID: <20250807014442.3829950-2-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Lockdep shows the following warning: INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. [] dump_stack_lvl+0x66/0xa0 [] assign_lock_key+0x10c/0x120 [] register_lock_class+0xf4/0x2f0 [] __lock_acquire+0x7f/0x2c40 [] ? __pfx_hlock_conflict+0x10/0x10 [] ? native_flush_tlb_global+0x8e/0xa0 [] ? __flush_tlb_all+0x4e/0xa0 [] ? __kernel_map_pages+0x112/0x140 [] ? xa_load_or_alloc+0x67/0xe0 [] lock_acquire+0xe6/0x280 [] ? xa_load_or_alloc+0x67/0xe0 [] _raw_spin_lock+0x30/0x40 [] ? xa_load_or_alloc+0x67/0xe0 [] xa_load_or_alloc+0x67/0xe0 [] kho_preserve_folio+0x90/0x100 [] __kho_finalize+0xcf/0x400 [] kho_finalize+0x34/0x70 This is becase xa has its own lock, that is not initialized in xa_load_or_alloc. Modifiy __kho_preserve_order(), to properly call xa_init(&new_physxa->phys_bits); Fixes: fc33e4b44b27 ("kexec: enable KHO support for memory preservation") Signed-off-by: Pasha Tatashin Acked-by: Mike Rapoport (Microsoft) Reviewed-by: Pratyush Yadav --- kernel/kexec_handover.c | 29 +++++++++++++++++++++++++---- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index e49743ae52c5..6240bc38305b 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -144,14 +144,35 @@ static int __kho_preserve_order(struct kho_mem_track = *track, unsigned long pfn, unsigned int order) { struct kho_mem_phys_bits *bits; - struct kho_mem_phys *physxa; + struct kho_mem_phys *physxa, *new_physxa; const unsigned long pfn_high =3D pfn >> order; =20 might_sleep(); =20 - physxa =3D xa_load_or_alloc(&track->orders, order, sizeof(*physxa)); - if (IS_ERR(physxa)) - return PTR_ERR(physxa); + physxa =3D xa_load(&track->orders, order); + if (!physxa) { + new_physxa =3D kzalloc(sizeof(*physxa), GFP_KERNEL); + if (!new_physxa) + return -ENOMEM; + + xa_init(&new_physxa->phys_bits); + physxa =3D xa_cmpxchg(&track->orders, order, NULL, new_physxa, + GFP_KERNEL); + if (xa_is_err(physxa)) { + int err =3D xa_err(physxa); + + xa_destroy(&new_physxa->phys_bits); + kfree(new_physxa); + + return err; + } + if (physxa) { + xa_destroy(&new_physxa->phys_bits); + kfree(new_physxa); + } else { + physxa =3D new_physxa; + } + } =20 bits =3D xa_load_or_alloc(&physxa->phys_bits, pfn_high / PRESERVE_BITS, sizeof(*bits)); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81B081AF4EF for ; Thu, 7 Aug 2025 01:44:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531093; cv=none; b=ikUeHbmtofsPePrmzLev+jdcqG6OAJosza8Z5JCH84B+1mvI1MkOzJVs5mtPCXSdkNTZQ/YWXWPD+9xvrmTEvh+JFFCJFBe9HhCQesTKMUgkwmr3JBC2naogyTuEjFtvIBxrvbmkRp+iUd0RYD3+Z8DUxPhR1IsD2yC2MDPOAvM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531093; c=relaxed/simple; bh=pChmRpJN5MN0Cs/ZXsmC3YUcAo70fIcDoe+NxtZQw38=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=du5ypzG7he/CyDbW7Qghu+ymP1Kag3KP0/8HNL7RfjycHj+P24N+6TIKNt97vQBgCYaZmxLRvQ0i4MSh6qTOP11/SRWOvlS7fhXl3SGKnEbpC7Gw7ZJWPO8/ddFmsPs1fFT8hQKFM6/0lZQs7axQIianUA2p8uUKvUThysjJdG0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=QlrF3A+u; arc=none smtp.client-ip=209.85.219.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="QlrF3A+u" Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-707389a2fe3so5505246d6.2 for ; Wed, 06 Aug 2025 18:44:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531090; x=1755135890; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=zshHU08ID0A8NXQB//otilnse1ea1iiCCRjsJeEREns=; b=QlrF3A+uEzNnSQi2J7vty3/Y4wlh/0cIoKk/9rMBFZEHy914iBlmF27s4ofmq8T7tt QheOd0sep4xtkcdyX/BEbSBEon8bBb5ZmIIkwQLbaOxoqYOJlUsmS7Gn5+BErFnRDcdJ FGX9oTxOEPyj/NqH+exQJhtCnXo1nzwvuo0CIBHYElttWCEULT+9h92p5eo2w5Gq3jhc lkV0JASIgu7B5PPYp7hqV+W50FfGZK3gHl3ZFOEKNZuOvMZbEw2Q5QGDSXE1r0BDbpjP vxI8DHS6+p4T0usbknEWcC6noOtLj7GGd6tHyRhq6gRKkf8Xg/7loXSgfJLX9hju3hgK fByw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531090; x=1755135890; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zshHU08ID0A8NXQB//otilnse1ea1iiCCRjsJeEREns=; b=cvW5MvbhjJxbdNwOxrdCHxL8TA8plmsR+bu29m1XG8UILraJpWF7jOl+tYbz/ktK7X uowsmFlU70Dv54aLLtpV7vzCKRSgvYIKs6mrYgwiZY+xVyg3JcCMoUkc8Me7xKWA2jo1 kbRq3BsGeDKkBAcCj/zqT+D+ML3U+vz/E8+ZfYsFWUsKHww0TtAbzK1sMgZf3U/3isvj k9RisiR6Pg53u9v6c7cexT8bdqZNF6CqkRzRQ3PXi0HhDODGndtpHT9iYvFhojw8Kkq8 gNoNjDXciXHxObQcIevKbOdFy9k66QBPa+vRf7RaL8EnJ+aloPTbaHlEb0yJN+bABf1p bHyw== X-Forwarded-Encrypted: i=1; AJvYcCXe8UVusobYfWS5Bwo0K7K5XweCJUASrTjAYXJ8KrqqfKkRXOwJoTRyoDgcPu2gpu9TVfArFKLo2dg2dUM=@vger.kernel.org X-Gm-Message-State: AOJu0YxdYWLR9sgtFYavgHcrzFwdw9eV7UmXpmtM/cgI9iklC3fpE7uR LdjCrp1u860gf/qpW4vHgTeUUSNFLE2dF5J5Ru06GYA2XojpcV0XqYglyHDoCQQBCd4= X-Gm-Gg: ASbGnctYpfn8VWzvr6/oUr4j2PKSh47pW3D1La+pXn5HtigvCyzgemvoR4VFXfhuUju zpAnbbymRf2DCuEUVlPbTMrzhXhhe5/LihdeCMwanfSO+ksnQlvYh8K7dgMkzlkU6QTI354/Fyd ybo3dwnzRpZxrBXVAEE58E3ufyeqc06KRW5f4JWNF5oBqnERCaoi7l7vDa3NYdta9Ccskl1tfza dhdSbxBro3RoDf1oF8qhDsVUdp7Ms5JniIT6haCtPZQKJXY2bZT14p0yLQqq+z9N40vjwJ/nhy8 P02EuLBIeQf5BSqkznJ/rmuZ08zAPXw9OvH2+vtGwppxhqxUpJlrI7D7CAXNQHgck6CXRMmEKmj +js7BgSOF7isOXi+7tV2/BRWfEkKYzsIp5EHvzq3FdKi+m7NgKANYs108Fpw7zrYWJohzVNZ5QN eLKw0f1NIUoHS/ X-Google-Smtp-Source: AGHT+IHnmZoVOKyFNNYjLIT5+3p7MrntTaDtd4zF5EXEXaUvK/PtJPkVc+qy8Wy9N2jknjRUu/0o0g== X-Received: by 2002:a05:6214:300f:b0:707:29f9:3bd1 with SMTP id 6a1803df08f44-7097964ce99mr76335266d6.46.1754531090257; Wed, 06 Aug 2025 18:44:50 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:49 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 02/30] kho: mm: Don't allow deferred struct page with KHO Date: Thu, 7 Aug 2025 01:44:08 +0000 Message-ID: <20250807014442.3829950-3-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" KHO uses struct pages for the preserved memory early in boot, however, with deferred struct page initialization, only a small portion of memory has properly initialized struct pages. This problem was detected where vmemmap is poisoned, and illegal flag combinations are detected. Don't allow them to be enabled together, and later we will have to teach KHO to work properly with deferred struct page init kernel feature. Fixes: 990a950fe8fd ("kexec: add config option for KHO") Signed-off-by: Pasha Tatashin Acked-by: Mike Rapoport (Microsoft) Acked-by: Pratyush Yadav --- kernel/Kconfig.kexec | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 2ee603a98813..1224dd937df0 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -97,6 +97,7 @@ config KEXEC_JUMP config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE + depends on !DEFERRED_STRUCT_PAGE_INIT select MEMBLOCK_KHO_SCRATCH select KEXEC_FILE select DEBUG_FS --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E3A5F1C5D72 for ; Thu, 7 Aug 2025 01:44:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531095; cv=none; b=OyJAT53unB8AwP5aDyeEllEY7Y5FI/+PssC3snNiTMmm9/1VDqelPJCxJ7+8qCEf/E6NYuW2S3NUMTBnlXnkotXsdeu1S6Nv5N4yzKF7Grnz3BebaY899g3PXkEmTQQHAASAQ6kAKd79vPEPKdL1SY1wkb83G/2Kp/WUX64Ivno= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531095; c=relaxed/simple; bh=7U4lml8t8oTaVKuCLMpzN1Wuy07LH0AEtgZI7ZnzuAo=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hOqnZLvnKO40XJgwXGTJ8DRTwz51RLQr/+Zj92vG5BNK4aMjZozWt16ig54BK8fZ/obuqxs6WFMr3Vugm2R4Dnj0q8siwea0vYz9j28/T7dNuhVqUrcJX+IZnli2mMkJLEZyaCz3GBItaGk/6AxenTYh0Ned3c6pDsJA2s7rgfo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=jGhRq5J8; arc=none smtp.client-ip=209.85.219.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="jGhRq5J8" Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-707453b0306so5567776d6.2 for ; Wed, 06 Aug 2025 18:44:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531092; x=1755135892; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=kaESzk2f6ioxFaB+zOVCWa3oEolQrGNRo6YQh6Uj768=; b=jGhRq5J8VB7lIKnNpy5IKMw8o759Wi8rvUO67fzgAAGCCCNNALGWGFCvUy8qIR2/LT DXGxW0APa9sdDaPeZEwdzz0T2lICen0AXKPPnAznTYEJWRHZQTfte+Svh0/PKYE1wN/E Hw9FP6Y83KNEBd3R+497m3ThktAMJAxFcsnbHwhRHmebFfa7+opf19X8RXLo+BdSY3gw vj1YvsosNm64zjFspN5YAkjdnqvhI7zj2WOgFRuA2hBLMN2h7r3WUF31CD2F4GR4hOmR qf6OOUYSKXEEXnkxZcjMDk/tFAuea95bcYCzFRjAcDrsVywYcGTLqOzHgv1pcuTLGqrI F8Kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531092; x=1755135892; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kaESzk2f6ioxFaB+zOVCWa3oEolQrGNRo6YQh6Uj768=; b=btC/k6QKv1fuKmEGYyVkZia6DD1pbMA1MXUfXMIuItJvtVpFSqWeYm1J0J8CLftNEM Ob7rgZ+g3QwyMaRfRqf479yip3wL+McKoybjVTDHlRYTxdZahxRUh0Ze+UaU6WEL3WiO TZsOMVGMuYJWerisbPXOc3hmp6+8EJzvC82bAM/RV9Cv7hxUaEITTIP8OAib2b32V/WF fQngYsjDE3EGgP5ruFO6vPHzePVnJw09yID2B65vyNXm2ttbwGaZ34I0rU+n6FAp+B6U 6CK5pxoZA2SiKWLYFJTdPQsGGQtyJTksfwhs5vT0DTng+e1wi5mZPacGBC1tH8rbTzX/ jEzw== X-Forwarded-Encrypted: i=1; AJvYcCVAoJs+5HWKuCQNF9UtdylPbYnxjafVGHfBAuYHVmJa5tY8dAt+LUBVJJ6wCssDYQLoQcvT/x4Jf5/KvsE=@vger.kernel.org X-Gm-Message-State: AOJu0YxZpVvLQmYG+8XCw2ZAZaY53cq7Try4yJ2eJJm36ljiNW/cSg1r sxKGh+lwxFs8HsdvhMgCaq5i+RmDc3j5fT0tVSNT8BA/fbDG5lNDM/4xcHUbGSKasqk= X-Gm-Gg: ASbGncvN69cUKDGagk1ezLhKFpWbob0WxzSdxAzRLD1BU+/ME2bvf69HsnxWltMBZJR +YywEAxmuVHC1FOZMNVlXs2g3d4O4+jeb3F2DEJ3WeoaCF8kRRWQfNRvSqIytHckn7iAo3isdou NWaUp9UwiMplTmlXdhRD3HV4jLt8vVYcVzLhfiNqRO7kNlp5xIXp9Itlv/zfOZe3XL+h4UG2ZAr DhEyLDVXm/HPLfOaCDKZlyfRh9U/4O+iabiXX9kgR7eahBEAnY6AbFJALJyNkF/RD2tup9jRzje PzEXzpM5CV19+HPV3269w4DVEQozl9OsH2Y3bXiWadMCjdnJeTdZi81nlHvjfzqtY8IEgojVEW7 VBayvGPRlELSvlZRdkIqjKeJ1S3HwgEPyTPaMi8urDL/Fb3JF5YrunXaQ16kd8ZHhXHL6M2vLBk blhJEsez5T8Isr X-Google-Smtp-Source: AGHT+IEyWMvTBlOM62kNmVMKSSB8JNc55ip6AvkaRw6dwIN2X5ATQTWFGGBA3z7kK5RTd9pgC648bw== X-Received: by 2002:ad4:5f8f:0:b0:707:4daf:637 with SMTP id 6a1803df08f44-7097af1440bmr64537216d6.29.1754531091759; Wed, 06 Aug 2025 18:44:51 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:51 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 03/30] kho: warn if KHO is disabled due to an error Date: Thu, 7 Aug 2025 01:44:09 +0000 Message-ID: <20250807014442.3829950-4-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" During boot scratch area is allocated based on command line parameters or auto calculated. However, scratch area may fail to allocate, and in that case KHO is disabled. Currently, no warning is printed that KHO is disabled, which makes it confusing for the end user to figure out why KHO is not available. Add the missing warning message. Signed-off-by: Pasha Tatashin Acked-by: Mike Rapoport (Microsoft) Acked-by: Pratyush Yadav --- kernel/kexec_handover.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 6240bc38305b..c2b7e8b86db0 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -565,6 +565,7 @@ static void __init kho_reserve_scratch(void) err_free_scratch_desc: memblock_free(kho_scratch, kho_scratch_cnt * sizeof(*kho_scratch)); err_disable_kho: + pr_warn("Failed to reserve scratch area, disabling kexec handover\n"); kho_enable =3D false; } =20 --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A208C1A76BB for ; Thu, 7 Aug 2025 01:44:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.177 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531096; cv=none; b=GIoFdKUHgFj/89XYiRzkLJ4muCrOCGbOO4/VNgn5YE8wZuc6xFk0VYHylZxUuONWsMlPEGUolEpo1yTJWqoliFuRewZjBJU7iIWHDj+PFAD6wFQiGHDC2KB8qKtCTM3UrdDTabrHSIndnFeQr+yk+7zoguhxr2h+FAYD4JDXHJs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531096; c=relaxed/simple; bh=9o3FtqjKn/C7E3gCLx9DaCJVqshNMO5CpPQSR//j8AM=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OomZuRIUaDIT1XyPqY4cj06dUmB7CqKfjIKGQcYBRBBcxSh0F6L4m3H9bIlTXyyuviSKjQGcOx1H7ofELHexTkOCfDTbzao3DKGkw/Mmp6yaAdXpvXk6c+cH18FT/HXaIrVZgmDZkwO9Vya8nwFtVEVIRW4IOLxRh7+6rFLBTa0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=hekIzqZr; arc=none smtp.client-ip=209.85.160.177 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="hekIzqZr" Received: by mail-qt1-f177.google.com with SMTP id d75a77b69052e-4b0784e3153so8851331cf.2 for ; Wed, 06 Aug 2025 18:44:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531093; x=1755135893; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=FY0apzV4/jiYpNtz4zaEBU41SIXIqP4k9Hjv/Z6KQLI=; b=hekIzqZrxOsPoxQ4uSUVxR2tdnfeo3Dz2NaEHGWaTGyIyZh13/jf5lR8H8N1Drg+8h BdFj9ICIq9hC5Vxuy7SjSMFqSexd2WUZNFwPlIy7HyG0jcRUQ4t9TQLzPj+V3I3SRTzM 4l2slXZHQFkVFJ3l49wgqRArt+vKpVk8h2bSZiuZKuyQaU5nFeHiN4Zoe99gpLMOxUMJ vy20ho52rBMS57InEMEX0MDtgraALp5vcvZKNGuE/M9OcZuhs6Gvo3Yq0rTJ3j9zAlq+ zP+cmlRQCE+Uc5zNz7ZfieMDIsCpXvurcCUlwPTCw+BbjlzM/vHTtffoVmS0GCtwJM3w 1xjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531093; x=1755135893; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FY0apzV4/jiYpNtz4zaEBU41SIXIqP4k9Hjv/Z6KQLI=; b=WY6KNoOmayd3fBntDooihNCKjrtFKexcHv/vq/5jiI+rtn8ObBxpkNT6/SxKGP+mX0 9wjjoAd7xjhQ16z83Z35CybUteQtUelykwxU8xlWFVjSUMqFmIKq5XJBgQdgEVEIhQa6 t/Gn35phqxkkHaCc3PjlR91jgKW3VDa70qOTimVW3lRl3VnutN25MQSrhZML3BwXG49F Tgbitmx1FXI4Hrscq6wlfL/oJv/4sQQXEGLa8Jk3/1Q/UrfGbbEKt42iJY53VMkY7TS7 EbkGo4OYAv7Sbm21fA4B2xmfKV0F+msqdgkOgFpANWbeOR8okaNfHA7EWrPBFC1ogU9a rXLQ== X-Forwarded-Encrypted: i=1; AJvYcCWFsp+AX9qf6QTOb8gO0M1mCbLZXcpoFAeBKd+SZQG38ty88znaWcWPpmtZGxNcn4+IPA/dBRJuIGqQ+Pw=@vger.kernel.org X-Gm-Message-State: AOJu0Yw+7Q6r7gU2ONGRy8AjIxM2wfZlab6cOdW715F3f3vV+tQcxCBy 1sTxAuuYO1OO9YzyQxTp1pHAVYBVEuw+PAVKub3YvKqZJjwGw935YIvkRhopJRjbg+Y= X-Gm-Gg: ASbGnctd0SkaJjK58zTQoYYT/nwDeGjQLmBmWRczZ/jvJNfVr5GXh9Bx5CJf3TVknD8 r6bXMnIiEMuCOOSqiz6BakgAu4/JXHqE9bneSQphq+Ft0qh2si4jtzCmGh3ztaMLikZSgmWqkL5 eOYlnypfFA8oZB+GQemIjLi6DAsSum/ZRezx/h2LtcwbQo1CkY3Q/3RU/wB4WRt8Pl+7KPg7Z6v EgroqposqhpBk+Puyp9ygbV//f4HZS2MzZmGzBotRYNfGw4zfzllTJalJNWloq3zWN/sx/6ADqe BTuxStRAUsq7oxqlWapJFmi85i8zpRsz5HkIlZAOijUkd1zJL78N63uWauwzs386BUUk9IGLCX/ x04r0VPM6pdpzTYvXjWwY03tgvrznzkyf36/uyjDf1auxzjNc9bVC1BT1XGtHYnj9ODV1d2Oa6U as/6c1bBtJabN+ X-Google-Smtp-Source: AGHT+IFVnZ5D1Am418Avb76/eeCwz3uLxwOs//KfNDCrFp0xraechqxEC3xc41ENAONdARaIHNn8kA== X-Received: by 2002:a05:622a:1494:b0:4ae:cc4b:11bb with SMTP id d75a77b69052e-4b09163a4cbmr65118341cf.57.1754531093076; Wed, 06 Aug 2025 18:44:53 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:52 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 04/30] kho: allow to drive kho from within kernel Date: Thu, 7 Aug 2025 01:44:10 +0000 Message-ID: <20250807014442.3829950-5-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Allow to do finalize and abort from kernel modules, so LUO could drive the KHO sequence via its own state machine. Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 15 +++++++++ kernel/kexec_handover.c | 56 ++++++++++++++++++++++++++++++++-- 2 files changed, 69 insertions(+), 2 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index 348844cffb13..f98565def593 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -54,6 +54,10 @@ void kho_memory_init(void); =20 void kho_populate(phys_addr_t fdt_phys, u64 fdt_len, phys_addr_t scratch_p= hys, u64 scratch_len); + +int kho_finalize(void); +int kho_abort(void); + #else static inline bool kho_is_enabled(void) { @@ -104,6 +108,17 @@ static inline void kho_populate(phys_addr_t fdt_phys, = u64 fdt_len, phys_addr_t scratch_phys, u64 scratch_len) { } + +static inline int kho_finalize(void) +{ + return -EOPNOTSUPP; +} + +static inline int kho_abort(void) +{ + return -EOPNOTSUPP; +} + #endif /* CONFIG_KEXEC_HANDOVER */ =20 #endif /* LINUX_KEXEC_HANDOVER_H */ diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index c2b7e8b86db0..2c22a9f3b278 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -757,7 +757,7 @@ static int kho_out_update_debugfs_fdt(void) return err; } =20 -static int kho_abort(void) +static int __kho_abort(void) { int err; unsigned long order; @@ -790,7 +790,33 @@ static int kho_abort(void) return err; } =20 -static int kho_finalize(void) +int kho_abort(void) +{ + int ret =3D 0; + + if (!kho_enable) + return -EOPNOTSUPP; + + mutex_lock(&kho_out.lock); + + if (!kho_out.finalized) { + ret =3D -ENOENT; + goto unlock; + } + + ret =3D __kho_abort(); + if (ret) + goto unlock; + + kho_out.finalized =3D false; + ret =3D kho_out_update_debugfs_fdt(); + +unlock: + mutex_unlock(&kho_out.lock); + return ret; +} + +static int __kho_finalize(void) { int err =3D 0; u64 *preserved_mem_map; @@ -839,6 +865,32 @@ static int kho_finalize(void) return err; } =20 +int kho_finalize(void) +{ + int ret =3D 0; + + if (!kho_enable) + return -EOPNOTSUPP; + + mutex_lock(&kho_out.lock); + + if (kho_out.finalized) { + ret =3D -EEXIST; + goto unlock; + } + + ret =3D __kho_finalize(); + if (ret) + goto unlock; + + kho_out.finalized =3D true; + ret =3D kho_out_update_debugfs_fdt(); + +unlock: + mutex_unlock(&kho_out.lock); + return ret; +} + static int kho_out_finalize_get(void *data, u64 *val) { mutex_lock(&kho_out.lock); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C998E1E0E08 for ; Thu, 7 Aug 2025 01:44:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531098; cv=none; b=DKv7lAxAgOtj7kNRk9qUSiflXufytyQCttMGH5cttdjvBsJIOTNwnwS/kRL8PCej42sl3vujr2GJl+GvGTwLPtXwnla7XMCCqJoXLyC4+tpdh9PdwNSqSi7PtgBT7kF3df6DbFt/f+kmpva8pL0m4Ky+P/KmkWZZ7G8UqeChyNg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531098; c=relaxed/simple; bh=zX1S6sHAlwhhyXb3WKb4eKTk2Ab0++YRiXQY/rUaxLI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mPKJQy1iPJ03aZQdqk5LQ7h6ZsZwWjp9zw91gIuRDibfXbBSYj3NPhSpzirczsD/qqSHksoqjjCG8pOiATekBclRndBlU5gu62lZf1i3mRw9WQEiu9gH3gE9KX31/hwLl/WMut00ZOY+VmOVs+kALg4CA4mb05aokhQPX+WdJj8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=ijXPyG4j; arc=none smtp.client-ip=209.85.219.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="ijXPyG4j" Received: by mail-qv1-f48.google.com with SMTP id 6a1803df08f44-7072ed7094aso5675966d6.1 for ; Wed, 06 Aug 2025 18:44:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531094; x=1755135894; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=IuJY+DJcpvZ8c9vRXHnvTFVehhdRAUYZ2+dB0gIVboA=; b=ijXPyG4juDbMQaSSDe6yLdggMeLE+n2+jCZEPzCQttH3vZiwjJvp+vpyQHOPNhv0ky hy4WVAbaiS6K/E0SBbvSKGdxJLyvV1ZXn+0CqsBUiKeRNlji7gBoUnFNK3WNyaIE0wY2 OpAToG2TCLKTPhRNJhkAJah862ZmDaidJOwy616H6s/yuGhobCYLaOx7mi7dCSEc0hcQ lQE4YO3cIZfMvAsYTvDCxGMQQeHzHQWEa0fZVEM+xYhC2hBaguRlwveKr93kz7acrZ2g avyo2D9i693cFFFQF60fpUqyLfG/4ziRKPj3k5q+H1s7H5L2mGYuEm3Krj1S25a+7kdg QL6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531095; x=1755135895; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IuJY+DJcpvZ8c9vRXHnvTFVehhdRAUYZ2+dB0gIVboA=; b=a98f4dIdgrG/WY2O7/jKpiQcAuyHIO+MgaRWOtNdp9iIEhWko24XvgwPegeUqGu0LR gygEC3elqPcZSbVp5/KT40W72sAfmA8xNkP1T93GHfo5KW1gHTvDdcSKpD1xwFhbcNHA xjdk6TIpiRAGlhKKiQ+4yxnbrF2bjPE/LFx7nRZzAyjrp/0PX7dC35pg/BhDJkEyBjXr SdBq3THSRroU3fMZBosaSneUFLSf7hLgqF+oNI96yzK6OzyzgMJjKEiNq8EYcR7UtcET T+TuO6COFqGbv7Cq0+FSpK6byRJYmUKgsBEJWF1RMFQYxkfoM2f4s303KYw6Bxj7e2BE IeWQ== X-Forwarded-Encrypted: i=1; AJvYcCUiumG/Y+SO2fLkbYHhnEdmWXgnDoywoLEy4+1rm4Lj7DMw4TwzibEi+KaB9S4I4/65AEzP1tTVuuJnMOE=@vger.kernel.org X-Gm-Message-State: AOJu0YyNtfle06G4ifoYiYuNdg9ABzj6lD6jNpREP+aWT4ccC8zfiWyS QRROhVvs3rtL40NqXF+XFseUAz5BmkSeL93cIBQHpFIXCzmjH7sOVTvSrPFCJ9+2ymc= X-Gm-Gg: ASbGnctBXnxJouZSnJZDS62KxDigr/B8NfRlYJzzc0j5j4KGZn7+nTGq6veVJTayi+F DR5MltDshYq0WNqB2kwMwG4D7b2PgcRAZJ3qFAW2cF5vKsEUm2tVaipOumQX383GmVJ5BwkwPTW +u0AE9WYPSxBbuT9jKXMhVMDPeBFZHnk+xyESOnf8jOQnWX9inaJvNnghBXeVhAxaZixf8xNMAk 8UYxXuWM+ibgjJZmGaADBfhuQV1htUDe+g2TzJJT6CZjyK/ZglqDFuYBfDc+VAEmykfS1B2c384 Isvu6dvm8YrlMz49ZKbQ/N35HCDFT1e6ESq3/mLljwMPb88QW6sntir7XdktoM3u9azEtFMJ494 kKX1p1adcpuuCs1iv3ke+fQnwStyNngcN9sRtab2Hy43QOVpe7CIUArnPxnm0pnJK5+E6Yw0zXd YN2j349KrUwogf X-Google-Smtp-Source: AGHT+IE1tVM0YUUcEBsMqVCKHVRy1rkAu8tlveZeewH/tKzAaP6qTaZRqK6AkniZb18bLHt5TBKrpQ== X-Received: by 2002:a05:6214:1c4c:b0:707:56bd:906f with SMTP id 6a1803df08f44-7097ae18869mr69979806d6.17.1754531094496; Wed, 06 Aug 2025 18:44:54 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:53 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 05/30] kho: make debugfs interface optional Date: Thu, 7 Aug 2025 01:44:11 +0000 Message-ID: <20250807014442.3829950-6-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, KHO is controlled via debugfs interface, but once LUO is introduced, it can control KHO, and the debug interface becomes optional. Add a separate config CONFIG_KEXEC_HANDOVER_DEBUG that enables the debugfs interface, and allows to inspect the tree. Move all debugfs related code to a new file to keep the .c files clear of ifdefs. Co-developed-by: Mike Rapoport (Microsoft) Signed-off-by: Mike Rapoport (Microsoft) Signed-off-by: Pasha Tatashin --- MAINTAINERS | 3 +- kernel/Kconfig.kexec | 10 ++ kernel/Makefile | 1 + kernel/kexec_handover.c | 278 ++++--------------------------- kernel/kexec_handover_debug.c | 218 ++++++++++++++++++++++++ kernel/kexec_handover_internal.h | 44 +++++ 6 files changed, 311 insertions(+), 243 deletions(-) create mode 100644 kernel/kexec_handover_debug.c create mode 100644 kernel/kexec_handover_internal.h diff --git a/MAINTAINERS b/MAINTAINERS index fda151dbf229..ce0314af3bdf 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13534,13 +13534,14 @@ KEXEC HANDOVER (KHO) M: Alexander Graf M: Mike Rapoport M: Changyuan Lyu +M: Pasha Tatashin L: kexec@lists.infradead.org L: linux-mm@kvack.org S: Maintained F: Documentation/admin-guide/mm/kho.rst F: Documentation/core-api/kho/* F: include/linux/kexec_handover.h -F: kernel/kexec_handover.c +F: kernel/kexec_handover* F: tools/testing/selftests/kho/ =20 KEYS-ENCRYPTED diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 1224dd937df0..9968d3d4dd17 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -109,6 +109,16 @@ config KEXEC_HANDOVER to keep data or state alive across the kexec. For this to work, both source and target kernels need to have this option enabled. =20 +config KEXEC_HANDOVER_DEBUG + bool "kexec handover debug interface" + depends on KEXEC_HANDOVER + depends on DEBUG_FS + help + Allow to control kexec handover device tree via debugfs + interface, i.e. finalize the state or aborting the finalization. + Also, enables inspecting the KHO fdt trees with the debugfs binary + blobs. + config CRASH_DUMP bool "kernel crash dumps" default ARCH_DEFAULT_CRASH_DUMP diff --git a/kernel/Makefile b/kernel/Makefile index c60623448235..bfca6dfe335a 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -82,6 +82,7 @@ obj-$(CONFIG_KEXEC) +=3D kexec.o obj-$(CONFIG_KEXEC_FILE) +=3D kexec_file.o obj-$(CONFIG_KEXEC_ELF) +=3D kexec_elf.o obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o +obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_BACKTRACE_SELF_TEST) +=3D backtracetest.o obj-$(CONFIG_COMPAT) +=3D compat.o obj-$(CONFIG_CGROUPS) +=3D cgroup/ diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 2c22a9f3b278..a19d271721f7 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -10,7 +10,6 @@ =20 #include #include -#include #include #include #include @@ -27,6 +26,7 @@ */ #include "../mm/internal.h" #include "kexec_internal.h" +#include "kexec_handover_internal.h" =20 #define KHO_FDT_COMPATIBLE "kho-v1" #define PROP_PRESERVED_MEMORY_MAP "preserved-memory-map" @@ -84,8 +84,6 @@ struct khoser_mem_chunk; =20 struct kho_serialization { struct page *fdt; - struct list_head fdt_list; - struct dentry *sub_fdt_dir; struct kho_mem_track track; /* First chunk of serialized preserved memory map */ struct khoser_mem_chunk *preserved_mem_map; @@ -381,8 +379,8 @@ static void __init kho_mem_deserialize(const void *fdt) * area for early allocations that happen before page allocator is * initialized. */ -static struct kho_scratch *kho_scratch; -static unsigned int kho_scratch_cnt; +struct kho_scratch *kho_scratch; +unsigned int kho_scratch_cnt; =20 /* * The scratch areas are scaled by default as percent of memory allocated = from @@ -569,36 +567,24 @@ static void __init kho_reserve_scratch(void) kho_enable =3D false; } =20 -struct fdt_debugfs { - struct list_head list; - struct debugfs_blob_wrapper wrapper; - struct dentry *file; +struct kho_out { + struct blocking_notifier_head chain_head; + struct mutex lock; /* protects KHO FDT finalization */ + struct kho_serialization ser; + bool finalized; + struct kho_debugfs dbg; }; =20 -static int kho_debugfs_fdt_add(struct list_head *list, struct dentry *dir, - const char *name, const void *fdt) -{ - struct fdt_debugfs *f; - struct dentry *file; - - f =3D kmalloc(sizeof(*f), GFP_KERNEL); - if (!f) - return -ENOMEM; - - f->wrapper.data =3D (void *)fdt; - f->wrapper.size =3D fdt_totalsize(fdt); - - file =3D debugfs_create_blob(name, 0400, dir, &f->wrapper); - if (IS_ERR(file)) { - kfree(f); - return PTR_ERR(file); - } - - f->file =3D file; - list_add(&f->list, list); - - return 0; -} +static struct kho_out kho_out =3D { + .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), + .lock =3D __MUTEX_INITIALIZER(kho_out.lock), + .ser =3D { + .track =3D { + .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), + }, + }, + .finalized =3D false, +}; =20 /** * kho_add_subtree - record the physical address of a sub FDT in KHO root = tree. @@ -611,7 +597,8 @@ static int kho_debugfs_fdt_add(struct list_head *list, = struct dentry *dir, * by KHO for the new kernel to retrieve it after kexec. * * A debugfs blob entry is also created at - * ``/sys/kernel/debug/kho/out/sub_fdts/@name``. + * ``/sys/kernel/debug/kho/out/sub_fdts/@name`` when kernel is configured = with + * CONFIG_KEXEC_HANDOVER_DEBUG * * Return: 0 on success, error code on failure */ @@ -628,33 +615,10 @@ int kho_add_subtree(struct kho_serialization *ser, co= nst char *name, void *fdt) if (err) return err; =20 - return kho_debugfs_fdt_add(&ser->fdt_list, ser->sub_fdt_dir, name, fdt); + return kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); } EXPORT_SYMBOL_GPL(kho_add_subtree); =20 -struct kho_out { - struct blocking_notifier_head chain_head; - - struct dentry *dir; - - struct mutex lock; /* protects KHO FDT finalization */ - - struct kho_serialization ser; - bool finalized; -}; - -static struct kho_out kho_out =3D { - .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), - .lock =3D __MUTEX_INITIALIZER(kho_out.lock), - .ser =3D { - .fdt_list =3D LIST_HEAD_INIT(kho_out.ser.fdt_list), - .track =3D { - .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), - }, - }, - .finalized =3D false, -}; - int register_kho_notifier(struct notifier_block *nb) { return blocking_notifier_chain_register(&kho_out.chain_head, nb); @@ -734,29 +698,6 @@ int kho_preserve_phys(phys_addr_t phys, size_t size) } EXPORT_SYMBOL_GPL(kho_preserve_phys); =20 -/* Handling for debug/kho/out */ - -static struct dentry *debugfs_root; - -static int kho_out_update_debugfs_fdt(void) -{ - int err =3D 0; - struct fdt_debugfs *ff, *tmp; - - if (kho_out.finalized) { - err =3D kho_debugfs_fdt_add(&kho_out.ser.fdt_list, kho_out.dir, - "fdt", page_to_virt(kho_out.ser.fdt)); - } else { - list_for_each_entry_safe(ff, tmp, &kho_out.ser.fdt_list, list) { - debugfs_remove(ff->file); - list_del(&ff->list); - kfree(ff); - } - } - - return err; -} - static int __kho_abort(void) { int err; @@ -809,7 +750,8 @@ int kho_abort(void) goto unlock; =20 kho_out.finalized =3D false; - ret =3D kho_out_update_debugfs_fdt(); + + kho_debugfs_cleanup(&kho_out.dbg); =20 unlock: mutex_unlock(&kho_out.lock); @@ -859,7 +801,7 @@ static int __kho_finalize(void) abort: if (err) { pr_err("Failed to convert KHO state tree: %d\n", err); - kho_abort(); + __kho_abort(); } =20 return err; @@ -884,119 +826,32 @@ int kho_finalize(void) goto unlock; =20 kho_out.finalized =3D true; - ret =3D kho_out_update_debugfs_fdt(); + ret =3D kho_debugfs_fdt_add(&kho_out.dbg, "fdt", + page_to_virt(kho_out.ser.fdt), true); =20 unlock: mutex_unlock(&kho_out.lock); return ret; } =20 -static int kho_out_finalize_get(void *data, u64 *val) +bool kho_finalized(void) { - mutex_lock(&kho_out.lock); - *val =3D kho_out.finalized; - mutex_unlock(&kho_out.lock); - - return 0; -} - -static int kho_out_finalize_set(void *data, u64 _val) -{ - int ret =3D 0; - bool val =3D !!_val; + bool ret; =20 mutex_lock(&kho_out.lock); - - if (val =3D=3D kho_out.finalized) { - if (kho_out.finalized) - ret =3D -EEXIST; - else - ret =3D -ENOENT; - goto unlock; - } - - if (val) - ret =3D kho_finalize(); - else - ret =3D kho_abort(); - - if (ret) - goto unlock; - - kho_out.finalized =3D val; - ret =3D kho_out_update_debugfs_fdt(); - -unlock: + ret =3D kho_out.finalized; mutex_unlock(&kho_out.lock); - return ret; -} - -DEFINE_DEBUGFS_ATTRIBUTE(fops_kho_out_finalize, kho_out_finalize_get, - kho_out_finalize_set, "%llu\n"); - -static int scratch_phys_show(struct seq_file *m, void *v) -{ - for (int i =3D 0; i < kho_scratch_cnt; i++) - seq_printf(m, "0x%llx\n", kho_scratch[i].addr); - - return 0; -} -DEFINE_SHOW_ATTRIBUTE(scratch_phys); - -static int scratch_len_show(struct seq_file *m, void *v) -{ - for (int i =3D 0; i < kho_scratch_cnt; i++) - seq_printf(m, "0x%llx\n", kho_scratch[i].size); - - return 0; -} -DEFINE_SHOW_ATTRIBUTE(scratch_len); - -static __init int kho_out_debugfs_init(void) -{ - struct dentry *dir, *f, *sub_fdt_dir; - - dir =3D debugfs_create_dir("out", debugfs_root); - if (IS_ERR(dir)) - return -ENOMEM; - - sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); - if (IS_ERR(sub_fdt_dir)) - goto err_rmdir; =20 - f =3D debugfs_create_file("scratch_phys", 0400, dir, NULL, - &scratch_phys_fops); - if (IS_ERR(f)) - goto err_rmdir; - - f =3D debugfs_create_file("scratch_len", 0400, dir, NULL, - &scratch_len_fops); - if (IS_ERR(f)) - goto err_rmdir; - - f =3D debugfs_create_file("finalize", 0600, dir, NULL, - &fops_kho_out_finalize); - if (IS_ERR(f)) - goto err_rmdir; - - kho_out.dir =3D dir; - kho_out.ser.sub_fdt_dir =3D sub_fdt_dir; - return 0; - -err_rmdir: - debugfs_remove_recursive(dir); - return -ENOENT; + return ret; } =20 struct kho_in { - struct dentry *dir; phys_addr_t fdt_phys; phys_addr_t scratch_phys; - struct list_head fdt_list; + struct kho_debugfs dbg; }; =20 static struct kho_in kho_in =3D { - .fdt_list =3D LIST_HEAD_INIT(kho_in.fdt_list), }; =20 static const void *kho_get_fdt(void) @@ -1040,56 +895,6 @@ int kho_retrieve_subtree(const char *name, phys_addr_= t *phys) } EXPORT_SYMBOL_GPL(kho_retrieve_subtree); =20 -/* Handling for debugfs/kho/in */ - -static __init int kho_in_debugfs_init(const void *fdt) -{ - struct dentry *sub_fdt_dir; - int err, child; - - kho_in.dir =3D debugfs_create_dir("in", debugfs_root); - if (IS_ERR(kho_in.dir)) - return PTR_ERR(kho_in.dir); - - sub_fdt_dir =3D debugfs_create_dir("sub_fdts", kho_in.dir); - if (IS_ERR(sub_fdt_dir)) { - err =3D PTR_ERR(sub_fdt_dir); - goto err_rmdir; - } - - err =3D kho_debugfs_fdt_add(&kho_in.fdt_list, kho_in.dir, "fdt", fdt); - if (err) - goto err_rmdir; - - fdt_for_each_subnode(child, fdt, 0) { - int len =3D 0; - const char *name =3D fdt_get_name(fdt, child, NULL); - const u64 *fdt_phys; - - fdt_phys =3D fdt_getprop(fdt, child, "fdt", &len); - if (!fdt_phys) - continue; - if (len !=3D sizeof(*fdt_phys)) { - pr_warn("node `%s`'s prop `fdt` has invalid length: %d\n", - name, len); - continue; - } - err =3D kho_debugfs_fdt_add(&kho_in.fdt_list, sub_fdt_dir, name, - phys_to_virt(*fdt_phys)); - if (err) { - pr_warn("failed to add fdt `%s` to debugfs: %d\n", name, - err); - continue; - } - } - - return 0; - -err_rmdir: - debugfs_remove_recursive(kho_in.dir); - return err; -} - static __init int kho_init(void) { int err =3D 0; @@ -1104,27 +909,16 @@ static __init int kho_init(void) goto err_free_scratch; } =20 - debugfs_root =3D debugfs_create_dir("kho", NULL); - if (IS_ERR(debugfs_root)) { - err =3D -ENOENT; + err =3D kho_debugfs_init(); + if (err) goto err_free_fdt; - } =20 - err =3D kho_out_debugfs_init(); + err =3D kho_out_debugfs_init(&kho_out.dbg); if (err) goto err_free_fdt; =20 if (fdt) { - err =3D kho_in_debugfs_init(fdt); - /* - * Failure to create /sys/kernel/debug/kho/in does not prevent - * reviving state from KHO and setting up KHO for the next - * kexec. - */ - if (err) - pr_err("failed exposing handover FDT in debugfs: %d\n", - err); - + kho_in_debugfs_init(&kho_in.dbg, fdt); return 0; } =20 diff --git a/kernel/kexec_handover_debug.c b/kernel/kexec_handover_debug.c new file mode 100644 index 000000000000..b88d138a97be --- /dev/null +++ b/kernel/kexec_handover_debug.c @@ -0,0 +1,218 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * kexec_handover.c - kexec handover metadata processing + * Copyright (C) 2023 Alexander Graf + * Copyright (C) 2025 Microsoft Corporation, Mike Rapoport + * Copyright (C) 2025 Google LLC, Changyuan Lyu + * Copyright (C) 2025 Google LLC, Pasha Tatashin + */ + +#define pr_fmt(fmt) "KHO: " fmt + +#include +#include +#include +#include +#include "kexec_handover_internal.h" + +static struct dentry *debugfs_root; + +struct fdt_debugfs { + struct list_head list; + struct debugfs_blob_wrapper wrapper; + struct dentry *file; +}; + +static int __kho_debugfs_fdt_add(struct list_head *list, struct dentry *di= r, + const char *name, const void *fdt) +{ + struct fdt_debugfs *f; + struct dentry *file; + + f =3D kmalloc(sizeof(*f), GFP_KERNEL); + if (!f) + return -ENOMEM; + + f->wrapper.data =3D (void *)fdt; + f->wrapper.size =3D fdt_totalsize(fdt); + + file =3D debugfs_create_blob(name, 0400, dir, &f->wrapper); + if (IS_ERR(file)) { + kfree(f); + return PTR_ERR(file); + } + + f->file =3D file; + list_add(&f->list, list); + + return 0; +} + +int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, + const void *fdt, bool root) +{ + struct dentry *dir; + + if (root) + dir =3D dbg->dir; + else + dir =3D dbg->sub_fdt_dir; + + return __kho_debugfs_fdt_add(&dbg->fdt_list, dir, name, fdt); +} + +void kho_debugfs_cleanup(struct kho_debugfs *dbg) +{ + struct fdt_debugfs *ff, *tmp; + + list_for_each_entry_safe(ff, tmp, &dbg->fdt_list, list) { + debugfs_remove(ff->file); + list_del(&ff->list); + kfree(ff); + } +} + +static int kho_out_finalize_get(void *data, u64 *val) +{ + *val =3D kho_finalized(); + + return 0; +} + +static int kho_out_finalize_set(void *data, u64 _val) +{ + bool val =3D !!_val; + + if (val) + return kho_finalize(); + + return kho_abort(); +} + +DEFINE_DEBUGFS_ATTRIBUTE(kho_out_finalize_fops, kho_out_finalize_get, + kho_out_finalize_set, "%llu\n"); + +static int scratch_phys_show(struct seq_file *m, void *v) +{ + for (int i =3D 0; i < kho_scratch_cnt; i++) + seq_printf(m, "0x%llx\n", kho_scratch[i].addr); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(scratch_phys); + +static int scratch_len_show(struct seq_file *m, void *v) +{ + for (int i =3D 0; i < kho_scratch_cnt; i++) + seq_printf(m, "0x%llx\n", kho_scratch[i].size); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(scratch_len); + +__init void kho_in_debugfs_init(struct kho_debugfs *dbg, const void *fdt) +{ + struct dentry *dir, *sub_fdt_dir; + int err, child; + + INIT_LIST_HEAD(&dbg->fdt_list); + + dir =3D debugfs_create_dir("in", debugfs_root); + if (IS_ERR(dir)) { + err =3D PTR_ERR(dir); + goto err_out; + } + + sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); + if (IS_ERR(sub_fdt_dir)) { + err =3D PTR_ERR(sub_fdt_dir); + goto err_rmdir; + } + + err =3D __kho_debugfs_fdt_add(&dbg->fdt_list, dir, "fdt", fdt); + if (err) + goto err_rmdir; + + fdt_for_each_subnode(child, fdt, 0) { + int len =3D 0; + const char *name =3D fdt_get_name(fdt, child, NULL); + const u64 *fdt_phys; + + fdt_phys =3D fdt_getprop(fdt, child, "fdt", &len); + if (!fdt_phys) + continue; + if (len !=3D sizeof(*fdt_phys)) { + pr_warn("node %s prop fdt has invalid length: %d\n", + name, len); + continue; + } + err =3D __kho_debugfs_fdt_add(&dbg->fdt_list, sub_fdt_dir, name, + phys_to_virt(*fdt_phys)); + if (err) { + pr_warn("failed to add fdt %s to debugfs: %d\n", name, + err); + continue; + } + } + + dbg->dir =3D dir; + dbg->sub_fdt_dir =3D sub_fdt_dir; + + return; +err_rmdir: + debugfs_remove_recursive(dir); +err_out: + /* + * Failure to create /sys/kernel/debug/kho/in does not prevent + * reviving state from KHO and setting up KHO for the next + * kexec. + */ + if (err) + pr_err("failed exposing handover FDT in debugfs: %d\n", err); +} + +__init int kho_out_debugfs_init(struct kho_debugfs *dbg) +{ + struct dentry *dir, *f, *sub_fdt_dir; + + INIT_LIST_HEAD(&dbg->fdt_list); + + dir =3D debugfs_create_dir("out", debugfs_root); + if (IS_ERR(dir)) + return -ENOMEM; + + sub_fdt_dir =3D debugfs_create_dir("sub_fdts", dir); + if (IS_ERR(sub_fdt_dir)) + goto err_rmdir; + + f =3D debugfs_create_file("scratch_phys", 0400, dir, NULL, + &scratch_phys_fops); + if (IS_ERR(f)) + goto err_rmdir; + + f =3D debugfs_create_file("scratch_len", 0400, dir, NULL, + &scratch_len_fops); + if (IS_ERR(f)) + goto err_rmdir; + + f =3D debugfs_create_file("finalize", 0600, dir, NULL, + &kho_out_finalize_fops); + if (IS_ERR(f)) + goto err_rmdir; + + dbg->dir =3D dir; + dbg->sub_fdt_dir =3D sub_fdt_dir; + return 0; + +err_rmdir: + debugfs_remove_recursive(dir); + return -ENOENT; +} + +__init int kho_debugfs_init(void) +{ + debugfs_root =3D debugfs_create_dir("kho", NULL); + if (IS_ERR(debugfs_root)) + return -ENOENT; + return 0; +} diff --git a/kernel/kexec_handover_internal.h b/kernel/kexec_handover_inter= nal.h new file mode 100644 index 000000000000..41e9616fcdd0 --- /dev/null +++ b/kernel/kexec_handover_internal.h @@ -0,0 +1,44 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef LINUX_KEXEC_HANDOVER_INTERNAL_H +#define LINUX_KEXEC_HANDOVER_INTERNAL_H + +#include +#include +#include + +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +#include + +struct kho_debugfs { + struct dentry *dir; + struct dentry *sub_fdt_dir; + struct list_head fdt_list; +}; + +#else +struct kho_debugfs {} +#endif + +extern struct kho_scratch *kho_scratch; +extern unsigned int kho_scratch_cnt; + +bool kho_finalized(void); + +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +int kho_debugfs_init(void); +void kho_in_debugfs_init(struct kho_debugfs *dbg, const void *fdt); +int kho_out_debugfs_init(struct kho_debugfs *dbg); +int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, + const void *fdt, bool root); +void kho_debugfs_cleanup(struct kho_debugfs *dbg); +#else +static inline int kho_debugfs_init(void) { return 0; } +static inline void kho_in_debugfs_init(struct kho_debugfs *dbg, + const void *fdt) { } +static inline int kho_out_debugfs_init(struct kho_debugfs *dbg) { return 0= ; } +static inline int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char = *name, + const void *fdt, bool root) { return 0; } +static inline void kho_debugfs_cleanup(struct kho_debugfs *dbg) {} +#endif /* CONFIG_KEXEC_HANDOVER_DEBUG */ + +#endif /* LINUX_KEXEC_HANDOVER_INTERNAL_H */ --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qt1-f174.google.com (mail-qt1-f174.google.com [209.85.160.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 588CF1EA7F4 for ; Thu, 7 Aug 2025 01:44:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531100; cv=none; b=jziTF8DAgWcLzauyFFE2WFtmvZ8+pAsqEB7Y0sjRJoyFSS7Hd/kqrMuJ+o9AM1y/TNNMX9ThPwtUcp0jsq43EYaVxKzdnkqnVV2EkT9UxUXFShgOvDoTqEY8j2yE2F8/ZKO7Z+MngQaKSTRf0o/3XwYGAwdzCj9sOp2J/61IZjk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531100; c=relaxed/simple; bh=Tq22inIbaysxSW9ZcZh3aXQvdCgpg9+ezmJJ87SJJqw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=isZSeiOp7FIPs9CBPuEFehorHXausSKz50z96tY1GmtrybWtiQlGcI60JgCVLvG1MJBQHPrV4aFxcLm2NpVqG2SvandMsyMSq0IlaDk7nPo6h3cThjWjQKBlOVHgsHlTYk5l7JMqd7PPERajh+G3QLGSA9Mj+s7PaYCQrgh7mkI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=kuBqzGbZ; arc=none smtp.client-ip=209.85.160.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="kuBqzGbZ" Received: by mail-qt1-f174.google.com with SMTP id d75a77b69052e-4b088929339so5066411cf.1 for ; Wed, 06 Aug 2025 18:44:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531096; x=1755135896; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ffw72xt7ZzTFODVs8KzGvNRDPuQerrnNgMWkdlg1ha8=; b=kuBqzGbZEtv55+E4fKGb/nC5hawi49SZPgM6bDXruA28W8rxsV5Sf58nQ4md/8envh tR4nc6ahWxyUzIMxY8xjxoW/cbnVEFltuK2LDjgZWAtIp6Auixx/4PBMHRZdZsod0Omp GiYO2/aUErIhFTk/Kgtn4QcPRkbRu1Mv3piizNamTnYTQbXPQ3BTvz7HinkJCIv4e4dn l0whPBskuyGxvr+yk3HDx1kYmZ5uNZls9I7DzJL+f0UOjxikpuWtfZfOmTpqTxIc+bHx jz1jdIInttm5GpO4/Y1AMeBYqr++QtiZ7XCCWp7HfSspQ1ARCNVLrqVxnlkDxceJSvY9 F51g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531096; x=1755135896; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ffw72xt7ZzTFODVs8KzGvNRDPuQerrnNgMWkdlg1ha8=; b=nYFnRx1aLGgQRIiYLR5A0X+8avs9COfXLrjdOoaTTBY8uQyJSFxZ05iFZk0hFY4v2H 0XAEx8wwnYOEoGKFODt7b5MsUl/Mwyf4YkQkWBAu0wNB5M0/R9H6VHix7GHMu7tWkl+C oAnEkOXpl4JUntAJiT2E1+Yf2p8cfFt3S5NuomjllMDxjcj2EgAhyAkJw1BdG/R7v6JN GvyjxJfDdOIyqVt+n3ZlWL23DhZRPGjKzYIbHWMmTXHDNOC1PVAA5DbkXLV3U84m3w1p jj0cjoaT+YmVuwZHEd/6xy+Alw8u3Q7nF/t2/MNQwjzN46UueharRHTyrT1mAjA236y1 XHpA== X-Forwarded-Encrypted: i=1; AJvYcCW0IpH8FFvdfbEOWAUETifKiPM3EtXuFcthlGHJ0PjzwOO2tVDqvRvqTgAFdvvcMgzwW3xCTOGT7jupfng=@vger.kernel.org X-Gm-Message-State: AOJu0YyrdDWk/KQi+nbU8A7Id2J9jTBLVQ5X5KAZv11AaEXRjtnBwP/6 ehwPAaeFtsZSRYlloWVkot9zO7Hbg38nQIwIpbUIvZ1kLUYZyP23S6cnUEDab4lsEmw= X-Gm-Gg: ASbGncu940YTiA0OzktxM/wdApK1AzWkPohWOqYnXxWuWfIkTGwqUjgZ4RpawGntG41 VX2f2QxQYeTbWAy6ra5q0iN331QT97df0RC3gOS48PU+D3Qqif51WJyAzVOXGJmpFTh7H/iLNUj 48u4Pgsne079Jl3CEy03/w2yyuxR1w+l6YxA00GnJ+WbCGyxoF0VqSSNJXuSb9LxYhHJcm2/63Z vALO4fHL41cubWLtt1x3QQVFOVC16+DqGJn3n67cyTr7N+ywhwWvsxb7iwrdgrEfsn0EpdzLWYy kovc7iytt96KfyMtrU3FQn7zRlTtnElfB2LAXtTilywEbj4Mbs14Y34I2hIM8QuDjbshYv8AASr DLWMTf4C93q1tSq4PZbhbIdZ0BYCih6CeEY4JyEomWWiXtr/W5j8j1f4eyIMBAG3XN10TcZOiHI 1lTSMlZ74rf7F5VboDc8+OQ0hfZGkS7eSZTw== X-Google-Smtp-Source: AGHT+IGcMMIR2SPPfJvG2NQQfbq95evg3BY5gAEft4jor8DCsZVbXoS61W9VF5v5vMwwYxadOs7XvQ== X-Received: by 2002:ad4:5f8f:0:b0:707:4daf:637 with SMTP id 6a1803df08f44-7097af1440bmr64538336d6.29.1754531096014; Wed, 06 Aug 2025 18:44:56 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:55 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 06/30] kho: drop notifiers Date: Thu, 7 Aug 2025 01:44:12 +0000 Message-ID: <20250807014442.3829950-7-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Mike Rapoport (Microsoft)" The KHO framework uses a notifier chain as the mechanism for clients to participate in the finalization process. While this works for a single, central state machine, it is too restrictive for kernel-internal components like pstore/reserve_mem or IMA. These components need a simpler, direct way to register their state for preservation (e.g., during their initcall) without being part of a complex, shutdown-time notifier sequence. The notifier model forces all participants into a single finalization flow and makes direct preservation from an arbitrary context difficult. This patch refactors the client participation model by removing the notifier chain and introducing a direct API for managing FDT subtrees. The core kho_finalize() and kho_abort() state machine remains, but clients now register their data with KHO beforehand. Signed-off-by: Mike Rapoport (Microsoft) Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 28 +---- kernel/kexec_handover.c | 177 +++++++++++++++++-------------- kernel/kexec_handover_debug.c | 17 +-- kernel/kexec_handover_internal.h | 5 +- mm/memblock.c | 56 ++-------- 5 files changed, 124 insertions(+), 159 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index f98565def593..cabdff5f50a2 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -10,14 +10,7 @@ struct kho_scratch { phys_addr_t size; }; =20 -/* KHO Notifier index */ -enum kho_event { - KEXEC_KHO_FINALIZE =3D 0, - KEXEC_KHO_ABORT =3D 1, -}; - struct folio; -struct notifier_block; =20 #define DECLARE_KHOSER_PTR(name, type) \ union { \ @@ -36,20 +29,16 @@ struct notifier_block; (typeof((s).ptr))((s).phys ? phys_to_virt((s).phys) : NULL); \ }) =20 -struct kho_serialization; - #ifdef CONFIG_KEXEC_HANDOVER bool kho_is_enabled(void); =20 int kho_preserve_folio(struct folio *folio); int kho_preserve_phys(phys_addr_t phys, size_t size); struct folio *kho_restore_folio(phys_addr_t phys); -int kho_add_subtree(struct kho_serialization *ser, const char *name, void = *fdt); +int kho_add_subtree(const char *name, void *fdt); +void kho_remove_subtree(void *fdt); int kho_retrieve_subtree(const char *name, phys_addr_t *phys); =20 -int register_kho_notifier(struct notifier_block *nb); -int unregister_kho_notifier(struct notifier_block *nb); - void kho_memory_init(void); =20 void kho_populate(phys_addr_t fdt_phys, u64 fdt_len, phys_addr_t scratch_p= hys, @@ -79,23 +68,16 @@ static inline struct folio *kho_restore_folio(phys_addr= _t phys) return NULL; } =20 -static inline int kho_add_subtree(struct kho_serialization *ser, - const char *name, void *fdt) +static inline int kho_add_subtree(const char *name, void *fdt) { return -EOPNOTSUPP; } =20 -static inline int kho_retrieve_subtree(const char *name, phys_addr_t *phys) +static inline void kho_remove_subtree(void *fdt) { - return -EOPNOTSUPP; } =20 -static inline int register_kho_notifier(struct notifier_block *nb) -{ - return -EOPNOTSUPP; -} - -static inline int unregister_kho_notifier(struct notifier_block *nb) +static inline int kho_retrieve_subtree(const char *name, phys_addr_t *phys) { return -EOPNOTSUPP; } diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index a19d271721f7..8a4894e8ac71 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -15,7 +15,6 @@ #include #include #include -#include #include =20 #include @@ -82,11 +81,35 @@ struct kho_mem_track { =20 struct khoser_mem_chunk; =20 -struct kho_serialization { - struct page *fdt; +struct kho_sub_fdt { + struct list_head l; + const char *name; + void *fdt; +}; + +struct kho_out { + void *fdt; + bool finalized; + struct mutex lock; /* protects KHO FDT finalization */ + + struct list_head sub_fdts; + struct mutex fdts_lock; + struct kho_mem_track track; /* First chunk of serialized preserved memory map */ struct khoser_mem_chunk *preserved_mem_map; + + struct kho_debugfs dbg; +}; + +static struct kho_out kho_out =3D { + .lock =3D __MUTEX_INITIALIZER(kho_out.lock), + .track =3D { + .orders =3D XARRAY_INIT(kho_out.track.orders, 0), + }, + .sub_fdts =3D LIST_HEAD_INIT(kho_out.sub_fdts), + .fdts_lock =3D __MUTEX_INITIALIZER(kho_out.fdts_lock), + .finalized =3D false, }; =20 static void *xa_load_or_alloc(struct xarray *xa, unsigned long index, size= _t sz) @@ -285,14 +308,14 @@ static void kho_mem_ser_free(struct khoser_mem_chunk = *first_chunk) } } =20 -static int kho_mem_serialize(struct kho_serialization *ser) +static int kho_mem_serialize(struct kho_out *kho_out) { struct khoser_mem_chunk *first_chunk =3D NULL; struct khoser_mem_chunk *chunk =3D NULL; struct kho_mem_phys *physxa; unsigned long order; =20 - xa_for_each(&ser->track.orders, order, physxa) { + xa_for_each(&kho_out->track.orders, order, physxa) { struct kho_mem_phys_bits *bits; unsigned long phys; =20 @@ -320,7 +343,7 @@ static int kho_mem_serialize(struct kho_serialization *= ser) } } =20 - ser->preserved_mem_map =3D first_chunk; + kho_out->preserved_mem_map =3D first_chunk; =20 return 0; =20 @@ -567,28 +590,8 @@ static void __init kho_reserve_scratch(void) kho_enable =3D false; } =20 -struct kho_out { - struct blocking_notifier_head chain_head; - struct mutex lock; /* protects KHO FDT finalization */ - struct kho_serialization ser; - bool finalized; - struct kho_debugfs dbg; -}; - -static struct kho_out kho_out =3D { - .chain_head =3D BLOCKING_NOTIFIER_INIT(kho_out.chain_head), - .lock =3D __MUTEX_INITIALIZER(kho_out.lock), - .ser =3D { - .track =3D { - .orders =3D XARRAY_INIT(kho_out.ser.track.orders, 0), - }, - }, - .finalized =3D false, -}; - /** * kho_add_subtree - record the physical address of a sub FDT in KHO root = tree. - * @ser: serialization control object passed by KHO notifiers. * @name: name of the sub tree. * @fdt: the sub tree blob. * @@ -602,34 +605,45 @@ static struct kho_out kho_out =3D { * * Return: 0 on success, error code on failure */ -int kho_add_subtree(struct kho_serialization *ser, const char *name, void = *fdt) +int kho_add_subtree(const char *name, void *fdt) { - int err =3D 0; - u64 phys =3D (u64)virt_to_phys(fdt); - void *root =3D page_to_virt(ser->fdt); + struct kho_sub_fdt *sub_fdt; + int err; =20 - err |=3D fdt_begin_node(root, name); - err |=3D fdt_property(root, PROP_SUB_FDT, &phys, sizeof(phys)); - err |=3D fdt_end_node(root); + sub_fdt =3D kmalloc(sizeof(*sub_fdt), GFP_KERNEL); + if (!sub_fdt) + return -ENOMEM; =20 - if (err) - return err; + INIT_LIST_HEAD(&sub_fdt->l); + sub_fdt->name =3D name; + sub_fdt->fdt =3D fdt; + + mutex_lock(&kho_out.fdts_lock); + list_add_tail(&sub_fdt->l, &kho_out.sub_fdts); + err =3D kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); + mutex_unlock(&kho_out.fdts_lock); =20 - return kho_debugfs_fdt_add(&kho_out.dbg, name, fdt, false); + return err; } EXPORT_SYMBOL_GPL(kho_add_subtree); =20 -int register_kho_notifier(struct notifier_block *nb) +void kho_remove_subtree(void *fdt) { - return blocking_notifier_chain_register(&kho_out.chain_head, nb); -} -EXPORT_SYMBOL_GPL(register_kho_notifier); + struct kho_sub_fdt *sub_fdt; + + mutex_lock(&kho_out.fdts_lock); + list_for_each_entry(sub_fdt, &kho_out.sub_fdts, l) { + if (sub_fdt->fdt =3D=3D fdt) { + list_del(&sub_fdt->l); + kfree(sub_fdt); + kho_debugfs_fdt_remove(&kho_out.dbg, fdt); + break; + } + } + mutex_unlock(&kho_out.fdts_lock); =20 -int unregister_kho_notifier(struct notifier_block *nb) -{ - return blocking_notifier_chain_unregister(&kho_out.chain_head, nb); } -EXPORT_SYMBOL_GPL(unregister_kho_notifier); +EXPORT_SYMBOL_GPL(kho_remove_subtree); =20 /** * kho_preserve_folio - preserve a folio across kexec. @@ -644,7 +658,7 @@ int kho_preserve_folio(struct folio *folio) { const unsigned long pfn =3D folio_pfn(folio); const unsigned int order =3D folio_order(folio); - struct kho_mem_track *track =3D &kho_out.ser.track; + struct kho_mem_track *track =3D &kho_out.track; =20 if (kho_out.finalized) return -EBUSY; @@ -670,7 +684,7 @@ int kho_preserve_phys(phys_addr_t phys, size_t size) const unsigned long start_pfn =3D pfn; const unsigned long end_pfn =3D PHYS_PFN(phys + size); int err =3D 0; - struct kho_mem_track *track =3D &kho_out.ser.track; + struct kho_mem_track *track =3D &kho_out.track; =20 if (kho_out.finalized) return -EBUSY; @@ -700,11 +714,11 @@ EXPORT_SYMBOL_GPL(kho_preserve_phys); =20 static int __kho_abort(void) { - int err; + int err =3D 0; unsigned long order; struct kho_mem_phys *physxa; =20 - xa_for_each(&kho_out.ser.track.orders, order, physxa) { + xa_for_each(&kho_out.track.orders, order, physxa) { struct kho_mem_phys_bits *bits; unsigned long phys; =20 @@ -714,17 +728,13 @@ static int __kho_abort(void) xa_destroy(&physxa->phys_bits); kfree(physxa); } - xa_destroy(&kho_out.ser.track.orders); + xa_destroy(&kho_out.track.orders); =20 - if (kho_out.ser.preserved_mem_map) { - kho_mem_ser_free(kho_out.ser.preserved_mem_map); - kho_out.ser.preserved_mem_map =3D NULL; + if (kho_out.preserved_mem_map) { + kho_mem_ser_free(kho_out.preserved_mem_map); + kho_out.preserved_mem_map =3D NULL; } =20 - err =3D blocking_notifier_call_chain(&kho_out.chain_head, KEXEC_KHO_ABORT, - NULL); - err =3D notifier_to_errno(err); - if (err) pr_err("Failed to abort KHO finalization: %d\n", err); =20 @@ -751,7 +761,7 @@ int kho_abort(void) =20 kho_out.finalized =3D false; =20 - kho_debugfs_cleanup(&kho_out.dbg); + kho_debugfs_fdt_remove(&kho_out.dbg, kho_out.fdt); =20 unlock: mutex_unlock(&kho_out.lock); @@ -762,41 +772,46 @@ static int __kho_finalize(void) { int err =3D 0; u64 *preserved_mem_map; - void *fdt =3D page_to_virt(kho_out.ser.fdt); + void *root =3D kho_out.fdt; + struct kho_sub_fdt *fdt; =20 - err |=3D fdt_create(fdt, PAGE_SIZE); - err |=3D fdt_finish_reservemap(fdt); - err |=3D fdt_begin_node(fdt, ""); - err |=3D fdt_property_string(fdt, "compatible", KHO_FDT_COMPATIBLE); + err |=3D fdt_create(root, PAGE_SIZE); + err |=3D fdt_finish_reservemap(root); + err |=3D fdt_begin_node(root, ""); + err |=3D fdt_property_string(root, "compatible", KHO_FDT_COMPATIBLE); /** * Reserve the preserved-memory-map property in the root FDT, so * that all property definitions will precede subnodes created by * KHO callers. */ - err |=3D fdt_property_placeholder(fdt, PROP_PRESERVED_MEMORY_MAP, + err |=3D fdt_property_placeholder(root, PROP_PRESERVED_MEMORY_MAP, sizeof(*preserved_mem_map), (void **)&preserved_mem_map); if (err) goto abort; =20 - err =3D kho_preserve_folio(page_folio(kho_out.ser.fdt)); + err =3D kho_preserve_folio(virt_to_folio(kho_out.fdt)); if (err) goto abort; =20 - err =3D blocking_notifier_call_chain(&kho_out.chain_head, - KEXEC_KHO_FINALIZE, &kho_out.ser); - err =3D notifier_to_errno(err); + err =3D kho_mem_serialize(&kho_out); if (err) goto abort; =20 - err =3D kho_mem_serialize(&kho_out.ser); - if (err) - goto abort; + *preserved_mem_map =3D (u64)virt_to_phys(kho_out.preserved_mem_map); =20 - *preserved_mem_map =3D (u64)virt_to_phys(kho_out.ser.preserved_mem_map); + mutex_lock(&kho_out.fdts_lock); + list_for_each_entry(fdt, &kho_out.sub_fdts, l) { + phys_addr_t phys =3D virt_to_phys(fdt->fdt); =20 - err |=3D fdt_end_node(fdt); - err |=3D fdt_finish(fdt); + err |=3D fdt_begin_node(root, fdt->name); + err |=3D fdt_property(root, PROP_SUB_FDT, &phys, sizeof(phys)); + err |=3D fdt_end_node(root); + }; + mutex_unlock(&kho_out.fdts_lock); + + err |=3D fdt_end_node(root); + err |=3D fdt_finish(root); =20 abort: if (err) { @@ -827,7 +842,7 @@ int kho_finalize(void) =20 kho_out.finalized =3D true; ret =3D kho_debugfs_fdt_add(&kho_out.dbg, "fdt", - page_to_virt(kho_out.ser.fdt), true); + kho_out.fdt, true); =20 unlock: mutex_unlock(&kho_out.lock); @@ -899,15 +914,17 @@ static __init int kho_init(void) { int err =3D 0; const void *fdt =3D kho_get_fdt(); + struct page *fdt_page; =20 if (!kho_enable) return 0; =20 - kho_out.ser.fdt =3D alloc_page(GFP_KERNEL); - if (!kho_out.ser.fdt) { + fdt_page =3D alloc_page(GFP_KERNEL); + if (!fdt_page) { err =3D -ENOMEM; goto err_free_scratch; } + kho_out.fdt =3D page_to_virt(fdt_page); =20 err =3D kho_debugfs_init(); if (err) @@ -935,8 +952,8 @@ static __init int kho_init(void) return 0; =20 err_free_fdt: - put_page(kho_out.ser.fdt); - kho_out.ser.fdt =3D NULL; + put_page(fdt_page); + kho_out.fdt =3D NULL; err_free_scratch: for (int i =3D 0; i < kho_scratch_cnt; i++) { void *start =3D __va(kho_scratch[i].addr); @@ -947,7 +964,7 @@ static __init int kho_init(void) kho_enable =3D false; return err; } -late_initcall(kho_init); +fs_initcall(kho_init); =20 static void __init kho_release_scratch(void) { @@ -1083,7 +1100,7 @@ int kho_fill_kimage(struct kimage *image) if (!kho_enable) return 0; =20 - image->kho.fdt =3D page_to_phys(kho_out.ser.fdt); + image->kho.fdt =3D virt_to_phys(kho_out.fdt); =20 scratch_size =3D sizeof(*kho_scratch) * kho_scratch_cnt; scratch =3D (struct kexec_buf){ diff --git a/kernel/kexec_handover_debug.c b/kernel/kexec_handover_debug.c index b88d138a97be..af4bad225630 100644 --- a/kernel/kexec_handover_debug.c +++ b/kernel/kexec_handover_debug.c @@ -61,14 +61,17 @@ int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const = char *name, return __kho_debugfs_fdt_add(&dbg->fdt_list, dir, name, fdt); } =20 -void kho_debugfs_cleanup(struct kho_debugfs *dbg) +void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, void *fdt) { - struct fdt_debugfs *ff, *tmp; - - list_for_each_entry_safe(ff, tmp, &dbg->fdt_list, list) { - debugfs_remove(ff->file); - list_del(&ff->list); - kfree(ff); + struct fdt_debugfs *ff; + + list_for_each_entry(ff, &dbg->fdt_list, list) { + if (ff->wrapper.data =3D=3D fdt) { + debugfs_remove(ff->file); + list_del(&ff->list); + kfree(ff); + break; + } } } =20 diff --git a/kernel/kexec_handover_internal.h b/kernel/kexec_handover_inter= nal.h index 41e9616fcdd0..240517596ea3 100644 --- a/kernel/kexec_handover_internal.h +++ b/kernel/kexec_handover_internal.h @@ -30,7 +30,7 @@ void kho_in_debugfs_init(struct kho_debugfs *dbg, const v= oid *fdt); int kho_out_debugfs_init(struct kho_debugfs *dbg); int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char *name, const void *fdt, bool root); -void kho_debugfs_cleanup(struct kho_debugfs *dbg); +void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, void *fdt); #else static inline int kho_debugfs_init(void) { return 0; } static inline void kho_in_debugfs_init(struct kho_debugfs *dbg, @@ -38,7 +38,8 @@ static inline void kho_in_debugfs_init(struct kho_debugfs= *dbg, static inline int kho_out_debugfs_init(struct kho_debugfs *dbg) { return 0= ; } static inline int kho_debugfs_fdt_add(struct kho_debugfs *dbg, const char = *name, const void *fdt, bool root) { return 0; } -static inline void kho_debugfs_cleanup(struct kho_debugfs *dbg) {} +static inline void kho_debugfs_fdt_remove(struct kho_debugfs *dbg, + void *fdt) { } #endif /* CONFIG_KEXEC_HANDOVER_DEBUG */ =20 #endif /* LINUX_KEXEC_HANDOVER_INTERNAL_H */ diff --git a/mm/memblock.c b/mm/memblock.c index 154f1d73b61f..6af0b51b1bb7 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -2501,51 +2501,18 @@ int reserve_mem_release_by_name(const char *name) #define MEMBLOCK_KHO_FDT "memblock" #define MEMBLOCK_KHO_NODE_COMPATIBLE "memblock-v1" #define RESERVE_MEM_KHO_NODE_COMPATIBLE "reserve-mem-v1" -static struct page *kho_fdt; - -static int reserve_mem_kho_finalize(struct kho_serialization *ser) -{ - int err =3D 0, i; - - for (i =3D 0; i < reserved_mem_count; i++) { - struct reserve_mem_table *map =3D &reserved_mem_table[i]; - - err |=3D kho_preserve_phys(map->start, map->size); - } - - err |=3D kho_preserve_folio(page_folio(kho_fdt)); - err |=3D kho_add_subtree(ser, MEMBLOCK_KHO_FDT, page_to_virt(kho_fdt)); - - return notifier_from_errno(err); -} - -static int reserve_mem_kho_notifier(struct notifier_block *self, - unsigned long cmd, void *v) -{ - switch (cmd) { - case KEXEC_KHO_FINALIZE: - return reserve_mem_kho_finalize((struct kho_serialization *)v); - case KEXEC_KHO_ABORT: - return NOTIFY_DONE; - default: - return NOTIFY_BAD; - } -} - -static struct notifier_block reserve_mem_kho_nb =3D { - .notifier_call =3D reserve_mem_kho_notifier, -}; =20 static int __init prepare_kho_fdt(void) { int err =3D 0, i; + struct page *fdt_page; void *fdt; =20 - kho_fdt =3D alloc_page(GFP_KERNEL); - if (!kho_fdt) + fdt_page =3D alloc_page(GFP_KERNEL); + if (!fdt_page) return -ENOMEM; =20 - fdt =3D page_to_virt(kho_fdt); + fdt =3D page_to_virt(fdt_page); =20 err |=3D fdt_create(fdt, PAGE_SIZE); err |=3D fdt_finish_reservemap(fdt); @@ -2555,6 +2522,7 @@ static int __init prepare_kho_fdt(void) for (i =3D 0; i < reserved_mem_count; i++) { struct reserve_mem_table *map =3D &reserved_mem_table[i]; =20 + err |=3D kho_preserve_phys(map->start, map->size); err |=3D fdt_begin_node(fdt, map->name); err |=3D fdt_property_string(fdt, "compatible", RESERVE_MEM_KHO_NODE_COM= PATIBLE); err |=3D fdt_property(fdt, "start", &map->start, sizeof(map->start)); @@ -2562,13 +2530,14 @@ static int __init prepare_kho_fdt(void) err |=3D fdt_end_node(fdt); } err |=3D fdt_end_node(fdt); - err |=3D fdt_finish(fdt); =20 + err |=3D kho_preserve_folio(page_folio(fdt_page)); + err |=3D kho_add_subtree(MEMBLOCK_KHO_FDT, fdt); + if (err) { pr_err("failed to prepare memblock FDT for KHO: %d\n", err); - put_page(kho_fdt); - kho_fdt =3D NULL; + put_page(fdt_page); } =20 return err; @@ -2584,13 +2553,6 @@ static int __init reserve_mem_init(void) err =3D prepare_kho_fdt(); if (err) return err; - - err =3D register_kho_notifier(&reserve_mem_kho_nb); - if (err) { - put_page(kho_fdt); - kho_fdt =3D NULL; - } - return err; } late_initcall(reserve_mem_init); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f45.google.com (mail-qv1-f45.google.com [209.85.219.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E98691F76A8 for ; Thu, 7 Aug 2025 01:44:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531101; cv=none; b=orv41Czu45AM59iAmQQyczxFsW+YxduepOYjaYybvomrRKhiRuorgvnkfUWIF8l+a3VJiLLAEkvq2YG2zBspDyOYOH+MBv99KwwunEtatyFYt79MnAt6+dSFN5zyhdZxeXiqN6PIGfL5KEI8MFdKw+AYmaxYYPbyBthgoInj9uY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531101; c=relaxed/simple; bh=1D7JJh7HzPnlfEMFGjJk71iWrGihfks9VCV0ygwxBoM=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pz5YIQjShbSbs3agGUst6XIXZpcOuPp9oxVs18HITJm19B6UxzuHo5GZC5gW+O3jWGX6FbWVig9wKnxEMfRWoCYtLXcoDcTYJHNSahgumq37TIOkUEYbrtxppLoeCaM5fO848gm9oHTqDaJh1DENBEOBgtjYkoVuIduQbc7KjZ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=aY17lsS7; arc=none smtp.client-ip=209.85.219.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="aY17lsS7" Received: by mail-qv1-f45.google.com with SMTP id 6a1803df08f44-7073075c767so6815116d6.3 for ; Wed, 06 Aug 2025 18:44:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531098; x=1755135898; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=s+B2pqucc9Fmko4Lv5M93Olp+poBlrEG0AoODeQS2z8=; b=aY17lsS7jsngqvql3ro45AGZyeWrvlEq+Gf7+yl6HpzkERZliHgIWFRrYAOviBhkzb mO02ClxmGh3nqhGuxyPb2fxsxF022xHfFEykIHsXft8Q6eFDQnKKMVXuVXPouipBnLDE 8lity8N437B7XWpZQ7HlEVyn5kZRhuthhUG5PH3Qmae3o45b9aBnX4wCLAQqyacQpELo ke72YlComU75XmNm4Q/dMkNGpGj4La+m1bQhOJfKLIfetfHOEo4cGTMnR29LGXlUiYMS Gzejaovhwgr2PlC3iNB8a6muoluRmXtczwlHkT+gvZdH+vlgUzlOmrLMnHPQgP1A9Phv 7TNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531098; x=1755135898; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=s+B2pqucc9Fmko4Lv5M93Olp+poBlrEG0AoODeQS2z8=; b=JNe1MmZuWPBkrpput9lDkMLEBqC8IxmFvVfXfDn0jDXeXHpRF04Ie5x3bui0MdW0d4 bQTeXHPHYhnXy1orycXfVXW1Z1vdtElNUXSRPXm2piMk8/tXM1leTzBTwfZfd10Jn/E1 EJ22IFEIwxIpXogyOqxa3lFg1dbmEjeXShbSA53gHVt5i48CmXIU0HCKqY48/L++1wyi 7kztRPRl/b2Sj/XHovwBqdGTeP4T/JZStX/c3oYzb/Zzs2lrvt+aftSlIPeiW663+U/y 47+ljZLXAk0hdnVUL4e6TFD9j+tJMgwty6vODPqLoEj6ml0zWHKTfbZKshw1MG8J8tJj wQgA== X-Forwarded-Encrypted: i=1; AJvYcCUMHR7iFgPMrBHnS5WyUDu0q4aPaDPIPmXn77mqJnwko6uS/cjBaV5MHEBg04VW2FQLzaeftXiwwoDYLGY=@vger.kernel.org X-Gm-Message-State: AOJu0Ywejc0wmspTzTYC3HyyEy8Uq2PJQaFuRrTMHDdv+bzulZveoPqX 9+Lo/C1rcg9lNNSd03mic1nmp7V+cv6OTCSkdC3vO8gEPUXhPndEr5cbejS0VEItn5s= X-Gm-Gg: ASbGncsAYPetpsLlHfY2tCI9jKMm4vsrq362/CKv0D9oiNs2JwePnrp2ytdUvhdvlMH HHn0eyFeC8tQuxaptVv0GzIiv/6Qizmqhnavd1WQ/JIxk2g2o7rV8JiXEn7XOhFD6XNI4KUZ8HH UyacBpMFdkwRcurGKx9qz0ejlQXpfgFnw0gl9u5QNGgJFbudDrk9GmADmXDfoh8sYpqr0AohOzI aaENA3snPcUIzO2dwU+ijvEWGmizQMx/ap1nprY1rZLfZBOhaHOiARvNZGIoOlpzDuG843gEAaW 4Pb7me4VN5X16pBC2ExAEhc94yZlWTlRbW89ODB78jU/4fIyDmBJDqMDjrsNeIeKugleitzFTCE cKNbV2VYJO1DeF+LTCgGI02AktR2rW2zxlEr10oMEVgGnkEQ48Mb0QN0TO4S/moc2R9+QKOWRQi tysJPubDlbOikKAWE/rCaQEZ4= X-Google-Smtp-Source: AGHT+IGAEOC4X/n5iybE9j8P6hUMRBMLG9Syy/4PfWc1CO2/gkmjV3XW7jEAwKFIXrmXPPP2oTU9ug== X-Received: by 2002:a05:6214:5081:b0:707:415b:c13a with SMTP id 6a1803df08f44-70979539363mr72512306d6.22.1754531097555; Wed, 06 Aug 2025 18:44:57 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:56 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 07/30] kho: add interfaces to unpreserve folios and physical memory ranges Date: Thu, 7 Aug 2025 01:44:13 +0000 Message-ID: <20250807014442.3829950-8-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Changyuan Lyu Allow users of KHO to cancel the previous preservation by adding the necessary interfaces to unpreserve folio. Signed-off-by: Changyuan Lyu Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin --- include/linux/kexec_handover.h | 12 +++++ kernel/kexec_handover.c | 90 +++++++++++++++++++++++++++++----- 2 files changed, 89 insertions(+), 13 deletions(-) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index cabdff5f50a2..383e9460edb9 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -33,7 +33,9 @@ struct folio; bool kho_is_enabled(void); =20 int kho_preserve_folio(struct folio *folio); +int kho_unpreserve_folio(struct folio *folio); int kho_preserve_phys(phys_addr_t phys, size_t size); +int kho_unpreserve_phys(phys_addr_t phys, size_t size); struct folio *kho_restore_folio(phys_addr_t phys); int kho_add_subtree(const char *name, void *fdt); void kho_remove_subtree(void *fdt); @@ -58,11 +60,21 @@ static inline int kho_preserve_folio(struct folio *foli= o) return -EOPNOTSUPP; } =20 +static inline int kho_unpreserve_folio(struct folio *folio) +{ + return -EOPNOTSUPP; +} + static inline int kho_preserve_phys(phys_addr_t phys, size_t size) { return -EOPNOTSUPP; } =20 +static inline int kho_unpreserve_phys(phys_addr_t phys, size_t size) +{ + return -EOPNOTSUPP; +} + static inline struct folio *kho_restore_folio(phys_addr_t phys) { return NULL; diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 8a4894e8ac71..b2e99aefbb32 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -136,26 +136,33 @@ static void *xa_load_or_alloc(struct xarray *xa, unsi= gned long index, size_t sz) return elm; } =20 -static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pf= n, - unsigned long end_pfn) +static void __kho_unpreserve_order(struct kho_mem_track *track, unsigned l= ong pfn, + unsigned int order) { struct kho_mem_phys_bits *bits; struct kho_mem_phys *physxa; + const unsigned long pfn_high =3D pfn >> order; =20 - while (pfn < end_pfn) { - const unsigned int order =3D - min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn)); - const unsigned long pfn_high =3D pfn >> order; + physxa =3D xa_load(&track->orders, order); + if (!physxa) + return; =20 - physxa =3D xa_load(&track->orders, order); - if (!physxa) - continue; + bits =3D xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); + if (!bits) + return; =20 - bits =3D xa_load(&physxa->phys_bits, pfn_high / PRESERVE_BITS); - if (!bits) - continue; + clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); +} =20 - clear_bit(pfn_high % PRESERVE_BITS, bits->preserve); +static void __kho_unpreserve(struct kho_mem_track *track, unsigned long pf= n, + unsigned long end_pfn) +{ + unsigned int order; + + while (pfn < end_pfn) { + order =3D min(count_trailing_zeros(pfn), ilog2(end_pfn - pfn)); + + __kho_unpreserve_order(track, pfn, order); =20 pfn +=3D 1 << order; } @@ -667,6 +674,30 @@ int kho_preserve_folio(struct folio *folio) } EXPORT_SYMBOL_GPL(kho_preserve_folio); =20 +/** + * kho_unpreserve_folio - unpreserve a folio. + * @folio: folio to unpreserve. + * + * Instructs KHO to unpreserve a folio that was preserved by + * kho_preserve_folio() before. The provided @folio (pfn and order) + * must exactly match a previously preserved folio. + * + * Return: 0 on success, error code on failure + */ +int kho_unpreserve_folio(struct folio *folio) +{ + const unsigned long pfn =3D folio_pfn(folio); + const unsigned int order =3D folio_order(folio); + struct kho_mem_track *track =3D &kho_out.track; + + if (kho_out.finalized) + return -EBUSY; + + __kho_unpreserve_order(track, pfn, order); + return 0; +} +EXPORT_SYMBOL_GPL(kho_unpreserve_folio); + /** * kho_preserve_phys - preserve a physically contiguous range across kexec. * @phys: physical address of the range. @@ -712,6 +743,39 @@ int kho_preserve_phys(phys_addr_t phys, size_t size) } EXPORT_SYMBOL_GPL(kho_preserve_phys); =20 +/** + * kho_unpreserve_phys - unpreserve a physically contiguous range. + * @phys: physical address of the range. + * @size: size of the range. + * + * Instructs KHO to unpreserve the memory range from @phys to @phys + @siz= e. + * The @phys address must be aligned to @size, and @size must be a + * power-of-2 multiple of PAGE_SIZE. + * This call must exactly match a granularity at which memory was original= ly + * preserved (either by a `kho_preserve_phys` call with the same `phys` and + * `size`). Unpreserving arbitrary sub-ranges of larger preserved blocks i= s not + * supported. + * + * Return: 0 on success, error code on failure + */ +int kho_unpreserve_phys(phys_addr_t phys, size_t size) +{ + struct kho_mem_track *track =3D &kho_out.track; + unsigned long pfn =3D PHYS_PFN(phys); + unsigned long end_pfn =3D PHYS_PFN(phys + size); + + if (kho_out.finalized) + return -EBUSY; + + if (!PAGE_ALIGNED(phys) || !PAGE_ALIGNED(size)) + return -EINVAL; + + __kho_unpreserve(track, pfn, end_pfn); + + return 0; +} +EXPORT_SYMBOL_GPL(kho_unpreserve_phys); + static int __kho_abort(void) { int err =3D 0; --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C8D81FC0ED for ; Thu, 7 Aug 2025 01:45:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531102; cv=none; b=Wq9x6oGbxqVKet2myC0rP57AInAqy+bkA2hGj3wkflge+PMcCUSeQlVUYtr84IjSISwoR4mw4a2V1PivEiQquyCq1jdryr3uOYhhHpPFwPJLnIcheqZVXu4VmQ2cPgFViXsfYk1vr2bb1ldBL6CW/cUkS1/uopFzuK0VE/hrtbQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531102; c=relaxed/simple; bh=vVfONrVyjedFUgoOJUn6GW1APtTwOCiADsyg5cO0faw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mxxPmP9w9cKkX+QeRmmQhPav4bYzlvqhMaZr0mjnSwNQvHegKLK8qPEFhXGYpT/++yrSNT3lalkOIZy8VT4yjFeU0el0YBt6eJgaZUM58LqfE0zKnEWPcUF24zgQgN2LG3M6gj/vDmYIJXcoPD11wwzWaNaMNLcs+3UT/3sR6Ec= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=IQdmmI+C; arc=none smtp.client-ip=209.85.219.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="IQdmmI+C" Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-7074710a90bso5398546d6.0 for ; Wed, 06 Aug 2025 18:45:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531099; x=1755135899; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=t7G+m/ldx5Lte6prj7fLyPcxxJI86crKx906ZhMWQQk=; b=IQdmmI+Cv+zDt/QzpCqEFU+RsJufI9UrlJanooqH41E/pSGh2xBX94CkOdyOKWaTE7 Bmfm0FqP5g2LTu+YVfYqUjdv6aCwkxI20wKMnNTgpJ05adCDLYJ/C9qFtb973WX3Fmhn tJx02/qy0r90B+GLvXJluBS2j+PwRGuaGT+AHB8wqAkuRZEzhbmn+cN0vhz0y4kS+dl7 6m0zEJLm/8tzaLdZN8ZNea5IuCPxRf/aQMqzOltWNW+mkVfprHfbV7Rudb1IJatZGWeQ vIPQO7P77to93JH+NOFLHZZa304TZHzbfoauEnSih7VdITG783ffpWOlvKUQmDazuXzT FJYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531099; x=1755135899; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=t7G+m/ldx5Lte6prj7fLyPcxxJI86crKx906ZhMWQQk=; b=Zh552kEVCV6dGx3T3q9jo1/GgRNmd/grHAaRmzRA0BXmL9FvfHaH4gLEIwlecFXZOj qGMGLfvcUhlTpwER7DR/cRJG4oBtXlVLwMlbYbxfFzLErAVfO38qquf4kMoBQzBZC8xg r57kUvCeaIrsUcJ3X5r8naWxK4k/7oqUzK4OYU+9ANEEIu5aK+6DZN9+EdauLKTzLiHa pPf8Xeln1f0imH0T26bREcdWPcRi5kwgDItdx6DBi1fNMj2aEyqlOgdrdkVex4MdPOHD VuOsT5cDmbBo5qkx/lDsjXi7FVSYKRqgYW+ZVf5q6YEoma2epeeFwrEmSezgZuwrs4kE dgJw== X-Forwarded-Encrypted: i=1; AJvYcCWk94OFU1xSFHjl/l30LSNRx9bwJ5PZyBqQoBtkrG0rNwpZOMZ3Il4xmN3HiwKNi7DyB/2gFhGO9wyqC/k=@vger.kernel.org X-Gm-Message-State: AOJu0YzzOES3kxQLctDtMxejI7p2mp0cN6qLGr43oYq5nbV7qRR2lUFX n2tI22fKp1w+YETJpK0ycF/JlFBJyyDABAUjtQ6jHk/hR2e9xRfS8KUemikO8nAhBW8= X-Gm-Gg: ASbGncsw8rnEgD4ZKl/ASKn6nFZId5K520C3ZxVdUuBYi4kr4IKq7b0QHWzkpE/y5W+ sTfPO3QUqtkgIv8VWPlWqmHPfwxPplazqpm+34Uo6f/83k30yiZN0Dzu18CFj2e1xnJNKjG3fox Zmtxs71jfIB9LPJGAL2nfcX/y1eDzbDAQLbWIVOQ4xwHmEaezRiVha3FB6eX7aRam2yl+BNR2yL 1yDI2W9GhtLqvrXO8ADBS4EQyAzUo18Z1/PPsM4tSkHZKSRc20DTljrNh3od02NlQSXD2PHjidh 7mTO61vah3LbFRmIrlPBAKiFdhF8ax1furVD6CMjKOh+kYw1QGwmE6GtxQpmmfpkhpMLtao3zPw AVA69/wesbY6+N9uv09m4sIyvUdDGqyO+C7VnFS4VUGGXO9H36kqMKVtH2iKyVL1aiMQqqOcdR1 IzwwGNmw49Bmjk X-Google-Smtp-Source: AGHT+IFrm+HnX+DKqjrzyDEzZevUOzkckUB98RzWsO6EpAjPOvGPrTYO1ThdZbbOoF0JmkLR1SS4uA== X-Received: by 2002:a05:6214:c62:b0:707:43a1:5b0d with SMTP id 6a1803df08f44-7097964cf4cmr68796306d6.41.1754531099046; Wed, 06 Aug 2025 18:44:59 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:58 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 08/30] kho: don't unpreserve memory during abort Date: Thu, 7 Aug 2025 01:44:14 +0000 Message-ID: <20250807014442.3829950-9-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" KHO allows clients to preserve memory regions at any point before the KHO state is finalized. The finalization process itself involves KHO performing its own actions, such as serializing the overall preserved memory map. If this finalization process is aborted, the current implementation destroys KHO's internal memory tracking structures (`kho_out.ser.track.orders`). This behavior effectively unpreserves all memory from KHO's perspective, regardless of whether those preservations were made by clients before the finalization attempt or by KHO itself during finalization. This premature unpreservation is incorrect. An abort of the finalization process should only undo actions taken by KHO as part of that specific finalization attempt. Individual memory regions preserved by clients prior to finalization should remain preserved, as their lifecycle is managed by the clients themselves. These clients might still need to call kho_unpreserve_folio() or kho_unpreserve_phys() based on their own logic, even after a KHO finalization attempt is aborted. Signed-off-by: Pasha Tatashin --- kernel/kexec_handover.c | 21 +-------------------- 1 file changed, 1 insertion(+), 20 deletions(-) diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index b2e99aefbb32..07755184f44b 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -778,31 +778,12 @@ EXPORT_SYMBOL_GPL(kho_unpreserve_phys); =20 static int __kho_abort(void) { - int err =3D 0; - unsigned long order; - struct kho_mem_phys *physxa; - - xa_for_each(&kho_out.track.orders, order, physxa) { - struct kho_mem_phys_bits *bits; - unsigned long phys; - - xa_for_each(&physxa->phys_bits, phys, bits) - kfree(bits); - - xa_destroy(&physxa->phys_bits); - kfree(physxa); - } - xa_destroy(&kho_out.track.orders); - if (kho_out.preserved_mem_map) { kho_mem_ser_free(kho_out.preserved_mem_map); kho_out.preserved_mem_map =3D NULL; } =20 - if (err) - pr_err("Failed to abort KHO finalization: %d\n", err); - - return err; + return 0; } =20 int kho_abort(void) --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E378E1A5BAF for ; Thu, 7 Aug 2025 01:45:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531104; cv=none; b=aySyJhD3gEPKHtH3czGaE9pXtPPVtMY/T4PjnxUFgUIaTBWaDO/eTbeToJoG2720IyTeihvN0avHdDmSQmd175o7slhctWw7BWaPNDfjGAuvfB9NTYd+NdKN/4N0VjOFIlE4tesNGBvHLV38SUpfZxK4hA3UCLUyuKcWJN8YZRM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531104; c=relaxed/simple; bh=PKiNhDV/woD9XQWUc1QLLTXDf6qKPRlDYuwRtODBNz0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=M35tf4mFj0gP/cK0SqBjkQKEMas04QAKp5RYrbDJDKJMe0+LNLWbF0GXnZIXg/Q4//jL+HFmCOpkTIg8wwEzkn4Uo2R6K3QwiUDcxHdWj5PJASD7dCFgPozbmsGgaWeR0NATqhdiV6tFj1C+k5FmO5htvOoIAoiq3YxcgzMHw6w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=eMJ/1kvi; arc=none smtp.client-ip=209.85.219.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="eMJ/1kvi" Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-6f8aa9e6ffdso6083776d6.3 for ; Wed, 06 Aug 2025 18:45:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531101; x=1755135901; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=5YQfTLXaPrwDWqa20JnWXXwtihBOEt4q7FRxEsztbTE=; b=eMJ/1kviecAKHCx1NN9fWza/jo12NvyAvLBw2c37rFrINU8UaJalQExTosnB3QbP0O AVRct7g//CG1g1kXjvUslwRV///v5JZnbryUosKEKHjdEudzWog+lY1TgfQl9G46WNbB EWGfaWpRaOHcWIH0lFocExwM45eVgjRwHmfTcbRaVSGkUHuIBF15grWD68fUNU8EYjOr Rs3x8K1N/7E4klinMLLILvRfXzkKEUw+PNKn9xndXwL7nNKB+wbmjGZOrJex+xnsVYTo 70URJ2i+jU+xm48FbnNf57QriyWlVt5bncQ+XAZmBHmIU/gkE2MBJyePUgBlWGQbXZiL tN3A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531101; x=1755135901; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5YQfTLXaPrwDWqa20JnWXXwtihBOEt4q7FRxEsztbTE=; b=kaQTD2eYcMKhYXE+tjfGadi+5KkObk7Lr/qlS0IuNJFZokou5v6Yc2jV7BcbBgR4Lf Deb7ktudwXXSWNtCEG5qoIgw41HQ3cNLHycXTUe6u3OtY5cdvibYZWFRJihFD5w5N29L oTIGtt5VTOYbhdFRwZ0A6XUsTAXOgIsMR9qtpDKYs/1yFfkl+QZUXzEeZulLA916phBI EszJ3qhurC2pMlZKSu4ODhc8ty/FnC3KdLXsf4dR8AFKps8LyF34wdo6MbRS4zU01Jd7 8WrJtxN1hMC6BMVW9dEcHkajhrdL4ajivoZlJjA55ugwrkIyaAqm1rz7YngBLZb5wWaa Ottw== X-Forwarded-Encrypted: i=1; AJvYcCU/AgaTizKP2twYLDrN0YIlphfs7BSVudvQsaTM6jS/XTvnoV1DDsuQlP6y0u/8Mia6LBGoi4TzaOssgFI=@vger.kernel.org X-Gm-Message-State: AOJu0YyCEDUQ3yNbayX6nwOeOPVa2K14QZhZgRb5Ib5Nx8Pi9avBRZvM EpiQN0QasoTjWc8WEXJF9U1jVM7gNf95riDvtkOvoCDedS3ZfqAgywza7/WBh4aREqA= X-Gm-Gg: ASbGncuiuOS58Gzjqx4fbEuwMv+4ub77VxiaITzq6lHbaGkj3UHc9RzjYvhgr8TPx1+ HEkOAOn0Bp2jBrx2YUnxdcu4qvpkI8J5F0mZT1pGGqruIT354AxpFqQORjDl1wMJnYuaa2NIMkh X0FcO1F38ZiZxSvA8+pCMzCNPOhLKHr5VLVOrXRv83zDck8ia/n5XAVgU23P1QeQ0JAmd+LheDu ISFYv8dURyrClgU3ZazaE+REedICAk1iYPy/jc5Bsrvr9WEZICYshTsET5GkxnJ/iJ6UftVdd1Z KenVXgbP8OK3OiREdHMFtRWDrJB5eLIb4RWJ/SmpLcY66Neu5y6TW8M8WhGwyHrN30EWDn9HACF 0BgFpG8jB/RddCMFF0XpAaafgbSYeElDCzuYb46x9yE/Z6rE/zdYFd6EStPvCj+a7QBMlCfJ95k GZU26V/gOJbnCL X-Google-Smtp-Source: AGHT+IHmmcvtRUPUdTTVwMopqE/V5cogsgxyMiffpH9sAyZ4atsq/VQ1CQZ07gsqifDfn1bW6xxifg== X-Received: by 2002:a05:6214:40f:b0:704:f7d8:edfc with SMTP id 6a1803df08f44-7098a8639d6mr19787226d6.49.1754531100577; Wed, 06 Aug 2025 18:45:00 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.44.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:44:59 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 09/30] liveupdate: kho: move to kernel/liveupdate Date: Thu, 7 Aug 2025 01:44:15 +0000 Message-ID: <20250807014442.3829950-10-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move KHO to kernel/liveupdate/ in preparation of placing all Live Update core kernel related files to the same place. Signed-off-by: Pasha Tatashin Reviewed-by: Jason Gunthorpe --- Documentation/core-api/kho/concepts.rst | 2 +- MAINTAINERS | 2 +- init/Kconfig | 2 ++ kernel/Kconfig.kexec | 25 ---------------- kernel/Makefile | 3 +- kernel/liveupdate/Kconfig | 30 +++++++++++++++++++ kernel/liveupdate/Makefile | 7 +++++ kernel/{ =3D> liveupdate}/kexec_handover.c | 6 ++-- .../{ =3D> liveupdate}/kexec_handover_debug.c | 0 .../kexec_handover_internal.h | 0 10 files changed, 45 insertions(+), 32 deletions(-) create mode 100644 kernel/liveupdate/Kconfig create mode 100644 kernel/liveupdate/Makefile rename kernel/{ =3D> liveupdate}/kexec_handover.c (99%) rename kernel/{ =3D> liveupdate}/kexec_handover_debug.c (100%) rename kernel/{ =3D> liveupdate}/kexec_handover_internal.h (100%) diff --git a/Documentation/core-api/kho/concepts.rst b/Documentation/core-a= pi/kho/concepts.rst index 36d5c05cfb30..d626d1dbd678 100644 --- a/Documentation/core-api/kho/concepts.rst +++ b/Documentation/core-api/kho/concepts.rst @@ -70,5 +70,5 @@ in the FDT. That state is called the KHO finalization pha= se. =20 Public API =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D -.. kernel-doc:: kernel/kexec_handover.c +.. kernel-doc:: kernel/liveupdate/kexec_handover.c :export: diff --git a/MAINTAINERS b/MAINTAINERS index ce0314af3bdf..35cf4f95ed46 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13541,7 +13541,7 @@ S: Maintained F: Documentation/admin-guide/mm/kho.rst F: Documentation/core-api/kho/* F: include/linux/kexec_handover.h -F: kernel/kexec_handover* +F: kernel/liveupdate/kexec_handover* F: tools/testing/selftests/kho/ =20 KEYS-ENCRYPTED diff --git a/init/Kconfig b/init/Kconfig index 836320251219..1c67a44b8deb 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -2108,6 +2108,8 @@ config TRACEPOINTS =20 source "kernel/Kconfig.kexec" =20 +source "kernel/liveupdate/Kconfig" + endmenu # General setup =20 source "arch/Kconfig" diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 9968d3d4dd17..b05f5018ed98 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -94,31 +94,6 @@ config KEXEC_JUMP Jump between original kernel and kexeced kernel and invoke code in physical address mode via KEXEC =20 -config KEXEC_HANDOVER - bool "kexec handover" - depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE - depends on !DEFERRED_STRUCT_PAGE_INIT - select MEMBLOCK_KHO_SCRATCH - select KEXEC_FILE - select DEBUG_FS - select LIBFDT - select CMA - help - Allow kexec to hand over state across kernels by generating and - passing additional metadata to the target kernel. This is useful - to keep data or state alive across the kexec. For this to work, - both source and target kernels need to have this option enabled. - -config KEXEC_HANDOVER_DEBUG - bool "kexec handover debug interface" - depends on KEXEC_HANDOVER - depends on DEBUG_FS - help - Allow to control kexec handover device tree via debugfs - interface, i.e. finalize the state or aborting the finalization. - Also, enables inspecting the KHO fdt trees with the debugfs binary - blobs. - config CRASH_DUMP bool "kernel crash dumps" default ARCH_DEFAULT_CRASH_DUMP diff --git a/kernel/Makefile b/kernel/Makefile index bfca6dfe335a..da59db2676fb 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -52,6 +52,7 @@ obj-y +=3D printk/ obj-y +=3D irq/ obj-y +=3D rcu/ obj-y +=3D livepatch/ +obj-y +=3D liveupdate/ obj-y +=3D dma/ obj-y +=3D entry/ obj-y +=3D unwind/ @@ -81,8 +82,6 @@ obj-$(CONFIG_CRASH_DM_CRYPT) +=3D crash_dump_dm_crypt.o obj-$(CONFIG_KEXEC) +=3D kexec.o obj-$(CONFIG_KEXEC_FILE) +=3D kexec_file.o obj-$(CONFIG_KEXEC_ELF) +=3D kexec_elf.o -obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o -obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o obj-$(CONFIG_BACKTRACE_SELF_TEST) +=3D backtracetest.o obj-$(CONFIG_COMPAT) +=3D compat.o obj-$(CONFIG_CGROUPS) +=3D cgroup/ diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig new file mode 100644 index 000000000000..eebe564b385d --- /dev/null +++ b/kernel/liveupdate/Kconfig @@ -0,0 +1,30 @@ +# SPDX-License-Identifier: GPL-2.0-only + +menu "Live Update" + +config KEXEC_HANDOVER + bool "kexec handover" + depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE + depends on !DEFERRED_STRUCT_PAGE_INIT + select MEMBLOCK_KHO_SCRATCH + select KEXEC_FILE + select DEBUG_FS + select LIBFDT + select CMA + help + Allow kexec to hand over state across kernels by generating and + passing additional metadata to the target kernel. This is useful + to keep data or state alive across the kexec. For this to work, + both source and target kernels need to have this option enabled. + +config KEXEC_HANDOVER_DEBUG + bool "kexec handover debug interface" + depends on KEXEC_HANDOVER + depends on DEBUG_FS + help + Allow to control kexec handover device tree via debugfs + interface, i.e. finalize the state or aborting the finalization. + Also, enables inspecting the KHO fdt trees with the debugfs binary + blobs. + +endmenu diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile new file mode 100644 index 000000000000..72cf7a8e6739 --- /dev/null +++ b/kernel/liveupdate/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0 +# +# Makefile for the linux kernel. +# + +obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o +obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o diff --git a/kernel/kexec_handover.c b/kernel/liveupdate/kexec_handover.c similarity index 99% rename from kernel/kexec_handover.c rename to kernel/liveupdate/kexec_handover.c index 07755184f44b..05f5694ea057 100644 --- a/kernel/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -23,8 +23,8 @@ * KHO is tightly coupled with mm init and needs access to some of mm * internal APIs. */ -#include "../mm/internal.h" -#include "kexec_internal.h" +#include "../../mm/internal.h" +#include "../kexec_internal.h" #include "kexec_handover_internal.h" =20 #define KHO_FDT_COMPATIBLE "kho-v1" @@ -824,7 +824,7 @@ static int __kho_finalize(void) err |=3D fdt_finish_reservemap(root); err |=3D fdt_begin_node(root, ""); err |=3D fdt_property_string(root, "compatible", KHO_FDT_COMPATIBLE); - /** + /* * Reserve the preserved-memory-map property in the root FDT, so * that all property definitions will precede subnodes created by * KHO callers. diff --git a/kernel/kexec_handover_debug.c b/kernel/liveupdate/kexec_handov= er_debug.c similarity index 100% rename from kernel/kexec_handover_debug.c rename to kernel/liveupdate/kexec_handover_debug.c diff --git a/kernel/kexec_handover_internal.h b/kernel/liveupdate/kexec_han= dover_internal.h similarity index 100% rename from kernel/kexec_handover_internal.h rename to kernel/liveupdate/kexec_handover_internal.h --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACED32163B2 for ; Thu, 7 Aug 2025 01:45:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531106; cv=none; b=KgXtjQgiBWvYgnVKKNs4Jik+mW1uNWlN4W0Eovx/zxN1EOjDS34VT/5QG3RGytzS0Cwhmqq6+byB2i2CSAmK/05Uz+rMIc+dP3Wg1Etjk6kqPWDAx0/NFXGgqZydxdhqnBuWDCFkepgnQIHSXGps+e/JupsUcMrJMp8cPLdjWbE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531106; c=relaxed/simple; bh=vmMRJbzbHZvdNZey1RXXgmZo2fS4JqXmqprMc39NEAM=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MFmp+GmGwYSU0w3n5oLn33KiUjUIMctgtjzdCgqZWOkCscc9m2+QcXo8NPszHIUw0emwcNYuZoVd2lne+fT+NZS8Z1RTXCGwFMo0Oo7jkg/lfrH3DeNr/oMcYmEPQfEXJHrg9XZnQufX1QUPeCPI7HZfitc9lrNyIyqTNs2sOLc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Z0R2yHOG; arc=none smtp.client-ip=209.85.219.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Z0R2yHOG" Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-7074bad03ceso7153606d6.0 for ; Wed, 06 Aug 2025 18:45:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531103; x=1755135903; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=PGbUc9NIm29yqExnh12PfmLPkmHzH5fwphyzTE2k18E=; b=Z0R2yHOGCFTfKmqwPcgRoG6bkV+cLPIDByajCc2sr7aKApI4mSlyG0JrP2TW848tpj k9e5pn/hxp2poL+TrJZcMpDrsyGAUGv+Kloa1wg8CCkR8LydvHjdL4msAibXeNSyjbja r/cdkl5K6AyHP+wBGvg7DzFK5IeKU53NuHO3vIv+d1ZGmjeGRz/MTnmKiBg50fcc+o4c JoeYgjHUlgdL/9gFtgaeEk+PSXvEToyiXBjr3joBXv/MzoAg6/nrZp/JETp1fJ8GYYtd h7WL2JGFmT1/kXbVGdqGQQhDv3w1h/4ZEn7EuablSYSuiMgx04KoYnGHecbKxUgx46/E A5OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531103; x=1755135903; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PGbUc9NIm29yqExnh12PfmLPkmHzH5fwphyzTE2k18E=; b=Jo4Bf4B6VkFhBEfGzFN9HoN/i12gS449hhxfxSU0iiMoyFt7zPaipyhXdG3EqGHTWT 1o+vs+eFhHFO+A8Tak62srDQNUBTaE4QYHfKjiT428GYl4V56X+SygYeelLXwqjjF70w qw/Awbt7G+PaYOt7hJK9Nvd5drwfL42K78/ARYs1l5IOvnRmzNN7c1AQJx4FcsB0MVwg Lzm12GfKo0kDRY27M38rfOyjjoy2F80jssmkAxOpjCAPN1TiXiJnItzFkH3o8jLVDWwA 0LJ1bXsMmf/eromEmFeHXubfPFYx4HFDaPVhoIS6AiKQg/p8JvchEJrmcscKTN5TqYOZ zvAA== X-Forwarded-Encrypted: i=1; AJvYcCUG4nMTFnlQSiMlwm6ZHJl7y3+8I5sKTNwGgx8g9PJSE6/5dW8dEWKrVL5GDDKSdG/7RJvHrF+4StrvzPU=@vger.kernel.org X-Gm-Message-State: AOJu0YyWA91IbcWtRmoGlALITEkLKNCovgKtRVrBBjP35E6rH5pvKnQu aW+lnjuimzWFr7D2BibL4FH34yZo2gtLVYBBFZRKXEsR9BZNpu4XGe/seqVm5bAgCaQ= X-Gm-Gg: ASbGncscDy5TfvV9+FgrwXEQZzD+0i9PkLsykX/S2Nl66A5GCUIQ6cWw1o4570eQfHD T58iCz3OCy2eY2G3OfFYHBoNq5oljCYPIPeN1lYguDDr+7c149ADB5en0HkNAptIjickqY0Qgyg cPyFK5yhefBNstYTcrsROwzTaYlgLQku4UOB0jzlEdSU9n+7JWxsdftpKeg2QvbOC3GEB2YmTOb 94nXkctsT9mcDiLpMx8FzCRttkf+LXw1fFl4CNrXjNDRkNovirDSpNQKiVM8bddMLvprFfHdPXU c8UONqeTb/kozf6vj5FlsY4Gzubzmd339PUYXGcyLgiq0WcVmKOGdyVr7JNNIYelia4cShOUCa4 Gl3OQeKLJoWh6R0FSIjNwYl17D2f8xiZiUV1oNWx8Gyj4gxL31VouZ/Q8en6A7ibSzbT3bVA5Hp 6bcRolTStKMXOJ X-Google-Smtp-Source: AGHT+IG2n/NHpVcAaXg5EHw4dc3YYrb+NR1PoqvgF/4Vk6fmqtkHTLLVFsYSEsS7jOvVstvSyDq9qg== X-Received: by 2002:a05:6214:4015:b0:707:71dc:a382 with SMTP id 6a1803df08f44-709795f878cmr72338246d6.25.1754531102077; Wed, 06 Aug 2025 18:45:02 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:01 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 10/30] liveupdate: luo_core: luo_ioctl: Live Update Orchestrator Date: Thu, 7 Aug 2025 01:44:16 +0000 Message-ID: <20250807014442.3829950-11-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce LUO, a mechanism intended to facilitate kernel updates while keeping designated devices operational across the transition (e.g., via kexec). The primary use case is updating hypervisors with minimal disruption to running virtual machines. For userspace side of hypervisor update we have copyless migration. LUO is for updating the kernel. This initial patch lays the groundwork for the LUO subsystem. Further functionality, including the implementation of state transition logic, integration with KHO, and hooks for subsystems and file descriptors, will be added in subsequent patches. Create a character device at /dev/liveupdate. A new uAPI header, , will define the necessary structures. The magic number for IOCTL is registered in Documentation/userspace-api/ioctl/ioctl-number.rst. Signed-off-by: Pasha Tatashin --- .../userspace-api/ioctl/ioctl-number.rst | 2 + include/linux/liveupdate.h | 64 ++++ include/uapi/linux/liveupdate.h | 94 ++++++ kernel/liveupdate/Kconfig | 27 ++ kernel/liveupdate/Makefile | 6 + kernel/liveupdate/luo_core.c | 297 ++++++++++++++++++ kernel/liveupdate/luo_internal.h | 21 ++ kernel/liveupdate/luo_ioctl.c | 48 +++ 8 files changed, 559 insertions(+) create mode 100644 include/linux/liveupdate.h create mode 100644 include/uapi/linux/liveupdate.h create mode 100644 kernel/liveupdate/luo_core.c create mode 100644 kernel/liveupdate/luo_internal.h create mode 100644 kernel/liveupdate/luo_ioctl.c diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documenta= tion/userspace-api/ioctl/ioctl-number.rst index 406a9f4d0869..d569459a2320 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -383,6 +383,8 @@ Code Seq# Include File = Comments 0xB8 01-02 uapi/misc/mrvl_cn10k_dpi.h Mar= vell CN10K DPI driver 0xB8 all uapi/linux/mshv.h Mic= rosoft Hyper-V /dev/mshv driver +0xBA all uapi/linux/liveupdate.h Pas= ha Tatashin + 0xC0 00-0F linux/usb/iowarrior.h 0xCA 00-0F uapi/misc/cxl.h Dea= d since 6.15 0xCA 10-2F uapi/misc/ocxl.h diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h new file mode 100644 index 000000000000..85a6828c95b0 --- /dev/null +++ b/include/linux/liveupdate.h @@ -0,0 +1,64 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ +#ifndef _LINUX_LIVEUPDATE_H +#define _LINUX_LIVEUPDATE_H + +#include +#include +#include +#include + +#ifdef CONFIG_LIVEUPDATE + +/* Return true if live update orchestrator is enabled */ +bool liveupdate_enabled(void); + +/* Called during reboot to tell participants to complete serialization */ +int liveupdate_reboot(void); + +/* + * Return true if machine is in updated state (i.e. live update boot in + * progress) + */ +bool liveupdate_state_updated(void); + +/* + * Return true if machine is in normal state (i.e. no live update in progr= ess). + */ +bool liveupdate_state_normal(void); + +enum liveupdate_state liveupdate_get_state(void); + +#else /* CONFIG_LIVEUPDATE */ + +static inline int liveupdate_reboot(void) +{ + return 0; +} + +static inline bool liveupdate_enabled(void) +{ + return false; +} + +static inline bool liveupdate_state_updated(void) +{ + return false; +} + +static inline bool liveupdate_state_normal(void) +{ + return true; +} + +static inline enum liveupdate_state liveupdate_get_state(void) +{ + return LIVEUPDATE_STATE_NORMAL; +} + +#endif /* CONFIG_LIVEUPDATE */ +#endif /* _LINUX_LIVEUPDATE_H */ diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h new file mode 100644 index 000000000000..3cb09b2c4353 --- /dev/null +++ b/include/uapi/linux/liveupdate.h @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +/* + * Userspace interface for /dev/liveupdate + * Live Update Orchestrator + * + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _UAPI_LIVEUPDATE_H +#define _UAPI_LIVEUPDATE_H + +#include +#include + +/** + * enum liveupdate_state - Defines the possible states of the live update + * orchestrator. + * @LIVEUPDATE_STATE_UNDEFINED: State has not yet been initialized. + * @LIVEUPDATE_STATE_NORMAL: Default state, no live update in prog= ress. + * @LIVEUPDATE_STATE_PREPARED: Live update is prepared for reboot; t= he + * LIVEUPDATE_PREPARE callbacks have com= pleted + * successfully. + * Devices might operate in a limited st= ate + * for example the participating devices= might + * not be allowed to unbind, and also the + * setting up of new DMA mappings might = be + * disabled in this state. + * @LIVEUPDATE_STATE_FROZEN: The final reboot event + * (%LIVEUPDATE_FREEZE) has been sent, a= nd the + * system is performing its final state = saving + * within the "blackout window". User + * workloads must be suspended. The actu= al + * reboot (kexec) into the next kernel is + * imminent. + * @LIVEUPDATE_STATE_UPDATED: The system has rebooted into the next + * kernel via live update the system is = now + * running the next kernel, awaiting the + * finish event. + * + * These states track the progress and outcome of a live update operation. + */ +enum liveupdate_state { + LIVEUPDATE_STATE_UNDEFINED =3D 0, + LIVEUPDATE_STATE_NORMAL =3D 1, + LIVEUPDATE_STATE_PREPARED =3D 2, + LIVEUPDATE_STATE_FROZEN =3D 3, + LIVEUPDATE_STATE_UPDATED =3D 4, +}; + +/** + * enum liveupdate_event - Events that trigger live update callbacks. + * @LIVEUPDATE_PREPARE: PREPARE should happen *before* the blackout window. + * Subsystems should prepare for an upcoming reboot by + * serializing their states. However, it must be cons= idered + * that user applications, e.g. virtual machines are = still + * running during this phase. + * @LIVEUPDATE_FREEZE: FREEZE sent from the reboot() syscall, when the cu= rrent + * kernel is on its way out. This is the final opport= unity + * for subsystems to save any state that must persist + * across the reboot. Callbacks for this event should= be as + * fast as possible since they are on the critical pa= th of + * rebooting into the next kernel. + * @LIVEUPDATE_FINISH: FINISH is sent in the newly booted kernel after a + * successful live update and normally *after* the bl= ackout + * window. Subsystems should perform any final cleanup + * during this phase. This phase also provides an + * opportunity to clean up devices that were preserve= d but + * never explicitly reclaimed during the live update + * process. State restoration should have already occ= urred + * before this event. Callbacks for this event must n= ot + * fail. The completion of this call transitions the + * machine from ``updated`` to ``normal`` state. + * @LIVEUPDATE_CANCEL: CANCEL the live update and go back to normal state= . This + * event is user initiated, or is done automatically = when + * LIVEUPDATE_PREPARE or LIVEUPDATE_FREEZE stage fail= s. + * Subsystems should revert any actions taken during = the + * corresponding prepare event. Callbacks for this ev= ent + * must not fail. + * + * These events represent the different stages and actions within the live + * update process that subsystems (like device drivers and bus drivers) + * need to be aware of to correctly serialize and restore their state. + * + */ +enum liveupdate_event { + LIVEUPDATE_PREPARE =3D 0, + LIVEUPDATE_FREEZE =3D 1, + LIVEUPDATE_FINISH =3D 2, + LIVEUPDATE_CANCEL =3D 3, +}; + +#endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index eebe564b385d..f6b0bde188d9 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -1,7 +1,34 @@ # SPDX-License-Identifier: GPL-2.0-only +# +# Copyright (c) 2025, Google LLC. +# Pasha Tatashin +# +# Live Update Orchestrator +# =20 menu "Live Update" =20 +config LIVEUPDATE + bool "Live Update Orchestrator" + depends on KEXEC_HANDOVER + help + Enable the Live Update Orchestrator. Live Update is a mechanism, + typically based on kexec, that allows the kernel to be updated + while keeping selected devices operational across the transition. + These devices are intended to be reclaimed by the new kernel and + re-attached to their original workload without requiring a device + reset. + + Ability to handover a device from current to the next kernel depends + on specific support within device drivers and related kernel + subsystems. + + This feature primarily targets virtual machine hosts to quickly update + the kernel hypervisor with minimal disruption to the running virtual + machines. + + If unsure, say N. + config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 72cf7a8e6739..8627b7691943 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -3,5 +3,11 @@ # Makefile for the linux kernel. # =20 +luo-y :=3D \ + luo_core.o \ + luo_ioctl.o + obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o + +obj-$(CONFIG_LIVEUPDATE) +=3D luo.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c new file mode 100644 index 000000000000..c77e540e26f8 --- /dev/null +++ b/kernel/liveupdate/luo_core.c @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: Live Update Orchestrator (LUO) + * + * Live Update is a specialized, kexec-based reboot process that allows a + * running kernel to be updated from one version to another while preservi= ng + * the state of selected resources and keeping designated hardware devices + * operational. For these devices, DMA activity may continue throughout the + * kernel transition. + * + * While the primary use case driving this work is supporting live updates= of + * the Linux kernel when it is used as a hypervisor in cloud environments,= the + * LUO framework itself is designed to be workload-agnostic. Much like Ker= nel + * Live Patching, which applies security fixes regardless of the workload, + * Live Update facilitates a full kernel version upgrade for any type of s= ystem. + * + * For example, a non-hypervisor system running an in-memory cache like + * memcached with many gigabytes of data can use LUO. The userspace service + * can place its cache into a memfd, have its state preserved by LUO, and + * restore it immediately after the kernel kexec. + * + * Whether the system is running virtual machines, containers, a + * high-performance database, or networking services, LUO's primary goal i= s to + * enable a full kernel update by preserving critical userspace state and + * keeping essential devices operational. + * + * The core of LUO is a state machine that tracks the progress of a live u= pdate, + * along with a callback API that allows other kernel subsystems to partic= ipate + * in the process. Example subsystems that can hook into LUO include: kvm, + * iommu, interrupts, vfio, participating filesystems, and memory manageme= nt. + * + * LUO uses Kexec Handover to transfer memory state from the current kerne= l to + * the next kernel. For more details see + * Documentation/core-api/kho/concepts.rst. + * + * The LUO state machine ensures that operations are performed in the corr= ect + * sequence and provides a mechanism to track and recover from potential + * failures. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include "luo_internal.h" + +static DECLARE_RWSEM(luo_state_rwsem); + +static enum liveupdate_state luo_state =3D LIVEUPDATE_STATE_UNDEFINED; + +static const char *const luo_state_str[] =3D { + [LIVEUPDATE_STATE_UNDEFINED] =3D "undefined", + [LIVEUPDATE_STATE_NORMAL] =3D "normal", + [LIVEUPDATE_STATE_PREPARED] =3D "prepared", + [LIVEUPDATE_STATE_FROZEN] =3D "frozen", + [LIVEUPDATE_STATE_UPDATED] =3D "updated", +}; + +static bool luo_enabled; + +static int __init early_liveupdate_param(char *buf) +{ + return kstrtobool(buf, &luo_enabled); +} +early_param("liveupdate", early_liveupdate_param); + +/* Return true if the current state is equal to the provided state */ +static inline bool is_current_luo_state(enum liveupdate_state expected_sta= te) +{ + return liveupdate_get_state() =3D=3D expected_state; +} + +static void __luo_set_state(enum liveupdate_state state) +{ + WRITE_ONCE(luo_state, state); +} + +static inline void luo_set_state(enum liveupdate_state state) +{ + pr_info("Switched from [%s] to [%s] state\n", + luo_current_state_str(), luo_state_str[state]); + __luo_set_state(state); +} + +static int luo_do_freeze_calls(void) +{ + return 0; +} + +static void luo_do_finish_calls(void) +{ +} + +/* Get the current state as a string */ +const char *luo_current_state_str(void) +{ + return luo_state_str[liveupdate_get_state()]; +} + +enum liveupdate_state liveupdate_get_state(void) +{ + return READ_ONCE(luo_state); +} + +int luo_prepare(void) +{ + return 0; +} + +/** + * luo_freeze() - Initiate the final freeze notification phase for live up= date. + * + * Attempts to transition the live update orchestrator state from + * %LIVEUPDATE_STATE_PREPARED to %LIVEUPDATE_STATE_FROZEN. This function is + * typically called just before the actual reboot system call (e.g., kexec) + * is invoked, either directly by the orchestration tool or potentially fr= om + * within the reboot syscall path itself. + * + * @return 0: Success. Negative error otherwise. State is reverted to + * %LIVEUPDATE_STATE_NORMAL in case of an error during callbacks, and ever= ything + * is canceled via cancel notifcation. + */ +int luo_freeze(void) +{ + int ret; + + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[freeze] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_FROZEN], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + ret =3D luo_do_freeze_calls(); + if (!ret) + luo_set_state(LIVEUPDATE_STATE_FROZEN); + else + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return ret; +} + +/** + * luo_finish - Finalize the live update process in the new kernel. + * + * This function is called after a successful live update reboot into a n= ew + * kernel, once the new kernel is ready to transition to the normal operat= ional + * state. It signals the completion of the live update sequence to subsyst= ems. + * + * @return 0 on success, ``-EAGAIN`` if the state change was cancelled by = the + * user while waiting for the lock, or ``-EINVAL`` if the orchestrator is = not in + * the updated state. + */ +int luo_finish(void) +{ + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[finish] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_UPDATED)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_NORMAL], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + luo_do_finish_calls(); + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return 0; +} + +int luo_cancel(void) +{ + return 0; +} + +void luo_state_read_enter(void) +{ + down_read(&luo_state_rwsem); +} + +void luo_state_read_exit(void) +{ + up_read(&luo_state_rwsem); +} + +static int __init luo_startup(void) +{ + __luo_set_state(LIVEUPDATE_STATE_NORMAL); + + return 0; +} +early_initcall(luo_startup); + +/* Public Functions */ + +/** + * liveupdate_reboot() - Kernel reboot notifier for live update final + * serialization. + * + * This function is invoked directly from the reboot() syscall pathway if a + * reboot is initiated while the live update state is %LIVEUPDATE_STATE_PR= EPARED + * (i.e., if the user did not explicitly trigger the frozen state). It han= dles + * the implicit transition into the final frozen state. + * + * It triggers the %LIVEUPDATE_REBOOT event callbacks for participating + * subsystems. These callbacks must perform final state saving very quickl= y as + * they execute during the blackout period just before kexec. + * + * If any %LIVEUPDATE_FREEZE callback fails, this function triggers the + * %LIVEUPDATE_CANCEL event for all participants to revert their state, ab= orts + * the live update, and returns an error. + */ +int liveupdate_reboot(void) +{ + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED)) + return 0; + + return luo_freeze(); +} + +/** + * liveupdate_state_updated - Check if the system is in the live update + * 'updated' state. + * + * This function checks if the live update orchestrator is in the + * ``LIVEUPDATE_STATE_UPDATED`` state. This state indicates that the syste= m has + * successfully rebooted into a new kernel as part of a live update, and t= he + * preserved devices are expected to be in the process of being reclaimed. + * + * This is typically used by subsystems during early boot of the new kernel + * to determine if they need to attempt to restore state from a previous + * live update. + * + * @return true if the system is in the ``LIVEUPDATE_STATE_UPDATED`` state, + * false otherwise. + */ +bool liveupdate_state_updated(void) +{ + return is_current_luo_state(LIVEUPDATE_STATE_UPDATED); +} + +/** + * liveupdate_state_normal - Check if the system is in the live update 'no= rmal' + * state. + * + * This function checks if the live update orchestrator is in the + * ``LIVEUPDATE_STATE_NORMAL`` state. This state indicates that no live up= date + * is in progress. It represents the default operational state of the syst= em. + * + * This can be used to gate actions that should only be performed when no + * live update activity is occurring. + * + * @return true if the system is in the ``LIVEUPDATE_STATE_NORMAL`` state, + * false otherwise. + */ +bool liveupdate_state_normal(void) +{ + return is_current_luo_state(LIVEUPDATE_STATE_NORMAL); +} + +/** + * liveupdate_enabled - Check if the live update feature is enabled. + * + * This function returns the state of the live update feature flag, which + * can be controlled via the ``liveupdate`` kernel command-line parameter. + * + * @return true if live update is enabled, false otherwise. + */ +bool liveupdate_enabled(void) +{ + return luo_enabled; +} diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h new file mode 100644 index 000000000000..3d10f3eb20a7 --- /dev/null +++ b/kernel/liveupdate/luo_internal.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _LINUX_LUO_INTERNAL_H +#define _LINUX_LUO_INTERNAL_H + +int luo_cancel(void); +int luo_prepare(void); +int luo_freeze(void); +int luo_finish(void); + +void luo_state_read_enter(void); +void luo_state_read_exit(void); + +const char *luo_current_state_str(void); + +#endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_ioctl.c b/kernel/liveupdate/luo_ioctl.c new file mode 100644 index 000000000000..3df1ec9fbe57 --- /dev/null +++ b/kernel/liveupdate/luo_ioctl.c @@ -0,0 +1,48 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +static const struct file_operations fops =3D { + .owner =3D THIS_MODULE, +}; + +static struct miscdevice liveupdate_miscdev =3D { + .minor =3D MISC_DYNAMIC_MINOR, + .name =3D "liveupdate", + .fops =3D &fops, +}; + +static int __init liveupdate_init(void) +{ + if (!liveupdate_enabled()) + return 0; + + return misc_register(&liveupdate_miscdev); +} +module_init(liveupdate_init); + +static void __exit liveupdate_exit(void) +{ + misc_deregister(&liveupdate_miscdev); +} +module_exit(liveupdate_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Pasha Tatashin"); +MODULE_DESCRIPTION("Live Update Orchestrator"); +MODULE_VERSION("0.1"); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B385821C9E1 for ; Thu, 7 Aug 2025 01:45:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531108; cv=none; b=ESmEn/1JtFkvf+zSx8ehBz93m4inhZlvGDIELNrPsAMEACcaDEJry0kl2xYtLgV5QKbfvlRphIaodwKiFUteX8KkOviwpPRjzo5PcMsgyN+mp6ZLihibge7aklR4CsFPzIV9asuBP1M4rV8QfVjI4gbKNahWqbYNjEDqtR0wjB4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531108; c=relaxed/simple; bh=T9mSQcqqhj5wozcCZeXjtsKsPvQoMlz6uMFHkaNKny0=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Rsxy49oV0Osl6hUDIqnpxV4AYxhbY1O6bUkxAUMGEEa8JNOD0t51/F8ofGOUv5j5uc5Hl8Nu2WQYT0M27a+JhSy0RLXVyR3ArgzKzmUWqH3dFPUGktjLnH9b3CUJ1+YPwx1EMeOQXdKhc3kgNfCrjfVOpwVCV5dJKUgiyADmx8U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=jHxeYx+a; arc=none smtp.client-ip=209.85.219.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="jHxeYx+a" Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-70748a0e13dso4959366d6.1 for ; Wed, 06 Aug 2025 18:45:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531104; x=1755135904; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=KQpmpulpEMbleZBgXGFUqWXbMNM3g038MqdeBg/kxjU=; b=jHxeYx+aIajj5KA7QJ5yGEJbgNTPGOet5S+wgm0iul1wAeOx5HDP2t/vcVUHEJHPq3 pUmHFAnFppE51LDbeCeXmE+R5gzNiTBUdhTivkYerrfVrJLVubSEtfivId+KphPQVEpg BFDxw5w7TPtXLNtgPNCiXzu/55s0x7OVUNAlS6aE1ZrWVEI6/SsfVPL7zUMZWq/50NNi UwoHiVCDwy+Vbt72PcGOT8lN0OcGTNcJ2HOp3xoMVyh7BvKLIERqznJDTpuuwTDN5E5T iOT7V6Zem42nZg7yYOYbW6Bo9PYpu7lVmkByvAkkmEEYopCJnhixWLxNbob3S4O9UBwU c6ZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531104; x=1755135904; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KQpmpulpEMbleZBgXGFUqWXbMNM3g038MqdeBg/kxjU=; b=RHsS8A2MyoU6SuFc2mubveWG9DxqpAonbg6qP640VPU2HSidrfDrlpP1Md24rdfvON wfx28WNmjnQvDnwNoGO9F5o/MMHL3yvcfkLhNiyR2pifEqLp2X+PwYg2+UVbLaK/z3+4 vc+mdwGABfi1Ata5/67v/kAvStPsoYIIH8dRPxyBm7W4JTmoN7QAdVMvFhrsZOw2e9qB Y2npvACck3Sle7COcfgY+MmN+zJJozOeOFKhd1raB55aLCPRah4MtnmXftWug8YGZ0/7 ZUIxo19pXS+uu+zoMmn1+Xl/ndXP5F3N4QS4GFiD+MsEn4UKehfviu8+Tt/4wQdsujjX 0y2A== X-Forwarded-Encrypted: i=1; AJvYcCUvj0BPuL0dBbJFyLrAbmYHtuii0l/wUhJUpI4SADFG1nm336UyNlMhXopO9MtvYOfXSLBfIN4Z897VB5k=@vger.kernel.org X-Gm-Message-State: AOJu0YzXk8xyQr1DagjLgllHRkratkVkHRkiwd7OwTUyQGqOVkL3vfC2 GPaTWnL/xUumLVpoc1Y3vYK/6rE92URKTZ7UK+mlXR6EBJP9Bx6pZ+/tDE7VVGaddGE= X-Gm-Gg: ASbGnctAP5i4x/lDmuETMdhl6SgYwKiBvj+23+ZtfPLqoEARfWrc0KNbm2B0GbSbJhk MOeBLPqpiOx9/drT9mYtQoYSRrlBolxxsZpM4AT76hcL0xqK3sH2FWP6kMIAS12YNaHHjzbbUs/ axENtxNymF1dxqeBIEwq49BpWBj4/F00gF0ZRL+rZN6mmc5SHmH+i/n06Kb/CMqj8I+N8iKlzE8 fAH8byjRTDrHJ9zkawdb/hvg4Cx59/S4tJ1q1BBNo120AEPjqBI1TSW+vWnnk2lc5l/BXunlw6Z bJbyROhVcUWemL4p0vUkHwIlnZFsP0x/uEiKcG+eMZRIp8lix/CgX54wXUBEgsWz+Q2uMRQ6I5I Qkba1mpzdlA9yVXj3C99y+usYQg2+Js58tBTBVF7THIRybWBH/hqW3ul1r+VF0m2tPEQLKbA0/b zuY2wwrsH3Zm9w9bf/5N++HHc= X-Google-Smtp-Source: AGHT+IG4qF76mfK0ckCdhvAOyM6dk3CxZ2iK62x6AwDVufRrHJI4s1ZiTk8d/a3ypIjxis075X7CBg== X-Received: by 2002:a05:6214:6005:b0:709:8672:dd84 with SMTP id 6a1803df08f44-7098672df63mr40806146d6.21.1754531103493; Wed, 06 Aug 2025 18:45:03 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:03 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 11/30] liveupdate: luo_core: integrate with KHO Date: Thu, 7 Aug 2025 01:44:17 +0000 Message-ID: <20250807014442.3829950-12-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Integrate the LUO with the KHO framework to enable passing LUO state across a kexec reboot. When LUO is transitioned to a "prepared" state, it tells KHO to finalize, so all memory segments that were added to KHO preservation list are getting preserved. After "Prepared" state no new segments can be preserved. If LUO is canceled, it also tells KHO to cancel the serialization, and therefore, later LUO can go back into the prepared state. This patch introduces the following changes: - During the KHO finalization phase allocate FDT blob. - Populate this FDT with a LUO compatibility string ("luo-v1"). LUO now depends on `CONFIG_KEXEC_HANDOVER`. The core state transition logic (`luo_do_*_calls`) remains unimplemented in this patch. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_core.c | 210 ++++++++++++++++++++++++++++++- kernel/liveupdate/luo_internal.h | 9 ++ 2 files changed, 216 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index c77e540e26f8..951422e51dd3 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -47,9 +47,12 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt =20 #include +#include #include +#include #include #include +#include #include #include "luo_internal.h" =20 @@ -67,6 +70,21 @@ static const char *const luo_state_str[] =3D { =20 static bool luo_enabled; =20 +static void *luo_fdt_out; +static void *luo_fdt_in; + +/* + * The LUO FDT size depends on the number of participating subsystems, + * + * The current fixed size (4K) is large enough to handle reasonable number= of + * preserved entities. If this size ever becomes insufficient, it can eith= er be + * increased, or a dynamic size calculation mechanism could be implemented= in + * the future. + */ +#define LUO_FDT_SIZE PAGE_SIZE +#define LUO_KHO_ENTRY_NAME "LUO" +#define LUO_COMPATIBLE "luo-v1" + static int __init early_liveupdate_param(char *buf) { return kstrtobool(buf, &luo_enabled); @@ -91,6 +109,60 @@ static inline void luo_set_state(enum liveupdate_state = state) __luo_set_state(state); } =20 +/* Called during the prepare phase, to create LUO fdt tree */ +static int luo_fdt_setup(void) +{ + void *fdt_out; + int ret; + + fdt_out =3D (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, + get_order(LUO_FDT_SIZE)); + if (!fdt_out) { + pr_err("failed to allocate FDT memory\n"); + return -ENOMEM; + } + + ret =3D fdt_create_empty_tree(fdt_out, LUO_FDT_SIZE); + if (ret) + goto exit_free; + + ret =3D fdt_setprop_string(fdt_out, 0, "compatible", LUO_COMPATIBLE); + if (ret) + goto exit_free; + + ret =3D kho_preserve_phys(__pa(fdt_out), LUO_FDT_SIZE); + if (ret) + goto exit_free; + + ret =3D kho_add_subtree(LUO_KHO_ENTRY_NAME, fdt_out); + if (ret) + goto exit_unpreserve; + luo_fdt_out =3D fdt_out; + + return 0; + +exit_unpreserve: + WARN_ON_ONCE(kho_unpreserve_phys(__pa(fdt_out), LUO_FDT_SIZE)); +exit_free: + free_pages((unsigned long)fdt_out, get_order(LUO_FDT_SIZE)); + pr_err("failed to prepare LUO FDT: %d\n", ret); + + return ret; +} + +static void luo_fdt_destroy(void) +{ + WARN_ON_ONCE(kho_unpreserve_phys(__pa(luo_fdt_out), LUO_FDT_SIZE)); + kho_remove_subtree(luo_fdt_out); + free_pages((unsigned long)luo_fdt_out, get_order(LUO_FDT_SIZE)); + luo_fdt_out =3D NULL; +} + +static int luo_do_prepare_calls(void) +{ + return 0; +} + static int luo_do_freeze_calls(void) { return 0; @@ -100,6 +172,71 @@ static void luo_do_finish_calls(void) { } =20 +static void luo_do_cancel_calls(void) +{ +} + +static int __luo_prepare(void) +{ + int ret; + + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[prepare] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_NORMAL)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_PREPARED], + luo_current_state_str()); + ret =3D -EINVAL; + goto exit_unlock; + } + + ret =3D luo_fdt_setup(); + if (ret) + goto exit_unlock; + + ret =3D luo_do_prepare_calls(); + if (ret) { + luo_fdt_destroy(); + goto exit_unlock; + } + + luo_set_state(LIVEUPDATE_STATE_PREPARED); + +exit_unlock: + up_write(&luo_state_rwsem); + + return ret; +} + +static int __luo_cancel(void) +{ + if (down_write_killable(&luo_state_rwsem)) { + pr_warn("[cancel] event canceled by user\n"); + return -EAGAIN; + } + + if (!is_current_luo_state(LIVEUPDATE_STATE_PREPARED) && + !is_current_luo_state(LIVEUPDATE_STATE_FROZEN)) { + pr_warn("Can't switch to [%s] from [%s] state\n", + luo_state_str[LIVEUPDATE_STATE_NORMAL], + luo_current_state_str()); + up_write(&luo_state_rwsem); + + return -EINVAL; + } + + luo_do_cancel_calls(); + luo_fdt_destroy(); + luo_set_state(LIVEUPDATE_STATE_NORMAL); + + up_write(&luo_state_rwsem); + + return 0; +} + /* Get the current state as a string */ const char *luo_current_state_str(void) { @@ -111,9 +248,28 @@ enum liveupdate_state liveupdate_get_state(void) return READ_ONCE(luo_state); } =20 +/** + * luo_prepare - Initiate the live update preparation phase. + * + * This function is called to begin the live update process. It attempts to + * transition the luo to the ``LIVEUPDATE_STATE_PREPARED`` state. + * + * If the calls complete successfully, the orchestrator state is set + * to ``LIVEUPDATE_STATE_PREPARED``. If any call fails a + * ``LIVEUPDATE_CANCEL`` is sent to roll back any actions. + * + * @return 0 on success, ``-EAGAIN`` if the state change was cancelled by = the + * user while waiting for the lock, ``-EINVAL`` if the orchestrator is not= in + * the normal state, or a negative error code returned by the calls. + */ int luo_prepare(void) { - return 0; + int err =3D __luo_prepare(); + + if (err) + return err; + + return kho_finalize(); } =20 /** @@ -193,9 +349,28 @@ int luo_finish(void) return 0; } =20 +/** + * luo_cancel - Cancel the ongoing live update from prepared or frozen sta= tes. + * + * This function is called to abort a live update that is currently in the + * ``LIVEUPDATE_STATE_PREPARED`` state. + * + * If the state is correct, it triggers the ``LIVEUPDATE_CANCEL`` notifier= chain + * to allow subsystems to undo any actions performed during the prepare or + * freeze events. Finally, the orchestrator state is transitioned back to + * ``LIVEUPDATE_STATE_NORMAL``. + * + * @return 0 on success, or ``-EAGAIN`` if the state change was cancelled = by the + * user while waiting for the lock. + */ int luo_cancel(void) { - return 0; + int err =3D kho_abort(); + + if (err) + return err; + + return __luo_cancel(); } =20 void luo_state_read_enter(void) @@ -210,7 +385,36 @@ void luo_state_read_exit(void) =20 static int __init luo_startup(void) { - __luo_set_state(LIVEUPDATE_STATE_NORMAL); + phys_addr_t fdt_phys; + int ret; + + if (!kho_is_enabled()) { + if (luo_enabled) + pr_warn("Disabling liveupdate because KHO is disabled\n"); + luo_enabled =3D false; + return 0; + } + + /* Retrieve LUO subtree, and verify its format. */ + ret =3D kho_retrieve_subtree(LUO_KHO_ENTRY_NAME, &fdt_phys); + if (ret) { + if (ret !=3D -ENOENT) { + luo_restore_fail("failed to retrieve FDT '%s' from KHO: %d\n", + LUO_KHO_ENTRY_NAME, ret); + } + __luo_set_state(LIVEUPDATE_STATE_NORMAL); + + return 0; + } + + luo_fdt_in =3D __va(fdt_phys); + ret =3D fdt_node_check_compatible(luo_fdt_in, 0, LUO_COMPATIBLE); + if (ret) { + luo_restore_fail("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_KHO_ENTRY_NAME, LUO_COMPATIBLE, ret); + } + + __luo_set_state(LIVEUPDATE_STATE_UPDATED); =20 return 0; } diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 3d10f3eb20a7..b61c17b78830 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -8,6 +8,15 @@ #ifndef _LINUX_LUO_INTERNAL_H #define _LINUX_LUO_INTERNAL_H =20 +/* + * Handles a deserialization failure: devices and memory is in unpredictab= le + * state. + * + * Continuing the boot process after a failure is dangerous because it cou= ld + * lead to leaks of private data. + */ +#define luo_restore_fail(__fmt, ...) panic(__fmt, ##__VA_ARGS__) + int luo_cancel(void); int luo_prepare(void); int luo_freeze(void); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f42.google.com (mail-qv1-f42.google.com [209.85.219.42]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BB2521E087 for ; Thu, 7 Aug 2025 01:45:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.42 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531109; cv=none; b=lf+7ThPMLosbKkQGIHAHd+Dtp/wPIAvWdF7SMBaC+9XrxUI+mnsK5UnsIK2NNv1Tlozyed0M3qaA0W7mOXj9cNsRygYWk3zv1L5FRa3hskHes4OKdrDzONGWhIqUanf/hjZZNHUp4r0sp9R3rkJrBIfAinXq/yOmnQ8Ekjoz1BU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531109; c=relaxed/simple; bh=mFjM+WlW0ZjSFagHbPCuSGZchs6NcEkgEPQH2WbUlxk=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Avvm9fjlIOVfBDcXMn0382HR0gXTx/0lV9YmtZFahyWUZFKiGZQaKSTN6TwzY2m1D5cq3SuIOyOvrQmBndB8/hwEupyc54Krpk7pnJOl7RFMGJ0AFteoaEWUxc6yWStvdHaIF8iJoX4IGF9hLoll6gL7IRORAJoEF64DRda+vTw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=QkDBrvXC; arc=none smtp.client-ip=209.85.219.42 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="QkDBrvXC" Received: by mail-qv1-f42.google.com with SMTP id 6a1803df08f44-709233a8609so6452436d6.1 for ; Wed, 06 Aug 2025 18:45:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531105; x=1755135905; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=POmsI4cKyi9S3VeBB0onB4St1mju1grpKYfRTV4Eix0=; b=QkDBrvXChA2chB0yaCOjafgXR3HE71mS8r3o4nlYJfoVwm+WJqhOq7nMQnOtHyqS+a /8m76nkwITx1pj7Ff7Zv8IsUIEObshVe3pc0WDvCbjBls2fFVL83Q/wvVApLQ03brqQw LeLDRLZvHbhQZGdkGJ7LHYJhDM/ITVOSuI6o6NMs+B+OsNiYcM7e5r+BwG1MA27v+CHM Eus4K2oDAktdMqgWogjWkz5tuttJjzDBbQBr7YwEjAt8iwZBcUF6lkzPhAIfVFhvZC3d T0jt7YpBD2OZ89MbJk0i/7gRgYKrxJU7zjq/rVB6QOUqwhd1rX2CZ307bALmn4NsxOV6 xiog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531105; x=1755135905; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=POmsI4cKyi9S3VeBB0onB4St1mju1grpKYfRTV4Eix0=; b=hK4drmYrdsx9a1bwTmb+eFBhL2E5WXhqKl3acJb9HS0EbUoZBfV4vMJMcWTSLQMdai 0ixhvVydGfn+F322uZgtRCez+x0kmumBUrDgH0d+UyVIb0tj5fpNqbohupBjdYee+GXm 2i9JCeJJs14qHbtATEngXEMseeW7vB9/IQelC5B6rTfkqyZf3j292fsdRBIFDNOh/97h BU2KNAGCDZ8B8WpwpEWLsoRExG++IQHVc2puehYtNEukbkj519+3F9dZ+tDhLXM4LlTl g9WzoNUXIGRjSE6kNehSvhgkFSOxWOlG0hE0cg8sW6YFH5KC65ADToz63SwOmY8JAvMg QvtQ== X-Forwarded-Encrypted: i=1; AJvYcCWnBbZbAh7EcPoKGiQUJiHyn59sA2hUUSfkuuxTpCYudlIKNBw0kxv03V4lQaDU8X9Ofa/SM4yV9pxSeKA=@vger.kernel.org X-Gm-Message-State: AOJu0YwHqIJzwzEhMEbxvFDFVyCCCa2OEx0w67B8aFFPz4/TqNk+hIkw 8U5z3HTSmka2WTLTrEKPqFNr59p1iJOw6cAp97X921fjxMNpbOuniuzMGt0lYeAzZnE= X-Gm-Gg: ASbGncv8HHt0ZPxU2UZQmfwXZWdylzDZ2rCuX20B7TJeShGKVWbiei/wKw1pFWZa6Vc EW4xAQOuD8M331/JwrOXVnZxFhe6nuG238RaByGFNPV/+sSu0tCS8pF0ByJy0UNhEydkZVTvJYN 37GH+V5aZOhtS/Iq48pCk8bK1bELYvbMqwM+uP+aLQaWYWxua+efHp7glI5wsvUg0svrhtEr4hF 4ORdnb2TJNNd/zmbY6JPrn7XZz5Hqm06PL8iO9iTyAXizYjm+1UeudhYCFAdvwowLAEmnnIoxyT NCdynt525o6OOalnOhDAQjcq61U/9KXyBIVXq/xr5bEZZeJUe++8aIxvR0B2Oh0oD8N33pF/yTd dd/cBk/2dyqz8S+fy2WmJFtqx+oKSOPloO3cxxU962G8cyegKSAiwCYYdF0hzmBDPneUehJSKwS zUvRC/canEuh9zIc0jsGzpZAI= X-Google-Smtp-Source: AGHT+IEuCsnH/h0FE1lLlqyA7mXVLkxUqHzbw/Ube1ATJQ1Zxa9P26pU4dT/1eci+C5au6rvsXS+Ig== X-Received: by 2002:a05:6214:d65:b0:709:5874:83b9 with SMTP id 6a1803df08f44-7097af93e9amr65378806d6.34.1754531104826; Wed, 06 Aug 2025 18:45:04 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:04 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 12/30] liveupdate: luo_subsystems: add subsystem registration Date: Thu, 7 Aug 2025 01:44:18 +0000 Message-ID: <20250807014442.3829950-13-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the framework for kernel subsystems (e.g., KVM, IOMMU, device drivers) to register with LUO and participate in the live update process via callbacks. Subsystem Registration: - Defines struct liveupdate_subsystem in linux/liveupdate.h, which subsystems use to provide their name and optional callbacks (prepare, freeze, cancel, finish). The callbacks accept a u64 *data intended for passing state/handles. - Exports liveupdate_register_subsystem() and liveupdate_unregister_subsystem() API functions. - Adds drivers/misc/liveupdate/luo_subsystems.c to manage a list of registered subsystems. Registration/unregistration is restricted to specific LUO states (NORMAL/UPDATED). Callback Framework: - The main luo_core.c state transition functions now delegate to new luo_do_subsystems_*_calls() functions defined in luo_subsystems.c. - These new functions are intended to iterate through the registered subsystems and invoke their corresponding callbacks. FDT Integration: - Adds a /subsystems subnode within the main LUO FDT created in luo_core.c. This node has its own compatibility string (subsystems-v1). - luo_subsystems_fdt_setup() populates this node by adding a property for each registered subsystem, using the subsystem's name. Currently, these properties are initialized with a placeholder u64 value (0). - luo_subsystems_startup() is called from luo_core.c on boot to find and validate the /subsystems node in the FDT received via KHO. - Adds a stub API function liveupdate_get_subsystem_data() intended for subsystems to retrieve their persisted u64 data from the FDT in the new kernel. Signed-off-by: Pasha Tatashin --- include/linux/liveupdate.h | 66 +++++++ kernel/liveupdate/Makefile | 3 +- kernel/liveupdate/luo_core.c | 19 +- kernel/liveupdate/luo_internal.h | 7 + kernel/liveupdate/luo_subsystems.c | 291 +++++++++++++++++++++++++++++ 5 files changed, 383 insertions(+), 3 deletions(-) create mode 100644 kernel/liveupdate/luo_subsystems.c diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index 85a6828c95b0..4c378a986cfe 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -12,6 +12,52 @@ #include #include =20 +struct liveupdate_subsystem; + +/** + * struct liveupdate_subsystem_ops - LUO events callback functions + * @prepare: Optional. Called during LUO prepare phase. Should perform + * preparatory actions and can store a u64 handle/state + * via the 'data' pointer for use in later callbacks. + * Return 0 on success, negative error code on failure. + * @freeze: Optional. Called during LUO freeze event (before actual = jump + * to new kernel). Should perform final state saving action= s and + * can update the u64 handle/state via the 'data' pointer. = Retur: + * 0 on success, negative error code on failure. + * @cancel: Optional. Called if the live update process is canceled = after + * prepare (or freeze) was called. Receives the u64 data + * set by prepare/freeze. Used for cleanup. + * @boot: Optional. Call durng boot post live update. This callbac= k is + * done when subsystem register during live update. + * @finish: Optional. Called after the live update is finished in th= e new + * kernel. + * Receives the u64 data set by prepare/freeze. Used for cl= eanup. + * @owner: Module reference + */ +struct liveupdate_subsystem_ops { + int (*prepare)(struct liveupdate_subsystem *handle, u64 *data); + int (*freeze)(struct liveupdate_subsystem *handle, u64 *data); + void (*cancel)(struct liveupdate_subsystem *handle, u64 data); + void (*boot)(struct liveupdate_subsystem *handle, u64 data); + void (*finish)(struct liveupdate_subsystem *handle, u64 data); + struct module *owner; +}; + +/** + * struct liveupdate_subsystem - Represents a subsystem participating in L= UO + * @ops: Callback functions + * @name: Unique name identifying the subsystem. + * @list: List head used internally by LUO. Should not be modified= by + * caller after registration. + * @private_data: For LUO internal use, cached value of data field. + */ +struct liveupdate_subsystem { + const struct liveupdate_subsystem_ops *ops; + const char *name; + struct list_head list; + u64 private_data; +}; + #ifdef CONFIG_LIVEUPDATE =20 /* Return true if live update orchestrator is enabled */ @@ -33,6 +79,10 @@ bool liveupdate_state_normal(void); =20 enum liveupdate_state liveupdate_get_state(void); =20 +int liveupdate_register_subsystem(struct liveupdate_subsystem *h); +int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h); +int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a); + #else /* CONFIG_LIVEUPDATE */ =20 static inline int liveupdate_reboot(void) @@ -60,5 +110,21 @@ static inline enum liveupdate_state liveupdate_get_stat= e(void) return LIVEUPDATE_STATE_NORMAL; } =20 +static inline int liveupdate_register_subsystem(struct liveupdate_subsyste= m *h) +{ + return 0; +} + +static inline int liveupdate_unregister_subsystem(struct liveupdate_subsys= tem *h) +{ + return 0; +} + +static inline int liveupdate_get_subsystem_data(struct liveupdate_subsyste= m *h, + u64 *data) +{ + return -ENODATA; +} + #endif /* CONFIG_LIVEUPDATE */ #endif /* _LINUX_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 8627b7691943..47e9ad56675b 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -5,7 +5,8 @@ =20 luo-y :=3D \ luo_core.o \ - luo_ioctl.o + luo_ioctl.o \ + luo_subsystems.o =20 obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index 951422e51dd3..64d53b31d6d8 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -130,6 +130,10 @@ static int luo_fdt_setup(void) if (ret) goto exit_free; =20 + ret =3D luo_subsystems_fdt_setup(fdt_out); + if (ret) + goto exit_free; + ret =3D kho_preserve_phys(__pa(fdt_out), LUO_FDT_SIZE); if (ret) goto exit_free; @@ -160,20 +164,30 @@ static void luo_fdt_destroy(void) =20 static int luo_do_prepare_calls(void) { - return 0; + int ret; + + ret =3D luo_do_subsystems_prepare_calls(); + + return ret; } =20 static int luo_do_freeze_calls(void) { - return 0; + int ret; + + ret =3D luo_do_subsystems_freeze_calls(); + + return ret; } =20 static void luo_do_finish_calls(void) { + luo_do_subsystems_finish_calls(); } =20 static void luo_do_cancel_calls(void) { + luo_do_subsystems_cancel_calls(); } =20 static int __luo_prepare(void) @@ -415,6 +429,7 @@ static int __init luo_startup(void) } =20 __luo_set_state(LIVEUPDATE_STATE_UPDATED); + luo_subsystems_startup(luo_fdt_in); =20 return 0; } diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index b61c17b78830..40bfbe279d34 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -27,4 +27,11 @@ void luo_state_read_exit(void); =20 const char *luo_current_state_str(void); =20 +void luo_subsystems_startup(void *fdt); +int luo_subsystems_fdt_setup(void *fdt); +int luo_do_subsystems_prepare_calls(void); +int luo_do_subsystems_freeze_calls(void); +void luo_do_subsystems_finish_calls(void); +void luo_do_subsystems_cancel_calls(void); + #endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_subsystems.c b/kernel/liveupdate/luo_sub= systems.c new file mode 100644 index 000000000000..69f00d5c000e --- /dev/null +++ b/kernel/liveupdate/luo_subsystems.c @@ -0,0 +1,291 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO Subsystems support + * + * Various kernel subsystems register with the Live Update Orchestrator to + * participate in the live update process. These subsystems are notified at + * different stages of the live update sequence, allowing them to serialize + * device state before the reboot and restore it afterwards. Examples incl= ude + * the device layer, interrupt controllers, KVM, IOMMU, and specific device + * drivers. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +#define LUO_SUBSYSTEMS_NODE_NAME "subsystems" +#define LUO_SUBSYSTEMS_COMPATIBLE "subsystems-v1" + +static DEFINE_MUTEX(luo_subsystem_list_mutex); +static LIST_HEAD(luo_subsystems_list); +static void *luo_fdt_out; +static void *luo_fdt_in; + +/** + * luo_subsystems_fdt_setup - Adds and populates the 'subsystems' node in = the + * FDT. + * @fdt: Pointer to the LUO FDT blob. + * + * Add subsystems node and each subsystem to the LUO FDT blob. + * + * Returns: 0 on success, negative errno on failure. + */ +int luo_subsystems_fdt_setup(void *fdt) +{ + struct liveupdate_subsystem *subsystem; + const u64 zero_data =3D 0; + int ret, node_offset; + + guard(mutex)(&luo_subsystem_list_mutex); + ret =3D fdt_add_subnode(fdt, 0, LUO_SUBSYSTEMS_NODE_NAME); + if (ret < 0) + goto exit_error; + + node_offset =3D ret; + ret =3D fdt_setprop_string(fdt, node_offset, "compatible", + LUO_SUBSYSTEMS_COMPATIBLE); + if (ret < 0) + goto exit_error; + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + ret =3D fdt_setprop(fdt, node_offset, subsystem->name, + &zero_data, sizeof(zero_data)); + if (ret < 0) + goto exit_error; + } + + luo_fdt_out =3D fdt; + return 0; +exit_error: + pr_err("Failed to setup 'subsystems' node to FDT: %s\n", + fdt_strerror(ret)); + return -ENOSPC; +} + +/** + * luo_subsystems_startup - Validates the LUO subsystems FDT node at start= up. + * @fdt: Pointer to the LUO FDT blob passed from the previous kernel. + * + * This __init function checks the existence and validity of the '/subsyst= ems' + * node in the FDT. This node is considered mandatory. + */ +void __init luo_subsystems_startup(void *fdt) +{ + int ret, node_offset; + + guard(mutex)(&luo_subsystem_list_mutex); + node_offset =3D fdt_subnode_offset(fdt, 0, LUO_SUBSYSTEMS_NODE_NAME); + if (node_offset < 0) + luo_restore_fail("Failed to find /subsystems node\n"); + + ret =3D fdt_node_check_compatible(fdt, node_offset, + LUO_SUBSYSTEMS_COMPATIBLE); + if (ret) { + luo_restore_fail("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_SUBSYSTEMS_NODE_NAME, + LUO_SUBSYSTEMS_COMPATIBLE, ret); + } + luo_fdt_in =3D fdt; +} + +static int luo_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) +{ + return 0; +} + +/** + * luo_do_subsystems_prepare_calls - Calls prepare callbacks and updates F= DT + * if all prepares succeed. Handles cancellation on failure. + * + * Phase 1: Calls 'prepare' for all subsystems and stores results temporar= ily. + * If any 'prepare' fails, calls 'cancel' on previously prepared subsystems + * and returns the error. + * Phase 2: If all 'prepare' calls succeeded, writes the stored data to th= e FDT. + * If any FDT write fails, calls 'cancel' on *all* prepared subsystems and + * returns the FDT error. + * + * Returns: 0 on success. Negative errno on failure. + */ +int luo_do_subsystems_prepare_calls(void) +{ + return 0; +} + +/** + * luo_do_subsystems_freeze_calls - Calls freeze callbacks and updates FDT + * if all freezes succeed. Handles cancellation on failure. + * + * Phase 1: Calls 'freeze' for all subsystems and stores results temporari= ly. + * If any 'freeze' fails, calls 'cancel' on previously called subsystems + * and returns the error. + * Phase 2: If all 'freeze' calls succeeded, writes the stored data to the= FDT. + * If any FDT write fails, calls 'cancel' on *all* subsystems and + * returns the FDT error. + * + * Returns: 0 on success. Negative errno on failure. + */ +int luo_do_subsystems_freeze_calls(void) +{ + return 0; +} + +/** + * luo_do_subsystems_finish_calls- Calls finish callbacks for all subsyste= ms. + * + * This function is called at the end of live update cycle to do the final + * clean-up or housekeeping of the post-live update states. + */ +void luo_do_subsystems_finish_calls(void) +{ +} + +/** + * luo_do_subsystems_cancel_calls - Calls cancel callbacks for all subsyst= ems. + * + * This function is typically called when the live update process needs to= be + * aborted externally, for example, after the prepare phase may have run b= ut + * before actual reboot. It iterates through all registered subsystems and= calls + * the 'cancel' callback for those that implement it and likely completed + * prepare. + */ +void luo_do_subsystems_cancel_calls(void) +{ +} + +/** + * liveupdate_register_subsystem - Register a kernel subsystem handler wit= h LUO + * @h: Pointer to the liveupdate_subsystem structure allocated and populat= ed + * by the calling subsystem. + * + * Registers a subsystem handler that provides callbacks for different eve= nts + * of the live update cycle. Registration is typically done during the + * subsystem's module init or core initialization. + * + * Can only be called when LUO is in the NORMAL or UPDATED states. + * The provided name (@h->name) must be unique among registered subsystems. + * + * Return: 0 on success, negative error code otherwise. + */ +int liveupdate_register_subsystem(struct liveupdate_subsystem *h) +{ + struct liveupdate_subsystem *iter; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + guard(mutex)(&luo_subsystem_list_mutex); + list_for_each_entry(iter, &luo_subsystems_list, list) { + if (iter =3D=3D h) { + pr_warn("Subsystem '%s' (%p) already registered.\n", + h->name, h); + ret =3D -EEXIST; + goto out_unlock; + } + + if (!strcmp(iter->name, h->name)) { + pr_err("Subsystem with name '%s' already registered.\n", + h->name); + ret =3D -EEXIST; + goto out_unlock; + } + } + + if (!try_module_get(h->ops->owner)) { + pr_warn("Subsystem '%s' unable to get reference.\n", h->name); + ret =3D -EAGAIN; + goto out_unlock; + } + + INIT_LIST_HEAD(&h->list); + list_add_tail(&h->list, &luo_subsystems_list); + +out_unlock: + /* + * If we are booting during live update, and subsystem provided a boot + * callback, do it now, since we know that subsystem has already + * initialized. + */ + if (!ret && liveupdate_state_updated() && h->ops->boot) { + u64 data; + + ret =3D luo_get_subsystem_data(h, &data); + if (!WARN_ON_ONCE(ret)) + h->ops->boot(h, data); + } + + luo_state_read_exit(); + + return ret; +} + +/** + * liveupdate_unregister_subsystem - Unregister a kernel subsystem handler= from + * LUO + * @h: Pointer to the same liveupdate_subsystem structure that was used du= ring + * registration. + * + * Unregisters a previously registered subsystem handler. Typically called + * during module exit or subsystem teardown. LUO removes the structure fro= m its + * internal list; the caller is responsible for any necessary memory clean= up + * of the structure itself. + * + * Return: 0 on success, negative error code otherwise. + * -EINVAL if h is NULL. + * -ENOENT if the specified handler @h is not found in the registration li= st. + * -EBUSY if LUO is not in the NORMAL state. + */ +int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h) +{ + struct liveupdate_subsystem *iter; + bool found =3D false; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + guard(mutex)(&luo_subsystem_list_mutex); + list_for_each_entry(iter, &luo_subsystems_list, list) { + if (iter =3D=3D h) { + found =3D true; + break; + } + } + + if (found) { + list_del_init(&h->list); + } else { + pr_warn("Subsystem handler '%s' not found for unregistration.\n", + h->name); + ret =3D -ENOENT; + } + + module_put(h->ops->owner); + luo_state_read_exit(); + + return ret; +} + +int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) +{ + return 0; +} --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B631121C166 for ; Thu, 7 Aug 2025 01:45:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531110; cv=none; b=JI6OcTp19hHxzwyQu3Jp/s1Goq91QEvcOo3zW78w9+ZlCa/EaobZdxgcaRg0B5tTgMWNfemYGVruH7Y5gDhTW2lqr/Qx2PHtjmsuQZ1sR82bARSryEDzKQtPGZhW53tj1c8HNHib1Oo8iYPaiqP26AbFKUarZXfPRmltV6BlKN8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531110; c=relaxed/simple; bh=Y1sLhC9y86x1i1cxGNgfSB/iscxWS5gRZuMoBf/oLNc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=MFiHAobwTx7hqU9xYlUUB6r8mkRxnzEdaM5TNYwAOQVKORD6aQaWhLNeUXZMOOPcBfkk5/hsAyFYQqxA3s1+R23hPRKN28HQ5OOIwhPe+h5CCoIYd75xFtZroqa0He0fyCAzWImaTGV5KZbTSSUQ84535/MKGOYgeSTG8sXI/tM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=e186YskE; arc=none smtp.client-ip=209.85.219.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="e186YskE" Received: by mail-qv1-f53.google.com with SMTP id 6a1803df08f44-7074bad0523so5471776d6.3 for ; Wed, 06 Aug 2025 18:45:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531106; x=1755135906; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=2OTaXTT6jQmKJ4PUph7LgRVFyALKuwMGjifm+qkQzqU=; b=e186YskEIVlxlIHYxVG9KrCp23fQgceIHkHPdnlaQPHfXGX10VbEIZklCh65vAjevn wkn2a6TCk7kd7vYiCLFsUpUZwAytdiUUGW6HDzUeZHfdA2aEoKbgQtDwb/TX8Q5WOfEr 9QW08CSFXR/o+h8NmSs+y2lwIVUXT1FHspdEQLtyQT5bsyaZK0gD65Ysn75ywxDGTVZO kY3/QfcQ5rsoAPzX8uMiEXyn2zCr5klNa19oLCysdqk9Pu3CwWOtcqztSkJ4FRCcfmFZ nlzsPEScqg7cQCV3dU4+TVxEvw+2HU+FJ5L7AnY/Y0cp9o3RbtaSiNLR1yCiaMqiU2fl RHcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531106; x=1755135906; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2OTaXTT6jQmKJ4PUph7LgRVFyALKuwMGjifm+qkQzqU=; b=ULrHE4/TZ0amN+rJ1nshSJw18qoxRmWVSeAV9VEcD6E/rdt1pGuQlU0LR7D/u/xSjl 6QQuLevEXIS38SlCE7ND1rv6u6j8pUgY1A8UF40dnDo35qcboPwJDGM+KKYAup4oqQWH oCr8AB1Qlo4vBDLm5oaAuPQyVMiQw73OgfZd8S3dB70o0v9jtImZ+x8+kwoHwc241P5H O8rcAG2MY3pYBB0ykef6KfqvHKPH3MmKPwTejBd6Ma0BcGNrzNrtQ9TAZv3IyLg37PKT CZ//R7PAr5c5jNx8PTi34u8i2AMFQVuWwX5OD4iE5kcoAf6NWTvRJ9CFJ69qLB/nirgd W5/g== X-Forwarded-Encrypted: i=1; AJvYcCV4RmCPJN3vvhD10FtguLZhp7uaepovcCREQVCm/wpE8A9Auf5w60tNxsGQUbSDNJJ+2iz1pbvM0f/9pJk=@vger.kernel.org X-Gm-Message-State: AOJu0Yz+TjSNrlyd3pg2kY0KVLkRj26xVUUvtj9dP3nUnEjbYsX9psH3 +/fxdkNO3o+5lhtyLUkZWd6XvJVgPCd0bGQdSGvQAhoISbs7Z1gd+Jwp5VgeiAtlOUI= X-Gm-Gg: ASbGncuMs7A6nuFUHnP4XERL4DnSKxbY0BlTt849lOiVX5tgMC0lpsT1lAhuKNouEoK fsYj1E1iIyuRhNayM3Stwxb9HMHa6JWKgWd79yePgljFluQLjQObXEPK5njym3d/Agf3qsAq+Yh i6JJ/8Lp1DEacpZ870NJLrtH06OnRVKANM8qWbjRZVLWQWyD3WKWjRsVIG9VqLsn0mpukSudqNm N7q7GnR2iy5sDVziZWZdqvtifd1vBFbwLN/g80rCoZKJeeYuZmoxvXq9UkYnb6Re3+O1r3vgyhl LSYtsagjBZKvhEr8TkippN2qEy+BsfAsBWSbZkyYORgEGHK6XBVbOm8hvgj2t/bcL/uFZok/plM HQXJWJckXbdFEYpthCvNLT1ulNKtZzOS6Xdb3CXi+bjNXgZ/S7ZkLqfKHZKEyko3nzJH6h8K+/I fwrMRbnpsACEe3 X-Google-Smtp-Source: AGHT+IHkVV1gJ/eLfgNhdc15SxnB7MFyTZ6hI7y1/NB5UoxH3EhPIvvoxDJur8yfTZfudJtjwPns5g== X-Received: by 2002:a05:6214:482:b0:707:45dc:c36 with SMTP id 6a1803df08f44-709795f4f3amr75094366d6.29.1754531106352; Wed, 06 Aug 2025 18:45:06 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:05 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 13/30] liveupdate: luo_subsystems: implement subsystem callbacks Date: Thu, 7 Aug 2025 01:44:19 +0000 Message-ID: <20250807014442.3829950-14-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implement the core logic within luo_subsystems.c to handle the invocation of registered subsystem callbacks and manage the persistence of their state via the LUO FDT. This replaces the stub implementations from the previous patch. This completes the core mechanism enabling subsystems to actively participate in the LUO state machine, execute phase-specific logic, and persist/restore a u64 state across the live update transition using the FDT. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_subsystems.c | 167 ++++++++++++++++++++++++++++- 1 file changed, 164 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/luo_subsystems.c b/kernel/liveupdate/luo_sub= systems.c index 69f00d5c000e..ebb7c0db08f3 100644 --- a/kernel/liveupdate/luo_subsystems.c +++ b/kernel/liveupdate/luo_subsystems.c @@ -101,8 +101,81 @@ void __init luo_subsystems_startup(void *fdt) luo_fdt_in =3D fdt; } =20 +static void __luo_do_subsystems_cancel_calls(struct liveupdate_subsystem *= boundary_subsystem) +{ + struct liveupdate_subsystem *subsystem; + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (subsystem =3D=3D boundary_subsystem) + break; + + if (subsystem->ops->cancel) { + subsystem->ops->cancel(subsystem, + subsystem->private_data); + } + subsystem->private_data =3D 0; + } +} + +static void luo_subsystems_retrieve_data_from_fdt(void) +{ + struct liveupdate_subsystem *subsystem; + int node_offset, prop_len; + const void *prop; + + if (!luo_fdt_in) + return; + + node_offset =3D fdt_subnode_offset(luo_fdt_in, 0, + LUO_SUBSYSTEMS_NODE_NAME); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + prop =3D fdt_getprop(luo_fdt_in, node_offset, + subsystem->name, &prop_len); + + if (!prop || prop_len !=3D sizeof(u64)) { + luo_restore_fail("In FDT node '/%s' can't find property '%s': %s\n", + LUO_SUBSYSTEMS_NODE_NAME, + subsystem->name, + fdt_strerror(node_offset)); + } + memcpy(&subsystem->private_data, prop, sizeof(u64)); + } +} + +static int luo_subsystems_commit_data_to_fdt(void) +{ + struct liveupdate_subsystem *subsystem; + int ret, node_offset; + + node_offset =3D fdt_subnode_offset(luo_fdt_out, 0, + LUO_SUBSYSTEMS_NODE_NAME); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + ret =3D fdt_setprop(luo_fdt_out, node_offset, subsystem->name, + &subsystem->private_data, sizeof(u64)); + if (ret < 0) { + pr_err("Failed to set FDT property for subsystem '%s' %s\n", + subsystem->name, fdt_strerror(ret)); + return -ENOENT; + } + } + + return 0; +} + static int luo_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) { + int node_offset, prop_len; + const void *prop; + + node_offset =3D fdt_subnode_offset(luo_fdt_in, 0, + LUO_SUBSYSTEMS_NODE_NAME); + prop =3D fdt_getprop(luo_fdt_in, node_offset, h->name, &prop_len); + if (!prop || prop_len !=3D sizeof(u64)) { + luo_state_read_exit(); + return -ENOENT; + } + memcpy(data, prop, sizeof(u64)); + return 0; } =20 @@ -121,7 +194,30 @@ static int luo_get_subsystem_data(struct liveupdate_su= bsystem *h, u64 *data) */ int luo_do_subsystems_prepare_calls(void) { - return 0; + struct liveupdate_subsystem *subsystem; + int ret; + + guard(mutex)(&luo_subsystem_list_mutex); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (!subsystem->ops->prepare) + continue; + + ret =3D subsystem->ops->prepare(subsystem, + &subsystem->private_data); + if (ret < 0) { + pr_err("Subsystem '%s' prepare callback failed [%d]\n", + subsystem->name, ret); + __luo_do_subsystems_cancel_calls(subsystem); + + return ret; + } + } + + ret =3D luo_subsystems_commit_data_to_fdt(); + if (ret) + __luo_do_subsystems_cancel_calls(NULL); + + return ret; } =20 /** @@ -139,7 +235,30 @@ int luo_do_subsystems_prepare_calls(void) */ int luo_do_subsystems_freeze_calls(void) { - return 0; + struct liveupdate_subsystem *subsystem; + int ret; + + guard(mutex)(&luo_subsystem_list_mutex); + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (!subsystem->ops->freeze) + continue; + + ret =3D subsystem->ops->freeze(subsystem, + &subsystem->private_data); + if (ret < 0) { + pr_err("Subsystem '%s' freeze callback failed [%d]\n", + subsystem->name, ret); + __luo_do_subsystems_cancel_calls(subsystem); + + return ret; + } + } + + ret =3D luo_subsystems_commit_data_to_fdt(); + if (ret) + __luo_do_subsystems_cancel_calls(NULL); + + return ret; } =20 /** @@ -150,6 +269,18 @@ int luo_do_subsystems_freeze_calls(void) */ void luo_do_subsystems_finish_calls(void) { + struct liveupdate_subsystem *subsystem; + + guard(mutex)(&luo_subsystem_list_mutex); + luo_subsystems_retrieve_data_from_fdt(); + + list_for_each_entry(subsystem, &luo_subsystems_list, list) { + if (subsystem->ops->finish) { + subsystem->ops->finish(subsystem, + subsystem->private_data); + } + subsystem->private_data =3D 0; + } } =20 /** @@ -163,6 +294,9 @@ void luo_do_subsystems_finish_calls(void) */ void luo_do_subsystems_cancel_calls(void) { + guard(mutex)(&luo_subsystem_list_mutex); + __luo_do_subsystems_cancel_calls(NULL); + luo_subsystems_commit_data_to_fdt(); } =20 /** @@ -285,7 +419,34 @@ int liveupdate_unregister_subsystem(struct liveupdate_= subsystem *h) return ret; } =20 +/** + * liveupdate_get_subsystem_data - Retrieve raw private data for a subsyst= em + * from FDT. + * @h: Pointer to the liveupdate_subsystem structure representing the + * subsystem instance. The 'name' field is used to find the property. + * @data: Output pointer where the subsystem's raw private u64 data will= be + * stored via memcpy. + * + * Reads the 8-byte data property associated with the subsystem @h->name + * directly from the '/subsystems' node within the globally accessible + * 'luo_fdt_in' blob. Returns appropriate error codes if inputs are invali= d, or + * nodes/properties are missing or invalid. + * + * Return: 0 on success. -ENOENT on error. + */ int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a) { - return 0; + int ret; + + luo_state_read_enter(); + if (WARN_ON_ONCE(!luo_fdt_in || !liveupdate_state_updated())) { + luo_state_read_exit(); + return -ENOENT; + } + + scoped_guard(mutex, &luo_subsystem_list_mutex) + ret =3D luo_get_subsystem_data(h, data); + luo_state_read_exit(); + + return ret; } --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C98B223DE7 for ; Thu, 7 Aug 2025 01:45:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531113; cv=none; b=VLKg+jBkaP3B0KffFW4n423vBbdyA+6mt/ykELHyVwOzK45BtZZXL8T1G2g214kp6VJpu4Vui8XNtXFiyp7U0AFqvoTwaWlsNxpDc8yTaulhXAhWnIMsM8zvJhsrOd6Sl//pIr4n9kLee02TUw9BgPNc99ofzrkn5MX5bkLSV2w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531113; c=relaxed/simple; bh=JgDVxicg+AOekGOe+/9MAKxIOxeDHs3VbHkVSf0QKAU=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=IugFasFkjf8dkEo2MOwx3gOIt3TFWLzl0M15x+JeigEQYmfobNyH7xwcA/m90QeG0hjDy9cvUmIknh1gT2uKPgkQRawMbYH+je2ASW9G03XAiNByaLDl8J8lAj/3c1WVaPNm1FpwApqxzZMcRHRrhSIBgaOYTG7eWhLly/DHovs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=XNxdTF6i; arc=none smtp.client-ip=209.85.219.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="XNxdTF6i" Received: by mail-qv1-f44.google.com with SMTP id 6a1803df08f44-70875cc3423so5976766d6.0 for ; Wed, 06 Aug 2025 18:45:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531108; x=1755135908; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=mp1vmIN6774rh53/eJBOc5kRP469D2TV1rphzED5Sxg=; b=XNxdTF6iksWI/03rM5SZyzdcDwPpKCKmG7L7KBoE/JNeoIf+/s0n8y/tNMHhh8c6HW zT/MzdlBUbv2kp2pqXXEgqdWSowwpx0iITXpaDysdiPsCD1q6vIo37v8VR37lqyYxjmA IrLAOskjzdKXAvNnqDMNlWOr7yH/qFwVzlHV9lKWLDzaG9v4nnrMsTHbNUQrklvz25s6 WG4lad9dnlvRYEVXKGNJ0NQQD9m4+nyQq9HcsajF/R60FStHjuKwW1VJHJFVyjpCBZ/P DuLlBGz3u8+2peBVA/5ObxDtOVeDpbjboQS8yIq+rV/XB/J7k7ma+asVXpWF7IYsy5GT IOFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531108; x=1755135908; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mp1vmIN6774rh53/eJBOc5kRP469D2TV1rphzED5Sxg=; b=kfO0pq/mXdJEiopnMb5QyAl+TNWgLiYlB8vEA36TTDN2O3eAxsO5ubNaBvNY0iNLeJ +rbaa3s4h6AZKbwB51JsGn4SWImyz9MSyHsBaOzloBoN6+9yWO9LezlQjCmveihuykH1 wVazCx6SVjnw47Zdon/4nyB9rb29MOOUsfeAV/1uGno2BYTUfHRcyixKjWciT5oq1YH6 +wWa54rWc+Wa98gV12qBwl2kOZeayCW5dSfLdQ/RV1AtnYBQq83LuNgk44/94gxBV9ui FhvRd8pnBloBV32Znjl6c7Jn9y1MWHhpc9PL3qjXBwAWBYf425E1HYwyPYk9lQt6qgm5 7O0w== X-Forwarded-Encrypted: i=1; AJvYcCV3FD73gKqBmhpg+8eHoaBfnSPB/ODMbDbj+3gtHtwvkBeLLlMinaxrlwqov9W6+pDnM/1F3KbjKJF83v8=@vger.kernel.org X-Gm-Message-State: AOJu0YwoF0mPO85+hvDtDFJIUS618Rt+suxA44VdSO3u3qXVUcTqHv81 L4OK1MXoMpk42JfNQQO2dv6q7fib0LG0wVdxC0mgEg/Rx4sRkfwckG+xCUqgnVVrCZQ= X-Gm-Gg: ASbGncvJkgH0UXmUTAI+rSG0d5a0YIr6t6EFXsyXJgOzDeG6FJjBrjSZ+rjCuqpjIXg xW44QDSkloqIN27g9PsBgv67A2F6ugHMQCylHjK1A4vIkoo4Z+EmG5lKuwRX/C4zVt6TdmRkD2h Z9wfEdJB9xHHGtNGXQ/apiioeypraMK9Fgs1EArJ1KdTdma/sbPT7tgHiB5UdLpSkCpWOFr0zuJ 1aijWbK3L/s7PkGryO61o1rmqcVhoTrG0pkzxfa6Y07Yaqah5od1QoAqXN5uI5q/iyGAw3/V6sW olpfYFnLp0ncz+xXWVhFZPXZoCDPQNvh4X8+xt/chJU/wT+mVpD0XxE96m8LLjs0UByQkEjz0Q0 Nu2gRtpI/YkNgzx536I1pm7Te1vXsIDdJjoSq4wAJ8d1grEEJFAg1tjTKoXHgp7kxQkdH77Tc+Q Zr3g8RQN9pHTUY1KCudosdBWk= X-Google-Smtp-Source: AGHT+IHW6xOS8icaV4fCZErUgrfoJrCJf1Yssvt+ylUZS0UycAitGY8nEQwBLa80lBpJIc7l5fKEdg== X-Received: by 2002:a05:6214:2422:b0:707:6409:d001 with SMTP id 6a1803df08f44-7098943a0camr29339416d6.21.1754531107741; Wed, 06 Aug 2025 18:45:07 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:07 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 14/30] liveupdate: luo_files: add infrastructure for FDs Date: Thu, 7 Aug 2025 01:44:20 +0000 Message-ID: <20250807014442.3829950-15-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the framework within LUO to support preserving specific types of file descriptors across a live update transition. This allows stateful FDs (like memfds or vfio FDs used by VMs) to be recreated in the new kernel. Note: The core logic for iterating through the luo_files_list and invoking the handler callbacks (prepare, freeze, cancel, finish) within luo_do_files_*_calls, as well as managing the u64 data persistence via the FDT for individual files, is currently implemented as stubs in this patch. This patch sets up the registration, FDT layout, and retrieval framework. Signed-off-by: Pasha Tatashin --- include/linux/liveupdate.h | 73 ++++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_files.c | 677 +++++++++++++++++++++++++++++++ kernel/liveupdate/luo_internal.h | 4 + 4 files changed, 755 insertions(+) create mode 100644 kernel/liveupdate/luo_files.c diff --git a/include/linux/liveupdate.h b/include/linux/liveupdate.h index 4c378a986cfe..72786482ca48 100644 --- a/include/linux/liveupdate.h +++ b/include/linux/liveupdate.h @@ -13,6 +13,66 @@ #include =20 struct liveupdate_subsystem; +struct liveupdate_file_handler; +struct file; + +/** + * struct liveupdate_file_ops - Callbacks for live-updatable files. + * @prepare: Optional. Saves state for a specific file instance @fil= e, + * before update, potentially returning value via @data. + * Returns 0 on success, negative errno on failure. + * @freeze: Optional. Performs final actions just before kernel + * transition, potentially reading/updating the handle via + * @data. + * Returns 0 on success, negative errno on failure. + * @cancel: Optional. Cleans up state/resources if update is aborted + * after prepare/freeze succeeded, using the @data handle = (by + * value) from the successful prepare. Returns void. + * @finish: Optional. Performs final cleanup in the new kernel usin= g the + * preserved @data handle (by value). Returns void. + * @retrieve: Retrieve the preserved file. Must be called before fini= sh. + * @can_preserve: callback to determine if @file can be preserved by this + * handler. + * Return bool (true if preservable, false otherwise). + * @owner: Module reference + */ +struct liveupdate_file_ops { + int (*prepare)(struct liveupdate_file_handler *handler, + struct file *file, u64 *data); + int (*freeze)(struct liveupdate_file_handler *handler, + struct file *file, u64 *data); + void (*cancel)(struct liveupdate_file_handler *handler, + struct file *file, u64 data); + void (*finish)(struct liveupdate_file_handler *handler, + struct file *file, u64 data, bool reclaimed); + int (*retrieve)(struct liveupdate_file_handler *handler, + u64 data, struct file **file); + bool (*can_preserve)(struct liveupdate_file_handler *handler, + struct file *file); + struct module *owner; +}; + +/** + * struct liveupdate_file_handler - Represents a handler for a live-updata= ble + * file type. + * @ops: Callback functions + * @compatible: The compatibility string (e.g., "memfd-v1", "vfiofd-v1") + * that uniquely identifies the file type this handler sup= ports. + * This is matched against the compatible string associate= d with + * individual &struct liveupdate_file instances. + * @list: used for linking this handler instance into a global li= st of + * registered file handlers. + * + * Modules that want to support live update for specific file types should + * register an instance of this structure. LUO uses this registration to + * determine if a given file can be preserved and to find the appropriate + * operations to manage its state across the update. + */ +struct liveupdate_file_handler { + const struct liveupdate_file_ops *ops; + const char *compatible; + struct list_head list; +}; =20 /** * struct liveupdate_subsystem_ops - LUO events callback functions @@ -83,6 +143,9 @@ int liveupdate_register_subsystem(struct liveupdate_subs= ystem *h); int liveupdate_unregister_subsystem(struct liveupdate_subsystem *h); int liveupdate_get_subsystem_data(struct liveupdate_subsystem *h, u64 *dat= a); =20 +int liveupdate_register_file_handler(struct liveupdate_file_handler *h); +int liveupdate_unregister_file_handler(struct liveupdate_file_handler *h); + #else /* CONFIG_LIVEUPDATE */ =20 static inline int liveupdate_reboot(void) @@ -126,5 +189,15 @@ static inline int liveupdate_get_subsystem_data(struct= liveupdate_subsystem *h, return -ENODATA; } =20 +static inline int liveupdate_register_file_handler(struct liveupdate_file_= handler *h) +{ + return 0; +} + +static inline int liveupdate_unregister_file_handler(struct liveupdate_fil= e_handler *h) +{ + return 0; +} + #endif /* CONFIG_LIVEUPDATE */ #endif /* _LINUX_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 47e9ad56675b..c67fa2797796 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -5,6 +5,7 @@ =20 luo-y :=3D \ luo_core.o \ + luo_files.o \ luo_ioctl.o \ luo_subsystems.o =20 diff --git a/kernel/liveupdate/luo_files.c b/kernel/liveupdate/luo_files.c new file mode 100644 index 000000000000..4b7568d0f0f0 --- /dev/null +++ b/kernel/liveupdate/luo_files.c @@ -0,0 +1,677 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO file descriptors + * + * LUO provides the infrastructure necessary to preserve + * specific types of stateful file descriptors across a kernel live + * update transition. The primary goal is to allow workloads, such as virt= ual + * machines using vfio, memfd, or iommufd to retain access to their essent= ial + * resources without interruption after the underlying kernel is updated. + * + * The framework operates based on handler registration and instance track= ing: + * + * 1. Handler Registration: Kernel modules responsible for specific file + * types (e.g., memfd, vfio) register a &struct liveupdate_file_handler + * handler. This handler contains callbacks + * (&liveupdate_file_handler.ops->prepare, + * &liveupdate_file_handler.ops->freeze, + * &liveupdate_file_handler.ops->finish, etc.) and a unique 'compatible' s= tring + * identifying the file type. Registration occurs via + * liveupdate_register_file_handler(). + * + * 2. File Instance Tracking: When a potentially preservable file needs to= be + * managed for live update, the core LUO logic (luo_register_file()) finds= a + * compatible registered handler using its + * &liveupdate_file_handler.ops->can_preserve callback. If found, an inte= rnal + * &struct luo_file instance is created, assigned a unique u64 'token', and + * added to a list. + * + * 3. State Persistence (FDT): During the LUO prepare/freeze phases, the + * registered handler callbacks are invoked for each tracked file instance. + * These callbacks can generate a u64 data payload representing the minimal + * state needed for restoration. This payload, along with the handler's + * compatible string and the unique token, is stored in a dedicated + * '/file-descriptors' node within the main LUO FDT blob passed via + * Kexec Handover (KHO). + * + * 4. Restoration: In the new kernel, the LUO framework parses the incoming + * FDT to reconstruct the list of &struct luo_file instances. When the + * original owner requests the file, luo_retrieve_file() uses the correspo= nding + * handler's &liveupdate_file_handler.ops->retrieve callback, passing the + * persisted u64 data, to recreate or find the appropriate &struct file ob= ject. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" + +#define LUO_FILES_NODE_NAME "file-descriptors" +#define LUO_FILES_COMPATIBLE "file-descriptors-v1" + +static DEFINE_XARRAY(luo_files_xa_in); +static DEFINE_XARRAY(luo_files_xa_out); +static bool luo_files_xa_in_recreated; + +/* Registered files. */ +static DECLARE_RWSEM(luo_register_file_list_rwsem); +static LIST_HEAD(luo_register_file_list); + +static DECLARE_RWSEM(luo_file_fdt_rwsem); +static void *luo_file_fdt_out; +static void *luo_file_fdt_in; + +static size_t luo_file_fdt_out_size; + +static atomic64_t luo_files_count; + +/** + * struct luo_file - Represents a file descriptor instance preserved + * across live update. + * @fh: Pointer to the &struct liveupdate_file_handler containi= ng + * the implementation of prepare, freeze, cancel, and fini= sh + * operations specific to this file's type. + * @file: A pointer to the kernel's &struct file object represent= ing + * the open file descriptor that is being preserved. + * @private_data: Internal storage used by the live update core framework + * between phases. + * @reclaimed: Flag indicating whether this preserved file descriptor = has + * been successfully 'reclaimed' (e.g., requested via an i= octl) + * by user-space or the owning kernel subsystem in the new + * kernel after the live update. + * @state: The current state of file descriptor, it is allowed to + * prepare, freeze, and finish FDs before the global state + * switch. + * @mutex: Lock to protect FD state, and allow independently to ch= ange + * the FD state compared to global state. + * + * This structure holds the necessary callbacks and context for managing a + * specific open file descriptor throughout the different phases of a live + * update process. Instances of this structure are typically allocated, + * populated with file-specific details (&file, &arg, callbacks, compatibi= lity + * string, token), and linked into a central list managed by the LUO. The + * private_data field is used internally by the core logic to store state + * between phases. + */ +struct luo_file { + struct liveupdate_file_handler *fh; + struct file *file; + u64 private_data; + bool reclaimed; + enum liveupdate_state state; + struct mutex mutex; +}; + +static void luo_files_recreate_luo_files_xa_in(void) +{ + const char *node_name, *fdt_compat_str; + struct liveupdate_file_handler *fh; + struct luo_file *luo_file; + const void *data_ptr; + int file_node_offset; + int ret =3D 0; + + guard(rwsem_read)(&luo_file_fdt_rwsem); + if (luo_files_xa_in_recreated || !luo_file_fdt_in) + return; + + /* Take write in order to guarantee that we re-create list once */ + guard(rwsem_write)(&luo_register_file_list_rwsem); + if (luo_files_xa_in_recreated) + return; + + fdt_for_each_subnode(file_node_offset, luo_file_fdt_in, 0) { + bool handler_found =3D false; + u64 token; + + node_name =3D fdt_get_name(luo_file_fdt_in, file_node_offset, + NULL); + if (!node_name) { + luo_restore_fail("FDT subnode at offset %d: Cannot get name\n", + file_node_offset); + } + + ret =3D kstrtou64(node_name, 0, &token); + if (ret < 0) { + luo_restore_fail("FDT node '%s': Failed to parse token\n", + node_name); + } + + if (xa_load(&luo_files_xa_in, token)) { + luo_restore_fail("Duplicate token %llu found in incoming FDT for file d= escriptors.\n", + token); + } + + fdt_compat_str =3D fdt_getprop(luo_file_fdt_in, file_node_offset, + "compatible", NULL); + if (!fdt_compat_str) { + luo_restore_fail("FDT node '%s': Missing 'compatible' property\n", + node_name); + } + + data_ptr =3D fdt_getprop(luo_file_fdt_in, file_node_offset, "data", + NULL); + if (!data_ptr) { + luo_restore_fail("Can't recover property 'data' for FDT node '%s'\n", + node_name); + } + + list_for_each_entry(fh, &luo_register_file_list, list) { + if (!strcmp(fh->compatible, fdt_compat_str)) { + handler_found =3D true; + break; + } + } + + if (!handler_found) { + luo_restore_fail("FDT node '%s': No registered handler for compatible '= %s'\n", + node_name, fdt_compat_str); + } + + luo_file =3D kmalloc(sizeof(*luo_file), + GFP_KERNEL | __GFP_NOFAIL); + luo_file->fh =3D fh; + luo_file->file =3D NULL; + memcpy(&luo_file->private_data, data_ptr, sizeof(u64)); + luo_file->reclaimed =3D false; + mutex_init(&luo_file->mutex); + luo_file->state =3D LIVEUPDATE_STATE_UPDATED; + ret =3D xa_err(xa_store(&luo_files_xa_in, token, luo_file, + GFP_KERNEL | __GFP_NOFAIL)); + if (ret < 0) { + luo_restore_fail("Failed to store luo_file for token %llu in XArray: %d= \n", + token, ret); + } + } + luo_files_xa_in_recreated =3D true; +} + +static size_t luo_files_fdt_size(void) +{ + u64 num_files =3D atomic64_read(&luo_files_count); + + /* Estimate a 1K overhead, + 128 bytes per file entry */ + return PAGE_SIZE << get_order(SZ_1K + (num_files * 128)); +} + +static void luo_files_fdt_cleanup(void) +{ + WARN_ON_ONCE(kho_unpreserve_phys(__pa(luo_file_fdt_out), + luo_file_fdt_out_size)); + + free_pages((unsigned long)luo_file_fdt_out, + get_order(luo_file_fdt_out_size)); + + luo_file_fdt_out_size =3D 0; + luo_file_fdt_out =3D NULL; +} + +static int luo_files_to_fdt(struct xarray *files_xa_out) +{ + const u64 zero_data =3D 0; + unsigned long token; + struct luo_file *h; + char token_str[19]; + int ret =3D 0; + + xa_for_each(files_xa_out, token, h) { + snprintf(token_str, sizeof(token_str), "%#0llx", (u64)token); + + ret =3D fdt_begin_node(luo_file_fdt_out, token_str); + if (ret < 0) + break; + + ret =3D fdt_property_string(luo_file_fdt_out, "compatible", + h->fh->compatible); + if (ret < 0) { + fdt_end_node(luo_file_fdt_out); + break; + } + + ret =3D fdt_property_u64(luo_file_fdt_out, "data", zero_data); + if (ret < 0) { + fdt_end_node(luo_file_fdt_out); + break; + } + + ret =3D fdt_end_node(luo_file_fdt_out); + if (ret < 0) + break; + } + + return ret; +} + +static int luo_files_fdt_setup(void) +{ + int ret; + + guard(rwsem_write)(&luo_file_fdt_rwsem); + luo_file_fdt_out_size =3D luo_files_fdt_size(); + luo_file_fdt_out =3D (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, + get_order(luo_file_fdt_out_size)); + if (!luo_file_fdt_out) { + pr_err("Failed to allocate FDT memory (%zu bytes)\n", + luo_file_fdt_out_size); + luo_file_fdt_out_size =3D 0; + return -ENOMEM; + } + + ret =3D kho_preserve_phys(__pa(luo_file_fdt_out), luo_file_fdt_out_size); + if (ret) { + pr_err("Failed to kho preserve FDT memory (%zu bytes)\n", + luo_file_fdt_out_size); + luo_file_fdt_out_size =3D 0; + luo_file_fdt_out =3D NULL; + return ret; + } + + ret =3D fdt_create(luo_file_fdt_out, luo_file_fdt_out_size); + if (ret < 0) + goto exit_cleanup; + + ret =3D fdt_finish_reservemap(luo_file_fdt_out); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_begin_node(luo_file_fdt_out, LUO_FILES_NODE_NAME); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_property_string(luo_file_fdt_out, "compatible", + LUO_FILES_COMPATIBLE); + if (ret < 0) + goto exit_end_node; + + ret =3D luo_files_to_fdt(&luo_files_xa_out); + if (ret < 0) + goto exit_end_node; + + ret =3D fdt_end_node(luo_file_fdt_out); + if (ret < 0) + goto exit_finish; + + ret =3D fdt_finish(luo_file_fdt_out); + if (ret < 0) + goto exit_cleanup; + + return 0; + +exit_end_node: + fdt_end_node(luo_file_fdt_out); +exit_finish: + fdt_finish(luo_file_fdt_out); +exit_cleanup: + pr_err("Failed to setup FDT: %s (ret %d)\n", fdt_strerror(ret), ret); + luo_files_fdt_cleanup(); + + return ret; +} + +static int luo_files_prepare(struct liveupdate_subsystem *h, u64 *data) +{ + int ret; + + ret =3D luo_files_fdt_setup(); + if (ret) + return ret; + + scoped_guard(rwsem_read, &luo_file_fdt_rwsem) + *data =3D __pa(luo_file_fdt_out); + + return ret; +} + +static int luo_files_freeze(struct liveupdate_subsystem *h, u64 *data) +{ + return 0; +} + +static void luo_files_finish(struct liveupdate_subsystem *h, u64 data) +{ + luo_files_recreate_luo_files_xa_in(); +} + +static void luo_files_cancel(struct liveupdate_subsystem *h, u64 data) +{ +} + +static void luo_files_boot(struct liveupdate_subsystem *h, u64 fdt_pa) +{ + int ret; + + ret =3D fdt_node_check_compatible(__va(fdt_pa), 0, + LUO_FILES_COMPATIBLE); + if (ret) { + luo_restore_fail("FDT '%s' is incompatible with '%s' [%d]\n", + LUO_FILES_NODE_NAME, LUO_FILES_COMPATIBLE, + ret); + } + scoped_guard(rwsem_write, &luo_file_fdt_rwsem) + luo_file_fdt_in =3D __va(fdt_pa); +} + +static const struct liveupdate_subsystem_ops luo_file_subsys_ops =3D { + .prepare =3D luo_files_prepare, + .freeze =3D luo_files_freeze, + .cancel =3D luo_files_cancel, + .boot =3D luo_files_boot, + .finish =3D luo_files_finish, + .owner =3D THIS_MODULE, +}; + +static struct liveupdate_subsystem luo_file_subsys =3D { + .ops =3D &luo_file_subsys_ops, + .name =3D LUO_FILES_NODE_NAME, +}; + +static int __init luo_files_startup(void) +{ + int ret; + + if (!liveupdate_enabled()) + return 0; + + ret =3D liveupdate_register_subsystem(&luo_file_subsys); + if (ret) { + pr_warn("Failed to register luo_file subsystem [%d]\n", ret); + return ret; + } + + return ret; +} +late_initcall(luo_files_startup); + +/** + * luo_register_file - Register a file descriptor for live update manageme= nt. + * @token: Token value for this file descriptor. + * @fd: file descriptor to be preserved. + * + * Context: Must be called when LUO is in 'normal' state. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_register_file(u64 token, int fd) +{ + struct liveupdate_file_handler *fh; + struct luo_file *luo_file; + bool found =3D false; + int ret =3D -ENOENT; + struct file *file; + + file =3D fget(fd); + if (!file) { + pr_err("Bad file descriptor\n"); + return -EBADF; + } + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + pr_warn("File can be registered only in normal or updated state\n"); + luo_state_read_exit(); + fput(file); + return -EBUSY; + } + + guard(rwsem_read)(&luo_register_file_list_rwsem); + list_for_each_entry(fh, &luo_register_file_list, list) { + if (fh->ops->can_preserve(fh, file)) { + found =3D true; + break; + } + } + + if (!found) + goto exit_unlock; + + luo_file =3D kmalloc(sizeof(*luo_file), GFP_KERNEL); + if (!luo_file) { + ret =3D -ENOMEM; + goto exit_unlock; + } + + luo_file->private_data =3D 0; + luo_file->reclaimed =3D false; + + luo_file->file =3D file; + luo_file->fh =3D fh; + mutex_init(&luo_file->mutex); + luo_file->state =3D LIVEUPDATE_STATE_NORMAL; + + if (xa_load(&luo_files_xa_out, token)) { + ret =3D -EEXIST; + pr_warn("Token %llu is already taken\n", token); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + goto exit_unlock; + } + + ret =3D xa_err(xa_store(&luo_files_xa_out, token, luo_file, + GFP_KERNEL)); + if (ret < 0) { + pr_warn("Failed to store file for token %llu in XArray: %d\n", + token, ret); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + goto exit_unlock; + } + atomic64_inc(&luo_files_count); + +exit_unlock: + luo_state_read_exit(); + + if (ret) + fput(file); + + return ret; +} + +static int __luo_unregister_file(u64 token) +{ + struct luo_file *luo_file; + + luo_file =3D xa_erase(&luo_files_xa_out, token); + if (!luo_file) + return -ENOENT; + + fput(luo_file->file); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + atomic64_dec(&luo_files_count); + + return 0; +} + +/** + * luo_unregister_file - Unregister a file instance using its token. + * @token: The unique token of the file instance to unregister. + * + * Finds the &struct luo_file associated with the @token in the + * global list and removes it. This function *only* removes the entry from= the + * list; it does *not* free the memory allocated for the &struct luo_file + * itself. The caller is responsible for freeing the structure after this + * function returns successfully. + * + * Context: Can be called when a preserved file descriptor is closed or + * no longer needs live update management. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_unregister_file(u64 token) +{ + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + pr_warn("File can be unregistered only in normal or updates state\n"); + luo_state_read_exit(); + return -EBUSY; + } + + ret =3D __luo_unregister_file(token); + if (ret) { + pr_warn("Failed to unregister: token %llu not found.\n", + token); + } + luo_state_read_exit(); + + return ret; +} + +/** + * luo_retrieve_file - Find a registered file instance by its token. + * @token: The unique token of the file instance to retrieve. + * @filep: Output parameter. On success (return value 0), this will point + * to the retrieved "struct file". + * + * Searches the global list for a &struct luo_file matching the @token. Us= es a + * read lock, allowing concurrent retrievals. + * + * Return: 0 on success. Negative errno on failure. + */ +int luo_retrieve_file(u64 token, struct file **filep) +{ + struct luo_file *luo_file; + int ret =3D 0; + + luo_files_recreate_luo_files_xa_in(); + luo_state_read_enter(); + if (!liveupdate_state_updated()) { + pr_warn("File can be retrieved only in updated state\n"); + luo_state_read_exit(); + return -EBUSY; + } + + luo_file =3D xa_load(&luo_files_xa_in, token); + if (luo_file && !luo_file->reclaimed) { + scoped_guard(mutex, &luo_file->mutex) { + if (!luo_file->reclaimed) { + luo_file->reclaimed =3D true; + ret =3D luo_file->fh->ops->retrieve(luo_file->fh, + luo_file->private_data, + filep); + if (!ret) + luo_file->file =3D *filep; + } + } + } else if (luo_file && luo_file->reclaimed) { + pr_err("The file descriptor for token %lld has already been retrieved\n", + token); + ret =3D -EINVAL; + } else { + ret =3D -ENOENT; + } + + luo_state_read_exit(); + + return ret; +} + +/** + * liveupdate_register_file_handler - Register a file handler with LUO. + * @fh: Pointer to a caller-allocated &struct liveupdate_file_handler. + * The caller must initialize this structure, including a unique + * 'compatible' string and a valid 'fh' callbacks. This function adds the + * handler to the global list of supported file handlers. + * + * Context: Typically called during module initialization for file types t= hat + * support live update preservation. + * + * Return: 0 on success. Negative errno on failure. + */ +int liveupdate_register_file_handler(struct liveupdate_file_handler *fh) +{ + struct liveupdate_file_handler *fh_iter; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + guard(rwsem_write)(&luo_register_file_list_rwsem); + list_for_each_entry(fh_iter, &luo_register_file_list, list) { + if (!strcmp(fh_iter->compatible, fh->compatible)) { + pr_err("File handler registration failed: Compatible string '%s' alread= y registered.\n", + fh->compatible); + ret =3D -EEXIST; + goto exit_unlock; + } + } + + if (!try_module_get(fh->ops->owner)) { + pr_warn("File handler '%s' unable to get reference.\n", + fh->compatible); + ret =3D -EAGAIN; + goto exit_unlock; + } + + INIT_LIST_HEAD(&fh->list); + list_add_tail(&fh->list, &luo_register_file_list); + +exit_unlock: + luo_state_read_exit(); + + return ret; +} + +/** + * liveupdate_unregister_file - Unregister a file handler. + * @fh: Pointer to the specific &struct liveupdate_file_handler instance + * that was previously returned by or passed to + * liveupdate_register_file_handler. + * + * Removes the specified handler instance @fh from the global list of + * registered file handlers. This function only removes the entry from the + * list; it does not free the memory associated with @fh itself. The caller + * is responsible for freeing the structure memory after this function ret= urns + * successfully. + * + * Return: 0 on success. Negative errno on failure. + */ +int liveupdate_unregister_file_handler(struct liveupdate_file_handler *fh) +{ + unsigned long token; + struct luo_file *h; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_normal() && !liveupdate_state_updated()) { + luo_state_read_exit(); + return -EBUSY; + } + + guard(rwsem_write)(&luo_register_file_list_rwsem); + + xa_for_each(&luo_files_xa_out, token, h) { + if (h->fh =3D=3D fh) { + luo_state_read_exit(); + return -EBUSY; + } + } + + list_del_init(&fh->list); + luo_state_read_exit(); + module_put(fh->ops->owner); + + return ret; +} diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 40bfbe279d34..5692196fd425 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -34,4 +34,8 @@ int luo_do_subsystems_freeze_calls(void); void luo_do_subsystems_finish_calls(void); void luo_do_subsystems_cancel_calls(void); =20 +int luo_retrieve_file(u64 token, struct file **filep); +int luo_register_file(u64 token, int fd); +int luo_unregister_file(u64 token); + #endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA298226D0F for ; Thu, 7 Aug 2025 01:45:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531114; cv=none; b=YH/GEvHU2sCiH/ASvCg4R7BJr6LwvepIvGEs5nZB4Oa5A1NDZ0YwAZB1TGn9zBntWrdbAE52MfSUEGIOmk6ccFe5OGx7PD3W33CFLiQmKksNRfe/9lZ+t0z7/8y4K77E9Z1s/2lI5mEGYuTxJUJUllRExWdBLrItCflGxC4bFqk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531114; c=relaxed/simple; bh=QWyMyloaFq7cLSNrV9zsyDSIsGUiO4dZm96o/xMpJ1U=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UdmAPw6NPTphd7qBVBfDpUoLnivOI/GuW0nBOr4/F47tRNcTNjJrXcIvVLqBToMRykJ+qfGkfiGt/ifmWdf4D0Dv4V6VHYjcxVHgZwWRN/XgzRrAduIIKKfzrga6SsCd2UEDXiogMkbv+dZBIC6SbOe5F8EhdcA2X3vfcf/sJSM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=avfqTi3l; arc=none smtp.client-ip=209.85.219.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="avfqTi3l" Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-70884da4b55so6135756d6.3 for ; Wed, 06 Aug 2025 18:45:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531109; x=1755135909; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=535Ba2NCPms4/9++qm2Yq8lWENY5QwuViLFiaBWBL/o=; b=avfqTi3lvTIOpVl/6sVMLaK7srTIduqAV//XM+yeZW0YptCTxeS71TEAkJHOk0jdh+ yL7HPKHG9kyqalyIimk9y7L04Be8jQ1Zg0wnfpad092biqfHdxvkG9BHuuS2x/sw0xTN Me8G2FoDsMFLNHiObiI05PJ8/n8IW7SzX1lFoNJtmMGP0dNXAez0G2Qw3yQFlFZvHcqf pEPGxv/4xtwuakfToznouliaCHoE3wO/ITCAPZY4f9if4Awo5sjeIea2pYBfedbs4JiN 0kq54Sc432sHBu9L9CupMPiuYqrbVKerR6iLKJTYy6JKfFLo2zrlKrtZ9kivw9B1feue u36w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531109; x=1755135909; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=535Ba2NCPms4/9++qm2Yq8lWENY5QwuViLFiaBWBL/o=; b=luilZuaI2cXFf8xA+lD50zgG6e/+mqjDJloylUjAdg5m/owuc5ikLdOrWL9XBoqYWM a7FIg48ePbpaq4B3Oj4DgXP9nckp8KfNt7kGmp7zXllyTAPYNOKOmQCEo+AzedXdPlcW 3sAF2c0HTHNZ49zmoBt0VVTlh5z5h9tIQGm0pVHeT79R+Nirp0p375DzS1hOQ31F4FFk Y/zVogUDAC1b1iyGHY99Kllywz9rW/C8ei3yweUAeZdaOrIp7XCFAs+FuRo7Wl8+3BMo EPDiFZIYkdwAJeBYgQ/Knj5zyIvhyKWTq6kr5qYqahJtxxiMGA71hHHAjTf8SlG6GRUx invA== X-Forwarded-Encrypted: i=1; AJvYcCWHRjQX6JH10qLt5//yuOiT/x8fXfd+/l4IVkMDuIRtszQxrru21mL9DFf5YG9nWc4fXXqqwVHDXjiADak=@vger.kernel.org X-Gm-Message-State: AOJu0YyELZHGFMzE5tnanc5TZTXhnZm2NlwzAcm9G+el07kBqFCtwvSA bYpGypCs8ZdQUGhR/vZ2q4QI0MDsmn/rjy/MNcfnMleTdBJapZ+yVjblfOewNrq6zg4= X-Gm-Gg: ASbGncu5eQ6pPKFX9K2yd+UV7wJNdzW2qUy10bUYeX5J/S0gqGHoo3R7RYn39bXDOwx DADtPyBACFfDCBhUkaI5IW88sirr1YjowljKuR5ejWvjp1pIQD1b08r/7sbVLTzmHf7BQN3GIAq MbA7ouugNYH6WQc7Mb8Q4uiAL2oCAZKBKLGvYM+o2nPVZvr5atObY+w9V/XulSFLyEE9OYyklw9 crbd/t/GvMQskHRd6GQzmhJxwykCuOGvt4YUN8QAjqOI6esI9hPsWO/2mLgzc/bBPUjwdJX/Haw bqe7fL/AzDlEsQr5uFhlffwJb8v6RgEWl1GnC5ssk93x8vW2o5ij70ddDa1bO+UsdvYtGE1ylGA kO04hf/6teYHhWLE9yCeWzGQzy5i7ifABHU/HViEZpeNsgQpupOKKPNdZcwiVMrrBRvhbTaaSXO c/OMl3/EhLEQsx X-Google-Smtp-Source: AGHT+IEnMynjbrA/1kdM8QRErdkzAIa+G9lVVTQo0MaHEEKIMvABCbjhv49oP3YEo2mCQBJF30aYNA== X-Received: by 2002:a05:6214:246f:b0:707:7cee:4fd with SMTP id 6a1803df08f44-7097ae10932mr70070696d6.3.1754531109452; Wed, 06 Aug 2025 18:45:09 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:08 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 15/30] liveupdate: luo_files: implement file systems callbacks Date: Thu, 7 Aug 2025 01:44:21 +0000 Message-ID: <20250807014442.3829950-16-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Implements the core logic within luo_files.c to invoke the prepare, reboot, finish, and cancel callbacks for preserved file instances, replacing the previous stub implementations. It also handles the persistence and retrieval of the u64 data payload associated with each file via the LUO FDT. This completes the core mechanism enabling registered files handlers to act= ively manage file state across the live update transition using the LUO framework. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_files.c | 191 +++++++++++++++++++++++++++++++++- 1 file changed, 188 insertions(+), 3 deletions(-) diff --git a/kernel/liveupdate/luo_files.c b/kernel/liveupdate/luo_files.c index 4b7568d0f0f0..33577c9e9a64 100644 --- a/kernel/liveupdate/luo_files.c +++ b/kernel/liveupdate/luo_files.c @@ -326,32 +326,190 @@ static int luo_files_fdt_setup(void) return ret; } =20 +static int luo_files_prepare_one(struct luo_file *h) +{ + int ret =3D 0; + + guard(mutex)(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_NORMAL) { + if (h->fh->ops->prepare) { + ret =3D h->fh->ops->prepare(h->fh, h->file, + &h->private_data); + } + if (!ret) + h->state =3D LIVEUPDATE_STATE_PREPARED; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_PREPARED && + h->state !=3D LIVEUPDATE_STATE_FROZEN); + } + + return ret; +} + +static int luo_files_freeze_one(struct luo_file *h) +{ + int ret =3D 0; + + guard(mutex)(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_PREPARED) { + if (h->fh->ops->freeze) { + ret =3D h->fh->ops->freeze(h->fh, h->file, + &h->private_data); + } + if (!ret) + h->state =3D LIVEUPDATE_STATE_FROZEN; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_FROZEN); + } + + return ret; +} + +static void luo_files_finish_one(struct luo_file *h) +{ + guard(mutex)(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_UPDATED) { + if (h->fh->ops->finish) { + h->fh->ops->finish(h->fh, h->file, h->private_data, + h->reclaimed); + } + h->state =3D LIVEUPDATE_STATE_NORMAL; + } else { + WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_NORMAL); + } +} + +static void luo_files_cancel_one(struct luo_file *h) +{ + int ret; + + guard(mutex)(&h->mutex); + if (h->state =3D=3D LIVEUPDATE_STATE_NORMAL) + return; + + ret =3D WARN_ON_ONCE(h->state !=3D LIVEUPDATE_STATE_PREPARED && + h->state !=3D LIVEUPDATE_STATE_FROZEN); + if (ret) + return; + + if (h->fh->ops->cancel) + h->fh->ops->cancel(h->fh, h->file, h->private_data); + h->private_data =3D 0; + h->state =3D LIVEUPDATE_STATE_NORMAL; +} + +static void __luo_files_cancel(struct luo_file *boundary_file) +{ + unsigned long token; + struct luo_file *h; + + xa_for_each(&luo_files_xa_out, token, h) { + if (h =3D=3D boundary_file) + break; + + luo_files_cancel_one(h); + } + luo_files_fdt_cleanup(); +} + +static int luo_files_commit_data_to_fdt(void) +{ + int node_offset, ret; + unsigned long token; + char token_str[19]; + struct luo_file *h; + + guard(rwsem_read)(&luo_file_fdt_rwsem); + xa_for_each(&luo_files_xa_out, token, h) { + snprintf(token_str, sizeof(token_str), "%#0llx", (u64)token); + node_offset =3D fdt_subnode_offset(luo_file_fdt_out, + 0, + token_str); + ret =3D fdt_setprop(luo_file_fdt_out, node_offset, "data", + &h->private_data, sizeof(h->private_data)); + if (ret < 0) { + pr_err("Failed to set data property for token %s: %s\n", + token_str, fdt_strerror(ret)); + return -ENOSPC; + } + } + + return 0; +} + static int luo_files_prepare(struct liveupdate_subsystem *h, u64 *data) { + unsigned long token; + struct luo_file *luo_file; int ret; =20 ret =3D luo_files_fdt_setup(); if (ret) return ret; =20 - scoped_guard(rwsem_read, &luo_file_fdt_rwsem) - *data =3D __pa(luo_file_fdt_out); + xa_for_each(&luo_files_xa_out, token, luo_file) { + ret =3D luo_files_prepare_one(luo_file); + if (ret < 0) { + pr_err("Prepare failed for file token %#0llx handler '%s' [%d]\n", + (u64)token, luo_file->fh->compatible, ret); + __luo_files_cancel(luo_file); + + return ret; + } + } + + ret =3D luo_files_commit_data_to_fdt(); + if (ret) { + __luo_files_cancel(NULL); + } else { + scoped_guard(rwsem_read, &luo_file_fdt_rwsem) + *data =3D __pa(luo_file_fdt_out); + } =20 return ret; } =20 static int luo_files_freeze(struct liveupdate_subsystem *h, u64 *data) { - return 0; + unsigned long token; + struct luo_file *luo_file; + int ret; + + xa_for_each(&luo_files_xa_out, token, luo_file) { + ret =3D luo_files_freeze_one(luo_file); + if (ret < 0) { + pr_err("Freeze callback failed for file token %#0llx handler '%s' [%d]\= n", + (u64)token, luo_file->fh->compatible, ret); + __luo_files_cancel(luo_file); + + return ret; + } + } + + ret =3D luo_files_commit_data_to_fdt(); + if (ret) + __luo_files_cancel(NULL); + + return ret; } =20 static void luo_files_finish(struct liveupdate_subsystem *h, u64 data) { + unsigned long token; + struct luo_file *luo_file; + luo_files_recreate_luo_files_xa_in(); + xa_for_each(&luo_files_xa_in, token, luo_file) { + luo_files_finish_one(luo_file); + mutex_destroy(&luo_file->mutex); + kfree(luo_file); + } + xa_destroy(&luo_files_xa_in); } =20 static void luo_files_cancel(struct liveupdate_subsystem *h, u64 data) { + __luo_files_cancel(NULL); } =20 static void luo_files_boot(struct liveupdate_subsystem *h, u64 fdt_pa) @@ -484,6 +642,27 @@ int luo_register_file(u64 token, int fd) return ret; } =20 +static void luo_files_fdt_remove_node(u64 token) +{ + char token_str[19]; + int offset, ret; + + guard(rwsem_write)(&luo_file_fdt_rwsem); + if (!luo_file_fdt_out) + return; + + snprintf(token_str, sizeof(token_str), "%#0llx", token); + offset =3D fdt_subnode_offset(luo_file_fdt_out, 0, token_str); + if (offset < 0) + return; + + ret =3D fdt_del_node(luo_file_fdt_out, offset); + if (ret < 0) { + pr_warn("LUO Files: Failed to delete FDT node for token %s: %s\n", + token_str, fdt_strerror(ret)); + } +} + static int __luo_unregister_file(u64 token) { struct luo_file *luo_file; @@ -492,6 +671,12 @@ static int __luo_unregister_file(u64 token) if (!luo_file) return -ENOENT; =20 + if (luo_file->state =3D=3D LIVEUPDATE_STATE_FROZEN || + luo_file->state =3D=3D LIVEUPDATE_STATE_PREPARED) { + luo_files_cancel_one(luo_file); + luo_files_fdt_remove_node(token); + } + fput(luo_file->file); mutex_destroy(&luo_file->mutex); kfree(luo_file); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qv1-f54.google.com (mail-qv1-f54.google.com [209.85.219.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9730D2253F2 for ; Thu, 7 Aug 2025 01:45:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.54 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531115; cv=none; b=CSu8mgPl+iRcngQQeUZCd3Di7z4zjnMqia1WGA859fzHKoi3f9pJQjEgLTVCPyPv15OG88k2PWYC8geGOURsY5SEMvuV8fjXf7u9S08Sp40PyJq8ZfwFYj+qBApQYKgyzJyM5Mg2och2Qo9i1BEw3cm4lsPvzqBy+D2tQvUEZHQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531115; c=relaxed/simple; bh=2bOvVYYu5v/Os2gHERdEl5mhadRivrwZwBWuB57XWlE=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=asx4P71vVsZTsqO5qVQ8bgCisJXAhdLwRuCh+6nKZK/6sd/HacEuxF2SplK098u2YNvmdHhDoWxkCJ4Z1USvK5USIuoryytj4vdJ1E5FRw2U+mJe6X5YS/DozwZNKch63zBihnMo22hXO1dmGtIDsWfqcsMzvLHvUmKWIb9iMAI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=W8Rlbie9; arc=none smtp.client-ip=209.85.219.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="W8Rlbie9" Received: by mail-qv1-f54.google.com with SMTP id 6a1803df08f44-7077a1563b5so5303486d6.1 for ; Wed, 06 Aug 2025 18:45:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531111; x=1755135911; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=M8Pc86FlLRtwRUzQGSvmjZcMELF/5Imt71S8yXJrUQE=; b=W8Rlbie9uRug5A5oQgGVho6uqXDlii0pjLF0N9rb3+XmiMWIyb70ZZF7owytjsX2EI z61DAYil7fB7uhM5w/By3Mt6rfeinWWEABmbcTxPGOFQC8igaDKaPlz1QTaMA8lTbCRB P1BSLQFmtHF+qo4OnosmnOZSjSSuwFWQNwvVxDq5Gabt2TfGrbkI9UzZUWKLUPuZtC4k 6hfk6tOUyLoJOrhFKouD5LwzL7Pw/j0qqjnLPTXgDcopCNq9iZ3cGfZ6hgfufz4KLgUp jHjbERpedcNHg/Z5W8J7zGe2L2xFvrwHuQxftjzFuPfX/1IWxPCBBj7DoO2bBASBIGcK 147Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531111; x=1755135911; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M8Pc86FlLRtwRUzQGSvmjZcMELF/5Imt71S8yXJrUQE=; b=Ho78T6520cNf+qvCfsj/JerjPGvbfdCq7MIgX0MX/hi41Q/cm2GCd/z0cyRjurflWU VjooLOyWcL5OAdAz/towckFsKHSEC3eKiAdxdJzpWuR/vPVHuH6/Ootg+vvK2KbsuysZ x+LSz0eYKebGEHHx106AqDN7ddCSOsFNM/qW1yQTYI/5nsfRXf4rTPMw4DJlvqMqWivv PIRxEny4KxqxZ+At/RHhDuirSZ3vSpgenp7weREKb0qAjodY+piVeMHa6o+yMc9fUbs9 /vNZ2JVPiWG+wihSsanayMBXSC73uZxzMLR6iRG1x8Xvb7HSsq/SJskiDZswdn9EC3qx tjjA== X-Forwarded-Encrypted: i=1; AJvYcCVoXMUoz5mevc2ZHpS+wyIhPIMz8GI8APSXXoDrQr8eUDdb6E0BAa2WKLigSQUX1brCRQ6yRVnAfhe4F5U=@vger.kernel.org X-Gm-Message-State: AOJu0YxtBpT1gzujbtGrwJcygAtXUNyxuBoa0Ty4Hyih5gEe4WljGnCI vM/EAI5BWtdnFIazXww1PsQsqLoVve8/RWy8UKHDvfiFttrqoftJbCwwUKo1GIJjhJA= X-Gm-Gg: ASbGncsyCJGjVzpGgdjJf991GCrrw0kpHhFzcE9Z2DLcy58k5TQGjBNjCWe0QydSXVh xQI9J5lJEn4p410A1Z0I9NPA9Ereq06ID6YnRT7+nv9cIhK+cpdERin85IO6JXsNLhFu9qhj9BX upbjMwRW/ooZ6Kg/uk7cVg85PCvB1qR5EGNWQCsBDXSmu1hyNTC/tcBzcBtz5zoMIoQwqIf01L0 BOU2Eh+ZC+fd2v+U6ewdWRjRuhvgPk//Ca+b2s7xrMLq7NLK+vMztIuE74kJ5Pj02/dwQKdicfx icrl+S2oypJ/lh+SAOCPgei7frJWbv9sCMUBpGDCVsC3i0OSeIS28naHwyT6vvAum6/X8NhHA4d rP6GrzSmOATSupd7JPMy2n2lyB2ERx7+bKQmPH8tzDFSKZr/QUuqK+npbA48zPUu4h3jqVougsf 2UsvmoK70NWpkF X-Google-Smtp-Source: AGHT+IEhA0QesCBs9Rro7SuewSiUGU8rDS7yvn1okysx9FgcI1M1Z0UadP+fhEkBUH9Ofl09bYq+0A== X-Received: by 2002:a05:6214:4013:b0:707:5986:8963 with SMTP id 6a1803df08f44-7098a801aeemr17644456d6.33.1754531110843; Wed, 06 Aug 2025 18:45:10 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:10 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 16/30] liveupdate: luo_ioctl: add userpsace interface Date: Thu, 7 Aug 2025 01:44:22 +0000 Message-ID: <20250807014442.3829950-17-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the user-space interface for the Live Update Orchestrator via ioctl commands, enabling external control over the live update process and management of preserved resources. The idea is that there is going to be a single userspace agent driving the live update, therefore, only a single process can ever hold this device opened at a time. Signed-off-by: Pasha Tatashin --- include/uapi/linux/liveupdate.h | 243 ++++++++++++++++++++++++++++++++ kernel/liveupdate/luo_ioctl.c | 200 ++++++++++++++++++++++++++ 2 files changed, 443 insertions(+) diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index 3cb09b2c4353..37ec5656443b 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -14,6 +14,32 @@ #include #include =20 +/** + * DOC: General ioctl format + * + * The ioctl interface follows a general format to allow for extensibility= . Each + * ioctl is passed in a structure pointer as the argument providing the si= ze of + * the structure in the first u32. The kernel checks that any structure sp= ace + * beyond what it understands is 0. This allows userspace to use the backw= ard + * compatible portion while consistently using the newer, larger, structur= es. + * + * ioctls use a standard meaning for common errnos: + * + * - ENOTTY: The IOCTL number itself is not supported at all + * - E2BIG: The IOCTL number is supported, but the provided structure has + * non-zero in a part the kernel does not understand. + * - EOPNOTSUPP: The IOCTL number is supported, and the structure is + * understood, however a known field has a value the kernel does not + * understand or support. + * - EINVAL: Everything about the IOCTL was understood, but a field is not + * correct. + * - ENOENT: An ID or IOVA provided does not exist. + * - ENOMEM: Out of memory. + * - EOVERFLOW: Mathematics overflowed. + * + * As well as additional errnos, within specific ioctls. + */ + /** * enum liveupdate_state - Defines the possible states of the live update * orchestrator. @@ -91,4 +117,221 @@ enum liveupdate_event { LIVEUPDATE_CANCEL =3D 3, }; =20 +/* The ioctl type, documented in ioctl-number.rst */ +#define LIVEUPDATE_IOCTL_TYPE 0xBA + +/* The ioctl commands */ +enum { + LIVEUPDATE_CMD_BASE =3D 0x00, + LIVEUPDATE_CMD_FD_PRESERVE =3D LIVEUPDATE_CMD_BASE, + LIVEUPDATE_CMD_FD_UNPRESERVE =3D 0x01, + LIVEUPDATE_CMD_FD_RESTORE =3D 0x02, + LIVEUPDATE_CMD_GET_STATE =3D 0x03, + LIVEUPDATE_CMD_SET_EVENT =3D 0x04, +}; + +/** + * struct liveupdate_ioctl_fd_preserve - ioctl(LIVEUPDATE_IOCTL_FD_PRESERV= E) + * @size: Input; sizeof(struct liveupdate_ioctl_fd_preserve) + * @fd: Input; The user-space file descriptor to be preserved. + * @token: Input; An opaque, unique token for preserved resource. + * + * Holds parameters for preserving Validate and initiate preservation for = a file + * descriptor. + * + * User sets the @fd field identifying the file descriptor to preserve + * (e.g., memfd, kvm, iommufd, VFIO). The kernel validates if this FD type + * and its dependencies are supported for preservation. If validation pass= es, + * the kernel marks the FD internally and *initiates the process* of prepa= ring + * its state for saving. The actual snapshotting of the state typically oc= curs + * during the subsequent %LIVEUPDATE_IOCTL_PREPARE execution phase, though + * some finalization might occur during freeze. + * On successful validation and initiation, the kernel uses the @token + * field with an opaque identifier representing the resource being preserv= ed. + * This token confirms the FD is targeted for preservation and is required= for + * the subsequent %LIVEUPDATE_IOCTL_FD_RESTORE call after the live update. + * + * Return: 0 on success (validation passed, preservation initiated), negat= ive + * error code on failure (e.g., unsupported FD type, dependency issue, + * validation failed). + */ +struct liveupdate_ioctl_fd_preserve { + __u32 size; + __s32 fd; + __aligned_u64 token; +}; + +#define LIVEUPDATE_IOCTL_FD_PRESERVE \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_FD_PRESERVE) + +/** + * struct liveupdate_ioctl_fd_unpreserve - ioctl(LIVEUPDATE_IOCTL_FD_UNPRE= SERVE) + * @size: Input; sizeof(struct liveupdate_ioctl_fd_unpreserve) + * @token: Input; A token for resource to be unpreserved. + * + * Remove a file descriptor from the preservation list. + * + * Allows user space to explicitly remove a file descriptor from the set of + * items marked as potentially preservable. User space provides a @token t= hat + * was previously used by a successful %LIVEUPDATE_IOCTL_FD_PRESERVE call + * (potentially from a prior, possibly cancelled, live update attempt). The + * kernel reads the token value from the provided user-space address. + * + * On success, the kernel removes the corresponding entry (identified by t= he + * token value read from the user pointer) from its internal preservation = list. + * The provided @token (representing the now-removed entry) becomes invalid + * after this call. + * + * Return: 0 on success, negative error code on failure (e.g., -EBUSY or -= EINVAL + * if bad address provided, invalid token value read, token not found). + */ +struct liveupdate_ioctl_fd_unpreserve { + __u32 size; + __aligned_u64 token; +}; + +#define LIVEUPDATE_IOCTL_FD_UNPRESERVE \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_FD_UNPRESERVE) + +/** + * struct liveupdate_ioctl_fd_restore - ioctl(LIVEUPDATE_IOCTL_FD_RESTORE) + * @size: Input; sizeof(struct liveupdate_ioctl_fd_restore) + * @fd: Output; The new file descriptor representing the fully restored + * kernel resource. + * @token: Input; An opaque, token that was used to preserve the resource. + * + * Restore a previously preserved file descriptor. + * + * User sets the @token field to the value obtained from a successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE call before the live update. On success, + * the kernel restores the state (saved during the PREPARE/FREEZE phases) + * associated with the token and populates the @fd field with a new file + * descriptor referencing the restored resource in the current (new) kerne= l. + * This operation must be performed *before* signaling completion via + * %LIVEUPDATE_IOCTL_FINISH. + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +struct liveupdate_ioctl_fd_restore { + __u32 size; + __s32 fd; + __aligned_u64 token; +}; + +#define LIVEUPDATE_IOCTL_FD_RESTORE \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_FD_RESTORE) + +/** + * struct liveupdate_ioctl_get_state - ioctl(LIVEUPDATE_IOCTL_GET_STATE) + * @size: Input; sizeof(struct liveupdate_ioctl_get_state) + * @state: Output; The current live update state. + * + * Query the current state of the live update orchestrator. + * + * The kernel fills the @state with the current + * state of the live update subsystem. Possible states are: + * + * - %LIVEUPDATE_STATE_NORMAL: Default state; no live update operation is + * currently in progress. + * - %LIVEUPDATE_STATE_PREPARED: The preparation phase (triggered by + * %LIVEUPDATE_PREPARE) has completed + * successfully. The system is ready for the + * reboot transition. Note that some + * device operations (e.g., unbinding, new D= MA + * mappings) might be restricted in this sta= te. + * - %LIVEUPDATE_STATE_UPDATED: The system has successfully rebooted into= the + * new kernel via live update. It is now run= ning + * the new kernel code and is awaiting the + * completion signal from user space via + * %LIVEUPDATE_FINISH after restoration task= s are + * done. + * + * See the definition of &enum liveupdate_state for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_ioctl_get_state { + __u32 size; + __u32 state; +}; + +#define LIVEUPDATE_IOCTL_GET_STATE \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_GET_STATE) + +/** + * struct liveupdate_ioctl_set_event - ioctl(LIVEUPDATE_IOCTL_SET_EVENT) + * @size: Input; sizeof(struct liveupdate_ioctl_set_event) + * @event: Input; The live update event. + * + * Notify live update orchestrator about global event, that causes a state + * transition. + * + * Event, can be one of the following: + * + * - %LIVEUPDATE_PREPARE: Initiates the live update preparation phase. This + * typically triggers the saving process for items = marked + * via the PRESERVE ioctls. This typically occurs + * *before* the "blackout window", while user + * applications (e.g., VMs) may still be running. K= ernel + * subsystems receiving the %LIVEUPDATE_PREPARE eve= nt + * should serialize necessary state. This command d= oes + * not transfer data. + * - %LIVEUPDATE_FINISH: Signal restoration completion and triggercleanup. + * + * Signals that user space has completed all necess= ary + * restoration actions in the new kernel (after a l= ive + * update reboot). Calling this ioctl triggers the + * cleanup phase: any resources that were successfu= lly + * preserved but were *not* subsequently restored + * (reclaimed) via the RESTORE ioctls will have the= ir + * preserved state discarded and associated kernel + * resources released. Involved devices may be rese= t. All + * desired restorations *must* be completed *before* + * this. Kernel callbacks for the %LIVEUPDATE_FINISH + * event must not fail. Successfully completing this + * phase transitions the system state from + * %LIVEUPDATE_STATE_UPDATED back to + * %LIVEUPDATE_STATE_NORMAL. This command does + * not transfer data. + * - %LIVEUPDATE_CANCEL: Cancel the live update preparation phase. + * + * Notifies the live update subsystem to abort the + * preparation sequence potentially initiated by + * %LIVEUPDATE_PREPARE event. + * + * When triggered, subsystems receiving the + * %LIVEUPDATE_CANCEL event should revert any state + * changes or actions taken specifically for the ab= orted + * prepare phase (e.g., discard partially serialized + * state). The kernel releases resources allocated + * specifically for this *aborted preparation attem= pt*. + * + * This operation cancels the current *attempt* to + * prepare for a live update but does **not** remove + * previously validated items from the internal list + * of potentially preservable resources. Consequent= ly, + * preservation tokens previously used by successful + * %LIVEUPDATE_IOCTL_FD_PRESERVE or calls **remain + * valid** as identifiers for those potentially + * preservable resources. However, since the system= state + * returns towards %LIVEUPDATE_STATE_NORMAL, user s= pace + * must initiate a new live update sequence (starti= ng + * with %LIVEUPDATE_PREPARE) to proceed with an upd= ate + * using these (or other) tokens. + * + * This command does not transfer data. Kernel call= backs + * for the %LIVEUPDATE_CANCEL event must not fail. + * + * See the definition of &enum liveupdate_event for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_ioctl_set_event { + __u32 size; + __u32 event; +}; + +#define LIVEUPDATE_IOCTL_SET_EVENT \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SET_EVENT) + #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/luo_ioctl.c b/kernel/liveupdate/luo_ioctl.c index 3df1ec9fbe57..6f61569c94e8 100644 --- a/kernel/liveupdate/luo_ioctl.c +++ b/kernel/liveupdate/luo_ioctl.c @@ -5,6 +5,25 @@ * Pasha Tatashin */ =20 +/** + * DOC: LUO ioctl Interface + * + * The IOCTL user-space control interface for the LUO subsystem. + * It registers a character device, typically found at ``/dev/liveupdate``, + * which allows a userspace agent to manage the LUO state machine and its + * associated resources, such as preservable file descriptors. + * + * To ensure that the state machine is controlled by a single entity, acce= ss + * to this device is exclusive: only one process is permitted to have + * ``/dev/liveupdate`` open at any given time. Subsequent open attempts wi= ll + * fail with -EBUSY until the first process closes its file descriptor. + * This singleton model simplifies state management by preventing conflict= ing + * commands from multiple userspace agents. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include #include #include #include @@ -17,8 +36,189 @@ #include #include "luo_internal.h" =20 +static atomic_t luo_device_in_use =3D ATOMIC_INIT(0); + +struct luo_ucmd { + void __user *ubuffer; + u32 user_size; + void *cmd; +}; + +static int luo_ioctl_fd_preserve(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_fd_preserve *argp =3D ucmd->cmd; + int ret; + + ret =3D luo_register_file(argp->token, argp->fd); + if (!ret) + return ret; + + if (copy_to_user(ucmd->ubuffer, argp, ucmd->user_size)) + return -EFAULT; + + return 0; +} + +static int luo_ioctl_fd_unpreserve(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_fd_unpreserve *argp =3D ucmd->cmd; + + return luo_unregister_file(argp->token); +} + +static int luo_ioctl_fd_restore(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_fd_restore *argp =3D ucmd->cmd; + struct file *file; + int ret; + + argp->fd =3D get_unused_fd_flags(O_CLOEXEC); + if (argp->fd < 0) { + pr_err("Failed to allocate new fd: %d\n", argp->fd); + return argp->fd; + } + + ret =3D luo_retrieve_file(argp->token, &file); + if (ret < 0) { + put_unused_fd(argp->fd); + + return ret; + } + + fd_install(argp->fd, file); + + if (copy_to_user(ucmd->ubuffer, argp, ucmd->user_size)) + return -EFAULT; + + return 0; +} + +static int luo_ioctl_get_state(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_get_state *argp =3D ucmd->cmd; + + argp->state =3D liveupdate_get_state(); + + if (copy_to_user(ucmd->ubuffer, argp, ucmd->user_size)) + return -EFAULT; + + return 0; +} + +static int luo_ioctl_set_event(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_set_event *argp =3D ucmd->cmd; + int ret; + + switch (argp->event) { + case LIVEUPDATE_PREPARE: + ret =3D luo_prepare(); + break; + case LIVEUPDATE_FINISH: + ret =3D luo_finish(); + break; + case LIVEUPDATE_CANCEL: + ret =3D luo_cancel(); + break; + default: + ret =3D -EINVAL; + } + + return ret; +} + +static int luo_open(struct inode *inodep, struct file *filep) +{ + if (atomic_cmpxchg(&luo_device_in_use, 0, 1)) + return -EBUSY; + + return 0; +} + +static int luo_release(struct inode *inodep, struct file *filep) +{ + atomic_set(&luo_device_in_use, 0); + + return 0; +} + +union ucmd_buffer { + struct liveupdate_ioctl_fd_preserve preserve; + struct liveupdate_ioctl_fd_unpreserve unpreserve; + struct liveupdate_ioctl_fd_restore restore; + struct liveupdate_ioctl_get_state state; + struct liveupdate_ioctl_set_event event; +}; + +struct luo_ioctl_op { + unsigned int size; + unsigned int min_size; + unsigned int ioctl_num; + int (*execute)(struct luo_ucmd *ucmd); +}; + +#define IOCTL_OP(_ioctl, _fn, _struct, _last) = \ + [_IOC_NR(_ioctl) - LIVEUPDATE_CMD_BASE] =3D { \ + .size =3D sizeof(_struct) + \ + BUILD_BUG_ON_ZERO(sizeof(union ucmd_buffer) < \ + sizeof(_struct)), \ + .min_size =3D offsetofend(_struct, _last), \ + .ioctl_num =3D _ioctl, \ + .execute =3D _fn, \ + } + +static const struct luo_ioctl_op luo_ioctl_ops[] =3D { + IOCTL_OP(LIVEUPDATE_IOCTL_FD_PRESERVE, luo_ioctl_fd_preserve, + struct liveupdate_ioctl_fd_preserve, token), + IOCTL_OP(LIVEUPDATE_IOCTL_FD_UNPRESERVE, luo_ioctl_fd_unpreserve, + struct liveupdate_ioctl_fd_unpreserve, token), + IOCTL_OP(LIVEUPDATE_IOCTL_FD_RESTORE, luo_ioctl_fd_restore, + struct liveupdate_ioctl_fd_restore, token), + IOCTL_OP(LIVEUPDATE_IOCTL_GET_STATE, luo_ioctl_get_state, + struct liveupdate_ioctl_get_state, state), + IOCTL_OP(LIVEUPDATE_IOCTL_SET_EVENT, luo_ioctl_set_event, + struct liveupdate_ioctl_set_event, event), +}; + +static long luo_ioctl(struct file *filep, unsigned int cmd, unsigned long = arg) +{ + const struct luo_ioctl_op *op; + struct luo_ucmd ucmd =3D {}; + union ucmd_buffer buf; + unsigned int nr; + int ret; + + nr =3D _IOC_NR(cmd); + if (nr < LIVEUPDATE_CMD_BASE || + (nr - LIVEUPDATE_CMD_BASE) >=3D ARRAY_SIZE(luo_ioctl_ops)) { + return -EINVAL; + } + + ucmd.ubuffer =3D (void __user *)arg; + ret =3D get_user(ucmd.user_size, (u32 __user *)ucmd.ubuffer); + if (ret) + return ret; + + op =3D &luo_ioctl_ops[nr - LIVEUPDATE_CMD_BASE]; + if (op->ioctl_num !=3D cmd) + return -ENOIOCTLCMD; + if (ucmd.user_size < op->min_size) + return -EINVAL; + + ucmd.cmd =3D &buf; + ret =3D copy_struct_from_user(ucmd.cmd, op->size, ucmd.ubuffer, + ucmd.user_size); + if (ret) + return ret; + + return op->execute(&ucmd); +} + static const struct file_operations fops =3D { .owner =3D THIS_MODULE, + .open =3D luo_open, + .release =3D luo_release, + .unlocked_ioctl =3D luo_ioctl, }; =20 static struct miscdevice liveupdate_miscdev =3D { --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:30 2025 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B73A4231858 for ; Thu, 7 Aug 2025 01:45:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531116; cv=none; b=uPewdgAeBFytYEFFSjpwTnSOCaOW/YbgdUPHr54IUXTAcz6mrwVS1ryudCR6CG76yly6XycCs5OsnMVocRTTo2tZPrwgDjYVuO5khykd45Y0oNx8+eYrq6ObgZn1jAczXeRs/T6shLuruKfl8TVXaIv/xP0kcWqV59IIOtVLksA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531116; c=relaxed/simple; bh=6IaJ4oC3HjSKXEwPPmdg9sH4w864pWwsUqoUlC5zX5s=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=grY4owH3lXnMLTP7tS4eJgkWv7wwogmYSArG9ygeIYfAeG3x5sVJ5uFqUkNmxzW9V4BxHELvWhHT7L/9F/AHXXyTRUmnk1duLm/YbaK8GKqDBzlR4c89wGUAyE3JXSbLMVVLll3XvDQExl1xPJ7Rapm5SUuGWTj4v80Cg+l/P5k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=gF8WDCKl; arc=none smtp.client-ip=209.85.160.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="gF8WDCKl" Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-4b0784e3153so8855251cf.2 for ; Wed, 06 Aug 2025 18:45:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531112; x=1755135912; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=5k1U17r+PzRIJkOEWEh69D1nNoyfvJ+I2xTfX4gnP+M=; b=gF8WDCKl7lSirbpvxjc3A32RPgzd74XttEyW4Jsd9Z6qD9CFDGpLG0/QILyXXGq6vT VNiyCx4dGT1VKH/hnutlzijUDovc7kjUOthswMqgzUg1ye+DZ9jMuMRg7DCDA2CTqT36 cqA1XAcrRYZQQP5shvw6qmosaZrxu3hSUJ61vheEB1frWbRQe6P1R1THj6RnxhCYd66A TXgv255p0lm0iYpnnkQFZawB3cQ7QI+W2N5gQdE0oZ0phIA4xiZYLXMr0rvTH9shEj83 HIR5EtNMEaKq4fn5+hGPW0phYNTQd8bTCRTPCrSm4Y26aOQAFJYt0DjVae44Yr0gYQsg AgHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531112; x=1755135912; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5k1U17r+PzRIJkOEWEh69D1nNoyfvJ+I2xTfX4gnP+M=; b=Ur/mqDVTk+dAWKqopsmcoItRjic+QT+r5ST8h3u7qlmFrk4mC0gC6mUgrfK1S2BxsR 8WaVjesvz9WRB6fYOIxYwAVY02G4mKNW3VLoeve5KIoo6l9I45wJbIb0Ao9AjGnzWrFm GqtUyrqDumYHCv5DPBXevKneQeglauwFxVdVrfSLdU1iT4RIBbCkAaZ4r1K5NZEvVt2s 7Vy5zsQ3pIfIDf9OG9GXnZELLM3SkV8zusKSs1InQ5mSz2k8Xt7Uk8U0f5p1sfQ9Sy6J yRHx1UaOQWMIXYJgB0UVjlHbTzck31pq+8flAZPHHbiS9N5h0TK9SZvkbJ9wM7MXpIwf tJHw== X-Forwarded-Encrypted: i=1; AJvYcCX4cbW8l4eLQuGUiICoZZ1u+k7GgAvUved3TsKFrpmimAnyz8c/AjuDY8UahLrOJ9AkpcAdqisS1JHmyo0=@vger.kernel.org X-Gm-Message-State: AOJu0YwM8Wp1w8Ux+HZb0vzVot/pwmt/LYoY2nyS1ZJ10scV5e3iFnhr PdwCp6YjN3XcAqSB7ijABJloB7GnGdda/aNfzPy9UVQrhWRuOnjMVl5KvhnomwxmhF8= X-Gm-Gg: ASbGncs5w7d3xjQZ55OlkKXNrC/1yG7AJOeex5H4mUXWO22UYsZtkryO6QXdY6AHXA4 LDb7SLNUIUjItoi9jpGGa/bw7cX1wNueo9JeaQ/ILABUQnbi/M2nDgiw9RAAUcEOMo6YuGZkyVG 7q0FNE2qrL5LY42sSGOLsYcUx8HSmvNmKhqE5toaPqZ4c677gaFysLAvjpvRy/Za57tDFOv/CDh nz9krOmRIU/ikd+PHlsTZjiQjwO/2+zJiFqNnVLdED7EU4Q2GNw7FNxXJ4ZT1YwsR4paiFW3m7M J+jdH1vXzakMNY5vQceR5lL/H3rxUR7SAKRxhtjIItl1gyC+C6648/61GILyarrhrQaDQBGuB6M 0ajz58ejvH7yZcMKUhnHjCKDPmwrQgKuuQWcKEnHfbDP/qwMTb+NH+ik/2eeskoMroLj8hYcIuZ A+X6QZ6hDcor6ovfsyGG3GLr8= X-Google-Smtp-Source: AGHT+IFGhloL5+wAzqGLxdMTz2N8jTfQVk0LrudTV42eJiPSBl97ww3SO5T1gQQT1i9NRyGQ/H+ydQ== X-Received: by 2002:a05:622a:10d:b0:4b0:7e72:9f05 with SMTP id d75a77b69052e-4b09157a420mr93793331cf.29.1754531112392; Wed, 06 Aug 2025 18:45:12 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:11 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 17/30] liveupdate: luo_files: luo_ioctl: Unregister all FDs on device close Date: Thu, 7 Aug 2025 01:44:23 +0000 Message-ID: <20250807014442.3829950-18-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Currently, a file descriptor registered for preservation via the remains globally registered with LUO until it is explicitly unregistered. This creates a potential for resource leaks into the next kernel if the userspace agent crashes or exits without proper cleanup before a live update is fully initiated. This patch ties the lifetime of FD preservation requests to the lifetime of the open file descriptor for /dev/liveupdate, creating an implicit "session". When the /dev/liveupdate file descriptor is closed (either explicitly via close() or implicitly on process exit/crash), the .release handler, luo_release(), is now called. This handler invokes the new function luo_unregister_all_files(), which iterates through all FDs that were preserved through that session and unregisters them. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/luo_files.c | 19 +++++++++++++++++++ kernel/liveupdate/luo_internal.h | 1 + kernel/liveupdate/luo_ioctl.c | 1 + 3 files changed, 21 insertions(+) diff --git a/kernel/liveupdate/luo_files.c b/kernel/liveupdate/luo_files.c index 33577c9e9a64..63f8b086b785 100644 --- a/kernel/liveupdate/luo_files.c +++ b/kernel/liveupdate/luo_files.c @@ -721,6 +721,25 @@ int luo_unregister_file(u64 token) return ret; } =20 +/** + * luo_unregister_all_files - Unpreserve all currently registered files. + * + * Iterates through all file descriptors currently registered for preserva= tion + * and unregisters them, freeing all associated resources. This is typical= ly + * called when LUO agent exits. + */ +void luo_unregister_all_files(void) +{ + struct luo_file *luo_file; + unsigned long token; + + luo_state_read_enter(); + xa_for_each(&luo_files_xa_out, token, luo_file) + __luo_unregister_file(token); + luo_state_read_exit(); + WARN_ON_ONCE(atomic64_read(&luo_files_count) !=3D 0); +} + /** * luo_retrieve_file - Find a registered file instance by its token. * @token: The unique token of the file instance to retrieve. diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 5692196fd425..189e032d7738 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -37,5 +37,6 @@ void luo_do_subsystems_cancel_calls(void); int luo_retrieve_file(u64 token, struct file **filep); int luo_register_file(u64 token, int fd); int luo_unregister_file(u64 token); +void luo_unregister_all_files(void); =20 #endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_ioctl.c b/kernel/liveupdate/luo_ioctl.c index 6f61569c94e8..7ca33d1c868f 100644 --- a/kernel/liveupdate/luo_ioctl.c +++ b/kernel/liveupdate/luo_ioctl.c @@ -137,6 +137,7 @@ static int luo_open(struct inode *inodep, struct file *= filep) =20 static int luo_release(struct inode *inodep, struct file *filep) { + luo_unregister_all_files(); atomic_set(&luo_device_in_use, 0); =20 return 0; --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C38B23506E for ; Thu, 7 Aug 2025 01:45:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531118; cv=none; b=KwY5cqvT9rLTULzKbL2wPDNnuo62gDITuXKKB8umHxs9V2VfL1gar4WUe+R92y1MHt5pBYcVtqVfhudnABNAnAaSkoxR8u+8l0crxv7dTtG71I3mnrz/db2kbrTCNmKq3FGqTZSrwfW7jt7Ta/qBXZVrKvm2WLdCPuIklJ77CBo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531118; c=relaxed/simple; bh=Hl2OWzV5OeCMKSdYvoQ+GWIuxcx97TV+rOlqTWLKO9U=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=EZSNNC07a/ajOuBvatU8zADAHvffFcJv1snzmZ5psXDOXQWVSBteHhiARkXTE+UZi1H04z/wEf4zm+C0kDUJowefxRinEIjrikSzT2KLsKeUyfYLYEL2QqmB7iQHL85Xl8UVnTzXMOWMC6moqf59eiXeJgcx2U57/qOfIkeEAKg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Nw1EKecB; arc=none smtp.client-ip=209.85.160.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Nw1EKecB" Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-4af14096b9eso7600281cf.3 for ; Wed, 06 Aug 2025 18:45:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531114; x=1755135914; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=8hwgDrFusb37owy5fF2KCZv/M9EocMOsOhnKlUYr6eI=; b=Nw1EKecB07eeizFO10GuFjzXY5fqX6xehIIoLU+3BP8Igtv9HlnIfQtwawRhgPTpCr qNaUomW/zI0lrmxaqSj1cZgvpZNfypFxXbYaH0kw6PTg/1O3y+bdj6tl99wGaFcJhmEW 77n0eEQ1ZQnNlxrG6KwDCPmN9hAwGNbr+X0eFR3ne89J/zqhf1CMVqAY9gjqY1iCeCNb g9cciSw0GflbS1PVDYKYj2cmllY2cOhUpm35E01xfyxeHNIh7HxZKSFD7uJOElmZCl9t 5CYvAdxiC6nefRNzNbdSa3cbV2S+CU+OqxzwyUo7aJQnfwEKFnk1Mj5+POj74EBRGRME LqUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531114; x=1755135914; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8hwgDrFusb37owy5fF2KCZv/M9EocMOsOhnKlUYr6eI=; b=H/uzAraSEXibH5BQ0zNYaXHCxWMB0YMkWEfNGhici9Iz3PwLET4jEG1aE2OtZPrTVh wfavHKZ/zYqlMsLpDqqjHGk/USqzPeCFRJGvyklzkmFshdRySQodTkNLEHq+cWOsCHoI 22tPODrUHZQal6fj26rbf4uNahKd+rmnvyRU/TmHJFyN9CBMe3tkup9hGWlNmNL2+OwF 1seSxVECIUkL4y8NYfvOqodbFSGyQykShSwcl1qVwJT0G6ymxhXcWfBeQ9jFJxoFO8nZ 3vavzZhyEkraSlptGgWgaQu9fCLKX7r+Bl5Xa3p+eQwh6cijM4+7glHvTmo2QE+KisRc v+DA== X-Forwarded-Encrypted: i=1; AJvYcCU3twzwIBLcd36goxhljAEZTfMd6K/LfnuixgDwAMM3YkMV3hPOSiW7d0W0maB/F+SBhPgJAs7lzDo7Fb4=@vger.kernel.org X-Gm-Message-State: AOJu0YyFgddxx8FU9p0YlxTGpvZ1nMLMzJbSIV7c7VGptsj+onFAtCZg Eirp7azeGlGBr70m7jJXKxs7L34U5vjX9+F6TsEUGU4zP35K+rHoZUlwGAyk8wIoo9A= X-Gm-Gg: ASbGncvfO6P+OEMre3Nxv2zROBptn8trRiVIPfqmnGc+6v3NTs95p9EUnR2mbjKZcqn pveJPYRIQBJJcLIV0N5vWBM7ufMlT440++SWCICw/sA8Q8Wwe54Ug3fcHSd+oe4JRAwMi9H/5s/ U16e+oQvBpmd81+cNN3cG6dwdEf30OF1Kc3P1cOfSYFdWon5/aGoSPcLQmySofbfUG56M/K+bah YZBlCiiJmAej1RyYcUOOuMowIcJnIe3WXbCVulMuEpk2qdHloZC5aTrWYlUBU9z1nPGsJJwh6YA cOkYjBKBK/L1BjEkicITNKNYne3LBmQvnwU3ABxtSnwx1+YeIdE14ulfZOwSby5H4Vn4e0zjHmS t1vTBivq0B5p4tCQVfL0zK9tLJrP5JIqmd0E9co5AcMjDaqXtxKfVTVCw1GJWwRB59zS37Odqio BG6Xa+i8ZrRtwY X-Google-Smtp-Source: AGHT+IH4YzyGG4XPLy7NZ5fSOZd1AAebDYYOaRO7tsFemVfq331RBpfgGYjTHTH3NPqkKUhiEWNpgA== X-Received: by 2002:a05:622a:1211:b0:4b0:6965:dd97 with SMTP id d75a77b69052e-4b0915b344bmr63492861cf.44.1754531113855; Wed, 06 Aug 2025 18:45:13 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:13 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 18/30] liveupdate: luo_files: luo_ioctl: Add ioctls for per-file state management Date: Thu, 7 Aug 2025 01:44:24 +0000 Message-ID: <20250807014442.3829950-19-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a set of new ioctls to allow a userspace agent to query and control the live update state of individual file descriptors that have been registered for preservation. Previously, state transitions (prepare, freeze, finish) were handled globally for all registered resources by the main LUO state machine. This patch provides a more granular interface, enabling a controlling agent to manage the lifecycle of specific FDs independently, which is useful for performance reasons. - Adds LIVEUPDATE_IOCTL_GET_FD_STATE to query the current state (e.g., NORMAL, PREPARED, FROZEN) of a file identified by its token. - Adds LIVEUPDATE_IOCTL_SET_FD_EVENT to trigger state transitions (PREPARE, FREEZE, CANCEL, FINISH) for a single file. Signed-off-by: Pasha Tatashin --- include/uapi/linux/liveupdate.h | 62 +++++++++++++ kernel/liveupdate/luo_files.c | 152 +++++++++++++++++++++++++++++++ kernel/liveupdate/luo_internal.h | 8 ++ kernel/liveupdate/luo_ioctl.c | 48 ++++++++++ 4 files changed, 270 insertions(+) diff --git a/include/uapi/linux/liveupdate.h b/include/uapi/linux/liveupdat= e.h index 37ec5656443b..833da5a8c064 100644 --- a/include/uapi/linux/liveupdate.h +++ b/include/uapi/linux/liveupdate.h @@ -128,6 +128,8 @@ enum { LIVEUPDATE_CMD_FD_RESTORE =3D 0x02, LIVEUPDATE_CMD_GET_STATE =3D 0x03, LIVEUPDATE_CMD_SET_EVENT =3D 0x04, + LIVEUPDATE_CMD_GET_FD_STATE =3D 0x05, + LIVEUPDATE_CMD_SET_FD_EVENT =3D 0x06, }; =20 /** @@ -334,4 +336,64 @@ struct liveupdate_ioctl_set_event { #define LIVEUPDATE_IOCTL_SET_EVENT \ _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SET_EVENT) =20 +/** + * struct liveupdate_ioctl_get_fd_state - ioctl(LIVEUPDATE_IOCTL_GET_FD_ST= ATE) + * @size: Input; sizeof(struct liveupdate_ioctl_get_fd_state) + * @incoming: Input; If 1, query the state of a restored file from the inc= oming + * (previous kernel's) set. If 0, query a file being prepared f= or + * preservation in the current set. + * @token: Input; Token of FD for which to get state. + * @state: Output; The live update state of this FD. + * + * Query the current live update state of a specific preserved file descri= ptor. + * + * - %LIVEUPDATE_STATE_NORMAL: Default state + * - %LIVEUPDATE_STATE_PREPARED: Prepare callback has been performed on th= is FD. + * - %LIVEUPDATE_STATE_FROZEN: Freeze callback ahs been performed on thi= s FD. + * - %LIVEUPDATE_STATE_UPDATED: The system has successfully rebooted into= the + * new kernel. + * + * See the definition of &enum liveupdate_state for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_ioctl_get_fd_state { + __u32 size; + __u8 incoming; + __aligned_u64 token; + __u32 state; +}; + +#define LIVEUPDATE_IOCTL_GET_FD_STATE \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_GET_FD_STATE) + +/** + * struct liveupdate_ioctl_set_fd_event - ioctl(LIVEUPDATE_IOCTL_SET_FD_EV= ENT) + * @size: Input; sizeof(struct liveupdate_ioctl_set_fd_event) + * @event: Input; The live update event. + * @token: Input; Token of FD for which to set the provided event. + * + * Notify a specific preserved file descriptor of an event, that causes a = state + * transition for that file descriptor. + * + * Event, can be one of the following: + * + * - %LIVEUPDATE_PREPARE: Initiates the FD live update preparation phase. + * - %LIVEUPDATE_FREEZE: Initiates the FD live update freeze phase. + * - %LIVEUPDATE_CANCEL: Cancel the FD preparation or freeze phase. + * - %LIVEUPDATE_FINISH: FD Restoration completion and trigger cleanup. + * + * See the definition of &enum liveupdate_event for more details on each s= tate. + * + * Return: 0 on success, negative error code on failure. + */ +struct liveupdate_ioctl_set_fd_event { + __u32 size; + __u32 event; + __aligned_u64 token; +}; + +#define LIVEUPDATE_IOCTL_SET_FD_EVENT \ + _IO(LIVEUPDATE_IOCTL_TYPE, LIVEUPDATE_CMD_SET_FD_EVENT) + #endif /* _UAPI_LIVEUPDATE_H */ diff --git a/kernel/liveupdate/luo_files.c b/kernel/liveupdate/luo_files.c index 63f8b086b785..0d68d0c8c45e 100644 --- a/kernel/liveupdate/luo_files.c +++ b/kernel/liveupdate/luo_files.c @@ -740,6 +740,158 @@ void luo_unregister_all_files(void) WARN_ON_ONCE(atomic64_read(&luo_files_count) !=3D 0); } =20 +/** + * luo_file_get_state - Get the preservation state of a specific file. + * @token: The token of the file to query. + * @statep: Output pointer to store the file's current live update state. + * @incoming: If true, query the state of a restored file from the incoming + * (previous kernel's) set. If false, query a file being prepar= ed + * for preservation in the current set. + * + * Finds the file associated with the given @token in either the incoming + * or outgoing tracking arrays and returns its current LUO state + * (NORMAL, PREPARED, FROZEN, UPDATED). + * + * Return: 0 on success, -ENOENT if the token is not found. + */ +int luo_file_get_state(u64 token, enum liveupdate_state *statep, bool inco= ming) +{ + struct luo_file *luo_file; + struct xarray *target_xa; + int ret =3D 0; + + luo_state_read_enter(); + + target_xa =3D incoming ? &luo_files_xa_in : &luo_files_xa_out; + luo_file =3D xa_load(target_xa, token); + + if (!luo_file) { + ret =3D -ENOENT; + goto out_unlock; + } + + scoped_guard(mutex, &luo_file->mutex) + *statep =3D luo_file->state; + +out_unlock: + luo_state_read_exit(); + return ret; +} + +/** + * luo_file_prepare - Prepare a single registered file for live update. + * @token: The token of the file to prepare. + * + * Finds the file associated with @token and transitions it to the PREPARED + * state by invoking its handler's ->prepare() callback. This allows for + * granular, per-file preparation before the global LUO PREPARE event. + * + * Return: 0 on success, negative error code on failure. + */ +int luo_file_prepare(u64 token) +{ + struct luo_file *luo_file; + int ret; + + luo_state_read_enter(); + luo_file =3D xa_load(&luo_files_xa_out, token); + if (!luo_file) { + ret =3D -ENOENT; + goto out_unlock; + } + + ret =3D luo_files_prepare_one(luo_file); +out_unlock: + luo_state_read_exit(); + return ret; +} + +/** + * luo_file_freeze - Freeze a single prepared file for live update. + * @token: The token of the file to freeze. + * + * Finds the file associated with @token and transitions it from the PREPA= RED + * to the FROZEN state by invoking its handler's ->freeze() callback. This= is + * typically used for final, "blackout window" state saving for a specific + * file. + * + * Return: 0 on success, negative error code on failure. + */ +int luo_file_freeze(u64 token) +{ + struct luo_file *luo_file; + int ret; + + luo_state_read_enter(); + luo_file =3D xa_load(&luo_files_xa_out, token); + if (!luo_file) { + ret =3D -ENOENT; + goto out_unlock; + } + + ret =3D luo_files_freeze_one(luo_file); +out_unlock: + luo_state_read_exit(); + return ret; +} + +int luo_file_cancel(u64 token) +{ + struct luo_file *luo_file; + int ret =3D 0; + + luo_state_read_enter(); + luo_file =3D xa_load(&luo_files_xa_out, token); + if (!luo_file) { + ret =3D -ENOENT; + goto out_unlock; + } + + luo_files_cancel_one(luo_file); +out_unlock: + luo_state_read_exit(); + return ret; +} + +/** + * luo_file_finish - Clean-up a single restored file after live update. + * @token: The token of the file to finalize. + * + * This function is called in the new kernel after a live update, typically + * after a file has been restored via luo_retrieve_file() and is no longer + * needed by the userspace agent in its preserved state. It invokes the + * handler's ->finish() callback, allowing for any final cleanup of the + * preserved state associated with this specific file. + * + * This must be called when LUO is in the UPDATED state. + * + * Return: 0 on success, -ENOENT if the token is not found, -EBUSY if not + * in the UPDATED state. + */ +int luo_file_finish(u64 token) +{ + struct luo_file *luo_file; + int ret =3D 0; + + luo_state_read_enter(); + if (!liveupdate_state_updated()) { + pr_warn("finish can only be done in UPDATED state\n"); + ret =3D -EBUSY; + goto out_unlock; + } + + luo_file =3D xa_load(&luo_files_xa_in, token); + if (!luo_file) { + ret =3D -ENOENT; + goto out_unlock; + } + + luo_files_finish_one(luo_file); +out_unlock: + luo_state_read_exit(); + return ret; +} + /** * luo_retrieve_file - Find a registered file instance by its token. * @token: The unique token of the file instance to retrieve. diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 189e032d7738..01bd0d3b023b 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -8,6 +8,8 @@ #ifndef _LINUX_LUO_INTERNAL_H #define _LINUX_LUO_INTERNAL_H =20 +#include + /* * Handles a deserialization failure: devices and memory is in unpredictab= le * state. @@ -39,4 +41,10 @@ int luo_register_file(u64 token, int fd); int luo_unregister_file(u64 token); void luo_unregister_all_files(void); =20 +int luo_file_get_state(u64 token, enum liveupdate_state *statep, bool inco= ming); +int luo_file_prepare(u64 token); +int luo_file_freeze(u64 token); +int luo_file_cancel(u64 token); +int luo_file_finish(u64 token); + #endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_ioctl.c b/kernel/liveupdate/luo_ioctl.c index 7ca33d1c868f..4c0f6708e411 100644 --- a/kernel/liveupdate/luo_ioctl.c +++ b/kernel/liveupdate/luo_ioctl.c @@ -127,6 +127,48 @@ static int luo_ioctl_set_event(struct luo_ucmd *ucmd) return ret; } =20 +static int luo_ioctl_get_fd_state(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_get_fd_state *argp =3D ucmd->cmd; + enum liveupdate_state state; + int ret; + + ret =3D luo_file_get_state(argp->token, &state, !!argp->incoming); + if (ret) + return ret; + + argp->state =3D state; + if (copy_to_user(ucmd->ubuffer, argp, ucmd->user_size)) + return -EFAULT; + + return 0; +} + +static int luo_ioctl_set_fd_event(struct luo_ucmd *ucmd) +{ + struct liveupdate_ioctl_set_fd_event *argp =3D ucmd->cmd; + int ret; + + switch (argp->event) { + case LIVEUPDATE_PREPARE: + ret =3D luo_file_prepare(argp->token); + break; + case LIVEUPDATE_FREEZE: + ret =3D luo_file_freeze(argp->token); + break; + case LIVEUPDATE_FINISH: + ret =3D luo_file_finish(argp->token); + break; + case LIVEUPDATE_CANCEL: + ret =3D luo_file_cancel(argp->token); + break; + default: + ret =3D -EINVAL; + } + + return ret; +} + static int luo_open(struct inode *inodep, struct file *filep) { if (atomic_cmpxchg(&luo_device_in_use, 0, 1)) @@ -149,6 +191,8 @@ union ucmd_buffer { struct liveupdate_ioctl_fd_restore restore; struct liveupdate_ioctl_get_state state; struct liveupdate_ioctl_set_event event; + struct liveupdate_ioctl_get_fd_state fd_state; + struct liveupdate_ioctl_set_fd_event fd_event; }; =20 struct luo_ioctl_op { @@ -179,6 +223,10 @@ static const struct luo_ioctl_op luo_ioctl_ops[] =3D { struct liveupdate_ioctl_get_state, state), IOCTL_OP(LIVEUPDATE_IOCTL_SET_EVENT, luo_ioctl_set_event, struct liveupdate_ioctl_set_event, event), + IOCTL_OP(LIVEUPDATE_IOCTL_GET_FD_STATE, luo_ioctl_get_fd_state, + struct liveupdate_ioctl_get_fd_state, token), + IOCTL_OP(LIVEUPDATE_IOCTL_SET_FD_EVENT, luo_ioctl_set_fd_event, + struct liveupdate_ioctl_set_fd_event, token), }; =20 static long luo_ioctl(struct file *filep, unsigned int cmd, unsigned long = arg) --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA01F23814D for ; Thu, 7 Aug 2025 01:45:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531119; cv=none; b=UGisNli3VeBAkSedrG2Y2FEr711qkGL/kz35UJrGziXCxWto0PRR2AJw27Yvpq+nAcZU2I3De9vRwxW9Nydh2f0IYTjVpdV060q8ypC+eRZ4Cci/hEZU9MFFNPYVeaC8J3wzwED8qKWfgffeNKmiix0ak/QKxOxms1rbjx7uCk8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531119; c=relaxed/simple; bh=Rx/tyv8veGFGG2JeXQPCv++VWJ0RVIdQqE9qIDTRPF8=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=s1ltDKOlPw5N8lS9tc1vZJ0WgI6HiyQfauMIFTGylH/EC/0wClpN2X47a7qy0PruYVba3lmckAYD5lIhz8RvQEyxfgF5JC87iXR3pQdG0ImzKziN7OS8Svzj4UQgORSPEKFP3huYHgT5mHrYMgNnfZwov5W3Mdm2jkQi2w5wKmY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Uajcy2Nr; arc=none smtp.client-ip=209.85.160.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Uajcy2Nr" Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-4b0a0870791so7513891cf.0 for ; Wed, 06 Aug 2025 18:45:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531115; x=1755135915; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=WF/QU/y7x54PCzdmrSK0bQUFRNe6CNoszBlEiHHR+XM=; b=Uajcy2Nr8ISkpfSUxoYPN7gxJWft3y/BStQ2jcAbnCuX+OEf2eqz9w+02DVZTChAJ7 PyqjgWh+uICCm7U2fotjUUd8gvI1IcrEucWHvWYKUj880zxRYVzpULYnj2SYn6L5dKCZ SPe91YqdlwP0B9stb914coi1OIuYq7c93R9H0VbPD0mthrTz51wJ/wPLPFz9GNn5lIAf TJplvCEqoZt9tMrzXN/yMPTRd+FVgzLFZ/05zNmMqM8XeYJIQm1+S7aj9gaS1l6x1TKD 5Vmu7yM40tskG/zQXoktaX0G/2coRfhnqIUlSi9HsvH4838D3Vg6Q6Ms6AdfPkvRVUPM LXhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531115; x=1755135915; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WF/QU/y7x54PCzdmrSK0bQUFRNe6CNoszBlEiHHR+XM=; b=troD64mrG6TeM6TyPtCvALXH7bResLmzGR9w4Nm3MAFqT8+cy6vNlf3Wjm5FQaTf7V urVRcyQx/TdsUq5zU9AqciamyNqBXQExs5PMZJwRuURV6cjrUO4AfxHjqueB+byeUwmU xqcTqUm/lqdSo6tNKDaxvEU6Ye1ITqk3ttYHlzZQmtYPx2g07zJ0htRNDX+Mii+vY+7a MgYVg/ixdCUDoIr9kAE8HRCXk7yjxr1JPNv39WzdFUOkJUUXL9cx58T4ynbyi3OnGquS dnop9PjaWoyAhRXSOJOkuhnLlJpgG/qTBvGn7iIv5QMXQglnsGWb29q5RGWqvY9f7VHQ Qf9g== X-Forwarded-Encrypted: i=1; AJvYcCVURzp4r1It2gwhPSKpM6XblhMNQDvsACB3Ou9WaxutiWvOSF0xatPiZfxlvYql8L7XM+cJph9TA99SuZk=@vger.kernel.org X-Gm-Message-State: AOJu0YwlEBlrGZMB3pV7E6JYIXdUbZ86hzWfh26y15iIMyG8idsTK2dq 62Avp01CtAgU8445XCs/tOeynFsbfcuOzxO6KOb/uAG7/1ozGDKHqlLaZAmmo20hosY= X-Gm-Gg: ASbGncuVwwFz4ippbybYqr1NCm5Vi8TJsEEOGJEOMxESBm25bMyHXmwe9nwzUO5JVvd R7YS2S1T7lFR8FQbHFQ+94YIzbXrR+OC7kFndc40wy4thhN0USQ0l3lipUiBCf1qRHnEvyHBzZY 3wBR5KDouTBspDPr3cCOH2CKm5QY9qp43+2qTaavM8jzd3vw3jTx01nUVZ5s/GL4OTRKkk8GR86 kBONvm5Vzls3UJUgNi106WJEm6kVifg/c6xnsX1V1wXBPmCP0UdJodvaheGmPk4PZc86iEIB4JS ybPNR5FWU9WEgVLTIy3S35FJsp/vis3+2xMRZw6lyZDxxRdqwfIkYmoFh2Nxp4MVjTDgRZHBSL+ JYWZVk4ic98nUhMm0GVRDS85g4gcFReCFbQjU+Oo5Wk8rJ0sxmdynJGFdhJPzvXh+Gch+ODv173 LNLqh6xoqUPinqbxgHXkttlhw= X-Google-Smtp-Source: AGHT+IFKHH1YeAK0b2tzEkZQep5rY0nibsccM8bhcvnZ6YiF0cNDdr12cF9wjVLlo+2LNmKUV8uQBg== X-Received: by 2002:ac8:6904:0:b0:4b0:6463:7d0d with SMTP id d75a77b69052e-4b0915c39d2mr77833781cf.42.1754531115362; Wed, 06 Aug 2025 18:45:15 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:14 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 19/30] liveupdate: luo_sysfs: add sysfs state monitoring Date: Thu, 7 Aug 2025 01:44:25 +0000 Message-ID: <20250807014442.3829950-20-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce a sysfs interface for the Live Update Orchestrator under /sys/kernel/liveupdate/. This interface provides a way for userspace tools and scripts to monitor the current state of the LUO state machine. The main feature is a read-only file, state, which displays the current LUO state as a string ("normal", "prepared", "frozen", "updated"). The interface uses sysfs_notify to allow userspace listeners (e.g., via poll) to be efficiently notified of state changes. ABI documentation for this new sysfs interface is added in Documentation/ABI/testing/sysfs-kernel-liveupdate. This read-only sysfs interface complements the main ioctl interface provided by /dev/liveupdate, which handles LUO control operations and resource management. Signed-off-by: Pasha Tatashin --- .../ABI/testing/sysfs-kernel-liveupdate | 51 ++++++++++ kernel/liveupdate/Kconfig | 18 ++++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_core.c | 1 + kernel/liveupdate/luo_internal.h | 6 ++ kernel/liveupdate/luo_sysfs.c | 92 +++++++++++++++++++ 6 files changed, 169 insertions(+) create mode 100644 Documentation/ABI/testing/sysfs-kernel-liveupdate create mode 100644 kernel/liveupdate/luo_sysfs.c diff --git a/Documentation/ABI/testing/sysfs-kernel-liveupdate b/Documentat= ion/ABI/testing/sysfs-kernel-liveupdate new file mode 100644 index 000000000000..bb85cbae4943 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-liveupdate @@ -0,0 +1,51 @@ +What: /sys/kernel/liveupdate/ +Date: May 2025 +KernelVersion: 6.16.0 +Contact: pasha.tatashin@soleen.com +Description: Directory containing interfaces to query the live + update orchestrator. Live update is the ability to reboot the + host kernel (e.g., via kexec, without a full power cycle) while + keeping specifically designated devices operational ("alive") + across the transition. After the new kernel boots, these devices + can be re-attached to their original workloads (e.g., virtual + machines) with their state preserved. This is particularly + useful, for example, for quick hypervisor updates without + terminating running virtual machines. + + +What: /sys/kernel/liveupdate/state +Date: May 2025 +KernelVersion: 6.16.0 +Contact: pasha.tatashin@soleen.com +Description: Read-only file that displays the current state of the live + update orchestrator as a string. Possible values are: + + "normal" No live update operation is in progress. This is + the default operational state. + + "prepared" The live update preparation phase has completed + successfully (e.g., triggered via the + /dev/liveupdate event). Kernel subsystems have + been notified via the %LIVEUPDATE_PREPARE + event/callback and should have initiated state + saving. User workloads (e.g., VMs) are generally + still running, but some operations (like device + unbinding or new DMA mappings) might be + restricted. The system is ready for the reboot + trigger. + + "frozen" The final reboot notification has been sent + (e.g., triggered via the 'reboot()' syscall), + corresponding to the %LIVEUPDATE_REBOOT kernel + event. Subsystems have had their final chance to + save state. User workloads must be suspended. + The system is about to execute the reboot into + the new kernel (imminent kexec). This state + corresponds to the "blackout window". + + "updated" The system has successfully rebooted into the + new kernel via live update. Restoration of + preserved resources can now occur (typically via + ioctl commands). The system is awaiting the + final 'finish' signal after user space completes + restoration tasks. diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index f6b0bde188d9..75a17ca8a592 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -29,6 +29,24 @@ config LIVEUPDATE =20 If unsure, say N. =20 +config LIVEUPDATE_SYSFS_API + bool "Live Update sysfs monitoring interface" + depends on SYSFS + depends on LIVEUPDATE + help + Enable a sysfs interface for the Live Update Orchestrator + at /sys/kernel/liveupdate/. + + This allows monitoring the LUO state ('normal', 'prepared', + 'frozen', 'updated') via the read-only 'state' file. + + This interface complements the primary /dev/liveupdate ioctl + interface, which handles the full update process. + This sysfs API may be useful for scripting, or userspace monitoring + needed to coordinate application restarts and minimize downtime. + + If unsure, say N. + config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index c67fa2797796..47f5d0378a75 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -13,3 +13,4 @@ obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o =20 obj-$(CONFIG_LIVEUPDATE) +=3D luo.o +obj-$(CONFIG_LIVEUPDATE_SYSFS_API) +=3D luo_sysfs.o diff --git a/kernel/liveupdate/luo_core.c b/kernel/liveupdate/luo_core.c index 64d53b31d6d8..bd07ee859112 100644 --- a/kernel/liveupdate/luo_core.c +++ b/kernel/liveupdate/luo_core.c @@ -100,6 +100,7 @@ static inline bool is_current_luo_state(enum liveupdate= _state expected_state) static void __luo_set_state(enum liveupdate_state state) { WRITE_ONCE(luo_state, state); + luo_sysfs_notify(); } =20 static inline void luo_set_state(enum liveupdate_state state) diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 01bd0d3b023b..9091ed04c606 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -47,4 +47,10 @@ int luo_file_freeze(u64 token); int luo_file_cancel(u64 token); int luo_file_finish(u64 token); =20 +#ifdef CONFIG_LIVEUPDATE_SYSFS_API +void luo_sysfs_notify(void); +#else +static inline void luo_sysfs_notify(void) {} +#endif + #endif /* _LINUX_LUO_INTERNAL_H */ diff --git a/kernel/liveupdate/luo_sysfs.c b/kernel/liveupdate/luo_sysfs.c new file mode 100644 index 000000000000..935946bb741b --- /dev/null +++ b/kernel/liveupdate/luo_sysfs.c @@ -0,0 +1,92 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO sysfs interface + * + * Provides a sysfs interface at ``/sys/kernel/liveupdate/`` for monitorin= g LUO + * state. Live update allows rebooting the kernel (via kexec) while prese= rving + * designated device state for attached workloads (e.g., VMs), useful for + * minimizing downtime during hypervisor updates. + * + * /sys/kernel/liveupdate/state + * ---------------------------- + * - Permissions: Read-only + * - Description: Displays the current LUO state string. + * - Valid States: + * @normal + * Idle state. + * @prepared + * Preparation phase complete (triggered via '/dev/liveupdate'). Res= ources + * checked, state saving initiated via %LIVEUPDATE_PREPARE event. + * Workloads mostly running but may be restricted. Ready forreboot + * trigger. + * @frozen + * Final reboot notification sent (triggered via 'reboot'). Correspo= nds to + * %LIVEUPDATE_REBOOT event. Final state saving. Workloads must be + * suspended. System about to kexec ("blackout window"). + * @updated + * New kernel booted via live update. Awaiting 'finish' signal. + * + * Userspace Interaction & Blackout Window Reduction + * ------------------------------------------------- + * Userspace monitors the ``state`` file to coordinate actions: + * - Suspend workloads before @frozen state is entered. + * - Initiate resource restoration upon entering @updated state. + * - Resume workloads after restoration, minimizing downtime. + */ + +#include +#include +#include +#include "luo_internal.h" + +static bool luo_sysfs_initialized; + +#define LUO_DIR_NAME "liveupdate" + +void luo_sysfs_notify(void) +{ + if (luo_sysfs_initialized) + sysfs_notify(kernel_kobj, LUO_DIR_NAME, "state"); +} + +/* Show the current live update state */ +static ssize_t state_show(struct kobject *kobj, struct kobj_attribute *att= r, + char *buf) +{ + return sysfs_emit(buf, "%s\n", luo_current_state_str()); +} + +static struct kobj_attribute state_attribute =3D __ATTR_RO(state); + +static struct attribute *luo_attrs[] =3D { + &state_attribute.attr, + NULL +}; + +static struct attribute_group luo_attr_group =3D { + .attrs =3D luo_attrs, + .name =3D LUO_DIR_NAME, +}; + +static int __init luo_init(void) +{ + int ret; + + ret =3D sysfs_create_group(kernel_kobj, &luo_attr_group); + if (ret) { + pr_err("Failed to create group\n"); + return ret; + } + + luo_sysfs_initialized =3D true; + pr_info("Initialized\n"); + + return 0; +} +subsys_initcall(luo_init); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DF7A0239E6A for ; Thu, 7 Aug 2025 01:45:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.53 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531120; cv=none; b=MHeVeh0L1B61hKbuLLg9X5Atk+Wle1VpBjnJ9eAAgnzuc0n7H9i8j4wsa9F00eH6SCjRdzbA0DBw91HnSrK2edLu94X1sNK8Uvhyr3BstxD0XTYb5TTChFzhB8OVj0Syy11X7gdGtSWoRwToJgxbfOU3q2Tx1yd26/3S5M3Sie0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531120; c=relaxed/simple; bh=RZ++fAXg5VoJ9aULwrUGHh0jUgxwzggc0842qo3hOvo=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aLIfHdUDLum5rNNaEaGQkAOFqTkiM3FEWNWNldJ0Ek6uGohb53DCa6zUc0i4BHkSQpQeLux+qjtxl26NiYX5Bdvu4VvQYzhCqz3xQ8W308sSemMMMocbHac9EWU/rYodJ/8wt+q91XTKF6yZzU3PCBB+elnjRwjol7CsbCSM6Q8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=h3BWZfhk; arc=none smtp.client-ip=209.85.219.53 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="h3BWZfhk" Received: by mail-qv1-f53.google.com with SMTP id 6a1803df08f44-7074a74248dso5529356d6.3 for ; Wed, 06 Aug 2025 18:45:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531117; x=1755135917; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=WtKfUzOgupFaYotRf3a7eOnx1rqlnB537wzbb8Z3Bb4=; b=h3BWZfhkG+WF4+EcWwY4730YQ8ldiyH3oNiieZ0NLStWuOeP9N/2b+8vNQArNgCoG+ mLaksFDoAYGVeemswgDli5X9eI1aT7TdvLoPtyCuulwXojVEj9CeUH6m+dTeRk6JgNTN vq+PXePRJ6wKDdGH7O/TNoh7U4iTFA3Ys+eTAsObiPubV1iuXtQK61w6Lu5wo/7u6p4U BLdn1K8UBWvtJvY8epYMe0hC3kYn03Erna9X6Q9M3pu8KKuz+1YFA2k4h7LS28SxuAYv vstbh2aEWtln7Jzc1VE+S5WZTOCFCTp4BrwUt/yVTypuDfGumlmVAtSng/a7IwLQCXIZ ChIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531117; x=1755135917; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WtKfUzOgupFaYotRf3a7eOnx1rqlnB537wzbb8Z3Bb4=; b=qssHiR0CF6FyP1rIdNW/KSw7COlQkdQRc3Ghrqkpq6V0MLgEatvET+E/0zQhrReuwl 4gzZWrRyothQtTuSJiaoUKudpZsdi+K2AyJBkBK2CmXXj8OBo6DZaNWPHJasb2jJ4naY YyLp5+1tSS9CqMD5vHgkAwXDL0p8h3ZI7Ae8nbQ5lrgMsqAH8CFLtRq6VUzRSRpyCDmC dS+3v0/jnORnVKo0AHebrpTHEwu84r2v98EKdMx9zli90R+Hwxy7VUIZFicaoyjfFKzq 7DtSkWhpLKkorDFaw5gVJk/p+fn3hCXb9MBilaqtV5lpzJEaVknebWVDRrCnBZTRdiLW Wkng== X-Forwarded-Encrypted: i=1; AJvYcCUi9bIcZegWU+DSMjGccpXlg3wweCFv2Y1dq6nqR7imyj7ywHFgRdc+ETm+lhTzVccujgreL0hzETJtxsc=@vger.kernel.org X-Gm-Message-State: AOJu0YzguTzusnI9i8BTdw2w4IY5gD4urLKtx62oXhmwvb7SP04hKhQ3 GJr1scsq1cHJFmNCaN0y6/8uUorva/UALckXAhAybT5TKFm/TpNjDASBaJlBvDrgvXc= X-Gm-Gg: ASbGncuL759Fx/cXTYnDzPrrw/xeAAGqjScNEZd4TRBajqBm+ewCcsVsfnt43sA1sBr t6JRhhHN8kRuD3qHnR44MV/p905rLeAN9Tt/AhifVZOopSp6bc9SivSPEVuEgSFimbm0ICI7X7D 3v6QHFu0tFf3Cym1k7CUHytFdjBnz3KRIKxAA7nxVmbYTJx66FNZyCa8V0oDJrNQtK7T3V+fzcP KOsyiADIRiRR7GlAEKc8rcVn2Zi/CaPHR96l9SIfEstn9IGJuhnImN9Im2u7dEu5aI5P0wFwe83 XXFmhl3PgZanmygt+p2kjxWa7UsLdZx//7LzCtapwxZg30xMA/8fkMNlkycmjqUZLF/tBq4sgL2 6/AaIN7Eus+mEJiYG6CHCXibDPmwWc5LM5j2HWvr8m2n0PtPEwfcgqPQonq8Ue1ILZyCMnPVdXc qenAGMWkjAMASQ X-Google-Smtp-Source: AGHT+IHDHBBqboi0c0b6rcDIt/ymD1qcIlODHNQyebDLYVOHVLNPjCPlRPjhkhoVVjwmxNLgx8k5ag== X-Received: by 2002:a05:6214:300f:b0:707:29f9:3bd1 with SMTP id 6a1803df08f44-7097964ce99mr76346136d6.46.1754531116814; Wed, 06 Aug 2025 18:45:16 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:16 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 20/30] reboot: call liveupdate_reboot() before kexec Date: Thu, 7 Aug 2025 01:44:26 +0000 Message-ID: <20250807014442.3829950-21-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Modify the reboot() syscall handler in kernel/reboot.c to call liveupdate_reboot() when processing the LINUX_REBOOT_CMD_KEXEC command. This ensures that the Live Update Orchestrator is notified just before the kernel executes the kexec jump. The liveupdate_reboot() function triggers the final LIVEUPDATE_FREEZE event, allowing participating subsystems to perform last-minute state saving within the blackout window, and transitions the LUO state machine to FROZEN. The call is placed immediately before kernel_kexec() to ensure LUO finalization happens at the latest possible moment before the kernel transition. If liveupdate_reboot() returns an error (indicating a failure during LUO finalization), the kexec operation is aborted to prevent proceeding with an inconsistent state. Signed-off-by: Pasha Tatashin --- kernel/reboot.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/kernel/reboot.c b/kernel/reboot.c index ec087827c85c..bdeb04a773db 100644 --- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -797,6 +798,9 @@ SYSCALL_DEFINE4(reboot, int, magic1, int, magic2, unsig= ned int, cmd, =20 #ifdef CONFIG_KEXEC_CORE case LINUX_REBOOT_CMD_KEXEC: + ret =3D liveupdate_reboot(); + if (ret) + break; ret =3D kernel_kexec(); break; #endif --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8043223F431 for ; Thu, 7 Aug 2025 01:45:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531122; cv=none; b=u0NDYK5KheWBErt5Q21c0SfXzKsjAJEj9p6w4x56ujHN/NcITvkvTUw4yPD6UpaARat4AEwR4vsYDgZRLdysjz1iU/UIJVTbN4M68P4BPSgQMj7CIIBaCDjhK2Y715C0ncqGW/wSE23DBlotZxH1jJQgw9EIiKV7fpX6erEWRsI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531122; c=relaxed/simple; bh=wD6I1RCn0+c99seXGKswy1P0OxmegImcJk+Lb6MyVL4=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=prVWDfo01aj58OAC+pqkcqhfDWHLFVfjFCxOcwQQ5hXrK7r2QWBH+xlP3XVb5txiNKQnmy3mBwlNnLyng2bUO8M6Dgk1TCPg5ffnJSz0ny2rmy2fsBP9IZVmpUA8ZNmJhpXQthQg9Kp25roLVdIeBWi5SVfGhfzPOP5Y+Nj/Y0E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=MFNdTEfK; arc=none smtp.client-ip=209.85.219.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="MFNdTEfK" Received: by mail-qv1-f44.google.com with SMTP id 6a1803df08f44-7075ccb168bso5538836d6.0 for ; Wed, 06 Aug 2025 18:45:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531118; x=1755135918; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=q7wVeHC0BKF0BnkzpBtVk7MqDMjUeUEopTUe1tEjQ6M=; b=MFNdTEfKSm2hUchTLiu6Ij1W/TBQNRDaQN+pCacJgQOflejZHgqP9ol3PV3efPQajz iv5P8wvV6TWKMk12ES2+Em/t5GONl4QEAjwJkvWWwXS6u9OhEF3FmBHghh3g+9VwZtOS 82TqhrIyRw7ODJrtv/D0eWYpr8XQVdN6n6Lu02vFyMRO9Ymo83UfHLp53WgBA+OVfDtc rjIw7AHcgq9hwHxLPGrS+X0fK/q4bCTt3HzNO4x/eVIpJ42qGZX6820eQP57nIQF6xc2 9VRADA3IVm3wO0oDA7ZwKaoG+ld51sQHOQv5SK+/NL/DQuWVG+oMBqwUiXuFsAp/xgCN GvWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531118; x=1755135918; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=q7wVeHC0BKF0BnkzpBtVk7MqDMjUeUEopTUe1tEjQ6M=; b=MW1qgxBVYZkweXEf8xEJqxy9A1LXlxACQ2QBl/ZZ0uvplqZQYMTowXqN/ZkCiZD6fl dVfgt/8CDDO6iwZFhYRyvHjeUmsFG3ExCOb642c28Cnw/uTU9Wo4xOV5KfvEeTJ/cmjd fbQb0tuYEKY83XZRh4xK8cc8nKDIap/Rk6QRXg5YeKu2yW3NRlFy2ziINnKr8fNSgfFV 3E9pEroF8uGq2/BjEL1iQtYXK3f6/hrujkYSg9wvrsy9+vcCot8B3YHFabl3+gbNg0Ml t3+JLU3hWa69AyoaU1wFbsMylBiOyt0udR4eDTE7qUjhUIIvxTHyIRWn4eY3ecuUIJtD QJWQ== X-Forwarded-Encrypted: i=1; AJvYcCWQV0EqMNohNRSOFe8ydJSTVpq+UnxdVuA6MenYfu7YNjkX099qSj5V/YfGA4CpLOYJEGFTKvR2/YBR240=@vger.kernel.org X-Gm-Message-State: AOJu0YwvnMZxklGPyh6aQaRPvqhKOd5mZKplJ2lnLWBZJ/j4mk8lxuWd mNiq6B8ubAOhHI93DtQa/pkqdGIXnyqiOTotCp4viklyk7eKWEVP/iSy/uOAfK0wFyE= X-Gm-Gg: ASbGncvXL4S0InEVvG7fptu8kks8APnN8WNJhzcs38uStQLQsfPwF5YGGKbY6VsCcs0 sjdCf5c3+mVcaseYhJFrpCUL83VqAMMf+S+WvrUedC7eKqDMOjUzjlVoi8Oem39i8tB/NRTm+Ys l4NNDlk9UuOYeAUe6sDGVDu0Ax+0PjxifiAsX+28N8HPq6CCJET6yRz1CQvG/zY1sV4pPw22DUy zl5U6vD5HJCHXU4/7cw83K+nWaIH3J1T9Bv9Ir/8Lx3VNGgjjUHX3S3td6Tyn5s2+OLrMgywUMn rk6k8fEveTGKupMrdSigmwRHKPnNS261tETmWhq6h/Gr5ISRqs1POxKcoMrNRX4bRElpBqedVZn j14jN9D1ctqi19wqUuM3thE+JRbQNHTtW4VWjDgC6BL5n18JDlCHanVUY+uar4SOnL5y+oNmOUe Kp/nMGQpRstNzV X-Google-Smtp-Source: AGHT+IGUMFTnXz7aA5TeYDEGpdy4VXQ/e6kO5So5tD5I7KBD7rTcRc491fmczfxSCGd/3u9GjXp5RA== X-Received: by 2002:a05:6214:2587:b0:709:65cf:be98 with SMTP id 6a1803df08f44-7097ae1016dmr57649496d6.8.1754531118246; Wed, 06 Aug 2025 18:45:18 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:17 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 21/30] kho: move kho debugfs directory to liveupdate Date: Thu, 7 Aug 2025 01:44:27 +0000 Message-ID: <20250807014442.3829950-22-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now, that LUO and KHO both live under kernel/liveupdate, it makes sense to also move the kho debugfs files to liveupdate/ The old names: /sys/kernel/debug/kho/out/ /sys/kernel/debug/kho/in/ The new names: /sys/kernel/debug/liveupdate/kho_out/ /sys/kernel/debug/liveupdate/kho_in/ Also, export the liveupdate_debufs_root, so LUO selftests could use it as well. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/kexec_handover_debug.c | 11 ++++++----- kernel/liveupdate/luo_internal.h | 4 ++++ 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/liveupdate/kexec_handover_debug.c b/kernel/liveupdate/k= exec_handover_debug.c index af4bad225630..f06d6cdfeab3 100644 --- a/kernel/liveupdate/kexec_handover_debug.c +++ b/kernel/liveupdate/kexec_handover_debug.c @@ -14,8 +14,9 @@ #include #include #include "kexec_handover_internal.h" +#include "luo_internal.h" =20 -static struct dentry *debugfs_root; +struct dentry *liveupdate_debugfs_root; =20 struct fdt_debugfs { struct list_head list; @@ -120,7 +121,7 @@ __init void kho_in_debugfs_init(struct kho_debugfs *dbg= , const void *fdt) =20 INIT_LIST_HEAD(&dbg->fdt_list); =20 - dir =3D debugfs_create_dir("in", debugfs_root); + dir =3D debugfs_create_dir("in", liveupdate_debugfs_root); if (IS_ERR(dir)) { err =3D PTR_ERR(dir); goto err_out; @@ -180,7 +181,7 @@ __init int kho_out_debugfs_init(struct kho_debugfs *dbg) =20 INIT_LIST_HEAD(&dbg->fdt_list); =20 - dir =3D debugfs_create_dir("out", debugfs_root); + dir =3D debugfs_create_dir("out", liveupdate_debugfs_root); if (IS_ERR(dir)) return -ENOMEM; =20 @@ -214,8 +215,8 @@ __init int kho_out_debugfs_init(struct kho_debugfs *dbg) =20 __init int kho_debugfs_init(void) { - debugfs_root =3D debugfs_create_dir("kho", NULL); - if (IS_ERR(debugfs_root)) + liveupdate_debugfs_root =3D debugfs_create_dir("liveupdate", NULL); + if (IS_ERR(liveupdate_debugfs_root)) return -ENOENT; return 0; } diff --git a/kernel/liveupdate/luo_internal.h b/kernel/liveupdate/luo_inter= nal.h index 9091ed04c606..78bea012c383 100644 --- a/kernel/liveupdate/luo_internal.h +++ b/kernel/liveupdate/luo_internal.h @@ -53,4 +53,8 @@ void luo_sysfs_notify(void); static inline void luo_sysfs_notify(void) {} #endif =20 +#ifdef CONFIG_KEXEC_HANDOVER_DEBUG +extern struct dentry *liveupdate_debugfs_root; +#endif + #endif /* _LINUX_LUO_INTERNAL_H */ --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C9292475E3 for ; Thu, 7 Aug 2025 01:45:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531125; cv=none; b=J8t6Wg+BAvqsqBXkHX3Uvb9/Yez3j/zjLKrij2YY1Rw0m+crWQcyOMYlGXM3vfi5BvNTMPyJGrytMZZXdAiXn8rt1JRkxdKnjPi69iRE2gvWvFYEnCuNurAe+/Z3Xc4wfvMflOiO9D5lHnTRoN/H8N77jPQwZL9skRet73cMDqs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531125; c=relaxed/simple; bh=rde7cD/pof9SFGzTndECBbTzNyyTEtJB9y1Sh5Pk7tw=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=AutdkbvlLoBYgkCqAu+vw8OFA7LIpsWE30B8Pivjq69V+xguDNyG7DKAR5KDyJJ7xhjsXearHqC+QEv078tFVaeR+ZfefNZCVarCTVeVBpOvhrvn0+19P4q+YhxiMDrt47ze77vZhttmqao6GzqnmoRUnlWIxxkMdXgSoDLqb7A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=dyhugy6+; arc=none smtp.client-ip=209.85.219.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="dyhugy6+" Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-6fa980d05a8so5826846d6.2 for ; Wed, 06 Aug 2025 18:45:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531120; x=1755135920; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=VOmTgDGmeBWmPehEZ3L8eOdNbmK4CsWM3+7RuLYUbdA=; b=dyhugy6+TKkDPi2/NYhUGUwcal5JHN1nTZT3UG1+KNdqiRkZ+SRgJkngrFznuR8lhu LtN9mTG8m63xYj86DO91hiPsIy8KvSwsYSxM/T3grmJ1e2bJ0Nj5ap82pRQJCTMJW1iZ kBJsz27kcCKZSjVEsylQA0gnAeuRUH7bltmsXK6Iej+Eoa5UEtWQlBja6XFyzaUduXe0 VR7a4Rz9zSTFl8paEgaKH5uNe+ElXG0fij6J1GTpIYKF0Z8jfNfHEjK0I2ug32El94La Y/Oc8slh2SmDy7Vj7ZNa7cQDbJohdo+tbo3zGVpTXSP6wxHYAbe4dhExi3BepOBrRWyW pK1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531120; x=1755135920; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VOmTgDGmeBWmPehEZ3L8eOdNbmK4CsWM3+7RuLYUbdA=; b=N4B65pq2pjANzSRU0fGuhZzCcgWSWpm00DriTomuYpM+TPiq6d/2J8c5CCmvo3/9MD i87UR2Witxg0N7aulsDgkVZs3W6GdnXGLP9hgRRNpJ33AN5RPrgJnMZJSAW5QwHJgSWU 9ml6E+FFyLcoErg05GacP9n01Y1//QgxOnRcDvtzHDq9yFrehO2qnuXsIT0GfTCjQCk+ IbfH7ahoqNKSQxZtaf1YhA1xAy51N32UvyjUO9nHeSiiFNc4f0qW998jYGG2oaEz2Fxk oymUo48ixDeLMVKpou+8aqhWB9JG5cakt+JiR9/KzqeYMrySeMYBLNPMKZtQnVQikeGS NBWw== X-Forwarded-Encrypted: i=1; AJvYcCUhr5V3JmV3QDEVuTbjJ/pggEC0r+OaYy0O6vKXaJ21oTgira4SXZojD7nVaTHZ1tyVNlwX1Gri0SHD1Zg=@vger.kernel.org X-Gm-Message-State: AOJu0YzzuLuolWaeE3yX35K+EVmRbLII9fPahq6W+4Y77mYxnH2sjF5U wXCYqK7OZix9H5LwBZ82/kHAobsOmDnhiZNoQ7YgxCZKA4sPLHJvuR0wwfGxmgqeeUY= X-Gm-Gg: ASbGnct7EhdV8QbuvF0ZnZCjYFDNDtxkKa9A8+kz3zARF2/BfXt6R78NLB7p2ixok7d SjV2VqmF2shoUXy5moH/zLRyBqEUei3YFbAbANOmTnmh1lxTrPuldcVmVIAnkjDRSGcJnKNkAM7 o1FrgLgbHUs4waJoFCA3rjti5B3ozIys25Q8PcCXCLIWPITYjS2Qz5QNwkxyEae/qiT1xIRWSVY FuoMquAUTGvRkpQgdVynkiBqHoQZW87p0Hy1zgYlz87PLdh76xSsa/CEY3kchhaTzZuYzzlI1YS QDTtfV4QnbOtrz8aNq8zfQGQBnFYyX0ZpcaLtUvWKMc+5UYjrb+pASJ5me/e8SEGWTy2iFxQ4pF cwbv/dVrhEhAfsrwbSMWj24yDbx7HVyQRm1NFo2GglIqvUdGJfr9dYLuCFtJ7btlDXsU6eYLOuo y9R0bQhQ2Hhsu6 X-Google-Smtp-Source: AGHT+IGa10Q9xXZTxnWqAP3/TJ/pWwwZKDNRJhQRt0NViWNtS/8r0csvFFcTyLahUM71ijfSvrto3Q== X-Received: by 2002:a05:6214:4013:b0:707:43a1:5b0e with SMTP id 6a1803df08f44-7098a6a5940mr22300026d6.10.1754531119648; Wed, 06 Aug 2025 18:45:19 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:19 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 22/30] liveupdate: add selftests for subsystems un/registration Date: Thu, 7 Aug 2025 01:44:28 +0000 Message-ID: <20250807014442.3829950-23-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Introduce a self-test mechanism for the LUO to allow verification of core subsystem management functionality. This is primarily intended for developers and system integrators validating the live update feature. The tests are enabled via the new Kconfig option CONFIG_LIVEUPDATE_SELFTESTS (default 'n') and are triggered through a new ioctl command, LIVEUPDATE_IOCTL_SELFTESTS, added to the /dev/liveupdate device node. This ioctl accepts commands defined in luo_selftests.h to: - LUO_CMD_SUBSYSTEM_REGISTER: Creates and registers a dummy LUO subsystem using the liveupdate_register_subsystem() function. It allocates a data page and copies initial data from userspace. - LUO_CMD_SUBSYSTEM_UNREGISTER: Unregisters the specified dummy subsystem using the liveupdate_unregister_subsystem() function and cleans up associated test resources. - LUO_CMD_SUBSYSTEM_GETDATA: Copies the data page associated with a registered test subsystem back to userspace, allowing verification of data potentially modified or preserved by test callbacks. This provides a way to test the fundamental registration and unregistration flows within the LUO framework from userspace without requiring a full live update sequence. Signed-off-by: Pasha Tatashin --- kernel/liveupdate/Kconfig | 15 ++ kernel/liveupdate/Makefile | 1 + kernel/liveupdate/luo_selftests.c | 345 ++++++++++++++++++++++++++++++ kernel/liveupdate/luo_selftests.h | 84 ++++++++ 4 files changed, 445 insertions(+) create mode 100644 kernel/liveupdate/luo_selftests.c create mode 100644 kernel/liveupdate/luo_selftests.h diff --git a/kernel/liveupdate/Kconfig b/kernel/liveupdate/Kconfig index 75a17ca8a592..5be04ede357d 100644 --- a/kernel/liveupdate/Kconfig +++ b/kernel/liveupdate/Kconfig @@ -47,6 +47,21 @@ config LIVEUPDATE_SYSFS_API =20 If unsure, say N. =20 +config LIVEUPDATE_SELFTESTS + bool "Live Update Orchestrator - self-tests" + depends on LIVEUPDATE + help + =C2=A0 Say Y here to build self-tests for the LUO framework. When enabled, + these tests can be initiated via the ioctl interface to help verify + the core live update functionality. + + =C2=A0 This option is primarily intended for developers working on the + =C2=A0 live update feature or for validation purposes during system + =C2=A0 integration. + + =C2=A0 If you are unsure or are building a production kernel where size + =C2=A0 or attack surface is a concern, say N. + config KEXEC_HANDOVER bool "kexec handover" depends on ARCH_SUPPORTS_KEXEC_HANDOVER && ARCH_SUPPORTS_KEXEC_FILE diff --git a/kernel/liveupdate/Makefile b/kernel/liveupdate/Makefile index 47f5d0378a75..9b8b69517463 100644 --- a/kernel/liveupdate/Makefile +++ b/kernel/liveupdate/Makefile @@ -13,4 +13,5 @@ obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o obj-$(CONFIG_KEXEC_HANDOVER_DEBUG) +=3D kexec_handover_debug.o =20 obj-$(CONFIG_LIVEUPDATE) +=3D luo.o +obj-$(CONFIG_LIVEUPDATE_SELFTESTS) +=3D luo_selftests.o obj-$(CONFIG_LIVEUPDATE_SYSFS_API) +=3D luo_sysfs.o diff --git a/kernel/liveupdate/luo_selftests.c b/kernel/liveupdate/luo_self= tests.c new file mode 100644 index 000000000000..824d6a99f8fc --- /dev/null +++ b/kernel/liveupdate/luo_selftests.c @@ -0,0 +1,345 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +/** + * DOC: LUO Selftests + * + * We provide ioctl-based selftest interface for the LUO. It provides a + * mechanism to test core LUO functionality, particularly the registration, + * unregistration, and data handling aspects of LUO subsystems, without + * requiring a full live update event sequence. + * + * The tests are intended primarily for developers working on the LUO fram= ework + * or for validation purposes during system integration. This functionalit= y is + * conditionally compiled based on the `CONFIG_LIVEUPDATE_SELFTESTS` Kconf= ig + * option and should typically be disabled in production kernels. + * + * Interface: + * The selftests are accessed via the `/dev/liveupdate` character device u= sing + * the `LIVEUPDATE_IOCTL_SELFTESTS` ioctl command. The argument to the ioc= tl + * is a pointer to a `struct liveupdate_selftest` structure (defined in + * `uapi/linux/liveupdate.h`), which contains: + * - `cmd`: The specific selftest command to execute (e.g., + * `LUO_CMD_SUBSYSTEM_REGISTER`). + * - `arg`: A pointer to a command-specific argument structure. For subsys= tem + * tests, this points to a `struct luo_arg_subsystem` (defined in + * `luo_selftests.h`). + * + * Commands: + * - `LUO_CMD_SUBSYSTEM_REGISTER`: + * Registers a new dummy LUO subsystem. It allocates kernel memory for test + * data, copies initial data from the user-provided `data_page`, sets up + * simple logging callbacks, and calls the core + * `liveupdate_register_subsystem()` + * function. Requires `arg` pointing to `struct luo_arg_subsystem`. + * - `LUO_CMD_SUBSYSTEM_UNREGISTER`: + * Unregisters a previously registered dummy subsystem identified by `name= `. + * It calls the core `liveupdate_unregister_subsystem()` function and then + * frees the associated kernel memory and internal tracking structures. + * Requires `arg` pointing to `struct luo_arg_subsystem` (only `name` used= ). + * - `LUO_CMD_SUBSYSTEM_GETDATA`: + * Copies the content of the kernel data page associated with the specified + * dummy subsystem (`name`) back to the user-provided `data_page`. This al= lows + * userspace to verify the state of the data after potential test operatio= ns. + * Requires `arg` pointing to `struct luo_arg_subsystem`. + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include +#include "luo_internal.h" +#include "luo_selftests.h" + +static struct luo_subsystems { + struct liveupdate_subsystem handle; + char name[LUO_NAME_LENGTH]; + void *data; + bool in_use; + bool preserved; +} luo_subsystems[LUO_MAX_SUBSYSTEMS]; + +/* Only allow one selftest ioctl operation at a time */ +static DEFINE_MUTEX(luo_ioctl_mutex); + +static int luo_subsystem_prepare(struct liveupdate_subsystem *h, u64 *data) +{ + struct luo_subsystems *s =3D container_of(h, struct luo_subsystems, + handle); + unsigned long phys_addr =3D __pa(s->data); + int ret; + + ret =3D kho_preserve_phys(phys_addr, PAGE_SIZE); + if (ret) + return ret; + + s->preserved =3D true; + *data =3D phys_addr; + pr_info("Subsystem '%s' prepare data[%lx]\n", + s->name, phys_addr); + + if (strstr(s->name, NAME_PREPARE_FAIL)) + return -EAGAIN; + + return 0; +} + +static int luo_subsystem_freeze(struct liveupdate_subsystem *h, u64 *data) +{ + struct luo_subsystems *s =3D container_of(h, struct luo_subsystems, + handle); + + pr_info("Subsystem '%s' freeze data[%llx]\n", s->name, *data); + + return 0; +} + +static void luo_subsystem_cancel(struct liveupdate_subsystem *h, u64 data) +{ + struct luo_subsystems *s =3D container_of(h, struct luo_subsystems, + handle); + + pr_info("Subsystem '%s' canel data[%llx]\n", s->name, data); + s->preserved =3D false; + WARN_ON(kho_unpreserve_phys(data, PAGE_SIZE)); +} + +static void luo_subsystem_finish(struct liveupdate_subsystem *h, u64 data) +{ + struct luo_subsystems *s =3D container_of(h, struct luo_subsystems, + handle); + + pr_info("Subsystem '%s' finish data[%llx]\n", s->name, data); +} + +static const struct liveupdate_subsystem_ops luo_selftest_subsys_ops =3D { + .prepare =3D luo_subsystem_prepare, + .freeze =3D luo_subsystem_freeze, + .cancel =3D luo_subsystem_cancel, + .finish =3D luo_subsystem_finish, + .owner =3D THIS_MODULE, +}; + +static int luo_subsystem_idx(char *name) +{ + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (luo_subsystems[i].in_use && + !strcmp(luo_subsystems[i].name, name)) + break; + } + + if (i =3D=3D LUO_MAX_SUBSYSTEMS) { + pr_warn("Subsystem with name '%s' is not registred\n", name); + + return -EINVAL; + } + + return i; +} + +static void luo_put_and_free_subsystem(char *name) +{ + int i =3D luo_subsystem_idx(name); + + if (i < 0) + return; + + if (luo_subsystems[i].preserved) + kho_unpreserve_phys(__pa(luo_subsystems[i].data), PAGE_SIZE); + free_page((unsigned long)luo_subsystems[i].data); + luo_subsystems[i].in_use =3D false; + luo_subsystems[i].preserved =3D false; +} + +static int luo_get_and_alloc_subsystem(char *name, void __user *data, + struct liveupdate_subsystem **hp) +{ + unsigned long page_addr, i; + + page_addr =3D get_zeroed_page(GFP_KERNEL); + if (!page_addr) { + pr_warn("Failed to allocate memory for subsystem data\n"); + return -ENOMEM; + } + + if (copy_from_user((void *)page_addr, data, PAGE_SIZE)) { + free_page(page_addr); + return -EFAULT; + } + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (!luo_subsystems[i].in_use) + break; + } + + if (i =3D=3D LUO_MAX_SUBSYSTEMS) { + pr_warn("Maximum number of subsystems registered\n"); + free_page(page_addr); + return -ENOMEM; + } + + luo_subsystems[i].in_use =3D true; + luo_subsystems[i].handle.ops =3D &luo_selftest_subsys_ops; + luo_subsystems[i].handle.name =3D luo_subsystems[i].name; + strscpy(luo_subsystems[i].name, name, LUO_NAME_LENGTH); + luo_subsystems[i].data =3D (void *)page_addr; + + *hp =3D &luo_subsystems[i].handle; + + return 0; +} + +static int luo_cmd_subsystem_unregister(void __user *argp) +{ + struct luo_arg_subsystem arg; + int ret, i; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + i =3D luo_subsystem_idx(arg.name); + if (i < 0) + return i; + + ret =3D liveupdate_unregister_subsystem(&luo_subsystems[i].handle); + if (ret) + return ret; + + luo_put_and_free_subsystem(arg.name); + + return 0; +} + +static int luo_cmd_subsystem_register(void __user *argp) +{ + struct liveupdate_subsystem *h; + struct luo_arg_subsystem arg; + int ret; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + ret =3D luo_get_and_alloc_subsystem(arg.name, + (void __user *)arg.data_page, &h); + if (ret) + return ret; + + ret =3D liveupdate_register_subsystem(h); + if (ret) + luo_put_and_free_subsystem(arg.name); + + return ret; +} + +static int luo_cmd_subsystem_getdata(void __user *argp) +{ + struct luo_arg_subsystem arg; + int i; + + if (copy_from_user(&arg, argp, sizeof(arg))) + return -EFAULT; + + i =3D luo_subsystem_idx(arg.name); + if (i < 0) + return i; + + if (copy_to_user(arg.data_page, luo_subsystems[i].data, + PAGE_SIZE)) { + return -EFAULT; + } + + return 0; +} + +static int luo_ioctl_selftests(void __user *argp) +{ + struct liveupdate_selftest luo_st; + void __user *cmd_argp; + int ret =3D 0; + + if (copy_from_user(&luo_st, argp, sizeof(luo_st))) + return -EFAULT; + + cmd_argp =3D (void __user *)luo_st.arg; + + mutex_lock(&luo_ioctl_mutex); + switch (luo_st.cmd) { + case LUO_CMD_SUBSYSTEM_REGISTER: + ret =3D luo_cmd_subsystem_register(cmd_argp); + break; + + case LUO_CMD_SUBSYSTEM_UNREGISTER: + ret =3D luo_cmd_subsystem_unregister(cmd_argp); + break; + + case LUO_CMD_SUBSYSTEM_GETDATA: + ret =3D luo_cmd_subsystem_getdata(cmd_argp); + break; + + default: + pr_warn("ioctl: unknown self-test command nr: 0x%llx\n", + luo_st.cmd); + ret =3D -ENOTTY; + break; + } + mutex_unlock(&luo_ioctl_mutex); + + return ret; +} + +static long luo_selftest_ioctl(struct file *filep, unsigned int cmd, + unsigned long arg) +{ + int ret =3D 0; + + if (_IOC_TYPE(cmd) !=3D LIVEUPDATE_IOCTL_TYPE) + return -ENOTTY; + + switch (cmd) { + case LIVEUPDATE_IOCTL_FREEZE: + ret =3D luo_freeze(); + break; + + case LIVEUPDATE_IOCTL_SELFTESTS: + ret =3D luo_ioctl_selftests((void __user *)arg); + break; + + default: + pr_warn("ioctl: unknown command nr: 0x%x\n", _IOC_NR(cmd)); + ret =3D -ENOTTY; + break; + } + + return ret; +} + +static const struct file_operations luo_selftest_fops =3D { + .open =3D nonseekable_open, + .unlocked_ioctl =3D luo_selftest_ioctl, +}; + +static int __init luo_seltesttest_init(void) +{ + if (!liveupdate_debugfs_root) { + pr_err("liveupdate root is not set\n"); + return 0; + } + debugfs_create_file_unsafe("luo_selftest", 0600, + liveupdate_debugfs_root, NULL, + &luo_selftest_fops); + return 0; +} + +late_initcall(luo_seltesttest_init); diff --git a/kernel/liveupdate/luo_selftests.h b/kernel/liveupdate/luo_self= tests.h new file mode 100644 index 000000000000..098f2e9e6a78 --- /dev/null +++ b/kernel/liveupdate/luo_selftests.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#ifndef _LINUX_LUO_SELFTESTS_H +#define _LINUX_LUO_SELFTESTS_H + +#include +#include + +/* Maximum number of subsystem self-test can register */ +#define LUO_MAX_SUBSYSTEMS 16 +#define LUO_NAME_LENGTH 32 + +#define LUO_CMD_SUBSYSTEM_REGISTER 0 +#define LUO_CMD_SUBSYSTEM_UNREGISTER 1 +#define LUO_CMD_SUBSYSTEM_GETDATA 2 +struct luo_arg_subsystem { + char name[LUO_NAME_LENGTH]; + void *data_page; +}; + +/* + * Test name prefixes: + * normal: prepare and freeze callbacks do not fail + * prepare_fail: prepare callback fails for this test. + * freeze_fail: freeze callback fails for this test + */ +#define NAME_NORMAL "ksft_luo" +#define NAME_PREPARE_FAIL "ksft_prepare_fail" +#define NAME_FREEZE_FAIL "ksft_freeze_fail" + +/** + * struct liveupdate_selftest - Holds directions for the self-test operati= ons. + * @cmd: Selftest comman defined in luo_selftests.h. + * @arg: Argument for the self test command. + * + * This structure is used only for the selftest purposes. + */ +struct liveupdate_selftest { + __u64 cmd; + __u64 arg; +}; + +/** + * LIVEUPDATE_IOCTL_FREEZE - Notify subsystems of imminent reboot + * transition. + * + * Argument: None. + * + * Notifies the live update subsystem and associated components that the k= ernel + * is about to execute the final reboot transition into the new kernel (e.= g., + * via kexec). This action triggers the internal %LIVEUPDATE_FREEZE kernel + * event. This event provides subsystems a final, brief opportunity (withi= n the + * "blackout window") to save critical state or perform last-moment quiesc= ing. + * Any remaining or deferred state saving for items marked via the PRESERVE + * ioctls typically occurs in response to the %LIVEUPDATE_FREEZE event. + * + * This ioctl should only be called when the system is in the + * %LIVEUPDATE_STATE_PREPARED state. This command does not transfer data. + * + * Return: 0 if the notification is successfully processed by the kernel (= but + * reboot follows). Returns a negative error code if the notification fails + * or if the system is not in the %LIVEUPDATE_STATE_PREPARED state. + */ +#define LIVEUPDATE_IOCTL_FREEZE \ + _IO(LIVEUPDATE_IOCTL_TYPE, 0x05) + +/** + * LIVEUPDATE_IOCTL_SELFTESTS - Interface for the LUO selftests + * + * Argument: Pointer to &struct liveupdate_selftest. + * + * Use by LUO selftests, commands are declared in luo_selftests.h + * + * Return: 0 on success, negative error code on failure (e.g., invalid tok= en). + */ +#define LIVEUPDATE_IOCTL_SELFTESTS \ + _IOWR(LIVEUPDATE_IOCTL_TYPE, 0x08, struct liveupdate_selftest) + +#endif /* _LINUX_LUO_SELFTESTS_H */ --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qv1-f47.google.com (mail-qv1-f47.google.com [209.85.219.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB93D23FC42 for ; Thu, 7 Aug 2025 01:45:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531126; cv=none; b=b/cLJ86v5saG3os5O+tBNSlpkLnOwAdMaF+IKBbvYVnohWqQACDYsUoGAYHegiwB/bromXOWt9YGIex/HMw1c05v/1vWgdIahd3EcNnWJ1PB9SkonJKxLlFHhUTbtPkw1sKIDmzEf2yI4frpz009O1RV1qUvBZbguD21pNIJJb4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531126; c=relaxed/simple; bh=kTWhEvbuCaicpRqRsa/P4zEbCOqm+ea1bZdGOKvfd+c=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=V+oYravaKXpzPE4IJwVLecB1ns2GWRe9wBsXN718OpamwNHAQ77Ar18TBd/3163eeobH9wr1HYbkxMp9aH4/M0BjmTHdgm0I22OcQTYL6LhJciBlCzC5FqAKjsjY37t9VvDeGGmQlECmUiIsfrfqUaAQ9tE+IHorTbAnbPrNgug= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Nz1wzbJX; arc=none smtp.client-ip=209.85.219.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Nz1wzbJX" Received: by mail-qv1-f47.google.com with SMTP id 6a1803df08f44-70884da4b55so6137246d6.3 for ; Wed, 06 Aug 2025 18:45:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531121; x=1755135921; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Z9F1hOcDAjab1iHPlBAL9eSJaLDlWaWMO1GBmtnwL/g=; b=Nz1wzbJXYiCG9uGQCHj+x1J2bP0LHLFmKfSJ9MKkkXZwvLM3P14OFnSQG7qeaysXYu 4g/eArlJ6WQlKiWBMknguOD3O9q0DOuv8XloVg1deSect/mYaY1OYSUcdyt+0yGOoeAg /eW0JOs9Lk8IOx1Do590kLd1aZe8Vpcaarc1PhiULMXcupHYDpUP/JB1S3NeOFQTqXqz P932t26OJJYAGXn/UPe5uzauznzNdgSKBkqO4FmXFLMOngNZCQQm1EZCnKdYVp7XVRee wdwVqlw4Ulkjd4rYcALE2a/1ZWusox3iC9KaVgFVnAJYhX/STNFUpPiEVwZ8lzRRA9wC u0Tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531122; x=1755135922; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Z9F1hOcDAjab1iHPlBAL9eSJaLDlWaWMO1GBmtnwL/g=; b=SUVyCQVVCDOp3L0+MYy9qemawL1bBx4XlJWCm1tfzqZGaDRPSRHy3PxtmsnPo4z3l6 P9vcky6npwTvPD8dH17YIQ4+nsS7n9UBQ0dF/Ayf40RPYtr4oBELrb5RUIs/6ZG+sSS8 x6u9tVJqz8UL1t9DGgJmjUhjr6ZZ9yOD1p8kOLg4JT0Ry4Kfwhg3gR5KlB1jHTzA5ysd 33BRnNfVu0YOzloURKq3DTk7+vcl4wiHUnlYhuPimqgG786aYjYLKWhVOWGnXDsy+pYE SBmZ5Cj9ht+4Y6Ha/PaVEhDjoUPBPauWdDoFCrkRdxSvk68FKt21IxMLF0teCVJOQFgT xGag== X-Forwarded-Encrypted: i=1; AJvYcCVUw/vvfgiEgyL3ABGmPiYrwW5pMhMd4HpHE1pASrs9FNqexrcjA9TiCHEeYytusJFPZDXdc66VCNZiDYE=@vger.kernel.org X-Gm-Message-State: AOJu0YzdxoSTkmW5t2SljonPy//orTicIUTHoWPNAZ8ecw3x10VDmPex xaDAyRzzvFzcTM/q3LhaeB+pvzz1Fpr17tNX03LcpI5tQ6O9jTHrEVky0umkxy1SM5k= X-Gm-Gg: ASbGncvR6xBpBK5BQS3yD7YSTnZtofy7scJU4w1cgmKDIjjwh8eeyQKJF1JYT30FOvC 4TdsBhgRK4SzI98DJBaZtgQr94btvfL9rVdMZPhURmQdgyXwHInbxbO2dmfVenfGS7U+BRuAGBd q5tnoKQ0EHFA4aVtdDxPEeV2kc/75UwgPyY0Q1yyUnbSh0Gro3AZNnKKkz0lBTP4t3yY6o7GOXf D/K2LRnCkqsChjGg06xzbSvjFwLRdvZBLNr730EusF1cvprU5cAO1FDZe1OzXqE2a0CdDIdT0xu 7DMfWivDlYj3/7aYT2mynLlXMcPx5mQDfa6IlKkOgkmg6pVZMjZHVvsCB9RnLPvtqXXtWUMSUh2 2KYDxIk/CEiQytGthih2pysgZr6UskTlBx/B+TuloIeyIGweRVDAF9ErtoPr88FkhRWbdLplXcZ cqwK9g4ccP6GTO X-Google-Smtp-Source: AGHT+IHM8rbdi1dYANpxFGjhD02SFYyo5a7VYvH2dHy9m+NmOGj0IODQ2sFOof0qb62BBdLWP7jK6Q== X-Received: by 2002:a05:6214:4006:b0:709:22f1:d657 with SMTP id 6a1803df08f44-7097afb978fmr57768486d6.40.1754531121390; Wed, 06 Aug 2025 18:45:21 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:20 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 23/30] selftests/liveupdate: add subsystem/state tests Date: Thu, 7 Aug 2025 01:44:29 +0000 Message-ID: <20250807014442.3829950-24-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduces a new set of userspace selftests for the LUO. These tests verify the functionality LUO by using the kernel-side selftest ioctls provided by the LUO module, primarily focusing on subsystem management and basic LUO state transitions. Signed-off-by: Pasha Tatashin --- tools/testing/selftests/Makefile | 1 + tools/testing/selftests/liveupdate/.gitignore | 1 + tools/testing/selftests/liveupdate/Makefile | 7 + tools/testing/selftests/liveupdate/config | 6 + .../testing/selftests/liveupdate/liveupdate.c | 406 ++++++++++++++++++ 5 files changed, 421 insertions(+) create mode 100644 tools/testing/selftests/liveupdate/.gitignore create mode 100644 tools/testing/selftests/liveupdate/Makefile create mode 100644 tools/testing/selftests/liveupdate/config create mode 100644 tools/testing/selftests/liveupdate/liveupdate.c diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Mak= efile index 030da61dbff3..3f76ee8ddda6 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -53,6 +53,7 @@ TARGETS +=3D kvm TARGETS +=3D landlock TARGETS +=3D lib TARGETS +=3D livepatch +TARGETS +=3D liveupdate TARGETS +=3D lkdtm TARGETS +=3D lsm TARGETS +=3D membarrier diff --git a/tools/testing/selftests/liveupdate/.gitignore b/tools/testing/= selftests/liveupdate/.gitignore new file mode 100644 index 000000000000..af6e773cf98f --- /dev/null +++ b/tools/testing/selftests/liveupdate/.gitignore @@ -0,0 +1 @@ +/liveupdate diff --git a/tools/testing/selftests/liveupdate/Makefile b/tools/testing/se= lftests/liveupdate/Makefile new file mode 100644 index 000000000000..2a573c36016e --- /dev/null +++ b/tools/testing/selftests/liveupdate/Makefile @@ -0,0 +1,7 @@ +# SPDX-License-Identifier: GPL-2.0-only +CFLAGS +=3D -Wall -O2 -Wno-unused-function +CFLAGS +=3D $(KHDR_INCLUDES) + +TEST_GEN_PROGS +=3D liveupdate + +include ../lib.mk diff --git a/tools/testing/selftests/liveupdate/config b/tools/testing/self= tests/liveupdate/config new file mode 100644 index 000000000000..382c85b89570 --- /dev/null +++ b/tools/testing/selftests/liveupdate/config @@ -0,0 +1,6 @@ +CONFIG_KEXEC_FILE=3Dy +CONFIG_KEXEC_HANDOVER=3Dy +CONFIG_KEXEC_HANDOVER_DEBUG=3Dy +CONFIG_LIVEUPDATE=3Dy +CONFIG_LIVEUPDATE_SYSFS_API=3Dy +CONFIG_LIVEUPDATE_SELFTESTS=3Dy diff --git a/tools/testing/selftests/liveupdate/liveupdate.c b/tools/testin= g/selftests/liveupdate/liveupdate.c new file mode 100644 index 000000000000..b59767a7aaba --- /dev/null +++ b/tools/testing/selftests/liveupdate/liveupdate.c @@ -0,0 +1,406 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + */ + +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include + +#include "../kselftest.h" +#include "../kselftest_harness.h" +#include "../../../../kernel/liveupdate/luo_selftests.h" + +struct subsystem_info { + void *data_page; + void *verify_page; + char test_name[LUO_NAME_LENGTH]; + bool registered; +}; + +FIXTURE(subsystem) { + int fd; + int fd_dbg; + struct subsystem_info si[LUO_MAX_SUBSYSTEMS]; +}; + +FIXTURE(state) { + int fd; + int fd_dbg; +}; + +#define LUO_DEVICE "/dev/liveupdate" +#define LUO_DBG_DEVICE "/sys/kernel/debug/liveupdate/luo_selftest" +#define LUO_SYSFS_STATE "/sys/kernel/liveupdate/state" +static size_t page_size; + +const char *const luo_state_str[] =3D { + [LIVEUPDATE_STATE_UNDEFINED] =3D "undefined", + [LIVEUPDATE_STATE_NORMAL] =3D "normal", + [LIVEUPDATE_STATE_PREPARED] =3D "prepared", + [LIVEUPDATE_STATE_FROZEN] =3D "frozen", + [LIVEUPDATE_STATE_UPDATED] =3D "updated", +}; + +static int run_luo_selftest_cmd(int fd_dbg, __u64 cmd_code, + struct luo_arg_subsystem *subsys_arg) +{ + struct liveupdate_selftest k_arg; + + k_arg.cmd =3D cmd_code; + k_arg.arg =3D (__u64)(unsigned long)subsys_arg; + + return ioctl(fd_dbg, LIVEUPDATE_IOCTL_SELFTESTS, &k_arg); +} + +static int register_subsystem(int fd_dbg, struct subsystem_info *si) +{ + struct luo_arg_subsystem subsys_arg; + int ret; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", si->test_name); + subsys_arg.data_page =3D si->data_page; + + ret =3D run_luo_selftest_cmd(fd_dbg, LUO_CMD_SUBSYSTEM_REGISTER, + &subsys_arg); + if (!ret) + si->registered =3D true; + + return ret; +} + +static int unregister_subsystem(int fd_dbg, struct subsystem_info *si) +{ + struct luo_arg_subsystem subsys_arg; + int ret; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", si->test_name); + + ret =3D run_luo_selftest_cmd(fd_dbg, LUO_CMD_SUBSYSTEM_UNREGISTER, + &subsys_arg); + if (!ret) + si->registered =3D false; + + return ret; +} + +static int get_sysfs_state(void) +{ + char buf[64]; + ssize_t len; + int fd, i; + + fd =3D open(LUO_SYSFS_STATE, O_RDONLY); + if (fd < 0) { + ksft_print_msg("Failed to open sysfs state file '%s': %s\n", + LUO_SYSFS_STATE, strerror(errno)); + return -errno; + } + + len =3D read(fd, buf, sizeof(buf) - 1); + close(fd); + + if (len <=3D 0) { + ksft_print_msg("Failed to read sysfs state file '%s': %s\n", + LUO_SYSFS_STATE, strerror(errno)); + return -errno; + } + if (buf[len - 1] =3D=3D '\n') + buf[len - 1] =3D '\0'; + else + buf[len] =3D '\0'; + + for (i =3D 0; i < ARRAY_SIZE(luo_state_str); i++) { + if (!strcmp(buf, luo_state_str[i])) + return i; + } + + return -EIO; +} + +FIXTURE_SETUP(state) +{ + int state; + + page_size =3D sysconf(_SC_PAGE_SIZE); + self->fd =3D open(LUO_DEVICE, O_RDWR); + if (self->fd < 0) + SKIP(return, "open(%s) failed [%d]", LUO_DEVICE, errno); + + self->fd_dbg =3D open(LUO_DBG_DEVICE, O_RDWR); + ASSERT_GE(self->fd_dbg, 0); + + state =3D get_sysfs_state(); + if (state < 0) { + if (state =3D=3D -ENOENT || state =3D=3D -EACCES) + SKIP(return, "sysfs state not accessible (%d)", state); + } +} + +FIXTURE_TEARDOWN(state) +{ + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + struct liveupdate_ioctl_get_state ligs =3D {.size =3D sizeof(ligs)}; + + ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs); + if (ligs.state !=3D LIVEUPDATE_STATE_NORMAL) + ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel); + close(self->fd); +} + +FIXTURE_SETUP(subsystem) +{ + int i; + + page_size =3D sysconf(_SC_PAGE_SIZE); + memset(&self->si, 0, sizeof(self->si)); + self->fd =3D open(LUO_DEVICE, O_RDWR); + if (self->fd < 0) + SKIP(return, "open(%s) failed [%d]", LUO_DEVICE, errno); + + self->fd_dbg =3D open(LUO_DBG_DEVICE, O_RDWR); + ASSERT_GE(self->fd_dbg, 0); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + snprintf(self->si[i].test_name, LUO_NAME_LENGTH, + NAME_NORMAL ".%d", i); + + self->si[i].data_page =3D mmap(NULL, page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + -1, 0); + ASSERT_NE(MAP_FAILED, self->si[i].data_page); + memset(self->si[i].data_page, 'A' + i, page_size); + + self->si[i].verify_page =3D mmap(NULL, page_size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + -1, 0); + ASSERT_NE(MAP_FAILED, self->si[i].verify_page); + memset(self->si[i].verify_page, 0, page_size); + } +} + +FIXTURE_TEARDOWN(subsystem) +{ + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + enum liveupdate_state state =3D LIVEUPDATE_STATE_NORMAL; + int i; + + ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &state); + if (state !=3D LIVEUPDATE_STATE_NORMAL) + ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + if (self->si[i].registered) + unregister_subsystem(self->fd_dbg, &self->si[i]); + munmap(self->si[i].data_page, page_size); + munmap(self->si[i].verify_page, page_size); + } + + close(self->fd); +} + +TEST_F(state, normal) +{ + struct liveupdate_ioctl_get_state ligs =3D {.size =3D sizeof(ligs)}; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_NORMAL); +} + +TEST_F(state, prepared) +{ + struct liveupdate_ioctl_get_state ligs =3D {.size =3D sizeof(ligs)}; + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_PREPARED); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_NORMAL); +} + +TEST_F(state, sysfs_normal) +{ + ASSERT_EQ(LIVEUPDATE_STATE_NORMAL, get_sysfs_state()); +} + +TEST_F(state, sysfs_prepared) +{ + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + ASSERT_EQ(LIVEUPDATE_STATE_PREPARED, get_sysfs_state()); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); + ASSERT_EQ(LIVEUPDATE_STATE_NORMAL, get_sysfs_state()); +} + +TEST_F(state, sysfs_frozen) +{ + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + + ASSERT_EQ(LIVEUPDATE_STATE_PREPARED, get_sysfs_state()); + + ASSERT_EQ(0, ioctl(self->fd_dbg, LIVEUPDATE_IOCTL_FREEZE, NULL)); + ASSERT_EQ(LIVEUPDATE_STATE_FROZEN, get_sysfs_state()); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); + ASSERT_EQ(LIVEUPDATE_STATE_NORMAL, get_sysfs_state()); +} + +TEST_F(subsystem, register_unregister) +{ + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[0])); + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[0])); +} + +TEST_F(subsystem, double_unregister) +{ + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[0])); + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[0])); + EXPECT_NE(0, unregister_subsystem(self->fd_dbg, &self->si[0])); + EXPECT_TRUE(errno =3D=3D EINVAL || errno =3D=3D ENOENT); +} + +TEST_F(subsystem, register_unregister_many) +{ + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_F(subsystem, getdata_verify) +{ + struct liveupdate_ioctl_get_state ligs =3D {.size =3D sizeof(ligs), .stat= e =3D 0}; + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + int i; + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_PREPARED); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) { + struct luo_arg_subsystem subsys_arg; + + memset(&subsys_arg, 0, sizeof(subsys_arg)); + snprintf(subsys_arg.name, LUO_NAME_LENGTH, "%s", + self->si[i].test_name); + subsys_arg.data_page =3D self->si[i].verify_page; + + ASSERT_EQ(0, run_luo_selftest_cmd(self->fd_dbg, + LUO_CMD_SUBSYSTEM_GETDATA, + &subsys_arg)); + ASSERT_EQ(0, memcmp(self->si[i].data_page, + self->si[i].verify_page, + page_size)); + } + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_GET_STATE, &ligs)); + ASSERT_EQ(ligs.state, LIVEUPDATE_STATE_NORMAL); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_F(subsystem, prepare_fail) +{ + struct liveupdate_ioctl_set_event prepare =3D { + .size =3D sizeof(prepare), + .event =3D LIVEUPDATE_PREPARE, + }; + struct liveupdate_ioctl_set_event cancel =3D { + .size =3D sizeof(cancel), + .event =3D LIVEUPDATE_CANCEL, + }; + int i; + + snprintf(self->si[LUO_MAX_SUBSYSTEMS - 1].test_name, LUO_NAME_LENGTH, + NAME_PREPARE_FAIL ".%d", LUO_MAX_SUBSYSTEMS - 1); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(-1, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); + + snprintf(self->si[LUO_MAX_SUBSYSTEMS - 1].test_name, LUO_NAME_LENGTH, + NAME_NORMAL ".%d", LUO_MAX_SUBSYSTEMS - 1); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, register_subsystem(self->fd_dbg, &self->si[i])); + + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &prepare)); + ASSERT_EQ(0, ioctl(self->fd_dbg, LIVEUPDATE_IOCTL_FREEZE, NULL)); + ASSERT_EQ(0, ioctl(self->fd, LIVEUPDATE_IOCTL_SET_EVENT, &cancel)); + ASSERT_EQ(LIVEUPDATE_STATE_NORMAL, get_sysfs_state()); + + for (i =3D 0; i < LUO_MAX_SUBSYSTEMS; i++) + ASSERT_EQ(0, unregister_subsystem(self->fd_dbg, &self->si[i])); +} + +TEST_HARNESS_MAIN --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qv1-f49.google.com (mail-qv1-f49.google.com [209.85.219.49]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 178212571B9 for ; Thu, 7 Aug 2025 01:45:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.49 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531127; cv=none; b=u+sTLdIsDbdAsVILLUJQIueNLpMnXA9+5Btz3Ywg9/JUihspiZxYInh3Zpy58XoNuhURbTxmLdsaLTRPDd+Vt0n58NRCZETUizTWpzoUPsQQ9KDL43PqhKpJwPd5138Uv1MuCAq9h6RU8QheuRHlXDgvZUoICiDBZA/eftvmkTk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531127; c=relaxed/simple; bh=UP+tTKO8bTmYY5fR3f5WgKrqzkcSSflb/PbEDUyLyGI=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HM7RBE2NCspvnvCgS5eWPLblGAFw3O+Od9fZKWfda5l3Z3DdTkf+IIp18bnUoGMfL9IUpp9c8k8KwDudMCC0n/sezx/H2/Q5WuGczmf+DsXjMJMkuN7B7fmxBhBF11ObnUKtoAUp6eU8hJ/I52HNzLJQXXJYf2POxOr3qtdpMaY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Q31CMqbt; arc=none smtp.client-ip=209.85.219.49 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Q31CMqbt" Received: by mail-qv1-f49.google.com with SMTP id 6a1803df08f44-70736b2ea12so3377776d6.1 for ; Wed, 06 Aug 2025 18:45:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531123; x=1755135923; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=yGzTQ+v+K/aDaHLB/ImuEc3dGBvilPrPVquoyLgmTGc=; b=Q31CMqbtC/2m7/iYzYo6w3yVUrIPBZ2T4f3K93wetHSduR9Q58N/8m4gax2mIpxC4W OWTBVnzfGngopUnsG4Md3DnYxOV0gmC0yXFYtvirvn37qWZ3Tbe5hKpUhD72OqpHgDr5 MyTVIYob57EMv+igLq5GantSC4GmLxiUKFZs0W+dXKVWg2xMC0aoIo3PXrVIM/2i8jA8 OYkxZUJ18v2ce+Rf/mPcgxygjsmeHXDjA2M23NLTUezFlTf0Vk95pPpArWsp0WrbbKwm LBxtXiv6dazu2CIerewSSocUVZmxxQoU2eIAxkDsc5cVzjpBqKv5uEyToKY72ftatRSs d+dQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531123; x=1755135923; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yGzTQ+v+K/aDaHLB/ImuEc3dGBvilPrPVquoyLgmTGc=; b=jN7Kw1KnJOrxb332LqTNFeMsxsUkjncQDZ5TGH08gjc/lx068MIpspAgu5YPZcNPPk HFbGGxcpCOqPJbEbgvmSO2MnvABnSDDzWzzI9iKInZmOeg/0vuBhxC1Bw6uH1RHs5Z6g oiC9VuAJ4hftN0TsWGXne6GddGiJc83ZC976FsfTfVaM/j/zW/93AiB02ElRyloDX0yX 64fX5JRhhzdsRCqNwfKGrbtqSGET9GAgm0dIfLLrNXtuC6kcU/IpNNoqRqpxyPhFluH7 UYszhH9JD2q7o96X7ZUBKTuT6EOhKOpAQrV5PQNS7efzjUvUtDGHhg2c3OPJbaEYKhxm vkUA== X-Forwarded-Encrypted: i=1; AJvYcCXDQi6IlHddoBQC2l6GVxglyIHiTfeWFTsHtXa46OSwEc+qjKWofuXQashw4pSHITP8FQg7vMttqXg+DQI=@vger.kernel.org X-Gm-Message-State: AOJu0YxzEJo6IFt7y8+dsb55VaEkKGgZeauudyleC3oUIvrXv8gyZovI 8nTgVmQVVEbcB6Qkqs+OxTscbN3GwMbUO/wljM3StV5jDfBTGmsHl6RDAf20sbFLnLY= X-Gm-Gg: ASbGnctlI9BaJY88ZwJMb/8LLGfcUh2U8v1zPUGDe/j2qe83siBBCIRbdHtt6C5FI/H u3tDHtlCaci2lH0ts5x9E1aQK4Fr/UY/aqB2FHbsEewe4uvYGV0GKvtthsp+b6hYcYAX3I7wDOm NUREWRGbFX1prMuoxKZVj2zR8HhNFQc6fBwRB7xYoUMMY3hacDelvKgqsUjD717VOr2XJzinTwT 7T9PU1zNtk02u6u4fZbubj5TKTjvtVQo4jrhhCSL8MRKZDyx3X5Bq1hk2H+edDYbyJpaW6Ybi+T p0s+1O9/yIJe1ZXCtljV6tOOGeUKZdx8Yy/+q/3S6sHNp9DwiS2GPpD3wG99DGAIqYqta01PUxE P3a3yNJ7c3sM4Tu8F5N+4WFZ5m+gu9YUqt/PUzBGWuJMV6olA1ZQuja/7qFpCKAQRuOtg0gXpT+ Z0g4Gu5ONiaBJdXgSCOyc6qDQ= X-Google-Smtp-Source: AGHT+IFHcYm7yT/4Xvuv4TWFZXCA+gfPfNAN3UieGuxKRFyW9+H99+FaVZtYe/cgGd4cOSeq4i1AXA== X-Received: by 2002:a05:6214:2522:b0:707:5dac:be09 with SMTP id 6a1803df08f44-7097add7f30mr59186586d6.9.1754531122925; Wed, 06 Aug 2025 18:45:22 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:22 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 24/30] docs: add luo documentation Date: Thu, 7 Aug 2025 01:44:30 +0000 Message-ID: <20250807014442.3829950-25-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add the documentation files for the Live Update Orchestrator Signed-off-by: Pasha Tatashin --- Documentation/admin-guide/index.rst | 1 + Documentation/admin-guide/liveupdate.rst | 16 +++++++ Documentation/core-api/index.rst | 1 + Documentation/core-api/liveupdate.rst | 50 ++++++++++++++++++++++ Documentation/userspace-api/index.rst | 1 + Documentation/userspace-api/liveupdate.rst | 25 +++++++++++ 6 files changed, 94 insertions(+) create mode 100644 Documentation/admin-guide/liveupdate.rst create mode 100644 Documentation/core-api/liveupdate.rst create mode 100644 Documentation/userspace-api/liveupdate.rst diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guid= e/index.rst index 259d79fbeb94..3f59ccf32760 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -95,6 +95,7 @@ likely to be of interest on almost any system. cgroup-v2 cgroup-v1/index cpu-load + liveupdate mm/index module-signing namespaces/index diff --git a/Documentation/admin-guide/liveupdate.rst b/Documentation/admin= -guide/liveupdate.rst new file mode 100644 index 000000000000..ff05cc1dd784 --- /dev/null +++ b/Documentation/admin-guide/liveupdate.rst @@ -0,0 +1,16 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update sysfs +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +LUO sysfs interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_sysfs.c + :doc: LUO sysfs interface + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/inde= x.rst index a03a99c2cac5..a8b7d1417f0a 100644 --- a/Documentation/core-api/index.rst +++ b/Documentation/core-api/index.rst @@ -137,6 +137,7 @@ Documents that don't fit elsewhere or which have yet to= be categorized. :maxdepth: 1 =20 librs + liveupdate netlink =20 .. only:: subproject and html diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api= /liveupdate.rst new file mode 100644 index 000000000000..41c4b76cd3ec --- /dev/null +++ b/Documentation/core-api/liveupdate.rst @@ -0,0 +1,50 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update Orchestrator +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +.. kernel-doc:: kernel/liveupdate/luo_core.c + :doc: Live Update Orchestrator (LUO) + +LUO Subsystems Participation +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :doc: LUO Subsystems support + +LUO Preserving File Descriptors +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_files.c + :doc: LUO file descriptors + +Public API +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: include/linux/liveupdate.h + +.. kernel-doc:: kernel/liveupdate/luo_core.c + :export: + +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :export: + +.. kernel-doc:: kernel/liveupdate/luo_files.c + :export: + +Internal API +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_core.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_subsystems.c + :internal: + +.. kernel-doc:: kernel/liveupdate/luo_files.c + :internal: + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update uAPI ` +- :doc:`Live Update SysFS ` +- :doc:`/core-api/kho/concepts` diff --git a/Documentation/userspace-api/index.rst b/Documentation/userspac= e-api/index.rst index b8c73be4fb11..ee8326932cb0 100644 --- a/Documentation/userspace-api/index.rst +++ b/Documentation/userspace-api/index.rst @@ -62,6 +62,7 @@ Everything else =20 ELF netlink/index + liveupdate sysfs-platform_profile vduse futex2 diff --git a/Documentation/userspace-api/liveupdate.rst b/Documentation/use= rspace-api/liveupdate.rst new file mode 100644 index 000000000000..70b5017c0e3c --- /dev/null +++ b/Documentation/userspace-api/liveupdate.rst @@ -0,0 +1,25 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +Live Update uAPI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +:Author: Pasha Tatashin + +ioctl interface +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_ioctl.c + :doc: LUO ioctl Interface + +ioctl uAPI +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: include/uapi/linux/liveupdate.h + +LUO selftests ioctl +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +.. kernel-doc:: kernel/liveupdate/luo_selftests.c + :doc: LUO Selftests + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B13C225C802 for ; Thu, 7 Aug 2025 01:45:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531128; cv=none; b=qvaveLvfwkRZsZetVUTH2tV7b8w2n59XcqMMk3a5k0+alp6T94MltJgj6espdHL3mYL1YdIhFJeDZjz2L+a7xfs5HGqye/x1qNVuSOcXILbUFy0Q9NGvJAzZGDrm0xsy2OMca5waBkQDfa6mkZHmOPw4lkLwUfZnCs7FbhiN0nI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531128; c=relaxed/simple; bh=MQ63LnQeT5I2XWVR3/9t21kTUDqY3sDHkmER0gtcW/M=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iX9OGdtM7929voEy5dOi1bj/KwmqMV9C6BceDzNbc8FczVvDT0sBh9jJaXWkkm5BgHpgdx0xHvaq/mIYO3HXlA9QvndHa0/vGe5n2SBf016zRFRhNfpElcH7ywwWWUchbz4ZkgFqerG1xfjU4Q0xN0P5c2zjO1csYZ+qj3TGd3M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Uun+cnp/; arc=none smtp.client-ip=209.85.219.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Uun+cnp/" Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-7073075c767so6817866d6.3 for ; Wed, 06 Aug 2025 18:45:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531124; x=1755135924; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=QA/vUYeCxNkI6poJfbjFNlz5+YRzYcbbQ+LvASsW6yI=; b=Uun+cnp/XZzGsvmDPlAnMsb6mrtGv5dYDxU0XO8gk/1VU77Obpair0DmYTgs11wixy OEvZkYCAJnBva9a70vWXvsavXZ4astcUadDqnQZOlrHFrWk5w/Dgab6t1Yr4bdEofaQJ uVYfrN4ICKA63QtOGO+reRstzuz4o5FnyutjFC1wv5COdWV3rkVxWL0cgs1NsXcKUGKp yyo8h6baecjst6NVS7cbvibmGvPtTJYKh76QOTQ0rfRGdLrRF3bd+XaJ8DtiYw6wTLUu hRlvoi3s4xhmoDs+Coyx8VZC5jj3DpTlml/n/UfO5TUSRPiyrG1fLlah/nfGRkJx0ivL hEAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531124; x=1755135924; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QA/vUYeCxNkI6poJfbjFNlz5+YRzYcbbQ+LvASsW6yI=; b=rs699NujsVmF9PKnmIt4JLYnp6/5e/PV4Wvq2HrkjPnhbUid90bM7ByjUrVyNYrlM0 D6nIzhkAo01YHUN9Zq0LzLHeQDzVUbkixlBLXzx0VYLUYBFzldAS8/7Mc4S9Pe+XR6yu 6Zk6HR1luQU0y3qVCBQEjOM1Dzh5f2DKQSVG+esiK0pepjbIWif+YW5D8JGLJHNBUAqz aZ/l8+Jm8pNWdS+ft7CMcgsI9B31w+JG8WzDoX11PfXrwG8nOesg7c9Y2tircRnYTuCp jy/9QwEuzdPX3uj2lUCdy7nxENbVxUitOqLtyxWyAY+ynRoAckCiRBK+q3ei/2RPASHc CKCw== X-Forwarded-Encrypted: i=1; AJvYcCXf9gvInVY2Cu1e/webUMQlQhhdmbGJZuV/Po6dJex/96fcAtFmXEL4FP/NfXDBy7p1JjKvoNXXVpBgUhg=@vger.kernel.org X-Gm-Message-State: AOJu0YwJE5FyvYt7V7b7jFagkEOBcifKReYd6nXQWQ/4eln2DzSK4O6K DBQ2zKUL0D2YEYElVF5ruuMmupS0dOTeYR/dBxGsSl+jAVilteMxKd8qXcFuguqrM8M= X-Gm-Gg: ASbGncsdKpAZWMO6AWA9A6VaxRXkvstInxl3ZXV+Yby1pnJJp2KGSRdX7KxfNZfIRId uMaFrUTazuw7ivDfMb0TzTp1RLPjlvqAdQxp3SjCRHCE/y5bVFCwZGI+EBo+GbMJ/glvLmQqI3v y+D4HzmbwAxt8cqZfmDpYFtWs8g7+r5F9/cEL8uYAEKUF8vbfRsaBHOmQ9oi8xI16C3U3lvzQhX Z40CN/+2ewc59CZ56l9i0yrnj9u1mStP3h0ZG/63WwxvcPKF7nuxO57fet6Jg2z51Wd1R3MjRQP uBR0y58+lFTED5LR2VlmyySXdsEAvepzvoE75gAGTLW/YutSy1VNEFvr1ZLGbzoXom93oaLh2ha 9qSwGNuuOAvRMbBB4txzAfBlZXFRGiddmSuJg6QkeFXCTHFEiNcWcJJErcVQ28tKKeQDYzzJ1k7 0wiku+h99sAXJy5hVA/QjZKP0= X-Google-Smtp-Source: AGHT+IEmxyIIEpF/zYXpWh7lst20k+6O+/3teCC6bRoX7mK0ce9gFr80XAlcr6QckhUk2jTkrcE1vA== X-Received: by 2002:a05:6214:27ea:b0:707:3314:793d with SMTP id 6a1803df08f44-7097969ff73mr76803076d6.37.1754531124250; Wed, 06 Aug 2025 18:45:24 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:23 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 25/30] MAINTAINERS: add liveupdate entry Date: Thu, 7 Aug 2025 01:44:31 +0000 Message-ID: <20250807014442.3829950-26-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add a MAINTAINERS file entry for the new Live Update Orchestrator introduced in previous patches. Signed-off-by: Pasha Tatashin --- MAINTAINERS | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/MAINTAINERS b/MAINTAINERS index 35cf4f95ed46..b88b77977649 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14207,6 +14207,19 @@ F: kernel/module/livepatch.c F: samples/livepatch/ F: tools/testing/selftests/livepatch/ =20 +LIVE UPDATE +M: Pasha Tatashin +L: linux-kernel@vger.kernel.org +S: Maintained +F: Documentation/ABI/testing/sysfs-kernel-liveupdate +F: Documentation/admin-guide/liveupdate.rst +F: Documentation/core-api/liveupdate.rst +F: Documentation/userspace-api/liveupdate.rst +F: include/linux/liveupdate.h +F: include/uapi/linux/liveupdate.h +F: kernel/liveupdate/ +F: tools/testing/selftests/liveupdate/ + LLC (802.2) L: netdev@vger.kernel.org S: Odd fixes --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C3122609D9 for ; Thu, 7 Aug 2025 01:45:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531130; cv=none; b=cnjyhJkWwXwJO2PzWBiWveRY4vrJOo3hDvF6NmHWl1LvkTYvCbWWz5byZdRcD/MxaIlLJYI0QRr414/3pYn0EcjCflO/MmDj3BL7D7POHwMF3Dq1jecZtk0dxhuk/ezRVxy7P4Ih+wiMnLm/LCSW+HNti8OwOajOYU+u+jGFHLc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531130; c=relaxed/simple; bh=VigCaModk7XbRG7Y45+D24UjlgsEkrae9/KiWmPe5Y8=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZO+mwnU3sovYUg2xhldHq9eFDOdF0Pnxb4RZGMaQ4wE3eoJD9QcKYx//Vr0FXUCwT0NRqwb38xKRZPlE9teg+oRxb1TKZujieAIBG4we5de5t8N2/4y5AQk5Ir1IHwLPjCe0HNnuTJdnq+ip+82EumvJYMsURFh3jz9mn3/wxY0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=LPuEzxR1; arc=none smtp.client-ip=209.85.160.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="LPuEzxR1" Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4b070e57254so4880201cf.3 for ; Wed, 06 Aug 2025 18:45:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531126; x=1755135926; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=XeT2iJHyifY4gFOC9716oziuoen1egRU2pBc2ZJ5N/k=; b=LPuEzxR1BC91WMfiEEHE4f3Y2aSQpB+ovSYFpb2lzFt4VN4qOOCvZt9ouc3ERW9Lon a/nME5FxSO8yyR8mWCu1TgSXB3iiaxaQsq2lmOCNobChf8RD0U3CNMC7U/gbi1AKmTOE 3bOlgvaBhXcbwYQDyAzgUZshLnhG+e1zEWQDQb4kX+LPKsmSj4EJPZxTi2Jc/nJvJnYE FyHaiFn96LEM9xQ9qj6vD6ePUJQYx7Q1xKf5anFSjLqqgp8TMZOHI8arbpLOdtVeNPBB SFVXm/bdH3jS59nsEGpl4rDG6SRb6P1pHpjOMwAmlnqcxDFUIQcpfQ399Udag5PMdf3v sNJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531126; x=1755135926; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XeT2iJHyifY4gFOC9716oziuoen1egRU2pBc2ZJ5N/k=; b=ZBWO7/s6Ooz1ue+mUe3V+0SvYOuO56IZ2v1a4yuotPOz1+Y4zFbbFp2ULpXbFO8pw/ NpIdYpgrj+KVeWVdIjdOnqj2hS4kgT9mTdsxj6RztYxUmgNnkAkufpL2MkimSu7+7ade /Nh9JErlWiBC5OM50buHFi/GaY7MNtFi46mGpb+q+B5PLMRy/luwOzbQvw/cyeDr/QEY uspHy0aBxJHIUP/9CcqM8YbrYdYvnLkYnJQF3vM2RwGaN1NOcOgM2Hm8h46NLqv2zUrn obRoQUwj8sZrGxXtDaMAO5p51tgQy95Db/erl8CS7cTNEAUrIXrfcjO5zg8Y9kDa63qv fKNQ== X-Forwarded-Encrypted: i=1; AJvYcCVgkA9RflTw/S9eFaJe7AJVrTOW7iNMdjsR42Y8CpWhPJRwx50EmhNie+1zMZXAz4U+tLvbwIU/dfAECX8=@vger.kernel.org X-Gm-Message-State: AOJu0Yyg2cQIPdDfckkUdU9++XlCFuhKl/4OODpdb2WnrWDePKYTClF8 PKYHfDVvd0+roes3VE0M1185WqgoRnSkbLp4sK59+igUFh7VsjtFLFx9VPm2GxyUAqI= X-Gm-Gg: ASbGncsOZ+61b5fpv676rr+2XixLzJvFkgRu2IGXUtjYjiVgpNzDwbZYM2K1g2rNMeg ZQr0uVa3XZNnMrfjMl6N8eo/knADZmp8sZKQ7wjD2T+bLd/dfqohqSIj7zaq5VOd5PYJbR49dOm u+UFPOFl4gpH2W6y+OdV4oCyottEiXGhN9UTduIVNFgxP/NrRVisMAb0VBUYbRlGN8ZvnAOBsxU y/5z/XM7opPnqQ3iYTwSn6wSRLWjvNiyYOZLKiBUMFFlp44mdL12A4ngkLyI/U9geqOtMUSCkqe 3cqtZa1JwKS3bgRIWUQwN7tHEuc+QrPfBuROl3jot8ZNK12t0T8eHcNSrNygare8Qi4ZOSB5rjB 09shg3UUDzHLV6XD1LLCKmBkWSz1x+Y/CSnlnOxiIkdvgXYeQJk5CKhry3mIKmKwhszbP9bljwB 7NBtnfk0XOaOFY X-Google-Smtp-Source: AGHT+IHk0tb1uB19BidWcJjqq6Q8qaHf0qnKstZyA8KJ/vdb4dYPPaNrNMTJmHZnrPtb3+FE1z9qUg== X-Received: by 2002:a05:6214:5099:b0:707:61a9:8bdf with SMTP id 6a1803df08f44-709795809c9mr79245396d6.22.1754531125709; Wed, 06 Aug 2025 18:45:25 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:25 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 26/30] mm: shmem: use SHMEM_F_* flags instead of VM_* flags Date: Thu, 7 Aug 2025 01:44:32 +0000 Message-ID: <20250807014442.3829950-27-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav shmem_inode_info::flags can have the VM flags VM_NORESERVE and VM_LOCKED. These are used to suppress pre-accounting or to lock the pages in the inode respectively. Using the VM flags directly makes it difficult to add shmem-specific flags that are unrelated to VM behavior since one would need to find a VM flag not used by shmem and re-purpose it. Introduce SHMEM_F_NORESERVE and SHMEM_F_LOCKED which represent the same information, but their bits are independent of the VM flags. Callers can still pass VM_NORESERVE to shmem_get_inode(), but it gets transformed to the shmem-specific flag internally. No functional changes intended. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- include/linux/shmem_fs.h | 6 ++++++ mm/shmem.c | 30 +++++++++++++++++------------- 2 files changed, 23 insertions(+), 13 deletions(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 6d0f9c599ff7..923f0da5f6c4 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -10,6 +10,7 @@ #include #include #include +#include =20 struct swap_iocb; =20 @@ -19,6 +20,11 @@ struct swap_iocb; #define SHMEM_MAXQUOTAS 2 #endif =20 +/* Suppress pre-accounting of the entire object size. */ +#define SHMEM_F_NORESERVE BIT(0) +/* Disallow swapping. */ +#define SHMEM_F_LOCKED BIT(1) + struct shmem_inode_info { spinlock_t lock; unsigned int seals; /* shmem seals */ diff --git a/mm/shmem.c b/mm/shmem.c index e2c76a30802b..8e6b3f003da5 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -175,20 +175,20 @@ static inline struct shmem_sb_info *SHMEM_SB(struct s= uper_block *sb) */ static inline int shmem_acct_size(unsigned long flags, loff_t size) { - return (flags & VM_NORESERVE) ? + return (flags & SHMEM_F_NORESERVE) ? 0 : security_vm_enough_memory_mm(current->mm, VM_ACCT(size)); } =20 static inline void shmem_unacct_size(unsigned long flags, loff_t size) { - if (!(flags & VM_NORESERVE)) + if (!(flags & SHMEM_F_NORESERVE)) vm_unacct_memory(VM_ACCT(size)); } =20 static inline int shmem_reacct_size(unsigned long flags, loff_t oldsize, loff_t newsize) { - if (!(flags & VM_NORESERVE)) { + if (!(flags & SHMEM_F_NORESERVE)) { if (VM_ACCT(newsize) > VM_ACCT(oldsize)) return security_vm_enough_memory_mm(current->mm, VM_ACCT(newsize) - VM_ACCT(oldsize)); @@ -206,7 +206,7 @@ static inline int shmem_reacct_size(unsigned long flags, */ static inline int shmem_acct_blocks(unsigned long flags, long pages) { - if (!(flags & VM_NORESERVE)) + if (!(flags & SHMEM_F_NORESERVE)) return 0; =20 return security_vm_enough_memory_mm(current->mm, @@ -215,7 +215,7 @@ static inline int shmem_acct_blocks(unsigned long flags= , long pages) =20 static inline void shmem_unacct_blocks(unsigned long flags, long pages) { - if (flags & VM_NORESERVE) + if (flags & SHMEM_F_NORESERVE) vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE)); } =20 @@ -1588,7 +1588,7 @@ int shmem_writeout(struct folio *folio, struct swap_i= ocb **plug, int nr_pages; bool split =3D false; =20 - if ((info->flags & VM_LOCKED) || sbinfo->noswap) + if ((info->flags & SHMEM_F_LOCKED) || sbinfo->noswap) goto redirty; =20 if (!total_swap_pages) @@ -2971,15 +2971,15 @@ int shmem_lock(struct file *file, int lock, struct = ucounts *ucounts) * ipc_lock_object() when called from shmctl_do_lock(), * no serialization needed when called from shm_destroy(). */ - if (lock && !(info->flags & VM_LOCKED)) { + if (lock && !(info->flags & SHMEM_F_LOCKED)) { if (!user_shm_lock(inode->i_size, ucounts)) goto out_nomem; - info->flags |=3D VM_LOCKED; + info->flags |=3D SHMEM_F_LOCKED; mapping_set_unevictable(file->f_mapping); } - if (!lock && (info->flags & VM_LOCKED) && ucounts) { + if (!lock && (info->flags & SHMEM_F_LOCKED) && ucounts) { user_shm_unlock(inode->i_size, ucounts); - info->flags &=3D ~VM_LOCKED; + info->flags &=3D ~SHMEM_F_LOCKED; mapping_clear_unevictable(file->f_mapping); } retval =3D 0; @@ -3123,7 +3123,9 @@ static struct inode *__shmem_get_inode(struct mnt_idm= ap *idmap, spin_lock_init(&info->lock); atomic_set(&info->stop_eviction, 0); info->seals =3D F_SEAL_SEAL; - info->flags =3D flags & VM_NORESERVE; + info->flags =3D 0; + if (flags & VM_NORESERVE) + info->flags |=3D SHMEM_F_NORESERVE; info->i_crtime =3D inode_get_mtime(inode); info->fsflags =3D (dir =3D=3D NULL) ? 0 : SHMEM_I(dir)->fsflags & SHMEM_FL_INHERITED; @@ -5862,8 +5864,10 @@ static inline struct inode *shmem_get_inode(struct m= nt_idmap *idmap, /* common code */ =20 static struct file *__shmem_file_setup(struct vfsmount *mnt, const char *n= ame, - loff_t size, unsigned long flags, unsigned int i_flags) + loff_t size, unsigned long vm_flags, + unsigned int i_flags) { + unsigned long flags =3D (vm_flags & VM_NORESERVE) ? SHMEM_F_NORESERVE : 0; struct inode *inode; struct file *res; =20 @@ -5880,7 +5884,7 @@ static struct file *__shmem_file_setup(struct vfsmoun= t *mnt, const char *name, return ERR_PTR(-ENOMEM); =20 inode =3D shmem_get_inode(&nop_mnt_idmap, mnt->mnt_sb, NULL, - S_IFREG | S_IRWXUGO, 0, flags); + S_IFREG | S_IRWXUGO, 0, vm_flags); if (IS_ERR(inode)) { shmem_unacct_size(flags, size); return ERR_CAST(inode); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qv1-f50.google.com (mail-qv1-f50.google.com [209.85.219.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E93A23506E for ; Thu, 7 Aug 2025 01:45:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531131; cv=none; b=bt0TEgo7B6JDdordDyeQ4BZsOX6lMJ/lrMHtHzFFuTNI7SVCQ2jpGjw2oDW3i8eDFY4tXD5VMSrPydKUUF3lMw23SvKCFxIK/q4qGIw8DYl4faYPngg2XBmmMAkjL2up0ieeKVbFfpLyoOoRImCHJfyE7bx+vUY2hQb7W9RHHbw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531131; c=relaxed/simple; bh=alpPo77l/7pgZG8VakBmdAb8i5nQRPDoWfvaWJcTLlE=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Cz3dhQUJ7z7N61s5rILgQ+oRp673w+tVskbG1JSB7qPTbzB62yYrxZOAIngsd6QTlTSs0JWiUpChEUnWsDkNNzP8/JikXAfKaT7MhcbpNz1+kKl7b4psWN7nf3+3n/YwcorfC39lWGVp0MIy6iT5lqD9t9TTeeITFsIoK+XUnkI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Ssc+rCxL; arc=none smtp.client-ip=209.85.219.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Ssc+rCxL" Received: by mail-qv1-f50.google.com with SMTP id 6a1803df08f44-7073075c767so6818236d6.3 for ; Wed, 06 Aug 2025 18:45:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531127; x=1755135927; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=jQny4bvpzR8mZbBN1/IVSyxQ9U+JDItGyGw2v+12Jn8=; b=Ssc+rCxL+/PekNC5vdpImp4eezZQQXUkesj3lXT7/LRqLT5EuydORJWfACh5Cn2cIY YBcBm306eWtSCd6DMYtn8CAx4zwsq1WD/ZSlEakuZyk0n/qkn4VIfvmlDDoKdqDckvxE Lh0QJtZiZl2GNLax0IITljlWdsRn20FcbKc0EBpqeTJldwIl59uNYW8QL9Dl+sFrrGkz j10+/4VMwzmnNppVZw/zQs9Lgx+ocFg7fbqf7ZozcMRQHeIyHquf6szREDwAALh+8kqG /HzRidnV0987oBRaJS5VxBh7mjGKe9LUp/SFKGa27PKa+SyW+BmUghIMRcBnmlx14Hjd YeEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531127; x=1755135927; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jQny4bvpzR8mZbBN1/IVSyxQ9U+JDItGyGw2v+12Jn8=; b=ScPuPzWLdKXwPCDlg60uTSoPRJyQDOoGAgg3zo8T2hls8gmia12j5sYlJlWSzXX6BR 9w+41X0byBzQBFjPYgUlUGt6BF9NN63vlP0/gAsqA5zsprdCTPMp9cDXtrDKlkfeP52f nIzfjbxvikGbG0oe8xDCqstVC00SWtrEMfjsKerwxUVPviMBtiBlQ9ciEaZMiZtGuwOw l4Q4radO+PLyyhheBoN/o3LOYLWMBuJSkTOU1c6VP2HTX7CoNQApm57gKj8uug6Eok54 x+ZhVjhAOlGwbxnuXp14iy5qo5jivX208C9SZxIjQzvV2cijszZjQUt/86fUytCW+USa j+9Q== X-Forwarded-Encrypted: i=1; AJvYcCXhCYbXxlc/NjizFvGRFEIkVGyJBsXOJNFaPPup5DkIK2ZybagZu3KHNaBBguG8Fs+XL5Y/PvedTrkl1mQ=@vger.kernel.org X-Gm-Message-State: AOJu0YyEAHeI8sCVHMTfAtlxmGGK07v50awG1fnjZ0c+tAf6JjoeZBMe 4AAXDd6trwZDUzMFhsM/8OXAq2jTOJtR1oPKlGSDLSvg1bVIdEaQAEQcjEv1DmIfncg= X-Gm-Gg: ASbGncuaX8jzyMdR1n8YV/ywhrc2g164iMSLBOahxZlxHoKnAbFilM7hr/BAFFoR527 G6f2IFI8zeLafpV8fJ7T7K/LP5uPCqv1WbzxDnw1ZQAjOcG2XRqMl6QRrkNXXGjd5epxqhKsM1u Gvx9XrJFOa3mtHtfqT8Mtm7dBVmG5xwRLUkez+n8Nng1R2/3eaG4ZeaGx9Qvpr0OS1cWVm8pcB0 vezfoRtTMHt5WNjKyg379Oj4IQVsEMcNRxZ2YLYAVH36XsNURQPW9eP7J1uWIj7P98/1kmzLNPJ HbzhDgJc2p09cm9RGT4NjOYXGQmh+pH4vAz5nee+IREEK0ndVqok48EXJA96OWJr391gQkhhUJI GySHP8Uif+S1vvz/z9plS8AB004COOOWnwTlN6eRlxAT7NEwK5Vw8l5ngsf8fCDgLgqmvmVJFbs y+jnB7k8YYL1jP X-Google-Smtp-Source: AGHT+IFqMT0HUt9lre9uw9F3rpS7vpTxQLIi22l4S1gL4FMPIdcVTcCnIgm3Am+TvXdeRU0xrWJBwQ== X-Received: by 2002:ad4:5761:0:b0:707:5694:89e4 with SMTP id 6a1803df08f44-709796c75b1mr72234786d6.47.1754531126973; Wed, 06 Aug 2025 18:45:26 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:26 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 27/30] mm: shmem: allow freezing inode mapping Date: Thu, 7 Aug 2025 01:44:33 +0000 Message-ID: <20250807014442.3829950-28-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav To prepare a shmem inode for live update via the Live Update Orchestrator (LUO), its index -> folio mappings must be serialized. Once the mappings are serialized, they cannot change since it would cause the serialized data to become inconsistent. This can be done by pinning the folios to avoid migration, and by making sure no folios can be added to or removed from the inode. While mechanisms to pin folios already exist, the only way to stop folios being added or removed are the grow and shrink file seals. But file seals come with their own semantics, one of which is that they can't be removed. This doesn't work with liveupdate since it can be cancelled or error out, which would need the seals to be removed and the file's normal functionality to be restored. Introduce SHMEM_F_MAPPING_FROZEN to indicate this instead. It is internal to shmem and is not directly exposed to userspace. It functions similar to F_SEAL_GROW | F_SEAL_SHRINK, but additionally disallows hole punching, and can be removed. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- include/linux/shmem_fs.h | 17 +++++++++++++++++ mm/shmem.c | 12 +++++++++++- 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/include/linux/shmem_fs.h b/include/linux/shmem_fs.h index 923f0da5f6c4..f68fc14f7664 100644 --- a/include/linux/shmem_fs.h +++ b/include/linux/shmem_fs.h @@ -24,6 +24,14 @@ struct swap_iocb; #define SHMEM_F_NORESERVE BIT(0) /* Disallow swapping. */ #define SHMEM_F_LOCKED BIT(1) +/* + * Disallow growing, shrinking, or hole punching in the inode. Combined wi= th + * folio pinning, makes sure the inode's mapping stays fixed. + * + * In some ways similar to F_SEAL_GROW | F_SEAL_SHRINK, but can be removed= and + * isn't directly visible to userspace. + */ +#define SHMEM_F_MAPPING_FROZEN BIT(2) =20 struct shmem_inode_info { spinlock_t lock; @@ -186,6 +194,15 @@ static inline bool shmem_file(struct file *file) return shmem_mapping(file->f_mapping); } =20 +/* Must be called with inode lock taken exclusive. */ +static inline void shmem_i_mapping_freeze(struct inode *inode, bool freeze) +{ + if (freeze) + SHMEM_I(inode)->flags |=3D SHMEM_F_MAPPING_FROZEN; + else + SHMEM_I(inode)->flags &=3D ~SHMEM_F_MAPPING_FROZEN; +} + /* * If fallocate(FALLOC_FL_KEEP_SIZE) has been used, there may be pages * beyond i_size's notion of EOF, which fallocate has committed to reservi= ng: diff --git a/mm/shmem.c b/mm/shmem.c index 8e6b3f003da5..ef57e2649a41 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1329,7 +1329,8 @@ static int shmem_setattr(struct mnt_idmap *idmap, loff_t newsize =3D attr->ia_size; =20 /* protected by i_rwsem */ - if ((newsize < oldsize && (info->seals & F_SEAL_SHRINK)) || + if ((info->flags & SHMEM_F_MAPPING_FROZEN) || + (newsize < oldsize && (info->seals & F_SEAL_SHRINK)) || (newsize > oldsize && (info->seals & F_SEAL_GROW))) return -EPERM; =20 @@ -3352,6 +3353,10 @@ shmem_write_begin(const struct kiocb *iocb, struct a= ddress_space *mapping, return -EPERM; } =20 + if (unlikely((info->flags & SHMEM_F_MAPPING_FROZEN) && + pos + len > inode->i_size)) + return -EPERM; + ret =3D shmem_get_folio(inode, index, pos + len, &folio, SGP_WRITE); if (ret) return ret; @@ -3725,6 +3730,11 @@ static long shmem_fallocate(struct file *file, int m= ode, loff_t offset, =20 inode_lock(inode); =20 + if (info->flags & SHMEM_F_MAPPING_FROZEN) { + error =3D -EPERM; + goto out; + } + if (mode & FALLOC_FL_PUNCH_HOLE) { struct address_space *mapping =3D file->f_mapping; loff_t unmap_start =3D round_up(offset, PAGE_SIZE); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 763B326B2AC for ; Thu, 7 Aug 2025 01:45:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531134; cv=none; b=VzL0CYUUKlqdqoZJSaOTqxx8+sQMTtD/4nfAtd9CTYqvC9siSt9ousmx+QzIfBVwfvVn68EUqmENPQ3I1Wpt24qs6mKgTjJq267ZrqehNKdsS8VkvnwkRB3xfcZpHiKEyMH0klNjs+YgiFL6bfxc0q7LMoQLllVGbrBH5jPzhNY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531134; c=relaxed/simple; bh=QBZ+QExF2IQaP6/YZb4gORsOYwd34GE+7HN7nommpfA=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qEhbU5vG+wQt33UkZF12UXzZxpEiXdyJJqso9hMfARKlRiKm5h75sUi9O6Eiyo4IhqB8UKNq16+8vAMOF7xT2KtvyWeeXNqL152oGv34FAut6kooOqlnKJox81UypjbAaiszCPMr7yw62bYpC0JP6e2JrTgGNFfOK4URrDZkVb0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=QP4t7ZGq; arc=none smtp.client-ip=209.85.219.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="QP4t7ZGq" Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-70739d1f07bso7249806d6.2 for ; Wed, 06 Aug 2025 18:45:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531129; x=1755135929; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=zNpjHXepLLwpraPv0r0ol85rEMD4xKPLfeNr8R2KnmQ=; b=QP4t7ZGqaRM1MHTTNknkdldlXtejuPccqDZgcTi0Oj1uJw57mam6ERDL8a+U+yLRq7 lclxE4IA4RUmg3X57ecRhcfJgjV7UTqs4pxUvaiD4Yk/9wDKzZt1SoNPy/uFYpxpwsRn k78+AAgCN/OYJ1ajfkRqyR3mGBoXovK09H6LY3ev2c+3HIIKrfcLQ+5242CHXHpCrLbA 93FFvOS+z1oimaHkt9lM/FgEQ1CqKF7o7F23ozTFyvvWPZKuJ3t+PQzLT/kFmBxm6f9r 5CFvRjv1XjFuTiZaTHcc7baTqkrIvTIr40QZsWp0dktZ2N9AJCuKGHdBYRIEK8T8Y5O3 fkDA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531129; x=1755135929; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zNpjHXepLLwpraPv0r0ol85rEMD4xKPLfeNr8R2KnmQ=; b=YQad1Q624EpJZb7wnt/xlYquWBMEGu2EEcypVZ8Z7ecKApbGz0dfGOAAvcn0uG/oce 5HZgWcmw/AlCJ7Nis4gn2k7uTQbnztXaIZ+aPfkeKbFHNaLBSFYMil5q3npEwtRq06Az pqqRiCW2o54Th7mASE7VM7gxa04L22to3jw/uF05qVeUvrD3BVIFjEt/dSbieCv3/uB9 5T29gbaydVmWewpt5ZSILByhOWprs7gUIifTyVUJXx5B4U2NWAjPcUVhlWdSbHzIm9U1 98j2LOSAMkeVeDUcbISFvFsVvQhXgOt2FVtWXlPuJEXY2HMUnieTFMBW58FGeQ0P+bkz an/w== X-Forwarded-Encrypted: i=1; AJvYcCUaLFufw2yYXpu5waHhdltTrMZ236UZltd9xLj13ROEBMoBVRW9Ikfz8admFhTRtnRnerMe+CDdNx4Dk7s=@vger.kernel.org X-Gm-Message-State: AOJu0YxrpJut0FYJmAbWDL2ZNgVDy6o/9u5Ik4opPifJtFKd1lhImUja WaPS1l1/8nWamk2lu6Ez0Vk3XpwKUgANTgpyojbuK78hAunE5nr8YpKy+7BGodKcDs8= X-Gm-Gg: ASbGncshRfhO1JRWU4C+WSi6WQsHxvSKXTYM+yoUhedBVUVyn7yHOtHilHRv0xNz+bn 54DhMPxaqvLxCiaBesCvzCTmN5mnNS3V2XJUN0gpQnPJQIuYZbp34OQTlzJnYadzbe50uZ8g1kw PW45DBCmffRy1fik7rF8iJR8smmyPY3M3Q6sF2/g7Tr+xjvHWvKeS40TcY2V+z8Jb+G3e6bDISg b0cukqdoatXgT+sH19FTH6GDI1wGtUdi47DX+PATFwA+c8JHPEZxMZ4QdYAYLoPW8sBlssV8kUB qs1j/6mKvbzHFDxK625t5ImRSZtzMqNvOZhg8o/T4dcnDEZ0DufRT/J78F+eEngDjddVn78dWV6 Du5CqlTTpE9c3qgHtpuzYvi1/OiHB/oLigZfoMDEO7bml2igQTV9TKCCgrW1Xq+MNvz11BnwPUJ MmE9LNVcfUz+4V8oHbSVilDVo= X-Google-Smtp-Source: AGHT+IE5gyxbiCqCqrXVqnuWA1mkhz9OocyomyZ6jSXE5mTuz1p3dbFY4IvWv4kGZUKx4DkAznRCTg== X-Received: by 2002:a05:6214:76d:b0:707:6977:aa77 with SMTP id 6a1803df08f44-7097969f103mr61593626d6.33.1754531128252; Wed, 06 Aug 2025 18:45:28 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:27 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 28/30] mm: shmem: export some functions to internal.h Date: Thu, 7 Aug 2025 01:44:34 +0000 Message-ID: <20250807014442.3829950-29-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav shmem_inode_acct_blocks(), shmem_recalc_inode(), and shmem_add_to_page_cache() are used by shmem_alloc_and_add_folio(). This functionality will also be used in the future by Live Update Orchestrator (LUO) to recreate memfd files after a live update. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- mm/internal.h | 6 ++++++ mm/shmem.c | 10 +++++----- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 45b725c3dc03..5cf487ee6f83 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1566,6 +1566,12 @@ void __meminit __init_page_from_nid(unsigned long pf= n, int nid); unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memc= g, int priority); =20 +int shmem_add_to_page_cache(struct folio *folio, + struct address_space *mapping, + pgoff_t index, void *expected, gfp_t gfp); +int shmem_inode_acct_blocks(struct inode *inode, long pages); +bool shmem_recalc_inode(struct inode *inode, long alloced, long swapped); + #ifdef CONFIG_SHRINKER_DEBUG static inline __printf(2, 0) int shrinker_debugfs_name_alloc( struct shrinker *shrinker, const char *fmt, va_list ap) diff --git a/mm/shmem.c b/mm/shmem.c index ef57e2649a41..eea2e8ca205f 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -219,7 +219,7 @@ static inline void shmem_unacct_blocks(unsigned long fl= ags, long pages) vm_unacct_memory(pages * VM_ACCT(PAGE_SIZE)); } =20 -static int shmem_inode_acct_blocks(struct inode *inode, long pages) +int shmem_inode_acct_blocks(struct inode *inode, long pages) { struct shmem_inode_info *info =3D SHMEM_I(inode); struct shmem_sb_info *sbinfo =3D SHMEM_SB(inode->i_sb); @@ -435,7 +435,7 @@ static void shmem_free_inode(struct super_block *sb, si= ze_t freed_ispace) * * Return: true if swapped was incremented from 0, for shmem_writeout(). */ -static bool shmem_recalc_inode(struct inode *inode, long alloced, long swa= pped) +bool shmem_recalc_inode(struct inode *inode, long alloced, long swapped) { struct shmem_inode_info *info =3D SHMEM_I(inode); bool first_swapped =3D false; @@ -898,9 +898,9 @@ static void shmem_update_stats(struct folio *folio, int= nr_pages) /* * Somewhat like filemap_add_folio, but error if expected item has gone. */ -static int shmem_add_to_page_cache(struct folio *folio, - struct address_space *mapping, - pgoff_t index, void *expected, gfp_t gfp) +int shmem_add_to_page_cache(struct folio *folio, + struct address_space *mapping, + pgoff_t index, void *expected, gfp_t gfp) { XA_STATE_ORDER(xas, &mapping->i_pages, index, folio_order(folio)); unsigned long nr =3D folio_nr_pages(folio); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9FE3E26F478 for ; Thu, 7 Aug 2025 01:45:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531136; cv=none; b=ip1QTXjunRHABB2Jhbh23hlZ8u8v+XDzO0salhGKXTlGCEHX+VYOUIyZF/9GqQxOMs1q5zH/l09k3bGES7U7Ke/GGJcFiARyKY9cT9TH/bP2ZwtHN2iXAQrkKFruT80LYd+bFpUKM2uedBQogdDR/zlXNkFANYhhaSJWpep2Zh8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531136; c=relaxed/simple; bh=F3LzhC++grM3zTUN87MyrI6xOO9Xu3zZ55hHAozBMgg=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Kobnkdx6Fecw/hbf2nagTRqe8YWHK2+8alhpTfdOMi4ax/5NDuX7bbM1yuecxazJQBtcpven8l7AmMQ/2mmZKdNKv3o/vVsxSxrksBctPa3Yodq5YADgR/wO2YC2WMfZ1VSYYRrY0zp4Zt4MOcal9B4te8Gim75OuVN5F8v0KQE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Yt78Oyr+; arc=none smtp.client-ip=209.85.160.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Yt78Oyr+" Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4af123c6fc4so5511471cf.0 for ; Wed, 06 Aug 2025 18:45:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531130; x=1755135930; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=ehkeScdrauztu8T9he2tGLB8hND8bPPz6tbPB+5xOmw=; b=Yt78Oyr+xb02NN3eyX+87MbAVSTApkgT1R6Wztn83jiNpi4TJBiL0Xa2Yqjd4kw2lW Wz0dSAVjxvUpSIIc4cSUBnjH0EOakV4jAIdlmsrSqKHa0cMN191CbGdrXe3OjQEHMDhw Kg7vm7G50M3q8wIecQrqu07H9V4oSmLbqF9ifewk/aSUG8z1Mpe0ReRjbsnuhMX18ZZF YNUi4OVs8XZMzB846HjhNOq0yMzCo1tpENu2K19umeUsk+y3r+Oy25WHOAB5szOGxiJv AodDRqjKTXkJL7eL7YYd1S7xsVwKwHmJysUuHDFatblEI2ehd3Tx/VggnJSK1GTTiIcO gMDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531130; x=1755135930; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ehkeScdrauztu8T9he2tGLB8hND8bPPz6tbPB+5xOmw=; b=Viuzv/tAoaZ6BMaMF7MwsDdewsC2BCZO0bo1VayEPgsbfHiDSFbRcn9RB/Z2VOnvg7 aSplQdfi+t4/0Ng7pDyr74V920hqmiD+AcKDgGW6vsft+X0p5BXrWqOJf5frZ9zybqaV ByW0l4/hCw1VtAsjgtcdcmogq7qHQTLd9Tx+KJQ5krLKnphqjW6+4Xb/q9zmg4jRkQUE 6ffqGNlYO8YG8ZOEiGkINOzUTOxCkOSsXIGa3GfUaRSg1Lq6OI4XNxHp3T6Y8UWwdfxH Ygy6T0UfVndPe3eGSGcp/fhnKI1KPDCGWjJAoBb21Y+LChPuS+SKQ89kvn6vrv/+S0JP KS0g== X-Forwarded-Encrypted: i=1; AJvYcCXTVCsXThPq+yInq21OWKmGf1E7mTBAa4MHgAnN/jz7ilynd3hj0xG6ZNUUNXwktXC6HksowSGexIPZPYg=@vger.kernel.org X-Gm-Message-State: AOJu0Yz8Gl/fRt1RG3LvgXaERQL84k+xsUXW6P6u08X0bF442nEGbdMT VbRejNP0PBHcXuYMVP9VF9TjuqT1HErzYOEwwtXwTcrie4xkz6b8uB6W8JAc0i7NaVk= X-Gm-Gg: ASbGncu7+lG40PpsNzGTF/LaHK1In2uz0uIMetV1+NVoy9S9tVdWObRxz75j24kbo9z +NczqyjJyDJLVLqFPA/SLsgN/Xh/N71eOPXp+sJhfdPXZeKtloZ5m9DOFT7Mopzro4nXwMGkazk 5SJJRFgCw/mJM2RR06T4iovr9NF3GuuzwiFf8YRjRyWlovtLmWgzt3kJG7+LkmTF5ikywDuNcev R0ZTKmoMvaYUCOCcuqAQYHaWif0XuaW9hPo+dlQYXQhFw5f7rtSifarAKJ0x0Dxnhc7U/RbduRC Gyox7Hug/EQkNRoSeZthWovLiPMqboQnAXQxojIqnAv/2w7esLXYJvQDsvA+FqDUHEPK0IqCOB1 /YdUyrKdSt1SJcVtV+K8ERsHNf5yUH2RvCezSj0PMcH+yOYlD5chm8PQV8o30VFmbzK0vw4uyTd COqrvXydUbbLn1 X-Google-Smtp-Source: AGHT+IGqYgaN1YVbRBRIlJ65LL6blzyWcpm1/jVEKqTrwyRimKIgHpEcS08wJGhbX5ipdDSbTol6hA== X-Received: by 2002:a05:622a:1211:b0:4b0:6965:dd97 with SMTP id d75a77b69052e-4b0915b344bmr63499471cf.44.1754531129758; Wed, 06 Aug 2025 18:45:29 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:29 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 29/30] luo: allow preserving memfd Date: Thu, 7 Aug 2025 01:44:35 +0000 Message-ID: <20250807014442.3829950-30-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav The ability to preserve a memfd allows userspace to use KHO and LUO to transfer its memory contents to the next kernel. This is useful in many ways. For one, it can be used with IOMMUFD as the backing store for IOMMU page tables. Preserving IOMMUFD is essential for performing a hypervisor live update with passthrough devices. memfd support provides the first building block for making that possible. For another, applications with a large amount of memory that takes time to reconstruct, reboots to consume kernel upgrades can be very expensive. memfd with LUO gives those applications reboot-persistent memory that they can use to quickly save and reconstruct that state. While memfd is backed by either hugetlbfs or shmem, currently only support on shmem is added. To be more precise, support for anonymous shmem files is added. The handover to the next kernel is not transparent. All the properties of the file are not preserved; only its memory contents, position, and size. The recreated file gets the UID and GID of the task doing the restore, and the task's cgroup gets charged with the memory. After LUO is in prepared state, the file cannot grow or shrink, and all its pages are pinned to avoid migrations and swapping. The file can still be read from or written to. Co-developed-by: Changyuan Lyu Signed-off-by: Changyuan Lyu Co-developed-by: Pasha Tatashin Signed-off-by: Pasha Tatashin Signed-off-by: Pratyush Yadav --- MAINTAINERS | 2 + mm/Makefile | 1 + mm/memfd_luo.c | 507 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 510 insertions(+) create mode 100644 mm/memfd_luo.c diff --git a/MAINTAINERS b/MAINTAINERS index b88b77977649..7421d21672f3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14209,6 +14209,7 @@ F: tools/testing/selftests/livepatch/ =20 LIVE UPDATE M: Pasha Tatashin +R: Pratyush Yadav L: linux-kernel@vger.kernel.org S: Maintained F: Documentation/ABI/testing/sysfs-kernel-liveupdate @@ -14218,6 +14219,7 @@ F: Documentation/userspace-api/liveupdate.rst F: include/linux/liveupdate.h F: include/uapi/linux/liveupdate.h F: kernel/liveupdate/ +F: mm/memfd_luo.c F: tools/testing/selftests/liveupdate/ =20 LLC (802.2) diff --git a/mm/Makefile b/mm/Makefile index ef54aa615d9d..0a9936ffc172 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -100,6 +100,7 @@ obj-$(CONFIG_NUMA) +=3D memory-tiers.o obj-$(CONFIG_DEVICE_MIGRATION) +=3D migrate_device.o obj-$(CONFIG_TRANSPARENT_HUGEPAGE) +=3D huge_memory.o khugepaged.o obj-$(CONFIG_PAGE_COUNTER) +=3D page_counter.o +obj-$(CONFIG_LIVEUPDATE) +=3D memfd_luo.o obj-$(CONFIG_MEMCG_V1) +=3D memcontrol-v1.o obj-$(CONFIG_MEMCG) +=3D memcontrol.o vmpressure.o ifdef CONFIG_SWAP diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c new file mode 100644 index 000000000000..0c91b40a2080 --- /dev/null +++ b/mm/memfd_luo.c @@ -0,0 +1,507 @@ +// SPDX-License-Identifier: GPL-2.0 + +/* + * Copyright (c) 2025, Google LLC. + * Pasha Tatashin + * Changyuan Lyu + * + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Pratyush Yadav + */ + +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt + +#include +#include +#include +#include +#include +#include +#include +#include "internal.h" + +static const char memfd_luo_compatible[] =3D "memfd-v1"; + +#define PRESERVED_PFN_MASK GENMASK(63, 12) +#define PRESERVED_PFN_SHIFT 12 +#define PRESERVED_FLAG_DIRTY BIT(0) +#define PRESERVED_FLAG_UPTODATE BIT(1) + +#define PRESERVED_FOLIO_PFN(desc) (((desc) & PRESERVED_PFN_MASK) >> PRESER= VED_PFN_SHIFT) +#define PRESERVED_FOLIO_FLAGS(desc) ((desc) & ~PRESERVED_PFN_MASK) +#define PRESERVED_FOLIO_MKDESC(pfn, flags) (((pfn) << PRESERVED_PFN_SHIFT)= | (flags)) + +struct memfd_luo_preserved_folio { + /* + * The folio descriptor is made of 2 parts. The bottom 12 bits are used + * for storing flags, the others for storing the PFN. + */ + u64 foliodesc; + u64 index; +}; + +static int memfd_luo_preserve_folios(struct memfd_luo_preserved_folio *pfo= lios, + struct folio **folios, + unsigned int nr_folios) +{ + unsigned int i; + int err; + + for (i =3D 0; i < nr_folios; i++) { + struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + struct folio *folio =3D folios[i]; + unsigned int flags =3D 0; + unsigned long pfn; + + err =3D kho_preserve_folio(folio); + if (err) + goto err_unpreserve; + + pfn =3D folio_pfn(folio); + if (folio_test_dirty(folio)) + flags |=3D PRESERVED_FLAG_DIRTY; + if (folio_test_uptodate(folio)) + flags |=3D PRESERVED_FLAG_UPTODATE; + + pfolio->foliodesc =3D PRESERVED_FOLIO_MKDESC(pfn, flags); + pfolio->index =3D folio->index; + } + + return 0; + +err_unpreserve: + i--; + for (; i >=3D 0; i--) + WARN_ON_ONCE(kho_unpreserve_folio(folios[i])); + return err; +} + +static void memfd_luo_unpreserve_folios(const struct memfd_luo_preserved_f= olio *pfolios, + unsigned int nr_folios) +{ + unsigned int i; + + for (i =3D 0; i < nr_folios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + struct folio *folio; + + if (!pfolio->foliodesc) + continue; + + folio =3D pfn_folio(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + + kho_unpreserve_folio(folio); + unpin_folio(folio); + } +} + +static void *memfd_luo_create_fdt(unsigned long size) +{ + unsigned int order =3D get_order(size); + struct folio *fdt_folio; + int err =3D 0; + void *fdt; + + if (order > MAX_PAGE_ORDER) + return NULL; + + fdt_folio =3D folio_alloc(GFP_KERNEL, order); + if (!fdt_folio) + return NULL; + + fdt =3D folio_address(fdt_folio); + + err |=3D fdt_create(fdt, (1 << (order + PAGE_SHIFT))); + err |=3D fdt_finish_reservemap(fdt); + err |=3D fdt_begin_node(fdt, ""); + if (err) + goto free; + + return fdt; + +free: + folio_put(fdt_folio); + return NULL; +} + +static int memfd_luo_finish_fdt(void *fdt) +{ + int err; + + err =3D fdt_end_node(fdt); + if (err) + return err; + + return fdt_finish(fdt); +} + +static int memfd_luo_prepare(struct liveupdate_file_handler *handler, + struct file *file, u64 *data) +{ + struct memfd_luo_preserved_folio *preserved_folios; + struct inode *inode =3D file_inode(file); + unsigned int max_folios, nr_folios =3D 0; + int err =3D 0, preserved_size; + struct folio **folios; + long size, nr_pinned; + pgoff_t offset; + void *fdt; + u64 pos; + + if (WARN_ON_ONCE(!shmem_file(file))) + return -EINVAL; + + inode_lock(inode); + shmem_i_mapping_freeze(inode, true); + + size =3D i_size_read(inode); + if ((PAGE_ALIGN(size) / PAGE_SIZE) > UINT_MAX) { + err =3D -E2BIG; + goto err_unlock; + } + + /* + * Guess the number of folios based on inode size. Real number might end + * up being smaller if there are higher order folios. + */ + max_folios =3D PAGE_ALIGN(size) / PAGE_SIZE; + folios =3D kvmalloc_array(max_folios, sizeof(*folios), GFP_KERNEL); + if (!folios) { + err =3D -ENOMEM; + goto err_unfreeze; + } + + /* + * Pin the folios so they don't move around behind our back. This also + * ensures none of the folios are in CMA -- which ensures they don't + * fall in KHO scratch memory. It also moves swapped out folios back to + * memory. + * + * A side effect of doing this is that it allocates a folio for all + * indices in the file. This might waste memory on sparse memfds. If + * that is really a problem in the future, we can have a + * memfd_pin_folios() variant that does not allocate a page on empty + * slots. + */ + nr_pinned =3D memfd_pin_folios(file, 0, size - 1, folios, max_folios, + &offset); + if (nr_pinned < 0) { + err =3D nr_pinned; + pr_err("failed to pin folios: %d\n", err); + goto err_free_folios; + } + /* nr_pinned won't be more than max_folios which is also unsigned int. */ + nr_folios =3D (unsigned int)nr_pinned; + + preserved_size =3D sizeof(struct memfd_luo_preserved_folio) * nr_folios; + if (check_mul_overflow(sizeof(struct memfd_luo_preserved_folio), + nr_folios, &preserved_size)) { + err =3D -E2BIG; + goto err_unpin; + } + + /* + * Most of the space should be taken by preserved folios. So take its + * size, plus a page for other properties. + */ + fdt =3D memfd_luo_create_fdt(PAGE_ALIGN(preserved_size) + PAGE_SIZE); + if (!fdt) { + err =3D -ENOMEM; + goto err_unpin; + } + + pos =3D file->f_pos; + err =3D fdt_property(fdt, "pos", &pos, sizeof(pos)); + if (err) + goto err_free_fdt; + + err =3D fdt_property(fdt, "size", &size, sizeof(size)); + if (err) + goto err_free_fdt; + + err =3D fdt_property_placeholder(fdt, "folios", preserved_size, + (void **)&preserved_folios); + if (err) { + pr_err("Failed to reserve folios property in FDT: %s\n", + fdt_strerror(err)); + err =3D -ENOMEM; + goto err_free_fdt; + } + + err =3D memfd_luo_preserve_folios(preserved_folios, folios, nr_folios); + if (err) + goto err_free_fdt; + + err =3D memfd_luo_finish_fdt(fdt); + if (err) + goto err_unpreserve; + + err =3D kho_preserve_folio(virt_to_folio(fdt)); + if (err) + goto err_unpreserve; + + kvfree(folios); + inode_unlock(inode); + + *data =3D virt_to_phys(fdt); + return 0; + +err_unpreserve: + memfd_luo_unpreserve_folios(preserved_folios, nr_folios); +err_free_fdt: + folio_put(virt_to_folio(fdt)); +err_unpin: + unpin_folios(folios, nr_pinned); +err_free_folios: + kvfree(folios); +err_unfreeze: + shmem_i_mapping_freeze(inode, false); +err_unlock: + inode_unlock(inode); + return err; +} + +static int memfd_luo_freeze(struct liveupdate_file_handler *handler, + struct file *file, u64 *data) +{ + u64 pos =3D file->f_pos; + void *fdt; + int err; + + if (WARN_ON_ONCE(!*data)) + return -EINVAL; + + fdt =3D phys_to_virt(*data); + + /* + * The pos or size might have changed since prepare. Everything else + * stays the same. + */ + err =3D fdt_setprop(fdt, 0, "pos", &pos, sizeof(pos)); + if (err) + return err; + + return 0; +} + +static void memfd_luo_cancel(struct liveupdate_file_handler *handler, + struct file *file, u64 data) +{ + const struct memfd_luo_preserved_folio *pfolios; + struct inode *inode =3D file_inode(file); + struct folio *fdt_folio; + void *fdt; + int len; + + if (WARN_ON_ONCE(!data)) + return; + + inode_lock(inode); + shmem_i_mapping_freeze(inode, false); + + fdt =3D phys_to_virt(data); + fdt_folio =3D virt_to_folio(fdt); + pfolios =3D fdt_getprop(fdt, 0, "folios", &len); + if (pfolios) + memfd_luo_unpreserve_folios(pfolios, len / sizeof(*pfolios)); + + kho_unpreserve_folio(fdt_folio); + folio_put(fdt_folio); + inode_unlock(inode); +} + +static struct folio *memfd_luo_get_fdt(u64 data) +{ + return kho_restore_folio((phys_addr_t)data); +} + +static void memfd_luo_finish(struct liveupdate_file_handler *handler, + struct file *file, u64 data, bool reclaimed) +{ + const struct memfd_luo_preserved_folio *pfolios; + struct folio *fdt_folio; + int len; + + if (reclaimed) + return; + + fdt_folio =3D memfd_luo_get_fdt(data); + + pfolios =3D fdt_getprop(folio_address(fdt_folio), 0, "folios", &len); + if (pfolios) + memfd_luo_unpreserve_folios(pfolios, len / sizeof(*pfolios)); + + folio_put(fdt_folio); +} + +static int memfd_luo_retrieve(struct liveupdate_file_handler *handler, u64= data, + struct file **file_p) +{ + const struct memfd_luo_preserved_folio *pfolios; + int nr_pfolios, len, ret =3D 0, i =3D 0; + struct address_space *mapping; + struct folio *folio, *fdt_folio; + const u64 *pos, *size; + struct inode *inode; + struct file *file; + const void *fdt; + + fdt_folio =3D memfd_luo_get_fdt(data); + if (!fdt_folio) + return -ENOENT; + + fdt =3D page_to_virt(folio_page(fdt_folio, 0)); + + pfolios =3D fdt_getprop(fdt, 0, "folios", &len); + if (!pfolios || len % sizeof(*pfolios)) { + pr_err("invalid 'folios' property\n"); + ret =3D -EINVAL; + goto put_fdt; + } + nr_pfolios =3D len / sizeof(*pfolios); + + size =3D fdt_getprop(fdt, 0, "size", &len); + if (!size || len !=3D sizeof(u64)) { + pr_err("invalid 'size' property\n"); + ret =3D -EINVAL; + goto put_folios; + } + + pos =3D fdt_getprop(fdt, 0, "pos", &len); + if (!pos || len !=3D sizeof(u64)) { + pr_err("invalid 'pos' property\n"); + ret =3D -EINVAL; + goto put_folios; + } + + file =3D shmem_file_setup("", 0, VM_NORESERVE); + + if (IS_ERR(file)) { + ret =3D PTR_ERR(file); + pr_err("failed to setup file: %d\n", ret); + goto put_folios; + } + + inode =3D file->f_inode; + mapping =3D inode->i_mapping; + vfs_setpos(file, *pos, MAX_LFS_FILESIZE); + + for (; i < nr_pfolios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + phys_addr_t phys; + u64 index; + int flags; + + if (!pfolio->foliodesc) + continue; + + phys =3D PFN_PHYS(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + folio =3D kho_restore_folio(phys); + if (!folio) { + pr_err("Unable to restore folio at physical address: %llx\n", + phys); + goto put_file; + } + index =3D pfolio->index; + flags =3D PRESERVED_FOLIO_FLAGS(pfolio->foliodesc); + + /* Set up the folio for insertion. */ + /* + * TODO: Should find a way to unify this and + * shmem_alloc_and_add_folio(). + */ + __folio_set_locked(folio); + __folio_set_swapbacked(folio); + + ret =3D mem_cgroup_charge(folio, NULL, mapping_gfp_mask(mapping)); + if (ret) { + pr_err("shmem: failed to charge folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + ret =3D shmem_add_to_page_cache(folio, mapping, index, NULL, + mapping_gfp_mask(mapping)); + if (ret) { + pr_err("shmem: failed to add to page cache folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + if (flags & PRESERVED_FLAG_UPTODATE) + folio_mark_uptodate(folio); + if (flags & PRESERVED_FLAG_DIRTY) + folio_mark_dirty(folio); + + ret =3D shmem_inode_acct_blocks(inode, 1); + if (ret) { + pr_err("shmem: failed to account folio index %d: %d\n", + i, ret); + goto unlock_folio; + } + + shmem_recalc_inode(inode, 1, 0); + folio_add_lru(folio); + folio_unlock(folio); + folio_put(folio); + } + + inode->i_size =3D *size; + *file_p =3D file; + folio_put(fdt_folio); + return 0; + +unlock_folio: + folio_unlock(folio); + folio_put(folio); +put_file: + fput(file); + i++; +put_folios: + for (; i < nr_pfolios; i++) { + const struct memfd_luo_preserved_folio *pfolio =3D &pfolios[i]; + + folio =3D kho_restore_folio(PRESERVED_FOLIO_PFN(pfolio->foliodesc)); + if (folio) + folio_put(folio); + } + +put_fdt: + folio_put(fdt_folio); + return ret; +} + +static bool memfd_luo_can_preserve(struct liveupdate_file_handler *handler, + struct file *file) +{ + struct inode *inode =3D file_inode(file); + + return shmem_file(file) && !inode->i_nlink; +} + +static const struct liveupdate_file_ops memfd_luo_file_ops =3D { + .prepare =3D memfd_luo_prepare, + .freeze =3D memfd_luo_freeze, + .cancel =3D memfd_luo_cancel, + .finish =3D memfd_luo_finish, + .retrieve =3D memfd_luo_retrieve, + .can_preserve =3D memfd_luo_can_preserve, + .owner =3D THIS_MODULE, +}; + +static struct liveupdate_file_handler memfd_luo_handler =3D { + .ops =3D &memfd_luo_file_ops, + .compatible =3D memfd_luo_compatible, +}; + +static int __init memfd_luo_init(void) +{ + int err; + + err =3D liveupdate_register_file_handler(&memfd_luo_handler); + if (err) + pr_err("Could not register luo filesystem handler: %d\n", err); + + return err; +} +late_initcall(memfd_luo_init); --=20 2.50.1.565.gc32cd1483b-goog From nobody Sun Oct 5 07:24:31 2025 Received: from mail-qt1-f172.google.com (mail-qt1-f172.google.com [209.85.160.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5CB77274FD9 for ; Thu, 7 Aug 2025 01:45:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531137; cv=none; b=mBH0XtUccrdAaBVf3Q+GCXeDnm56tkGdmLqqZWNK1jyO5Ho9xTOO/H7wd4mcjTE8petyiyH2Z2ytpfH+wfcUruZe3jJQ6LjpH79e0aHhqMvPhIwvhXW10/kUwDLt10/CqywCoq0NUxHOZ2NrWZAUGopCab74SwvJCX1S2MvM5I0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754531137; c=relaxed/simple; bh=J79Sk6AEqFb6qwXHY80LiIIR0kdFlLi7khC7kbAVegc=; h=From:To:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=I91T67fRTGy+XrWllIrzLGsGgzRVK/KORlqzfXOxn3Njnp21BgmeJlR8zLTRie+zZ0X4bm/MzGbbSYJYKo+U9CUgnSMfrNLzR3EP5CT2Dn8BzhrvjQzdOC8d2fmKgh5MoTFE9GkpL/GREhab703zlgE5lnN8a4isuQhoUPkqhz4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b=Z77zGx4J; arc=none smtp.client-ip=209.85.160.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="Z77zGx4J" Received: by mail-qt1-f172.google.com with SMTP id d75a77b69052e-4b070e57254so4880851cf.3 for ; Wed, 06 Aug 2025 18:45:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1754531131; x=1755135931; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=bxqnvx2r4Si9BOyfm56lPsflUxyGmt3u+0J7FV+KfwI=; b=Z77zGx4JrCFY/il7oa2goOH/IIDtyey4wbcpnoYFvBhJGXz9yPAzDq3ayqWlXQeS/x FjZRhY9lEmSAwHFPqMRZKBWVd2gXlpLRQ8L1nx6UWZw0tRlIiTj5HY0TKqPu3O/djT0Q nnf5jQFFxkT7VcSHa0vZMjNAwloKpo5822WBLbxnt0UKk+E8KpEqIPgQh8Hu/sXiHO4G nVUFT2+ScE43qABbC2f2hTV+IpZOMNULmwOOCDCQjrGrbH4rZ1VNAD6uInHcI7PWj6Cy Dy156M6UjHJhtQYhdSkggmP65Yb5OlslzI/OsngfSRoght3n//V/+cfqBEIuHKESv7na +FKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754531131; x=1755135931; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bxqnvx2r4Si9BOyfm56lPsflUxyGmt3u+0J7FV+KfwI=; b=XkDoRUc+sJQ9reiogbLzWi0W/jf9sH5b2BXSB9ofLTDnzx4xk6MI7xNai9CqbXoz0m tjmxYyS1z9eeQHOnDOiaag9D6mPXOXB8oXLPG5ZJHhMJC2/mPgiNZCK/qIOI4sOstv+N wTURZwZWYvIpiFUuP8/ig+DgqsukqsnGn9G7wy62rXwgplc+qQHwcnlvjl+9oM2JstwP IUbAwxXKa7Df6WvMVf+AN5hVoc1U1zQjpHrorXsX1v6RVGBZyyTZ3erqXN+WRyyBuVGk n2DypC/16xs+MU9Y8Dn0LhXkaQyNNpQyKvjpqDo395FhAPcb6dho19mb7rbhW6eM9+Kz Vs3g== X-Forwarded-Encrypted: i=1; AJvYcCWCBFvPbSaoRu84Hcfn9jIP1/oR1HnBt8p8bwNwtfPrdJwF7g15dCfg+YKNs3k0ViAgyINNWO8P8al13HI=@vger.kernel.org X-Gm-Message-State: AOJu0Yy1gapX8jO6z+lOFDHIXvZNZbgVDnlxJTbLlJcCRxobCvbTdaDU 84SOAM7dd//WSePpPDOi6RNdj2rfytRc6DdmbJ+5mRGv3TlVQvSfjmyrxpybkA7VKow= X-Gm-Gg: ASbGnctDrcNwDW3Z+H7kilBDQIxY5JJrqZIMoBXG+2pwL+4cmhyO+ELuykCWLkhZ4Yq jvIQG5YuiVO8hPrKMqdIpNNej2OMgoQb9wiJ0Xd/3uVEPqs+jAXYnVdHUEiAkAqLi2sP7r0L+/M +HP9eZbkyyhib46u5Cc4C4czZME3yx4nRnyhnh/tlU0z0GAXN7yBbJzI0vnwZRzQzHJKqYLUQu2 XFEU5L7BJiehFp2dqniSe7Wc4AE7QvkhVVFI882+W5yJ5KDQQEdBZv5k34YbxxqxJiPAADMswxI i0UVqhalLrxRnNKpJPTCF6Bx+KGqPp7E5bV5GmTGycPicclQDb/3TzET3d3pmOyWSxJIMw+z3QA +yg6haHxpkCx0uAQOsyBlKuXNYRDVdGN9NN2ykeEOy+kWXQPHT+s7VIr92f+Ur6Zxo9kLxfC8TD bpmIJzfDww0tOi X-Google-Smtp-Source: AGHT+IE25uuFedh62vClwjbUoKXlfnK11WcIRsY3uExXHXg77BXrLNoq560jJFcvmuXwL7CnS28NTA== X-Received: by 2002:a05:622a:1a13:b0:4ab:76bd:ec51 with SMTP id d75a77b69052e-4b0912f603cmr60022211cf.3.1754531131249; Wed, 06 Aug 2025 18:45:31 -0700 (PDT) Received: from soleen.c.googlers.com.com (235.247.85.34.bc.googleusercontent.com. [34.85.247.235]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-7077cde5a01sm92969046d6.70.2025.08.06.18.45.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Aug 2025 18:45:30 -0700 (PDT) From: Pasha Tatashin To: pratyush@kernel.org, jasonmiu@google.com, graf@amazon.com, changyuanl@google.com, pasha.tatashin@soleen.com, rppt@kernel.org, dmatlack@google.com, rientjes@google.com, corbet@lwn.net, rdunlap@infradead.org, ilpo.jarvinen@linux.intel.com, kanie@linux.alibaba.com, ojeda@kernel.org, aliceryhl@google.com, masahiroy@kernel.org, akpm@linux-foundation.org, tj@kernel.org, yoann.congal@smile.fr, mmaurer@google.com, roman.gushchin@linux.dev, chenridong@huawei.com, axboe@kernel.dk, mark.rutland@arm.com, jannh@google.com, vincent.guittot@linaro.org, hannes@cmpxchg.org, dan.j.williams@intel.com, david@redhat.com, joel.granados@kernel.org, rostedt@goodmis.org, anna.schumaker@oracle.com, song@kernel.org, zhangguopeng@kylinos.cn, linux@weissschuh.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, gregkh@linuxfoundation.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, rafael@kernel.org, dakr@kernel.org, bartosz.golaszewski@linaro.org, cw00.choi@samsung.com, myungjoo.ham@samsung.com, yesanishhere@gmail.com, Jonathan.Cameron@huawei.com, quic_zijuhu@quicinc.com, aleksander.lobakin@intel.com, ira.weiny@intel.com, andriy.shevchenko@linux.intel.com, leon@kernel.org, lukas@wunner.de, bhelgaas@google.com, wagi@kernel.org, djeffery@redhat.com, stuart.w.hayes@gmail.com, ptyadav@amazon.de, lennart@poettering.net, brauner@kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, saeedm@nvidia.com, ajayachandra@nvidia.com, jgg@nvidia.com, parav@nvidia.com, leonro@nvidia.com, witu@nvidia.com Subject: [PATCH v3 30/30] docs: add documentation for memfd preservation via LUO Date: Thu, 7 Aug 2025 01:44:36 +0000 Message-ID: <20250807014442.3829950-31-pasha.tatashin@soleen.com> X-Mailer: git-send-email 2.50.1.565.gc32cd1483b-goog In-Reply-To: <20250807014442.3829950-1-pasha.tatashin@soleen.com> References: <20250807014442.3829950-1-pasha.tatashin@soleen.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Pratyush Yadav Add the documentation under the "Preserving file descriptors" section of LUO's documentation. The doc describes the properties preserved, behaviour of the file under different LUO states, serialization format, and current limitations. Signed-off-by: Pratyush Yadav Signed-off-by: Pasha Tatashin --- Documentation/core-api/liveupdate.rst | 7 ++ Documentation/mm/index.rst | 1 + Documentation/mm/memfd_preservation.rst | 138 ++++++++++++++++++++++++ MAINTAINERS | 1 + 4 files changed, 147 insertions(+) create mode 100644 Documentation/mm/memfd_preservation.rst diff --git a/Documentation/core-api/liveupdate.rst b/Documentation/core-api= /liveupdate.rst index 41c4b76cd3ec..232d5f623992 100644 --- a/Documentation/core-api/liveupdate.rst +++ b/Documentation/core-api/liveupdate.rst @@ -18,6 +18,13 @@ LUO Preserving File Descriptors .. kernel-doc:: kernel/liveupdate/luo_files.c :doc: LUO file descriptors =20 +The following types of file descriptors can be preserved + +.. toctree:: + :maxdepth: 1 + + ../mm/memfd_preservation + Public API =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D .. kernel-doc:: include/linux/liveupdate.h diff --git a/Documentation/mm/index.rst b/Documentation/mm/index.rst index fb45acba16ac..c504156149a0 100644 --- a/Documentation/mm/index.rst +++ b/Documentation/mm/index.rst @@ -47,6 +47,7 @@ documentation, or deleted if it has served its purpose. hugetlbfs_reserv ksm memory-model + memfd_preservation mmu_notifier multigen_lru numa diff --git a/Documentation/mm/memfd_preservation.rst b/Documentation/mm/mem= fd_preservation.rst new file mode 100644 index 000000000000..416cd1dafc97 --- /dev/null +++ b/Documentation/mm/memfd_preservation.rst @@ -0,0 +1,138 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D +Memfd Preservation via LUO +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D + +Overview +=3D=3D=3D=3D=3D=3D=3D=3D + +Memory file descriptors (memfd) can be preserved over a kexec using the Li= ve +Update Orchestrator (LUO) file preservation. This allows userspace to tran= sfer +its memory contents to the next kernel after a kexec. + +The preservation is not intended to be transparent. Only select properties= of +the file are preserved. All others are reset to default. The preserved +properties are described below. + +.. note:: + The LUO API is not stabilized yet, so the preserved properties of a mem= fd are + also not stable and are subject to backwards incompatible changes. + +.. note:: + Currently a memfd backed by Hugetlb is not supported. Memfds created + with ``MFD_HUGETLB`` will be rejected. + +Preserved Properties +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The following properties of the memfd are preserved across kexec: + +File Contents + All data stored in the file is preserved. + +File Size + The size of the file is preserved. Holes in the file are filled by alloc= ating + pages for them during preservation. + +File Position + The current file position is preserved, allowing applications to continue + reading/writing from their last position. + +File Status Flags + memfds are always opened with ``O_RDWR`` and ``O_LARGEFILE``. This prope= rty is + maintained. + +Non-Preserved Properties +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +All properties which are not preserved must be assumed to be reset to defa= ult. +This section describes some of those properties which may be more of note. + +``FD_CLOEXEC`` flag + A memfd can be created with the ``MFD_CLOEXEC`` flag that sets the + ``FD_CLOEXEC`` on the file. This flag is not preserved and must be set a= gain + after restore via ``fcntl()``. + +Seals + File seals are not preserved. The file is unsealed on restore and if nee= ded, + must be sealed again via ``fcntl()``. + +Behavior with LUO states +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +This section described the behavior of the memfd in the different LUO stat= es. + +Normal Phase + During the normal phase, the memfd can be marked for preservation using = the + ``LIVEUPDATE_IOCTL_FD_PRESERVE`` ioctl. The memfd acts as a regular memfd + during this phase with no additional restrictions. + +Prepared Phase + After LUO enters ``LIVEUPDATE_STATE_PREPARED``, the memfd is serialized = and + prepared for the next kernel. During this phase, the below things happen: + + - All the folios are pinned. If some folios reside in ``ZONE_MIGRATE``, = they + are migrated out. This ensures none of the preserved folios land in KHO + scratch area. + - Pages in swap are swapped in. Currently, there is no way to pass pages= in + swap over KHO, so all swapped out pages are swapped back in and pinned. + - The memfd goes into "frozen mapping" mode. The file can no longer grow= or + shrink, or punch holes. This ensures the serialized mappings stay in s= ync. + The file can still be read from or written to or mmap-ed. + +Freeze Phase + Updates the current file position in the serialized data to capture any + changes that occurred between prepare and freeze phases. After this, the= FD is + not allowed to be accessed. + +Restoration Phase + After being restored, the memfd is functional as normal with the propert= ies + listed above restored. + +Cancellation + If the liveupdate is canceled after going into prepared phase, the memfd + functions like in normal phase. + +Serialization format +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The state is serialized in an FDT with the following structure:: + + /dts-v1/; + + / { + compatible =3D "memfd-v1"; + pos =3D ; + size =3D ; + folios =3D ; + }; + +Each folio descriptor contains: + +- PFN + flags (8 bytes) + + - Physical frame number (PFN) of the preserved folio (bits 63:12). + - Folio flags (bits 11:0): + + - ``PRESERVED_FLAG_DIRTY`` (bit 0) + - ``PRESERVED_FLAG_UPTODATE`` (bit 1) + +- Folio index within the file (8 bytes). + +Limitations +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +The current implementation has the following limitations: + +Size + Currently the size of the file is limited by the size of the FDT. The FD= T can + be at of most ``MAX_PAGE_ORDER`` order. By default this is 4 MiB with 4K + pages. Each page in the file is tracked using 16 bytes. This limits the + maximum size of the file to 1 GiB. + +See Also +=3D=3D=3D=3D=3D=3D=3D=3D + +- :doc:`Live Update Orchestrator ` +- :doc:`/core-api/kho/concepts` diff --git a/MAINTAINERS b/MAINTAINERS index 7421d21672f3..50482363c9d4 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -14215,6 +14215,7 @@ S: Maintained F: Documentation/ABI/testing/sysfs-kernel-liveupdate F: Documentation/admin-guide/liveupdate.rst F: Documentation/core-api/liveupdate.rst +F: Documentation/mm/memfd_preservation.rst F: Documentation/userspace-api/liveupdate.rst F: include/linux/liveupdate.h F: include/uapi/linux/liveupdate.h --=20 2.50.1.565.gc32cd1483b-goog