From nobody Thu Oct 2 22:47:39 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F029A350D6E for ; Tue, 9 Sep 2025 14:44:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757429080; cv=none; b=h3foaXqGTYvIsPxJp+ETdLZg/F81WbcrsyoRnr3o4dvD2HKqBdDUWdbpw7Ujsi48QFJFS1Bm8trCapwnBTCJn6JRkW5VDeGcBg9h9nsU+fCgv+Je5Pv4L66Vr1cQ0YGoF8HOsKOlvv4FrzAUZk1QtZawskI1ls1KS/nSA1GLR8k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757429080; c=relaxed/simple; bh=itvehA1F3CQanfjmtCZ1QgZPkfSvDpGGOIlOKdv/l3w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=al26nLtUfVhjkNHMYC3CNXqZtDpc6HXq5Etgzt3lVV/LFZAK+CcqHSYp1s2tkifoARCL1+l+IaxU+yGPj836JvMlMBzR8udnz7xXruKImS2jxKCfyRbFKAta0hnSiuiB580wduMzT+JetZb7LNMHxf/bWaMzQeq3yr71Ff7LThk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=gFQRGQQo; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="gFQRGQQo" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5C748C4CEF7; Tue, 9 Sep 2025 14:44:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1757429079; bh=itvehA1F3CQanfjmtCZ1QgZPkfSvDpGGOIlOKdv/l3w=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=gFQRGQQoQt2px906Eer8M14KNDMAupGvwN7g+S0njThSf2VPXwqXizWDww+56Jzfu HULAZD3c+EWcTfPshZNkRv/lpFDmyvNhHbQN0L2miMJg+DfkkNdSyMFrc2YPjPao0p BM7APgrH8qTlnQGcFdKXxyoLD6srs6RdQJwuQrHhPn5xg+uuF7IX87wkfDhWPpwmcq IeqPmbkOS+7Aq/BUDUXgieLgGrKR7I9LTYB6rnAc66SwGHBsJCHKxT64GA6hOvFdI/ 0JmwHgbjAHLOhW8j6OGXjZA63EzOepv+Uu1OsdtBnqwHag3MhrtRIMToVdEYrTkEEM KFftNK6qDQNog== From: Pratyush Yadav To: Alexander Graf , Mike Rapoport , Changyuan Lyu , Andrew Morton , Baoquan He , Pratyush Yadav , Pasha Tatashin , Jason Gunthorpe , =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= , Chris Li , Jason Miu , David Matlack , David Rientjes Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-mm@kvack.org Subject: [RFC PATCH 1/4] kho: introduce the KHO array Date: Tue, 9 Sep 2025 16:44:21 +0200 Message-ID: <20250909144426.33274-2-pratyush@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20250909144426.33274-1-pratyush@kernel.org> References: <20250909144426.33274-1-pratyush@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The KHO Array is a data structure that behaves like a sparse array of pointers. It is designed to be preserved and restored over Kexec Handover (KHO), and targets only 64-bit platforms. It can store 8-byte aligned pointers. It can also store integers between 0 and LONG_MAX. It supports sparse indices, though it performs best with densely clustered indices. The goal with KHO array is to provide a fundamental data type that can then be used to build serialization logic for higher layers. Moving the complexity of tracking these scattered list of pages to the KHO array layer makes higher layers simpler. The data format consists of a descriptor of the array which contains a magic number, format version, and pointer to the first page. Each page contains the starting position of the entries in the page and a pointer to the next page, forming a linked list. This linked list allows for the array to be built with non-contiguous pages. Visually, the data format looks like below: kho_array +----------+ | Magic | +----------+ kho_array_page | Version | +----------+----------+----------- +----------+ +--->| Next | Startpos | Entries... | Reserved | | +----------+----------+----------- +----------+ | | kho_array_page | First |----+ | +----------+----------+----------- +----------+ +--->| Next | Startpos | Entries... +----------+----------+----------- | | +--->... Signed-off-by: Pratyush Yadav --- MAINTAINERS | 2 + include/linux/kho_array.h | 300 ++++++++++++++++++++++++++++++++++++++ kernel/Makefile | 1 + kernel/kho_array.c | 209 ++++++++++++++++++++++++++ 4 files changed, 512 insertions(+) create mode 100644 include/linux/kho_array.h create mode 100644 kernel/kho_array.c diff --git a/MAINTAINERS b/MAINTAINERS index 6dcfbd11efef8..e66bc05bce0e3 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -13550,7 +13550,9 @@ S: Maintained F: Documentation/admin-guide/mm/kho.rst F: Documentation/core-api/kho/* F: include/linux/kexec_handover.h +F: include/linux/kho_array.h F: kernel/kexec_handover.c +F: kernel/kho_array.c F: tools/testing/selftests/kho/ =20 KEYS-ENCRYPTED diff --git a/include/linux/kho_array.h b/include/linux/kho_array.h new file mode 100644 index 0000000000000..39ab5532ee765 --- /dev/null +++ b/include/linux/kho_array.h @@ -0,0 +1,300 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Pratyush Yadav + */ + +/** + * DOC: KHO Array + * + * The KHO Array is a data structure that behaves like a sparse array of + * pointers. It is designed to be preserved and restored over Kexec Handov= er + * (KHO), and targets only 64-bit platforms. It can store 8-byte aligned + * pointers. It can also store integers between 0 and LONG_MAX. It supports + * sparse indices, though it performs best with densely clustered indices.= The + * data structure does not provide any locking. Callers must ensure they h= ave + * exclusive access. + * + * To keep the data format simple, the data structure is designed to only = be + * accessed linearly. When reading or writing the data structure, the valu= es + * should be accessed from the lowest index to the highest. + * + * The data format consists of a descriptor of the array which contains a = magic + * number, format version, and pointer to the first page. Each page contai= ns the + * starting position of the entries in the page and a pointer to the next = page, + * forming a linked list. This linked list allows for the array to be buil= t with + * non-contiguous pages. + * + * The starting position of each page an offset that is applied to calcula= te the + * index of each entry in the array. For example, of the starting position= is + * 1000, entry 0 has index 1000, entry 1 has index 1001, and so on. This + * facilitates memory-efficient handling of holes in the array. + * + * The diagram below shows the data format visually: + * + * kho_array + * +----------+ + * | Magic | + * +----------+ kho_array_page + * | Version | +----------+----------+----------- + * +----------+ +--->| Next | Startpos | Entries... + * | Reserved | | +----------+----------+----------- + * +----------+ | | kho_array_page + * | First |----+ | +----------+----------+----------- + * +----------+ +--->| Next | Startpos | Entries... + * +----------+----------+----------- + * | + * | + * +--->... + */ + +#ifndef LINUX_KHO_ARRAY_H +#define LINUX_KHO_ARRAY_H + +#include + +#define KHO_ARRAY_MAGIC 0x4b415252 /* ASCII for 'KARR' */ +#define KHO_ARRAY_VERSION 0 + +/** + * struct kho_array - Descriptor for a KHO array. + * @magic: Magic number to ensure valid descriptor. + * @version: Data format version. + * @__reserved: Reserved bytes. Must be set to 0. + * @first: Physical address of the first page in the list of pages. If 0, = the + * list is empty. + */ +struct kho_array { + u32 magic; + u16 version; + u16 __reserved; + __aligned_u64 first; +} __packed; + +/** + * struct kho_array_page - A page in the KHO array. + * @next: Physical address of the next page in the list. If 0, there is no= next + * page. + * @startpos: Position at which entries in this page start. + * @entries: Entries in the array. + */ +struct kho_array_page { + __aligned_u64 next; + __aligned_u64 startpos; + __aligned_u64 entries[]; +} __packed; + +#define KA_PAGE_NR_ENTRIES ((PAGE_SIZE - sizeof(struct kho_array_page)) / = sizeof(u64)) + +#define KA_ITER_PAGEPOS(iter) ((iter)->pos - (iter)->cur->startpos) +#define KA_PAGE(phys) ((phys) ? (struct kho_array_page *)__va((phys)) : NU= LL) + +/** + * kho_array_valid() - Validate KHO array descriptor. + * @ka: KHO array. + * + * Return: %true if valid, %false otherwise. + */ +bool kho_array_valid(struct kho_array *ka); + +/** + * kho_array_init() - Initialize an empty KHO array. + * @ka: KHO array. + * + * Initilizes @ka to an empty KHO array full of NULL entries. + */ +void kho_array_init(struct kho_array *ka); + +/** + * kho_array_destroy() - Free the KHO array. + * @ka: KHO array. + * + * After calling this function, @ka is destroyed and all its pages have be= en + * freed. It must be initialized again before reuse. + */ +void kho_array_destroy(struct kho_array *ka); + +/** + * kho_array_preserve() - KHO-preserve all pages of the array + * @ka: KHO array. + * + * Mark all pages of the array to be preserved across KHO. + * + * Note: the memory for the struct @ka itself is not marked as preserved. = The + * caller must take care of doing that, likely embedding it in a larger + * serialized data structure. + * + * Return: 0 on success, -errno on failure. + */ +int kho_array_preserve(struct kho_array *ka); + +/** + * kho_array_restore() - KHO-restore all pages of the array + * @ka: KHO array. + * + * Validate the magic and version of @ka, and if they match, restore all p= ages + * ka from KHO to set the array up for being accessed. + * + * Note: the memory for the struct @ka itself is not KHO-restored. The cal= ler + * must take care of doing that, likely embedding it in a larger serialize= d data + * structure. + * + * Return: 0 on success, -errno on failure. + */ +int kho_array_restore(struct kho_array *ka); + +/** + * ka_is_value() - Determine if an entry is a value. + * @entry: KHO array entry. + * + * Return: %true if the entry is a value, %false if it is a pointer. + */ +static inline bool ka_is_value(const void *entry) +{ + return (unsigned long)entry & 1; +} + +/** + * ka_to_value() - Get value stored in an KHO array entry. + * @entry: KHO array entry. + * + * Return: The value stored in @entry. + */ +static inline unsigned long ka_to_value(const void *entry) +{ + return (unsigned long)entry >> 1; +} + +/** + * ka_mk_value() - Create an KHO array entry from an integer. + * @v: Value to store in KHO array. + * + * Return: An entry suitable for storing in a KHO array. + */ +static inline void *ka_mk_value(unsigned long v) +{ + WARN_ON((long)v < 0); + return (void *)((v << 1) | 1); +} + +enum ka_iter_mode { + KA_ITER_READ, + KA_ITER_WRITE, +}; + +struct ka_iter { + struct kho_array *ka; + struct kho_array_page *cur; + unsigned long pos; + enum ka_iter_mode mode; +}; + +/** + * ka_iter_init_read() - Initialize iterator for reading. + * @iter: KHO array iterator. + * @ka: KHO array. + * + * Initialize @iter in read mode for reading @ka. After the function retur= ns, + * @iter points to the first non-empty entry in the array, if any. @ka mus= t be a + * valid KHO array. No validation on @ka is performed. + */ +void ka_iter_init_read(struct ka_iter *iter, struct kho_array *ka); + +/** + * ka_iter_init_write() - Initialize iterator for writing. + * @iter: KHO array iterator. + * @ka: KHO array. + * + * Initialize @ka to an empty array and then initialize @iter in write mode + * for building @ka. All data in @ka is over-written, so it must be an + * un-initialized array. After the function returns, @iter points to the f= irst + * entry in the array. + */ +void ka_iter_init_write(struct ka_iter *iter, struct kho_array *ka); + +/** + * ka_iter_init_restore() - Restore KHO array and initialize iterator for = reading. + * @iter: KHO array iterator. + * @ka: KHO array. + * + * KHO-restore @ka, performing version and format validation, and initiali= ze + * @iter in read mode for reading the array. After the function returns, @= iter + * points to the first non-empty entry in the array, if any + * + * Returns: 0 on success, -errno on failure. + */ +int ka_iter_init_restore(struct ka_iter *iter, struct kho_array *ka); + +/** + * ka_iter_setentry() - Set entry at current iterator position. + * @iter: KHO array iterator in write mode. + * @value: Value or pointer to store. + * + * Store @value at the current position of @iter. @iter must be in write m= ode. + * The iterator position is not advanced. + * + * Return: 0 on success, -errno on failure. + */ +int ka_iter_setentry(struct ka_iter *iter, const void *value); + +/** + * ka_iter_nextentry() - Advance iterator to next non-empty entry. + * @iter: KHO array iterator. + * + * Advance @iter to the next non-empty entry in the array, skipping over + * empty entries and holes between pages. + * + * Return: The entry, or %NULL if end of array reached. + */ +void *ka_iter_nextentry(struct ka_iter *iter); + +/** + * ka_iter_setpos() - Set iterator position. + * @iter: KHO array iterator. + * @pos: New position (must be >=3D current position). + * + * Set the iterator position to @pos. The position can only be moved forwa= rd. + * The iterator will point to the appropriate page for the given position. + * + * Return: 0 on success, -EINVAL if @pos is less than current position. + */ +int ka_iter_setpos(struct ka_iter *iter, unsigned long pos); + +/** + * ka_iter_end() - Check if iterator has reached end of array. + * @iter: KHO array iterator. + * + * Return: %true if iterator is at end of array, %false otherwise. + */ +bool ka_iter_end(struct ka_iter *iter); + +/** + * ka_iter_getpos() - Get current iterator position. + * @iter: KHO array iterator. + * + * Return: Current position in the array. + */ +static inline unsigned long ka_iter_getpos(struct ka_iter *iter) +{ + return iter->pos; +} + +/** + * ka_iter_getentry() - Get entry at current iterator position. + * @iter: KHO array iterator. + * + * Return: Pointer to entry at current position, or %NULL if none. + */ +void *ka_iter_getentry(struct ka_iter *iter); + +/** + * ka_iter_for_each - Iterate over all non-empty entries in array. + * @iter: KHO array iterator. + * @entry: Variable to store current entry. + * + * Loop over all non-empty entries in the array starting from current posi= tion. + */ +#define ka_iter_for_each(iter, entry) \ + for ((entry) =3D ka_iter_getentry(iter); (entry); (entry) =3D ka_iter_nex= tentry((iter))) + +#endif /* LINUX_KHO_ARRAY_H */ diff --git a/kernel/Makefile b/kernel/Makefile index c60623448235f..8baef3cb3979f 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -82,6 +82,7 @@ obj-$(CONFIG_KEXEC) +=3D kexec.o obj-$(CONFIG_KEXEC_FILE) +=3D kexec_file.o obj-$(CONFIG_KEXEC_ELF) +=3D kexec_elf.o obj-$(CONFIG_KEXEC_HANDOVER) +=3D kexec_handover.o +obj-$(CONFIG_KEXEC_HANDOVER) +=3D kho_array.o obj-$(CONFIG_BACKTRACE_SELF_TEST) +=3D backtracetest.o obj-$(CONFIG_COMPAT) +=3D compat.o obj-$(CONFIG_CGROUPS) +=3D cgroup/ diff --git a/kernel/kho_array.c b/kernel/kho_array.c new file mode 100644 index 0000000000000..bdac471c45c58 --- /dev/null +++ b/kernel/kho_array.c @@ -0,0 +1,209 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (C) 2025 Amazon.com Inc. or its affiliates. + * Pratyush Yadav + */ + +#include +#include +#include +#include +#include +#include + +#define KA_PAGE_NR_ENTRIES ((PAGE_SIZE - sizeof(struct kho_array_page)) / = sizeof(u64)) + +#define KA_ITER_PAGEPOS(iter) ((iter)->pos - (iter)->cur->startpos) +#define KA_PAGE(phys) ((phys) ? (struct kho_array_page *)__va((phys)) : NU= LL) + +bool ka_iter_end(struct ka_iter *iter) +{ + return !iter->cur || (KA_ITER_PAGEPOS(iter) >=3D KA_PAGE_NR_ENTRIES && !i= ter->cur->next); +} + +void *ka_iter_getentry(struct ka_iter *iter) +{ + if (!iter->cur || KA_ITER_PAGEPOS(iter) >=3D KA_PAGE_NR_ENTRIES) + return NULL; + + return (void *)iter->cur->entries[KA_ITER_PAGEPOS(iter)]; +} + +static int ka_iter_extend(struct ka_iter *iter) +{ + struct kho_array_page *kap; + struct folio *folio; + u64 phys; + + if (!ka_iter_end(iter)) + return 0; + + folio =3D folio_alloc(GFP_KERNEL | __GFP_ZERO, 0); + if (!folio) + return -ENOMEM; + + kap =3D folio_address(folio); + kap->startpos =3D rounddown(iter->pos, KA_PAGE_NR_ENTRIES); + + phys =3D (u64)PFN_PHYS(folio_pfn(folio)); + /* + * If the iterator already has a page, insert the page after it. + * Otherwise, set the page as the first in the array. + */ + if (iter->cur) + iter->cur->next =3D phys; + else + iter->ka->first =3D phys; + + iter->cur =3D kap; + + return 0; +} + +void ka_iter_init_read(struct ka_iter *iter, struct kho_array *ka) +{ + memset(iter, 0, sizeof(*iter)); + iter->ka =3D ka; + iter->mode =3D KA_ITER_READ; + iter->cur =3D KA_PAGE(ka->first); + + /* Make the iterator point to first valid entry. */ + if (!ka_iter_getentry(iter)) + ka_iter_nextentry(iter); +} + +void ka_iter_init_write(struct ka_iter *iter, struct kho_array *ka) +{ + kho_array_init(ka); + memset(iter, 0, sizeof(*iter)); + iter->ka =3D ka; + iter->mode =3D KA_ITER_WRITE; +} + +int ka_iter_init_restore(struct ka_iter *iter, struct kho_array *ka) +{ + int err; + + err =3D kho_array_restore(ka); + if (err) + return err; + + ka_iter_init_read(iter, ka); + return 0; +} + +int ka_iter_setpos(struct ka_iter *iter, unsigned long pos) +{ + if (pos < iter->pos) + return -EINVAL; + + iter->pos =3D pos; + + /* + * The iterator must point to the highest page with startpos <=3D pos. + * Advance it as far as possible. + */ + while (iter->cur && KA_PAGE(iter->cur->next) && + KA_PAGE(iter->cur->next)->startpos <=3D pos) + iter->cur =3D KA_PAGE(iter->cur->next); + + return 0; +} + +int ka_iter_setentry(struct ka_iter *iter, const void *value) +{ + int err =3D 0; + + if (iter->mode !=3D KA_ITER_WRITE) + return -EPERM; + + err =3D ka_iter_extend(iter); + if (err) + return err; + + iter->cur->entries[KA_ITER_PAGEPOS(iter)] =3D (u64)value; + return 0; +} + +void *ka_iter_nextentry(struct ka_iter *iter) +{ + ka_iter_setpos(iter, iter->pos + 1); + while (!ka_iter_end(iter) && !ka_iter_getentry(iter)) { + /* + * If we are in the hole between two pages, jump to the next + * page. + */ + if (KA_ITER_PAGEPOS(iter) >=3D KA_PAGE_NR_ENTRIES) + /* + * The check for ka_iter_end() above makes sure next + * page exists. + * + * TODO: This is a bit nasty and might attract review + * comments. Can I make it cleaner? + */ + ka_iter_setpos(iter, KA_PAGE(iter->cur->next)->startpos); + else + ka_iter_setpos(iter, iter->pos + 1); + } + + return ka_iter_getentry(iter); +} + +bool kho_array_valid(struct kho_array *ka) +{ + return ka->magic =3D=3D KHO_ARRAY_MAGIC && ka->version =3D=3D KHO_ARRAY_V= ERSION; +} + +void kho_array_init(struct kho_array *ka) +{ + memset(ka, 0, sizeof(*ka)); + ka->magic =3D KHO_ARRAY_MAGIC; + ka->version =3D KHO_ARRAY_VERSION; +} + +void kho_array_destroy(struct kho_array *ka) +{ + u64 cur =3D ka->first, next; + + while (cur) { + next =3D KA_PAGE(cur)->next; + folio_put(pfn_folio(PHYS_PFN(cur))); + cur =3D next; + } + + ka->magic =3D 0; +} + +int kho_array_preserve(struct kho_array *ka) +{ + u64 cur =3D ka->first; + int err; + + while (cur) { + err =3D kho_preserve_folio(pfn_folio(PHYS_PFN(cur))); + if (err) + return err; + + cur =3D KA_PAGE(cur)->next; + } + + return 0; +} + +int kho_array_restore(struct kho_array *ka) +{ + u64 cur =3D ka->first; + struct folio *folio; + + if (!kho_array_valid(ka)) + return -EOPNOTSUPP; + + while (cur) { + folio =3D kho_restore_folio(cur); + if (!folio) + return -ENOMEM; + cur =3D KA_PAGE(cur)->next; + } + + return 0; +} --=20 2.47.3 From nobody Thu Oct 2 22:47:39 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D06334F490 for ; Tue, 9 Sep 2025 14:44:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757429082; cv=none; b=b9V7dUHtkaTroD7Q6thLrVCAN+c9GNZvVBf2omWoTYecFIJ/e8jDrecxOcYJfXBVa6InzEdjCgcEMLaqSGgTvZkZrcvNAe2AB2d8Wvt0FJTjtcEMg/mvlNmumjIMzKQIekFdPiOFI1K+4784LqZwRE2Ir6cMG1NGggUKNZL/qHM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757429082; c=relaxed/simple; bh=9MZqSQvw5qyTZgNHswRXw8wMGyHQiJ8VgjtL5Ns5Ug4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=owoOhyZr5K6/Tc3frs86r7eAtV0QdNgtmH/KVtYjSQLwK/prOObDC3gSP2q0AD6z7W8Pp5vfOXqzoVGtg7j7GYKGc+E/vpzVHFSLD1UmNAjQZOFJLmpHZfuEPA6WnSs6w+vQsyOlh4NAoDU3n45T6Ryxc7X3u0AJlGvlM+IzgHw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=hTCMk9Ub; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="hTCMk9Ub" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E1801C4CEF4; Tue, 9 Sep 2025 14:44:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1757429082; bh=9MZqSQvw5qyTZgNHswRXw8wMGyHQiJ8VgjtL5Ns5Ug4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hTCMk9Ub9Ng4v+4IqxsqNkL1aWkxpzQIFyf+EXDEwI319MXSJoe6p59xW7F5cMZer juosq20FywgDc+qbSFwXismNwfJE2hPy6oFtTPvDYl0BHz19MJFqdquTnx3V7eKfj3 QTrauj37Pq/l709vyei0pdqxuQnr69m+mdsOk7JV1+zLyX0j6w7A4Mhw6r9PlxPHEb GCV3X6k1uqS+eMqGyTddkfqGpNuljywn0mPJB+gGDW5QSeOABCXFWFMBsgEbGJtaMf HSI+sHNEbu369ei3q/LC0EL7G7DmDx94Jziq5At/SYHfPCI1OutZQS3WELmFc32DSh YhTtId0FXEHzQ== From: Pratyush Yadav To: Alexander Graf , Mike Rapoport , Changyuan Lyu , Andrew Morton , Baoquan He , Pratyush Yadav , Pasha Tatashin , Jason Gunthorpe , =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= , Chris Li , Jason Miu , David Matlack , David Rientjes Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-mm@kvack.org Subject: [RFC PATCH 2/4] kho: use KHO array for preserved memory bitmap serialization Date: Tue, 9 Sep 2025 16:44:22 +0200 Message-ID: <20250909144426.33274-3-pratyush@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20250909144426.33274-1-pratyush@kernel.org> References: <20250909144426.33274-1-pratyush@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The preserved memory bitmap preservation creates a linked list of pages to track the bitmaps for preserved memory. Essentially, it is a scattered list of pointers grouped by folio order. Use a KHO array to hold the pointers to the bitmaps instead. This moves the burden of tracking this metadata to the KHO array layer, and makes the KHO core simpler. Currently, the bitmaps are held in chunks, which is a fixed-size array of pointers, plus some metadata including the order of the preserved folios. The KHO array holds only pointers and has no mechanism for grouping. To make the serialization format simpler, move the folio order from struct khoser_mem_chunk to struct khoser_mem_bitmap_ptr. The chunks to hold the bitmaps are not KHO-preserved since they are only used during the scratch-only phase. The same holds true with the KHO array. The pages which track the KHO array metadata are not KHO-preserved and thus are only valid during the scratch phase of the next kernel. After that, they are discarded and freed to buddy. Signed-off-by: Pratyush Yadav --- The diff is a bit hard to read. The final result can be found at https://git.kernel.org/pub/scm/linux/kernel/git/pratyush/linux.git/tree/ker= nel/kexec_handover.c?h=3Dkho-array-rfc-v1#n227 kernel/kexec_handover.c | 148 +++++++++++++++++++--------------------- 1 file changed, 69 insertions(+), 79 deletions(-) diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index ecd1ac210dbd7..26f9f5295f07d 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -18,6 +18,7 @@ #include #include #include +#include =20 #include =20 @@ -80,15 +81,13 @@ struct kho_mem_track { struct xarray orders; }; =20 -struct khoser_mem_chunk; - struct kho_serialization { struct page *fdt; struct list_head fdt_list; struct dentry *sub_fdt_dir; struct kho_mem_track track; - /* First chunk of serialized preserved memory map */ - struct khoser_mem_chunk *preserved_mem_map; + /* Serialized preserved memory map */ + struct kho_array *preserved_mem_map; }; =20 static void *xa_load_or_alloc(struct xarray *xa, unsigned long index, size= _t sz) @@ -226,11 +225,11 @@ EXPORT_SYMBOL_GPL(kho_restore_folio); =20 /* Serialize and deserialize struct kho_mem_phys across kexec * - * Record all the bitmaps in a linked list of pages for the next kernel to - * process. Each chunk holds bitmaps of the same order and each block of b= itmaps - * starts at a given physical address. This allows the bitmaps to be spars= e. The - * xarray is used to store them in a tree while building up the data struc= ture, - * but the KHO successor kernel only needs to process them once in order. + * Record all the bitmaps in a KHO array for the next kernel to process. E= ach + * bitmap stores the order of the folios and starts at a given physical ad= dress. + * This allows the bitmaps to be sparse. The xarray is used to store them = in a + * tree while building up the data structure, but the KHO successor kernel= only + * needs to process them once in order. * * All of this memory is normal kmalloc() memory and is not marked for * preservation. The successor kernel will remain isolated to the scratch = space @@ -240,118 +239,107 @@ EXPORT_SYMBOL_GPL(kho_restore_folio); =20 struct khoser_mem_bitmap_ptr { phys_addr_t phys_start; - DECLARE_KHOSER_PTR(bitmap, struct kho_mem_phys_bits *); -}; - -struct khoser_mem_chunk_hdr { - DECLARE_KHOSER_PTR(next, struct khoser_mem_chunk *); unsigned int order; - unsigned int num_elms; -}; - -#define KHOSER_BITMAP_SIZE \ - ((PAGE_SIZE - sizeof(struct khoser_mem_chunk_hdr)) / \ - sizeof(struct khoser_mem_bitmap_ptr)) - -struct khoser_mem_chunk { - struct khoser_mem_chunk_hdr hdr; - struct khoser_mem_bitmap_ptr bitmaps[KHOSER_BITMAP_SIZE]; + unsigned int __reserved; + DECLARE_KHOSER_PTR(bitmap, struct kho_mem_phys_bits *); }; =20 -static_assert(sizeof(struct khoser_mem_chunk) =3D=3D PAGE_SIZE); - -static struct khoser_mem_chunk *new_chunk(struct khoser_mem_chunk *cur_chu= nk, - unsigned long order) +static struct khoser_mem_bitmap_ptr *new_bitmap(phys_addr_t start, + struct kho_mem_phys_bits *bits, + unsigned int order) { - struct khoser_mem_chunk *chunk; + struct khoser_mem_bitmap_ptr *bitmap; =20 - chunk =3D kzalloc(PAGE_SIZE, GFP_KERNEL); - if (!chunk) + bitmap =3D kzalloc(sizeof(*bitmap), GFP_KERNEL); + if (!bitmap) return NULL; - chunk->hdr.order =3D order; - if (cur_chunk) - KHOSER_STORE_PTR(cur_chunk->hdr.next, chunk); - return chunk; + + bitmap->phys_start =3D start; + bitmap->order =3D order; + KHOSER_STORE_PTR(bitmap->bitmap, bits); + return bitmap; } =20 -static void kho_mem_ser_free(struct khoser_mem_chunk *first_chunk) +static void kho_mem_ser_free(struct kho_array *ka) { - struct khoser_mem_chunk *chunk =3D first_chunk; + struct khoser_mem_bitmap_ptr *elm; + struct ka_iter iter; =20 - while (chunk) { - struct khoser_mem_chunk *tmp =3D chunk; + if (!ka) + return; =20 - chunk =3D KHOSER_LOAD_PTR(chunk->hdr.next); - kfree(tmp); - } + ka_iter_init_read(&iter, ka); + ka_iter_for_each(&iter, elm) + kfree(elm); + + kho_array_destroy(ka); + kfree(ka); } =20 static int kho_mem_serialize(struct kho_serialization *ser) { - struct khoser_mem_chunk *first_chunk =3D NULL; - struct khoser_mem_chunk *chunk =3D NULL; struct kho_mem_phys *physxa; - unsigned long order; + unsigned long order, pos =3D 0; + struct kho_array *ka =3D NULL; + struct ka_iter iter; + + ka =3D kzalloc(sizeof(*ka), GFP_KERNEL); + if (!ka) + return -ENOMEM; + ka_iter_init_write(&iter, ka); =20 xa_for_each(&ser->track.orders, order, physxa) { struct kho_mem_phys_bits *bits; unsigned long phys; =20 - chunk =3D new_chunk(chunk, order); - if (!chunk) - goto err_free; - - if (!first_chunk) - first_chunk =3D chunk; - xa_for_each(&physxa->phys_bits, phys, bits) { struct khoser_mem_bitmap_ptr *elm; + phys_addr_t start; + + start =3D (phys * PRESERVE_BITS) << (order + PAGE_SHIFT); + elm =3D new_bitmap(start, bits, order); + if (!elm) + goto err_free; =20 - if (chunk->hdr.num_elms =3D=3D ARRAY_SIZE(chunk->bitmaps)) { - chunk =3D new_chunk(chunk, order); - if (!chunk) - goto err_free; - } - - elm =3D &chunk->bitmaps[chunk->hdr.num_elms]; - chunk->hdr.num_elms++; - elm->phys_start =3D (phys * PRESERVE_BITS) - << (order + PAGE_SHIFT); - KHOSER_STORE_PTR(elm->bitmap, bits); + ka_iter_setpos(&iter, pos); + if (ka_iter_setentry(&iter, elm)) + goto err_free; + pos++; } } =20 - ser->preserved_mem_map =3D first_chunk; + ser->preserved_mem_map =3D ka; =20 return 0; =20 err_free: - kho_mem_ser_free(first_chunk); + kho_mem_ser_free(ka); return -ENOMEM; } =20 -static void __init deserialize_bitmap(unsigned int order, - struct khoser_mem_bitmap_ptr *elm) +static void __init deserialize_bitmap(struct khoser_mem_bitmap_ptr *elm) { struct kho_mem_phys_bits *bitmap =3D KHOSER_LOAD_PTR(elm->bitmap); unsigned long bit; =20 for_each_set_bit(bit, bitmap->preserve, PRESERVE_BITS) { - int sz =3D 1 << (order + PAGE_SHIFT); + int sz =3D 1 << (elm->order + PAGE_SHIFT); phys_addr_t phys =3D - elm->phys_start + (bit << (order + PAGE_SHIFT)); + elm->phys_start + (bit << (elm->order + PAGE_SHIFT)); struct page *page =3D phys_to_page(phys); =20 memblock_reserve(phys, sz); memblock_reserved_mark_noinit(phys, sz); - page->private =3D order; + page->private =3D elm->order; } } =20 static void __init kho_mem_deserialize(const void *fdt) { - struct khoser_mem_chunk *chunk; + struct khoser_mem_bitmap_ptr *elm; const phys_addr_t *mem; + struct kho_array *ka; + struct ka_iter iter; int len; =20 mem =3D fdt_getprop(fdt, 0, PROP_PRESERVED_MEMORY_MAP, &len); @@ -361,15 +349,17 @@ static void __init kho_mem_deserialize(const void *fd= t) return; } =20 - chunk =3D *mem ? phys_to_virt(*mem) : NULL; - while (chunk) { - unsigned int i; - - for (i =3D 0; i !=3D chunk->hdr.num_elms; i++) - deserialize_bitmap(chunk->hdr.order, - &chunk->bitmaps[i]); - chunk =3D KHOSER_LOAD_PTR(chunk->hdr.next); + ka =3D *mem ? phys_to_virt(*mem) : NULL; + if (!ka) + return; + if (!kho_array_valid(ka)) { + pr_err("invalid KHO array for preserved memory bitmaps\n"); + return; } + + ka_iter_init_read(&iter, ka); + ka_iter_for_each(&iter, elm) + deserialize_bitmap(elm); } =20 /* --=20 2.47.3 From nobody Thu Oct 2 22:47:39 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBDD3352095 for ; Tue, 9 Sep 2025 14:44:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757429084; cv=none; b=ePJCvT5aYxk9RatTDAmokredB6UDaKhJlB+7KN215x5cDMEtm17anjw7YnRO/mh4x2KfXVq/gK7KT/HeVfA+dxv8cCk9JA9X3Omwz4rqonY+GILJ6EWUXjqJlTRzAVwEKL7cOs/ubb+K4IW315d1q9/pn3BJxSxOt9TAfVgqtmU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757429084; c=relaxed/simple; bh=R58t8Ck7gaCyPwSxTfKs4rf/zne/bpphAKyTAFsHSEg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OuPGOp9sh647t5ZCuBLHCVutHiHyPxNgAPU2M7Y4El/HEKmEKT42mulEwQ+9eLUBAEz8hTBqVsWErxnO/7Oq3gQ5FjsX5rrzn7TJWF0Bl93UZrBMXNT0cXrpXIK44rXImDjMKJAyk0ZcqzmVTx/v9MCcB4JeBx0X258R2+/zCGo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nTeyXhqa; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nTeyXhqa" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 547C8C4CEFC; Tue, 9 Sep 2025 14:44:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1757429084; bh=R58t8Ck7gaCyPwSxTfKs4rf/zne/bpphAKyTAFsHSEg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nTeyXhqaog50wiTHSSHI/usuM6Lz1jubOhD0MerikJmeK4c/dobUWCoKgpcwy2wFK Lb39k544yz4KwqKbBp7ObMKBgessNF88sW9kNtV3Ueo704WFmO60mTKjH0CrO62j2O HdVlqYH1A3Y42pMDsi8xJAc9FsgOPyNeayWGFy1Wd6wgpbDhyiUjCPscw9EP/5yrSi tVhcWdijota3s2Uc6HG8jMDzcVTXp1inUWpKioo0B3Dj28uWySSwbWDLOIe8VwGI0j 8IqLOJNwnFkzlz3AxXHJiIA4jQt3cWdaBsmtSeyUztA1fDXjqYXiEOiWJnuxVX7HlF uHD9wGZNDLPpw== From: Pratyush Yadav To: Alexander Graf , Mike Rapoport , Changyuan Lyu , Andrew Morton , Baoquan He , Pratyush Yadav , Pasha Tatashin , Jason Gunthorpe , =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= , Chris Li , Jason Miu , David Matlack , David Rientjes Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-mm@kvack.org Subject: [RFC PATCH 3/4] kho: add support for preserving vmalloc allocations Date: Tue, 9 Sep 2025 16:44:23 +0200 Message-ID: <20250909144426.33274-4-pratyush@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20250909144426.33274-1-pratyush@kernel.org> References: <20250909144426.33274-1-pratyush@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Mike Rapoport (Microsoft)" A vmalloc allocation is preserved using binary structure similar to global KHO memory tracker. It's a linked list of pages where each page is an array of physical address of pages in vmalloc area. kho_preserve_vmalloc() hands out the physical address of the head page to the caller. This address is used as the argument to kho_vmalloc_restore() to restore the mapping in the vmalloc address space and populate it with the preserved pages. Signed-off-by: Mike Rapoport (Microsoft) [pratyush@kernel.org: use KHO array instead of linked list of pages to track physical addresses] Signed-off-by: Pratyush Yadav --- include/linux/kexec_handover.h | 21 +++++ kernel/kexec_handover.c | 143 +++++++++++++++++++++++++++++++++ 2 files changed, 164 insertions(+) diff --git a/include/linux/kexec_handover.h b/include/linux/kexec_handover.h index 348844cffb136..633f94cec1a35 100644 --- a/include/linux/kexec_handover.h +++ b/include/linux/kexec_handover.h @@ -4,6 +4,7 @@ =20 #include #include +#include =20 struct kho_scratch { phys_addr_t addr; @@ -37,13 +38,23 @@ struct notifier_block; }) =20 struct kho_serialization; +struct kho_vmalloc; =20 #ifdef CONFIG_KEXEC_HANDOVER +struct kho_vmalloc { + struct kho_array ka; + unsigned int total_pages; + unsigned int flags; + unsigned short order; +}; + bool kho_is_enabled(void); =20 int kho_preserve_folio(struct folio *folio); +int kho_preserve_vmalloc(void *ptr, struct kho_vmalloc *preservation); int kho_preserve_phys(phys_addr_t phys, size_t size); struct folio *kho_restore_folio(phys_addr_t phys); +void *kho_restore_vmalloc(struct kho_vmalloc *preservation); int kho_add_subtree(struct kho_serialization *ser, const char *name, void = *fdt); int kho_retrieve_subtree(const char *name, phys_addr_t *phys); =20 @@ -70,11 +81,21 @@ static inline int kho_preserve_phys(phys_addr_t phys, s= ize_t size) return -EOPNOTSUPP; } =20 +static inline int kho_preserve_vmalloc(void *ptr, struct kho_vmalloc *pres= ervation) +{ + return -EOPNOTSUPP; +} + static inline struct folio *kho_restore_folio(phys_addr_t phys) { return NULL; } =20 +static inline void *kho_restore_vmalloc(struct kho_vmalloc *preservation) +{ + return NULL; +} + static inline int kho_add_subtree(struct kho_serialization *ser, const char *name, void *fdt) { diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c index 26f9f5295f07d..5f89134ceeee0 100644 --- a/kernel/kexec_handover.c +++ b/kernel/kexec_handover.c @@ -19,6 +19,7 @@ #include #include #include +#include =20 #include =20 @@ -723,6 +724,148 @@ int kho_preserve_phys(phys_addr_t phys, size_t size) } EXPORT_SYMBOL_GPL(kho_preserve_phys); =20 +#define KHO_VMALLOC_FLAGS_MASK (VM_ALLOC | VM_ALLOW_HUGE_VMAP) + +/** + * kho_preserve_vmalloc - preserve memory allocated with vmalloc() across = kexec + * @ptr: pointer to the area in vmalloc address space + * @preservation: pointer to metadata for preserved data. + * + * Instructs KHO to preserve the area in vmalloc address space at @ptr. The + * physical pages mapped at @ptr will be preserved and on successful return + * @preservation will hold the structure that describes the metadata for t= he + * preserved pages. @preservation itself is not KHO-preserved. The caller = must + * do that. + * + * NOTE: The memory allocated with vmalloc_node() variants cannot be relia= bly + * restored on the same node + * + * Return: 0 on success, error code on failure + */ +int kho_preserve_vmalloc(void *ptr, struct kho_vmalloc *preservation) +{ + struct kho_mem_track *track =3D &kho_out.ser.track; + struct vm_struct *vm =3D find_vm_area(ptr); + unsigned int order, flags; + struct ka_iter iter; + int err; + + if (!vm) + return -EINVAL; + + if (vm->flags & ~KHO_VMALLOC_FLAGS_MASK) + return -EOPNOTSUPP; + + flags =3D vm->flags & KHO_VMALLOC_FLAGS_MASK; + order =3D get_vm_area_page_order(vm); + + preservation->total_pages =3D vm->nr_pages; + preservation->flags =3D flags; + preservation->order =3D order; + + ka_iter_init_write(&iter, &preservation->ka); + + for (int i =3D 0, pos =3D 0; i < vm->nr_pages; i +=3D (1 << order), pos++= ) { + phys_addr_t phys =3D page_to_phys(vm->pages[i]); + + err =3D __kho_preserve_order(track, PHYS_PFN(phys), order); + if (err) + goto err_free; + + err =3D ka_iter_setpos(&iter, pos); + if (err) + goto err_free; + + err =3D ka_iter_setentry(&iter, ka_mk_value(phys)); + if (err) + goto err_free; + } + + err =3D kho_array_preserve(&preservation->ka); + if (err) + goto err_free; + + return 0; + +err_free: + kho_array_destroy(&preservation->ka); + return err; +} +EXPORT_SYMBOL_GPL(kho_preserve_vmalloc); + +/** + * kho_restore_vmalloc - recreates and populates an area in vmalloc address + * space from the preserved memory. + * @preservation: the preservation metadata. + * + * Recreates an area in vmalloc address space and populates it with memory= that + * was preserved using kho_preserve_vmalloc(). + * + * Return: pointer to the area in the vmalloc address space, NULL on failu= re. + */ +void *kho_restore_vmalloc(struct kho_vmalloc *preservation) +{ + unsigned int align, order, shift, flags; + unsigned int idx =3D 0, nr; + unsigned long addr, size; + struct vm_struct *area; + struct page **pages; + struct ka_iter iter; + void *entry; + int err; + + flags =3D preservation->flags; + if (flags & ~KHO_VMALLOC_FLAGS_MASK) + return NULL; + + err =3D ka_iter_init_restore(&iter, &preservation->ka); + if (err) + return NULL; + + nr =3D preservation->total_pages; + pages =3D kvmalloc_array(nr, sizeof(*pages), GFP_KERNEL); + if (!pages) + goto err_ka_destroy; + order =3D preservation->order; + shift =3D PAGE_SHIFT + order; + align =3D 1 << shift; + + ka_iter_for_each(&iter, entry) { + phys_addr_t phys =3D ka_to_value(entry); + struct page *page; + + page =3D phys_to_page(phys); + kho_restore_page(page, 0); + pages[idx++] =3D page; + phys +=3D PAGE_SIZE; + } + + area =3D __get_vm_area_node(nr * PAGE_SIZE, align, shift, flags, + VMALLOC_START, VMALLOC_END, NUMA_NO_NODE, + GFP_KERNEL, __builtin_return_address(0)); + if (!area) + goto err_free_pages_array; + + addr =3D (unsigned long)area->addr; + size =3D get_vm_area_size(area); + err =3D vmap_pages_range(addr, addr + size, PAGE_KERNEL, pages, shift); + if (err) + goto err_free_vm_area; + + kho_array_destroy(&preservation->ka); + + return area->addr; + +err_free_vm_area: + free_vm_area(area); +err_free_pages_array: + kvfree(pages); +err_ka_destroy: + kho_array_destroy(&preservation->ka); + return NULL; +} +EXPORT_SYMBOL_GPL(kho_restore_vmalloc); + /* Handling for debug/kho/out */ =20 static struct dentry *debugfs_root; --=20 2.47.3 From nobody Thu Oct 2 22:47:39 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 52FF3352FE5 for ; Tue, 9 Sep 2025 14:44:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757429087; cv=none; b=M5fGCV/sBdfhh4+LT8HPRaCkwyr/19hHRoKW0cOnGkrwVKwV3eWpfGaQHpOeEoT2bBENTyQ9hWQJK/TZJfnqnYWhM2yyAGwXiZhgh5mqtscVG9g3+nuAJ34rekrEzYAOy18W5OslkycuqdO3nubGewuCoBbYfnjsKMcASbjU9w4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1757429087; c=relaxed/simple; bh=5rpwGu8/al+jSPVVDiN+aC7cVtsur+Qn+e55fU0YUFo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sekr6nLwOTRaBnDSkj8zw/CnwaMHrXSq9VE+ofs3CccWtz0EFCtqURn3TySjZXFXHWJIuUpSlZ134Ki2jyWn923YzKLni1paOZwvyzfu335LeNfKnjnkhKwtxy78suC12qyvEheOqm8wpoXKYbBlT5hqLRG467v3unouq06bmOw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=t5MFgw81; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="t5MFgw81" Received: by smtp.kernel.org (Postfix) with ESMTPSA id BB53FC4CEF7; Tue, 9 Sep 2025 14:44:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1757429086; bh=5rpwGu8/al+jSPVVDiN+aC7cVtsur+Qn+e55fU0YUFo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t5MFgw81ctwGvYwePgn0tk5D0dAPf1flnsty/cI4NL15pvNkly5f8OUHdx9B9NQgf WZkofDjZ9wOKpW0h5ohIT89geDptp4SKbq7w9YRCuPU8NxKrYxIsLAtFxDzgdC+RIe AnSgoxvyjnsn2jVvpvy2Tn/DXkjKYhcFjRaRC60S9rdwSO7H/cbFpzkOVyZanvL4QQ fQV8/JgmbtLt6mz2D39P+y2IM65bbjWQ4kn7wwKDq6Q/OA42Za2LuhQyHmjSieIUr6 vbJ1HMhsZeme3QsIhuwq2edA+jBSUjkQq4fJ9GZF0FojJgMBKRoGqZjcAAFTUcmAa7 /MNRrfJy5mnrw== From: Pratyush Yadav To: Alexander Graf , Mike Rapoport , Changyuan Lyu , Andrew Morton , Baoquan He , Pratyush Yadav , Pasha Tatashin , Jason Gunthorpe , =?UTF-8?q?Thomas=20Wei=C3=9Fschuh?= , Chris Li , Jason Miu , David Matlack , David Rientjes Cc: linux-kernel@vger.kernel.org, kexec@lists.infradead.org, linux-mm@kvack.org Subject: [RFC PATCH 4/4] lib/test_kho: use kho_preserve_vmalloc instead of storing addresses in fdt Date: Tue, 9 Sep 2025 16:44:24 +0200 Message-ID: <20250909144426.33274-5-pratyush@kernel.org> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20250909144426.33274-1-pratyush@kernel.org> References: <20250909144426.33274-1-pratyush@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: "Mike Rapoport (Microsoft)" KHO test stores physical addresses of the preserved folios directly in fdt. Use kho_preserve_vmalloc() instead of it and kho_restore_vmalloc() to retrieve the addresses after kexec. This makes the test more scalable from one side and adds tests coverage for kho_preserve_vmalloc() from the other. Signed-off-by: Mike Rapoport (Microsoft) [pratyush@kernel.org: use the KHO-array version of kho_restore_vmalloc()] Signed-off-by: Pratyush Yadav --- lib/test_kho.c | 30 +++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/lib/test_kho.c b/lib/test_kho.c index c2eb899c3b456..3f4cb39cd917e 100644 --- a/lib/test_kho.c +++ b/lib/test_kho.c @@ -32,6 +32,7 @@ module_param(max_mem, long, 0644); struct kho_test_state { unsigned int nr_folios; struct folio **folios; + phys_addr_t *folios_info; struct folio *fdt; __wsum csum; }; @@ -67,14 +68,18 @@ static struct notifier_block kho_test_nb =3D { =20 static int kho_test_save_data(struct kho_test_state *state, void *fdt) { + struct kho_vmalloc folios_info_preservation =3D {}; phys_addr_t *folios_info __free(kvfree) =3D NULL; int err =3D 0; =20 - folios_info =3D kvmalloc_array(state->nr_folios, sizeof(*folios_info), - GFP_KERNEL); + folios_info =3D vmalloc_array(state->nr_folios, sizeof(*folios_info)); if (!folios_info) return -ENOMEM; =20 + err =3D kho_preserve_vmalloc(folios_info, &folios_info_preservation); + if (err) + return err; + for (int i =3D 0; i < state->nr_folios; i++) { struct folio *folio =3D state->folios[i]; unsigned int order =3D folio_order(folio); @@ -89,11 +94,14 @@ static int kho_test_save_data(struct kho_test_state *st= ate, void *fdt) err |=3D fdt_begin_node(fdt, "data"); err |=3D fdt_property(fdt, "nr_folios", &state->nr_folios, sizeof(state->nr_folios)); - err |=3D fdt_property(fdt, "folios_info", folios_info, - state->nr_folios * sizeof(*folios_info)); + err |=3D fdt_property(fdt, "folios_info", &folios_info_preservation, + sizeof(folios_info_preservation)); err |=3D fdt_property(fdt, "csum", &state->csum, sizeof(state->csum)); err |=3D fdt_end_node(fdt); =20 + if (!err) + state->folios_info =3D no_free_ptr(folios_info); + return err; } =20 @@ -197,7 +205,8 @@ static int kho_test_save(void) static int kho_test_restore_data(const void *fdt, int node) { const unsigned int *nr_folios; - const phys_addr_t *folios_info; + const struct kho_vmalloc *folios_info_preservation; + phys_addr_t *folios_info; const __wsum *old_csum; __wsum csum =3D 0; int len; @@ -212,8 +221,12 @@ static int kho_test_restore_data(const void *fdt, int = node) if (!old_csum || len !=3D sizeof(*old_csum)) return -EINVAL; =20 - folios_info =3D fdt_getprop(fdt, node, "folios_info", &len); - if (!folios_info || len !=3D sizeof(*folios_info) * *nr_folios) + folios_info_preservation =3D fdt_getprop(fdt, node, "folios_info", &len); + if (!folios_info_preservation || len !=3D sizeof(*folios_info_preservatio= n)) + return -EINVAL; + + folios_info =3D kho_restore_vmalloc((struct kho_vmalloc *)folios_info_pre= servation); + if (!folios_info) return -EINVAL; =20 for (int i =3D 0; i < *nr_folios; i++) { @@ -233,6 +246,8 @@ static int kho_test_restore_data(const void *fdt, int n= ode) folio_put(folio); } =20 + vfree(folios_info); + if (csum !=3D *old_csum) return -EINVAL; =20 @@ -291,6 +306,7 @@ static void kho_test_cleanup(void) folio_put(kho_test_state.folios[i]); =20 kvfree(kho_test_state.folios); + vfree(kho_test_state.folios_info); } =20 static void __exit kho_test_exit(void) --=20 2.47.3