[RFC v1 0/4] Make KHO Stateless

Jason Miu posted 4 patches 2 weeks, 1 day ago
There is a newer version of this series
include/linux/kexec_handover.h |  44 +-
kernel/kexec_core.c            |   4 +
kernel/kexec_handover.c        | 821 ++++++++++++++++-----------------
kernel/kexec_internal.h        |   2 +
mm/memblock.c                  |  46 +-
5 files changed, 404 insertions(+), 513 deletions(-)
[RFC v1 0/4] Make KHO Stateless
Posted by Jason Miu 2 weeks, 1 day ago
This series transitions KHO from an xarray-based metadata tracking
system with serialization to using page table like data structures
that can be passed directly to the next kernel.

The key motivations for this change are to:
- Eliminate the need for data serialization before kexec.
- Remove the former KHO state machine by deprecating the finalize
  and abort states.
- Pass preservation metadata more directly to the next kernel via the FDT.

The new approach uses a per-order page table structure (kho_order_table,
kho_page_table, kho_bitmap_table) to mark preserved pages. The physical
address of the root `kho_order_table` is passed in the FDT, allowing the
next kernel to reconstruct the preserved memory map.

The series includes the following changes:
1.  Introduce the KHO page table data structures.
2.  Adopt the KHO page tables, remove the xarray-based tracking and
    the serialization/finalization code.
3.  Update memblock to use direct KHO API calls, and adjust KHO FDT
    completion timing.
4.  Remove the KHO notifier system infrastructure.
        

Jason Miu (4):
  kho: Introduce KHO page table data structures
  kho: Adopt KHO page tables and remove serialization
  memblock: Remove KHO notifier usage
  kho: Remove notifier system infrastructure

 include/linux/kexec_handover.h |  44 +-
 kernel/kexec_core.c            |   4 +
 kernel/kexec_handover.c        | 821 ++++++++++++++++-----------------
 kernel/kexec_internal.h        |   2 +
 mm/memblock.c                  |  46 +-
 5 files changed, 404 insertions(+), 513 deletions(-)

-- 
2.51.0.384.g4c02a37b29-goog
Re: [RFC v1 0/4] Make KHO Stateless
Posted by Jason Gunthorpe 2 weeks ago
On Tue, Sep 16, 2025 at 07:50:15PM -0700, Jason Miu wrote:
> This series transitions KHO from an xarray-based metadata tracking
> system with serialization to using page table like data structures
> that can be passed directly to the next kernel.
> 
> The key motivations for this change are to:
> - Eliminate the need for data serialization before kexec.
> - Remove the former KHO state machine by deprecating the finalize
>   and abort states.
> - Pass preservation metadata more directly to the next kernel via the FDT.
> 
> The new approach uses a per-order page table structure (kho_order_table,
> kho_page_table, kho_bitmap_table) to mark preserved pages. The physical
> address of the root `kho_order_table` is passed in the FDT, allowing the
> next kernel to reconstruct the preserved memory map.

It is not a "page table" structure, it is just a radix tree with bits
as the leaf.

Jason
Re: [RFC v1 0/4] Make KHO Stateless
Posted by Matthew Wilcox 1 week, 3 days ago
On Wed, Sep 17, 2025 at 08:36:09AM -0300, Jason Gunthorpe wrote:
> On Tue, Sep 16, 2025 at 07:50:15PM -0700, Jason Miu wrote:
> > This series transitions KHO from an xarray-based metadata tracking
> > system with serialization to using page table like data structures
> > that can be passed directly to the next kernel.
> > 
> > The key motivations for this change are to:
> > - Eliminate the need for data serialization before kexec.
> > - Remove the former KHO state machine by deprecating the finalize
> >   and abort states.
> > - Pass preservation metadata more directly to the next kernel via the FDT.
> > 
> > The new approach uses a per-order page table structure (kho_order_table,
> > kho_page_table, kho_bitmap_table) to mark preserved pages. The physical
> > address of the root `kho_order_table` is passed in the FDT, allowing the
> > next kernel to reconstruct the preserved memory map.
> 
> It is not a "page table" structure, it is just a radix tree with bits
> as the leaf.

Sounds like the IDA data structure.  Maybe that API needs to be enhanced
for this use case, but surely using the same data structure would be a
good thing?
Re: [RFC v1 0/4] Make KHO Stateless
Posted by Pasha Tatashin 1 week, 3 days ago
On Sun, Sep 21, 2025 at 6:26 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Wed, Sep 17, 2025 at 08:36:09AM -0300, Jason Gunthorpe wrote:
> > On Tue, Sep 16, 2025 at 07:50:15PM -0700, Jason Miu wrote:
> > > This series transitions KHO from an xarray-based metadata tracking
> > > system with serialization to using page table like data structures
> > > that can be passed directly to the next kernel.
> > >
> > > The key motivations for this change are to:
> > > - Eliminate the need for data serialization before kexec.
> > > - Remove the former KHO state machine by deprecating the finalize
> > >   and abort states.
> > > - Pass preservation metadata more directly to the next kernel via the FDT.
> > >
> > > The new approach uses a per-order page table structure (kho_order_table,
> > > kho_page_table, kho_bitmap_table) to mark preserved pages. The physical
> > > address of the root `kho_order_table` is passed in the FDT, allowing the
> > > next kernel to reconstruct the preserved memory map.
> >
> > It is not a "page table" structure, it is just a radix tree with bits
> > as the leaf.
>
> Sounds like the IDA data structure.  Maybe that API needs to be enhanced
> for this use case, but surely using the same data structure would be a
> good thing?

Normally, I would agree, but in this case, this has to be a simple
data structure that, in the long run, is going to be stable between
different kernel versions: the old and the next kernel must understand
it. Therefore, relying on any external data structure would require
the maintainers and other developers to be aware of this rather
unusual kernel requirement. So, I think it is much better to keep this
implementation private to KHO, whose only responsibility is reliably
passing memory pages from the old kernel to the next kernel.
Re: [RFC v1 0/4] Make KHO Stateless
Posted by Pasha Tatashin 2 weeks ago
On Wed, Sep 17, 2025 at 7:36 AM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Sep 16, 2025 at 07:50:15PM -0700, Jason Miu wrote:
> > This series transitions KHO from an xarray-based metadata tracking
> > system with serialization to using page table like data structures
> > that can be passed directly to the next kernel.
> >
> > The key motivations for this change are to:
> > - Eliminate the need for data serialization before kexec.
> > - Remove the former KHO state machine by deprecating the finalize
> >   and abort states.
> > - Pass preservation metadata more directly to the next kernel via the FDT.
> >
> > The new approach uses a per-order page table structure (kho_order_table,
> > kho_page_table, kho_bitmap_table) to mark preserved pages. The physical
> > address of the root `kho_order_table` is passed in the FDT, allowing the
> > next kernel to reconstruct the preserved memory map.
>
> It is not a "page table" structure, it is just a radix tree with bits
> as the leaf.

To be fair above it is referred to as a page table *like* data
structure, but I agree kho radix tree sounds like a good overall name
for this, and it might make sense to rename from kho_page_table to
kho_radix_tree in other places.

>
> Jason