[PATCH v6 00/10] liveupdate: Rework KHO for in-kernel users & Fix memory corruption

Pasha Tatashin posted 10 patches 3 months, 3 weeks ago
Documentation/core-api/kho/concepts.rst     |   2 +-
MAINTAINERS                                 |   3 +-
include/linux/kexec_handover.h              |  53 +-
init/Kconfig                                |   2 +
kernel/Kconfig.kexec                        |  15 -
kernel/Makefile                             |   2 +-
kernel/liveupdate/Kconfig                   |  38 ++
kernel/liveupdate/Makefile                  |   5 +
kernel/{ => liveupdate}/kexec_handover.c    | 588 +++++++++-----------
kernel/liveupdate/kexec_handover_debug.c    |  25 +
kernel/liveupdate/kexec_handover_debugfs.c  | 216 +++++++
kernel/liveupdate/kexec_handover_internal.h |  56 ++
lib/test_kho.c                              |  30 +-
mm/memblock.c                               |  62 +--
tools/testing/selftests/kho/init.c          |   2 +-
tools/testing/selftests/kho/vmtest.sh       |   1 +
16 files changed, 645 insertions(+), 455 deletions(-)
create mode 100644 kernel/liveupdate/Kconfig
create mode 100644 kernel/liveupdate/Makefile
rename kernel/{ => liveupdate}/kexec_handover.c (78%)
create mode 100644 kernel/liveupdate/kexec_handover_debug.c
create mode 100644 kernel/liveupdate/kexec_handover_debugfs.c
create mode 100644 kernel/liveupdate/kexec_handover_internal.h
[PATCH v6 00/10] liveupdate: Rework KHO for in-kernel users & Fix memory corruption
Posted by Pasha Tatashin 3 months, 3 weeks ago
This series addresses comments and combines into one the two
series [1] and [2], and adds review-bys.

This series refactors the KHO framework to better support in-kernel
users like the upcoming LUO. The current design, which relies on a
notifier chain and debugfs for control, is too restrictive for direct
programmatic use.

The core of this rework is the removal of the notifier chain in favor of
a direct registration API. This decouples clients from the shutdown-time
finalization sequence, allowing them to manage their preserved state
more flexibly and at any time.

Also, this series fixes a memory corruption bug in KHO that occurs when
KFENCE is enabled.

The root cause is that KHO metadata, allocated via kzalloc(), can be
randomly serviced by kfence_alloc(). When a kernel boots via KHO, the
early memblock allocator is restricted to a "scratch area". This forces
the KFENCE pool to be allocated within this scratch area, creating a
conflict. If KHO metadata is subsequently placed in this pool, it gets
corrupted during the next kexec operation.

[1] https://lore.kernel.org/all/20251007033100.836886-1-pasha.tatashin@soleen.com
[2] https://lore.kernel.org/all/20251015053121.3978358-1-pasha.tatashin@soleen.com

Mike Rapoport (Microsoft) (1):
  kho: drop notifiers

Pasha Tatashin (9):
  kho: allow to drive kho from within kernel
  kho: make debugfs interface optional
  kho: add interfaces to unpreserve folios and page ranes
  kho: don't unpreserve memory during abort
  liveupdate: kho: move to kernel/liveupdate
  kho: move kho debugfs directory to liveupdate
  liveupdate: kho: warn and fail on metadata or preserved memory in
    scratch area
  liveupdate: kho: Increase metadata bitmap size to PAGE_SIZE
  liveupdate: kho: allocate metadata directly from the buddy allocator

 Documentation/core-api/kho/concepts.rst     |   2 +-
 MAINTAINERS                                 |   3 +-
 include/linux/kexec_handover.h              |  53 +-
 init/Kconfig                                |   2 +
 kernel/Kconfig.kexec                        |  15 -
 kernel/Makefile                             |   2 +-
 kernel/liveupdate/Kconfig                   |  38 ++
 kernel/liveupdate/Makefile                  |   5 +
 kernel/{ => liveupdate}/kexec_handover.c    | 588 +++++++++-----------
 kernel/liveupdate/kexec_handover_debug.c    |  25 +
 kernel/liveupdate/kexec_handover_debugfs.c  | 216 +++++++
 kernel/liveupdate/kexec_handover_internal.h |  56 ++
 lib/test_kho.c                              |  30 +-
 mm/memblock.c                               |  62 +--
 tools/testing/selftests/kho/init.c          |   2 +-
 tools/testing/selftests/kho/vmtest.sh       |   1 +
 16 files changed, 645 insertions(+), 455 deletions(-)
 create mode 100644 kernel/liveupdate/Kconfig
 create mode 100644 kernel/liveupdate/Makefile
 rename kernel/{ => liveupdate}/kexec_handover.c (78%)
 create mode 100644 kernel/liveupdate/kexec_handover_debug.c
 create mode 100644 kernel/liveupdate/kexec_handover_debugfs.c
 create mode 100644 kernel/liveupdate/kexec_handover_internal.h


base-commit: f406055cb18c6e299c4a783fc1effeb16be41803
-- 
2.51.0.915.g61a8936c21-goog
Re: [PATCH v6 00/10] liveupdate: Rework KHO for in-kernel users & Fix memory corruption
Posted by Mike Rapoport 3 months, 2 weeks ago
On Sat, Oct 18, 2025 at 01:17:46PM -0400, Pasha Tatashin wrote:
> This series addresses comments and combines into one the two
> series [1] and [2], and adds review-bys.
> 
> This series refactors the KHO framework to better support in-kernel
> users like the upcoming LUO. The current design, which relies on a
> notifier chain and debugfs for control, is too restrictive for direct
> programmatic use.
> 
> The core of this rework is the removal of the notifier chain in favor of
> a direct registration API. This decouples clients from the shutdown-time
> finalization sequence, allowing them to manage their preserved state
> more flexibly and at any time.
> 
> Also, this series fixes a memory corruption bug in KHO that occurs when
> KFENCE is enabled.
> 
> The root cause is that KHO metadata, allocated via kzalloc(), can be
> randomly serviced by kfence_alloc(). When a kernel boots via KHO, the
> early memblock allocator is restricted to a "scratch area". This forces
> the KFENCE pool to be allocated within this scratch area, creating a
> conflict. If KHO metadata is subsequently placed in this pool, it gets
> corrupted during the next kexec operation.
> 
> [1] https://lore.kernel.org/all/20251007033100.836886-1-pasha.tatashin@soleen.com
> [2] https://lore.kernel.org/all/20251015053121.3978358-1-pasha.tatashin@soleen.com
> 
> Mike Rapoport (Microsoft) (1):
>   kho: drop notifiers
> 
> Pasha Tatashin (9):
>   kho: allow to drive kho from within kernel
>   kho: make debugfs interface optional
>   kho: add interfaces to unpreserve folios and page ranes
>   kho: don't unpreserve memory during abort
>   liveupdate: kho: move to kernel/liveupdate
>   kho: move kho debugfs directory to liveupdate
>   liveupdate: kho: warn and fail on metadata or preserved memory in scratch area
>   liveupdate: kho: Increase metadata bitmap size to PAGE_SIZE
>   liveupdate: kho: allocate metadata directly from the buddy allocator

The fixes should go before the preparation for LUO or even better as a
separate series.

I've reread the LUO preparation patches and I don't think they are useful
on their own. They introduce a couple of unused interfaces and I think it's
better to have them along with the rest of LUO patches.

-- 
Sincerely yours,
Mike.
Re: [PATCH v6 00/10] liveupdate: Rework KHO for in-kernel users & Fix memory corruption
Posted by Pasha Tatashin 3 months, 2 weeks ago
On Mon, Oct 20, 2025 at 4:34 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Sat, Oct 18, 2025 at 01:17:46PM -0400, Pasha Tatashin wrote:
> > This series addresses comments and combines into one the two
> > series [1] and [2], and adds review-bys.
> >
> > This series refactors the KHO framework to better support in-kernel
> > users like the upcoming LUO. The current design, which relies on a
> > notifier chain and debugfs for control, is too restrictive for direct
> > programmatic use.
> >
> > The core of this rework is the removal of the notifier chain in favor of
> > a direct registration API. This decouples clients from the shutdown-time
> > finalization sequence, allowing them to manage their preserved state
> > more flexibly and at any time.
> >
> > Also, this series fixes a memory corruption bug in KHO that occurs when
> > KFENCE is enabled.
> >
> > The root cause is that KHO metadata, allocated via kzalloc(), can be
> > randomly serviced by kfence_alloc(). When a kernel boots via KHO, the
> > early memblock allocator is restricted to a "scratch area". This forces
> > the KFENCE pool to be allocated within this scratch area, creating a
> > conflict. If KHO metadata is subsequently placed in this pool, it gets
> > corrupted during the next kexec operation.
> >
> > [1] https://lore.kernel.org/all/20251007033100.836886-1-pasha.tatashin@soleen.com
> > [2] https://lore.kernel.org/all/20251015053121.3978358-1-pasha.tatashin@soleen.com
> >
> > Mike Rapoport (Microsoft) (1):
> >   kho: drop notifiers
> >
> > Pasha Tatashin (9):
> >   kho: allow to drive kho from within kernel
> >   kho: make debugfs interface optional
> >   kho: add interfaces to unpreserve folios and page ranes
> >   kho: don't unpreserve memory during abort
> >   liveupdate: kho: move to kernel/liveupdate
> >   kho: move kho debugfs directory to liveupdate
> >   liveupdate: kho: warn and fail on metadata or preserved memory in scratch area
> >   liveupdate: kho: Increase metadata bitmap size to PAGE_SIZE
> >   liveupdate: kho: allocate metadata directly from the buddy allocator
>
> The fixes should go before the preparation for LUO or even better as a
> separate series.
>
> I've reread the LUO preparation patches and I don't think they are useful
> on their own. They introduce a couple of unused interfaces and I think it's
> better to have them along with the rest of LUO patches.

Pulling them out to apply fixes separately feels counterproductive,
especially since we agreed to add the new kexec_handover_debug.c file.
The most straightforward path is to build on what's already in -next.
Let's stick with the current approach.

Thanks,
Pasha

>
> --
> Sincerely yours,
> Mike.
Re: [PATCH v6 00/10] liveupdate: Rework KHO for in-kernel users & Fix memory corruption
Posted by Mike Rapoport 3 months, 2 weeks ago
On Mon, Oct 20, 2025 at 09:46:17AM -0400, Pasha Tatashin wrote:
> On Mon, Oct 20, 2025 at 4:34 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Sat, Oct 18, 2025 at 01:17:46PM -0400, Pasha Tatashin wrote:
> > > This series addresses comments and combines into one the two
> > > series [1] and [2], and adds review-bys.
> > >
> > > This series refactors the KHO framework to better support in-kernel
> > > users like the upcoming LUO. The current design, which relies on a
> > > notifier chain and debugfs for control, is too restrictive for direct
> > > programmatic use.
> > >
> > > The core of this rework is the removal of the notifier chain in favor of
> > > a direct registration API. This decouples clients from the shutdown-time
> > > finalization sequence, allowing them to manage their preserved state
> > > more flexibly and at any time.
> > >
> > > Also, this series fixes a memory corruption bug in KHO that occurs when
> > > KFENCE is enabled.
> > >
> > > The root cause is that KHO metadata, allocated via kzalloc(), can be
> > > randomly serviced by kfence_alloc(). When a kernel boots via KHO, the
> > > early memblock allocator is restricted to a "scratch area". This forces
> > > the KFENCE pool to be allocated within this scratch area, creating a
> > > conflict. If KHO metadata is subsequently placed in this pool, it gets
> > > corrupted during the next kexec operation.
> > >
> > > [1] https://lore.kernel.org/all/20251007033100.836886-1-pasha.tatashin@soleen.com
> > > [2] https://lore.kernel.org/all/20251015053121.3978358-1-pasha.tatashin@soleen.com
> > >
> > > Mike Rapoport (Microsoft) (1):
> > >   kho: drop notifiers
> > >
> > > Pasha Tatashin (9):
> > >   kho: allow to drive kho from within kernel
> > >   kho: make debugfs interface optional
> > >   kho: add interfaces to unpreserve folios and page ranes
> > >   kho: don't unpreserve memory during abort
> > >   liveupdate: kho: move to kernel/liveupdate
> > >   kho: move kho debugfs directory to liveupdate
> > >   liveupdate: kho: warn and fail on metadata or preserved memory in scratch area
> > >   liveupdate: kho: Increase metadata bitmap size to PAGE_SIZE
> > >   liveupdate: kho: allocate metadata directly from the buddy allocator
> >
> > The fixes should go before the preparation for LUO or even better as a
> > separate series.
> >
> > I've reread the LUO preparation patches and I don't think they are useful
> > on their own. They introduce a couple of unused interfaces and I think it's
> > better to have them along with the rest of LUO patches.
> 
> Pulling them out to apply fixes separately feels counterproductive,
> especially since we agreed to add the new kexec_handover_debug.c file.
> The most straightforward path is to build on what's already in -next.
> Let's stick with the current approach.

The fixes are 6.18 material, the LUO preparation is 6.19 material.
 
> Thanks,
> Pasha
> 
> >
> > --
> > Sincerely yours,
> > Mike.

-- 
Sincerely yours,
Mike.
Re: [PATCH v6 00/10] liveupdate: Rework KHO for in-kernel users & Fix memory corruption
Posted by Pasha Tatashin 3 months, 2 weeks ago
On Mon, Oct 20, 2025 at 9:46 AM Pasha Tatashin
<pasha.tatashin@soleen.com> wrote:
>
> On Mon, Oct 20, 2025 at 4:34 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > On Sat, Oct 18, 2025 at 01:17:46PM -0400, Pasha Tatashin wrote:
> > > This series addresses comments and combines into one the two
> > > series [1] and [2], and adds review-bys.
> > >
> > > This series refactors the KHO framework to better support in-kernel
> > > users like the upcoming LUO. The current design, which relies on a
> > > notifier chain and debugfs for control, is too restrictive for direct
> > > programmatic use.
> > >
> > > The core of this rework is the removal of the notifier chain in favor of
> > > a direct registration API. This decouples clients from the shutdown-time
> > > finalization sequence, allowing them to manage their preserved state
> > > more flexibly and at any time.
> > >
> > > Also, this series fixes a memory corruption bug in KHO that occurs when
> > > KFENCE is enabled.
> > >
> > > The root cause is that KHO metadata, allocated via kzalloc(), can be
> > > randomly serviced by kfence_alloc(). When a kernel boots via KHO, the
> > > early memblock allocator is restricted to a "scratch area". This forces
> > > the KFENCE pool to be allocated within this scratch area, creating a
> > > conflict. If KHO metadata is subsequently placed in this pool, it gets
> > > corrupted during the next kexec operation.
> > >
> > > [1] https://lore.kernel.org/all/20251007033100.836886-1-pasha.tatashin@soleen.com
> > > [2] https://lore.kernel.org/all/20251015053121.3978358-1-pasha.tatashin@soleen.com
> > >
> > > Mike Rapoport (Microsoft) (1):
> > >   kho: drop notifiers
> > >
> > > Pasha Tatashin (9):
> > >   kho: allow to drive kho from within kernel
> > >   kho: make debugfs interface optional
> > >   kho: add interfaces to unpreserve folios and page ranes
> > >   kho: don't unpreserve memory during abort
> > >   liveupdate: kho: move to kernel/liveupdate
> > >   kho: move kho debugfs directory to liveupdate
> > >   liveupdate: kho: warn and fail on metadata or preserved memory in scratch area
> > >   liveupdate: kho: Increase metadata bitmap size to PAGE_SIZE
> > >   liveupdate: kho: allocate metadata directly from the buddy allocator
> >
> > The fixes should go before the preparation for LUO or even better as a
> > separate series.
> >
> > I've reread the LUO preparation patches and I don't think they are useful
> > on their own. They introduce a couple of unused interfaces and I think it's
> > better to have them along with the rest of LUO patches.
>

Forgot to add:
The LUO preparation patches have been soaking in linux-next for some
time now and are mostly reviewed.
...
> Pulling them out to apply fixes separately feels counterproductive,
> especially since we agreed to add the new kexec_handover_debug.c file.
> The most straightforward path is to build on what's already in -next.
> Let's stick with the current approach.
>
> Thanks,
> Pasha
>
> >
> > --
> > Sincerely yours,
> > Mike.