[PATCH v2 0/5] liveupdate: validate restored LUO metadata

Cris Jacob Maamor posted 5 patches 1 month, 1 week ago
include/linux/kexec_handover.h     |  6 +++++
kernel/liveupdate/kexec_handover.c | 35 ++++++++++++++++++++++++++++++
kernel/liveupdate/luo_core.c       | 10 ++++++++-
kernel/liveupdate/luo_file.c       | 14 ++++++++++--
kernel/liveupdate/luo_flb.c        | 23 +++++++++++++++++++-
kernel/liveupdate/luo_session.c    | 22 +++++++++++++++++--
6 files changed, 104 insertions(+), 6 deletions(-)
[PATCH v2 0/5] liveupdate: validate restored LUO metadata
Posted by Cris Jacob Maamor 1 month, 1 week ago
LUO restores metadata from KHO/FDT during liveupdate. The restored
metadata contains physical addresses and count fields used to access and
walk preserved session, file set, and FLB arrays.

This series adds a non-consuming KHO preserved-range check and uses it
before phys_to_virt() on restored metadata addresses. It also rejects
restored counts above LUO_SESSION_MAX, LUO_FILE_MAX, and LUO_FLB_MAX
before traversal.

As far as I can tell, this is root/admin-only; I do not have evidence
that a normal unprivileged user can trigger it directly.

Changes since v1:
- Dropped RFC marking.
- Added changelog text to each patch.
- No code changes.

Cris Jacob Maamor (5):
  kexec: handover: add helper to check preserved page ranges
  liveupdate: validate LUO FDT physical address before mapping
  liveupdate: validate restored LUO session metadata
  liveupdate: validate restored LUO file set metadata
  liveupdate: validate restored LUO FLB metadata

 include/linux/kexec_handover.h     |  6 +++++
 kernel/liveupdate/kexec_handover.c | 35 ++++++++++++++++++++++++++++++
 kernel/liveupdate/luo_core.c       | 10 ++++++++-
 kernel/liveupdate/luo_file.c       | 14 ++++++++++--
 kernel/liveupdate/luo_flb.c        | 23 +++++++++++++++++++-
 kernel/liveupdate/luo_session.c    | 22 +++++++++++++++++--
 6 files changed, 104 insertions(+), 6 deletions(-)

-- 
2.53.0
Re: [PATCH v2 0/5] liveupdate: validate restored LUO metadata
Posted by Pasha Tatashin 1 month, 1 week ago
On 05-02 01:30, Cris Jacob Maamor wrote:
> LUO restores metadata from KHO/FDT during liveupdate. The restored
> metadata contains physical addresses and count fields used to access and
> walk preserved session, file set, and FLB arrays.
> 
> This series adds a non-consuming KHO preserved-range check and uses it
> before phys_to_virt() on restored metadata addresses. It also rejects
> restored counts above LUO_SESSION_MAX, LUO_FILE_MAX, and LUO_FLB_MAX
> before traversal.
> 
> As far as I can tell, this is root/admin-only; I do not have evidence
> that a normal unprivileged user can trigger it directly.
> 
> Changes since v1:
> - Dropped RFC marking.
> - Added changelog text to each patch.
> - No code changes.
> 
> Cris Jacob Maamor (5):
>   kexec: handover: add helper to check preserved page ranges
>   liveupdate: validate LUO FDT physical address before mapping
>   liveupdate: validate restored LUO session metadata
>   liveupdate: validate restored LUO file set metadata
>   liveupdate: validate restored LUO FLB metadata

I have replied separately in the security report to clarify that this is 
not a bug. The behavior follows the ABI specification exactly: we use 
the PA addresses and ranges provided by the KHO FDT tree.

NAK

> 
>  include/linux/kexec_handover.h     |  6 +++++
>  kernel/liveupdate/kexec_handover.c | 35 ++++++++++++++++++++++++++++++
>  kernel/liveupdate/luo_core.c       | 10 ++++++++-
>  kernel/liveupdate/luo_file.c       | 14 ++++++++++--
>  kernel/liveupdate/luo_flb.c        | 23 +++++++++++++++++++-
>  kernel/liveupdate/luo_session.c    | 22 +++++++++++++++++--
>  6 files changed, 104 insertions(+), 6 deletions(-)
> 
> -- 
> 2.53.0
>
Re: [PATCH v2 0/5] liveupdate: validate restored LUO metadata
Posted by Pratyush Yadav 1 month, 1 week ago
Hi Pasha,

On Fri, May 01 2026, Pasha Tatashin wrote:

> On 05-02 01:30, Cris Jacob Maamor wrote:
>> LUO restores metadata from KHO/FDT during liveupdate. The restored
>> metadata contains physical addresses and count fields used to access and
>> walk preserved session, file set, and FLB arrays.
>> 
>> This series adds a non-consuming KHO preserved-range check and uses it
>> before phys_to_virt() on restored metadata addresses. It also rejects
>> restored counts above LUO_SESSION_MAX, LUO_FILE_MAX, and LUO_FLB_MAX
>> before traversal.
>> 
>> As far as I can tell, this is root/admin-only; I do not have evidence
>> that a normal unprivileged user can trigger it directly.
>> 
>> Changes since v1:
>> - Dropped RFC marking.
>> - Added changelog text to each patch.
>> - No code changes.
>> 
>> Cris Jacob Maamor (5):
>>   kexec: handover: add helper to check preserved page ranges
>>   liveupdate: validate LUO FDT physical address before mapping
>>   liveupdate: validate restored LUO session metadata
>>   liveupdate: validate restored LUO file set metadata
>>   liveupdate: validate restored LUO FLB metadata
>
> I have replied separately in the security report to clarify that this is 
> not a bug. The behavior follows the ABI specification exactly: we use 
> the PA addresses and ranges provided by the KHO FDT tree.
>
> NAK

I really do think we should do a restore-only variant for the
kho_alloc_preserve() family of allocators and use it everywhere. It
would prevent problems in the future. Not because the previous kernel is
malicious, but because we might have bugs and the KHO page magic sanity
check acts as a defense in depth.

For example, I am currently looking at a LUO bug where LUO does not
track if a session is outgoing or incoming. So you can do a retrieve()
or finish() on an outgoing session. A lot of nastiness is saved because
of the page magic check. Things like kho_restore_vmalloc() or
kho_restore_folio() fail early and loudly.

If we want to squeeze out more performance later down the line we can
move it behind a debug config, but having this usage pattern of always
restoring before using is going to be a lot more sane than just using
physical addresses willy nilly.

The approach this series takes with kho_is_preserved() is the wrong
design. But a kho_restore() or something similar (maybe we can find a
better name?) is really where we should be going.

-- 
Regards,
Pratyush Yadav
Re: [PATCH v2 0/5] liveupdate: validate restored LUO metadata
Posted by Pasha Tatashin 1 month, 1 week ago
On 05-06 11:02, Pratyush Yadav wrote:
> Hi Pasha,
> 
> On Fri, May 01 2026, Pasha Tatashin wrote:
> 
> > On 05-02 01:30, Cris Jacob Maamor wrote:
> >> LUO restores metadata from KHO/FDT during liveupdate. The restored
> >> metadata contains physical addresses and count fields used to access and
> >> walk preserved session, file set, and FLB arrays.
> >> 
> >> This series adds a non-consuming KHO preserved-range check and uses it
> >> before phys_to_virt() on restored metadata addresses. It also rejects
> >> restored counts above LUO_SESSION_MAX, LUO_FILE_MAX, and LUO_FLB_MAX
> >> before traversal.
> >> 
> >> As far as I can tell, this is root/admin-only; I do not have evidence
> >> that a normal unprivileged user can trigger it directly.
> >> 
> >> Changes since v1:
> >> - Dropped RFC marking.
> >> - Added changelog text to each patch.
> >> - No code changes.
> >> 
> >> Cris Jacob Maamor (5):
> >>   kexec: handover: add helper to check preserved page ranges
> >>   liveupdate: validate LUO FDT physical address before mapping
> >>   liveupdate: validate restored LUO session metadata
> >>   liveupdate: validate restored LUO file set metadata
> >>   liveupdate: validate restored LUO FLB metadata
> >
> > I have replied separately in the security report to clarify that this is 
> > not a bug. The behavior follows the ABI specification exactly: we use 
> > the PA addresses and ranges provided by the KHO FDT tree.
> >
> > NAK
> 
> I really do think we should do a restore-only variant for the
> kho_alloc_preserve() family of allocators and use it everywhere. It

That is unrelated to the provided patch series. The author of this 
series reported this as a security issue to the Linux security ML, and 
submitted this series at their request.

This is not a security issue, and in fact, it is not an issue at all. A 
restore-only variant can be added, but I do not see a reason for LUO to 
use it.

> would prevent problems in the future. Not because the previous kernel is
> malicious, but because we might have bugs and the KHO page magic sanity
> check acts as a defense in depth.
> 
> For example, I am currently looking at a LUO bug where LUO does not
> track if a session is outgoing or incoming. So you can do a retrieve()
> or finish() on an outgoing session. A lot of nastiness is saved because
> of the page magic check. Things like kho_restore_vmalloc() or
> kho_restore_folio() fail early and loudly.

I am not sure what bug you are looking at (please share the details!), 
but the fix absolutely should be to use outgoing/incoming sessions 
properly, and if we mixed them up somewhere, THAT should be fixed. Using 
KHO restore is not going to help much; however, I agree it can add 
some extra scrutiny (i.e., similar to an ASSERT), but it is not really 
something that would help improve correctness in any meaningful way. The 
correctness should lie in the LUO logic using incoming as incoming, and 
outgoing as outgoing.

> 
> If we want to squeeze out more performance later down the line we can
> move it behind a debug config, but having this usage pattern of always
> restoring before using is going to be a lot more sane than just using
> physical addresses willy nilly.
> 
> The approach this series takes with kho_is_preserved() is the wrong
> design. But a kho_restore() or something similar (maybe we can find a
> better name?) is really where we should be going.
> 
> -- 
> Regards,
> Pratyush Yadav
Re: [PATCH v2 0/5] liveupdate: validate restored LUO metadata
Posted by Pratyush Yadav 1 month, 1 week ago
On Wed, May 06 2026, Pasha Tatashin wrote:

> On 05-06 11:02, Pratyush Yadav wrote:
>> Hi Pasha,
>> 
>> On Fri, May 01 2026, Pasha Tatashin wrote:
>> 
>> > On 05-02 01:30, Cris Jacob Maamor wrote:
>> >> LUO restores metadata from KHO/FDT during liveupdate. The restored
>> >> metadata contains physical addresses and count fields used to access and
>> >> walk preserved session, file set, and FLB arrays.
>> >> 
>> >> This series adds a non-consuming KHO preserved-range check and uses it
>> >> before phys_to_virt() on restored metadata addresses. It also rejects
>> >> restored counts above LUO_SESSION_MAX, LUO_FILE_MAX, and LUO_FLB_MAX
>> >> before traversal.
>> >> 
>> >> As far as I can tell, this is root/admin-only; I do not have evidence
>> >> that a normal unprivileged user can trigger it directly.
>> >> 
>> >> Changes since v1:
>> >> - Dropped RFC marking.
>> >> - Added changelog text to each patch.
>> >> - No code changes.
>> >> 
>> >> Cris Jacob Maamor (5):
>> >>   kexec: handover: add helper to check preserved page ranges
>> >>   liveupdate: validate LUO FDT physical address before mapping
>> >>   liveupdate: validate restored LUO session metadata
>> >>   liveupdate: validate restored LUO file set metadata
>> >>   liveupdate: validate restored LUO FLB metadata
>> >
>> > I have replied separately in the security report to clarify that this is 
>> > not a bug. The behavior follows the ABI specification exactly: we use 
>> > the PA addresses and ranges provided by the KHO FDT tree.
>> >
>> > NAK
>> 
>> I really do think we should do a restore-only variant for the
>> kho_alloc_preserve() family of allocators and use it everywhere. It
>
> That is unrelated to the provided patch series. The author of this 
> series reported this as a security issue to the Linux security ML, and 
> submitted this series at their request.

Oh yes, sure. I am not arguing for taking this series. I just figured
this would be a good point to have this discussion.

>
> This is not a security issue, and in fact, it is not an issue at all. A 
> restore-only variant can be added, but I do not see a reason for LUO to 
> use it.
>
>> would prevent problems in the future. Not because the previous kernel is
>> malicious, but because we might have bugs and the KHO page magic sanity
>> check acts as a defense in depth.
>> 
>> For example, I am currently looking at a LUO bug where LUO does not
>> track if a session is outgoing or incoming. So you can do a retrieve()
>> or finish() on an outgoing session. A lot of nastiness is saved because
>> of the page magic check. Things like kho_restore_vmalloc() or
>> kho_restore_folio() fail early and loudly.
>
> I am not sure what bug you are looking at (please share the details!), 

I was looking at LUO code and realized that we do not separate outgoing
and incoming sessions when dealing with preserve/retrieve/finish ioctls.
So you can create a session, preserve a FD, and then immediately call
finish or retrieve without doing a kexec. Of course, LUO file handlers
aren't able to cope with it.

So for example, you can preserve a memfd and then immediately call
finish. This will call memfd_luo_finish(), where it will try to
kho_restore_vmalloc(). That fails with a bit WARN splat. And then later
it calls kho_restore_free() which also fails in a similar fashion.

You can do the same thing with retrieve(), but that also fails early and
loudly and does not cause any problems.

I am working on a fix for it. Should have something out shortly.

> but the fix absolutely should be to use outgoing/incoming sessions 
> properly, and if we mixed them up somewhere, THAT should be fixed. Using 
> KHO restore is not going to help much; however, I agree it can add 
> some extra scrutiny (i.e., similar to an ASSERT), but it is not really 
> something that would help improve correctness in any meaningful way. The 
> correctness should lie in the LUO logic using incoming as incoming, and 
> outgoing as outgoing.

I am not arguing that we shouldn't fix the logic bugs. Of course we
should.

My point is that this sanity check acts as another layer of defence.
Bugs happen, but the earlier we catch them the better and this sanity
check helps us do exactly that.

For example, if we did not have these sanity checks, the loud errors I
described above would be replaced by silent use-after-free, double-free,
struct page corruption, or other problems.

So I would like to understand why you _don't_ want to have this line of
defence. What's the problem? If you are worried about performance, we
can go and measure it. If the overhead is too high this can be behind a
debug config.

>
>> 
>> If we want to squeeze out more performance later down the line we can
>> move it behind a debug config, but having this usage pattern of always
>> restoring before using is going to be a lot more sane than just using
>> physical addresses willy nilly.
>> 
>> The approach this series takes with kho_is_preserved() is the wrong
>> design. But a kho_restore() or something similar (maybe we can find a
>> better name?) is really where we should be going.
>> 
>> -- 
>> Regards,
>> Pratyush Yadav

-- 
Regards,
Pratyush Yadav
Re: [PATCH v2 0/5] liveupdate: validate restored LUO metadata
Posted by Pasha Tatashin 1 month, 1 week ago
On 05-06 18:15, Pratyush Yadav wrote:
> On Wed, May 06 2026, Pasha Tatashin wrote:
> 
> > On 05-06 11:02, Pratyush Yadav wrote:
> >> Hi Pasha,
> >> 
> >> On Fri, May 01 2026, Pasha Tatashin wrote:
> >> 
> >> > On 05-02 01:30, Cris Jacob Maamor wrote:
> >> >> LUO restores metadata from KHO/FDT during liveupdate. The restored
> >> >> metadata contains physical addresses and count fields used to access and
> >> >> walk preserved session, file set, and FLB arrays.
> >> >> 
> >> >> This series adds a non-consuming KHO preserved-range check and uses it
> >> >> before phys_to_virt() on restored metadata addresses. It also rejects
> >> >> restored counts above LUO_SESSION_MAX, LUO_FILE_MAX, and LUO_FLB_MAX
> >> >> before traversal.
> >> >> 
> >> >> As far as I can tell, this is root/admin-only; I do not have evidence
> >> >> that a normal unprivileged user can trigger it directly.
> >> >> 
> >> >> Changes since v1:
> >> >> - Dropped RFC marking.
> >> >> - Added changelog text to each patch.
> >> >> - No code changes.
> >> >> 
> >> >> Cris Jacob Maamor (5):
> >> >>   kexec: handover: add helper to check preserved page ranges
> >> >>   liveupdate: validate LUO FDT physical address before mapping
> >> >>   liveupdate: validate restored LUO session metadata
> >> >>   liveupdate: validate restored LUO file set metadata
> >> >>   liveupdate: validate restored LUO FLB metadata
> >> >
> >> > I have replied separately in the security report to clarify that this is 
> >> > not a bug. The behavior follows the ABI specification exactly: we use 
> >> > the PA addresses and ranges provided by the KHO FDT tree.
> >> >
> >> > NAK
> >> 
> >> I really do think we should do a restore-only variant for the
> >> kho_alloc_preserve() family of allocators and use it everywhere. It
> >
> > That is unrelated to the provided patch series. The author of this 
> > series reported this as a security issue to the Linux security ML, and 
> > submitted this series at their request.
> 
> Oh yes, sure. I am not arguing for taking this series. I just figured
> this would be a good point to have this discussion.
> 
> >
> > This is not a security issue, and in fact, it is not an issue at all. A 
> > restore-only variant can be added, but I do not see a reason for LUO to 
> > use it.
> >
> >> would prevent problems in the future. Not because the previous kernel is
> >> malicious, but because we might have bugs and the KHO page magic sanity
> >> check acts as a defense in depth.
> >> 
> >> For example, I am currently looking at a LUO bug where LUO does not
> >> track if a session is outgoing or incoming. So you can do a retrieve()
> >> or finish() on an outgoing session. A lot of nastiness is saved because
> >> of the page magic check. Things like kho_restore_vmalloc() or
> >> kho_restore_folio() fail early and loudly.
> >
> > I am not sure what bug you are looking at (please share the details!), 
> 
> I was looking at LUO code and realized that we do not separate outgoing
> and incoming sessions when dealing with preserve/retrieve/finish ioctls.
> So you can create a session, preserve a FD, and then immediately call
> finish or retrieve without doing a kexec. Of course, LUO file handlers
> aren't able to cope with it.

Oh, this makes sense, please add a self-test for that as well :-)

> 
> So for example, you can preserve a memfd and then immediately call
> finish. This will call memfd_luo_finish(), where it will try to
> kho_restore_vmalloc(). That fails with a bit WARN splat. And then later
> it calls kho_restore_free() which also fails in a similar fashion.
> 
> You can do the same thing with retrieve(), but that also fails early and
> loudly and does not cause any problems.
> 
> I am working on a fix for it. Should have something out shortly.
> 
> > but the fix absolutely should be to use outgoing/incoming sessions 
> > properly, and if we mixed them up somewhere, THAT should be fixed. Using 
> > KHO restore is not going to help much; however, I agree it can add 
> > some extra scrutiny (i.e., similar to an ASSERT), but it is not really 
> > something that would help improve correctness in any meaningful way. The 
> > correctness should lie in the LUO logic using incoming as incoming, and 
> > outgoing as outgoing.
> 
> I am not arguing that we shouldn't fix the logic bugs. Of course we
> should.
> 
> My point is that this sanity check acts as another layer of defence.
> Bugs happen, but the earlier we catch them the better and this sanity
> check helps us do exactly that.
> 
> For example, if we did not have these sanity checks, the loud errors I
> described above would be replaced by silent use-after-free, double-free,
> struct page corruption, or other problems.
> 
> So I would like to understand why you _don't_ want to have this line of
> defence. What's the problem? If you are worried about performance, we
> can go and measure it. If the overhead is too high this can be behind a
> debug config.

Most likely, there is no performance cost, because when we free 
preserved memory, we still need to do a KHO restore. The only difference 
is that it may occur after a blackout not during blackout. Anyway, if 
you would like to add this sanity check, please send it out, and we can 
review and discuss how it looks.

> 
> >
> >> 
> >> If we want to squeeze out more performance later down the line we can
> >> move it behind a debug config, but having this usage pattern of always
> >> restoring before using is going to be a lot more sane than just using
> >> physical addresses willy nilly.
> >> 
> >> The approach this series takes with kho_is_preserved() is the wrong
> >> design. But a kho_restore() or something similar (maybe we can find a
> >> better name?) is really where we should be going.
> >> 
> >> -- 
> >> Regards,
> >> Pratyush Yadav
> 
> -- 
> Regards,
> Pratyush Yadav