kernel/kexec_handover.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-)
When restoring a folio using kho_restore_folio(), no sanity checks are
done to make sure the folio actually came from a kexec handover. The
caller is trusted to pass in the right address. If the caller has a bug
and passes in a wrong address, an in-use folio might be "restored" and
returned, causing all sorts of memory corruption.
Harden the folio restore logic by stashing in a magic number in
page->private along with the folio order. If the magic number does not
match, the folio won't be touched. page->private is an unsigned long.
The union kho_page_info splits it into two parts, with one holding the
order and the other holding the magic number.
Signed-off-by: Pratyush Yadav <pratyush@kernel.org>
---
kernel/kexec_handover.c | 29 ++++++++++++++++++++++++-----
1 file changed, 24 insertions(+), 5 deletions(-)
diff --git a/kernel/kexec_handover.c b/kernel/kexec_handover.c
index ecd1ac210dbd7..68eb3c28abe41 100644
--- a/kernel/kexec_handover.c
+++ b/kernel/kexec_handover.c
@@ -32,6 +32,22 @@
#define PROP_PRESERVED_MEMORY_MAP "preserved-memory-map"
#define PROP_SUB_FDT "fdt"
+#define KHO_PAGE_MAGIC 0x4b484f50U /* ASCII for 'KHOP' */
+
+/*
+ * KHO uses page->private, which is an unsigned long, to store page metadata.
+ * Use it to store both the magic and the order.
+ */
+union kho_page_info {
+ unsigned long page_private;
+ struct {
+ unsigned int order;
+ unsigned int magic;
+ };
+};
+
+static_assert(sizeof(union kho_page_info) == sizeof(((struct page *)0)->private));
+
static bool kho_enable __ro_after_init;
bool kho_is_enabled(void)
@@ -210,16 +226,16 @@ static void kho_restore_page(struct page *page, unsigned int order)
struct folio *kho_restore_folio(phys_addr_t phys)
{
struct page *page = pfn_to_online_page(PHYS_PFN(phys));
- unsigned long order;
+ union kho_page_info info;
if (!page)
return NULL;
- order = page->private;
- if (order > MAX_PAGE_ORDER)
+ info.page_private = page->private;
+ if (info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER)
return NULL;
- kho_restore_page(page, order);
+ kho_restore_page(page, info.order);
return page_folio(page);
}
EXPORT_SYMBOL_GPL(kho_restore_folio);
@@ -341,10 +357,13 @@ static void __init deserialize_bitmap(unsigned int order,
phys_addr_t phys =
elm->phys_start + (bit << (order + PAGE_SHIFT));
struct page *page = phys_to_page(phys);
+ union kho_page_info info;
memblock_reserve(phys, sz);
memblock_reserved_mark_noinit(phys, sz);
- page->private = order;
+ info.magic = KHO_PAGE_MAGIC;
+ info.order = order;
+ page->private = info.page_private;
}
}
base-commit: b320789d6883cc00ac78ce83bccbfe7ed58afcf0
--
2.47.3
On Wed, Sep 10, 2025 at 05:34:40PM +0200, Pratyush Yadav wrote: > When restoring a folio using kho_restore_folio(), no sanity checks are > done to make sure the folio actually came from a kexec handover. The > caller is trusted to pass in the right address. If the caller has a bug > and passes in a wrong address, an in-use folio might be "restored" and > returned, causing all sorts of memory corruption. > > Harden the folio restore logic by stashing in a magic number in > page->private along with the folio order. If the magic number does not > match, the folio won't be touched. page->private is an unsigned long. > The union kho_page_info splits it into two parts, with one holding the > order and the other holding the magic number. I think the sanity checks belongs to the core kho_restore_page() function and kho_restore_folio() should be a thin wrapper for that, at least until we'd need to allocate struct folio there. > Signed-off-by: Pratyush Yadav <pratyush@kernel.org> > --- > kernel/kexec_handover.c | 29 ++++++++++++++++++++++++----- > 1 file changed, 24 insertions(+), 5 deletions(-) -- Sincerely yours, Mike.
On Tue, Sep 16 2025, Mike Rapoport wrote: > On Wed, Sep 10, 2025 at 05:34:40PM +0200, Pratyush Yadav wrote: >> When restoring a folio using kho_restore_folio(), no sanity checks are >> done to make sure the folio actually came from a kexec handover. The >> caller is trusted to pass in the right address. If the caller has a bug >> and passes in a wrong address, an in-use folio might be "restored" and >> returned, causing all sorts of memory corruption. >> >> Harden the folio restore logic by stashing in a magic number in >> page->private along with the folio order. If the magic number does not >> match, the folio won't be touched. page->private is an unsigned long. >> The union kho_page_info splits it into two parts, with one holding the >> order and the other holding the magic number. > > I think the sanity checks belongs to the core kho_restore_page() function > and kho_restore_folio() should be a thin wrapper for that, at least until > we'd need to allocate struct folio there. Hmm, okay. I can do that for the next version. -- Regards, Pratyush Yadav
On Wed, Sep 10, 2025 at 05:34:40PM +0200, Pratyush Yadav wrote:
> +#define KHO_PAGE_MAGIC 0x4b484f50U /* ASCII for 'KHOP' */
> +
> +/*
> + * KHO uses page->private, which is an unsigned long, to store page metadata.
> + * Use it to store both the magic and the order.
> + */
> +union kho_page_info {
> + unsigned long page_private;
> + struct {
> + unsigned int order;
> + unsigned int magic;
> + };
KHO is only supported on 64-bit?
> @@ -210,16 +226,16 @@ static void kho_restore_page(struct page *page, unsigned int order)
> struct folio *kho_restore_folio(phys_addr_t phys)
> {
> struct page *page = pfn_to_online_page(PHYS_PFN(phys));
> - unsigned long order;
> + union kho_page_info info;
>
> if (!page)
> return NULL;
>
> - order = page->private;
> - if (order > MAX_PAGE_ORDER)
> + info.page_private = page->private;
> + if (info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER)
> return NULL;
>
> - kho_restore_page(page, order);
> + kho_restore_page(page, info.order);
> return page_folio(page);
This all looks very confused. Before your patch as well as after it.
I don't see anything in the current KHO code that requires the
phys_addr_t to be order-aligned.
On Wed, Sep 10 2025, Matthew Wilcox wrote:
> On Wed, Sep 10, 2025 at 05:34:40PM +0200, Pratyush Yadav wrote:
>> +#define KHO_PAGE_MAGIC 0x4b484f50U /* ASCII for 'KHOP' */
>> +
>> +/*
>> + * KHO uses page->private, which is an unsigned long, to store page metadata.
>> + * Use it to store both the magic and the order.
>> + */
>> +union kho_page_info {
>> + unsigned long page_private;
>> + struct {
>> + unsigned int order;
>> + unsigned int magic;
>> + };
>
> KHO is only supported on 64-bit?
Yes. Currently only x86_64 and ARM64. It is mainly for hypervisor live
update so there isn't much reason to support it on 32-bit platforms.
>
>> @@ -210,16 +226,16 @@ static void kho_restore_page(struct page *page, unsigned int order)
>> struct folio *kho_restore_folio(phys_addr_t phys)
>> {
>> struct page *page = pfn_to_online_page(PHYS_PFN(phys));
>> - unsigned long order;
>> + union kho_page_info info;
>>
>> if (!page)
>> return NULL;
>>
>> - order = page->private;
>> - if (order > MAX_PAGE_ORDER)
>> + info.page_private = page->private;
>> + if (info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER)
>> return NULL;
>>
>> - kho_restore_page(page, order);
>> + kho_restore_page(page, info.order);
>> return page_folio(page);
>
> This all looks very confused. Before your patch as well as after it.
> I don't see anything in the current KHO code that requires the
> phys_addr_t to be order-aligned.
Right, good point. I can send that as a follow up patch. But I think
this patch stands on its own without that fix too.
--
Regards,
Pratyush Yadav
On Wed, Sep 10, 2025 at 05:52:04PM +0200, Pratyush Yadav wrote:
> On Wed, Sep 10 2025, Matthew Wilcox wrote:
>
> > On Wed, Sep 10, 2025 at 05:34:40PM +0200, Pratyush Yadav wrote:
> >> +#define KHO_PAGE_MAGIC 0x4b484f50U /* ASCII for 'KHOP' */
> >> +
> >> +/*
> >> + * KHO uses page->private, which is an unsigned long, to store page metadata.
> >> + * Use it to store both the magic and the order.
> >> + */
> >> +union kho_page_info {
> >> + unsigned long page_private;
> >> + struct {
> >> + unsigned int order;
> >> + unsigned int magic;
> >> + };
> >
> > KHO is only supported on 64-bit?
>
> Yes. Currently only x86_64 and ARM64. It is mainly for hypervisor live
> update so there isn't much reason to support it on 32-bit platforms.
Presumably this will eventually change to use some special coding on the memdesc
pointer?
> >> @@ -210,16 +226,16 @@ static void kho_restore_page(struct page *page, unsigned int order)
> >> struct folio *kho_restore_folio(phys_addr_t phys)
> >> {
> >> struct page *page = pfn_to_online_page(PHYS_PFN(phys));
> >> - unsigned long order;
> >> + union kho_page_info info;
> >>
> >> if (!page)
> >> return NULL;
> >>
> >> - order = page->private;
> >> - if (order > MAX_PAGE_ORDER)
> >> + info.page_private = page->private;
> >> + if (info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER)
All the impossible checks shoudl be WARN_ON()
> >> return NULL;
> >>
> >> - kho_restore_page(page, order);
> >> + kho_restore_page(page, info.order);
> >> return page_folio(page);
> >
> > This all looks very confused. Before your patch as well as after it.
> > I don't see anything in the current KHO code that requires the
> > phys_addr_t to be order-aligned.
>
> Right, good point. I can send that as a follow up patch. But I think
> this patch stands on its own without that fix too.
Maybe it is worth adding some KHO_DEBUG kconfig to protect some of
these extra checks?
phys should be pfn_valid, phys should be aligned, the page should be
in the right state, order should be valid, etc. All worth checking.
Jason
On Wed, Sep 10 2025, Jason Gunthorpe wrote:
> On Wed, Sep 10, 2025 at 05:52:04PM +0200, Pratyush Yadav wrote:
>> On Wed, Sep 10 2025, Matthew Wilcox wrote:
>>
>> > On Wed, Sep 10, 2025 at 05:34:40PM +0200, Pratyush Yadav wrote:
>> >> +#define KHO_PAGE_MAGIC 0x4b484f50U /* ASCII for 'KHOP' */
>> >> +
>> >> +/*
>> >> + * KHO uses page->private, which is an unsigned long, to store page metadata.
>> >> + * Use it to store both the magic and the order.
>> >> + */
>> >> +union kho_page_info {
>> >> + unsigned long page_private;
>> >> + struct {
>> >> + unsigned int order;
>> >> + unsigned int magic;
>> >> + };
>> >
>> > KHO is only supported on 64-bit?
>>
>> Yes. Currently only x86_64 and ARM64. It is mainly for hypervisor live
>> update so there isn't much reason to support it on 32-bit platforms.
>
> Presumably this will eventually change to use some special coding on the memdesc
> pointer?
Maybe. I didn't think that through yet.
>
>> >> @@ -210,16 +226,16 @@ static void kho_restore_page(struct page *page, unsigned int order)
>> >> struct folio *kho_restore_folio(phys_addr_t phys)
>> >> {
>> >> struct page *page = pfn_to_online_page(PHYS_PFN(phys));
>> >> - unsigned long order;
>> >> + union kho_page_info info;
>> >>
>> >> if (!page)
>> >> return NULL;
>> >>
>> >> - order = page->private;
>> >> - if (order > MAX_PAGE_ORDER)
>> >> + info.page_private = page->private;
>> >> + if (info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER)
>
> All the impossible checks shoudl be WARN_ON()
The mental model I have is that the place that introduced the
"impossible" situation should get the WARN(). So for an incorrect phys
address (leading to magic mismatch) or incorrect order (say preserved
big range using kho_preserve_phys() and restoring it using
kho_restore_folio()), the caller is responsible so it should do the
WARN(). Does that make sense?
>
>> >> return NULL;
>> >>
>> >> - kho_restore_page(page, order);
>> >> + kho_restore_page(page, info.order);
>> >> return page_folio(page);
>> >
>> > This all looks very confused. Before your patch as well as after it.
>> > I don't see anything in the current KHO code that requires the
>> > phys_addr_t to be order-aligned.
>>
>> Right, good point. I can send that as a follow up patch. But I think
>> this patch stands on its own without that fix too.
Actually, on another look, this patch implicitly makes sure that
unaligned phys always fails. This is because deserialize_bitmap() only
sets the magic on the head page which is always aligned by the order.
For any unaligned phys, the magic will not match.
Still, would be a good idea to do a KHO_DEBUG check for alignment if
this implicit thing ever changes.
>
> Maybe it is worth adding some KHO_DEBUG kconfig to protect some of
> these extra checks?
>
> phys should be pfn_valid, phys should be aligned, the page should be
> in the right state, order should be valid, etc. All worth checking.
Makes sense. Do you suggest the magic and order checks to also be under
KHO_DEBUG, or should they always be done? I am of the opinion that it
makes sense to do them always, but I can also understand the argument
for disabling them in production for better performance.
--
Regards,
Pratyush Yadav
On Tue, Sep 16, 2025 at 03:20:51PM +0200, Pratyush Yadav wrote:
> >> >> @@ -210,16 +226,16 @@ static void kho_restore_page(struct page *page, unsigned int order)
> >> >> struct folio *kho_restore_folio(phys_addr_t phys)
> >> >> {
> >> >> struct page *page = pfn_to_online_page(PHYS_PFN(phys));
> >> >> - unsigned long order;
> >> >> + union kho_page_info info;
> >> >>
> >> >> if (!page)
> >> >> return NULL;
> >> >>
> >> >> - order = page->private;
> >> >> - if (order > MAX_PAGE_ORDER)
> >> >> + info.page_private = page->private;
> >> >> + if (info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER)
> >
> > All the impossible checks shoudl be WARN_ON()
>
> The mental model I have is that the place that introduced the
> "impossible" situation should get the WARN(). So for an incorrect phys
> address (leading to magic mismatch) or incorrect order (say preserved
> big range using kho_preserve_phys() and restoring it using
> kho_restore_folio()), the caller is responsible so it should do the
> WARN(). Does that make sense?
Callers should not pass illegal phys here, WARN at this point is
appropriate, and maybe under a debug #ifdef or something like that.
This is to make it clear that "test and fail" is not an acceptable way
to use this API.
> Actually, on another look, this patch implicitly makes sure that
> unaligned phys always fails. This is because deserialize_bitmap() only
> sets the magic on the head page which is always aligned by the order.
> For any unaligned phys, the magic will not match.
Makes sense, maybe a comment?
> Makes sense. Do you suggest the magic and order checks to also be under
> KHO_DEBUG, or should they always be done? I am of the opinion that it
> makes sense to do them always, but I can also understand the argument
> for disabling them in production for better performance.
I'm ambivalent :)
For now I'd do it always and leave some debug optimization to a future
project.
Jason
On Tue, Sep 16 2025, Jason Gunthorpe wrote:
> On Tue, Sep 16, 2025 at 03:20:51PM +0200, Pratyush Yadav wrote:
>> >> >> @@ -210,16 +226,16 @@ static void kho_restore_page(struct page *page, unsigned int order)
>> >> >> struct folio *kho_restore_folio(phys_addr_t phys)
>> >> >> {
>> >> >> struct page *page = pfn_to_online_page(PHYS_PFN(phys));
>> >> >> - unsigned long order;
>> >> >> + union kho_page_info info;
>> >> >>
>> >> >> if (!page)
>> >> >> return NULL;
>> >> >>
>> >> >> - order = page->private;
>> >> >> - if (order > MAX_PAGE_ORDER)
>> >> >> + info.page_private = page->private;
>> >> >> + if (info.magic != KHO_PAGE_MAGIC || info.order > MAX_PAGE_ORDER)
>> >
>> > All the impossible checks shoudl be WARN_ON()
>>
>> The mental model I have is that the place that introduced the
>> "impossible" situation should get the WARN(). So for an incorrect phys
>> address (leading to magic mismatch) or incorrect order (say preserved
>> big range using kho_preserve_phys() and restoring it using
>> kho_restore_folio()), the caller is responsible so it should do the
>> WARN(). Does that make sense?
>
> Callers should not pass illegal phys here, WARN at this point is
> appropriate, and maybe under a debug #ifdef or something like that.
>
> This is to make it clear that "test and fail" is not an acceptable way
> to use this API.
Okay, makes sense. Will add.
>
>> Actually, on another look, this patch implicitly makes sure that
>> unaligned phys always fails. This is because deserialize_bitmap() only
>> sets the magic on the head page which is always aligned by the order.
>> For any unaligned phys, the magic will not match.
>
> Makes sense, maybe a comment?
Will do.
>
>> Makes sense. Do you suggest the magic and order checks to also be under
>> KHO_DEBUG, or should they always be done? I am of the opinion that it
>> makes sense to do them always, but I can also understand the argument
>> for disabling them in production for better performance.
>
> I'm ambivalent :)
>
> For now I'd do it always and leave some debug optimization to a future
> project.
Sounds good. Will add them as always for now.
--
Regards,
Pratyush Yadav
© 2016 - 2026 Red Hat, Inc.