mm/page_alloc.c | 1 + 1 file changed, 1 insertion(+)
Several subsystems (slub, shmem, ttm, etc.) use page->private but don't
clear it before freeing pages. When these pages are later allocated as
high-order pages and split via split_page(), tail pages retain stale
page->private values.
This causes a use-after-free in the swap subsystem. The swap code uses
page->private to track swap count continuations, assuming freshly
allocated pages have page->private == 0. When stale values are present,
swap_count_continued() incorrectly assumes the continuation list is valid
and iterates over uninitialized page->lru containing LIST_POISON values,
causing a crash:
KASAN: maybe wild-memory-access in range [0xdead000000000100-0xdead000000000107]
RIP: 0010:__do_sys_swapoff+0x1151/0x1860
Fix this by clearing page->private in free_pages_prepare(), ensuring all
freed pages have clean state regardless of previous use.
Fixes: 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split rather than compound")
Cc: stable@vger.kernel.org
Suggested-by: Zi Yan <ziy@nvidia.com>
Acked-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
---
mm/page_alloc.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cbf758e27aa2..24ac34199f95 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1430,6 +1430,7 @@ __always_inline bool free_pages_prepare(struct page *page,
page_cpupid_reset_last(page);
page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP;
+ page->private = 0;
reset_page_owner(page, order);
page_table_check_free(page, order);
pgalloc_tag_sub(page, 1 << order);
--
2.53.0
On 2/7/26 18:36, Mikhail Gavrilov wrote:
Thanks!
> Several subsystems (slub, shmem, ttm, etc.) use page->private but don't
> clear it before freeing pages. When these pages are later allocated as
> high-order pages and split via split_page(), tail pages retain stale
> page->private values.
>
> This causes a use-after-free in the swap subsystem. The swap code uses
> page->private to track swap count continuations, assuming freshly
> allocated pages have page->private == 0. When stale values are present,
> swap_count_continued() incorrectly assumes the continuation list is valid
> and iterates over uninitialized page->lru containing LIST_POISON values,
> causing a crash:
>
> KASAN: maybe wild-memory-access in range [0xdead000000000100-0xdead000000000107]
> RIP: 0010:__do_sys_swapoff+0x1151/0x1860
>
> Fix this by clearing page->private in free_pages_prepare(), ensuring all
> freed pages have clean state regardless of previous use.
I could have sworn we discussed something like that already in the past.
I recall that freeing pages with page->private set was allowed. Although
I once wondered whether we should actually change that.
>
> Fixes: 3b8000ae185c ("mm/vmalloc: huge vmalloc backing pages should be split rather than compound")
> Cc: stable@vger.kernel.org
> Suggested-by: Zi Yan <ziy@nvidia.com>
> Acked-by: Zi Yan <ziy@nvidia.com>
> Signed-off-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
> ---
Next time, please don't send patches as reply to another thread; that
way it can easily get lost in a bigger thread.
You want to get peoples attention :)
> mm/page_alloc.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index cbf758e27aa2..24ac34199f95 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1430,6 +1430,7 @@ __always_inline bool free_pages_prepare(struct page *page,
>
> page_cpupid_reset_last(page);
> page->flags.f &= ~PAGE_FLAGS_CHECK_AT_PREP;
> + page->private = 0;
Should we be using set_page_private()? It's a bit inconsistent :)
I wonder, if it's really just the split_page() problem, why not
handle it there, where we already iterate over all ("tail") pages?
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cbf758e27aa2..cbbcfdf3ed26 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3122,8 +3122,10 @@ void split_page(struct page *page, unsigned int order)
VM_BUG_ON_PAGE(PageCompound(page), page);
VM_BUG_ON_PAGE(!page_count(page), page);
- for (i = 1; i < (1 << order); i++)
+ for (i = 1; i < (1 << order); i++) {
set_page_refcounted(page + i);
+ set_page_private(page, 0);
+ }
split_page_owner(page, order, 0);
pgalloc_tag_split(page_folio(page), order, 0);
split_page_memcg(page, order);
But then I thought about "what does actually happen during an folio split".
We had a check in __split_folio_to_order() that got removed in 4265d67e405a, for some
undocumented reason (and the patch got merged with 0 tags :( ). I assume because with zone-device
there was a way to now got ->private properly set. But we removed the safety check for
all other folios.
- /*
- * page->private should not be set in tail pages. Fix up and warn once
- * if private is unexpectedly set.
- */
- if (unlikely(new_folio->private)) {
- VM_WARN_ON_ONCE_PAGE(true, new_head);
- new_folio->private = NULL;
- }
I would have thought that we could have triggered that check easily before. Why didn't we?
Who would have cleared the private field of tail pages?
@Zi Yan, any idea why the folio splitting code wouldn't have revealed a similar problem?
--
Cheers,
David
>
> - /*
> - * page->private should not be set in tail pages. Fix up
> and warn once
> - * if private is unexpectedly set.
> - */
> - if (unlikely(new_folio->private)) {
> - VM_WARN_ON_ONCE_PAGE(true, new_head);
> - new_folio->private = NULL;
> - }
BTW, I wonder whether we should bring that check back for non-device folios.
--
Cheers,
David
© 2016 - 2026 Red Hat, Inc.