[PATCH mm-new v3 2/3] mm/khugepaged: use VM_WARN_ON_FOLIO instead of VM_BUG_ON_FOLIO for non-anon folios

Lance Yang posted 3 patches 4 months ago
[PATCH mm-new v3 2/3] mm/khugepaged: use VM_WARN_ON_FOLIO instead of VM_BUG_ON_FOLIO for non-anon folios
Posted by Lance Yang 4 months ago
From: Lance Yang <lance.yang@linux.dev>

As Zi pointed out, we should avoid crashing the kernel for conditions
that can be handled gracefully. Encountering a non-anonymous folio in an
anonymous VMA is a bug, but a warning is sufficient.

This patch changes the VM_BUG_ON_FOLIO(!folio_test_anon(folio)) to a
VM_WARN_ON_FOLIO() in both __collapse_huge_page_isolate() and
hpage_collapse_scan_pmd(), and then aborts the scan with SCAN_PAGE_ANON.

Making more of the scanning logic common between hpage_collapse_scan_pmd()
and __collapse_huge_page_isolate(), as suggested by Dev.

Suggested-by: Dev Jain <dev.jain@arm.com>
Suggested-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Dev Jain <dev.jain@arm.com>
Signed-off-by: Lance Yang <lance.yang@linux.dev>
---
 mm/khugepaged.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index e3e27223137a..b5c0295c3414 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -573,7 +573,11 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
 		}
 
 		folio = page_folio(page);
-		VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio);
+		if (!folio_test_anon(folio)) {
+			VM_WARN_ON_FOLIO(true, folio);
+			result = SCAN_PAGE_ANON;
+			goto out;
+		}
 
 		/* See hpage_collapse_scan_pmd(). */
 		if (folio_maybe_mapped_shared(folio)) {
@@ -1340,6 +1344,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
 		folio = page_folio(page);
 
 		if (!folio_test_anon(folio)) {
+			VM_WARN_ON_FOLIO(true, folio);
 			result = SCAN_PAGE_ANON;
 			goto out_unmap;
 		}
-- 
2.49.0
Re: [PATCH mm-new v3 2/3] mm/khugepaged: use VM_WARN_ON_FOLIO instead of VM_BUG_ON_FOLIO for non-anon folios
Posted by Lorenzo Stoakes 3 months, 3 weeks ago
On Wed, Oct 08, 2025 at 12:37:47PM +0800, Lance Yang wrote:
> From: Lance Yang <lance.yang@linux.dev>
>
> As Zi pointed out, we should avoid crashing the kernel for conditions
> that can be handled gracefully. Encountering a non-anonymous folio in an
> anonymous VMA is a bug, but a warning is sufficient.
>
> This patch changes the VM_BUG_ON_FOLIO(!folio_test_anon(folio)) to a
> VM_WARN_ON_FOLIO() in both __collapse_huge_page_isolate() and
> hpage_collapse_scan_pmd(), and then aborts the scan with SCAN_PAGE_ANON.

Well no, in hpage_collapse_scan_pmd() there is no warning at all.

>
> Making more of the scanning logic common between hpage_collapse_scan_pmd()
> and __collapse_huge_page_isolate(), as suggested by Dev.

I mean I guess it's fine but I'm not sure it's really necessary to give a
blow-by-blow of who suggested what in the actual commit message :) This
isn't really useful information for somebody looking at this code in the
future.

>
> Suggested-by: Dev Jain <dev.jain@arm.com>
> Suggested-by: Zi Yan <ziy@nvidia.com>
> Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
> Reviewed-by: Dev Jain <dev.jain@arm.com>
> Signed-off-by: Lance Yang <lance.yang@linux.dev>
> ---
>  mm/khugepaged.c | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index e3e27223137a..b5c0295c3414 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -573,7 +573,11 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
>  		}
>
>  		folio = page_folio(page);
> -		VM_BUG_ON_FOLIO(!folio_test_anon(folio), folio);
> +		if (!folio_test_anon(folio)) {
> +			VM_WARN_ON_FOLIO(true, folio);
> +			result = SCAN_PAGE_ANON;
> +			goto out;

Hmm this is iffy I'm not sure I agree with Zi here - the purpose of
VM_WARN_ON() etc. is for things that programmatically _should not
happen_.

Now every single time we run this code we're doing this check.

AND it implies that it is an actual possiblity, at run time, for this to be
the case.

I really don't like this.

Also if it's a runtime check this should be a WARN_ON_ONCE() not a
VM_WARN_ON_ONCE(). Of course you lose the folio output then. So this code
is very confused.

In general I don't think we should be doing this at all, rather we should
just convert the VM_BUG_ON_FOLIO() to a VM_WARN_ON_FOLIO().


> +		}
>
>  		/* See hpage_collapse_scan_pmd(). */
>  		if (folio_maybe_mapped_shared(folio)) {
> @@ -1340,6 +1344,7 @@ static int hpage_collapse_scan_pmd(struct mm_struct *mm,
>  		folio = page_folio(page);
>
>  		if (!folio_test_anon(folio)) {
> +			VM_WARN_ON_FOLIO(true, folio);

Err, what? This is a condition that should never, ever happen to the point
that we warn on it?

This surely is a condition that we expect to happen sometimes otherwise we
wouldn't do this no?

Either way the comments above still apply. Also VM_WARN_ON_FOLIO(true, ...)
is kinda gross... if this is an actual pattern that exists, VM_WARN_FOLIO()
would be preferable.

>  			result = SCAN_PAGE_ANON;
>  			goto out_unmap;
>  		}
> --
> 2.49.0
>