[v1] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

[PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by Jane Chu 1 month, 2 weeks ago

When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
passed head pfn to kill_accessing_process(), that is not right.
The precise pfn of the poisoned page should be used in order to
determine the precise vaddr as the SIGBUS payload.

This issue has already been taken care of in the normal path, that is,
hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
correctly in the hugetlb repoisoning case, it's essential to inform
VM the precise poisoned page, not the head page.

[1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
[2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
[3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/

Cc: <stable@vger.kernel.org>
Signed-off-by: Jane Chu <jane.chu@oracle.com>
---
 mm/memory-failure.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 3edebb0cda30..c9d87811b1ea 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
 }
 
 static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
-				unsigned long poisoned_pfn, struct to_kill *tk)
+				unsigned long poisoned_pfn, struct to_kill *tk,
+				int pte_nr)
 {
 	unsigned long pfn = 0;
+	unsigned long hwpoison_vaddr;
 
 	if (pte_present(pte)) {
 		pfn = pte_pfn(pte);
@@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
 			pfn = swp_offset_pfn(swp);
 	}
 
-	if (!pfn || pfn != poisoned_pfn)
+	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
 		return 0;
 
-	set_to_kill(tk, addr, shift);
+	hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
+	set_to_kill(tk, hwpoison_vaddr, shift);
 	return 1;
 }
 
@@ -749,7 +752,7 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr,
 
 	for (; addr != end; ptep++, addr += PAGE_SIZE) {
 		ret = check_hwpoisoned_entry(ptep_get(ptep), addr, PAGE_SHIFT,
-					     hwp->pfn, &hwp->tk);
+					     hwp->pfn, &hwp->tk, 1);
 		if (ret == 1)
 			break;
 	}
@@ -772,8 +775,8 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
 
 	ptl = huge_pte_lock(h, walk->mm, ptep);
 	pte = huge_ptep_get(walk->mm, addr, ptep);
-	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h),
-					hwp->pfn, &hwp->tk);
+	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h), hwp->pfn,
+				&hwp->tk, pages_per_huge_page(h));
 	spin_unlock(ptl);
 	return ret;
 }
@@ -2023,10 +2026,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
 		*hugetlb = 0;
 		return 0;
 	} else if (res == -EHWPOISON) {
-		if (flags & MF_ACTION_REQUIRED) {
-			folio = page_folio(p);
-			res = kill_accessing_process(current, folio_pfn(folio), flags);
-		}
+		if (flags & MF_ACTION_REQUIRED)
+			res = kill_accessing_process(current, pfn, flags);
 		action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
 		return res;
 	} else if (res == -EBUSY) {
@@ -2037,6 +2038,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
 		return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
 	}
 
+
 	folio = page_folio(p);
 	folio_lock(folio);
 
-- 
2.43.5

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by Andrew Morton 1 month, 2 weeks ago

On Thu, 18 Dec 2025 23:28:19 -0700 Jane Chu <jane.chu@oracle.com> wrote:

> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.
> 
> This issue has already been taken care of in the normal path, that is,
> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
> correctly in the hugetlb repoisoning case, it's essential to inform
> VM the precise poisoned page, not the head page.

This conflicts with your "mm/memory-failure: fix missing ->mf_stats
count in hugetlb poison".

Also conflicts a bit with "mm: fixup pfnmap memory failure handling to
use pgoff" but that one isn't cc:stable, so this patch (which *is*
cc:stable) takes priority.

Help?

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by jane.chu@oracle.com 1 month, 2 weeks ago


On 12/20/2025 3:13 PM, Andrew Morton wrote:
> On Thu, 18 Dec 2025 23:28:19 -0700 Jane Chu <jane.chu@oracle.com> wrote:
> 
>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
> 
> This conflicts with your "mm/memory-failure: fix missing ->mf_stats
> count in hugetlb poison".
> 
> Also conflicts a bit with "mm: fixup pfnmap memory failure handling to
> use pgoff" but that one isn't cc:stable, so this patch (which *is*
> cc:stable) takes priority.
> 
> Help?
> 

Sorry Andrew.  Let me try rebase v3 on top of my other patch.  Will also 
take a look at this other conflict.

thanks,
-jane

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by jane.chu@oracle.com 1 month, 2 weeks ago

Hi, Andrew,

On 12/20/2025 3:13 PM, Andrew Morton wrote:
> On Thu, 18 Dec 2025 23:28:19 -0700 Jane Chu <jane.chu@oracle.com> wrote:
> 
>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
> 
> This conflicts with your "mm/memory-failure: fix missing ->mf_stats
> count in hugetlb poison".
> 
> Also conflicts a bit with "mm: fixup pfnmap memory failure handling to
> use pgoff" but that one isn't cc:stable, so this patch (which *is*
> cc:stable) takes priority.

I looked at
   https://lore.kernel.org/lkml/20251213044708.3610-2-ankita@nvidia.com/
looks like we're changing different function, perhaps the conflict is
peripheral?

thanks,
-jane

> 
> Help?
>

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by David Hildenbrand (Red Hat) 1 month, 2 weeks ago

On 12/19/25 07:28, Jane Chu wrote:
> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.

I don't think so? IIRC, for hugetlb folios we always reported the head 
PFN. And user space must assume that the whole thing is poisoned and 
will go away.

I recall that older QEMU even depended on that behavior, for example.

-- 
Cheers

David

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by jane.chu@oracle.com 1 month, 2 weeks ago

On 12/21/2025 12:49 AM, David Hildenbrand (Red Hat) wrote:
> On 12/19/25 07:28, Jane Chu wrote:
>> When a hugetlb folio is being poisoned again, 
>> try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
> 
> I don't think so? IIRC, for hugetlb folios we always reported the head 
> PFN. And user space must assume that the whole thing is poisoned and 
> will go away.
> 
> I recall that older QEMU even depended on that behavior, for example.
> 

What happens if non-head PFN of hugetlb is indicated in a SIGBUG to 
QEMU?  Because, the regular path, the path via hwpoison_user_mappings() 
already behave this way.

I'm not familiar with QEMU. AFAIK, the need for this patch came from our 
VM/QEMU team.

thanks,
-jane

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by Liam R. Howlett 1 month, 2 weeks ago

* Jane Chu <jane.chu@oracle.com> [251219 01:28]:
> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.
> 
> This issue has already been taken care of in the normal path, that is,
> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
> correctly in the hugetlb repoisoning case, it's essential to inform
> VM the precise poisoned page, not the head page.
> 
> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
> 
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>

I don't see stable in the Cc list, did you miss it?

Looks good, small nit below.

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  mm/memory-failure.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 3edebb0cda30..c9d87811b1ea 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>  }
>  
>  static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
> -				unsigned long poisoned_pfn, struct to_kill *tk)
> +				unsigned long poisoned_pfn, struct to_kill *tk,
> +				int pte_nr)
>  {
>  	unsigned long pfn = 0;
> +	unsigned long hwpoison_vaddr;
>  
>  	if (pte_present(pte)) {
>  		pfn = pte_pfn(pte);
> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>  			pfn = swp_offset_pfn(swp);
>  	}
>  
> -	if (!pfn || pfn != poisoned_pfn)
> +	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>  		return 0;
>  
> -	set_to_kill(tk, addr, shift);
> +	hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
> +	set_to_kill(tk, hwpoison_vaddr, shift);
>  	return 1;
>  }
>  
> @@ -749,7 +752,7 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr,
>  
>  	for (; addr != end; ptep++, addr += PAGE_SIZE) {
>  		ret = check_hwpoisoned_entry(ptep_get(ptep), addr, PAGE_SHIFT,
> -					     hwp->pfn, &hwp->tk);
> +					     hwp->pfn, &hwp->tk, 1);
>  		if (ret == 1)
>  			break;
>  	}
> @@ -772,8 +775,8 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
>  
>  	ptl = huge_pte_lock(h, walk->mm, ptep);
>  	pte = huge_ptep_get(walk->mm, addr, ptep);
> -	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h),
> -					hwp->pfn, &hwp->tk);
> +	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h), hwp->pfn,
> +				&hwp->tk, pages_per_huge_page(h));
>  	spin_unlock(ptl);
>  	return ret;
>  }
> @@ -2023,10 +2026,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
>  		*hugetlb = 0;
>  		return 0;
>  	} else if (res == -EHWPOISON) {
> -		if (flags & MF_ACTION_REQUIRED) {
> -			folio = page_folio(p);
> -			res = kill_accessing_process(current, folio_pfn(folio), flags);
> -		}
> +		if (flags & MF_ACTION_REQUIRED)
> +			res = kill_accessing_process(current, pfn, flags);
>  		action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
>  		return res;
>  	} else if (res == -EBUSY) {
> @@ -2037,6 +2038,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
>  		return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
>  	}
>  
> +

nit: extra witespace added.

>  	folio = page_folio(p);
>  	folio_lock(folio);
>  
> -- 
> 2.43.5
>

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by jane.chu@oracle.com 1 month, 2 weeks ago


On 12/19/2025 9:27 AM, Liam R. Howlett wrote:
> * Jane Chu <jane.chu@oracle.com> [251219 01:28]:
>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
>>
>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> 
> I don't see stable in the Cc list, did you miss it?

Good catch, thank you!
> 
> Looks good, small nit below.
> 
> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

Thanks!
-jane

> 
>> ---
>>   mm/memory-failure.c | 22 ++++++++++++----------
>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 3edebb0cda30..c9d87811b1ea 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>   }
>>   
>>   static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>> -				unsigned long poisoned_pfn, struct to_kill *tk)
>> +				unsigned long poisoned_pfn, struct to_kill *tk,
>> +				int pte_nr)
>>   {
>>   	unsigned long pfn = 0;
>> +	unsigned long hwpoison_vaddr;
>>   
>>   	if (pte_present(pte)) {
>>   		pfn = pte_pfn(pte);
>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>   			pfn = swp_offset_pfn(swp);
>>   	}
>>   
>> -	if (!pfn || pfn != poisoned_pfn)
>> +	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>   		return 0;
>>   
>> -	set_to_kill(tk, addr, shift);
>> +	hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
>> +	set_to_kill(tk, hwpoison_vaddr, shift);
>>   	return 1;
>>   }
>>   
>> @@ -749,7 +752,7 @@ static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr,
>>   
>>   	for (; addr != end; ptep++, addr += PAGE_SIZE) {
>>   		ret = check_hwpoisoned_entry(ptep_get(ptep), addr, PAGE_SHIFT,
>> -					     hwp->pfn, &hwp->tk);
>> +					     hwp->pfn, &hwp->tk, 1);
>>   		if (ret == 1)
>>   			break;
>>   	}
>> @@ -772,8 +775,8 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask,
>>   
>>   	ptl = huge_pte_lock(h, walk->mm, ptep);
>>   	pte = huge_ptep_get(walk->mm, addr, ptep);
>> -	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h),
>> -					hwp->pfn, &hwp->tk);
>> +	ret = check_hwpoisoned_entry(pte, addr, huge_page_shift(h), hwp->pfn,
>> +				&hwp->tk, pages_per_huge_page(h));
>>   	spin_unlock(ptl);
>>   	return ret;
>>   }
>> @@ -2023,10 +2026,8 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
>>   		*hugetlb = 0;
>>   		return 0;
>>   	} else if (res == -EHWPOISON) {
>> -		if (flags & MF_ACTION_REQUIRED) {
>> -			folio = page_folio(p);
>> -			res = kill_accessing_process(current, folio_pfn(folio), flags);
>> -		}
>> +		if (flags & MF_ACTION_REQUIRED)
>> +			res = kill_accessing_process(current, pfn, flags);
>>   		action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
>>   		return res;
>>   	} else if (res == -EBUSY) {
>> @@ -2037,6 +2038,7 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb
>>   		return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
>>   	}
>>   
>> +
> 
> nit: extra witespace added.
> 
>>   	folio = page_folio(p);
>>   	folio_lock(folio);
>>   
>> -- 
>> 2.43.5
>>

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by Miaohe Lin 1 month, 2 weeks ago

On 2025/12/19 14:28, Jane Chu wrote:
> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
> passed head pfn to kill_accessing_process(), that is not right.
> The precise pfn of the poisoned page should be used in order to
> determine the precise vaddr as the SIGBUS payload.
> 
> This issue has already been taken care of in the normal path, that is,
> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
> correctly in the hugetlb repoisoning case, it's essential to inform
> VM the precise poisoned page, not the head page.
> 
> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
> 

Thanks for your patch.

> Cc: <stable@vger.kernel.org>
> Signed-off-by: Jane Chu <jane.chu@oracle.com>
> ---
>  mm/memory-failure.c | 22 ++++++++++++----------
>  1 file changed, 12 insertions(+), 10 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 3edebb0cda30..c9d87811b1ea 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>  }
>  
>  static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
> -				unsigned long poisoned_pfn, struct to_kill *tk)
> +				unsigned long poisoned_pfn, struct to_kill *tk,
> +				int pte_nr)
>  {
>  	unsigned long pfn = 0;
> +	unsigned long hwpoison_vaddr;
>  
>  	if (pte_present(pte)) {
>  		pfn = pte_pfn(pte);
> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>  			pfn = swp_offset_pfn(swp);
>  	}
>  
> -	if (!pfn || pfn != poisoned_pfn)
> +	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>  		return 0;

Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?

Thanks.
.

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by jane.chu@oracle.com 1 month, 2 weeks ago


On 12/19/2025 12:01 AM, Miaohe Lin wrote:
> On 2025/12/19 14:28, Jane Chu wrote:
>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>> passed head pfn to kill_accessing_process(), that is not right.
>> The precise pfn of the poisoned page should be used in order to
>> determine the precise vaddr as the SIGBUS payload.
>>
>> This issue has already been taken care of in the normal path, that is,
>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>> correctly in the hugetlb repoisoning case, it's essential to inform
>> VM the precise poisoned page, not the head page.
>>
>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>
> 
> Thanks for your patch.
> 
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>> ---
>>   mm/memory-failure.c | 22 ++++++++++++----------
>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>
>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> index 3edebb0cda30..c9d87811b1ea 100644
>> --- a/mm/memory-failure.c
>> +++ b/mm/memory-failure.c
>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>   }
>>   
>>   static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>> -				unsigned long poisoned_pfn, struct to_kill *tk)
>> +				unsigned long poisoned_pfn, struct to_kill *tk,
>> +				int pte_nr)
>>   {
>>   	unsigned long pfn = 0;
>> +	unsigned long hwpoison_vaddr;
>>   
>>   	if (pte_present(pte)) {
>>   		pfn = pte_pfn(pte);
>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>   			pfn = swp_offset_pfn(swp);
>>   	}
>>   
>> -	if (!pfn || pfn != poisoned_pfn)
>> +	if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>   		return 0;
> 
> Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?

Why?  Is there any concern with using the macro pages_per_huge_page(h) ?

thanks!
-jane
> 
> Thanks.
> .

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by Miaohe Lin 1 month, 2 weeks ago

On 2025/12/19 16:06, jane.chu@oracle.com wrote:
> 
> 
> On 12/19/2025 12:01 AM, Miaohe Lin wrote:
>> On 2025/12/19 14:28, Jane Chu wrote:
>>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>>> passed head pfn to kill_accessing_process(), that is not right.
>>> The precise pfn of the poisoned page should be used in order to
>>> determine the precise vaddr as the SIGBUS payload.
>>>
>>> This issue has already been taken care of in the normal path, that is,
>>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>>> correctly in the hugetlb repoisoning case, it's essential to inform
>>> VM the precise poisoned page, not the head page.
>>>
>>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>>
>>
>> Thanks for your patch.
>>
>>> Cc: <stable@vger.kernel.org>
>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>>> ---
>>>   mm/memory-failure.c | 22 ++++++++++++----------
>>>   1 file changed, 12 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>> index 3edebb0cda30..c9d87811b1ea 100644
>>> --- a/mm/memory-failure.c
>>> +++ b/mm/memory-failure.c
>>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>>   }
>>>     static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>> -                unsigned long poisoned_pfn, struct to_kill *tk)
>>> +                unsigned long poisoned_pfn, struct to_kill *tk,
>>> +                int pte_nr)
>>>   {
>>>       unsigned long pfn = 0;
>>> +    unsigned long hwpoison_vaddr;
>>>         if (pte_present(pte)) {
>>>           pfn = pte_pfn(pte);
>>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>>               pfn = swp_offset_pfn(swp);
>>>       }
>>>   -    if (!pfn || pfn != poisoned_pfn)
>>> +    if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>>           return 0;
>>
>> Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?
> 
> Why?  Is there any concern with using the macro pages_per_huge_page(h) ?

No, I was trying to get rid of new @pte_nr parameter. Something like below:

 static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
-                               unsigned long poisoned_pfn, struct to_kill *tk,
-                               int pte_nr)
+                               unsigned long poisoned_pfn, struct to_kill *tk)
 {
        unsigned long pfn = 0;
        unsigned long hwpoison_vaddr;
+       int pte_nr;

        if (pte_present(pte)) {
                pfn = pte_pfn(pte);
@@ -701,7 +701,8 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
                        pfn = softleaf_to_pfn(entry);
        }

-       if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
+       pte_nr = 1UL << (shift - PAGE_SHIFT);
+       if (!pfn || (pfn > poisoned_pfn || (pfn +  pte_nr - 1) < poisoned_pfn))
                return 0;

        hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);

So we don't have to pass in pte_nr from all callers. But that's trivial.

Thanks.
.

Re: [PATCH] mm/memory-failure: teach kill_accessing_process to accept hugetlb tail page pfn

Posted by jane.chu@oracle.com 1 month, 2 weeks ago


On 12/21/2025 7:01 PM, Miaohe Lin wrote:
> On 2025/12/19 16:06, jane.chu@oracle.com wrote:
>>
>>
>> On 12/19/2025 12:01 AM, Miaohe Lin wrote:
>>> On 2025/12/19 14:28, Jane Chu wrote:
>>>> When a hugetlb folio is being poisoned again, try_memory_failure_hugetlb()
>>>> passed head pfn to kill_accessing_process(), that is not right.
>>>> The precise pfn of the poisoned page should be used in order to
>>>> determine the precise vaddr as the SIGBUS payload.
>>>>
>>>> This issue has already been taken care of in the normal path, that is,
>>>> hwpoison_user_mappings(), see [1][2].  Further more, for [3] to work
>>>> correctly in the hugetlb repoisoning case, it's essential to inform
>>>> VM the precise poisoned page, not the head page.
>>>>
>>>> [1] https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
>>>> [2] https://lkml.kernel.org/r/20250224211445.2663312-1-jane.chu@oracle.com
>>>> [3] https://lore.kernel.org/lkml/20251116013223.1557158-1-jiaqiyan@google.com/
>>>>
>>>
>>> Thanks for your patch.
>>>
>>>> Cc: <stable@vger.kernel.org>
>>>> Signed-off-by: Jane Chu <jane.chu@oracle.com>
>>>> ---
>>>>    mm/memory-failure.c | 22 ++++++++++++----------
>>>>    1 file changed, 12 insertions(+), 10 deletions(-)
>>>>
>>>> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>>> index 3edebb0cda30..c9d87811b1ea 100644
>>>> --- a/mm/memory-failure.c
>>>> +++ b/mm/memory-failure.c
>>>> @@ -681,9 +681,11 @@ static void set_to_kill(struct to_kill *tk, unsigned long addr, short shift)
>>>>    }
>>>>      static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>>> -                unsigned long poisoned_pfn, struct to_kill *tk)
>>>> +                unsigned long poisoned_pfn, struct to_kill *tk,
>>>> +                int pte_nr)
>>>>    {
>>>>        unsigned long pfn = 0;
>>>> +    unsigned long hwpoison_vaddr;
>>>>          if (pte_present(pte)) {
>>>>            pfn = pte_pfn(pte);
>>>> @@ -694,10 +696,11 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>>>>                pfn = swp_offset_pfn(swp);
>>>>        }
>>>>    -    if (!pfn || pfn != poisoned_pfn)
>>>> +    if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
>>>>            return 0;
>>>
>>> Can we get pte_nr from @shift? I.e. something like "pte_nr = 1UL << (shift - PAGE_SHIFT);"?
>>
>> Why?  Is there any concern with using the macro pages_per_huge_page(h) ?
> 
> No, I was trying to get rid of new @pte_nr parameter. Something like below:
> 
>   static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
> -                               unsigned long poisoned_pfn, struct to_kill *tk,
> -                               int pte_nr)
> +                               unsigned long poisoned_pfn, struct to_kill *tk)
>   {
>          unsigned long pfn = 0;
>          unsigned long hwpoison_vaddr;
> +       int pte_nr;
> 
>          if (pte_present(pte)) {
>                  pfn = pte_pfn(pte);
> @@ -701,7 +701,8 @@ static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift,
>                          pfn = softleaf_to_pfn(entry);
>          }
> 
> -       if (!pfn || (pfn > poisoned_pfn || (pfn + pte_nr - 1) < poisoned_pfn))
> +       pte_nr = 1UL << (shift - PAGE_SHIFT);
> +       if (!pfn || (pfn > poisoned_pfn || (pfn +  pte_nr - 1) < poisoned_pfn))
>                  return 0;
> 
>          hwpoison_vaddr = addr + ((poisoned_pfn - pfn) << PAGE_SHIFT);
> 
> So we don't have to pass in pte_nr from all callers. But that's trivial.

Got it, that's better. I will combine yours and Matthew's suggestion in v3.

Thanks a lot!
-jane

> 
> Thanks.
> .
>