[PATCH 2/2] mm/memory-failure: improve large block size folio handling.

Zi Yan posted 2 patches 2 months, 1 week ago
There is a newer version of this series
[PATCH 2/2] mm/memory-failure: improve large block size folio handling.
Posted by Zi Yan 2 months, 1 week ago
Large block size (LBS) folios cannot be split to order-0 folios but
min_order_for_folio(). Current split fails directly, but that is not
optimal. Split the folio to min_order_for_folio(), so that, after split,
only the folio containing the poisoned page becomes unusable instead.

For soft offline, do not split the large folio if it cannot be split to
order-0. Since the folio is still accessible from userspace and premature
split might lead to potential performance loss.

Suggested-by: Jane Chu <jane.chu@oracle.com>
Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/memory-failure.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index f698df156bf8..443df9581c24 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1656,12 +1656,13 @@ static int identify_page_state(unsigned long pfn, struct page *p,
  * there is still more to do, hence the page refcount we took earlier
  * is still needed.
  */
-static int try_to_split_thp_page(struct page *page, bool release)
+static int try_to_split_thp_page(struct page *page, unsigned int new_order,
+		bool release)
 {
 	int ret;
 
 	lock_page(page);
-	ret = split_huge_page(page);
+	ret = split_huge_page_to_list_to_order(page, NULL, new_order);
 	unlock_page(page);
 
 	if (ret && release)
@@ -2280,6 +2281,7 @@ int memory_failure(unsigned long pfn, int flags)
 	folio_unlock(folio);
 
 	if (folio_test_large(folio)) {
+		int new_order = min_order_for_split(folio);
 		/*
 		 * The flag must be set after the refcount is bumped
 		 * otherwise it may race with THP split.
@@ -2294,7 +2296,14 @@ int memory_failure(unsigned long pfn, int flags)
 		 * page is a valid handlable page.
 		 */
 		folio_set_has_hwpoisoned(folio);
-		if (try_to_split_thp_page(p, false) < 0) {
+		/*
+		 * If the folio cannot be split to order-0, kill the process,
+		 * but split the folio anyway to minimize the amount of unusable
+		 * pages.
+		 */
+		if (try_to_split_thp_page(p, new_order, false) || new_order) {
+			/* get folio again in case the original one is split */
+			folio = page_folio(p);
 			res = -EHWPOISON;
 			kill_procs_now(p, pfn, flags, folio);
 			put_page(p);
@@ -2621,7 +2630,15 @@ static int soft_offline_in_use_page(struct page *page)
 	};
 
 	if (!huge && folio_test_large(folio)) {
-		if (try_to_split_thp_page(page, true)) {
+		int new_order = min_order_for_split(folio);
+
+		/*
+		 * If the folio cannot be split to order-0, do not split it at
+		 * all to retain the still accessible large folio.
+		 * NOTE: if getting free memory is perferred, split it like it
+		 * is done in memory_failure().
+		 */
+		if (new_order || try_to_split_thp_page(page, new_order, true)) {
 			pr_info("%#lx: thp split failed\n", pfn);
 			return -EBUSY;
 		}
-- 
2.51.0
Re: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
Posted by kernel test robot 2 months, 1 week ago
Hi Zi,

kernel test robot noticed the following build errors:

[auto build test ERROR on linus/master]
[also build test ERROR on v6.17 next-20251010]
[cannot apply to akpm-mm/mm-everything]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Zi-Yan/mm-huge_memory-do-not-change-split_huge_page-target-order-silently/20251011-014145
base:   linus/master
patch link:    https://lore.kernel.org/r/20251010173906.3128789-3-ziy%40nvidia.com
patch subject: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
config: parisc-allmodconfig (https://download.01.org/0day-ci/archive/20251011/202510111805.rg0AewVk-lkp@intel.com/config)
compiler: hppa-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251011/202510111805.rg0AewVk-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202510111805.rg0AewVk-lkp@intel.com/

All errors (new ones prefixed by >>):

   mm/memory-failure.c: In function 'memory_failure':
>> mm/memory-failure.c:2278:33: error: implicit declaration of function 'min_order_for_split' [-Wimplicit-function-declaration]
    2278 |                 int new_order = min_order_for_split(folio);
         |                                 ^~~~~~~~~~~~~~~~~~~


vim +/min_order_for_split +2278 mm/memory-failure.c

  2147	
  2148	/**
  2149	 * memory_failure - Handle memory failure of a page.
  2150	 * @pfn: Page Number of the corrupted page
  2151	 * @flags: fine tune action taken
  2152	 *
  2153	 * This function is called by the low level machine check code
  2154	 * of an architecture when it detects hardware memory corruption
  2155	 * of a page. It tries its best to recover, which includes
  2156	 * dropping pages, killing processes etc.
  2157	 *
  2158	 * The function is primarily of use for corruptions that
  2159	 * happen outside the current execution context (e.g. when
  2160	 * detected by a background scrubber)
  2161	 *
  2162	 * Must run in process context (e.g. a work queue) with interrupts
  2163	 * enabled and no spinlocks held.
  2164	 *
  2165	 * Return:
  2166	 *   0             - success,
  2167	 *   -ENXIO        - memory not managed by the kernel
  2168	 *   -EOPNOTSUPP   - hwpoison_filter() filtered the error event,
  2169	 *   -EHWPOISON    - the page was already poisoned, potentially
  2170	 *                   kill process,
  2171	 *   other negative values - failure.
  2172	 */
  2173	int memory_failure(unsigned long pfn, int flags)
  2174	{
  2175		struct page *p;
  2176		struct folio *folio;
  2177		struct dev_pagemap *pgmap;
  2178		int res = 0;
  2179		unsigned long page_flags;
  2180		bool retry = true;
  2181		int hugetlb = 0;
  2182	
  2183		if (!sysctl_memory_failure_recovery)
  2184			panic("Memory failure on page %lx", pfn);
  2185	
  2186		mutex_lock(&mf_mutex);
  2187	
  2188		if (!(flags & MF_SW_SIMULATED))
  2189			hw_memory_failure = true;
  2190	
  2191		p = pfn_to_online_page(pfn);
  2192		if (!p) {
  2193			res = arch_memory_failure(pfn, flags);
  2194			if (res == 0)
  2195				goto unlock_mutex;
  2196	
  2197			if (pfn_valid(pfn)) {
  2198				pgmap = get_dev_pagemap(pfn);
  2199				put_ref_page(pfn, flags);
  2200				if (pgmap) {
  2201					res = memory_failure_dev_pagemap(pfn, flags,
  2202									 pgmap);
  2203					goto unlock_mutex;
  2204				}
  2205			}
  2206			pr_err("%#lx: memory outside kernel control\n", pfn);
  2207			res = -ENXIO;
  2208			goto unlock_mutex;
  2209		}
  2210	
  2211	try_again:
  2212		res = try_memory_failure_hugetlb(pfn, flags, &hugetlb);
  2213		if (hugetlb)
  2214			goto unlock_mutex;
  2215	
  2216		if (TestSetPageHWPoison(p)) {
  2217			res = -EHWPOISON;
  2218			if (flags & MF_ACTION_REQUIRED)
  2219				res = kill_accessing_process(current, pfn, flags);
  2220			if (flags & MF_COUNT_INCREASED)
  2221				put_page(p);
  2222			action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
  2223			goto unlock_mutex;
  2224		}
  2225	
  2226		/*
  2227		 * We need/can do nothing about count=0 pages.
  2228		 * 1) it's a free page, and therefore in safe hand:
  2229		 *    check_new_page() will be the gate keeper.
  2230		 * 2) it's part of a non-compound high order page.
  2231		 *    Implies some kernel user: cannot stop them from
  2232		 *    R/W the page; let's pray that the page has been
  2233		 *    used and will be freed some time later.
  2234		 * In fact it's dangerous to directly bump up page count from 0,
  2235		 * that may make page_ref_freeze()/page_ref_unfreeze() mismatch.
  2236		 */
  2237		if (!(flags & MF_COUNT_INCREASED)) {
  2238			res = get_hwpoison_page(p, flags);
  2239			if (!res) {
  2240				if (is_free_buddy_page(p)) {
  2241					if (take_page_off_buddy(p)) {
  2242						page_ref_inc(p);
  2243						res = MF_RECOVERED;
  2244					} else {
  2245						/* We lost the race, try again */
  2246						if (retry) {
  2247							ClearPageHWPoison(p);
  2248							retry = false;
  2249							goto try_again;
  2250						}
  2251						res = MF_FAILED;
  2252					}
  2253					res = action_result(pfn, MF_MSG_BUDDY, res);
  2254				} else {
  2255					res = action_result(pfn, MF_MSG_KERNEL_HIGH_ORDER, MF_IGNORED);
  2256				}
  2257				goto unlock_mutex;
  2258			} else if (res < 0) {
  2259				res = action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
  2260				goto unlock_mutex;
  2261			}
  2262		}
  2263	
  2264		folio = page_folio(p);
  2265	
  2266		/* filter pages that are protected from hwpoison test by users */
  2267		folio_lock(folio);
  2268		if (hwpoison_filter(p)) {
  2269			ClearPageHWPoison(p);
  2270			folio_unlock(folio);
  2271			folio_put(folio);
  2272			res = -EOPNOTSUPP;
  2273			goto unlock_mutex;
  2274		}
  2275		folio_unlock(folio);
  2276	
  2277		if (folio_test_large(folio)) {
> 2278			int new_order = min_order_for_split(folio);
  2279			/*
  2280			 * The flag must be set after the refcount is bumped
  2281			 * otherwise it may race with THP split.
  2282			 * And the flag can't be set in get_hwpoison_page() since
  2283			 * it is called by soft offline too and it is just called
  2284			 * for !MF_COUNT_INCREASED.  So here seems to be the best
  2285			 * place.
  2286			 *
  2287			 * Don't need care about the above error handling paths for
  2288			 * get_hwpoison_page() since they handle either free page
  2289			 * or unhandlable page.  The refcount is bumped iff the
  2290			 * page is a valid handlable page.
  2291			 */
  2292			folio_set_has_hwpoisoned(folio);
  2293			/*
  2294			 * If the folio cannot be split to order-0, kill the process,
  2295			 * but split the folio anyway to minimize the amount of unusable
  2296			 * pages.
  2297			 */
  2298			if (try_to_split_thp_page(p, new_order, false) || new_order) {
  2299				/* get folio again in case the original one is split */
  2300				folio = page_folio(p);
  2301				res = -EHWPOISON;
  2302				kill_procs_now(p, pfn, flags, folio);
  2303				put_page(p);
  2304				action_result(pfn, MF_MSG_UNSPLIT_THP, MF_FAILED);
  2305				goto unlock_mutex;
  2306			}
  2307			VM_BUG_ON_PAGE(!page_count(p), p);
  2308			folio = page_folio(p);
  2309		}
  2310	
  2311		/*
  2312		 * We ignore non-LRU pages for good reasons.
  2313		 * - PG_locked is only well defined for LRU pages and a few others
  2314		 * - to avoid races with __SetPageLocked()
  2315		 * - to avoid races with __SetPageSlab*() (and more non-atomic ops)
  2316		 * The check (unnecessarily) ignores LRU pages being isolated and
  2317		 * walked by the page reclaim code, however that's not a big loss.
  2318		 */
  2319		shake_folio(folio);
  2320	
  2321		folio_lock(folio);
  2322	
  2323		/*
  2324		 * We're only intended to deal with the non-Compound page here.
  2325		 * The page cannot become compound pages again as folio has been
  2326		 * splited and extra refcnt is held.
  2327		 */
  2328		WARN_ON(folio_test_large(folio));
  2329	
  2330		/*
  2331		 * We use page flags to determine what action should be taken, but
  2332		 * the flags can be modified by the error containment action.  One
  2333		 * example is an mlocked page, where PG_mlocked is cleared by
  2334		 * folio_remove_rmap_*() in try_to_unmap_one(). So to determine page
  2335		 * status correctly, we save a copy of the page flags at this time.
  2336		 */
  2337		page_flags = folio->flags.f;
  2338	
  2339		/*
  2340		 * __munlock_folio() may clear a writeback folio's LRU flag without
  2341		 * the folio lock. We need to wait for writeback completion for this
  2342		 * folio or it may trigger a vfs BUG while evicting inode.
  2343		 */
  2344		if (!folio_test_lru(folio) && !folio_test_writeback(folio))
  2345			goto identify_page_state;
  2346	
  2347		/*
  2348		 * It's very difficult to mess with pages currently under IO
  2349		 * and in many cases impossible, so we just avoid it here.
  2350		 */
  2351		folio_wait_writeback(folio);
  2352	
  2353		/*
  2354		 * Now take care of user space mappings.
  2355		 * Abort on fail: __filemap_remove_folio() assumes unmapped page.
  2356		 */
  2357		if (!hwpoison_user_mappings(folio, p, pfn, flags)) {
  2358			res = action_result(pfn, MF_MSG_UNMAP_FAILED, MF_FAILED);
  2359			goto unlock_page;
  2360		}
  2361	
  2362		/*
  2363		 * Torn down by someone else?
  2364		 */
  2365		if (folio_test_lru(folio) && !folio_test_swapcache(folio) &&
  2366		    folio->mapping == NULL) {
  2367			res = action_result(pfn, MF_MSG_TRUNCATED_LRU, MF_IGNORED);
  2368			goto unlock_page;
  2369		}
  2370	
  2371	identify_page_state:
  2372		res = identify_page_state(pfn, p, page_flags);
  2373		mutex_unlock(&mf_mutex);
  2374		return res;
  2375	unlock_page:
  2376		folio_unlock(folio);
  2377	unlock_mutex:
  2378		mutex_unlock(&mf_mutex);
  2379		return res;
  2380	}
  2381	EXPORT_SYMBOL_GPL(memory_failure);
  2382	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
Posted by Zi Yan 2 months ago
On 11 Oct 2025, at 6:23, kernel test robot wrote:

> Hi Zi,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on linus/master]
> [also build test ERROR on v6.17 next-20251010]
> [cannot apply to akpm-mm/mm-everything]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Zi-Yan/mm-huge_memory-do-not-change-split_huge_page-target-order-silently/20251011-014145
> base:   linus/master
> patch link:    https://lore.kernel.org/r/20251010173906.3128789-3-ziy%40nvidia.com
> patch subject: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
> config: parisc-allmodconfig (https://download.01.org/0day-ci/archive/20251011/202510111805.rg0AewVk-lkp@intel.com/config)
> compiler: hppa-linux-gcc (GCC) 15.1.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251011/202510111805.rg0AewVk-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202510111805.rg0AewVk-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>    mm/memory-failure.c: In function 'memory_failure':
>>> mm/memory-failure.c:2278:33: error: implicit declaration of function 'min_order_for_split' [-Wimplicit-function-declaration]
>     2278 |                 int new_order = min_order_for_split(folio);
>          |                                 ^~~~~~~~~~~~~~~~~~~
>

min_order_for_split() is missing in the !CONFIG_TRANSPARENT_HUGEPAGE case. Will add one
to get rid of this error.

Thanks.


--
Best Regards,
Yan, Zi
Re: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
Posted by Miaohe Lin 2 months, 1 week ago
On 2025/10/11 1:39, Zi Yan wrote:
> Large block size (LBS) folios cannot be split to order-0 folios but
> min_order_for_folio(). Current split fails directly, but that is not
> optimal. Split the folio to min_order_for_folio(), so that, after split,
> only the folio containing the poisoned page becomes unusable instead.
> 
> For soft offline, do not split the large folio if it cannot be split to
> order-0. Since the folio is still accessible from userspace and premature
> split might lead to potential performance loss.

Thanks for your patch.

> 
> Suggested-by: Jane Chu <jane.chu@oracle.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>  mm/memory-failure.c | 25 +++++++++++++++++++++----
>  1 file changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index f698df156bf8..443df9581c24 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1656,12 +1656,13 @@ static int identify_page_state(unsigned long pfn, struct page *p,
>   * there is still more to do, hence the page refcount we took earlier
>   * is still needed.
>   */
> -static int try_to_split_thp_page(struct page *page, bool release)
> +static int try_to_split_thp_page(struct page *page, unsigned int new_order,
> +		bool release)
>  {
>  	int ret;
>  
>  	lock_page(page);
> -	ret = split_huge_page(page);
> +	ret = split_huge_page_to_list_to_order(page, NULL, new_order);
>  	unlock_page(page);
>  
>  	if (ret && release)
> @@ -2280,6 +2281,7 @@ int memory_failure(unsigned long pfn, int flags)
>  	folio_unlock(folio);
>  
>  	if (folio_test_large(folio)) {
> +		int new_order = min_order_for_split(folio);
>  		/*
>  		 * The flag must be set after the refcount is bumped
>  		 * otherwise it may race with THP split.
> @@ -2294,7 +2296,14 @@ int memory_failure(unsigned long pfn, int flags)
>  		 * page is a valid handlable page.
>  		 */
>  		folio_set_has_hwpoisoned(folio);
> -		if (try_to_split_thp_page(p, false) < 0) {
> +		/*
> +		 * If the folio cannot be split to order-0, kill the process,
> +		 * but split the folio anyway to minimize the amount of unusable
> +		 * pages.
> +		 */
> +		if (try_to_split_thp_page(p, new_order, false) || new_order) {
> +			/* get folio again in case the original one is split */
> +			folio = page_folio(p);

If original folio A is split and the after-split new folio is B (A != B), will the
refcnt of folio A held above be missing? I.e. get_hwpoison_page() held the extra refcnt
of folio A, but we put the refcnt of folio B below. Is this a problem or am I miss
something?

Thanks.
.
Re: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
Posted by Matthew Wilcox 2 months, 1 week ago
On Sat, Oct 11, 2025 at 12:12:12PM +0800, Miaohe Lin wrote:
> >  		folio_set_has_hwpoisoned(folio);
> > -		if (try_to_split_thp_page(p, false) < 0) {
> > +		/*
> > +		 * If the folio cannot be split to order-0, kill the process,
> > +		 * but split the folio anyway to minimize the amount of unusable
> > +		 * pages.
> > +		 */
> > +		if (try_to_split_thp_page(p, new_order, false) || new_order) {
> > +			/* get folio again in case the original one is split */
> > +			folio = page_folio(p);
> 
> If original folio A is split and the after-split new folio is B (A != B), will the
> refcnt of folio A held above be missing? I.e. get_hwpoison_page() held the extra refcnt
> of folio A, but we put the refcnt of folio B below. Is this a problem or am I miss
> something?

That's how split works.

Zi Yan, the kernel-doc for folio_split() could use some attention.
First, it's not kernel-doc; the comment opens with /* instead of /**.
Second, it says:

 * After split, folio is left locked for caller.

which isn't actually true, right?  The folio which contains
@split_at will be locked.  Also, it will contain the additional
reference which was taken on @folio by the caller.
Re: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
Posted by Zi Yan 2 months ago
On 11 Oct 2025, at 1:00, Matthew Wilcox wrote:

> On Sat, Oct 11, 2025 at 12:12:12PM +0800, Miaohe Lin wrote:
>>>  		folio_set_has_hwpoisoned(folio);
>>> -		if (try_to_split_thp_page(p, false) < 0) {
>>> +		/*
>>> +		 * If the folio cannot be split to order-0, kill the process,
>>> +		 * but split the folio anyway to minimize the amount of unusable
>>> +		 * pages.
>>> +		 */
>>> +		if (try_to_split_thp_page(p, new_order, false) || new_order) {
>>> +			/* get folio again in case the original one is split */
>>> +			folio = page_folio(p);
>>
>> If original folio A is split and the after-split new folio is B (A != B), will the
>> refcnt of folio A held above be missing? I.e. get_hwpoison_page() held the extra refcnt
>> of folio A, but we put the refcnt of folio B below. Is this a problem or am I miss
>> something?
>
> That's how split works.
>
> Zi Yan, the kernel-doc for folio_split() could use some attention.
> First, it's not kernel-doc; the comment opens with /* instead of /**.

Got it.

> Second, it says:
>
>  * After split, folio is left locked for caller.
>
> which isn't actually true, right?  The folio which contains

No, folio is indeed left locked. Currently folio_split() is
used by truncate_inode_partial_folio() via try_folio_split()
and the folio passed into truncate_inode_partial_folio() is
already locked by the caller and is unlocked by the caller as well.
The caller does not know anything about @split_at, thus
cannot unlock the folio containing @split_at.


> @split_at will be locked.  Also, it will contain the additional
> reference which was taken on @folio by the caller.

The same for the folio reference.

That is the reason we have @split_at and @lock_at for __folio_split().

I can see it is counter-intuitive. To change it, I might need
your help on how to change truncate_inode_partial_folio() callers,
since all of them are use @folio afterwards, without a reference,
I am not sure if their uses are safe anymore.

--
Best Regards,
Yan, Zi
Re: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
Posted by Miaohe Lin 2 months, 1 week ago
On 2025/10/11 13:00, Matthew Wilcox wrote:
> On Sat, Oct 11, 2025 at 12:12:12PM +0800, Miaohe Lin wrote:
>>>  		folio_set_has_hwpoisoned(folio);
>>> -		if (try_to_split_thp_page(p, false) < 0) {
>>> +		/*
>>> +		 * If the folio cannot be split to order-0, kill the process,
>>> +		 * but split the folio anyway to minimize the amount of unusable
>>> +		 * pages.
>>> +		 */
>>> +		if (try_to_split_thp_page(p, new_order, false) || new_order) {
>>> +			/* get folio again in case the original one is split */
>>> +			folio = page_folio(p);
>>
>> If original folio A is split and the after-split new folio is B (A != B), will the
>> refcnt of folio A held above be missing? I.e. get_hwpoison_page() held the extra refcnt
>> of folio A, but we put the refcnt of folio B below. Is this a problem or am I miss
>> something?
> 
> That's how split works.

I read the code and see how split works. Thanks for point this out.

> 
> Zi Yan, the kernel-doc for folio_split() could use some attention.

That would be really helpful.

Thanks.
.

> First, it's not kernel-doc; the comment opens with /* instead of /**.
> Second, it says:
> 
>  * After split, folio is left locked for caller.
> 
> which isn't actually true, right?  The folio which contains
> @split_at will be locked.  Also, it will contain the additional
> reference which was taken on @folio by the caller.
> 
> .
>
Re: [PATCH 2/2] mm/memory-failure: improve large block size folio handling.
Posted by Luis Chamberlain 2 months, 1 week ago
On Fri, Oct 10, 2025 at 01:39:06PM -0400, Zi Yan wrote:
> Large block size (LBS) folios cannot be split to order-0 folios but
> min_order_for_folio(). Current split fails directly, but that is not
> optimal. Split the folio to min_order_for_folio(), so that, after split,
> only the folio containing the poisoned page becomes unusable instead.
> 
> For soft offline, do not split the large folio if it cannot be split to
> order-0. Since the folio is still accessible from userspace and premature
> split might lead to potential performance loss.
> 
> Suggested-by: Jane Chu <jane.chu@oracle.com>
> Signed-off-by: Zi Yan <ziy@nvidia.com>

Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>

  Luis