[PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory

Kirill A. Shutemov posted 2 patches 7 months, 2 weeks ago
[PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Kirill A. Shutemov 7 months, 2 weeks ago
try_alloc_pages() will not attempt to allocate memory if the system has
*any* unaccepted memory. Memory is accepted as needed and can remain in
the system indefinitely, causing the interface to always fail.

Rather than immediately giving up, attempt to use already accepted
memory on free lists.

Pass 'alloc_flags' to cond_accept_memory() and do not accept new memory
for ALLOC_TRYLOCK requests.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Brendan Jackman <jackmanb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/page_alloc.c | 28 +++++++++++++++-------------
 1 file changed, 15 insertions(+), 13 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 5669baf2a6fe..5fccf5fce084 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -290,7 +290,8 @@ EXPORT_SYMBOL(nr_online_nodes);
 #endif
 
 static bool page_contains_unaccepted(struct page *page, unsigned int order);
-static bool cond_accept_memory(struct zone *zone, unsigned int order);
+static bool cond_accept_memory(struct zone *zone, unsigned int order,
+			       int alloc_flags);
 static bool __free_unaccepted(struct page *page);
 
 int page_group_by_mobility_disabled __read_mostly;
@@ -3616,7 +3617,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 			}
 		}
 
-		cond_accept_memory(zone, order);
+		cond_accept_memory(zone, order, alloc_flags);
 
 		/*
 		 * Detect whether the number of free pages is below high
@@ -3643,7 +3644,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 				       gfp_mask)) {
 			int ret;
 
-			if (cond_accept_memory(zone, order))
+			if (cond_accept_memory(zone, order, alloc_flags))
 				goto try_this_zone;
 
 			/*
@@ -3696,7 +3697,7 @@ get_page_from_freelist(gfp_t gfp_mask, unsigned int order, int alloc_flags,
 
 			return page;
 		} else {
-			if (cond_accept_memory(zone, order))
+			if (cond_accept_memory(zone, order, alloc_flags))
 				goto try_this_zone;
 
 			/* Try again if zone has deferred pages */
@@ -4849,7 +4850,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid,
 			goto failed;
 		}
 
-		cond_accept_memory(zone, 0);
+		cond_accept_memory(zone, 0, alloc_flags);
 retry_this_zone:
 		mark = wmark_pages(zone, alloc_flags & ALLOC_WMARK_MASK) + nr_pages;
 		if (zone_watermark_fast(zone, 0,  mark,
@@ -4858,7 +4859,7 @@ unsigned long alloc_pages_bulk_noprof(gfp_t gfp, int preferred_nid,
 			break;
 		}
 
-		if (cond_accept_memory(zone, 0))
+		if (cond_accept_memory(zone, 0, alloc_flags))
 			goto retry_this_zone;
 
 		/* Try again if zone has deferred pages */
@@ -7284,7 +7285,8 @@ static inline bool has_unaccepted_memory(void)
 	return static_branch_unlikely(&zones_with_unaccepted_pages);
 }
 
-static bool cond_accept_memory(struct zone *zone, unsigned int order)
+static bool cond_accept_memory(struct zone *zone, unsigned int order,
+			       int alloc_flags)
 {
 	long to_accept, wmark;
 	bool ret = false;
@@ -7295,6 +7297,10 @@ static bool cond_accept_memory(struct zone *zone, unsigned int order)
 	if (list_empty(&zone->unaccepted_pages))
 		return false;
 
+	/* Bailout, since try_to_accept_memory_one() needs to take a lock */
+	if (alloc_flags & ALLOC_TRYLOCK)
+		return false;
+
 	wmark = promo_wmark_pages(zone);
 
 	/*
@@ -7351,7 +7357,8 @@ static bool page_contains_unaccepted(struct page *page, unsigned int order)
 	return false;
 }
 
-static bool cond_accept_memory(struct zone *zone, unsigned int order)
+static bool cond_accept_memory(struct zone *zone, unsigned int order,
+			       int alloc_flags)
 {
 	return false;
 }
@@ -7422,11 +7429,6 @@ struct page *try_alloc_pages_noprof(int nid, unsigned int order)
 	if (!pcp_allowed_order(order))
 		return NULL;
 
-#ifdef CONFIG_UNACCEPTED_MEMORY
-	/* Bailout, since try_to_accept_memory_one() needs to take a lock */
-	if (has_unaccepted_memory())
-		return NULL;
-#endif
 	/* Bailout, since _deferred_grow_zone() needs to take a lock */
 	if (deferred_pages_enabled())
 		return NULL;
-- 
2.47.2
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Andrew Morton 7 months, 1 week ago
On Tue,  6 May 2025 14:25:08 +0300 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:

> try_alloc_pages() will not attempt to allocate memory if the system has
> *any* unaccepted memory. Memory is accepted as needed and can remain in
> the system indefinitely, causing the interface to always fail.
> 
> Rather than immediately giving up, attempt to use already accepted
> memory on free lists.
> 
> Pass 'alloc_flags' to cond_accept_memory() and do not accept new memory
> for ALLOC_TRYLOCK requests.

What are the userspace-visible effects, please?

Was the omission of cc:stable intentional?  I cannot locally determine
this without the above info.

If the cc:stable omission was indeed intentional then it would be better
if this series was presented as two standalone patches.
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Kirill A. Shutemov 7 months, 1 week ago
On Tue, May 06, 2025 at 05:00:34PM -0700, Andrew Morton wrote:
> On Tue,  6 May 2025 14:25:08 +0300 "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> wrote:
> 
> > try_alloc_pages() will not attempt to allocate memory if the system has
> > *any* unaccepted memory. Memory is accepted as needed and can remain in
> > the system indefinitely, causing the interface to always fail.
> > 
> > Rather than immediately giving up, attempt to use already accepted
> > memory on free lists.
> > 
> > Pass 'alloc_flags' to cond_accept_memory() and do not accept new memory
> > for ALLOC_TRYLOCK requests.
> 
> What are the userspace-visible effects, please?

I cannot say I fully understand the implications.

The interface obviously allows failure, but the caller might expect
eventual success on retry.

So far, only BPF uses the interface. Maybe Alexei can comment on what will
happen if the function always fails.

I noticed the issue by code analysis because the second patch removes
has_unaccepted_memory().

> Was the omission of cc:stable intentional?  I cannot locally determine
> this without the above info.
> 
> If the cc:stable omission was indeed intentional then it would be better
> if this series was presented as two standalone patches.

Given that the second patch cannot be applied to current Linus' tree
without this one, it is better to add stable@.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Alexei Starovoitov 7 months, 1 week ago
On Tue, May 6, 2025 at 4:25 AM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> try_alloc_pages() will not attempt to allocate memory if the system has
> *any* unaccepted memory. Memory is accepted as needed and can remain in
> the system indefinitely, causing the interface to always fail.
>
> Rather than immediately giving up, attempt to use already accepted
> memory on free lists.
>
> Pass 'alloc_flags' to cond_accept_memory() and do not accept new memory
> for ALLOC_TRYLOCK requests.
>
> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")

Thanks for working on this, but the fixes tag is overkill.
This limitation is not causing any issues in our setups.
Improving it is certainly better, of course.
Acked-by: Alexei Starovoitov <ast@kernel.org>
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Kirill A. Shutemov 7 months, 1 week ago
On Tue, May 06, 2025 at 10:18:21AM -0700, Alexei Starovoitov wrote:
> On Tue, May 6, 2025 at 4:25 AM Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >
> > try_alloc_pages() will not attempt to allocate memory if the system has
> > *any* unaccepted memory. Memory is accepted as needed and can remain in
> > the system indefinitely, causing the interface to always fail.
> >
> > Rather than immediately giving up, attempt to use already accepted
> > memory on free lists.
> >
> > Pass 'alloc_flags' to cond_accept_memory() and do not accept new memory
> > for ALLOC_TRYLOCK requests.
> >
> > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> 
> Thanks for working on this, but the fixes tag is overkill.
> This limitation is not causing any issues in our setups.

Have you had chance to test it on any platform with unaccepted memory?
So far it is only Intel TDX and AMD SEV guests.

> Improving it is certainly better, of course.
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Thanks!

-- 
  Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Alexei Starovoitov 7 months, 1 week ago
On Tue, May 6, 2025 at 12:00 PM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> On Tue, May 06, 2025 at 10:18:21AM -0700, Alexei Starovoitov wrote:
> > On Tue, May 6, 2025 at 4:25 AM Kirill A. Shutemov
> > <kirill.shutemov@linux.intel.com> wrote:
> > >
> > > try_alloc_pages() will not attempt to allocate memory if the system has
> > > *any* unaccepted memory. Memory is accepted as needed and can remain in
> > > the system indefinitely, causing the interface to always fail.
> > >
> > > Rather than immediately giving up, attempt to use already accepted
> > > memory on free lists.
> > >
> > > Pass 'alloc_flags' to cond_accept_memory() and do not accept new memory
> > > for ALLOC_TRYLOCK requests.
> > >
> > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> >
> > Thanks for working on this, but the fixes tag is overkill.
> > This limitation is not causing any issues in our setups.
>
> Have you had chance to test it on any platform with unaccepted memory?
> So far it is only Intel TDX and AMD SEV guests.

We don't use them, and my understanding is that such
unaccepted memory will be there only during boot time.
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Kirill A. Shutemov 7 months, 1 week ago
On Tue, May 06, 2025 at 03:05:31PM -0700, Alexei Starovoitov wrote:
> On Tue, May 6, 2025 at 12:00 PM Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >
> > On Tue, May 06, 2025 at 10:18:21AM -0700, Alexei Starovoitov wrote:
> > > On Tue, May 6, 2025 at 4:25 AM Kirill A. Shutemov
> > > <kirill.shutemov@linux.intel.com> wrote:
> > > >
> > > > try_alloc_pages() will not attempt to allocate memory if the system has
> > > > *any* unaccepted memory. Memory is accepted as needed and can remain in
> > > > the system indefinitely, causing the interface to always fail.
> > > >
> > > > Rather than immediately giving up, attempt to use already accepted
> > > > memory on free lists.
> > > >
> > > > Pass 'alloc_flags' to cond_accept_memory() and do not accept new memory
> > > > for ALLOC_TRYLOCK requests.
> > > >
> > > > Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > Fixes: 97769a53f117 ("mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation")
> > >
> > > Thanks for working on this, but the fixes tag is overkill.
> > > This limitation is not causing any issues in our setups.
> >
> > Have you had chance to test it on any platform with unaccepted memory?
> > So far it is only Intel TDX and AMD SEV guests.
> 
> We don't use them, and my understanding is that such
> unaccepted memory will be there only during boot time.

That's false. Unaccepted memory can be there indefinitely after boot. It
only gets accepted on demand.

-- 
  Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Brendan Jackman 7 months, 2 weeks ago
On Tue May 6, 2025 at 11:25 AM UTC, Kirill A. Shutemov wrote:
> +	/* Bailout, since try_to_accept_memory_one() needs to take a lock */
> +	if (alloc_flags & ALLOC_TRYLOCK)
> +		return false;
> +

Quick lazy question: why don't we just trylock it like we do for the zone
lock?
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Kirill A. Shutemov 7 months, 2 weeks ago
On Tue, May 06, 2025 at 01:20:25PM +0000, Brendan Jackman wrote:
> On Tue May 6, 2025 at 11:25 AM UTC, Kirill A. Shutemov wrote:
> > +	/* Bailout, since try_to_accept_memory_one() needs to take a lock */
> > +	if (alloc_flags & ALLOC_TRYLOCK)
> > +		return false;
> > +
> 
> Quick lazy question: why don't we just trylock it like we do for the zone
> lock?

It is not only zone lock. There's also unaccepted_memory_lock inside
accept_memory().

-- 
  Kiryl Shutsemau / Kirill A. Shutemov
Re: [PATCH 1/2] mm/page_alloc: Ensure try_alloc_pages() plays well with unaccepted memory
Posted by Brendan Jackman 7 months, 1 week ago
On Tue May 6, 2025 at 1:34 PM UTC, Kirill A. Shutemov wrote:
> On Tue, May 06, 2025 at 01:20:25PM +0000, Brendan Jackman wrote:
>> On Tue May 6, 2025 at 11:25 AM UTC, Kirill A. Shutemov wrote:
>> > +	/* Bailout, since try_to_accept_memory_one() needs to take a lock */
>> > +	if (alloc_flags & ALLOC_TRYLOCK)
>> > +		return false;
>> > +
>> 
>> Quick lazy question: why don't we just trylock it like we do for the zone
>> lock?
>
> It is not only zone lock. There's also unaccepted_memory_lock inside
> accept_memory().

Right, but my lazy question was why can't we "just" trylock that too?

But anyway, that's no use because if we win the trylock we'd still have
to do __free_pages_ok().