[PATCH 1/2] xen/mm: add a NUMA node parameter to scrub_free_pages()

Roger Pau Monne posted 2 patches 1 month ago
There is a newer version of this series
[PATCH 1/2] xen/mm: add a NUMA node parameter to scrub_free_pages()
Posted by Roger Pau Monne 1 month ago
Such parameter allow requesting to scrub memory only from the specified
node.  If there's no memory to scrub from the requested node the function
returns false.  If the node is already being scrubbed from a different CPU
the function returns true so the caller can differentiate whether there's
still pending work to do.

No functional change intended.  Existing callers are switched to use the
new interface, albeit they all pass NUMA_NO_NODE to keep the current
behavior.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/arm/domain.c   |  2 +-
 xen/arch/x86/domain.c   |  2 +-
 xen/common/page_alloc.c | 17 ++++++++++++++---
 xen/include/xen/mm.h    |  3 ++-
 4 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 47973f99d935..dff7554417ea 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -75,7 +75,7 @@ static void noreturn idle_loop(void)
          * and then, after it is done, whether softirqs became pending
          * while we were scrubbing.
          */
-        else if ( !softirq_pending(cpu) && !scrub_free_pages() &&
+        else if ( !softirq_pending(cpu) && !scrub_free_pages(NUMA_NO_NODE) &&
                   !softirq_pending(cpu) )
             do_idle();
 
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index 7632d5e2d62d..276c485a204f 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -166,7 +166,7 @@ static void noreturn cf_check idle_loop(void)
          * and then, after it is done, whether softirqs became pending
          * while we were scrubbing.
          */
-        else if ( !softirq_pending(cpu) && !scrub_free_pages() &&
+        else if ( !softirq_pending(cpu) && !scrub_free_pages(NUMA_NO_NODE) &&
                   !softirq_pending(cpu) )
         {
             if ( guest )
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 2efc11ce095f..248c44df32b3 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -1339,16 +1339,27 @@ static void cf_check scrub_continue(void *data)
     }
 }
 
-bool scrub_free_pages(void)
+bool scrub_free_pages(nodeid_t node)
 {
     struct page_info *pg;
     unsigned int zone;
     unsigned int cpu = smp_processor_id();
     bool preempt = false;
-    nodeid_t node;
     unsigned int cnt = 0;
 
-    node = node_to_scrub(true);
+    if ( node != NUMA_NO_NODE )
+    {
+        if ( !node_need_scrub[node] )
+            /* Nothing to scrub. */
+            return false;
+
+        if ( node_test_and_set(node, node_scrubbing) )
+            /* Another CPU is scrubbing it. */
+            return true;
+    }
+    else
+        node = node_to_scrub(true);
+
     if ( node == NUMA_NO_NODE )
         return false;
 
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 426362adb2f4..7067c9ec0405 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -65,6 +65,7 @@
 #include <xen/compiler.h>
 #include <xen/mm-frame.h>
 #include <xen/mm-types.h>
+#include <xen/numa.h>
 #include <xen/types.h>
 #include <xen/list.h>
 #include <xen/spinlock.h>
@@ -90,7 +91,7 @@ void init_xenheap_pages(paddr_t ps, paddr_t pe);
 void xenheap_max_mfn(unsigned long mfn);
 void *alloc_xenheap_pages(unsigned int order, unsigned int memflags);
 void free_xenheap_pages(void *v, unsigned int order);
-bool scrub_free_pages(void);
+bool scrub_free_pages(nodeid_t node);
 #define alloc_xenheap_page() (alloc_xenheap_pages(0,0))
 #define free_xenheap_page(v) (free_xenheap_pages(v,0))
 
-- 
2.51.0


Re: [PATCH 1/2] xen/mm: add a NUMA node parameter to scrub_free_pages()
Posted by Jan Beulich 1 month ago
On 08.01.2026 18:55, Roger Pau Monne wrote:
> Such parameter allow requesting to scrub memory only from the specified
> node.  If there's no memory to scrub from the requested node the function
> returns false.  If the node is already being scrubbed from a different CPU
> the function returns true so the caller can differentiate whether there's
> still pending work to do.

I'm really trying to understand both patches together, and peeking ahead I
don't understand the above, which looks to describe ...

> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -1339,16 +1339,27 @@ static void cf_check scrub_continue(void *data)
>      }
>  }
>  
> -bool scrub_free_pages(void)
> +bool scrub_free_pages(nodeid_t node)
>  {
>      struct page_info *pg;
>      unsigned int zone;
>      unsigned int cpu = smp_processor_id();
>      bool preempt = false;
> -    nodeid_t node;
>      unsigned int cnt = 0;
>  
> -    node = node_to_scrub(true);
> +    if ( node != NUMA_NO_NODE )
> +    {
> +        if ( !node_need_scrub[node] )
> +            /* Nothing to scrub. */
> +            return false;
> +
> +        if ( node_test_and_set(node, node_scrubbing) )
> +            /* Another CPU is scrubbing it. */
> +            return true;

... these two return-s. My problem being that patch 2 doesn't use the
return value (while existing callers don't take this path). Is this then
"just in case" for now (and making the meaning of the return values
somewhat inconsistent for the function as a whole)?

Jan
Re: [PATCH 1/2] xen/mm: add a NUMA node parameter to scrub_free_pages()
Posted by Roger Pau Monné 1 month ago
On Fri, Jan 09, 2026 at 11:22:39AM +0100, Jan Beulich wrote:
> On 08.01.2026 18:55, Roger Pau Monne wrote:
> > Such parameter allow requesting to scrub memory only from the specified
> > node.  If there's no memory to scrub from the requested node the function
> > returns false.  If the node is already being scrubbed from a different CPU
> > the function returns true so the caller can differentiate whether there's
> > still pending work to do.
> 
> I'm really trying to understand both patches together, and peeking ahead I
> don't understand the above, which looks to describe ...
> 
> > --- a/xen/common/page_alloc.c
> > +++ b/xen/common/page_alloc.c
> > @@ -1339,16 +1339,27 @@ static void cf_check scrub_continue(void *data)
> >      }
> >  }
> >  
> > -bool scrub_free_pages(void)
> > +bool scrub_free_pages(nodeid_t node)
> >  {
> >      struct page_info *pg;
> >      unsigned int zone;
> >      unsigned int cpu = smp_processor_id();
> >      bool preempt = false;
> > -    nodeid_t node;
> >      unsigned int cnt = 0;
> >  
> > -    node = node_to_scrub(true);
> > +    if ( node != NUMA_NO_NODE )
> > +    {
> > +        if ( !node_need_scrub[node] )
> > +            /* Nothing to scrub. */
> > +            return false;
> > +
> > +        if ( node_test_and_set(node, node_scrubbing) )
> > +            /* Another CPU is scrubbing it. */
> > +            return true;
> 
> ... these two return-s. My problem being that patch 2 doesn't use the
> return value (while existing callers don't take this path). Is this then
> "just in case" for now (and making the meaning of the return values
> somewhat inconsistent for the function as a whole)?

I've added those so that the function return values are consistent,
even if not consumed right now, it would make no sense for the return
values to have different meaning when the node parameter is !=
NUMA_NO_NODE.  Or at least that was my impression.

In fact an earlier version of patch 2 did consume those values.  I've
moved to a different approach, but I think it's good to keep the
return values consistent regardless of the input parameters.

Thanks, Roger.
Re: [PATCH 1/2] xen/mm: add a NUMA node parameter to scrub_free_pages()
Posted by Jan Beulich 1 month ago
On 09.01.2026 15:46, Roger Pau Monné wrote:
> On Fri, Jan 09, 2026 at 11:22:39AM +0100, Jan Beulich wrote:
>> On 08.01.2026 18:55, Roger Pau Monne wrote:
>>> Such parameter allow requesting to scrub memory only from the specified
>>> node.  If there's no memory to scrub from the requested node the function
>>> returns false.  If the node is already being scrubbed from a different CPU
>>> the function returns true so the caller can differentiate whether there's
>>> still pending work to do.
>>
>> I'm really trying to understand both patches together, and peeking ahead I
>> don't understand the above, which looks to describe ...
>>
>>> --- a/xen/common/page_alloc.c
>>> +++ b/xen/common/page_alloc.c
>>> @@ -1339,16 +1339,27 @@ static void cf_check scrub_continue(void *data)
>>>      }
>>>  }
>>>  
>>> -bool scrub_free_pages(void)
>>> +bool scrub_free_pages(nodeid_t node)
>>>  {
>>>      struct page_info *pg;
>>>      unsigned int zone;
>>>      unsigned int cpu = smp_processor_id();
>>>      bool preempt = false;
>>> -    nodeid_t node;
>>>      unsigned int cnt = 0;
>>>  
>>> -    node = node_to_scrub(true);
>>> +    if ( node != NUMA_NO_NODE )
>>> +    {
>>> +        if ( !node_need_scrub[node] )
>>> +            /* Nothing to scrub. */
>>> +            return false;
>>> +
>>> +        if ( node_test_and_set(node, node_scrubbing) )
>>> +            /* Another CPU is scrubbing it. */
>>> +            return true;
>>
>> ... these two return-s. My problem being that patch 2 doesn't use the
>> return value (while existing callers don't take this path). Is this then
>> "just in case" for now (and making the meaning of the return values
>> somewhat inconsistent for the function as a whole)?
> 
> I've added those so that the function return values are consistent,
> even if not consumed right now, it would make no sense for the return
> values to have different meaning when the node parameter is !=
> NUMA_NO_NODE.  Or at least that was my impression.
> 
> In fact an earlier version of patch 2 did consume those values.  I've
> moved to a different approach, but I think it's good to keep the
> return values consistent regardless of the input parameters.

My point was though: The present "true" return doesn't mean "Another CPU
is scrubbing it." Instead it means "More work to do" aiui. That's similar
in a way, but not identical.

Jan