[PATCH v2] mempolicy: Clarify what zone reclaim means

Joshua Hahn posted 1 patch 2 months ago
There is a newer version of this series
include/uapi/linux/mempolicy.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
[PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by Joshua Hahn 2 months ago
The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
memory. Contrary to its user-facing name, it is internally referred to as
"node_reclaim_mode".

This can be confusing. But because we cannot change the name of the API since
it has been in place since at least 2.6, let's try to be more explicit about
what the behavior of this API is. 

Change the description to clarify what zone reclaim entails, and be explicit
about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
past already [1] [2].

[1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
[2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/

Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
---
 include/uapi/linux/mempolicy.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
index 1f9bb10d1a47..6c9c9385ff89 100644
--- a/include/uapi/linux/mempolicy.h
+++ b/include/uapi/linux/mempolicy.h
@@ -66,10 +66,16 @@ enum {
 #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
 
 /*
+ * Enabling zone reclaim means the page allocator will attempt to fulfill
+ * the allocation request on the current node by triggering reclaim and
+ * trying to shrink the current node.
+ * Fallback allocations on the next candidates in the zonelist are considered
+ * zone when reclaim fails to free up enough memory in the current node/zone.
+ *
  * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
  * ABI.  New bits are OK, but existing bits can never change.
  */
-#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
+#define RECLAIM_ZONE	(1<<0)	/* Enable zone reclaim */
 #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
 #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
 

base-commit: 260f6f4fda93c8485c8037865c941b42b9cba5d2
-- 
2.47.3
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by Huang, Ying 2 months ago
Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> memory. Contrary to its user-facing name, it is internally referred to as
> "node_reclaim_mode".
>
> This can be confusing. But because we cannot change the name of the API since
> it has been in place since at least 2.6, let's try to be more explicit about
> what the behavior of this API is. 
>
> Change the description to clarify what zone reclaim entails, and be explicit
> about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> past already [1] [2].
>
> [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
>  include/uapi/linux/mempolicy.h | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> index 1f9bb10d1a47..6c9c9385ff89 100644
> --- a/include/uapi/linux/mempolicy.h
> +++ b/include/uapi/linux/mempolicy.h
> @@ -66,10 +66,16 @@ enum {
>  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>  
>  /*
> + * Enabling zone reclaim means the page allocator will attempt to fulfill
> + * the allocation request on the current node by triggering reclaim and
> + * trying to shrink the current node.
> + * Fallback allocations on the next candidates in the zonelist are considered
> + * zone when reclaim fails to free up enough memory in the current node/zone.
> + *
>   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>   * ABI.  New bits are OK, but existing bits can never change.

As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
this line too?

>   */
> -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> +#define RECLAIM_ZONE	(1<<0)	/* Enable zone reclaim */
>  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
>  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
>  
>
> base-commit: 260f6f4fda93c8485c8037865c941b42b9cba5d2

---
Best Regards,
Huang, Ying
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by Joshua Hahn 2 months ago
On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> 
> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> > memory. Contrary to its user-facing name, it is internally referred to as
> > "node_reclaim_mode".
> >
> > This can be confusing. But because we cannot change the name of the API since
> > it has been in place since at least 2.6, let's try to be more explicit about
> > what the behavior of this API is. 
> >
> > Change the description to clarify what zone reclaim entails, and be explicit
> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> > past already [1] [2].
> >
> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >
> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> > ---
> >  include/uapi/linux/mempolicy.h | 8 +++++++-
> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> > index 1f9bb10d1a47..6c9c9385ff89 100644
> > --- a/include/uapi/linux/mempolicy.h
> > +++ b/include/uapi/linux/mempolicy.h
> > @@ -66,10 +66,16 @@ enum {
> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >  
> >  /*
> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
> > + * the allocation request on the current node by triggering reclaim and
> > + * trying to shrink the current node.
> > + * Fallback allocations on the next candidates in the zonelist are considered
> > + * zone when reclaim fails to free up enough memory in the current node/zone.
> > + *
> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >   * ABI.  New bits are OK, but existing bits can never change.
> 
> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
> this line too?

Hi Ying, 

Thank you for reviewing this patch!

I didn't know that sysctl isn't considered a kernel ABI. If I understand your
suggestion correctly, I can rephrase the comment block above to something like this?

- * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
- * ABI. New bits are OK, but existing bits can never change.
+ * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
+ * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
+ * can never change.

Thanks again for your review Ying, I hope you have a good day : -)
Joshua

Sent using hkml (https://github.com/sjp38/hackermail)
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by Huang, Ying 2 months ago
Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>
>> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> 
>> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
>> > memory. Contrary to its user-facing name, it is internally referred to as
>> > "node_reclaim_mode".
>> >
>> > This can be confusing. But because we cannot change the name of the API since
>> > it has been in place since at least 2.6, let's try to be more explicit about
>> > what the behavior of this API is. 
>> >
>> > Change the description to clarify what zone reclaim entails, and be explicit
>> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
>> > past already [1] [2].
>> >
>> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
>> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>> >
>> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> > ---
>> >  include/uapi/linux/mempolicy.h | 8 +++++++-
>> >  1 file changed, 7 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> > index 1f9bb10d1a47..6c9c9385ff89 100644
>> > --- a/include/uapi/linux/mempolicy.h
>> > +++ b/include/uapi/linux/mempolicy.h
>> > @@ -66,10 +66,16 @@ enum {
>> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>> >  
>> >  /*
>> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
>> > + * the allocation request on the current node by triggering reclaim and
>> > + * trying to shrink the current node.
>> > + * Fallback allocations on the next candidates in the zonelist are considered
>> > + * zone when reclaim fails to free up enough memory in the current node/zone.
>> > + *
>> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> >   * ABI.  New bits are OK, but existing bits can never change.
>> 
>> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
>> this line too?
>
> Hi Ying, 
>
> Thank you for reviewing this patch!
>
> I didn't know that sysctl isn't considered a kernel ABI. If I understand your
> suggestion correctly, I can rephrase the comment block above to something like this?
>
> - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> - * ABI. New bits are OK, but existing bits can never change.
> + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
> + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
> + * can never change.

Because it's not an ABI, I think that we could avoid to say "never".

> Thanks again for your review Ying, I hope you have a good day : -)

Welcome!  You too!

With some trivial tweak, please feel free to add my

Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>

in the future version.

---
Best Regards,
Huang, Ying
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by Joshua Hahn 2 months ago
On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> 
> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
> >
> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> >> 
> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> >> > memory. Contrary to its user-facing name, it is internally referred to as
> >> > "node_reclaim_mode".
> >> >
> >> > This can be confusing. But because we cannot change the name of the API since
> >> > it has been in place since at least 2.6, let's try to be more explicit about
> >> > what the behavior of this API is. 
> >> >
> >> > Change the description to clarify what zone reclaim entails, and be explicit
> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> >> > past already [1] [2].
> >> >
> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >> >
> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> >> > ---
> >> >  include/uapi/linux/mempolicy.h | 8 +++++++-
> >> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >> >
> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
> >> > --- a/include/uapi/linux/mempolicy.h
> >> > +++ b/include/uapi/linux/mempolicy.h
> >> > @@ -66,10 +66,16 @@ enum {
> >> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >> >  
> >> >  /*
> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
> >> > + * the allocation request on the current node by triggering reclaim and
> >> > + * trying to shrink the current node.
> >> > + * Fallback allocations on the next candidates in the zonelist are considered
> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
> >> > + *
> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> >   * ABI.  New bits are OK, but existing bits can never change.
> >> 
> >> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
> >> this line too?
> >
> > Hi Ying, 
> >
> > Thank you for reviewing this patch!
> >
> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
> > suggestion correctly, I can rephrase the comment block above to something like this?
> >
> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> > - * ABI. New bits are OK, but existing bits can never change.
> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
> > + * can never change.

Hi Ying,

> Because it's not an ABI, I think that we could avoid to say "never".

My personal opinion is that we should keep this warning, since there has
already been an example before where a developer tried to remove this bit [1],
and this broke some behavior for userspace configurations. However, if I
understand your comment correctly, you are suggesting that we should change
the wording to not include "never", since sysctls are no longer an ABI (and
therefore we should be OK to change what the values mean?)

If that is the case, then I can send in another patch since I think the goals
are a bit different for the two patches. With that said, I think we should
keep the warning just to avoid any breakages in userspace, even if sysctl
might not be considered an ABI anymore (also I must have missed this, I didn't
know this at all!)

> > Thanks again for your review Ying, I hope you have a good day : -)
> 
> Welcome!  You too!
> 
> With some trivial tweak, please feel free to add my
> 
> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
> 
> in the future version.

Thank you for your review Ying! Since there is a question remaining about what
to do with the "never" statement, I will wait to send out a v3 with your
review : -) 

Have a great day!
Joshua

[1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/

Sent using hkml (https://github.com/sjp38/hackermail)
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by Huang, Ying 2 months ago
Joshua Hahn <joshua.hahnjy@gmail.com> writes:

> On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>
>> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> 
>> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
>> >
>> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
>> >> 
>> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
>> >> > memory. Contrary to its user-facing name, it is internally referred to as
>> >> > "node_reclaim_mode".
>> >> >
>> >> > This can be confusing. But because we cannot change the name of the API since
>> >> > it has been in place since at least 2.6, let's try to be more explicit about
>> >> > what the behavior of this API is. 
>> >> >
>> >> > Change the description to clarify what zone reclaim entails, and be explicit
>> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
>> >> > past already [1] [2].
>> >> >
>> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
>> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>> >> >
>> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> >> > ---
>> >> >  include/uapi/linux/mempolicy.h | 8 +++++++-
>> >> >  1 file changed, 7 insertions(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
>> >> > --- a/include/uapi/linux/mempolicy.h
>> >> > +++ b/include/uapi/linux/mempolicy.h
>> >> > @@ -66,10 +66,16 @@ enum {
>> >> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>> >> >  
>> >> >  /*
>> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
>> >> > + * the allocation request on the current node by triggering reclaim and
>> >> > + * trying to shrink the current node.
>> >> > + * Fallback allocations on the next candidates in the zonelist are considered
>> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
>> >> > + *
>> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> >> >   * ABI.  New bits are OK, but existing bits can never change.
>> >> 
>> >> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
>> >> this line too?
>> >
>> > Hi Ying, 
>> >
>> > Thank you for reviewing this patch!
>> >
>> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
>> > suggestion correctly, I can rephrase the comment block above to something like this?
>> >
>> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>> > - * ABI. New bits are OK, but existing bits can never change.
>> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
>> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
>> > + * can never change.
>
> Hi Ying,
>
>> Because it's not an ABI, I think that we could avoid to say "never".
>
> My personal opinion is that we should keep this warning, since there has
> already been an example before where a developer tried to remove this bit [1],
> and this broke some behavior for userspace configurations. However, if I
> understand your comment correctly, you are suggesting that we should change
> the wording to not include "never", since sysctls are no longer an ABI (and
> therefore we should be OK to change what the values mean?)
>
> If that is the case, then I can send in another patch since I think the goals
> are a bit different for the two patches. With that said, I think we should
> keep the warning just to avoid any breakages in userspace, even if sysctl
> might not be considered an ABI anymore (also I must have missed this, I didn't
> know this at all!)

Sorry for confusing.  I agree that we shouldn't change the sysctl
interface in most cases.  I just thought that we could soften the
wording a little?  For example,

New bits are OK, but existing bits shouldn't be changed.

I think that it's still clear that we don't want to change the existing
bits.

However, my English is poor.  So, my suggestion may not make sense.

>> > Thanks again for your review Ying, I hope you have a good day : -)
>> 
>> Welcome!  You too!
>> 
>> With some trivial tweak, please feel free to add my
>> 
>> Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
>> 
>> in the future version.
>
> Thank you for your review Ying! Since there is a question remaining about what
> to do with the "never" statement, I will wait to send out a v3 with your
> review : -) 

---
Best Regards,
Huang, Ying
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by Joshua Hahn 2 months ago
On Tue, 05 Aug 2025 09:27:30 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:

> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> 
> > On Mon, 04 Aug 2025 09:24:31 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
> >
> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> >> 
> >> > On Fri, 01 Aug 2025 08:59:20 +0800 "Huang, Ying" <ying.huang@linux.alibaba.com> wrote:
> >> >
> >> >> Joshua Hahn <joshua.hahnjy@gmail.com> writes:
> >> >> 
> >> >> > The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> >> >> > memory. Contrary to its user-facing name, it is internally referred to as
> >> >> > "node_reclaim_mode".
> >> >> >
> >> >> > This can be confusing. But because we cannot change the name of the API since
> >> >> > it has been in place since at least 2.6, let's try to be more explicit about
> >> >> > what the behavior of this API is. 
> >> >> >
> >> >> > Change the description to clarify what zone reclaim entails, and be explicit
> >> >> > about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> >> >> > past already [1] [2].
> >> >> >
> >> >> > [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> >> >> > [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >> >> >
> >> >> > Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> >> >> > ---
> >> >> >  include/uapi/linux/mempolicy.h | 8 +++++++-
> >> >> >  1 file changed, 7 insertions(+), 1 deletion(-)
> >> >> >
> >> >> > diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> >> > index 1f9bb10d1a47..6c9c9385ff89 100644
> >> >> > --- a/include/uapi/linux/mempolicy.h
> >> >> > +++ b/include/uapi/linux/mempolicy.h
> >> >> > @@ -66,10 +66,16 @@ enum {
> >> >> >  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >> >> >  
> >> >> >  /*
> >> >> > + * Enabling zone reclaim means the page allocator will attempt to fulfill
> >> >> > + * the allocation request on the current node by triggering reclaim and
> >> >> > + * trying to shrink the current node.
> >> >> > + * Fallback allocations on the next candidates in the zonelist are considered
> >> >> > + * zone when reclaim fails to free up enough memory in the current node/zone.
> >> >> > + *
> >> >> >   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> >> >   * ABI.  New bits are OK, but existing bits can never change.
> >> >> 
> >> >> As far as I know, sysctl isn't considered kernel ABI now.  So, cghane
> >> >> this line too?
> >> >
> >> > Hi Ying, 
> >> >
> >> > Thank you for reviewing this patch!
> >> >
> >> > I didn't know that sysctl isn't considered a kernel ABI. If I understand your
> >> > suggestion correctly, I can rephrase the comment block above to something like this?
> >> >
> >> > - * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
> >> > - * ABI. New bits are OK, but existing bits can never change.
> >> > + * These bit locations are exposed in the vm.zone_reclaim_mode sysctl and
> >> > + * in /proc/sys/vm/zone_reclaim_mode. New bits are OK, but existing bits
> >> > + * can never change.
> >
> > Hi Ying,
> >
> >> Because it's not an ABI, I think that we could avoid to say "never".
> >
> > My personal opinion is that we should keep this warning, since there has
> > already been an example before where a developer tried to remove this bit [1],
> > and this broke some behavior for userspace configurations. However, if I
> > understand your comment correctly, you are suggesting that we should change
> > the wording to not include "never", since sysctls are no longer an ABI (and
> > therefore we should be OK to change what the values mean?)
> >
> > If that is the case, then I can send in another patch since I think the goals
> > are a bit different for the two patches. With that said, I think we should
> > keep the warning just to avoid any breakages in userspace, even if sysctl
> > might not be considered an ABI anymore (also I must have missed this, I didn't
> > know this at all!)
> 
> Sorry for confusing.  I agree that we shouldn't change the sysctl
> interface in most cases.  I just thought that we could soften the
> wording a little?  For example,
> 
> New bits are OK, but existing bits shouldn't be changed.
> 
> I think that it's still clear that we don't want to change the existing
> bits.
> 
> However, my English is poor.  So, my suggestion may not make sense.

Hi Ying, thank you again for the response!

No worries at all, it was my misunderstanding : -) This suggestion makes sense,
and I think it's small enough & relevant to the code block, so I'll also fold
this change into my patch as well. I'll send out the next version shortly!

Have a great day!
Joshua

Sent using hkml (https://github.com/sjp38/hackermail)
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by SeongJae Park 2 months ago
On Thu, 31 Jul 2025 14:07:37 -0700 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:

> The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> memory. Contrary to its user-facing name, it is internally referred to as
> "node_reclaim_mode".
> 
> This can be confusing. But because we cannot change the name of the API since
> it has been in place since at least 2.6, let's try to be more explicit about
> what the behavior of this API is. 
> 
> Change the description to clarify what zone reclaim entails, and be explicit
> about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> past already [1] [2].
> 
> [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> 
> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> ---
>  include/uapi/linux/mempolicy.h | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> index 1f9bb10d1a47..6c9c9385ff89 100644
> --- a/include/uapi/linux/mempolicy.h
> +++ b/include/uapi/linux/mempolicy.h
> @@ -66,10 +66,16 @@ enum {
>  #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>  
>  /*
> + * Enabling zone reclaim means the page allocator will attempt to fulfill
> + * the allocation request on the current node by triggering reclaim and
> + * trying to shrink the current node.
> + * Fallback allocations on the next candidates in the zonelist are considered
> + * zone when reclaim fails to free up enough memory in the current node/zone.

s/zone when reclaim fails/when reclaim fails/ ?

> + *
>   * These bit locations are exposed in the vm.zone_reclaim_mode sysctl
>   * ABI.  New bits are OK, but existing bits can never change.
>   */
> -#define RECLAIM_ZONE	(1<<0)	/* Run shrink_inactive_list on the zone */
> +#define RECLAIM_ZONE	(1<<0)	/* Enable zone reclaim */
>  #define RECLAIM_WRITE	(1<<1)	/* Writeout pages during reclaim */
>  #define RECLAIM_UNMAP	(1<<2)	/* Unmap pages during reclaim */
>  
> 
> base-commit: 260f6f4fda93c8485c8037865c941b42b9cba5d2
> -- 
> 2.47.3
> 

Other than the above trivial thing,

Acked-by: SeongJae Park <sj@kernel.org>


Thanks,
SJ
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by David Hildenbrand 2 months ago
On 01.08.25 00:41, SeongJae Park wrote:
> On Thu, 31 Jul 2025 14:07:37 -0700 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> 
>> The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
>> memory. Contrary to its user-facing name, it is internally referred to as
>> "node_reclaim_mode".
>>
>> This can be confusing. But because we cannot change the name of the API since
>> it has been in place since at least 2.6, let's try to be more explicit about
>> what the behavior of this API is.
>>
>> Change the description to clarify what zone reclaim entails, and be explicit
>> about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
>> past already [1] [2].
>>
>> [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
>> [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
>>
>> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
>> ---
>>   include/uapi/linux/mempolicy.h | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
>> index 1f9bb10d1a47..6c9c9385ff89 100644
>> --- a/include/uapi/linux/mempolicy.h
>> +++ b/include/uapi/linux/mempolicy.h
>> @@ -66,10 +66,16 @@ enum {
>>   #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
>>   
>>   /*
>> + * Enabling zone reclaim means the page allocator will attempt to fulfill
>> + * the allocation request on the current node by triggering reclaim and
>> + * trying to shrink the current node.
>> + * Fallback allocations on the next candidates in the zonelist are considered
>> + * zone when reclaim fails to free up enough memory in the current node/zone.
> 
> s/zone when reclaim fails/when reclaim fails/ ?

Agreed, that confused me as well.

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers,

David / dhildenb
Re: [PATCH v2] mempolicy: Clarify what zone reclaim means
Posted by Joshua Hahn 2 months ago
On Fri, 1 Aug 2025 11:04:00 +0200 David Hildenbrand <david@redhat.com> wrote:

> On 01.08.25 00:41, SeongJae Park wrote:
> > On Thu, 31 Jul 2025 14:07:37 -0700 Joshua Hahn <joshua.hahnjy@gmail.com> wrote:
> > 
> >> The zone_reclaim_mode API controls the reclaim behavior when a node runs out of
> >> memory. Contrary to its user-facing name, it is internally referred to as
> >> "node_reclaim_mode".
> >>
> >> This can be confusing. But because we cannot change the name of the API since
> >> it has been in place since at least 2.6, let's try to be more explicit about
> >> what the behavior of this API is.
> >>
> >> Change the description to clarify what zone reclaim entails, and be explicit
> >> about the RECLAIM_ZONE bit, whose purpose has led to some confusion in the
> >> past already [1] [2].
> >>
> >> [1] https://lore.kernel.org/linux-mm/1579005573-58923-1-git-send-email-alex.shi@linux.alibaba.com/
> >> [2] https://lore.kernel.org/linux-mm/20200626003459.D8E015CA@viggo.jf.intel.com/
> >>
> >> Signed-off-by: Joshua Hahn <joshua.hahnjy@gmail.com>
> >> ---
> >>   include/uapi/linux/mempolicy.h | 8 +++++++-
> >>   1 file changed, 7 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/include/uapi/linux/mempolicy.h b/include/uapi/linux/mempolicy.h
> >> index 1f9bb10d1a47..6c9c9385ff89 100644
> >> --- a/include/uapi/linux/mempolicy.h
> >> +++ b/include/uapi/linux/mempolicy.h
> >> @@ -66,10 +66,16 @@ enum {
> >>   #define MPOL_F_MORON	(1 << 4) /* Migrate On protnone Reference On Node */
> >>   
> >>   /*
> >> + * Enabling zone reclaim means the page allocator will attempt to fulfill
> >> + * the allocation request on the current node by triggering reclaim and
> >> + * trying to shrink the current node.
> >> + * Fallback allocations on the next candidates in the zonelist are considered
> >> + * zone when reclaim fails to free up enough memory in the current node/zone.
> > 
> > s/zone when reclaim fails/when reclaim fails/ ?
> 
> Agreed, that confused me as well.

Hi David, hi SJ!

Thank you both for catching this, I definitely missed this before sending the
patch out. Will fix in the next version!

> Acked-by: David Hildenbrand <david@redhat.com>

And thank you for your Ack : -) Have a great day!
Joshua

Sent using hkml (https://github.com/sjp38/hackermail)