[RFC PATCH 04/39] mm: mempolicy: Refactor out policy_node_nodemask()

Ackerley Tng posted 39 patches 2 months, 2 weeks ago
[RFC PATCH 04/39] mm: mempolicy: Refactor out policy_node_nodemask()
Posted by Ackerley Tng 2 months, 2 weeks ago
This was refactored out of huge_node().

huge_node()'s interpretation of vma for order assumes the
hugetlb-specific storage of the hstate information in the
inode. policy_node_nodemask() does not assume that, and can be used
more generically.

This refactoring also enforces that nid default to the current node
id, which was not previously enforced.

alloc_pages_mpol_noprof() is the last remaining direct user of
policy_nodemask(). All its callers begin with nid being the current
node id as well. More refactoring is required for to simplify that.

Signed-off-by: Ackerley Tng <ackerleytng@google.com>
---
 include/linux/mempolicy.h |  2 ++
 mm/mempolicy.c            | 36 ++++++++++++++++++++++++++----------
 2 files changed, 28 insertions(+), 10 deletions(-)

diff --git a/include/linux/mempolicy.h b/include/linux/mempolicy.h
index 1add16f21612..a49631e47421 100644
--- a/include/linux/mempolicy.h
+++ b/include/linux/mempolicy.h
@@ -138,6 +138,8 @@ extern void numa_policy_init(void);
 extern void mpol_rebind_task(struct task_struct *tsk, const nodemask_t *new);
 extern void mpol_rebind_mm(struct mm_struct *mm, nodemask_t *new);
 
+extern int policy_node_nodemask(struct mempolicy *mpol, gfp_t gfp_flags,
+				pgoff_t ilx, nodemask_t **nodemask);
 extern int huge_node(struct vm_area_struct *vma,
 				unsigned long addr, gfp_t gfp_flags,
 				struct mempolicy **mpol, nodemask_t **nodemask);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b858e22b259d..f3e572e17775 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1212,7 +1212,6 @@ static struct folio *alloc_migration_target_by_mpol(struct folio *src,
 	struct mempolicy *pol = mmpol->pol;
 	pgoff_t ilx = mmpol->ilx;
 	unsigned int order;
-	int nid = numa_node_id();
 	gfp_t gfp;
 
 	order = folio_order(src);
@@ -1221,10 +1220,11 @@ static struct folio *alloc_migration_target_by_mpol(struct folio *src,
 	if (folio_test_hugetlb(src)) {
 		nodemask_t *nodemask;
 		struct hstate *h;
+		int nid;
 
 		h = folio_hstate(src);
 		gfp = htlb_alloc_mask(h);
-		nodemask = policy_nodemask(gfp, pol, ilx, &nid);
+		nid = policy_node_nodemask(pol, gfp, ilx, &nodemask);
 		return alloc_hugetlb_folio_nodemask(h, nid, nodemask, gfp,
 				htlb_allow_alloc_fallback(MR_MEMPOLICY_MBIND));
 	}
@@ -1234,7 +1234,7 @@ static struct folio *alloc_migration_target_by_mpol(struct folio *src,
 	else
 		gfp = GFP_HIGHUSER_MOVABLE | __GFP_RETRY_MAYFAIL | __GFP_COMP;
 
-	return folio_alloc_mpol(gfp, order, pol, ilx, nid);
+	return folio_alloc_mpol(gfp, order, pol, ilx, numa_node_id());
 }
 #else
 
@@ -2084,6 +2084,27 @@ static nodemask_t *policy_nodemask(gfp_t gfp, struct mempolicy *pol,
 	return nodemask;
 }
 
+/**
+ * policy_node_nodemask(@mpol, @gfp_flags, @ilx, @nodemask)
+ * @mpol: the memory policy to interpret. Reference must be taken.
+ * @gfp_flags: for this request
+ * @ilx: interleave index, for use only when MPOL_INTERLEAVE or
+ *       MPOL_WEIGHTED_INTERLEAVE
+ * @nodemask: (output) pointer to nodemask pointer for 'bind' and 'prefer-many'
+ *            policy
+ *
+ * Returns a nid suitable for a page allocation and a pointer. If the effective
+ * policy is 'bind' or 'prefer-many', returns a pointer to the mempolicy's
+ * @nodemask for filtering the zonelist.
+ */
+int policy_node_nodemask(struct mempolicy *mpol, gfp_t gfp_flags,
+			 pgoff_t ilx, nodemask_t **nodemask)
+{
+	int nid = numa_node_id();
+	*nodemask = policy_nodemask(gfp_flags, mpol, ilx, &nid);
+	return nid;
+}
+
 #ifdef CONFIG_HUGETLBFS
 /*
  * huge_node(@vma, @addr, @gfp_flags, @mpol)
@@ -2102,12 +2123,8 @@ int huge_node(struct vm_area_struct *vma, unsigned long addr, gfp_t gfp_flags,
 		struct mempolicy **mpol, nodemask_t **nodemask)
 {
 	pgoff_t ilx;
-	int nid;
-
-	nid = numa_node_id();
 	*mpol = get_vma_policy(vma, addr, hstate_vma(vma)->order, &ilx);
-	*nodemask = policy_nodemask(gfp_flags, *mpol, ilx, &nid);
-	return nid;
+	return policy_node_nodemask(*mpol, gfp_flags, ilx, nodemask);
 }
 
 /*
@@ -2549,8 +2566,7 @@ unsigned long alloc_pages_bulk_array_mempolicy_noprof(gfp_t gfp,
 		return alloc_pages_bulk_array_preferred_many(gfp,
 				numa_node_id(), pol, nr_pages, page_array);
 
-	nid = numa_node_id();
-	nodemask = policy_nodemask(gfp, pol, NO_INTERLEAVE_INDEX, &nid);
+	nid = policy_node_nodemask(pol, gfp, NO_INTERLEAVE_INDEX, &nodemask);
 	return alloc_pages_bulk_noprof(gfp, nid, nodemask,
 				       nr_pages, NULL, page_array);
 }
-- 
2.46.0.598.g6f2099f65c-goog
Re: [RFC PATCH 04/39] mm: mempolicy: Refactor out policy_node_nodemask()
Posted by Gregory Price 2 months, 2 weeks ago
On Tue, Sep 10, 2024 at 11:43:35PM +0000, Ackerley Tng wrote:
> This was refactored out of huge_node().
> 
> huge_node()'s interpretation of vma for order assumes the
> hugetlb-specific storage of the hstate information in the
> inode. policy_node_nodemask() does not assume that, and can be used
> more generically.
> 
> This refactoring also enforces that nid default to the current node
> id, which was not previously enforced.
> 
> alloc_pages_mpol_noprof() is the last remaining direct user of
> policy_nodemask(). All its callers begin with nid being the current
> node id as well. More refactoring is required for to simplify that.
> 
> Signed-off-by: Ackerley Tng <ackerleytng@google.com>

Reviewed-by: Gregory Price <gourry@gourry.net>

> +/**
> + * policy_node_nodemask(@mpol, @gfp_flags, @ilx, @nodemask)
> + * @mpol: the memory policy to interpret. Reference must be taken.
> + * @gfp_flags: for this request
> + * @ilx: interleave index, for use only when MPOL_INTERLEAVE or
> + *       MPOL_WEIGHTED_INTERLEAVE
> + * @nodemask: (output) pointer to nodemask pointer for 'bind' and 'prefer-many'
> + *            policy
> + *
> + * Returns a nid suitable for a page allocation and a pointer. If the effective
> + * policy is 'bind' or 'prefer-many', returns a pointer to the mempolicy's
> + * @nodemask for filtering the zonelist.

Technically it's possible for nid to contain MAX_NUMNODES upon return
if weighted interleave is used and the nodemask is somehow invalid
(contains no nodes, including the local node). I would expect this to
be indicative of a larger problem (i.e. should functionally never happen).

Now that I'm looking at it, it's possible the weighted interleave path
should default to returning numa_node_id() if node == MAX_NUMNODES, which
would not require any changes to this patch.

> + */
> +int policy_node_nodemask(struct mempolicy *mpol, gfp_t gfp_flags,
> +			 pgoff_t ilx, nodemask_t **nodemask)
> +{
> +	int nid = numa_node_id();
> +	*nodemask = policy_nodemask(gfp_flags, mpol, ilx, &nid);
> +	return nid;
> +}
> +
>  #ifdef CONFIG_HUGETLBFS
>  /*
>   * huge_node(@vma, @addr, @gfp_flags, @mpol)