From nobody Sun Oct 5 10:47:20 2025 Received: from mailrelay-egress16.pub.mailoutpod3-cph3.one.com (mailrelay-egress16.pub.mailoutpod3-cph3.one.com [46.30.212.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB6D628A707 for ; Wed, 6 Aug 2025 12:41:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=46.30.212.3 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754484116; cv=none; b=ZlJz8sgcYar1eqbofAjmSW+e6YkOGozi66De4cO5sJjG/PIO4msjMzNZOnO3YiuvSJwNYYvatHASJNtcNmDLgEHau2u7IonV7npQugBCdodPYXyByNT+vXWLDQEj7V9q7ZcJdMw71OcGv33nhk3l/tkoRgkerCcYg5nO/htKUkI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1754484116; c=relaxed/simple; bh=tDnIk1nsk7d2myv/RhIrCyjK/WPTakd3dPeWmKrQbDI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=G+AVPql047MFUpB/9T8Hwn4F+mQFOcccbjQ7kMHzOkXYUQxj0ZXcatmsuDnUy9MGM3TVcyizZTThumTkbyBSzN21o/K54g5auCja1YvvkGVC/0qiPoGBDSVQbeBNMLpPm7a5/dKkeEHT2jUyWD370EGsXEBQuyUKnG8/tfn83jM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=konsulko.se; spf=none smtp.mailfrom=konsulko.se; dkim=pass (2048-bit key) header.d=konsulko.se header.i=@konsulko.se header.b=QLUSGTXs; dkim=permerror (0-bit key) header.d=konsulko.se header.i=@konsulko.se header.b=iTQxPsYz; arc=none smtp.client-ip=46.30.212.3 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=konsulko.se Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=konsulko.se Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=konsulko.se header.i=@konsulko.se header.b="QLUSGTXs"; dkim=permerror (0-bit key) header.d=konsulko.se header.i=@konsulko.se header.b="iTQxPsYz" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1754484111; x=1755088911; d=konsulko.se; s=rsa1; h=content-transfer-encoding:mime-version:references:in-reply-to:message-id:date: subject:cc:to:from:from; bh=pjAfZVitxErZt1vrAjky7Diqv9MECrtNO3aybwOppWM=; b=QLUSGTXsUL+Tm8j0wA6JYfkbPDriRX2GdqD4uZEnx8yTfZ1xuz++jLyVyXeOXo5HwuEOkz6fDXTEr LSWAr3QbZTCZy/8mhd+shQZi0KIWzYaSA2Qapd5BkSoTPVtQsDDtNrOGU17h5afXxe0lT0DpFvG/yq DH83x7vh8KnJih+te/fMD3uqXWDwk4ZOmZHQcoRnzvz3aGlul8LbmywTDP9dy4dyZ6EqVsiTGVH7Lc kSBhISQci/I6Fzcrwl/IqP9lQ5A7AfoOdpCHVibt7jQFRBHirtvXERQZJpAuoHZ4GrBozp/Czdq6wL FA1oF11j0u8vwfBpXpRyHcC1W2MV1Kg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; t=1754484111; x=1755088911; d=konsulko.se; s=ed1; h=content-transfer-encoding:mime-version:references:in-reply-to:message-id:date: subject:cc:to:from:from; bh=pjAfZVitxErZt1vrAjky7Diqv9MECrtNO3aybwOppWM=; b=iTQxPsYzR0MXlyDzpqHrU1teSRa1uHSFmwOBDYdRfUKqWbv6vcBDT8xbBlvagvCMbt1iWJ8x2PbEA t9XG+otBQ== X-HalOne-ID: b8ecdef9-72c2-11f0-bda4-e90f2b8e16ca Received: from localhost.localdomain (c188-150-224-8.bredband.tele2.se [188.150.224.8]) by mailrelay2.pub.mailoutpod2-cph3.one.com (Halon) with ESMTPSA id b8ecdef9-72c2-11f0-bda4-e90f2b8e16ca; Wed, 06 Aug 2025 12:41:50 +0000 (UTC) From: Vitaly Wool To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, Uladzislau Rezki , Danilo Krummrich , Alice Ryhl , Vlastimil Babka , rust-for-linux@vger.kernel.org, Lorenzo Stoakes , "Liam R . Howlett" , Kent Overstreet , linux-bcachefs@vger.kernel.org, bpf@vger.kernel.org, Herbert Xu , Jann Horn , Pedro Falcato , Vitaly Wool Subject: [PATCH v15 2/4] mm/slub: allow to set node and align in k[v]realloc Date: Wed, 6 Aug 2025 14:41:47 +0200 Message-Id: <20250806124147.1724658-1-vitaly.wool@konsulko.se> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20250806124034.1724515-1-vitaly.wool@konsulko.se> References: <20250806124034.1724515-1-vitaly.wool@konsulko.se> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Reimplement k[v]realloc_node() to be able to set node and alignment should a user need to do so. In order to do that while retaining the maximal backward compatibility, add k[v]realloc_node_align() functions and redefine the rest of API using these new ones. While doing that, we also keep the number of _noprof variants to a minimum, which implies some changes to the existing users of older _noprof functions, that basically being bcachefs. With that change we also provide the ability for the Rust part of the kernel to set node and alignment in its K[v]xxx [re]allocations. Signed-off-by: Vitaly Wool Reviewed-by: Vlastimil Babka --- fs/bcachefs/darray.c | 2 +- fs/bcachefs/util.h | 2 +- include/linux/bpfptr.h | 2 +- include/linux/slab.h | 39 +++++++++++++++++----------- lib/rhashtable.c | 4 +-- mm/slub.c | 59 ++++++++++++++++++++++++++++++++---------- 6 files changed, 74 insertions(+), 34 deletions(-) diff --git a/fs/bcachefs/darray.c b/fs/bcachefs/darray.c index e86d36d23e9e..928e83a1ce42 100644 --- a/fs/bcachefs/darray.c +++ b/fs/bcachefs/darray.c @@ -21,7 +21,7 @@ int __bch2_darray_resize_noprof(darray_char *d, size_t el= ement_size, size_t new_ return -ENOMEM; =20 void *data =3D likely(bytes < INT_MAX) - ? kvmalloc_noprof(bytes, gfp) + ? kvmalloc_node_align_noprof(bytes, 1, gfp, NUMA_NO_NODE) : vmalloc_noprof(bytes); if (!data) return -ENOMEM; diff --git a/fs/bcachefs/util.h b/fs/bcachefs/util.h index 6488f098d140..7112fd40ee21 100644 --- a/fs/bcachefs/util.h +++ b/fs/bcachefs/util.h @@ -61,7 +61,7 @@ static inline void *bch2_kvmalloc_noprof(size_t n, gfp_t = flags) { void *p =3D unlikely(n >=3D INT_MAX) ? vmalloc_noprof(n) - : kvmalloc_noprof(n, flags & ~__GFP_ZERO); + : kvmalloc_node_align_noprof(n, 1, flags & ~__GFP_ZERO, NUMA_NO_NODE); if (p && (flags & __GFP_ZERO)) memset(p, 0, n); return p; diff --git a/include/linux/bpfptr.h b/include/linux/bpfptr.h index 1af241525a17..f6e0795db484 100644 --- a/include/linux/bpfptr.h +++ b/include/linux/bpfptr.h @@ -67,7 +67,7 @@ static inline int copy_to_bpfptr_offset(bpfptr_t dst, siz= e_t offset, =20 static inline void *kvmemdup_bpfptr_noprof(bpfptr_t src, size_t len) { - void *p =3D kvmalloc_noprof(len, GFP_USER | __GFP_NOWARN); + void *p =3D kvmalloc_node_align_noprof(len, 1, GFP_USER | __GFP_NOWARN, N= UMA_NO_NODE); =20 if (!p) return ERR_PTR(-ENOMEM); diff --git a/include/linux/slab.h b/include/linux/slab.h index d5a8ab98035c..6dc300bac2a1 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -465,9 +465,13 @@ int kmem_cache_shrink(struct kmem_cache *s); /* * Common kmalloc functions provided by all allocators */ -void * __must_check krealloc_noprof(const void *objp, size_t new_size, - gfp_t flags) __realloc_size(2); -#define krealloc(...) alloc_hooks(krealloc_noprof(__VA_ARGS__)) +void * __must_check krealloc_node_align_noprof(const void *objp, size_t ne= w_size, + unsigned long align, + gfp_t flags, int nid) __realloc_size(2); +#define krealloc_noprof(_o, _s, _f) krealloc_node_align_noprof(_o, _s, 1, = _f, NUMA_NO_NODE) +#define krealloc_node_align(...) alloc_hooks(krealloc_node_align_noprof(__= VA_ARGS__)) +#define krealloc_node(_o, _s, _f, _n) krealloc_node_align(_o, _s, 1, _f, _= n) +#define krealloc(...) krealloc_node(__VA_ARGS__, NUMA_NO_NODE) =20 void kfree(const void *objp); void kfree_sensitive(const void *objp); @@ -1041,18 +1045,20 @@ static inline __alloc_size(1) void *kzalloc_noprof(= size_t size, gfp_t flags) #define kzalloc(...) alloc_hooks(kzalloc_noprof(__VA_ARGS__)) #define kzalloc_node(_size, _flags, _node) kmalloc_node(_size, (_flags)|__= GFP_ZERO, _node) =20 -void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int= node) __alloc_size(1); -#define kvmalloc_node_noprof(size, flags, node) \ - __kvmalloc_node_noprof(PASS_BUCKET_PARAMS(size, NULL), flags, node) -#define kvmalloc_node(...) alloc_hooks(kvmalloc_node_noprof(__VA_ARGS__)) - -#define kvmalloc(_size, _flags) kvmalloc_node(_size, _flags, NUMA_NO_NOD= E) -#define kvmalloc_noprof(_size, _flags) kvmalloc_node_noprof(_size, _flags= , NUMA_NO_NODE) +void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), unsigned long al= ign, + gfp_t flags, int node) __alloc_size(1); +#define kvmalloc_node_align_noprof(_size, _align, _flags, _node) \ + __kvmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, NULL), _align, _flags, _= node) +#define kvmalloc_node_align(...) \ + alloc_hooks(kvmalloc_node_align_noprof(__VA_ARGS__)) +#define kvmalloc_node(_s, _f, _n) kvmalloc_node_align(_s, 1, _f, _n) +#define kvmalloc(...) kvmalloc_node(__VA_ARGS__, NUMA_NO_NODE) #define kvzalloc(_size, _flags) kvmalloc(_size, (_flags)|__GFP_ZERO) =20 #define kvzalloc_node(_size, _flags, _node) kvmalloc_node(_size, (_flags)|= __GFP_ZERO, _node) + #define kmem_buckets_valloc(_b, _size, _flags) \ - alloc_hooks(__kvmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, _b), _flags,= NUMA_NO_NODE)) + alloc_hooks(__kvmalloc_node_noprof(PASS_BUCKET_PARAMS(_size, _b), 1, _fla= gs, NUMA_NO_NODE)) =20 static inline __alloc_size(1, 2) void * kvmalloc_array_node_noprof(size_t n, size_t size, gfp_t flags, int node) @@ -1062,7 +1068,7 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gfp= _t flags, int node) if (unlikely(check_mul_overflow(n, size, &bytes))) return NULL; =20 - return kvmalloc_node_noprof(bytes, flags, node); + return kvmalloc_node_align_noprof(bytes, 1, flags, node); } =20 #define kvmalloc_array_noprof(...) kvmalloc_array_node_noprof(__VA_ARGS__= , NUMA_NO_NODE) @@ -1073,9 +1079,12 @@ kvmalloc_array_node_noprof(size_t n, size_t size, gf= p_t flags, int node) #define kvcalloc_node(...) alloc_hooks(kvcalloc_node_noprof(__VA_ARGS__)) #define kvcalloc(...) alloc_hooks(kvcalloc_noprof(__VA_ARGS__)) =20 -void *kvrealloc_noprof(const void *p, size_t size, gfp_t flags) - __realloc_size(2); -#define kvrealloc(...) alloc_hooks(kvrealloc_noprof(__VA_ARGS__)) +void *kvrealloc_node_align_noprof(const void *p, size_t size, unsigned lon= g align, + gfp_t flags, int nid) __realloc_size(2); +#define kvrealloc_node_align(...) \ + alloc_hooks(kvrealloc_node_align_noprof(__VA_ARGS__)) +#define kvrealloc_node(_p, _s, _f, _n) kvrealloc_node_align(_p, _s, 1, _f= , _n) +#define kvrealloc(...) kvrealloc_node(__VA_ARGS__, NUMA_NO_NODE) =20 extern void kvfree(const void *addr); DEFINE_FREE(kvfree, void *, if (!IS_ERR_OR_NULL(_T)) kvfree(_T)) diff --git a/lib/rhashtable.c b/lib/rhashtable.c index 3e555d012ed6..fde0f0e556f8 100644 --- a/lib/rhashtable.c +++ b/lib/rhashtable.c @@ -184,8 +184,8 @@ static struct bucket_table *bucket_table_alloc(struct r= hashtable *ht, static struct lock_class_key __key; =20 tbl =3D alloc_hooks_tag(ht->alloc_tag, - kvmalloc_node_noprof(struct_size(tbl, buckets, nbuckets), - gfp|__GFP_ZERO, NUMA_NO_NODE)); + kvmalloc_node_align_noprof(struct_size(tbl, buckets, nbuckets), + 1, gfp|__GFP_ZERO, NUMA_NO_NODE)); =20 size =3D nbuckets; =20 diff --git a/mm/slub.c b/mm/slub.c index c4b64821e680..d01441ea0522 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4845,7 +4845,7 @@ void kfree(const void *object) EXPORT_SYMBOL(kfree); =20 static __always_inline __realloc_size(2) void * -__do_krealloc(const void *p, size_t new_size, gfp_t flags) +__do_krealloc(const void *p, size_t new_size, unsigned long align, gfp_t f= lags, int nid) { void *ret; size_t ks =3D 0; @@ -4859,6 +4859,16 @@ __do_krealloc(const void *p, size_t new_size, gfp_t = flags) if (!kasan_check_byte(p)) return NULL; =20 + /* + * If reallocation is not necessary (e. g. the new size is less + * than the current allocated size), the current allocation will be + * preserved unless __GFP_THISNODE is set. In the latter case a new + * allocation on the requested node will be attempted. + */ + if (unlikely(flags & __GFP_THISNODE) && nid !=3D NUMA_NO_NODE && + nid !=3D page_to_nid(virt_to_page(p))) + goto alloc_new; + if (is_kfence_address(p)) { ks =3D orig_size =3D kfence_ksize(p); } else { @@ -4881,6 +4891,10 @@ __do_krealloc(const void *p, size_t new_size, gfp_t = flags) if (new_size > ks) goto alloc_new; =20 + /* If the old object doesn't satisfy the new alignment, allocate a new on= e */ + if (!IS_ALIGNED((unsigned long)p, align)) + goto alloc_new; + /* Zero out spare memory. */ if (want_init_on_alloc(flags)) { kasan_disable_current(); @@ -4903,7 +4917,7 @@ __do_krealloc(const void *p, size_t new_size, gfp_t f= lags) return (void *)p; =20 alloc_new: - ret =3D kmalloc_node_track_caller_noprof(new_size, flags, NUMA_NO_NODE, _= RET_IP_); + ret =3D kmalloc_node_track_caller_noprof(new_size, flags, nid, _RET_IP_); if (ret && p) { /* Disable KASAN checks as the object's redzone is accessed. */ kasan_disable_current(); @@ -4915,14 +4929,19 @@ __do_krealloc(const void *p, size_t new_size, gfp_t= flags) } =20 /** - * krealloc - reallocate memory. The contents will remain unchanged. + * krealloc_node_align - reallocate memory. The contents will remain uncha= nged. * @p: object to reallocate memory for. * @new_size: how many bytes of memory are required. + * @align: desired alignment. * @flags: the type of memory to allocate. + * @nid: NUMA node or NUMA_NO_NODE * * If @p is %NULL, krealloc() behaves exactly like kmalloc(). If @new_size * is 0 and @p is not a %NULL pointer, the object pointed to is freed. * + * Only alignments up to those guaranteed by kmalloc() will be honored. Pl= ease see + * Documentation/core-api/memory-allocation.rst for more details. + * * If __GFP_ZERO logic is requested, callers must ensure that, starting wi= th the * initial memory allocation, every subsequent call to this API for the sa= me * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible= that @@ -4947,7 +4966,8 @@ __do_krealloc(const void *p, size_t new_size, gfp_t f= lags) * * Return: pointer to the allocated memory or %NULL in case of error */ -void *krealloc_noprof(const void *p, size_t new_size, gfp_t flags) +void *krealloc_node_align_noprof(const void *p, size_t new_size, unsigned = long align, + gfp_t flags, int nid) { void *ret; =20 @@ -4956,13 +4976,13 @@ void *krealloc_noprof(const void *p, size_t new_siz= e, gfp_t flags) return ZERO_SIZE_PTR; } =20 - ret =3D __do_krealloc(p, new_size, flags); + ret =3D __do_krealloc(p, new_size, align, flags, nid); if (ret && kasan_reset_tag(p) !=3D kasan_reset_tag(ret)) kfree(p); =20 return ret; } -EXPORT_SYMBOL(krealloc_noprof); +EXPORT_SYMBOL(krealloc_node_align_noprof); =20 static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t size) { @@ -4993,9 +5013,13 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t = size) * failure, fall back to non-contiguous (vmalloc) allocation. * @size: size of the request. * @b: which set of kmalloc buckets to allocate from. + * @align: desired alignment. * @flags: gfp mask for the allocation - must be compatible (superset) wit= h GFP_KERNEL. * @node: numa node to allocate from * + * Only alignments up to those guaranteed by kmalloc() will be honored. Pl= ease see + * Documentation/core-api/memory-allocation.rst for more details. + * * Uses kmalloc to get the memory but if the allocation fails then falls b= ack * to the vmalloc allocator. Use kvfree for freeing the memory. * @@ -5005,7 +5029,8 @@ static gfp_t kmalloc_gfp_adjust(gfp_t flags, size_t s= ize) * * Return: pointer to the allocated memory of %NULL in case of failure */ -void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), gfp_t flags, int= node) +void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size, b), unsigned long al= ign, + gfp_t flags, int node) { void *ret; =20 @@ -5035,7 +5060,7 @@ void *__kvmalloc_node_noprof(DECL_BUCKET_PARAMS(size,= b), gfp_t flags, int node) * about the resulting pointer, and cannot play * protection games. */ - return __vmalloc_node_range_noprof(size, 1, VMALLOC_START, VMALLOC_END, + return __vmalloc_node_range_noprof(size, align, VMALLOC_START, VMALLOC_EN= D, flags, PAGE_KERNEL, VM_ALLOW_HUGE_VMAP, node, __builtin_return_address(0)); } @@ -5079,14 +5104,19 @@ void kvfree_sensitive(const void *addr, size_t len) EXPORT_SYMBOL(kvfree_sensitive); =20 /** - * kvrealloc - reallocate memory; contents remain unchanged + * kvrealloc_node_align - reallocate memory; contents remain unchanged * @p: object to reallocate memory for * @size: the size to reallocate + * @align: desired alignment * @flags: the flags for the page level allocator + * @nid: NUMA node id * * If @p is %NULL, kvrealloc() behaves exactly like kvmalloc(). If @size i= s 0 * and @p is not a %NULL pointer, the object pointed to is freed. * + * Only alignments up to those guaranteed by kmalloc() will be honored. Pl= ease see + * Documentation/core-api/memory-allocation.rst for more details. + * * If __GFP_ZERO logic is requested, callers must ensure that, starting wi= th the * initial memory allocation, every subsequent call to this API for the sa= me * memory allocation is flagged with __GFP_ZERO. Otherwise, it is possible= that @@ -5100,17 +5130,18 @@ EXPORT_SYMBOL(kvfree_sensitive); * * Return: pointer to the allocated memory or %NULL in case of error */ -void *kvrealloc_noprof(const void *p, size_t size, gfp_t flags) +void *kvrealloc_node_align_noprof(const void *p, size_t size, unsigned lon= g align, + gfp_t flags, int nid) { void *n; =20 if (is_vmalloc_addr(p)) - return vrealloc_noprof(p, size, flags); + return vrealloc_node_align_noprof(p, size, align, flags, nid); =20 - n =3D krealloc_noprof(p, size, kmalloc_gfp_adjust(flags, size)); + n =3D krealloc_node_align_noprof(p, size, align, kmalloc_gfp_adjust(flags= , size), nid); if (!n) { /* We failed to krealloc(), fall back to kvmalloc(). */ - n =3D kvmalloc_noprof(size, flags); + n =3D kvmalloc_node_align_noprof(size, align, flags, nid); if (!n) return NULL; =20 @@ -5126,7 +5157,7 @@ void *kvrealloc_noprof(const void *p, size_t size, gf= p_t flags) =20 return n; } -EXPORT_SYMBOL(kvrealloc_noprof); +EXPORT_SYMBOL(kvrealloc_node_align_noprof); =20 struct detached_freelist { struct slab *slab; --=20 2.39.2