[PATCH V4 8/8] mm/slab: place slabobj_ext metadata in unused space within s->size

Harry Yoo posted 8 patches 1 month, 2 weeks ago
There is a newer version of this series
[PATCH V4 8/8] mm/slab: place slabobj_ext metadata in unused space within s->size
Posted by Harry Yoo 1 month, 2 weeks ago
When a cache has high s->align value and s->object_size is not aligned
to it, each object ends up with some unused space because of alignment.
If this wasted space is big enough, we can use it to store the
slabobj_ext metadata instead of wasting it.

On my system, this happens with caches like kmem_cache, mm_struct, pid,
task_struct, sighand_cache, xfs_inode, and others.

To place the slabobj_ext metadata within each object, the existing
slab_obj_ext() logic can still be used by setting:

  - slab->obj_exts = slab_address(slab) + s->red_left_zone +
                     (slabobj_ext offset)
  - stride = s->size

slab_obj_ext() doesn't need know where the metadata is stored,
so this method works without adding extra overhead to slab_obj_ext().

A good example benefiting from this optimization is xfs_inode
(object_size: 992, align: 64). To measure memory savings, 2 millions of
files were created on XFS.

[ MEMCG=y, MEM_ALLOC_PROFILING=n ]

Before patch (creating ~2.64M directories on xfs):
  Slab:            5175976 kB
  SReclaimable:    3837524 kB
  SUnreclaim:      1338452 kB

After patch (creating ~2.64M directories on xfs):
  Slab:            5152912 kB
  SReclaimable:    3838568 kB
  SUnreclaim:      1314344 kB (-23.54 MiB)

Enjoy the memory savings!

Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
---
 include/linux/slab.h |  9 ++++++
 mm/slab_common.c     |  6 ++--
 mm/slub.c            | 73 ++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 83 insertions(+), 5 deletions(-)

diff --git a/include/linux/slab.h b/include/linux/slab.h
index 4554c04a9bd7..da512d9ab1a0 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -59,6 +59,9 @@ enum _slab_flag_bits {
 	_SLAB_CMPXCHG_DOUBLE,
 #ifdef CONFIG_SLAB_OBJ_EXT
 	_SLAB_NO_OBJ_EXT,
+#endif
+#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
+	_SLAB_OBJ_EXT_IN_OBJ,
 #endif
 	_SLAB_FLAGS_LAST_BIT
 };
@@ -244,6 +247,12 @@ enum _slab_flag_bits {
 #define SLAB_NO_OBJ_EXT		__SLAB_FLAG_UNUSED
 #endif
 
+#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
+#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_BIT(_SLAB_OBJ_EXT_IN_OBJ)
+#else
+#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_UNUSED
+#endif
+
 /*
  * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
  *
diff --git a/mm/slab_common.c b/mm/slab_common.c
index c4cf9ed2ec92..f0a6db20d7ea 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -43,11 +43,13 @@ DEFINE_MUTEX(slab_mutex);
 struct kmem_cache *kmem_cache;
 
 /*
- * Set of flags that will prevent slab merging
+ * Set of flags that will prevent slab merging.
+ * Any flag that adds per-object metadata should be included,
+ * since slab merging can update s->inuse that affects the metadata layout.
  */
 #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
 		SLAB_TRACE | SLAB_TYPESAFE_BY_RCU | SLAB_NOLEAKTRACE | \
-		SLAB_FAILSLAB | SLAB_NO_MERGE)
+		SLAB_FAILSLAB | SLAB_NO_MERGE | SLAB_OBJ_EXT_IN_OBJ)
 
 #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
 			 SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
diff --git a/mm/slub.c b/mm/slub.c
index 3fc3d2ca42e7..78f0087c8e48 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -977,6 +977,39 @@ static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
 {
 	return false;
 }
+
+#endif
+
+#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
+static bool obj_exts_in_object(struct kmem_cache *s)
+{
+	return s->flags & SLAB_OBJ_EXT_IN_OBJ;
+}
+
+static unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
+{
+	unsigned int offset = get_info_end(s);
+
+	if (kmem_cache_debug_flags(s, SLAB_STORE_USER))
+		offset += sizeof(struct track) * 2;
+
+	if (slub_debug_orig_size(s))
+		offset += sizeof(unsigned long);
+
+	offset += kasan_metadata_size(s, false);
+
+	return offset;
+}
+#else
+static inline bool obj_exts_in_object(struct kmem_cache *s)
+{
+	return false;
+}
+
+static inline unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
+{
+	return 0;
+}
 #endif
 
 #ifdef CONFIG_SLUB_DEBUG
@@ -1277,6 +1310,9 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
 
 	off += kasan_metadata_size(s, false);
 
+	if (obj_exts_in_object(s))
+		off += sizeof(struct slabobj_ext);
+
 	if (off != size_from_object(s))
 		/* Beginning of the filler is the free pointer */
 		print_section(KERN_ERR, "Padding  ", p + off,
@@ -1446,7 +1482,10 @@ check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
  * 	A. Free pointer (if we cannot overwrite object on free)
  * 	B. Tracking data for SLAB_STORE_USER
  *	C. Original request size for kmalloc object (SLAB_STORE_USER enabled)
- *	D. Padding to reach required alignment boundary or at minimum
+ *	D. KASAN alloc metadata (KASAN enabled)
+ *	E. struct slabobj_ext to store accounting metadata
+ *	   (SLAB_OBJ_EXT_IN_OBJ enabled)
+ *	F. Padding to reach required alignment boundary or at minimum
  * 		one word if debugging is on to be able to detect writes
  * 		before the word boundary.
  *
@@ -1474,6 +1513,9 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
 
 	off += kasan_metadata_size(s, false);
 
+	if (obj_exts_in_object(s))
+		off += sizeof(struct slabobj_ext);
+
 	if (size_from_object(s) == off)
 		return 1;
 
@@ -2280,7 +2322,8 @@ static inline void free_slab_obj_exts(struct slab *slab)
 		return;
 	}
 
-	if (obj_exts_in_slab(slab->slab_cache, slab)) {
+	if (obj_exts_in_slab(slab->slab_cache, slab) ||
+			obj_exts_in_object(slab->slab_cache)) {
 		slab->obj_exts = 0;
 		return;
 	}
@@ -2326,6 +2369,23 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
 			obj_exts |= MEMCG_DATA_OBJEXTS;
 		slab->obj_exts = obj_exts;
 		slab_set_stride(slab, sizeof(struct slabobj_ext));
+	} else if (obj_exts_in_object(s)) {
+		unsigned int offset = obj_exts_offset_in_object(s);
+
+		obj_exts = (unsigned long)slab_address(slab);
+		obj_exts += s->red_left_pad;
+		obj_exts += obj_exts_offset_in_object(s);
+
+		get_slab_obj_exts(obj_exts);
+		for_each_object(addr, s, slab_address(slab), slab->objects)
+			memset(kasan_reset_tag(addr) + offset, 0,
+			       sizeof(struct slabobj_ext));
+		put_slab_obj_exts(obj_exts);
+
+		if (IS_ENABLED(CONFIG_MEMCG))
+			obj_exts |= MEMCG_DATA_OBJEXTS;
+		slab->obj_exts = obj_exts;
+		slab_set_stride(slab, s->size);
 	}
 }
 
@@ -8023,6 +8083,7 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
 {
 	slab_flags_t flags = s->flags;
 	unsigned int size = s->object_size;
+	unsigned int aligned_size;
 	unsigned int order;
 
 	/*
@@ -8132,7 +8193,13 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
 	 * offset 0. In order to align the objects we have to simply size
 	 * each object to conform to the alignment.
 	 */
-	size = ALIGN(size, s->align);
+	aligned_size = ALIGN(size, s->align);
+#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
+	if (aligned_size - size >= sizeof(struct slabobj_ext))
+		s->flags |= SLAB_OBJ_EXT_IN_OBJ;
+#endif
+	size = aligned_size;
+
 	s->size = size;
 	s->reciprocal_size = reciprocal_value(size);
 	order = calculate_order(size);
-- 
2.43.0
Re: [PATCH V4 8/8] mm/slab: place slabobj_ext metadata in unused space within s->size
Posted by Hao Li 1 month, 2 weeks ago
On Mon, Dec 22, 2025 at 08:08:43PM +0900, Harry Yoo wrote:
> When a cache has high s->align value and s->object_size is not aligned
> to it, each object ends up with some unused space because of alignment.
> If this wasted space is big enough, we can use it to store the
> slabobj_ext metadata instead of wasting it.
> 
> On my system, this happens with caches like kmem_cache, mm_struct, pid,
> task_struct, sighand_cache, xfs_inode, and others.
> 
> To place the slabobj_ext metadata within each object, the existing
> slab_obj_ext() logic can still be used by setting:
> 
>   - slab->obj_exts = slab_address(slab) + s->red_left_zone +
>                      (slabobj_ext offset)
>   - stride = s->size
> 
> slab_obj_ext() doesn't need know where the metadata is stored,
> so this method works without adding extra overhead to slab_obj_ext().
> 
> A good example benefiting from this optimization is xfs_inode
> (object_size: 992, align: 64). To measure memory savings, 2 millions of
> files were created on XFS.
> 
> [ MEMCG=y, MEM_ALLOC_PROFILING=n ]
> 
> Before patch (creating ~2.64M directories on xfs):
>   Slab:            5175976 kB
>   SReclaimable:    3837524 kB
>   SUnreclaim:      1338452 kB
> 
> After patch (creating ~2.64M directories on xfs):
>   Slab:            5152912 kB
>   SReclaimable:    3838568 kB
>   SUnreclaim:      1314344 kB (-23.54 MiB)
> 
> Enjoy the memory savings!
> 
> Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> ---
>  include/linux/slab.h |  9 ++++++
>  mm/slab_common.c     |  6 ++--
>  mm/slub.c            | 73 ++++++++++++++++++++++++++++++++++++++++++--
>  3 files changed, 83 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 4554c04a9bd7..da512d9ab1a0 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -59,6 +59,9 @@ enum _slab_flag_bits {
>  	_SLAB_CMPXCHG_DOUBLE,
>  #ifdef CONFIG_SLAB_OBJ_EXT
>  	_SLAB_NO_OBJ_EXT,
> +#endif
> +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> +	_SLAB_OBJ_EXT_IN_OBJ,
>  #endif
>  	_SLAB_FLAGS_LAST_BIT
>  };
> @@ -244,6 +247,12 @@ enum _slab_flag_bits {
>  #define SLAB_NO_OBJ_EXT		__SLAB_FLAG_UNUSED
>  #endif
>  
> +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> +#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_BIT(_SLAB_OBJ_EXT_IN_OBJ)
> +#else
> +#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_UNUSED
> +#endif
> +
>  /*
>   * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
>   *
> diff --git a/mm/slab_common.c b/mm/slab_common.c
> index c4cf9ed2ec92..f0a6db20d7ea 100644
> --- a/mm/slab_common.c
> +++ b/mm/slab_common.c
> @@ -43,11 +43,13 @@ DEFINE_MUTEX(slab_mutex);
>  struct kmem_cache *kmem_cache;
>  
>  /*
> - * Set of flags that will prevent slab merging
> + * Set of flags that will prevent slab merging.
> + * Any flag that adds per-object metadata should be included,
> + * since slab merging can update s->inuse that affects the metadata layout.
>   */
>  #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
>  		SLAB_TRACE | SLAB_TYPESAFE_BY_RCU | SLAB_NOLEAKTRACE | \
> -		SLAB_FAILSLAB | SLAB_NO_MERGE)
> +		SLAB_FAILSLAB | SLAB_NO_MERGE | SLAB_OBJ_EXT_IN_OBJ)
>  
>  #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
>  			 SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
> diff --git a/mm/slub.c b/mm/slub.c
> index 3fc3d2ca42e7..78f0087c8e48 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -977,6 +977,39 @@ static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
>  {
>  	return false;
>  }
> +
> +#endif
> +
> +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> +static bool obj_exts_in_object(struct kmem_cache *s)
> +{
> +	return s->flags & SLAB_OBJ_EXT_IN_OBJ;
> +}
> +
> +static unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
> +{
> +	unsigned int offset = get_info_end(s);
> +
> +	if (kmem_cache_debug_flags(s, SLAB_STORE_USER))
> +		offset += sizeof(struct track) * 2;
> +
> +	if (slub_debug_orig_size(s))
> +		offset += sizeof(unsigned long);
> +
> +	offset += kasan_metadata_size(s, false);
> +
> +	return offset;
> +}
> +#else
> +static inline bool obj_exts_in_object(struct kmem_cache *s)
> +{
> +	return false;
> +}
> +
> +static inline unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
> +{
> +	return 0;
> +}
>  #endif
>  
>  #ifdef CONFIG_SLUB_DEBUG
> @@ -1277,6 +1310,9 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
>  
>  	off += kasan_metadata_size(s, false);
>  
> +	if (obj_exts_in_object(s))
> +		off += sizeof(struct slabobj_ext);
> +
>  	if (off != size_from_object(s))
>  		/* Beginning of the filler is the free pointer */
>  		print_section(KERN_ERR, "Padding  ", p + off,
> @@ -1446,7 +1482,10 @@ check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
>   * 	A. Free pointer (if we cannot overwrite object on free)
>   * 	B. Tracking data for SLAB_STORE_USER
>   *	C. Original request size for kmalloc object (SLAB_STORE_USER enabled)
> - *	D. Padding to reach required alignment boundary or at minimum
> + *	D. KASAN alloc metadata (KASAN enabled)
> + *	E. struct slabobj_ext to store accounting metadata
> + *	   (SLAB_OBJ_EXT_IN_OBJ enabled)
> + *	F. Padding to reach required alignment boundary or at minimum
>   * 		one word if debugging is on to be able to detect writes
>   * 		before the word boundary.
>   *
> @@ -1474,6 +1513,9 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
>  
>  	off += kasan_metadata_size(s, false);
>  
> +	if (obj_exts_in_object(s))
> +		off += sizeof(struct slabobj_ext);
> +
>  	if (size_from_object(s) == off)
>  		return 1;
>  
> @@ -2280,7 +2322,8 @@ static inline void free_slab_obj_exts(struct slab *slab)
>  		return;
>  	}
>  
> -	if (obj_exts_in_slab(slab->slab_cache, slab)) {
> +	if (obj_exts_in_slab(slab->slab_cache, slab) ||
> +			obj_exts_in_object(slab->slab_cache)) {
>  		slab->obj_exts = 0;
>  		return;
>  	}
> @@ -2326,6 +2369,23 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
>  			obj_exts |= MEMCG_DATA_OBJEXTS;
>  		slab->obj_exts = obj_exts;
>  		slab_set_stride(slab, sizeof(struct slabobj_ext));
> +	} else if (obj_exts_in_object(s)) {
> +		unsigned int offset = obj_exts_offset_in_object(s);
> +
> +		obj_exts = (unsigned long)slab_address(slab);
> +		obj_exts += s->red_left_pad;
> +		obj_exts += obj_exts_offset_in_object(s);

Hi, Harry

It looks like this could just be simplified to obj_exts += offset, right?

> +
> +		get_slab_obj_exts(obj_exts);
> +		for_each_object(addr, s, slab_address(slab), slab->objects)
> +			memset(kasan_reset_tag(addr) + offset, 0,
> +			       sizeof(struct slabobj_ext));
> +		put_slab_obj_exts(obj_exts);
> +
> +		if (IS_ENABLED(CONFIG_MEMCG))
> +			obj_exts |= MEMCG_DATA_OBJEXTS;
> +		slab->obj_exts = obj_exts;
> +		slab_set_stride(slab, s->size);
>  	}
>  }
>  
> @@ -8023,6 +8083,7 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
>  {
>  	slab_flags_t flags = s->flags;
>  	unsigned int size = s->object_size;
> +	unsigned int aligned_size;
>  	unsigned int order;
>  
>  	/*
> @@ -8132,7 +8193,13 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
>  	 * offset 0. In order to align the objects we have to simply size
>  	 * each object to conform to the alignment.
>  	 */
> -	size = ALIGN(size, s->align);
> +	aligned_size = ALIGN(size, s->align);
> +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> +	if (aligned_size - size >= sizeof(struct slabobj_ext))
> +		s->flags |= SLAB_OBJ_EXT_IN_OBJ;
> +#endif
> +	size = aligned_size;
> +

One more thought: in calculate_sizes() we add some extra padding when
SLAB_RED_ZONE is enabled:

if (flags & SLAB_RED_ZONE) {
	/*
	 * Add some empty padding so that we can catch
	 * overwrites from earlier objects rather than let
	 * tracking information or the free pointer be
	 * corrupted if a user writes before the start
	 * of the object.
	 */
	size += sizeof(void *);
	...
}


From what I understand, this additional padding ends up being placed
after the KASAN allocation metadata.
Since it’s only "extra" padding (i.e., it doesn’t seem strictly required
for the layout), and your patch would reuse this area — together with
the final padding introduced by `size = ALIGN(size, s->align);` — for
objext, it seems like this padding may no longer provide much benefit.

Do you think it would make sense to remove this extra padding
altogether?

-- 
Thanks,
Hao
>  	s->size = size;
>  	s->reciprocal_size = reciprocal_value(size);
>  	order = calculate_order(size);
> -- 
> 2.43.0
> 
Re: [PATCH V4 8/8] mm/slab: place slabobj_ext metadata in unused space within s->size
Posted by Harry Yoo 1 month, 2 weeks ago
On Wed, Dec 24, 2025 at 01:33:59PM +0800, Hao Li wrote:
> On Mon, Dec 22, 2025 at 08:08:43PM +0900, Harry Yoo wrote:
> > When a cache has high s->align value and s->object_size is not aligned
> > to it, each object ends up with some unused space because of alignment.
> > If this wasted space is big enough, we can use it to store the
> > slabobj_ext metadata instead of wasting it.
> > 
> > On my system, this happens with caches like kmem_cache, mm_struct, pid,
> > task_struct, sighand_cache, xfs_inode, and others.
> > 
> > To place the slabobj_ext metadata within each object, the existing
> > slab_obj_ext() logic can still be used by setting:
> > 
> >   - slab->obj_exts = slab_address(slab) + s->red_left_zone +
> >                      (slabobj_ext offset)
> >   - stride = s->size
> > 
> > slab_obj_ext() doesn't need know where the metadata is stored,
> > so this method works without adding extra overhead to slab_obj_ext().
> > 
> > A good example benefiting from this optimization is xfs_inode
> > (object_size: 992, align: 64). To measure memory savings, 2 millions of
> > files were created on XFS.
> > 
> > [ MEMCG=y, MEM_ALLOC_PROFILING=n ]
> > 
> > Before patch (creating ~2.64M directories on xfs):
> >   Slab:            5175976 kB
> >   SReclaimable:    3837524 kB
> >   SUnreclaim:      1338452 kB
> > 
> > After patch (creating ~2.64M directories on xfs):
> >   Slab:            5152912 kB
> >   SReclaimable:    3838568 kB
> >   SUnreclaim:      1314344 kB (-23.54 MiB)
> > 
> > Enjoy the memory savings!
> > 
> > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > ---
> >  include/linux/slab.h |  9 ++++++
> >  mm/slab_common.c     |  6 ++--
> >  mm/slub.c            | 73 ++++++++++++++++++++++++++++++++++++++++++--
> >  3 files changed, 83 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 4554c04a9bd7..da512d9ab1a0 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -59,6 +59,9 @@ enum _slab_flag_bits {
> >  	_SLAB_CMPXCHG_DOUBLE,
> >  #ifdef CONFIG_SLAB_OBJ_EXT
> >  	_SLAB_NO_OBJ_EXT,
> > +#endif
> > +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> > +	_SLAB_OBJ_EXT_IN_OBJ,
> >  #endif
> >  	_SLAB_FLAGS_LAST_BIT
> >  };
> > @@ -244,6 +247,12 @@ enum _slab_flag_bits {
> >  #define SLAB_NO_OBJ_EXT		__SLAB_FLAG_UNUSED
> >  #endif
> >  
> > +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> > +#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_BIT(_SLAB_OBJ_EXT_IN_OBJ)
> > +#else
> > +#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_UNUSED
> > +#endif
> > +
> >  /*
> >   * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
> >   *
> > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > index c4cf9ed2ec92..f0a6db20d7ea 100644
> > --- a/mm/slab_common.c
> > +++ b/mm/slab_common.c
> > @@ -43,11 +43,13 @@ DEFINE_MUTEX(slab_mutex);
> >  struct kmem_cache *kmem_cache;
> >  
> >  /*
> > - * Set of flags that will prevent slab merging
> > + * Set of flags that will prevent slab merging.
> > + * Any flag that adds per-object metadata should be included,
> > + * since slab merging can update s->inuse that affects the metadata layout.
> >   */
> >  #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
> >  		SLAB_TRACE | SLAB_TYPESAFE_BY_RCU | SLAB_NOLEAKTRACE | \
> > -		SLAB_FAILSLAB | SLAB_NO_MERGE)
> > +		SLAB_FAILSLAB | SLAB_NO_MERGE | SLAB_OBJ_EXT_IN_OBJ)
> >  
> >  #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
> >  			 SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
> > diff --git a/mm/slub.c b/mm/slub.c
> > index 3fc3d2ca42e7..78f0087c8e48 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -977,6 +977,39 @@ static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> >  {
> >  	return false;
> >  }
> > +
> > +#endif
> > +
> > +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> > +static bool obj_exts_in_object(struct kmem_cache *s)
> > +{
> > +	return s->flags & SLAB_OBJ_EXT_IN_OBJ;
> > +}
> > +
> > +static unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
> > +{
> > +	unsigned int offset = get_info_end(s);
> > +
> > +	if (kmem_cache_debug_flags(s, SLAB_STORE_USER))
> > +		offset += sizeof(struct track) * 2;
> > +
> > +	if (slub_debug_orig_size(s))
> > +		offset += sizeof(unsigned long);
> > +
> > +	offset += kasan_metadata_size(s, false);
> > +
> > +	return offset;
> > +}
> > +#else
> > +static inline bool obj_exts_in_object(struct kmem_cache *s)
> > +{
> > +	return false;
> > +}
> > +
> > +static inline unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
> > +{
> > +	return 0;
> > +}
> >  #endif
> >  
> >  #ifdef CONFIG_SLUB_DEBUG
> > @@ -1277,6 +1310,9 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
> >  
> >  	off += kasan_metadata_size(s, false);
> >  
> > +	if (obj_exts_in_object(s))
> > +		off += sizeof(struct slabobj_ext);
> > +
> >  	if (off != size_from_object(s))
> >  		/* Beginning of the filler is the free pointer */
> >  		print_section(KERN_ERR, "Padding  ", p + off,
> > @@ -1446,7 +1482,10 @@ check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
> >   * 	A. Free pointer (if we cannot overwrite object on free)
> >   * 	B. Tracking data for SLAB_STORE_USER
> >   *	C. Original request size for kmalloc object (SLAB_STORE_USER enabled)
> > - *	D. Padding to reach required alignment boundary or at minimum
> > + *	D. KASAN alloc metadata (KASAN enabled)
> > + *	E. struct slabobj_ext to store accounting metadata
> > + *	   (SLAB_OBJ_EXT_IN_OBJ enabled)
> > + *	F. Padding to reach required alignment boundary or at minimum
> >   * 		one word if debugging is on to be able to detect writes
> >   * 		before the word boundary.
> >   *
> > @@ -1474,6 +1513,9 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
> >  
> >  	off += kasan_metadata_size(s, false);
> >  
> > +	if (obj_exts_in_object(s))
> > +		off += sizeof(struct slabobj_ext);
> > +
> >  	if (size_from_object(s) == off)
> >  		return 1;
> >  
> > @@ -2280,7 +2322,8 @@ static inline void free_slab_obj_exts(struct slab *slab)
> >  		return;
> >  	}
> >  
> > -	if (obj_exts_in_slab(slab->slab_cache, slab)) {
> > +	if (obj_exts_in_slab(slab->slab_cache, slab) ||
> > +			obj_exts_in_object(slab->slab_cache)) {
> >  		slab->obj_exts = 0;
> >  		return;
> >  	}
> > @@ -2326,6 +2369,23 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> >  			obj_exts |= MEMCG_DATA_OBJEXTS;
> >  		slab->obj_exts = obj_exts;
> >  		slab_set_stride(slab, sizeof(struct slabobj_ext));
> > +	} else if (obj_exts_in_object(s)) {
> > +		unsigned int offset = obj_exts_offset_in_object(s);
> > +
> > +		obj_exts = (unsigned long)slab_address(slab);
> > +		obj_exts += s->red_left_pad;
> > +		obj_exts += obj_exts_offset_in_object(s);
> 
> Hi, Harry
> 
> It looks like this could just be simplified to obj_exts += offset, right?

Right! Will do in v5.

> > +
> > +		get_slab_obj_exts(obj_exts);
> > +		for_each_object(addr, s, slab_address(slab), slab->objects)
> > +			memset(kasan_reset_tag(addr) + offset, 0,
> > +			       sizeof(struct slabobj_ext));
> > +		put_slab_obj_exts(obj_exts);
> > +
> > +		if (IS_ENABLED(CONFIG_MEMCG))
> > +			obj_exts |= MEMCG_DATA_OBJEXTS;
> > +		slab->obj_exts = obj_exts;
> > +		slab_set_stride(slab, s->size);
> >  	}
> >  }
> >  
> > @@ -8023,6 +8083,7 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
> >  {
> >  	slab_flags_t flags = s->flags;
> >  	unsigned int size = s->object_size;
> > +	unsigned int aligned_size;
> >  	unsigned int order;
> >  
> >  	/*
> > @@ -8132,7 +8193,13 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
> >  	 * offset 0. In order to align the objects we have to simply size
> >  	 * each object to conform to the alignment.
> >  	 */
> > -	size = ALIGN(size, s->align);
> > +	aligned_size = ALIGN(size, s->align);
> > +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> > +	if (aligned_size - size >= sizeof(struct slabobj_ext))
> > +		s->flags |= SLAB_OBJ_EXT_IN_OBJ;
> > +#endif
> > +	size = aligned_size;
> > +
> 
> One more thought: in calculate_sizes() we add some extra padding when
> SLAB_RED_ZONE is enabled:
> 
> if (flags & SLAB_RED_ZONE) {
> 	/*
> 	 * Add some empty padding so that we can catch
> 	 * overwrites from earlier objects rather than let
> 	 * tracking information or the free pointer be
> 	 * corrupted if a user writes before the start
> 	 * of the object.
> 	 */
> 	size += sizeof(void *);
> 	...
> }
> 
> 
> From what I understand, this additional padding ends up being placed
> after the KASAN allocation metadata.

Right.

> Since it’s only "extra" padding (i.e., it doesn’t seem strictly required
> for the layout), and your patch would reuse this area — together with
> the final padding introduced by `size = ALIGN(size, s->align);`

Very good point!
Nah, it wasn't intentional to reuse the extra padding.

> for objext, it seems like this padding may no longer provide much benefit.
> Do you think it would make sense to remove this extra padding
> altogether?

I think when debugging flags are enabled it'd still be useful to have,
I'll try to keep the padding area after obj_ext (so that overwrites from
the previous object won't overwrite the metadata).

Thanks a lot!

-- 
Cheers,
Harry / Hyeonggon
Re: [PATCH V4 8/8] mm/slab: place slabobj_ext metadata in unused space within s->size
Posted by Hao Li 1 month, 2 weeks ago
On Wed, Dec 24, 2025 at 03:38:57PM +0900, Harry Yoo wrote:
> On Wed, Dec 24, 2025 at 01:33:59PM +0800, Hao Li wrote:
> > On Mon, Dec 22, 2025 at 08:08:43PM +0900, Harry Yoo wrote:
> > > When a cache has high s->align value and s->object_size is not aligned
> > > to it, each object ends up with some unused space because of alignment.
> > > If this wasted space is big enough, we can use it to store the
> > > slabobj_ext metadata instead of wasting it.
> > > 
> > > On my system, this happens with caches like kmem_cache, mm_struct, pid,
> > > task_struct, sighand_cache, xfs_inode, and others.
> > > 
> > > To place the slabobj_ext metadata within each object, the existing
> > > slab_obj_ext() logic can still be used by setting:
> > > 
> > >   - slab->obj_exts = slab_address(slab) + s->red_left_zone +
> > >                      (slabobj_ext offset)
> > >   - stride = s->size
> > > 
> > > slab_obj_ext() doesn't need know where the metadata is stored,
> > > so this method works without adding extra overhead to slab_obj_ext().
> > > 
> > > A good example benefiting from this optimization is xfs_inode
> > > (object_size: 992, align: 64). To measure memory savings, 2 millions of
> > > files were created on XFS.
> > > 
> > > [ MEMCG=y, MEM_ALLOC_PROFILING=n ]
> > > 
> > > Before patch (creating ~2.64M directories on xfs):
> > >   Slab:            5175976 kB
> > >   SReclaimable:    3837524 kB
> > >   SUnreclaim:      1338452 kB
> > > 
> > > After patch (creating ~2.64M directories on xfs):
> > >   Slab:            5152912 kB
> > >   SReclaimable:    3838568 kB
> > >   SUnreclaim:      1314344 kB (-23.54 MiB)
> > > 
> > > Enjoy the memory savings!
> > > 
> > > Suggested-by: Vlastimil Babka <vbabka@suse.cz>
> > > Signed-off-by: Harry Yoo <harry.yoo@oracle.com>
> > > ---
> > >  include/linux/slab.h |  9 ++++++
> > >  mm/slab_common.c     |  6 ++--
> > >  mm/slub.c            | 73 ++++++++++++++++++++++++++++++++++++++++++--
> > >  3 files changed, 83 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > > index 4554c04a9bd7..da512d9ab1a0 100644
> > > --- a/include/linux/slab.h
> > > +++ b/include/linux/slab.h
> > > @@ -59,6 +59,9 @@ enum _slab_flag_bits {
> > >  	_SLAB_CMPXCHG_DOUBLE,
> > >  #ifdef CONFIG_SLAB_OBJ_EXT
> > >  	_SLAB_NO_OBJ_EXT,
> > > +#endif
> > > +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> > > +	_SLAB_OBJ_EXT_IN_OBJ,
> > >  #endif
> > >  	_SLAB_FLAGS_LAST_BIT
> > >  };
> > > @@ -244,6 +247,12 @@ enum _slab_flag_bits {
> > >  #define SLAB_NO_OBJ_EXT		__SLAB_FLAG_UNUSED
> > >  #endif
> > >  
> > > +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> > > +#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_BIT(_SLAB_OBJ_EXT_IN_OBJ)
> > > +#else
> > > +#define SLAB_OBJ_EXT_IN_OBJ	__SLAB_FLAG_UNUSED
> > > +#endif
> > > +
> > >  /*
> > >   * ZERO_SIZE_PTR will be returned for zero sized kmalloc requests.
> > >   *
> > > diff --git a/mm/slab_common.c b/mm/slab_common.c
> > > index c4cf9ed2ec92..f0a6db20d7ea 100644
> > > --- a/mm/slab_common.c
> > > +++ b/mm/slab_common.c
> > > @@ -43,11 +43,13 @@ DEFINE_MUTEX(slab_mutex);
> > >  struct kmem_cache *kmem_cache;
> > >  
> > >  /*
> > > - * Set of flags that will prevent slab merging
> > > + * Set of flags that will prevent slab merging.
> > > + * Any flag that adds per-object metadata should be included,
> > > + * since slab merging can update s->inuse that affects the metadata layout.
> > >   */
> > >  #define SLAB_NEVER_MERGE (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER | \
> > >  		SLAB_TRACE | SLAB_TYPESAFE_BY_RCU | SLAB_NOLEAKTRACE | \
> > > -		SLAB_FAILSLAB | SLAB_NO_MERGE)
> > > +		SLAB_FAILSLAB | SLAB_NO_MERGE | SLAB_OBJ_EXT_IN_OBJ)
> > >  
> > >  #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
> > >  			 SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
> > > diff --git a/mm/slub.c b/mm/slub.c
> > > index 3fc3d2ca42e7..78f0087c8e48 100644
> > > --- a/mm/slub.c
> > > +++ b/mm/slub.c
> > > @@ -977,6 +977,39 @@ static inline bool obj_exts_in_slab(struct kmem_cache *s, struct slab *slab)
> > >  {
> > >  	return false;
> > >  }
> > > +
> > > +#endif
> > > +
> > > +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> > > +static bool obj_exts_in_object(struct kmem_cache *s)
> > > +{
> > > +	return s->flags & SLAB_OBJ_EXT_IN_OBJ;
> > > +}
> > > +
> > > +static unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
> > > +{
> > > +	unsigned int offset = get_info_end(s);
> > > +
> > > +	if (kmem_cache_debug_flags(s, SLAB_STORE_USER))
> > > +		offset += sizeof(struct track) * 2;
> > > +
> > > +	if (slub_debug_orig_size(s))
> > > +		offset += sizeof(unsigned long);
> > > +
> > > +	offset += kasan_metadata_size(s, false);
> > > +
> > > +	return offset;
> > > +}
> > > +#else
> > > +static inline bool obj_exts_in_object(struct kmem_cache *s)
> > > +{
> > > +	return false;
> > > +}
> > > +
> > > +static inline unsigned int obj_exts_offset_in_object(struct kmem_cache *s)
> > > +{
> > > +	return 0;
> > > +}
> > >  #endif
> > >  
> > >  #ifdef CONFIG_SLUB_DEBUG
> > > @@ -1277,6 +1310,9 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p)
> > >  
> > >  	off += kasan_metadata_size(s, false);
> > >  
> > > +	if (obj_exts_in_object(s))
> > > +		off += sizeof(struct slabobj_ext);
> > > +
> > >  	if (off != size_from_object(s))
> > >  		/* Beginning of the filler is the free pointer */
> > >  		print_section(KERN_ERR, "Padding  ", p + off,
> > > @@ -1446,7 +1482,10 @@ check_bytes_and_report(struct kmem_cache *s, struct slab *slab,
> > >   * 	A. Free pointer (if we cannot overwrite object on free)
> > >   * 	B. Tracking data for SLAB_STORE_USER
> > >   *	C. Original request size for kmalloc object (SLAB_STORE_USER enabled)
> > > - *	D. Padding to reach required alignment boundary or at minimum
> > > + *	D. KASAN alloc metadata (KASAN enabled)
> > > + *	E. struct slabobj_ext to store accounting metadata
> > > + *	   (SLAB_OBJ_EXT_IN_OBJ enabled)
> > > + *	F. Padding to reach required alignment boundary or at minimum
> > >   * 		one word if debugging is on to be able to detect writes
> > >   * 		before the word boundary.
> > >   *
> > > @@ -1474,6 +1513,9 @@ static int check_pad_bytes(struct kmem_cache *s, struct slab *slab, u8 *p)
> > >  
> > >  	off += kasan_metadata_size(s, false);
> > >  
> > > +	if (obj_exts_in_object(s))
> > > +		off += sizeof(struct slabobj_ext);
> > > +
> > >  	if (size_from_object(s) == off)
> > >  		return 1;
> > >  
> > > @@ -2280,7 +2322,8 @@ static inline void free_slab_obj_exts(struct slab *slab)
> > >  		return;
> > >  	}
> > >  
> > > -	if (obj_exts_in_slab(slab->slab_cache, slab)) {
> > > +	if (obj_exts_in_slab(slab->slab_cache, slab) ||
> > > +			obj_exts_in_object(slab->slab_cache)) {
> > >  		slab->obj_exts = 0;
> > >  		return;
> > >  	}
> > > @@ -2326,6 +2369,23 @@ static void alloc_slab_obj_exts_early(struct kmem_cache *s, struct slab *slab)
> > >  			obj_exts |= MEMCG_DATA_OBJEXTS;
> > >  		slab->obj_exts = obj_exts;
> > >  		slab_set_stride(slab, sizeof(struct slabobj_ext));
> > > +	} else if (obj_exts_in_object(s)) {
> > > +		unsigned int offset = obj_exts_offset_in_object(s);
> > > +
> > > +		obj_exts = (unsigned long)slab_address(slab);
> > > +		obj_exts += s->red_left_pad;
> > > +		obj_exts += obj_exts_offset_in_object(s);
> > 
> > Hi, Harry
> > 
> > It looks like this could just be simplified to obj_exts += offset, right?
> 
> Right! Will do in v5.
> 
> > > +
> > > +		get_slab_obj_exts(obj_exts);
> > > +		for_each_object(addr, s, slab_address(slab), slab->objects)
> > > +			memset(kasan_reset_tag(addr) + offset, 0,
> > > +			       sizeof(struct slabobj_ext));
> > > +		put_slab_obj_exts(obj_exts);
> > > +
> > > +		if (IS_ENABLED(CONFIG_MEMCG))
> > > +			obj_exts |= MEMCG_DATA_OBJEXTS;
> > > +		slab->obj_exts = obj_exts;
> > > +		slab_set_stride(slab, s->size);
> > >  	}
> > >  }
> > >  
> > > @@ -8023,6 +8083,7 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
> > >  {
> > >  	slab_flags_t flags = s->flags;
> > >  	unsigned int size = s->object_size;
> > > +	unsigned int aligned_size;
> > >  	unsigned int order;
> > >  
> > >  	/*
> > > @@ -8132,7 +8193,13 @@ static int calculate_sizes(struct kmem_cache_args *args, struct kmem_cache *s)
> > >  	 * offset 0. In order to align the objects we have to simply size
> > >  	 * each object to conform to the alignment.
> > >  	 */
> > > -	size = ALIGN(size, s->align);
> > > +	aligned_size = ALIGN(size, s->align);
> > > +#if defined(CONFIG_SLAB_OBJ_EXT) && defined(CONFIG_64BIT)
> > > +	if (aligned_size - size >= sizeof(struct slabobj_ext))
> > > +		s->flags |= SLAB_OBJ_EXT_IN_OBJ;
> > > +#endif
> > > +	size = aligned_size;
> > > +
> > 
> > One more thought: in calculate_sizes() we add some extra padding when
> > SLAB_RED_ZONE is enabled:
> > 
> > if (flags & SLAB_RED_ZONE) {
> > 	/*
> > 	 * Add some empty padding so that we can catch
> > 	 * overwrites from earlier objects rather than let
> > 	 * tracking information or the free pointer be
> > 	 * corrupted if a user writes before the start
> > 	 * of the object.
> > 	 */
> > 	size += sizeof(void *);
> > 	...
> > }
> > 
> > 
> > From what I understand, this additional padding ends up being placed
> > after the KASAN allocation metadata.
> 
> Right.
> 
> > Since it’s only "extra" padding (i.e., it doesn’t seem strictly required
> > for the layout), and your patch would reuse this area — together with
> > the final padding introduced by `size = ALIGN(size, s->align);`
> 
> Very good point!
> Nah, it wasn't intentional to reuse the extra padding.
> 
> > for objext, it seems like this padding may no longer provide much benefit.
> > Do you think it would make sense to remove this extra padding
> > altogether?
> 
> I think when debugging flags are enabled it'd still be useful to have,

Absolutely — I’m with you on this.

After thinking about it again, I agree it’s better to keep it.

Without that mandatory extra word, we could end up with "no trailing
padding at all" in cases where ALIGN(size, s->align) doesn’t actually
add any bytes.

> I'll try to keep the padding area after obj_ext (so that overwrites from
> the previous object won't overwrite the metadata).

Agree — we should make sure there is at least sizeof(void *) of extra
space after obj_exts when SLAB_RED_ZONE is enabled, so POISON_INUSE has
somewhere to go.

> 
> Thanks a lot!

Happy to help.

-- 
Thanks,
Hao
> 
> -- 
> Cheers,
> Harry / Hyeonggon
Re: [PATCH V4 8/8] mm/slab: place slabobj_ext metadata in unused space within s->size
Posted by Harry Yoo 1 month, 1 week ago
On Wed, Dec 24, 2025 at 08:43:17PM +0800, Hao Li wrote:
> On Wed, Dec 24, 2025 at 03:38:57PM +0900, Harry Yoo wrote:
> > On Wed, Dec 24, 2025 at 01:33:59PM +0800, Hao Li wrote:
> > > One more thought: in calculate_sizes() we add some extra padding when
> > > SLAB_RED_ZONE is enabled:
> > > 
> > > if (flags & SLAB_RED_ZONE) {
> > > 	/*
> > > 	 * Add some empty padding so that we can catch
> > > 	 * overwrites from earlier objects rather than let
> > > 	 * tracking information or the free pointer be
> > > 	 * corrupted if a user writes before the start
> > > 	 * of the object.
> > > 	 */
> > > 	size += sizeof(void *);
> > > 	...
> > > }
> > > 
> > > 
> > > From what I understand, this additional padding ends up being placed
> > > after the KASAN allocation metadata.
> > 
> > Right.
> > 
> > > Since it’s only "extra" padding (i.e., it doesn’t seem strictly required
> > > for the layout), and your patch would reuse this area — together with
> > > the final padding introduced by `size = ALIGN(size, s->align);`
> > 
> > Very good point!
> > Nah, it wasn't intentional to reuse the extra padding.

Waaaait, now I'm looking into it again to write V5...

It may reduce (or remove) the space for the final padding but not the
mandatory padding because the mandatory padding is already included
in the size before `aligned_size = ALIGN(size, s->align)`

> > > for objext, it seems like this padding may no longer provide much benefit.
> > > Do you think it would make sense to remove this extra padding
> > > altogether?
> > 
> > I think when debugging flags are enabled it'd still be useful to have,
> 
> Absolutely — I’m with you on this.
> 
> After thinking about it again, I agree it’s better to keep it.
> 
> Without that mandatory extra word, we could end up with "no trailing
> padding at all" in cases where ALIGN(size, s->align) doesn’t actually
> add any bytes.
> 
> > I'll try to keep the padding area after obj_ext (so that overwrites from
> > the previous object won't overwrite the metadata).
> 
> Agree — we should make sure there is at least sizeof(void *) of extra
> space after obj_exts when SLAB_RED_ZONE is enabled, so POISON_INUSE has
> somewhere to go.

I think V4 of the patchset is already doing that, no?

The mandatory padding exists after obj_ext if SLAB_RED_ZONE is enabled
and the final padding may or may not exist. check_pad_bytes() already knows
that the padding(s) exist after obj_ext.

By the way, thanks for fixing the comment once again,
it's easier to think about the layout now.

-- 
Cheers,
Harry / Hyeonggon
Re: [PATCH V4 8/8] mm/slab: place slabobj_ext metadata in unused space within s->size
Posted by Hao Li 1 month, 1 week ago
On Tue, Dec 30, 2025 at 01:59:57PM +0900, Harry Yoo wrote:
> On Wed, Dec 24, 2025 at 08:43:17PM +0800, Hao Li wrote:
> > On Wed, Dec 24, 2025 at 03:38:57PM +0900, Harry Yoo wrote:
> > > On Wed, Dec 24, 2025 at 01:33:59PM +0800, Hao Li wrote:
> > > > One more thought: in calculate_sizes() we add some extra padding when
> > > > SLAB_RED_ZONE is enabled:
> > > > 
> > > > if (flags & SLAB_RED_ZONE) {
> > > > 	/*
> > > > 	 * Add some empty padding so that we can catch
> > > > 	 * overwrites from earlier objects rather than let
> > > > 	 * tracking information or the free pointer be
> > > > 	 * corrupted if a user writes before the start
> > > > 	 * of the object.
> > > > 	 */
> > > > 	size += sizeof(void *);
> > > > 	...
> > > > }
> > > > 
> > > > 
> > > > From what I understand, this additional padding ends up being placed
> > > > after the KASAN allocation metadata.
> > > 
> > > Right.
> > > 
> > > > Since it’s only "extra" padding (i.e., it doesn’t seem strictly required
> > > > for the layout), and your patch would reuse this area — together with
> > > > the final padding introduced by `size = ALIGN(size, s->align);`
> > > 
> > > Very good point!
> > > Nah, it wasn't intentional to reuse the extra padding.
> 
> Waaaait, now I'm looking into it again to write V5...
> 
> It may reduce (or remove) the space for the final padding but not the
> mandatory padding because the mandatory padding is already included
> in the size before `aligned_size = ALIGN(size, s->align)`

Ah, right - I double-checked as well. `aligned_size - size` is exactly the
space reserved for the final padding, so slabobj_ext won't eat into the
mandatory padding.

> 
> > > > for objext, it seems like this padding may no longer provide much benefit.
> > > > Do you think it would make sense to remove this extra padding
> > > > altogether?
> > > 
> > > I think when debugging flags are enabled it'd still be useful to have,
> > 
> > Absolutely — I’m with you on this.
> > 
> > After thinking about it again, I agree it’s better to keep it.
> > 
> > Without that mandatory extra word, we could end up with "no trailing
> > padding at all" in cases where ALIGN(size, s->align) doesn’t actually
> > add any bytes.
> > 
> > > I'll try to keep the padding area after obj_ext (so that overwrites from
> > > the previous object won't overwrite the metadata).
> > 
> > Agree — we should make sure there is at least sizeof(void *) of extra
> > space after obj_exts when SLAB_RED_ZONE is enabled, so POISON_INUSE has
> > somewhere to go.
> 
> I think V4 of the patchset is already doing that, no?
> 
> The mandatory padding exists after obj_ext if SLAB_RED_ZONE is enabled
> and the final padding may or may not exist. check_pad_bytes() already knows
> that the padding(s) exist after obj_ext.

Yes, you are right, V4 already does this — I just hadn't noticed it earlier...

> 
> By the way, thanks for fixing the comment once again,
> it's easier to think about the layout now.

Glad it helped. The object layout is really subtle — missing even a
small detail was enough to throw us off. Glad we finally got it all
straightened out.

-- 
Thanks,
Hao