[v2] Add and use seprintf() instead of less ergonomic APIs

[RFC v2 0/5] Add and use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

Hi Kees,

I've found some more bugs in the same code.  There were three off-by-one
bugs in the code I had replaced, and while I had noticed something
weird, I hadn't stopped to think too much about them.  I've documented
the bugs, and fixed them in the last commit.  I've also added an ENDOF()
macro to prevent these off-by-one bugs when we can avoid them.

This time I've built the kernel, which showed I had forgotten some
prototypes, plus also one typo.

See range-diff below.

This is still not complying to coding style, but is otherwise in working
order.  I'll send it as is for discussion.  When we agree on the
specific questions on the code I made in v1, I'll turn it into coding-
style compliant.


Have a lovely Sun day!
Alex

Alejandro Colomar (5):
  vsprintf: Add [v]seprintf(), [v]stprintf()
  stacktrace, stackdepot: Add seprintf()-like variants of functions
  mm: Use seprintf() instead of less ergonomic APIs
  array_size.h: Add ENDOF()
  mm: Fix benign off-by-one bugs

 include/linux/array_size.h |   6 ++
 include/linux/sprintf.h    |   4 ++
 include/linux/stackdepot.h |  13 +++++
 include/linux/stacktrace.h |   3 +
 kernel/stacktrace.c        |  28 ++++++++++
 lib/stackdepot.c           |  12 ++++
 lib/vsprintf.c             | 109 +++++++++++++++++++++++++++++++++++++
 mm/kfence/kfence_test.c    |  28 +++++-----
 mm/kmsan/kmsan_test.c      |   6 +-
 mm/mempolicy.c             |  18 +++---
 mm/page_owner.c            |  32 ++++++-----
 mm/slub.c                  |   5 +-
 12 files changed, 221 insertions(+), 43 deletions(-)

Range-diff against v1:
1:  2d20eaf1752e ! 1:  64334f0b94d6 vsprintf: Add [v]seprintf(), [v]stprintf()
    @@ Commit message
         Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
    + ## include/linux/sprintf.h ##
    +@@ include/linux/sprintf.h: __printf(2, 3) int sprintf(char *buf, const char * fmt, ...);
    + __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
    + __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
    + __printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
    ++__printf(3, 4) int stprintf(char *buf, size_t size, const char *fmt, ...);
    ++__printf(3, 0) int vstprintf(char *buf, size_t size, const char *fmt, va_list args);
    + __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
    + __printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
    ++__printf(3, 4) char *seprintf(char *p, const char end[0], const char *fmt, ...);
    ++__printf(3, 0) char *vseprintf(char *p, const char end[0], const char *fmt, va_list args);
    + __printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
    + __printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
    + __printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
    +
      ## lib/vsprintf.c ##
     @@ lib/vsprintf.c: int vsnprintf(char *buf, size_t size, const char *fmt_str, va_list args)
      }
2:  ec2e375c2d1e ! 2:  9c140de9842d stacktrace, stackdepot: Add seprintf()-like variants of functions
    @@ lib/stackdepot.c: int stack_depot_snprint(depot_stack_handle_t handle, char *buf
     +	unsigned int nr_entries;
     +
     +	nr_entries = stack_depot_fetch(handle, &entries);
    -+	return nr_entries ? stack_trace_seprint(p, e, entries, nr_entries,
    ++	return nr_entries ? stack_trace_seprint(p, end, entries, nr_entries,
     +						spaces) : p;
     +}
     +EXPORT_SYMBOL_GPL(stack_depot_seprint);
3:  be193e1856aa ! 3:  e3271b5f2ad9 mm: Use seprintf() instead of less ergonomic APIs
    @@ Commit message
     
         mm/kfence/kfence_test.c:
     
    -            The last call to scnprintf() did increment 'cur', but it's
    -            unused after that, so it was dead code.  I've removed the dead
    -            code in this patch.
    +            -  The last call to scnprintf() did increment 'cur', but it's
    +               unused after that, so it was dead code.  I've removed the dead
    +               code in this patch.
    +
    +            -  'end' is calculated as
    +
    +                    end = &expect[0][sizeof(expect[0] - 1)];
    +
    +               However, the '-1' doesn't seem to be necessary.  When passing
    +               $2 to scnprintf(), the size was specified as 'end - cur'.
    +               And scnprintf() --just like snprintf(3)--, won't write more
    +               than $2 bytes (including the null byte).  That means that
    +               scnprintf() wouldn't write more than
    +
    +                    &expect[0][sizeof(expect[0]) - 1] - expect[0]
    +
    +               which simplifies to
    +
    +                    sizeof(expect[0]) - 1
    +
    +               bytes.  But we have sizeof(expect[0]) bytes available, so
    +               we're wasting one byte entirely.  This is a benign off-by-one
    +               bug.  The two occurrences of this bug will be fixed in a
    +               following patch in this series.
    +
    +    mm/kmsan/kmsan_test.c:
    +
    +            The same benign off-by-one bug calculating the remaining size.
     
         mm/mempolicy.c:
     
-:  ------------ > 4:  5331d286ceca array_size.h: Add ENDOF()
-:  ------------ > 5:  08cfdd2bf779 mm: Fix benign off-by-one bugs
-- 
2.50.0

[RFC v5 0/7] Add and use sprintf_{end,array}() instead of less ergonomic APIs

Posted by Alejandro Colomar 2 months, 4 weeks ago

Hi,

Changes in v5:

-  Minor fix in commit message.
-  Rename [V]SPRINTF_END() => [v]sprintf_array(), keeping the
   implementation.

Remaining questions:

-  There are only 3 remaining calls to snprintf(3) under mm/.  They are
   just fine for now, which is why I didn't replace them.  If anyone
   wants to replace them, to get rid of all snprintf(3), we could that.
   I think for now we can leave them, to minimize the churn.

        $ grep -rnI snprintf mm/
        mm/hugetlb_cgroup.c:674:                snprintf(buf, size, "%luGB", hsize / SZ_1G);
        mm/hugetlb_cgroup.c:676:                snprintf(buf, size, "%luMB", hsize / SZ_1M);
        mm/hugetlb_cgroup.c:678:                snprintf(buf, size, "%luKB", hsize / SZ_1K);

-  There are only 2 remaining calls to the kernel's scnprintf().  This
   one I would really like to get rid of.  Also, those calls are quite
   suspicious of not being what we want.  Please do have a look at them
   and confirm what's the appropriate behavior in the 2 cases when the
   string is truncated or not copied at all.  That code is very scary
   for me to try to guess.

        $ grep -rnI scnprintf mm/
        mm/kfence/report.c:75:          int len = scnprintf(buf, sizeof(buf), "%ps", (void *)stack_entries[skipnr]);
        mm/kfence/kfence_test.mod.c:22: { 0x96848186, "scnprintf" },
        mm/kmsan/report.c:42:           len = scnprintf(buf, sizeof(buf), "%ps",

   Apart from two calls, I see a string literal with that name.  Please
   let me know if I should do anything about it.  I don't know what that
   is.

-  I think we should remove one error handling check in
   "mm/page_owner.c" (marked with an XXX comment), but I'm not 100%
   sure.  Please confirm.

Other comments:

-  This is still not complying to coding style.  I'll keep it like that
   while questions remain open.
-  I've tested the tests under CONFIG_KFENCE_KUNIT_TEST=y, and this has
   no regressions at all.
-  With the current style of the sprintf_end() prototyope, this triggers
   a diagnostic due to a GCC bug:
   <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108036>
   It would be interesting to ask GCC to fix that bug.  (Added relevant
   GCC maintainers and contributors to CC in this cover letter.)

For anyone new to the thread, sprintf_end() will be proposed for
standardization soon as seprintf():
<https://lore.kernel.org/linux-hardening/20250710024745.143955-1-alx@kernel.org/T/#u>


Have a lovely night!
Alex


Alejandro Colomar (7):
  vsprintf: Add [v]sprintf_end()
  stacktrace, stackdepot: Add sprintf_end()-like variants of functions
  mm: Use sprintf_end() instead of less ergonomic APIs
  array_size.h: Add ENDOF()
  mm: Fix benign off-by-one bugs
  sprintf: Add [v]sprintf_array()
  mm: Use [v]sprintf_array() to avoid specifying the array size

 include/linux/array_size.h |  6 ++++
 include/linux/sprintf.h    |  6 ++++
 include/linux/stackdepot.h | 13 +++++++++
 include/linux/stacktrace.h |  3 ++
 kernel/stacktrace.c        | 28 ++++++++++++++++++
 lib/stackdepot.c           | 13 +++++++++
 lib/vsprintf.c             | 59 ++++++++++++++++++++++++++++++++++++++
 mm/backing-dev.c           |  2 +-
 mm/cma.c                   |  4 +--
 mm/cma_debug.c             |  2 +-
 mm/hugetlb.c               |  3 +-
 mm/hugetlb_cgroup.c        |  2 +-
 mm/hugetlb_cma.c           |  2 +-
 mm/kasan/report.c          |  3 +-
 mm/kfence/kfence_test.c    | 28 +++++++++---------
 mm/kmsan/kmsan_test.c      |  6 ++--
 mm/memblock.c              |  4 +--
 mm/mempolicy.c             | 18 ++++++------
 mm/page_owner.c            | 32 +++++++++++----------
 mm/percpu.c                |  2 +-
 mm/shrinker_debug.c        |  2 +-
 mm/slub.c                  |  5 ++--
 mm/zswap.c                 |  2 +-
 23 files changed, 187 insertions(+), 58 deletions(-)

Range-diff against v4:
1:  2c4f793de0b8 = 1:  2c4f793de0b8 vsprintf: Add [v]sprintf_end()
2:  894d02b08056 = 2:  894d02b08056 stacktrace, stackdepot: Add sprintf_end()-like variants of functions
3:  690ed4d22f57 = 3:  690ed4d22f57 mm: Use sprintf_end() instead of less ergonomic APIs
4:  e05c5afabb3c = 4:  e05c5afabb3c array_size.h: Add ENDOF()
5:  44a5cfc82acf ! 5:  515445ae064d mm: Fix benign off-by-one bugs
    @@ Commit message
     
         We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
         doesn't write more than $2 bytes including the null byte, so trying to
    -    pass 'size-1' there is wasting one byte.  Now that we use seprintf(),
    -    the situation isn't different: seprintf() will stop writing *before*
    +    pass 'size-1' there is wasting one byte.  Now that we use sprintf_end(),
    +    the situation isn't different: sprintf_end() will stop writing *before*
         'end' --that is, at most the terminating null byte will be written at
         'end-1'--.
     
6:  0314948eb225 ! 6:  04c1e026a67f sprintf: Add [V]SPRINTF_END()
    @@ Metadata
     Author: Alejandro Colomar <alx@kernel.org>
     
      ## Commit message ##
    -    sprintf: Add [V]SPRINTF_END()
    +    sprintf: Add [v]sprintf_array()
     
         These macros take the end of the array argument implicitly to avoid
         programmer mistakes.  This guarantees that the input is an array, unlike
    @@ include/linux/sprintf.h
      #include <linux/types.h>
     +#include <linux/array_size.h>
     +
    -+#define SPRINTF_END(a, fmt, ...)  sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
    -+#define VSPRINTF_END(a, fmt, ap)  vsprintf_end(a, ENDOF(a), fmt, ap)
    ++#define sprintf_array(a, fmt, ...)  sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
    ++#define vsprintf_array(a, fmt, ap)  vsprintf_end(a, ENDOF(a), fmt, ap)
      
      int num_to_str(char *buf, int size, unsigned long long num, unsigned int width);
      
7:  f99632f42eee ! 7:  e53d87e684ef mm: Use [V]SPRINTF_END() to avoid specifying the array size
    @@ Metadata
     Author: Alejandro Colomar <alx@kernel.org>
     
      ## Commit message ##
    -    mm: Use [V]SPRINTF_END() to avoid specifying the array size
    +    mm: Use [v]sprintf_array() to avoid specifying the array size
     
         Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
         Cc: Marco Elver <elver@google.com>
    @@ mm/backing-dev.c: int bdi_register_va(struct backing_dev_info *bdi, const char *
      		return 0;
      
     -	vsnprintf(bdi->dev_name, sizeof(bdi->dev_name), fmt, args);
    -+	VSPRINTF_END(bdi->dev_name, fmt, args);
    ++	vsprintf_array(bdi->dev_name, fmt, args);
      	dev = device_create(&bdi_class, NULL, MKDEV(0, 0), bdi, bdi->dev_name);
      	if (IS_ERR(dev))
      		return PTR_ERR(dev);
    @@ mm/cma.c: static int __init cma_new_area(const char *name, phys_addr_t size,
      
      	if (name)
     -		snprintf(cma->name, CMA_MAX_NAME, "%s", name);
    -+		SPRINTF_END(cma->name, "%s", name);
    ++		sprintf_array(cma->name, "%s", name);
      	else
     -		snprintf(cma->name, CMA_MAX_NAME,  "cma%d\n", cma_area_count);
    -+		SPRINTF_END(cma->name, "cma%d\n", cma_area_count);
    ++		sprintf_array(cma->name, "cma%d\n", cma_area_count);
      
      	cma->available_count = cma->count = size >> PAGE_SHIFT;
      	cma->order_per_bit = order_per_bit;
    @@ mm/cma_debug.c: static void cma_debugfs_add_one(struct cma *cma, struct dentry *
      	for (r = 0; r < cma->nranges; r++) {
      		cmr = &cma->ranges[r];
     -		snprintf(rdirname, sizeof(rdirname), "%d", r);
    -+		SPRINTF_END(rdirname, "%d", r);
    ++		sprintf_array(rdirname, "%d", r);
      		dir = debugfs_create_dir(rdirname, rangedir);
      		debugfs_create_file("base_pfn", 0444, dir,
      			    &cmr->base_pfn, &cma_debugfs_fops);
    @@ mm/hugetlb.c: void __init hugetlb_add_hstate(unsigned int order)
      	INIT_LIST_HEAD(&h->hugepage_activelist);
     -	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
     -					huge_page_size(h)/SZ_1K);
    -+	SPRINTF_END(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
    ++	sprintf_array(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
      
      	parsed_hstate = h;
      }
    @@ mm/hugetlb_cgroup.c: hugetlb_cgroup_cfttypes_init(struct hstate *h, struct cftyp
      		*cft = *tmpl;
      		/* rebuild the name */
     -		snprintf(cft->name, MAX_CFTYPE_NAME, "%s.%s", buf, tmpl->name);
    -+		SPRINTF_END(cft->name, "%s.%s", buf, tmpl->name);
    ++		sprintf_array(cft->name, "%s.%s", buf, tmpl->name);
      		/* rebuild the private */
      		cft->private = MEMFILE_PRIVATE(idx, tmpl->private);
      		/* rebuild the file_offset */
    @@ mm/hugetlb_cma.c: void __init hugetlb_cma_reserve(int order)
      		size = round_up(size, PAGE_SIZE << order);
      
     -		snprintf(name, sizeof(name), "hugetlb%d", nid);
    -+		SPRINTF_END(name, "hugetlb%d", nid);
    ++		sprintf_array(name, "hugetlb%d", nid);
      		/*
      		 * Note that 'order per bit' is based on smallest size that
      		 * may be returned to CMA allocator in the case of
    @@ mm/kasan/report.c: static void print_memory_metadata(const void *addr)
      
     -		snprintf(buffer, sizeof(buffer),
     -				(i == 0) ? ">%px: " : " %px: ", row);
    -+		SPRINTF_END(buffer, (i == 0) ? ">%px: " : " %px: ", row);
    ++		sprintf_array(buffer, (i == 0) ? ">%px: " : " %px: ", row);
      
      		/*
      		 * We should not pass a shadow pointer to generic
    @@ mm/memblock.c: static void __init_memblock memblock_dump(struct memblock_type *t
      #ifdef CONFIG_NUMA
      		if (numa_valid_node(memblock_get_region_node(rgn)))
     -			snprintf(nid_buf, sizeof(nid_buf), " on node %d",
    -+			SPRINTF_END(nid_buf, " on node %d",
    ++			sprintf_array(nid_buf, " on node %d",
      				 memblock_get_region_node(rgn));
      #endif
      		pr_info(" %s[%#x]\t[%pa-%pa], %pa bytes%s flags: %#x\n",
    @@ mm/memblock.c: int reserve_mem_release_by_name(const char *name)
      	start = phys_to_virt(map->start);
      	end = start + map->size - 1;
     -	snprintf(buf, sizeof(buf), "reserve_mem:%s", name);
    -+	SPRINTF_END(buf, "reserve_mem:%s", name);
    ++	sprintf_array(buf, "reserve_mem:%s", name);
      	free_reserved_area(start, end, 0, buf);
      	map->size = 0;
      
    @@ mm/percpu.c: int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_
      	int nr_g0_units;
      
     -	snprintf(psize_str, sizeof(psize_str), "%luK", PAGE_SIZE >> 10);
    -+	SPRINTF_END(psize_str, "%luK", PAGE_SIZE >> 10);
    ++	sprintf_array(psize_str, "%luK", PAGE_SIZE >> 10);
      
      	ai = pcpu_build_alloc_info(reserved_size, 0, PAGE_SIZE, NULL);
      	if (IS_ERR(ai))
    @@ mm/shrinker_debug.c: int shrinker_debugfs_add(struct shrinker *shrinker)
      	shrinker->debugfs_id = id;
      
     -	snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
    -+	SPRINTF_END(buf, "%s-%d", shrinker->name, id);
    ++	sprintf_array(buf, "%s-%d", shrinker->name, id);
      
      	/* create debugfs entry */
      	entry = debugfs_create_dir(buf, shrinker_debugfs_root);
    @@ mm/zswap.c: static struct zswap_pool *zswap_pool_create(char *type, char *compre
      
      	/* unique name for each pool specifically required by zsmalloc */
     -	snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count));
    -+	SPRINTF_END(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
    ++	sprintf_array(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
      	pool->zpool = zpool_create_pool(type, name, gfp);
      	if (!pool->zpool) {
      		pr_err("%s zpool not available\n", type);
-- 
2.50.0

[RFC v5 1/7] vsprintf: Add [v]sprintf_end()

Posted by Alejandro Colomar 2 months, 4 weeks ago

sprintf_end() is a function similar to stpcpy(3) in the sense that it
returns a pointer that is suitable for chaining to other copy
operations.

It takes a pointer to the end of the buffer as a sentinel for when to
truncate, which unlike a size, doesn't need to be updated after every
call.  This makes it much more ergonomic, avoiding manually calculating
the size after each copy, which is error prone.

It also makes error handling much easier, by reporting truncation with
a null pointer, which is accepted and transparently passed down by
subsequent sprintf_end() calls.  This results in only needing to report
errors once after a chain of sprintf_end() calls, unlike snprintf(3),
which requires checking after every call.

	p = buf;
	e = buf + countof(buf);
	p = sprintf_end(p, e, foo);
	p = sprintf_end(p, e, bar);
	if (p == NULL)
		goto trunc;

vs

	len = 0;
	size = countof(buf);
	len += snprintf(buf + len, size - len, foo);
	if (len >= size)
		goto trunc;

	len += snprintf(buf + len, size - len, bar);
	if (len >= size)
		goto trunc;

And also better than scnprintf() calls:

	len = 0;
	size = countof(buf);
	len += scnprintf(buf + len, size - len, foo);
	len += scnprintf(buf + len, size - len, bar);
	// No ability to check.

It seems aparent that it's a more elegant approach to string catenation.

These functions will soon be proposed for standardization as
[v]seprintf() into C2y, and they exist in Plan9 as seprint(2) --but the
Plan9 implementation has important bugs--.

Link: <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0049.git/tree/alx-0049.txt>
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h |  2 ++
 lib/vsprintf.c          | 59 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index 51cab2def9ec..a0dc35574521 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -13,6 +13,8 @@ __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
 __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
+__printf(3, 4) char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
+__printf(3, 0) char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
 __printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
 __printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
 __printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 01699852f30c..d32df53a713a 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2923,6 +2923,40 @@ int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
 }
 EXPORT_SYMBOL(vscnprintf);
 
+/**
+ * vsprintf_end - va_list string end-delimited print formatted
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
+ * @args: Arguments for the format string
+ *
+ * The return value is a pointer to the trailing '\0'.
+ * If @p is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
+ * If @end <= @p, the function returns NULL.
+ *
+ * See the vsnprintf() documentation for format string extensions over C99.
+ */
+char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
+{
+	int len;
+	size_t size;
+
+	if (unlikely(p == NULL))
+		return NULL;
+
+	size = end - p;
+	if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
+		return NULL;
+
+	len = vsnprintf(p, size, fmt, args);
+	if (unlikely(len >= size))
+		return NULL;
+
+	return p + len;
+}
+EXPORT_SYMBOL(vsprintf_end);
+
 /**
  * snprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
@@ -2974,6 +3008,31 @@ int scnprintf(char *buf, size_t size, const char *fmt, ...)
 }
 EXPORT_SYMBOL(scnprintf);
 
+/**
+ * sprintf_end - string end-delimited print formatted
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
+ * @...: Arguments for the format string
+ *
+ * The return value is a pointer to the trailing '\0'.
+ * If @buf is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
+ * If @end <= @p, the function returns NULL.
+ */
+
+char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	p = vsprintf_end(p, end, fmt, args);
+	va_end(args);
+
+	return p;
+}
+EXPORT_SYMBOL(sprintf_end);
+
 /**
  * vsprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
-- 
2.50.0

[RFC v5 2/7] stacktrace, stackdepot: Add sprintf_end()-like variants of functions

Posted by Alejandro Colomar 2 months, 4 weeks ago

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/stackdepot.h | 13 +++++++++++++
 include/linux/stacktrace.h |  3 +++
 kernel/stacktrace.c        | 28 ++++++++++++++++++++++++++++
 lib/stackdepot.c           | 13 +++++++++++++
 4 files changed, 57 insertions(+)

diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 2cc21ffcdaf9..76182e874f67 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -219,6 +219,19 @@ void stack_depot_print(depot_stack_handle_t stack);
 int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
 		       int spaces);
 
+/**
+ * stack_depot_sprint_end - Print a stack trace from stack depot into a buffer
+ *
+ * @handle:	Stack depot handle returned from stack_depot_save()
+ * @p:		Pointer to the print buffer
+ * @end:	Pointer to one past the last element in the buffer
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return:	Pointer to trailing '\0'; or NULL on truncation
+ */
+char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
+                             const char end[0], int spaces);
+
 /**
  * stack_depot_put - Drop a reference to a stack trace from stack depot
  *
diff --git a/include/linux/stacktrace.h b/include/linux/stacktrace.h
index 97455880ac41..79ada795d479 100644
--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -67,6 +67,9 @@ void stack_trace_print(const unsigned long *trace, unsigned int nr_entries,
 		       int spaces);
 int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
 			unsigned int nr_entries, int spaces);
+char *stack_trace_sprint_end(char *p, const char end[0],
+			     const unsigned long *entries,
+			     unsigned int nr_entries, int spaces);
 unsigned int stack_trace_save(unsigned long *store, unsigned int size,
 			      unsigned int skipnr);
 unsigned int stack_trace_save_tsk(struct task_struct *task,
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index afb3c116da91..f389647d8e44 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -70,6 +70,34 @@ int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
 }
 EXPORT_SYMBOL_GPL(stack_trace_snprint);
 
+/**
+ * stack_trace_sprint_end - Print the entries in the stack trace into a buffer
+ * @p:		Pointer to the print buffer
+ * @end:	Pointer to one past the last element in the buffer
+ * @entries:	Pointer to storage array
+ * @nr_entries:	Number of entries in the storage array
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return: Pointer to the trailing '\0'; or NULL on truncation.
+ */
+char *stack_trace_sprint_end(char *p, const char end[0],
+			  const unsigned long *entries, unsigned int nr_entries,
+			  int spaces)
+{
+	unsigned int i;
+
+	if (WARN_ON(!entries))
+		return 0;
+
+	for (i = 0; i < nr_entries; i++) {
+		p = sprintf_end(p, end, "%*c%pS\n", 1 + spaces, ' ',
+			     (void *)entries[i]);
+	}
+
+	return p;
+}
+EXPORT_SYMBOL_GPL(stack_trace_sprint_end);
+
 #ifdef CONFIG_ARCH_STACKWALK
 
 struct stacktrace_cookie {
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 73d7b50924ef..48e5c0ff37e8 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -771,6 +771,19 @@ int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
 }
 EXPORT_SYMBOL_GPL(stack_depot_snprint);
 
+char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
+			     const char end[0], int spaces)
+{
+	unsigned long *entries;
+	unsigned int nr_entries;
+
+	nr_entries = stack_depot_fetch(handle, &entries);
+	return nr_entries ?
+		stack_trace_sprint_end(p, end, entries, nr_entries, spaces)
+		: sprintf_end(p, end, "");
+}
+EXPORT_SYMBOL_GPL(stack_depot_sprint_end);
+
 depot_stack_handle_t __must_check stack_depot_set_extra_bits(
 			depot_stack_handle_t handle, unsigned int extra_bits)
 {
-- 
2.50.0

[RFC v5 3/7] mm: Use sprintf_end() instead of less ergonomic APIs

Posted by Alejandro Colomar 2 months, 4 weeks ago

While doing this, I detected some anomalies in the existing code:

mm/kfence/kfence_test.c:

	-  The last call to scnprintf() did increment 'cur', but it's
	   unused after that, so it was dead code.  I've removed the dead
	   code in this patch.

	-  'end' is calculated as

		end = &expect[0][sizeof(expect[0] - 1)];

	   However, the '-1' doesn't seem to be necessary.  When passing
	   $2 to scnprintf(), the size was specified as 'end - cur'.
	   And scnprintf() --just like snprintf(3)--, won't write more
	   than $2 bytes (including the null byte).  That means that
	   scnprintf() wouldn't write more than

		&expect[0][sizeof(expect[0]) - 1] - expect[0]

	   which simplifies to

		sizeof(expect[0]) - 1

	   bytes.  But we have sizeof(expect[0]) bytes available, so
	   we're wasting one byte entirely.  This is a benign off-by-one
	   bug.  The two occurrences of this bug will be fixed in a
	   following patch in this series.

mm/kmsan/kmsan_test.c:

	The same benign off-by-one bug calculating the remaining size.

mm/mempolicy.c:

	This file uses the 'p += snprintf()' anti-pattern.  That will
	overflow the pointer on truncation, which has undefined
	behavior.  Using sprintf_end(), this bug is fixed.

	As in the previous file, here there was also dead code in the
	last scnprintf() call, by incrementing a pointer that is not
	used after the call.  I've removed the dead code.

mm/page_owner.c:

	Within print_page_owner(), there are some calls to scnprintf(),
	which do report truncation.  And then there are other calls to
	snprintf(), where we handle errors (there are two 'goto err').

	I've kept the existing error handling, as I trust it's there for
	a good reason (i.e., we may want to avoid calling
	print_page_owner_memcg() if we truncated before).  Please review
	if this amount of error handling is the right one, or if we want
	to add or remove some.  For sprintf_end(), a single test for
	null after the last call is enough to detect truncation.

mm/slub.c:

	Again, the 'p += snprintf()' anti-pattern.  This is UB, and by
	using sprintf_end() we've fixed the bug.

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Marco Elver <elver@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Chao Yu <chao.yu@oppo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/kfence/kfence_test.c | 24 ++++++++++++------------
 mm/kmsan/kmsan_test.c   |  4 ++--
 mm/mempolicy.c          | 18 +++++++++---------
 mm/page_owner.c         | 32 +++++++++++++++++---------------
 mm/slub.c               |  5 +++--
 5 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index 00034e37bc9f..bae382eca4ab 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -113,26 +113,26 @@ static bool report_matches(const struct expect_report *r)
 	end = &expect[0][sizeof(expect[0]) - 1];
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: out-of-bounds %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_UAF:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: use-after-free %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: use-after-free %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_CORRUPTION:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: memory corruption");
+		cur = sprintf_end(cur, end, "BUG: KFENCE: memory corruption");
 		break;
 	case KFENCE_ERROR_INVALID:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: invalid %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_INVALID_FREE:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid free");
+		cur = sprintf_end(cur, end, "BUG: KFENCE: invalid free");
 		break;
 	}
 
-	scnprintf(cur, end - cur, " in %pS", r->fn);
+	sprintf_end(cur, end, " in %pS", r->fn);
 	/* The exact offset won't match, remove it; also strip module name. */
 	cur = strchr(expect[0], '+');
 	if (cur)
@@ -144,26 +144,26 @@ static bool report_matches(const struct expect_report *r)
 
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
-		cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Out-of-bounds %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_UAF:
-		cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Use-after-free %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_CORRUPTION:
-		cur += scnprintf(cur, end - cur, "Corrupted memory at");
+		cur = sprintf_end(cur, end, "Corrupted memory at");
 		break;
 	case KFENCE_ERROR_INVALID:
-		cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Invalid %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_INVALID_FREE:
-		cur += scnprintf(cur, end - cur, "Invalid free of");
+		cur = sprintf_end(cur, end, "Invalid free of");
 		break;
 	}
 
-	cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
+	sprintf_end(cur, end, " 0x%p", (void *)addr);
 
 	spin_lock_irqsave(&observed.lock, flags);
 	if (!report_available())
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index 9733a22c46c1..e48ca1972ff3 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -107,9 +107,9 @@ static bool report_matches(const struct expect_report *r)
 	cur = expected_header;
 	end = &expected_header[sizeof(expected_header) - 1];
 
-	cur += scnprintf(cur, end - cur, "BUG: KMSAN: %s", r->error_type);
+	cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
 
-	scnprintf(cur, end - cur, " in %s", r->symbol);
+	sprintf_end(cur, end, " in %s", r->symbol);
 	/* The exact offset won't match, remove it; also strip module name. */
 	cur = strchr(expected_header, '+');
 	if (cur)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b28a1e6ae096..6beb2710f97c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3359,6 +3359,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol)
 void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
 {
 	char *p = buffer;
+	char *e = buffer + maxlen;
 	nodemask_t nodes = NODE_MASK_NONE;
 	unsigned short mode = MPOL_DEFAULT;
 	unsigned short flags = 0;
@@ -3384,33 +3385,32 @@ void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
 		break;
 	default:
 		WARN_ON_ONCE(1);
-		snprintf(p, maxlen, "unknown");
+		sprintf_end(p, e, "unknown");
 		return;
 	}
 
-	p += snprintf(p, maxlen, "%s", policy_modes[mode]);
+	p = sprintf_end(p, e, "%s", policy_modes[mode]);
 
 	if (flags & MPOL_MODE_FLAGS) {
-		p += snprintf(p, buffer + maxlen - p, "=");
+		p = sprintf_end(p, e, "=");
 
 		/*
 		 * Static and relative are mutually exclusive.
 		 */
 		if (flags & MPOL_F_STATIC_NODES)
-			p += snprintf(p, buffer + maxlen - p, "static");
+			p = sprintf_end(p, e, "static");
 		else if (flags & MPOL_F_RELATIVE_NODES)
-			p += snprintf(p, buffer + maxlen - p, "relative");
+			p = sprintf_end(p, e, "relative");
 
 		if (flags & MPOL_F_NUMA_BALANCING) {
 			if (!is_power_of_2(flags & MPOL_MODE_FLAGS))
-				p += snprintf(p, buffer + maxlen - p, "|");
-			p += snprintf(p, buffer + maxlen - p, "balancing");
+				p = sprintf_end(p, e, "|");
+			p = sprintf_end(p, e, "balancing");
 		}
 	}
 
 	if (!nodes_empty(nodes))
-		p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
-			       nodemask_pr_args(&nodes));
+		sprintf_end(p, e, ":%*pbl", nodemask_pr_args(&nodes));
 }
 
 #ifdef CONFIG_SYSFS
diff --git a/mm/page_owner.c b/mm/page_owner.c
index cc4a6916eec6..c00b3be01540 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -496,7 +496,7 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
 /*
  * Looking for memcg information and print it out
  */
-static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
+static inline char *print_page_owner_memcg(char *p, const char end[0],
 					 struct page *page)
 {
 #ifdef CONFIG_MEMCG
@@ -511,8 +511,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 		goto out_unlock;
 
 	if (memcg_data & MEMCG_DATA_OBJEXTS)
-		ret += scnprintf(kbuf + ret, count - ret,
-				"Slab cache page\n");
+		p = sprintf_end(p, end, "Slab cache page\n");
 
 	memcg = page_memcg_check(page);
 	if (!memcg)
@@ -520,7 +519,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 
 	online = (memcg->css.flags & CSS_ONLINE);
 	cgroup_name(memcg->css.cgroup, name, sizeof(name));
-	ret += scnprintf(kbuf + ret, count - ret,
+	p = sprintf_end(p, end,
 			"Charged %sto %smemcg %s\n",
 			PageMemcgKmem(page) ? "(via objcg) " : "",
 			online ? "" : "offline ",
@@ -529,7 +528,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 	rcu_read_unlock();
 #endif /* CONFIG_MEMCG */
 
-	return ret;
+	return p;
 }
 
 static ssize_t
@@ -538,14 +537,16 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 		depot_stack_handle_t handle)
 {
 	int ret, pageblock_mt, page_mt;
-	char *kbuf;
+	char *kbuf, *p, *e;
 
 	count = min_t(size_t, count, PAGE_SIZE);
 	kbuf = kmalloc(count, GFP_KERNEL);
 	if (!kbuf)
 		return -ENOMEM;
 
-	ret = scnprintf(kbuf, count,
+	p = kbuf;
+	e = kbuf + count;
+	p = sprintf_end(p, e,
 			"Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n",
 			page_owner->order, page_owner->gfp_mask,
 			&page_owner->gfp_mask, page_owner->pid,
@@ -555,7 +556,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 	/* Print information relevant to grouping pages by mobility */
 	pageblock_mt = get_pageblock_migratetype(page);
 	page_mt  = gfp_migratetype(page_owner->gfp_mask);
-	ret += scnprintf(kbuf + ret, count - ret,
+	p = sprintf_end(p, e,
 			"PFN 0x%lx type %s Block %lu type %s Flags %pGp\n",
 			pfn,
 			migratetype_names[page_mt],
@@ -563,22 +564,23 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 			migratetype_names[pageblock_mt],
 			&page->flags);
 
-	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
-	if (ret >= count)
-		goto err;
+	p = stack_depot_sprint_end(handle, p, e, 0);
+	if (p == NULL)
+		goto err;  // XXX: Should we remove this error handling?
 
 	if (page_owner->last_migrate_reason != -1) {
-		ret += scnprintf(kbuf + ret, count - ret,
+		p = sprintf_end(p, e,
 			"Page has been migrated, last migrate reason: %s\n",
 			migrate_reason_names[page_owner->last_migrate_reason]);
 	}
 
-	ret = print_page_owner_memcg(kbuf, count, ret, page);
+	p = print_page_owner_memcg(p, e, page);
 
-	ret += snprintf(kbuf + ret, count - ret, "\n");
-	if (ret >= count)
+	p = sprintf_end(p, e, "\n");
+	if (p == NULL)
 		goto err;
 
+	ret = p - kbuf;
 	if (copy_to_user(buf, kbuf, ret))
 		ret = -EFAULT;
 
diff --git a/mm/slub.c b/mm/slub.c
index be8b09e09d30..dcc857676857 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -7451,6 +7451,7 @@ static char *create_unique_id(struct kmem_cache *s)
 {
 	char *name = kmalloc(ID_STR_LENGTH, GFP_KERNEL);
 	char *p = name;
+	char *e = name + ID_STR_LENGTH;
 
 	if (!name)
 		return ERR_PTR(-ENOMEM);
@@ -7475,9 +7476,9 @@ static char *create_unique_id(struct kmem_cache *s)
 		*p++ = 'A';
 	if (p != name + 1)
 		*p++ = '-';
-	p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
+	p = sprintf_end(p, e, "%07u", s->size);
 
-	if (WARN_ON(p > name + ID_STR_LENGTH - 1)) {
+	if (WARN_ON(p == NULL)) {
 		kfree(name);
 		return ERR_PTR(-EINVAL);
 	}
-- 
2.50.0

[RFC v5 4/7] array_size.h: Add ENDOF()

Posted by Alejandro Colomar 2 months, 4 weeks ago

This macro is useful to calculate the second argument to sprintf_end(),
avoiding off-by-one bugs.

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/array_size.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/array_size.h b/include/linux/array_size.h
index 06d7d83196ca..781bdb70d939 100644
--- a/include/linux/array_size.h
+++ b/include/linux/array_size.h
@@ -10,4 +10,10 @@
  */
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
 
+/**
+ * ENDOF - get a pointer to one past the last element in array @a
+ * @a: array
+ */
+#define ENDOF(a)  (a + ARRAY_SIZE(a))
+
 #endif  /* _LINUX_ARRAY_SIZE_H */
-- 
2.50.0

[RFC v5 5/7] mm: Fix benign off-by-one bugs

Posted by Alejandro Colomar 2 months, 4 weeks ago

We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
doesn't write more than $2 bytes including the null byte, so trying to
pass 'size-1' there is wasting one byte.  Now that we use sprintf_end(),
the situation isn't different: sprintf_end() will stop writing *before*
'end' --that is, at most the terminating null byte will be written at
'end-1'--.

Acked-by: Marco Elver <elver@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/kfence/kfence_test.c | 4 ++--
 mm/kmsan/kmsan_test.c   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index bae382eca4ab..c635aa9d478b 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -110,7 +110,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Title */
 	cur = expect[0];
-	end = &expect[0][sizeof(expect[0]) - 1];
+	end = ENDOF(expect[0]);
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
 		cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
@@ -140,7 +140,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Access information */
 	cur = expect[1];
-	end = &expect[1][sizeof(expect[1]) - 1];
+	end = ENDOF(expect[1]);
 
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index e48ca1972ff3..9bda55992e3d 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -105,7 +105,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Title */
 	cur = expected_header;
-	end = &expected_header[sizeof(expected_header) - 1];
+	end = ENDOF(expected_header);
 
 	cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
 
-- 
2.50.0

[RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 4 weeks ago

These macros take the end of the array argument implicitly to avoid
programmer mistakes.  This guarantees that the input is an array, unlike

	snprintf(buf, sizeof(buf), ...);

which is dangerous if the programmer passes a pointer instead of an
array.

These macros are essentially the same as the 2-argument version of
strscpy(), but with a formatted string, and returning a pointer to the
terminating '\0' (or NULL, on error).

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index a0dc35574521..8576a543e62c 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -4,6 +4,10 @@
 
 #include <linux/compiler_attributes.h>
 #include <linux/types.h>
+#include <linux/array_size.h>
+
+#define sprintf_array(a, fmt, ...)  sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
+#define vsprintf_array(a, fmt, ap)  vsprintf_end(a, ENDOF(a), fmt, ap)
 
 int num_to_str(char *buf, int size, unsigned long long num, unsigned int width);
 
-- 
2.50.0

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Linus Torvalds 2 months, 4 weeks ago

On Thu, 10 Jul 2025 at 14:31, Alejandro Colomar <alx@kernel.org> wrote:
>
> These macros are essentially the same as the 2-argument version of
> strscpy(), but with a formatted string, and returning a pointer to the
> terminating '\0' (or NULL, on error).

No.

Stop this garbage.

You took my suggestion, and then you messed it up.

Your version of sprintf_array() is broken. It evaluates 'a' twice.
Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
argument.

And you did it for no reason I can see. You said that you wanted to
return the end of the resulting string, but the fact is, not a single
user seems to care, and honestly, I think it would be wrong to care.
The size of the result is likely the more useful thing, or you could
even make these 'void' or something.

But instead you made the macro be dangerous to use.

This kind of churn is WRONG. It _looks_ like a cleanup that doesn't
change anything, but then it has subtle bugs that will come and bite
us later because you did things wrong.

I'm NAK'ing all of this. This is BAD. Cleanup patches had better be
fundamentally correct, not introduce broken "helpers" that will make
for really subtle bugs.

Maybe nobody ever ends up having that first argument with a side
effect. MAYBE. It's still very very wrong.

                Linus

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Martin Uecker 2 months, 4 weeks ago

Am Donnerstag, dem 10.07.2025 um 14:58 -0700 schrieb Linus Torvalds:
> On Thu, 10 Jul 2025 at 14:31, Alejandro Colomar <alx@kernel.org> wrote:
> > 
> > These macros are essentially the same as the 2-argument version of
> > strscpy(), but with a formatted string, and returning a pointer to the
> > terminating '\0' (or NULL, on error).
> 
> No.
> 
> Stop this garbage.
> 
> You took my suggestion, and then you messed it up.
> 
> Your version of sprintf_array() is broken. It evaluates 'a' twice.
> Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> argument.
> 
> And you did it for no reason I can see. You said that you wanted to
> return the end of the resulting string, but the fact is, not a single
> user seems to care, and honestly, I think it would be wrong to care.
> The size of the result is likely the more useful thing, or you could
> even make these 'void' or something.
> 
> But instead you made the macro be dangerous to use.
> 
> This kind of churn is WRONG. It _looks_ like a cleanup that doesn't
> change anything, but then it has subtle bugs that will come and bite
> us later because you did things wrong.
> 
> I'm NAK'ing all of this. This is BAD. Cleanup patches had better be
> fundamentally correct, not introduce broken "helpers" that will make
> for really subtle bugs.
> 
> Maybe nobody ever ends up having that first argument with a side
> effect. MAYBE. It's still very very wrong.
> 
>                 Linus

What I am puzzled about is that - if you revise your string APIs -,
you do not directly go for a safe abstraction that combines length
and pointer and instead keep using these fragile 80s-style string
functions and open-coded pointer and size computations that everybody
gets wrong all the time.

String handling could also look like this:


https://godbolt.org/z/dqGz9b4sM

and be completely bounds safe.

(Note that those function abort() on allocation failure, but this
is an unfinished demo and also not for kernel use. Also I need to
rewrite this using string views.)


Martin

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by David Laight 2 months, 4 weeks ago

On Fri, 11 Jul 2025 08:05:38 +0200
Martin Uecker <ma.uecker@gmail.com> wrote:

> Am Donnerstag, dem 10.07.2025 um 14:58 -0700 schrieb Linus Torvalds:
> > On Thu, 10 Jul 2025 at 14:31, Alejandro Colomar <alx@kernel.org> wrote:  
> > > 
> > > These macros are essentially the same as the 2-argument version of
> > > strscpy(), but with a formatted string, and returning a pointer to the
> > > terminating '\0' (or NULL, on error).  
> > 
> > No.
> > 
> > Stop this garbage.
> > 
> > You took my suggestion, and then you messed it up.
> > 
> > Your version of sprintf_array() is broken. It evaluates 'a' twice.
> > Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> > argument.
> > 
> > And you did it for no reason I can see. You said that you wanted to
> > return the end of the resulting string, but the fact is, not a single
> > user seems to care, and honestly, I think it would be wrong to care.
> > The size of the result is likely the more useful thing, or you could
> > even make these 'void' or something.
> > 
> > But instead you made the macro be dangerous to use.
> > 
> > This kind of churn is WRONG. It _looks_ like a cleanup that doesn't
> > change anything, but then it has subtle bugs that will come and bite
> > us later because you did things wrong.
> > 
> > I'm NAK'ing all of this. This is BAD. Cleanup patches had better be
> > fundamentally correct, not introduce broken "helpers" that will make
> > for really subtle bugs.
> > 
> > Maybe nobody ever ends up having that first argument with a side
> > effect. MAYBE. It's still very very wrong.
> > 
> >                 Linus  
> 
> What I am puzzled about is that - if you revise your string APIs -,
> you do not directly go for a safe abstraction that combines length
> and pointer and instead keep using these fragile 80s-style string
> functions and open-coded pointer and size computations that everybody
> gets wrong all the time.
> 
> String handling could also look like this:

What does that actually look like behind all the #defines and generics?
It it continually doing malloc/free it is pretty much inappropriate
for a lot of system/kernel code.

	David

> 
> 
> https://godbolt.org/z/dqGz9b4sM
> 
> and be completely bounds safe.
> 
> (Note that those function abort() on allocation failure, but this
> is an unfinished demo and also not for kernel use. Also I need to
> rewrite this using string views.)
> 
> 
> Martin
> 
> 
> 
>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Martin Uecker 2 months, 4 weeks ago

Am Freitag, dem 11.07.2025 um 18:45 +0100 schrieb David Laight:
> On Fri, 11 Jul 2025 08:05:38 +0200
> Martin Uecker <ma.uecker@gmail.com> wrote:
> 
> > Am Donnerstag, dem 10.07.2025 um 14:58 -0700 schrieb Linus Torvalds:
> > > On Thu, 10 Jul 2025 at 14:31, Alejandro Colomar <alx@kernel.org> wrote:  
> > > > 
> > > > These macros are essentially the same as the 2-argument version of
> > > > strscpy(), but with a formatted string, and returning a pointer to the
> > > > terminating '\0' (or NULL, on error).  
> > > 
> > > No.
> > > 
> > > Stop this garbage.
> > > 
> > > You took my suggestion, and then you messed it up.
> > > 
> > > Your version of sprintf_array() is broken. It evaluates 'a' twice.
> > > Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> > > argument.
> > > 
> > > And you did it for no reason I can see. You said that you wanted to
> > > return the end of the resulting string, but the fact is, not a single
> > > user seems to care, and honestly, I think it would be wrong to care.
> > > The size of the result is likely the more useful thing, or you could
> > > even make these 'void' or something.
> > > 
> > > But instead you made the macro be dangerous to use.
> > > 
> > > This kind of churn is WRONG. It _looks_ like a cleanup that doesn't
> > > change anything, but then it has subtle bugs that will come and bite
> > > us later because you did things wrong.
> > > 
> > > I'm NAK'ing all of this. This is BAD. Cleanup patches had better be
> > > fundamentally correct, not introduce broken "helpers" that will make
> > > for really subtle bugs.
> > > 
> > > Maybe nobody ever ends up having that first argument with a side
> > > effect. MAYBE. It's still very very wrong.
> > > 
> > >                 Linus  
> > 
> > What I am puzzled about is that - if you revise your string APIs -,
> > you do not directly go for a safe abstraction that combines length
> > and pointer and instead keep using these fragile 80s-style string
> > functions and open-coded pointer and size computations that everybody
> > gets wrong all the time.
> > 
> > String handling could also look like this:
> 
> What does that actually look like behind all the #defines and generics?
> It it continually doing malloc/free it is pretty much inappropriate
> for a lot of system/kernel code.

The example I linked would allocate behind your back and would clearly
not be useful for the kernel also because it would abort() on
allocation failure (as I pointed out below).  

Still, I do not see why similar functions could not work for the
kernel.  The main point is to keep pointer and length together in a
single struct.  But it is certainly more difficult to define APIs
which make sense for the kernel.

I explain a bit how such types work here:

https://uecker.codeberg.page/2025-07-02.html
https://uecker.codeberg.page/2025-07-09.html

Martin
> 

> > 
> > https://godbolt.org/z/dqGz9b4sM
> > 
> > and be completely bounds safe.
> > 
> > (Note that those function abort() on allocation failure, but this
> > is an unfinished demo and also not for kernel use. Also I need to
> > rewrite this using string views.)
> > 
> > 
> > Martin
> > 
> > 
> > 
> > 
>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Linus Torvalds 2 months, 4 weeks ago

On Fri, 11 Jul 2025 at 10:45, David Laight <david.laight.linux@gmail.com> wrote:
>
> What does that actually look like behind all the #defines and generics?
> It it continually doing malloc/free it is pretty much inappropriate
> for a lot of system/kernel code.

Honestly, the kernel approximately *never* has "string handling" in
the traditional sense.

But we do have "buffers with text". The difference typically exactly
being that allocation has to happen separately from any text
operation.

It's why I already suggested people look at our various existing
buffer abstractions: we have several, although they tend to often be
somewhat specialized.

So, for example, we have things like "struct qstr" for path
components: it's specialized not only in having an associated hash
value for the string, but because it's a "initialize once" kind of
buffer that gets initialized at creation time, and the string contents
are constant (it literally contains a "const char *" in addition to
the length/hash).

That kind of "string buffer" obviously isn't useful for things like
the printf family, but we do have others. Like "struct seq_buf", which
already has "seq_buf_printf()" helpers.

That's the one you probably should use for most kernel "print to
buffer", but it has very few users despite not being complicated to
use:

        struct seq_buf s;
        seq_buf_init(&s, buf, szie);

and you're off to the races, and can do things like

        seq_buf_printf(&s, ....);

without ever having to worry about overflows etc.

So we already do *have* good interfaces. But they aren't the
traditional ones that everybody knows about.

                   Linus

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Kees Cook 2 months, 3 weeks ago

On Fri, Jul 11, 2025 at 10:58:56AM -0700, Linus Torvalds wrote:
>         struct seq_buf s;
>         seq_buf_init(&s, buf, szie);

And because some folks didn't like this "declaration that requires a
function call", we even added:

	DECLARE_SEQ_BUF(s, 32);

to do it in 1 line. :P

I would love to see more string handling replaced with seq_buf.

-- 
Kees Cook

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 3 weeks ago

Hi Kees,

On Mon, Jul 14, 2025 at 10:19:39PM -0700, Kees Cook wrote:
> On Fri, Jul 11, 2025 at 10:58:56AM -0700, Linus Torvalds wrote:
> >         struct seq_buf s;
> >         seq_buf_init(&s, buf, szie);
> 
> And because some folks didn't like this "declaration that requires a
> function call", we even added:
> 
> 	DECLARE_SEQ_BUF(s, 32);
> 
> to do it in 1 line. :P
> 
> I would love to see more string handling replaced with seq_buf.

The thing is, it's not as easy as the fixes I'm proposing, and
sprintf_end() solves a lot of UB in a minimal diff that you can dumbly
apply.

And transitioning from sprintf_end() to seq_buf will still be a
possibility --probably even easier, because the code is simpler than
with s[c]nprintf()--.

Another thing, and this is my opinion, is that I'm not fond of APIs that
keep an internal state.  With sprintf_end(), the state is minimal and
external: the state is the 'p' pointer to where you're going to write.
That way, the programmer knows exactly where the writes occur, and can
reason about it without having to read the implementation and keep a
model of the state in its head.  With a struct-based approach, you hide
the state inside the structure, which means it's not so easy to reason
about how an action will affect the string, at first glance; you need an
expert in the API to know how to use it.

With sprintf_end(), either one is stupid/careless enough to get the
parameters wrong, or the function necessarily works well, *and is simple
to fully understand*.  And considering that we have ENDOF(), it's hard
to understand how one could get it wrong:

	p = buf;
	e = ENDOF(buf);
	p = sprintf_end(p, e, ...);
	p = sprintf_end(p, e, ...);
	p = sprintf_end(p, e, ...);
	p = sprintf_end(p, e, ...);

Admittedly, ENDOF() doesn't compile if buf is not an array, so in those
cases, there's a chance of a paranoic programmer slapping a -1 just in
case, but that doesn't hurt:

	p = buf;
	e = buf + size;  // Someone might accidentally -1 that?

I'm working on extending the _Countof() operator so that it can be
applied to array parameters to functions, so that it can be used to
count arrays that are not arrays:

	void
	f(size_t n, char buf[n])
	{
		p = buf;
		e = buf + _Countof(buf);  // _Countof(buf) will evaluate to n.
		...
	}

Which will significantly enhance the usability of sprintf_end().  I want
to implement this for GCC next year (there are a few things that need to
be improved first to be able to do that), and also propose it for
standardization.

For a similar comparison of stateful vs stateless functions, there are
strtok(3) and strsep(3), which apart from minor differences (strtok(3)
collapses adjacent delimiters) are more or less the same.  But I'd use
strsep(3) over strtok(3), even if just because strtok(3) keeps an
internal state, so I always need to be very careful of reading the
documentation to remind myself of what happens to the state after each
call.  strsep(3) is dead simple: you call it, and it updates the pointer
you passed; nothing is kept secretly from the programmer.

Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Kees Cook 2 months, 3 weeks ago

On Tue, Jul 15, 2025 at 09:08:14AM +0200, Alejandro Colomar wrote:
> Hi Kees,
> 
> On Mon, Jul 14, 2025 at 10:19:39PM -0700, Kees Cook wrote:
> > On Fri, Jul 11, 2025 at 10:58:56AM -0700, Linus Torvalds wrote:
> > >         struct seq_buf s;
> > >         seq_buf_init(&s, buf, szie);
> > 
> > And because some folks didn't like this "declaration that requires a
> > function call", we even added:
> > 
> > 	DECLARE_SEQ_BUF(s, 32);
> > 
> > to do it in 1 line. :P
> > 
> > I would love to see more string handling replaced with seq_buf.
> 
> The thing is, it's not as easy as the fixes I'm proposing, and
> sprintf_end() solves a lot of UB in a minimal diff that you can dumbly
> apply.

Note that I'm not arguing against your idea -- I just think it's not
going to be likely to end up in Linux soon given Linus's objections. My
perspective is mainly one of pragmatic damage control: what *can* we do
in Linux that would make things better? Currently, seq_buf is better
than raw C strings...

-- 
Kees Cook

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 3 weeks ago

Hi Kees,

On Thu, Jul 17, 2025 at 04:47:04PM -0700, Kees Cook wrote:
> On Tue, Jul 15, 2025 at 09:08:14AM +0200, Alejandro Colomar wrote:
> > Hi Kees,
> > 
> > On Mon, Jul 14, 2025 at 10:19:39PM -0700, Kees Cook wrote:
> > > On Fri, Jul 11, 2025 at 10:58:56AM -0700, Linus Torvalds wrote:
> > > >         struct seq_buf s;
> > > >         seq_buf_init(&s, buf, szie);
> > > 
> > > And because some folks didn't like this "declaration that requires a
> > > function call", we even added:
> > > 
> > > 	DECLARE_SEQ_BUF(s, 32);
> > > 
> > > to do it in 1 line. :P
> > > 
> > > I would love to see more string handling replaced with seq_buf.
> > 
> > The thing is, it's not as easy as the fixes I'm proposing, and
> > sprintf_end() solves a lot of UB in a minimal diff that you can dumbly
> > apply.
> 
> Note that I'm not arguing against your idea -- I just think it's not
> going to be likely to end up in Linux soon given Linus's objections.

It would be interesting to hear if Linus holds his objections on v6.

> My
> perspective is mainly one of pragmatic damage control: what *can* we do
> in Linux that would make things better? Currently, seq_buf is better
> than raw C strings...

TBH, I'm not fully convinced.  While it may look simpler at first
glance, I'm worried that it might bite in the details.  I default to not
trusting APIs that hide the complexity in hidden state.  On the other
hand, I agree that almost anything is safer than snprintf(3).

But one good thing of snprintf(3) is that it's simple, and thus
relatively obvious to see that it's wrong, so it's easy to fix (it's
easy to transition from snprintf(3) to sprintf_end()).  So, maybe
keeping it bogus until it's replaced by sprintf_end() is a better
approach than using seq_buf.  (Unless the current code is found
exploitable, but I assume not.)

Have a lovely night!
Alex

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Martin Uecker 2 months, 3 weeks ago

Am Montag, dem 14.07.2025 um 22:19 -0700 schrieb Kees Cook:
> On Fri, Jul 11, 2025 at 10:58:56AM -0700, Linus Torvalds wrote:
> >         struct seq_buf s;
> >         seq_buf_init(&s, buf, szie);
> 
> And because some folks didn't like this "declaration that requires a
> function call", we even added:
> 
> 	DECLARE_SEQ_BUF(s, 32);
> 
> to do it in 1 line. :P
> 
> I would love to see more string handling replaced with seq_buf.

Why not have?

struct seq_buf s = SEQ_BUF(32);

So the kernel has safe abstractions, there are just not used enough.

Do you also have a string view abstraction?  I found this really
useful as basic building block for safe string handling, and
equally important to a string builder type such as seq_buf.

The string builder is for safely construcing new strings, the
string view is for safely accessing parts of existing strings.

Also what I found really convenient and useful in this context
was to have an accessor macro that expose the  buffer as a 
regular array cast to the correct size:

 *( (char(*)[(x)->N]) (x)->data )

(put into statement expressions to avoid double evaluation)

instead of simply returning a char*

You can then access the array directly with [] which then can be
bounds checked with UBsan, one can measure its length with sizeof,
and one can also let it decay and get a char* to pass it to legacy
code (and to some degree this can be protected by BDOS).

Martin

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Kees Cook 2 months, 3 weeks ago

On Tue, Jul 15, 2025 at 08:24:29AM +0200, Martin Uecker wrote:
> Am Montag, dem 14.07.2025 um 22:19 -0700 schrieb Kees Cook:
> > On Fri, Jul 11, 2025 at 10:58:56AM -0700, Linus Torvalds wrote:
> > >         struct seq_buf s;
> > >         seq_buf_init(&s, buf, szie);
> > 
> > And because some folks didn't like this "declaration that requires a
> > function call", we even added:
> > 
> > 	DECLARE_SEQ_BUF(s, 32);
> > 
> > to do it in 1 line. :P
> > 
> > I would love to see more string handling replaced with seq_buf.
> 
> Why not have?
> 
> struct seq_buf s = SEQ_BUF(32);
> 
> 
> So the kernel has safe abstractions, there are just not used enough.

Yeah, that should be fine. The trouble is encapsulating the actual
buffer itself. But things like spinlocks need initialization too, so
it's not too unusual to need a constructor for things living in a
struct.

If the struct had DECLARE which created 2 variables, then an INIT could
just reuse the special name...

> The string builder is for safely construcing new strings, the
> string view is for safely accessing parts of existing strings.

seq_buf doesn't currently have a "view" API, just a "make sure the
result is NUL terminated, please enjoy this char *"

> Also what I found really convenient and useful in this context
> was to have an accessor macro that expose the  buffer as a 
> regular array cast to the correct size:
> 
>  *( (char(*)[(x)->N]) (x)->data )
> 
> (put into statement expressions to avoid double evaluation)
> 
> instead of simply returning a char*

Yeah, I took a look through your proposed C string library routines. I
think it would be pretty nice, but it does feel like it has to go
through a lot of hoops when C should have something native. Though to
be clear, I'm not saying seq_buf is the answer. :)

-- 
Kees Cook

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Matthew Wilcox 2 months, 3 weeks ago

On Fri, Jul 11, 2025 at 10:58:56AM -0700, Linus Torvalds wrote:
> That kind of "string buffer" obviously isn't useful for things like
> the printf family, but we do have others. Like "struct seq_buf", which
> already has "seq_buf_printf()" helpers.
> 
> That's the one you probably should use for most kernel "print to
> buffer", but it has very few users despite not being complicated to
> use:
> 
>         struct seq_buf s;
>         seq_buf_init(&s, buf, szie);
> 
> and you're off to the races, and can do things like
> 
>         seq_buf_printf(&s, ....);
> 
> without ever having to worry about overflows etc.

I actually wanted to go one step further with this (that's why I took
readpos out of seq_buf in d0ed46b60396).  If you look at the guts of
vsprintf.c, it'd be much improved by using seq_buf internally instead
of passing around buf and end.

Once we've done that, maybe we can strip these annoying %pXYZ out
of vsprintf.c and use seq_buf routines like it's a StringBuilder (or
whatever other language/library convention you prefer).

Anyway, I ran out of time to work on it, but I still think it's
worthwhile.  And then there'd be a lot more commonality between regular
printing and trace printing, which would be nice.

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Martin Uecker 2 months, 4 weeks ago

Am Freitag, dem 11.07.2025 um 08:05 +0200 schrieb Martin Uecker:
> Am Donnerstag, dem 10.07.2025 um 14:58 -0700 schrieb Linus Torvalds:
> > On Thu, 10 Jul 2025 at 14:31, Alejandro Colomar <alx@kernel.org> wrote:
> > > 
> > > These macros are essentially the same as the 2-argument version of
> > > strscpy(), but with a formatted string, and returning a pointer to the
> > > terminating '\0' (or NULL, on error).
> > 
> > No.
> > 
> > Stop this garbage.
> > 
> > You took my suggestion, and then you messed it up.
> > 
> > Your version of sprintf_array() is broken. It evaluates 'a' twice.
> > Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> > argument.
> > 
> > And you did it for no reason I can see. You said that you wanted to
> > return the end of the resulting string, but the fact is, not a single
> > user seems to care, and honestly, I think it would be wrong to care.
> > The size of the result is likely the more useful thing, or you could
> > even make these 'void' or something.
> > 
> > But instead you made the macro be dangerous to use.
> > 
> > This kind of churn is WRONG. It _looks_ like a cleanup that doesn't
> > change anything, but then it has subtle bugs that will come and bite
> > us later because you did things wrong.
> > 
> > I'm NAK'ing all of this. This is BAD. Cleanup patches had better be
> > fundamentally correct, not introduce broken "helpers" that will make
> > for really subtle bugs.
> > 
> > Maybe nobody ever ends up having that first argument with a side
> > effect. MAYBE. It's still very very wrong.
> > 
> >                 Linus
> 
> What I am puzzled about is that - if you revise your string APIs -,
> you do not directly go for a safe abstraction that combines length
> and pointer and instead keep using these fragile 80s-style string
> functions and open-coded pointer and size computations that everybody
> gets wrong all the time.
> 
> String handling could also look like this:
> 
> 
> https://godbolt.org/z/dqGz9b4sM
> 
> and be completely bounds safe.
> 
> (Note that those function abort() on allocation failure, but this
> is an unfinished demo and also not for kernel use. Also I need to
> rewrite this using string views.)
> 

And *if* you want functions that manipulate buffers, why not pass
a pointer to the buffer instead of to its first element to not loose
the type information.

int foo(size_t s, char (*p)[s]);

char buf[10;
foo(ARRAY_SIZE(buf), &buf);

may look slightly unusual but is a lot safer than

int foo(char *buf, size_t len);

char buf[10];
foo(buf, ARRAY_SIZE(buf);

and - once you are used to it - also more logical because why would
you pass a pointer to part of an object to a function that is supposed
to work on the complete object.

Martin

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 4 weeks ago

Hi Linus,

[I'll reply to both of your emails at once]

On Thu, Jul 10, 2025 at 02:58:24PM -0700, Linus Torvalds wrote:
> You took my suggestion, and then you messed it up.
> 
> Your version of sprintf_array() is broken. It evaluates 'a' twice.
> Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> argument.

An array has no issue being evaluated twice (unless it's a VLA).  On the
other hand, I agree it's better to not do that in the first place.
My bad for forgetting about it.  Sorry.

On Thu, Jul 10, 2025 at 03:08:29PM -0700, Linus Torvalds wrote:
> If you want to return an error on truncation, do it right.  Not by
> returning NULL, but by actually returning an error.

Okay.

> For example, in the kernel, we finally fixed 'strcpy()'. After about a
> million different versions of 'copy a string' where every single
> version was complete garbage, we ended up with 'strscpy()'. Yeah, the
> name isn't lovely, but the *use* of it is:

I have implemented the same thing in shadow, called strtcpy() (T for
truncation).  (With the difference that we read the string twice, since
we don't care about threads.)

I also plan to propose standardization of that one in ISO C.

>  - it returns the length of the result for people who want it - which
> is by far the most common thing people want

Agree.

>  - it returns an actual honest-to-goodness error code if something
> overflowed, instead of the absoilutely horrible "source length" of the
> string that strlcpy() does and which is fundamentally broken (because
> it requires that you walk *past* the end of the source,
> Christ-on-a-stick what a broken interface)

Agree.

>  - it can take an array as an argument (without the need for another
> name - see my earlier argument about not making up new names by just
> having generics)

We can't make the same thing with sprintf() variants because they're
variadic, so you can't count the number of arguments.  And since the
'end' argument is of the same type as the formatted string, we can't
do it with _Generic reliably either.

> Now, it has nasty naming (exactly the kind of 'add random character'
> naming that I was arguing against), and that comes from so many
> different broken versions until we hit on something that works.
> 
> strncpy is horrible garbage. strlcpy is even worse. strscpy actually
> works and so far hasn't caused issues (there's a 'pad' version for the
> very rare situation where you want 'strncpy-like' padding, but it
> still guarantees NUL-termination, and still has a good return value).

Agree.

> Let's agree to *not* make horrible garbage when making up new versions
> of sprintf.

Agree.  I indeed introduced the mistake accidentally in v4, after you
complained of having too many functions, as I was introducing not one
but two APIs: seprintf() and stprintf(), where seprintf() is what now
we're calling sprintf_end(), and stprintf() we could call it
sprintf_trunc().  So I did the mistake by trying to reduce the number of
functions to just one, which is wrong.

So, maybe I should go back to those functions, and just give them good
names.

What do you think of the following?

	#define sprintf_array(a, ...)  sprintf_trunc(a, ARRAY_SIZE(a), __VA_ARGS__)
	#define vsprintf_array(a, ap)  vsprintf_trunc(a, ARRAY_SIZE(a), ap)

	char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
	char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
	int sprintf_trunc(char *buf, size_t size, const char *fmt, ...);
	int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args);

	char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
	{
		va_list args;

		va_start(args, fmt);
		p = vseprintf(p, end, fmt, args);
		va_end(args);

		return p;
	}

	char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
	{
		int len;

		if (unlikely(p == NULL))
			return NULL;

		len = vsprintf_trunc(p, end - p, fmt, args);
		if (unlikely(len < 0))
			return NULL;

		return p + len;
	}

	int sprintf_trunc(char *buf, size_t size, const char *fmt, ...)
	{
		va_list args;
		int len;

		va_start(args, fmt);
		len = vstprintf(buf, size, fmt, args);
		va_end(args);

		return len;
	}

	int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args)
	{
		int len;

		if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
			return -EOVERFLOW;

		len = vsnprintf(buf, size, fmt, args);
		if (unlikely(len >= size))
			return -E2BIG;

		return len;
	}

sprintf_trunc() is like strscpy(), but with a formatted string.  It
could replace uses of s[c]nprintf() where there's a single call (no
chained calls).

sprintf_array() is like the 2-argument version of strscpy().  It could
replace s[c]nprintf() calls where there's no chained calls, where the
input is an array.

sprintf_end() would replace the chained calls.

Does this sound good to you?

Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by David Laight 2 months, 4 weeks ago

On Fri, 11 Jul 2025 01:23:49 +0200
Alejandro Colomar <alx@kernel.org> wrote:

> Hi Linus,
> 
> [I'll reply to both of your emails at once]
> 
> On Thu, Jul 10, 2025 at 02:58:24PM -0700, Linus Torvalds wrote:
> > You took my suggestion, and then you messed it up.
> > 
> > Your version of sprintf_array() is broken. It evaluates 'a' twice.
> > Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> > argument.  
> 
> An array has no issue being evaluated twice (unless it's a VLA).  On the
> other hand, I agree it's better to not do that in the first place.
> My bad for forgetting about it.  Sorry.

Or a function that returns an array...

	David

> 
> On Thu, Jul 10, 2025 at 03:08:29PM -0700, Linus Torvalds wrote:
> > If you want to return an error on truncation, do it right.  Not by
> > returning NULL, but by actually returning an error.  
> 
> Okay.
> 
> > For example, in the kernel, we finally fixed 'strcpy()'. After about a
> > million different versions of 'copy a string' where every single
> > version was complete garbage, we ended up with 'strscpy()'. Yeah, the
> > name isn't lovely, but the *use* of it is:  
> 
> I have implemented the same thing in shadow, called strtcpy() (T for
> truncation).  (With the difference that we read the string twice, since
> we don't care about threads.)
> 
> I also plan to propose standardization of that one in ISO C.
> 
> >  - it returns the length of the result for people who want it - which
> > is by far the most common thing people want  
> 
> Agree.
> 
> >  - it returns an actual honest-to-goodness error code if something
> > overflowed, instead of the absoilutely horrible "source length" of the
> > string that strlcpy() does and which is fundamentally broken (because
> > it requires that you walk *past* the end of the source,
> > Christ-on-a-stick what a broken interface)  
> 
> Agree.
> 
> >  - it can take an array as an argument (without the need for another
> > name - see my earlier argument about not making up new names by just
> > having generics)  
> 
> We can't make the same thing with sprintf() variants because they're
> variadic, so you can't count the number of arguments.  And since the
> 'end' argument is of the same type as the formatted string, we can't
> do it with _Generic reliably either.
> 
> > Now, it has nasty naming (exactly the kind of 'add random character'
> > naming that I was arguing against), and that comes from so many
> > different broken versions until we hit on something that works.
> > 
> > strncpy is horrible garbage. strlcpy is even worse. strscpy actually
> > works and so far hasn't caused issues (there's a 'pad' version for the
> > very rare situation where you want 'strncpy-like' padding, but it
> > still guarantees NUL-termination, and still has a good return value).  
> 
> Agree.
> 
> > Let's agree to *not* make horrible garbage when making up new versions
> > of sprintf.  
> 
> Agree.  I indeed introduced the mistake accidentally in v4, after you
> complained of having too many functions, as I was introducing not one
> but two APIs: seprintf() and stprintf(), where seprintf() is what now
> we're calling sprintf_end(), and stprintf() we could call it
> sprintf_trunc().  So I did the mistake by trying to reduce the number of
> functions to just one, which is wrong.
> 
> So, maybe I should go back to those functions, and just give them good
> names.
> 
> What do you think of the following?
> 
> 	#define sprintf_array(a, ...)  sprintf_trunc(a, ARRAY_SIZE(a), __VA_ARGS__)
> 	#define vsprintf_array(a, ap)  vsprintf_trunc(a, ARRAY_SIZE(a), ap)
> 
> 	char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
> 	char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
> 	int sprintf_trunc(char *buf, size_t size, const char *fmt, ...);
> 	int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args);
> 
> 	char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
> 	{
> 		va_list args;
> 
> 		va_start(args, fmt);
> 		p = vseprintf(p, end, fmt, args);
> 		va_end(args);
> 
> 		return p;
> 	}
> 
> 	char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
> 	{
> 		int len;
> 
> 		if (unlikely(p == NULL))
> 			return NULL;
> 
> 		len = vsprintf_trunc(p, end - p, fmt, args);
> 		if (unlikely(len < 0))
> 			return NULL;
> 
> 		return p + len;
> 	}
> 
> 	int sprintf_trunc(char *buf, size_t size, const char *fmt, ...)
> 	{
> 		va_list args;
> 		int len;
> 
> 		va_start(args, fmt);
> 		len = vstprintf(buf, size, fmt, args);
> 		va_end(args);
> 
> 		return len;
> 	}
> 
> 	int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args)
> 	{
> 		int len;
> 
> 		if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
> 			return -EOVERFLOW;
> 
> 		len = vsnprintf(buf, size, fmt, args);
> 		if (unlikely(len >= size))
> 			return -E2BIG;
> 
> 		return len;
> 	}
> 
> sprintf_trunc() is like strscpy(), but with a formatted string.  It
> could replace uses of s[c]nprintf() where there's a single call (no
> chained calls).
> 
> sprintf_array() is like the 2-argument version of strscpy().  It could
> replace s[c]nprintf() calls where there's no chained calls, where the
> input is an array.
> 
> sprintf_end() would replace the chained calls.
> 
> Does this sound good to you?
> 
> 
> Cheers,
> Alex
>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 4 weeks ago

Hi David,

On Fri, Jul 11, 2025 at 06:43:43PM +0100, David Laight wrote:
> On Fri, 11 Jul 2025 01:23:49 +0200
> Alejandro Colomar <alx@kernel.org> wrote:
> 
> > Hi Linus,
> > 
> > [I'll reply to both of your emails at once]
> > 
> > On Thu, Jul 10, 2025 at 02:58:24PM -0700, Linus Torvalds wrote:
> > > You took my suggestion, and then you messed it up.
> > > 
> > > Your version of sprintf_array() is broken. It evaluates 'a' twice.
> > > Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> > > argument.  
> > 
> > An array has no issue being evaluated twice (unless it's a VLA).  On the
> > other hand, I agree it's better to not do that in the first place.
> > My bad for forgetting about it.  Sorry.
> 
> Or a function that returns an array...

Actually, I was forgetting that the array could be gotten from a pointer
to array:

	int (*ap)[42] = ...;

	ENDOF(ap++);  // Evaluates ap++

Anyway, fixed in v6.


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 4 weeks ago

On Fri, Jul 11, 2025 at 09:17:28PM +0200, Alejandro Colomar wrote:
> Hi David,
> 
> On Fri, Jul 11, 2025 at 06:43:43PM +0100, David Laight wrote:
> > On Fri, 11 Jul 2025 01:23:49 +0200
> > Alejandro Colomar <alx@kernel.org> wrote:
> > 
> > > Hi Linus,
> > > 
> > > [I'll reply to both of your emails at once]
> > > 
> > > On Thu, Jul 10, 2025 at 02:58:24PM -0700, Linus Torvalds wrote:
> > > > You took my suggestion, and then you messed it up.
> > > > 
> > > > Your version of sprintf_array() is broken. It evaluates 'a' twice.
> > > > Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> > > > argument.  
> > > 
> > > An array has no issue being evaluated twice (unless it's a VLA).  On the
> > > other hand, I agree it's better to not do that in the first place.
> > > My bad for forgetting about it.  Sorry.
> > 
> > Or a function that returns an array...
> 
> Actually, I was forgetting that the array could be gotten from a pointer
> to array:
> 
> 	int (*ap)[42] = ...;
> 
> 	ENDOF(ap++);  // Evaluates ap++

D'oh!  That should have been ENDOF(*ap++).

> Anyway, fixed in v6.
> 
> 
> Cheers,
> Alex
> 
> -- 
> <https://www.alejandro-colomar.es/>



-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 4 weeks ago

On Fri, Jul 11, 2025 at 01:23:56AM +0200, Alejandro Colomar wrote:
> Hi Linus,
> 
> [I'll reply to both of your emails at once]
> 
> On Thu, Jul 10, 2025 at 02:58:24PM -0700, Linus Torvalds wrote:
> > You took my suggestion, and then you messed it up.
> > 
> > Your version of sprintf_array() is broken. It evaluates 'a' twice.
> > Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> > argument.
> 
> An array has no issue being evaluated twice (unless it's a VLA).  On the
> other hand, I agree it's better to not do that in the first place.
> My bad for forgetting about it.  Sorry.
> 
> On Thu, Jul 10, 2025 at 03:08:29PM -0700, Linus Torvalds wrote:
> > If you want to return an error on truncation, do it right.  Not by
> > returning NULL, but by actually returning an error.
> 
> Okay.
> 
> > For example, in the kernel, we finally fixed 'strcpy()'. After about a
> > million different versions of 'copy a string' where every single
> > version was complete garbage, we ended up with 'strscpy()'. Yeah, the
> > name isn't lovely, but the *use* of it is:
> 
> I have implemented the same thing in shadow, called strtcpy() (T for
> truncation).  (With the difference that we read the string twice, since
> we don't care about threads.)
> 
> I also plan to propose standardization of that one in ISO C.
> 
> >  - it returns the length of the result for people who want it - which
> > is by far the most common thing people want
> 
> Agree.
> 
> >  - it returns an actual honest-to-goodness error code if something
> > overflowed, instead of the absoilutely horrible "source length" of the
> > string that strlcpy() does and which is fundamentally broken (because
> > it requires that you walk *past* the end of the source,
> > Christ-on-a-stick what a broken interface)
> 
> Agree.
> 
> >  - it can take an array as an argument (without the need for another
> > name - see my earlier argument about not making up new names by just
> > having generics)
> 
> We can't make the same thing with sprintf() variants because they're
> variadic, so you can't count the number of arguments.  And since the
> 'end' argument is of the same type as the formatted string, we can't
> do it with _Generic reliably either.
> 
> > Now, it has nasty naming (exactly the kind of 'add random character'
> > naming that I was arguing against), and that comes from so many
> > different broken versions until we hit on something that works.
> > 
> > strncpy is horrible garbage. strlcpy is even worse. strscpy actually
> > works and so far hasn't caused issues (there's a 'pad' version for the
> > very rare situation where you want 'strncpy-like' padding, but it
> > still guarantees NUL-termination, and still has a good return value).
> 
> Agree.
> 
> > Let's agree to *not* make horrible garbage when making up new versions
> > of sprintf.
> 
> Agree.  I indeed introduced the mistake accidentally in v4, after you
> complained of having too many functions, as I was introducing not one
> but two APIs: seprintf() and stprintf(), where seprintf() is what now
> we're calling sprintf_end(), and stprintf() we could call it
> sprintf_trunc().  So I did the mistake by trying to reduce the number of
> functions to just one, which is wrong.
> 
> So, maybe I should go back to those functions, and just give them good
> names.
> 
> What do you think of the following?
> 
> 	#define sprintf_array(a, ...)  sprintf_trunc(a, ARRAY_SIZE(a), __VA_ARGS__)
> 	#define vsprintf_array(a, ap)  vsprintf_trunc(a, ARRAY_SIZE(a), ap)

Typo: forgot the fmt argument.

> 
> 	char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
> 	char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
> 	int sprintf_trunc(char *buf, size_t size, const char *fmt, ...);
> 	int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args);
> 
> 	char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
> 	{
> 		va_list args;
> 
> 		va_start(args, fmt);
> 		p = vseprintf(p, end, fmt, args);
> 		va_end(args);
> 
> 		return p;
> 	}
> 
> 	char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
> 	{
> 		int len;
> 
> 		if (unlikely(p == NULL))
> 			return NULL;
> 
> 		len = vsprintf_trunc(p, end - p, fmt, args);
> 		if (unlikely(len < 0))
> 			return NULL;
> 
> 		return p + len;
> 	}
> 
> 	int sprintf_trunc(char *buf, size_t size, const char *fmt, ...)
> 	{
> 		va_list args;
> 		int len;
> 
> 		va_start(args, fmt);
> 		len = vstprintf(buf, size, fmt, args);
> 		va_end(args);
> 
> 		return len;
> 	}
> 
> 	int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args)
> 	{
> 		int len;
> 
> 		if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
> 			return -EOVERFLOW;
> 
> 		len = vsnprintf(buf, size, fmt, args);
> 		if (unlikely(len >= size))
> 			return -E2BIG;
> 
> 		return len;
> 	}
> 
> sprintf_trunc() is like strscpy(), but with a formatted string.  It
> could replace uses of s[c]nprintf() where there's a single call (no
> chained calls).
> 
> sprintf_array() is like the 2-argument version of strscpy().  It could
> replace s[c]nprintf() calls where there's no chained calls, where the
> input is an array.
> 
> sprintf_end() would replace the chained calls.
> 
> Does this sound good to you?
> 
> 
> Cheers,
> Alex
> 
> -- 
> <https://www.alejandro-colomar.es/>



-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v5 6/7] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 4 weeks ago

On Fri, Jul 11, 2025 at 01:23:56AM +0200, Alejandro Colomar wrote:
> Hi Linus,
> 
> [I'll reply to both of your emails at once]
> 
> On Thu, Jul 10, 2025 at 02:58:24PM -0700, Linus Torvalds wrote:
> > You took my suggestion, and then you messed it up.
> > 
> > Your version of sprintf_array() is broken. It evaluates 'a' twice.
> > Because unlike ARRAY_SIZE(), your broken ENDOF() macro evaluates the
> > argument.
> 
> An array has no issue being evaluated twice (unless it's a VLA).  On the
> other hand, I agree it's better to not do that in the first place.
> My bad for forgetting about it.  Sorry.
> 
> On Thu, Jul 10, 2025 at 03:08:29PM -0700, Linus Torvalds wrote:
> > If you want to return an error on truncation, do it right.  Not by
> > returning NULL, but by actually returning an error.
> 
> Okay.
> 
> > For example, in the kernel, we finally fixed 'strcpy()'. After about a
> > million different versions of 'copy a string' where every single
> > version was complete garbage, we ended up with 'strscpy()'. Yeah, the
> > name isn't lovely, but the *use* of it is:
> 
> I have implemented the same thing in shadow, called strtcpy() (T for
> truncation).  (With the difference that we read the string twice, since
> we don't care about threads.)
> 
> I also plan to propose standardization of that one in ISO C.
> 
> >  - it returns the length of the result for people who want it - which
> > is by far the most common thing people want
> 
> Agree.
> 
> >  - it returns an actual honest-to-goodness error code if something
> > overflowed, instead of the absoilutely horrible "source length" of the
> > string that strlcpy() does and which is fundamentally broken (because
> > it requires that you walk *past* the end of the source,
> > Christ-on-a-stick what a broken interface)
> 
> Agree.
> 
> >  - it can take an array as an argument (without the need for another
> > name - see my earlier argument about not making up new names by just
> > having generics)
> 
> We can't make the same thing with sprintf() variants because they're
> variadic, so you can't count the number of arguments.  And since the
> 'end' argument is of the same type as the formatted string, we can't
> do it with _Generic reliably either.
> 
> > Now, it has nasty naming (exactly the kind of 'add random character'
> > naming that I was arguing against), and that comes from so many
> > different broken versions until we hit on something that works.
> > 
> > strncpy is horrible garbage. strlcpy is even worse. strscpy actually
> > works and so far hasn't caused issues (there's a 'pad' version for the
> > very rare situation where you want 'strncpy-like' padding, but it
> > still guarantees NUL-termination, and still has a good return value).
> 
> Agree.
> 
> > Let's agree to *not* make horrible garbage when making up new versions
> > of sprintf.
> 
> Agree.  I indeed introduced the mistake accidentally in v4, after you
> complained of having too many functions, as I was introducing not one
> but two APIs: seprintf() and stprintf(), where seprintf() is what now
> we're calling sprintf_end(), and stprintf() we could call it
> sprintf_trunc().  So I did the mistake by trying to reduce the number of
> functions to just one, which is wrong.
> 
> So, maybe I should go back to those functions, and just give them good
> names.
> 
> What do you think of the following?
> 
> 	#define sprintf_array(a, ...)  sprintf_trunc(a, ARRAY_SIZE(a), __VA_ARGS__)
> 	#define vsprintf_array(a, ap)  vsprintf_trunc(a, ARRAY_SIZE(a), ap)
> 
> 	char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
> 	char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
> 	int sprintf_trunc(char *buf, size_t size, const char *fmt, ...);
> 	int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args);
> 
> 	char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
> 	{
> 		va_list args;
> 
> 		va_start(args, fmt);
> 		p = vseprintf(p, end, fmt, args);

Typo here.  It's vsprintf_end().

> 		va_end(args);
> 
> 		return p;
> 	}
> 
> 	char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
> 	{
> 		int len;
> 
> 		if (unlikely(p == NULL))
> 			return NULL;
> 
> 		len = vsprintf_trunc(p, end - p, fmt, args);
> 		if (unlikely(len < 0))
> 			return NULL;
> 
> 		return p + len;
> 	}
> 
> 	int sprintf_trunc(char *buf, size_t size, const char *fmt, ...)
> 	{
> 		va_list args;
> 		int len;
> 
> 		va_start(args, fmt);
> 		len = vstprintf(buf, size, fmt, args);

Typo here.  It's vsprintf_trunc().

> 		va_end(args);
> 
> 		return len;
> 	}
> 
> 	int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args)
> 	{
> 		int len;
> 
> 		if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
> 			return -EOVERFLOW;
> 
> 		len = vsnprintf(buf, size, fmt, args);
> 		if (unlikely(len >= size))
> 			return -E2BIG;
> 
> 		return len;
> 	}
> 
> sprintf_trunc() is like strscpy(), but with a formatted string.  It
> could replace uses of s[c]nprintf() where there's a single call (no
> chained calls).
> 
> sprintf_array() is like the 2-argument version of strscpy().  It could
> replace s[c]nprintf() calls where there's no chained calls, where the
> input is an array.
> 
> sprintf_end() would replace the chained calls.
> 
> Does this sound good to you?
> 
> 
> Cheers,
> Alex
> 
> -- 
> <https://www.alejandro-colomar.es/>



-- 
<https://www.alejandro-colomar.es/>

[RFC v5 7/7] mm: Use [v]sprintf_array() to avoid specifying the array size

Posted by Alejandro Colomar 2 months, 4 weeks ago

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/backing-dev.c    | 2 +-
 mm/cma.c            | 4 ++--
 mm/cma_debug.c      | 2 +-
 mm/hugetlb.c        | 3 +--
 mm/hugetlb_cgroup.c | 2 +-
 mm/hugetlb_cma.c    | 2 +-
 mm/kasan/report.c   | 3 +--
 mm/memblock.c       | 4 ++--
 mm/percpu.c         | 2 +-
 mm/shrinker_debug.c | 2 +-
 mm/zswap.c          | 2 +-
 11 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 783904d8c5ef..c4e588135aea 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -1090,7 +1090,7 @@ int bdi_register_va(struct backing_dev_info *bdi, const char *fmt, va_list args)
 	if (bdi->dev)	/* The driver needs to use separate queues per device */
 		return 0;
 
-	vsnprintf(bdi->dev_name, sizeof(bdi->dev_name), fmt, args);
+	vsprintf_array(bdi->dev_name, fmt, args);
 	dev = device_create(&bdi_class, NULL, MKDEV(0, 0), bdi, bdi->dev_name);
 	if (IS_ERR(dev))
 		return PTR_ERR(dev);
diff --git a/mm/cma.c b/mm/cma.c
index c04be488b099..61d97a387670 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -237,9 +237,9 @@ static int __init cma_new_area(const char *name, phys_addr_t size,
 	cma_area_count++;
 
 	if (name)
-		snprintf(cma->name, CMA_MAX_NAME, "%s", name);
+		sprintf_array(cma->name, "%s", name);
 	else
-		snprintf(cma->name, CMA_MAX_NAME,  "cma%d\n", cma_area_count);
+		sprintf_array(cma->name, "cma%d\n", cma_area_count);
 
 	cma->available_count = cma->count = size >> PAGE_SHIFT;
 	cma->order_per_bit = order_per_bit;
diff --git a/mm/cma_debug.c b/mm/cma_debug.c
index fdf899532ca0..751eae9f6364 100644
--- a/mm/cma_debug.c
+++ b/mm/cma_debug.c
@@ -186,7 +186,7 @@ static void cma_debugfs_add_one(struct cma *cma, struct dentry *root_dentry)
 	rangedir = debugfs_create_dir("ranges", tmp);
 	for (r = 0; r < cma->nranges; r++) {
 		cmr = &cma->ranges[r];
-		snprintf(rdirname, sizeof(rdirname), "%d", r);
+		sprintf_array(rdirname, "%d", r);
 		dir = debugfs_create_dir(rdirname, rangedir);
 		debugfs_create_file("base_pfn", 0444, dir,
 			    &cmr->base_pfn, &cma_debugfs_fops);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6a3cf7935c14..70acc8b3cbb8 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4780,8 +4780,7 @@ void __init hugetlb_add_hstate(unsigned int order)
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
 	INIT_LIST_HEAD(&h->hugepage_activelist);
-	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
-					huge_page_size(h)/SZ_1K);
+	sprintf_array(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
 
 	parsed_hstate = h;
 }
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 58e895f3899a..0953cea93759 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -822,7 +822,7 @@ hugetlb_cgroup_cfttypes_init(struct hstate *h, struct cftype *cft,
 	for (i = 0; i < tmpl_size; cft++, tmpl++, i++) {
 		*cft = *tmpl;
 		/* rebuild the name */
-		snprintf(cft->name, MAX_CFTYPE_NAME, "%s.%s", buf, tmpl->name);
+		sprintf_array(cft->name, "%s.%s", buf, tmpl->name);
 		/* rebuild the private */
 		cft->private = MEMFILE_PRIVATE(idx, tmpl->private);
 		/* rebuild the file_offset */
diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c
index e0f2d5c3a84c..bae82a97a43c 100644
--- a/mm/hugetlb_cma.c
+++ b/mm/hugetlb_cma.c
@@ -211,7 +211,7 @@ void __init hugetlb_cma_reserve(int order)
 
 		size = round_up(size, PAGE_SIZE << order);
 
-		snprintf(name, sizeof(name), "hugetlb%d", nid);
+		sprintf_array(name, "hugetlb%d", nid);
 		/*
 		 * Note that 'order per bit' is based on smallest size that
 		 * may be returned to CMA allocator in the case of
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 8357e1a33699..3b40225e7873 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -486,8 +486,7 @@ static void print_memory_metadata(const void *addr)
 		char buffer[4 + (BITS_PER_LONG / 8) * 2];
 		char metadata[META_BYTES_PER_ROW];
 
-		snprintf(buffer, sizeof(buffer),
-				(i == 0) ? ">%px: " : " %px: ", row);
+		sprintf_array(buffer, (i == 0) ? ">%px: " : " %px: ", row);
 
 		/*
 		 * We should not pass a shadow pointer to generic
diff --git a/mm/memblock.c b/mm/memblock.c
index 0e9ebb8aa7fe..3eea7a177330 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -2021,7 +2021,7 @@ static void __init_memblock memblock_dump(struct memblock_type *type)
 		flags = rgn->flags;
 #ifdef CONFIG_NUMA
 		if (numa_valid_node(memblock_get_region_node(rgn)))
-			snprintf(nid_buf, sizeof(nid_buf), " on node %d",
+			sprintf_array(nid_buf, " on node %d",
 				 memblock_get_region_node(rgn));
 #endif
 		pr_info(" %s[%#x]\t[%pa-%pa], %pa bytes%s flags: %#x\n",
@@ -2379,7 +2379,7 @@ int reserve_mem_release_by_name(const char *name)
 
 	start = phys_to_virt(map->start);
 	end = start + map->size - 1;
-	snprintf(buf, sizeof(buf), "reserve_mem:%s", name);
+	sprintf_array(buf, "reserve_mem:%s", name);
 	free_reserved_area(start, end, 0, buf);
 	map->size = 0;
 
diff --git a/mm/percpu.c b/mm/percpu.c
index b35494c8ede2..a467102c2405 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -3186,7 +3186,7 @@ int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_to_node_fn_t
 	int upa;
 	int nr_g0_units;
 
-	snprintf(psize_str, sizeof(psize_str), "%luK", PAGE_SIZE >> 10);
+	sprintf_array(psize_str, "%luK", PAGE_SIZE >> 10);
 
 	ai = pcpu_build_alloc_info(reserved_size, 0, PAGE_SIZE, NULL);
 	if (IS_ERR(ai))
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
index 20eaee3e97f7..f529ac29557c 100644
--- a/mm/shrinker_debug.c
+++ b/mm/shrinker_debug.c
@@ -176,7 +176,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
 		return id;
 	shrinker->debugfs_id = id;
 
-	snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
+	sprintf_array(buf, "%s-%d", shrinker->name, id);
 
 	/* create debugfs entry */
 	entry = debugfs_create_dir(buf, shrinker_debugfs_root);
diff --git a/mm/zswap.c b/mm/zswap.c
index 204fb59da33c..e66b5c5b1ecf 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -271,7 +271,7 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
 		return NULL;
 
 	/* unique name for each pool specifically required by zsmalloc */
-	snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count));
+	sprintf_array(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
 	pool->zpool = zpool_create_pool(type, name, gfp);
 	if (!pool->zpool) {
 		pr_err("%s zpool not available\n", type);
-- 
2.50.0

[RFC v3 0/7] Add and use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

Hi,

In this v3:

-  I've added Fixes: tags for all commits that introduced issues being
   fixed in this patch set.  I've also added the people who signed or
   reviewed those patches to CC.

-  I've fixed a typo in a comment.

-  I've also added a STPRINTF() macro and used it to remove explicit
   uses of sizeof().

Now, only 5 calls to snprintf(3) remain under mm/:

	$ grep -rnI nprint mm/
	mm/hugetlb_cgroup.c:674:		snprintf(buf, size, "%luGB", hsize / SZ_1G);
	mm/hugetlb_cgroup.c:676:		snprintf(buf, size, "%luMB", hsize / SZ_1M);
	mm/hugetlb_cgroup.c:678:		snprintf(buf, size, "%luKB", hsize / SZ_1K);
	mm/kfence/report.c:75:		int len = scnprintf(buf, sizeof(buf), "%ps", (void *)stack_entries[skipnr]);
	mm/kmsan/report.c:42:		len = scnprintf(buf, sizeof(buf), "%ps",

The first three are fine.  The remaining two, I'd like someone to check
if they should be replaced by one of these wrappers.  I had doubts about
it, and would need someone understanding that code to check them.
Mainly, do we really want to ignore truncation?

The questions from v1 still are in the air.

I've written an analysis of snprintf(3), why it's dangerous, and how
these APIs address that, and will present it as a proposal for
standardization of these APIs in ISO C2y.  I'll send that as a reply to
this message in a moment, as I believe it will be interesting for
linux-hardening@.


Have a lovely night!
Alex

Alejandro Colomar (7):
  vsprintf: Add [v]seprintf(), [v]stprintf()
  stacktrace, stackdepot: Add seprintf()-like variants of functions
  mm: Use seprintf() instead of less ergonomic APIs
  array_size.h: Add ENDOF()
  mm: Fix benign off-by-one bugs
  sprintf: Add [V]STPRINTF()
  mm: Use [V]STPRINTF() to avoid specifying the array size

 include/linux/array_size.h |   6 ++
 include/linux/sprintf.h    |   8 +++
 include/linux/stackdepot.h |  13 +++++
 include/linux/stacktrace.h |   3 +
 kernel/stacktrace.c        |  28 ++++++++++
 lib/stackdepot.c           |  12 ++++
 lib/vsprintf.c             | 109 +++++++++++++++++++++++++++++++++++++
 mm/backing-dev.c           |   2 +-
 mm/cma.c                   |   4 +-
 mm/cma_debug.c             |   2 +-
 mm/hugetlb.c               |   3 +-
 mm/hugetlb_cgroup.c        |   2 +-
 mm/hugetlb_cma.c           |   2 +-
 mm/kasan/report.c          |   3 +-
 mm/kfence/kfence_test.c    |  28 +++++-----
 mm/kmsan/kmsan_test.c      |   6 +-
 mm/memblock.c              |   4 +-
 mm/mempolicy.c             |  18 +++---
 mm/page_owner.c            |  32 ++++++-----
 mm/percpu.c                |   2 +-
 mm/shrinker_debug.c        |   2 +-
 mm/slub.c                  |   5 +-
 mm/zswap.c                 |   2 +-
 23 files changed, 238 insertions(+), 58 deletions(-)

Range-diff against v2:
1:  64334f0b94d6 = 1:  64334f0b94d6 vsprintf: Add [v]seprintf(), [v]stprintf()
2:  9c140de9842d = 2:  9c140de9842d stacktrace, stackdepot: Add seprintf()-like variants of functions
3:  e3271b5f2ad9 ! 3:  033bf00f1fcf mm: Use seprintf() instead of less ergonomic APIs
    @@ Commit message
                 Again, the 'p += snprintf()' anti-pattern.  This is UB, and by
                 using seprintf() we've fixed the bug.
     
    +    Fixes: f99e12b21b84 (2021-07-30; "kfence: add function to mask address bits")
    +    [alx: that commit introduced dead code]
    +    Fixes: af649773fb25 (2024-07-17; "mm/numa_balancing: teach mpol_to_str about the balancing mode")
    +    [alx: that commit added p+=snprintf() calls, which are UB]
    +    Fixes: 2291990ab36b (2008-04-28; "mempolicy: clean-up mpol-to-str() mempolicy formatting")
    +    [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
    +    Fixes: 948927ee9e4f (2013-11-13; "mm, mempolicy: make mpol_to_str robust and always succeed")
    +    [alx: that commit changes old code into p+=snprintf(), which is still UB]
    +    [alx: that commit also produced dead code by leaving the last 'p+=...']
    +    Fixes: d65360f22406 (2022-09-26; "mm/slub: clean up create_unique_id()")
    +    [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
         Cc: Kees Cook <kees@kernel.org>
         Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
    +    Cc: Sven Schnelle <svens@linux.ibm.com>
    +    Cc: Marco Elver <elver@google.com>
    +    Cc: Heiko Carstens <hca@linux.ibm.com>
    +    Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    +    Cc: "Huang, Ying" <ying.huang@intel.com>
    +    Cc: Andrew Morton <akpm@linux-foundation.org>
    +    Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
    +    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    +    Cc: David Rientjes <rientjes@google.com>
    +    Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
    +    Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
    +    Cc: Chao Yu <chao.yu@oppo.com>
    +    Cc: Vlastimil Babka <vbabka@suse.cz>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## mm/kfence/kfence_test.c ##
4:  5331d286ceca ! 4:  d8bd0e1d308b array_size.h: Add ENDOF()
    @@ include/linux/array_size.h
      #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
      
     +/**
    -+ * ENDOF - get a pointer to one past the last element in array @arr
    -+ * @arr: array
    ++ * ENDOF - get a pointer to one past the last element in array @a
    ++ * @a: array
     + */
     +#define ENDOF(a)  (a + ARRAY_SIZE(a))
     +
5:  08cfdd2bf779 ! 5:  740755c1a888 mm: Fix benign off-by-one bugs
    @@ Commit message
         'end' --that is, at most the terminating null byte will be written at
         'end-1'--.
     
    +    Fixes: bc8fbc5f305a (2021-02-26; "kfence: add test suite")
    +    Fixes: 8ed691b02ade (2022-10-03; "kmsan: add tests for KMSAN")
         Cc: Kees Cook <kees@kernel.org>
         Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
    +    Cc: Alexander Potapenko <glider@google.com>
    +    Cc: Marco Elver <elver@google.com>
    +    Cc: Dmitry Vyukov <dvyukov@google.com>
    +    Cc: Alexander Potapenko <glider@google.com>
    +    Cc: Jann Horn <jannh@google.com>
    +    Cc: Andrew Morton <akpm@linux-foundation.org>
    +    Cc: Linus Torvalds <torvalds@linux-foundation.org>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## mm/kfence/kfence_test.c ##
-:  ------------ > 6:  44d05559398c sprintf: Add [V]STPRINTF()
-:  ------------ > 7:  d0e95db3c80a mm: Use [V]STPRINTF() to avoid specifying the array size
-- 
2.50.0

Re: [RFC v3 0/7] Add and use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

On Mon, Jul 07, 2025 at 07:06:06AM +0200, Alejandro Colomar wrote:
> I've written an analysis of snprintf(3), why it's dangerous, and how
> these APIs address that, and will present it as a proposal for
> standardization of these APIs in ISO C2y.  I'll send that as a reply to
> this message in a moment, as I believe it will be interesting for
> linux-hardening@.

Hi,

Here is the proposal for ISO C2y (see below).  I'll also send it to the
C Committee for discussion. 


Have a lovely night!
Alex

---
Name
	alx-0049r1 - add seprintf()

Principles
	-  Codify existing practice to address evident deficiencies.
	-  Enable secure programming

Category
	Standardize existing libc APIs

Author
	Alejandro Colomar <alx@kernel.org>

	Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>

History
	<https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0049.git/>

	r0 (2025-07-06):
	-  Initial draft.

	r1 (2025-07-06):
	-  wfix.
	-  tfix.
	-  Expand on the off-by-one bugs.
	-  Note that ignoring truncation is not valid most of the time.

Rationale
	snprintf(3) is very difficult to chain for writing parts of a
	string in separate calls, such as in a loop.

	Let's start from the obvious sprintf(3) code (sprintf(3) will
	not prevent overflow, but let's take it as a baseline from which
	programmers start thinking):

		p = buf;
		for (...)
			p += sprintf(p, ...);

	Then, programmers will start thinking about preventing buffer
	overflows.  Programmers sometimes will naively add some buffer
	size information and use snprintf(3):

		p = buf;
		size = countof(buf);
		for (...)
			p += snprintf(p, size - (p - buf), ...);

		if (p >= buf + size)  // or worse, (p > buf + size - 1)
			goto fail;

	(Except for minor differences, this kind of code can be found
	 everywhere.  Here are a couple of examples:
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/slub.c#L7231>
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/mempolicy.c#L3369>.)

	This has several issues, starting with the difficulty of getting
	the second argument right.  Sometimes, programmers will be too
	confused, and slap a -1 there just to be safe.

		p = buf;
		size = countof(buf);
		for (...)
			p += snprintf(p, size - (p - buf) - 1, ...);

		if (p >= buf + size -1)
			goto fail;

	(Except for minor differences, this kind of code can be found
	 everywhere.  Here are a couple of examples:
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/kfence/kfence_test.c#L113>
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/kmsan/kmsan_test.c#L108>.)

	Programmers will sometimes hold a pointer to one past the last
	element in the array.  This is a wise choice, as that pointer is
	constant throughout the lifetime of the object.  Then,
	programmers might end up with something like this:

		p = buf;
		e = buf + countof(buf);
		for (...)
			p += snprintf(p, e - p, ...);

		if (p >= end)
			goto fail;

	This is certainly much cleaner.  Now a programmer might focus on
	the fact that this can overflow the pointer.  An easy approach
	would be to make sure that the function never returns more than
	the remaining size.  That is, one could implement something like
	this scnprintf() --name chosen to match the Linux kernel API of
	the same name--.  For the sake of simplicity, let's ignore
	multiple evaluation of arguments.

		#define scnprintf(s, size, ...)                 \
		({                                              \
			int len_;                               \
			len_ = snprintf(s, size, __VA_ARGS__);  \
			if (len_ == -1)                         \
				len_ = 0;                       \
			if (len_ >= size)                       \
				len_ = size - 1;                \
		                                                \
			len_;                                   \
		})

		p = buf;
		e = buf + countof(buf);
		for (...)
			p += scnprintf(p, e - p, ...);

	(Except for minor differences, this kind of code can be found
	 everywhere.  Here's an example:
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/kfence/kfence_test.c#L131>.)

	Now the programmer got rid of pointer overflow.  However, they
	now have silent truncation that cannot be detected.  In some
	cases this may seem good enough.  However, often it's not.  And
	anyway, some code remains using snprintf(3) to be able to detect
	truncation.

	Moreover, this kind of code ignores the fact that vsnprintf(3)
	can fail internally, in which case there's not even a truncated
	string.  In the kernel, they're fine, because their internal
	vsnprintf() doesn't seem to ever fail, so they can always rely
	on the truncated string.  This is not reliable in projects that
	rely on the libc vsnprintf(3).

	For the code that needs to detect truncation, a programmer might
	choose a different path.  It would keep using snprintf(3), but
	would use a temporary length variable instead of the pointer.

		p = buf;
		e = buf + countof(buf);
		for (...) {
			len = snprintf(p, e - p, ...);
			if (len == -1)
				goto fail;
			if (len >= e - p)
				goto fail;
			p += len;
		}

	This is naturally error-prone.  A colleague of mine --which is an
	excellent programmer, to be clear--, had a bug even after
	knowing about it and having tried to fix it.  That shows how
	hard it is to write this correctly:
	<https://github.com/nginx/unit/pull/734#discussion_r1043963527>

	In a similar fashion, the strlcpy(3) manual page from OpenBSD
	documents a similar issue when chaining calls to strlcpy(3)
	--which was designed with semantics equivalent to snprintf(3),
	except for not formatting the string--:

	|	     char *dir, *file, pname[MAXPATHLEN];
	|	     size_t n;
	|
	|	     ...
	|
	|	     n = strlcpy(pname, dir, sizeof(pname));
	|	     if (n >= sizeof(pname))
	|		     goto toolong;
	|	     if (strlcpy(pname + n, file, sizeof(pname) - n) >= sizeof(pname) - n)
	|		     goto toolong;
	|
	|       However, one may question the validity of such optimiza‐
	|       tions, as they defeat the whole purpose of strlcpy() and
	|       strlcat().  As a matter of fact, the  first  version  of
	|       this manual page got it wrong.

	Finally, a programmer might realize that while this is error-
	prone, this is indeed the right thing to do.  There's no way to
	avoid it.  One could then think of encapsulating this into an
	API that at least would make it easy to write.  Then, one might
	wonder what the right parameters are for such an API.  The only
	immutable thing in the loop is 'e'.  And apart from that, one
	needs to know where to write, which is 'p'.  Let's start with
	those, and try to keep all the other information (size, len)
	without escaping the API.  Again, let's ignore multiple-
	evaluation issues in this macro for the sake of simplicity.

		#define foo(p, e, ...)                                \
		({                                                    \
			int  len_ = snprintf(p, e - p, __VA_ARGS__);  \
			if (len_ == -1)                               \
				p = NULL;                             \
			else if (len_ >= e - p)                       \
				p = NULL;                             \
			else                                          \
				p += len_;                            \
			p;
		})

		p = buf;
		e = buf + countof(buf);
		for (...) {
			p = foo(p, e, ...);
			if (p == NULL)
				goto fail;
		}

	We've advanced a lot.  We got rid of the buffer overflow; we
	also got rid of the error-prone code at call site.  However, one
	might think that checking for truncation after every call is
	cumbersome.  Indeed, it is possible to slightly tweak the
	internals of foo() to propagate errors from previous calls.

		#define seprintf(p, e, ...)                           \
		({                                                    \
			if (p != NULL) {                              \
				int  len_;                            \
		                                                      \
				len_ = snprintf(p, e - p, __VA_ARGS__); \
				if (len_ == -1)                       \
					p = NULL;                     \
				else if (len_ >= e - p)               \
					p = NULL;                     \
				else                                  \
					p += len_;                    \
			}                                             \
			p;                                            \
		})

		p = buf;
		e = buf + countof(buf);
		for (...)
			p = seprintf(p, e, ...);

		if (p == NULL)
			goto fail;

	By propagating an input null pointer directly to the output of
	the API, which I've called seprintf() --the 'e' refers to the
	'end' pointer, which is the key in this API--, we've allowed
	ignoring null pointers until after the very last call.  If we
	compare our resulting code to the sprintf(3)-based baseline, we
	got --perhaps unsurprisingly-- something quite close to it:

		p = buf;
		for (...)
			p += sprintf(p, ...);

	vs

		p = buf;
		e = buf + countof(buf);
		for (...)
			p = seprintf(p, e, ...);

		if (p == NULL)
			goto fail;

	And the seprintf() version is safe against both truncation and
	buffer overflow.

	Some important details of the API are:

	-  When 'p' is NULL, the API must preserve errno.  This is
	   important to be able to determine the cause of the error
	   after all the chained calls, even when the error occurred in
	   some call in the middle of the chain.

	-  When truncation occurs, a distinct errno value must be used,
	   to signal the programmer that at least the string is reliable
	   to be used as a null-terminated string.  The error code
	   chosen is E2BIG, for compatibility with strscpy(), a Linux
	   kernel internal API with which this API shares many features
	   in common.

	-  When a hard error (an internal snprintf(3) error) occurs, an
	   error code different than E2BIG must be used.  It is
	   important to set errno, because if an implementation would
	   chose to return NULL without setting errno, an old value of
	   E2BIG could lead the programmer to believe the string was
	   successfully written (and truncated), and read it with
	   nefast consequences.

Prior art
	This API is implemented in the shadow-utils project.

	Plan9 designed something quite close, which they call
	seprint(2).  The parameters are the same --the right choice--,
	but they got the semantics for corner cases wrong.  Ironically,
	the existing Plan9 code I've seen seems to expect the semantics
	that I chose, regardless of the actual semantics of the Plan9
	API.  This is --I suspect--, because my semantics are actually
	the intuitive semantics that one would naively guess of an API
	with these parameters and return value.

	I've implemented this API for the Linux kernel, and found and
	fixed an amazing amount of bugs and other questionable code in
	just the first handful of files that I inspected.
	<https://lore.kernel.org/linux-hardening/cover.1751747518.git.alx@kernel.org/T/#t>
	<https://lore.kernel.org/linux-hardening/cover.1751823326.git.alx@kernel.org/T/#t>

Future directions
	The 'e = buf + _Countof(buf)' construct is something I've found
	to be quite common.  It would be interesting to have an
	_Endof operator that would return a pointer to one past the last
	element of an array.  It would require an array operand, just
	like _Countof.  If an _Endof operator is deemed too cumbersome
	for implementation, an endof() standard macro that expands to
	the obvious implementation with _Countof could be okay.

	This operator (or operator-like macro) would prevent off-by-one
	bugs when calculating the end sentinel value, such as those
	shown above (with links to Linux kernel real bugs).

Proposed wording
	Based on N3550.

    7.24.6  Input/output <stdio.h> :: Formatted input/output functions
	## New section after 7.24.6.6 ("The snprintf function"):

	+7.24.6.6+1  The <b>seprintf</b> function
	+
	+Synopsis
	+1	#include <stdio.h>
	+	char *seprintf(char *restrict p, const char end[0], const char *restrict format, ...);
	+
	+Description
	+2	The <b>$0</b> function
	+	is equivalent to <b>fprintf</b>,
	+	except that the output is written into an array
	+	(specified by argument <tt>p</tt>)
	+	rather than a stream.
	+	If <tt>p</tt> is a null pointer,
	+	nothing is written,
	+	and the function returns a null pointer.
	+	Otherwise,
	+	<tt>end</tt> shall compare greater than <tt>p</tt>;
	+	the function writes at most
	+	<tt>end - p - 1</tt> non-null characters,
	+	the remaining output characters are discarded,
	+	and a null character is written
	+	at the end of the characters
	+	actually written to the array.
	+	If copying takes place between objects that overlap,
	+	the behavior is undefined.
	+
	+Returns
	+3	The <b>$0</b> function returns
	+	a pointer to the terminating null character
	+	if the output was written
	+	without discarding any characters.
	+
	+4
	+	If <tt>p</tt> is a null pointer,
	+	a null pointer is returned,
	+	and <b>errno</b> is not modified.
	+
	+5
	+	If any characters are discarded,
	+	a null pointer is returned,
	+	and the value of the macro <b>E2BIG</b>
	+	is stored in <b>errno</b>.
	+
	+6
	+	If an error occurred,
	+	a null pointer is returned,
	+	and an implementation-defined non-zero value
	+	is stored in <b>errno</b>.

	## New section after 7.24.6.13 ("The vsnprintf function"):

	+7.24.6.13+1  The <b>vseprintf</b> function
	+
	+Synopsis
	+1	#include <stdio.h>
	+	char *vseprintf(char *restrict p, const char end[0], const char *restrict format, va_list arg);
	+
	+Description
	+2	The <b>$0</b> function
	+	is equivalent to
	+	<b>seprintf</b>,
	+	with the varying argument list replaced by <tt>arg</tt>.
	+
	+3
	+	The <tt>va_list</tt> argument to this function
	+	shall have been initialized by the <b>va_start</b> macro
	+	(and possibly subsequent <b>va_arg</b> invocations).
	+	This function does not invoke the <b>va_end</b> macro.343)

    7.33.2  Formatted wide character input/output functions
	## New section after 7.33.2.4 ("The swprintf function"):

	+7.33.2.4+1  The <b>sewprintf</b> function
	+
	+Synopsis
	+1	#include <wchar.h>
	+	wchar_t *sewprintf(wchar_t *restrict p, const wchar_t end[0], const wchar_t *restrict format, ...);
	+
	+Description
	+2	The <b>$0</b> function
	+	is equivalent to
	+	<b>seprintf</b>,
	+	except that it handles wide strings.

	## New section after 7.33.2.8 ("The vswprintf function"):

	+7.33.2.8+1  The <b>vsewprintf</b> function
	+
	+Synopsis
	+1	#include <wchar.h>
	+	wchar_t *vsewprintf(wchar_t *restrict p, const wchar_t end[0], const wchar_t *restrict format, va_list arg);
	+
	+Description
	+2	The <b>$0</b> function
	+	is equivalent to
	+	<b>sewprintf</b>,
	+	with the varying argument list replaced by <tt>arg</tt>.
	+
	+3
	+	The <tt>va_list</tt> argument to this function
	+	shall have been initialized by the <b>va_start</b> macro
	+	(and possibly subsequent <b>va_arg</b> invocations).
	+	This function does not invoke the <b>va_end</b> macro.407)

-- 
<https://www.alejandro-colomar.es/>

[RFC v3 1/7] vsprintf: Add [v]seprintf(), [v]stprintf()

Posted by Alejandro Colomar 3 months ago

seprintf()
==========

seprintf() is a function similar to stpcpy(3) in the sense that it
returns a pointer that is suitable for chaining to other copy
operations.

It takes a pointer to the end of the buffer as a sentinel for when to
truncate, which unlike a size, doesn't need to be updated after every
call.  This makes it much more ergonomic, avoiding manually calculating
the size after each copy, which is error prone.

It also makes error handling much easier, by reporting truncation with
a null pointer, which is accepted and transparently passed down by
subsequent seprintf() calls.  This results in only needing to report
errors once after a chain of seprintf() calls, unlike snprintf(3), which
requires checking after every call.

	p = buf;
	e = buf + countof(buf);
	p = seprintf(p, e, foo);
	p = seprintf(p, e, bar);
	if (p == NULL)
		goto trunc;

vs

	len = 0;
	size = countof(buf);
	len += snprintf(buf + len, size - len, foo);
	if (len >= size)
		goto trunc;

	len += snprintf(buf + len, size - len, bar);
	if (len >= size)
		goto trunc;

And also better than scnprintf() calls:

	len = 0;
	size = countof(buf);
	len += scnprintf(buf + len, size - len, foo);
	len += scnprintf(buf + len, size - len, bar);
	if (len >= size)
		goto trunc;

It seems aparent that it's a more elegant approach to string catenation.

stprintf()
==========

stprintf() is a helper that is needed for implementing seprintf()
--although it could be open-coded within vseprintf(), of course--, but
it's also useful by itself.  It has the same interface properties as
strscpy(): that is, it copies with truncation, and reports truncation
with -E2BIG.  It would be useful to replace some calls to snprintf(3)
and scnprintf() which don't need chaining, and where it's simpler to
pass a size.

It is better than plain snprintf(3), because it results in simpler error
detection (it doesn't need a check >=countof(buf), but rather <0).

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h |   4 ++
 lib/vsprintf.c          | 109 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 113 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index 51cab2def9ec..c3dbfd2efd2b 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -11,8 +11,12 @@ __printf(2, 3) int sprintf(char *buf, const char * fmt, ...);
 __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
 __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
+__printf(3, 4) int stprintf(char *buf, size_t size, const char *fmt, ...);
+__printf(3, 0) int vstprintf(char *buf, size_t size, const char *fmt, va_list args);
 __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
+__printf(3, 4) char *seprintf(char *p, const char end[0], const char *fmt, ...);
+__printf(3, 0) char *vseprintf(char *p, const char end[0], const char *fmt, va_list args);
 __printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
 __printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
 __printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 01699852f30c..a3efacadb5e5 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2892,6 +2892,37 @@ int vsnprintf(char *buf, size_t size, const char *fmt_str, va_list args)
 }
 EXPORT_SYMBOL(vsnprintf);
 
+/**
+ * vstprintf - Format a string and place it in a buffer
+ * @buf: The buffer to place the result into
+ * @size: The size of the buffer, including the trailing null space
+ * @fmt: The format string to use
+ * @args: Arguments for the format string
+ *
+ * The return value is the length of the new string.
+ * If the string is truncated, the function returns -E2BIG.
+ *
+ * If you're not already dealing with a va_list consider using stprintf().
+ *
+ * See the vsnprintf() documentation for format string extensions over C99.
+ */
+int vstprintf(char *buf, size_t size, const char *fmt, va_list args)
+{
+	int len;
+
+	len = vsnprintf(buf, size, fmt, args);
+
+	// It seems the kernel's vsnprintf() doesn't fail?
+	//if (unlikely(len < 0))
+	//	return -E2BIG;
+
+	if (unlikely(len >= size))
+		return -E2BIG;
+
+	return len;
+}
+EXPORT_SYMBOL(vstprintf);
+
 /**
  * vscnprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
@@ -2923,6 +2954,36 @@ int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
 }
 EXPORT_SYMBOL(vscnprintf);
 
+/**
+ * vseprintf - Format a string and place it in a buffer
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
+ * @args: Arguments for the format string
+ *
+ * The return value is a pointer to the trailing '\0'.
+ * If @p is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
+ *
+ * If you're not already dealing with a va_list consider using seprintf().
+ *
+ * See the vsnprintf() documentation for format string extensions over C99.
+ */
+char *vseprintf(char *p, const char end[0], const char *fmt, va_list args)
+{
+	int len;
+
+	if (unlikely(p == NULL))
+		return NULL;
+
+	len = vstprintf(p, end - p, fmt, args);
+	if (unlikely(len < 0))
+		return NULL;
+
+	return p + len;
+}
+EXPORT_SYMBOL(vseprintf);
+
 /**
  * snprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
@@ -2950,6 +3011,30 @@ int snprintf(char *buf, size_t size, const char *fmt, ...)
 }
 EXPORT_SYMBOL(snprintf);
 
+/**
+ * stprintf - Format a string and place it in a buffer
+ * @buf: The buffer to place the result into
+ * @size: The size of the buffer, including the trailing null space
+ * @fmt: The format string to use
+ * @...: Arguments for the format string
+ *
+ * The return value is the length of the new string.
+ * If the string is truncated, the function returns -E2BIG.
+ */
+
+int stprintf(char *buf, size_t size, const char *fmt, ...)
+{
+	va_list args;
+	int len;
+
+	va_start(args, fmt);
+	len = vstprintf(buf, size, fmt, args);
+	va_end(args);
+
+	return len;
+}
+EXPORT_SYMBOL(stprintf);
+
 /**
  * scnprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
@@ -2974,6 +3059,30 @@ int scnprintf(char *buf, size_t size, const char *fmt, ...)
 }
 EXPORT_SYMBOL(scnprintf);
 
+/**
+ * seprintf - Format a string and place it in a buffer
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
+ * @...: Arguments for the format string
+ *
+ * The return value is a pointer to the trailing '\0'.
+ * If @buf is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
+ */
+
+char *seprintf(char *p, const char end[0], const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	p = vseprintf(p, end, fmt, args);
+	va_end(args);
+
+	return p;
+}
+EXPORT_SYMBOL(seprintf);
+
 /**
  * vsprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
-- 
2.50.0

[RFC v3 2/7] stacktrace, stackdepot: Add seprintf()-like variants of functions

Posted by Alejandro Colomar 3 months ago

I think there's an anomaly in stack_depot_s*print().  If we have zero
entries, we don't copy anything, which means the string is still not a
string.  Normally, this function is called surrounded by other calls to
s*printf(), which guarantee that there's a '\0', but maybe we should
make sure to write a '\0' here?

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/stackdepot.h | 13 +++++++++++++
 include/linux/stacktrace.h |  3 +++
 kernel/stacktrace.c        | 28 ++++++++++++++++++++++++++++
 lib/stackdepot.c           | 12 ++++++++++++
 4 files changed, 56 insertions(+)

diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 2cc21ffcdaf9..a7749fc3ac7c 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -219,6 +219,19 @@ void stack_depot_print(depot_stack_handle_t stack);
 int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
 		       int spaces);
 
+/**
+ * stack_depot_seprint - Print a stack trace from stack depot into a buffer
+ *
+ * @handle:	Stack depot handle returned from stack_depot_save()
+ * @p:		Pointer to the print buffer
+ * @end:	Pointer to one past the last element in the buffer
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return:	Pointer to trailing '\0'; or NULL on truncation
+ */
+char *stack_depot_seprint(depot_stack_handle_t handle, char *p,
+                          const char end[0], int spaces);
+
 /**
  * stack_depot_put - Drop a reference to a stack trace from stack depot
  *
diff --git a/include/linux/stacktrace.h b/include/linux/stacktrace.h
index 97455880ac41..748936386c89 100644
--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -67,6 +67,9 @@ void stack_trace_print(const unsigned long *trace, unsigned int nr_entries,
 		       int spaces);
 int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
 			unsigned int nr_entries, int spaces);
+char *stack_trace_seprint(char *p, const char end[0],
+			  const unsigned long *entries, unsigned int nr_entries,
+			  int spaces);
 unsigned int stack_trace_save(unsigned long *store, unsigned int size,
 			      unsigned int skipnr);
 unsigned int stack_trace_save_tsk(struct task_struct *task,
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index afb3c116da91..65caf9e63673 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -70,6 +70,34 @@ int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
 }
 EXPORT_SYMBOL_GPL(stack_trace_snprint);
 
+/**
+ * stack_trace_seprint - Print the entries in the stack trace into a buffer
+ * @p:		Pointer to the print buffer
+ * @end:	Pointer to one past the last element in the buffer
+ * @entries:	Pointer to storage array
+ * @nr_entries:	Number of entries in the storage array
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return: Pointer to the trailing '\0'; or NULL on truncation.
+ */
+char *stack_trace_seprint(char *p, const char end[0],
+			  const unsigned long *entries, unsigned int nr_entries,
+			  int spaces)
+{
+	unsigned int i;
+
+	if (WARN_ON(!entries))
+		return 0;
+
+	for (i = 0; i < nr_entries; i++) {
+		p = seprintf(p, end, "%*c%pS\n", 1 + spaces, ' ',
+			     (void *)entries[i]);
+	}
+
+	return p;
+}
+EXPORT_SYMBOL_GPL(stack_trace_seprint);
+
 #ifdef CONFIG_ARCH_STACKWALK
 
 struct stacktrace_cookie {
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 73d7b50924ef..749496e6a6f1 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -771,6 +771,18 @@ int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
 }
 EXPORT_SYMBOL_GPL(stack_depot_snprint);
 
+char *stack_depot_seprint(depot_stack_handle_t handle, char *p,
+			  const char end[0], int spaces)
+{
+	unsigned long *entries;
+	unsigned int nr_entries;
+
+	nr_entries = stack_depot_fetch(handle, &entries);
+	return nr_entries ? stack_trace_seprint(p, end, entries, nr_entries,
+						spaces) : p;
+}
+EXPORT_SYMBOL_GPL(stack_depot_seprint);
+
 depot_stack_handle_t __must_check stack_depot_set_extra_bits(
 			depot_stack_handle_t handle, unsigned int extra_bits)
 {
-- 
2.50.0

[RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

While doing this, I detected some anomalies in the existing code:

mm/kfence/kfence_test.c:

	-  The last call to scnprintf() did increment 'cur', but it's
	   unused after that, so it was dead code.  I've removed the dead
	   code in this patch.

	-  'end' is calculated as

		end = &expect[0][sizeof(expect[0] - 1)];

	   However, the '-1' doesn't seem to be necessary.  When passing
	   $2 to scnprintf(), the size was specified as 'end - cur'.
	   And scnprintf() --just like snprintf(3)--, won't write more
	   than $2 bytes (including the null byte).  That means that
	   scnprintf() wouldn't write more than

		&expect[0][sizeof(expect[0]) - 1] - expect[0]

	   which simplifies to

		sizeof(expect[0]) - 1

	   bytes.  But we have sizeof(expect[0]) bytes available, so
	   we're wasting one byte entirely.  This is a benign off-by-one
	   bug.  The two occurrences of this bug will be fixed in a
	   following patch in this series.

mm/kmsan/kmsan_test.c:

	The same benign off-by-one bug calculating the remaining size.

mm/mempolicy.c:

	This file uses the 'p += snprintf()' anti-pattern.  That will
	overflow the pointer on truncation, which has undefined
	behavior.  Using seprintf(), this bug is fixed.

	As in the previous file, here there was also dead code in the
	last scnprintf() call, by incrementing a pointer that is not
	used after the call.  I've removed the dead code.

mm/page_owner.c:

	Within print_page_owner(), there are some calls to scnprintf(),
	which do report truncation.  And then there are other calls to
	snprintf(), where we handle errors (there are two 'goto err').

	I've kept the existing error handling, as I trust it's there for
	a good reason (i.e., we may want to avoid calling
	print_page_owner_memcg() if we truncated before).  Please review
	if this amount of error handling is the right one, or if we want
	to add or remove some.  For seprintf(), a single test for null
	after the last call is enough to detect truncation.

mm/slub.c:

	Again, the 'p += snprintf()' anti-pattern.  This is UB, and by
	using seprintf() we've fixed the bug.

Fixes: f99e12b21b84 (2021-07-30; "kfence: add function to mask address bits")
[alx: that commit introduced dead code]
Fixes: af649773fb25 (2024-07-17; "mm/numa_balancing: teach mpol_to_str about the balancing mode")
[alx: that commit added p+=snprintf() calls, which are UB]
Fixes: 2291990ab36b (2008-04-28; "mempolicy: clean-up mpol-to-str() mempolicy formatting")
[alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
Fixes: 948927ee9e4f (2013-11-13; "mm, mempolicy: make mpol_to_str robust and always succeed")
[alx: that commit changes old code into p+=snprintf(), which is still UB]
[alx: that commit also produced dead code by leaving the last 'p+=...']
Fixes: d65360f22406 (2022-09-26; "mm/slub: clean up create_unique_id()")
[alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Marco Elver <elver@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Chao Yu <chao.yu@oppo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/kfence/kfence_test.c | 24 ++++++++++++------------
 mm/kmsan/kmsan_test.c   |  4 ++--
 mm/mempolicy.c          | 18 +++++++++---------
 mm/page_owner.c         | 32 +++++++++++++++++---------------
 mm/slub.c               |  5 +++--
 5 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index 00034e37bc9f..ff734c514c03 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -113,26 +113,26 @@ static bool report_matches(const struct expect_report *r)
 	end = &expect[0][sizeof(expect[0]) - 1];
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: out-of-bounds %s",
+		cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_UAF:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: use-after-free %s",
+		cur = seprintf(cur, end, "BUG: KFENCE: use-after-free %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_CORRUPTION:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: memory corruption");
+		cur = seprintf(cur, end, "BUG: KFENCE: memory corruption");
 		break;
 	case KFENCE_ERROR_INVALID:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid %s",
+		cur = seprintf(cur, end, "BUG: KFENCE: invalid %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_INVALID_FREE:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid free");
+		cur = seprintf(cur, end, "BUG: KFENCE: invalid free");
 		break;
 	}
 
-	scnprintf(cur, end - cur, " in %pS", r->fn);
+	seprintf(cur, end, " in %pS", r->fn);
 	/* The exact offset won't match, remove it; also strip module name. */
 	cur = strchr(expect[0], '+');
 	if (cur)
@@ -144,26 +144,26 @@ static bool report_matches(const struct expect_report *r)
 
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
-		cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
+		cur = seprintf(cur, end, "Out-of-bounds %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_UAF:
-		cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
+		cur = seprintf(cur, end, "Use-after-free %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_CORRUPTION:
-		cur += scnprintf(cur, end - cur, "Corrupted memory at");
+		cur = seprintf(cur, end, "Corrupted memory at");
 		break;
 	case KFENCE_ERROR_INVALID:
-		cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
+		cur = seprintf(cur, end, "Invalid %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_INVALID_FREE:
-		cur += scnprintf(cur, end - cur, "Invalid free of");
+		cur = seprintf(cur, end, "Invalid free of");
 		break;
 	}
 
-	cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
+	seprintf(cur, end, " 0x%p", (void *)addr);
 
 	spin_lock_irqsave(&observed.lock, flags);
 	if (!report_available())
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index 9733a22c46c1..a062a46b2d24 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -107,9 +107,9 @@ static bool report_matches(const struct expect_report *r)
 	cur = expected_header;
 	end = &expected_header[sizeof(expected_header) - 1];
 
-	cur += scnprintf(cur, end - cur, "BUG: KMSAN: %s", r->error_type);
+	cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
 
-	scnprintf(cur, end - cur, " in %s", r->symbol);
+	seprintf(cur, end, " in %s", r->symbol);
 	/* The exact offset won't match, remove it; also strip module name. */
 	cur = strchr(expected_header, '+');
 	if (cur)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b28a1e6ae096..c696e4a6f4c2 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3359,6 +3359,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol)
 void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
 {
 	char *p = buffer;
+	char *e = buffer + maxlen;
 	nodemask_t nodes = NODE_MASK_NONE;
 	unsigned short mode = MPOL_DEFAULT;
 	unsigned short flags = 0;
@@ -3384,33 +3385,32 @@ void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
 		break;
 	default:
 		WARN_ON_ONCE(1);
-		snprintf(p, maxlen, "unknown");
+		seprintf(p, e, "unknown");
 		return;
 	}
 
-	p += snprintf(p, maxlen, "%s", policy_modes[mode]);
+	p = seprintf(p, e, "%s", policy_modes[mode]);
 
 	if (flags & MPOL_MODE_FLAGS) {
-		p += snprintf(p, buffer + maxlen - p, "=");
+		p = seprintf(p, e, "=");
 
 		/*
 		 * Static and relative are mutually exclusive.
 		 */
 		if (flags & MPOL_F_STATIC_NODES)
-			p += snprintf(p, buffer + maxlen - p, "static");
+			p = seprintf(p, e, "static");
 		else if (flags & MPOL_F_RELATIVE_NODES)
-			p += snprintf(p, buffer + maxlen - p, "relative");
+			p = seprintf(p, e, "relative");
 
 		if (flags & MPOL_F_NUMA_BALANCING) {
 			if (!is_power_of_2(flags & MPOL_MODE_FLAGS))
-				p += snprintf(p, buffer + maxlen - p, "|");
-			p += snprintf(p, buffer + maxlen - p, "balancing");
+				p = seprintf(p, e, "|");
+			p = seprintf(p, e, "balancing");
 		}
 	}
 
 	if (!nodes_empty(nodes))
-		p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
-			       nodemask_pr_args(&nodes));
+		seprintf(p, e, ":%*pbl", nodemask_pr_args(&nodes));
 }
 
 #ifdef CONFIG_SYSFS
diff --git a/mm/page_owner.c b/mm/page_owner.c
index cc4a6916eec6..5811738e3320 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -496,7 +496,7 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
 /*
  * Looking for memcg information and print it out
  */
-static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
+static inline char *print_page_owner_memcg(char *p, const char end[0],
 					 struct page *page)
 {
 #ifdef CONFIG_MEMCG
@@ -511,8 +511,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 		goto out_unlock;
 
 	if (memcg_data & MEMCG_DATA_OBJEXTS)
-		ret += scnprintf(kbuf + ret, count - ret,
-				"Slab cache page\n");
+		p = seprintf(p, end, "Slab cache page\n");
 
 	memcg = page_memcg_check(page);
 	if (!memcg)
@@ -520,7 +519,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 
 	online = (memcg->css.flags & CSS_ONLINE);
 	cgroup_name(memcg->css.cgroup, name, sizeof(name));
-	ret += scnprintf(kbuf + ret, count - ret,
+	p = seprintf(p, end,
 			"Charged %sto %smemcg %s\n",
 			PageMemcgKmem(page) ? "(via objcg) " : "",
 			online ? "" : "offline ",
@@ -529,7 +528,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 	rcu_read_unlock();
 #endif /* CONFIG_MEMCG */
 
-	return ret;
+	return p;
 }
 
 static ssize_t
@@ -538,14 +537,16 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 		depot_stack_handle_t handle)
 {
 	int ret, pageblock_mt, page_mt;
-	char *kbuf;
+	char *kbuf, *p, *e;
 
 	count = min_t(size_t, count, PAGE_SIZE);
 	kbuf = kmalloc(count, GFP_KERNEL);
 	if (!kbuf)
 		return -ENOMEM;
 
-	ret = scnprintf(kbuf, count,
+	p = kbuf;
+	e = kbuf + count;
+	p = seprintf(p, e,
 			"Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n",
 			page_owner->order, page_owner->gfp_mask,
 			&page_owner->gfp_mask, page_owner->pid,
@@ -555,7 +556,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 	/* Print information relevant to grouping pages by mobility */
 	pageblock_mt = get_pageblock_migratetype(page);
 	page_mt  = gfp_migratetype(page_owner->gfp_mask);
-	ret += scnprintf(kbuf + ret, count - ret,
+	p = seprintf(p, e,
 			"PFN 0x%lx type %s Block %lu type %s Flags %pGp\n",
 			pfn,
 			migratetype_names[page_mt],
@@ -563,22 +564,23 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 			migratetype_names[pageblock_mt],
 			&page->flags);
 
-	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
-	if (ret >= count)
-		goto err;
+	p = stack_depot_seprint(handle, p, e, 0);
+	if (p == NULL)
+		goto err;  // XXX: Should we remove this error handling?
 
 	if (page_owner->last_migrate_reason != -1) {
-		ret += scnprintf(kbuf + ret, count - ret,
+		p = seprintf(p, e,
 			"Page has been migrated, last migrate reason: %s\n",
 			migrate_reason_names[page_owner->last_migrate_reason]);
 	}
 
-	ret = print_page_owner_memcg(kbuf, count, ret, page);
+	p = print_page_owner_memcg(p, e, page);
 
-	ret += snprintf(kbuf + ret, count - ret, "\n");
-	if (ret >= count)
+	p = seprintf(p, e, "\n");
+	if (p == NULL)
 		goto err;
 
+	ret = p - kbuf;
 	if (copy_to_user(buf, kbuf, ret))
 		ret = -EFAULT;
 
diff --git a/mm/slub.c b/mm/slub.c
index be8b09e09d30..b67c6ca0d0f7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -7451,6 +7451,7 @@ static char *create_unique_id(struct kmem_cache *s)
 {
 	char *name = kmalloc(ID_STR_LENGTH, GFP_KERNEL);
 	char *p = name;
+	char *e = name + ID_STR_LENGTH;
 
 	if (!name)
 		return ERR_PTR(-ENOMEM);
@@ -7475,9 +7476,9 @@ static char *create_unique_id(struct kmem_cache *s)
 		*p++ = 'A';
 	if (p != name + 1)
 		*p++ = '-';
-	p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
+	p = seprintf(p, e, "%07u", s->size);
 
-	if (WARN_ON(p > name + ID_STR_LENGTH - 1)) {
+	if (WARN_ON(p == NULL)) {
 		kfree(name);
 		return ERR_PTR(-EINVAL);
 	}
-- 
2.50.0

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Linus Torvalds 3 months ago

On Sun, 6 Jul 2025 at 22:06, Alejandro Colomar <alx@kernel.org> wrote:
>
> -       p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
> +       p = seprintf(p, e, "%07u", s->size);

I am *really* not a fan of introducing yet another random non-standard
string function.

This 'seprintf' thing really seems to be a completely made-up thing.
Let's not go there. It just adds more confusion - it may be a simpler
interface, but it's another cogniitive load thing, and honestly, that
"beginning and end" interface is not great.

I think we'd be better off with real "character buffer" interfaces,
and they should be *named* that way, not be yet another "random
character added to the printf family".

The whole "add a random character" thing is a disease. But at least
with printf/fprintf/vprintf/vsnprintf/etc, it's a _standard_ disease,
so people hopefully know about it.

So I really *really* don't like things like seprintf(). It just makes me go WTF?

Interfaces that have worked for us are things like "seq_printf()", which

 (a) has sane naming, not "add random characters"

 (b) has real abstractions (in that case 'struct seq_file') rather
than adding random extra arguments to the argument list.

and we do have something like that in 'struct seq_buf'.  I'm not
convinced that's the optimal interface, but I think it's *better*.
Because it does both encapsulate a proper "this is my buffer" type,
and has a proper "this is a buffer operation" function name.

So I'd *much* rather people would try to convert their uses to things
like that, than add random letter combinations.

             Linus

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Christopher Bazley 2 months, 3 weeks ago

Hi Linus,

On Mon, Jul 7, 2025 at 8:17 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Sun, 6 Jul 2025 at 22:06, Alejandro Colomar <alx@kernel.org> wrote:
> >
> > -       p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
> > +       p = seprintf(p, e, "%07u", s->size);
>
> I am *really* not a fan of introducing yet another random non-standard
> string function.
>
> This 'seprintf' thing really seems to be a completely made-up thing.
> Let's not go there. It just adds more confusion - it may be a simpler
> interface, but it's another cogniitive load thing, and honestly, that
> "beginning and end" interface is not great.
>
> I think we'd be better off with real "character buffer" interfaces,
> and they should be *named* that way, not be yet another "random
> character added to the printf family".

I was really interested to see this comment because I presented a
design for a standard character buffer interface, "strb_t", to WG14 in
summer of 2014. The latest published version of that paper is
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3306.pdf (very long)
and the slides (which cover most of the important points) are
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3276.pdf

I contacted you beforehand, for permission to include kasprintf and
kvasprintf in the 'prior art' section of my paper. At the time, you
gave me useful information about the history of those and related
functions. (As an aside, Alejandro has since written a proposal to
standardise a similar function named aprintf, which I support:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3630.txt )

Going back to "strb_t", I did not bother you about it again because I
didn't anticipate it being used in kernel space, which has its own
interfaces for most things. I'd be interested to hear what you think
of it though. My intent was to make it impossible to abuse, insofar as
that is possible. That led me to make choices (such as use of an
incomplete struct type) that some might consider strange or
overengineered. I didn't see the point in trying to replace one set of
error-prone functions with another.

Alejandro has put a lot of thought into his proposed seprintf
function, but it still fundamentally relies on the programmer passing
the right arguments and it doesn't seem to extend the functionality of
snprintf in any way that I actually need.

For example, some of my goals for the character buffer interface were:

- A buffer should be specified using a single parameter.
- Impossible to accidentally shallow-copy a buffer instead of copying
a reference to it.
- No aspect of character consumption delegated to character producers, e.g.:
  * whether to insert or overwrite.
  * whether to prepend, insert or append.
  * whether to allocate extra storage, and how to do that.
- Minimize the effect of ignoring return values and not require
ubiquitous error-handling.
- Able to put strings directly into a buffer from any source.
- Allow diverse implementations (mostly to allow tailoring to
different platforms).

This small program demonstrates some of those ideas:
https://godbolt.org/z/66Gnre6dx
It uses my ugly hacked-together prototype.

Chris

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Christopher Bazley 2 months, 3 weeks ago

On Sat, Jul 12, 2025 at 9:58 PM Christopher Bazley
<chris.bazley.wg14@gmail.com> wrote:
>
> Hi Linus,
>
> On Mon, Jul 7, 2025 at 8:17 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > On Sun, 6 Jul 2025 at 22:06, Alejandro Colomar <alx@kernel.org> wrote:
> > >
> > > -       p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
> > > +       p = seprintf(p, e, "%07u", s->size);
> >
> > I am *really* not a fan of introducing yet another random non-standard
> > string function.
> >
> > This 'seprintf' thing really seems to be a completely made-up thing.
> > Let's not go there. It just adds more confusion - it may be a simpler
> > interface, but it's another cogniitive load thing, and honestly, that
> > "beginning and end" interface is not great.
> >
> > I think we'd be better off with real "character buffer" interfaces,
> > and they should be *named* that way, not be yet another "random
> > character added to the printf family".
>
> I was really interested to see this comment because I presented a
> design for a standard character buffer interface, "strb_t", to WG14 in
> summer of 2014.

Ugh, that should have been 2024. I'm getting old!

Chris

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

Hi Linus,

On Mon, Jul 07, 2025 at 12:17:11PM -0700, Linus Torvalds wrote:
> On Sun, 6 Jul 2025 at 22:06, Alejandro Colomar <alx@kernel.org> wrote:
> >
> > -       p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
> > +       p = seprintf(p, e, "%07u", s->size);
> 
> I am *really* not a fan of introducing yet another random non-standard
> string function.

I am in the C Committee, and have proposed this API for standardization.
I have a feeling that the committee might be open to it.

> This 'seprintf' thing really seems to be a completely made-up thing.
> Let's not go there. It just adds more confusion - it may be a simpler
> interface, but it's another cogniitive load thing,

I understand the part of your concern that relates to
<https://xkcd.com/927/>.

However, I've shown how in mm/, I got rid of most snprintf() and
scnprintf() calls.  I could even get rid of the remaining snprintf()
ones; I didn't do it to avoid churn, but they're just 3, so I could do
it, as a way to remove all uses of snprintf(3).

I also got rid of all scnprintf() uses except for 2.  Not because those
two cannot be removed, but because the code was scary enough that I
didn't dare touch it.  I'd like someone to read it and confirm that it
can be replaced.

> and honestly, that
> "beginning and end" interface is not great.

Just look at the diffs.  It is great, in terms of writing less code.

In some cases, it makes sense to pass a size.  Those cases are when you
don't want to chain several calls.  That's the case of stprintf(), and
it's wrapper STPRINTF(), which calls ARRAY_SIZE() internally.

But most of the time you want to chain calls, and 'end' beats 'size'
there.

> I think we'd be better off with real "character buffer" interfaces,
> and they should be *named* that way, not be yet another "random
> character added to the printf family".

You might want to do that, but I doubt it's an easy change.  On the
other hand, this change is trivial, and can be done incrementally,
without needing to modify the buffer since its inception.

And you can come back later to wrap this in some API that does what you
want.  Nothing stops you from doing that.

But this fixes several cases of UB in a few files that I've looked at,
with minimal diffs.

> The whole "add a random character" thing is a disease. But at least
> with printf/fprintf/vprintf/vsnprintf/etc, it's a _standard_ disease,
> so people hopefully know about it.

seprint(2) was implemented in Plan9 many decades ago.  It's not
standard, because somehow Plan9 has been ignored by history, but the
name has a long history.

<https://plan9.io/magic/man2html/2/print>

Plus, I'm making seprintf() standard (if I can convince the committee).

Yesterday night, I presented the proposal to the committee, informally
(via email).  You can read a copy here:
<https://lore.kernel.org/linux-hardening/cover.1751747518.git.alx@kernel.org/T/#m9311035d60b4595db62273844d16671601e77a50>

I'll present it formally in a month, since I have a batch of proposals
for the committee in the works.

Have a lovely day!
Alex

> So I really *really* don't like things like seprintf(). It just makes me go WTF?
> 
> Interfaces that have worked for us are things like "seq_printf()", which
> 
>  (a) has sane naming, not "add random characters"
> 
>  (b) has real abstractions (in that case 'struct seq_file') rather
> than adding random extra arguments to the argument list.
> 
> and we do have something like that in 'struct seq_buf'.  I'm not
> convinced that's the optimal interface, but I think it's *better*.
> Because it does both encapsulate a proper "this is my buffer" type,
> and has a proper "this is a buffer operation" function name.
> 
> So I'd *much* rather people would try to convert their uses to things
> like that, than add random letter combinations.
> 
>              Linus

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Linus Torvalds 3 months ago

On Mon, 7 Jul 2025 at 13:29, Alejandro Colomar <alx@kernel.org> wrote:
>
> I am in the C Committee, and have proposed this API for standardization.
> I have a feeling that the committee might be open to it.

Honestly, how about fixing the serious problems with the language instead?

Get rid of the broken "strict aliasing" garbage.

Get rid of the random "undefined behavior" stuff that is literally
designed to let compilers intentionally mis-compile code.

Because as things are, "I am on the C committee" isn't a
recommendation. It's a "we have decades of bad decisions to show our
credentials".

In the kernel, I have made it very very clear that we do not use
standard C, because standard C is broken.

I stand by my "let's not add random letters to existing functions that
are already too confusing".

              Linus

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

Hi Linus,

On Mon, Jul 07, 2025 at 01:49:20PM -0700, Linus Torvalds wrote:
> On Mon, 7 Jul 2025 at 13:29, Alejandro Colomar <alx@kernel.org> wrote:
> >
> > I am in the C Committee, and have proposed this API for standardization.
> > I have a feeling that the committee might be open to it.
> 
> Honestly, how about fixing the serious problems with the language instead?

I'm doing some work on that.  See the new _Countof() operator?  That was
my first introduction in the standard, last year.

I'm working on an extension to it that I believe will make array
parameters safer.

> Get rid of the broken "strict aliasing" garbage.

I don't feel qualified to comment on that.

> Get rid of the random "undefined behavior" stuff that is literally
> designed to let compilers intentionally mis-compile code.

We're indeed working on that.  The last committee meeting removed a
large number of undefined behaviors, and turned them into mandatory
diagnostics.  And there's ongoing work on removing more of those.

> Because as things are, "I am on the C committee" isn't a
> recommendation. It's a "we have decades of bad decisions to show our
> credentials".

I joined in 2024 because I was fed up with the shit they were producing
and wanted to influence it.  You don't need to convince me.

> In the kernel, I have made it very very clear that we do not use
> standard C, because standard C is broken.

I agree.  I personally use GNU C and tend to ignore the standard.  But
I'm still working on improving the standard, even if just to avoid
having to learn Rust (and also because GCC and glibc don't accept any
improvements or fixes if they don't go through the standard, these
days).

Have a lovely day!
Alex

> I stand by my "let's not add random letters to existing functions that
> are already too confusing".
> 
>               Linus

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

On Mon, Jul 07, 2025 at 11:06:06PM +0200, Alejandro Colomar wrote:
> > I stand by my "let's not add random letters to existing functions that
> > are already too confusing".

If the name is your main concern, we can discuss a more explicit name in
the kernel.

I still plan to propose it as seprintf() for standardization, and for
libc, but if that reads as a letter soup to you, I guess we can call it
sprintf_end() or whatever, for the kernel.

Does that sound reasonable enough?  What do you think about the diff
itself ignoring the function name?

Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Linus Torvalds 3 months ago

On Mon, 7 Jul 2025 at 14:27, Alejandro Colomar <alx@kernel.org> wrote:
>
> If the name is your main concern, we can discuss a more explicit name in
> the kernel.

So as they say: "There are only two hard problems in computer science:
cache invalidation, naming and off-by-one errors".

And the *worst* model for naming is the "add random characters" (ok, I
still remember when people believed the insane "Hungarian Notation"
BS, *that* particular braindamage seems to thankfully have faded away
and was probably even worse, because it was both pointless, unreadable
_and_ caused long identifiers).

Now, we obviously tend to have the usual bike-shedding discussions
that come from naming, but my *personal* preference is to avoid the
myriad of random "does almost the same thing with different
parameters" by using generics.

This is actually something that the kernel has done for decades, with
various odd macro games - things like "get_user()" just automatically
doing the RightThing(tm) based on the size of the argument, rather
than having N different versions for different types.

So we actually have a fair number of "generics" in the kernel, and
while admittedly the header file contortions to implement them can
often be horrendous - the *use* cases tend to be fairly readable.

It's not just get_user() and friends, it's things like our
type-checking min/max macros etc. Lots of small helpers that

And while the traditional C model for this is indeed macro games with
sizeof() and other oddities, these days at least we have _Generic() to
help.

So my personal preference would actually be to not make up new names
at all, but just have the normal names DoTheRightThing(tm)
automatically.

But honestly, that works best when you have good data structure
abstraction - *not* when you pass just random "char *" pointers
around.  It tends to help those kinds of _Generic() users, but even
without the use of _Generic() and friends, it helps static type
checking and makes things much less ambiguous even in general.

IOW, there's never any question about "is this string the source or
the destination?" or "is this the start or the end of the buffer", if
you just have a struct with clear naming that contains the arguments.

And while C doesn't have named arguments, it *does* have named
structure initializers, and we use them pretty religiously in the
kernel. Exactly because it helps so much both for readability and for
stability (ie it catches things when you intentionally rename members
because the semantics changed).

                Linus

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

Hi Linus,

On Mon, Jul 07, 2025 at 03:17:50PM -0700, Linus Torvalds wrote:
> On Mon, 7 Jul 2025 at 14:27, Alejandro Colomar <alx@kernel.org> wrote:
> >
> > If the name is your main concern, we can discuss a more explicit name in
> > the kernel.
> 
> So as they say: "There are only two hard problems in computer science:
> cache invalidation, naming and off-by-one errors".

Indeed.  And we have two of these classes here.  :)

> And the *worst* model for naming is the "add random characters" (ok, I
> still remember when people believed the insane "Hungarian Notation"
> BS, *that* particular braindamage seems to thankfully have faded away
> and was probably even worse, because it was both pointless, unreadable
> _and_ caused long identifiers).

To be fair, one letter is enough if you're used to the name.  Everything
of the form s*printf() people know that the differentiating part is that
single letter between 's' and 'p', and a quick look at the function
prototype usually explains the rest.

More than that, and it's unnecessarily noisy to my taste.  But not
everyone does string work all the time, so I get why you'd be less prone
to liking the name.

I won't press for the name.  Unless you say anything, my next revision
of the series will call it sprintf_end().

> Now, we obviously tend to have the usual bike-shedding discussions
> that come from naming, but my *personal* preference is to avoid the
> myriad of random "does almost the same thing with different
> parameters" by using generics.
> 
> This is actually something that the kernel has done for decades, with
> various odd macro games - things like "get_user()" just automatically
> doing the RightThing(tm) based on the size of the argument, rather
> than having N different versions for different types.

In this case, I wouldn't want to go that way and reuse the name
snprintf(3), because the kernel implementation of snprintf(3) is
non-conforming, and both the standard and the kernel snprintf() have
semantics that are importantly different than this API in terms of
handling errors.

I think reusing the name with slightly different semantics would be
prone to bugs.

Anyway, sprintf_end() should be clear enough that I don't expect much
bikeshedding for the name.  Feel free to revisit this in the future and
merge names if you don't like it; I won't complain.  :)

Have a lovely night!
Alex

P.S.:  I'm not able to sign this email.

> So we actually have a fair number of "generics" in the kernel, and
> while admittedly the header file contortions to implement them can
> often be horrendous - the *use* cases tend to be fairly readable.
> 
> It's not just get_user() and friends, it's things like our
> type-checking min/max macros etc. Lots of small helpers that
> 
> And while the traditional C model for this is indeed macro games with
> sizeof() and other oddities, these days at least we have _Generic() to
> help.
> 
> So my personal preference would actually be to not make up new names
> at all, but just have the normal names DoTheRightThing(tm)
> automatically.
> 
> But honestly, that works best when you have good data structure
> abstraction - *not* when you pass just random "char *" pointers
> around.  It tends to help those kinds of _Generic() users, but even
> without the use of _Generic() and friends, it helps static type
> checking and makes things much less ambiguous even in general.
> 
> IOW, there's never any question about "is this string the source or
> the destination?" or "is this the start or the end of the buffer", if
> you just have a struct with clear naming that contains the arguments.
> 
> And while C doesn't have named arguments, it *does* have named
> structure initializers, and we use them pretty religiously in the
> kernel. Exactly because it helps so much both for readability and for
> stability (ie it catches things when you intentionally rename members
> because the semantics changed).
> 
>                 Linus

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Al Viro 3 months ago

On Mon, Jul 07, 2025 at 12:17:11PM -0700, Linus Torvalds wrote:

> and we do have something like that in 'struct seq_buf'.  I'm not
> convinced that's the optimal interface, but I think it's *better*.
> Because it does both encapsulate a proper "this is my buffer" type,
> and has a proper "this is a buffer operation" function name.
> 
> So I'd *much* rather people would try to convert their uses to things
> like that, than add random letter combinations.

Lifting struct membuf out of include/linux/regset.h, perhaps, and
adding printf to the family?

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Linus Torvalds 3 months ago

On Mon, 7 Jul 2025 at 12:35, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Lifting struct membuf out of include/linux/regset.h, perhaps, and
> adding printf to the family?

membuf has its own problems. It doesn't remember the beginning of the
buffer, so while it's good for "fill in this buffer with streaming
data", it's pretty bad for "let's declare a buffer, fill it in, and
then use the buffer for something".

So with membuf, you can do that "fill this buffer" cleanly.

But you can't then do that "ok, it's filled, now flush it" - not
without passing in some other data (namely the original buffer data).

I don't exactly love "struct seq_buf" either - it's big and wasteful
because it has 64-bit sizes - but it at least *retains* the full
state, so you can do things like "print to this buffer" and "flush
this buffer" *without* passing around extra data.

              Linus

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Marco Elver 3 months ago

On Mon, 7 Jul 2025 at 07:06, Alejandro Colomar <alx@kernel.org> wrote:
>
> While doing this, I detected some anomalies in the existing code:
>
> mm/kfence/kfence_test.c:
>
>         -  The last call to scnprintf() did increment 'cur', but it's
>            unused after that, so it was dead code.  I've removed the dead
>            code in this patch.

That was done to be consistent with the other code for readability,
and to be clear where the next bytes should be appended (if someone
decides to append more). There is no runtime dead code, the compiler
optimizes away the assignment. But I'm indifferent, so removing the
assignment is fine if you prefer that.

Did you run the tests? Do they pass?


>         -  'end' is calculated as
>
>                 end = &expect[0][sizeof(expect[0] - 1)];
>
>            However, the '-1' doesn't seem to be necessary.  When passing
>            $2 to scnprintf(), the size was specified as 'end - cur'.
>            And scnprintf() --just like snprintf(3)--, won't write more
>            than $2 bytes (including the null byte).  That means that
>            scnprintf() wouldn't write more than
>
>                 &expect[0][sizeof(expect[0]) - 1] - expect[0]
>
>            which simplifies to
>
>                 sizeof(expect[0]) - 1
>
>            bytes.  But we have sizeof(expect[0]) bytes available, so
>            we're wasting one byte entirely.  This is a benign off-by-one
>            bug.  The two occurrences of this bug will be fixed in a
>            following patch in this series.
>
> mm/kmsan/kmsan_test.c:
>
>         The same benign off-by-one bug calculating the remaining size.


Same - does the test pass?

> mm/mempolicy.c:
>
>         This file uses the 'p += snprintf()' anti-pattern.  That will
>         overflow the pointer on truncation, which has undefined
>         behavior.  Using seprintf(), this bug is fixed.
>
>         As in the previous file, here there was also dead code in the
>         last scnprintf() call, by incrementing a pointer that is not
>         used after the call.  I've removed the dead code.
>
> mm/page_owner.c:
>
>         Within print_page_owner(), there are some calls to scnprintf(),
>         which do report truncation.  And then there are other calls to
>         snprintf(), where we handle errors (there are two 'goto err').
>
>         I've kept the existing error handling, as I trust it's there for
>         a good reason (i.e., we may want to avoid calling
>         print_page_owner_memcg() if we truncated before).  Please review
>         if this amount of error handling is the right one, or if we want
>         to add or remove some.  For seprintf(), a single test for null
>         after the last call is enough to detect truncation.
>
> mm/slub.c:
>
>         Again, the 'p += snprintf()' anti-pattern.  This is UB, and by
>         using seprintf() we've fixed the bug.
>
> Fixes: f99e12b21b84 (2021-07-30; "kfence: add function to mask address bits")
> [alx: that commit introduced dead code]
> Fixes: af649773fb25 (2024-07-17; "mm/numa_balancing: teach mpol_to_str about the balancing mode")
> [alx: that commit added p+=snprintf() calls, which are UB]
> Fixes: 2291990ab36b (2008-04-28; "mempolicy: clean-up mpol-to-str() mempolicy formatting")
> [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
> Fixes: 948927ee9e4f (2013-11-13; "mm, mempolicy: make mpol_to_str robust and always succeed")
> [alx: that commit changes old code into p+=snprintf(), which is still UB]
> [alx: that commit also produced dead code by leaving the last 'p+=...']
> Fixes: d65360f22406 (2022-09-26; "mm/slub: clean up create_unique_id()")
> [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
> Cc: Kees Cook <kees@kernel.org>
> Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
> Cc: Sven Schnelle <svens@linux.ibm.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Heiko Carstens <hca@linux.ibm.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> Cc: "Huang, Ying" <ying.huang@intel.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: David Rientjes <rientjes@google.com>
> Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> Cc: Chao Yu <chao.yu@oppo.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
>  mm/kfence/kfence_test.c | 24 ++++++++++++------------
>  mm/kmsan/kmsan_test.c   |  4 ++--
>  mm/mempolicy.c          | 18 +++++++++---------
>  mm/page_owner.c         | 32 +++++++++++++++++---------------
>  mm/slub.c               |  5 +++--
>  5 files changed, 43 insertions(+), 40 deletions(-)
>
> diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
> index 00034e37bc9f..ff734c514c03 100644
> --- a/mm/kfence/kfence_test.c
> +++ b/mm/kfence/kfence_test.c
> @@ -113,26 +113,26 @@ static bool report_matches(const struct expect_report *r)
>         end = &expect[0][sizeof(expect[0]) - 1];
>         switch (r->type) {
>         case KFENCE_ERROR_OOB:
> -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: out-of-bounds %s",
> +               cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
>                                  get_access_type(r));
>                 break;
>         case KFENCE_ERROR_UAF:
> -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: use-after-free %s",
> +               cur = seprintf(cur, end, "BUG: KFENCE: use-after-free %s",
>                                  get_access_type(r));
>                 break;
>         case KFENCE_ERROR_CORRUPTION:
> -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: memory corruption");
> +               cur = seprintf(cur, end, "BUG: KFENCE: memory corruption");
>                 break;
>         case KFENCE_ERROR_INVALID:
> -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid %s",
> +               cur = seprintf(cur, end, "BUG: KFENCE: invalid %s",
>                                  get_access_type(r));
>                 break;
>         case KFENCE_ERROR_INVALID_FREE:
> -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid free");
> +               cur = seprintf(cur, end, "BUG: KFENCE: invalid free");
>                 break;
>         }
>
> -       scnprintf(cur, end - cur, " in %pS", r->fn);
> +       seprintf(cur, end, " in %pS", r->fn);
>         /* The exact offset won't match, remove it; also strip module name. */
>         cur = strchr(expect[0], '+');
>         if (cur)
> @@ -144,26 +144,26 @@ static bool report_matches(const struct expect_report *r)
>
>         switch (r->type) {
>         case KFENCE_ERROR_OOB:
> -               cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
> +               cur = seprintf(cur, end, "Out-of-bounds %s at", get_access_type(r));
>                 addr = arch_kfence_test_address(addr);
>                 break;
>         case KFENCE_ERROR_UAF:
> -               cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
> +               cur = seprintf(cur, end, "Use-after-free %s at", get_access_type(r));
>                 addr = arch_kfence_test_address(addr);
>                 break;
>         case KFENCE_ERROR_CORRUPTION:
> -               cur += scnprintf(cur, end - cur, "Corrupted memory at");
> +               cur = seprintf(cur, end, "Corrupted memory at");
>                 break;
>         case KFENCE_ERROR_INVALID:
> -               cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
> +               cur = seprintf(cur, end, "Invalid %s at", get_access_type(r));
>                 addr = arch_kfence_test_address(addr);
>                 break;
>         case KFENCE_ERROR_INVALID_FREE:
> -               cur += scnprintf(cur, end - cur, "Invalid free of");
> +               cur = seprintf(cur, end, "Invalid free of");
>                 break;
>         }
>
> -       cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
> +       seprintf(cur, end, " 0x%p", (void *)addr);
>
>         spin_lock_irqsave(&observed.lock, flags);
>         if (!report_available())
> diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
> index 9733a22c46c1..a062a46b2d24 100644
> --- a/mm/kmsan/kmsan_test.c
> +++ b/mm/kmsan/kmsan_test.c
> @@ -107,9 +107,9 @@ static bool report_matches(const struct expect_report *r)
>         cur = expected_header;
>         end = &expected_header[sizeof(expected_header) - 1];
>
> -       cur += scnprintf(cur, end - cur, "BUG: KMSAN: %s", r->error_type);
> +       cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
>
> -       scnprintf(cur, end - cur, " in %s", r->symbol);
> +       seprintf(cur, end, " in %s", r->symbol);
>         /* The exact offset won't match, remove it; also strip module name. */
>         cur = strchr(expected_header, '+');
>         if (cur)
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index b28a1e6ae096..c696e4a6f4c2 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -3359,6 +3359,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol)
>  void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
>  {
>         char *p = buffer;
> +       char *e = buffer + maxlen;
>         nodemask_t nodes = NODE_MASK_NONE;
>         unsigned short mode = MPOL_DEFAULT;
>         unsigned short flags = 0;
> @@ -3384,33 +3385,32 @@ void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
>                 break;
>         default:
>                 WARN_ON_ONCE(1);
> -               snprintf(p, maxlen, "unknown");
> +               seprintf(p, e, "unknown");
>                 return;
>         }
>
> -       p += snprintf(p, maxlen, "%s", policy_modes[mode]);
> +       p = seprintf(p, e, "%s", policy_modes[mode]);
>
>         if (flags & MPOL_MODE_FLAGS) {
> -               p += snprintf(p, buffer + maxlen - p, "=");
> +               p = seprintf(p, e, "=");
>
>                 /*
>                  * Static and relative are mutually exclusive.
>                  */
>                 if (flags & MPOL_F_STATIC_NODES)
> -                       p += snprintf(p, buffer + maxlen - p, "static");
> +                       p = seprintf(p, e, "static");
>                 else if (flags & MPOL_F_RELATIVE_NODES)
> -                       p += snprintf(p, buffer + maxlen - p, "relative");
> +                       p = seprintf(p, e, "relative");
>
>                 if (flags & MPOL_F_NUMA_BALANCING) {
>                         if (!is_power_of_2(flags & MPOL_MODE_FLAGS))
> -                               p += snprintf(p, buffer + maxlen - p, "|");
> -                       p += snprintf(p, buffer + maxlen - p, "balancing");
> +                               p = seprintf(p, e, "|");
> +                       p = seprintf(p, e, "balancing");
>                 }
>         }
>
>         if (!nodes_empty(nodes))
> -               p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
> -                              nodemask_pr_args(&nodes));
> +               seprintf(p, e, ":%*pbl", nodemask_pr_args(&nodes));
>  }
>
>  #ifdef CONFIG_SYSFS
> diff --git a/mm/page_owner.c b/mm/page_owner.c
> index cc4a6916eec6..5811738e3320 100644
> --- a/mm/page_owner.c
> +++ b/mm/page_owner.c
> @@ -496,7 +496,7 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
>  /*
>   * Looking for memcg information and print it out
>   */
> -static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
> +static inline char *print_page_owner_memcg(char *p, const char end[0],
>                                          struct page *page)
>  {
>  #ifdef CONFIG_MEMCG
> @@ -511,8 +511,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
>                 goto out_unlock;
>
>         if (memcg_data & MEMCG_DATA_OBJEXTS)
> -               ret += scnprintf(kbuf + ret, count - ret,
> -                               "Slab cache page\n");
> +               p = seprintf(p, end, "Slab cache page\n");
>
>         memcg = page_memcg_check(page);
>         if (!memcg)
> @@ -520,7 +519,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
>
>         online = (memcg->css.flags & CSS_ONLINE);
>         cgroup_name(memcg->css.cgroup, name, sizeof(name));
> -       ret += scnprintf(kbuf + ret, count - ret,
> +       p = seprintf(p, end,
>                         "Charged %sto %smemcg %s\n",
>                         PageMemcgKmem(page) ? "(via objcg) " : "",
>                         online ? "" : "offline ",
> @@ -529,7 +528,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
>         rcu_read_unlock();
>  #endif /* CONFIG_MEMCG */
>
> -       return ret;
> +       return p;
>  }
>
>  static ssize_t
> @@ -538,14 +537,16 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>                 depot_stack_handle_t handle)
>  {
>         int ret, pageblock_mt, page_mt;
> -       char *kbuf;
> +       char *kbuf, *p, *e;
>
>         count = min_t(size_t, count, PAGE_SIZE);
>         kbuf = kmalloc(count, GFP_KERNEL);
>         if (!kbuf)
>                 return -ENOMEM;
>
> -       ret = scnprintf(kbuf, count,
> +       p = kbuf;
> +       e = kbuf + count;
> +       p = seprintf(p, e,
>                         "Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n",
>                         page_owner->order, page_owner->gfp_mask,
>                         &page_owner->gfp_mask, page_owner->pid,
> @@ -555,7 +556,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>         /* Print information relevant to grouping pages by mobility */
>         pageblock_mt = get_pageblock_migratetype(page);
>         page_mt  = gfp_migratetype(page_owner->gfp_mask);
> -       ret += scnprintf(kbuf + ret, count - ret,
> +       p = seprintf(p, e,
>                         "PFN 0x%lx type %s Block %lu type %s Flags %pGp\n",
>                         pfn,
>                         migratetype_names[page_mt],
> @@ -563,22 +564,23 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
>                         migratetype_names[pageblock_mt],
>                         &page->flags);
>
> -       ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
> -       if (ret >= count)
> -               goto err;
> +       p = stack_depot_seprint(handle, p, e, 0);
> +       if (p == NULL)
> +               goto err;  // XXX: Should we remove this error handling?
>
>         if (page_owner->last_migrate_reason != -1) {
> -               ret += scnprintf(kbuf + ret, count - ret,
> +               p = seprintf(p, e,
>                         "Page has been migrated, last migrate reason: %s\n",
>                         migrate_reason_names[page_owner->last_migrate_reason]);
>         }
>
> -       ret = print_page_owner_memcg(kbuf, count, ret, page);
> +       p = print_page_owner_memcg(p, e, page);
>
> -       ret += snprintf(kbuf + ret, count - ret, "\n");
> -       if (ret >= count)
> +       p = seprintf(p, e, "\n");
> +       if (p == NULL)
>                 goto err;
>
> +       ret = p - kbuf;
>         if (copy_to_user(buf, kbuf, ret))
>                 ret = -EFAULT;
>
> diff --git a/mm/slub.c b/mm/slub.c
> index be8b09e09d30..b67c6ca0d0f7 100644
> --- a/mm/slub.c
> +++ b/mm/slub.c
> @@ -7451,6 +7451,7 @@ static char *create_unique_id(struct kmem_cache *s)
>  {
>         char *name = kmalloc(ID_STR_LENGTH, GFP_KERNEL);
>         char *p = name;
> +       char *e = name + ID_STR_LENGTH;
>
>         if (!name)
>                 return ERR_PTR(-ENOMEM);
> @@ -7475,9 +7476,9 @@ static char *create_unique_id(struct kmem_cache *s)
>                 *p++ = 'A';
>         if (p != name + 1)
>                 *p++ = '-';
> -       p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
> +       p = seprintf(p, e, "%07u", s->size);
>
> -       if (WARN_ON(p > name + ID_STR_LENGTH - 1)) {
> +       if (WARN_ON(p == NULL)) {
>                 kfree(name);
>                 return ERR_PTR(-EINVAL);
>         }
> --
> 2.50.0
>

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

Hi Marco,

On Mon, Jul 07, 2025 at 09:44:09AM +0200, Marco Elver wrote:
> On Mon, 7 Jul 2025 at 07:06, Alejandro Colomar <alx@kernel.org> wrote:
> >
> > While doing this, I detected some anomalies in the existing code:
> >
> > mm/kfence/kfence_test.c:
> >
> >         -  The last call to scnprintf() did increment 'cur', but it's
> >            unused after that, so it was dead code.  I've removed the dead
> >            code in this patch.
> 
> That was done to be consistent with the other code for readability,
> and to be clear where the next bytes should be appended (if someone
> decides to append more). There is no runtime dead code, the compiler
> optimizes away the assignment. But I'm indifferent, so removing the
> assignment is fine if you prefer that.

Yeah, I guessed that might be the reason.  I'm fine restoring it if you
prefer it.  I tend to use -Wunused-but-set-variable, but if it is not
used here and doesn't trigger, I guess it's fine to keep it.

> Did you run the tests? Do they pass?

I don't know how to run them.  I've only built the kernel.  If you point
me to instructions on how to run them, I'll do so.  Thanks!

> >         -  'end' is calculated as
> >
> >                 end = &expect[0][sizeof(expect[0] - 1)];
> >
> >            However, the '-1' doesn't seem to be necessary.  When passing
> >            $2 to scnprintf(), the size was specified as 'end - cur'.
> >            And scnprintf() --just like snprintf(3)--, won't write more
> >            than $2 bytes (including the null byte).  That means that
> >            scnprintf() wouldn't write more than
> >
> >                 &expect[0][sizeof(expect[0]) - 1] - expect[0]
> >
> >            which simplifies to
> >
> >                 sizeof(expect[0]) - 1
> >
> >            bytes.  But we have sizeof(expect[0]) bytes available, so
> >            we're wasting one byte entirely.  This is a benign off-by-one
> >            bug.  The two occurrences of this bug will be fixed in a
> >            following patch in this series.
> >
> > mm/kmsan/kmsan_test.c:
> >
> >         The same benign off-by-one bug calculating the remaining size.
> 
> 
> Same - does the test pass?

Same; built the kernel, but didn't know how to run tests.


Have a lovely day!
Alex

> > mm/mempolicy.c:
> >
> >         This file uses the 'p += snprintf()' anti-pattern.  That will
> >         overflow the pointer on truncation, which has undefined
> >         behavior.  Using seprintf(), this bug is fixed.
> >
> >         As in the previous file, here there was also dead code in the
> >         last scnprintf() call, by incrementing a pointer that is not
> >         used after the call.  I've removed the dead code.
> >
> > mm/page_owner.c:
> >
> >         Within print_page_owner(), there are some calls to scnprintf(),
> >         which do report truncation.  And then there are other calls to
> >         snprintf(), where we handle errors (there are two 'goto err').
> >
> >         I've kept the existing error handling, as I trust it's there for
> >         a good reason (i.e., we may want to avoid calling
> >         print_page_owner_memcg() if we truncated before).  Please review
> >         if this amount of error handling is the right one, or if we want
> >         to add or remove some.  For seprintf(), a single test for null
> >         after the last call is enough to detect truncation.
> >
> > mm/slub.c:
> >
> >         Again, the 'p += snprintf()' anti-pattern.  This is UB, and by
> >         using seprintf() we've fixed the bug.
> >
> > Fixes: f99e12b21b84 (2021-07-30; "kfence: add function to mask address bits")
> > [alx: that commit introduced dead code]
> > Fixes: af649773fb25 (2024-07-17; "mm/numa_balancing: teach mpol_to_str about the balancing mode")
> > [alx: that commit added p+=snprintf() calls, which are UB]
> > Fixes: 2291990ab36b (2008-04-28; "mempolicy: clean-up mpol-to-str() mempolicy formatting")
> > [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
> > Fixes: 948927ee9e4f (2013-11-13; "mm, mempolicy: make mpol_to_str robust and always succeed")
> > [alx: that commit changes old code into p+=snprintf(), which is still UB]
> > [alx: that commit also produced dead code by leaving the last 'p+=...']
> > Fixes: d65360f22406 (2022-09-26; "mm/slub: clean up create_unique_id()")
> > [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
> > Cc: Kees Cook <kees@kernel.org>
> > Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
> > Cc: Sven Schnelle <svens@linux.ibm.com>
> > Cc: Marco Elver <elver@google.com>
> > Cc: Heiko Carstens <hca@linux.ibm.com>
> > Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
> > Cc: "Huang, Ying" <ying.huang@intel.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
> > Cc: Linus Torvalds <torvalds@linux-foundation.org>
> > Cc: David Rientjes <rientjes@google.com>
> > Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> > Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
> > Cc: Chao Yu <chao.yu@oppo.com>
> > Cc: Vlastimil Babka <vbabka@suse.cz>
> > Signed-off-by: Alejandro Colomar <alx@kernel.org>
> > ---
> >  mm/kfence/kfence_test.c | 24 ++++++++++++------------
> >  mm/kmsan/kmsan_test.c   |  4 ++--
> >  mm/mempolicy.c          | 18 +++++++++---------
> >  mm/page_owner.c         | 32 +++++++++++++++++---------------
> >  mm/slub.c               |  5 +++--
> >  5 files changed, 43 insertions(+), 40 deletions(-)
> >
> > diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
> > index 00034e37bc9f..ff734c514c03 100644
> > --- a/mm/kfence/kfence_test.c
> > +++ b/mm/kfence/kfence_test.c
> > @@ -113,26 +113,26 @@ static bool report_matches(const struct expect_report *r)
> >         end = &expect[0][sizeof(expect[0]) - 1];
> >         switch (r->type) {
> >         case KFENCE_ERROR_OOB:
> > -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: out-of-bounds %s",
> > +               cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
> >                                  get_access_type(r));
> >                 break;
> >         case KFENCE_ERROR_UAF:
> > -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: use-after-free %s",
> > +               cur = seprintf(cur, end, "BUG: KFENCE: use-after-free %s",
> >                                  get_access_type(r));
> >                 break;
> >         case KFENCE_ERROR_CORRUPTION:
> > -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: memory corruption");
> > +               cur = seprintf(cur, end, "BUG: KFENCE: memory corruption");
> >                 break;
> >         case KFENCE_ERROR_INVALID:
> > -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid %s",
> > +               cur = seprintf(cur, end, "BUG: KFENCE: invalid %s",
> >                                  get_access_type(r));
> >                 break;
> >         case KFENCE_ERROR_INVALID_FREE:
> > -               cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid free");
> > +               cur = seprintf(cur, end, "BUG: KFENCE: invalid free");
> >                 break;
> >         }
> >
> > -       scnprintf(cur, end - cur, " in %pS", r->fn);
> > +       seprintf(cur, end, " in %pS", r->fn);
> >         /* The exact offset won't match, remove it; also strip module name. */
> >         cur = strchr(expect[0], '+');
> >         if (cur)
> > @@ -144,26 +144,26 @@ static bool report_matches(const struct expect_report *r)
> >
> >         switch (r->type) {
> >         case KFENCE_ERROR_OOB:
> > -               cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
> > +               cur = seprintf(cur, end, "Out-of-bounds %s at", get_access_type(r));
> >                 addr = arch_kfence_test_address(addr);
> >                 break;
> >         case KFENCE_ERROR_UAF:
> > -               cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
> > +               cur = seprintf(cur, end, "Use-after-free %s at", get_access_type(r));
> >                 addr = arch_kfence_test_address(addr);
> >                 break;
> >         case KFENCE_ERROR_CORRUPTION:
> > -               cur += scnprintf(cur, end - cur, "Corrupted memory at");
> > +               cur = seprintf(cur, end, "Corrupted memory at");
> >                 break;
> >         case KFENCE_ERROR_INVALID:
> > -               cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
> > +               cur = seprintf(cur, end, "Invalid %s at", get_access_type(r));
> >                 addr = arch_kfence_test_address(addr);
> >                 break;
> >         case KFENCE_ERROR_INVALID_FREE:
> > -               cur += scnprintf(cur, end - cur, "Invalid free of");
> > +               cur = seprintf(cur, end, "Invalid free of");
> >                 break;
> >         }
> >
> > -       cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
> > +       seprintf(cur, end, " 0x%p", (void *)addr);
> >
> >         spin_lock_irqsave(&observed.lock, flags);
> >         if (!report_available())
> > diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
> > index 9733a22c46c1..a062a46b2d24 100644
> > --- a/mm/kmsan/kmsan_test.c
> > +++ b/mm/kmsan/kmsan_test.c
> > @@ -107,9 +107,9 @@ static bool report_matches(const struct expect_report *r)
> >         cur = expected_header;
> >         end = &expected_header[sizeof(expected_header) - 1];
> >
> > -       cur += scnprintf(cur, end - cur, "BUG: KMSAN: %s", r->error_type);
> > +       cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
> >
> > -       scnprintf(cur, end - cur, " in %s", r->symbol);
> > +       seprintf(cur, end, " in %s", r->symbol);
> >         /* The exact offset won't match, remove it; also strip module name. */
> >         cur = strchr(expected_header, '+');
> >         if (cur)
> > diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> > index b28a1e6ae096..c696e4a6f4c2 100644
> > --- a/mm/mempolicy.c
> > +++ b/mm/mempolicy.c
> > @@ -3359,6 +3359,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol)
> >  void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
> >  {
> >         char *p = buffer;
> > +       char *e = buffer + maxlen;
> >         nodemask_t nodes = NODE_MASK_NONE;
> >         unsigned short mode = MPOL_DEFAULT;
> >         unsigned short flags = 0;
> > @@ -3384,33 +3385,32 @@ void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
> >                 break;
> >         default:
> >                 WARN_ON_ONCE(1);
> > -               snprintf(p, maxlen, "unknown");
> > +               seprintf(p, e, "unknown");
> >                 return;
> >         }
> >
> > -       p += snprintf(p, maxlen, "%s", policy_modes[mode]);
> > +       p = seprintf(p, e, "%s", policy_modes[mode]);
> >
> >         if (flags & MPOL_MODE_FLAGS) {
> > -               p += snprintf(p, buffer + maxlen - p, "=");
> > +               p = seprintf(p, e, "=");
> >
> >                 /*
> >                  * Static and relative are mutually exclusive.
> >                  */
> >                 if (flags & MPOL_F_STATIC_NODES)
> > -                       p += snprintf(p, buffer + maxlen - p, "static");
> > +                       p = seprintf(p, e, "static");
> >                 else if (flags & MPOL_F_RELATIVE_NODES)
> > -                       p += snprintf(p, buffer + maxlen - p, "relative");
> > +                       p = seprintf(p, e, "relative");
> >
> >                 if (flags & MPOL_F_NUMA_BALANCING) {
> >                         if (!is_power_of_2(flags & MPOL_MODE_FLAGS))
> > -                               p += snprintf(p, buffer + maxlen - p, "|");
> > -                       p += snprintf(p, buffer + maxlen - p, "balancing");
> > +                               p = seprintf(p, e, "|");
> > +                       p = seprintf(p, e, "balancing");
> >                 }
> >         }
> >
> >         if (!nodes_empty(nodes))
> > -               p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
> > -                              nodemask_pr_args(&nodes));
> > +               seprintf(p, e, ":%*pbl", nodemask_pr_args(&nodes));
> >  }
> >
> >  #ifdef CONFIG_SYSFS
> > diff --git a/mm/page_owner.c b/mm/page_owner.c
> > index cc4a6916eec6..5811738e3320 100644
> > --- a/mm/page_owner.c
> > +++ b/mm/page_owner.c
> > @@ -496,7 +496,7 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
> >  /*
> >   * Looking for memcg information and print it out
> >   */
> > -static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
> > +static inline char *print_page_owner_memcg(char *p, const char end[0],
> >                                          struct page *page)
> >  {
> >  #ifdef CONFIG_MEMCG
> > @@ -511,8 +511,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
> >                 goto out_unlock;
> >
> >         if (memcg_data & MEMCG_DATA_OBJEXTS)
> > -               ret += scnprintf(kbuf + ret, count - ret,
> > -                               "Slab cache page\n");
> > +               p = seprintf(p, end, "Slab cache page\n");
> >
> >         memcg = page_memcg_check(page);
> >         if (!memcg)
> > @@ -520,7 +519,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
> >
> >         online = (memcg->css.flags & CSS_ONLINE);
> >         cgroup_name(memcg->css.cgroup, name, sizeof(name));
> > -       ret += scnprintf(kbuf + ret, count - ret,
> > +       p = seprintf(p, end,
> >                         "Charged %sto %smemcg %s\n",
> >                         PageMemcgKmem(page) ? "(via objcg) " : "",
> >                         online ? "" : "offline ",
> > @@ -529,7 +528,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
> >         rcu_read_unlock();
> >  #endif /* CONFIG_MEMCG */
> >
> > -       return ret;
> > +       return p;
> >  }
> >
> >  static ssize_t
> > @@ -538,14 +537,16 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> >                 depot_stack_handle_t handle)
> >  {
> >         int ret, pageblock_mt, page_mt;
> > -       char *kbuf;
> > +       char *kbuf, *p, *e;
> >
> >         count = min_t(size_t, count, PAGE_SIZE);
> >         kbuf = kmalloc(count, GFP_KERNEL);
> >         if (!kbuf)
> >                 return -ENOMEM;
> >
> > -       ret = scnprintf(kbuf, count,
> > +       p = kbuf;
> > +       e = kbuf + count;
> > +       p = seprintf(p, e,
> >                         "Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n",
> >                         page_owner->order, page_owner->gfp_mask,
> >                         &page_owner->gfp_mask, page_owner->pid,
> > @@ -555,7 +556,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> >         /* Print information relevant to grouping pages by mobility */
> >         pageblock_mt = get_pageblock_migratetype(page);
> >         page_mt  = gfp_migratetype(page_owner->gfp_mask);
> > -       ret += scnprintf(kbuf + ret, count - ret,
> > +       p = seprintf(p, e,
> >                         "PFN 0x%lx type %s Block %lu type %s Flags %pGp\n",
> >                         pfn,
> >                         migratetype_names[page_mt],
> > @@ -563,22 +564,23 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
> >                         migratetype_names[pageblock_mt],
> >                         &page->flags);
> >
> > -       ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
> > -       if (ret >= count)
> > -               goto err;
> > +       p = stack_depot_seprint(handle, p, e, 0);
> > +       if (p == NULL)
> > +               goto err;  // XXX: Should we remove this error handling?
> >
> >         if (page_owner->last_migrate_reason != -1) {
> > -               ret += scnprintf(kbuf + ret, count - ret,
> > +               p = seprintf(p, e,
> >                         "Page has been migrated, last migrate reason: %s\n",
> >                         migrate_reason_names[page_owner->last_migrate_reason]);
> >         }
> >
> > -       ret = print_page_owner_memcg(kbuf, count, ret, page);
> > +       p = print_page_owner_memcg(p, e, page);
> >
> > -       ret += snprintf(kbuf + ret, count - ret, "\n");
> > -       if (ret >= count)
> > +       p = seprintf(p, e, "\n");
> > +       if (p == NULL)
> >                 goto err;
> >
> > +       ret = p - kbuf;
> >         if (copy_to_user(buf, kbuf, ret))
> >                 ret = -EFAULT;
> >
> > diff --git a/mm/slub.c b/mm/slub.c
> > index be8b09e09d30..b67c6ca0d0f7 100644
> > --- a/mm/slub.c
> > +++ b/mm/slub.c
> > @@ -7451,6 +7451,7 @@ static char *create_unique_id(struct kmem_cache *s)
> >  {
> >         char *name = kmalloc(ID_STR_LENGTH, GFP_KERNEL);
> >         char *p = name;
> > +       char *e = name + ID_STR_LENGTH;
> >
> >         if (!name)
> >                 return ERR_PTR(-ENOMEM);
> > @@ -7475,9 +7476,9 @@ static char *create_unique_id(struct kmem_cache *s)
> >                 *p++ = 'A';
> >         if (p != name + 1)
> >                 *p++ = '-';
> > -       p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
> > +       p = seprintf(p, e, "%07u", s->size);
> >
> > -       if (WARN_ON(p > name + ID_STR_LENGTH - 1)) {
> > +       if (WARN_ON(p == NULL)) {
> >                 kfree(name);
> >                 return ERR_PTR(-EINVAL);
> >         }
> > --
> > 2.50.0
> >

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Marco Elver 3 months ago

On Mon, 7 Jul 2025 at 16:39, Alejandro Colomar <alx@kernel.org> wrote:
>
> Hi Marco,
>
> On Mon, Jul 07, 2025 at 09:44:09AM +0200, Marco Elver wrote:
> > On Mon, 7 Jul 2025 at 07:06, Alejandro Colomar <alx@kernel.org> wrote:
> > >
> > > While doing this, I detected some anomalies in the existing code:
> > >
> > > mm/kfence/kfence_test.c:
> > >
> > >         -  The last call to scnprintf() did increment 'cur', but it's
> > >            unused after that, so it was dead code.  I've removed the dead
> > >            code in this patch.
> >
> > That was done to be consistent with the other code for readability,
> > and to be clear where the next bytes should be appended (if someone
> > decides to append more). There is no runtime dead code, the compiler
> > optimizes away the assignment. But I'm indifferent, so removing the
> > assignment is fine if you prefer that.
>
> Yeah, I guessed that might be the reason.  I'm fine restoring it if you
> prefer it.  I tend to use -Wunused-but-set-variable, but if it is not
> used here and doesn't trigger, I guess it's fine to keep it.

Feel free to make it warning-free, I guess that's useful.

> > Did you run the tests? Do they pass?
>
> I don't know how to run them.  I've only built the kernel.  If you point
> me to instructions on how to run them, I'll do so.  Thanks!

Should just be CONFIG_KFENCE_KUNIT_TEST=y -- then boot kernel and
check that the test reports "ok".

Thanks,
-- marco

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

Hi Marco,

On Mon, Jul 07, 2025 at 04:58:53PM +0200, Marco Elver wrote:
> Feel free to make it warning-free, I guess that's useful.

Thanks!

> > > Did you run the tests? Do they pass?
> >
> > I don't know how to run them.  I've only built the kernel.  If you point
> > me to instructions on how to run them, I'll do so.  Thanks!
> 
> Should just be CONFIG_KFENCE_KUNIT_TEST=y -- then boot kernel and
> check that the test reports "ok".

Hmmm, I can't see the results.  Did I miss anything?

	alx@debian:~$ uname -a
	Linux debian 6.15.0-seprintf-mm+ #5 SMP PREEMPT_DYNAMIC Mon Jul  7 19:16:40 CEST 2025 x86_64 GNU/Linux
	alx@debian:~$ cat /boot/config-6.15.0-seprintf-mm+ | grep KFENCE
	CONFIG_HAVE_ARCH_KFENCE=y
	CONFIG_KFENCE=y
	CONFIG_KFENCE_SAMPLE_INTERVAL=0
	CONFIG_KFENCE_NUM_OBJECTS=255
	# CONFIG_KFENCE_DEFERRABLE is not set
	# CONFIG_KFENCE_STATIC_KEYS is not set
	CONFIG_KFENCE_STRESS_TEST_FAULTS=0
	CONFIG_KFENCE_KUNIT_TEST=y
	alx@debian:~$ sudo dmesg | grep -i kfence
	alx@debian:~$ 

I see a lot of new stuff in dmesg, but nothing with 'kfence' in it.


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Marco Elver 3 months ago

On Mon, 7 Jul 2025 at 20:51, Alejandro Colomar <alx@kernel.org> wrote:
>
> Hi Marco,
>
> On Mon, Jul 07, 2025 at 04:58:53PM +0200, Marco Elver wrote:
> > Feel free to make it warning-free, I guess that's useful.
>
> Thanks!
>
> > > > Did you run the tests? Do they pass?
> > >
> > > I don't know how to run them.  I've only built the kernel.  If you point
> > > me to instructions on how to run them, I'll do so.  Thanks!
> >
> > Should just be CONFIG_KFENCE_KUNIT_TEST=y -- then boot kernel and
> > check that the test reports "ok".
>
> Hmmm, I can't see the results.  Did I miss anything?
>
>         alx@debian:~$ uname -a
>         Linux debian 6.15.0-seprintf-mm+ #5 SMP PREEMPT_DYNAMIC Mon Jul  7 19:16:40 CEST 2025 x86_64 GNU/Linux
>         alx@debian:~$ cat /boot/config-6.15.0-seprintf-mm+ | grep KFENCE
>         CONFIG_HAVE_ARCH_KFENCE=y
>         CONFIG_KFENCE=y
>         CONFIG_KFENCE_SAMPLE_INTERVAL=0

                     ^^ This means KFENCE is off.

Not sure why it's 0 (distro default config?), but if you switch it to
something like:

  CONFIG_KFENCE_SAMPLE_INTERVAL=10

The test should run. Alternatively set 'kfence.sample_interval=10' as
boot param.

>         CONFIG_KFENCE_NUM_OBJECTS=255
>         # CONFIG_KFENCE_DEFERRABLE is not set
>         # CONFIG_KFENCE_STATIC_KEYS is not set
>         CONFIG_KFENCE_STRESS_TEST_FAULTS=0
>         CONFIG_KFENCE_KUNIT_TEST=y
>         alx@debian:~$ sudo dmesg | grep -i kfence
>         alx@debian:~$
>
> I see a lot of new stuff in dmesg, but nothing with 'kfence' in it.
>
>
> Cheers,
> Alex
>
> --
> <https://www.alejandro-colomar.es/>

Re: [RFC v3 3/7] mm: Use seprintf() instead of less ergonomic APIs

Posted by Alejandro Colomar 3 months ago

Hi Marco,

On Mon, Jul 07, 2025 at 09:08:29PM +0200, Marco Elver wrote:
> > > > > Did you run the tests? Do they pass?
> > > >
> > > > I don't know how to run them.  I've only built the kernel.  If you point
> > > > me to instructions on how to run them, I'll do so.  Thanks!
> > >
> > > Should just be CONFIG_KFENCE_KUNIT_TEST=y -- then boot kernel and
> > > check that the test reports "ok".
> >
> > Hmmm, I can't see the results.  Did I miss anything?
> >
> >         alx@debian:~$ uname -a
> >         Linux debian 6.15.0-seprintf-mm+ #5 SMP PREEMPT_DYNAMIC Mon Jul  7 19:16:40 CEST 2025 x86_64 GNU/Linux
> >         alx@debian:~$ cat /boot/config-6.15.0-seprintf-mm+ | grep KFENCE
> >         CONFIG_HAVE_ARCH_KFENCE=y
> >         CONFIG_KFENCE=y
> >         CONFIG_KFENCE_SAMPLE_INTERVAL=0
> 
>                      ^^ This means KFENCE is off.
> 
> Not sure why it's 0 (distro default config?), but if you switch it to
> something like:

Yup, Debian default config plus what you told me.  :)

> 
>   CONFIG_KFENCE_SAMPLE_INTERVAL=10

Thanks!  Now I see the tests.

I see no regressions.  I've tested both v6.15 and my branch, and see no
differences:


This was generated with the kernel built from my branch:

	$ sudo dmesg | grep -inC2 kfence | sed 's/^....//' > tmp/log_after

This was generated with a v6.15 kernel with the same exact config:

	$ sudo dmesg | grep -inC2 kfence | sed 's/^....//' > tmp/log_before

And here's a diff, ignoring some numbers that were easy to filter out:

	$ diff -U999 \
		<(cat tmp/log_before \
			| sed 's/0x[0-9a-f]*/0x????/g' \
			| sed 's/[[:digit:]]\.[[:digit:]]\+/?.?/g' \
			| sed 's/#[[:digit:]]\+/#???/g') \
		<(cat tmp/log_after \
			| sed 's/0x[0-9a-f]*/0x????/g' \
			| sed 's/[[:digit:]]\.[[:digit:]]\+/?.?/g' \
			| sed 's/#[[:digit:]]\+/#???/g');
	--- /dev/fd/63	2025-07-07 22:47:37.395608776 +0200
	+++ /dev/fd/62	2025-07-07 22:47:37.395608776 +0200
	@@ -1,303 +1,303 @@
	 [    ?.?] NR_IRQS: 524544, nr_irqs: 1096, preallocated irqs: 16
	 [    ?.?] rcu: srcu_init: Setting srcu_struct sizes based on contention.
	 [    ?.?] kfence: initialized - using 2097152 bytes for 255 objects at 0x????(____ptrval____)-0x????(____ptrval____)
	 [    ?.?] Console: colour dummy device 80x????
	 [    ?.?] printk: legacy console [tty0] enabled
	 --
	 [    ?.?] ok 7 sysctl_test
	 [    ?.?]     KTAP version 1
	 [    ?.?]     # Subtest: kfence
	 [    ?.?]     1..27
	 [    ?.?]     # test_out_of_bounds_read: test_alloc: size=32, gfp=cc0, policy=left, cache=0
	 [    ?.?] ==================================================================
	 [    ?.?] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0x????/0x????
	 
	 [    ?.?] Out-of-bounds read at 0x???? (1B left of kfence-#???):
	 [    ?.?]  test_out_of_bounds_read+0x????/0x????
	 [    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 [    ?.?]  ret_from_fork_asm+0x????/0x????
	 
	 [    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 
	-[    ?.?] allocated by task 281 on cpu 6 at ?.?s (?.?s ago):
	+[    ?.?] allocated by task 286 on cpu 8 at ?.?s (?.?s ago):
	 --
	 [    ?.?]     # test_out_of_bounds_read: test_alloc: size=32, gfp=cc0, policy=right, cache=0
	 [    ?.?] ==================================================================
	 [    ?.?] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read.cold+0x????/0x????
	 
	 [    ?.?] Out-of-bounds read at 0x???? (32B right of kfence-#???):
	 [    ?.?]  test_out_of_bounds_read.cold+0x????/0x????
	 [    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 [    ?.?]  ret_from_fork_asm+0x????/0x????
	 
	 [    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 
	-[    ?.?] allocated by task 281 on cpu 6 at ?.?s (?.?s ago):
	+[    ?.?] allocated by task 286 on cpu 11 at ?.?s (?.?s ago):
	 --
	 [    ?.?]     # test_out_of_bounds_read-memcache: test_alloc: size=32, gfp=cc0, policy=left, cache=1
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0x????/0x????
	 -
	 :[    ?.?] Out-of-bounds read at 0x???? (1B left of kfence-#???):
	 -[    ?.?]  test_out_of_bounds_read+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 284 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 289 on cpu 8 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_out_of_bounds_read-memcache: test_alloc: size=32, gfp=cc0, policy=right, cache=1
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read.cold+0x????/0x????
	 -
	 :[    ?.?] Out-of-bounds read at 0x???? (32B right of kfence-#???):
	 -[    ?.?]  test_out_of_bounds_read.cold+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 284 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 289 on cpu 8 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_out_of_bounds_write: test_alloc: size=32, gfp=cc0, policy=left, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: out-of-bounds write in test_out_of_bounds_write+0x????/0x????
	 -
	 :[    ?.?] Out-of-bounds write at 0x???? (1B left of kfence-#???):
	 -[    ?.?]  test_out_of_bounds_write+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 -
	--[    ?.?] allocated by task 288 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 291 on cpu 6 at ?.?s (?.?s ago):
	 --
	--[    ?.?]     # test_out_of_bounds_write-memcache: test_alloc: size=32, gfp=cc0, policy=left, cache=1
	 -[    ?.?] ==================================================================
	+-[    ?.?] clocksource: tsc: mask: 0x???? max_cycles: 0x????, max_idle_ns: 881590599626 ns
	 :[    ?.?] BUG: KFENCE: out-of-bounds write in test_out_of_bounds_write+0x????/0x????
	 -
	 :[    ?.?] Out-of-bounds write at 0x???? (1B left of kfence-#???):
	 -[    ?.?]  test_out_of_bounds_write+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 290 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 293 on cpu 10 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_use_after_free_read: test_alloc: size=32, gfp=cc0, policy=any, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: use-after-free read in test_use_after_free_read+0x????/0x????
	 -
	 :[    ?.?] Use-after-free read at 0x???? (in kfence-#???):
	 -[    ?.?]  test_use_after_free_read+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 -
	--[    ?.?] allocated by task 292 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 296 on cpu 10 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_use_after_free_read-memcache: test_alloc: size=32, gfp=cc0, policy=any, cache=1
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: use-after-free read in test_use_after_free_read+0x????/0x????
	 -
	 :[    ?.?] Use-after-free read at 0x???? (in kfence-#???):
	 -[    ?.?]  test_use_after_free_read+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 294 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 298 on cpu 10 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_double_free: test_alloc: size=32, gfp=cc0, policy=any, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: invalid free in test_double_free+0x????/0x????
	 -
	 :[    ?.?] Invalid free of 0x???? (in kfence-#???):
	 -[    ?.?]  test_double_free+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 -
	--[    ?.?] allocated by task 300 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 304 on cpu 6 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_double_free-memcache: test_alloc: size=32, gfp=cc0, policy=any, cache=1
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: invalid free in test_double_free+0x????/0x????
	 -
	 :[    ?.?] Invalid free of 0x???? (in kfence-#???):
	 -[    ?.?]  test_double_free+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 302 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 306 on cpu 8 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_invalid_addr_free: test_alloc: size=32, gfp=cc0, policy=any, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: invalid free in test_invalid_addr_free+0x????/0x????
	 -
	 :[    ?.?] Invalid free of 0x???? (in kfence-#???):
	 -[    ?.?]  test_invalid_addr_free+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 -
	--[    ?.?] allocated by task 304 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 308 on cpu 8 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_invalid_addr_free-memcache: test_alloc: size=32, gfp=cc0, policy=any, cache=1
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: invalid free in test_invalid_addr_free+0x????/0x????
	 -
	 :[    ?.?] Invalid free of 0x???? (in kfence-#???):
	 -[    ?.?]  test_invalid_addr_free+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 306 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 310 on cpu 8 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_corruption: test_alloc: size=32, gfp=cc0, policy=left, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: memory corruption in test_corruption+0x????/0x????
	 -
	 :[    ?.?] Corrupted memory at 0x???? [ ! . . . . . . . . . . . . . . . ] (in kfence-#???):
	 -[    ?.?]  test_corruption+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 -
	--[    ?.?] allocated by task 308 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 312 on cpu 6 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_corruption: test_alloc: size=32, gfp=cc0, policy=right, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: memory corruption in test_corruption+0x????/0x????
	 -
	 :[    ?.?] Corrupted memory at 0x???? [ ! ] (in kfence-#???):
	 -[    ?.?]  test_corruption+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 -
	--[    ?.?] allocated by task 308 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 312 on cpu 6 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_corruption-memcache: test_alloc: size=32, gfp=cc0, policy=left, cache=1
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: memory corruption in test_corruption+0x????/0x????
	 -
	 :[    ?.?] Corrupted memory at 0x???? [ ! . . . . . . . . . . . . . . . ] (in kfence-#???):
	 -[    ?.?]  test_corruption+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 310 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 314 on cpu 6 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_corruption-memcache: test_alloc: size=32, gfp=cc0, policy=right, cache=1
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: memory corruption in test_corruption+0x????/0x????
	 -
	 :[    ?.?] Corrupted memory at 0x???? [ ! ] (in kfence-#???):
	 -[    ?.?]  test_corruption+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 310 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 314 on cpu 6 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_kmalloc_aligned_oob_read: test_alloc: size=73, gfp=cc0, policy=right, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: out-of-bounds read in test_kmalloc_aligned_oob_read+0x????/0x????
	 -
	 :[    ?.?] Out-of-bounds read at 0x???? (105B right of kfence-#???):
	 -[    ?.?]  test_kmalloc_aligned_oob_read+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=73, cache=kmalloc-96
	 -
	--[    ?.?] allocated by task 320 on cpu 10 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 326 on cpu 6 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_kmalloc_aligned_oob_write: test_alloc: size=73, gfp=cc0, policy=right, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: memory corruption in test_kmalloc_aligned_oob_write+0x????/0x????
	 -
	 :[    ?.?] Corrupted memory at 0x???? [ ! . . . . . . . . . . . . . . . ] (in kfence-#???):
	 -[    ?.?]  test_kmalloc_aligned_oob_write+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=73, cache=kmalloc-96
	 -
	--[    ?.?] allocated by task 326 on cpu 8 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 328 on cpu 4 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     ok 22 test_memcache_ctor
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: invalid read in test_invalid_access+0x????/0x????
	 -
	 -[    ?.?] Invalid read at 0x????:
	 --
	 -[    ?.?]     # test_memcache_typesafe_by_rcu: test_alloc: size=32, gfp=cc0, policy=any, cache=1
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: use-after-free read in test_memcache_typesafe_by_rcu.cold+0x????/0x????
	 -
	 :[    ?.?] Use-after-free read at 0x???? (in kfence-#???):
	 -[    ?.?]  test_memcache_typesafe_by_rcu.cold+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=test
	 -
	--[    ?.?] allocated by task 336 on cpu 6 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 338 on cpu 10 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_krealloc: test_alloc: size=32, gfp=cc0, policy=any, cache=0
	 -[    ?.?] ==================================================================
	 :[    ?.?] BUG: KFENCE: use-after-free read in test_krealloc+0x????/0x????
	 -
	 :[    ?.?] Use-after-free read at 0x???? (in kfence-#???):
	 -[    ?.?]  test_krealloc+0x????/0x????
	 -[    ?.?]  kunit_try_run_case+0x????/0x????
	 --
	 -[    ?.?]  ret_from_fork_asm+0x????/0x????
	 -
	 :[    ?.?] kfence-#???: 0x????-0x????, size=32, cache=kmalloc-32
	 -
	--[    ?.?] allocated by task 338 on cpu 4 at ?.?s (?.?s ago):
	+-[    ?.?] allocated by task 340 on cpu 6 at ?.?s (?.?s ago):
	 --
	 -[    ?.?]     # test_memcache_alloc_bulk: setup_test_cache: size=32, ctor=0x????
	 -[    ?.?]     ok 27 test_memcache_alloc_bulk
	 :[    ?.?] # kfence: pass:25 fail:0 skip:2 total:27
	 -[    ?.?] # Totals: pass:25 fail:0 skip:2 total:27
	 :[    ?.?] ok 8 kfence
	 -[    ?.?]     KTAP version 1
	 -[    ?.?]     # Subtest: damon

If you'd like me to grep for something more specific, please let me
know.


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

[RFC v3 4/7] array_size.h: Add ENDOF()

Posted by Alejandro Colomar 3 months ago

This macro is useful to calculate the second argument to seprintf(),
avoiding off-by-one bugs.

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/array_size.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/array_size.h b/include/linux/array_size.h
index 06d7d83196ca..781bdb70d939 100644
--- a/include/linux/array_size.h
+++ b/include/linux/array_size.h
@@ -10,4 +10,10 @@
  */
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
 
+/**
+ * ENDOF - get a pointer to one past the last element in array @a
+ * @a: array
+ */
+#define ENDOF(a)  (a + ARRAY_SIZE(a))
+
 #endif  /* _LINUX_ARRAY_SIZE_H */
-- 
2.50.0

[RFC v3 5/7] mm: Fix benign off-by-one bugs

Posted by Alejandro Colomar 3 months ago

We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
doesn't write more than $2 bytes including the null byte, so trying to
pass 'size-1' there is wasting one byte.  Now that we use seprintf(),
the situation isn't different: seprintf() will stop writing *before*
'end' --that is, at most the terminating null byte will be written at
'end-1'--.

Fixes: bc8fbc5f305a (2021-02-26; "kfence: add test suite")
Fixes: 8ed691b02ade (2022-10-03; "kmsan: add tests for KMSAN")
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/kfence/kfence_test.c | 4 ++--
 mm/kmsan/kmsan_test.c   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index ff734c514c03..f02c3e23638a 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -110,7 +110,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Title */
 	cur = expect[0];
-	end = &expect[0][sizeof(expect[0]) - 1];
+	end = ENDOF(expect[0]);
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
 		cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
@@ -140,7 +140,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Access information */
 	cur = expect[1];
-	end = &expect[1][sizeof(expect[1]) - 1];
+	end = ENDOF(expect[1]);
 
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index a062a46b2d24..882500807db8 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -105,7 +105,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Title */
 	cur = expected_header;
-	end = &expected_header[sizeof(expected_header) - 1];
+	end = ENDOF(expected_header);
 
 	cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
 
-- 
2.50.0

Re: [RFC v3 5/7] mm: Fix benign off-by-one bugs

Posted by Marco Elver 3 months ago

On Mon, 7 Jul 2025 at 07:06, Alejandro Colomar <alx@kernel.org> wrote:
>
> We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
> doesn't write more than $2 bytes including the null byte, so trying to
> pass 'size-1' there is wasting one byte.  Now that we use seprintf(),
> the situation isn't different: seprintf() will stop writing *before*
> 'end' --that is, at most the terminating null byte will be written at
> 'end-1'--.
>
> Fixes: bc8fbc5f305a (2021-02-26; "kfence: add test suite")
> Fixes: 8ed691b02ade (2022-10-03; "kmsan: add tests for KMSAN")

Not sure about the Fixes - this means it's likely going to be
backported to stable kernels, which is not appropriate. There's no
functional problem, and these are tests only, so not worth the churn.

Did you run the tests?

Otherwise:

Acked-by: Marco Elver <elver@google.com>

> Cc: Kees Cook <kees@kernel.org>
> Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Jann Horn <jannh@google.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>
> Signed-off-by: Alejandro Colomar <alx@kernel.org>
> ---
>  mm/kfence/kfence_test.c | 4 ++--
>  mm/kmsan/kmsan_test.c   | 2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
> index ff734c514c03..f02c3e23638a 100644
> --- a/mm/kfence/kfence_test.c
> +++ b/mm/kfence/kfence_test.c
> @@ -110,7 +110,7 @@ static bool report_matches(const struct expect_report *r)
>
>         /* Title */
>         cur = expect[0];
> -       end = &expect[0][sizeof(expect[0]) - 1];
> +       end = ENDOF(expect[0]);
>         switch (r->type) {
>         case KFENCE_ERROR_OOB:
>                 cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
> @@ -140,7 +140,7 @@ static bool report_matches(const struct expect_report *r)
>
>         /* Access information */
>         cur = expect[1];
> -       end = &expect[1][sizeof(expect[1]) - 1];
> +       end = ENDOF(expect[1]);
>
>         switch (r->type) {
>         case KFENCE_ERROR_OOB:
> diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
> index a062a46b2d24..882500807db8 100644
> --- a/mm/kmsan/kmsan_test.c
> +++ b/mm/kmsan/kmsan_test.c
> @@ -105,7 +105,7 @@ static bool report_matches(const struct expect_report *r)
>
>         /* Title */
>         cur = expected_header;
> -       end = &expected_header[sizeof(expected_header) - 1];
> +       end = ENDOF(expected_header);
>
>         cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
>
> --
> 2.50.0
>

Re: [RFC v3 5/7] mm: Fix benign off-by-one bugs

Posted by Michal Hocko 3 months ago

On Mon 07-07-25 09:46:12, Marco Elver wrote:
> On Mon, 7 Jul 2025 at 07:06, Alejandro Colomar <alx@kernel.org> wrote:
> >
> > We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
> > doesn't write more than $2 bytes including the null byte, so trying to
> > pass 'size-1' there is wasting one byte.  Now that we use seprintf(),
> > the situation isn't different: seprintf() will stop writing *before*
> > 'end' --that is, at most the terminating null byte will be written at
> > 'end-1'--.
> >
> > Fixes: bc8fbc5f305a (2021-02-26; "kfence: add test suite")
> > Fixes: 8ed691b02ade (2022-10-03; "kmsan: add tests for KMSAN")
> 
> Not sure about the Fixes - this means it's likely going to be
> backported to stable kernels, which is not appropriate. There's no
> functional problem, and these are tests only, so not worth the churn.

As long as there is no actual bug fixed then I believe those Fixes tags
are more confusing than actually helpful. And that applies to other
patches in this series as well.
-- 
Michal Hocko
SUSE Labs

Re: [RFC v3 5/7] mm: Fix benign off-by-one bugs

Posted by Alejandro Colomar 3 months ago

Hi Michal,

On Mon, Jul 07, 2025 at 09:53:31AM +0200, Michal Hocko wrote:
> On Mon 07-07-25 09:46:12, Marco Elver wrote:
> > On Mon, 7 Jul 2025 at 07:06, Alejandro Colomar <alx@kernel.org> wrote:
> > >
> > > We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
> > > doesn't write more than $2 bytes including the null byte, so trying to
> > > pass 'size-1' there is wasting one byte.  Now that we use seprintf(),
> > > the situation isn't different: seprintf() will stop writing *before*
> > > 'end' --that is, at most the terminating null byte will be written at
> > > 'end-1'--.
> > >
> > > Fixes: bc8fbc5f305a (2021-02-26; "kfence: add test suite")
> > > Fixes: 8ed691b02ade (2022-10-03; "kmsan: add tests for KMSAN")
> > 
> > Not sure about the Fixes - this means it's likely going to be
> > backported to stable kernels, which is not appropriate. There's no
> > functional problem, and these are tests only, so not worth the churn.
> 
> As long as there is no actual bug fixed then I believe those Fixes tags
> are more confusing than actually helpful. And that applies to other
> patches in this series as well.

For the dead code, I can remove the fixes tags, and even the changes
themselves, since there are good reasons to keep the dead code
(consistency, and avoiding a future programmer forgetting to add it back
when adding a subsequent seprintf() call).

For the fixes to UB, do you prefer the Fixes tags to be removed too?


Have a lovely day!
Alex

> -- 
> Michal Hocko
> SUSE Labs

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v3 5/7] mm: Fix benign off-by-one bugs

Posted by Michal Hocko 3 months ago

On Mon 07-07-25 16:42:43, Alejandro Colomar wrote:
> Hi Michal,
> 
> On Mon, Jul 07, 2025 at 09:53:31AM +0200, Michal Hocko wrote:
> > On Mon 07-07-25 09:46:12, Marco Elver wrote:
> > > On Mon, 7 Jul 2025 at 07:06, Alejandro Colomar <alx@kernel.org> wrote:
> > > >
> > > > We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
> > > > doesn't write more than $2 bytes including the null byte, so trying to
> > > > pass 'size-1' there is wasting one byte.  Now that we use seprintf(),
> > > > the situation isn't different: seprintf() will stop writing *before*
> > > > 'end' --that is, at most the terminating null byte will be written at
> > > > 'end-1'--.
> > > >
> > > > Fixes: bc8fbc5f305a (2021-02-26; "kfence: add test suite")
> > > > Fixes: 8ed691b02ade (2022-10-03; "kmsan: add tests for KMSAN")
> > > 
> > > Not sure about the Fixes - this means it's likely going to be
> > > backported to stable kernels, which is not appropriate. There's no
> > > functional problem, and these are tests only, so not worth the churn.
> > 
> > As long as there is no actual bug fixed then I believe those Fixes tags
> > are more confusing than actually helpful. And that applies to other
> > patches in this series as well.
> 
> For the dead code, I can remove the fixes tags, and even the changes
> themselves, since there are good reasons to keep the dead code
> (consistency, and avoiding a future programmer forgetting to add it back
> when adding a subsequent seprintf() call).
> 
> For the fixes to UB, do you prefer the Fixes tags to be removed too?

Are any of those UB a real or just theoretical problems? To be more
precise I do not question to have those plugged but is there any
evidence that older kernels would need those as well other than just in
case?

-- 
Michal Hocko
SUSE Labs

Re: [RFC v3 5/7] mm: Fix benign off-by-one bugs

Posted by Alejandro Colomar 3 months ago

Hi Michal,

On Mon, Jul 07, 2025 at 05:12:00PM +0200, Michal Hocko wrote:
> > For the dead code, I can remove the fixes tags, and even the changes
> > themselves, since there are good reasons to keep the dead code
> > (consistency, and avoiding a future programmer forgetting to add it back
> > when adding a subsequent seprintf() call).
> > 
> > For the fixes to UB, do you prefer the Fixes tags to be removed too?
> 
> Are any of those UB a real or just theoretical problems? To be more
> precise I do not question to have those plugged but is there any
> evidence that older kernels would need those as well other than just in
> case?

No, I haven't done any checks to verify that this is exploitable in any
way.  I personally wouldn't backport any of this.

About the Fixes: tags, I guess if they are interpreted as something to
be backported, I'll remove them all, as I don't want to backport this.

I guess having them listed in the mailing list archives would be good
enough for speleology purposes (e.g., for someone interested in what
kinds of issues this API fixes).

I'll remove them all.

Cheers,
Alex

> 
> -- 
> Michal Hocko
> SUSE Labs

-- 
<https://www.alejandro-colomar.es/>

[RFC v3 6/7] sprintf: Add [V]STPRINTF()

Posted by Alejandro Colomar 3 months ago

These macros take the array size argument implicitly to avoid programmer
mistakes.  This guarantees that the input is an array, unlike the common
call

	snprintf(buf, sizeof(buf), ...);

which is dangerous if the programmer passes a pointer.

These macros are essentially the same as the 2-argument version of
strscpy(), but with a formatted string.

Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index c3dbfd2efd2b..6080d3732055 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -4,6 +4,10 @@
 
 #include <linux/compiler_attributes.h>
 #include <linux/types.h>
+#include <linux/array_size.h>
+
+#define STPRINTF(a, fmt, ...)  stprintf(a, ARRAY_SIZE(a), fmt, ##__VA_ARGS__)
+#define VSTPRINTF(a, fmt, ap)  vstprintf(a, ARRAY_SIZE(a), fmt, ap)
 
 int num_to_str(char *buf, int size, unsigned long long num, unsigned int width);
 
-- 
2.50.0

[RFC v3 7/7] mm: Use [V]STPRINTF() to avoid specifying the array size

Posted by Alejandro Colomar 3 months ago

Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/backing-dev.c    | 2 +-
 mm/cma.c            | 4 ++--
 mm/cma_debug.c      | 2 +-
 mm/hugetlb.c        | 3 +--
 mm/hugetlb_cgroup.c | 2 +-
 mm/hugetlb_cma.c    | 2 +-
 mm/kasan/report.c   | 3 +--
 mm/memblock.c       | 4 ++--
 mm/percpu.c         | 2 +-
 mm/shrinker_debug.c | 2 +-
 mm/zswap.c          | 2 +-
 11 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 783904d8c5ef..408fdf52ee5d 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -1090,7 +1090,7 @@ int bdi_register_va(struct backing_dev_info *bdi, const char *fmt, va_list args)
 	if (bdi->dev)	/* The driver needs to use separate queues per device */
 		return 0;
 
-	vsnprintf(bdi->dev_name, sizeof(bdi->dev_name), fmt, args);
+	VSTPRINTF(bdi->dev_name, fmt, args);
 	dev = device_create(&bdi_class, NULL, MKDEV(0, 0), bdi, bdi->dev_name);
 	if (IS_ERR(dev))
 		return PTR_ERR(dev);
diff --git a/mm/cma.c b/mm/cma.c
index c04be488b099..49c54a74d6ce 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -237,9 +237,9 @@ static int __init cma_new_area(const char *name, phys_addr_t size,
 	cma_area_count++;
 
 	if (name)
-		snprintf(cma->name, CMA_MAX_NAME, "%s", name);
+		STPRINTF(cma->name, "%s", name);
 	else
-		snprintf(cma->name, CMA_MAX_NAME,  "cma%d\n", cma_area_count);
+		STPRINTF(cma->name, "cma%d\n", cma_area_count);
 
 	cma->available_count = cma->count = size >> PAGE_SHIFT;
 	cma->order_per_bit = order_per_bit;
diff --git a/mm/cma_debug.c b/mm/cma_debug.c
index fdf899532ca0..ae94b7ae6710 100644
--- a/mm/cma_debug.c
+++ b/mm/cma_debug.c
@@ -186,7 +186,7 @@ static void cma_debugfs_add_one(struct cma *cma, struct dentry *root_dentry)
 	rangedir = debugfs_create_dir("ranges", tmp);
 	for (r = 0; r < cma->nranges; r++) {
 		cmr = &cma->ranges[r];
-		snprintf(rdirname, sizeof(rdirname), "%d", r);
+		STPRINTF(rdirname, "%d", r);
 		dir = debugfs_create_dir(rdirname, rangedir);
 		debugfs_create_file("base_pfn", 0444, dir,
 			    &cmr->base_pfn, &cma_debugfs_fops);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6a3cf7935c14..6d0bd88eeba9 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4780,8 +4780,7 @@ void __init hugetlb_add_hstate(unsigned int order)
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
 	INIT_LIST_HEAD(&h->hugepage_activelist);
-	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
-					huge_page_size(h)/SZ_1K);
+	STPRINTF(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
 
 	parsed_hstate = h;
 }
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 58e895f3899a..8f5ffe35d16d 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -822,7 +822,7 @@ hugetlb_cgroup_cfttypes_init(struct hstate *h, struct cftype *cft,
 	for (i = 0; i < tmpl_size; cft++, tmpl++, i++) {
 		*cft = *tmpl;
 		/* rebuild the name */
-		snprintf(cft->name, MAX_CFTYPE_NAME, "%s.%s", buf, tmpl->name);
+		STPRINTF(cft->name, "%s.%s", buf, tmpl->name);
 		/* rebuild the private */
 		cft->private = MEMFILE_PRIVATE(idx, tmpl->private);
 		/* rebuild the file_offset */
diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c
index e0f2d5c3a84c..c28d09e0ce68 100644
--- a/mm/hugetlb_cma.c
+++ b/mm/hugetlb_cma.c
@@ -211,7 +211,7 @@ void __init hugetlb_cma_reserve(int order)
 
 		size = round_up(size, PAGE_SIZE << order);
 
-		snprintf(name, sizeof(name), "hugetlb%d", nid);
+		STPRINTF(name, "hugetlb%d", nid);
 		/*
 		 * Note that 'order per bit' is based on smallest size that
 		 * may be returned to CMA allocator in the case of
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 8357e1a33699..62a9bcff236a 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -486,8 +486,7 @@ static void print_memory_metadata(const void *addr)
 		char buffer[4 + (BITS_PER_LONG / 8) * 2];
 		char metadata[META_BYTES_PER_ROW];
 
-		snprintf(buffer, sizeof(buffer),
-				(i == 0) ? ">%px: " : " %px: ", row);
+		STPRINTF(buffer, (i == 0) ? ">%px: " : " %px: ", row);
 
 		/*
 		 * We should not pass a shadow pointer to generic
diff --git a/mm/memblock.c b/mm/memblock.c
index 0e9ebb8aa7fe..20d3928a6b13 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -2021,7 +2021,7 @@ static void __init_memblock memblock_dump(struct memblock_type *type)
 		flags = rgn->flags;
 #ifdef CONFIG_NUMA
 		if (numa_valid_node(memblock_get_region_node(rgn)))
-			snprintf(nid_buf, sizeof(nid_buf), " on node %d",
+			STPRINTF(nid_buf, " on node %d",
 				 memblock_get_region_node(rgn));
 #endif
 		pr_info(" %s[%#x]\t[%pa-%pa], %pa bytes%s flags: %#x\n",
@@ -2379,7 +2379,7 @@ int reserve_mem_release_by_name(const char *name)
 
 	start = phys_to_virt(map->start);
 	end = start + map->size - 1;
-	snprintf(buf, sizeof(buf), "reserve_mem:%s", name);
+	STPRINTF(buf, "reserve_mem:%s", name);
 	free_reserved_area(start, end, 0, buf);
 	map->size = 0;
 
diff --git a/mm/percpu.c b/mm/percpu.c
index b35494c8ede2..8d5b5ac7dbef 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -3186,7 +3186,7 @@ int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_to_node_fn_t
 	int upa;
 	int nr_g0_units;
 
-	snprintf(psize_str, sizeof(psize_str), "%luK", PAGE_SIZE >> 10);
+	STPRINTF(psize_str, "%luK", PAGE_SIZE >> 10);
 
 	ai = pcpu_build_alloc_info(reserved_size, 0, PAGE_SIZE, NULL);
 	if (IS_ERR(ai))
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
index 20eaee3e97f7..7194f2de8594 100644
--- a/mm/shrinker_debug.c
+++ b/mm/shrinker_debug.c
@@ -176,7 +176,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
 		return id;
 	shrinker->debugfs_id = id;
 
-	snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
+	STPRINTF(buf, "%s-%d", shrinker->name, id);
 
 	/* create debugfs entry */
 	entry = debugfs_create_dir(buf, shrinker_debugfs_root);
diff --git a/mm/zswap.c b/mm/zswap.c
index 204fb59da33c..01c96cb5e84f 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -271,7 +271,7 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
 		return NULL;
 
 	/* unique name for each pool specifically required by zsmalloc */
-	snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count));
+	STPRINTF(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
 	pool->zpool = zpool_create_pool(type, name, gfp);
 	if (!pool->zpool) {
 		pr_err("%s zpool not available\n", type);
-- 
2.50.0

[RFC v6 0/8] Add and use sprintf_{end,trunc,array}() instead of less ergonomic APIs

Posted by Alejandro Colomar 2 months, 4 weeks ago

Hi,

Changes in v6:

[As commented in private to Linus, I assume the NAK from Linus in v5
 applies to the macro that evaluates twice.  This is resolved in v6, so
 I send assuming no NAKs to the overall patch set.]

-  Don't try to have a single function.  Have sprintf_end() for chaining
   calls and sprintf_trunc() --which is the fmt version of strscpy()--
   for single calls.  Then sprintf_array() --which is the fmt version of
   the 2-argument strscpy()-- for single calls with an array as input.
-  Fix implementation of sprintf_array() to not evaluate twice.

These changes are essentially a roll-back to the general idea in v3,
except for the more explicit names.

Remaining questions:

-  There are only 3 remaining calls to snprintf(3) under mm/.  They are
   just fine for now, which is why I didn't replace them.  If anyone
   wants to replace them, to get rid of all snprintf(3), we could that.
   I think for now we can leave them, to minimize the churn.

        $ grep -rnI snprintf mm/
        mm/hugetlb_cgroup.c:674:                snprintf(buf, size, "%luGB", hsize / SZ_1G);
        mm/hugetlb_cgroup.c:676:                snprintf(buf, size, "%luMB", hsize / SZ_1M);
        mm/hugetlb_cgroup.c:678:                snprintf(buf, size, "%luKB", hsize / SZ_1K);

   They could be replaced by sprintf_trunc().

-  There are only 2 remaining calls to the kernel's scnprintf().  This
   one I would really like to get rid of.  Also, those calls are quite
   suspicious of not being what we want.  Please do have a look at them
   and confirm what's the appropriate behavior in the 2 cases when the
   string is truncated or not copied at all.  That code is very scary
   for me to try to guess.

        $ grep -rnI scnprintf mm/
        mm/kfence/report.c:75:          int len = scnprintf(buf, sizeof(buf), "%ps", (void *)stack_entries[skipnr]);
        mm/kfence/kfence_test.mod.c:22: { 0x96848186, "scnprintf" },
        mm/kmsan/report.c:42:           len = scnprintf(buf, sizeof(buf), "%ps",

   Apart from two calls, I see a string literal with that name.  Please
   let me know if I should do anything about it.  I don't know what that
   is.

-  I think we should remove one error handling check in
   "mm/page_owner.c" (marked with an XXX comment), but I'm not 100%
   sure.  Please confirm.

Other comments:

-  This is still not complying to coding style.  I'll keep it like that
   while questions remain open.
-  I've tested the tests under CONFIG_KFENCE_KUNIT_TEST=y, and this has
   no regressions at all.
-  With the current style of the sprintf_end() prototyope, this triggers
   a diagnostic due to a GCC bug:
   <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108036>
   It would be interesting to ask GCC to fix that bug.  (Added relevant
   GCC maintainers and contributors to CC in this cover letter.)
-  The call sprintf_end(p, end, "") in lib/stackdepot.c, within
   stack_depot_sprint_end(), produces a warning for having an empty
   string.  This could be replaced by a strcpy_end(p, end, "") if/when
   we add that function.

For anyone new to the thread, sprintf_end() will be proposed for
standardization soon as seprintf():
<https://lore.kernel.org/linux-hardening/20250710024745.143955-1-alx@kernel.org/T/#u>


Have a lovely night!
Alex


Alejandro Colomar (8):
  vsprintf: Add [v]sprintf_trunc()
  vsprintf: Add [v]sprintf_end()
  sprintf: Add [v]sprintf_array()
  stacktrace, stackdepot: Add sprintf_end()-like variants of functions
  mm: Use sprintf_end() instead of less ergonomic APIs
  array_size.h: Add ENDOF()
  mm: Fix benign off-by-one bugs
  mm: Use [v]sprintf_array() to avoid specifying the array size

 include/linux/array_size.h |   6 +++
 include/linux/sprintf.h    |   8 +++
 include/linux/stackdepot.h |  13 +++++
 include/linux/stacktrace.h |   3 ++
 kernel/stacktrace.c        |  28 ++++++++++
 lib/stackdepot.c           |  13 +++++
 lib/vsprintf.c             | 107 +++++++++++++++++++++++++++++++++++++
 mm/backing-dev.c           |   2 +-
 mm/cma.c                   |   4 +-
 mm/cma_debug.c             |   2 +-
 mm/hugetlb.c               |   3 +-
 mm/hugetlb_cgroup.c        |   2 +-
 mm/hugetlb_cma.c           |   2 +-
 mm/kasan/report.c          |   3 +-
 mm/kfence/kfence_test.c    |  28 +++++-----
 mm/kmsan/kmsan_test.c      |   6 +--
 mm/memblock.c              |   4 +-
 mm/mempolicy.c             |  18 +++----
 mm/page_owner.c            |  32 +++++------
 mm/percpu.c                |   2 +-
 mm/shrinker_debug.c        |   2 +-
 mm/slub.c                  |   5 +-
 mm/zswap.c                 |   2 +-
 23 files changed, 237 insertions(+), 58 deletions(-)

Range-diff against v5:
-:  ------------ > 1:  dab6068bef5c vsprintf: Add [v]sprintf_trunc()
1:  2c4f793de0b8 ! 2:  c801c9a1a90d vsprintf: Add [v]sprintf_end()
    @@ Commit message
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## include/linux/sprintf.h ##
    -@@ include/linux/sprintf.h: __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
    - __printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
    - __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
    +@@ include/linux/sprintf.h: __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
      __printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
    + __printf(3, 4) int sprintf_trunc(char *buf, size_t size, const char *fmt, ...);
    + __printf(3, 0) int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args);
     +__printf(3, 4) char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
     +__printf(3, 0) char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
      __printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
    @@ include/linux/sprintf.h: __printf(3, 4) int snprintf(char *buf, size_t size, con
      __printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
     
      ## lib/vsprintf.c ##
    -@@ lib/vsprintf.c: int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
    +@@ lib/vsprintf.c: int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args)
      }
    - EXPORT_SYMBOL(vscnprintf);
    + EXPORT_SYMBOL(vsprintf_trunc);
      
     +/**
     + * vsprintf_end - va_list string end-delimited print formatted
    @@ lib/vsprintf.c: int vscnprintf(char *buf, size_t size, const char *fmt, va_list
     +char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
     +{
     +  int len;
    -+  size_t size;
     +
     +  if (unlikely(p == NULL))
     +          return NULL;
     +
    -+  size = end - p;
    -+  if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
    -+          return NULL;
    -+
    -+  len = vsnprintf(p, size, fmt, args);
    -+  if (unlikely(len >= size))
    ++  len = vsprintf_trunc(p, end - p, fmt, args);
    ++  if (unlikely(len < 0))
     +          return NULL;
     +
     +  return p + len;
    @@ lib/vsprintf.c: int vscnprintf(char *buf, size_t size, const char *fmt, va_list
      /**
       * snprintf - Format a string and place it in a buffer
       * @buf: The buffer to place the result into
    -@@ lib/vsprintf.c: int scnprintf(char *buf, size_t size, const char *fmt, ...)
    +@@ lib/vsprintf.c: int sprintf_trunc(char *buf, size_t size, const char *fmt, ...)
      }
    - EXPORT_SYMBOL(scnprintf);
    + EXPORT_SYMBOL(sprintf_trunc);
      
     +/**
     + * sprintf_end - string end-delimited print formatted
6:  04c1e026a67f ! 3:  9348d5df2d9f sprintf: Add [v]sprintf_array()
    @@ Commit message
         array.
     
         These macros are essentially the same as the 2-argument version of
    -    strscpy(), but with a formatted string, and returning a pointer to the
    -    terminating '\0' (or NULL, on error).
    +    strscpy(), but with a formatted string.
     
         Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
         Cc: Marco Elver <elver@google.com>
    @@ include/linux/sprintf.h
      #include <linux/types.h>
     +#include <linux/array_size.h>
     +
    -+#define sprintf_array(a, fmt, ...)  sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
    -+#define vsprintf_array(a, fmt, ap)  vsprintf_end(a, ENDOF(a), fmt, ap)
    ++#define sprintf_array(a, fmt, ...)  sprintf_trunc(a, ARRAY_SIZE(a), fmt, ##__VA_ARGS__)
    ++#define vsprintf_array(a, fmt, ap)  vsprintf_trunc(a, ARRAY_SIZE(a), fmt, ap)
      
      int num_to_str(char *buf, int size, unsigned long long num, unsigned int width);
      
2:  894d02b08056 = 4:  6c5d8e6012f0 stacktrace, stackdepot: Add sprintf_end()-like variants of functions
3:  690ed4d22f57 = 5:  8a0ffc1bf43d mm: Use sprintf_end() instead of less ergonomic APIs
4:  e05c5afabb3c = 6:  37b1088dbd01 array_size.h: Add ENDOF()
5:  515445ae064d = 7:  c88780354e13 mm: Fix benign off-by-one bugs
7:  e53d87e684ef = 8:  aa6323cbea64 mm: Use [v]sprintf_array() to avoid specifying the array size

base-commit: 0ff41df1cb268fc69e703a08a57ee14ae967d0ca
-- 
2.50.0

[RFC v6 1/8] vsprintf: Add [v]sprintf_trunc()

Posted by Alejandro Colomar 2 months, 4 weeks ago

sprintf_trunc() is a function similar to strscpy().  It truncates the
string, and returns an error code on truncation or error.  On success,
it returns the length of the new string.

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h |  2 ++
 lib/vsprintf.c          | 53 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 55 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index 51cab2def9ec..5ea6ec9c2e59 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -13,6 +13,8 @@ __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
 __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
+__printf(3, 4) int sprintf_trunc(char *buf, size_t size, const char *fmt, ...);
+__printf(3, 0) int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args);
 __printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
 __printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
 __printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 01699852f30c..15e780942c56 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2923,6 +2923,34 @@ int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
 }
 EXPORT_SYMBOL(vscnprintf);
 
+/**
+ * vsprintf_trunc - va_list string truncate print formatted
+ * @buf: The buffer to place the result into
+ * @size: The size of the buffer, including the trailing null space
+ * @fmt: The format string to use
+ * @args: Arguments for the format string
+ *
+ * The return value is the length of the string.
+ * If the string is truncated, the function returns -E2BIG.
+ * If @size is invalid, the function returns -EOVERFLOW.
+ *
+ * See the vsnprintf() documentation for format string extensions over C99.
+ */
+int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args)
+{
+	int len;
+
+	if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
+		return -EOVERFLOW;
+
+	len = vsnprintf(buf, size, fmt, args);
+	if (unlikely(len >= size))
+		return -E2BIG;
+
+	return len;
+}
+EXPORT_SYMBOL(vsprintf_trunc);
+
 /**
  * snprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
@@ -2974,6 +3002,31 @@ int scnprintf(char *buf, size_t size, const char *fmt, ...)
 }
 EXPORT_SYMBOL(scnprintf);
 
+/**
+ * sprintf_trunc - string truncate print formatted
+ * @buf: The buffer to place the result into
+ * @size: The size of the buffer, including the trailing null space
+ * @fmt: The format string to use
+ * @...: Arguments for the format string
+ *
+ * The return value is the length of the string.
+ * If the string is truncated, the function returns -E2BIG.
+ * If @size is invalid, the function returns -EOVERFLOW.
+ */
+
+int sprintf_trunc(char *buf, size_t size, const char *fmt, ...)
+{
+	int len;
+	va_list args;
+
+	va_start(args, fmt);
+	len = vsprintf_trunc(buf, size, fmt, args);
+	va_end(args);
+
+	return len;
+}
+EXPORT_SYMBOL(sprintf_trunc);
+
 /**
  * vsprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
-- 
2.50.0

[RFC v6 2/8] vsprintf: Add [v]sprintf_end()

Posted by Alejandro Colomar 2 months, 4 weeks ago

sprintf_end() is a function similar to stpcpy(3) in the sense that it
returns a pointer that is suitable for chaining to other copy
operations.

It takes a pointer to the end of the buffer as a sentinel for when to
truncate, which unlike a size, doesn't need to be updated after every
call.  This makes it much more ergonomic, avoiding manually calculating
the size after each copy, which is error prone.

It also makes error handling much easier, by reporting truncation with
a null pointer, which is accepted and transparently passed down by
subsequent sprintf_end() calls.  This results in only needing to report
errors once after a chain of sprintf_end() calls, unlike snprintf(3),
which requires checking after every call.

	p = buf;
	e = buf + countof(buf);
	p = sprintf_end(p, e, foo);
	p = sprintf_end(p, e, bar);
	if (p == NULL)
		goto trunc;

vs

	len = 0;
	size = countof(buf);
	len += snprintf(buf + len, size - len, foo);
	if (len >= size)
		goto trunc;

	len += snprintf(buf + len, size - len, bar);
	if (len >= size)
		goto trunc;

And also better than scnprintf() calls:

	len = 0;
	size = countof(buf);
	len += scnprintf(buf + len, size - len, foo);
	len += scnprintf(buf + len, size - len, bar);
	// No ability to check.

It seems aparent that it's a more elegant approach to string catenation.

These functions will soon be proposed for standardization as
[v]seprintf() into C2y, and they exist in Plan9 as seprint(2) --but the
Plan9 implementation has important bugs--.

Link: <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0049.git/tree/alx-0049.txt>
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h |  2 ++
 lib/vsprintf.c          | 54 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 56 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index 5ea6ec9c2e59..8dfc37713747 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -15,6 +15,8 @@ __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
 __printf(3, 4) int sprintf_trunc(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args);
+__printf(3, 4) char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
+__printf(3, 0) char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
 __printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
 __printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
 __printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 15e780942c56..5d0c5a0d60fd 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2951,6 +2951,35 @@ int vsprintf_trunc(char *buf, size_t size, const char *fmt, va_list args)
 }
 EXPORT_SYMBOL(vsprintf_trunc);
 
+/**
+ * vsprintf_end - va_list string end-delimited print formatted
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
+ * @args: Arguments for the format string
+ *
+ * The return value is a pointer to the trailing '\0'.
+ * If @p is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
+ * If @end <= @p, the function returns NULL.
+ *
+ * See the vsnprintf() documentation for format string extensions over C99.
+ */
+char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
+{
+	int len;
+
+	if (unlikely(p == NULL))
+		return NULL;
+
+	len = vsprintf_trunc(p, end - p, fmt, args);
+	if (unlikely(len < 0))
+		return NULL;
+
+	return p + len;
+}
+EXPORT_SYMBOL(vsprintf_end);
+
 /**
  * snprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
@@ -3027,6 +3056,31 @@ int sprintf_trunc(char *buf, size_t size, const char *fmt, ...)
 }
 EXPORT_SYMBOL(sprintf_trunc);
 
+/**
+ * sprintf_end - string end-delimited print formatted
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
+ * @...: Arguments for the format string
+ *
+ * The return value is a pointer to the trailing '\0'.
+ * If @buf is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
+ * If @end <= @p, the function returns NULL.
+ */
+
+char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	p = vsprintf_end(p, end, fmt, args);
+	va_end(args);
+
+	return p;
+}
+EXPORT_SYMBOL(sprintf_end);
+
 /**
  * vsprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
-- 
2.50.0

[RFC v6 3/8] sprintf: Add [v]sprintf_array()

Posted by Alejandro Colomar 2 months, 4 weeks ago

These macros take the end of the array argument implicitly to avoid
programmer mistakes.  This guarantees that the input is an array, unlike

	snprintf(buf, sizeof(buf), ...);

which is dangerous if the programmer passes a pointer instead of an
array.

These macros are essentially the same as the 2-argument version of
strscpy(), but with a formatted string.

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index 8dfc37713747..bd8174224a4a 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -4,6 +4,10 @@
 
 #include <linux/compiler_attributes.h>
 #include <linux/types.h>
+#include <linux/array_size.h>
+
+#define sprintf_array(a, fmt, ...)  sprintf_trunc(a, ARRAY_SIZE(a), fmt, ##__VA_ARGS__)
+#define vsprintf_array(a, fmt, ap)  vsprintf_trunc(a, ARRAY_SIZE(a), fmt, ap)
 
 int num_to_str(char *buf, int size, unsigned long long num, unsigned int width);
 
-- 
2.50.0

[RFC v6 4/8] stacktrace, stackdepot: Add sprintf_end()-like variants of functions

Posted by Alejandro Colomar 2 months, 4 weeks ago

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/stackdepot.h | 13 +++++++++++++
 include/linux/stacktrace.h |  3 +++
 kernel/stacktrace.c        | 28 ++++++++++++++++++++++++++++
 lib/stackdepot.c           | 13 +++++++++++++
 4 files changed, 57 insertions(+)

diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 2cc21ffcdaf9..76182e874f67 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -219,6 +219,19 @@ void stack_depot_print(depot_stack_handle_t stack);
 int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
 		       int spaces);
 
+/**
+ * stack_depot_sprint_end - Print a stack trace from stack depot into a buffer
+ *
+ * @handle:	Stack depot handle returned from stack_depot_save()
+ * @p:		Pointer to the print buffer
+ * @end:	Pointer to one past the last element in the buffer
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return:	Pointer to trailing '\0'; or NULL on truncation
+ */
+char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
+                             const char end[0], int spaces);
+
 /**
  * stack_depot_put - Drop a reference to a stack trace from stack depot
  *
diff --git a/include/linux/stacktrace.h b/include/linux/stacktrace.h
index 97455880ac41..79ada795d479 100644
--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -67,6 +67,9 @@ void stack_trace_print(const unsigned long *trace, unsigned int nr_entries,
 		       int spaces);
 int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
 			unsigned int nr_entries, int spaces);
+char *stack_trace_sprint_end(char *p, const char end[0],
+			     const unsigned long *entries,
+			     unsigned int nr_entries, int spaces);
 unsigned int stack_trace_save(unsigned long *store, unsigned int size,
 			      unsigned int skipnr);
 unsigned int stack_trace_save_tsk(struct task_struct *task,
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index afb3c116da91..f389647d8e44 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -70,6 +70,34 @@ int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
 }
 EXPORT_SYMBOL_GPL(stack_trace_snprint);
 
+/**
+ * stack_trace_sprint_end - Print the entries in the stack trace into a buffer
+ * @p:		Pointer to the print buffer
+ * @end:	Pointer to one past the last element in the buffer
+ * @entries:	Pointer to storage array
+ * @nr_entries:	Number of entries in the storage array
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return: Pointer to the trailing '\0'; or NULL on truncation.
+ */
+char *stack_trace_sprint_end(char *p, const char end[0],
+			  const unsigned long *entries, unsigned int nr_entries,
+			  int spaces)
+{
+	unsigned int i;
+
+	if (WARN_ON(!entries))
+		return 0;
+
+	for (i = 0; i < nr_entries; i++) {
+		p = sprintf_end(p, end, "%*c%pS\n", 1 + spaces, ' ',
+			     (void *)entries[i]);
+	}
+
+	return p;
+}
+EXPORT_SYMBOL_GPL(stack_trace_sprint_end);
+
 #ifdef CONFIG_ARCH_STACKWALK
 
 struct stacktrace_cookie {
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 73d7b50924ef..48e5c0ff37e8 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -771,6 +771,19 @@ int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
 }
 EXPORT_SYMBOL_GPL(stack_depot_snprint);
 
+char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
+			     const char end[0], int spaces)
+{
+	unsigned long *entries;
+	unsigned int nr_entries;
+
+	nr_entries = stack_depot_fetch(handle, &entries);
+	return nr_entries ?
+		stack_trace_sprint_end(p, end, entries, nr_entries, spaces)
+		: sprintf_end(p, end, "");
+}
+EXPORT_SYMBOL_GPL(stack_depot_sprint_end);
+
 depot_stack_handle_t __must_check stack_depot_set_extra_bits(
 			depot_stack_handle_t handle, unsigned int extra_bits)
 {
-- 
2.50.0

[RFC v6 5/8] mm: Use sprintf_end() instead of less ergonomic APIs

Posted by Alejandro Colomar 2 months, 4 weeks ago

While doing this, I detected some anomalies in the existing code:

mm/kfence/kfence_test.c:

	-  The last call to scnprintf() did increment 'cur', but it's
	   unused after that, so it was dead code.  I've removed the dead
	   code in this patch.

	-  'end' is calculated as

		end = &expect[0][sizeof(expect[0] - 1)];

	   However, the '-1' doesn't seem to be necessary.  When passing
	   $2 to scnprintf(), the size was specified as 'end - cur'.
	   And scnprintf() --just like snprintf(3)--, won't write more
	   than $2 bytes (including the null byte).  That means that
	   scnprintf() wouldn't write more than

		&expect[0][sizeof(expect[0]) - 1] - expect[0]

	   which simplifies to

		sizeof(expect[0]) - 1

	   bytes.  But we have sizeof(expect[0]) bytes available, so
	   we're wasting one byte entirely.  This is a benign off-by-one
	   bug.  The two occurrences of this bug will be fixed in a
	   following patch in this series.

mm/kmsan/kmsan_test.c:

	The same benign off-by-one bug calculating the remaining size.

mm/mempolicy.c:

	This file uses the 'p += snprintf()' anti-pattern.  That will
	overflow the pointer on truncation, which has undefined
	behavior.  Using sprintf_end(), this bug is fixed.

	As in the previous file, here there was also dead code in the
	last scnprintf() call, by incrementing a pointer that is not
	used after the call.  I've removed the dead code.

mm/page_owner.c:

	Within print_page_owner(), there are some calls to scnprintf(),
	which do report truncation.  And then there are other calls to
	snprintf(), where we handle errors (there are two 'goto err').

	I've kept the existing error handling, as I trust it's there for
	a good reason (i.e., we may want to avoid calling
	print_page_owner_memcg() if we truncated before).  Please review
	if this amount of error handling is the right one, or if we want
	to add or remove some.  For sprintf_end(), a single test for
	null after the last call is enough to detect truncation.

mm/slub.c:

	Again, the 'p += snprintf()' anti-pattern.  This is UB, and by
	using sprintf_end() we've fixed the bug.

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Marco Elver <elver@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Chao Yu <chao.yu@oppo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/kfence/kfence_test.c | 24 ++++++++++++------------
 mm/kmsan/kmsan_test.c   |  4 ++--
 mm/mempolicy.c          | 18 +++++++++---------
 mm/page_owner.c         | 32 +++++++++++++++++---------------
 mm/slub.c               |  5 +++--
 5 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index 00034e37bc9f..bae382eca4ab 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -113,26 +113,26 @@ static bool report_matches(const struct expect_report *r)
 	end = &expect[0][sizeof(expect[0]) - 1];
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: out-of-bounds %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_UAF:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: use-after-free %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: use-after-free %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_CORRUPTION:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: memory corruption");
+		cur = sprintf_end(cur, end, "BUG: KFENCE: memory corruption");
 		break;
 	case KFENCE_ERROR_INVALID:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: invalid %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_INVALID_FREE:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid free");
+		cur = sprintf_end(cur, end, "BUG: KFENCE: invalid free");
 		break;
 	}
 
-	scnprintf(cur, end - cur, " in %pS", r->fn);
+	sprintf_end(cur, end, " in %pS", r->fn);
 	/* The exact offset won't match, remove it; also strip module name. */
 	cur = strchr(expect[0], '+');
 	if (cur)
@@ -144,26 +144,26 @@ static bool report_matches(const struct expect_report *r)
 
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
-		cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Out-of-bounds %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_UAF:
-		cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Use-after-free %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_CORRUPTION:
-		cur += scnprintf(cur, end - cur, "Corrupted memory at");
+		cur = sprintf_end(cur, end, "Corrupted memory at");
 		break;
 	case KFENCE_ERROR_INVALID:
-		cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Invalid %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_INVALID_FREE:
-		cur += scnprintf(cur, end - cur, "Invalid free of");
+		cur = sprintf_end(cur, end, "Invalid free of");
 		break;
 	}
 
-	cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
+	sprintf_end(cur, end, " 0x%p", (void *)addr);
 
 	spin_lock_irqsave(&observed.lock, flags);
 	if (!report_available())
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index 9733a22c46c1..e48ca1972ff3 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -107,9 +107,9 @@ static bool report_matches(const struct expect_report *r)
 	cur = expected_header;
 	end = &expected_header[sizeof(expected_header) - 1];
 
-	cur += scnprintf(cur, end - cur, "BUG: KMSAN: %s", r->error_type);
+	cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
 
-	scnprintf(cur, end - cur, " in %s", r->symbol);
+	sprintf_end(cur, end, " in %s", r->symbol);
 	/* The exact offset won't match, remove it; also strip module name. */
 	cur = strchr(expected_header, '+');
 	if (cur)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b28a1e6ae096..6beb2710f97c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3359,6 +3359,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol)
 void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
 {
 	char *p = buffer;
+	char *e = buffer + maxlen;
 	nodemask_t nodes = NODE_MASK_NONE;
 	unsigned short mode = MPOL_DEFAULT;
 	unsigned short flags = 0;
@@ -3384,33 +3385,32 @@ void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
 		break;
 	default:
 		WARN_ON_ONCE(1);
-		snprintf(p, maxlen, "unknown");
+		sprintf_end(p, e, "unknown");
 		return;
 	}
 
-	p += snprintf(p, maxlen, "%s", policy_modes[mode]);
+	p = sprintf_end(p, e, "%s", policy_modes[mode]);
 
 	if (flags & MPOL_MODE_FLAGS) {
-		p += snprintf(p, buffer + maxlen - p, "=");
+		p = sprintf_end(p, e, "=");
 
 		/*
 		 * Static and relative are mutually exclusive.
 		 */
 		if (flags & MPOL_F_STATIC_NODES)
-			p += snprintf(p, buffer + maxlen - p, "static");
+			p = sprintf_end(p, e, "static");
 		else if (flags & MPOL_F_RELATIVE_NODES)
-			p += snprintf(p, buffer + maxlen - p, "relative");
+			p = sprintf_end(p, e, "relative");
 
 		if (flags & MPOL_F_NUMA_BALANCING) {
 			if (!is_power_of_2(flags & MPOL_MODE_FLAGS))
-				p += snprintf(p, buffer + maxlen - p, "|");
-			p += snprintf(p, buffer + maxlen - p, "balancing");
+				p = sprintf_end(p, e, "|");
+			p = sprintf_end(p, e, "balancing");
 		}
 	}
 
 	if (!nodes_empty(nodes))
-		p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
-			       nodemask_pr_args(&nodes));
+		sprintf_end(p, e, ":%*pbl", nodemask_pr_args(&nodes));
 }
 
 #ifdef CONFIG_SYSFS
diff --git a/mm/page_owner.c b/mm/page_owner.c
index cc4a6916eec6..c00b3be01540 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -496,7 +496,7 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
 /*
  * Looking for memcg information and print it out
  */
-static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
+static inline char *print_page_owner_memcg(char *p, const char end[0],
 					 struct page *page)
 {
 #ifdef CONFIG_MEMCG
@@ -511,8 +511,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 		goto out_unlock;
 
 	if (memcg_data & MEMCG_DATA_OBJEXTS)
-		ret += scnprintf(kbuf + ret, count - ret,
-				"Slab cache page\n");
+		p = sprintf_end(p, end, "Slab cache page\n");
 
 	memcg = page_memcg_check(page);
 	if (!memcg)
@@ -520,7 +519,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 
 	online = (memcg->css.flags & CSS_ONLINE);
 	cgroup_name(memcg->css.cgroup, name, sizeof(name));
-	ret += scnprintf(kbuf + ret, count - ret,
+	p = sprintf_end(p, end,
 			"Charged %sto %smemcg %s\n",
 			PageMemcgKmem(page) ? "(via objcg) " : "",
 			online ? "" : "offline ",
@@ -529,7 +528,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 	rcu_read_unlock();
 #endif /* CONFIG_MEMCG */
 
-	return ret;
+	return p;
 }
 
 static ssize_t
@@ -538,14 +537,16 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 		depot_stack_handle_t handle)
 {
 	int ret, pageblock_mt, page_mt;
-	char *kbuf;
+	char *kbuf, *p, *e;
 
 	count = min_t(size_t, count, PAGE_SIZE);
 	kbuf = kmalloc(count, GFP_KERNEL);
 	if (!kbuf)
 		return -ENOMEM;
 
-	ret = scnprintf(kbuf, count,
+	p = kbuf;
+	e = kbuf + count;
+	p = sprintf_end(p, e,
 			"Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n",
 			page_owner->order, page_owner->gfp_mask,
 			&page_owner->gfp_mask, page_owner->pid,
@@ -555,7 +556,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 	/* Print information relevant to grouping pages by mobility */
 	pageblock_mt = get_pageblock_migratetype(page);
 	page_mt  = gfp_migratetype(page_owner->gfp_mask);
-	ret += scnprintf(kbuf + ret, count - ret,
+	p = sprintf_end(p, e,
 			"PFN 0x%lx type %s Block %lu type %s Flags %pGp\n",
 			pfn,
 			migratetype_names[page_mt],
@@ -563,22 +564,23 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 			migratetype_names[pageblock_mt],
 			&page->flags);
 
-	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
-	if (ret >= count)
-		goto err;
+	p = stack_depot_sprint_end(handle, p, e, 0);
+	if (p == NULL)
+		goto err;  // XXX: Should we remove this error handling?
 
 	if (page_owner->last_migrate_reason != -1) {
-		ret += scnprintf(kbuf + ret, count - ret,
+		p = sprintf_end(p, e,
 			"Page has been migrated, last migrate reason: %s\n",
 			migrate_reason_names[page_owner->last_migrate_reason]);
 	}
 
-	ret = print_page_owner_memcg(kbuf, count, ret, page);
+	p = print_page_owner_memcg(p, e, page);
 
-	ret += snprintf(kbuf + ret, count - ret, "\n");
-	if (ret >= count)
+	p = sprintf_end(p, e, "\n");
+	if (p == NULL)
 		goto err;
 
+	ret = p - kbuf;
 	if (copy_to_user(buf, kbuf, ret))
 		ret = -EFAULT;
 
diff --git a/mm/slub.c b/mm/slub.c
index be8b09e09d30..dcc857676857 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -7451,6 +7451,7 @@ static char *create_unique_id(struct kmem_cache *s)
 {
 	char *name = kmalloc(ID_STR_LENGTH, GFP_KERNEL);
 	char *p = name;
+	char *e = name + ID_STR_LENGTH;
 
 	if (!name)
 		return ERR_PTR(-ENOMEM);
@@ -7475,9 +7476,9 @@ static char *create_unique_id(struct kmem_cache *s)
 		*p++ = 'A';
 	if (p != name + 1)
 		*p++ = '-';
-	p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
+	p = sprintf_end(p, e, "%07u", s->size);
 
-	if (WARN_ON(p > name + ID_STR_LENGTH - 1)) {
+	if (WARN_ON(p == NULL)) {
 		kfree(name);
 		return ERR_PTR(-EINVAL);
 	}
-- 
2.50.0

[RFC v6 6/8] array_size.h: Add ENDOF()

Posted by Alejandro Colomar 2 months, 4 weeks ago

This macro is useful to calculate the second argument to sprintf_end(),
avoiding off-by-one bugs.

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/array_size.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/array_size.h b/include/linux/array_size.h
index 06d7d83196ca..781bdb70d939 100644
--- a/include/linux/array_size.h
+++ b/include/linux/array_size.h
@@ -10,4 +10,10 @@
  */
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
 
+/**
+ * ENDOF - get a pointer to one past the last element in array @a
+ * @a: array
+ */
+#define ENDOF(a)  (a + ARRAY_SIZE(a))
+
 #endif  /* _LINUX_ARRAY_SIZE_H */
-- 
2.50.0

[RFC v6 7/8] mm: Fix benign off-by-one bugs

Posted by Alejandro Colomar 2 months, 4 weeks ago

We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
doesn't write more than $2 bytes including the null byte, so trying to
pass 'size-1' there is wasting one byte.  Now that we use sprintf_end(),
the situation isn't different: sprintf_end() will stop writing *before*
'end' --that is, at most the terminating null byte will be written at
'end-1'--.

Acked-by: Marco Elver <elver@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/kfence/kfence_test.c | 4 ++--
 mm/kmsan/kmsan_test.c   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index bae382eca4ab..c635aa9d478b 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -110,7 +110,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Title */
 	cur = expect[0];
-	end = &expect[0][sizeof(expect[0]) - 1];
+	end = ENDOF(expect[0]);
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
 		cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
@@ -140,7 +140,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Access information */
 	cur = expect[1];
-	end = &expect[1][sizeof(expect[1]) - 1];
+	end = ENDOF(expect[1]);
 
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index e48ca1972ff3..9bda55992e3d 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -105,7 +105,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Title */
 	cur = expected_header;
-	end = &expected_header[sizeof(expected_header) - 1];
+	end = ENDOF(expected_header);
 
 	cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
 
-- 
2.50.0

[RFC v6 8/8] mm: Use [v]sprintf_array() to avoid specifying the array size

Posted by Alejandro Colomar 2 months, 4 weeks ago

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/backing-dev.c    | 2 +-
 mm/cma.c            | 4 ++--
 mm/cma_debug.c      | 2 +-
 mm/hugetlb.c        | 3 +--
 mm/hugetlb_cgroup.c | 2 +-
 mm/hugetlb_cma.c    | 2 +-
 mm/kasan/report.c   | 3 +--
 mm/memblock.c       | 4 ++--
 mm/percpu.c         | 2 +-
 mm/shrinker_debug.c | 2 +-
 mm/zswap.c          | 2 +-
 11 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 783904d8c5ef..c4e588135aea 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -1090,7 +1090,7 @@ int bdi_register_va(struct backing_dev_info *bdi, const char *fmt, va_list args)
 	if (bdi->dev)	/* The driver needs to use separate queues per device */
 		return 0;
 
-	vsnprintf(bdi->dev_name, sizeof(bdi->dev_name), fmt, args);
+	vsprintf_array(bdi->dev_name, fmt, args);
 	dev = device_create(&bdi_class, NULL, MKDEV(0, 0), bdi, bdi->dev_name);
 	if (IS_ERR(dev))
 		return PTR_ERR(dev);
diff --git a/mm/cma.c b/mm/cma.c
index c04be488b099..61d97a387670 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -237,9 +237,9 @@ static int __init cma_new_area(const char *name, phys_addr_t size,
 	cma_area_count++;
 
 	if (name)
-		snprintf(cma->name, CMA_MAX_NAME, "%s", name);
+		sprintf_array(cma->name, "%s", name);
 	else
-		snprintf(cma->name, CMA_MAX_NAME,  "cma%d\n", cma_area_count);
+		sprintf_array(cma->name, "cma%d\n", cma_area_count);
 
 	cma->available_count = cma->count = size >> PAGE_SHIFT;
 	cma->order_per_bit = order_per_bit;
diff --git a/mm/cma_debug.c b/mm/cma_debug.c
index fdf899532ca0..751eae9f6364 100644
--- a/mm/cma_debug.c
+++ b/mm/cma_debug.c
@@ -186,7 +186,7 @@ static void cma_debugfs_add_one(struct cma *cma, struct dentry *root_dentry)
 	rangedir = debugfs_create_dir("ranges", tmp);
 	for (r = 0; r < cma->nranges; r++) {
 		cmr = &cma->ranges[r];
-		snprintf(rdirname, sizeof(rdirname), "%d", r);
+		sprintf_array(rdirname, "%d", r);
 		dir = debugfs_create_dir(rdirname, rangedir);
 		debugfs_create_file("base_pfn", 0444, dir,
 			    &cmr->base_pfn, &cma_debugfs_fops);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6a3cf7935c14..70acc8b3cbb8 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4780,8 +4780,7 @@ void __init hugetlb_add_hstate(unsigned int order)
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
 	INIT_LIST_HEAD(&h->hugepage_activelist);
-	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
-					huge_page_size(h)/SZ_1K);
+	sprintf_array(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
 
 	parsed_hstate = h;
 }
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 58e895f3899a..0953cea93759 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -822,7 +822,7 @@ hugetlb_cgroup_cfttypes_init(struct hstate *h, struct cftype *cft,
 	for (i = 0; i < tmpl_size; cft++, tmpl++, i++) {
 		*cft = *tmpl;
 		/* rebuild the name */
-		snprintf(cft->name, MAX_CFTYPE_NAME, "%s.%s", buf, tmpl->name);
+		sprintf_array(cft->name, "%s.%s", buf, tmpl->name);
 		/* rebuild the private */
 		cft->private = MEMFILE_PRIVATE(idx, tmpl->private);
 		/* rebuild the file_offset */
diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c
index e0f2d5c3a84c..bae82a97a43c 100644
--- a/mm/hugetlb_cma.c
+++ b/mm/hugetlb_cma.c
@@ -211,7 +211,7 @@ void __init hugetlb_cma_reserve(int order)
 
 		size = round_up(size, PAGE_SIZE << order);
 
-		snprintf(name, sizeof(name), "hugetlb%d", nid);
+		sprintf_array(name, "hugetlb%d", nid);
 		/*
 		 * Note that 'order per bit' is based on smallest size that
 		 * may be returned to CMA allocator in the case of
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 8357e1a33699..3b40225e7873 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -486,8 +486,7 @@ static void print_memory_metadata(const void *addr)
 		char buffer[4 + (BITS_PER_LONG / 8) * 2];
 		char metadata[META_BYTES_PER_ROW];
 
-		snprintf(buffer, sizeof(buffer),
-				(i == 0) ? ">%px: " : " %px: ", row);
+		sprintf_array(buffer, (i == 0) ? ">%px: " : " %px: ", row);
 
 		/*
 		 * We should not pass a shadow pointer to generic
diff --git a/mm/memblock.c b/mm/memblock.c
index 0e9ebb8aa7fe..3eea7a177330 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -2021,7 +2021,7 @@ static void __init_memblock memblock_dump(struct memblock_type *type)
 		flags = rgn->flags;
 #ifdef CONFIG_NUMA
 		if (numa_valid_node(memblock_get_region_node(rgn)))
-			snprintf(nid_buf, sizeof(nid_buf), " on node %d",
+			sprintf_array(nid_buf, " on node %d",
 				 memblock_get_region_node(rgn));
 #endif
 		pr_info(" %s[%#x]\t[%pa-%pa], %pa bytes%s flags: %#x\n",
@@ -2379,7 +2379,7 @@ int reserve_mem_release_by_name(const char *name)
 
 	start = phys_to_virt(map->start);
 	end = start + map->size - 1;
-	snprintf(buf, sizeof(buf), "reserve_mem:%s", name);
+	sprintf_array(buf, "reserve_mem:%s", name);
 	free_reserved_area(start, end, 0, buf);
 	map->size = 0;
 
diff --git a/mm/percpu.c b/mm/percpu.c
index b35494c8ede2..a467102c2405 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -3186,7 +3186,7 @@ int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_to_node_fn_t
 	int upa;
 	int nr_g0_units;
 
-	snprintf(psize_str, sizeof(psize_str), "%luK", PAGE_SIZE >> 10);
+	sprintf_array(psize_str, "%luK", PAGE_SIZE >> 10);
 
 	ai = pcpu_build_alloc_info(reserved_size, 0, PAGE_SIZE, NULL);
 	if (IS_ERR(ai))
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
index 20eaee3e97f7..f529ac29557c 100644
--- a/mm/shrinker_debug.c
+++ b/mm/shrinker_debug.c
@@ -176,7 +176,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
 		return id;
 	shrinker->debugfs_id = id;
 
-	snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
+	sprintf_array(buf, "%s-%d", shrinker->name, id);
 
 	/* create debugfs entry */
 	entry = debugfs_create_dir(buf, shrinker_debugfs_root);
diff --git a/mm/zswap.c b/mm/zswap.c
index 204fb59da33c..e66b5c5b1ecf 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -271,7 +271,7 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
 		return NULL;
 
 	/* unique name for each pool specifically required by zsmalloc */
-	snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count));
+	sprintf_array(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
 	pool->zpool = zpool_create_pool(type, name, gfp);
 	if (!pool->zpool) {
 		pr_err("%s zpool not available\n", type);
-- 
2.50.0

[RFC v4 0/7] Add and use sprintf_end() instead of less ergonomic APIs

Posted by Alejandro Colomar 2 months, 4 weeks ago

Hi,

Changes in v4:

-  Added to global CC everyone who participated in the discussion so
   far.
-  Rename seprintf() => sprintf_end().
-  Implement SPRINTF_END().
-  Drop stprintf().  We don't need it as an intermediate helper.
-  Link to the draft of a standards proposal (which I'll paste as a
   reply to this mail again).
-  Minor fixes or updates to commit messages.
-  Added Marco Elver's Acked-by: tag in commit 5/7.
-  In stack_depot_sprint_end(), do sprintf_end(p, end, "") when
   nr_entries is 0, to guarantee that the string is valid if this is the
   first s*printf() call in a row.
-  Document sprintf_end() as 'string end-delimited print formatted'.
   This spells the letters in seprintf() for their meaning, in case
   anyone thinks the letters are randomly chosen.  :)
-  Remove comment about vsnprintf(3) not failing in the kernel, after
   Rasmus commented this is QoI guaranteed by the kernel.

Remaining questions:

-  There are only 3 remaining calls to snprintf(3) under mm/.  They are
   just fine for now, which is why I didn't replace them.  If anyone
   wants to replace them, to get rid of all snprintf(3), we could that.
   I think for now we can leave them, to minimize the churn.

	$ grep -rnI snprintf mm/
	mm/hugetlb_cgroup.c:674:		snprintf(buf, size, "%luGB", hsize / SZ_1G);
	mm/hugetlb_cgroup.c:676:		snprintf(buf, size, "%luMB", hsize / SZ_1M);
	mm/hugetlb_cgroup.c:678:		snprintf(buf, size, "%luKB", hsize / SZ_1K);

-  There are only 2 remaining calls to the kernel's scnprintf().  This
   one I would really like to get rid of.  Also, those calls are quite
   suspicious of not being what we want.  Please do have a look at them
   and confirm what's the appropriate behavior in the 2 cases when the
   string is truncated or not copied at all.  That code is very scary
   for me to try to guess.

	$ grep -rnI scnprintf mm/
	mm/kfence/report.c:75:		int len = scnprintf(buf, sizeof(buf), "%ps", (void *)stack_entries[skipnr]);
	mm/kfence/kfence_test.mod.c:22:	{ 0x96848186, "scnprintf" },
	mm/kmsan/report.c:42:		len = scnprintf(buf, sizeof(buf), "%ps",

   Apart from two calls, I see a string literal with that name.  Please
   let me know if I should do anything about it.  I don't know what that
   is.

-  I think we should remove one error handling check in
   "mm/page_owner.c" (marked with an XXX comment), but I'm not 100%
   sure.  Please confirm.

Other comments:

-  This is still not complying to coding style.  I'll keep it like that
   while questions remain open.
-  I've tested the tests under CONFIG_KFENCE_KUNIT_TEST=y, and this has
   no regressions at all.
-  With the current style of the sprintf_end() prototyope, this triggers
   a diagnostic due to a GCC bug:
   <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108036>
   It would be interesting to ask GCC to fix that bug.  (Added relevant
   GCC maintainers and contributors to CC in this cover letter.)


Have a lovely night!
Alex


Alejandro Colomar (7):
  vsprintf: Add [v]sprintf_end()
  stacktrace, stackdepot: Add sprintf_end()-like variants of functions
  mm: Use sprintf_end() instead of less ergonomic APIs
  array_size.h: Add ENDOF()
  mm: Fix benign off-by-one bugs
  sprintf: Add [V]SPRINTF_END()
  mm: Use [V]SPRINTF_END() to avoid specifying the array size

 include/linux/array_size.h |  6 ++++
 include/linux/sprintf.h    |  6 ++++
 include/linux/stackdepot.h | 13 +++++++++
 include/linux/stacktrace.h |  3 ++
 kernel/stacktrace.c        | 28 ++++++++++++++++++
 lib/stackdepot.c           | 13 +++++++++
 lib/vsprintf.c             | 59 ++++++++++++++++++++++++++++++++++++++
 mm/backing-dev.c           |  2 +-
 mm/cma.c                   |  4 +--
 mm/cma_debug.c             |  2 +-
 mm/hugetlb.c               |  3 +-
 mm/hugetlb_cgroup.c        |  2 +-
 mm/hugetlb_cma.c           |  2 +-
 mm/kasan/report.c          |  3 +-
 mm/kfence/kfence_test.c    | 28 +++++++++---------
 mm/kmsan/kmsan_test.c      |  6 ++--
 mm/memblock.c              |  4 +--
 mm/mempolicy.c             | 18 ++++++------
 mm/page_owner.c            | 32 +++++++++++----------
 mm/percpu.c                |  2 +-
 mm/shrinker_debug.c        |  2 +-
 mm/slub.c                  |  5 ++--
 mm/zswap.c                 |  2 +-
 23 files changed, 187 insertions(+), 58 deletions(-)

Range-diff against v3:
1:  64334f0b94d6 ! 1:  2c4f793de0b8 vsprintf: Add [v]seprintf(), [v]stprintf()
    @@ Metadata
     Author: Alejandro Colomar <alx@kernel.org>
     
      ## Commit message ##
    -    vsprintf: Add [v]seprintf(), [v]stprintf()
    +    vsprintf: Add [v]sprintf_end()
     
    -    seprintf()
    -    ==========
    -
    -    seprintf() is a function similar to stpcpy(3) in the sense that it
    +    sprintf_end() is a function similar to stpcpy(3) in the sense that it
         returns a pointer that is suitable for chaining to other copy
         operations.
     
    @@ Commit message
     
         It also makes error handling much easier, by reporting truncation with
         a null pointer, which is accepted and transparently passed down by
    -    subsequent seprintf() calls.  This results in only needing to report
    -    errors once after a chain of seprintf() calls, unlike snprintf(3), which
    -    requires checking after every call.
    +    subsequent sprintf_end() calls.  This results in only needing to report
    +    errors once after a chain of sprintf_end() calls, unlike snprintf(3),
    +    which requires checking after every call.
     
                 p = buf;
                 e = buf + countof(buf);
    -            p = seprintf(p, e, foo);
    -            p = seprintf(p, e, bar);
    +            p = sprintf_end(p, e, foo);
    +            p = sprintf_end(p, e, bar);
                 if (p == NULL)
                         goto trunc;
     
    @@ Commit message
                 size = countof(buf);
                 len += scnprintf(buf + len, size - len, foo);
                 len += scnprintf(buf + len, size - len, bar);
    -            if (len >= size)
    -                    goto trunc;
    +            // No ability to check.
     
         It seems aparent that it's a more elegant approach to string catenation.
     
    -    stprintf()
    -    ==========
    -
    -    stprintf() is a helper that is needed for implementing seprintf()
    -    --although it could be open-coded within vseprintf(), of course--, but
    -    it's also useful by itself.  It has the same interface properties as
    -    strscpy(): that is, it copies with truncation, and reports truncation
    -    with -E2BIG.  It would be useful to replace some calls to snprintf(3)
    -    and scnprintf() which don't need chaining, and where it's simpler to
    -    pass a size.
    -
    -    It is better than plain snprintf(3), because it results in simpler error
    -    detection (it doesn't need a check >=countof(buf), but rather <0).
    +    These functions will soon be proposed for standardization as
    +    [v]seprintf() into C2y, and they exist in Plan9 as seprint(2) --but the
    +    Plan9 implementation has important bugs--.
     
    +    Link: <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0049.git/tree/alx-0049.txt>
         Cc: Kees Cook <kees@kernel.org>
         Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
    +    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    +    Cc: Marco Elver <elver@google.com>
    +    Cc: Michal Hocko <mhocko@suse.com>
    +    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    +    Cc: Al Viro <viro@zeniv.linux.org.uk>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## include/linux/sprintf.h ##
    -@@ include/linux/sprintf.h: __printf(2, 3) int sprintf(char *buf, const char * fmt, ...);
    - __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
    - __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
    +@@ include/linux/sprintf.h: __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
      __printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
    -+__printf(3, 4) int stprintf(char *buf, size_t size, const char *fmt, ...);
    -+__printf(3, 0) int vstprintf(char *buf, size_t size, const char *fmt, va_list args);
      __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
      __printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
    -+__printf(3, 4) char *seprintf(char *p, const char end[0], const char *fmt, ...);
    -+__printf(3, 0) char *vseprintf(char *p, const char end[0], const char *fmt, va_list args);
    ++__printf(3, 4) char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
    ++__printf(3, 0) char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
      __printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
      __printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
      __printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
     
      ## lib/vsprintf.c ##
    -@@ lib/vsprintf.c: int vsnprintf(char *buf, size_t size, const char *fmt_str, va_list args)
    - }
    - EXPORT_SYMBOL(vsnprintf);
    - 
    -+/**
    -+ * vstprintf - Format a string and place it in a buffer
    -+ * @buf: The buffer to place the result into
    -+ * @size: The size of the buffer, including the trailing null space
    -+ * @fmt: The format string to use
    -+ * @args: Arguments for the format string
    -+ *
    -+ * The return value is the length of the new string.
    -+ * If the string is truncated, the function returns -E2BIG.
    -+ *
    -+ * If you're not already dealing with a va_list consider using stprintf().
    -+ *
    -+ * See the vsnprintf() documentation for format string extensions over C99.
    -+ */
    -+int vstprintf(char *buf, size_t size, const char *fmt, va_list args)
    -+{
    -+	int len;
    -+
    -+	len = vsnprintf(buf, size, fmt, args);
    -+
    -+	// It seems the kernel's vsnprintf() doesn't fail?
    -+	//if (unlikely(len < 0))
    -+	//	return -E2BIG;
    -+
    -+	if (unlikely(len >= size))
    -+		return -E2BIG;
    -+
    -+	return len;
    -+}
    -+EXPORT_SYMBOL(vstprintf);
    -+
    - /**
    -  * vscnprintf - Format a string and place it in a buffer
    -  * @buf: The buffer to place the result into
     @@ lib/vsprintf.c: int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
      }
      EXPORT_SYMBOL(vscnprintf);
      
     +/**
    -+ * vseprintf - Format a string and place it in a buffer
    ++ * vsprintf_end - va_list string end-delimited print formatted
     + * @p: The buffer to place the result into
     + * @end: A pointer to one past the last character in the buffer
     + * @fmt: The format string to use
    @@ lib/vsprintf.c: int vscnprintf(char *buf, size_t size, const char *fmt, va_list
     + * The return value is a pointer to the trailing '\0'.
     + * If @p is NULL, the function returns NULL.
     + * If the string is truncated, the function returns NULL.
    -+ *
    -+ * If you're not already dealing with a va_list consider using seprintf().
    ++ * If @end <= @p, the function returns NULL.
     + *
     + * See the vsnprintf() documentation for format string extensions over C99.
     + */
    -+char *vseprintf(char *p, const char end[0], const char *fmt, va_list args)
    ++char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
     +{
     +	int len;
    ++	size_t size;
     +
     +	if (unlikely(p == NULL))
     +		return NULL;
     +
    -+	len = vstprintf(p, end - p, fmt, args);
    -+	if (unlikely(len < 0))
    ++	size = end - p;
    ++	if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
    ++		return NULL;
    ++
    ++	len = vsnprintf(p, size, fmt, args);
    ++	if (unlikely(len >= size))
     +		return NULL;
     +
     +	return p + len;
     +}
    -+EXPORT_SYMBOL(vseprintf);
    ++EXPORT_SYMBOL(vsprintf_end);
     +
      /**
       * snprintf - Format a string and place it in a buffer
       * @buf: The buffer to place the result into
    -@@ lib/vsprintf.c: int snprintf(char *buf, size_t size, const char *fmt, ...)
    - }
    - EXPORT_SYMBOL(snprintf);
    - 
    -+/**
    -+ * stprintf - Format a string and place it in a buffer
    -+ * @buf: The buffer to place the result into
    -+ * @size: The size of the buffer, including the trailing null space
    -+ * @fmt: The format string to use
    -+ * @...: Arguments for the format string
    -+ *
    -+ * The return value is the length of the new string.
    -+ * If the string is truncated, the function returns -E2BIG.
    -+ */
    -+
    -+int stprintf(char *buf, size_t size, const char *fmt, ...)
    -+{
    -+	va_list args;
    -+	int len;
    -+
    -+	va_start(args, fmt);
    -+	len = vstprintf(buf, size, fmt, args);
    -+	va_end(args);
    -+
    -+	return len;
    -+}
    -+EXPORT_SYMBOL(stprintf);
    -+
    - /**
    -  * scnprintf - Format a string and place it in a buffer
    -  * @buf: The buffer to place the result into
     @@ lib/vsprintf.c: int scnprintf(char *buf, size_t size, const char *fmt, ...)
      }
      EXPORT_SYMBOL(scnprintf);
      
     +/**
    -+ * seprintf - Format a string and place it in a buffer
    ++ * sprintf_end - string end-delimited print formatted
     + * @p: The buffer to place the result into
     + * @end: A pointer to one past the last character in the buffer
     + * @fmt: The format string to use
    @@ lib/vsprintf.c: int scnprintf(char *buf, size_t size, const char *fmt, ...)
     + * The return value is a pointer to the trailing '\0'.
     + * If @buf is NULL, the function returns NULL.
     + * If the string is truncated, the function returns NULL.
    ++ * If @end <= @p, the function returns NULL.
     + */
     +
    -+char *seprintf(char *p, const char end[0], const char *fmt, ...)
    ++char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
     +{
     +	va_list args;
     +
     +	va_start(args, fmt);
    -+	p = vseprintf(p, end, fmt, args);
    ++	p = vsprintf_end(p, end, fmt, args);
     +	va_end(args);
     +
     +	return p;
     +}
    -+EXPORT_SYMBOL(seprintf);
    ++EXPORT_SYMBOL(sprintf_end);
     +
      /**
       * vsprintf - Format a string and place it in a buffer
2:  9c140de9842d ! 2:  894d02b08056 stacktrace, stackdepot: Add seprintf()-like variants of functions
    @@ Metadata
     Author: Alejandro Colomar <alx@kernel.org>
     
      ## Commit message ##
    -    stacktrace, stackdepot: Add seprintf()-like variants of functions
    -
    -    I think there's an anomaly in stack_depot_s*print().  If we have zero
    -    entries, we don't copy anything, which means the string is still not a
    -    string.  Normally, this function is called surrounded by other calls to
    -    s*printf(), which guarantee that there's a '\0', but maybe we should
    -    make sure to write a '\0' here?
    +    stacktrace, stackdepot: Add sprintf_end()-like variants of functions
     
         Cc: Kees Cook <kees@kernel.org>
         Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
    +    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    +    Cc: Marco Elver <elver@google.com>
    +    Cc: Michal Hocko <mhocko@suse.com>
    +    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    +    Cc: Al Viro <viro@zeniv.linux.org.uk>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## include/linux/stackdepot.h ##
    @@ include/linux/stackdepot.h: void stack_depot_print(depot_stack_handle_t stack);
      		       int spaces);
      
     +/**
    -+ * stack_depot_seprint - Print a stack trace from stack depot into a buffer
    ++ * stack_depot_sprint_end - Print a stack trace from stack depot into a buffer
     + *
     + * @handle:	Stack depot handle returned from stack_depot_save()
     + * @p:		Pointer to the print buffer
    @@ include/linux/stackdepot.h: void stack_depot_print(depot_stack_handle_t stack);
     + *
     + * Return:	Pointer to trailing '\0'; or NULL on truncation
     + */
    -+char *stack_depot_seprint(depot_stack_handle_t handle, char *p,
    -+                          const char end[0], int spaces);
    ++char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
    ++                             const char end[0], int spaces);
     +
      /**
       * stack_depot_put - Drop a reference to a stack trace from stack depot
    @@ include/linux/stacktrace.h: void stack_trace_print(const unsigned long *trace, u
      		       int spaces);
      int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
      			unsigned int nr_entries, int spaces);
    -+char *stack_trace_seprint(char *p, const char end[0],
    -+			  const unsigned long *entries, unsigned int nr_entries,
    -+			  int spaces);
    ++char *stack_trace_sprint_end(char *p, const char end[0],
    ++			     const unsigned long *entries,
    ++			     unsigned int nr_entries, int spaces);
      unsigned int stack_trace_save(unsigned long *store, unsigned int size,
      			      unsigned int skipnr);
      unsigned int stack_trace_save_tsk(struct task_struct *task,
    @@ kernel/stacktrace.c: int stack_trace_snprint(char *buf, size_t size, const unsig
      EXPORT_SYMBOL_GPL(stack_trace_snprint);
      
     +/**
    -+ * stack_trace_seprint - Print the entries in the stack trace into a buffer
    ++ * stack_trace_sprint_end - Print the entries in the stack trace into a buffer
     + * @p:		Pointer to the print buffer
     + * @end:	Pointer to one past the last element in the buffer
     + * @entries:	Pointer to storage array
    @@ kernel/stacktrace.c: int stack_trace_snprint(char *buf, size_t size, const unsig
     + *
     + * Return: Pointer to the trailing '\0'; or NULL on truncation.
     + */
    -+char *stack_trace_seprint(char *p, const char end[0],
    ++char *stack_trace_sprint_end(char *p, const char end[0],
     +			  const unsigned long *entries, unsigned int nr_entries,
     +			  int spaces)
     +{
    @@ kernel/stacktrace.c: int stack_trace_snprint(char *buf, size_t size, const unsig
     +		return 0;
     +
     +	for (i = 0; i < nr_entries; i++) {
    -+		p = seprintf(p, end, "%*c%pS\n", 1 + spaces, ' ',
    ++		p = sprintf_end(p, end, "%*c%pS\n", 1 + spaces, ' ',
     +			     (void *)entries[i]);
     +	}
     +
     +	return p;
     +}
    -+EXPORT_SYMBOL_GPL(stack_trace_seprint);
    ++EXPORT_SYMBOL_GPL(stack_trace_sprint_end);
     +
      #ifdef CONFIG_ARCH_STACKWALK
      
    @@ lib/stackdepot.c: int stack_depot_snprint(depot_stack_handle_t handle, char *buf
      }
      EXPORT_SYMBOL_GPL(stack_depot_snprint);
      
    -+char *stack_depot_seprint(depot_stack_handle_t handle, char *p,
    -+			  const char end[0], int spaces)
    ++char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
    ++			     const char end[0], int spaces)
     +{
     +	unsigned long *entries;
     +	unsigned int nr_entries;
     +
     +	nr_entries = stack_depot_fetch(handle, &entries);
    -+	return nr_entries ? stack_trace_seprint(p, end, entries, nr_entries,
    -+						spaces) : p;
    ++	return nr_entries ?
    ++		stack_trace_sprint_end(p, end, entries, nr_entries, spaces)
    ++		: sprintf_end(p, end, "");
     +}
    -+EXPORT_SYMBOL_GPL(stack_depot_seprint);
    ++EXPORT_SYMBOL_GPL(stack_depot_sprint_end);
     +
      depot_stack_handle_t __must_check stack_depot_set_extra_bits(
      			depot_stack_handle_t handle, unsigned int extra_bits)
3:  033bf00f1fcf ! 3:  690ed4d22f57 mm: Use seprintf() instead of less ergonomic APIs
    @@ Metadata
     Author: Alejandro Colomar <alx@kernel.org>
     
      ## Commit message ##
    -    mm: Use seprintf() instead of less ergonomic APIs
    +    mm: Use sprintf_end() instead of less ergonomic APIs
     
         While doing this, I detected some anomalies in the existing code:
     
    @@ Commit message
     
                 This file uses the 'p += snprintf()' anti-pattern.  That will
                 overflow the pointer on truncation, which has undefined
    -            behavior.  Using seprintf(), this bug is fixed.
    +            behavior.  Using sprintf_end(), this bug is fixed.
     
                 As in the previous file, here there was also dead code in the
                 last scnprintf() call, by incrementing a pointer that is not
    @@ Commit message
                 a good reason (i.e., we may want to avoid calling
                 print_page_owner_memcg() if we truncated before).  Please review
                 if this amount of error handling is the right one, or if we want
    -            to add or remove some.  For seprintf(), a single test for null
    -            after the last call is enough to detect truncation.
    +            to add or remove some.  For sprintf_end(), a single test for
    +            null after the last call is enough to detect truncation.
     
         mm/slub.c:
     
                 Again, the 'p += snprintf()' anti-pattern.  This is UB, and by
    -            using seprintf() we've fixed the bug.
    +            using sprintf_end() we've fixed the bug.
     
    -    Fixes: f99e12b21b84 (2021-07-30; "kfence: add function to mask address bits")
    -    [alx: that commit introduced dead code]
    -    Fixes: af649773fb25 (2024-07-17; "mm/numa_balancing: teach mpol_to_str about the balancing mode")
    -    [alx: that commit added p+=snprintf() calls, which are UB]
    -    Fixes: 2291990ab36b (2008-04-28; "mempolicy: clean-up mpol-to-str() mempolicy formatting")
    -    [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
    -    Fixes: 948927ee9e4f (2013-11-13; "mm, mempolicy: make mpol_to_str robust and always succeed")
    -    [alx: that commit changes old code into p+=snprintf(), which is still UB]
    -    [alx: that commit also produced dead code by leaving the last 'p+=...']
    -    Fixes: d65360f22406 (2022-09-26; "mm/slub: clean up create_unique_id()")
    -    [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
         Cc: Kees Cook <kees@kernel.org>
         Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
         Cc: Sven Schnelle <svens@linux.ibm.com>
         Cc: Marco Elver <elver@google.com>
         Cc: Heiko Carstens <hca@linux.ibm.com>
         Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
    -    Cc: "Huang, Ying" <ying.huang@intel.com>
         Cc: Andrew Morton <akpm@linux-foundation.org>
    -    Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
         Cc: Linus Torvalds <torvalds@linux-foundation.org>
         Cc: David Rientjes <rientjes@google.com>
         Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
         Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
         Cc: Chao Yu <chao.yu@oppo.com>
         Cc: Vlastimil Babka <vbabka@suse.cz>
    +    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    +    Cc: Michal Hocko <mhocko@suse.com>
    +    Cc: Al Viro <viro@zeniv.linux.org.uk>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## mm/kfence/kfence_test.c ##
    @@ mm/kfence/kfence_test.c: static bool report_matches(const struct expect_report *
      	switch (r->type) {
      	case KFENCE_ERROR_OOB:
     -		cur += scnprintf(cur, end - cur, "BUG: KFENCE: out-of-bounds %s",
    -+		cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
    ++		cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
      				 get_access_type(r));
      		break;
      	case KFENCE_ERROR_UAF:
     -		cur += scnprintf(cur, end - cur, "BUG: KFENCE: use-after-free %s",
    -+		cur = seprintf(cur, end, "BUG: KFENCE: use-after-free %s",
    ++		cur = sprintf_end(cur, end, "BUG: KFENCE: use-after-free %s",
      				 get_access_type(r));
      		break;
      	case KFENCE_ERROR_CORRUPTION:
     -		cur += scnprintf(cur, end - cur, "BUG: KFENCE: memory corruption");
    -+		cur = seprintf(cur, end, "BUG: KFENCE: memory corruption");
    ++		cur = sprintf_end(cur, end, "BUG: KFENCE: memory corruption");
      		break;
      	case KFENCE_ERROR_INVALID:
     -		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid %s",
    -+		cur = seprintf(cur, end, "BUG: KFENCE: invalid %s",
    ++		cur = sprintf_end(cur, end, "BUG: KFENCE: invalid %s",
      				 get_access_type(r));
      		break;
      	case KFENCE_ERROR_INVALID_FREE:
     -		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid free");
    -+		cur = seprintf(cur, end, "BUG: KFENCE: invalid free");
    ++		cur = sprintf_end(cur, end, "BUG: KFENCE: invalid free");
      		break;
      	}
      
     -	scnprintf(cur, end - cur, " in %pS", r->fn);
    -+	seprintf(cur, end, " in %pS", r->fn);
    ++	sprintf_end(cur, end, " in %pS", r->fn);
      	/* The exact offset won't match, remove it; also strip module name. */
      	cur = strchr(expect[0], '+');
      	if (cur)
    @@ mm/kfence/kfence_test.c: static bool report_matches(const struct expect_report *
      	switch (r->type) {
      	case KFENCE_ERROR_OOB:
     -		cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
    -+		cur = seprintf(cur, end, "Out-of-bounds %s at", get_access_type(r));
    ++		cur = sprintf_end(cur, end, "Out-of-bounds %s at", get_access_type(r));
      		addr = arch_kfence_test_address(addr);
      		break;
      	case KFENCE_ERROR_UAF:
     -		cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
    -+		cur = seprintf(cur, end, "Use-after-free %s at", get_access_type(r));
    ++		cur = sprintf_end(cur, end, "Use-after-free %s at", get_access_type(r));
      		addr = arch_kfence_test_address(addr);
      		break;
      	case KFENCE_ERROR_CORRUPTION:
     -		cur += scnprintf(cur, end - cur, "Corrupted memory at");
    -+		cur = seprintf(cur, end, "Corrupted memory at");
    ++		cur = sprintf_end(cur, end, "Corrupted memory at");
      		break;
      	case KFENCE_ERROR_INVALID:
     -		cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
    -+		cur = seprintf(cur, end, "Invalid %s at", get_access_type(r));
    ++		cur = sprintf_end(cur, end, "Invalid %s at", get_access_type(r));
      		addr = arch_kfence_test_address(addr);
      		break;
      	case KFENCE_ERROR_INVALID_FREE:
     -		cur += scnprintf(cur, end - cur, "Invalid free of");
    -+		cur = seprintf(cur, end, "Invalid free of");
    ++		cur = sprintf_end(cur, end, "Invalid free of");
      		break;
      	}
      
     -	cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
    -+	seprintf(cur, end, " 0x%p", (void *)addr);
    ++	sprintf_end(cur, end, " 0x%p", (void *)addr);
      
      	spin_lock_irqsave(&observed.lock, flags);
      	if (!report_available())
    @@ mm/kmsan/kmsan_test.c: static bool report_matches(const struct expect_report *r)
      	end = &expected_header[sizeof(expected_header) - 1];
      
     -	cur += scnprintf(cur, end - cur, "BUG: KMSAN: %s", r->error_type);
    -+	cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
    ++	cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
      
     -	scnprintf(cur, end - cur, " in %s", r->symbol);
    -+	seprintf(cur, end, " in %s", r->symbol);
    ++	sprintf_end(cur, end, " in %s", r->symbol);
      	/* The exact offset won't match, remove it; also strip module name. */
      	cur = strchr(expected_header, '+');
      	if (cur)
    @@ mm/mempolicy.c: void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol
      	default:
      		WARN_ON_ONCE(1);
     -		snprintf(p, maxlen, "unknown");
    -+		seprintf(p, e, "unknown");
    ++		sprintf_end(p, e, "unknown");
      		return;
      	}
      
     -	p += snprintf(p, maxlen, "%s", policy_modes[mode]);
    -+	p = seprintf(p, e, "%s", policy_modes[mode]);
    ++	p = sprintf_end(p, e, "%s", policy_modes[mode]);
      
      	if (flags & MPOL_MODE_FLAGS) {
     -		p += snprintf(p, buffer + maxlen - p, "=");
    -+		p = seprintf(p, e, "=");
    ++		p = sprintf_end(p, e, "=");
      
      		/*
      		 * Static and relative are mutually exclusive.
      		 */
      		if (flags & MPOL_F_STATIC_NODES)
     -			p += snprintf(p, buffer + maxlen - p, "static");
    -+			p = seprintf(p, e, "static");
    ++			p = sprintf_end(p, e, "static");
      		else if (flags & MPOL_F_RELATIVE_NODES)
     -			p += snprintf(p, buffer + maxlen - p, "relative");
    -+			p = seprintf(p, e, "relative");
    ++			p = sprintf_end(p, e, "relative");
      
      		if (flags & MPOL_F_NUMA_BALANCING) {
      			if (!is_power_of_2(flags & MPOL_MODE_FLAGS))
     -				p += snprintf(p, buffer + maxlen - p, "|");
     -			p += snprintf(p, buffer + maxlen - p, "balancing");
    -+				p = seprintf(p, e, "|");
    -+			p = seprintf(p, e, "balancing");
    ++				p = sprintf_end(p, e, "|");
    ++			p = sprintf_end(p, e, "balancing");
      		}
      	}
      
      	if (!nodes_empty(nodes))
     -		p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
     -			       nodemask_pr_args(&nodes));
    -+		seprintf(p, e, ":%*pbl", nodemask_pr_args(&nodes));
    ++		sprintf_end(p, e, ":%*pbl", nodemask_pr_args(&nodes));
      }
      
      #ifdef CONFIG_SYSFS
    @@ mm/page_owner.c: static inline int print_page_owner_memcg(char *kbuf, size_t cou
      	if (memcg_data & MEMCG_DATA_OBJEXTS)
     -		ret += scnprintf(kbuf + ret, count - ret,
     -				"Slab cache page\n");
    -+		p = seprintf(p, end, "Slab cache page\n");
    ++		p = sprintf_end(p, end, "Slab cache page\n");
      
      	memcg = page_memcg_check(page);
      	if (!memcg)
    @@ mm/page_owner.c: static inline int print_page_owner_memcg(char *kbuf, size_t cou
      	online = (memcg->css.flags & CSS_ONLINE);
      	cgroup_name(memcg->css.cgroup, name, sizeof(name));
     -	ret += scnprintf(kbuf + ret, count - ret,
    -+	p = seprintf(p, end,
    ++	p = sprintf_end(p, end,
      			"Charged %sto %smemcg %s\n",
      			PageMemcgKmem(page) ? "(via objcg) " : "",
      			online ? "" : "offline ",
    @@ mm/page_owner.c: print_page_owner(char __user *buf, size_t count, unsigned long
     -	ret = scnprintf(kbuf, count,
     +	p = kbuf;
     +	e = kbuf + count;
    -+	p = seprintf(p, e,
    ++	p = sprintf_end(p, e,
      			"Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n",
      			page_owner->order, page_owner->gfp_mask,
      			&page_owner->gfp_mask, page_owner->pid,
    @@ mm/page_owner.c: print_page_owner(char __user *buf, size_t count, unsigned long
      	pageblock_mt = get_pageblock_migratetype(page);
      	page_mt  = gfp_migratetype(page_owner->gfp_mask);
     -	ret += scnprintf(kbuf + ret, count - ret,
    -+	p = seprintf(p, e,
    ++	p = sprintf_end(p, e,
      			"PFN 0x%lx type %s Block %lu type %s Flags %pGp\n",
      			pfn,
      			migratetype_names[page_mt],
    @@ mm/page_owner.c: print_page_owner(char __user *buf, size_t count, unsigned long
     -	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
     -	if (ret >= count)
     -		goto err;
    -+	p = stack_depot_seprint(handle, p, e, 0);
    ++	p = stack_depot_sprint_end(handle, p, e, 0);
     +	if (p == NULL)
     +		goto err;  // XXX: Should we remove this error handling?
      
      	if (page_owner->last_migrate_reason != -1) {
     -		ret += scnprintf(kbuf + ret, count - ret,
    -+		p = seprintf(p, e,
    ++		p = sprintf_end(p, e,
      			"Page has been migrated, last migrate reason: %s\n",
      			migrate_reason_names[page_owner->last_migrate_reason]);
      	}
    @@ mm/page_owner.c: print_page_owner(char __user *buf, size_t count, unsigned long
      
     -	ret += snprintf(kbuf + ret, count - ret, "\n");
     -	if (ret >= count)
    -+	p = seprintf(p, e, "\n");
    ++	p = sprintf_end(p, e, "\n");
     +	if (p == NULL)
      		goto err;
      
    @@ mm/slub.c: static char *create_unique_id(struct kmem_cache *s)
      	if (p != name + 1)
      		*p++ = '-';
     -	p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
    -+	p = seprintf(p, e, "%07u", s->size);
    ++	p = sprintf_end(p, e, "%07u", s->size);
      
     -	if (WARN_ON(p > name + ID_STR_LENGTH - 1)) {
     +	if (WARN_ON(p == NULL)) {
4:  d8bd0e1d308b ! 4:  e05c5afabb3c array_size.h: Add ENDOF()
    @@ Metadata
      ## Commit message ##
         array_size.h: Add ENDOF()
     
    -    This macro is useful to calculate the second argument to seprintf(),
    +    This macro is useful to calculate the second argument to sprintf_end(),
         avoiding off-by-one bugs.
     
         Cc: Kees Cook <kees@kernel.org>
         Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
    +    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    +    Cc: Marco Elver <elver@google.com>
    +    Cc: Michal Hocko <mhocko@suse.com>
    +    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    +    Cc: Al Viro <viro@zeniv.linux.org.uk>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## include/linux/array_size.h ##
5:  740755c1a888 ! 5:  44a5cfc82acf mm: Fix benign off-by-one bugs
    @@ Commit message
         'end' --that is, at most the terminating null byte will be written at
         'end-1'--.
     
    -    Fixes: bc8fbc5f305a (2021-02-26; "kfence: add test suite")
    -    Fixes: 8ed691b02ade (2022-10-03; "kmsan: add tests for KMSAN")
    +    Acked-by: Marco Elver <elver@google.com>
         Cc: Kees Cook <kees@kernel.org>
         Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
         Cc: Alexander Potapenko <glider@google.com>
    -    Cc: Marco Elver <elver@google.com>
         Cc: Dmitry Vyukov <dvyukov@google.com>
         Cc: Alexander Potapenko <glider@google.com>
         Cc: Jann Horn <jannh@google.com>
         Cc: Andrew Morton <akpm@linux-foundation.org>
         Cc: Linus Torvalds <torvalds@linux-foundation.org>
    +    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    +    Cc: Marco Elver <elver@google.com>
    +    Cc: Michal Hocko <mhocko@suse.com>
    +    Cc: Al Viro <viro@zeniv.linux.org.uk>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## mm/kfence/kfence_test.c ##
    @@ mm/kfence/kfence_test.c: static bool report_matches(const struct expect_report *
     +	end = ENDOF(expect[0]);
      	switch (r->type) {
      	case KFENCE_ERROR_OOB:
    - 		cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
    + 		cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
     @@ mm/kfence/kfence_test.c: static bool report_matches(const struct expect_report *r)
      
      	/* Access information */
    @@ mm/kmsan/kmsan_test.c: static bool report_matches(const struct expect_report *r)
     -	end = &expected_header[sizeof(expected_header) - 1];
     +	end = ENDOF(expected_header);
      
    - 	cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
    + 	cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
      
6:  44d05559398c ! 6:  0314948eb225 sprintf: Add [V]STPRINTF()
    @@ Metadata
     Author: Alejandro Colomar <alx@kernel.org>
     
      ## Commit message ##
    -    sprintf: Add [V]STPRINTF()
    +    sprintf: Add [V]SPRINTF_END()
     
    -    These macros take the array size argument implicitly to avoid programmer
    -    mistakes.  This guarantees that the input is an array, unlike the common
    -    call
    +    These macros take the end of the array argument implicitly to avoid
    +    programmer mistakes.  This guarantees that the input is an array, unlike
     
                 snprintf(buf, sizeof(buf), ...);
     
    -    which is dangerous if the programmer passes a pointer.
    +    which is dangerous if the programmer passes a pointer instead of an
    +    array.
     
         These macros are essentially the same as the 2-argument version of
    -    strscpy(), but with a formatted string.
    +    strscpy(), but with a formatted string, and returning a pointer to the
    +    terminating '\0' (or NULL, on error).
     
    +    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    +    Cc: Marco Elver <elver@google.com>
    +    Cc: Michal Hocko <mhocko@suse.com>
    +    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    +    Cc: Al Viro <viro@zeniv.linux.org.uk>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## include/linux/sprintf.h ##
    @@ include/linux/sprintf.h
      #include <linux/types.h>
     +#include <linux/array_size.h>
     +
    -+#define STPRINTF(a, fmt, ...)  stprintf(a, ARRAY_SIZE(a), fmt, ##__VA_ARGS__)
    -+#define VSTPRINTF(a, fmt, ap)  vstprintf(a, ARRAY_SIZE(a), fmt, ap)
    ++#define SPRINTF_END(a, fmt, ...)  sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
    ++#define VSPRINTF_END(a, fmt, ap)  vsprintf_end(a, ENDOF(a), fmt, ap)
      
      int num_to_str(char *buf, int size, unsigned long long num, unsigned int width);
      
7:  d0e95db3c80a ! 7:  f99632f42eee mm: Use [V]STPRINTF() to avoid specifying the array size
    @@ Metadata
     Author: Alejandro Colomar <alx@kernel.org>
     
      ## Commit message ##
    -    mm: Use [V]STPRINTF() to avoid specifying the array size
    +    mm: Use [V]SPRINTF_END() to avoid specifying the array size
     
    +    Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
    +    Cc: Marco Elver <elver@google.com>
    +    Cc: Michal Hocko <mhocko@suse.com>
    +    Cc: Linus Torvalds <torvalds@linux-foundation.org>
    +    Cc: Al Viro <viro@zeniv.linux.org.uk>
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## mm/backing-dev.c ##
    @@ mm/backing-dev.c: int bdi_register_va(struct backing_dev_info *bdi, const char *
      		return 0;
      
     -	vsnprintf(bdi->dev_name, sizeof(bdi->dev_name), fmt, args);
    -+	VSTPRINTF(bdi->dev_name, fmt, args);
    ++	VSPRINTF_END(bdi->dev_name, fmt, args);
      	dev = device_create(&bdi_class, NULL, MKDEV(0, 0), bdi, bdi->dev_name);
      	if (IS_ERR(dev))
      		return PTR_ERR(dev);
    @@ mm/cma.c: static int __init cma_new_area(const char *name, phys_addr_t size,
      
      	if (name)
     -		snprintf(cma->name, CMA_MAX_NAME, "%s", name);
    -+		STPRINTF(cma->name, "%s", name);
    ++		SPRINTF_END(cma->name, "%s", name);
      	else
     -		snprintf(cma->name, CMA_MAX_NAME,  "cma%d\n", cma_area_count);
    -+		STPRINTF(cma->name, "cma%d\n", cma_area_count);
    ++		SPRINTF_END(cma->name, "cma%d\n", cma_area_count);
      
      	cma->available_count = cma->count = size >> PAGE_SHIFT;
      	cma->order_per_bit = order_per_bit;
    @@ mm/cma_debug.c: static void cma_debugfs_add_one(struct cma *cma, struct dentry *
      	for (r = 0; r < cma->nranges; r++) {
      		cmr = &cma->ranges[r];
     -		snprintf(rdirname, sizeof(rdirname), "%d", r);
    -+		STPRINTF(rdirname, "%d", r);
    ++		SPRINTF_END(rdirname, "%d", r);
      		dir = debugfs_create_dir(rdirname, rangedir);
      		debugfs_create_file("base_pfn", 0444, dir,
      			    &cmr->base_pfn, &cma_debugfs_fops);
    @@ mm/hugetlb.c: void __init hugetlb_add_hstate(unsigned int order)
      	INIT_LIST_HEAD(&h->hugepage_activelist);
     -	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
     -					huge_page_size(h)/SZ_1K);
    -+	STPRINTF(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
    ++	SPRINTF_END(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
      
      	parsed_hstate = h;
      }
    @@ mm/hugetlb_cgroup.c: hugetlb_cgroup_cfttypes_init(struct hstate *h, struct cftyp
      		*cft = *tmpl;
      		/* rebuild the name */
     -		snprintf(cft->name, MAX_CFTYPE_NAME, "%s.%s", buf, tmpl->name);
    -+		STPRINTF(cft->name, "%s.%s", buf, tmpl->name);
    ++		SPRINTF_END(cft->name, "%s.%s", buf, tmpl->name);
      		/* rebuild the private */
      		cft->private = MEMFILE_PRIVATE(idx, tmpl->private);
      		/* rebuild the file_offset */
    @@ mm/hugetlb_cma.c: void __init hugetlb_cma_reserve(int order)
      		size = round_up(size, PAGE_SIZE << order);
      
     -		snprintf(name, sizeof(name), "hugetlb%d", nid);
    -+		STPRINTF(name, "hugetlb%d", nid);
    ++		SPRINTF_END(name, "hugetlb%d", nid);
      		/*
      		 * Note that 'order per bit' is based on smallest size that
      		 * may be returned to CMA allocator in the case of
    @@ mm/kasan/report.c: static void print_memory_metadata(const void *addr)
      
     -		snprintf(buffer, sizeof(buffer),
     -				(i == 0) ? ">%px: " : " %px: ", row);
    -+		STPRINTF(buffer, (i == 0) ? ">%px: " : " %px: ", row);
    ++		SPRINTF_END(buffer, (i == 0) ? ">%px: " : " %px: ", row);
      
      		/*
      		 * We should not pass a shadow pointer to generic
    @@ mm/memblock.c: static void __init_memblock memblock_dump(struct memblock_type *t
      #ifdef CONFIG_NUMA
      		if (numa_valid_node(memblock_get_region_node(rgn)))
     -			snprintf(nid_buf, sizeof(nid_buf), " on node %d",
    -+			STPRINTF(nid_buf, " on node %d",
    ++			SPRINTF_END(nid_buf, " on node %d",
      				 memblock_get_region_node(rgn));
      #endif
      		pr_info(" %s[%#x]\t[%pa-%pa], %pa bytes%s flags: %#x\n",
    @@ mm/memblock.c: int reserve_mem_release_by_name(const char *name)
      	start = phys_to_virt(map->start);
      	end = start + map->size - 1;
     -	snprintf(buf, sizeof(buf), "reserve_mem:%s", name);
    -+	STPRINTF(buf, "reserve_mem:%s", name);
    ++	SPRINTF_END(buf, "reserve_mem:%s", name);
      	free_reserved_area(start, end, 0, buf);
      	map->size = 0;
      
    @@ mm/percpu.c: int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_
      	int nr_g0_units;
      
     -	snprintf(psize_str, sizeof(psize_str), "%luK", PAGE_SIZE >> 10);
    -+	STPRINTF(psize_str, "%luK", PAGE_SIZE >> 10);
    ++	SPRINTF_END(psize_str, "%luK", PAGE_SIZE >> 10);
      
      	ai = pcpu_build_alloc_info(reserved_size, 0, PAGE_SIZE, NULL);
      	if (IS_ERR(ai))
    @@ mm/shrinker_debug.c: int shrinker_debugfs_add(struct shrinker *shrinker)
      	shrinker->debugfs_id = id;
      
     -	snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
    -+	STPRINTF(buf, "%s-%d", shrinker->name, id);
    ++	SPRINTF_END(buf, "%s-%d", shrinker->name, id);
      
      	/* create debugfs entry */
      	entry = debugfs_create_dir(buf, shrinker_debugfs_root);
    @@ mm/zswap.c: static struct zswap_pool *zswap_pool_create(char *type, char *compre
      
      	/* unique name for each pool specifically required by zsmalloc */
     -	snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count));
    -+	STPRINTF(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
    ++	SPRINTF_END(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
      	pool->zpool = zpool_create_pool(type, name, gfp);
      	if (!pool->zpool) {
      		pr_err("%s zpool not available\n", type);
-- 
2.50.0

alx-0049r2 - add seprintf()

Posted by Alejandro Colomar 2 months, 4 weeks ago

Hi,

Below is a draft of the proposal I'll submit in a few weeks to the
C Committee.


Cheers,
Alex

---
Name
	alx-0049r2 - add seprintf()

Principles
	-  Codify existing practice to address evident deficiencies.
	-  Enable secure programming

Category
	Standardize existing libc APIs

Author
	Alejandro Colomar <alx@kernel.org>

	Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>

History
	<https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0049.git/>

	r0 (2025-07-06):
	-  Initial draft.

	r1 (2025-07-06):
	-  wfix.
	-  tfix.
	-  Expand on the off-by-one bugs.
	-  Note that ignoring truncation is not valid most of the time.

	r2 (2025-07-10):
	-  tfix.
	-  Mention SEPRINTF().

Rationale
	snprintf(3) is very difficult to chain for writing parts of a
	string in separate calls, such as in a loop.

	Let's start from the obvious sprintf(3) code (sprintf(3) will
	not prevent overflow, but let's take it as a baseline from which
	programmers start thinking):

		p = buf;
		for (...)
			p += sprintf(p, ...);

	Then, programmers will start thinking about preventing buffer
	overflows.  Programmers sometimes will naively add some buffer
	size information and use snprintf(3):

		p = buf;
		size = countof(buf);
		for (...)
			p += snprintf(p, size - (p - buf), ...);

		if (p >= buf + size)  // or worse, (p > buf + size - 1)
			goto fail;

	(Except for minor differences, this kind of code can be found
	 everywhere.  Here are a couple of examples:
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/slub.c#L7231>
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/mempolicy.c#L3369>.)

	This has several issues, starting with the difficulty of getting
	the second argument right.  Sometimes, programmers will be too
	confused, and slap a -1 there just to be safe.

		p = buf;
		size = countof(buf);
		for (...)
			p += snprintf(p, size - (p - buf) - 1, ...);

		if (p >= buf + size -1)
			goto fail;

	(Except for minor differences, this kind of code can be found
	 everywhere.  Here are a couple of examples:
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/kfence/kfence_test.c#L113>
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/kmsan/kmsan_test.c#L108>.)

	Programmers will sometimes hold a pointer to one past the last
	element in the array.  This is a wise choice, as that pointer is
	constant throughout the lifetime of the object.  Then,
	programmers might end up with something like this:

		p = buf;
		e = buf + countof(buf);
		for (...)
			p += snprintf(p, e - p, ...);

		if (p >= e)
			goto fail;

	This is certainly much cleaner.  Now a programmer might focus on
	the fact that this can overflow the pointer.  An easy approach
	would be to make sure that the function never returns more than
	the remaining size.  That is, one could implement something like
	this scnprintf() --name chosen to match the Linux kernel API of
	the same name--.  For the sake of simplicity, let's ignore
	multiple evaluation of arguments.

		#define scnprintf(s, size, ...)                 \
		({                                              \
			int len_;                               \
			len_ = snprintf(s, size, __VA_ARGS__);  \
			if (len_ == -1)                         \
				len_ = 0;                       \
			if (len_ >= size)                       \
				len_ = size - 1;                \
		                                                \
			len_;                                   \
		})

		p = buf;
		e = buf + countof(buf);
		for (...)
			p += scnprintf(p, e - p, ...);

	(Except for minor differences, this kind of code can be found
	 everywhere.  Here's an example:
	 <https://elixir.bootlin.com/linux/v6.14/source/mm/kfence/kfence_test.c#L131>.)

	Now the programmer got rid of pointer overflow.  However, they
	now have silent truncation that cannot be detected.  In some
	cases this may seem good enough.  However, often it's not.  And
	anyway, some code remains using snprintf(3) to be able to detect
	truncation.

	Moreover, this kind of code ignores the fact that vsnprintf(3)
	can fail internally, in which case there's not even a truncated
	string.  In the kernel, they're fine, because their internal
	vsnprintf() doesn't seem to ever fail, so they can always rely
	on the truncated string.  This is not reliable in projects that
	rely on the libc vsnprintf(3).

	For the code that needs to detect truncation, a programmer might
	choose a different path.  It would keep using snprintf(3), but
	would use a temporary length variable instead of the pointer.

		p = buf;
		e = buf + countof(buf);
		for (...) {
			len = snprintf(p, e - p, ...);
			if (len == -1)
				goto fail;
			if (len >= e - p)
				goto fail;
			p += len;
		}

	This is naturally error-prone.  A colleague of mine --which is an
	excellent programmer, to be clear--, had a bug even after
	knowing about it and having tried to fix it.  That shows how
	hard it is to write this correctly:
	<https://github.com/nginx/unit/pull/734#discussion_r1043963527>

	In a similar fashion, the strlcpy(3) manual page from OpenBSD
	documents a similar issue when chaining calls to strlcpy(3)
	--which was designed with semantics equivalent to snprintf(3),
	except for not formatting the string--:

	|	     char *dir, *file, pname[MAXPATHLEN];
	|	     size_t n;
	|
	|	     ...
	|
	|	     n = strlcpy(pname, dir, sizeof(pname));
	|	     if (n >= sizeof(pname))
	|		     goto toolong;
	|	     if (strlcpy(pname + n, file, sizeof(pname) - n) >= sizeof(pname) - n)
	|		     goto toolong;
	|
	|       However, one may question the validity of such optimiza‐
	|       tions, as they defeat the whole purpose of strlcpy() and
	|       strlcat().  As a matter of fact, the  first  version  of
	|       this manual page got it wrong.

	Finally, a programmer might realize that while this is error-
	prone, this is indeed the right thing to do.  There's no way to
	avoid it.  One could then think of encapsulating this into an
	API that at least would make it easy to write.  Then, one might
	wonder what the right parameters are for such an API.  The only
	immutable thing in the loop is 'e'.  And apart from that, one
	needs to know where to write, which is 'p'.  Let's start with
	those, and try to keep all the other information (size, len)
	without escaping the API.  Again, let's ignore multiple-
	evaluation issues in this macro for the sake of simplicity.

		#define foo(p, e, ...)                                \
		({                                                    \
			int  len_ = snprintf(p, e - p, __VA_ARGS__);  \
			if (len_ == -1)                               \
				p = NULL;                             \
			else if (len_ >= e - p)                       \
				p = NULL;                             \
			else                                          \
				p += len_;                            \
			p;
		})

		p = buf;
		e = buf + countof(buf);
		for (...) {
			p = foo(p, e, ...);
			if (p == NULL)
				goto fail;
		}

	We've advanced a lot.  We got rid of the buffer overflow; we
	also got rid of the error-prone code at call site.  However, one
	might think that checking for truncation after every call is
	cumbersome.  Indeed, it is possible to slightly tweak the
	internals of foo() to propagate errors from previous calls.

		#define seprintf(p, e, ...)                           \
		({                                                    \
			if (p != NULL) {                              \
				int  len_;                            \
		                                                      \
				len_ = snprintf(p, e - p, __VA_ARGS__); \
				if (len_ == -1)                       \
					p = NULL;                     \
				else if (len_ >= e - p)               \
					p = NULL;                     \
				else                                  \
					p += len_;                    \
			}                                             \
			p;                                            \
		})

		p = buf;
		e = buf + countof(buf);
		for (...)
			p = seprintf(p, e, ...);

		if (p == NULL)
			goto fail;

	By propagating an input null pointer directly to the output of
	the API, which I've called seprintf() --the 'e' refers to the
	'end' pointer, which is the key in this API--, we've allowed
	ignoring null pointers until after the very last call.  If we
	compare our resulting code to the sprintf(3)-based baseline, we
	got --perhaps unsurprisingly-- something quite close to it:

		p = buf;
		for (...)
			p += sprintf(p, ...);

	vs

		p = buf;
		e = buf + countof(buf);
		for (...)
			p = seprintf(p, e, ...);

		if (p == NULL)
			goto fail;

	And the seprintf() version is safe against both truncation and
	buffer overflow.

	For the case where there is only one call to this function (so
	not chained), and the buffer is an array, an even more ergonomic
	wrapper can be written, and it is recommended that projects
	define this macro themselves:

		#define SEPRINTF(a, fmt, ...)  \
			seprintf(a, a + countof(a), fmt, __VA_ARGS__)

	This adds some safety guarantees that $2 is calculated correctly
	when it can be automated.  Correct use would look like

		if (SEPRINTF(buf, "foo") == NULL)
			goto fail;

	Some important details of the seprintf() API are:

	-  When 'p' is NULL, the API must preserve errno.  This is
	   important to be able to determine the cause of the error
	   after all the chained calls, even when the error occurred in
	   some call in the middle of the chain.

	-  When truncation occurs, a distinct errno value must be used,
	   to signal the programmer that at least the string is reliable
	   to be used as a null-terminated string.  The error code
	   chosen is E2BIG, for compatibility with strscpy(), a Linux
	   kernel internal API with which this API shares many features
	   in common.

	-  When a hard error (an internal snprintf(3) error) occurs, an
	   error code different than E2BIG must be used.  It is
	   important to set errno, because if an implementation would
	   chose to return NULL without setting errno, an old value of
	   E2BIG could lead the programmer to believe the string was
	   successfully written (and truncated), and read it with
	   nefast consequences.

Prior art
	This API is implemented in the shadow-utils project.

	Plan9 designed something quite close, which they call
	seprint(2).  The parameters are the same --the right choice--,
	but they got the semantics for corner cases wrong.  Ironically,
	the existing Plan9 code I've seen seems to expect the semantics
	that I chose, regardless of the actual semantics of the Plan9
	API.  This is --I suspect--, because my semantics are actually
	the intuitive semantics that one would naively guess of an API
	with these parameters and return value.

	I've implemented this API for the Linux kernel, and found and
	fixed an amazing amount of bugs and other questionable code in
	just the first handful of files that I inspected.
	<https://lore.kernel.org/linux-hardening/cover.1751747518.git.alx@kernel.org/T/#t>
	<https://lore.kernel.org/linux-hardening/cover.1751823326.git.alx@kernel.org/T/#t>

Future directions
	The 'e = buf + _Countof(buf)' construct is something I've found
	to be quite common.  It would be interesting to have an
	_Endof operator that would return a pointer to one past the last
	element of an array.  It would require an array operand, just
	like _Countof.  If an _Endof operator is deemed too cumbersome
	for implementation, an endof() standard macro that expands to
	the obvious implementation with _Countof could be okay.

	This operator (or operator-like macro) would prevent off-by-one
	bugs when calculating the end sentinel value, such as those
	shown above (with links to Linux kernel real bugs).

Proposed wording
	Based on N3550.

    7.24.6  Input/output <stdio.h> :: Formatted input/output functions
	## New section after 7.24.6.6 ("The snprintf function"):

	+7.24.6.6+1  The <b>seprintf</b> function
	+
	+Synopsis
	+1	#include <stdio.h>
	+	char *seprintf(char *restrict p, const char end[0], const char *restrict format, ...);
	+
	+Description
	+2	The <b>$0</b> function
	+	is equivalent to <b>fprintf</b>,
	+	except that the output is written into an array
	+	(specified by argument <tt>p</tt>)
	+	rather than a stream.
	+	If <tt>p</tt> is a null pointer,
	+	nothing is written,
	+	and the function returns a null pointer.
	+	Otherwise,
	+	<tt>end</tt> shall compare greater than <tt>p</tt>;
	+	the function writes at most
	+	<tt>end - p - 1</tt> non-null characters,
	+	the remaining output characters are discarded,
	+	and a null character is written
	+	at the end of the characters
	+	actually written to the array.
	+	If copying takes place between objects that overlap,
	+	the behavior is undefined.
	+
	+Returns
	+3	The <b>$0</b> function returns
	+	a pointer to the terminating null character
	+	if the output was written
	+	without discarding any characters.
	+
	+4
	+	If <tt>p</tt> is a null pointer,
	+	a null pointer is returned,
	+	and <b>errno</b> is not modified.
	+
	+5
	+	If any characters are discarded,
	+	a null pointer is returned,
	+	and the value of the macro <b>E2BIG</b>
	+	is stored in <b>errno</b>.
	+
	+6
	+	If an error occurred,
	+	a null pointer is returned,
	+	and an implementation-defined non-zero value
	+	is stored in <b>errno</b>.

	## New section after 7.24.6.13 ("The vsnprintf function"):

	+7.24.6.13+1  The <b>vseprintf</b> function
	+
	+Synopsis
	+1	#include <stdio.h>
	+	char *vseprintf(char *restrict p, const char end[0], const char *restrict format, va_list arg);
	+
	+Description
	+2	The <b>$0</b> function
	+	is equivalent to
	+	<b>seprintf</b>,
	+	with the varying argument list replaced by <tt>arg</tt>.
	+
	+3
	+	The <tt>va_list</tt> argument to this function
	+	shall have been initialized by the <b>va_start</b> macro
	+	(and possibly subsequent <b>va_arg</b> invocations).
	+	This function does not invoke the <b>va_end</b> macro.343)

    7.33.2  Formatted wide character input/output functions
	## New section after 7.33.2.4 ("The swprintf function"):

	+7.33.2.4+1  The <b>sewprintf</b> function
	+
	+Synopsis
	+1	#include <wchar.h>
	+	wchar_t *sewprintf(wchar_t *restrict p, const wchar_t end[0], const wchar_t *restrict format, ...);
	+
	+Description
	+2	The <b>$0</b> function
	+	is equivalent to
	+	<b>seprintf</b>,
	+	except that it handles wide strings.

	## New section after 7.33.2.8 ("The vswprintf function"):

	+7.33.2.8+1  The <b>vsewprintf</b> function
	+
	+Synopsis
	+1	#include <wchar.h>
	+	wchar_t *vsewprintf(wchar_t *restrict p, const wchar_t end[0], const wchar_t *restrict format, va_list arg);
	+
	+Description
	+2	The <b>$0</b> function
	+	is equivalent to
	+	<b>sewprintf</b>,
	+	with the varying argument list replaced by <tt>arg</tt>.
	+
	+3
	+	The <tt>va_list</tt> argument to this function
	+	shall have been initialized by the <b>va_start</b> macro
	+	(and possibly subsequent <b>va_arg</b> invocations).
	+	This function does not invoke the <b>va_end</b> macro.407)

[RFC v4 1/7] vsprintf: Add [v]sprintf_end()

Posted by Alejandro Colomar 2 months, 4 weeks ago

sprintf_end() is a function similar to stpcpy(3) in the sense that it
returns a pointer that is suitable for chaining to other copy
operations.

It takes a pointer to the end of the buffer as a sentinel for when to
truncate, which unlike a size, doesn't need to be updated after every
call.  This makes it much more ergonomic, avoiding manually calculating
the size after each copy, which is error prone.

It also makes error handling much easier, by reporting truncation with
a null pointer, which is accepted and transparently passed down by
subsequent sprintf_end() calls.  This results in only needing to report
errors once after a chain of sprintf_end() calls, unlike snprintf(3),
which requires checking after every call.

	p = buf;
	e = buf + countof(buf);
	p = sprintf_end(p, e, foo);
	p = sprintf_end(p, e, bar);
	if (p == NULL)
		goto trunc;

vs

	len = 0;
	size = countof(buf);
	len += snprintf(buf + len, size - len, foo);
	if (len >= size)
		goto trunc;

	len += snprintf(buf + len, size - len, bar);
	if (len >= size)
		goto trunc;

And also better than scnprintf() calls:

	len = 0;
	size = countof(buf);
	len += scnprintf(buf + len, size - len, foo);
	len += scnprintf(buf + len, size - len, bar);
	// No ability to check.

It seems aparent that it's a more elegant approach to string catenation.

These functions will soon be proposed for standardization as
[v]seprintf() into C2y, and they exist in Plan9 as seprint(2) --but the
Plan9 implementation has important bugs--.

Link: <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0049.git/tree/alx-0049.txt>
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h |  2 ++
 lib/vsprintf.c          | 59 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 61 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index 51cab2def9ec..a0dc35574521 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -13,6 +13,8 @@ __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
 __printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
 __printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
+__printf(3, 4) char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
+__printf(3, 0) char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
 __printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
 __printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
 __printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 01699852f30c..d32df53a713a 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -2923,6 +2923,40 @@ int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
 }
 EXPORT_SYMBOL(vscnprintf);
 
+/**
+ * vsprintf_end - va_list string end-delimited print formatted
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
+ * @args: Arguments for the format string
+ *
+ * The return value is a pointer to the trailing '\0'.
+ * If @p is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
+ * If @end <= @p, the function returns NULL.
+ *
+ * See the vsnprintf() documentation for format string extensions over C99.
+ */
+char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
+{
+	int len;
+	size_t size;
+
+	if (unlikely(p == NULL))
+		return NULL;
+
+	size = end - p;
+	if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
+		return NULL;
+
+	len = vsnprintf(p, size, fmt, args);
+	if (unlikely(len >= size))
+		return NULL;
+
+	return p + len;
+}
+EXPORT_SYMBOL(vsprintf_end);
+
 /**
  * snprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
@@ -2974,6 +3008,31 @@ int scnprintf(char *buf, size_t size, const char *fmt, ...)
 }
 EXPORT_SYMBOL(scnprintf);
 
+/**
+ * sprintf_end - string end-delimited print formatted
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
+ * @...: Arguments for the format string
+ *
+ * The return value is a pointer to the trailing '\0'.
+ * If @buf is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
+ * If @end <= @p, the function returns NULL.
+ */
+
+char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	p = vsprintf_end(p, end, fmt, args);
+	va_end(args);
+
+	return p;
+}
+EXPORT_SYMBOL(sprintf_end);
+
 /**
  * vsprintf - Format a string and place it in a buffer
  * @buf: The buffer to place the result into
-- 
2.50.0

[RFC v4 2/7] stacktrace, stackdepot: Add sprintf_end()-like variants of functions

Posted by Alejandro Colomar 2 months, 4 weeks ago

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/stackdepot.h | 13 +++++++++++++
 include/linux/stacktrace.h |  3 +++
 kernel/stacktrace.c        | 28 ++++++++++++++++++++++++++++
 lib/stackdepot.c           | 13 +++++++++++++
 4 files changed, 57 insertions(+)

diff --git a/include/linux/stackdepot.h b/include/linux/stackdepot.h
index 2cc21ffcdaf9..76182e874f67 100644
--- a/include/linux/stackdepot.h
+++ b/include/linux/stackdepot.h
@@ -219,6 +219,19 @@ void stack_depot_print(depot_stack_handle_t stack);
 int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
 		       int spaces);
 
+/**
+ * stack_depot_sprint_end - Print a stack trace from stack depot into a buffer
+ *
+ * @handle:	Stack depot handle returned from stack_depot_save()
+ * @p:		Pointer to the print buffer
+ * @end:	Pointer to one past the last element in the buffer
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return:	Pointer to trailing '\0'; or NULL on truncation
+ */
+char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
+                             const char end[0], int spaces);
+
 /**
  * stack_depot_put - Drop a reference to a stack trace from stack depot
  *
diff --git a/include/linux/stacktrace.h b/include/linux/stacktrace.h
index 97455880ac41..79ada795d479 100644
--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -67,6 +67,9 @@ void stack_trace_print(const unsigned long *trace, unsigned int nr_entries,
 		       int spaces);
 int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
 			unsigned int nr_entries, int spaces);
+char *stack_trace_sprint_end(char *p, const char end[0],
+			     const unsigned long *entries,
+			     unsigned int nr_entries, int spaces);
 unsigned int stack_trace_save(unsigned long *store, unsigned int size,
 			      unsigned int skipnr);
 unsigned int stack_trace_save_tsk(struct task_struct *task,
diff --git a/kernel/stacktrace.c b/kernel/stacktrace.c
index afb3c116da91..f389647d8e44 100644
--- a/kernel/stacktrace.c
+++ b/kernel/stacktrace.c
@@ -70,6 +70,34 @@ int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
 }
 EXPORT_SYMBOL_GPL(stack_trace_snprint);
 
+/**
+ * stack_trace_sprint_end - Print the entries in the stack trace into a buffer
+ * @p:		Pointer to the print buffer
+ * @end:	Pointer to one past the last element in the buffer
+ * @entries:	Pointer to storage array
+ * @nr_entries:	Number of entries in the storage array
+ * @spaces:	Number of leading spaces to print
+ *
+ * Return: Pointer to the trailing '\0'; or NULL on truncation.
+ */
+char *stack_trace_sprint_end(char *p, const char end[0],
+			  const unsigned long *entries, unsigned int nr_entries,
+			  int spaces)
+{
+	unsigned int i;
+
+	if (WARN_ON(!entries))
+		return 0;
+
+	for (i = 0; i < nr_entries; i++) {
+		p = sprintf_end(p, end, "%*c%pS\n", 1 + spaces, ' ',
+			     (void *)entries[i]);
+	}
+
+	return p;
+}
+EXPORT_SYMBOL_GPL(stack_trace_sprint_end);
+
 #ifdef CONFIG_ARCH_STACKWALK
 
 struct stacktrace_cookie {
diff --git a/lib/stackdepot.c b/lib/stackdepot.c
index 73d7b50924ef..48e5c0ff37e8 100644
--- a/lib/stackdepot.c
+++ b/lib/stackdepot.c
@@ -771,6 +771,19 @@ int stack_depot_snprint(depot_stack_handle_t handle, char *buf, size_t size,
 }
 EXPORT_SYMBOL_GPL(stack_depot_snprint);
 
+char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
+			     const char end[0], int spaces)
+{
+	unsigned long *entries;
+	unsigned int nr_entries;
+
+	nr_entries = stack_depot_fetch(handle, &entries);
+	return nr_entries ?
+		stack_trace_sprint_end(p, end, entries, nr_entries, spaces)
+		: sprintf_end(p, end, "");
+}
+EXPORT_SYMBOL_GPL(stack_depot_sprint_end);
+
 depot_stack_handle_t __must_check stack_depot_set_extra_bits(
 			depot_stack_handle_t handle, unsigned int extra_bits)
 {
-- 
2.50.0

[RFC v4 3/7] mm: Use sprintf_end() instead of less ergonomic APIs

Posted by Alejandro Colomar 2 months, 4 weeks ago

While doing this, I detected some anomalies in the existing code:

mm/kfence/kfence_test.c:

	-  The last call to scnprintf() did increment 'cur', but it's
	   unused after that, so it was dead code.  I've removed the dead
	   code in this patch.

	-  'end' is calculated as

		end = &expect[0][sizeof(expect[0] - 1)];

	   However, the '-1' doesn't seem to be necessary.  When passing
	   $2 to scnprintf(), the size was specified as 'end - cur'.
	   And scnprintf() --just like snprintf(3)--, won't write more
	   than $2 bytes (including the null byte).  That means that
	   scnprintf() wouldn't write more than

		&expect[0][sizeof(expect[0]) - 1] - expect[0]

	   which simplifies to

		sizeof(expect[0]) - 1

	   bytes.  But we have sizeof(expect[0]) bytes available, so
	   we're wasting one byte entirely.  This is a benign off-by-one
	   bug.  The two occurrences of this bug will be fixed in a
	   following patch in this series.

mm/kmsan/kmsan_test.c:

	The same benign off-by-one bug calculating the remaining size.

mm/mempolicy.c:

	This file uses the 'p += snprintf()' anti-pattern.  That will
	overflow the pointer on truncation, which has undefined
	behavior.  Using sprintf_end(), this bug is fixed.

	As in the previous file, here there was also dead code in the
	last scnprintf() call, by incrementing a pointer that is not
	used after the call.  I've removed the dead code.

mm/page_owner.c:

	Within print_page_owner(), there are some calls to scnprintf(),
	which do report truncation.  And then there are other calls to
	snprintf(), where we handle errors (there are two 'goto err').

	I've kept the existing error handling, as I trust it's there for
	a good reason (i.e., we may want to avoid calling
	print_page_owner_memcg() if we truncated before).  Please review
	if this amount of error handling is the right one, or if we want
	to add or remove some.  For sprintf_end(), a single test for
	null after the last call is enough to detect truncation.

mm/slub.c:

	Again, the 'p += snprintf()' anti-pattern.  This is UB, and by
	using sprintf_end() we've fixed the bug.

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Marco Elver <elver@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Chao Yu <chao.yu@oppo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/kfence/kfence_test.c | 24 ++++++++++++------------
 mm/kmsan/kmsan_test.c   |  4 ++--
 mm/mempolicy.c          | 18 +++++++++---------
 mm/page_owner.c         | 32 +++++++++++++++++---------------
 mm/slub.c               |  5 +++--
 5 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index 00034e37bc9f..bae382eca4ab 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -113,26 +113,26 @@ static bool report_matches(const struct expect_report *r)
 	end = &expect[0][sizeof(expect[0]) - 1];
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: out-of-bounds %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_UAF:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: use-after-free %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: use-after-free %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_CORRUPTION:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: memory corruption");
+		cur = sprintf_end(cur, end, "BUG: KFENCE: memory corruption");
 		break;
 	case KFENCE_ERROR_INVALID:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid %s",
+		cur = sprintf_end(cur, end, "BUG: KFENCE: invalid %s",
 				 get_access_type(r));
 		break;
 	case KFENCE_ERROR_INVALID_FREE:
-		cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid free");
+		cur = sprintf_end(cur, end, "BUG: KFENCE: invalid free");
 		break;
 	}
 
-	scnprintf(cur, end - cur, " in %pS", r->fn);
+	sprintf_end(cur, end, " in %pS", r->fn);
 	/* The exact offset won't match, remove it; also strip module name. */
 	cur = strchr(expect[0], '+');
 	if (cur)
@@ -144,26 +144,26 @@ static bool report_matches(const struct expect_report *r)
 
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
-		cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Out-of-bounds %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_UAF:
-		cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Use-after-free %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_CORRUPTION:
-		cur += scnprintf(cur, end - cur, "Corrupted memory at");
+		cur = sprintf_end(cur, end, "Corrupted memory at");
 		break;
 	case KFENCE_ERROR_INVALID:
-		cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
+		cur = sprintf_end(cur, end, "Invalid %s at", get_access_type(r));
 		addr = arch_kfence_test_address(addr);
 		break;
 	case KFENCE_ERROR_INVALID_FREE:
-		cur += scnprintf(cur, end - cur, "Invalid free of");
+		cur = sprintf_end(cur, end, "Invalid free of");
 		break;
 	}
 
-	cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
+	sprintf_end(cur, end, " 0x%p", (void *)addr);
 
 	spin_lock_irqsave(&observed.lock, flags);
 	if (!report_available())
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index 9733a22c46c1..e48ca1972ff3 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -107,9 +107,9 @@ static bool report_matches(const struct expect_report *r)
 	cur = expected_header;
 	end = &expected_header[sizeof(expected_header) - 1];
 
-	cur += scnprintf(cur, end - cur, "BUG: KMSAN: %s", r->error_type);
+	cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
 
-	scnprintf(cur, end - cur, " in %s", r->symbol);
+	sprintf_end(cur, end, " in %s", r->symbol);
 	/* The exact offset won't match, remove it; also strip module name. */
 	cur = strchr(expected_header, '+');
 	if (cur)
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index b28a1e6ae096..6beb2710f97c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -3359,6 +3359,7 @@ int mpol_parse_str(char *str, struct mempolicy **mpol)
 void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
 {
 	char *p = buffer;
+	char *e = buffer + maxlen;
 	nodemask_t nodes = NODE_MASK_NONE;
 	unsigned short mode = MPOL_DEFAULT;
 	unsigned short flags = 0;
@@ -3384,33 +3385,32 @@ void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol)
 		break;
 	default:
 		WARN_ON_ONCE(1);
-		snprintf(p, maxlen, "unknown");
+		sprintf_end(p, e, "unknown");
 		return;
 	}
 
-	p += snprintf(p, maxlen, "%s", policy_modes[mode]);
+	p = sprintf_end(p, e, "%s", policy_modes[mode]);
 
 	if (flags & MPOL_MODE_FLAGS) {
-		p += snprintf(p, buffer + maxlen - p, "=");
+		p = sprintf_end(p, e, "=");
 
 		/*
 		 * Static and relative are mutually exclusive.
 		 */
 		if (flags & MPOL_F_STATIC_NODES)
-			p += snprintf(p, buffer + maxlen - p, "static");
+			p = sprintf_end(p, e, "static");
 		else if (flags & MPOL_F_RELATIVE_NODES)
-			p += snprintf(p, buffer + maxlen - p, "relative");
+			p = sprintf_end(p, e, "relative");
 
 		if (flags & MPOL_F_NUMA_BALANCING) {
 			if (!is_power_of_2(flags & MPOL_MODE_FLAGS))
-				p += snprintf(p, buffer + maxlen - p, "|");
-			p += snprintf(p, buffer + maxlen - p, "balancing");
+				p = sprintf_end(p, e, "|");
+			p = sprintf_end(p, e, "balancing");
 		}
 	}
 
 	if (!nodes_empty(nodes))
-		p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
-			       nodemask_pr_args(&nodes));
+		sprintf_end(p, e, ":%*pbl", nodemask_pr_args(&nodes));
 }
 
 #ifdef CONFIG_SYSFS
diff --git a/mm/page_owner.c b/mm/page_owner.c
index cc4a6916eec6..c00b3be01540 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -496,7 +496,7 @@ void pagetypeinfo_showmixedcount_print(struct seq_file *m,
 /*
  * Looking for memcg information and print it out
  */
-static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
+static inline char *print_page_owner_memcg(char *p, const char end[0],
 					 struct page *page)
 {
 #ifdef CONFIG_MEMCG
@@ -511,8 +511,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 		goto out_unlock;
 
 	if (memcg_data & MEMCG_DATA_OBJEXTS)
-		ret += scnprintf(kbuf + ret, count - ret,
-				"Slab cache page\n");
+		p = sprintf_end(p, end, "Slab cache page\n");
 
 	memcg = page_memcg_check(page);
 	if (!memcg)
@@ -520,7 +519,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 
 	online = (memcg->css.flags & CSS_ONLINE);
 	cgroup_name(memcg->css.cgroup, name, sizeof(name));
-	ret += scnprintf(kbuf + ret, count - ret,
+	p = sprintf_end(p, end,
 			"Charged %sto %smemcg %s\n",
 			PageMemcgKmem(page) ? "(via objcg) " : "",
 			online ? "" : "offline ",
@@ -529,7 +528,7 @@ static inline int print_page_owner_memcg(char *kbuf, size_t count, int ret,
 	rcu_read_unlock();
 #endif /* CONFIG_MEMCG */
 
-	return ret;
+	return p;
 }
 
 static ssize_t
@@ -538,14 +537,16 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 		depot_stack_handle_t handle)
 {
 	int ret, pageblock_mt, page_mt;
-	char *kbuf;
+	char *kbuf, *p, *e;
 
 	count = min_t(size_t, count, PAGE_SIZE);
 	kbuf = kmalloc(count, GFP_KERNEL);
 	if (!kbuf)
 		return -ENOMEM;
 
-	ret = scnprintf(kbuf, count,
+	p = kbuf;
+	e = kbuf + count;
+	p = sprintf_end(p, e,
 			"Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n",
 			page_owner->order, page_owner->gfp_mask,
 			&page_owner->gfp_mask, page_owner->pid,
@@ -555,7 +556,7 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 	/* Print information relevant to grouping pages by mobility */
 	pageblock_mt = get_pageblock_migratetype(page);
 	page_mt  = gfp_migratetype(page_owner->gfp_mask);
-	ret += scnprintf(kbuf + ret, count - ret,
+	p = sprintf_end(p, e,
 			"PFN 0x%lx type %s Block %lu type %s Flags %pGp\n",
 			pfn,
 			migratetype_names[page_mt],
@@ -563,22 +564,23 @@ print_page_owner(char __user *buf, size_t count, unsigned long pfn,
 			migratetype_names[pageblock_mt],
 			&page->flags);
 
-	ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
-	if (ret >= count)
-		goto err;
+	p = stack_depot_sprint_end(handle, p, e, 0);
+	if (p == NULL)
+		goto err;  // XXX: Should we remove this error handling?
 
 	if (page_owner->last_migrate_reason != -1) {
-		ret += scnprintf(kbuf + ret, count - ret,
+		p = sprintf_end(p, e,
 			"Page has been migrated, last migrate reason: %s\n",
 			migrate_reason_names[page_owner->last_migrate_reason]);
 	}
 
-	ret = print_page_owner_memcg(kbuf, count, ret, page);
+	p = print_page_owner_memcg(p, e, page);
 
-	ret += snprintf(kbuf + ret, count - ret, "\n");
-	if (ret >= count)
+	p = sprintf_end(p, e, "\n");
+	if (p == NULL)
 		goto err;
 
+	ret = p - kbuf;
 	if (copy_to_user(buf, kbuf, ret))
 		ret = -EFAULT;
 
diff --git a/mm/slub.c b/mm/slub.c
index be8b09e09d30..dcc857676857 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -7451,6 +7451,7 @@ static char *create_unique_id(struct kmem_cache *s)
 {
 	char *name = kmalloc(ID_STR_LENGTH, GFP_KERNEL);
 	char *p = name;
+	char *e = name + ID_STR_LENGTH;
 
 	if (!name)
 		return ERR_PTR(-ENOMEM);
@@ -7475,9 +7476,9 @@ static char *create_unique_id(struct kmem_cache *s)
 		*p++ = 'A';
 	if (p != name + 1)
 		*p++ = '-';
-	p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
+	p = sprintf_end(p, e, "%07u", s->size);
 
-	if (WARN_ON(p > name + ID_STR_LENGTH - 1)) {
+	if (WARN_ON(p == NULL)) {
 		kfree(name);
 		return ERR_PTR(-EINVAL);
 	}
-- 
2.50.0

[RFC v4 4/7] array_size.h: Add ENDOF()

Posted by Alejandro Colomar 2 months, 4 weeks ago

This macro is useful to calculate the second argument to sprintf_end(),
avoiding off-by-one bugs.

Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/array_size.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/array_size.h b/include/linux/array_size.h
index 06d7d83196ca..781bdb70d939 100644
--- a/include/linux/array_size.h
+++ b/include/linux/array_size.h
@@ -10,4 +10,10 @@
  */
 #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))
 
+/**
+ * ENDOF - get a pointer to one past the last element in array @a
+ * @a: array
+ */
+#define ENDOF(a)  (a + ARRAY_SIZE(a))
+
 #endif  /* _LINUX_ARRAY_SIZE_H */
-- 
2.50.0

[RFC v4 5/7] mm: Fix benign off-by-one bugs

Posted by Alejandro Colomar 2 months, 4 weeks ago

We were wasting a byte due to an off-by-one bug.  s[c]nprintf()
doesn't write more than $2 bytes including the null byte, so trying to
pass 'size-1' there is wasting one byte.  Now that we use seprintf(),
the situation isn't different: seprintf() will stop writing *before*
'end' --that is, at most the terminating null byte will be written at
'end-1'--.

Acked-by: Marco Elver <elver@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/kfence/kfence_test.c | 4 ++--
 mm/kmsan/kmsan_test.c   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/kfence/kfence_test.c b/mm/kfence/kfence_test.c
index bae382eca4ab..c635aa9d478b 100644
--- a/mm/kfence/kfence_test.c
+++ b/mm/kfence/kfence_test.c
@@ -110,7 +110,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Title */
 	cur = expect[0];
-	end = &expect[0][sizeof(expect[0]) - 1];
+	end = ENDOF(expect[0]);
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
 		cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
@@ -140,7 +140,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Access information */
 	cur = expect[1];
-	end = &expect[1][sizeof(expect[1]) - 1];
+	end = ENDOF(expect[1]);
 
 	switch (r->type) {
 	case KFENCE_ERROR_OOB:
diff --git a/mm/kmsan/kmsan_test.c b/mm/kmsan/kmsan_test.c
index e48ca1972ff3..9bda55992e3d 100644
--- a/mm/kmsan/kmsan_test.c
+++ b/mm/kmsan/kmsan_test.c
@@ -105,7 +105,7 @@ static bool report_matches(const struct expect_report *r)
 
 	/* Title */
 	cur = expected_header;
-	end = &expected_header[sizeof(expected_header) - 1];
+	end = ENDOF(expected_header);
 
 	cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
 
-- 
2.50.0

[RFC v4 6/7] sprintf: Add [V]SPRINTF_END()

Posted by Alejandro Colomar 2 months, 4 weeks ago

These macros take the end of the array argument implicitly to avoid
programmer mistakes.  This guarantees that the input is an array, unlike

	snprintf(buf, sizeof(buf), ...);

which is dangerous if the programmer passes a pointer instead of an
array.

These macros are essentially the same as the 2-argument version of
strscpy(), but with a formatted string, and returning a pointer to the
terminating '\0' (or NULL, on error).

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 include/linux/sprintf.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/sprintf.h b/include/linux/sprintf.h
index a0dc35574521..33eb03d0b9b8 100644
--- a/include/linux/sprintf.h
+++ b/include/linux/sprintf.h
@@ -4,6 +4,10 @@
 
 #include <linux/compiler_attributes.h>
 #include <linux/types.h>
+#include <linux/array_size.h>
+
+#define SPRINTF_END(a, fmt, ...)  sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
+#define VSPRINTF_END(a, fmt, ap)  vsprintf_end(a, ENDOF(a), fmt, ap)
 
 int num_to_str(char *buf, int size, unsigned long long num, unsigned int width);
 
-- 
2.50.0

Re: [RFC v4 6/7] sprintf: Add [V]SPRINTF_END()

Posted by Linus Torvalds 2 months, 4 weeks ago

On Wed, 9 Jul 2025 at 19:49, Alejandro Colomar <alx@kernel.org> wrote:
>
> +#define SPRINTF_END(a, fmt, ...)  sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
> +#define VSPRINTF_END(a, fmt, ap)  vsprintf_end(a, ENDOF(a), fmt, ap)

So I like vsprintf_end() more as a name ("like more" not being "I love
it", but at least it makes me think it's a bit more self-explanatory).

But I don't love screaming macros. They historically scream because
they are unsafe, but they shouldn't be unsafe in the first place.

And I don't think those [V]SPRINTF_END() and ENDOF() macros are unsafe
- they use our ARRAY_SIZE() macro which does not evaluate the
argument, only the type, and is safe to use.

So honestly, this interface looks easy to use, but the screaming must stop.

And none of this has *anything* to do with "end" in this form anyway.

IOW, why isn't this just

  #define sprintf_array(a,...) snprintf(a, ARRAY_SIZE(a), __VA_ARGS__)

which is simpler and more direct, doesn't use the "end" version that
is pointless (it's _literally_ about the size of the array, so
'snprintf' is the right thing to use), doesn't scream, and has a
rather self-explanatory name.

Naming matters.

                Linus

Re: [RFC v4 6/7] sprintf: Add [V]SPRINTF_END()

Posted by Alejandro Colomar 2 months, 4 weeks ago

Hi Linus,

On Thu, Jul 10, 2025 at 08:52:13AM -0700, Linus Torvalds wrote:
> On Wed, 9 Jul 2025 at 19:49, Alejandro Colomar <alx@kernel.org> wrote:
> >
> > +#define SPRINTF_END(a, fmt, ...)  sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
> > +#define VSPRINTF_END(a, fmt, ap)  vsprintf_end(a, ENDOF(a), fmt, ap)
> 
> So I like vsprintf_end() more as a name ("like more" not being "I love
> it", but at least it makes me think it's a bit more self-explanatory).

:-)

> But I don't love screaming macros. They historically scream because
> they are unsafe, but they shouldn't be unsafe in the first place.
> 
> And I don't think those [V]SPRINTF_END() and ENDOF() macros are unsafe
> - they use our ARRAY_SIZE() macro which does not evaluate the
> argument, only the type, and is safe to use.

Yup, it's safe to use.

> So honestly, this interface looks easy to use, but the screaming must stop.
> 
> And none of this has *anything* to do with "end" in this form anyway.

That same thing happened through my head while doing it, but I didn't
think of a better name.

In shadow, we have many interfaces for which we have an uppercase macro
version of many functions that gets array sizes and other extra safety
measures where we can.  (So there, the uppercase versions are indeed
extra safety, instead of the historical "there be dragons".  I use the
uppercase to mean "this does some magic to be safer".)

> IOW, why isn't this just
> 
>   #define sprintf_array(a,...) snprintf(a, ARRAY_SIZE(a), __VA_ARGS__)

Agree.  This is a better name for the kernel.

> which is simpler and more direct, doesn't use the "end" version that
> is pointless (it's _literally_ about the size of the array, so
> 'snprintf' is the right thing to use),

I disagree with snprintf(3), but not because of the input, but rather
because of the output.  I think an API similar to strscpy() would be
better, so it can return an error code for truncation.  In fact, up to
v2, I had a stprintf() (T for truncation) that did exactly that.
However, I found out I could do the same with sprintf_end(), which would
mean one less function to grok, which is why I dropped that part.

I'll use your suggested name, as I like it.  Expect v5 in a few minutes.

> doesn't scream, and has a
> rather self-explanatory name.
> 
> Naming matters.

+1

Have a lovely day!
Alex

> 
>                 Linus

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v4 6/7] sprintf: Add [V]SPRINTF_END()

Posted by Alejandro Colomar 2 months, 4 weeks ago

Hi Linus,

On Thu, Jul 10, 2025 at 08:30:59PM +0200, Alejandro Colomar wrote:
> > IOW, why isn't this just
> > 
> >   #define sprintf_array(a,...) snprintf(a, ARRAY_SIZE(a), __VA_ARGS__)
> 
> Agree.  This is a better name for the kernel.

Oops, I misread.  I thought you were implementing it as

	#define sprintf_array(a, ...)  sprintf_end(a, ENDOF(a), __VA_ARGS__)

So, I prefer my implementation because it returns NULL on truncation.
Compare usage:

	if (linus_sprintf_array(a, "foo") >= ARRAY_SIZE(a))
		goto fail;

	if (alex_sprintf_array(a, "foo") == NULL)
		goto fail;

Another approach would be to have

	if (third_sprintf_array(a, "foo") < 0)  // -E2BIG
		goto fail;

Which was my first approach, but since we have sprintf_end(), let's just
reuse it.


Cheers,
Alex

-- 
<https://www.alejandro-colomar.es/>

Re: [RFC v4 6/7] sprintf: Add [V]SPRINTF_END()

Posted by Linus Torvalds 2 months, 4 weeks ago

On Thu, 10 Jul 2025 at 14:21, Alejandro Colomar <alx@kernel.org> wrote:
>
> So, I prefer my implementation because it returns NULL on truncation.

As I pointed out, your implementation is WRONG.

If you want to return an error on truncation, do it right.  Not by
returning NULL, but by actually returning an error.

For example, in the kernel, we finally fixed 'strcpy()'. After about a
million different versions of 'copy a string' where every single
version was complete garbage, we ended up with 'strscpy()'. Yeah, the
name isn't lovely, but the *use* of it is:

 - it returns the length of the result for people who want it - which
is by far the most common thing people want

 - it returns an actual honest-to-goodness error code if something
overflowed, instead of the absoilutely horrible "source length" of the
string that strlcpy() does and which is fundamentally broken (because
it requires that you walk *past* the end of the source,
Christ-on-a-stick what a broken interface)

 - it can take an array as an argument (without the need for another
name - see my earlier argument about not making up new names by just
having generics)

Now, it has nasty naming (exactly the kind of 'add random character'
naming that I was arguing against), and that comes from so many
different broken versions until we hit on something that works.

strncpy is horrible garbage. strlcpy is even worse. strscpy actually
works and so far hasn't caused issues (there's a 'pad' version for the
very rare situation where you want 'strncpy-like' padding, but it
still guarantees NUL-termination, and still has a good return value).

Let's agree to *not* make horrible garbage when making up new versions
of sprintf.

             Linus

[RFC v4 7/7] mm: Use [V]SPRINTF_END() to avoid specifying the array size

Posted by Alejandro Colomar 2 months, 4 weeks ago

Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Marco Elver <elver@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
---
 mm/backing-dev.c    | 2 +-
 mm/cma.c            | 4 ++--
 mm/cma_debug.c      | 2 +-
 mm/hugetlb.c        | 3 +--
 mm/hugetlb_cgroup.c | 2 +-
 mm/hugetlb_cma.c    | 2 +-
 mm/kasan/report.c   | 3 +--
 mm/memblock.c       | 4 ++--
 mm/percpu.c         | 2 +-
 mm/shrinker_debug.c | 2 +-
 mm/zswap.c          | 2 +-
 11 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 783904d8c5ef..20a75fd9f205 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -1090,7 +1090,7 @@ int bdi_register_va(struct backing_dev_info *bdi, const char *fmt, va_list args)
 	if (bdi->dev)	/* The driver needs to use separate queues per device */
 		return 0;
 
-	vsnprintf(bdi->dev_name, sizeof(bdi->dev_name), fmt, args);
+	VSPRINTF_END(bdi->dev_name, fmt, args);
 	dev = device_create(&bdi_class, NULL, MKDEV(0, 0), bdi, bdi->dev_name);
 	if (IS_ERR(dev))
 		return PTR_ERR(dev);
diff --git a/mm/cma.c b/mm/cma.c
index c04be488b099..05f8f036b811 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -237,9 +237,9 @@ static int __init cma_new_area(const char *name, phys_addr_t size,
 	cma_area_count++;
 
 	if (name)
-		snprintf(cma->name, CMA_MAX_NAME, "%s", name);
+		SPRINTF_END(cma->name, "%s", name);
 	else
-		snprintf(cma->name, CMA_MAX_NAME,  "cma%d\n", cma_area_count);
+		SPRINTF_END(cma->name, "cma%d\n", cma_area_count);
 
 	cma->available_count = cma->count = size >> PAGE_SHIFT;
 	cma->order_per_bit = order_per_bit;
diff --git a/mm/cma_debug.c b/mm/cma_debug.c
index fdf899532ca0..6df439b400c1 100644
--- a/mm/cma_debug.c
+++ b/mm/cma_debug.c
@@ -186,7 +186,7 @@ static void cma_debugfs_add_one(struct cma *cma, struct dentry *root_dentry)
 	rangedir = debugfs_create_dir("ranges", tmp);
 	for (r = 0; r < cma->nranges; r++) {
 		cmr = &cma->ranges[r];
-		snprintf(rdirname, sizeof(rdirname), "%d", r);
+		SPRINTF_END(rdirname, "%d", r);
 		dir = debugfs_create_dir(rdirname, rangedir);
 		debugfs_create_file("base_pfn", 0444, dir,
 			    &cmr->base_pfn, &cma_debugfs_fops);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 6a3cf7935c14..2e6aa3efafb2 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4780,8 +4780,7 @@ void __init hugetlb_add_hstate(unsigned int order)
 	for (i = 0; i < MAX_NUMNODES; ++i)
 		INIT_LIST_HEAD(&h->hugepage_freelists[i]);
 	INIT_LIST_HEAD(&h->hugepage_activelist);
-	snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
-					huge_page_size(h)/SZ_1K);
+	SPRINTF_END(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
 
 	parsed_hstate = h;
 }
diff --git a/mm/hugetlb_cgroup.c b/mm/hugetlb_cgroup.c
index 58e895f3899a..4b5330ff9cef 100644
--- a/mm/hugetlb_cgroup.c
+++ b/mm/hugetlb_cgroup.c
@@ -822,7 +822,7 @@ hugetlb_cgroup_cfttypes_init(struct hstate *h, struct cftype *cft,
 	for (i = 0; i < tmpl_size; cft++, tmpl++, i++) {
 		*cft = *tmpl;
 		/* rebuild the name */
-		snprintf(cft->name, MAX_CFTYPE_NAME, "%s.%s", buf, tmpl->name);
+		SPRINTF_END(cft->name, "%s.%s", buf, tmpl->name);
 		/* rebuild the private */
 		cft->private = MEMFILE_PRIVATE(idx, tmpl->private);
 		/* rebuild the file_offset */
diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c
index e0f2d5c3a84c..6bccad5b4216 100644
--- a/mm/hugetlb_cma.c
+++ b/mm/hugetlb_cma.c
@@ -211,7 +211,7 @@ void __init hugetlb_cma_reserve(int order)
 
 		size = round_up(size, PAGE_SIZE << order);
 
-		snprintf(name, sizeof(name), "hugetlb%d", nid);
+		SPRINTF_END(name, "hugetlb%d", nid);
 		/*
 		 * Note that 'order per bit' is based on smallest size that
 		 * may be returned to CMA allocator in the case of
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 8357e1a33699..c2c9bef78edf 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -486,8 +486,7 @@ static void print_memory_metadata(const void *addr)
 		char buffer[4 + (BITS_PER_LONG / 8) * 2];
 		char metadata[META_BYTES_PER_ROW];
 
-		snprintf(buffer, sizeof(buffer),
-				(i == 0) ? ">%px: " : " %px: ", row);
+		SPRINTF_END(buffer, (i == 0) ? ">%px: " : " %px: ", row);
 
 		/*
 		 * We should not pass a shadow pointer to generic
diff --git a/mm/memblock.c b/mm/memblock.c
index 0e9ebb8aa7fe..6bb21aacb15d 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -2021,7 +2021,7 @@ static void __init_memblock memblock_dump(struct memblock_type *type)
 		flags = rgn->flags;
 #ifdef CONFIG_NUMA
 		if (numa_valid_node(memblock_get_region_node(rgn)))
-			snprintf(nid_buf, sizeof(nid_buf), " on node %d",
+			SPRINTF_END(nid_buf, " on node %d",
 				 memblock_get_region_node(rgn));
 #endif
 		pr_info(" %s[%#x]\t[%pa-%pa], %pa bytes%s flags: %#x\n",
@@ -2379,7 +2379,7 @@ int reserve_mem_release_by_name(const char *name)
 
 	start = phys_to_virt(map->start);
 	end = start + map->size - 1;
-	snprintf(buf, sizeof(buf), "reserve_mem:%s", name);
+	SPRINTF_END(buf, "reserve_mem:%s", name);
 	free_reserved_area(start, end, 0, buf);
 	map->size = 0;
 
diff --git a/mm/percpu.c b/mm/percpu.c
index b35494c8ede2..efe5d1517a96 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -3186,7 +3186,7 @@ int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_to_node_fn_t
 	int upa;
 	int nr_g0_units;
 
-	snprintf(psize_str, sizeof(psize_str), "%luK", PAGE_SIZE >> 10);
+	SPRINTF_END(psize_str, "%luK", PAGE_SIZE >> 10);
 
 	ai = pcpu_build_alloc_info(reserved_size, 0, PAGE_SIZE, NULL);
 	if (IS_ERR(ai))
diff --git a/mm/shrinker_debug.c b/mm/shrinker_debug.c
index 20eaee3e97f7..9a6e959882c6 100644
--- a/mm/shrinker_debug.c
+++ b/mm/shrinker_debug.c
@@ -176,7 +176,7 @@ int shrinker_debugfs_add(struct shrinker *shrinker)
 		return id;
 	shrinker->debugfs_id = id;
 
-	snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
+	SPRINTF_END(buf, "%s-%d", shrinker->name, id);
 
 	/* create debugfs entry */
 	entry = debugfs_create_dir(buf, shrinker_debugfs_root);
diff --git a/mm/zswap.c b/mm/zswap.c
index 204fb59da33c..7a8041f84e18 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -271,7 +271,7 @@ static struct zswap_pool *zswap_pool_create(char *type, char *compressor)
 		return NULL;
 
 	/* unique name for each pool specifically required by zsmalloc */
-	snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count));
+	SPRINTF_END(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
 	pool->zpool = zpool_create_pool(type, name, gfp);
 	if (!pool->zpool) {
 		pr_err("%s zpool not available\n", type);
-- 
2.50.0