include/linux/array_size.h | 6 ++++ include/linux/sprintf.h | 6 ++++ include/linux/stackdepot.h | 13 +++++++++ include/linux/stacktrace.h | 3 ++ kernel/stacktrace.c | 28 ++++++++++++++++++ lib/stackdepot.c | 13 +++++++++ lib/vsprintf.c | 59 ++++++++++++++++++++++++++++++++++++++ mm/backing-dev.c | 2 +- mm/cma.c | 4 +-- mm/cma_debug.c | 2 +- mm/hugetlb.c | 3 +- mm/hugetlb_cgroup.c | 2 +- mm/hugetlb_cma.c | 2 +- mm/kasan/report.c | 3 +- mm/kfence/kfence_test.c | 28 +++++++++--------- mm/kmsan/kmsan_test.c | 6 ++-- mm/memblock.c | 4 +-- mm/mempolicy.c | 18 ++++++------ mm/page_owner.c | 32 +++++++++++---------- mm/percpu.c | 2 +- mm/shrinker_debug.c | 2 +- mm/slub.c | 5 ++-- mm/zswap.c | 2 +- 23 files changed, 187 insertions(+), 58 deletions(-)
Hi,
Changes in v4:
- Added to global CC everyone who participated in the discussion so
far.
- Rename seprintf() => sprintf_end().
- Implement SPRINTF_END().
- Drop stprintf(). We don't need it as an intermediate helper.
- Link to the draft of a standards proposal (which I'll paste as a
reply to this mail again).
- Minor fixes or updates to commit messages.
- Added Marco Elver's Acked-by: tag in commit 5/7.
- In stack_depot_sprint_end(), do sprintf_end(p, end, "") when
nr_entries is 0, to guarantee that the string is valid if this is the
first s*printf() call in a row.
- Document sprintf_end() as 'string end-delimited print formatted'.
This spells the letters in seprintf() for their meaning, in case
anyone thinks the letters are randomly chosen. :)
- Remove comment about vsnprintf(3) not failing in the kernel, after
Rasmus commented this is QoI guaranteed by the kernel.
Remaining questions:
- There are only 3 remaining calls to snprintf(3) under mm/. They are
just fine for now, which is why I didn't replace them. If anyone
wants to replace them, to get rid of all snprintf(3), we could that.
I think for now we can leave them, to minimize the churn.
$ grep -rnI snprintf mm/
mm/hugetlb_cgroup.c:674: snprintf(buf, size, "%luGB", hsize / SZ_1G);
mm/hugetlb_cgroup.c:676: snprintf(buf, size, "%luMB", hsize / SZ_1M);
mm/hugetlb_cgroup.c:678: snprintf(buf, size, "%luKB", hsize / SZ_1K);
- There are only 2 remaining calls to the kernel's scnprintf(). This
one I would really like to get rid of. Also, those calls are quite
suspicious of not being what we want. Please do have a look at them
and confirm what's the appropriate behavior in the 2 cases when the
string is truncated or not copied at all. That code is very scary
for me to try to guess.
$ grep -rnI scnprintf mm/
mm/kfence/report.c:75: int len = scnprintf(buf, sizeof(buf), "%ps", (void *)stack_entries[skipnr]);
mm/kfence/kfence_test.mod.c:22: { 0x96848186, "scnprintf" },
mm/kmsan/report.c:42: len = scnprintf(buf, sizeof(buf), "%ps",
Apart from two calls, I see a string literal with that name. Please
let me know if I should do anything about it. I don't know what that
is.
- I think we should remove one error handling check in
"mm/page_owner.c" (marked with an XXX comment), but I'm not 100%
sure. Please confirm.
Other comments:
- This is still not complying to coding style. I'll keep it like that
while questions remain open.
- I've tested the tests under CONFIG_KFENCE_KUNIT_TEST=y, and this has
no regressions at all.
- With the current style of the sprintf_end() prototyope, this triggers
a diagnostic due to a GCC bug:
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108036>
It would be interesting to ask GCC to fix that bug. (Added relevant
GCC maintainers and contributors to CC in this cover letter.)
Have a lovely night!
Alex
Alejandro Colomar (7):
vsprintf: Add [v]sprintf_end()
stacktrace, stackdepot: Add sprintf_end()-like variants of functions
mm: Use sprintf_end() instead of less ergonomic APIs
array_size.h: Add ENDOF()
mm: Fix benign off-by-one bugs
sprintf: Add [V]SPRINTF_END()
mm: Use [V]SPRINTF_END() to avoid specifying the array size
include/linux/array_size.h | 6 ++++
include/linux/sprintf.h | 6 ++++
include/linux/stackdepot.h | 13 +++++++++
include/linux/stacktrace.h | 3 ++
kernel/stacktrace.c | 28 ++++++++++++++++++
lib/stackdepot.c | 13 +++++++++
lib/vsprintf.c | 59 ++++++++++++++++++++++++++++++++++++++
mm/backing-dev.c | 2 +-
mm/cma.c | 4 +--
mm/cma_debug.c | 2 +-
mm/hugetlb.c | 3 +-
mm/hugetlb_cgroup.c | 2 +-
mm/hugetlb_cma.c | 2 +-
mm/kasan/report.c | 3 +-
mm/kfence/kfence_test.c | 28 +++++++++---------
mm/kmsan/kmsan_test.c | 6 ++--
mm/memblock.c | 4 +--
mm/mempolicy.c | 18 ++++++------
mm/page_owner.c | 32 +++++++++++----------
mm/percpu.c | 2 +-
mm/shrinker_debug.c | 2 +-
mm/slub.c | 5 ++--
mm/zswap.c | 2 +-
23 files changed, 187 insertions(+), 58 deletions(-)
Range-diff against v3:
1: 64334f0b94d6 ! 1: 2c4f793de0b8 vsprintf: Add [v]seprintf(), [v]stprintf()
@@ Metadata
Author: Alejandro Colomar <alx@kernel.org>
## Commit message ##
- vsprintf: Add [v]seprintf(), [v]stprintf()
+ vsprintf: Add [v]sprintf_end()
- seprintf()
- ==========
-
- seprintf() is a function similar to stpcpy(3) in the sense that it
+ sprintf_end() is a function similar to stpcpy(3) in the sense that it
returns a pointer that is suitable for chaining to other copy
operations.
@@ Commit message
It also makes error handling much easier, by reporting truncation with
a null pointer, which is accepted and transparently passed down by
- subsequent seprintf() calls. This results in only needing to report
- errors once after a chain of seprintf() calls, unlike snprintf(3), which
- requires checking after every call.
+ subsequent sprintf_end() calls. This results in only needing to report
+ errors once after a chain of sprintf_end() calls, unlike snprintf(3),
+ which requires checking after every call.
p = buf;
e = buf + countof(buf);
- p = seprintf(p, e, foo);
- p = seprintf(p, e, bar);
+ p = sprintf_end(p, e, foo);
+ p = sprintf_end(p, e, bar);
if (p == NULL)
goto trunc;
@@ Commit message
size = countof(buf);
len += scnprintf(buf + len, size - len, foo);
len += scnprintf(buf + len, size - len, bar);
- if (len >= size)
- goto trunc;
+ // No ability to check.
It seems aparent that it's a more elegant approach to string catenation.
- stprintf()
- ==========
-
- stprintf() is a helper that is needed for implementing seprintf()
- --although it could be open-coded within vseprintf(), of course--, but
- it's also useful by itself. It has the same interface properties as
- strscpy(): that is, it copies with truncation, and reports truncation
- with -E2BIG. It would be useful to replace some calls to snprintf(3)
- and scnprintf() which don't need chaining, and where it's simpler to
- pass a size.
-
- It is better than plain snprintf(3), because it results in simpler error
- detection (it doesn't need a check >=countof(buf), but rather <0).
+ These functions will soon be proposed for standardization as
+ [v]seprintf() into C2y, and they exist in Plan9 as seprint(2) --but the
+ Plan9 implementation has important bugs--.
+ Link: <https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0049.git/tree/alx-0049.txt>
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
+ Cc: Marco Elver <elver@google.com>
+ Cc: Michal Hocko <mhocko@suse.com>
+ Cc: Linus Torvalds <torvalds@linux-foundation.org>
+ Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
## include/linux/sprintf.h ##
-@@ include/linux/sprintf.h: __printf(2, 3) int sprintf(char *buf, const char * fmt, ...);
- __printf(2, 0) int vsprintf(char *buf, const char *, va_list);
- __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
+@@ include/linux/sprintf.h: __printf(3, 4) int snprintf(char *buf, size_t size, const char *fmt, ...);
__printf(3, 0) int vsnprintf(char *buf, size_t size, const char *fmt, va_list args);
-+__printf(3, 4) int stprintf(char *buf, size_t size, const char *fmt, ...);
-+__printf(3, 0) int vstprintf(char *buf, size_t size, const char *fmt, va_list args);
__printf(3, 4) int scnprintf(char *buf, size_t size, const char *fmt, ...);
__printf(3, 0) int vscnprintf(char *buf, size_t size, const char *fmt, va_list args);
-+__printf(3, 4) char *seprintf(char *p, const char end[0], const char *fmt, ...);
-+__printf(3, 0) char *vseprintf(char *p, const char end[0], const char *fmt, va_list args);
++__printf(3, 4) char *sprintf_end(char *p, const char end[0], const char *fmt, ...);
++__printf(3, 0) char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args);
__printf(2, 3) __malloc char *kasprintf(gfp_t gfp, const char *fmt, ...);
__printf(2, 0) __malloc char *kvasprintf(gfp_t gfp, const char *fmt, va_list args);
__printf(2, 0) const char *kvasprintf_const(gfp_t gfp, const char *fmt, va_list args);
## lib/vsprintf.c ##
-@@ lib/vsprintf.c: int vsnprintf(char *buf, size_t size, const char *fmt_str, va_list args)
- }
- EXPORT_SYMBOL(vsnprintf);
-
-+/**
-+ * vstprintf - Format a string and place it in a buffer
-+ * @buf: The buffer to place the result into
-+ * @size: The size of the buffer, including the trailing null space
-+ * @fmt: The format string to use
-+ * @args: Arguments for the format string
-+ *
-+ * The return value is the length of the new string.
-+ * If the string is truncated, the function returns -E2BIG.
-+ *
-+ * If you're not already dealing with a va_list consider using stprintf().
-+ *
-+ * See the vsnprintf() documentation for format string extensions over C99.
-+ */
-+int vstprintf(char *buf, size_t size, const char *fmt, va_list args)
-+{
-+ int len;
-+
-+ len = vsnprintf(buf, size, fmt, args);
-+
-+ // It seems the kernel's vsnprintf() doesn't fail?
-+ //if (unlikely(len < 0))
-+ // return -E2BIG;
-+
-+ if (unlikely(len >= size))
-+ return -E2BIG;
-+
-+ return len;
-+}
-+EXPORT_SYMBOL(vstprintf);
-+
- /**
- * vscnprintf - Format a string and place it in a buffer
- * @buf: The buffer to place the result into
@@ lib/vsprintf.c: int vscnprintf(char *buf, size_t size, const char *fmt, va_list args)
}
EXPORT_SYMBOL(vscnprintf);
+/**
-+ * vseprintf - Format a string and place it in a buffer
++ * vsprintf_end - va_list string end-delimited print formatted
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
@@ lib/vsprintf.c: int vscnprintf(char *buf, size_t size, const char *fmt, va_list
+ * The return value is a pointer to the trailing '\0'.
+ * If @p is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
-+ *
-+ * If you're not already dealing with a va_list consider using seprintf().
++ * If @end <= @p, the function returns NULL.
+ *
+ * See the vsnprintf() documentation for format string extensions over C99.
+ */
-+char *vseprintf(char *p, const char end[0], const char *fmt, va_list args)
++char *vsprintf_end(char *p, const char end[0], const char *fmt, va_list args)
+{
+ int len;
++ size_t size;
+
+ if (unlikely(p == NULL))
+ return NULL;
+
-+ len = vstprintf(p, end - p, fmt, args);
-+ if (unlikely(len < 0))
++ size = end - p;
++ if (WARN_ON_ONCE(size == 0 || size > INT_MAX))
++ return NULL;
++
++ len = vsnprintf(p, size, fmt, args);
++ if (unlikely(len >= size))
+ return NULL;
+
+ return p + len;
+}
-+EXPORT_SYMBOL(vseprintf);
++EXPORT_SYMBOL(vsprintf_end);
+
/**
* snprintf - Format a string and place it in a buffer
* @buf: The buffer to place the result into
-@@ lib/vsprintf.c: int snprintf(char *buf, size_t size, const char *fmt, ...)
- }
- EXPORT_SYMBOL(snprintf);
-
-+/**
-+ * stprintf - Format a string and place it in a buffer
-+ * @buf: The buffer to place the result into
-+ * @size: The size of the buffer, including the trailing null space
-+ * @fmt: The format string to use
-+ * @...: Arguments for the format string
-+ *
-+ * The return value is the length of the new string.
-+ * If the string is truncated, the function returns -E2BIG.
-+ */
-+
-+int stprintf(char *buf, size_t size, const char *fmt, ...)
-+{
-+ va_list args;
-+ int len;
-+
-+ va_start(args, fmt);
-+ len = vstprintf(buf, size, fmt, args);
-+ va_end(args);
-+
-+ return len;
-+}
-+EXPORT_SYMBOL(stprintf);
-+
- /**
- * scnprintf - Format a string and place it in a buffer
- * @buf: The buffer to place the result into
@@ lib/vsprintf.c: int scnprintf(char *buf, size_t size, const char *fmt, ...)
}
EXPORT_SYMBOL(scnprintf);
+/**
-+ * seprintf - Format a string and place it in a buffer
++ * sprintf_end - string end-delimited print formatted
+ * @p: The buffer to place the result into
+ * @end: A pointer to one past the last character in the buffer
+ * @fmt: The format string to use
@@ lib/vsprintf.c: int scnprintf(char *buf, size_t size, const char *fmt, ...)
+ * The return value is a pointer to the trailing '\0'.
+ * If @buf is NULL, the function returns NULL.
+ * If the string is truncated, the function returns NULL.
++ * If @end <= @p, the function returns NULL.
+ */
+
-+char *seprintf(char *p, const char end[0], const char *fmt, ...)
++char *sprintf_end(char *p, const char end[0], const char *fmt, ...)
+{
+ va_list args;
+
+ va_start(args, fmt);
-+ p = vseprintf(p, end, fmt, args);
++ p = vsprintf_end(p, end, fmt, args);
+ va_end(args);
+
+ return p;
+}
-+EXPORT_SYMBOL(seprintf);
++EXPORT_SYMBOL(sprintf_end);
+
/**
* vsprintf - Format a string and place it in a buffer
2: 9c140de9842d ! 2: 894d02b08056 stacktrace, stackdepot: Add seprintf()-like variants of functions
@@ Metadata
Author: Alejandro Colomar <alx@kernel.org>
## Commit message ##
- stacktrace, stackdepot: Add seprintf()-like variants of functions
-
- I think there's an anomaly in stack_depot_s*print(). If we have zero
- entries, we don't copy anything, which means the string is still not a
- string. Normally, this function is called surrounded by other calls to
- s*printf(), which guarantee that there's a '\0', but maybe we should
- make sure to write a '\0' here?
+ stacktrace, stackdepot: Add sprintf_end()-like variants of functions
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
+ Cc: Marco Elver <elver@google.com>
+ Cc: Michal Hocko <mhocko@suse.com>
+ Cc: Linus Torvalds <torvalds@linux-foundation.org>
+ Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
## include/linux/stackdepot.h ##
@@ include/linux/stackdepot.h: void stack_depot_print(depot_stack_handle_t stack);
int spaces);
+/**
-+ * stack_depot_seprint - Print a stack trace from stack depot into a buffer
++ * stack_depot_sprint_end - Print a stack trace from stack depot into a buffer
+ *
+ * @handle: Stack depot handle returned from stack_depot_save()
+ * @p: Pointer to the print buffer
@@ include/linux/stackdepot.h: void stack_depot_print(depot_stack_handle_t stack);
+ *
+ * Return: Pointer to trailing '\0'; or NULL on truncation
+ */
-+char *stack_depot_seprint(depot_stack_handle_t handle, char *p,
-+ const char end[0], int spaces);
++char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
++ const char end[0], int spaces);
+
/**
* stack_depot_put - Drop a reference to a stack trace from stack depot
@@ include/linux/stacktrace.h: void stack_trace_print(const unsigned long *trace, u
int spaces);
int stack_trace_snprint(char *buf, size_t size, const unsigned long *entries,
unsigned int nr_entries, int spaces);
-+char *stack_trace_seprint(char *p, const char end[0],
-+ const unsigned long *entries, unsigned int nr_entries,
-+ int spaces);
++char *stack_trace_sprint_end(char *p, const char end[0],
++ const unsigned long *entries,
++ unsigned int nr_entries, int spaces);
unsigned int stack_trace_save(unsigned long *store, unsigned int size,
unsigned int skipnr);
unsigned int stack_trace_save_tsk(struct task_struct *task,
@@ kernel/stacktrace.c: int stack_trace_snprint(char *buf, size_t size, const unsig
EXPORT_SYMBOL_GPL(stack_trace_snprint);
+/**
-+ * stack_trace_seprint - Print the entries in the stack trace into a buffer
++ * stack_trace_sprint_end - Print the entries in the stack trace into a buffer
+ * @p: Pointer to the print buffer
+ * @end: Pointer to one past the last element in the buffer
+ * @entries: Pointer to storage array
@@ kernel/stacktrace.c: int stack_trace_snprint(char *buf, size_t size, const unsig
+ *
+ * Return: Pointer to the trailing '\0'; or NULL on truncation.
+ */
-+char *stack_trace_seprint(char *p, const char end[0],
++char *stack_trace_sprint_end(char *p, const char end[0],
+ const unsigned long *entries, unsigned int nr_entries,
+ int spaces)
+{
@@ kernel/stacktrace.c: int stack_trace_snprint(char *buf, size_t size, const unsig
+ return 0;
+
+ for (i = 0; i < nr_entries; i++) {
-+ p = seprintf(p, end, "%*c%pS\n", 1 + spaces, ' ',
++ p = sprintf_end(p, end, "%*c%pS\n", 1 + spaces, ' ',
+ (void *)entries[i]);
+ }
+
+ return p;
+}
-+EXPORT_SYMBOL_GPL(stack_trace_seprint);
++EXPORT_SYMBOL_GPL(stack_trace_sprint_end);
+
#ifdef CONFIG_ARCH_STACKWALK
@@ lib/stackdepot.c: int stack_depot_snprint(depot_stack_handle_t handle, char *buf
}
EXPORT_SYMBOL_GPL(stack_depot_snprint);
-+char *stack_depot_seprint(depot_stack_handle_t handle, char *p,
-+ const char end[0], int spaces)
++char *stack_depot_sprint_end(depot_stack_handle_t handle, char *p,
++ const char end[0], int spaces)
+{
+ unsigned long *entries;
+ unsigned int nr_entries;
+
+ nr_entries = stack_depot_fetch(handle, &entries);
-+ return nr_entries ? stack_trace_seprint(p, end, entries, nr_entries,
-+ spaces) : p;
++ return nr_entries ?
++ stack_trace_sprint_end(p, end, entries, nr_entries, spaces)
++ : sprintf_end(p, end, "");
+}
-+EXPORT_SYMBOL_GPL(stack_depot_seprint);
++EXPORT_SYMBOL_GPL(stack_depot_sprint_end);
+
depot_stack_handle_t __must_check stack_depot_set_extra_bits(
depot_stack_handle_t handle, unsigned int extra_bits)
3: 033bf00f1fcf ! 3: 690ed4d22f57 mm: Use seprintf() instead of less ergonomic APIs
@@ Metadata
Author: Alejandro Colomar <alx@kernel.org>
## Commit message ##
- mm: Use seprintf() instead of less ergonomic APIs
+ mm: Use sprintf_end() instead of less ergonomic APIs
While doing this, I detected some anomalies in the existing code:
@@ Commit message
This file uses the 'p += snprintf()' anti-pattern. That will
overflow the pointer on truncation, which has undefined
- behavior. Using seprintf(), this bug is fixed.
+ behavior. Using sprintf_end(), this bug is fixed.
As in the previous file, here there was also dead code in the
last scnprintf() call, by incrementing a pointer that is not
@@ Commit message
a good reason (i.e., we may want to avoid calling
print_page_owner_memcg() if we truncated before). Please review
if this amount of error handling is the right one, or if we want
- to add or remove some. For seprintf(), a single test for null
- after the last call is enough to detect truncation.
+ to add or remove some. For sprintf_end(), a single test for
+ null after the last call is enough to detect truncation.
mm/slub.c:
Again, the 'p += snprintf()' anti-pattern. This is UB, and by
- using seprintf() we've fixed the bug.
+ using sprintf_end() we've fixed the bug.
- Fixes: f99e12b21b84 (2021-07-30; "kfence: add function to mask address bits")
- [alx: that commit introduced dead code]
- Fixes: af649773fb25 (2024-07-17; "mm/numa_balancing: teach mpol_to_str about the balancing mode")
- [alx: that commit added p+=snprintf() calls, which are UB]
- Fixes: 2291990ab36b (2008-04-28; "mempolicy: clean-up mpol-to-str() mempolicy formatting")
- [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
- Fixes: 948927ee9e4f (2013-11-13; "mm, mempolicy: make mpol_to_str robust and always succeed")
- [alx: that commit changes old code into p+=snprintf(), which is still UB]
- [alx: that commit also produced dead code by leaving the last 'p+=...']
- Fixes: d65360f22406 (2022-09-26; "mm/slub: clean up create_unique_id()")
- [alx: that commit changed p+=sprintf() into p+=snprintf(), which is still UB]
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Marco Elver <elver@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
- Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
- Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Chao Yu <chao.yu@oppo.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
+ Cc: Michal Hocko <mhocko@suse.com>
+ Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
## mm/kfence/kfence_test.c ##
@@ mm/kfence/kfence_test.c: static bool report_matches(const struct expect_report *
switch (r->type) {
case KFENCE_ERROR_OOB:
- cur += scnprintf(cur, end - cur, "BUG: KFENCE: out-of-bounds %s",
-+ cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
++ cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
get_access_type(r));
break;
case KFENCE_ERROR_UAF:
- cur += scnprintf(cur, end - cur, "BUG: KFENCE: use-after-free %s",
-+ cur = seprintf(cur, end, "BUG: KFENCE: use-after-free %s",
++ cur = sprintf_end(cur, end, "BUG: KFENCE: use-after-free %s",
get_access_type(r));
break;
case KFENCE_ERROR_CORRUPTION:
- cur += scnprintf(cur, end - cur, "BUG: KFENCE: memory corruption");
-+ cur = seprintf(cur, end, "BUG: KFENCE: memory corruption");
++ cur = sprintf_end(cur, end, "BUG: KFENCE: memory corruption");
break;
case KFENCE_ERROR_INVALID:
- cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid %s",
-+ cur = seprintf(cur, end, "BUG: KFENCE: invalid %s",
++ cur = sprintf_end(cur, end, "BUG: KFENCE: invalid %s",
get_access_type(r));
break;
case KFENCE_ERROR_INVALID_FREE:
- cur += scnprintf(cur, end - cur, "BUG: KFENCE: invalid free");
-+ cur = seprintf(cur, end, "BUG: KFENCE: invalid free");
++ cur = sprintf_end(cur, end, "BUG: KFENCE: invalid free");
break;
}
- scnprintf(cur, end - cur, " in %pS", r->fn);
-+ seprintf(cur, end, " in %pS", r->fn);
++ sprintf_end(cur, end, " in %pS", r->fn);
/* The exact offset won't match, remove it; also strip module name. */
cur = strchr(expect[0], '+');
if (cur)
@@ mm/kfence/kfence_test.c: static bool report_matches(const struct expect_report *
switch (r->type) {
case KFENCE_ERROR_OOB:
- cur += scnprintf(cur, end - cur, "Out-of-bounds %s at", get_access_type(r));
-+ cur = seprintf(cur, end, "Out-of-bounds %s at", get_access_type(r));
++ cur = sprintf_end(cur, end, "Out-of-bounds %s at", get_access_type(r));
addr = arch_kfence_test_address(addr);
break;
case KFENCE_ERROR_UAF:
- cur += scnprintf(cur, end - cur, "Use-after-free %s at", get_access_type(r));
-+ cur = seprintf(cur, end, "Use-after-free %s at", get_access_type(r));
++ cur = sprintf_end(cur, end, "Use-after-free %s at", get_access_type(r));
addr = arch_kfence_test_address(addr);
break;
case KFENCE_ERROR_CORRUPTION:
- cur += scnprintf(cur, end - cur, "Corrupted memory at");
-+ cur = seprintf(cur, end, "Corrupted memory at");
++ cur = sprintf_end(cur, end, "Corrupted memory at");
break;
case KFENCE_ERROR_INVALID:
- cur += scnprintf(cur, end - cur, "Invalid %s at", get_access_type(r));
-+ cur = seprintf(cur, end, "Invalid %s at", get_access_type(r));
++ cur = sprintf_end(cur, end, "Invalid %s at", get_access_type(r));
addr = arch_kfence_test_address(addr);
break;
case KFENCE_ERROR_INVALID_FREE:
- cur += scnprintf(cur, end - cur, "Invalid free of");
-+ cur = seprintf(cur, end, "Invalid free of");
++ cur = sprintf_end(cur, end, "Invalid free of");
break;
}
- cur += scnprintf(cur, end - cur, " 0x%p", (void *)addr);
-+ seprintf(cur, end, " 0x%p", (void *)addr);
++ sprintf_end(cur, end, " 0x%p", (void *)addr);
spin_lock_irqsave(&observed.lock, flags);
if (!report_available())
@@ mm/kmsan/kmsan_test.c: static bool report_matches(const struct expect_report *r)
end = &expected_header[sizeof(expected_header) - 1];
- cur += scnprintf(cur, end - cur, "BUG: KMSAN: %s", r->error_type);
-+ cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
++ cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
- scnprintf(cur, end - cur, " in %s", r->symbol);
-+ seprintf(cur, end, " in %s", r->symbol);
++ sprintf_end(cur, end, " in %s", r->symbol);
/* The exact offset won't match, remove it; also strip module name. */
cur = strchr(expected_header, '+');
if (cur)
@@ mm/mempolicy.c: void mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol
default:
WARN_ON_ONCE(1);
- snprintf(p, maxlen, "unknown");
-+ seprintf(p, e, "unknown");
++ sprintf_end(p, e, "unknown");
return;
}
- p += snprintf(p, maxlen, "%s", policy_modes[mode]);
-+ p = seprintf(p, e, "%s", policy_modes[mode]);
++ p = sprintf_end(p, e, "%s", policy_modes[mode]);
if (flags & MPOL_MODE_FLAGS) {
- p += snprintf(p, buffer + maxlen - p, "=");
-+ p = seprintf(p, e, "=");
++ p = sprintf_end(p, e, "=");
/*
* Static and relative are mutually exclusive.
*/
if (flags & MPOL_F_STATIC_NODES)
- p += snprintf(p, buffer + maxlen - p, "static");
-+ p = seprintf(p, e, "static");
++ p = sprintf_end(p, e, "static");
else if (flags & MPOL_F_RELATIVE_NODES)
- p += snprintf(p, buffer + maxlen - p, "relative");
-+ p = seprintf(p, e, "relative");
++ p = sprintf_end(p, e, "relative");
if (flags & MPOL_F_NUMA_BALANCING) {
if (!is_power_of_2(flags & MPOL_MODE_FLAGS))
- p += snprintf(p, buffer + maxlen - p, "|");
- p += snprintf(p, buffer + maxlen - p, "balancing");
-+ p = seprintf(p, e, "|");
-+ p = seprintf(p, e, "balancing");
++ p = sprintf_end(p, e, "|");
++ p = sprintf_end(p, e, "balancing");
}
}
if (!nodes_empty(nodes))
- p += scnprintf(p, buffer + maxlen - p, ":%*pbl",
- nodemask_pr_args(&nodes));
-+ seprintf(p, e, ":%*pbl", nodemask_pr_args(&nodes));
++ sprintf_end(p, e, ":%*pbl", nodemask_pr_args(&nodes));
}
#ifdef CONFIG_SYSFS
@@ mm/page_owner.c: static inline int print_page_owner_memcg(char *kbuf, size_t cou
if (memcg_data & MEMCG_DATA_OBJEXTS)
- ret += scnprintf(kbuf + ret, count - ret,
- "Slab cache page\n");
-+ p = seprintf(p, end, "Slab cache page\n");
++ p = sprintf_end(p, end, "Slab cache page\n");
memcg = page_memcg_check(page);
if (!memcg)
@@ mm/page_owner.c: static inline int print_page_owner_memcg(char *kbuf, size_t cou
online = (memcg->css.flags & CSS_ONLINE);
cgroup_name(memcg->css.cgroup, name, sizeof(name));
- ret += scnprintf(kbuf + ret, count - ret,
-+ p = seprintf(p, end,
++ p = sprintf_end(p, end,
"Charged %sto %smemcg %s\n",
PageMemcgKmem(page) ? "(via objcg) " : "",
online ? "" : "offline ",
@@ mm/page_owner.c: print_page_owner(char __user *buf, size_t count, unsigned long
- ret = scnprintf(kbuf, count,
+ p = kbuf;
+ e = kbuf + count;
-+ p = seprintf(p, e,
++ p = sprintf_end(p, e,
"Page allocated via order %u, mask %#x(%pGg), pid %d, tgid %d (%s), ts %llu ns\n",
page_owner->order, page_owner->gfp_mask,
&page_owner->gfp_mask, page_owner->pid,
@@ mm/page_owner.c: print_page_owner(char __user *buf, size_t count, unsigned long
pageblock_mt = get_pageblock_migratetype(page);
page_mt = gfp_migratetype(page_owner->gfp_mask);
- ret += scnprintf(kbuf + ret, count - ret,
-+ p = seprintf(p, e,
++ p = sprintf_end(p, e,
"PFN 0x%lx type %s Block %lu type %s Flags %pGp\n",
pfn,
migratetype_names[page_mt],
@@ mm/page_owner.c: print_page_owner(char __user *buf, size_t count, unsigned long
- ret += stack_depot_snprint(handle, kbuf + ret, count - ret, 0);
- if (ret >= count)
- goto err;
-+ p = stack_depot_seprint(handle, p, e, 0);
++ p = stack_depot_sprint_end(handle, p, e, 0);
+ if (p == NULL)
+ goto err; // XXX: Should we remove this error handling?
if (page_owner->last_migrate_reason != -1) {
- ret += scnprintf(kbuf + ret, count - ret,
-+ p = seprintf(p, e,
++ p = sprintf_end(p, e,
"Page has been migrated, last migrate reason: %s\n",
migrate_reason_names[page_owner->last_migrate_reason]);
}
@@ mm/page_owner.c: print_page_owner(char __user *buf, size_t count, unsigned long
- ret += snprintf(kbuf + ret, count - ret, "\n");
- if (ret >= count)
-+ p = seprintf(p, e, "\n");
++ p = sprintf_end(p, e, "\n");
+ if (p == NULL)
goto err;
@@ mm/slub.c: static char *create_unique_id(struct kmem_cache *s)
if (p != name + 1)
*p++ = '-';
- p += snprintf(p, ID_STR_LENGTH - (p - name), "%07u", s->size);
-+ p = seprintf(p, e, "%07u", s->size);
++ p = sprintf_end(p, e, "%07u", s->size);
- if (WARN_ON(p > name + ID_STR_LENGTH - 1)) {
+ if (WARN_ON(p == NULL)) {
4: d8bd0e1d308b ! 4: e05c5afabb3c array_size.h: Add ENDOF()
@@ Metadata
## Commit message ##
array_size.h: Add ENDOF()
- This macro is useful to calculate the second argument to seprintf(),
+ This macro is useful to calculate the second argument to sprintf_end(),
avoiding off-by-one bugs.
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
+ Cc: Marco Elver <elver@google.com>
+ Cc: Michal Hocko <mhocko@suse.com>
+ Cc: Linus Torvalds <torvalds@linux-foundation.org>
+ Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
## include/linux/array_size.h ##
5: 740755c1a888 ! 5: 44a5cfc82acf mm: Fix benign off-by-one bugs
@@ Commit message
'end' --that is, at most the terminating null byte will be written at
'end-1'--.
- Fixes: bc8fbc5f305a (2021-02-26; "kfence: add test suite")
- Fixes: 8ed691b02ade (2022-10-03; "kmsan: add tests for KMSAN")
+ Acked-by: Marco Elver <elver@google.com>
Cc: Kees Cook <kees@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
Cc: Alexander Potapenko <glider@google.com>
- Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
+ Cc: Marco Elver <elver@google.com>
+ Cc: Michal Hocko <mhocko@suse.com>
+ Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
## mm/kfence/kfence_test.c ##
@@ mm/kfence/kfence_test.c: static bool report_matches(const struct expect_report *
+ end = ENDOF(expect[0]);
switch (r->type) {
case KFENCE_ERROR_OOB:
- cur = seprintf(cur, end, "BUG: KFENCE: out-of-bounds %s",
+ cur = sprintf_end(cur, end, "BUG: KFENCE: out-of-bounds %s",
@@ mm/kfence/kfence_test.c: static bool report_matches(const struct expect_report *r)
/* Access information */
@@ mm/kmsan/kmsan_test.c: static bool report_matches(const struct expect_report *r)
- end = &expected_header[sizeof(expected_header) - 1];
+ end = ENDOF(expected_header);
- cur = seprintf(cur, end, "BUG: KMSAN: %s", r->error_type);
+ cur = sprintf_end(cur, end, "BUG: KMSAN: %s", r->error_type);
6: 44d05559398c ! 6: 0314948eb225 sprintf: Add [V]STPRINTF()
@@ Metadata
Author: Alejandro Colomar <alx@kernel.org>
## Commit message ##
- sprintf: Add [V]STPRINTF()
+ sprintf: Add [V]SPRINTF_END()
- These macros take the array size argument implicitly to avoid programmer
- mistakes. This guarantees that the input is an array, unlike the common
- call
+ These macros take the end of the array argument implicitly to avoid
+ programmer mistakes. This guarantees that the input is an array, unlike
snprintf(buf, sizeof(buf), ...);
- which is dangerous if the programmer passes a pointer.
+ which is dangerous if the programmer passes a pointer instead of an
+ array.
These macros are essentially the same as the 2-argument version of
- strscpy(), but with a formatted string.
+ strscpy(), but with a formatted string, and returning a pointer to the
+ terminating '\0' (or NULL, on error).
+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
+ Cc: Marco Elver <elver@google.com>
+ Cc: Michal Hocko <mhocko@suse.com>
+ Cc: Linus Torvalds <torvalds@linux-foundation.org>
+ Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
## include/linux/sprintf.h ##
@@ include/linux/sprintf.h
#include <linux/types.h>
+#include <linux/array_size.h>
+
-+#define STPRINTF(a, fmt, ...) stprintf(a, ARRAY_SIZE(a), fmt, ##__VA_ARGS__)
-+#define VSTPRINTF(a, fmt, ap) vstprintf(a, ARRAY_SIZE(a), fmt, ap)
++#define SPRINTF_END(a, fmt, ...) sprintf_end(a, ENDOF(a), fmt, ##__VA_ARGS__)
++#define VSPRINTF_END(a, fmt, ap) vsprintf_end(a, ENDOF(a), fmt, ap)
int num_to_str(char *buf, int size, unsigned long long num, unsigned int width);
7: d0e95db3c80a ! 7: f99632f42eee mm: Use [V]STPRINTF() to avoid specifying the array size
@@ Metadata
Author: Alejandro Colomar <alx@kernel.org>
## Commit message ##
- mm: Use [V]STPRINTF() to avoid specifying the array size
+ mm: Use [V]SPRINTF_END() to avoid specifying the array size
+ Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
+ Cc: Marco Elver <elver@google.com>
+ Cc: Michal Hocko <mhocko@suse.com>
+ Cc: Linus Torvalds <torvalds@linux-foundation.org>
+ Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
## mm/backing-dev.c ##
@@ mm/backing-dev.c: int bdi_register_va(struct backing_dev_info *bdi, const char *
return 0;
- vsnprintf(bdi->dev_name, sizeof(bdi->dev_name), fmt, args);
-+ VSTPRINTF(bdi->dev_name, fmt, args);
++ VSPRINTF_END(bdi->dev_name, fmt, args);
dev = device_create(&bdi_class, NULL, MKDEV(0, 0), bdi, bdi->dev_name);
if (IS_ERR(dev))
return PTR_ERR(dev);
@@ mm/cma.c: static int __init cma_new_area(const char *name, phys_addr_t size,
if (name)
- snprintf(cma->name, CMA_MAX_NAME, "%s", name);
-+ STPRINTF(cma->name, "%s", name);
++ SPRINTF_END(cma->name, "%s", name);
else
- snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count);
-+ STPRINTF(cma->name, "cma%d\n", cma_area_count);
++ SPRINTF_END(cma->name, "cma%d\n", cma_area_count);
cma->available_count = cma->count = size >> PAGE_SHIFT;
cma->order_per_bit = order_per_bit;
@@ mm/cma_debug.c: static void cma_debugfs_add_one(struct cma *cma, struct dentry *
for (r = 0; r < cma->nranges; r++) {
cmr = &cma->ranges[r];
- snprintf(rdirname, sizeof(rdirname), "%d", r);
-+ STPRINTF(rdirname, "%d", r);
++ SPRINTF_END(rdirname, "%d", r);
dir = debugfs_create_dir(rdirname, rangedir);
debugfs_create_file("base_pfn", 0444, dir,
&cmr->base_pfn, &cma_debugfs_fops);
@@ mm/hugetlb.c: void __init hugetlb_add_hstate(unsigned int order)
INIT_LIST_HEAD(&h->hugepage_activelist);
- snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB",
- huge_page_size(h)/SZ_1K);
-+ STPRINTF(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
++ SPRINTF_END(h->name, "hugepages-%lukB", huge_page_size(h)/SZ_1K);
parsed_hstate = h;
}
@@ mm/hugetlb_cgroup.c: hugetlb_cgroup_cfttypes_init(struct hstate *h, struct cftyp
*cft = *tmpl;
/* rebuild the name */
- snprintf(cft->name, MAX_CFTYPE_NAME, "%s.%s", buf, tmpl->name);
-+ STPRINTF(cft->name, "%s.%s", buf, tmpl->name);
++ SPRINTF_END(cft->name, "%s.%s", buf, tmpl->name);
/* rebuild the private */
cft->private = MEMFILE_PRIVATE(idx, tmpl->private);
/* rebuild the file_offset */
@@ mm/hugetlb_cma.c: void __init hugetlb_cma_reserve(int order)
size = round_up(size, PAGE_SIZE << order);
- snprintf(name, sizeof(name), "hugetlb%d", nid);
-+ STPRINTF(name, "hugetlb%d", nid);
++ SPRINTF_END(name, "hugetlb%d", nid);
/*
* Note that 'order per bit' is based on smallest size that
* may be returned to CMA allocator in the case of
@@ mm/kasan/report.c: static void print_memory_metadata(const void *addr)
- snprintf(buffer, sizeof(buffer),
- (i == 0) ? ">%px: " : " %px: ", row);
-+ STPRINTF(buffer, (i == 0) ? ">%px: " : " %px: ", row);
++ SPRINTF_END(buffer, (i == 0) ? ">%px: " : " %px: ", row);
/*
* We should not pass a shadow pointer to generic
@@ mm/memblock.c: static void __init_memblock memblock_dump(struct memblock_type *t
#ifdef CONFIG_NUMA
if (numa_valid_node(memblock_get_region_node(rgn)))
- snprintf(nid_buf, sizeof(nid_buf), " on node %d",
-+ STPRINTF(nid_buf, " on node %d",
++ SPRINTF_END(nid_buf, " on node %d",
memblock_get_region_node(rgn));
#endif
pr_info(" %s[%#x]\t[%pa-%pa], %pa bytes%s flags: %#x\n",
@@ mm/memblock.c: int reserve_mem_release_by_name(const char *name)
start = phys_to_virt(map->start);
end = start + map->size - 1;
- snprintf(buf, sizeof(buf), "reserve_mem:%s", name);
-+ STPRINTF(buf, "reserve_mem:%s", name);
++ SPRINTF_END(buf, "reserve_mem:%s", name);
free_reserved_area(start, end, 0, buf);
map->size = 0;
@@ mm/percpu.c: int __init pcpu_page_first_chunk(size_t reserved_size, pcpu_fc_cpu_
int nr_g0_units;
- snprintf(psize_str, sizeof(psize_str), "%luK", PAGE_SIZE >> 10);
-+ STPRINTF(psize_str, "%luK", PAGE_SIZE >> 10);
++ SPRINTF_END(psize_str, "%luK", PAGE_SIZE >> 10);
ai = pcpu_build_alloc_info(reserved_size, 0, PAGE_SIZE, NULL);
if (IS_ERR(ai))
@@ mm/shrinker_debug.c: int shrinker_debugfs_add(struct shrinker *shrinker)
shrinker->debugfs_id = id;
- snprintf(buf, sizeof(buf), "%s-%d", shrinker->name, id);
-+ STPRINTF(buf, "%s-%d", shrinker->name, id);
++ SPRINTF_END(buf, "%s-%d", shrinker->name, id);
/* create debugfs entry */
entry = debugfs_create_dir(buf, shrinker_debugfs_root);
@@ mm/zswap.c: static struct zswap_pool *zswap_pool_create(char *type, char *compre
/* unique name for each pool specifically required by zsmalloc */
- snprintf(name, 38, "zswap%x", atomic_inc_return(&zswap_pools_count));
-+ STPRINTF(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
++ SPRINTF_END(name, "zswap%x", atomic_inc_return(&zswap_pools_count));
pool->zpool = zpool_create_pool(type, name, gfp);
if (!pool->zpool) {
pr_err("%s zpool not available\n", type);
--
2.50.0
Hi,
Below is a draft of the proposal I'll submit in a few weeks to the
C Committee.
Cheers,
Alex
---
Name
alx-0049r2 - add seprintf()
Principles
- Codify existing practice to address evident deficiencies.
- Enable secure programming
Category
Standardize existing libc APIs
Author
Alejandro Colomar <alx@kernel.org>
Cc: Christopher Bazley <chris.bazley.wg14@gmail.com>
History
<https://www.alejandro-colomar.es/src/alx/alx/wg14/alx-0049.git/>
r0 (2025-07-06):
- Initial draft.
r1 (2025-07-06):
- wfix.
- tfix.
- Expand on the off-by-one bugs.
- Note that ignoring truncation is not valid most of the time.
r2 (2025-07-10):
- tfix.
- Mention SEPRINTF().
Rationale
snprintf(3) is very difficult to chain for writing parts of a
string in separate calls, such as in a loop.
Let's start from the obvious sprintf(3) code (sprintf(3) will
not prevent overflow, but let's take it as a baseline from which
programmers start thinking):
p = buf;
for (...)
p += sprintf(p, ...);
Then, programmers will start thinking about preventing buffer
overflows. Programmers sometimes will naively add some buffer
size information and use snprintf(3):
p = buf;
size = countof(buf);
for (...)
p += snprintf(p, size - (p - buf), ...);
if (p >= buf + size) // or worse, (p > buf + size - 1)
goto fail;
(Except for minor differences, this kind of code can be found
everywhere. Here are a couple of examples:
<https://elixir.bootlin.com/linux/v6.14/source/mm/slub.c#L7231>
<https://elixir.bootlin.com/linux/v6.14/source/mm/mempolicy.c#L3369>.)
This has several issues, starting with the difficulty of getting
the second argument right. Sometimes, programmers will be too
confused, and slap a -1 there just to be safe.
p = buf;
size = countof(buf);
for (...)
p += snprintf(p, size - (p - buf) - 1, ...);
if (p >= buf + size -1)
goto fail;
(Except for minor differences, this kind of code can be found
everywhere. Here are a couple of examples:
<https://elixir.bootlin.com/linux/v6.14/source/mm/kfence/kfence_test.c#L113>
<https://elixir.bootlin.com/linux/v6.14/source/mm/kmsan/kmsan_test.c#L108>.)
Programmers will sometimes hold a pointer to one past the last
element in the array. This is a wise choice, as that pointer is
constant throughout the lifetime of the object. Then,
programmers might end up with something like this:
p = buf;
e = buf + countof(buf);
for (...)
p += snprintf(p, e - p, ...);
if (p >= e)
goto fail;
This is certainly much cleaner. Now a programmer might focus on
the fact that this can overflow the pointer. An easy approach
would be to make sure that the function never returns more than
the remaining size. That is, one could implement something like
this scnprintf() --name chosen to match the Linux kernel API of
the same name--. For the sake of simplicity, let's ignore
multiple evaluation of arguments.
#define scnprintf(s, size, ...) \
({ \
int len_; \
len_ = snprintf(s, size, __VA_ARGS__); \
if (len_ == -1) \
len_ = 0; \
if (len_ >= size) \
len_ = size - 1; \
\
len_; \
})
p = buf;
e = buf + countof(buf);
for (...)
p += scnprintf(p, e - p, ...);
(Except for minor differences, this kind of code can be found
everywhere. Here's an example:
<https://elixir.bootlin.com/linux/v6.14/source/mm/kfence/kfence_test.c#L131>.)
Now the programmer got rid of pointer overflow. However, they
now have silent truncation that cannot be detected. In some
cases this may seem good enough. However, often it's not. And
anyway, some code remains using snprintf(3) to be able to detect
truncation.
Moreover, this kind of code ignores the fact that vsnprintf(3)
can fail internally, in which case there's not even a truncated
string. In the kernel, they're fine, because their internal
vsnprintf() doesn't seem to ever fail, so they can always rely
on the truncated string. This is not reliable in projects that
rely on the libc vsnprintf(3).
For the code that needs to detect truncation, a programmer might
choose a different path. It would keep using snprintf(3), but
would use a temporary length variable instead of the pointer.
p = buf;
e = buf + countof(buf);
for (...) {
len = snprintf(p, e - p, ...);
if (len == -1)
goto fail;
if (len >= e - p)
goto fail;
p += len;
}
This is naturally error-prone. A colleague of mine --which is an
excellent programmer, to be clear--, had a bug even after
knowing about it and having tried to fix it. That shows how
hard it is to write this correctly:
<https://github.com/nginx/unit/pull/734#discussion_r1043963527>
In a similar fashion, the strlcpy(3) manual page from OpenBSD
documents a similar issue when chaining calls to strlcpy(3)
--which was designed with semantics equivalent to snprintf(3),
except for not formatting the string--:
| char *dir, *file, pname[MAXPATHLEN];
| size_t n;
|
| ...
|
| n = strlcpy(pname, dir, sizeof(pname));
| if (n >= sizeof(pname))
| goto toolong;
| if (strlcpy(pname + n, file, sizeof(pname) - n) >= sizeof(pname) - n)
| goto toolong;
|
| However, one may question the validity of such optimiza‐
| tions, as they defeat the whole purpose of strlcpy() and
| strlcat(). As a matter of fact, the first version of
| this manual page got it wrong.
Finally, a programmer might realize that while this is error-
prone, this is indeed the right thing to do. There's no way to
avoid it. One could then think of encapsulating this into an
API that at least would make it easy to write. Then, one might
wonder what the right parameters are for such an API. The only
immutable thing in the loop is 'e'. And apart from that, one
needs to know where to write, which is 'p'. Let's start with
those, and try to keep all the other information (size, len)
without escaping the API. Again, let's ignore multiple-
evaluation issues in this macro for the sake of simplicity.
#define foo(p, e, ...) \
({ \
int len_ = snprintf(p, e - p, __VA_ARGS__); \
if (len_ == -1) \
p = NULL; \
else if (len_ >= e - p) \
p = NULL; \
else \
p += len_; \
p;
})
p = buf;
e = buf + countof(buf);
for (...) {
p = foo(p, e, ...);
if (p == NULL)
goto fail;
}
We've advanced a lot. We got rid of the buffer overflow; we
also got rid of the error-prone code at call site. However, one
might think that checking for truncation after every call is
cumbersome. Indeed, it is possible to slightly tweak the
internals of foo() to propagate errors from previous calls.
#define seprintf(p, e, ...) \
({ \
if (p != NULL) { \
int len_; \
\
len_ = snprintf(p, e - p, __VA_ARGS__); \
if (len_ == -1) \
p = NULL; \
else if (len_ >= e - p) \
p = NULL; \
else \
p += len_; \
} \
p; \
})
p = buf;
e = buf + countof(buf);
for (...)
p = seprintf(p, e, ...);
if (p == NULL)
goto fail;
By propagating an input null pointer directly to the output of
the API, which I've called seprintf() --the 'e' refers to the
'end' pointer, which is the key in this API--, we've allowed
ignoring null pointers until after the very last call. If we
compare our resulting code to the sprintf(3)-based baseline, we
got --perhaps unsurprisingly-- something quite close to it:
p = buf;
for (...)
p += sprintf(p, ...);
vs
p = buf;
e = buf + countof(buf);
for (...)
p = seprintf(p, e, ...);
if (p == NULL)
goto fail;
And the seprintf() version is safe against both truncation and
buffer overflow.
For the case where there is only one call to this function (so
not chained), and the buffer is an array, an even more ergonomic
wrapper can be written, and it is recommended that projects
define this macro themselves:
#define SEPRINTF(a, fmt, ...) \
seprintf(a, a + countof(a), fmt, __VA_ARGS__)
This adds some safety guarantees that $2 is calculated correctly
when it can be automated. Correct use would look like
if (SEPRINTF(buf, "foo") == NULL)
goto fail;
Some important details of the seprintf() API are:
- When 'p' is NULL, the API must preserve errno. This is
important to be able to determine the cause of the error
after all the chained calls, even when the error occurred in
some call in the middle of the chain.
- When truncation occurs, a distinct errno value must be used,
to signal the programmer that at least the string is reliable
to be used as a null-terminated string. The error code
chosen is E2BIG, for compatibility with strscpy(), a Linux
kernel internal API with which this API shares many features
in common.
- When a hard error (an internal snprintf(3) error) occurs, an
error code different than E2BIG must be used. It is
important to set errno, because if an implementation would
chose to return NULL without setting errno, an old value of
E2BIG could lead the programmer to believe the string was
successfully written (and truncated), and read it with
nefast consequences.
Prior art
This API is implemented in the shadow-utils project.
Plan9 designed something quite close, which they call
seprint(2). The parameters are the same --the right choice--,
but they got the semantics for corner cases wrong. Ironically,
the existing Plan9 code I've seen seems to expect the semantics
that I chose, regardless of the actual semantics of the Plan9
API. This is --I suspect--, because my semantics are actually
the intuitive semantics that one would naively guess of an API
with these parameters and return value.
I've implemented this API for the Linux kernel, and found and
fixed an amazing amount of bugs and other questionable code in
just the first handful of files that I inspected.
<https://lore.kernel.org/linux-hardening/cover.1751747518.git.alx@kernel.org/T/#t>
<https://lore.kernel.org/linux-hardening/cover.1751823326.git.alx@kernel.org/T/#t>
Future directions
The 'e = buf + _Countof(buf)' construct is something I've found
to be quite common. It would be interesting to have an
_Endof operator that would return a pointer to one past the last
element of an array. It would require an array operand, just
like _Countof. If an _Endof operator is deemed too cumbersome
for implementation, an endof() standard macro that expands to
the obvious implementation with _Countof could be okay.
This operator (or operator-like macro) would prevent off-by-one
bugs when calculating the end sentinel value, such as those
shown above (with links to Linux kernel real bugs).
Proposed wording
Based on N3550.
7.24.6 Input/output <stdio.h> :: Formatted input/output functions
## New section after 7.24.6.6 ("The snprintf function"):
+7.24.6.6+1 The <b>seprintf</b> function
+
+Synopsis
+1 #include <stdio.h>
+ char *seprintf(char *restrict p, const char end[0], const char *restrict format, ...);
+
+Description
+2 The <b>$0</b> function
+ is equivalent to <b>fprintf</b>,
+ except that the output is written into an array
+ (specified by argument <tt>p</tt>)
+ rather than a stream.
+ If <tt>p</tt> is a null pointer,
+ nothing is written,
+ and the function returns a null pointer.
+ Otherwise,
+ <tt>end</tt> shall compare greater than <tt>p</tt>;
+ the function writes at most
+ <tt>end - p - 1</tt> non-null characters,
+ the remaining output characters are discarded,
+ and a null character is written
+ at the end of the characters
+ actually written to the array.
+ If copying takes place between objects that overlap,
+ the behavior is undefined.
+
+Returns
+3 The <b>$0</b> function returns
+ a pointer to the terminating null character
+ if the output was written
+ without discarding any characters.
+
+4
+ If <tt>p</tt> is a null pointer,
+ a null pointer is returned,
+ and <b>errno</b> is not modified.
+
+5
+ If any characters are discarded,
+ a null pointer is returned,
+ and the value of the macro <b>E2BIG</b>
+ is stored in <b>errno</b>.
+
+6
+ If an error occurred,
+ a null pointer is returned,
+ and an implementation-defined non-zero value
+ is stored in <b>errno</b>.
## New section after 7.24.6.13 ("The vsnprintf function"):
+7.24.6.13+1 The <b>vseprintf</b> function
+
+Synopsis
+1 #include <stdio.h>
+ char *vseprintf(char *restrict p, const char end[0], const char *restrict format, va_list arg);
+
+Description
+2 The <b>$0</b> function
+ is equivalent to
+ <b>seprintf</b>,
+ with the varying argument list replaced by <tt>arg</tt>.
+
+3
+ The <tt>va_list</tt> argument to this function
+ shall have been initialized by the <b>va_start</b> macro
+ (and possibly subsequent <b>va_arg</b> invocations).
+ This function does not invoke the <b>va_end</b> macro.343)
7.33.2 Formatted wide character input/output functions
## New section after 7.33.2.4 ("The swprintf function"):
+7.33.2.4+1 The <b>sewprintf</b> function
+
+Synopsis
+1 #include <wchar.h>
+ wchar_t *sewprintf(wchar_t *restrict p, const wchar_t end[0], const wchar_t *restrict format, ...);
+
+Description
+2 The <b>$0</b> function
+ is equivalent to
+ <b>seprintf</b>,
+ except that it handles wide strings.
## New section after 7.33.2.8 ("The vswprintf function"):
+7.33.2.8+1 The <b>vsewprintf</b> function
+
+Synopsis
+1 #include <wchar.h>
+ wchar_t *vsewprintf(wchar_t *restrict p, const wchar_t end[0], const wchar_t *restrict format, va_list arg);
+
+Description
+2 The <b>$0</b> function
+ is equivalent to
+ <b>sewprintf</b>,
+ with the varying argument list replaced by <tt>arg</tt>.
+
+3
+ The <tt>va_list</tt> argument to this function
+ shall have been initialized by the <b>va_start</b> macro
+ (and possibly subsequent <b>va_arg</b> invocations).
+ This function does not invoke the <b>va_end</b> macro.407)
© 2016 - 2025 Red Hat, Inc.