[PATCH 4/4] tools/nolibc: add missing memchr() to string.h

Willy Tarreau posted 4 patches 3 months, 2 weeks ago
[PATCH 4/4] tools/nolibc: add missing memchr() to string.h
Posted by Willy Tarreau 3 months, 2 weeks ago
Surprisingly we forgot to add this common one. It was added with a
per-arch guard allowing to later implement it in arch-specific asm
code like was done for a few other ones.

The test verifies that we don't search past the indicated length.

Signed-off-by: Willy Tarreau <w@1wt.eu>
---
 tools/include/nolibc/string.h                | 15 +++++++++++++++
 tools/testing/selftests/nolibc/nolibc-test.c |  2 ++
 2 files changed, 17 insertions(+)

diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
index 163a17e7dd38b..4000926f44ac4 100644
--- a/tools/include/nolibc/string.h
+++ b/tools/include/nolibc/string.h
@@ -93,6 +93,21 @@ void *memset(void *dst, int b, size_t len)
 }
 #endif /* #ifndef NOLIBC_ARCH_HAS_MEMSET */
 
+#ifndef NOLIBC_ARCH_HAS_MEMCHR
+static __attribute__((unused))
+void *memchr(const void *s, int c, size_t len)
+{
+	char *p = (char *)s;
+
+	while (len--) {
+		if (*p == (char)c)
+			return p;
+		p++;
+	}
+	return NULL;
+}
+#endif /* #ifndef NOLIBC_ARCH_HAS_MEMCHR */
+
 static __attribute__((unused))
 char *strchr(const char *s, int c)
 {
diff --git a/tools/testing/selftests/nolibc/nolibc-test.c b/tools/testing/selftests/nolibc/nolibc-test.c
index dbe13000fb1ac..d832566265296 100644
--- a/tools/testing/selftests/nolibc/nolibc-test.c
+++ b/tools/testing/selftests/nolibc/nolibc-test.c
@@ -1524,6 +1524,8 @@ int run_stdlib(int min, int max)
 		CASE_TEST(abs);                     EXPECT_EQ(1, abs(-10), 10); break;
 		CASE_TEST(abs_noop);                EXPECT_EQ(1, abs(10), 10); break;
 		CASE_TEST(difftime);                EXPECT_ZR(1, test_difftime()); break;
+		CASE_TEST(memchr_foobar6_o);        EXPECT_STREQ(1, memchr("foobar", 'o', 6), "oobar"); break;
+		CASE_TEST(memchr_foobar3_b);        EXPECT_STRZR(1, memchr("foobar", 'b', 3)); break;
 
 		case __LINE__:
 			return ret; /* must be last */
-- 
2.17.5
Re: [PATCH 4/4] tools/nolibc: add missing memchr() to string.h
Posted by Thomas Weißschuh 3 months, 2 weeks ago
On 2025-06-20 12:02:51+0200, Willy Tarreau wrote:
> Surprisingly we forgot to add this common one. It was added with a
> per-arch guard allowing to later implement it in arch-specific asm
> code like was done for a few other ones.
> 
> The test verifies that we don't search past the indicated length.
> 
> Signed-off-by: Willy Tarreau <w@1wt.eu>
> ---
>  tools/include/nolibc/string.h                | 15 +++++++++++++++
>  tools/testing/selftests/nolibc/nolibc-test.c |  2 ++
>  2 files changed, 17 insertions(+)
> 
> diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
> index 163a17e7dd38b..4000926f44ac4 100644
> --- a/tools/include/nolibc/string.h
> +++ b/tools/include/nolibc/string.h
> @@ -93,6 +93,21 @@ void *memset(void *dst, int b, size_t len)
>  }
>  #endif /* #ifndef NOLIBC_ARCH_HAS_MEMSET */
>  
> +#ifndef NOLIBC_ARCH_HAS_MEMCHR

So far we only have added these guards when necessary,
which they aren't here. Can we drop them?

> +static __attribute__((unused))
> +void *memchr(const void *s, int c, size_t len)
> +{
> +	char *p = (char *)s;

The docs say that they are interpreted as "unsigned char".
Also, can we keep the const?

> +
> +	while (len--) {
> +		if (*p == (char)c)
> +			return p;
> +		p++;
> +	}
> +	return NULL;
> +}
> +#endif /* #ifndef NOLIBC_ARCH_HAS_MEMCHR */
> +
>  static __attribute__((unused))
>  char *strchr(const char *s, int c)
>  {

<snip>
Re: [PATCH 4/4] tools/nolibc: add missing memchr() to string.h
Posted by Willy Tarreau 3 months, 2 weeks ago
On Sat, Jun 21, 2025 at 10:27:11AM +0200, Thomas Weißschuh wrote:
> On 2025-06-20 12:02:51+0200, Willy Tarreau wrote:
> > Surprisingly we forgot to add this common one. It was added with a
> > per-arch guard allowing to later implement it in arch-specific asm
> > code like was done for a few other ones.
> > 
> > The test verifies that we don't search past the indicated length.
> > 
> > Signed-off-by: Willy Tarreau <w@1wt.eu>
> > ---
> >  tools/include/nolibc/string.h                | 15 +++++++++++++++
> >  tools/testing/selftests/nolibc/nolibc-test.c |  2 ++
> >  2 files changed, 17 insertions(+)
> > 
> > diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
> > index 163a17e7dd38b..4000926f44ac4 100644
> > --- a/tools/include/nolibc/string.h
> > +++ b/tools/include/nolibc/string.h
> > @@ -93,6 +93,21 @@ void *memset(void *dst, int b, size_t len)
> >  }
> >  #endif /* #ifndef NOLIBC_ARCH_HAS_MEMSET */
> >  
> > +#ifndef NOLIBC_ARCH_HAS_MEMCHR
> 
> So far we only have added these guards when necessary,
> which they aren't here. Can we drop them?

I intentionally placed them so that we can easily override them,
as we did for the other ones on x86 where string operations are
super short (repnz scasb is two bytes once you have the registers
already loaded).

> > +static __attribute__((unused))
> > +void *memchr(const void *s, int c, size_t len)
> > +{
> > +	char *p = (char *)s;
> 
> The docs say that they are interpreted as "unsigned char".

It does not change anything here, except adding an extra
modifier (since we'll then also have to do it in the loop
when comparing against c), thus IMHO it's extra noise.

> Also, can we keep the const?

It's memchr()'s definition which requires to return a void* so the
const needs to be dropped somewhere. Here I found visually cleaner to
have a single cast during the variable assignment rather than have a
second one on the return statement. But it's a matter of taste. I
tend to hate casts as they confuse the reader and remove the ability
of the compiler to produce relevant warnings, so for me the less the
better.

Thanks,
Willy
Re: [PATCH 4/4] tools/nolibc: add missing memchr() to string.h
Posted by Thomas Weißschuh 3 months, 2 weeks ago
On 2025-06-21 10:42:34+0200, Willy Tarreau wrote:
> On Sat, Jun 21, 2025 at 10:27:11AM +0200, Thomas Weißschuh wrote:
> > On 2025-06-20 12:02:51+0200, Willy Tarreau wrote:
> > > Surprisingly we forgot to add this common one. It was added with a
> > > per-arch guard allowing to later implement it in arch-specific asm
> > > code like was done for a few other ones.
> > > 
> > > The test verifies that we don't search past the indicated length.
> > > 
> > > Signed-off-by: Willy Tarreau <w@1wt.eu>
> > > ---
> > >  tools/include/nolibc/string.h                | 15 +++++++++++++++
> > >  tools/testing/selftests/nolibc/nolibc-test.c |  2 ++
> > >  2 files changed, 17 insertions(+)
> > > 
> > > diff --git a/tools/include/nolibc/string.h b/tools/include/nolibc/string.h
> > > index 163a17e7dd38b..4000926f44ac4 100644
> > > --- a/tools/include/nolibc/string.h
> > > +++ b/tools/include/nolibc/string.h
> > > @@ -93,6 +93,21 @@ void *memset(void *dst, int b, size_t len)
> > >  }
> > >  #endif /* #ifndef NOLIBC_ARCH_HAS_MEMSET */
> > >  
> > > +#ifndef NOLIBC_ARCH_HAS_MEMCHR
> > 
> > So far we only have added these guards when necessary,
> > which they aren't here. Can we drop them?
> 
> I intentionally placed them so that we can easily override them,
> as we did for the other ones on x86 where string operations are
> super short (repnz scasb is two bytes once you have the registers
> already loaded).

Okay.

We do have different override mechanisms.
Both NOLIBC_ARCH_HAS_* and for example the mechanism for sys_fork.
Not sure if it is worth aligning them.

> > > +static __attribute__((unused))
> > > +void *memchr(const void *s, int c, size_t len)
> > > +{
> > > +	char *p = (char *)s;
> > 
> > The docs say that they are interpreted as "unsigned char".
> 
> It does not change anything here, except adding an extra
> modifier (since we'll then also have to do it in the loop
> when comparing against c), thus IMHO it's extra noise.

Fair enough.

> > Also, can we keep the const?
> 
> It's memchr()'s definition which requires to return a void* so the
> const needs to be dropped somewhere. Here I found visually cleaner to
> have a single cast during the variable assignment rather than have a
> second one on the return statement. But it's a matter of taste. I
> tend to hate casts as they confuse the reader and remove the ability
> of the compiler to produce relevant warnings, so for me the less the
> better.

Ditto.
Re: [PATCH 4/4] tools/nolibc: add missing memchr() to string.h
Posted by Willy Tarreau 3 months, 2 weeks ago
On Sun, Jun 22, 2025 at 09:56:35PM +0200, Thomas Weißschuh wrote:
(...)
> > > > +#ifndef NOLIBC_ARCH_HAS_MEMCHR
> > > 
> > > So far we only have added these guards when necessary,
> > > which they aren't here. Can we drop them?
> > 
> > I intentionally placed them so that we can easily override them,
> > as we did for the other ones on x86 where string operations are
> > super short (repnz scasb is two bytes once you have the registers
> > already loaded).
> 
> Okay.
> 
> We do have different override mechanisms.
> Both NOLIBC_ARCH_HAS_* and for example the mechanism for sys_fork.
> Not sure if it is worth aligning them.

I don't know either, because we're speaking about doing it with standard
name functions (e.g. memchr), contrary to sys_* that we're bringing with
nolibc. I think it requires a bit more thinking to be sure we're not going
to cause trouble (e.g. with compiler builtin ones etc). At least whatever
the outcome, I agree that trying to align all definitions using the same
approach would be desirable, even if it means changing all of them.

> > > > +static __attribute__((unused))
> > > > +void *memchr(const void *s, int c, size_t len)
> > > > +{
> > > > +	char *p = (char *)s;
> > > 
> > > The docs say that they are interpreted as "unsigned char".
> > 
> > It does not change anything here, except adding an extra
> > modifier (since we'll then also have to do it in the loop
> > when comparing against c), thus IMHO it's extra noise.
> 
> Fair enough.
> 
> > > Also, can we keep the const?
> > 
> > It's memchr()'s definition which requires to return a void* so the
> > const needs to be dropped somewhere. Here I found visually cleaner to
> > have a single cast during the variable assignment rather than have a
> > second one on the return statement. But it's a matter of taste. I
> > tend to hate casts as they confuse the reader and remove the ability
> > of the compiler to produce relevant warnings, so for me the less the
> > better.
> 
> Ditto.

OK, then I'll push it.

Thank you!
Willy