unicode: don't write -1 after NULL terminator

[PATCH] unicode: don't write -1 after NULL terminator

Posted by Jason A. Donenfeld 3 years, 5 months ago

If the intention is to overwrite the first NULL with a -1, s[strlen(s)]
is the first NULL, not s[strlen(s)+1].

Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 fs/unicode/mkutf8data.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
index bc1a7c8b5c8d..61800e0d3226 100644
--- a/fs/unicode/mkutf8data.c
+++ b/fs/unicode/mkutf8data.c
@@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
 	/* Second test: length-limited string. */
 	s = buf2;
 	/* Replace NUL with a value that will cause an error if seen. */
-	s[strlen(s) + 1] = -1;
+	s[strlen(s)] = -1;
 	t = buf3;
 	if (utf8cursor(&u8c, tree, s))
 		return -1;
-- 
2.38.1

Re: [PATCH] unicode: don't write -1 after NULL terminator

Posted by Jiri Slaby 3 years, 5 months ago

On 03. 11. 22, 2:24, Jason A. Donenfeld wrote:
> If the intention is to overwrite the first NULL with a -1, s[strlen(s)]
> is the first NULL, not s[strlen(s)+1].

This caught my attention. You mix NULL (void *) with NUL (\0) in the 
changelog & subject. That occurs rather confusing to me.

> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
>   fs/unicode/mkutf8data.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
> index bc1a7c8b5c8d..61800e0d3226 100644
> --- a/fs/unicode/mkutf8data.c
> +++ b/fs/unicode/mkutf8data.c
> @@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
>   	/* Second test: length-limited string. */
>   	s = buf2;
>   	/* Replace NUL with a value that will cause an error if seen. */
> -	s[strlen(s) + 1] = -1;
> +	s[strlen(s)] = -1;
>   	t = buf3;
>   	if (utf8cursor(&u8c, tree, s))
>   		return -1;

-- 
js
suse labs

[PATCH v2] unicode: don't write -1 after NUL terminator

Posted by Jason A. Donenfeld 3 years, 5 months ago

If the intention is to overwrite the first NUL with a -1, s[strlen(s)]
is the first NUL, not s[strlen(s)+1].

Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
 fs/unicode/mkutf8data.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
index bc1a7c8b5c8d..61800e0d3226 100644
--- a/fs/unicode/mkutf8data.c
+++ b/fs/unicode/mkutf8data.c
@@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
 	/* Second test: length-limited string. */
 	s = buf2;
 	/* Replace NUL with a value that will cause an error if seen. */
-	s[strlen(s) + 1] = -1;
+	s[strlen(s)] = -1;
 	t = buf3;
 	if (utf8cursor(&u8c, tree, s))
 		return -1;
-- 
2.38.1

Re: [PATCH v2] unicode: don't write -1 after NUL terminator

Posted by Gabriel Krisman Bertazi 3 years, 5 months ago

"Jason A. Donenfeld" <Jason@zx2c4.com> writes:

> If the intention is to overwrite the first NUL with a -1, s[strlen(s)]
> is the first NUL, not s[strlen(s)+1].

Hi Jason,

This code is part of the verification of the trie that done at the end
of utf8data generation. It is making sure the tree is not corrupted, by
ensuring that utf8byte doesn't see something past the correct end of the
string (the first NULL byte).  Note it is not a bad memory access
either, since we guarantee to have allocated enough space.

So I think the code is correct as is. if you apply your patch and
regenerate utf8data.h_shipped, utf8byte will reach that -1 and fail the
verification.

> Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
> Cc: stable@vger.kernel.org
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> ---
>  fs/unicode/mkutf8data.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
> index bc1a7c8b5c8d..61800e0d3226 100644
> --- a/fs/unicode/mkutf8data.c
> +++ b/fs/unicode/mkutf8data.c
> @@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
>  	/* Second test: length-limited string. */
>  	s = buf2;
>  	/* Replace NUL with a value that will cause an error if seen. */
> -	s[strlen(s) + 1] = -1;
> +	s[strlen(s)] = -1;
>  	t = buf3;
>  	if (utf8cursor(&u8c, tree, s))
>  		return -1;

-- 
Gabriel Krisman Bertazi

Re: [PATCH v2] unicode: don't write -1 after NUL terminator

Posted by Jason A. Donenfeld 3 years, 5 months ago

Hi Gabriel,

On Mon, Nov 07, 2022 at 09:45:25AM -0500, Gabriel Krisman Bertazi wrote:
> "Jason A. Donenfeld" <Jason@zx2c4.com> writes:
> 
> > If the intention is to overwrite the first NUL with a -1, s[strlen(s)]
> > is the first NUL, not s[strlen(s)+1].
> 
> Hi Jason,
> 
> This code is part of the verification of the trie that done at the end
> of utf8data generation. It is making sure the tree is not corrupted, by
> ensuring that utf8byte doesn't see something past the correct end of the
> string (the first NULL byte).  Note it is not a bad memory access
> either, since we guarantee to have allocated enough space.
> 
> So I think the code is correct as is. if you apply your patch and
> regenerate utf8data.h_shipped, utf8byte will reach that -1 and fail the
> verification.

Ah, okay. "Replace NUL" would seem to be wrong/confusing comment text I
suppose. Thanks for the explanation anyhow, and sorry for the noise.

Jason

> 
> > Cc: Gabriel Krisman Bertazi <krisman@collabora.com>
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> > ---
> >  fs/unicode/mkutf8data.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/unicode/mkutf8data.c b/fs/unicode/mkutf8data.c
> > index bc1a7c8b5c8d..61800e0d3226 100644
> > --- a/fs/unicode/mkutf8data.c
> > +++ b/fs/unicode/mkutf8data.c
> > @@ -3194,7 +3194,7 @@ static int normalize_line(struct tree *tree)
> >  	/* Second test: length-limited string. */
> >  	s = buf2;
> >  	/* Replace NUL with a value that will cause an error if seen. */
> > -	s[strlen(s) + 1] = -1;
> > +	s[strlen(s)] = -1;
> >  	t = buf3;
> >  	if (utf8cursor(&u8c, tree, s))
> >  		return -1;
> 
> -- 
> Gabriel Krisman Bertazi