[PATCH] xen/lib: Fix strcmp() and strncmp()

Jane Malalane posted 1 patch 2 years, 8 months ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20210727184715.17699-1-jane.malalane@citrix.com
xen/lib/strcmp.c  | 8 +++++---
xen/lib/strncmp.c | 8 +++++---
2 files changed, 10 insertions(+), 6 deletions(-)
[PATCH] xen/lib: Fix strcmp() and strncmp()
Posted by Jane Malalane 2 years, 8 months ago
The C standard requires that each character be compared as unsigned
char. Xen's current behaviour compares as signed char, which changes
the answer when chars with a value greater than 0x7f are used.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
---
CC: Andrew Cooper <andrew.cooper3@citrix.com>
CC: George Dunlap <george.dunlap@citrix.com>
CC: Ian Jackson <iwj@xenproject.org>
CC: Jan Beulich <jbeulich@suse.com>
CC: Julien Grall <julien@xen.org>
CC: Stefano Stabellini <sstabellini@kernel.org>
CC: Wei Liu <wl@xen.org>
---
 xen/lib/strcmp.c  | 8 +++++---
 xen/lib/strncmp.c | 8 +++++---
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/xen/lib/strcmp.c b/xen/lib/strcmp.c
index 465f1c4191..f85c1e8741 100644
--- a/xen/lib/strcmp.c
+++ b/xen/lib/strcmp.c
@@ -11,14 +11,16 @@
  */
 int (strcmp)(const char *cs, const char *ct)
 {
-	register signed char __res;
+	unsigned char *csu = (unsigned char *)cs;
+	unsigned char *ctu = (unsigned char *)ct;
+	int res;
 
 	while (1) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		if ((res = *csu - *ctu++) != 0 || !*csu++)
 			break;
 	}
 
-	return __res;
+	return res;
 }
 
 /*
diff --git a/xen/lib/strncmp.c b/xen/lib/strncmp.c
index 9af7fa1c99..1480f58c2e 100644
--- a/xen/lib/strncmp.c
+++ b/xen/lib/strncmp.c
@@ -12,15 +12,17 @@
  */
 int (strncmp)(const char *cs, const char *ct, size_t count)
 {
-	register signed char __res = 0;
+	unsigned char *csu = (unsigned char *)cs;
+	unsigned char *ctu = (unsigned char *)ct;
+	int res = 0;
 
 	while (count) {
-		if ((__res = *cs - *ct++) != 0 || !*cs++)
+		if ((res = *csu - *ctu++) != 0 || !*csu++)
 			break;
 		count--;
 	}
 
-	return __res;
+	return res;
 }
 
 /*
-- 
2.11.0


Re: [PATCH] xen/lib: Fix strcmp() and strncmp()
Posted by Andrew Cooper 2 years, 8 months ago
On 27/07/2021 19:47, Jane Malalane wrote:
> diff --git a/xen/lib/strcmp.c b/xen/lib/strcmp.c
> index 465f1c4191..f85c1e8741 100644
> --- a/xen/lib/strcmp.c
> +++ b/xen/lib/strcmp.c
> @@ -11,14 +11,16 @@
>   */
>  int (strcmp)(const char *cs, const char *ct)
>  {
> -	register signed char __res;
> +	unsigned char *csu = (unsigned char *)cs;
> +	unsigned char *ctu = (unsigned char *)ct;

So there was actually one final thing, but it is holiday season, hence
the lack of replies from others.

We should not be casting away const-ness on the pointers, because that
is undefined behaviour and compilers are starting to warn about it. 
Therefore, we want something like:

const unsigned char *csu = (const unsigned char *)cs;

~Andrew


Re: [PATCH] xen/lib: Fix strcmp() and strncmp()
Posted by Ian Jackson 2 years, 8 months ago
Andrew Cooper writes ("Re: [PATCH] xen/lib: Fix strcmp() and strncmp()"):
> On 27/07/2021 19:47, Jane Malalane wrote:
> > -	register signed char __res;
> > +	unsigned char *csu = (unsigned char *)cs;
> > +	unsigned char *ctu = (unsigned char *)ct;
> 
> So there was actually one final thing, but it is holiday season, hence
> the lack of replies from others.

Oh.

> We should not be casting away const-ness on the pointers, because that
> is undefined behaviour and compilers are starting to warn about it. 

I don't think casting away const is UB.  Perhaps you (and perhaps
others) are seeing this in 6.3.2.3(2):

 | For any qualifier q, a pointer to a non-q-qualified type may be
 | converted to a pointer to the q-qualified version of the type; the
 | values stored in the original and converted pointers shall compare
 | equal.p

This does indeed define the meaning of *adding* qualifiers to a
pointer type but not define the meaning of removing them.  But that
whole paragraph is almost redundant, because in 6.3.2.3(7):

 | A pointer to an object or incomplete type may be converted to a
 | pointer to a different object or incomplete type. If the resulting
 | pointer is not correctly aligned57) for the pointed-to type, the
 | behavior is undefined. Otherwise, when converted back again, the
 | result shall compare equal to the original pointer.

This defines the meaning of conversions of pointers to object types
(like char*) regardless of the qualifiers.

I read that as "a pointer to an object type or to an incomplete type".
But the precise reading doesn't matter because these pointers are
actually to objects.

There's also this in 6.7.3(5):

 | If an attempt is made to modify an object defined with a
 | const-qualified type through use of an lvalue with
 | non-const-qualified type, the behavior is undefined.  made to refer
 | to an object defined with a volatile-qualified type through

But there is no attempt to modify.  (Also this paragraph doesn't apply
because characters in string literals have type char, not type const
char, but 6.4.6(6) directly prohibits modification of characters in
string literals.)

6.2.7(2) says

 | All declarations that refer to the same object or function shall
 | have compatible type; otherwise, the behavior is undefined.

but I don't think these pointers variables are declarations of the
chars pointed to.

> Therefore, we want something like:
> 
> const unsigned char *csu = (const unsigned char *)cs;

Having said all thst, I agree that that not casting away const would
be better (especially if it generates compiler warnings).

I pushed it already.  If thios is UB we should revert it but as I say
I think it isn't, so we can wait for a followup.

Thanks,
Ian.

Re: [PATCH] xen/lib: Fix strcmp() and strncmp()
Posted by Ian Jackson 2 years, 8 months ago
Jane Malalane writes ("[PATCH] xen/lib: Fix strcmp() and strncmp()"):
> The C standard requires that each character be compared as unsigned
> char. Xen's current behaviour compares as signed char, which changes
> the answer when chars with a value greater than 0x7f are used.
> 
> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>

Thanks for this.

What are the practical effects of this bug ?  AFAICT in the hypervisor
code all the call sites simply test for zero/nonzero.

Of course we should fix this because

> -		if ((__res = *cs - *ct++) != 0 || !*cs++)

this substraction is UB if it overflows.  So in theory the compiler
could miscompile it - although in practice I can't see how the
assumption that this doesn't overflow would "help" the compiler.

Ian.

Re: [PATCH] xen/lib: Fix strcmp() and strncmp()
Posted by Jane Malalane 2 years, 8 months ago
On 28/07/2021 11:42, Ian Jackson wrote:
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
>
> Jane Malalane writes ("[PATCH] xen/lib: Fix strcmp() and strncmp()"):
>> The C standard requires that each character be compared as unsigned
>> char. Xen's current behaviour compares as signed char, which changes
>> the answer when chars with a value greater than 0x7f are used.
>>
>> Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
>> Signed-off-by: Jane Malalane <jane.malalane@citrix.com>
> Thanks for this.
>
> What are the practical effects of this bug ?  AFAICT in the hypervisor
> code all the call sites simply test for zero/nonzero.
>
> Of course we should fix this because
>
>> -		if ((__res = *cs - *ct++) != 0 || !*cs++)
> this substraction is UB if it overflows.  So in theory the compiler
> could miscompile it - although in practice I can't see how the
> assumption that this doesn't overflow would "help" the compiler.
>
> Ian.
>
>
>
> This fix was just to make the code spec compliant and mainly for 
> practice as I'm currently being introduced to Xen.
>
> Jane
>
Re: [PATCH] xen/lib: Fix strcmp() and strncmp()
Posted by Ian Jackson 2 years, 8 months ago
Jane Malalane writes ("Re: [PATCH] xen/lib: Fix strcmp() and strncmp()"):
> On 28/07/2021 11:42, Ian Jackson wrote:
>     What are the practical effects of this bug ?  AFAICT in the hypervisor
>     code all the call sites simply test for zero/nonzero.
...
>     This fix was just to make the code spec compliant and mainly for practice as I'm currently being introduced to Xen.

OK, great.  As I say it looks correct to me.  I just wanted to make
sure I wasn't missing anything.

So,

Reviewed-by: Ian Jackson <iwj@xenproject.org>

and I will queue this.

Ian.