Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Recently, some compile-time checking I added to the clamp_t family of
functions triggered a build error when a poorly written driver was
compiled on ARM, because the driver assumed that the naked `char` type
is signed, but ARM treats it as unsigned, and the C standard says it's
architecture-dependent.
I doubt this particular driver is the only instance in which
unsuspecting authors make assumptions about `char` with no `signed` or
`unsigned` specifier. We were lucky enough this time that that driver
used `clamp_t(char, negative_value, positive_value)`, so the new
checking code found it, and I've sent a patch to fix it, but there are
likely other places lurking that won't be so easily unearthed.
So let's just eliminate this particular variety of heisensign bugs
entirely. Set `-funsigned-char` globally, so that gcc makes the type
unsigned on all architectures.
This will break things in some places and fix things in others, so this
will likely cause a bit of churn while reconciling the type misuse.
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/lkml/202210190108.ESC3pc3D-lkp@intel.com/
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index f41ec8c8426b..bbf376931899 100644
--- a/Makefile
+++ b/Makefile
@@ -562,7 +562,7 @@ KBUILD_AFLAGS := -D__ASSEMBLY__ -fno-PIE
KBUILD_CFLAGS := -Wall -Wundef -Werror=strict-prototypes -Wno-trigraphs \
-fno-strict-aliasing -fno-common -fshort-wchar -fno-PIE \
-Werror=implicit-function-declaration -Werror=implicit-int \
- -Werror=return-type -Wno-format-security \
+ -Werror=return-type -Wno-format-security -funsigned-char \
-std=gnu11
KBUILD_CPPFLAGS := -D__KERNEL__
KBUILD_RUSTFLAGS := $(rust_common_flags) \
--
2.38.1
On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote:
> Recently, some compile-time checking I added to the clamp_t family of
> functions triggered a build error when a poorly written driver was
> compiled on ARM, because the driver assumed that the naked `char` type
> is signed, but ARM treats it as unsigned, and the C standard says it's
> architecture-dependent.
>
> I doubt this particular driver is the only instance in which
> unsuspecting authors make assumptions about `char` with no `signed` or
> `unsigned` specifier. We were lucky enough this time that that driver
> used `clamp_t(char, negative_value, positive_value)`, so the new
> checking code found it, and I've sent a patch to fix it, but there are
> likely other places lurking that won't be so easily unearthed.
>
> So let's just eliminate this particular variety of heisensign bugs
> entirely. Set `-funsigned-char` globally, so that gcc makes the type
> unsigned on all architectures.
>
> This will break things in some places and fix things in others, so this
> will likely cause a bit of churn while reconciling the type misuse.
>
There is an interesting fallout: When running the m68k:q800 qemu emulation,
there are lots of warning backtraces.
WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha12,aes)' before 'adiantum(xchacha20,aes)'
------------[ cut here ]------------
WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha20,aes)' before 'aegis128'
and so on for pretty much every entry in the alg_test_descs[] array.
Bisect points to this patch, and reverting it fixes the problem.
It looks like the problem is that arch/m68k/include/asm/string.h
uses "char res" to store the result of strcmp(), and char is now
unsigned - meaning strcmp() will now never return a value < 0.
Effectively that means that strcmp() is broken on m68k if
CONFIG_COLDFIRE=n.
The fix is probably quite simple.
diff --git a/arch/m68k/include/asm/string.h b/arch/m68k/include/asm/string.h
index f759d944c449..b8f4ae19e8f6 100644
--- a/arch/m68k/include/asm/string.h
+++ b/arch/m68k/include/asm/string.h
@@ -42,7 +42,7 @@ static inline char *strncpy(char *dest, const char *src, size_t n)
#define __HAVE_ARCH_STRCMP
static inline int strcmp(const char *cs, const char *ct)
{
- char res;
+ signed char res;
asm ("\n"
"1: move.b (%0)+,%2\n" /* get *cs */
Does that make sense ? If so I can send a patch.
Guenter
Hi Günter,
On Wed, Dec 21, 2022 at 3:54 PM Guenter Roeck <linux@roeck-us.net> wrote:
> On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote:
> > Recently, some compile-time checking I added to the clamp_t family of
> > functions triggered a build error when a poorly written driver was
> > compiled on ARM, because the driver assumed that the naked `char` type
> > is signed, but ARM treats it as unsigned, and the C standard says it's
> > architecture-dependent.
> >
> > I doubt this particular driver is the only instance in which
> > unsuspecting authors make assumptions about `char` with no `signed` or
> > `unsigned` specifier. We were lucky enough this time that that driver
> > used `clamp_t(char, negative_value, positive_value)`, so the new
> > checking code found it, and I've sent a patch to fix it, but there are
> > likely other places lurking that won't be so easily unearthed.
> >
> > So let's just eliminate this particular variety of heisensign bugs
> > entirely. Set `-funsigned-char` globally, so that gcc makes the type
> > unsigned on all architectures.
> >
> > This will break things in some places and fix things in others, so this
> > will likely cause a bit of churn while reconciling the type misuse.
> >
>
> There is an interesting fallout: When running the m68k:q800 qemu emulation,
> there are lots of warning backtraces.
>
> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha12,aes)' before 'adiantum(xchacha20,aes)'
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha20,aes)' before 'aegis128'
>
> and so on for pretty much every entry in the alg_test_descs[] array.
>
> Bisect points to this patch, and reverting it fixes the problem.
>
> It looks like the problem is that arch/m68k/include/asm/string.h
> uses "char res" to store the result of strcmp(), and char is now
> unsigned - meaning strcmp() will now never return a value < 0.
> Effectively that means that strcmp() is broken on m68k if
> CONFIG_COLDFIRE=n.
>
> The fix is probably quite simple.
>
> diff --git a/arch/m68k/include/asm/string.h b/arch/m68k/include/asm/string.h
> index f759d944c449..b8f4ae19e8f6 100644
> --- a/arch/m68k/include/asm/string.h
> +++ b/arch/m68k/include/asm/string.h
> @@ -42,7 +42,7 @@ static inline char *strncpy(char *dest, const char *src, size_t n)
> #define __HAVE_ARCH_STRCMP
> static inline int strcmp(const char *cs, const char *ct)
> {
> - char res;
> + signed char res;
>
> asm ("\n"
> "1: move.b (%0)+,%2\n" /* get *cs */
>
> Does that make sense ? If so I can send a patch.
Thanks, been there, done that
https://lore.kernel.org/all/bce014e60d7b1a3d1c60009fc3572e2f72591f21.1671110959.git.geert@linux-m68k.org
Note that we detected other issues with the m68k strcmp(), so
probably that patch wouldn't go in as-is.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On 21/12/2022 16.05, Geert Uytterhoeven wrote:
> Hi Günter,
>
> On Wed, Dec 21, 2022 at 3:54 PM Guenter Roeck <linux@roeck-us.net> wrote:
>> On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote:
>>> Recently, some compile-time checking I added to the clamp_t family of
>>> functions triggered a build error when a poorly written driver was
>>> compiled on ARM, because the driver assumed that the naked `char` type
>>> is signed, but ARM treats it as unsigned, and the C standard says it's
>>> architecture-dependent.
>>>
>>> I doubt this particular driver is the only instance in which
>>> unsuspecting authors make assumptions about `char` with no `signed` or
>>> `unsigned` specifier. We were lucky enough this time that that driver
>>> used `clamp_t(char, negative_value, positive_value)`, so the new
>>> checking code found it, and I've sent a patch to fix it, but there are
>>> likely other places lurking that won't be so easily unearthed.
>>>
>>> So let's just eliminate this particular variety of heisensign bugs
>>> entirely. Set `-funsigned-char` globally, so that gcc makes the type
>>> unsigned on all architectures.
>>>
>>> This will break things in some places and fix things in others, so this
>>> will likely cause a bit of churn while reconciling the type misuse.
>>>
>>
>> There is an interesting fallout: When running the m68k:q800 qemu emulation,
>> there are lots of warning backtraces.
>>
>> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
>> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha12,aes)' before 'adiantum(xchacha20,aes)'
>> ------------[ cut here ]------------
>> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
>> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha20,aes)' before 'aegis128'
>>
>> and so on for pretty much every entry in the alg_test_descs[] array.
>>
>> Bisect points to this patch, and reverting it fixes the problem.
>>
>> It looks like the problem is that arch/m68k/include/asm/string.h
>> uses "char res" to store the result of strcmp(), and char is now
>> unsigned - meaning strcmp() will now never return a value < 0.
>> Effectively that means that strcmp() is broken on m68k if
>> CONFIG_COLDFIRE=n.
>>
>> The fix is probably quite simple.
>>
>> diff --git a/arch/m68k/include/asm/string.h b/arch/m68k/include/asm/string.h
>> index f759d944c449..b8f4ae19e8f6 100644
>> --- a/arch/m68k/include/asm/string.h
>> +++ b/arch/m68k/include/asm/string.h
>> @@ -42,7 +42,7 @@ static inline char *strncpy(char *dest, const char *src, size_t n)
>> #define __HAVE_ARCH_STRCMP
>> static inline int strcmp(const char *cs, const char *ct)
>> {
>> - char res;
>> + signed char res;
>>
>> asm ("\n"
>> "1: move.b (%0)+,%2\n" /* get *cs */
>>
>> Does that make sense ? If so I can send a patch.
>
> Thanks, been there, done that
> https://lore.kernel.org/all/bce014e60d7b1a3d1c60009fc3572e2f72591f21.1671110959.git.geert@linux-m68k.org
Well, looks like that would still leave strcmp() buggy, you can't
represent all possible differences between two char values (signed or
not) in an 8-bit quantity. So any implementation based on returning the
first non-zero value of *a - *b must store that intermediate value in
something wider. Otherwise you'll get -128 from strcmp("\x40", "\xc0"),
but _also_ -128 when you do strcmp("\xc0", "\x40"), which is obviously
bogus.
I recently fixed that long-standing bug in U-Boot's strcmp() and a
similar one in nolibc in the linux tree. I wonder how many more
instances exist.
Rasmus
Hi Rasmus,
On Wed, Dec 21, 2022 at 4:29 PM Rasmus Villemoes
<rasmus.villemoes@prevas.dk> wrote:
> On 21/12/2022 16.05, Geert Uytterhoeven wrote:
> > On Wed, Dec 21, 2022 at 3:54 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >> On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote:
> >>> Recently, some compile-time checking I added to the clamp_t family of
> >>> functions triggered a build error when a poorly written driver was
> >>> compiled on ARM, because the driver assumed that the naked `char` type
> >>> is signed, but ARM treats it as unsigned, and the C standard says it's
> >>> architecture-dependent.
> >>>
> >>> I doubt this particular driver is the only instance in which
> >>> unsuspecting authors make assumptions about `char` with no `signed` or
> >>> `unsigned` specifier. We were lucky enough this time that that driver
> >>> used `clamp_t(char, negative_value, positive_value)`, so the new
> >>> checking code found it, and I've sent a patch to fix it, but there are
> >>> likely other places lurking that won't be so easily unearthed.
> >>>
> >>> So let's just eliminate this particular variety of heisensign bugs
> >>> entirely. Set `-funsigned-char` globally, so that gcc makes the type
> >>> unsigned on all architectures.
> >>>
> >>> This will break things in some places and fix things in others, so this
> >>> will likely cause a bit of churn while reconciling the type misuse.
> >>>
> >>
> >> There is an interesting fallout: When running the m68k:q800 qemu emulation,
> >> there are lots of warning backtraces.
> >>
> >> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
> >> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha12,aes)' before 'adiantum(xchacha20,aes)'
> >> ------------[ cut here ]------------
> >> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
> >> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha20,aes)' before 'aegis128'
> >>
> >> and so on for pretty much every entry in the alg_test_descs[] array.
> >>
> >> Bisect points to this patch, and reverting it fixes the problem.
> >>
> >> It looks like the problem is that arch/m68k/include/asm/string.h
> >> uses "char res" to store the result of strcmp(), and char is now
> >> unsigned - meaning strcmp() will now never return a value < 0.
> >> Effectively that means that strcmp() is broken on m68k if
> >> CONFIG_COLDFIRE=n.
> >>
> >> The fix is probably quite simple.
> >>
> >> diff --git a/arch/m68k/include/asm/string.h b/arch/m68k/include/asm/string.h
> >> index f759d944c449..b8f4ae19e8f6 100644
> >> --- a/arch/m68k/include/asm/string.h
> >> +++ b/arch/m68k/include/asm/string.h
> >> @@ -42,7 +42,7 @@ static inline char *strncpy(char *dest, const char *src, size_t n)
> >> #define __HAVE_ARCH_STRCMP
> >> static inline int strcmp(const char *cs, const char *ct)
> >> {
> >> - char res;
> >> + signed char res;
> >>
> >> asm ("\n"
> >> "1: move.b (%0)+,%2\n" /* get *cs */
> >>
> >> Does that make sense ? If so I can send a patch.
> >
> > Thanks, been there, done that
> > https://lore.kernel.org/all/bce014e60d7b1a3d1c60009fc3572e2f72591f21.1671110959.git.geert@linux-m68k.org
>
> Well, looks like that would still leave strcmp() buggy, you can't
> represent all possible differences between two char values (signed or
> not) in an 8-bit quantity. So any implementation based on returning the
> first non-zero value of *a - *b must store that intermediate value in
> something wider. Otherwise you'll get -128 from strcmp("\x40", "\xc0"),
> but _also_ -128 when you do strcmp("\xc0", "\x40"), which is obviously
> bogus.
So we have https://lore.kernel.org/all/87bko3ia88.fsf@igel.home ;-)
And the other issue is m68k strcmp() calls being dropped by the
optimizer, cfr. the discussion in
https://lore.kernel.org/all/b673f98db7d14d53a6e1a1957ef81741@AcuMS.aculab.com
> I recently fixed that long-standing bug in U-Boot's strcmp() and a
> similar one in nolibc in the linux tree. I wonder how many more
> instances exist.
Thanks, commit fb63362c63c7aeac ("lib: fix buggy strcmp and strncmp") in
v2023.01-rc1, which is not yet in a released version.
(and in plain C, not in asm ;-)
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Wed, Dec 21, 2022 at 04:29:11PM +0100, Rasmus Villemoes wrote:
> On 21/12/2022 16.05, Geert Uytterhoeven wrote:
> > Hi Günter,
> >
> > On Wed, Dec 21, 2022 at 3:54 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >> On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote:
> >>> Recently, some compile-time checking I added to the clamp_t family of
> >>> functions triggered a build error when a poorly written driver was
> >>> compiled on ARM, because the driver assumed that the naked `char` type
> >>> is signed, but ARM treats it as unsigned, and the C standard says it's
> >>> architecture-dependent.
> >>>
> >>> I doubt this particular driver is the only instance in which
> >>> unsuspecting authors make assumptions about `char` with no `signed` or
> >>> `unsigned` specifier. We were lucky enough this time that that driver
> >>> used `clamp_t(char, negative_value, positive_value)`, so the new
> >>> checking code found it, and I've sent a patch to fix it, but there are
> >>> likely other places lurking that won't be so easily unearthed.
> >>>
> >>> So let's just eliminate this particular variety of heisensign bugs
> >>> entirely. Set `-funsigned-char` globally, so that gcc makes the type
> >>> unsigned on all architectures.
> >>>
> >>> This will break things in some places and fix things in others, so this
> >>> will likely cause a bit of churn while reconciling the type misuse.
> >>>
> >>
> >> There is an interesting fallout: When running the m68k:q800 qemu emulation,
> >> there are lots of warning backtraces.
> >>
> >> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
> >> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha12,aes)' before 'adiantum(xchacha20,aes)'
> >> ------------[ cut here ]------------
> >> WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
> >> testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha20,aes)' before 'aegis128'
> >>
> >> and so on for pretty much every entry in the alg_test_descs[] array.
> >>
> >> Bisect points to this patch, and reverting it fixes the problem.
> >>
> >> It looks like the problem is that arch/m68k/include/asm/string.h
> >> uses "char res" to store the result of strcmp(), and char is now
> >> unsigned - meaning strcmp() will now never return a value < 0.
> >> Effectively that means that strcmp() is broken on m68k if
> >> CONFIG_COLDFIRE=n.
> >>
> >> The fix is probably quite simple.
> >>
> >> diff --git a/arch/m68k/include/asm/string.h b/arch/m68k/include/asm/string.h
> >> index f759d944c449..b8f4ae19e8f6 100644
> >> --- a/arch/m68k/include/asm/string.h
> >> +++ b/arch/m68k/include/asm/string.h
> >> @@ -42,7 +42,7 @@ static inline char *strncpy(char *dest, const char *src, size_t n)
> >> #define __HAVE_ARCH_STRCMP
> >> static inline int strcmp(const char *cs, const char *ct)
> >> {
> >> - char res;
> >> + signed char res;
> >>
> >> asm ("\n"
> >> "1: move.b (%0)+,%2\n" /* get *cs */
> >>
> >> Does that make sense ? If so I can send a patch.
> >
> > Thanks, been there, done that
> > https://lore.kernel.org/all/bce014e60d7b1a3d1c60009fc3572e2f72591f21.1671110959.git.geert@linux-m68k.org
>
> Well, looks like that would still leave strcmp() buggy, you can't
> represent all possible differences between two char values (signed or
> not) in an 8-bit quantity. So any implementation based on returning the
> first non-zero value of *a - *b must store that intermediate value in
> something wider. Otherwise you'll get -128 from strcmp("\x40", "\xc0"),
> but _also_ -128 when you do strcmp("\xc0", "\x40"), which is obviously
> bogus.
>
The above assumes an unsigned char as input to strcmp(). I consider that
a hypothetical problem because "comparing" strings with upper bits
set doesn't really make sense in practice (How does one compare Günter
against Gunter ? And how about Gǖnter ?). On the other side, the problem
observed here is real and immediate.
Guenter
On Dez 21 2022, Guenter Roeck wrote: > The above assumes an unsigned char as input to strcmp(). That's how strcmp is defined. See <https://lore.kernel.org/all/87bko3ia88.fsf@igel.home> -- Andreas Schwab, schwab@linux-m68k.org GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510 2552 DF73 E780 A9DA AEC1 "And now for something completely different."
On Wed, Dec 21, 2022 at 7:56 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> The above assumes an unsigned char as input to strcmp(). I consider that
> a hypothetical problem because "comparing" strings with upper bits
> set doesn't really make sense in practice (How does one compare Günter
> against Gunter ? And how about Gǖnter ?). On the other side, the problem
> observed here is real and immediate.
POSIX does actually specify "Günter" vs "Gunter".
The way strcmp is supposed to work is to return the sign of the
difference between the byte values ("unsigned char").
But that sign has to be computed in 'int', not in 'signed char'.
So yes, the m68k implementation is broken regardless, but with a
signed char it just happened to work for the US-ASCII case that the
crypto case tested.
I think the real fix is to just remove that broken implementation
entirely, and rely on the generic one.
I'll commit that, and see what happens.
Linus
From: Linus Torvalds
> Sent: 21 December 2022 17:07
>
> On Wed, Dec 21, 2022 at 7:56 AM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > The above assumes an unsigned char as input to strcmp(). I consider that
> > a hypothetical problem because "comparing" strings with upper bits
> > set doesn't really make sense in practice (How does one compare Günter
> > against Gunter ? And how about Gǖnter ?). On the other side, the problem
> > observed here is real and immediate.
>
> POSIX does actually specify "Günter" vs "Gunter".
>
> The way strcmp is supposed to work is to return the sign of the
> difference between the byte values ("unsigned char").
>
> But that sign has to be computed in 'int', not in 'signed char'.
>
> So yes, the m68k implementation is broken regardless, but with a
> signed char it just happened to work for the US-ASCII case that the
> crypto case tested.
>
> I think the real fix is to just remove that broken implementation
> entirely, and rely on the generic one.
I wonder how much slower it is - m68k is likely to be microcoded
and I don't think instruction timings are actually available.
The fastest version probably uses subx (with carry) to generate
0/-1 and leaves +delta for the other result - but getting the
compares and branches in the right order is hard.
I believe some of the other m68k asm functions are also missing
the "memory" 'clobber' and so could get mis-optimised.
While I can write (or rather have written) m68k asm I don't have
a compiler.
I also suspect that any x86 code that uses 'rep scas' is going
to be slow on anything modern.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Wed, Dec 21, 2022 at 09:06:41AM -0800, Linus Torvalds wrote:
> On Wed, Dec 21, 2022 at 7:56 AM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > The above assumes an unsigned char as input to strcmp(). I consider that
> > a hypothetical problem because "comparing" strings with upper bits
> > set doesn't really make sense in practice (How does one compare Günter
> > against Gunter ? And how about Gǖnter ?). On the other side, the problem
> > observed here is real and immediate.
>
> POSIX does actually specify "Günter" vs "Gunter".
>
> The way strcmp is supposed to work is to return the sign of the
> difference between the byte values ("unsigned char").
>
> But that sign has to be computed in 'int', not in 'signed char'.
>
> So yes, the m68k implementation is broken regardless, but with a
> signed char it just happened to work for the US-ASCII case that the
> crypto case tested.
>
I understand. I just prefer a known limited breakage to completely
broken code.
> I think the real fix is to just remove that broken implementation
> entirely, and rely on the generic one.
Perfectly fine with me.
Thanks,
Guenter
On Wed, Dec 21, 2022 at 9:19 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On Wed, Dec 21, 2022 at 09:06:41AM -0800, Linus Torvalds wrote:
> >
> > I think the real fix is to just remove that broken implementation
> > entirely, and rely on the generic one.
>
> Perfectly fine with me.
That got pushed out as commit 7c0846125358 ("m68k: remove broken
strcmp implementation") but it's obviously entirely untested. I don't
do m68k cross-compiles, much less boot tests.
Just FYI for everybody - I may have screwed something up for some very
non-obvious reason.
But it looked very obvious indeed, and I hate having buggy code that
is architecture-specific when we have generic code that isn't buggy.
Linus
Hi Linus,
On Wed, Dec 21, 2022 at 7:46 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, Dec 21, 2022 at 9:19 AM Guenter Roeck <linux@roeck-us.net> wrote:
> > On Wed, Dec 21, 2022 at 09:06:41AM -0800, Linus Torvalds wrote:
> > > I think the real fix is to just remove that broken implementation
> > > entirely, and rely on the generic one.
> >
> > Perfectly fine with me.
>
> That got pushed out as commit 7c0846125358 ("m68k: remove broken
> strcmp implementation") but it's obviously entirely untested. I don't
> do m68k cross-compiles, much less boot tests.
>
> Just FYI for everybody - I may have screwed something up for some very
> non-obvious reason.
>
> But it looked very obvious indeed, and I hate having buggy code that
> is architecture-specific when we have generic code that isn't buggy.
Thank you for being proactive!
It works fine (and slightly reduced kernel size, too ;-)
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Wed, Dec 21, 2022 at 10:46:08AM -0800, Linus Torvalds wrote:
> On Wed, Dec 21, 2022 at 9:19 AM Guenter Roeck <linux@roeck-us.net> wrote:
> >
> > On Wed, Dec 21, 2022 at 09:06:41AM -0800, Linus Torvalds wrote:
> > >
> > > I think the real fix is to just remove that broken implementation
> > > entirely, and rely on the generic one.
> >
> > Perfectly fine with me.
>
> That got pushed out as commit 7c0846125358 ("m68k: remove broken
> strcmp implementation") but it's obviously entirely untested. I don't
> do m68k cross-compiles, much less boot tests.
>
> Just FYI for everybody - I may have screwed something up for some very
> non-obvious reason.
>
No worries:
Build reference: msi-fixes-6.2-1-2644-g0a924817d2ed
Building mcf5208evb:m5208:m5208evb_defconfig:initrd ... running ..... passed
Building q800:m68040:mac_defconfig:initrd ... running ..... passed
Building q800:m68040:mac_defconfig:rootfs ... running ..... passed
Guenter
On Wed, Dec 21, 2022 at 10:46 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> But it looked very obvious indeed, and I hate having buggy code that
> is architecture-specific when we have generic code that isn't buggy.
Side note: we have an x86-64 implementation that looks fine (but not
really noticeably better than the generic one) that is based on the
'return subtraction' model. But it seems to get it right.
And we have a 32-bit x86 assembly thing that is based on 'rep scasb',
that then uses the carry bit to also get things right.
That 32-bit asm goes back to Linux 0.01 (with some changes since to
use "sbbl+or" instead of a conditional neg). I was playing around a
lot with the 'rep' instructions back when, since it was all part of
"learn the instruction set" for me.
Both of them should probably be removed as pointless too, but they
don't seem actively buggy.
Linus
On Wed, Dec 21, 2022 at 04:05:45PM +0100, Geert Uytterhoeven wrote:
> Hi Günter,
>
> On Wed, Dec 21, 2022 at 3:54 PM Guenter Roeck <linux@roeck-us.net> wrote:
> > On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote:
> > > Recently, some compile-time checking I added to the clamp_t family of
> > > functions triggered a build error when a poorly written driver was
> > > compiled on ARM, because the driver assumed that the naked `char` type
> > > is signed, but ARM treats it as unsigned, and the C standard says it's
> > > architecture-dependent.
> > >
> > > I doubt this particular driver is the only instance in which
> > > unsuspecting authors make assumptions about `char` with no `signed` or
> > > `unsigned` specifier. We were lucky enough this time that that driver
> > > used `clamp_t(char, negative_value, positive_value)`, so the new
> > > checking code found it, and I've sent a patch to fix it, but there are
> > > likely other places lurking that won't be so easily unearthed.
> > >
> > > So let's just eliminate this particular variety of heisensign bugs
> > > entirely. Set `-funsigned-char` globally, so that gcc makes the type
> > > unsigned on all architectures.
> > >
> > > This will break things in some places and fix things in others, so this
> > > will likely cause a bit of churn while reconciling the type misuse.
> > >
> >
> > There is an interesting fallout: When running the m68k:q800 qemu emulation,
> > there are lots of warning backtraces.
> >
> > WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
> > testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha12,aes)' before 'adiantum(xchacha20,aes)'
> > ------------[ cut here ]------------
> > WARNING: CPU: 0 PID: 23 at crypto/testmgr.c:5724 alg_test.part.0+0x7c/0x326
> > testmgr: alg_test_descs entries in wrong order: 'adiantum(xchacha20,aes)' before 'aegis128'
> >
> > and so on for pretty much every entry in the alg_test_descs[] array.
> >
> > Bisect points to this patch, and reverting it fixes the problem.
> >
> > It looks like the problem is that arch/m68k/include/asm/string.h
> > uses "char res" to store the result of strcmp(), and char is now
> > unsigned - meaning strcmp() will now never return a value < 0.
> > Effectively that means that strcmp() is broken on m68k if
> > CONFIG_COLDFIRE=n.
> >
> > The fix is probably quite simple.
> >
> > diff --git a/arch/m68k/include/asm/string.h b/arch/m68k/include/asm/string.h
> > index f759d944c449..b8f4ae19e8f6 100644
> > --- a/arch/m68k/include/asm/string.h
> > +++ b/arch/m68k/include/asm/string.h
> > @@ -42,7 +42,7 @@ static inline char *strncpy(char *dest, const char *src, size_t n)
> > #define __HAVE_ARCH_STRCMP
> > static inline int strcmp(const char *cs, const char *ct)
> > {
> > - char res;
> > + signed char res;
> >
> > asm ("\n"
> > "1: move.b (%0)+,%2\n" /* get *cs */
> >
> > Does that make sense ? If so I can send a patch.
>
> Thanks, been there, done that
> https://lore.kernel.org/all/bce014e60d7b1a3d1c60009fc3572e2f72591f21.1671110959.git.geert@linux-m68k.org
>
> Note that we detected other issues with the m68k strcmp(), so
> probably that patch wouldn't go in as-is.
>
So anything non-Coldfire is and will remain broken on m68k for the time
being ? Wouldn't it be better to fix the acute problem now and address
the long-standing problem(s) separately ?
Thanks,
Guenter
On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote: > Recently, some compile-time checking I added to the clamp_t family of > functions triggered a build error when a poorly written driver was > compiled on ARM, because the driver assumed that the naked `char` type > is signed, but ARM treats it as unsigned, and the C standard says it's > architecture-dependent. > > I doubt this particular driver is the only instance in which > unsuspecting authors make assumptions about `char` with no `signed` or > `unsigned` specifier. We were lucky enough this time that that driver > used `clamp_t(char, negative_value, positive_value)`, so the new > checking code found it, and I've sent a patch to fix it, but there are > likely other places lurking that won't be so easily unearthed. > > So let's just eliminate this particular variety of heisensign bugs > entirely. Set `-funsigned-char` globally, so that gcc makes the type > unsigned on all architectures. > > This will break things in some places and fix things in others, so this > will likely cause a bit of churn while reconciling the type misuse. > This is a very daring change and obviously is going to introduce bugs. It might be better to create a static checker rule that says "char" without explicit signedness can only be used for strings. arch/parisc/kernel/drivers.c:337 print_hwpath() warn: impossible condition '(path->bc[i] == -1) => (0-255 == (-1))' arch/parisc/kernel/drivers.c:410 setup_bus_id() warn: impossible condition '(path.bc[i] == -1) => (0-255 == (-1))' arch/parisc/kernel/drivers.c:486 create_parisc_device() warn: impossible condition '(modpath->bc[i] == -1) => (0-255 == (-1))' arch/parisc/kernel/drivers.c:759 hwpath_to_device() warn: impossible condition '(modpath->bc[i] == -1) => (0-255 == (-1))' drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: assigning (-9) to unsigned variable 'tm' drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: we never enter this loop drivers/misc/sgi-gru/grumain.c:711 gru_check_chiplet_assignment() warn: 'gts->ts_user_chiplet_id' is unsigned drivers/net/wireless/cisco/airo.c:5316 proc_wepkey_on_close() warn: assigning (-16) to unsigned variable 'key[i / 3]' drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9415 rt2800_iq_search() warn: assigning (-32) to unsigned variable 'idx0' drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9470 rt2800_iq_search() warn: assigning (-32) to unsigned variable 'perr' drivers/video/fbdev/sis/init301.c:3549 SiS_GetCRT2Data301() warn: 'SiS_Pr->SiS_EModeIDTable[ModeIdIndex]->ROMMODEIDX661' is unsigned sound/pci/au88x0/au88x0_core.c:2029 vortex_adb_checkinout() warn: signedness bug returning '(-22)' sound/pci/au88x0/au88x0_core.c:2046 vortex_adb_checkinout() warn: signedness bug returning '(-12)' sound/pci/au88x0/au88x0_core.c:2125 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, (0), en, 0)' is unsigned sound/pci/au88x0/au88x0_core.c:2170 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, stream->resources, en, 4)' is unsigned sound/pci/rme9652/hdsp.c:3953 hdsp_channel_buffer_location() warn: 'hdsp->channel_map[channel]' is unsigned sound/pci/rme9652/rme9652.c:1833 rme9652_channel_buffer_location() warn: 'rme9652->channel_map[channel]' is unsigned I did not know that ARM had unsigned chars. I only knew about PPC and on that arch they use char aggressively so that no one forgets that char is unsigned. Changing char to signed would have made people very annoyed. :P regards, dan carpenter
On Mon, Oct 24, 2022 at 12:24:24PM +0300, Dan Carpenter wrote: > On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote: > > Recently, some compile-time checking I added to the clamp_t family of > > functions triggered a build error when a poorly written driver was > > compiled on ARM, because the driver assumed that the naked `char` type > > is signed, but ARM treats it as unsigned, and the C standard says it's > > architecture-dependent. > > > > I doubt this particular driver is the only instance in which > > unsuspecting authors make assumptions about `char` with no `signed` or > > `unsigned` specifier. We were lucky enough this time that that driver > > used `clamp_t(char, negative_value, positive_value)`, so the new > > checking code found it, and I've sent a patch to fix it, but there are > > likely other places lurking that won't be so easily unearthed. > > > > So let's just eliminate this particular variety of heisensign bugs > > entirely. Set `-funsigned-char` globally, so that gcc makes the type > > unsigned on all architectures. > > > > This will break things in some places and fix things in others, so this > > will likely cause a bit of churn while reconciling the type misuse. > > > > This is a very daring change and obviously is going to introduce bugs. > It might be better to create a static checker rule that says "char" > without explicit signedness can only be used for strings. Indeed this would be great. > > arch/parisc/kernel/drivers.c:337 print_hwpath() warn: impossible condition '(path->bc[i] == -1) => (0-255 == (-1))' > arch/parisc/kernel/drivers.c:410 setup_bus_id() warn: impossible condition '(path.bc[i] == -1) => (0-255 == (-1))' > arch/parisc/kernel/drivers.c:486 create_parisc_device() warn: impossible condition '(modpath->bc[i] == -1) => (0-255 == (-1))' > arch/parisc/kernel/drivers.c:759 hwpath_to_device() warn: impossible condition '(modpath->bc[i] == -1) => (0-255 == (-1))' > drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: assigning (-9) to unsigned variable 'tm' > drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: we never enter this loop > drivers/misc/sgi-gru/grumain.c:711 gru_check_chiplet_assignment() warn: 'gts->ts_user_chiplet_id' is unsigned > drivers/net/wireless/cisco/airo.c:5316 proc_wepkey_on_close() warn: assigning (-16) to unsigned variable 'key[i / 3]' > drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9415 rt2800_iq_search() warn: assigning (-32) to unsigned variable 'idx0' > drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9470 rt2800_iq_search() warn: assigning (-32) to unsigned variable 'perr' > drivers/video/fbdev/sis/init301.c:3549 SiS_GetCRT2Data301() warn: 'SiS_Pr->SiS_EModeIDTable[ModeIdIndex]->ROMMODEIDX661' is unsigned > sound/pci/au88x0/au88x0_core.c:2029 vortex_adb_checkinout() warn: signedness bug returning '(-22)' > sound/pci/au88x0/au88x0_core.c:2046 vortex_adb_checkinout() warn: signedness bug returning '(-12)' > sound/pci/au88x0/au88x0_core.c:2125 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, (0), en, 0)' is unsigned > sound/pci/au88x0/au88x0_core.c:2170 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, stream->resources, en, 4)' is unsigned > sound/pci/rme9652/hdsp.c:3953 hdsp_channel_buffer_location() warn: 'hdsp->channel_map[channel]' is unsigned > sound/pci/rme9652/rme9652.c:1833 rme9652_channel_buffer_location() warn: 'rme9652->channel_map[channel]' is unsigned Thanks. I'll fix these up. Jason
On Mon, Oct 24, 2022 at 12:24:24PM +0300, Dan Carpenter wrote: > On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote: > > Recently, some compile-time checking I added to the clamp_t family of > > functions triggered a build error when a poorly written driver was > > compiled on ARM, because the driver assumed that the naked `char` type > > is signed, but ARM treats it as unsigned, and the C standard says it's > > architecture-dependent. > > > > I doubt this particular driver is the only instance in which > > unsuspecting authors make assumptions about `char` with no `signed` or > > `unsigned` specifier. We were lucky enough this time that that driver > > used `clamp_t(char, negative_value, positive_value)`, so the new > > checking code found it, and I've sent a patch to fix it, but there are > > likely other places lurking that won't be so easily unearthed. > > > > So let's just eliminate this particular variety of heisensign bugs > > entirely. Set `-funsigned-char` globally, so that gcc makes the type > > unsigned on all architectures. > > > > This will break things in some places and fix things in others, so this > > will likely cause a bit of churn while reconciling the type misuse. > > > > This is a very daring change and obviously is going to introduce bugs. > It might be better to create a static checker rule that says "char" > without explicit signedness can only be used for strings. > > arch/parisc/kernel/drivers.c:337 print_hwpath() warn: impossible condition '(path->bc[i] == -1) => (0-255 == (-1))' > arch/parisc/kernel/drivers.c:410 setup_bus_id() warn: impossible condition '(path.bc[i] == -1) => (0-255 == (-1))' > arch/parisc/kernel/drivers.c:486 create_parisc_device() warn: impossible condition '(modpath->bc[i] == -1) => (0-255 == (-1))' > arch/parisc/kernel/drivers.c:759 hwpath_to_device() warn: impossible condition '(modpath->bc[i] == -1) => (0-255 == (-1))' > drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: assigning (-9) to unsigned variable 'tm' > drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: we never enter this loop > drivers/misc/sgi-gru/grumain.c:711 gru_check_chiplet_assignment() warn: 'gts->ts_user_chiplet_id' is unsigned > drivers/net/wireless/cisco/airo.c:5316 proc_wepkey_on_close() warn: assigning (-16) to unsigned variable 'key[i / 3]' > drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9415 rt2800_iq_search() warn: assigning (-32) to unsigned variable 'idx0' > drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9470 rt2800_iq_search() warn: assigning (-32) to unsigned variable 'perr' > drivers/video/fbdev/sis/init301.c:3549 SiS_GetCRT2Data301() warn: 'SiS_Pr->SiS_EModeIDTable[ModeIdIndex]->ROMMODEIDX661' is unsigned > sound/pci/au88x0/au88x0_core.c:2029 vortex_adb_checkinout() warn: signedness bug returning '(-22)' > sound/pci/au88x0/au88x0_core.c:2046 vortex_adb_checkinout() warn: signedness bug returning '(-12)' > sound/pci/au88x0/au88x0_core.c:2125 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, (0), en, 0)' is unsigned > sound/pci/au88x0/au88x0_core.c:2170 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, stream->resources, en, 4)' is unsigned > sound/pci/rme9652/hdsp.c:3953 hdsp_channel_buffer_location() warn: 'hdsp->channel_map[channel]' is unsigned > sound/pci/rme9652/rme9652.c:1833 rme9652_channel_buffer_location() warn: 'rme9652->channel_map[channel]' is unsigned Here are some more: drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9472 rt2800_iq_search() warn: impossible condition '(gerr < -7) => (0-255 < (-7))' drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9476 rt2800_iq_search() warn: impossible condition '(perr < -31) => (0-255 < (-31))' drivers/staging/rtl8192e/rtllib_softmac_wx.c:459 rtllib_wx_set_essid() warn: impossible condition '(extra[i] < 0) => (0-255 < 0)' sound/pci/rme9652/hdsp.c:4153 snd_hdsp_channel_info() warn: impossible condition '(hdsp->channel_map[channel] < 0) => (0-255 < 0)' This might be interesting for backports if everyone starts to rely on the fact that char is unsigned as the PPC people currently do. regards, dan carpenter
On Mon, Oct 24, 2022 at 12:30:11PM +0300, Dan Carpenter wrote: > On Mon, Oct 24, 2022 at 12:24:24PM +0300, Dan Carpenter wrote: > > On Wed, Oct 19, 2022 at 02:30:34PM -0600, Jason A. Donenfeld wrote: > > > Recently, some compile-time checking I added to the clamp_t family of > > > functions triggered a build error when a poorly written driver was > > > compiled on ARM, because the driver assumed that the naked `char` type > > > is signed, but ARM treats it as unsigned, and the C standard says it's > > > architecture-dependent. > > > > > > I doubt this particular driver is the only instance in which > > > unsuspecting authors make assumptions about `char` with no `signed` or > > > `unsigned` specifier. We were lucky enough this time that that driver > > > used `clamp_t(char, negative_value, positive_value)`, so the new > > > checking code found it, and I've sent a patch to fix it, but there are > > > likely other places lurking that won't be so easily unearthed. > > > > > > So let's just eliminate this particular variety of heisensign bugs > > > entirely. Set `-funsigned-char` globally, so that gcc makes the type > > > unsigned on all architectures. > > > > > > This will break things in some places and fix things in others, so this > > > will likely cause a bit of churn while reconciling the type misuse. > > > > > > > This is a very daring change and obviously is going to introduce bugs. > > It might be better to create a static checker rule that says "char" > > without explicit signedness can only be used for strings. > > > > arch/parisc/kernel/drivers.c:337 print_hwpath() warn: impossible condition '(path->bc[i] == -1) => (0-255 == (-1))' > > arch/parisc/kernel/drivers.c:410 setup_bus_id() warn: impossible condition '(path.bc[i] == -1) => (0-255 == (-1))' > > arch/parisc/kernel/drivers.c:486 create_parisc_device() warn: impossible condition '(modpath->bc[i] == -1) => (0-255 == (-1))' > > arch/parisc/kernel/drivers.c:759 hwpath_to_device() warn: impossible condition '(modpath->bc[i] == -1) => (0-255 == (-1))' > > drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: assigning (-9) to unsigned variable 'tm' > > drivers/media/dvb-frontends/stv0288.c:471 stv0288_set_frontend() warn: we never enter this loop > > drivers/misc/sgi-gru/grumain.c:711 gru_check_chiplet_assignment() warn: 'gts->ts_user_chiplet_id' is unsigned > > drivers/net/wireless/cisco/airo.c:5316 proc_wepkey_on_close() warn: assigning (-16) to unsigned variable 'key[i / 3]' > > drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9415 rt2800_iq_search() warn: assigning (-32) to unsigned variable 'idx0' > > drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9470 rt2800_iq_search() warn: assigning (-32) to unsigned variable 'perr' > > drivers/video/fbdev/sis/init301.c:3549 SiS_GetCRT2Data301() warn: 'SiS_Pr->SiS_EModeIDTable[ModeIdIndex]->ROMMODEIDX661' is unsigned > > sound/pci/au88x0/au88x0_core.c:2029 vortex_adb_checkinout() warn: signedness bug returning '(-22)' > > sound/pci/au88x0/au88x0_core.c:2046 vortex_adb_checkinout() warn: signedness bug returning '(-12)' > > sound/pci/au88x0/au88x0_core.c:2125 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, (0), en, 0)' is unsigned > > sound/pci/au88x0/au88x0_core.c:2170 vortex_adb_allocroute() warn: 'vortex_adb_checkinout(vortex, stream->resources, en, 4)' is unsigned > > sound/pci/rme9652/hdsp.c:3953 hdsp_channel_buffer_location() warn: 'hdsp->channel_map[channel]' is unsigned > > sound/pci/rme9652/rme9652.c:1833 rme9652_channel_buffer_location() warn: 'rme9652->channel_map[channel]' is unsigned > > Here are some more: > > drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9472 rt2800_iq_search() warn: impossible condition '(gerr < -7) => (0-255 < (-7))' > drivers/net/wireless/ralink/rt2x00/rt2800lib.c:9476 rt2800_iq_search() warn: impossible condition '(perr < -31) => (0-255 < (-31))' > drivers/staging/rtl8192e/rtllib_softmac_wx.c:459 rtllib_wx_set_essid() warn: impossible condition '(extra[i] < 0) => (0-255 < 0)' > sound/pci/rme9652/hdsp.c:4153 snd_hdsp_channel_info() warn: impossible condition '(hdsp->channel_map[channel] < 0) => (0-255 < 0)' > > This might be interesting for backports if everyone starts to rely on > the fact that char is unsigned as the PPC people currently do. Give these a minute to hit Lore, but patches just submitted to various maintainers as fixes (for 6.1), since these are already broken on some architecture. https://lore.kernel.org/all/20221024163005.536097-1-Jason@zx2c4.com https://lore.kernel.org/all/20221024162947.536060-1-Jason@zx2c4.com https://lore.kernel.org/all/20221024162929.536004-1-Jason@zx2c4.com https://lore.kernel.org/all/20221024162901.535972-1-Jason@zx2c4.com https://lore.kernel.org/all/20221024162843.535921-1-Jason@zx2c4.com https://lore.kernel.org/all/20221024162823.535884-1-Jason@zx2c4.com https://lore.kernel.org/all/20221024162756.535776-1-Jason@zx2c4.com Jason
On Mon, Oct 24, 2022 at 9:34 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Give these a minute to hit Lore, but patches just submitted to various
> maintainers as fixes (for 6.1), since these are already broken on some
> architecture.
Hold up a minute.
Some of those may need more thought. For example, that first one:
> https://lore.kernel.org/all/20221024163005.536097-1-Jason@zx2c4.com
looks just *strange*. As far as I can tell, no other wireless drivers
do any sign checks at all.
Now, I didn't really look around a lot, but looking at a few other
SIOCSIWESSID users, most don't even seem to treat it as a string at
all, but as just a byte dump (so memcpy() instead of strncpy())
As far as I know, there are no actual rules for SSID character sets,
and while using utf-8 or something else might cause interoperability
problems, this driver seems to be just confused. If you want to check
for "printable characters", that check is still wrong.
So I don't think this is a "assume char is signed" issue. I think this
is a "driver is confused" issue.
IOW, I don't think these are 6.1 material as some kind of obvious
fixes, at least not without driver author acks.
Linus
From: Linus Torvalds > Sent: 24 October 2022 18:11 ... > > As far as I know, there are no actual rules for SSID character sets, > and while using utf-8 or something else might cause interoperability > problems, this driver seems to be just confused. If you want to check > for "printable characters", that check is still wrong. Are SSID even required to be printable at all? While most systems only let you configure 'strings' I don't remember that actually being a requirement. (I've sure I read up on this years ago.) The frame format will be using an explicit length. So even embedded zeros may be valid! David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
Hi Linus, On Mon, Oct 24, 2022 at 7:11 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > IOW, I don't think these are 6.1 material as some kind of obvious > fixes, at least not without driver author acks. Right, these are posted to the authors and maintainers to look at. Maybe they punt them until 6.2 which would be fine too. > On Mon, Oct 24, 2022 at 9:34 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > Some of those may need more thought. For example, that first one: > > > https://lore.kernel.org/all/20221024163005.536097-1-Jason@zx2c4.com > > looks just *strange*. As far as I can tell, no other wireless drivers > do any sign checks at all. > > Now, I didn't really look around a lot, but looking at a few other > SIOCSIWESSID users, most don't even seem to treat it as a string at > all, but as just a byte dump (so memcpy() instead of strncpy()) > > As far as I know, there are no actual rules for SSID character sets, > and while using utf-8 or something else might cause interoperability > problems, this driver seems to be just confused. If you want to check > for "printable characters", that check is still wrong. > > So I don't think this is a "assume char is signed" issue. I think this > is a "driver is confused" issue. Yea I had a few versions of this. In one of them, I changed `char *extra` throughout the wireless stack into `s8 *extra` and in another `u8 *extra`, after realizing they're mostly just bags of bits. But that seemed pretty invasive when, indeed, this staging driver is just a little screwy. So perhaps the right fix is to just kill that whole snippet? Kalle - opinions? Jason
"Jason A. Donenfeld" <Jason@zx2c4.com> writes: > Hi Linus, > > On Mon, Oct 24, 2022 at 7:11 PM Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> IOW, I don't think these are 6.1 material as some kind of obvious >> fixes, at least not without driver author acks. > > Right, these are posted to the authors and maintainers to look at. > Maybe they punt them until 6.2 which would be fine too. > >> On Mon, Oct 24, 2022 at 9:34 AM Jason A. Donenfeld <Jason@zx2c4.com> wrote: >> Some of those may need more thought. For example, that first one: >> >> > https://lore.kernel.org/all/20221024163005.536097-1-Jason@zx2c4.com >> >> looks just *strange*. As far as I can tell, no other wireless drivers >> do any sign checks at all. >> >> Now, I didn't really look around a lot, but looking at a few other >> SIOCSIWESSID users, most don't even seem to treat it as a string at >> all, but as just a byte dump (so memcpy() instead of strncpy()) Yes, SSID should be handled as a byte array with a specified length. Back in the day some badly written code treated it as string but luckily it's rare now. >> As far as I know, there are no actual rules for SSID character sets, >> and while using utf-8 or something else might cause interoperability >> problems, this driver seems to be just confused. If you want to check >> for "printable characters", that check is still wrong. >> >> So I don't think this is a "assume char is signed" issue. I think this >> is a "driver is confused" issue. > > Yea I had a few versions of this. In one of them, I changed `char > *extra` throughout the wireless stack into `s8 *extra` and in another > `u8 *extra`, after realizing they're mostly just bags of bits. But > that seemed pretty invasive when, indeed, this staging driver is just > a little screwy. > > So perhaps the right fix is to just kill that whole snippet? Kalle - opinions? I would also remove the whole 'extra[i] < 0', seems like a pointless check to me. And I see that you already submitted v2, good. -- https://patchwork.kernel.org/project/linux-wireless/list/ https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches
On Wed, Oct 19, 2022 at 1:30 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> So let's just eliminate this particular variety of heisensign bugs
> entirely. Set `-funsigned-char` globally, so that gcc makes the type
> unsigned on all architectures.
>
> This will break things in some places and fix things in others, so this
> will likely cause a bit of churn while reconciling the type misuse.
Yeah, if we were still in the merge window, I'd probably apply this,
but as things stand, I think it should go into linux-next and cook
there for the next merge window.
Anybody willing to put this in their -next trees?
Any breakage it causes is likely going to be fairly subtle, and in
some random driver that isn't used on architectures that already have
an unsigned 'char' type.
I think the architectures with an unsigned 'char' are arm, powerpc and
s390, in all their variations (ie both 32- and 64-bit).
So all *core* code should be fine with this, but that still leaves a
lot of drivers that have likely never been tested on anything but x86,
and could just stop working.
I don't think breakage is very *likely*, but I suspect it exists.
Linus
On Wed, Oct 19, 2022 at 04:56:03PM -0700, Linus Torvalds wrote: > I think the architectures with an unsigned 'char' are arm, powerpc and > s390, in all their variations (ie both 32- and 64-bit). xtensa and most MIPS configurations as well, fwiw. Segher
On Wed, Oct 19, 2022 at 04:56:03PM -0700, Linus Torvalds wrote: > On Wed, Oct 19, 2022 at 1:30 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote: > > > > So let's just eliminate this particular variety of heisensign bugs > > entirely. Set `-funsigned-char` globally, so that gcc makes the type > > unsigned on all architectures. > > > > This will break things in some places and fix things in others, so this > > will likely cause a bit of churn while reconciling the type misuse. > > Yeah, if we were still in the merge window, I'd probably apply this, > but as things stand, I think it should go into linux-next and cook > there for the next merge window. > > Anybody willing to put this in their -next trees? Sure, happy to take it. > > Any breakage it causes is likely going to be fairly subtle, and in > some random driver that isn't used on architectures that already have > an unsigned 'char' type. > > I think the architectures with an unsigned 'char' are arm, powerpc and > s390, in all their variations (ie both 32- and 64-bit). > > So all *core* code should be fine with this, but that still leaves a > lot of drivers that have likely never been tested on anything but x86, > and could just stop working. > > I don't think breakage is very *likely*, but I suspect it exists. Given I've started with cleaning up one driver already, I'll keep my eye on further breakage. Jason > > Linus
On Wed, Oct 19, 2022 at 5:02 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> Given I've started with cleaning up one driver already, I'll keep my eye
> on further breakage.
I wonder if we could just check for code generation differences some way.
I tested a couple of files, and was able to find differences, eg
# kernel/sched/core.c:8861: pr_info("task:%-15.15s state:%c",
p->comm, task_state_to_char(p));
- movzbl state_char.149(%rax), %edx # state_char[_60], state_char[_60]
+ movsbl state_char.149(%rax), %edx # state_char[_60], state_char[_60]
call _printk #
because the 'char' for the '%c' is passed as an integer. And the
tracing code has the
.is_signed = is_signed_type(_type)
initializers, which obviously change when the type is 'char'.
But I also checked a number of other files that didn't have that
pattern at all, and there was zero code generation difference, even
when the "readable asm" output itself had some changes in some of the
internal label names.
That was what my old 'sparse' trial thing was actually *hoping* (but
failed) to do, ie notice when the signedness of a char actually
affects code generation. And it does in fact seem fairly rare.
Having some scripting automation that just notices "this changes code
generation in function X" might actually be interesting, and judging
by my quick tests might not be *too* verbose.
Linus
On Wed, Oct 19, 2022 at 05:38:55PM -0700, Linus Torvalds wrote:
> Having some scripting automation that just notices "this changes code
> generation in function X" might actually be interesting, and judging
> by my quick tests might not be *too* verbose.
On the reproducible build comparison system[1] we use for checking a lot
of the KSPP work for .text deltas, an allmodconfig finds a fair bit for
this change. Out of 33900 .o files, 1005 have changes.
Spot checking matches a lot of what you found already...
u64 flags = how->flags;
...
fs/open.c:1123:
int acc_mode = ACC_MODE(flags);
- 1c86: movsbl 0x0(%rdx),%edx
+ 1c86: movzbl 0x0(%rdx),%edx
#define ACC_MODE(x) ("\004\002\006\006"[(x)&O_ACCMODE])
Ignoring those, it goes down to 625, and spot checking those is more
difficult, but looks to be mostly register selection changes dominating
the delta. The resulting vmlinux sizes are identical, though.
-Kees
[1] A fancier version of:
https://outflux.net/blog/archives/2022/06/24/finding-binary-differences/
--
Kees Cook
On Thu, Oct 20, 2022 at 11:41:29AM -0700, Kees Cook wrote:
> On Wed, Oct 19, 2022 at 05:38:55PM -0700, Linus Torvalds wrote:
> > Having some scripting automation that just notices "this changes code
> > generation in function X" might actually be interesting, and judging
> > by my quick tests might not be *too* verbose.
>
> On the reproducible build comparison system[1] we use for checking a lot
> of the KSPP work for .text deltas, an allmodconfig finds a fair bit for
> this change. Out of 33900 .o files, 1005 have changes.
>
> Spot checking matches a lot of what you found already...
>
> u64 flags = how->flags;
> ...
> fs/open.c:1123:
> int acc_mode = ACC_MODE(flags);
> - 1c86: movsbl 0x0(%rdx),%edx
> + 1c86: movzbl 0x0(%rdx),%edx
>
> #define ACC_MODE(x) ("\004\002\006\006"[(x)&O_ACCMODE])
>
> Ignoring those, it goes down to 625, and spot checking those is more
> difficult, but looks to be mostly register selection changes dominating
> the delta. The resulting vmlinux sizes are identical, though.
>
> -Kees
>
> [1] A fancier version of:
> https://outflux.net/blog/archives/2022/06/24/finding-binary-differences/
Say, don't we have some way of outputting LLVM IL? I saw some
-fno-discard-value-names floating through a few days ago. Apparently you
can do `make LLVM=1 fs/select.ll`? This might have less noise in it.
I'll play on the airplane tomorrow.
Jason
On Wed, Oct 19, 2022 at 05:38:55PM -0700, Linus Torvalds wrote:
> On Wed, Oct 19, 2022 at 5:02 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> >
> > Given I've started with cleaning up one driver already, I'll keep my eye
> > on further breakage.
>
> I wonder if we could just check for code generation differences some way.
> Having some scripting automation that just notices "this changes code
> generation in function X" might actually be interesting, and judging
> by my quick tests might not be *too* verbose.
Or even just some allyesconfig diffing.
> I tested a couple of files, and was able to find differences, eg
>
> # kernel/sched/core.c:8861: pr_info("task:%-15.15s state:%c",
> p->comm, task_state_to_char(p));
> - movzbl state_char.149(%rax), %edx # state_char[_60], state_char[_60]
> + movsbl state_char.149(%rax), %edx # state_char[_60], state_char[_60]
> call _printk #
>
> because the 'char' for the '%c' is passed as an integer. And the
Seems harmless though.
> tracing code has the
>
> .is_signed = is_signed_type(_type)
>
> initializers, which obviously change when the type is 'char'.
And likewise, looking at the types of initializers that's used with.
Actually, for the array one, unsigned is probably more sensible anyway.
The thing is, anyhow, that most code that works without -funsigned-char
*will* work with it, because the core of the kernel obviously works fine
on ARM already. The problematic areas will be x86-specific drivers that
have never been tested on other archs. i915 comes to mind -- as a
general rule, it already does all manner of insane things. But there's
obviously a lot of other hardware that's only ever run on Intel. So I'm
much more concerned about that than I am about code in, say, kernel/sched.
Jason
© 2016 - 2026 Red Hat, Inc.