[v3] RE: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().

RE: [PATCH next v3 0/5] minmax: Relax type checks in min() and max().
Posted by David Laight 2 years, 5 months ago
From: Linus Torvalds
> Sent: Tuesday, August 22, 2023 6:36 PM
> 
> On Mon, 21 Aug 2023 at 11:24, Kees Cook <keescook@chromium.org> wrote:
> >
> > It seems like the foot-gun problems are when a value gets clamped by the
> > imposed type. Can't we just warn about those cases?
> 
> I think that the problem with min_t() is that it is used as a random
> "when min() warns about different types", and that it basically just
> adds a cast without any semantic meaning.
> 
> End result: we currently have 4500+ of those cases (and another 1300
> uses of 'max_t') and I bet that several of them are the narrowing
> kind. And some are probably valid.
> 
> And if we tighten up "min_t()" type rules, we'll just hit the *next*
> problem in the series, namely "what are people going to do now?"
> 
> We don't want to just keep pushing the problem down.
> 
> So I actually mostly liked all the *other* patches in David's series:
> using 'min_unsigned()' and friends adds a nice *semantic* layer, not
> just a cast. And relaxing the checking of min/max doesn't cause the
> same the "push problems down" issue, as long as the relaxing is
> reasonable.
> 
> (Side note: I'm not convinced 'min_unsigned()' is the right syntax.
> While I like the concept, I think 'min()' is often used as a part of
> other expressions, and 'min_unsigned()' ends up making for a very
> illegible long and complex thing. I think we might as well use
> 'umin()/umax()', matching our type system).

Picking a name is like painting the bike-shed :-)

> It's just that I very much don't think it's reasonable to relax "20u"
> (or - more commonly - sizeof) to be any random constant signed
> integer, and it should *not* compare well with signed integers and not
> silently become a signed compare.
> 
> (But I do think that it's very ok to go the other way: compare a
> _unsigned_ value with a "signed" constant integer like 20. The two
> cases are not symmetrical: '20' can be a perfectly fine unsigned
> value, but '20u' cannot be treated signed).

I'll rebase the patch after the next -rc1 is out with that removed. 

> And while I don't like David's patch to silently turn unsigned
> constant signed, I do acknowledge that very often the *source* of the
> unsignedness is a 'sizeof()' expression, and then you have an integer
> that gets compared to a size, and you end up using 'min_t()'. But I do
> *NOT* want to fix those cases by ignoring the signedness.
> 
> Just a quick grep of
> 
>     git grep 'min_t(size_t' | wc
> 
> shows that quite a lot of the 'min_t()' cases are this exact issue.
> But I absolutely do *not* think the solution is to relax 'min()'.

size_t is actually a problem with unsigned constants.
It is 'unsigned int' on 32bit and 'unsigned long' on 64bit.
Making it hard to use a literal.

> I suspect the fix to those cases is to much more eagerly use
> 'clamp()'. Almost every single time you do a "compare to a size", it
> really is "make sure the integer is within the size range". So while
> 
>     int val
>    ...
>     x = min(val,sizeof(xyz));
> 
> is horrendously wrong and *should* warn,

The difficulty is getting people to correct it by changing
the type on 'val' to be unsigned.
Sometimes it is just a local variable.
Often it just can't ever be negative, or there is an immediately
preceding check.

Indeed if it is negative that is probably a bug.
(eg a missing error check from an earlier call.)
No amount of compile-time checking is going to help.
About the only thing that would help would be a run-time
check and a 'goto label' if negative.

There are plenty of places where a (short) buffer length is
passed to a function as 'int' - even though negative values
are completely invalid.
In reality the type change needs 'chasing back' through to
all the callers.

> I think doing
> 
>    x = clamp(val, 0, sizeof(xyz));
> 
> is a *much* nicer model, and should not warn even if "val" and the
> upper bound do not agree. In the above kind of situation, suddenly it
> *is* ok to treat the 'sizeof()' as a signed integer, but only because
> we have that explicit lower bound too.

That would require major rework on clamp() :-)

It would also be nice to get clamp(unsigned_var, 0u, 20u) to
compile without the annoying warning from the compiler.
I think you have to use:
	builtin_choose_expr(x, var, 0) >= builtin_choose_expr(x, low, 0)

> In other words: we should not "try to fix the types". That was what
> caused the problem in the first place, with "min_t()" just trying to
> fix the type mismatch with a cast. Casts are wrong, and we should have
> known that. The end result is horrendous, and I do agree with David on
> that too.

Some of the worst casts are the ones that are only there for sparse.
The compiler sees '(force __be32)var' as '(uint)var' rather than 'var'.
That really ought to always have been 'force(__be32, var)' so that
it can be made transparent to the compiler.

> I think we should strive to fix it with "semantic" fixes instead. Like
> the above "use clamp() instead of min(), and the fundamental
> signedness problem simply goes away because it has enough semantic
> meaning to be well-defined".

Unfortunately it adds an extra compare and branch to get mis-predicted.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)