From: Linus Torvalds > Sent: 21 July 2022 19:07 ... > (b) since you have that r10 use anyway, why can't you just generate the simpler > > movl $-IMM,%r10d > addl -4(%calldest),%r10d > > instead? You only need ZF anyway. > > Maybe you need to add some "r10 is clobbered" thing, I don't know. > > But again: I don't know llvm, so the above is basically me just doing > the "pattern matching monkey" thing. > > Linus Since: "If the callee is a variadic function, then the number of floating point arguments passed to the function in vector registers must be provided by the caller in the AL register." And that that never happens in the kernel you can use %eax instead of %r10d. Even in userspace %al can be set non-zero after the signature check. If you are willing to cut the signature down to 26 bits and then ensure that one of the bytes of -IMM (or ~IMM if you use xor) is 0xcc and jump back to that on error the check becomes: movl $-IMM,%eax 1: addl -4(%calldest),%eax jnz 1b-1 // or -2, -3, -4 add $num_fp_args,%eax // If needed non-zero call %calldest I think that adds 10 bytes to the call site. Although with retpoline thunks (and no fp varargs calls) all but the initial movl can go into the thunk. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Thu, Jul 21, 2022 at 10:01:12PM +0000, David Laight wrote: > Since: "If the callee is a variadic function, then the number of floating > point arguments passed to the function in vector registers must be provided > by the caller in the AL register." > > And that that never happens in the kernel you can use %eax instead > of %r10d. Except there's the AMD BTC thing and we should (compiler patch seems MIA) have an unconditional: 'xor %eax,%eax' in front of every function call. (The official mitigation strategy was CALL; LFENCE IIRC, but that's so horrible nobody is actually considering that) Yes, the suggested sequence ends with rax being zero, but since we start the speculation before that result is computed that's not good enough I suspect.
From: Peter Zijlstra > Sent: 22 July 2022 12:03 > > On Thu, Jul 21, 2022 at 10:01:12PM +0000, David Laight wrote: > > > Since: "If the callee is a variadic function, then the number of floating > > point arguments passed to the function in vector registers must be provided > > by the caller in the AL register." > > > > And that that never happens in the kernel you can use %eax instead > > of %r10d. > > Except there's the AMD BTC thing and we should (compiler patch seems > MIA) have an unconditional: 'xor %eax,%eax' in front of every function > call. I've just read https://www.amd.com/system/files/documents/technical-guidance-for-mitigating-branch-type-confusion_v7_20220712.pdf It doesn't seem to suggest clearing registers except as a vague 'might help' before a function return (to limit what the speculated code can do. The only advantage I can think of for 'xor ax,ax' is that it is done as a register rename - and isn't dependant on older instructions. So it might reduce some pipeline stalls. I'm guessing that someone might find a 'gadget' that depends on %eax and it may be possible to find somewhere that leaves an arbitrary value in it. It is also about the only register that isn't live! > (The official mitigation strategy was CALL; LFENCE IIRC, but that's so > horrible nobody is actually considering that) > > Yes, the suggested sequence ends with rax being zero, but since we start > the speculation before that result is computed that's not good enough I > suspect. The speculated code can't use the 'wrong' %eax value. The only problem is that reading from -4(%r11) is likely to be a D$ miss giving plenty of time for the cpu to execute 'crap'. But I'm not sure a later 'xor ax,ax' helps. (OTOH this is all horrid and makes my brian hurt.) AFAICT with BTC you 'just lose'. I thought it was bad enough that some cpu used the BTB for predicted conditional jumps - but using it to decide 'this must be a branch instruction' seems especially broken. Seems the best thing to do with those cpu is to run an embedded system with a busybox+buildroot userspace where almost everything runs as root :-) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
© 2016 - 2026 Red Hat, Inc.