When the compiler can reduce the condition to a constant, it can elide the
conditional and one of the basic blocks. However, arch_evaluate_nospec() will
still insert speculation protection, despite there being nothing to protect.
Allow the speculation protection to be skipped entirely when the compiler is
removing the condition entirely.
e.g. for x86, given:
int foo(void)
{
if ( evaluate_nospec(1) )
return 2;
else
return 42;
}
then before, we get:
<foo>:
lfence
mov $0x2,%eax
retq
and afterwards, we get:
<foo>:
mov $0x2,%eax
retq
which is correct. With no conditional branch to protect, the lfence isn't
providing any relevant safety.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
xen/include/xen/nospec.h | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/xen/include/xen/nospec.h b/xen/include/xen/nospec.h
index a4155af08770..56cf67a44176 100644
--- a/xen/include/xen/nospec.h
+++ b/xen/include/xen/nospec.h
@@ -18,6 +18,15 @@ static always_inline bool evaluate_nospec(bool cond)
#ifndef arch_evaluate_nospec
#define arch_evaluate_nospec(cond) cond
#endif
+
+ /*
+ * If the compiler can reduce the condition to a constant, then it won't
+ * be emitting a conditional branch, and there's nothing needing
+ * protecting.
+ */
+ if ( __builtin_constant_p(cond) )
+ return cond;
+
return arch_evaluate_nospec(cond);
}
--
2.30.2
On 04.03.2024 17:10, Andrew Cooper wrote:
> --- a/xen/include/xen/nospec.h
> +++ b/xen/include/xen/nospec.h
> @@ -18,6 +18,15 @@ static always_inline bool evaluate_nospec(bool cond)
> #ifndef arch_evaluate_nospec
> #define arch_evaluate_nospec(cond) cond
> #endif
> +
> + /*
> + * If the compiler can reduce the condition to a constant, then it won't
> + * be emitting a conditional branch, and there's nothing needing
> + * protecting.
> + */
> + if ( __builtin_constant_p(cond) )
> + return cond;
> +
> return arch_evaluate_nospec(cond);
> }
While for now, even after having some hours for considering, I can't point
out anything concrete that could potentially become a problem here, I
still have the gut feeling that this would better be left in the arch
logic. (There's the oddity of what the function actually expands to if the
#define in context actually takes effect, but that's merely cosmetic.)
The one thing I'm firmly unhappy with is "won't" in the comment: We can't
know what the compiler will do. I've certainly known of compilers which
didn't as you indicate here. That was nothing remotely recent, but
ancient DOS/Windows ones. Still, unlike with e.g. __{get,put}_user_bad()
the compiler doing something unexpected would go entirely silently here.
The other (minor) aspect I'm not entirely happy with is that you insert
between the fallback #define and its use. I think (if we need such a
#define in the first place) the two would better stay close together.
As to the need for the #define: To me
static always_inline bool evaluate_nospec(bool cond)
{
#ifdef arch_evaluate_nospec
return arch_evaluate_nospec(cond);
#else
return cond;
#endif
}
or even
static always_inline bool evaluate_nospec(bool cond)
{
#ifdef arch_evaluate_nospec
return arch_evaluate_nospec(cond);
#endif
return cond;
}
reads no worse, but perhaps slightly better, and is then consistent with
block_speculation(). At which point the question about "insertion point"
here would hopefully also disappear, as this addition is meaningful only
ahead of the #else.
Jan
© 2016 - 2026 Red Hat, Inc.