[PATCH] x86/nospec: Fix evaluate_nospec() code generation under Clang

Andrew Cooper posted 1 patch 2 years ago
Test gitlab-ci failed
Patches applied successfully (tree, apply log)
git fetch https://gitlab.com/xen-project/patchew/xen tags/patchew/20220425175603.21086-1-andrew.cooper3@citrix.com
xen/arch/x86/include/asm/nospec.h | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
[PATCH] x86/nospec: Fix evaluate_nospec() code generation under Clang
Posted by Andrew Cooper 2 years ago
It turns out that evaluate_nospec() code generation is not safe under Clang.
Given:

  void eval_nospec_test(int x)
  {
      if ( evaluate_nospec(x) )
          asm volatile ("nop #true" ::: "memory");
      else
          asm volatile ("nop #false" ::: "memory");
  }

Clang emits:

  <eval_nospec_test>:
         0f ae e8                lfence
         85 ff                   test   %edi,%edi
         74 02                   je     <eval_nospec_test+0x9>
         90                      nop
         c3                      ret
         90                      nop
         c3                      ret

which is not safe because the lfence has been hoisted above the conditional
jump.  Clang concludes that both barrier_nospec_true()'s have identical side
effects and can safely be merged.

Clang can be persuaded that the side effects are different if there are
different comments in the asm blocks.  This is fragile, but no more fragile
that other aspects of this construct.

Introduce barrier_nospec_false() with a separate internal comment to prevent
Clang merging it with barrier_nospec_true() despite the otherwise-identical
content.  The generated code now becomes:

  <eval_nospec_test>:
         85 ff                   test   %edi,%edi
         74 05                   je     <eval_nospec_test+0x9>
         0f ae e8                lfence
         90                      nop
         c3                      ret
         0f ae e8                lfence
         90                      nop
         c3                      ret

which has the correct number of lfence's, and in the correct place.

Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
 xen/arch/x86/include/asm/nospec.h | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/include/asm/nospec.h b/xen/arch/x86/include/asm/nospec.h
index 5312ae4c6f31..7150e76b87fb 100644
--- a/xen/arch/x86/include/asm/nospec.h
+++ b/xen/arch/x86/include/asm/nospec.h
@@ -10,15 +10,26 @@
 static always_inline bool barrier_nospec_true(void)
 {
 #ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
-    alternative("lfence", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
+    alternative("lfence #nospec-true", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
 #endif
     return true;
 }
 
+static always_inline bool barrier_nospec_false(void)
+{
+#ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
+    alternative("lfence #nospec-false", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
+#endif
+    return false;
+}
+
 /* Allow to protect evaluation of conditionals with respect to speculation */
 static always_inline bool evaluate_nospec(bool condition)
 {
-    return condition ? barrier_nospec_true() : !barrier_nospec_true();
+    if ( condition )
+        return barrier_nospec_true();
+    else
+        return barrier_nospec_false();
 }
 
 /* Allow to block speculative execution in generic code */
-- 
2.11.0


Re: [PATCH] x86/nospec: Fix evaluate_nospec() code generation under Clang
Posted by Roger Pau Monné 2 years ago
On Mon, Apr 25, 2022 at 06:56:03PM +0100, Andrew Cooper wrote:
> It turns out that evaluate_nospec() code generation is not safe under Clang.
> Given:
> 
>   void eval_nospec_test(int x)
>   {
>       if ( evaluate_nospec(x) )
>           asm volatile ("nop #true" ::: "memory");
>       else
>           asm volatile ("nop #false" ::: "memory");
>   }
> 
> Clang emits:
> 
>   <eval_nospec_test>:
>          0f ae e8                lfence
>          85 ff                   test   %edi,%edi
>          74 02                   je     <eval_nospec_test+0x9>
>          90                      nop
>          c3                      ret
>          90                      nop
>          c3                      ret
> 
> which is not safe because the lfence has been hoisted above the conditional
> jump.  Clang concludes that both barrier_nospec_true()'s have identical side
> effects and can safely be merged.
> 
> Clang can be persuaded that the side effects are different if there are
> different comments in the asm blocks.  This is fragile, but no more fragile
> that other aspects of this construct.
> 
> Introduce barrier_nospec_false() with a separate internal comment to prevent
> Clang merging it with barrier_nospec_true() despite the otherwise-identical
> content.  The generated code now becomes:
> 
>   <eval_nospec_test>:
>          85 ff                   test   %edi,%edi
>          74 05                   je     <eval_nospec_test+0x9>
>          0f ae e8                lfence
>          90                      nop
>          c3                      ret
>          0f ae e8                lfence
>          90                      nop
>          c3                      ret
> 
> which has the correct number of lfence's, and in the correct place.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Like Jan I wonder what the clang devs think of this solution.  Is
there any test in clang to assert that comments won't be stripped from
asm blocks before optimization?

> ---
> CC: Jan Beulich <JBeulich@suse.com>
> CC: Roger Pau Monné <roger.pau@citrix.com>
> CC: Wei Liu <wl@xen.org>
> ---
>  xen/arch/x86/include/asm/nospec.h | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/x86/include/asm/nospec.h b/xen/arch/x86/include/asm/nospec.h
> index 5312ae4c6f31..7150e76b87fb 100644
> --- a/xen/arch/x86/include/asm/nospec.h
> +++ b/xen/arch/x86/include/asm/nospec.h
> @@ -10,15 +10,26 @@
>  static always_inline bool barrier_nospec_true(void)
>  {
>  #ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
> -    alternative("lfence", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
> +    alternative("lfence #nospec-true", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
>  #endif
>      return true;
>  }
>  
> +static always_inline bool barrier_nospec_false(void)
> +{
> +#ifdef CONFIG_SPECULATIVE_HARDEN_BRANCH
> +    alternative("lfence #nospec-false", "", X86_FEATURE_SC_NO_BRANCH_HARDEN);
> +#endif
> +    return false;
> +}
> +
>  /* Allow to protect evaluation of conditionals with respect to speculation */
>  static always_inline bool evaluate_nospec(bool condition)
>  {
> -    return condition ? barrier_nospec_true() : !barrier_nospec_true();
> +    if ( condition )
> +        return barrier_nospec_true();
> +    else
> +        return barrier_nospec_false();
>  }

Is the switch from using a ternary operator also a requirement for
clang not optimizing this? (I would assume not, but better ask)

Thanks, Roger.

Re: [PATCH] x86/nospec: Fix evaluate_nospec() code generation under Clang
Posted by Jan Beulich 2 years ago
On 25.04.2022 19:56, Andrew Cooper wrote:
> It turns out that evaluate_nospec() code generation is not safe under Clang.
> Given:
> 
>   void eval_nospec_test(int x)
>   {
>       if ( evaluate_nospec(x) )
>           asm volatile ("nop #true" ::: "memory");
>       else
>           asm volatile ("nop #false" ::: "memory");
>   }
> 
> Clang emits:
> 
>   <eval_nospec_test>:
>          0f ae e8                lfence
>          85 ff                   test   %edi,%edi
>          74 02                   je     <eval_nospec_test+0x9>
>          90                      nop
>          c3                      ret
>          90                      nop
>          c3                      ret
> 
> which is not safe because the lfence has been hoisted above the conditional
> jump.  Clang concludes that both barrier_nospec_true()'s have identical side
> effects and can safely be merged.
> 
> Clang can be persuaded that the side effects are different if there are
> different comments in the asm blocks.  This is fragile, but no more fragile
> that other aspects of this construct.
> 
> Introduce barrier_nospec_false() with a separate internal comment to prevent
> Clang merging it with barrier_nospec_true() despite the otherwise-identical
> content.  The generated code now becomes:
> 
>   <eval_nospec_test>:
>          85 ff                   test   %edi,%edi
>          74 05                   je     <eval_nospec_test+0x9>
>          0f ae e8                lfence
>          90                      nop
>          c3                      ret
>          0f ae e8                lfence
>          90                      nop
>          c3                      ret
> 
> which has the correct number of lfence's, and in the correct place.
> 
> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>

I can live with us going this route, so
Reviewed-by: Jan Beulich <jbeulich@suse.com>

However, I'd like alternatives to be considered: Would two asm()s
perhaps not be candidates for merging when they have different
(perhaps fake) arguments or clobbers? If so, would this be less
fragile than relying on comments, which clearly any layer could be
viewed as free to strip off (when the same isn't true for arguments
and clobbers)?

Also you did say you'd open an issue with Clang to try to get their
view on relying on comments here. Could you please add a reference
to that issue in the description here?

Jan