[v1] cmpxchg and lock cmpxchg should not touch accumulator

[PATCH 1/2] fix cmpxchg instruction

Posted by Wei Li 3 years, 10 months ago

We need a branch to determine when the instruction can touch the
accumulator. But there is a branch provided by movcond. So it is
redundant to use both branch and movcond. Just use one branch to
solve the problem according to intel manual.

Signed-off-by: Wei Li <lw945lw945@yahoo.com>
---
 target/i386/tcg/translate.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 2a94d33742..05be8d08e6 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -5339,15 +5339,17 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
     case 0x1b0:
     case 0x1b1: /* cmpxchg Ev, Gv */
         {
-            TCGv oldv, newv, cmpv;
+            TCGLabel *label1, *label2;
+            TCGv oldv, newv, cmpv, a0;
 
             ot = mo_b_d(b, dflag);
             modrm = x86_ldub_code(env, s);
             reg = ((modrm >> 3) & 7) | REX_R(s);
             mod = (modrm >> 6) & 3;
-            oldv = tcg_temp_new();
-            newv = tcg_temp_new();
-            cmpv = tcg_temp_new();
+            oldv = tcg_temp_local_new();
+            newv = tcg_temp_local_new();
+            cmpv = tcg_temp_local_new();
+            a0 = tcg_temp_local_new();
             gen_op_mov_v_reg(s, ot, newv, reg);
             tcg_gen_mov_tl(cmpv, cpu_regs[R_EAX]);
 
@@ -5365,24 +5367,32 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
                     gen_op_mov_v_reg(s, ot, oldv, rm);
                 } else {
                     gen_lea_modrm(env, s, modrm);
-                    gen_op_ld_v(s, ot, oldv, s->A0);
+                    tcg_gen_mov_tl(a0, s->A0);
+                    gen_op_ld_v(s, ot, oldv, a0);
                     rm = 0; /* avoid warning */
                 }
+                label1 = gen_new_label();
                 gen_extu(ot, oldv);
                 gen_extu(ot, cmpv);
-                /* store value = (old == cmp ? new : old);  */
-                tcg_gen_movcond_tl(TCG_COND_EQ, newv, oldv, cmpv, newv, oldv);
+                tcg_gen_brcond_tl(TCG_COND_EQ, oldv, cmpv, label1);
+                label2 = gen_new_label();
                 if (mod == 3) {
                     gen_op_mov_reg_v(s, ot, R_EAX, oldv);
+                    tcg_gen_br(label2);
+                    gen_set_label(label1);
                     gen_op_mov_reg_v(s, ot, rm, newv);
                 } else {
                     /* Perform an unconditional store cycle like physical cpu;
                        must be before changing accumulator to ensure
                        idempotency if the store faults and the instruction
                        is restarted */
-                    gen_op_st_v(s, ot, newv, s->A0);
+                    gen_op_st_v(s, ot, oldv, a0);
                     gen_op_mov_reg_v(s, ot, R_EAX, oldv);
+                    tcg_gen_br(label2);
+                    gen_set_label(label1);
+                    gen_op_st_v(s, ot, newv, a0);
                 }
+                gen_set_label(label2);
             }
             tcg_gen_mov_tl(cpu_cc_src, oldv);
             tcg_gen_mov_tl(s->cc_srcT, cmpv);
@@ -5391,6 +5401,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
             tcg_temp_free(oldv);
             tcg_temp_free(newv);
             tcg_temp_free(cmpv);
+            tcg_temp_free(a0);
         }
         break;
     case 0x1c7: /* cmpxchg8b */
-- 
2.30.2

Re: [PATCH 1/2] fix cmpxchg instruction

Posted by Richard Henderson 3 years, 10 months ago

On 3/19/22 09:06, Wei Li wrote:
> We need a branch to determine when the instruction can touch the
> accumulator. But there is a branch provided by movcond.

There is no branch in movcond -- this expands to cmov.

> -                /* store value = (old == cmp ? new : old);  */
> -                tcg_gen_movcond_tl(TCG_COND_EQ, newv, oldv, cmpv, newv, oldv);
> +                tcg_gen_brcond_tl(TCG_COND_EQ, oldv, cmpv, label1);
...
>                       /* Perform an unconditional store cycle like physical cpu;
>                          must be before changing accumulator to ensure
>                          idempotency if the store faults and the instruction
>                          is restarted */

Your branch invalidates the comment -- the store becomes conditional, and we no longer get 
a write fault on read-only pages when the comparison fails.  OTOH, we're already getting 
the incorrect SIGSEGV behaviour, since we get a read fault on an unmapped page instead of 
a write fault.

The faulting behaviour could be addressed with a write_probe prior to the original load. 
Alternately, we can use the tcg_gen_atomic_cmpxchg_tl path whenever mod != 3.  While an 
unlocked cmpxchg need not be atomic, it is not required to be non-atomic either, and it 
would reduce code duplication.

r~

Re: [PATCH 1/2] fix cmpxchg instruction

Posted by Wei Li 3 years, 10 months ago

>There is no branch in movcond -- this expands to cmov.Thanks. I get it.
>Your branch invalidates the comment -- the store becomes conditional, and we no longer get
>a write fault on read-only pages when the comparison fails.  OTOH, we're already getting
>the incorrect SIGSEGV behaviour, since we get a read fault on an unmapped page instead of
>a write fault.I cannot get the point. Here is my understanding.when the comparison fails, program executes gen_op_st_v(s, ot, oldv, a0). Then we will get theincorrect SIGSEGV behaviour that a read fault instead of a write fault. But why the SIGSEGVbehaviour is a read fault not a write fault? The memory operation is a write operation.
>Alternately, we can use the tcg_gen_atomic_cmpxchg_tl path whenever mod != 3.  While an>unlocked cmpxchg need not be atomic, it is not required to be non-atomic either, and it>would reduce code duplication.A good advice to implement. I will update in patch v2.

Wei Li 

    On Monday, March 21, 2022, 03:07:50 AM GMT+8, Richard Henderson <richard.henderson@linaro.org> wrote:  
 
 On 3/19/22 09:06, Wei Li wrote:
> We need a branch to determine when the instruction can touch the
> accumulator. But there is a branch provided by movcond.

There is no branch in movcond -- this expands to cmov.

> -                /* store value = (old == cmp ? new : old);  */
> -                tcg_gen_movcond_tl(TCG_COND_EQ, newv, oldv, cmpv, newv, oldv);
> +                tcg_gen_brcond_tl(TCG_COND_EQ, oldv, cmpv, label1);
...
>                      /* Perform an unconditional store cycle like physical cpu;
>                          must be before changing accumulator to ensure
>                          idempotency if the store faults and the instruction
>                          is restarted */

Your branch invalidates the comment -- the store becomes conditional, and we no longer get 
a write fault on read-only pages when the comparison fails.  OTOH, we're already getting 
the incorrect SIGSEGV behaviour, since we get a read fault on an unmapped page instead of 
a write fault.

The faulting behaviour could be addressed with a write_probe prior to the original load. 
Alternately, we can use the tcg_gen_atomic_cmpxchg_tl path whenever mod != 3.  While an 
unlocked cmpxchg need not be atomic, it is not required to be non-atomic either, and it 
would reduce code duplication.


r~