[PATCH v2 05/25] target/i386: cleanup cc_op changes for REP/REPZ/REPNZ

Paolo Bonzini posted 25 patches 6 months, 3 weeks ago
[PATCH v2 05/25] target/i386: cleanup cc_op changes for REP/REPZ/REPNZ
Posted by Paolo Bonzini 6 months, 3 weeks ago
gen_update_cc_op must be called before control flow splits.  Do it
where the jump on ECX!=0 is translated.

On the other hand, remove the call before gen_jcc1, which takes care of
it already, and explain why REPZ/REPNZ need not use CC_OP_DYNAMIC---the
translation block ends before any control-flow-dependent cc_op could
be observed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 target/i386/tcg/translate.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
index 3f1d2858fc9..6b766f5dd3f 100644
--- a/target/i386/tcg/translate.c
+++ b/target/i386/tcg/translate.c
@@ -1242,11 +1242,15 @@ static inline void gen_jcc1(DisasContext *s, int b, TCGLabel *l1)
 }
 
 /* XXX: does not work with gdbstub "ice" single step - not a
-   serious problem */
+   serious problem.  The caller can jump to the returned label
+   to stop the REP but, if the flags have changed, it has to call
+   gen_update_cc_op before doing so.  */
 static TCGLabel *gen_jz_ecx_string(DisasContext *s)
 {
     TCGLabel *l1 = gen_new_label();
     TCGLabel *l2 = gen_new_label();
+
+    gen_update_cc_op(s);
     gen_op_jnz_ecx(s, l1);
     gen_set_label(l2);
     gen_jmp_rel_csize(s, 0, 1);
@@ -1342,7 +1346,6 @@ static void gen_repz(DisasContext *s, MemOp ot,
                      void (*fn)(DisasContext *s, MemOp ot))
 {
     TCGLabel *l2;
-    gen_update_cc_op(s);
     l2 = gen_jz_ecx_string(s);
     fn(s, ot);
     gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
@@ -1364,11 +1367,13 @@ static void gen_repz2(DisasContext *s, MemOp ot, int nz,
                       void (*fn)(DisasContext *s, MemOp ot))
 {
     TCGLabel *l2;
-    gen_update_cc_op(s);
     l2 = gen_jz_ecx_string(s);
+    /*
+     * Only one iteration is done at a time, so there is
+     * no control flow junction here and cc_op is never dynamic.
+     */
     fn(s, ot);
     gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
-    gen_update_cc_op(s);
     gen_jcc1(s, (JCC_Z << 1) | (nz ^ 1), l2);
     if (s->repz_opt) {
         gen_op_jz_ecx(s, l2);
-- 
2.45.0
Re: [PATCH v2 05/25] target/i386: cleanup cc_op changes for REP/REPZ/REPNZ
Posted by Richard Henderson 6 months, 3 weeks ago
On 5/6/24 01:09, Paolo Bonzini wrote:
> gen_update_cc_op must be called before control flow splits.  Do it
> where the jump on ECX!=0 is translated.
> 
> On the other hand, remove the call before gen_jcc1, which takes care of
> it already, and explain why REPZ/REPNZ need not use CC_OP_DYNAMIC---the
> translation block ends before any control-flow-dependent cc_op could
> be observed.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>   target/i386/tcg/translate.c | 13 +++++++++----
>   1 file changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/target/i386/tcg/translate.c b/target/i386/tcg/translate.c
> index 3f1d2858fc9..6b766f5dd3f 100644
> --- a/target/i386/tcg/translate.c
> +++ b/target/i386/tcg/translate.c
> @@ -1242,11 +1242,15 @@ static inline void gen_jcc1(DisasContext *s, int b, TCGLabel *l1)
>   }
>   
>   /* XXX: does not work with gdbstub "ice" single step - not a
> -   serious problem */
> +   serious problem.  The caller can jump to the returned label
> +   to stop the REP but, if the flags have changed, it has to call
> +   gen_update_cc_op before doing so.  */
>   static TCGLabel *gen_jz_ecx_string(DisasContext *s)
>   {
>       TCGLabel *l1 = gen_new_label();
>       TCGLabel *l2 = gen_new_label();
> +
> +    gen_update_cc_op(s);
>       gen_op_jnz_ecx(s, l1);
>       gen_set_label(l2);
>       gen_jmp_rel_csize(s, 0, 1);
> @@ -1342,7 +1346,6 @@ static void gen_repz(DisasContext *s, MemOp ot,
>                        void (*fn)(DisasContext *s, MemOp ot))
>   {
>       TCGLabel *l2;
> -    gen_update_cc_op(s);
>       l2 = gen_jz_ecx_string(s);
>       fn(s, ot);
>       gen_op_add_reg_im(s, s->aflag, R_ECX, -1);

Ok.


> @@ -1364,11 +1367,13 @@ static void gen_repz2(DisasContext *s, MemOp ot, int nz,
>                         void (*fn)(DisasContext *s, MemOp ot))
>   {
>       TCGLabel *l2;
> -    gen_update_cc_op(s);
>       l2 = gen_jz_ecx_string(s);
> +    /*
> +     * Only one iteration is done at a time, so there is
> +     * no control flow junction here and cc_op is never dynamic.
> +     */
>       fn(s, ot);
>       gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
> -    gen_update_cc_op(s);
>       gen_jcc1(s, (JCC_Z << 1) | (nz ^ 1), l2);
>       if (s->repz_opt) {
>           gen_op_jz_ecx(s, l2);

Ok, but only because gen_jcc1 does the gen_update_cc_op.  The comment is neither correct 
nor necessary.

The reason to write cc_op before branches instead of junctions is to avoid having *two* 
writes of cc_op on either side of the branch.


r~
Re: [PATCH v2 05/25] target/i386: cleanup cc_op changes for REP/REPZ/REPNZ
Posted by Paolo Bonzini 6 months, 3 weeks ago
On Mon, May 6, 2024 at 6:08 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
> > -    gen_update_cc_op(s);
> >       l2 = gen_jz_ecx_string(s);
> > +    /*
> > +     * Only one iteration is done at a time, so there is
> > +     * no control flow junction here and cc_op is never dynamic.
> > +     */
> >       fn(s, ot);
> >       gen_op_add_reg_im(s, s->aflag, R_ECX, -1);
> > -    gen_update_cc_op(s);
> >       gen_jcc1(s, (JCC_Z << 1) | (nz ^ 1), l2);
> >       if (s->repz_opt) {
> >           gen_op_jz_ecx(s, l2);
>
> Ok, but only because gen_jcc1 does the gen_update_cc_op.  The comment is neither correct
> nor necessary.

Yeah, it's true that gen_jcc1 does the update. On the other hand,
there are two different kinds of cc_op updates:

1) at branches, if you know that only one of the sides might write
cc_op - so you ensure it's up-to-date before the branch - and set
CC_OP_DYNAMIC at the junction. Same if you have movcond instead of a
branch.

2) at end of translation block, to spill the value lazily (because in
the middle of the TB we are able to restore it from insn data). With
these patches there is never a need to do this, because gen_jmp_rel()
and gen_jmp_rel_csize() take care of it.

The comment deals with the former, the removal with the latter.

The idea of the comment is that after SCAS/CMPS you have CC_OP_SUB*,
so in principle you may expect that you need to set CC_OP_DYNAMIC
explicitly at the end of a REPZ/REPNZ, which is where the CX != 0 and
CX == 0 paths join. But this is not necessary, because there is
nothing after that instruction - the translation block ends.

So I guess the comment could instead be placed at the end of the function?

    /*
     * Only one iteration is done at a time, so the translation
     * block has ended unconditionally at this point and there
     * is no control flow junction - no need to set CC_OP_DYNAMIC.
     */

What do you think?

Paolo
Re: [PATCH v2 05/25] target/i386: cleanup cc_op changes for REP/REPZ/REPNZ
Posted by Richard Henderson 6 months, 3 weeks ago
On 5/6/24 09:31, Paolo Bonzini wrote:
> The comment deals with the former, the removal with the latter.
> 
> The idea of the comment is that after SCAS/CMPS you have CC_OP_SUB*,
> so in principle you may expect that you need to set CC_OP_DYNAMIC
> explicitly at the end of a REPZ/REPNZ, which is where the CX != 0 and
> CX == 0 paths join. But this is not necessary, because there is
> nothing after that instruction - the translation block ends.
> 
> So I guess the comment could instead be placed at the end of the function?
> 
>      /*
>       * Only one iteration is done at a time, so the translation
>       * block has ended unconditionally at this point and there
>       * is no control flow junction - no need to set CC_OP_DYNAMIC.
>       */
> 
> What do you think?

Just before gen_jmp_rel_csize?  Yes, that seems good.


r~