[PATCH] tcg/optimize: Fix folding of vector bitsel

WANG Rui posted 1 patch 1 week, 2 days ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20250919124901.2756538-1-wangrui@loongson.cn
Maintainers: Richard Henderson <richard.henderson@linaro.org>
tcg/optimize.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
[PATCH] tcg/optimize: Fix folding of vector bitsel
Posted by WANG Rui 1 week, 2 days ago
It looks like a typo. When the false value (C) is the constant -1, the
correct fold should be: R = B | ~A

Reproducer (LoongArch64 assembly):

     .text
     .globl  _start
 _start:
     vldi    $vr1, 3073
     vldi    $vr2, 1023
     vbitsel.v       $vr0, $vr2, $vr1, $vr2
     vpickve2gr.d    $a1, $vr0, 1
     xori    $a0, $a1, 1
     li.w    $a7, 93
     syscall 0

Fixes: e58b977238e3 ("tcg/optimize: Optimize bitsel_vec")
Link: https://github.com/llvm/llvm-project/issues/159610
Signed-off-by: WANG Rui <wangrui@loongson.cn>
---
 tcg/optimize.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tcg/optimize.c b/tcg/optimize.c
index 3638ab9fea..f69702b26e 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -1568,9 +1568,10 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp *op)
             return fold_and(ctx, op);
         }
         if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
+            TCGArg ta = op->args[2];
             op->opc = INDEX_op_orc_vec;
             op->args[2] = op->args[1];
-            op->args[1] = op->args[3];
+            op->args[1] = ta;
             return fold_orc(ctx, op);
         }
     }
-- 
2.51.0
Re: [PATCH] tcg/optimize: Fix folding of vector bitsel
Posted by Michael Tokarev 3 days, 19 hours ago
On 19.09.2025 15:49, WANG Rui wrote:
> It looks like a typo. When the false value (C) is the constant -1, the
> correct fold should be: R = B | ~A
> 
> Reproducer (LoongArch64 assembly):
> 
>       .text
>       .globl  _start
>   _start:
>       vldi    $vr1, 3073
>       vldi    $vr2, 1023
>       vbitsel.v       $vr0, $vr2, $vr1, $vr2
>       vpickve2gr.d    $a1, $vr0, 1
>       xori    $a0, $a1, 1
>       li.w    $a7, 93
>       syscall 0
> 
> Fixes: e58b977238e3 ("tcg/optimize: Optimize bitsel_vec")
> Link: https://github.com/llvm/llvm-project/issues/159610
> Signed-off-by: WANG Rui <wangrui@loongson.cn>

It also looks like qemu-stable@ material.

Please let me know if it isn't.

Thanks,

/mjt

>   tcg/optimize.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index 3638ab9fea..f69702b26e 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -1568,9 +1568,10 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp *op)
>               return fold_and(ctx, op);
>           }
>           if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
> +            TCGArg ta = op->args[2];
>               op->opc = INDEX_op_orc_vec;
>               op->args[2] = op->args[1];
> -            op->args[1] = op->args[3];
> +            op->args[1] = ta;
>               return fold_orc(ctx, op);
>           }
>       }
Re: [PATCH] tcg/optimize: Fix folding of vector bitsel
Posted by Richard Henderson 3 days, 19 hours ago
On 9/24/25 12:31, Michael Tokarev wrote:
> On 19.09.2025 15:49, WANG Rui wrote:
>> It looks like a typo. When the false value (C) is the constant -1, the
>> correct fold should be: R = B | ~A
>>
>> Reproducer (LoongArch64 assembly):
>>
>>       .text
>>       .globl  _start
>>   _start:
>>       vldi    $vr1, 3073
>>       vldi    $vr2, 1023
>>       vbitsel.v       $vr0, $vr2, $vr1, $vr2
>>       vpickve2gr.d    $a1, $vr0, 1
>>       xori    $a0, $a1, 1
>>       li.w    $a7, 93
>>       syscall 0
>>
>> Fixes: e58b977238e3 ("tcg/optimize: Optimize bitsel_vec")
>> Link: https://github.com/llvm/llvm-project/issues/159610
>> Signed-off-by: WANG Rui <wangrui@loongson.cn>
> 
> It also looks like qemu-stable@ material.
> 
> Please let me know if it isn't.

It is, yes.

r~

Re: [PATCH] tcg/optimize: Fix folding of vector bitsel
Posted by Philippe Mathieu-Daudé 4 days, 10 hours ago
On 19/9/25 14:49, WANG Rui wrote:
> It looks like a typo.

Likely from the TCG_TARGET_HAS_andc_vec case.

> When the false value (C) is the constant -1, the
> correct fold should be: R = B | ~A
> 
> Reproducer (LoongArch64 assembly):
> 
>       .text
>       .globl  _start
>   _start:
>       vldi    $vr1, 3073
>       vldi    $vr2, 1023
>       vbitsel.v       $vr0, $vr2, $vr1, $vr2
>       vpickve2gr.d    $a1, $vr0, 1
>       xori    $a0, $a1, 1
>       li.w    $a7, 93
>       syscall 0
> 
> Fixes: e58b977238e3 ("tcg/optimize: Optimize bitsel_vec")
> Link: https://github.com/llvm/llvm-project/issues/159610
> Signed-off-by: WANG Rui <wangrui@loongson.cn>
> ---
>   tcg/optimize.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index 3638ab9fea..f69702b26e 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -1568,9 +1568,10 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp *op)
>               return fold_and(ctx, op);
>           }
>           if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
> +            TCGArg ta = op->args[2];
>               op->opc = INDEX_op_orc_vec;
>               op->args[2] = op->args[1];
> -            op->args[1] = op->args[3];
> +            op->args[1] = ta;
>               return fold_orc(ctx, op);
>           }
>       }
Looks correct, but I don't understand the swap. Can't we justkeep the 
same argument order for an ORC opcode? I'd have done:

-- >8 --
@@ -1569,8 +1569,6 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp 
*op)
          }
          if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
              op->opc = INDEX_op_orc_vec;
-            op->args[2] = op->args[1];
-            op->args[1] = op->args[3];
              return fold_orc(ctx, op);
          }
      }
---
Re: [PATCH] tcg/optimize: Fix folding of vector bitsel
Posted by WANG Rui 4 days, 10 hours ago
On Wed, Sep 24, 2025 at 12:03 PM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> On 19/9/25 14:49, WANG Rui wrote:
> > It looks like a typo.
>
> Likely from the TCG_TARGET_HAS_andc_vec case.
>
> > When the false value (C) is the constant -1, the
> > correct fold should be: R = B | ~A
> >
> > Reproducer (LoongArch64 assembly):
> >
> >       .text
> >       .globl  _start
> >   _start:
> >       vldi    $vr1, 3073
> >       vldi    $vr2, 1023
> >       vbitsel.v       $vr0, $vr2, $vr1, $vr2
> >       vpickve2gr.d    $a1, $vr0, 1
> >       xori    $a0, $a1, 1
> >       li.w    $a7, 93
> >       syscall 0
> >
> > Fixes: e58b977238e3 ("tcg/optimize: Optimize bitsel_vec")
> > Link: https://github.com/llvm/llvm-project/issues/159610
> > Signed-off-by: WANG Rui <wangrui@loongson.cn>
> > ---
> >   tcg/optimize.c | 3 ++-
> >   1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/tcg/optimize.c b/tcg/optimize.c
> > index 3638ab9fea..f69702b26e 100644
> > --- a/tcg/optimize.c
> > +++ b/tcg/optimize.c
> > @@ -1568,9 +1568,10 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp *op)
> >               return fold_and(ctx, op);
> >           }
> >           if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
> > +            TCGArg ta = op->args[2];
> >               op->opc = INDEX_op_orc_vec;
> >               op->args[2] = op->args[1];
> > -            op->args[1] = op->args[3];
> > +            op->args[1] = ta;
> >               return fold_orc(ctx, op);
> >           }
> >       }
> Looks correct, but I don't understand the swap. Can't we justkeep the
> same argument order for an ORC opcode? I'd have done:
>
> -- >8 --
> @@ -1569,8 +1569,6 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp
> *op)
>           }
>           if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
>               op->opc = INDEX_op_orc_vec;
> -            op->args[2] = op->args[1];
> -            op->args[1] = op->args[3];
>               return fold_orc(ctx, op);
>           }

Bitwise logic can be tricky and easy to get wrong. In general, (a |
~b) != (b | ~a). For example, when a = 0 and b = 1, the results
differ.

>       }
> ---
Re: [PATCH] tcg/optimize: Fix folding of vector bitsel
Posted by Philippe Mathieu-Daudé 3 days, 14 hours ago
On 24/9/25 06:29, WANG Rui wrote:
> On Wed, Sep 24, 2025 at 12:03 PM Philippe Mathieu-Daudé
> <philmd@linaro.org> wrote:
>>
>> On 19/9/25 14:49, WANG Rui wrote:
>>> It looks like a typo.
>>
>> Likely from the TCG_TARGET_HAS_andc_vec case.
>>
>>> When the false value (C) is the constant -1, the
>>> correct fold should be: R = B | ~A
>>>
>>> Reproducer (LoongArch64 assembly):
>>>
>>>        .text
>>>        .globl  _start
>>>    _start:
>>>        vldi    $vr1, 3073
>>>        vldi    $vr2, 1023
>>>        vbitsel.v       $vr0, $vr2, $vr1, $vr2
>>>        vpickve2gr.d    $a1, $vr0, 1
>>>        xori    $a0, $a1, 1
>>>        li.w    $a7, 93
>>>        syscall 0
>>>
>>> Fixes: e58b977238e3 ("tcg/optimize: Optimize bitsel_vec")
>>> Link: https://github.com/llvm/llvm-project/issues/159610
>>> Signed-off-by: WANG Rui <wangrui@loongson.cn>
>>> ---
>>>    tcg/optimize.c | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/tcg/optimize.c b/tcg/optimize.c
>>> index 3638ab9fea..f69702b26e 100644
>>> --- a/tcg/optimize.c
>>> +++ b/tcg/optimize.c
>>> @@ -1568,9 +1568,10 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp *op)
>>>                return fold_and(ctx, op);
>>>            }
>>>            if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
>>> +            TCGArg ta = op->args[2];
>>>                op->opc = INDEX_op_orc_vec;
>>>                op->args[2] = op->args[1];
>>> -            op->args[1] = op->args[3];
>>> +            op->args[1] = ta;
>>>                return fold_orc(ctx, op);
>>>            }
>>>        }
>> Looks correct, but I don't understand the swap. Can't we justkeep the
>> same argument order for an ORC opcode? I'd have done:
>>
>> -- >8 --
>> @@ -1569,8 +1569,6 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp
>> *op)
>>            }
>>            if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
>>                op->opc = INDEX_op_orc_vec;
>> -            op->args[2] = op->args[1];
>> -            op->args[1] = op->args[3];
>>                return fold_orc(ctx, op);
>>            }
> 
> Bitwise logic can be tricky and easy to get wrong. In general, (a |
> ~b) != (b | ~a). For example, when a = 0 and b = 1, the results
> differ.

Oh right.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

> 
>>        }
>> ---
> 

Re: [PATCH] tcg/optimize: Fix folding of vector bitsel
Posted by Richard Henderson 4 days, 15 hours ago
On 9/19/25 05:49, WANG Rui wrote:
> It looks like a typo. When the false value (C) is the constant -1, the
> correct fold should be: R = B | ~A
> 
> Reproducer (LoongArch64 assembly):
> 
>       .text
>       .globl  _start
>   _start:
>       vldi    $vr1, 3073
>       vldi    $vr2, 1023
>       vbitsel.v       $vr0, $vr2, $vr1, $vr2
>       vpickve2gr.d    $a1, $vr0, 1
>       xori    $a0, $a1, 1
>       li.w    $a7, 93
>       syscall 0
> 
> Fixes: e58b977238e3 ("tcg/optimize: Optimize bitsel_vec")
> Link: https://github.com/llvm/llvm-project/issues/159610
> Signed-off-by: WANG Rui <wangrui@loongson.cn>
> ---
>   tcg/optimize.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

r~

> 
> diff --git a/tcg/optimize.c b/tcg/optimize.c
> index 3638ab9fea..f69702b26e 100644
> --- a/tcg/optimize.c
> +++ b/tcg/optimize.c
> @@ -1568,9 +1568,10 @@ static bool fold_bitsel_vec(OptContext *ctx, TCGOp *op)
>               return fold_and(ctx, op);
>           }
>           if (fv == -1 && TCG_TARGET_HAS_orc_vec) {
> +            TCGArg ta = op->args[2];
>               op->opc = INDEX_op_orc_vec;
>               op->args[2] = op->args[1];
> -            op->args[1] = op->args[3];
> +            op->args[1] = ta;
>               return fold_orc(ctx, op);
>           }
>       }