[PATCH 2/3] tcg: Eliminate one store for in-place 128-bit dup_mem

Richard Henderson posted 3 patches 5 years, 5 months ago
Maintainers: Richard Henderson <rth@twiddle.net>, Paolo Bonzini <pbonzini@redhat.com>
There is a newer version of this series
[PATCH 2/3] tcg: Eliminate one store for in-place 128-bit dup_mem
Posted by Richard Henderson 5 years, 5 months ago
Do not store back to the exact memory from which we just loaded.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/tcg-op-gvec.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
index 793d4ba64c..fcc25b04e6 100644
--- a/tcg/tcg-op-gvec.c
+++ b/tcg/tcg-op-gvec.c
@@ -1581,7 +1581,7 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t dofs, uint32_t aofs,
             TCGv_vec in = tcg_temp_new_vec(TCG_TYPE_V128);
 
             tcg_gen_ld_vec(in, cpu_env, aofs);
-            for (i = 0; i < oprsz; i += 16) {
+            for (i = (aofs == dofs) * 16; i < oprsz; i += 16) {
                 tcg_gen_st_vec(in, cpu_env, dofs + i);
             }
             tcg_temp_free_vec(in);
@@ -1591,7 +1591,7 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t dofs, uint32_t aofs,
 
             tcg_gen_ld_i64(in0, cpu_env, aofs);
             tcg_gen_ld_i64(in1, cpu_env, aofs + 8);
-            for (i = 0; i < oprsz; i += 16) {
+            for (i = (aofs == dofs) * 16; i < oprsz; i += 16) {
                 tcg_gen_st_i64(in0, cpu_env, dofs + i);
                 tcg_gen_st_i64(in1, cpu_env, dofs + i + 8);
             }
-- 
2.25.1


Re: [PATCH 2/3] tcg: Eliminate one store for in-place 128-bit dup_mem
Posted by Philippe Mathieu-Daudé 5 years, 5 months ago
Le ven. 28 août 2020 20:04, Richard Henderson <richard.henderson@linaro.org>
a écrit :

> Do not store back to the exact memory from which we just loaded.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

---
>  tcg/tcg-op-gvec.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c
> index 793d4ba64c..fcc25b04e6 100644
> --- a/tcg/tcg-op-gvec.c
> +++ b/tcg/tcg-op-gvec.c
> @@ -1581,7 +1581,7 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t
> dofs, uint32_t aofs,
>              TCGv_vec in = tcg_temp_new_vec(TCG_TYPE_V128);
>
>              tcg_gen_ld_vec(in, cpu_env, aofs);
> -            for (i = 0; i < oprsz; i += 16) {
> +            for (i = (aofs == dofs) * 16; i < oprsz; i += 16) {
>                  tcg_gen_st_vec(in, cpu_env, dofs + i);
>              }
>              tcg_temp_free_vec(in);
> @@ -1591,7 +1591,7 @@ void tcg_gen_gvec_dup_mem(unsigned vece, uint32_t
> dofs, uint32_t aofs,
>
>              tcg_gen_ld_i64(in0, cpu_env, aofs);
>              tcg_gen_ld_i64(in1, cpu_env, aofs + 8);
> -            for (i = 0; i < oprsz; i += 16) {
> +            for (i = (aofs == dofs) * 16; i < oprsz; i += 16) {
>                  tcg_gen_st_i64(in0, cpu_env, dofs + i);
>                  tcg_gen_st_i64(in1, cpu_env, dofs + i + 8);
>              }
> --
> 2.25.1
>
>
>