[PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128

Richard Henderson posted 1 patch 10 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230713202327.12662-1-richard.henderson@linaro.org
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, Riku Voipio <riku.voipio@iki.fi>
accel/tcg/tcg-runtime.h            | 2 +-
include/exec/helper-proto-common.h | 2 ++
accel/tcg/cputlb.c                 | 2 +-
accel/tcg/user-exec.c              | 2 +-
tcg/tcg-op-ldst.c                  | 2 +-
accel/tcg/atomic_common.c.inc      | 2 +-
6 files changed, 7 insertions(+), 5 deletions(-)
[PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
Posted by Richard Henderson 10 months ago
We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with
CONFIG_ATOMIC128_OPT in atomic128.h.  It is difficult
to tell when those changes have been applied with the
ifdef we must use with CONFIG_CMPXCHG128.  So instead
use HAVE_CMPXCHG128, which triggers -Werror-undef when
the proper header has not been included.

Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which
requires CONFIG_ATOMIC128_OPT.  Without this we fall back
to EXCP_ATOMIC to single-step 128-bit atomics, which is
slow enough to cause some tests to time out.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---

Thomas, this issue does not quite match the one you bisected, but
other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128
being used in BootLinuxS390X.test_s390_ccw_virtio_tcg.

As far as I can see, this wasn't broken by the addition of
CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough.

Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host.


r~
---
 accel/tcg/tcg-runtime.h            | 2 +-
 include/exec/helper-proto-common.h | 2 ++
 accel/tcg/cputlb.c                 | 2 +-
 accel/tcg/user-exec.c              | 2 +-
 tcg/tcg-op-ldst.c                  | 2 +-
 accel/tcg/atomic_common.c.inc      | 2 +-
 6 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/tcg-runtime.h b/accel/tcg/tcg-runtime.h
index 39e68007f9..186899a2c7 100644
--- a/accel/tcg/tcg-runtime.h
+++ b/accel/tcg/tcg-runtime.h
@@ -58,7 +58,7 @@ DEF_HELPER_FLAGS_5(atomic_cmpxchgq_be, TCG_CALL_NO_WG,
 DEF_HELPER_FLAGS_5(atomic_cmpxchgq_le, TCG_CALL_NO_WG,
                    i64, env, i64, i64, i64, i32)
 #endif
-#ifdef CONFIG_CMPXCHG128
+#if HAVE_CMPXCHG128
 DEF_HELPER_FLAGS_5(atomic_cmpxchgo_be, TCG_CALL_NO_WG,
                    i128, env, i64, i128, i128, i32)
 DEF_HELPER_FLAGS_5(atomic_cmpxchgo_le, TCG_CALL_NO_WG,
diff --git a/include/exec/helper-proto-common.h b/include/exec/helper-proto-common.h
index 4d4b022668..8b67170a22 100644
--- a/include/exec/helper-proto-common.h
+++ b/include/exec/helper-proto-common.h
@@ -7,6 +7,8 @@
 #ifndef HELPER_PROTO_COMMON_H
 #define HELPER_PROTO_COMMON_H
 
+#include "qemu/atomic128.h"  /* for HAVE_CMPXCHG128 */
+
 #define HELPER_H "accel/tcg/tcg-runtime.h"
 #include "exec/helper-proto.h.inc"
 #undef  HELPER_H
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index c2b81ec569..e0079c9a9d 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -3105,7 +3105,7 @@ void cpu_st16_mmu(CPUArchState *env, target_ulong addr, Int128 val,
 #include "atomic_template.h"
 #endif
 
-#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128)
+#if defined(CONFIG_ATOMIC128) || HAVE_CMPXCHG128
 #define DATA_SIZE 16
 #include "atomic_template.h"
 #endif
diff --git a/accel/tcg/user-exec.c b/accel/tcg/user-exec.c
index d95b875a6a..e7225e10e9 100644
--- a/accel/tcg/user-exec.c
+++ b/accel/tcg/user-exec.c
@@ -1385,7 +1385,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, vaddr addr, MemOpIdx oi,
 #include "atomic_template.h"
 #endif
 
-#if defined(CONFIG_ATOMIC128) || defined(CONFIG_CMPXCHG128)
+#if defined(CONFIG_ATOMIC128) || HAVE_CMPXCHG128
 #define DATA_SIZE 16
 #include "atomic_template.h"
 #endif
diff --git a/tcg/tcg-op-ldst.c b/tcg/tcg-op-ldst.c
index 0fcc1618e5..d54c305598 100644
--- a/tcg/tcg-op-ldst.c
+++ b/tcg/tcg-op-ldst.c
@@ -778,7 +778,7 @@ typedef void (*gen_atomic_op_i64)(TCGv_i64, TCGv_env, TCGv_i64,
 #else
 # define WITH_ATOMIC64(X)
 #endif
-#ifdef CONFIG_CMPXCHG128
+#if HAVE_CMPXCHG128
 # define WITH_ATOMIC128(X) X,
 #else
 # define WITH_ATOMIC128(X)
diff --git a/accel/tcg/atomic_common.c.inc b/accel/tcg/atomic_common.c.inc
index ee222fd7e7..95a5c5ff12 100644
--- a/accel/tcg/atomic_common.c.inc
+++ b/accel/tcg/atomic_common.c.inc
@@ -41,7 +41,7 @@ CMPXCHG_HELPER(cmpxchgq_be, uint64_t)
 CMPXCHG_HELPER(cmpxchgq_le, uint64_t)
 #endif
 
-#ifdef CONFIG_CMPXCHG128
+#if HAVE_CMPXCHG128
 CMPXCHG_HELPER(cmpxchgo_be, Int128)
 CMPXCHG_HELPER(cmpxchgo_le, Int128)
 #endif
-- 
2.34.1
Re: [PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
Posted by Thomas Huth 10 months ago
On 13/07/2023 22.23, Richard Henderson wrote:
> We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with
> CONFIG_ATOMIC128_OPT in atomic128.h.  It is difficult
> to tell when those changes have been applied with the
> ifdef we must use with CONFIG_CMPXCHG128.  So instead
> use HAVE_CMPXCHG128, which triggers -Werror-undef when
> the proper header has not been included.
> 
> Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which
> requires CONFIG_ATOMIC128_OPT.  Without this we fall back
> to EXCP_ATOMIC to single-step 128-bit atomics, which is
> slow enough to cause some tests to time out.
> 
> Reported-by: Thomas Huth <thuth@redhat.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> 
> Thomas, this issue does not quite match the one you bisected, but
> other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128
> being used in BootLinuxS390X.test_s390_ccw_virtio_tcg.
> 
> As far as I can see, this wasn't broken by the addition of
> CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough.
> 
> Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host.

Thanks, I can confirm that this fixes the issue for me, too.

Tested-by: Thomas Huth <thuth@redhat.com>
Re: [PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
Posted by Philippe Mathieu-Daudé 10 months ago
Hi Richard,

On 13/7/23 22:23, Richard Henderson wrote:
> We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with
> CONFIG_ATOMIC128_OPT in atomic128.h.  It is difficult
> to tell when those changes have been applied with the
> ifdef we must use with CONFIG_CMPXCHG128.  So instead
> use HAVE_CMPXCHG128, which triggers -Werror-undef when
> the proper header has not been included.
> 
> Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which
> requires CONFIG_ATOMIC128_OPT.  Without this we fall back
> to EXCP_ATOMIC to single-step 128-bit atomics, which is
> slow enough to cause some tests to time out.
> 
> Reported-by: Thomas Huth <thuth@redhat.com>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
> 
> Thomas, this issue does not quite match the one you bisected, but
> other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128
> being used in BootLinuxS390X.test_s390_ccw_virtio_tcg.
> 
> As far as I can see, this wasn't broken by the addition of
> CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough.
> 
> Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host.

IIUC:

If we have CONFIG_ATOMIC128, we use qatomic_cmpxchg__nocheck;
else if we have CONFIG_CMPXCHG128 we use __sync_val_compare_and_swap_16;
in both cases we set HAVE_CMPXCHG128;
otherwise we can not use atomic128 cmpxchg().

(I'm trying to figure why we need both CONFIGs).
Re: [PATCH for-8.1] tcg: Use HAVE_CMPXCHG128 instead of CONFIG_CMPXCHG128
Posted by Richard Henderson 10 months ago
On 7/13/23 22:36, Philippe Mathieu-Daudé wrote:
> Hi Richard,
> 
> On 13/7/23 22:23, Richard Henderson wrote:
>> We adjust CONFIG_ATOMIC128 and CONFIG_CMPXCHG128 with
>> CONFIG_ATOMIC128_OPT in atomic128.h.  It is difficult
>> to tell when those changes have been applied with the
>> ifdef we must use with CONFIG_CMPXCHG128.  So instead
>> use HAVE_CMPXCHG128, which triggers -Werror-undef when
>> the proper header has not been included.
>>
>> Improves tcg_gen_atomic_cmpxchg_i128 for s390x host, which
>> requires CONFIG_ATOMIC128_OPT.  Without this we fall back
>> to EXCP_ATOMIC to single-step 128-bit atomics, which is
>> slow enough to cause some tests to time out.
>>
>> Reported-by: Thomas Huth <thuth@redhat.com>
>> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
>> ---
>>
>> Thomas, this issue does not quite match the one you bisected, but
>> other than the cmpxchg, I don't see any see any qemu_{ld,st}_i128
>> being used in BootLinuxS390X.test_s390_ccw_virtio_tcg.
>>
>> As far as I can see, this wasn't broken by the addition of
>> CONFIG_ATOMIC128_OPT, rather that fix didn't go far enough.
>>
>> Anyway, test_s390_ccw_virtio_tcg now passes in 159s on our host.
> 
> IIUC:
> 
> If we have CONFIG_ATOMIC128, we use qatomic_cmpxchg__nocheck;
> else if we have CONFIG_CMPXCHG128 we use __sync_val_compare_and_swap_16;
> in both cases we set HAVE_CMPXCHG128;
> otherwise we can not use atomic128 cmpxchg().
> 
> (I'm trying to figure why we need both CONFIGs).

Or sometimes we use inline asm, because there's no compiler support at all.
Please see host/include/*/host/atomic16-*.h.


r~