[PULL 01/24] configure: move -mcx16 flag out of CPU_CFLAGS

Paolo Bonzini posted 24 patches 6 months ago
Maintainers: Gerd Hoffmann <kraxel@redhat.com>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, "Alex Bennée" <alex.bennee@linaro.org>, Thomas Huth <thuth@redhat.com>, "Daniel P. Berrangé" <berrange@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Hyman Huang <yong.huang@smartx.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, Richard Henderson <richard.henderson@linaro.org>, Eduardo Habkost <eduardo@habkost.net>, Laurent Vivier <lvivier@redhat.com>
[PULL 01/24] configure: move -mcx16 flag out of CPU_CFLAGS
Posted by Paolo Bonzini 6 months ago
From: Artyom Kunakovsky <artyomkunakovsky@gmail.com>

The point of CPU_CFLAGS is really just to select the appropriate multilib,
for example for library linking tests, and -mcx16 is not needed for
that purpose.

Furthermore, if -mcx16 is part of QEMU's choice of a basic x86_64
instruction set, it should be applied to cross-compiled x86_64 code too;
it is plausible that tests/tcg would want to cover cmpxchg16b as well,
for example.  In the end this makes just as much sense as a per sub-build
tweak, so move the flag to meson.build and cross_cc_cflags_x86_64.

This leaves out contrib/plugins, which would fail when attempting to use
__sync_val_compare_and_swap_16 (note it does not do yet); while minor,
this *is* a disadvantage of this change.  But building contrib/plugins
with a Makefile instead of meson.build is something self-inflicted just
for the sake of showing that it can be done, and if this kind of papercut
started becoming a problem we could make the directory part of the meson
build.  Until then, we can live with the limitation.

Signed-off-by: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
Message-ID: <20240523051118.29367-1-artyomkunakovsky@gmail.com>
[rewrite commit message, remove from configure. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 configure   | 7 ++-----
 meson.build | 7 +++++++
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/configure b/configure
index 38ee2577013..4d01a42ba65 100755
--- a/configure
+++ b/configure
@@ -512,10 +512,7 @@ case "$cpu" in
     cpu="x86_64"
     host_arch=x86_64
     linux_arch=x86
-    # ??? Only extremely old AMD cpus do not have cmpxchg16b.
-    # If we truly care, we should simply detect this case at
-    # runtime and generate the fallback to serial emulation.
-    CPU_CFLAGS="-m64 -mcx16"
+    CPU_CFLAGS="-m64"
     ;;
 esac
 
@@ -1203,7 +1200,7 @@ fi
 : ${cross_cc_cflags_sparc64="-m64 -mcpu=ultrasparc"}
 : ${cross_cc_sparc="$cross_cc_sparc64"}
 : ${cross_cc_cflags_sparc="-m32 -mcpu=supersparc"}
-: ${cross_cc_cflags_x86_64="-m64"}
+: ${cross_cc_cflags_x86_64="-m64 -mcx16"}
 
 compute_target_variable() {
   eval "$2="
diff --git a/meson.build b/meson.build
index a9de71d4506..7fd82b5f48c 100644
--- a/meson.build
+++ b/meson.build
@@ -336,6 +336,13 @@ if host_arch == 'i386' and not cc.links('''
   qemu_common_flags = ['-march=i486'] + qemu_common_flags
 endif
 
+# ??? Only extremely old AMD cpus do not have cmpxchg16b.
+# If we truly care, we should simply detect this case at
+# runtime and generate the fallback to serial emulation.
+if host_arch == 'x86_64'
+  qemu_common_flags = ['-mcx16'] + qemu_common_flags
+endif
+
 if get_option('prefer_static')
   qemu_ldflags += get_option('b_pie') ? '-static-pie' : '-static'
 endif
-- 
2.45.1
Re: [PULL 01/24] configure: move -mcx16 flag out of CPU_CFLAGS
Posted by Alex Bennée 1 month, 3 weeks ago
Paolo Bonzini <pbonzini@redhat.com> writes:

> From: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
>
> The point of CPU_CFLAGS is really just to select the appropriate multilib,
> for example for library linking tests, and -mcx16 is not needed for
> that purpose.
>
> Furthermore, if -mcx16 is part of QEMU's choice of a basic x86_64
> instruction set, it should be applied to cross-compiled x86_64 code too;
> it is plausible that tests/tcg would want to cover cmpxchg16b as well,
> for example.  In the end this makes just as much sense as a per sub-build
> tweak, so move the flag to meson.build and cross_cc_cflags_x86_64.
>
> This leaves out contrib/plugins, which would fail when attempting to use
> __sync_val_compare_and_swap_16 (note it does not do yet); while minor,
> this *is* a disadvantage of this change.  But building contrib/plugins
> with a Makefile instead of meson.build is something self-inflicted just
> for the sake of showing that it can be done, and if this kind of papercut
> started becoming a problem we could make the directory part of the meson
> build.  Until then, we can live with the limitation.
>
> Signed-off-by: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
> Message-ID: <20240523051118.29367-1-artyomkunakovsky@gmail.com>
> [rewrite commit message, remove from configure. - Paolo]
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  configure   | 7 ++-----
>  meson.build | 7 +++++++
>  2 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/configure b/configure
> index 38ee2577013..4d01a42ba65 100755
> --- a/configure
> +++ b/configure
> @@ -512,10 +512,7 @@ case "$cpu" in
>      cpu="x86_64"
>      host_arch=x86_64
>      linux_arch=x86
> -    # ??? Only extremely old AMD cpus do not have cmpxchg16b.
> -    # If we truly care, we should simply detect this case at
> -    # runtime and generate the fallback to serial emulation.
> -    CPU_CFLAGS="-m64 -mcx16"
> +    CPU_CFLAGS="-m64"
>      ;;
>  esac
>  
> @@ -1203,7 +1200,7 @@ fi
>  : ${cross_cc_cflags_sparc64="-m64 -mcpu=ultrasparc"}
>  : ${cross_cc_sparc="$cross_cc_sparc64"}
>  : ${cross_cc_cflags_sparc="-m32 -mcpu=supersparc"}
> -: ${cross_cc_cflags_x86_64="-m64"}
> +: ${cross_cc_cflags_x86_64="-m64 -mcx16"}
>  
>  compute_target_variable() {
>    eval "$2="
> diff --git a/meson.build b/meson.build
> index a9de71d4506..7fd82b5f48c 100644
> --- a/meson.build
> +++ b/meson.build
> @@ -336,6 +336,13 @@ if host_arch == 'i386' and not cc.links('''
>    qemu_common_flags = ['-march=i486'] + qemu_common_flags
>  endif
>  
> +# ??? Only extremely old AMD cpus do not have cmpxchg16b.
> +# If we truly care, we should simply detect this case at
> +# runtime and generate the fallback to serial emulation.
> +if host_arch == 'x86_64'
> +  qemu_common_flags = ['-mcx16'] + qemu_common_flags
> +endif
> +
>  if get_option('prefer_static')
>    qemu_ldflags += get_option('b_pie') ? '-static-pie' : '-static'
>  endif

This breaks atomic detection resulting in:

#undef CONFIG_ATOMIC128
#undef CONFIG_ATOMIC128_OPT
#undef CONFIG_CMPXCHG128

which makes the TCG atomic handling code fallback to cpu_step_atomic,
killing performance.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro
Re: [PULL 01/24] configure: move -mcx16 flag out of CPU_CFLAGS
Posted by Pierrick Bouvier 1 month, 3 weeks ago
On 10/4/24 09:08, Alex Bennée wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> From: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
>>
>> The point of CPU_CFLAGS is really just to select the appropriate multilib,
>> for example for library linking tests, and -mcx16 is not needed for
>> that purpose.
>>
>> Furthermore, if -mcx16 is part of QEMU's choice of a basic x86_64
>> instruction set, it should be applied to cross-compiled x86_64 code too;
>> it is plausible that tests/tcg would want to cover cmpxchg16b as well,
>> for example.  In the end this makes just as much sense as a per sub-build
>> tweak, so move the flag to meson.build and cross_cc_cflags_x86_64.
>>
>> This leaves out contrib/plugins, which would fail when attempting to use
>> __sync_val_compare_and_swap_16 (note it does not do yet); while minor,
>> this *is* a disadvantage of this change.  But building contrib/plugins
>> with a Makefile instead of meson.build is something self-inflicted just
>> for the sake of showing that it can be done, and if this kind of papercut
>> started becoming a problem we could make the directory part of the meson
>> build.  Until then, we can live with the limitation.
>>
>> Signed-off-by: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
>> Message-ID: <20240523051118.29367-1-artyomkunakovsky@gmail.com>
>> [rewrite commit message, remove from configure. - Paolo]
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>>   configure   | 7 ++-----
>>   meson.build | 7 +++++++
>>   2 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/configure b/configure
>> index 38ee2577013..4d01a42ba65 100755
>> --- a/configure
>> +++ b/configure
>> @@ -512,10 +512,7 @@ case "$cpu" in
>>       cpu="x86_64"
>>       host_arch=x86_64
>>       linux_arch=x86
>> -    # ??? Only extremely old AMD cpus do not have cmpxchg16b.
>> -    # If we truly care, we should simply detect this case at
>> -    # runtime and generate the fallback to serial emulation.
>> -    CPU_CFLAGS="-m64 -mcx16"
>> +    CPU_CFLAGS="-m64"
>>       ;;
>>   esac
>>   
>> @@ -1203,7 +1200,7 @@ fi
>>   : ${cross_cc_cflags_sparc64="-m64 -mcpu=ultrasparc"}
>>   : ${cross_cc_sparc="$cross_cc_sparc64"}
>>   : ${cross_cc_cflags_sparc="-m32 -mcpu=supersparc"}
>> -: ${cross_cc_cflags_x86_64="-m64"}
>> +: ${cross_cc_cflags_x86_64="-m64 -mcx16"}
>>   
>>   compute_target_variable() {
>>     eval "$2="
>> diff --git a/meson.build b/meson.build
>> index a9de71d4506..7fd82b5f48c 100644
>> --- a/meson.build
>> +++ b/meson.build
>> @@ -336,6 +336,13 @@ if host_arch == 'i386' and not cc.links('''
>>     qemu_common_flags = ['-march=i486'] + qemu_common_flags
>>   endif
>>   
>> +# ??? Only extremely old AMD cpus do not have cmpxchg16b.
>> +# If we truly care, we should simply detect this case at
>> +# runtime and generate the fallback to serial emulation.
>> +if host_arch == 'x86_64'
>> +  qemu_common_flags = ['-mcx16'] + qemu_common_flags
>> +endif
>> +
>>   if get_option('prefer_static')
>>     qemu_ldflags += get_option('b_pie') ? '-static-pie' : '-static'
>>   endif
> 
> This breaks atomic detection resulting in:
> 
> #undef CONFIG_ATOMIC128
> #undef CONFIG_ATOMIC128_OPT
> #undef CONFIG_CMPXCHG128
> 
> which makes the TCG atomic handling code fallback to cpu_step_atomic,
> killing performance.
> 

Posted this patch to solve issue:
https://lore.kernel.org/qemu-devel/20241004220123.978938-1-pierrick.bouvier@linaro.org/T/#u