On 10/4/24 09:08, Alex Bennée wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
>
>> From: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
>>
>> The point of CPU_CFLAGS is really just to select the appropriate multilib,
>> for example for library linking tests, and -mcx16 is not needed for
>> that purpose.
>>
>> Furthermore, if -mcx16 is part of QEMU's choice of a basic x86_64
>> instruction set, it should be applied to cross-compiled x86_64 code too;
>> it is plausible that tests/tcg would want to cover cmpxchg16b as well,
>> for example. In the end this makes just as much sense as a per sub-build
>> tweak, so move the flag to meson.build and cross_cc_cflags_x86_64.
>>
>> This leaves out contrib/plugins, which would fail when attempting to use
>> __sync_val_compare_and_swap_16 (note it does not do yet); while minor,
>> this *is* a disadvantage of this change. But building contrib/plugins
>> with a Makefile instead of meson.build is something self-inflicted just
>> for the sake of showing that it can be done, and if this kind of papercut
>> started becoming a problem we could make the directory part of the meson
>> build. Until then, we can live with the limitation.
>>
>> Signed-off-by: Artyom Kunakovsky <artyomkunakovsky@gmail.com>
>> Message-ID: <20240523051118.29367-1-artyomkunakovsky@gmail.com>
>> [rewrite commit message, remove from configure. - Paolo]
>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
>> ---
>> configure | 7 ++-----
>> meson.build | 7 +++++++
>> 2 files changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/configure b/configure
>> index 38ee2577013..4d01a42ba65 100755
>> --- a/configure
>> +++ b/configure
>> @@ -512,10 +512,7 @@ case "$cpu" in
>> cpu="x86_64"
>> host_arch=x86_64
>> linux_arch=x86
>> - # ??? Only extremely old AMD cpus do not have cmpxchg16b.
>> - # If we truly care, we should simply detect this case at
>> - # runtime and generate the fallback to serial emulation.
>> - CPU_CFLAGS="-m64 -mcx16"
>> + CPU_CFLAGS="-m64"
>> ;;
>> esac
>>
>> @@ -1203,7 +1200,7 @@ fi
>> : ${cross_cc_cflags_sparc64="-m64 -mcpu=ultrasparc"}
>> : ${cross_cc_sparc="$cross_cc_sparc64"}
>> : ${cross_cc_cflags_sparc="-m32 -mcpu=supersparc"}
>> -: ${cross_cc_cflags_x86_64="-m64"}
>> +: ${cross_cc_cflags_x86_64="-m64 -mcx16"}
>>
>> compute_target_variable() {
>> eval "$2="
>> diff --git a/meson.build b/meson.build
>> index a9de71d4506..7fd82b5f48c 100644
>> --- a/meson.build
>> +++ b/meson.build
>> @@ -336,6 +336,13 @@ if host_arch == 'i386' and not cc.links('''
>> qemu_common_flags = ['-march=i486'] + qemu_common_flags
>> endif
>>
>> +# ??? Only extremely old AMD cpus do not have cmpxchg16b.
>> +# If we truly care, we should simply detect this case at
>> +# runtime and generate the fallback to serial emulation.
>> +if host_arch == 'x86_64'
>> + qemu_common_flags = ['-mcx16'] + qemu_common_flags
>> +endif
>> +
>> if get_option('prefer_static')
>> qemu_ldflags += get_option('b_pie') ? '-static-pie' : '-static'
>> endif
>
> This breaks atomic detection resulting in:
>
> #undef CONFIG_ATOMIC128
> #undef CONFIG_ATOMIC128_OPT
> #undef CONFIG_CMPXCHG128
>
> which makes the TCG atomic handling code fallback to cpu_step_atomic,
> killing performance.
>
Posted this patch to solve issue:
https://lore.kernel.org/qemu-devel/20241004220123.978938-1-pierrick.bouvier@linaro.org/T/#u