If QEMU is compiled with clang-7 it results in the warning:
hw/display/qxl.c:1884:19: error: misaligned or large atomic operation
may incur significant performance penalty [-Werror,-Watomic-alignment]
old_pending = atomic_fetch_or(&d->ram->int_pending, le_events);
^
This is because the Spice headers forgot to define the QXLRam struct
with the '__aligned__(4)' attribute. clang 7 and newer will thus
warn that the access here to int_pending might not be 4-aligned
(because the QXLRam object d->ram points at might start at a
misaligned address). In fact we set up d->ram in init_qxl_ram() so
it always starts at a 4K boundary, so we know the atomic access here
is OK.
Newer Spice versions (with Spice commit
beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1) will fix the bug;
for older Spice versions, work around it by telling the compiler
explicitly that the alignment is OK using __builtin_assume_aligned().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
include/qemu/compiler.h | 9 +++++++++
hw/display/qxl.c | 26 +++++++++++++++++++++++++-
2 files changed, 34 insertions(+), 1 deletion(-)
diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index 5843812710c..bf47e7bee4e 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -122,6 +122,15 @@
#ifndef __has_feature
#define __has_feature(x) 0 /* compatibility with non-clang compilers */
#endif
+
+#ifndef __has_builtin
+#define __has_builtin(x) 0 /* compatibility with non-clang compilers */
+#endif
+
+#if __has_builtin(__builtin_assume_aligned) || QEMU_GNUC_PREREQ(4, 7)
+#define HAS_ASSUME_ALIGNED
+#endif
+
/* Implement C11 _Generic via GCC builtins. Example:
*
* QEMU_GENERIC(x, (float, sinf), (long double, sinl), sin) (x)
diff --git a/hw/display/qxl.c b/hw/display/qxl.c
index 8e9135d9c67..5702e8645d3 100644
--- a/hw/display/qxl.c
+++ b/hw/display/qxl.c
@@ -1881,7 +1881,31 @@ static void qxl_send_events(PCIQXLDevice *d, uint32_t events)
trace_qxl_send_events_vm_stopped(d->id, events);
return;
}
- old_pending = atomic_fetch_or(&d->ram->int_pending, le_events);
+ /*
+ * Older versions of Spice forgot to define the QXLRam struct
+ * with the '__aligned__(4)' attribute. clang 7 and newer will
+ * thus warn that atomic_fetch_or(&d->ram->int_pending, ...)
+ * might be a misaligned atomic access, and will generate an
+ * out-of-line call for it, which results in a link error since
+ * we don't currently link against libatomic.
+ *
+ * In fact we set up d->ram in init_qxl_ram() so it always starts
+ * at a 4K boundary, so we know that &d->ram->int_pending is
+ * naturally aligned for a uint32_t. Newer Spice versions
+ * (with Spice commit beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1)
+ * will fix the bug directly. To deal with older versions,
+ * we tell the compiler to assume the address really is aligned.
+ * Any compiler which cares about the misalignment will have
+ * __builtin_assume_aligned.
+ */
+#ifdef HAS_ASSUME_ALIGNED
+#define ALIGNED_UINT32_PTR(P) ((uint32_t *)__builtin_assume_aligned(P, 4))
+#else
+#define ALIGNED_UINT32_PTR(P) ((uint32_t *)P)
+#endif
+
+ old_pending = atomic_fetch_or(ALIGNED_UINT32_PTR(&d->ram->int_pending),
+ le_events);
if ((old_pending & le_events) == le_events) {
return;
}
--
2.19.0
On 9/27/18 8:55 AM, Peter Maydell wrote: > If QEMU is compiled with clang-7 it results in the warning: > > hw/display/qxl.c:1884:19: error: misaligned or large atomic operation > may incur significant performance penalty [-Werror,-Watomic-alignment] > old_pending = atomic_fetch_or(&d->ram->int_pending, le_events); > ^ > > This is because the Spice headers forgot to define the QXLRam struct > with the '__aligned__(4)' attribute. clang 7 and newer will thus > warn that the access here to int_pending might not be 4-aligned > (because the QXLRam object d->ram points at might start at a > misaligned address). In fact we set up d->ram in init_qxl_ram() so > it always starts at a 4K boundary, so we know the atomic access here > is OK. > > Newer Spice versions (with Spice commit > beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1) will fix the bug; > for older Spice versions, work around it by telling the compiler > explicitly that the alignment is OK using __builtin_assume_aligned(). > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > include/qemu/compiler.h | 9 +++++++++ > hw/display/qxl.c | 26 +++++++++++++++++++++++++- > 2 files changed, 34 insertions(+), 1 deletion(-) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
On 9/27/18 5:55 PM, Peter Maydell wrote:
> If QEMU is compiled with clang-7 it results in the warning:
>
> hw/display/qxl.c:1884:19: error: misaligned or large atomic operation
> may incur significant performance penalty [-Werror,-Watomic-alignment]
> old_pending = atomic_fetch_or(&d->ram->int_pending, le_events);
> ^
>
> This is because the Spice headers forgot to define the QXLRam struct
> with the '__aligned__(4)' attribute. clang 7 and newer will thus
> warn that the access here to int_pending might not be 4-aligned
> (because the QXLRam object d->ram points at might start at a
> misaligned address). In fact we set up d->ram in init_qxl_ram() so
> it always starts at a 4K boundary, so we know the atomic access here
> is OK.
>
> Newer Spice versions (with Spice commit
> beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1) will fix the bug;
> for older Spice versions, work around it by telling the compiler
> explicitly that the alignment is OK using __builtin_assume_aligned().
>
> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Thanks!
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
> include/qemu/compiler.h | 9 +++++++++
> hw/display/qxl.c | 26 +++++++++++++++++++++++++-
> 2 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
> index 5843812710c..bf47e7bee4e 100644
> --- a/include/qemu/compiler.h
> +++ b/include/qemu/compiler.h
> @@ -122,6 +122,15 @@
> #ifndef __has_feature
> #define __has_feature(x) 0 /* compatibility with non-clang compilers */
> #endif
> +
> +#ifndef __has_builtin
> +#define __has_builtin(x) 0 /* compatibility with non-clang compilers */
> +#endif
> +
> +#if __has_builtin(__builtin_assume_aligned) || QEMU_GNUC_PREREQ(4, 7)
> +#define HAS_ASSUME_ALIGNED
> +#endif
> +
> /* Implement C11 _Generic via GCC builtins. Example:
> *
> * QEMU_GENERIC(x, (float, sinf), (long double, sinl), sin) (x)
> diff --git a/hw/display/qxl.c b/hw/display/qxl.c
> index 8e9135d9c67..5702e8645d3 100644
> --- a/hw/display/qxl.c
> +++ b/hw/display/qxl.c
> @@ -1881,7 +1881,31 @@ static void qxl_send_events(PCIQXLDevice *d, uint32_t events)
> trace_qxl_send_events_vm_stopped(d->id, events);
> return;
> }
> - old_pending = atomic_fetch_or(&d->ram->int_pending, le_events);
> + /*
> + * Older versions of Spice forgot to define the QXLRam struct
> + * with the '__aligned__(4)' attribute. clang 7 and newer will
> + * thus warn that atomic_fetch_or(&d->ram->int_pending, ...)
> + * might be a misaligned atomic access, and will generate an
> + * out-of-line call for it, which results in a link error since
> + * we don't currently link against libatomic.
> + *
> + * In fact we set up d->ram in init_qxl_ram() so it always starts
> + * at a 4K boundary, so we know that &d->ram->int_pending is
> + * naturally aligned for a uint32_t. Newer Spice versions
> + * (with Spice commit beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1)
> + * will fix the bug directly. To deal with older versions,
> + * we tell the compiler to assume the address really is aligned.
> + * Any compiler which cares about the misalignment will have
> + * __builtin_assume_aligned.
> + */
> +#ifdef HAS_ASSUME_ALIGNED
> +#define ALIGNED_UINT32_PTR(P) ((uint32_t *)__builtin_assume_aligned(P, 4))
> +#else
> +#define ALIGNED_UINT32_PTR(P) ((uint32_t *)P)
> +#endif
> +
> + old_pending = atomic_fetch_or(ALIGNED_UINT32_PTR(&d->ram->int_pending),
> + le_events);
> if ((old_pending & le_events) == le_events) {
> return;
> }
>
On Thu, Sep 27, 2018 at 04:55:38PM +0100, Peter Maydell wrote: > If QEMU is compiled with clang-7 it results in the warning: > > hw/display/qxl.c:1884:19: error: misaligned or large atomic operation > may incur significant performance penalty [-Werror,-Watomic-alignment] > old_pending = atomic_fetch_or(&d->ram->int_pending, le_events); > ^ > > This is because the Spice headers forgot to define the QXLRam struct > with the '__aligned__(4)' attribute. clang 7 and newer will thus > warn that the access here to int_pending might not be 4-aligned > (because the QXLRam object d->ram points at might start at a > misaligned address). In fact we set up d->ram in init_qxl_ram() so > it always starts at a 4K boundary, so we know the atomic access here > is OK. > > Newer Spice versions (with Spice commit > beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1) will fix the bug; > for older Spice versions, work around it by telling the compiler > explicitly that the alignment is OK using __builtin_assume_aligned(). Added to vga queue. cheers, Gerd
On 9/27/18 10:55 AM, Peter Maydell wrote: > Newer Spice versions (with Spice commit > beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1) will fix the bug; > for older Spice versions, work around it by telling the compiler > explicitly that the alignment is OK using __builtin_assume_aligned(). > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > include/qemu/compiler.h | 9 +++++++++ > hw/display/qxl.c | 26 +++++++++++++++++++++++++- > 2 files changed, 34 insertions(+), 1 deletion(-) > > +#ifdef HAS_ASSUME_ALIGNED > +#define ALIGNED_UINT32_PTR(P) ((uint32_t *)__builtin_assume_aligned(P, 4)) > +#else > +#define ALIGNED_UINT32_PTR(P) ((uint32_t *)P) > +#endif > + > + old_pending = atomic_fetch_or(ALIGNED_UINT32_PTR(&d->ram->int_pending), The only client of your macro is not impacted, but if you want to be technically robust against all future uses of the macro, the fallback is underparenthesized, and would be safer as: #define ALIGNED_UINT32_PTR(P) ((uint32_t *)(P)) -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
© 2016 - 2025 Red Hat, Inc.