[v4] Enable build of full Xen for RISC-V

[PATCH v5 13/23] xen/riscv: introduce atomic.h

Posted by Oleksii Kurochko 1 year, 11 months ago

Initially the patch was introduced by Bobby, who takes the header from
Linux kernel.

The following changes were done on top of Linux kernel header:
 - atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
     to use__*xchg_generic()
 - drop casts in write_atomic() as they are unnecessary
 - drop introduction of WRITE_ONCE() and READ_ONCE().
   Xen provides ACCESS_ONCE()
 - remove zero-length array access in read_atomic()
 - drop defines similar to pattern
 - #define atomic_add_return_relaxed   atomic_add_return_relaxed
 - move not RISC-V specific functions to asm-generic/atomics-ops.h

Signed-off-by: Bobby Eshleman <bobbyeshleman@gmail.com>
Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V5:
 - fence.h changes were moved to separate patch as patches related to io.h and cmpxchg.h,
   which are dependecies for this patch, also needed changes in fence.h
 - remove accessing of zero-length array
 - drops cast in write_atomic()
 - drop introduction of WRITE_ONCE() and READ_ONCE().
 - drop defines similar to pattern #define atomic_add_return_relaxed   atomic_add_return_relaxed
 - Xen code style fixes
 - move not RISC-V specific functions to asm-generic/atomics-ops.h
---
Changes in V4:
 - do changes related to the updates of [PATCH v3 13/34] xen/riscv: introduce cmpxchg.h
 - drop casts in read_atomic_size(), write_atomic(), add_sized()
 - tabs -> spaces
 - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler to handle NR_CPUS=1
   the same as NR_CPUS>1 with accepting less than ideal performance.
---
Changes in V3:
  - update the commit message
  - add SPDX for fence.h
  - code style fixes
  - Remove /* TODO: ... */ for add_sized macros. It looks correct to me.
  - re-order the patch
  - merge to this patch fence.h
---
Changes in V2:
 - Change an author of commit. I got this header from Bobby's old repo.
---
 xen/arch/riscv/include/asm/atomic.h  | 296 +++++++++++++++++++++++++++
 xen/include/asm-generic/atomic-ops.h |  92 +++++++++
 2 files changed, 388 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/atomic.h
 create mode 100644 xen/include/asm-generic/atomic-ops.h

diff --git a/xen/arch/riscv/include/asm/atomic.h b/xen/arch/riscv/include/asm/atomic.h
new file mode 100644
index 0000000000..8007ae4c90
--- /dev/null
+++ b/xen/arch/riscv/include/asm/atomic.h
@@ -0,0 +1,296 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Taken and modified from Linux.
+ *
+ * The following changes were done:
+ * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
+ *     to use__*xchg_generic()
+ * - drop casts in write_atomic() as they are unnecessary
+ * - drop introduction of WRITE_ONCE() and READ_ONCE().
+ *   Xen provides ACCESS_ONCE()
+ * - remove zero-length array access in read_atomic()
+ * - drop defines similar to pattern
+ *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
+ * - move not RISC-V specific functions to asm-generic/atomics-ops.h
+ * 
+ * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
+ * Copyright (C) 2012 Regents of the University of California
+ * Copyright (C) 2017 SiFive
+ * Copyright (C) 2024 Vates SAS
+ */
+
+#ifndef _ASM_RISCV_ATOMIC_H
+#define _ASM_RISCV_ATOMIC_H
+
+#include <xen/atomic.h>
+
+#include <asm/cmpxchg.h>
+#include <asm/fence.h>
+#include <asm/io.h>
+#include <asm/system.h>
+
+#include <asm-generic/atomic-ops.h>
+
+void __bad_atomic_size(void);
+
+/*
+ * Legacy from Linux kernel. For some reason they wanted to have ordered
+ * read/write access. Thereby read* is used instead of read<X>_cpu()
+ */
+static always_inline void read_atomic_size(const volatile void *p,
+                                           void *res,
+                                           unsigned int size)
+{
+    switch ( size )
+    {
+    case 1: *(uint8_t *)res = readb(p); break;
+    case 2: *(uint16_t *)res = readw(p); break;
+    case 4: *(uint32_t *)res = readl(p); break;
+    case 8: *(uint32_t *)res  = readq(p); break;
+    default: __bad_atomic_size(); break;
+    }
+}
+
+#define read_atomic(p) ({                               \
+    union { typeof(*p) val; char c[sizeof(*p)]; } x_;   \
+    read_atomic_size(p, x_.c, sizeof(*p));              \
+    x_.val;                                             \
+})
+
+#define write_atomic(p, x)                              \
+({                                                      \
+    typeof(*p) x__ = (x);                               \
+    switch ( sizeof(*p) )                               \
+    {                                                   \
+    case 1: writeb(x__,  p); break;                     \
+    case 2: writew(x__, p); break;                      \
+    case 4: writel(x__, p); break;                      \
+    case 8: writeq(x__, p); break;                      \
+    default: __bad_atomic_size(); break;                \
+    }                                                   \
+    x__;                                                \
+})
+
+#define add_sized(p, x)                                 \
+({                                                      \
+    typeof(*(p)) x__ = (x);                             \
+    switch ( sizeof(*(p)) )                             \
+    {                                                   \
+    case 1: writeb(read_atomic(p) + x__, p); break;     \
+    case 2: writew(read_atomic(p) + x__, p); break;     \
+    case 4: writel(read_atomic(p) + x__, p); break;     \
+    default: __bad_atomic_size(); break;                \
+    }                                                   \
+})
+
+#define __atomic_acquire_fence() \
+    __asm__ __volatile__ ( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
+
+#define __atomic_release_fence() \
+    __asm__ __volatile__ ( RISCV_RELEASE_BARRIER "" ::: "memory" )
+
+/*
+ * First, the atomic ops that have no ordering constraints and therefor don't
+ * have the AQ or RL bits set.  These don't return anything, so there's only
+ * one version to worry about.
+ */
+#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
+static inline                                               \
+void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
+{                                                           \
+    __asm__ __volatile__ (                                  \
+        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
+        : "+A" (v->counter)                                 \
+        : "r" (I)                                           \
+        : "memory" );                                       \
+}                                                           \
+
+#define ATOMIC_OPS(op, asm_op, I)                           \
+        ATOMIC_OP (op, asm_op, I, w, int,   )
+
+ATOMIC_OPS(add, add,  i)
+ATOMIC_OPS(sub, add, -i)
+ATOMIC_OPS(and, and,  i)
+ATOMIC_OPS( or,  or,  i)
+ATOMIC_OPS(xor, xor,  i)
+
+#undef ATOMIC_OP
+#undef ATOMIC_OPS
+
+/*
+ * Atomic ops that have ordered, relaxed, acquire, and release variants.
+ * There's two flavors of these: the arithmatic ops have both fetch and return
+ * versions, while the logical ops only have fetch versions.
+ */
+#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type, prefix)    \
+static inline                                                       \
+c_type atomic##prefix##_fetch_##op##_relaxed(c_type i,              \
+                         atomic##prefix##_t *v)                     \
+{                                                                   \
+    register c_type ret;                                            \
+    __asm__ __volatile__ (                                          \
+        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
+        : "+A" (v->counter), "=r" (ret)                             \
+        : "r" (I)                                                   \
+        : "memory" );                                               \
+    return ret;                                                     \
+}                                                                   \
+static inline                                                       \
+c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \
+{                                                                   \
+    register c_type ret;                                            \
+    __asm__ __volatile__ (                                          \
+        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2, %0"          \
+        : "+A" (v->counter), "=r" (ret)                             \
+        : "r" (I)                                                   \
+        : "memory" );                                               \
+    return ret;                                                     \
+}
+
+#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type, prefix) \
+static inline                                                           \
+c_type atomic##prefix##_##op##_return_relaxed(c_type i,                 \
+                          atomic##prefix##_t *v)                        \
+{                                                                       \
+        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op I;      \
+}                                                                       \
+static inline                                                           \
+c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t *v)  \
+{                                                                       \
+        return atomic##prefix##_fetch_##op(i, v) c_op I;                \
+}
+
+#define ATOMIC_OPS(op, asm_op, c_op, I)                                 \
+        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,   )               \
+        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
+
+ATOMIC_OPS(add, add, +,  i)
+ATOMIC_OPS(sub, add, +, -i)
+
+#undef ATOMIC_OPS
+
+#define ATOMIC_OPS(op, asm_op, I) \
+        ATOMIC_FETCH_OP(op, asm_op, I, w, int,   )
+
+ATOMIC_OPS(and, and, i)
+ATOMIC_OPS( or,  or, i)
+ATOMIC_OPS(xor, xor, i)
+
+#undef ATOMIC_OPS
+
+#undef ATOMIC_FETCH_OP
+#undef ATOMIC_OP_RETURN
+
+/* This is required to provide a full barrier on success. */
+static inline int atomic_add_unless(atomic_t *v, int a, int u)
+{
+       int prev, rc;
+
+    __asm__ __volatile__ (
+        "0: lr.w     %[p],  %[c]\n"
+        "   beq      %[p],  %[u], 1f\n"
+        "   add      %[rc], %[p], %[a]\n"
+        "   sc.w.rl  %[rc], %[rc], %[c]\n"
+        "   bnez     %[rc], 0b\n"
+        RISCV_FULL_BARRIER
+        "1:\n"
+        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
+        : [a] "r" (a), [u] "r" (u)
+        : "memory");
+    return prev;
+}
+
+/*
+ * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
+ * {cmp,}xchg and the operations that return, so they need a full barrier.
+ */
+#define ATOMIC_OP(c_t, prefix, size)                            \
+static inline                                                   \
+c_t atomic##prefix##_xchg_relaxed(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size, "", "", "");  \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg_acquire(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          "", "", RISCV_ACQUIRE_BARRIER);       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg_release(atomic##prefix##_t *v, c_t n) \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          "", RISCV_RELEASE_BARRIER, "");       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_xchg(atomic##prefix##_t *v, c_t n)         \
+{                                                               \
+    return __xchg_generic(&(v->counter), n, size,               \
+                          ".aqrl", "", "");                     \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_relaxed(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", "", "");                       \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_acquire(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", "", RISCV_ACQUIRE_BARRIER);    \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg_release(atomic##prefix##_t *v,     \
+                     c_t o, c_t n)                              \
+{	                                                            \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             "", RISCV_RELEASE_BARRIER, "");    \
+}                                                               \
+static inline                                                   \
+c_t atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n) \
+{                                                               \
+    return __cmpxchg_generic(&(v->counter), o, n, size,         \
+                             ".rl", "", " fence rw, rw\n");     \
+}
+
+#define ATOMIC_OPS() \
+    ATOMIC_OP(int,   , 4)
+
+ATOMIC_OPS()
+
+#undef ATOMIC_OPS
+#undef ATOMIC_OP
+
+static inline int atomic_sub_if_positive(atomic_t *v, int offset)
+{
+       int prev, rc;
+
+    __asm__ __volatile__ (
+        "0: lr.w     %[p],  %[c]\n"
+        "   sub      %[rc], %[p], %[o]\n"
+        "   bltz     %[rc], 1f\n"
+        "   sc.w.rl  %[rc], %[rc], %[c]\n"
+        "   bnez     %[rc], 0b\n"
+        "   fence    rw, rw\n"
+        "1:\n"
+        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
+        : [o] "r" (offset)
+        : "memory" );
+    return prev - offset;
+}
+
+#define atomic_dec_if_positive(v)	atomic_sub_if_positive(v, 1)
+
+#endif /* _ASM_RISCV_ATOMIC_H */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/asm-generic/atomic-ops.h b/xen/include/asm-generic/atomic-ops.h
new file mode 100644
index 0000000000..fdd5a93ed8
--- /dev/null
+++ b/xen/include/asm-generic/atomic-ops.h
@@ -0,0 +1,92 @@
+#/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_GENERIC_ATOMIC_OPS_H_
+#define _ASM_GENERIC_ATOMIC_OPS_H_
+
+#include <xen/atomic.h>
+#include <xen/lib.h>
+
+#ifndef ATOMIC_READ
+static inline int atomic_read(const atomic_t *v)
+{
+    return ACCESS_ONCE(v->counter);
+}
+#endif
+
+#ifndef _ATOMIC_READ
+static inline int _atomic_read(atomic_t v)
+{
+    return v.counter;
+}
+#endif
+
+#ifndef ATOMIC_SET
+static inline void atomic_set(atomic_t *v, int i)
+{
+    ACCESS_ONCE(v->counter) = i;
+}
+#endif
+
+#ifndef _ATOMIC_SET
+static inline void _atomic_set(atomic_t *v, int i)
+{
+    v->counter = i;
+}
+#endif
+
+#ifndef ATOMIC_SUB_AND_TEST
+static inline int atomic_sub_and_test(int i, atomic_t *v)
+{
+    return atomic_sub_return(i, v) == 0;
+}
+#endif
+
+#ifndef ATOMIC_INC
+static inline void atomic_inc(atomic_t *v)
+{
+    atomic_add(1, v);
+}
+#endif
+
+#ifndef ATOMIC_INC_RETURN
+static inline int atomic_inc_return(atomic_t *v)
+{
+    return atomic_add_return(1, v);
+}
+#endif
+
+#ifndef ATOMIC_DEC
+static inline void atomic_dec(atomic_t *v)
+{
+    atomic_sub(1, v);
+}
+#endif
+
+#ifndef ATOMIC_DEC_RETURN
+static inline int atomic_dec_return(atomic_t *v)
+{
+    return atomic_sub_return(1, v);
+}
+#endif
+
+#ifndef ATOMIC_DEC_AND_TEST
+static inline int atomic_dec_and_test(atomic_t *v)
+{
+    return atomic_sub_return(1, v) == 0;
+}
+#endif
+
+#ifndef ATOMIC_ADD_NEGATIVE
+static inline int atomic_add_negative(int i, atomic_t *v)
+{
+    return atomic_add_return(i, v) < 0;
+}
+#endif
+
+#ifndef ATOMIC_INC_AND_TEST
+static inline int atomic_inc_and_test(atomic_t *v)
+{
+    return atomic_add_return(1, v) == 0;
+}
+#endif
+
+#endif /* _ASM_GENERIC_ATOMIC_OPS_H_ */
-- 
2.43.0

Re: [PATCH v5 13/23] xen/riscv: introduce atomic.h

Posted by Jan Beulich 1 year, 11 months ago

On 26.02.2024 18:38, Oleksii Kurochko wrote:
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/atomic.h
> @@ -0,0 +1,296 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Taken and modified from Linux.
> + *
> + * The following changes were done:
> + * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated
> + *     to use__*xchg_generic()
> + * - drop casts in write_atomic() as they are unnecessary
> + * - drop introduction of WRITE_ONCE() and READ_ONCE().
> + *   Xen provides ACCESS_ONCE()
> + * - remove zero-length array access in read_atomic()
> + * - drop defines similar to pattern
> + *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
> + * - move not RISC-V specific functions to asm-generic/atomics-ops.h
> + * 
> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> + * Copyright (C) 2012 Regents of the University of California
> + * Copyright (C) 2017 SiFive
> + * Copyright (C) 2024 Vates SAS
> + */
> +
> +#ifndef _ASM_RISCV_ATOMIC_H
> +#define _ASM_RISCV_ATOMIC_H
> +
> +#include <xen/atomic.h>
> +
> +#include <asm/cmpxchg.h>
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +#include <asm-generic/atomic-ops.h>

While, because of the forward decls in xen/atomic.h, having this #include
works, I wonder if it wouldn't better be placed further down. The compiler
will likely have an easier time when it sees the inline definitions ahead
of any uses.

> +void __bad_atomic_size(void);
> +
> +/*
> + * Legacy from Linux kernel. For some reason they wanted to have ordered
> + * read/write access. Thereby read* is used instead of read<X>_cpu()
> + */
> +static always_inline void read_atomic_size(const volatile void *p,
> +                                           void *res,
> +                                           unsigned int size)
> +{
> +    switch ( size )
> +    {
> +    case 1: *(uint8_t *)res = readb(p); break;
> +    case 2: *(uint16_t *)res = readw(p); break;
> +    case 4: *(uint32_t *)res = readl(p); break;
> +    case 8: *(uint32_t *)res  = readq(p); break;

This is the point where the lack of constraints in io.h (see my respective
comment) becomes actually harmful: You're accessing not MMIO, but compiler-
visible variables here. It needs to know which ones are read ...

> +    default: __bad_atomic_size(); break;
> +    }
> +}
> +
> +#define read_atomic(p) ({                               \
> +    union { typeof(*p) val; char c[sizeof(*p)]; } x_;   \
> +    read_atomic_size(p, x_.c, sizeof(*p));              \
> +    x_.val;                                             \
> +})
> +
> +#define write_atomic(p, x)                              \
> +({                                                      \
> +    typeof(*p) x__ = (x);                               \
> +    switch ( sizeof(*p) )                               \
> +    {                                                   \
> +    case 1: writeb(x__,  p); break;                     \
> +    case 2: writew(x__, p); break;                      \
> +    case 4: writel(x__, p); break;                      \
> +    case 8: writeq(x__, p); break;                      \

... or written.

Nit: There's a stray blank in the writeb() invocation.

> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +    x__;                                                \
> +})
> +
> +#define add_sized(p, x)                                 \
> +({                                                      \
> +    typeof(*(p)) x__ = (x);                             \
> +    switch ( sizeof(*(p)) )                             \

Like you have it here, {read,write}_atomic() also need p properly
parenthesized. There look to be more parenthesization issues further
down.

> +    {                                                   \
> +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> +    case 2: writew(read_atomic(p) + x__, p); break;     \
> +    case 4: writel(read_atomic(p) + x__, p); break;     \
> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +})

Any reason this doesn't have an 8-byte case? x86'es at least has one.

> +#define __atomic_acquire_fence() \
> +    __asm__ __volatile__ ( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
> +
> +#define __atomic_release_fence() \
> +    __asm__ __volatile__ ( RISCV_RELEASE_BARRIER "" ::: "memory" )

Elsewhere you use asm volatile() - why __asm__ __volatile__() here?
Or why not there (cmpxchg.h, io.h)?

> +/*
> + * First, the atomic ops that have no ordering constraints and therefor don't
> + * have the AQ or RL bits set.  These don't return anything, so there's only
> + * one version to worry about.
> + */
> +#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
> +static inline                                               \
> +void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
> +{                                                           \
> +    __asm__ __volatile__ (                                  \
> +        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
> +        : "+A" (v->counter)                                 \
> +        : "r" (I)                                           \
> +        : "memory" );                                       \
> +}                                                           \
> +
> +#define ATOMIC_OPS(op, asm_op, I)                           \
> +        ATOMIC_OP (op, asm_op, I, w, int,   )
> +
> +ATOMIC_OPS(add, add,  i)
> +ATOMIC_OPS(sub, add, -i)
> +ATOMIC_OPS(and, and,  i)
> +ATOMIC_OPS( or,  or,  i)
> +ATOMIC_OPS(xor, xor,  i)
> +
> +#undef ATOMIC_OP
> +#undef ATOMIC_OPS
> +
> +/*
> + * Atomic ops that have ordered, relaxed, acquire, and release variants.
> + * There's two flavors of these: the arithmatic ops have both fetch and return
> + * versions, while the logical ops only have fetch versions.
> + */
> +#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type, prefix)    \
> +static inline                                                       \
> +c_type atomic##prefix##_fetch_##op##_relaxed(c_type i,              \
> +                         atomic##prefix##_t *v)                     \
> +{                                                                   \
> +    register c_type ret;                                            \
> +    __asm__ __volatile__ (                                          \
> +        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
> +        : "+A" (v->counter), "=r" (ret)                             \
> +        : "r" (I)                                                   \
> +        : "memory" );                                               \
> +    return ret;                                                     \
> +}                                                                   \
> +static inline                                                       \
> +c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \
> +{                                                                   \
> +    register c_type ret;                                            \
> +    __asm__ __volatile__ (                                          \
> +        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2, %0"          \
> +        : "+A" (v->counter), "=r" (ret)                             \
> +        : "r" (I)                                                   \
> +        : "memory" );                                               \
> +    return ret;                                                     \
> +}
> +
> +#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type, prefix) \
> +static inline                                                           \
> +c_type atomic##prefix##_##op##_return_relaxed(c_type i,                 \
> +                          atomic##prefix##_t *v)                        \
> +{                                                                       \
> +        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op I;      \
> +}                                                                       \
> +static inline                                                           \
> +c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t *v)  \
> +{                                                                       \
> +        return atomic##prefix##_fetch_##op(i, v) c_op I;                \
> +}
> +
> +#define ATOMIC_OPS(op, asm_op, c_op, I)                                 \
> +        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,   )               \
> +        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )

What purpose is the last macro argument when you only ever pass nothing
for it (here and ...

> +ATOMIC_OPS(add, add, +,  i)
> +ATOMIC_OPS(sub, add, +, -i)
> +
> +#undef ATOMIC_OPS
> +
> +#define ATOMIC_OPS(op, asm_op, I) \
> +        ATOMIC_FETCH_OP(op, asm_op, I, w, int,   )

... here)?

> +ATOMIC_OPS(and, and, i)
> +ATOMIC_OPS( or,  or, i)
> +ATOMIC_OPS(xor, xor, i)
> +
> +#undef ATOMIC_OPS
> +
> +#undef ATOMIC_FETCH_OP
> +#undef ATOMIC_OP_RETURN
> +
> +/* This is required to provide a full barrier on success. */
> +static inline int atomic_add_unless(atomic_t *v, int a, int u)
> +{
> +       int prev, rc;
> +
> +    __asm__ __volatile__ (
> +        "0: lr.w     %[p],  %[c]\n"
> +        "   beq      %[p],  %[u], 1f\n"
> +        "   add      %[rc], %[p], %[a]\n"
> +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> +        "   bnez     %[rc], 0b\n"
> +        RISCV_FULL_BARRIER
> +        "1:\n"
> +        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
> +        : [a] "r" (a), [u] "r" (u)
> +        : "memory");
> +    return prev;
> +}
> +
> +/*
> + * atomic_{cmp,}xchg is required to have exactly the same ordering semantics as
> + * {cmp,}xchg and the operations that return, so they need a full barrier.
> + */
> +#define ATOMIC_OP(c_t, prefix, size)                            \
> +static inline                                                   \
> +c_t atomic##prefix##_xchg_relaxed(atomic##prefix##_t *v, c_t n) \
> +{                                                               \
> +    return __xchg_generic(&(v->counter), n, size, "", "", "");  \

The inner parentheses aren't really needed here, are they?

> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_xchg_acquire(atomic##prefix##_t *v, c_t n) \
> +{                                                               \
> +    return __xchg_generic(&(v->counter), n, size,               \
> +                          "", "", RISCV_ACQUIRE_BARRIER);       \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_xchg_release(atomic##prefix##_t *v, c_t n) \
> +{                                                               \
> +    return __xchg_generic(&(v->counter), n, size,               \
> +                          "", RISCV_RELEASE_BARRIER, "");       \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_xchg(atomic##prefix##_t *v, c_t n)         \
> +{                                                               \
> +    return __xchg_generic(&(v->counter), n, size,               \
> +                          ".aqrl", "", "");                     \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_cmpxchg_relaxed(atomic##prefix##_t *v,     \
> +                     c_t o, c_t n)                              \
> +{                                                               \
> +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> +                             "", "", "");                       \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_cmpxchg_acquire(atomic##prefix##_t *v,     \
> +                     c_t o, c_t n)                              \
> +{                                                               \
> +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> +                             "", "", RISCV_ACQUIRE_BARRIER);    \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_cmpxchg_release(atomic##prefix##_t *v,     \
> +                     c_t o, c_t n)                              \
> +{	                                                            \

A hard tab looks to have been left here.

> +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> +                             "", RISCV_RELEASE_BARRIER, "");    \
> +}                                                               \
> +static inline                                                   \
> +c_t atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n) \
> +{                                                               \
> +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> +                             ".rl", "", " fence rw, rw\n");     \
> +}
> +
> +#define ATOMIC_OPS() \
> +    ATOMIC_OP(int,   , 4)
> +
> +ATOMIC_OPS()
> +
> +#undef ATOMIC_OPS
> +#undef ATOMIC_OP
> +
> +static inline int atomic_sub_if_positive(atomic_t *v, int offset)
> +{
> +       int prev, rc;
> +
> +    __asm__ __volatile__ (
> +        "0: lr.w     %[p],  %[c]\n"
> +        "   sub      %[rc], %[p], %[o]\n"
> +        "   bltz     %[rc], 1f\n"
> +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> +        "   bnez     %[rc], 0b\n"
> +        "   fence    rw, rw\n"
> +        "1:\n"
> +        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
> +        : [o] "r" (offset)
> +        : "memory" );
> +    return prev - offset;
> +}
> +
> +#define atomic_dec_if_positive(v)	atomic_sub_if_positive(v, 1)

Hmm, PPC for some reason also has the latter, but for both: Are they indeed
going to be needed in RISC-V code? They certainly look unnecessary for the
purpose of this series (allowing common code to build).

> --- /dev/null
> +++ b/xen/include/asm-generic/atomic-ops.h
> @@ -0,0 +1,92 @@
> +#/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _ASM_GENERIC_ATOMIC_OPS_H_
> +#define _ASM_GENERIC_ATOMIC_OPS_H_
> +
> +#include <xen/atomic.h>
> +#include <xen/lib.h>

If I'm not mistaken this header provides default implementations for every
xen/atomic.h-provided forward inline declaration that can be synthesized
from other atomic functions. I think a comment to this effect would want
adding somewhere here.

Jan

Re: [PATCH v5 13/23] xen/riscv: introduce atomic.h

Posted by Oleksii 1 year, 11 months ago

On Wed, 2024-03-06 at 16:31 +0100, Jan Beulich wrote:
> On 26.02.2024 18:38, Oleksii Kurochko wrote:
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/atomic.h
> > @@ -0,0 +1,296 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Taken and modified from Linux.
> > + *
> > + * The following changes were done:
> > + * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
> > updated
> > + *     to use__*xchg_generic()
> > + * - drop casts in write_atomic() as they are unnecessary
> > + * - drop introduction of WRITE_ONCE() and READ_ONCE().
> > + *   Xen provides ACCESS_ONCE()
> > + * - remove zero-length array access in read_atomic()
> > + * - drop defines similar to pattern
> > + *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
> > + * - move not RISC-V specific functions to asm-generic/atomics-
> > ops.h
> > + * 
> > + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> > + * Copyright (C) 2012 Regents of the University of California
> > + * Copyright (C) 2017 SiFive
> > + * Copyright (C) 2024 Vates SAS
> > + */
> > +
> > +#ifndef _ASM_RISCV_ATOMIC_H
> > +#define _ASM_RISCV_ATOMIC_H
> > +
> > +#include <xen/atomic.h>
> > +
> > +#include <asm/cmpxchg.h>
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +#include <asm-generic/atomic-ops.h>
> 
> While, because of the forward decls in xen/atomic.h, having this
> #include
> works, I wonder if it wouldn't better be placed further down. The
> compiler
> will likely have an easier time when it sees the inline definitions
> ahead
> of any uses.
Do you mean to move it after #define __atomic_release_fence() ?

> 
> > +void __bad_atomic_size(void);
> > +
> > +/*
> > + * Legacy from Linux kernel. For some reason they wanted to have
> > ordered
> > + * read/write access. Thereby read* is used instead of
> > read<X>_cpu()
> > + */
> > +static always_inline void read_atomic_size(const volatile void *p,
> > +                                           void *res,
> > +                                           unsigned int size)
> > +{
> > +    switch ( size )
> > +    {
> > +    case 1: *(uint8_t *)res = readb(p); break;
> > +    case 2: *(uint16_t *)res = readw(p); break;
> > +    case 4: *(uint32_t *)res = readl(p); break;
> > +    case 8: *(uint32_t *)res  = readq(p); break;
> 
> This is the point where the lack of constraints in io.h (see my
> respective
> comment) becomes actually harmful: You're accessing not MMIO, but
> compiler-
> visible variables here. It needs to know which ones are read ...
> 
> > +    default: __bad_atomic_size(); break;
> > +    }
> > +}
> > +
> > +#define read_atomic(p) ({                               \
> > +    union { typeof(*p) val; char c[sizeof(*p)]; } x_;   \
> > +    read_atomic_size(p, x_.c, sizeof(*p));              \
> > +    x_.val;                                             \
> > +})
> > +
> > +#define write_atomic(p, x)                              \
> > +({                                                      \
> > +    typeof(*p) x__ = (x);                               \
> > +    switch ( sizeof(*p) )                               \
> > +    {                                                   \
> > +    case 1: writeb(x__,  p); break;                     \
> > +    case 2: writew(x__, p); break;                      \
> > +    case 4: writel(x__, p); break;                      \
> > +    case 8: writeq(x__, p); break;                      \
> 
> ... or written.
> 
> Nit: There's a stray blank in the writeb() invocation.
> 
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +    x__;                                                \
> > +})
> > +
> > +#define add_sized(p, x)                                 \
> > +({                                                      \
> > +    typeof(*(p)) x__ = (x);                             \
> > +    switch ( sizeof(*(p)) )                             \
> 
> Like you have it here, {read,write}_atomic() also need p properly
> parenthesized. There look to be more parenthesization issues further
> down.
> 
> > +    {                                                   \
> > +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> > +    case 2: writew(read_atomic(p) + x__, p); break;     \
> > +    case 4: writel(read_atomic(p) + x__, p); break;     \
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +})
> 
> Any reason this doesn't have an 8-byte case? x86'es at least has one.
Just missed to add and no compiler error I had, but I'll added case 8.

> 
> > +#define __atomic_acquire_fence() \
> > +    __asm__ __volatile__ ( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
> > +
> > +#define __atomic_release_fence() \
> > +    __asm__ __volatile__ ( RISCV_RELEASE_BARRIER "" ::: "memory" )
> 
> Elsewhere you use asm volatile() - why __asm__ __volatile__() here?
> Or why not there (cmpxchg.h, io.h)?
It is how it was defined in Linux kernel, so I decided to use their
code style, but considering this macros likely not to be changed I can
update this lines with asm volatile.

> 
> > +/*
> > + * First, the atomic ops that have no ordering constraints and
> > therefor don't
> > + * have the AQ or RL bits set.  These don't return anything, so
> > there's only
> > + * one version to worry about.
> > + */
> > +#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
> > +static inline                                               \
> > +void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
> > +{                                                           \
> > +    __asm__ __volatile__ (                                  \
> > +        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
> > +        : "+A" (v->counter)                                 \
> > +        : "r" (I)                                           \
> > +        : "memory" );                                       \
> > +}                                                           \
> > +
> > +#define ATOMIC_OPS(op, asm_op, I)                           \
> > +        ATOMIC_OP (op, asm_op, I, w, int,   )
> > +
> > +ATOMIC_OPS(add, add,  i)
> > +ATOMIC_OPS(sub, add, -i)
> > +ATOMIC_OPS(and, and,  i)
> > +ATOMIC_OPS( or,  or,  i)
> > +ATOMIC_OPS(xor, xor,  i)
> > +
> > +#undef ATOMIC_OP
> > +#undef ATOMIC_OPS
> > +
> > +/*
> > + * Atomic ops that have ordered, relaxed, acquire, and release
> > variants.
> > + * There's two flavors of these: the arithmatic ops have both
> > fetch and return
> > + * versions, while the logical ops only have fetch versions.
> > + */
> > +#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type,
> > prefix)    \
> > +static
> > inline                                                       \
> > +c_type atomic##prefix##_fetch_##op##_relaxed(c_type
> > i,              \
> > +                         atomic##prefix##_t
> > *v)                     \
> > +{                                                                 
> >   \
> > +    register c_type
> > ret;                                            \
> > +    __asm__ __volatile__
> > (                                          \
> > +        "   amo" #asm_op "." #asm_type " %1, %2,
> > %0"                \
> > +        : "+A" (v->counter), "=r"
> > (ret)                             \
> > +        : "r"
> > (I)                                                   \
> > +        : "memory"
> > );                                               \
> > +    return
> > ret;                                                     \
> > +}                                                                 
> >   \
> > +static
> > inline                                                       \
> > +c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t
> > *v) \
> > +{                                                                 
> >   \
> > +    register c_type
> > ret;                                            \
> > +    __asm__ __volatile__
> > (                                          \
> > +        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2,
> > %0"          \
> > +        : "+A" (v->counter), "=r"
> > (ret)                             \
> > +        : "r"
> > (I)                                                   \
> > +        : "memory"
> > );                                               \
> > +    return
> > ret;                                                     \
> > +}
> > +
> > +#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type,
> > prefix) \
> > +static
> > inline                                                           \
> > +c_type atomic##prefix##_##op##_return_relaxed(c_type
> > i,                 \
> > +                          atomic##prefix##_t
> > *v)                        \
> > +{                                                                 
> >       \
> > +        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op
> > I;      \
> > +}                                                                 
> >       \
> > +static
> > inline                                                           \
> > +c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t
> > *v)  \
> > +{                                                                 
> >       \
> > +        return atomic##prefix##_fetch_##op(i, v) c_op
> > I;                \
> > +}
> > +
> > +#define ATOMIC_OPS(op, asm_op, c_op,
> > I)                                 \
> > +        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,  
> > )               \
> > +        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
> 
> What purpose is the last macro argument when you only ever pass
> nothing
> for it (here and ...
> 
> > +ATOMIC_OPS(add, add, +,  i)
> > +ATOMIC_OPS(sub, add, +, -i)
> > +
> > +#undef ATOMIC_OPS
> > +
> > +#define ATOMIC_OPS(op, asm_op, I) \
> > +        ATOMIC_FETCH_OP(op, asm_op, I, w, int,   )
> 
> ... here)?
for generic ATOMIC64 it is not used:

#ifdef CONFIG_GENERIC_ATOMIC64
#define ATOMIC_OPS(op, asm_op, c_op,
I)					\
        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,  
)		\
        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
#else
#define ATOMIC_OPS(op, asm_op, c_op,
I)					\
        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,  
)		\
        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,  
)		\
        ATOMIC_FETCH_OP( op, asm_op,       I, d, s64,
64)		\
        ATOMIC_OP_RETURN(op, asm_op, c_op, I, d, s64, 64)
#endif

( the code is from Linux kernel )
Only CONFIG_GENERIC_ATOMIC64=y was ported to Xen.

> 
> > +ATOMIC_OPS(and, and, i)
> > +ATOMIC_OPS( or,  or, i)
> > +ATOMIC_OPS(xor, xor, i)
> > +
> > +#undef ATOMIC_OPS
> > +
> > +#undef ATOMIC_FETCH_OP
> > +#undef ATOMIC_OP_RETURN
> > +
> > +/* This is required to provide a full barrier on success. */
> > +static inline int atomic_add_unless(atomic_t *v, int a, int u)
> > +{
> > +       int prev, rc;
> > +
> > +    __asm__ __volatile__ (
> > +        "0: lr.w     %[p],  %[c]\n"
> > +        "   beq      %[p],  %[u], 1f\n"
> > +        "   add      %[rc], %[p], %[a]\n"
> > +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> > +        "   bnez     %[rc], 0b\n"
> > +        RISCV_FULL_BARRIER
> > +        "1:\n"
> > +        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
> > +        : [a] "r" (a), [u] "r" (u)
> > +        : "memory");
> > +    return prev;
> > +}
> > +
> > +/*
> > + * atomic_{cmp,}xchg is required to have exactly the same ordering
> > semantics as
> > + * {cmp,}xchg and the operations that return, so they need a full
> > barrier.
> > + */
> > +#define ATOMIC_OP(c_t, prefix, size)                            \
> > +static inline                                                   \
> > +c_t atomic##prefix##_xchg_relaxed(atomic##prefix##_t *v, c_t n) \
> > +{                                                               \
> > +    return __xchg_generic(&(v->counter), n, size, "", "", "");  \
> 
> The inner parentheses aren't really needed here, are they?
> 
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_xchg_acquire(atomic##prefix##_t *v, c_t n) \
> > +{                                                               \
> > +    return __xchg_generic(&(v->counter), n, size,               \
> > +                          "", "", RISCV_ACQUIRE_BARRIER);       \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_xchg_release(atomic##prefix##_t *v, c_t n) \
> > +{                                                               \
> > +    return __xchg_generic(&(v->counter), n, size,               \
> > +                          "", RISCV_RELEASE_BARRIER, "");       \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_xchg(atomic##prefix##_t *v, c_t n)         \
> > +{                                                               \
> > +    return __xchg_generic(&(v->counter), n, size,               \
> > +                          ".aqrl", "", "");                     \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_cmpxchg_relaxed(atomic##prefix##_t *v,     \
> > +                     c_t o, c_t n)                              \
> > +{                                                               \
> > +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> > +                             "", "", "");                       \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_cmpxchg_acquire(atomic##prefix##_t *v,     \
> > +                     c_t o, c_t n)                              \
> > +{                                                               \
> > +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> > +                             "", "", RISCV_ACQUIRE_BARRIER);    \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_cmpxchg_release(atomic##prefix##_t *v,     \
> > +                     c_t o, c_t n)                              \
> > +{	                                                          
> >   \
> 
> A hard tab looks to have been left here.
> 
> > +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> > +                             "", RISCV_RELEASE_BARRIER, "");    \
> > +}                                                               \
> > +static inline                                                   \
> > +c_t atomic##prefix##_cmpxchg(atomic##prefix##_t *v, c_t o, c_t n)
> > \
> > +{                                                               \
> > +    return __cmpxchg_generic(&(v->counter), o, n, size,         \
> > +                             ".rl", "", " fence rw, rw\n");     \
> > +}
> > +
> > +#define ATOMIC_OPS() \
> > +    ATOMIC_OP(int,   , 4)
> > +
> > +ATOMIC_OPS()
> > +
> > +#undef ATOMIC_OPS
> > +#undef ATOMIC_OP
> > +
> > +static inline int atomic_sub_if_positive(atomic_t *v, int offset)
> > +{
> > +       int prev, rc;
> > +
> > +    __asm__ __volatile__ (
> > +        "0: lr.w     %[p],  %[c]\n"
> > +        "   sub      %[rc], %[p], %[o]\n"
> > +        "   bltz     %[rc], 1f\n"
> > +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> > +        "   bnez     %[rc], 0b\n"
> > +        "   fence    rw, rw\n"
> > +        "1:\n"
> > +        : [p] "=&r" (prev), [rc] "=&r" (rc), [c] "+A" (v->counter)
> > +        : [o] "r" (offset)
> > +        : "memory" );
> > +    return prev - offset;
> > +}
> > +
> > +#define atomic_dec_if_positive(v)	atomic_sub_if_positive(v,
> > 1)
> 
> Hmm, PPC for some reason also has the latter, but for both: Are they
> indeed
> going to be needed in RISC-V code? They certainly look unnecessary
> for the
> purpose of this series (allowing common code to build).
I checked my private branched and I don't use it still, so it makes
sense to drop it.

> 
> > --- /dev/null
> > +++ b/xen/include/asm-generic/atomic-ops.h
> > @@ -0,0 +1,92 @@
> > +#/* SPDX-License-Identifier: GPL-2.0 */
> > +#ifndef _ASM_GENERIC_ATOMIC_OPS_H_
> > +#define _ASM_GENERIC_ATOMIC_OPS_H_
> > +
> > +#include <xen/atomic.h>
> > +#include <xen/lib.h>
> 
> If I'm not mistaken this header provides default implementations for
> every
> xen/atomic.h-provided forward inline declaration that can be
> synthesized
> from other atomic functions. I think a comment to this effect would
> want
> adding somewhere here.
I think we can drop this inclusion here as inclusion of asm-
generic/atomic-ops.h will be always go with inclusion of xen/atomic.h.

~ Oleksii

Re: [PATCH v5 13/23] xen/riscv: introduce atomic.h

Posted by Jan Beulich 1 year, 11 months ago

On 07.03.2024 14:30, Oleksii wrote:
> On Wed, 2024-03-06 at 16:31 +0100, Jan Beulich wrote:
>> On 26.02.2024 18:38, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/include/asm/atomic.h
>>> @@ -0,0 +1,296 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Taken and modified from Linux.
>>> + *
>>> + * The following changes were done:
>>> + * - * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
>>> updated
>>> + *     to use__*xchg_generic()
>>> + * - drop casts in write_atomic() as they are unnecessary
>>> + * - drop introduction of WRITE_ONCE() and READ_ONCE().
>>> + *   Xen provides ACCESS_ONCE()
>>> + * - remove zero-length array access in read_atomic()
>>> + * - drop defines similar to pattern
>>> + *   #define atomic_add_return_relaxed   atomic_add_return_relaxed
>>> + * - move not RISC-V specific functions to asm-generic/atomics-
>>> ops.h
>>> + * 
>>> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
>>> + * Copyright (C) 2012 Regents of the University of California
>>> + * Copyright (C) 2017 SiFive
>>> + * Copyright (C) 2024 Vates SAS
>>> + */
>>> +
>>> +#ifndef _ASM_RISCV_ATOMIC_H
>>> +#define _ASM_RISCV_ATOMIC_H
>>> +
>>> +#include <xen/atomic.h>
>>> +
>>> +#include <asm/cmpxchg.h>
>>> +#include <asm/fence.h>
>>> +#include <asm/io.h>
>>> +#include <asm/system.h>
>>> +
>>> +#include <asm-generic/atomic-ops.h>
>>
>> While, because of the forward decls in xen/atomic.h, having this
>> #include
>> works, I wonder if it wouldn't better be placed further down. The
>> compiler
>> will likely have an easier time when it sees the inline definitions
>> ahead
>> of any uses.
> Do you mean to move it after #define __atomic_release_fence() ?

Perhaps yet further down, at least after the arithmetic ops were defined.

>>> --- /dev/null
>>> +++ b/xen/include/asm-generic/atomic-ops.h
>>> @@ -0,0 +1,92 @@
>>> +#/* SPDX-License-Identifier: GPL-2.0 */
>>> +#ifndef _ASM_GENERIC_ATOMIC_OPS_H_
>>> +#define _ASM_GENERIC_ATOMIC_OPS_H_
>>> +
>>> +#include <xen/atomic.h>
>>> +#include <xen/lib.h>
>>
>> If I'm not mistaken this header provides default implementations for
>> every
>> xen/atomic.h-provided forward inline declaration that can be
>> synthesized
>> from other atomic functions. I think a comment to this effect would
>> want
>> adding somewhere here.
> I think we can drop this inclusion here as inclusion of asm-
> generic/atomic-ops.h will be always go with inclusion of xen/atomic.h.

I'm okay with dropping that include, but that wasn't the purpose of my
comment. I was rather asking for a comment to be added here stating
what is (not) to be present in this header.

Jan