From nobody Sun May 10 21:17:34 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id ADB49C433F5 for ; Sun, 24 Apr 2022 07:29:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238510AbiDXHcw (ORCPT ); Sun, 24 Apr 2022 03:32:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238480AbiDXHco (ORCPT ); Sun, 24 Apr 2022 03:32:44 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 937EA151868; Sun, 24 Apr 2022 00:29:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F21E9612FF; Sun, 24 Apr 2022 07:29:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9F2AFC385B1; Sun, 24 Apr 2022 07:29:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650785383; bh=rk4+4zbLu6epppwE4WRbBhd3KaAhjfLplCUb8DsR/0c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=maI4dy3txuTIlTp12VhWrkJqf4Q2kXf50o6qzRDipwCRtRxdMi5EGsLqmLXGNWxSe DRwL1uqb+YDJJMsRfjHp1/wwSVzfme+sij4aaFWWgKtBCQyuZ4PZTCbJkv8sErUVBe zUD68Co8XO3b2J8Jbcp8kVYdX8BI79hcYIO0lGlctfuYZsiZEaJQ9YUh+RgZWadVo4 CIauvWLbcafUpqoD9G+Jm4n+rBmEqGVVsVxOqwmLdjRtjdHIWggj40/wFjmgt2gWBy r6khlE2b+CkXPnJd9yW+sGlNkwfGz1UNpU8KDUoCaek+n/Uk9h9qx1RzoxYdrqt3yD m495G7bFCk8ww== From: guoren@kernel.org To: guoren@kernel.org, arnd@arndb.de, mark.rutland@arm.com, boqun.feng@gmail.com, peterz@infradead.org, will@kernel.org Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-csky@vger.kernel.org, Guo Ren Subject: [PATCH V4 1/3] csky: atomic: Optimize cmpxchg with acquire & release Date: Sun, 24 Apr 2022 15:29:16 +0800 Message-Id: <20220424072918.2596899-2-guoren@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220424072918.2596899-1-guoren@kernel.org> References: <20220424072918.2596899-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren Optimize cmpxchg with ASM acquire/release fence ASM instructions instead of previous generic based. Prevent a fence when cmxchg's first load !=3D old. Comments by Rutland: 8e86f0b409a4 ("arm64: atomics: fix use of acquire + release for full barrier semantics") Comments by Boqun: FWIW, you probably need to make sure that a barrier instruction inside an lr/sc loop is a good thing. IIUC, the execution time of a barrier instruction is determined by the status of store buffers and invalidate queues (and probably other stuffs), so it may increase the execution time of the lr/sc loop, and make it unlikely to succeed. But this really depends on how the arch executes these instructions. Link: https://lore.kernel.org/linux-riscv/CAJF2gTSAxpAi=3DLbAdu7jntZRUa=3D-= dJwL0VfmDfBV5MHB=3DrcZ-w@mail.gmail.com/T/#m27a0f1342995deae49ce1d0e1f2683f= 8a181d6c3 Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Mark Rutland --- arch/csky/include/asm/barrier.h | 11 +++--- arch/csky/include/asm/cmpxchg.h | 64 ++++++++++++++++++++++++++++++--- 2 files changed, 67 insertions(+), 8 deletions(-) diff --git a/arch/csky/include/asm/barrier.h b/arch/csky/include/asm/barrie= r.h index f4045dd53e17..15de58b10aec 100644 --- a/arch/csky/include/asm/barrier.h +++ b/arch/csky/include/asm/barrier.h @@ -37,17 +37,21 @@ * bar.brar * bar.bwaw */ +#define FULL_FENCE ".long 0x842fc000\n" +#define ACQUIRE_FENCE ".long 0x8427c000\n" +#define RELEASE_FENCE ".long 0x842ec000\n" + #define __bar_brw() asm volatile (".long 0x842cc000\n":::"memory") #define __bar_br() asm volatile (".long 0x8424c000\n":::"memory") #define __bar_bw() asm volatile (".long 0x8428c000\n":::"memory") #define __bar_arw() asm volatile (".long 0x8423c000\n":::"memory") #define __bar_ar() asm volatile (".long 0x8421c000\n":::"memory") #define __bar_aw() asm volatile (".long 0x8422c000\n":::"memory") -#define __bar_brwarw() asm volatile (".long 0x842fc000\n":::"memory") -#define __bar_brarw() asm volatile (".long 0x8427c000\n":::"memory") +#define __bar_brwarw() asm volatile (FULL_FENCE:::"memory") +#define __bar_brarw() asm volatile (ACQUIRE_FENCE:::"memory") #define __bar_bwarw() asm volatile (".long 0x842bc000\n":::"memory") #define __bar_brwar() asm volatile (".long 0x842dc000\n":::"memory") -#define __bar_brwaw() asm volatile (".long 0x842ec000\n":::"memory") +#define __bar_brwaw() asm volatile (RELEASE_FENCE:::"memory") #define __bar_brar() asm volatile (".long 0x8425c000\n":::"memory") #define __bar_brar() asm volatile (".long 0x8425c000\n":::"memory") #define __bar_bwaw() asm volatile (".long 0x842ac000\n":::"memory") @@ -56,7 +60,6 @@ #define __smp_rmb() __bar_brar() #define __smp_wmb() __bar_bwaw() =20 -#define ACQUIRE_FENCE ".long 0x8427c000\n" #define __smp_acquire_fence() __bar_brarw() #define __smp_release_fence() __bar_brwaw() =20 diff --git a/arch/csky/include/asm/cmpxchg.h b/arch/csky/include/asm/cmpxch= g.h index d1bef11f8dc9..5b8faccd65e4 100644 --- a/arch/csky/include/asm/cmpxchg.h +++ b/arch/csky/include/asm/cmpxchg.h @@ -64,15 +64,71 @@ extern void __bad_xchg(void); #define arch_cmpxchg_relaxed(ptr, o, n) \ (__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr)))) =20 -#define arch_cmpxchg(ptr, o, n) \ +#define __cmpxchg_acquire(ptr, old, new, size) \ ({ \ + __typeof__(ptr) __ptr =3D (ptr); \ + __typeof__(new) __new =3D (new); \ + __typeof__(new) __tmp; \ + __typeof__(old) __old =3D (old); \ + __typeof__(*(ptr)) __ret; \ + switch (size) { \ + case 4: \ + asm volatile ( \ + "1: ldex.w %0, (%3) \n" \ + " cmpne %0, %4 \n" \ + " bt 2f \n" \ + " mov %1, %2 \n" \ + " stex.w %1, (%3) \n" \ + " bez %1, 1b \n" \ + ACQUIRE_FENCE \ + "2: \n" \ + : "=3D&r" (__ret), "=3D&r" (__tmp) \ + : "r" (__new), "r"(__ptr), "r"(__old) \ + :); \ + break; \ + default: \ + __bad_xchg(); \ + } \ + __ret; \ +}) + +#define arch_cmpxchg_acquire(ptr, o, n) \ + (__cmpxchg_acquire((ptr), (o), (n), sizeof(*(ptr)))) + +#define __cmpxchg(ptr, old, new, size) \ +({ \ + __typeof__(ptr) __ptr =3D (ptr); \ + __typeof__(new) __new =3D (new); \ + __typeof__(new) __tmp; \ + __typeof__(old) __old =3D (old); \ __typeof__(*(ptr)) __ret; \ - __smp_release_fence(); \ - __ret =3D arch_cmpxchg_relaxed(ptr, o, n); \ - __smp_acquire_fence(); \ + switch (size) { \ + case 4: \ + asm volatile ( \ + RELEASE_FENCE \ + "1: ldex.w %0, (%3) \n" \ + " cmpne %0, %4 \n" \ + " bt 2f \n" \ + " mov %1, %2 \n" \ + " stex.w %1, (%3) \n" \ + " bez %1, 1b \n" \ + FULL_FENCE \ + "2: \n" \ + : "=3D&r" (__ret), "=3D&r" (__tmp) \ + : "r" (__new), "r"(__ptr), "r"(__old) \ + :); \ + break; \ + default: \ + __bad_xchg(); \ + } \ __ret; \ }) =20 +#define arch_cmpxchg(ptr, o, n) \ + (__cmpxchg((ptr), (o), (n), sizeof(*(ptr)))) + +#define arch_cmpxchg_local(ptr, o, n) \ + (__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr)))) #else #include #endif --=20 2.25.1 From nobody Sun May 10 21:17:34 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D330BC433FE for ; Sun, 24 Apr 2022 07:30:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238519AbiDXHc5 (ORCPT ); Sun, 24 Apr 2022 03:32:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238489AbiDXHcs (ORCPT ); Sun, 24 Apr 2022 03:32:48 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E770F151F4B; Sun, 24 Apr 2022 00:29:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2F19861305; Sun, 24 Apr 2022 07:29:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DAC45C385AC; Sun, 24 Apr 2022 07:29:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650785386; bh=FQV8N0eLk4AtQ1VH8mq3+se93l+0KbFwvJtFQphXmxE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LSeeL9QRrlImtNcEzsT/DeqqmU+ixGcecS4w8gnug0ve9xyoL1qgfv+S54+86tC/7 jU/sx1b+27bi6lhpl5XZ8iUYHYb5THPqdqfCVRH8IB72bwamebNAxWkf4yGUUmkR8R LD4a6yD8ukXt1I55I0Ih5+CUx39GAsAjOT++bfuho6kPTzV5e5XCLrwigFnclnqssD 3snud2iqCV/shdF/F8QhieWIZrcxxgHLl8Q2tcCS2UuDbgCTZHhA/mBCvpGcn1vh0l lpfQO4ayZw714G4stlGT6EjcBPbYlxiHt9sJRep7zy03vZP/q6HLx4V0stRVcaWw+8 Gnom9vN0SfzsA== From: guoren@kernel.org To: guoren@kernel.org, arnd@arndb.de, mark.rutland@arm.com, boqun.feng@gmail.com, peterz@infradead.org, will@kernel.org Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-csky@vger.kernel.org, Guo Ren Subject: [PATCH V4 2/3] csky: atomic: Add custom atomic.h implementation Date: Sun, 24 Apr 2022 15:29:17 +0800 Message-Id: <20220424072918.2596899-3-guoren@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220424072918.2596899-1-guoren@kernel.org> References: <20220424072918.2596899-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren The generic atomic.h used cmpxchg to implement the atomic operations, it will cause daul loop to reduce the forward guarantee. The patch implement csky custom atomic operations with ldex/stex instructions for the best performance. Important comment by Rutland: 8e86f0b409a4 ("arm64: atomics: fix use of acquire + release for full barrier semantics") Link: https://lore.kernel.org/linux-riscv/CAJF2gTSAxpAi=3DLbAdu7jntZRUa=3D-= dJwL0VfmDfBV5MHB=3DrcZ-w@mail.gmail.com/T/#m27a0f1342995deae49ce1d0e1f2683f= 8a181d6c3 Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Mark Rutland --- arch/csky/include/asm/atomic.h | 142 +++++++++++++++++++++++++++++++++ 1 file changed, 142 insertions(+) create mode 100644 arch/csky/include/asm/atomic.h diff --git a/arch/csky/include/asm/atomic.h b/arch/csky/include/asm/atomic.h new file mode 100644 index 000000000000..56c9dc8e91b3 --- /dev/null +++ b/arch/csky/include/asm/atomic.h @@ -0,0 +1,142 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef __ASM_CSKY_ATOMIC_H +#define __ASM_CSKY_ATOMIC_H + +#ifdef CONFIG_SMP +#include + +#include +#include + +#define __atomic_acquire_fence() __bar_brarw() + +#define __atomic_release_fence() __bar_brwaw() + +static __always_inline int arch_atomic_read(const atomic_t *v) +{ + return READ_ONCE(v->counter); +} +static __always_inline void arch_atomic_set(atomic_t *v, int i) +{ + WRITE_ONCE(v->counter, i); +} + +#define ATOMIC_OP(op) \ +static __always_inline \ +void arch_atomic_##op(int i, atomic_t *v) \ +{ \ + unsigned long tmp; \ + __asm__ __volatile__ ( \ + "1: ldex.w %0, (%2) \n" \ + " " #op " %0, %1 \n" \ + " stex.w %0, (%2) \n" \ + " bez %0, 1b \n" \ + : "=3D&r" (tmp) \ + : "r" (i), "r" (&v->counter) \ + : "memory"); \ +} + +ATOMIC_OP(add) +ATOMIC_OP(sub) +ATOMIC_OP(and) +ATOMIC_OP( or) +ATOMIC_OP(xor) + +#undef ATOMIC_OP + +#define ATOMIC_FETCH_OP(op) \ +static __always_inline \ +int arch_atomic_fetch_##op##_relaxed(int i, atomic_t *v) \ +{ \ + register int ret, tmp; \ + __asm__ __volatile__ ( \ + "1: ldex.w %0, (%3) \n" \ + " mov %1, %0 \n" \ + " " #op " %0, %2 \n" \ + " stex.w %0, (%3) \n" \ + " bez %0, 1b \n" \ + : "=3D&r" (tmp), "=3D&r" (ret) \ + : "r" (i), "r"(&v->counter) \ + : "memory"); \ + return ret; \ +} + +#define ATOMIC_OP_RETURN(op, c_op) \ +static __always_inline \ +int arch_atomic_##op##_return_relaxed(int i, atomic_t *v) \ +{ \ + return arch_atomic_fetch_##op##_relaxed(i, v) c_op i; \ +} + +#define ATOMIC_OPS(op, c_op) \ + ATOMIC_FETCH_OP(op) \ + ATOMIC_OP_RETURN(op, c_op) + +ATOMIC_OPS(add, +) +ATOMIC_OPS(sub, -) + +#define arch_atomic_fetch_add_relaxed arch_atomic_fetch_add_relaxed +#define arch_atomic_fetch_sub_relaxed arch_atomic_fetch_sub_relaxed + +#define arch_atomic_add_return_relaxed arch_atomic_add_return_relaxed +#define arch_atomic_sub_return_relaxed arch_atomic_sub_return_relaxed + +#undef ATOMIC_OPS +#undef ATOMIC_OP_RETURN + +#define ATOMIC_OPS(op) \ + ATOMIC_FETCH_OP(op) + +ATOMIC_OPS(and) +ATOMIC_OPS( or) +ATOMIC_OPS(xor) + +#define arch_atomic_fetch_and_relaxed arch_atomic_fetch_and_relaxed +#define arch_atomic_fetch_or_relaxed arch_atomic_fetch_or_relaxed +#define arch_atomic_fetch_xor_relaxed arch_atomic_fetch_xor_relaxed + +#undef ATOMIC_OPS + +#undef ATOMIC_FETCH_OP + +#define ATOMIC_OP() \ +static __always_inline \ +int arch_atomic_xchg_relaxed(atomic_t *v, int n) \ +{ \ + return __xchg_relaxed(n, &(v->counter), 4); \ +} \ +static __always_inline \ +int arch_atomic_cmpxchg_relaxed(atomic_t *v, int o, int n) \ +{ \ + return __cmpxchg_relaxed(&(v->counter), o, n, 4); \ +} \ +static __always_inline \ +int arch_atomic_cmpxchg_acquire(atomic_t *v, int o, int n) \ +{ \ + return __cmpxchg_acquire(&(v->counter), o, n, 4); \ +} \ +static __always_inline \ +int arch_atomic_cmpxchg(atomic_t *v, int o, int n) \ +{ \ + return __cmpxchg(&(v->counter), o, n, 4); \ +} + +#define ATOMIC_OPS() \ + ATOMIC_OP() + +ATOMIC_OPS() + +#define arch_atomic_xchg_relaxed arch_atomic_xchg_relaxed +#define arch_atomic_cmpxchg_relaxed arch_atomic_cmpxchg_relaxed +#define arch_atomic_cmpxchg_acquire arch_atomic_cmpxchg_acquire +#define arch_atomic_cmpxchg arch_atomic_cmpxchg + +#undef ATOMIC_OPS +#undef ATOMIC_OP + +#else +#include +#endif + +#endif /* __ASM_CSKY_ATOMIC_H */ --=20 2.25.1 From nobody Sun May 10 21:17:34 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BD949C4332F for ; Sun, 24 Apr 2022 07:30:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238501AbiDXHdB (ORCPT ); Sun, 24 Apr 2022 03:33:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238509AbiDXHcw (ORCPT ); Sun, 24 Apr 2022 03:32:52 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B68CA152789; Sun, 24 Apr 2022 00:29:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id F4218B80E07; Sun, 24 Apr 2022 07:29:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 17A2FC385BD; Sun, 24 Apr 2022 07:29:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650785389; bh=UD7MfGnyGMX7gUmVXWT1olQvCjItiLH77jglfvKJE84=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ha5lJxF7K1uPp8pkZBQv/o+MFGfSvYEgJiWxV9Pcv2cpGPHjr2/xEQGbUIOnEZYNw qcZJ09xkLDqYCsRqipKDibJeQFDeChFA76Ave5NPgjgOd+XwWz8NJLjY2ULWAi7gjd Br/XIv7TrKe1Hgze/2Zq2Ol79BMY9WiAJHgsZ7cqrXRjoo+K8S2qU/s2l6ypp165Cn YUdXe3L3pOercR489w/bgKoxMzSFwrYiyIRTnEwRnGnA4tnpCUd39SoQYodT/4dFFc LNjlqKD5JUDL62B3oSQL2pPfLMR/hhBPkWJRPsjDIBZmE2LXBryIGwicdNzEM8gr1Q d3su9bH9DQYCg== From: guoren@kernel.org To: guoren@kernel.org, arnd@arndb.de, mark.rutland@arm.com, boqun.feng@gmail.com, peterz@infradead.org, will@kernel.org Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-csky@vger.kernel.org, Guo Ren Subject: [PATCH V4 3/3] csky: atomic: Add conditional atomic operations' optimization Date: Sun, 24 Apr 2022 15:29:18 +0800 Message-Id: <20220424072918.2596899-4-guoren@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220424072918.2596899-1-guoren@kernel.org> References: <20220424072918.2596899-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren Add conditional atomic operations' optimization: - arch_atomic_fetch_add_unless - arch_atomic_inc_unless_negative - arch_atomic_dec_unless_positive - arch_atomic_dec_if_positive Comments by Boqun: FWIW, you probably need to make sure that a barrier instruction inside an lr/sc loop is a good thing. IIUC, the execution time of a barrier instruction is determined by the status of store buffers and invalidate queues (and probably other stuffs), so it may increase the execution time of the lr/sc loop, and make it unlikely to succeed. But this really depends on how the arch executes these instructions. Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Boqun Feng --- arch/csky/include/asm/atomic.h | 95 ++++++++++++++++++++++++++++++++++ 1 file changed, 95 insertions(+) diff --git a/arch/csky/include/asm/atomic.h b/arch/csky/include/asm/atomic.h index 56c9dc8e91b3..60406ef9c2bb 100644 --- a/arch/csky/include/asm/atomic.h +++ b/arch/csky/include/asm/atomic.h @@ -100,6 +100,101 @@ ATOMIC_OPS(xor) =20 #undef ATOMIC_FETCH_OP =20 +static __always_inline int +arch_atomic_fetch_add_unless(atomic_t *v, int a, int u) +{ + int prev, tmp; + + __asm__ __volatile__ ( + RELEASE_FENCE + "1: ldex.w %0, (%3) \n" + " cmpne %0, %4 \n" + " bf 2f \n" + " mov %1, %0 \n" + " add %1, %2 \n" + " stex.w %1, (%3) \n" + " bez %1, 1b \n" + FULL_FENCE + "2:\n" + : "=3D&r" (prev), "=3D&r" (tmp) + : "r" (a), "r" (&v->counter), "r" (u) + : "memory"); + + return prev; +} +#define arch_atomic_fetch_add_unless arch_atomic_fetch_add_unless + +static __always_inline bool +arch_atomic_inc_unless_negative(atomic_t *v) +{ + int rc, tmp; + + __asm__ __volatile__ ( + RELEASE_FENCE + "1: ldex.w %0, (%2) \n" + " movi %1, 0 \n" + " blz %0, 2f \n" + " movi %1, 1 \n" + " addi %0, 1 \n" + " stex.w %0, (%2) \n" + " bez %0, 1b \n" + FULL_FENCE + "2:\n" + : "=3D&r" (tmp), "=3D&r" (rc) + : "r" (&v->counter) + : "memory"); + + return tmp ? true : false; + +} +#define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative + +static __always_inline bool +arch_atomic_dec_unless_positive(atomic_t *v) +{ + int rc, tmp; + + __asm__ __volatile__ ( + RELEASE_FENCE + "1: ldex.w %0, (%2) \n" + " movi %1, 0 \n" + " bhz %0, 2f \n" + " movi %1, 1 \n" + " subi %0, 1 \n" + " stex.w %0, (%2) \n" + " bez %0, 1b \n" + FULL_FENCE + "2:\n" + : "=3D&r" (tmp), "=3D&r" (rc) + : "r" (&v->counter) + : "memory"); + + return tmp ? true : false; +} +#define arch_atomic_dec_unless_positive arch_atomic_dec_unless_positive + +static __always_inline int +arch_atomic_dec_if_positive(atomic_t *v) +{ + int dec, tmp; + + __asm__ __volatile__ ( + RELEASE_FENCE + "1: ldex.w %0, (%2) \n" + " subi %1, %0, 1 \n" + " blz %1, 2f \n" + " stex.w %1, (%2) \n" + " bez %1, 1b \n" + FULL_FENCE + "2:\n" + : "=3D&r" (dec), "=3D&r" (tmp) + : "r" (&v->counter) + : "memory"); + + return dec - 1; +} +#define arch_atomic_dec_if_positive arch_atomic_dec_if_positive + #define ATOMIC_OP() \ static __always_inline \ int arch_atomic_xchg_relaxed(atomic_t *v, int n) \ --=20 2.25.1