From nobody Mon May 11 00:05:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EEB25C433FE for ; Wed, 20 Apr 2022 14:45:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379795AbiDTOrt (ORCPT ); Wed, 20 Apr 2022 10:47:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379752AbiDTOrX (ORCPT ); Wed, 20 Apr 2022 10:47:23 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57E952C663; Wed, 20 Apr 2022 07:44:37 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E5EC461647; Wed, 20 Apr 2022 14:44:36 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 221DBC385A1; Wed, 20 Apr 2022 14:44:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650465876; bh=b6/BU+/JNraTRFDMaX64ry4dRF+EjjQX1x14puzFLH8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=epm7esix1KLpCRzb1oPQOBHBiO+N5DARLDJVDiJJIvIquDNjLYxlRnBYP08cs7zRJ ljxRu5t3i1ct5Fyx484l/RXquo4+qw4FyOv0jFPIWyPO537Ng4k++hUQa9e19gFuw6 OZlU1ZsNt3i+dExhS2OdNPSuDOHnlbph3TiKCHXcvJllvzATespTqYz7rvOHGJMzC8 DIdXJEmI43HXjtoN/RyfvoyUnY5umHxa25u9lv/7KFNLLiGFUF7m+iyODFh+A7gS14 kngWU6/XGXg5jJrrS9y5ZY1uWttSmxXacJc2UL6qklakB3pzf7fEiUTFumfDLCNkk4 CLDemuRt8uiNg== From: guoren@kernel.org To: guoren@kernel.org, arnd@arndb.de, palmer@dabbelt.com, mark.rutland@arm.com, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, dlustig@nvidia.com, parri.andrea@gmail.com Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Guo Ren Subject: [PATCH V3 1/5] riscv: atomic: Cleanup unnecessary definition Date: Wed, 20 Apr 2022 22:44:13 +0800 Message-Id: <20220420144417.2453958-2-guoren@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220420144417.2453958-1-guoren@kernel.org> References: <20220420144417.2453958-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren The cmpxchg32 & cmpxchg32_local have been never used in linux, so remove them from cmpxchg.h. Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Palmer Dabbelt Cc: Arnd Bergmann Cc: Dan Lustig Cc: Andrea Parri --- arch/riscv/include/asm/cmpxchg.h | 12 ------------ 1 file changed, 12 deletions(-) diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpx= chg.h index 36dc962f6343..12debce235e5 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -348,18 +348,6 @@ #define arch_cmpxchg_local(ptr, o, n) \ (__cmpxchg_relaxed((ptr), (o), (n), sizeof(*(ptr)))) =20 -#define cmpxchg32(ptr, o, n) \ -({ \ - BUILD_BUG_ON(sizeof(*(ptr)) !=3D 4); \ - arch_cmpxchg((ptr), (o), (n)); \ -}) - -#define cmpxchg32_local(ptr, o, n) \ -({ \ - BUILD_BUG_ON(sizeof(*(ptr)) !=3D 4); \ - arch_cmpxchg_relaxed((ptr), (o), (n)) \ -}) - #define arch_cmpxchg64(ptr, o, n) \ ({ \ BUILD_BUG_ON(sizeof(*(ptr)) !=3D 8); \ --=20 2.25.1 From nobody Mon May 11 00:05:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6CF2DC433FE for ; Wed, 20 Apr 2022 14:45:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379783AbiDTOrv (ORCPT ); Wed, 20 Apr 2022 10:47:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379767AbiDTOr3 (ORCPT ); Wed, 20 Apr 2022 10:47:29 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3B1A42A30; Wed, 20 Apr 2022 07:44:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 5F138B81F94; Wed, 20 Apr 2022 14:44:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DF5FCC385AB; Wed, 20 Apr 2022 14:44:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650465880; bh=jRua+3pFZKtrCEO8fla+OBQVV80FESoUd1XaSPN+HWo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=IGh3FIne/NwlHFQDzjBuB4RphbxDBgOs91t76uO24wKle3gnxllG7v0qpvE3+nJlR lqKUOTdzoxN22Ea1a/4DTY+RSXLAYWIAyZswQ/CQOtGEZogd7Aqv5k6qrAunC20L8e Bgg3XZ1Nmw/E9JHQPcCQlqKeeO4EPn6+RkxiYmoZHoCavSZEHI2Qu3ZyxGs0UyT3gR 1tYGKPdbocbRPD4rOvDO5/1iqaTuStzMifGPu+8wdyHnJ/4a+slaPebPE5C6sdnfZ0 0y4PL9tbaiFl6JPC/yfLKHSg69mqTQOxXpHzntLFkJYDgxe/zhnKC6+LYGpFi+/Ova NnV6R9QDrYnVQ== From: guoren@kernel.org To: guoren@kernel.org, arnd@arndb.de, palmer@dabbelt.com, mark.rutland@arm.com, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, dlustig@nvidia.com, parri.andrea@gmail.com Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Guo Ren Subject: [PATCH V3 2/5] riscv: atomic: Optimize acquire and release for AMO operations Date: Wed, 20 Apr 2022 22:44:14 +0800 Message-Id: <20220420144417.2453958-3-guoren@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220420144417.2453958-1-guoren@kernel.org> References: <20220420144417.2453958-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren Current acquire & release implementations from atomic-arch- fallback.h are using __atomic_acquire/release_fence(), it cause another extra "fence r, rw/fence rw,w" instruction after/before AMO instruction. RISC-V AMO instructions could combine acquire and release in the instruction self which could reduce a fence instruction. Here is from RISC-V ISA 10.4 Atomic Memory Operations: To help implement multiprocessor synchronization, the AMOs optionally provide release consistency semantics. - .aq: If the aq bit is set, then no later memory operations in this RISC-V hart can be observed to take place before the AMO. - .rl: If the rl bit is set, then other RISC-V harts will not observe the AMO before memory accesses preceding the AMO in this RISC-V hart. - .aqrl: Setting both the aq and the rl bit on an AMO makes the sequence sequentially consistent, meaning that it cannot be reordered with earlier or later memory operations from the same hart. Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Palmer Dabbelt Cc: Mark Rutland Cc: Andrea Parri Cc: Dan Lustig --- arch/riscv/include/asm/atomic.h | 64 ++++++++++++++++++++++++++++++++ arch/riscv/include/asm/cmpxchg.h | 12 ++---- 2 files changed, 68 insertions(+), 8 deletions(-) diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomi= c.h index ac9bdf4fc404..20ce8b83bc18 100644 --- a/arch/riscv/include/asm/atomic.h +++ b/arch/riscv/include/asm/atomic.h @@ -99,6 +99,30 @@ c_type arch_atomic##prefix##_fetch_##op##_relaxed(c_type= i, \ return ret; \ } \ static __always_inline \ +c_type arch_atomic##prefix##_fetch_##op##_acquire(c_type i, \ + atomic##prefix##_t *v) \ +{ \ + register c_type ret; \ + __asm__ __volatile__ ( \ + " amo" #asm_op "." #asm_type ".aq %1, %2, %0" \ + : "+A" (v->counter), "=3Dr" (ret) \ + : "r" (I) \ + : "memory"); \ + return ret; \ +} \ +static __always_inline \ +c_type arch_atomic##prefix##_fetch_##op##_release(c_type i, \ + atomic##prefix##_t *v) \ +{ \ + register c_type ret; \ + __asm__ __volatile__ ( \ + " amo" #asm_op "." #asm_type ".rl %1, %2, %0" \ + : "+A" (v->counter), "=3Dr" (ret) \ + : "r" (I) \ + : "memory"); \ + return ret; \ +} \ +static __always_inline \ c_type arch_atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \ { \ register c_type ret; \ @@ -118,6 +142,18 @@ c_type arch_atomic##prefix##_##op##_return_relaxed(c_t= ype i, \ return arch_atomic##prefix##_fetch_##op##_relaxed(i, v) c_op I; \ } \ static __always_inline \ +c_type arch_atomic##prefix##_##op##_return_acquire(c_type i, \ + atomic##prefix##_t *v) \ +{ \ + return arch_atomic##prefix##_fetch_##op##_acquire(i, v) c_op I; \ +} \ +static __always_inline \ +c_type arch_atomic##prefix##_##op##_return_release(c_type i, \ + atomic##prefix##_t *v) \ +{ \ + return arch_atomic##prefix##_fetch_##op##_release(i, v) c_op I; \ +} \ +static __always_inline \ c_type arch_atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t *v= ) \ { \ return arch_atomic##prefix##_fetch_##op(i, v) c_op I; \ @@ -140,22 +176,38 @@ ATOMIC_OPS(sub, add, +, -i) =20 #define arch_atomic_add_return_relaxed arch_atomic_add_return_relaxed #define arch_atomic_sub_return_relaxed arch_atomic_sub_return_relaxed +#define arch_atomic_add_return_acquire arch_atomic_add_return_acquire +#define arch_atomic_sub_return_acquire arch_atomic_sub_return_acquire +#define arch_atomic_add_return_release arch_atomic_add_return_release +#define arch_atomic_sub_return_release arch_atomic_sub_return_release #define arch_atomic_add_return arch_atomic_add_return #define arch_atomic_sub_return arch_atomic_sub_return =20 #define arch_atomic_fetch_add_relaxed arch_atomic_fetch_add_relaxed #define arch_atomic_fetch_sub_relaxed arch_atomic_fetch_sub_relaxed +#define arch_atomic_fetch_add_acquire arch_atomic_fetch_add_acquire +#define arch_atomic_fetch_sub_acquire arch_atomic_fetch_sub_acquire +#define arch_atomic_fetch_add_release arch_atomic_fetch_add_release +#define arch_atomic_fetch_sub_release arch_atomic_fetch_sub_release #define arch_atomic_fetch_add arch_atomic_fetch_add #define arch_atomic_fetch_sub arch_atomic_fetch_sub =20 #ifndef CONFIG_GENERIC_ATOMIC64 #define arch_atomic64_add_return_relaxed arch_atomic64_add_return_relaxed #define arch_atomic64_sub_return_relaxed arch_atomic64_sub_return_relaxed +#define arch_atomic64_add_return_acquire arch_atomic64_add_return_acquire +#define arch_atomic64_sub_return_acquire arch_atomic64_sub_return_acquire +#define arch_atomic64_add_return_release arch_atomic64_add_return_release +#define arch_atomic64_sub_return_release arch_atomic64_sub_return_release #define arch_atomic64_add_return arch_atomic64_add_return #define arch_atomic64_sub_return arch_atomic64_sub_return =20 #define arch_atomic64_fetch_add_relaxed arch_atomic64_fetch_add_relaxed #define arch_atomic64_fetch_sub_relaxed arch_atomic64_fetch_sub_relaxed +#define arch_atomic64_fetch_add_acquire arch_atomic64_fetch_add_acquire +#define arch_atomic64_fetch_sub_acquire arch_atomic64_fetch_sub_acquire +#define arch_atomic64_fetch_add_release arch_atomic64_fetch_add_release +#define arch_atomic64_fetch_sub_release arch_atomic64_fetch_sub_release #define arch_atomic64_fetch_add arch_atomic64_fetch_add #define arch_atomic64_fetch_sub arch_atomic64_fetch_sub #endif @@ -178,6 +230,12 @@ ATOMIC_OPS(xor, xor, i) #define arch_atomic_fetch_and_relaxed arch_atomic_fetch_and_relaxed #define arch_atomic_fetch_or_relaxed arch_atomic_fetch_or_relaxed #define arch_atomic_fetch_xor_relaxed arch_atomic_fetch_xor_relaxed +#define arch_atomic_fetch_and_acquire arch_atomic_fetch_and_acquire +#define arch_atomic_fetch_or_acquire arch_atomic_fetch_or_acquire +#define arch_atomic_fetch_xor_acquire arch_atomic_fetch_xor_acquire +#define arch_atomic_fetch_and_release arch_atomic_fetch_and_release +#define arch_atomic_fetch_or_release arch_atomic_fetch_or_release +#define arch_atomic_fetch_xor_release arch_atomic_fetch_xor_release #define arch_atomic_fetch_and arch_atomic_fetch_and #define arch_atomic_fetch_or arch_atomic_fetch_or #define arch_atomic_fetch_xor arch_atomic_fetch_xor @@ -186,6 +244,12 @@ ATOMIC_OPS(xor, xor, i) #define arch_atomic64_fetch_and_relaxed arch_atomic64_fetch_and_relaxed #define arch_atomic64_fetch_or_relaxed arch_atomic64_fetch_or_relaxed #define arch_atomic64_fetch_xor_relaxed arch_atomic64_fetch_xor_relaxed +#define arch_atomic64_fetch_and_acquire arch_atomic64_fetch_and_acquire +#define arch_atomic64_fetch_or_acquire arch_atomic64_fetch_or_acquire +#define arch_atomic64_fetch_xor_acquire arch_atomic64_fetch_xor_acquire +#define arch_atomic64_fetch_and_release arch_atomic64_fetch_and_release +#define arch_atomic64_fetch_or_release arch_atomic64_fetch_or_release +#define arch_atomic64_fetch_xor_release arch_atomic64_fetch_xor_release #define arch_atomic64_fetch_and arch_atomic64_fetch_and #define arch_atomic64_fetch_or arch_atomic64_fetch_or #define arch_atomic64_fetch_xor arch_atomic64_fetch_xor diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpx= chg.h index 12debce235e5..1af8db92250b 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -52,16 +52,14 @@ switch (size) { \ case 4: \ __asm__ __volatile__ ( \ - " amoswap.w %0, %2, %1\n" \ - RISCV_ACQUIRE_BARRIER \ + " amoswap.w.aq %0, %2, %1\n" \ : "=3Dr" (__ret), "+A" (*__ptr) \ : "r" (__new) \ : "memory"); \ break; \ case 8: \ __asm__ __volatile__ ( \ - " amoswap.d %0, %2, %1\n" \ - RISCV_ACQUIRE_BARRIER \ + " amoswap.d.aq %0, %2, %1\n" \ : "=3Dr" (__ret), "+A" (*__ptr) \ : "r" (__new) \ : "memory"); \ @@ -87,16 +85,14 @@ switch (size) { \ case 4: \ __asm__ __volatile__ ( \ - RISCV_RELEASE_BARRIER \ - " amoswap.w %0, %2, %1\n" \ + " amoswap.w.rl %0, %2, %1\n" \ : "=3Dr" (__ret), "+A" (*__ptr) \ : "r" (__new) \ : "memory"); \ break; \ case 8: \ __asm__ __volatile__ ( \ - RISCV_RELEASE_BARRIER \ - " amoswap.d %0, %2, %1\n" \ + " amoswap.d.rl %0, %2, %1\n" \ : "=3Dr" (__ret), "+A" (*__ptr) \ : "r" (__new) \ : "memory"); \ --=20 2.25.1 From nobody Mon May 11 00:05:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6969C433F5 for ; Wed, 20 Apr 2022 14:45:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345374AbiDTOr6 (ORCPT ); Wed, 20 Apr 2022 10:47:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379768AbiDTOrb (ORCPT ); Wed, 20 Apr 2022 10:47:31 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F357B42A30; Wed, 20 Apr 2022 07:44:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7D94C61647; Wed, 20 Apr 2022 14:44:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A5450C385AC; Wed, 20 Apr 2022 14:44:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650465883; bh=sAulfPUiEvDbBqRnfY9vU25oXxTZrbt8To4Bji3mwyY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tPXOZu0vrdrkEajEXD7J4RwxL7f29AjfbOT4IBcggA0PvPHAGfMQ2HUqG/l9bNmaU lYhzbIMGzYCYZyrUAfSa2jsWQ3pKmKBZoU2c86KVsBFv1bxaiQMfUMllfUDNQl+xsG O4qH0JQR1ho6ORYR7LbnQVTkRQPGCwSSgiTjotL1aKCdlkZ4gYR+rH8gKX8Y/wN8mP t81UsuPTKux+lMBZU/4NP9QYgY2wdmSUY8OAPZ6Z6lEjYaMKBNCnLMcvGiC/ROX0nk uYkJmZVIDlsyrhaTn6YY/bkOt3Ou5lrsgNlrct74cxs8knSuRN7/2vtmiCHxHPrxaS jnyGB6S4/X81g== From: guoren@kernel.org To: guoren@kernel.org, arnd@arndb.de, palmer@dabbelt.com, mark.rutland@arm.com, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, dlustig@nvidia.com, parri.andrea@gmail.com Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Guo Ren Subject: [PATCH V3 3/5] riscv: atomic: Optimize memory barrier semantics of LRSC-pairs Date: Wed, 20 Apr 2022 22:44:15 +0800 Message-Id: <20220420144417.2453958-4-guoren@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220420144417.2453958-1-guoren@kernel.org> References: <20220420144417.2453958-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren The current implementation is the same with 8e86f0b409a4 ("arm64: atomics: fix use of acquire + release for full barrier semantics"). RISC-V could combine acquire and release into the AMO instructions and it could reduce the cost of instruction in performance. Here is RISC-V ISA 10.2 Load-Reserved/Store-Conditional Instructions: - .aq: The LR/SC sequence can be given acquire semantics by setting the aq bit on the LR instruction. - .rl: The LR/SC sequence can be given release semantics by setting the rl bit on the SC instruction. - .aqrl: Setting the aq bit on the LR instruction, and setting both the aq and the rl bit on the SC instruction makes the LR/SC sequence sequentially consistent, meaning that it cannot be reordered with earlier or later memory operations from the same hart. Software should not set the rl bit on an LR instruction unless the aq bit is also set, nor should software set the aq bit on an SC instruction unless the rl bit is also set. LR.rl and SC.aq instructions are not guaranteed to provide any stronger ordering than those with both bits clear, but may result in lower performance. Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Palmer Dabbelt Cc: Mark Rutland Cc: Dan Lustig Cc: Andrea Parri --- arch/riscv/include/asm/atomic.h | 6 ++---- arch/riscv/include/asm/cmpxchg.h | 6 ++---- 2 files changed, 4 insertions(+), 8 deletions(-) diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomi= c.h index 20ce8b83bc18..4aaf5b01e7c6 100644 --- a/arch/riscv/include/asm/atomic.h +++ b/arch/riscv/include/asm/atomic.h @@ -382,9 +382,8 @@ static __always_inline int arch_atomic_sub_if_positive(= atomic_t *v, int offset) "0: lr.w %[p], %[c]\n" " sub %[rc], %[p], %[o]\n" " bltz %[rc], 1f\n" - " sc.w.rl %[rc], %[rc], %[c]\n" + " sc.w.aqrl %[rc], %[rc], %[c]\n" " bnez %[rc], 0b\n" - " fence rw, rw\n" "1:\n" : [p]"=3D&r" (prev), [rc]"=3D&r" (rc), [c]"+A" (v->counter) : [o]"r" (offset) @@ -404,9 +403,8 @@ static __always_inline s64 arch_atomic64_sub_if_positiv= e(atomic64_t *v, s64 offs "0: lr.d %[p], %[c]\n" " sub %[rc], %[p], %[o]\n" " bltz %[rc], 1f\n" - " sc.d.rl %[rc], %[rc], %[c]\n" + " sc.d.aqrl %[rc], %[rc], %[c]\n" " bnez %[rc], 0b\n" - " fence rw, rw\n" "1:\n" : [p]"=3D&r" (prev), [rc]"=3D&r" (rc), [c]"+A" (v->counter) : [o]"r" (offset) diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpx= chg.h index 1af8db92250b..9269fceb86e0 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -307,9 +307,8 @@ __asm__ __volatile__ ( \ "0: lr.w %0, %2\n" \ " bne %0, %z3, 1f\n" \ - " sc.w.rl %1, %z4, %2\n" \ + " sc.w.aqrl %1, %z4, %2\n" \ " bnez %1, 0b\n" \ - " fence rw, rw\n" \ "1:\n" \ : "=3D&r" (__ret), "=3D&r" (__rc), "+A" (*__ptr) \ : "rJ" ((long)__old), "rJ" (__new) \ @@ -319,9 +318,8 @@ __asm__ __volatile__ ( \ "0: lr.d %0, %2\n" \ " bne %0, %z3, 1f\n" \ - " sc.d.rl %1, %z4, %2\n" \ + " sc.d.aqrl %1, %z4, %2\n" \ " bnez %1, 0b\n" \ - " fence rw, rw\n" \ "1:\n" \ : "=3D&r" (__ret), "=3D&r" (__rc), "+A" (*__ptr) \ : "rJ" (__old), "rJ" (__new) \ --=20 2.25.1 From nobody Mon May 11 00:05:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBFAFC433EF for ; Wed, 20 Apr 2022 14:45:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379786AbiDTOsG (ORCPT ); Wed, 20 Apr 2022 10:48:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379782AbiDTOrh (ORCPT ); Wed, 20 Apr 2022 10:47:37 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AAA2240; Wed, 20 Apr 2022 07:44:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 374F16176C; Wed, 20 Apr 2022 14:44:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6AC91C385A8; Wed, 20 Apr 2022 14:44:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650465887; bh=2lohV9FIb13AC3ra+UrYDKbvvuDZmMNdD6/UGs8XgO8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CuAZfwf3MjkmBTu2TZaBtRNJtpESsgpXqIA6PhB+MQ74rrbUDF/QVusDi/4OWGE+h /2qSsYUrinpsKBtHxf63d6Zd160XftP642RxJW1c9LBT7+IA39E6t3pblSEW/J6itG 9aWCdVU9DWZddtN+JaN1KDX1zP/eFFJQRTXlIrdCYf1hSgWDtZDv3Q8sEdu7UJgagc 7GlMEd0j/MnDyAFlmw9KRaNHlkpVgiX1OyKmNkXCIMulrHgKGDhL43sfiC845Xjbqy PR+DGjaWYx8G7dOW17Mb/yxdtOodHXBinkojBktJL9v1GA88LeHUpgM/7DkMrTsf78 rdAVlGxnb0M9A== From: guoren@kernel.org To: guoren@kernel.org, arnd@arndb.de, palmer@dabbelt.com, mark.rutland@arm.com, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, dlustig@nvidia.com, parri.andrea@gmail.com Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Guo Ren Subject: [PATCH V3 4/5] riscv: atomic: Optimize dec_if_positive functions Date: Wed, 20 Apr 2022 22:44:16 +0800 Message-Id: <20220420144417.2453958-5-guoren@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220420144417.2453958-1-guoren@kernel.org> References: <20220420144417.2453958-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren The arch_atomic_sub_if_positive is unnecessary for current Linux, and it causes another register allocation. Implementing the dec_if_positive function directly is more efficient. Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Palmer Dabbelt Cc: Mark Rutland Cc: Dan Lustig Cc: Andrea Parri --- arch/riscv/include/asm/atomic.h | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomi= c.h index 4aaf5b01e7c6..5589e1de2c80 100644 --- a/arch/riscv/include/asm/atomic.h +++ b/arch/riscv/include/asm/atomic.h @@ -374,45 +374,45 @@ ATOMIC_OPS() #undef ATOMIC_OPS #undef ATOMIC_OP =20 -static __always_inline int arch_atomic_sub_if_positive(atomic_t *v, int of= fset) +static __always_inline int arch_atomic_dec_if_positive(atomic_t *v) { int prev, rc; =20 __asm__ __volatile__ ( "0: lr.w %[p], %[c]\n" - " sub %[rc], %[p], %[o]\n" + " addi %[rc], %[p], -1\n" " bltz %[rc], 1f\n" " sc.w.aqrl %[rc], %[rc], %[c]\n" " bnez %[rc], 0b\n" "1:\n" : [p]"=3D&r" (prev), [rc]"=3D&r" (rc), [c]"+A" (v->counter) - : [o]"r" (offset) + : : "memory"); - return prev - offset; + return prev - 1; } =20 -#define arch_atomic_dec_if_positive(v) arch_atomic_sub_if_positive(v, 1) +#define arch_atomic_dec_if_positive arch_atomic_dec_if_positive =20 #ifndef CONFIG_GENERIC_ATOMIC64 -static __always_inline s64 arch_atomic64_sub_if_positive(atomic64_t *v, s6= 4 offset) +static __always_inline s64 arch_atomic64_dec_if_positive(atomic64_t *v) { s64 prev; long rc; =20 __asm__ __volatile__ ( "0: lr.d %[p], %[c]\n" - " sub %[rc], %[p], %[o]\n" + " addi %[rc], %[p], -1\n" " bltz %[rc], 1f\n" " sc.d.aqrl %[rc], %[rc], %[c]\n" " bnez %[rc], 0b\n" "1:\n" : [p]"=3D&r" (prev), [rc]"=3D&r" (rc), [c]"+A" (v->counter) - : [o]"r" (offset) + : : "memory"); - return prev - offset; + return prev - 1; } =20 -#define arch_atomic64_dec_if_positive(v) arch_atomic64_sub_if_positive(v, = 1) +#define arch_atomic64_dec_if_positive arch_atomic64_dec_if_positive #endif =20 #endif /* _ASM_RISCV_ATOMIC_H */ --=20 2.25.1 From nobody Mon May 11 00:05:13 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56873C433FE for ; Wed, 20 Apr 2022 14:45:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379852AbiDTOsK (ORCPT ); Wed, 20 Apr 2022 10:48:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1379785AbiDTOrj (ORCPT ); Wed, 20 Apr 2022 10:47:39 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 834D9240; Wed, 20 Apr 2022 07:44:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F371E6176C; Wed, 20 Apr 2022 14:44:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2EEC5C385A0; Wed, 20 Apr 2022 14:44:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1650465891; bh=uhUfyEvMq+wqDi6kDMjqVy2PHOt/CvPeNGoHGdg5hr4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZZIMXDq+kA8YeVTfqhw0qC4fqHVN56HMMHEaEDn/0GQjStW7aOmFByvOIvov2Ol4i iNcGQL//4xEg3r+cQzmkB728AHYHF6CLTT8Dh3mMy38JFkme7CWoW9C8R7pOxj6y7h 6EJtlr0+9A0T2m1A44CeFYm75ovrK2R7uRCAdSKTc0V4tUydNNLJGTiMk9PPvgTfxB 5S/WU6AGyQnA8opiXLWGC/e0uv4COExuwGq5UPtN5QrwBb8H5ipK4vxBo4grNokhRz U1A6NFo5X0liMzHZGeC8KDCml7sKqAnvt4lyofJjiYNgWMwj0gmIJX3NwX1mwhBRQ/ A37xq72Z0v1xA== From: guoren@kernel.org To: guoren@kernel.org, arnd@arndb.de, palmer@dabbelt.com, mark.rutland@arm.com, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, dlustig@nvidia.com, parri.andrea@gmail.com Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Guo Ren Subject: [PATCH V3 5/5] riscv: atomic: Add conditional atomic operations' optimization Date: Wed, 20 Apr 2022 22:44:17 +0800 Message-Id: <20220420144417.2453958-6-guoren@kernel.org> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220420144417.2453958-1-guoren@kernel.org> References: <20220420144417.2453958-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Guo Ren Add conditional atomic operations' optimization: - arch_atomic_inc_unless_negative - arch_atomic_dec_unless_positive - arch_atomic64_inc_unless_negative - arch_atomic64_dec_unless_positive Signed-off-by: Guo Ren Signed-off-by: Guo Ren Cc: Palmer Dabbelt Cc: Mark Rutland Cc: Andrea Parri Cc: Dan Lustig --- arch/riscv/include/asm/atomic.h | 78 +++++++++++++++++++++++++++++++++ 1 file changed, 78 insertions(+) diff --git a/arch/riscv/include/asm/atomic.h b/arch/riscv/include/asm/atomi= c.h index 5589e1de2c80..a62c5de71033 100644 --- a/arch/riscv/include/asm/atomic.h +++ b/arch/riscv/include/asm/atomic.h @@ -374,6 +374,44 @@ ATOMIC_OPS() #undef ATOMIC_OPS #undef ATOMIC_OP =20 +static __always_inline bool arch_atomic_inc_unless_negative(atomic_t *v) +{ + int prev, rc; + + __asm__ __volatile__ ( + "0: lr.w %[p], %[c]\n" + " bltz %[p], 1f\n" + " addi %[rc], %[p], 1\n" + " sc.w.aqrl %[rc], %[rc], %[c]\n" + " bnez %[rc], 0b\n" + "1:\n" + : [p]"=3D&r" (prev), [rc]"=3D&r" (rc), [c]"+A" (v->counter) + : + : "memory"); + return !(prev < 0); +} + +#define arch_atomic_inc_unless_negative arch_atomic_inc_unless_negative + +static __always_inline bool arch_atomic_dec_unless_positive(atomic_t *v) +{ + int prev, rc; + + __asm__ __volatile__ ( + "0: lr.w %[p], %[c]\n" + " bgtz %[p], 1f\n" + " addi %[rc], %[p], -1\n" + " sc.w.aqrl %[rc], %[rc], %[c]\n" + " bnez %[rc], 0b\n" + "1:\n" + : [p]"=3D&r" (prev), [rc]"=3D&r" (rc), [c]"+A" (v->counter) + : + : "memory"); + return !(prev > 0); +} + +#define arch_atomic_dec_unless_positive arch_atomic_dec_unless_positive + static __always_inline int arch_atomic_dec_if_positive(atomic_t *v) { int prev, rc; @@ -394,6 +432,46 @@ static __always_inline int arch_atomic_dec_if_positive= (atomic_t *v) #define arch_atomic_dec_if_positive arch_atomic_dec_if_positive =20 #ifndef CONFIG_GENERIC_ATOMIC64 +static __always_inline bool arch_atomic64_inc_unless_negative(atomic64_t *= v) +{ + s64 prev; + long rc; + + __asm__ __volatile__ ( + "0: lr.d %[p], %[c]\n" + " bltz %[p], 1f\n" + " addi %[rc], %[p], 1\n" + " sc.d.aqrl %[rc], %[rc], %[c]\n" + " bnez %[rc], 0b\n" + "1:\n" + : [p]"=3D&r" (prev), [rc]"=3D&r" (rc), [c]"+A" (v->counter) + : + : "memory"); + return !(prev < 0); +} + +#define arch_atomic64_inc_unless_negative arch_atomic64_inc_unless_negative + +static __always_inline bool arch_atomic64_dec_unless_positive(atomic64_t *= v) +{ + s64 prev; + long rc; + + __asm__ __volatile__ ( + "0: lr.d %[p], %[c]\n" + " bgtz %[p], 1f\n" + " addi %[rc], %[p], -1\n" + " sc.d.aqrl %[rc], %[rc], %[c]\n" + " bnez %[rc], 0b\n" + "1:\n" + : [p]"=3D&r" (prev), [rc]"=3D&r" (rc), [c]"+A" (v->counter) + : + : "memory"); + return !(prev > 0); +} + +#define arch_atomic64_dec_unless_positive arch_atomic64_dec_unless_positive + static __always_inline s64 arch_atomic64_dec_if_positive(atomic64_t *v) { s64 prev; --=20 2.25.1