From nobody Tue Apr 7 05:38:03 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3909ECAAD1 for ; Wed, 31 Aug 2022 18:29:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232430AbiHaS3G (ORCPT ); Wed, 31 Aug 2022 14:29:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232866AbiHaS2P (ORCPT ); Wed, 31 Aug 2022 14:28:15 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB8856478; Wed, 31 Aug 2022 11:23:58 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 35FC3B82274; Wed, 31 Aug 2022 18:23:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF846C433C1; Wed, 31 Aug 2022 18:23:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1661970235; bh=xN5D2ShxxNPlDuwNTCWUnmxwkBxFPNBZHBLxsbMlkKI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UEawr4TxzETNjqYk/WUPrJi36MXKNnvLL4s+Rf2+OXVHuaAgJf8FofJuf2Jpgx1iR XXvOA5futmG6pFxYWtEFZdHLX+tGy6ZjFncK7j6vu88a5WxSH+fPN4zil28FK/Srew hRx2Dq+2xY3oqjSTpuFZ91aYm0N+DToaEuPAbjrlS7A1/2vA11Cxa+EbIGKq0Vzomy fqnushRgVMBphTxbfJbAtkuqJOWY12ouI+0Uy8MU3ZGpbqeqLTZ3M8DGQngsyDrUQ+ yVueg5RIqm/aj1gwgXmfRMRkYN6dR1I/UBoh6+AckiNWJyM191e5H2nFekj2ogcHYH 08ce80hd5VfLQ== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 800355C019C; Wed, 31 Aug 2022 11:23:55 -0700 (PDT) From: "Paul E. McKenney" To: linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org, kernel-team@fb.com, mingo@kernel.org Cc: stern@rowland.harvard.edu, parri.andrea@gmail.com, will@kernel.org, peterz@infradead.org, boqun.feng@gmail.com, npiggin@gmail.com, dhowells@redhat.com, j.alglave@ucl.ac.uk, luc.maranget@inria.fr, akiyks@gmail.com, "Paul E. McKenney" , Daniel Lustig , Joel Fernandes , "Michael S. Tsirkin" , Jonathan Corbet Subject: [PATCH memory-model 2/3] docs/memory-barriers.txt: Fixup long lines Date: Wed, 31 Aug 2022 11:23:52 -0700 Message-Id: <20220831182353.2699262-2-paulmck@kernel.org> X-Mailer: git-send-email 2.31.1.189.g2e36527f23 In-Reply-To: <20220831182350.GA2698943@paulmck-ThinkPad-P17-Gen-1> References: <20220831182350.GA2698943@paulmck-ThinkPad-P17-Gen-1> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Akira Yokosawa Substitution of "data dependency barrier" with "address-dependency barrier" left quite a lot of lines exceeding 80 columns. Reflow those lines as well as a few short ones not related to the substitution. No changes in documentation text. Signed-off-by: Akira Yokosawa Cc: "Paul E. McKenney" Cc: Alan Stern Cc: Will Deacon Cc: Peter Zijlstra Cc: Boqun Feng Cc: Andrea Parri Cc: Nicholas Piggin Cc: David Howells Cc: Daniel Lustig Cc: Joel Fernandes Cc: "Michael S. Tsirkin" Cc: Jonathan Corbet Signed-off-by: Paul E. McKenney --- Documentation/memory-barriers.txt | 93 ++++++++++++++++--------------- 1 file changed, 47 insertions(+), 46 deletions(-) diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barri= ers.txt index b16767cb6d31d..06f80e3785c5d 100644 --- a/Documentation/memory-barriers.txt +++ b/Documentation/memory-barriers.txt @@ -187,9 +187,9 @@ As a further example, consider this sequence of events: B =3D 4; Q =3D P; P =3D &B; D =3D *Q; =20 -There is an obvious address dependency here, as the value loaded into D de= pends on -the address retrieved from P by CPU 2. At the end of the sequence, any of= the -following results are possible: +There is an obvious address dependency here, as the value loaded into D de= pends +on the address retrieved from P by CPU 2. At the end of the sequence, any= of +the following results are possible: =20 (Q =3D=3D &A) and (D =3D=3D 1) (Q =3D=3D &B) and (D =3D=3D 2) @@ -397,25 +397,25 @@ Memory barriers come in four basic varieties: =20 (2) Address-dependency barriers (historical). =20 - An address-dependency barrier is a weaker form of read barrier. In t= he case - where two loads are performed such that the second depends on the res= ult - of the first (eg: the first load retrieves the address to which the s= econd - load will be directed), an address-dependency barrier would be requir= ed to - make sure that the target of the second load is updated after the add= ress - obtained by the first load is accessed. + An address-dependency barrier is a weaker form of read barrier. In t= he + case where two loads are performed such that the second depends on the + result of the first (eg: the first load retrieves the address to which + the second load will be directed), an address-dependency barrier would + be required to make sure that the target of the second load is updated + after the address obtained by the first load is accessed. =20 - An address-dependency barrier is a partial ordering on interdependent= loads - only; it is not required to have any effect on stores, independent lo= ads - or overlapping loads. + An address-dependency barrier is a partial ordering on interdependent + loads only; it is not required to have any effect on stores, independ= ent + loads or overlapping loads. =20 As mentioned in (1), the other CPUs in the system can be viewed as committing sequences of stores to the memory system that the CPU being - considered can then perceive. An address-dependency barrier issued b= y the CPU - under consideration guarantees that for any load preceding it, if that - load touches one of a sequence of stores from another CPU, then by the - time the barrier completes, the effects of all the stores prior to th= at - touched by the load will be perceptible to any loads issued after the= address- - dependency barrier. + considered can then perceive. An address-dependency barrier issued by + the CPU under consideration guarantees that for any load preceding it, + if that load touches one of a sequence of stores from another CPU, th= en + by the time the barrier completes, the effects of all the stores prio= r to + that touched by the load will be perceptible to any loads issued after + the address-dependency barrier. =20 See the "Examples of memory barrier sequences" subsection for diagrams showing the ordering constraints. @@ -437,16 +437,16 @@ Memory barriers come in four basic varieties: =20 (3) Read (or load) memory barriers. =20 - A read barrier is an address-dependency barrier plus a guarantee that= all the - LOAD operations specified before the barrier will appear to happen be= fore - all the LOAD operations specified after the barrier with respect to t= he - other components of the system. + A read barrier is an address-dependency barrier plus a guarantee that= all + the LOAD operations specified before the barrier will appear to happen + before all the LOAD operations specified after the barrier with respe= ct to + the other components of the system. =20 A read barrier is a partial ordering on loads only; it is not require= d to have any effect on stores. =20 - Read memory barriers imply address-dependency barriers, and so can su= bstitute - for them. + Read memory barriers imply address-dependency barriers, and so can + substitute for them. =20 [!] Note that read barriers should normally be paired with write barr= iers; see the "SMP barrier pairing" subsection. @@ -584,8 +584,8 @@ following sequence of events: [!] READ_ONCE_OLD() corresponds to READ_ONCE() of pre-4.15 kernel, which doesn't imply an address-dependency barrier. =20 -There's a clear address dependency here, and it would seem that by the end= of the -sequence, Q must be either &A or &B, and that: +There's a clear address dependency here, and it would seem that by the end= of +the sequence, Q must be either &A or &B, and that: =20 (Q =3D=3D &A) implies (D =3D=3D 1) (Q =3D=3D &B) implies (D =3D=3D 4) @@ -599,8 +599,8 @@ While this may seem like a failure of coherency or caus= ality maintenance, it isn't, and this behaviour can be observed on certain real CPUs (such as th= e DEC Alpha). =20 -To deal with this, READ_ONCE() provides an implicit address-dependency -barrier since kernel release v4.15: +To deal with this, READ_ONCE() provides an implicit address-dependency bar= rier +since kernel release v4.15: =20 CPU 1 CPU 2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D @@ -627,12 +627,12 @@ but the old value of the variable B (2). =20 =20 An address-dependency barrier is not required to order dependent writes -because the CPUs that the Linux kernel supports don't do writes -until they are certain (1) that the write will actually happen, (2) -of the location of the write, and (3) of the value to be written. +because the CPUs that the Linux kernel supports don't do writes until they +are certain (1) that the write will actually happen, (2) of the location of +the write, and (3) of the value to be written. But please carefully read the "CONTROL DEPENDENCIES" section and the -Documentation/RCU/rcu_dereference.rst file: The compiler can and does -break dependencies in a great many highly creative ways. +Documentation/RCU/rcu_dereference.rst file: The compiler can and does bre= ak +dependencies in a great many highly creative ways. =20 CPU 1 CPU 2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D @@ -678,8 +678,8 @@ not understand them. The purpose of this section is to= help you prevent the compiler's ignorance from breaking your code. =20 A load-load control dependency requires a full read memory barrier, not -simply an (implicit) address-dependency barrier to make it work correctly.= Consider the -following bit of code: +simply an (implicit) address-dependency barrier to make it work correctly. +Consider the following bit of code: =20 q =3D READ_ONCE(a); @@ -691,8 +691,8 @@ following bit of code: This will not have the desired effect because there is no actual address dependency, but rather a control dependency that the CPU may short-circuit by attempting to predict the outcome in advance, so that other CPUs see -the load from b as having happened before the load from a. In such a -case what's actually required is: +the load from b as having happened before the load from a. In such a case +what's actually required is: =20 q =3D READ_ONCE(a); if (q) { @@ -980,8 +980,8 @@ Basically, the read barrier always has to be there, eve= n though it can be of the "weaker" type. =20 [!] Note that the stores before the write barrier would normally be expect= ed to -match the loads after the read barrier or the address-dependency barrier, = and vice -versa: +match the loads after the read barrier or the address-dependency barrier, = and +vice versa: =20 CPU 1 CPU 2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D = =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D @@ -1033,8 +1033,8 @@ STORE B, STORE C } all occurring before the unordered= set of { STORE D, STORE E V =20 =20 -Secondly, address-dependency barriers act as partial orderings on address-= dependent -loads. Consider the following sequence of events: +Secondly, address-dependency barriers act as partial orderings on address- +dependent loads. Consider the following sequence of events: =20 CPU 1 CPU 2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D @@ -1079,8 +1079,8 @@ effectively random order, despite the write barrier i= ssued by CPU 1: In the above example, CPU 2 perceives that B is 7, despite the load of *C (which would be B) coming after the LOAD of C. =20 -If, however, an address-dependency barrier were to be placed between the l= oad of C -and the load of *C (ie: B) on CPU 2: +If, however, an address-dependency barrier were to be placed between the l= oad +of C and the load of *C (ie: B) on CPU 2: =20 CPU 1 CPU 2 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D @@ -2761,7 +2761,8 @@ is discarded from the CPU's cache and reloaded. To d= eal with this, the appropriate part of the kernel must invalidate the overlapping bits of the cache on each CPU. =20 -See Documentation/core-api/cachetlb.rst for more information on cache mana= gement. +See Documentation/core-api/cachetlb.rst for more information on cache +management. =20 =20 CACHE COHERENCY VS MMIO @@ -2901,8 +2902,8 @@ AND THEN THERE'S THE ALPHA The DEC Alpha CPU is one of the most relaxed CPUs there is. Not only that, some versions of the Alpha CPU have a split data cache, permitting them to= have two semantically-related cache lines updated at separate times. This is w= here -the address-dependency barrier really becomes necessary as this synchronis= es both -caches with the memory coherence system, thus making it seem like pointer +the address-dependency barrier really becomes necessary as this synchronis= es +both caches with the memory coherence system, thus making it seem like poi= nter changes vs new data occur in the right order. =20 The Alpha defines the Linux kernel's memory model, although as of v4.15 --=20 2.31.1.189.g2e36527f23