From: Shida Zhang <zhangshida@kylinos.cn>
Andreas point out that multiple completions can race setting
bi_status.
The check (parent->bi_status) and the subsequent write are not an
atomic operation. The value of parent->bi_status could have changed
between the time you read it for the if check and the time you write
to it. So we use cmpxchg to fix the race, as suggested by Christoph.
Suggested-by: Andreas Gruenbacher <agruenba@redhat.com>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Suggested-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
---
block/bio.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/block/bio.c b/block/bio.c
index 1b5e4577f4c..097c1cd2054 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -314,8 +314,9 @@ static struct bio *__bio_chain_endio(struct bio *bio)
{
struct bio *parent = bio->bi_private;
- if (bio->bi_status && !parent->bi_status)
- parent->bi_status = bio->bi_status;
+ if (bio->bi_status)
+ cmpxchg(&parent->bi_status, 0, bio->bi_status);
+
bio_put(bio);
return parent;
}
--
2.34.1
On Mon, Dec 1, 2025 at 10:05 AM zhangshida <starzhangzsd@gmail.com> wrote:
> From: Shida Zhang <zhangshida@kylinos.cn>
>
> Andreas point out that multiple completions can race setting
> bi_status.
>
> The check (parent->bi_status) and the subsequent write are not an
> atomic operation. The value of parent->bi_status could have changed
> between the time you read it for the if check and the time you write
> to it. So we use cmpxchg to fix the race, as suggested by Christoph.
>
> Suggested-by: Andreas Gruenbacher <agruenba@redhat.com>
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Suggested-by: Caleb Sander Mateos <csander@purestorage.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
> ---
> block/bio.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/block/bio.c b/block/bio.c
> index 1b5e4577f4c..097c1cd2054 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -314,8 +314,9 @@ static struct bio *__bio_chain_endio(struct bio *bio)
> {
> struct bio *parent = bio->bi_private;
>
> - if (bio->bi_status && !parent->bi_status)
> - parent->bi_status = bio->bi_status;
> + if (bio->bi_status)
> + cmpxchg(&parent->bi_status, 0, bio->bi_status);
Hmm. I don't think cmpxchg() actually is of any value here: for all
the chained bios, bi_status is initialized to 0, and it is only set
again (to a non-0 value) when a failure occurs. When there are
multiple failures, we only need to make sure that one of those
failures is eventually reported, but for that, a simple assignment is
enough here. The cmpxchg() won't guarantee that a specific error value
will survive; it all still depends on the timing. The cmpxchg() only
makes it look like something special is happening here with respect to
ordering.
> +
> bio_put(bio);
> return parent;
> }
> --
> 2.34.1
>
Thanks,
Andreas
On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote: > > - if (bio->bi_status && !parent->bi_status) > > - parent->bi_status = bio->bi_status; > > + if (bio->bi_status) > > + cmpxchg(&parent->bi_status, 0, bio->bi_status); > > Hmm. I don't think cmpxchg() actually is of any value here: for all > the chained bios, bi_status is initialized to 0, and it is only set > again (to a non-0 value) when a failure occurs. When there are > multiple failures, we only need to make sure that one of those > failures is eventually reported, but for that, a simple assignment is > enough here. A simple assignment doesn't guarantee atomicy. It also overrides earlier with later status codes, which might not be desirable.
On Mon, Dec 1, 2025 at 12:25 PM Christoph Hellwig <hch@infradead.org> wrote: > On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote: > > > - if (bio->bi_status && !parent->bi_status) > > > - parent->bi_status = bio->bi_status; > > > + if (bio->bi_status) > > > + cmpxchg(&parent->bi_status, 0, bio->bi_status); > > > > Hmm. I don't think cmpxchg() actually is of any value here: for all > > the chained bios, bi_status is initialized to 0, and it is only set > > again (to a non-0 value) when a failure occurs. When there are > > multiple failures, we only need to make sure that one of those > > failures is eventually reported, but for that, a simple assignment is > > enough here. > > A simple assignment doesn't guarantee atomicy. Well, we've already discussed that bi_status is a single byte and so tearing won't be an issue. Otherwise, WRITE_ONCE() would still be enough here. > It also overrides earlier with later status codes, which might not be desirable. In an A -> B bio chain, we have two competing bi_status writers: (1) when the A fails, B->bi_status will be updated using cmpxchg(), (2) when B fails, bi_status will be assigned a non-0 value. In that scenario, B failures will always win over A failures. In addition, when we have an A -> B -> C bio chain, we still won't get "first failure wins" semantics because a failure of A will only be propagated to C once B completes as well. To "fix" that, we'd have to "chain" all bios to the same parent instead. But I don't think any of that is really needed. Andreas
On Mon, Dec 01, 2025 at 02:07:07PM +0100, Andreas Gruenbacher wrote: > On Mon, Dec 1, 2025 at 12:25 PM Christoph Hellwig <hch@infradead.org> wrote: > > On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote: > > > > - if (bio->bi_status && !parent->bi_status) > > > > - parent->bi_status = bio->bi_status; > > > > + if (bio->bi_status) > > > > + cmpxchg(&parent->bi_status, 0, bio->bi_status); > > > > > > Hmm. I don't think cmpxchg() actually is of any value here: for all > > > the chained bios, bi_status is initialized to 0, and it is only set > > > again (to a non-0 value) when a failure occurs. When there are > > > multiple failures, we only need to make sure that one of those > > > failures is eventually reported, but for that, a simple assignment is > > > enough here. > > > > A simple assignment doesn't guarantee atomicy. > > Well, we've already discussed that bi_status is a single byte and so > tearing won't be an issue. Otherwise, WRITE_ONCE() would still be > enough here. No. At least older alpha can tear byte updates as they need a read-modify-write cycle. But even on normal x86 the check and the update would be racy. The cmpxchg makes the intentions very clear, works everywhere and given it only happens in the error path does not create any fast path overhead.
On Tue, Dec 2, 2025 at 6:48 AM Christoph Hellwig <hch@lst.de> wrote:
> On Mon, Dec 01, 2025 at 02:07:07PM +0100, Andreas Gruenbacher wrote:
> > On Mon, Dec 1, 2025 at 12:25 PM Christoph Hellwig <hch@infradead.org> wrote:
> > > On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote:
> > > > > - if (bio->bi_status && !parent->bi_status)
> > > > > - parent->bi_status = bio->bi_status;
> > > > > + if (bio->bi_status)
> > > > > + cmpxchg(&parent->bi_status, 0, bio->bi_status);
> > > >
> > > > Hmm. I don't think cmpxchg() actually is of any value here: for all
> > > > the chained bios, bi_status is initialized to 0, and it is only set
> > > > again (to a non-0 value) when a failure occurs. When there are
> > > > multiple failures, we only need to make sure that one of those
> > > > failures is eventually reported, but for that, a simple assignment is
> > > > enough here.
> > >
> > > A simple assignment doesn't guarantee atomicy.
> >
> > Well, we've already discussed that bi_status is a single byte and so
> > tearing won't be an issue. Otherwise, WRITE_ONCE() would still be
> > enough here.
>
> No. At least older alpha can tear byte updates as they need a
> read-modify-write cycle.
I know this used to be a thing in the past, but to see that none of
that is relevant anymore today, have a look at where [*] quotes the
C11 standard:
memory location
either an object of scalar type, or a maximal sequence
of adjacent bit-fields all having nonzero width
NOTE 1: Two threads of execution can update and access
separate memory locations without interfering with
each other.
NOTE 2: A bit-field and an adjacent non-bit-field member
are in separate memory locations. The same applies
to two bit-fields, if one is declared inside a nested
structure declaration and the other is not, or if the two
are separated by a zero-length bit-field declaration,
or if they are separated by a non-bit-field member
declaration. It is not safe to concurrently update two
bit-fields in the same structure if all members declared
between them are also bit-fields, no matter what the
sizes of those intervening bit-fields happen to be.
[*] Documentation/memory-barriers.txt
> But even on normal x86 the check and the update would be racy.
There is no check and update (RMW), though. Quoting what I wrote
earlier in this thread:
On Mon, Dec 1, 2025 at 11:22 AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
> Hmm. I don't think cmpxchg() actually is of any value here: for all
> the chained bios, bi_status is initialized to 0, and it is only set
> again (to a non-0 value) when a failure occurs. When there are
> multiple failures, we only need to make sure that one of those
> failures is eventually reported, but for that, a simple assignment is
> enough here. The cmpxchg() won't guarantee that a specific error value
> will survive; it all still depends on the timing. The cmpxchg() only
> makes it look like something special is happening here with respect to
> ordering.
So with or without the cmpxchg(), if there are multiple errors, we
won't know which bi_status code will survive, but we do know that we
will end up with one of those error codes.
Andreas
Andreas Gruenbacher <agruenba@redhat.com> 于2025年12月3日周三 05:15写道:
>
> On Tue, Dec 2, 2025 at 6:48 AM Christoph Hellwig <hch@lst.de> wrote:
> > On Mon, Dec 01, 2025 at 02:07:07PM +0100, Andreas Gruenbacher wrote:
> > > On Mon, Dec 1, 2025 at 12:25 PM Christoph Hellwig <hch@infradead.org> wrote:
> > > > On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote:
> > > > > > - if (bio->bi_status && !parent->bi_status)
> > > > > > - parent->bi_status = bio->bi_status;
> > > > > > + if (bio->bi_status)
> > > > > > + cmpxchg(&parent->bi_status, 0, bio->bi_status);
> > > > >
> > > > > Hmm. I don't think cmpxchg() actually is of any value here: for all
> > > > > the chained bios, bi_status is initialized to 0, and it is only set
> > > > > again (to a non-0 value) when a failure occurs. When there are
> > > > > multiple failures, we only need to make sure that one of those
> > > > > failures is eventually reported, but for that, a simple assignment is
> > > > > enough here.
> > > >
> > > > A simple assignment doesn't guarantee atomicy.
> > >
> > > Well, we've already discussed that bi_status is a single byte and so
> > > tearing won't be an issue. Otherwise, WRITE_ONCE() would still be
> > > enough here.
> >
> > No. At least older alpha can tear byte updates as they need a
> > read-modify-write cycle.
>
> I know this used to be a thing in the past, but to see that none of
> that is relevant anymore today, have a look at where [*] quotes the
> C11 standard:
>
> memory location
> either an object of scalar type, or a maximal sequence
> of adjacent bit-fields all having nonzero width
>
> NOTE 1: Two threads of execution can update and access
> separate memory locations without interfering with
> each other.
>
> NOTE 2: A bit-field and an adjacent non-bit-field member
> are in separate memory locations. The same applies
> to two bit-fields, if one is declared inside a nested
> structure declaration and the other is not, or if the two
> are separated by a zero-length bit-field declaration,
> or if they are separated by a non-bit-field member
> declaration. It is not safe to concurrently update two
> bit-fields in the same structure if all members declared
> between them are also bit-fields, no matter what the
> sizes of those intervening bit-fields happen to be.
>
> [*] Documentation/memory-barriers.txt
>
> > But even on normal x86 the check and the update would be racy.
>
> There is no check and update (RMW), though. Quoting what I wrote
> earlier in this thread:
>
> On Mon, Dec 1, 2025 at 11:22 AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
> > Hmm. I don't think cmpxchg() actually is of any value here: for all
> > the chained bios, bi_status is initialized to 0, and it is only set
> > again (to a non-0 value) when a failure occurs. When there are
> > multiple failures, we only need to make sure that one of those
> > failures is eventually reported, but for that, a simple assignment is
> > enough here. The cmpxchg() won't guarantee that a specific error value
> > will survive; it all still depends on the timing. The cmpxchg() only
> > makes it look like something special is happening here with respect to
> > ordering.
>
> So with or without the cmpxchg(), if there are multiple errors, we
> won't know which bi_status code will survive, but we do know that we
> will end up with one of those error codes.
>
Thank you for sharing your insights—I found the discussion very enlightening.
While I agree with Andreas’s perspective, I also very much appreciate
the clarity
and precision offered by the cmpxchg() approach. That’s why when Christoph
suggested it, I was happy to incorporate it into the code.
But a cmpxchg is a little bit redundant here.
so we will change it to the simple assignment:
- if (bio->bi_status && !parent->bi_status)
parent->bi_status = bio->bi_status;
+ if (bio->bi_status)
parent->bi_status = bio->bi_status;
I will integrate this discussion into the commit message, it is very insightful.
Thanks,
Shida
> Andreas
>
Stephen Zhang <starzhangzsd@gmail.com> 于2025年12月3日周三 09:51写道:
>
> Andreas Gruenbacher <agruenba@redhat.com> 于2025年12月3日周三 05:15写道:
> >
> > On Tue, Dec 2, 2025 at 6:48 AM Christoph Hellwig <hch@lst.de> wrote:
> > > On Mon, Dec 01, 2025 at 02:07:07PM +0100, Andreas Gruenbacher wrote:
> > > > On Mon, Dec 1, 2025 at 12:25 PM Christoph Hellwig <hch@infradead.org> wrote:
> > > > > On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote:
> > > > > > > - if (bio->bi_status && !parent->bi_status)
> > > > > > > - parent->bi_status = bio->bi_status;
> > > > > > > + if (bio->bi_status)
> > > > > > > + cmpxchg(&parent->bi_status, 0, bio->bi_status);
> > > > > >
> > > > > > Hmm. I don't think cmpxchg() actually is of any value here: for all
> > > > > > the chained bios, bi_status is initialized to 0, and it is only set
> > > > > > again (to a non-0 value) when a failure occurs. When there are
> > > > > > multiple failures, we only need to make sure that one of those
> > > > > > failures is eventually reported, but for that, a simple assignment is
> > > > > > enough here.
> > > > >
> > > > > A simple assignment doesn't guarantee atomicy.
> > > >
> > > > Well, we've already discussed that bi_status is a single byte and so
> > > > tearing won't be an issue. Otherwise, WRITE_ONCE() would still be
> > > > enough here.
> > >
> > > No. At least older alpha can tear byte updates as they need a
> > > read-modify-write cycle.
> >
> > I know this used to be a thing in the past, but to see that none of
> > that is relevant anymore today, have a look at where [*] quotes the
> > C11 standard:
> >
> > memory location
> > either an object of scalar type, or a maximal sequence
> > of adjacent bit-fields all having nonzero width
> >
> > NOTE 1: Two threads of execution can update and access
> > separate memory locations without interfering with
> > each other.
> >
> > NOTE 2: A bit-field and an adjacent non-bit-field member
> > are in separate memory locations. The same applies
> > to two bit-fields, if one is declared inside a nested
> > structure declaration and the other is not, or if the two
> > are separated by a zero-length bit-field declaration,
> > or if they are separated by a non-bit-field member
> > declaration. It is not safe to concurrently update two
> > bit-fields in the same structure if all members declared
> > between them are also bit-fields, no matter what the
> > sizes of those intervening bit-fields happen to be.
> >
> > [*] Documentation/memory-barriers.txt
> >
> > > But even on normal x86 the check and the update would be racy.
> >
> > There is no check and update (RMW), though. Quoting what I wrote
> > earlier in this thread:
> >
> > On Mon, Dec 1, 2025 at 11:22 AM Andreas Gruenbacher <agruenba@redhat.com> wrote:
> > > Hmm. I don't think cmpxchg() actually is of any value here: for all
> > > the chained bios, bi_status is initialized to 0, and it is only set
> > > again (to a non-0 value) when a failure occurs. When there are
> > > multiple failures, we only need to make sure that one of those
> > > failures is eventually reported, but for that, a simple assignment is
> > > enough here. The cmpxchg() won't guarantee that a specific error value
> > > will survive; it all still depends on the timing. The cmpxchg() only
> > > makes it look like something special is happening here with respect to
> > > ordering.
> >
> > So with or without the cmpxchg(), if there are multiple errors, we
> > won't know which bi_status code will survive, but we do know that we
> > will end up with one of those error codes.
> >
>
> Thank you for sharing your insights—I found the discussion very enlightening.
>
> While I agree with Andreas’s perspective, I also very much appreciate
> the clarity
> and precision offered by the cmpxchg() approach. That’s why when Christoph
> suggested it, I was happy to incorporate it into the code.
>
> But a cmpxchg is a little bit redundant here.
> so we will change it to the simple assignment:
>
> - if (bio->bi_status && !parent->bi_status)
> parent->bi_status = bio->bi_status;
> + if (bio->bi_status)
> parent->bi_status = bio->bi_status;
>
> I will integrate this discussion into the commit message, it is very insightful.
>
Hi,
I’ve been reconsidering the two approaches for the upcoming patch revision.
Essentially, we’re comparing two methods:
A:
if (bio->bi_status)
parent->bi_status = bio->bi_status;
B:
if (bio->bi_status)
cmpxchg(&parent->bi_status, 0, bio->bi_status);
Both appear correct, but B seems a little bit redundant here.
Upon further reflection, I’ve noticed a subtle difference:
A unconditionally writes to parent->bi_status when bio->bi_status is non-zero,
regardless of the current value of parent->bi_status.
B uses cmpxchg to only update parent->bi_status if it is still zero.
Thus, B could avoid unnecessary writes in cases where parent->bi_status has
already been set to a non-zero value.
Do you think this optimization would be beneficial in practice, or is
the difference
negligible?
Thanks,
Shida
> Thanks,
> Shida
>
> > Andreas
> >
On Wed, Dec 03, 2025 at 11:09:36AM +0800, Stephen Zhang wrote: > > I’ve been reconsidering the two approaches for the upcoming patch revision. > Essentially, we’re comparing two methods: > A: > if (bio->bi_status) > parent->bi_status = bio->bi_status; > B: > if (bio->bi_status) > cmpxchg(&parent->bi_status, 0, bio->bi_status); > > Both appear correct, but B seems a little bit redundant here. A is not correct. You at least needs READ_ONCE/WRITE_ONCE here. B solves all these issues. > Upon further reflection, I’ve noticed a subtle difference: > A unconditionally writes to parent->bi_status when bio->bi_status is non-zero, > regardless of the current value of parent->bi_status. > B uses cmpxchg to only update parent->bi_status if it is still zero. > > Thus, B could avoid unnecessary writes in cases where parent->bi_status has > already been set to a non-zero value. The unnecessary writes don't really matter, we're in an error slow path here.
On Tue, Dec 2, 2025 at 7:10 PM Stephen Zhang <starzhangzsd@gmail.com> wrote: > > Stephen Zhang <starzhangzsd@gmail.com> 于2025年12月3日周三 09:51写道: > > > > Andreas Gruenbacher <agruenba@redhat.com> 于2025年12月3日周三 05:15写道: > > > > > > On Tue, Dec 2, 2025 at 6:48 AM Christoph Hellwig <hch@lst.de> wrote: > > > > On Mon, Dec 01, 2025 at 02:07:07PM +0100, Andreas Gruenbacher wrote: > > > > > On Mon, Dec 1, 2025 at 12:25 PM Christoph Hellwig <hch@infradead.org> wrote: > > > > > > On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote: > > > > > > > > - if (bio->bi_status && !parent->bi_status) > > > > > > > > - parent->bi_status = bio->bi_status; > > > > > > > > + if (bio->bi_status) > > > > > > > > + cmpxchg(&parent->bi_status, 0, bio->bi_status); > > > > > > > > > > > > > > Hmm. I don't think cmpxchg() actually is of any value here: for all > > > > > > > the chained bios, bi_status is initialized to 0, and it is only set > > > > > > > again (to a non-0 value) when a failure occurs. When there are > > > > > > > multiple failures, we only need to make sure that one of those > > > > > > > failures is eventually reported, but for that, a simple assignment is > > > > > > > enough here. > > > > > > > > > > > > A simple assignment doesn't guarantee atomicy. > > > > > > > > > > Well, we've already discussed that bi_status is a single byte and so > > > > > tearing won't be an issue. Otherwise, WRITE_ONCE() would still be > > > > > enough here. > > > > > > > > No. At least older alpha can tear byte updates as they need a > > > > read-modify-write cycle. > > > > > > I know this used to be a thing in the past, but to see that none of > > > that is relevant anymore today, have a look at where [*] quotes the > > > C11 standard: > > > > > > memory location > > > either an object of scalar type, or a maximal sequence > > > of adjacent bit-fields all having nonzero width > > > > > > NOTE 1: Two threads of execution can update and access > > > separate memory locations without interfering with > > > each other. > > > > > > NOTE 2: A bit-field and an adjacent non-bit-field member > > > are in separate memory locations. The same applies > > > to two bit-fields, if one is declared inside a nested > > > structure declaration and the other is not, or if the two > > > are separated by a zero-length bit-field declaration, > > > or if they are separated by a non-bit-field member > > > declaration. It is not safe to concurrently update two > > > bit-fields in the same structure if all members declared > > > between them are also bit-fields, no matter what the > > > sizes of those intervening bit-fields happen to be. > > > > > > [*] Documentation/memory-barriers.txt > > > > > > > But even on normal x86 the check and the update would be racy. > > > > > > There is no check and update (RMW), though. Quoting what I wrote > > > earlier in this thread: > > > > > > On Mon, Dec 1, 2025 at 11:22 AM Andreas Gruenbacher <agruenba@redhat.com> wrote: > > > > Hmm. I don't think cmpxchg() actually is of any value here: for all > > > > the chained bios, bi_status is initialized to 0, and it is only set > > > > again (to a non-0 value) when a failure occurs. When there are > > > > multiple failures, we only need to make sure that one of those > > > > failures is eventually reported, but for that, a simple assignment is > > > > enough here. The cmpxchg() won't guarantee that a specific error value > > > > will survive; it all still depends on the timing. The cmpxchg() only > > > > makes it look like something special is happening here with respect to > > > > ordering. > > > > > > So with or without the cmpxchg(), if there are multiple errors, we > > > won't know which bi_status code will survive, but we do know that we > > > will end up with one of those error codes. > > > > > > > Thank you for sharing your insights—I found the discussion very enlightening. > > > > While I agree with Andreas’s perspective, I also very much appreciate > > the clarity > > and precision offered by the cmpxchg() approach. That’s why when Christoph > > suggested it, I was happy to incorporate it into the code. > > > > But a cmpxchg is a little bit redundant here. > > so we will change it to the simple assignment: > > > > - if (bio->bi_status && !parent->bi_status) > > parent->bi_status = bio->bi_status; > > + if (bio->bi_status) > > parent->bi_status = bio->bi_status; > > > > I will integrate this discussion into the commit message, it is very insightful. > > > > Hi, > > I’ve been reconsidering the two approaches for the upcoming patch revision. > Essentially, we’re comparing two methods: > A: > if (bio->bi_status) > parent->bi_status = bio->bi_status; > B: > if (bio->bi_status) > cmpxchg(&parent->bi_status, 0, bio->bi_status); > > Both appear correct, but B seems a little bit redundant here. > Upon further reflection, I’ve noticed a subtle difference: > A unconditionally writes to parent->bi_status when bio->bi_status is non-zero, > regardless of the current value of parent->bi_status. > B uses cmpxchg to only update parent->bi_status if it is still zero. > > Thus, B could avoid unnecessary writes in cases where parent->bi_status has > already been set to a non-zero value. > > Do you think this optimization would be beneficial in practice, or is > the difference > negligible? cmpxchg() is much more expensive than a plain write, as it compiles into an atomic instruction (or load-linked/store-conditional pair on architectures that don't provide such an instruction). This requires the processor to take exclusive access of the cache line for the duration of the operation (or check whether the cache line has been invalidated during the operation). On x86, cmpxchg() marks the cache line as modified regardless of whether the compare succeeds and the value is updated. So it doesn't actually avoid the cost of a write. However, the original code's check of parent->bi_status before writing to it *does* avoid the write if parent->bi_status is already nonzero. So the optimization you're looking for is already implemented :) If __bio_chain_endio() can be called concurrently from multiple threads accessing the same parent bio, it should be using WRITE_ONCE() (and READ_ONCE(), if applicable) to access parent->bi_status to avoid the data race. On x86 and ARM these macros produce the same instruction as a normal write, but they may be required on other architectures to avoid tearing, as well as to prevent the compiler from adding or removing memory accesses based on the assumption that the values aren't being accessed concurrently by other threads. Best, Caleb
© 2016 - 2026 Red Hat, Inc.