[PATCH v4 3/3] block: prevent race condition on bi_status in __bio_chain_endio

zhangshida posted 3 patches 12 hours ago
[PATCH v4 3/3] block: prevent race condition on bi_status in __bio_chain_endio
Posted by zhangshida 12 hours ago
From: Shida Zhang <zhangshida@kylinos.cn>

Andreas point out that multiple completions can race setting
bi_status.

The check (parent->bi_status) and the subsequent write are not an
atomic operation. The value of parent->bi_status could have changed
between the time you read it for the if check and the time you write
to it. So we use cmpxchg to fix the race, as suggested by Christoph.

Suggested-by: Andreas Gruenbacher <agruenba@redhat.com>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Suggested-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
---
 block/bio.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 1b5e4577f4c..097c1cd2054 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -314,8 +314,9 @@ static struct bio *__bio_chain_endio(struct bio *bio)
 {
 	struct bio *parent = bio->bi_private;
 
-	if (bio->bi_status && !parent->bi_status)
-		parent->bi_status = bio->bi_status;
+	if (bio->bi_status)
+		cmpxchg(&parent->bi_status, 0, bio->bi_status);
+
 	bio_put(bio);
 	return parent;
 }
-- 
2.34.1
Re: [PATCH v4 3/3] block: prevent race condition on bi_status in __bio_chain_endio
Posted by Andreas Gruenbacher 11 hours ago
On Mon, Dec 1, 2025 at 10:05 AM zhangshida <starzhangzsd@gmail.com> wrote:
> From: Shida Zhang <zhangshida@kylinos.cn>
>
> Andreas point out that multiple completions can race setting
> bi_status.
>
> The check (parent->bi_status) and the subsequent write are not an
> atomic operation. The value of parent->bi_status could have changed
> between the time you read it for the if check and the time you write
> to it. So we use cmpxchg to fix the race, as suggested by Christoph.
>
> Suggested-by: Andreas Gruenbacher <agruenba@redhat.com>
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Suggested-by: Caleb Sander Mateos <csander@purestorage.com>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Shida Zhang <zhangshida@kylinos.cn>
> ---
>  block/bio.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/block/bio.c b/block/bio.c
> index 1b5e4577f4c..097c1cd2054 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -314,8 +314,9 @@ static struct bio *__bio_chain_endio(struct bio *bio)
>  {
>         struct bio *parent = bio->bi_private;
>
> -       if (bio->bi_status && !parent->bi_status)
> -               parent->bi_status = bio->bi_status;
> +       if (bio->bi_status)
> +               cmpxchg(&parent->bi_status, 0, bio->bi_status);

Hmm. I don't think cmpxchg() actually is of any value here: for all
the chained bios, bi_status is initialized to 0, and it is only set
again (to a non-0 value) when a failure occurs. When there are
multiple failures, we only need to make sure that one of those
failures is eventually reported, but for that, a simple assignment is
enough here. The cmpxchg() won't guarantee that a specific error value
will survive; it all still depends on the timing. The cmpxchg() only
makes it look like something special is happening here with respect to
ordering.

> +
>         bio_put(bio);
>         return parent;
>  }
> --
> 2.34.1
>

Thanks,
Andreas
Re: [PATCH v4 3/3] block: prevent race condition on bi_status in __bio_chain_endio
Posted by Christoph Hellwig 10 hours ago
On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote:
> > -       if (bio->bi_status && !parent->bi_status)
> > -               parent->bi_status = bio->bi_status;
> > +       if (bio->bi_status)
> > +               cmpxchg(&parent->bi_status, 0, bio->bi_status);
> 
> Hmm. I don't think cmpxchg() actually is of any value here: for all
> the chained bios, bi_status is initialized to 0, and it is only set
> again (to a non-0 value) when a failure occurs. When there are
> multiple failures, we only need to make sure that one of those
> failures is eventually reported, but for that, a simple assignment is
> enough here.

A simple assignment doesn't guarantee atomicy.  It also overrides
earlier with later status codes, which might not be desirable.
Re: [PATCH v4 3/3] block: prevent race condition on bi_status in __bio_chain_endio
Posted by Andreas Gruenbacher 8 hours ago
On Mon, Dec 1, 2025 at 12:25 PM Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Dec 01, 2025 at 11:22:32AM +0100, Andreas Gruenbacher wrote:
> > > -       if (bio->bi_status && !parent->bi_status)
> > > -               parent->bi_status = bio->bi_status;
> > > +       if (bio->bi_status)
> > > +               cmpxchg(&parent->bi_status, 0, bio->bi_status);
> >
> > Hmm. I don't think cmpxchg() actually is of any value here: for all
> > the chained bios, bi_status is initialized to 0, and it is only set
> > again (to a non-0 value) when a failure occurs. When there are
> > multiple failures, we only need to make sure that one of those
> > failures is eventually reported, but for that, a simple assignment is
> > enough here.
>
> A simple assignment doesn't guarantee atomicy.

Well, we've already discussed that bi_status is a single byte and so
tearing won't be an issue. Otherwise, WRITE_ONCE() would still be
enough here.

>  It also overrides earlier with later status codes, which might not be desirable.

In an A -> B bio chain, we have two competing bi_status writers:

  (1) when the A fails, B->bi_status will be updated using cmpxchg(),
  (2) when B fails, bi_status will be assigned a non-0 value.

In that scenario, B failures will always win over A failures.

In addition, when we have an A -> B -> C bio chain, we still won't get
"first failure wins" semantics because a failure of A will only be
propagated to C once B completes as well. To "fix" that, we'd have to
"chain" all bios to the same parent instead. But I don't think any of
that is really needed.

Andreas