[PATCH RFC 2/7] md/raid0: convert raid0_handle_discard() to use bio_submit_split()

Yu Kuai posted 7 patches 1 month, 1 week ago
There is a newer version of this series
[PATCH RFC 2/7] md/raid0: convert raid0_handle_discard() to use bio_submit_split()
Posted by Yu Kuai 1 month, 1 week ago
From: Yu Kuai <yukuai3@huawei.com>

On the one hand unify bio split code, prepare to fix disordered split
IO; On the other hand fix missing blkcg_bio_issue_init() and
trace_block_split() for split IO.

Noted raid0_make_request() already fix disordered split IO by
319ff40a5427 ("md/raid0: Fix performance regression for large sequential
writes"), by convert bio to underlying disks before submit_bio_noacct(),
with the respect md_submit_bio() already split by sectors, and
raid0_make_request() will split at most once for unaligned IO. This is a
bit hacky and we'll convert this to solution in general later.

Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid0.c | 20 ++++++++------------
 1 file changed, 8 insertions(+), 12 deletions(-)

diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index f1d8811a542a..19b5faf238b7 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -463,21 +463,17 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
 	zone = find_zone(conf, &start);
 
 	if (bio_end_sector(bio) > zone->zone_end) {
-		struct bio *split = bio_split(bio,
-			zone->zone_end - bio->bi_iter.bi_sector, GFP_NOIO,
-			&mddev->bio_set);
-
-		if (IS_ERR(split)) {
-			bio->bi_status = errno_to_blk_status(PTR_ERR(split));
-			bio_endio(bio);
+		bio = bio_submit_split(bio,
+				zone->zone_end - bio->bi_iter.bi_sector,
+				&mddev->bio_set);
+		if (!bio)
 			return;
-		}
-		bio_chain(split, bio);
-		submit_bio_noacct(bio);
-		bio = split;
+
+		bio->bi_opf &= ~REQ_NOMERGE;
 		end = zone->zone_end;
-	} else
+	} else {
 		end = bio_end_sector(bio);
+	}
 
 	orig_end = end;
 	if (zone != conf->strip_zone)
-- 
2.39.2
Re: [PATCH RFC 2/7] md/raid0: convert raid0_handle_discard() to use bio_submit_split()
Posted by Christoph Hellwig 1 month, 1 week ago
On Mon, Aug 25, 2025 at 05:36:55PM +0800, Yu Kuai wrote:
> +		bio = bio_submit_split(bio,
> +				zone->zone_end - bio->bi_iter.bi_sector,
> +				&mddev->bio_set);

Do you know why raid0 and linear use mddev->bio_set for splitting
instead of their own split bio_sets like raid1/10/5?  Is this safe?

Otherwise this looks nice.
Re: [PATCH RFC 2/7] md/raid0: convert raid0_handle_discard() to use bio_submit_split()
Posted by Yu Kuai 1 month, 1 week ago
Hi,

在 2025/08/25 18:57, Christoph Hellwig 写道:
> On Mon, Aug 25, 2025 at 05:36:55PM +0800, Yu Kuai wrote:
>> +		bio = bio_submit_split(bio,
>> +				zone->zone_end - bio->bi_iter.bi_sector,
>> +				&mddev->bio_set);
> 
> Do you know why raid0 and linear use mddev->bio_set for splitting
> instead of their own split bio_sets like raid1/10/5?  Is this safe?
> 

I think it's not safe, as mddev->bio_split pool size is just 2, reuse
this pool to split multiple times before submitting will need greate
pool size to make this work.

By the way, do you think it's better to increate disk->bio_split pool
size to 4 and convert all mdraid internal split to use disk->bio_split
directly?

Thanks,
Kuai

> Otherwise this looks nice.
> .
> 

Re: [PATCH RFC 2/7] md/raid0: convert raid0_handle_discard() to use bio_submit_split()
Posted by Christoph Hellwig 1 month, 1 week ago
On Tue, Aug 26, 2025 at 09:08:33AM +0800, Yu Kuai wrote:
> 在 2025/08/25 18:57, Christoph Hellwig 写道:
> > On Mon, Aug 25, 2025 at 05:36:55PM +0800, Yu Kuai wrote:
> > > +		bio = bio_submit_split(bio,
> > > +				zone->zone_end - bio->bi_iter.bi_sector,
> > > +				&mddev->bio_set);
> > 
> > Do you know why raid0 and linear use mddev->bio_set for splitting
> > instead of their own split bio_sets like raid1/10/5?  Is this safe?
> > 
> 
> I think it's not safe, as mddev->bio_split pool size is just 2, reuse
> this pool to split multiple times before submitting will need greate
> pool size to make this work.
> 
> By the way, do you think it's better to increate disk->bio_split pool
> size to 4 and convert all mdraid internal split to use disk->bio_split
> directly?

I don't really know where that magic number 4 or even the current number
comes from, but I think Jens might be amenable to a small increase with a
good explanation.

Re: [PATCH RFC 2/7] md/raid0: convert raid0_handle_discard() to use bio_submit_split()
Posted by Yu Kuai 1 month, 1 week ago
Hi,

在 2025/08/26 15:54, Christoph Hellwig 写道:
> On Tue, Aug 26, 2025 at 09:08:33AM +0800, Yu Kuai wrote:
>> 在 2025/08/25 18:57, Christoph Hellwig 写道:
>>> On Mon, Aug 25, 2025 at 05:36:55PM +0800, Yu Kuai wrote:
>>>> +		bio = bio_submit_split(bio,
>>>> +				zone->zone_end - bio->bi_iter.bi_sector,
>>>> +				&mddev->bio_set);
>>>
>>> Do you know why raid0 and linear use mddev->bio_set for splitting
>>> instead of their own split bio_sets like raid1/10/5?  Is this safe?
>>>
>>
>> I think it's not safe, as mddev->bio_split pool size is just 2, reuse
>> this pool to split multiple times before submitting will need greate
>> pool size to make this work.
>>
>> By the way, do you think it's better to increate disk->bio_split pool
>> size to 4 and convert all mdraid internal split to use disk->bio_split
>> directly?
> 
> I don't really know where that magic number 4 or even the current number
> comes from, but I think Jens might be amenable to a small increase with a
> good explanation.

I was thinking we have to make sure issuing the allocated split bio
before allocating new bio, and that number is the safe limit that we can
allocated before issuing.

In case of recursive split, we can hold multiple split bio in
curent->bio_list, and with this set to handle split bio first, we can
gurantee we'll at most hold 3 split bios from mdraid:
  - bio_split_to_limits(), for example, by max_sectors
  - bio_split() by internal chunksize
  - bio_split() by badblocks

That's why I said 4 should be safe :) If genddisk->bio_split can be
expanded to 4, all internal bio_split can be removed now.

Thanks,
Kuai

> 
> .
>