From: Yu Kuai <yukuai3@huawei.com>
On the one hand unify bio split code, prepare to fix disordered split
IO; On the other hand fix missing blkcg_bio_issue_init() and
trace_block_split() for split IO.
Noted raid0_make_request() already fix disordered split IO by
319ff40a5427 ("md/raid0: Fix performance regression for large sequential
writes"), by convert bio to underlying disks before submit_bio_noacct(),
with the respect md_submit_bio() already split by sectors, and
raid0_make_request() will split at most once for unaligned IO. This is a
bit hacky and we'll convert this to solution in general later.
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
drivers/md/raid0.c | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)
diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c
index f1d8811a542a..19b5faf238b7 100644
--- a/drivers/md/raid0.c
+++ b/drivers/md/raid0.c
@@ -463,21 +463,17 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio)
zone = find_zone(conf, &start);
if (bio_end_sector(bio) > zone->zone_end) {
- struct bio *split = bio_split(bio,
- zone->zone_end - bio->bi_iter.bi_sector, GFP_NOIO,
- &mddev->bio_set);
-
- if (IS_ERR(split)) {
- bio->bi_status = errno_to_blk_status(PTR_ERR(split));
- bio_endio(bio);
+ bio = bio_submit_split(bio,
+ zone->zone_end - bio->bi_iter.bi_sector,
+ &mddev->bio_set);
+ if (!bio)
return;
- }
- bio_chain(split, bio);
- submit_bio_noacct(bio);
- bio = split;
+
+ bio->bi_opf &= ~REQ_NOMERGE;
end = zone->zone_end;
- } else
+ } else {
end = bio_end_sector(bio);
+ }
orig_end = end;
if (zone != conf->strip_zone)
--
2.39.2
On Mon, Aug 25, 2025 at 05:36:55PM +0800, Yu Kuai wrote: > + bio = bio_submit_split(bio, > + zone->zone_end - bio->bi_iter.bi_sector, > + &mddev->bio_set); Do you know why raid0 and linear use mddev->bio_set for splitting instead of their own split bio_sets like raid1/10/5? Is this safe? Otherwise this looks nice.
Hi, 在 2025/08/25 18:57, Christoph Hellwig 写道: > On Mon, Aug 25, 2025 at 05:36:55PM +0800, Yu Kuai wrote: >> + bio = bio_submit_split(bio, >> + zone->zone_end - bio->bi_iter.bi_sector, >> + &mddev->bio_set); > > Do you know why raid0 and linear use mddev->bio_set for splitting > instead of their own split bio_sets like raid1/10/5? Is this safe? > I think it's not safe, as mddev->bio_split pool size is just 2, reuse this pool to split multiple times before submitting will need greate pool size to make this work. By the way, do you think it's better to increate disk->bio_split pool size to 4 and convert all mdraid internal split to use disk->bio_split directly? Thanks, Kuai > Otherwise this looks nice. > . >
On Tue, Aug 26, 2025 at 09:08:33AM +0800, Yu Kuai wrote: > 在 2025/08/25 18:57, Christoph Hellwig 写道: > > On Mon, Aug 25, 2025 at 05:36:55PM +0800, Yu Kuai wrote: > > > + bio = bio_submit_split(bio, > > > + zone->zone_end - bio->bi_iter.bi_sector, > > > + &mddev->bio_set); > > > > Do you know why raid0 and linear use mddev->bio_set for splitting > > instead of their own split bio_sets like raid1/10/5? Is this safe? > > > > I think it's not safe, as mddev->bio_split pool size is just 2, reuse > this pool to split multiple times before submitting will need greate > pool size to make this work. > > By the way, do you think it's better to increate disk->bio_split pool > size to 4 and convert all mdraid internal split to use disk->bio_split > directly? I don't really know where that magic number 4 or even the current number comes from, but I think Jens might be amenable to a small increase with a good explanation.
Hi, 在 2025/08/26 15:54, Christoph Hellwig 写道: > On Tue, Aug 26, 2025 at 09:08:33AM +0800, Yu Kuai wrote: >> 在 2025/08/25 18:57, Christoph Hellwig 写道: >>> On Mon, Aug 25, 2025 at 05:36:55PM +0800, Yu Kuai wrote: >>>> + bio = bio_submit_split(bio, >>>> + zone->zone_end - bio->bi_iter.bi_sector, >>>> + &mddev->bio_set); >>> >>> Do you know why raid0 and linear use mddev->bio_set for splitting >>> instead of their own split bio_sets like raid1/10/5? Is this safe? >>> >> >> I think it's not safe, as mddev->bio_split pool size is just 2, reuse >> this pool to split multiple times before submitting will need greate >> pool size to make this work. >> >> By the way, do you think it's better to increate disk->bio_split pool >> size to 4 and convert all mdraid internal split to use disk->bio_split >> directly? > > I don't really know where that magic number 4 or even the current number > comes from, but I think Jens might be amenable to a small increase with a > good explanation. I was thinking we have to make sure issuing the allocated split bio before allocating new bio, and that number is the safe limit that we can allocated before issuing. In case of recursive split, we can hold multiple split bio in curent->bio_list, and with this set to handle split bio first, we can gurantee we'll at most hold 3 split bios from mdraid: - bio_split_to_limits(), for example, by max_sectors - bio_split() by internal chunksize - bio_split() by badblocks That's why I said 4 should be safe :) If genddisk->bio_split can be expanded to 4, all internal bio_split can be removed now. Thanks, Kuai > > . >
© 2016 - 2025 Red Hat, Inc.