drivers/md/raid5.c | 1 + 1 file changed, 1 insertion(+)
Set chunk_sectors to the full stripe width (io_opt) so that the block
layer splits I/O at full stripe boundaries. This ensures that large
writes are aligned to full stripes, avoiding the read-modify-write
overhead that occurs with partial stripe writes in RAID-5/6.
When chunk_sectors is set, the block layer's bio splitting logic in
get_max_io_size() uses blk_boundary_sectors_left() to limit I/O size
to the boundary. This naturally aligns split bios to full stripe
boundaries, enabling more efficient full stripe writes.
Test results with 24-disk RAID5 (chunk_size=64k):
dd if=/dev/zero of=/dev/md0 bs=10M oflag=direct
Before: 461 MB/s
After: 520 MB/s (+12.8%)
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Yu Kuai <yukuai@fnnas.com>
---
drivers/md/raid5.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 8854e024f311..810d936560d1 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -7783,6 +7783,7 @@ static int raid5_set_limits(struct mddev *mddev)
lim.logical_block_size = mddev->logical_block_size;
lim.io_min = mddev->chunk_sectors << 9;
lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded);
+ lim.chunk_sectors = lim.io_opt >> 9;
lim.features |= BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE;
lim.discard_granularity = stripe;
lim.max_write_zeroes_sectors = 0;
--
2.51.0
在 2026/2/23 11:58, Yu Kuai 写道: > Set chunk_sectors to the full stripe width (io_opt) so that the block > layer splits I/O at full stripe boundaries. This ensures that large > writes are aligned to full stripes, avoiding the read-modify-write > overhead that occurs with partial stripe writes in RAID-5/6. > > When chunk_sectors is set, the block layer's bio splitting logic in > get_max_io_size() uses blk_boundary_sectors_left() to limit I/O size > to the boundary. This naturally aligns split bios to full stripe > boundaries, enabling more efficient full stripe writes. > > Test results with 24-disk RAID5 (chunk_size=64k): > dd if=/dev/zero of=/dev/md0 bs=10M oflag=direct > Before: 461 MB/s > After: 520 MB/s (+12.8%) > > Suggested-by: Christoph Hellwig<hch@infradead.org> > Signed-off-by: Yu Kuai<yukuai@fnnas.com> > --- > drivers/md/raid5.c | 1 + > 1 file changed, 1 insertion(+) Applied to md-7.0 -- Thansk, Kuai
Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de>
Dear Kuai, Thank you for your patch. Am 23.02.26 um 04:58 schrieb Yu Kuai: > Set chunk_sectors to the full stripe width (io_opt) so that the block > layer splits I/O at full stripe boundaries. This ensures that large > writes are aligned to full stripes, avoiding the read-modify-write > overhead that occurs with partial stripe writes in RAID-5/6. > > When chunk_sectors is set, the block layer's bio splitting logic in > get_max_io_size() uses blk_boundary_sectors_left() to limit I/O size > to the boundary. This naturally aligns split bios to full stripe > boundaries, enabling more efficient full stripe writes. > > Test results with 24-disk RAID5 (chunk_size=64k): > dd if=/dev/zero of=/dev/md0 bs=10M oflag=direct > Before: 461 MB/s > After: 520 MB/s (+12.8%) Sweet. > Suggested-by: Christoph Hellwig <hch@infradead.org> > Signed-off-by: Yu Kuai <yukuai@fnnas.com> Once committed, should this be backported to the longterm and stable series? > --- > drivers/md/raid5.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index 8854e024f311..810d936560d1 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -7783,6 +7783,7 @@ static int raid5_set_limits(struct mddev *mddev) > lim.logical_block_size = mddev->logical_block_size; > lim.io_min = mddev->chunk_sectors << 9; > lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded); > + lim.chunk_sectors = lim.io_opt >> 9; > lim.features |= BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE; > lim.discard_granularity = stripe; > lim.max_write_zeroes_sectors = 0; Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Kind regards, Paul
Hi, 在 2026/2/23 15:26, Paul Menzel 写道: > Dear Kuai, > > > Thank you for your patch. > > Am 23.02.26 um 04:58 schrieb Yu Kuai: >> Set chunk_sectors to the full stripe width (io_opt) so that the block >> layer splits I/O at full stripe boundaries. This ensures that large >> writes are aligned to full stripes, avoiding the read-modify-write >> overhead that occurs with partial stripe writes in RAID-5/6. >> >> When chunk_sectors is set, the block layer's bio splitting logic in >> get_max_io_size() uses blk_boundary_sectors_left() to limit I/O size >> to the boundary. This naturally aligns split bios to full stripe >> boundaries, enabling more efficient full stripe writes. >> >> Test results with 24-disk RAID5 (chunk_size=64k): >> dd if=/dev/zero of=/dev/md0 bs=10M oflag=direct >> Before: 461 MB/s >> After: 520 MB/s (+12.8%) > > Sweet. > >> Suggested-by: Christoph Hellwig <hch@infradead.org> >> Signed-off-by: Yu Kuai <yukuai@fnnas.com> > > Once committed, should this be backported to the longterm and stable > series? Sorry for the delay, this is just performance improvement, I think this will not be backported automatically.This patch need to be send to LTS with some explanation if someone is interested. > >> --- >> drivers/md/raid5.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c >> index 8854e024f311..810d936560d1 100644 >> --- a/drivers/md/raid5.c >> +++ b/drivers/md/raid5.c >> @@ -7783,6 +7783,7 @@ static int raid5_set_limits(struct mddev *mddev) >> lim.logical_block_size = mddev->logical_block_size; >> lim.io_min = mddev->chunk_sectors << 9; >> lim.io_opt = lim.io_min * (conf->raid_disks - conf->max_degraded); >> + lim.chunk_sectors = lim.io_opt >> 9; >> lim.features |= BLK_FEAT_RAID_PARTIAL_STRIPES_EXPENSIVE; >> lim.discard_granularity = stripe; >> lim.max_write_zeroes_sectors = 0; > > Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> > > > Kind regards, > > Paul > -- Thansk, Kuai
© 2016 - 2026 Red Hat, Inc.