md/raid1: don't split discard io for write behind

[PATCH -next] md/raid1: don't split discard io for write behind

Posted by Yu Kuai 2 years, 4 months ago

From: Yu Kuai <yukuai3@huawei.com>

Currently, discad io is treated the same as normal write io, and for
write behind case, io size is limited to:

BIO_MAX_VECS * (PAGE_SIZE >> 9)

For 0.5KB sector size and 4KB PAGE_SIZE, this is just 1MB. For
consequence, if 'WriteMostly' is set to one of the underlying disks,
then diskcard io will be splited into 1MB and it will take a long time
for the diskcard to finish.

Fix this problem by disable write behind for discard io.

Reported-by: Roman Mamedov <rm@romanrm.net>
Closes: https://lore.kernel.org/all/6a1165f7-c792-c054-b8f0-1ad4f7b8ae01@ultracoder.org/
Reported-and-tested-by: Kirill Kirilenko <kirill@ultracoder.org>
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
 drivers/md/raid1.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 3a78f79ee6d5..35d12948e0a9 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1345,6 +1345,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
 	int first_clone;
 	int max_sectors;
 	bool write_behind = false;
+	bool is_discard = (bio_op(bio) == REQ_OP_DISCARD);
 
 	if (mddev_is_clustered(mddev) &&
 	     md_cluster_ops->area_resyncing(mddev, WRITE,
@@ -1405,7 +1406,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
 		 * write-mostly, which means we could allocate write behind
 		 * bio later.
 		 */
-		if (rdev && test_bit(WriteMostly, &rdev->flags))
+		if (!is_discard && rdev && test_bit(WriteMostly, &rdev->flags))
 			write_behind = true;
 
 		if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
-- 
2.39.2

Re: [PATCH -next] md/raid1: don't split discard io for write behind

Posted by Song Liu 2 years, 4 months ago

On Fri, Oct 6, 2023 at 8:24 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> From: Yu Kuai <yukuai3@huawei.com>
>
> Currently, discad io is treated the same as normal write io, and for
> write behind case, io size is limited to:
>
> BIO_MAX_VECS * (PAGE_SIZE >> 9)
>
> For 0.5KB sector size and 4KB PAGE_SIZE, this is just 1MB. For
> consequence, if 'WriteMostly' is set to one of the underlying disks,
> then diskcard io will be splited into 1MB and it will take a long time
> for the diskcard to finish.
>
> Fix this problem by disable write behind for discard io.
>
> Reported-by: Roman Mamedov <rm@romanrm.net>
> Closes: https://lore.kernel.org/all/6a1165f7-c792-c054-b8f0-1ad4f7b8ae01@ultracoder.org/
> Reported-and-tested-by: Kirill Kirilenko <kirill@ultracoder.org>
> Signed-off-by: Yu Kuai <yukuai3@huawei.com>

Applied to md-next. Thanks!

Song

> ---
>  drivers/md/raid1.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index 3a78f79ee6d5..35d12948e0a9 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1345,6 +1345,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
>         int first_clone;
>         int max_sectors;
>         bool write_behind = false;
> +       bool is_discard = (bio_op(bio) == REQ_OP_DISCARD);
>
>         if (mddev_is_clustered(mddev) &&
>              md_cluster_ops->area_resyncing(mddev, WRITE,
> @@ -1405,7 +1406,7 @@ static void raid1_write_request(struct mddev *mddev, struct bio *bio,
>                  * write-mostly, which means we could allocate write behind
>                  * bio later.
>                  */
> -               if (rdev && test_bit(WriteMostly, &rdev->flags))
> +               if (!is_discard && rdev && test_bit(WriteMostly, &rdev->flags))
>                         write_behind = true;
>
>                 if (rdev && unlikely(test_bit(Blocked, &rdev->flags))) {
> --
> 2.39.2
>