From nobody Thu Dec 18 08:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C88E6C7EE2E for ; Mon, 29 May 2023 13:15:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229656AbjE2NPZ (ORCPT ); Mon, 29 May 2023 09:15:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42058 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229717AbjE2NPK (ORCPT ); Mon, 29 May 2023 09:15:10 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EB3D107; Mon, 29 May 2023 06:14:50 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QVGGD2PSBz4f41Vd; Mon, 29 May 2023 21:14:44 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHcLNDpXRknNjnKQ--.28139S5; Mon, 29 May 2023 21:14:45 +0800 (CST) From: Yu Kuai To: song@kernel.org, neilb@suse.de, akpm@osdl.org Cc: xni@redhat.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v3 1/7] md/raid10: prevent soft lockup while flush writes Date: Mon, 29 May 2023 21:11:00 +0800 Message-Id: <20230529131106.2123367-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230529131106.2123367-1-yukuai1@huaweicloud.com> References: <20230529131106.2123367-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAHcLNDpXRknNjnKQ--.28139S5 X-Coremail-Antispam: 1UD129KBjvJXoW7ZFyxGw43ZryUArW3Jw1rXrb_yoW8ZFyUpa 90gFWYyw4UCw13AwsIyF4IgFyrZa90q3y7CFWvyw13XF13XFyUGa1DJrWjgrWDuryfGrW3 CF4vkrZ7Xw15tFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBK14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6rxdM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8 JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIx AIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7VUbec_DUUUUU= = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai Currently, there is no limit for raid1/raid10 plugged bio. While flushing writes, raid1 has cond_resched() while raid10 doesn't, and too many writes can cause soft lockup. Follow up soft lockup can be triggered easily with writeback test for raid10 with ramdisks: watchdog: BUG: soft lockup - CPU#10 stuck for 27s! [md0_raid10:1293] Call Trace: call_rcu+0x16/0x20 put_object+0x41/0x80 __delete_object+0x50/0x90 delete_object_full+0x2b/0x40 kmemleak_free+0x46/0xa0 slab_free_freelist_hook.constprop.0+0xed/0x1a0 kmem_cache_free+0xfd/0x300 mempool_free_slab+0x1f/0x30 mempool_free+0x3a/0x100 bio_free+0x59/0x80 bio_put+0xcf/0x2c0 free_r10bio+0xbf/0xf0 raid_end_bio_io+0x78/0xb0 one_write_done+0x8a/0xa0 raid10_end_write_request+0x1b4/0x430 bio_endio+0x175/0x320 brd_submit_bio+0x3b9/0x9b7 [brd] __submit_bio+0x69/0xe0 submit_bio_noacct_nocheck+0x1e6/0x5a0 submit_bio_noacct+0x38c/0x7e0 flush_pending_writes+0xf0/0x240 raid10d+0xac/0x1ed0 Fix the problem by adding cond_resched() to raid10 like what raid1 did. Note that unlimited plugged bio still need to be optimized, for example, in the case of lots of dirty pages writeback, this will take lots of memory and io will spend a long time in plug, hence io latency is bad. Signed-off-by: Yu Kuai --- drivers/md/raid10.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 32fb4ff0acdb..6b31f848a6d9 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -921,6 +921,7 @@ static void flush_pending_writes(struct r10conf *conf) else submit_bio_noacct(bio); bio =3D next; + cond_resched(); } blk_finish_plug(&plug); } else @@ -1145,6 +1146,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, boo= l from_schedule) else submit_bio_noacct(bio); bio =3D next; + cond_resched(); } kfree(plug); } --=20 2.39.2 From nobody Thu Dec 18 08:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39EB6C77B7A for ; Mon, 29 May 2023 13:15:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229755AbjE2NPS (ORCPT ); Mon, 29 May 2023 09:15:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229564AbjE2NPH (ORCPT ); Mon, 29 May 2023 09:15:07 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 400D5F9; Mon, 29 May 2023 06:14:49 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QVGGD5WPQz4f454K; Mon, 29 May 2023 21:14:44 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHcLNDpXRknNjnKQ--.28139S6; Mon, 29 May 2023 21:14:45 +0800 (CST) From: Yu Kuai To: song@kernel.org, neilb@suse.de, akpm@osdl.org Cc: xni@redhat.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v3 2/7] md/raid1-10: factor out a helper to add bio to plug Date: Mon, 29 May 2023 21:11:01 +0800 Message-Id: <20230529131106.2123367-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230529131106.2123367-1-yukuai1@huaweicloud.com> References: <20230529131106.2123367-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAHcLNDpXRknNjnKQ--.28139S6 X-Coremail-Antispam: 1UD129KBjvJXoWxXrWUXFWkCr1xtw43ury7Awb_yoW5uw4fpa 15KFyavrWDXrW5Xw1kJF4DuF45K3ZIgFZFkr93C3s3Jr17XFWUWa15JFWrCr98ZFZxury7 Jrn0krsrCF43KFUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6rxdM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWx JVW8Jr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUc6pPUUUUU = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai The code in raid1 and raid10 is identical, prepare to limit the number of plugged bios. Signed-off-by: Yu Kuai --- drivers/md/raid1-10.c | 16 ++++++++++++++++ drivers/md/raid1.c | 12 +----------- drivers/md/raid10.c | 11 +---------- 3 files changed, 18 insertions(+), 21 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index e61f6cad4e08..9bf19a3409ce 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -109,3 +109,19 @@ static void md_bio_reset_resync_pages(struct bio *bio,= struct resync_pages *rp, size -=3D len; } while (idx++ < RESYNC_PAGES && size > 0); } + +static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *= bio, + blk_plug_cb_fn unplug) +{ + struct raid1_plug_cb *plug =3D NULL; + struct blk_plug_cb *cb =3D blk_check_plugged(unplug, mddev, + sizeof(*plug)); + + if (!cb) + return false; + + plug =3D container_of(cb, struct raid1_plug_cb, cb); + bio_list_add(&plug->pending, bio); + + return true; +} diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 2f1011ffdf09..e86c5e71c604 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1343,8 +1343,6 @@ static void raid1_write_request(struct mddev *mddev, = struct bio *bio, struct bitmap *bitmap =3D mddev->bitmap; unsigned long flags; struct md_rdev *blocked_rdev; - struct blk_plug_cb *cb; - struct raid1_plug_cb *plug =3D NULL; int first_clone; int max_sectors; bool write_behind =3D false; @@ -1573,15 +1571,7 @@ static void raid1_write_request(struct mddev *mddev,= struct bio *bio, r1_bio->sector); /* flush_pending_writes() needs access to the rdev so...*/ mbio->bi_bdev =3D (void *)rdev; - - cb =3D blk_check_plugged(raid1_unplug, mddev, sizeof(*plug)); - if (cb) - plug =3D container_of(cb, struct raid1_plug_cb, cb); - else - plug =3D NULL; - if (plug) { - bio_list_add(&plug->pending, mbio); - } else { + if (!raid1_add_bio_to_plug(mddev, mbio, raid1_unplug)) { spin_lock_irqsave(&conf->device_lock, flags); bio_list_add(&conf->pending_bio_list, mbio); spin_unlock_irqrestore(&conf->device_lock, flags); diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 6b31f848a6d9..18702051ebd1 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1287,8 +1287,6 @@ static void raid10_write_one_disk(struct mddev *mddev= , struct r10bio *r10_bio, const blk_opf_t do_sync =3D bio->bi_opf & REQ_SYNC; const blk_opf_t do_fua =3D bio->bi_opf & REQ_FUA; unsigned long flags; - struct blk_plug_cb *cb; - struct raid1_plug_cb *plug =3D NULL; struct r10conf *conf =3D mddev->private; struct md_rdev *rdev; int devnum =3D r10_bio->devs[n_copy].devnum; @@ -1328,14 +1326,7 @@ static void raid10_write_one_disk(struct mddev *mdde= v, struct r10bio *r10_bio, =20 atomic_inc(&r10_bio->remaining); =20 - cb =3D blk_check_plugged(raid10_unplug, mddev, sizeof(*plug)); - if (cb) - plug =3D container_of(cb, struct raid1_plug_cb, cb); - else - plug =3D NULL; - if (plug) { - bio_list_add(&plug->pending, mbio); - } else { + if (!raid1_add_bio_to_plug(mddev, mbio, raid10_unplug)) { spin_lock_irqsave(&conf->device_lock, flags); bio_list_add(&conf->pending_bio_list, mbio); spin_unlock_irqrestore(&conf->device_lock, flags); --=20 2.39.2 From nobody Thu Dec 18 08:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 682FDC77B7A for ; Mon, 29 May 2023 13:15:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229764AbjE2NPW (ORCPT ); Mon, 29 May 2023 09:15:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229662AbjE2NPH (ORCPT ); Mon, 29 May 2023 09:15:07 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45CFE106; Mon, 29 May 2023 06:14:49 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QVGGF2nz1z4f44Cn; Mon, 29 May 2023 21:14:45 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHcLNDpXRknNjnKQ--.28139S7; Mon, 29 May 2023 21:14:46 +0800 (CST) From: Yu Kuai To: song@kernel.org, neilb@suse.de, akpm@osdl.org Cc: xni@redhat.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v3 3/7] md/raid1-10: factor out a helper to submit normal write Date: Mon, 29 May 2023 21:11:02 +0800 Message-Id: <20230529131106.2123367-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230529131106.2123367-1-yukuai1@huaweicloud.com> References: <20230529131106.2123367-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAHcLNDpXRknNjnKQ--.28139S7 X-Coremail-Antispam: 1UD129KBjvJXoWxAF1rXFW5ZFyfXFyUWr48JFb_yoW5tr1xp3 9Iqa4fZ3y7JF47Wan8Cay8Ja4Fga1DtrWUuFW7CayfAFW3ZFyDta1kJry0gryDAFyrCry7 ZF18K39rWa13JFDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6rxdM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_ Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x 0EwIxGrwCI42IY6xIIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWx JVW8Jr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUd8n5UUUUU = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai There are multiple places to do the same thing, factor out a helper to prevent redundant code, and the helper will be used in following patch as well. Signed-off-by: Yu Kuai --- drivers/md/raid1-10.c | 17 +++++++++++++++++ drivers/md/raid1.c | 13 ++----------- drivers/md/raid10.c | 26 ++++---------------------- 3 files changed, 23 insertions(+), 33 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 9bf19a3409ce..506299bd55cb 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -110,6 +110,23 @@ static void md_bio_reset_resync_pages(struct bio *bio,= struct resync_pages *rp, } while (idx++ < RESYNC_PAGES && size > 0); } =20 + +static inline void raid1_submit_write(struct bio *bio) +{ + struct md_rdev *rdev =3D (struct md_rdev *)bio->bi_bdev; + + bio->bi_next =3D NULL; + bio_set_dev(bio, rdev->bdev); + if (test_bit(Faulty, &rdev->flags)) + bio_io_error(bio); + else if (unlikely(bio_op(bio) =3D=3D REQ_OP_DISCARD && + !bdev_max_discard_sectors(bio->bi_bdev))) + /* Just ignore it */ + bio_endio(bio); + else + submit_bio_noacct(bio); +} + static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *= bio, blk_plug_cb_fn unplug) { diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index e86c5e71c604..0778e398584c 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -799,17 +799,8 @@ static void flush_bio_list(struct r1conf *conf, struct= bio *bio) =20 while (bio) { /* submit pending writes */ struct bio *next =3D bio->bi_next; - struct md_rdev *rdev =3D (void *)bio->bi_bdev; - bio->bi_next =3D NULL; - bio_set_dev(bio, rdev->bdev); - if (test_bit(Faulty, &rdev->flags)) { - bio_io_error(bio); - } else if (unlikely((bio_op(bio) =3D=3D REQ_OP_DISCARD) && - !bdev_max_discard_sectors(bio->bi_bdev))) - /* Just ignore it */ - bio_endio(bio); - else - submit_bio_noacct(bio); + + raid1_submit_write(bio); bio =3D next; cond_resched(); } diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 18702051ebd1..6640507ecb0d 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -909,17 +909,8 @@ static void flush_pending_writes(struct r10conf *conf) =20 while (bio) { /* submit pending writes */ struct bio *next =3D bio->bi_next; - struct md_rdev *rdev =3D (void*)bio->bi_bdev; - bio->bi_next =3D NULL; - bio_set_dev(bio, rdev->bdev); - if (test_bit(Faulty, &rdev->flags)) { - bio_io_error(bio); - } else if (unlikely((bio_op(bio) =3D=3D REQ_OP_DISCARD) && - !bdev_max_discard_sectors(bio->bi_bdev))) - /* Just ignore it */ - bio_endio(bio); - else - submit_bio_noacct(bio); + + raid1_submit_write(bio); bio =3D next; cond_resched(); } @@ -1134,17 +1125,8 @@ static void raid10_unplug(struct blk_plug_cb *cb, bo= ol from_schedule) =20 while (bio) { /* submit pending writes */ struct bio *next =3D bio->bi_next; - struct md_rdev *rdev =3D (void*)bio->bi_bdev; - bio->bi_next =3D NULL; - bio_set_dev(bio, rdev->bdev); - if (test_bit(Faulty, &rdev->flags)) { - bio_io_error(bio); - } else if (unlikely((bio_op(bio) =3D=3D REQ_OP_DISCARD) && - !bdev_max_discard_sectors(bio->bi_bdev))) - /* Just ignore it */ - bio_endio(bio); - else - submit_bio_noacct(bio); + + raid1_submit_write(bio); bio =3D next; cond_resched(); } --=20 2.39.2 From nobody Thu Dec 18 08:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CBFC8C77B7E for ; Mon, 29 May 2023 13:15:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229782AbjE2NPa (ORCPT ); Mon, 29 May 2023 09:15:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229728AbjE2NPL (ORCPT ); Mon, 29 May 2023 09:15:11 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AEF201AD; Mon, 29 May 2023 06:14:50 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QVGGF55Sgz4f3y3M; Mon, 29 May 2023 21:14:45 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHcLNDpXRknNjnKQ--.28139S8; Mon, 29 May 2023 21:14:46 +0800 (CST) From: Yu Kuai To: song@kernel.org, neilb@suse.de, akpm@osdl.org Cc: xni@redhat.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v3 4/7] md/raid1-10: submit write io directly if bitmap is not enabled Date: Mon, 29 May 2023 21:11:03 +0800 Message-Id: <20230529131106.2123367-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230529131106.2123367-1-yukuai1@huaweicloud.com> References: <20230529131106.2123367-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAHcLNDpXRknNjnKQ--.28139S8 X-Coremail-Antispam: 1UD129KBjvJXoWxuF18KFWfuryrKF4xGFW7Arb_yoW5AFWrpa yDGa4Ykr15JFW3X3ZxAa4DAFyFywn7tr9rKryfC395uFy3XFsxGFWrGay5twn7CrnxGFsx Xr15KryDCr1UXrJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2 IY04v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAF wI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc4 0Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AK xVWxJVW8Jr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F 4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTY UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai Commit 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10") add bitmap support, and it changed that write io is submitted through daemon thread because bitmap need to be updated before write io. And later, plug is used to fix performance regression because all the write io will go to demon thread, which means io can't be issued concurrently. However, if bitmap is not enabled, the write io should not go to daemon thread in the first place, and plug is not needed as well. Fixes: 6cce3b23f6f8 ("[PATCH] md: write intent bitmap support for raid10") Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 4 +--- drivers/md/md-bitmap.h | 7 +++++++ drivers/md/raid1-10.c | 13 +++++++++++-- 3 files changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index ad5a3456cd8a..3ee590cf12a7 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -1016,7 +1016,6 @@ static int md_bitmap_file_test_bit(struct bitmap *bit= map, sector_t block) return set; } =20 - /* this gets called when the md device is ready to unplug its underlying * (slave) device queues -- before we let any writes go down, we need to * sync the dirty pages of the bitmap file to disk */ @@ -1026,8 +1025,7 @@ void md_bitmap_unplug(struct bitmap *bitmap) int dirty, need_write; int writing =3D 0; =20 - if (!bitmap || !bitmap->storage.filemap || - test_bit(BITMAP_STALE, &bitmap->flags)) + if (!md_bitmap_enabled(bitmap)) return; =20 /* look at each page to see if there are any set bits that need to be diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index cfd7395de8fd..3a4750952b3a 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -273,6 +273,13 @@ int md_bitmap_copy_from_slot(struct mddev *mddev, int = slot, sector_t *lo, sector_t *hi, bool clear_bits); void md_bitmap_free(struct bitmap *bitmap); void md_bitmap_wait_behind_writes(struct mddev *mddev); + +static inline bool md_bitmap_enabled(struct bitmap *bitmap) +{ + return bitmap && bitmap->storage.filemap && + !test_bit(BITMAP_STALE, &bitmap->flags); +} + #endif =20 #endif diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 506299bd55cb..73cc3cb9154d 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -131,9 +131,18 @@ static inline bool raid1_add_bio_to_plug(struct mddev = *mddev, struct bio *bio, blk_plug_cb_fn unplug) { struct raid1_plug_cb *plug =3D NULL; - struct blk_plug_cb *cb =3D blk_check_plugged(unplug, mddev, - sizeof(*plug)); + struct blk_plug_cb *cb; + + /* + * If bitmap is not enabled, it's safe to submit the io directly, and + * this can get optimal performance. + */ + if (!md_bitmap_enabled(mddev->bitmap)) { + raid1_submit_write(bio); + return true; + } =20 + cb =3D blk_check_plugged(unplug, mddev, sizeof(*plug)); if (!cb) return false; =20 --=20 2.39.2 From nobody Thu Dec 18 08:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28FF3C77B7E for ; Mon, 29 May 2023 13:15:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229776AbjE2NP2 (ORCPT ); Mon, 29 May 2023 09:15:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229730AbjE2NPL (ORCPT ); Mon, 29 May 2023 09:15:11 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 640BC110; Mon, 29 May 2023 06:14:50 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QVGGG11Hmz4f3pBW; Mon, 29 May 2023 21:14:46 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHcLNDpXRknNjnKQ--.28139S9; Mon, 29 May 2023 21:14:47 +0800 (CST) From: Yu Kuai To: song@kernel.org, neilb@suse.de, akpm@osdl.org Cc: xni@redhat.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v3 5/7] md/md-bitmap: add a new helper to unplug bitmap asynchrously Date: Mon, 29 May 2023 21:11:04 +0800 Message-Id: <20230529131106.2123367-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230529131106.2123367-1-yukuai1@huaweicloud.com> References: <20230529131106.2123367-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAHcLNDpXRknNjnKQ--.28139S9 X-Coremail-Antispam: 1UD129KBjvJXoWxZFWfZF1UZw43WF17KF17ZFb_yoWrCw1rpF W5t345Cr45JF47W345A34UuFySka4vqr9rJryfCw4ruF9xXF9xJF48GFWjywn8WFs8GFnI va1rtF98CF1Fqr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2 IY04v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAF wI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc4 0Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AK xVWxJVW8Jr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F 4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTY UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai If bitmap is enabled, bitmap must update before submitting write io, this is why unplug callback must move these io to 'conf->pending_io_list' if 'current->bio_list' is not empty, which will suffer performance degradation. A new helper md_bitmap_unplug_async() is introduced to submit bitmap io in a kworker, so that submit bitmap io in raid10_unplug() doesn't require that 'current->bio_list' is empty. This patch prepare to limit the number of plugged bio. Signed-off-by: Yu Kuai --- drivers/md/md-bitmap.c | 29 +++++++++++++++++++++++++++++ drivers/md/md-bitmap.h | 1 + drivers/md/md.c | 9 +++++++++ drivers/md/md.h | 1 + 4 files changed, 40 insertions(+) diff --git a/drivers/md/md-bitmap.c b/drivers/md/md-bitmap.c index 3ee590cf12a7..25cd72b317f1 100644 --- a/drivers/md/md-bitmap.c +++ b/drivers/md/md-bitmap.c @@ -1054,6 +1054,35 @@ void md_bitmap_unplug(struct bitmap *bitmap) } EXPORT_SYMBOL(md_bitmap_unplug); =20 +struct bitmap_unplug_work { + struct work_struct work; + struct bitmap *bitmap; + struct completion *done; +}; + +static void md_bitmap_unplug_fn(struct work_struct *work) +{ + struct bitmap_unplug_work *unplug_work =3D + container_of(work, struct bitmap_unplug_work, work); + + md_bitmap_unplug(unplug_work->bitmap); + complete(unplug_work->done); +} + +void md_bitmap_unplug_async(struct bitmap *bitmap) +{ + DECLARE_COMPLETION_ONSTACK(done); + struct bitmap_unplug_work unplug_work; + + INIT_WORK(&unplug_work.work, md_bitmap_unplug_fn); + unplug_work.bitmap =3D bitmap; + unplug_work.done =3D &done; + + queue_work(md_bitmap_wq, &unplug_work.work); + wait_for_completion(&done); +} +EXPORT_SYMBOL(md_bitmap_unplug_async); + static void md_bitmap_set_memory_bits(struct bitmap *bitmap, sector_t offs= et, int needed); /* * bitmap_init_from_disk -- called at bitmap_create time to initialize * the in-memory bitmap from the on-disk bitmap -- also, sets up the diff --git a/drivers/md/md-bitmap.h b/drivers/md/md-bitmap.h index 3a4750952b3a..8a3788c9bfef 100644 --- a/drivers/md/md-bitmap.h +++ b/drivers/md/md-bitmap.h @@ -264,6 +264,7 @@ void md_bitmap_sync_with_cluster(struct mddev *mddev, sector_t new_lo, sector_t new_hi); =20 void md_bitmap_unplug(struct bitmap *bitmap); +void md_bitmap_unplug_async(struct bitmap *bitmap); void md_bitmap_daemon_work(struct mddev *mddev); =20 int md_bitmap_resize(struct bitmap *bitmap, sector_t blocks, diff --git a/drivers/md/md.c b/drivers/md/md.c index e592f37a1071..a5a7af2f4e59 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -83,6 +83,7 @@ static struct module *md_cluster_mod; static DECLARE_WAIT_QUEUE_HEAD(resync_wait); static struct workqueue_struct *md_wq; static struct workqueue_struct *md_misc_wq; +struct workqueue_struct *md_bitmap_wq; =20 static int remove_and_add_spares(struct mddev *mddev, struct md_rdev *this); @@ -9636,6 +9637,11 @@ static int __init md_init(void) if (!md_misc_wq) goto err_misc_wq; =20 + md_bitmap_wq =3D alloc_workqueue("md_bitmap", WQ_MEM_RECLAIM | WQ_UNBOUND, + 0); + if (!md_bitmap_wq) + goto err_bitmap_wq; + ret =3D __register_blkdev(MD_MAJOR, "md", md_probe); if (ret < 0) goto err_md; @@ -9654,6 +9660,8 @@ static int __init md_init(void) err_mdp: unregister_blkdev(MD_MAJOR, "md"); err_md: + destroy_workqueue(md_bitmap_wq); +err_bitmap_wq: destroy_workqueue(md_misc_wq); err_misc_wq: destroy_workqueue(md_wq); @@ -9950,6 +9958,7 @@ static __exit void md_exit(void) spin_unlock(&all_mddevs_lock); =20 destroy_workqueue(md_misc_wq); + destroy_workqueue(md_bitmap_wq); destroy_workqueue(md_wq); } =20 diff --git a/drivers/md/md.h b/drivers/md/md.h index a50122165fa1..bfd2306bc750 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -852,6 +852,7 @@ struct mdu_array_info_s; struct mdu_disk_info_s; =20 extern int mdp_major; +extern struct workqueue_struct *md_bitmap_wq; void md_autostart_arrays(int part); int md_set_array_info(struct mddev *mddev, struct mdu_array_info_s *info); int md_add_new_disk(struct mddev *mddev, struct mdu_disk_info_s *info); --=20 2.39.2 From nobody Thu Dec 18 08:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B534C77B7E for ; Mon, 29 May 2023 13:15:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229461AbjE2NPo (ORCPT ); Mon, 29 May 2023 09:15:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42118 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229747AbjE2NPQ (ORCPT ); Mon, 29 May 2023 09:15:16 -0400 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1111115; Mon, 29 May 2023 06:14:51 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4QVGGG5BYVz4f44Ct; Mon, 29 May 2023 21:14:46 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHcLNDpXRknNjnKQ--.28139S10; Mon, 29 May 2023 21:14:47 +0800 (CST) From: Yu Kuai To: song@kernel.org, neilb@suse.de, akpm@osdl.org Cc: xni@redhat.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v3 6/7] md/raid1-10: don't handle pluged bio by daemon thread Date: Mon, 29 May 2023 21:11:05 +0800 Message-Id: <20230529131106.2123367-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230529131106.2123367-1-yukuai1@huaweicloud.com> References: <20230529131106.2123367-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAHcLNDpXRknNjnKQ--.28139S10 X-Coremail-Antispam: 1UD129KBjvJXoWxZw4DAw1ruF48GFyfCryDKFg_yoWrAFyUp3 yYqa1YgrW8GFW3Zw4DZF4DuFyFqa1vgFZrAFZ5uws5uFy3XF9xWa15GFW8t34DZrsxGFy7 Ary5trWDGa1YvFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2 IY04v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAF wI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc4 0Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AK xVWxJVW8Jr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F 4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTY UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai current->bio_list will be set under submit_bio() context, in this case bitmap io will be added to the list and wait for current io submission to finish, while current io submission must wait for bitmap io to be done. commit 874807a83139 ("md/raid1{,0}: fix deadlock in bitmap_unplug.") fix the deadlock by handling plugged bio by daemon thread. On the one hand, the deadlock won't exist after commit a214b949d8e3 ("blk-mq: only flush requests from the plug in blk_mq_submit_bio"). On the other hand, current solution makes it impossible to flush plugged bio in raid1/10_make_request(), because this will cause that all the writes will goto daemon thread. In order to limit the number of plugged bio, commit 874807a83139 ("md/raid1{,0}: fix deadlock in bitmap_unplug.") is reverted, and the deadlock is fixed by handling bitmap io asynchronously. Signed-off-by: Yu Kuai --- drivers/md/raid1-10.c | 14 ++++++++++++++ drivers/md/raid1.c | 4 ++-- drivers/md/raid10.c | 8 +++----- 3 files changed, 19 insertions(+), 7 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 73cc3cb9154d..17e55c1fd5a1 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -151,3 +151,17 @@ static inline bool raid1_add_bio_to_plug(struct mddev = *mddev, struct bio *bio, =20 return true; } + +/* + * current->bio_list will be set under submit_bio() context, in this case = bitmap + * io will be added to the list and wait for current io submission to fini= sh, + * while current io submission must wait for bitmap io to be done. In orde= r to + * avoid such deadlock, submit bitmap io asynchronously. + */ +static inline void raid1_prepare_flush_writes(struct bitmap *bitmap) +{ + if (current->bio_list) + md_bitmap_unplug_async(bitmap); + else + md_bitmap_unplug(bitmap); +} diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 0778e398584c..006620fed595 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -794,7 +794,7 @@ static int read_balance(struct r1conf *conf, struct r1b= io *r1_bio, int *max_sect static void flush_bio_list(struct r1conf *conf, struct bio *bio) { /* flush any pending bitmap writes to disk before proceeding w/ I/O */ - md_bitmap_unplug(conf->mddev->bitmap); + raid1_prepare_flush_writes(conf->mddev->bitmap); wake_up(&conf->wait_barrier); =20 while (bio) { /* submit pending writes */ @@ -1166,7 +1166,7 @@ static void raid1_unplug(struct blk_plug_cb *cb, bool= from_schedule) struct r1conf *conf =3D mddev->private; struct bio *bio; =20 - if (from_schedule || current->bio_list) { + if (from_schedule) { spin_lock_irq(&conf->device_lock); bio_list_merge(&conf->pending_bio_list, &plug->pending); spin_unlock_irq(&conf->device_lock); diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 6640507ecb0d..fb22cfe94d32 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -902,9 +902,7 @@ static void flush_pending_writes(struct r10conf *conf) __set_current_state(TASK_RUNNING); =20 blk_start_plug(&plug); - /* flush any pending bitmap writes to disk - * before proceeding w/ I/O */ - md_bitmap_unplug(conf->mddev->bitmap); + raid1_prepare_flush_writes(conf->mddev->bitmap); wake_up(&conf->wait_barrier); =20 while (bio) { /* submit pending writes */ @@ -1108,7 +1106,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, boo= l from_schedule) struct r10conf *conf =3D mddev->private; struct bio *bio; =20 - if (from_schedule || current->bio_list) { + if (from_schedule) { spin_lock_irq(&conf->device_lock); bio_list_merge(&conf->pending_bio_list, &plug->pending); spin_unlock_irq(&conf->device_lock); @@ -1120,7 +1118,7 @@ static void raid10_unplug(struct blk_plug_cb *cb, boo= l from_schedule) =20 /* we aren't scheduling, so we can do the write-out directly. */ bio =3D bio_list_get(&plug->pending); - md_bitmap_unplug(mddev->bitmap); + raid1_prepare_flush_writes(mddev->bitmap); wake_up(&conf->wait_barrier); =20 while (bio) { /* submit pending writes */ --=20 2.39.2 From nobody Thu Dec 18 08:59:17 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0C02C77B7A for ; Mon, 29 May 2023 13:15:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229814AbjE2NPt (ORCPT ); Mon, 29 May 2023 09:15:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42504 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229728AbjE2NPl (ORCPT ); Mon, 29 May 2023 09:15:41 -0400 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 85C9811B; Mon, 29 May 2023 06:14:52 -0700 (PDT) Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4QVGGG2chnz4f3sjt; Mon, 29 May 2023 21:14:46 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP4 (Coremail) with SMTP id gCh0CgAHcLNDpXRknNjnKQ--.28139S11; Mon, 29 May 2023 21:14:48 +0800 (CST) From: Yu Kuai To: song@kernel.org, neilb@suse.de, akpm@osdl.org Cc: xni@redhat.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next v3 7/7] md/raid1-10: limit the number of plugged bio Date: Mon, 29 May 2023 21:11:06 +0800 Message-Id: <20230529131106.2123367-8-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230529131106.2123367-1-yukuai1@huaweicloud.com> References: <20230529131106.2123367-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAHcLNDpXRknNjnKQ--.28139S11 X-Coremail-Antispam: 1UD129KBjvJXoWxZr4rKF4Duw45JFyUAw1xGrg_yoW5tryDpa 1Dta4Yv3yUZrW7X3yDJa1UCFyFga1qgFWDCr95C395ZFy7XFWjga15GFWrCr1DZFZxWF9r J3Z8KrW7GF45tF7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUP214x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Cr1j6r xdM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjx v20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1l F7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2 IY04v7MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAF wI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc4 0Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AK xVW8Jr0_Cr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JV WxJwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsGvfC2KfnxnUUI43ZEXa7VUbmZ X7UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai bio can be added to plug infinitely, and following writeback test can trigger huge amount of plugged bio: Test script: modprobe brd rd_nr=3D4 rd_size=3D10485760 mdadm -CR /dev/md0 -l10 -n4 /dev/ram[0123] --assume-clean --bitmap=3Dintern= al echo 0 > /proc/sys/vm/dirty_background_ratio fio -filename=3D/dev/md0 -ioengine=3Dlibaio -rw=3Dwrite -bs=3D4k -numjobs= =3D1 -iodepth=3D128 -name=3Dtest Test result: Monitor /sys/block/md0/inflight will found that inflight keep increasing until fio finish writing, after running for about 2 minutes: [root@fedora ~]# cat /sys/block/md0/inflight 0 4474191 Fix the problem by limiting the number of plugged bio based on the number of copies for original bio. Signed-off-by: Yu Kuai --- drivers/md/raid1-10.c | 9 ++++++++- drivers/md/raid1.c | 2 +- drivers/md/raid10.c | 2 +- 3 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c index 17e55c1fd5a1..bb1e23b66c45 100644 --- a/drivers/md/raid1-10.c +++ b/drivers/md/raid1-10.c @@ -21,6 +21,7 @@ #define IO_MADE_GOOD ((struct bio *)2) =20 #define BIO_SPECIAL(bio) ((unsigned long)bio <=3D 2) +#define MAX_PLUG_BIO 32 =20 /* for managing resync I/O pages */ struct resync_pages { @@ -31,6 +32,7 @@ struct resync_pages { struct raid1_plug_cb { struct blk_plug_cb cb; struct bio_list pending; + unsigned int count; }; =20 static void rbio_pool_free(void *rbio, void *data) @@ -128,7 +130,7 @@ static inline void raid1_submit_write(struct bio *bio) } =20 static inline bool raid1_add_bio_to_plug(struct mddev *mddev, struct bio *= bio, - blk_plug_cb_fn unplug) + blk_plug_cb_fn unplug, int copies) { struct raid1_plug_cb *plug =3D NULL; struct blk_plug_cb *cb; @@ -148,6 +150,11 @@ static inline bool raid1_add_bio_to_plug(struct mddev = *mddev, struct bio *bio, =20 plug =3D container_of(cb, struct raid1_plug_cb, cb); bio_list_add(&plug->pending, bio); + if (++plug->count / MAX_PLUG_BIO >=3D copies) { + list_del(&cb->list); + cb->callback(cb, false); + } + =20 return true; } diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 006620fed595..dc89a1c4b1f1 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1562,7 +1562,7 @@ static void raid1_write_request(struct mddev *mddev, = struct bio *bio, r1_bio->sector); /* flush_pending_writes() needs access to the rdev so...*/ mbio->bi_bdev =3D (void *)rdev; - if (!raid1_add_bio_to_plug(mddev, mbio, raid1_unplug)) { + if (!raid1_add_bio_to_plug(mddev, mbio, raid1_unplug, disks)) { spin_lock_irqsave(&conf->device_lock, flags); bio_list_add(&conf->pending_bio_list, mbio); spin_unlock_irqrestore(&conf->device_lock, flags); diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index fb22cfe94d32..9237dbeb07ba 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1306,7 +1306,7 @@ static void raid10_write_one_disk(struct mddev *mddev= , struct r10bio *r10_bio, =20 atomic_inc(&r10_bio->remaining); =20 - if (!raid1_add_bio_to_plug(mddev, mbio, raid10_unplug)) { + if (!raid1_add_bio_to_plug(mddev, mbio, raid10_unplug, conf->copies)) { spin_lock_irqsave(&conf->device_lock, flags); bio_list_add(&conf->pending_bio_list, mbio); spin_unlock_irqrestore(&conf->device_lock, flags); --=20 2.39.2