From nobody Tue Dec 16 06:51:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02990C4332F for ; Fri, 10 Nov 2023 18:29:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345469AbjKJS3c (ORCPT ); Fri, 10 Nov 2023 13:29:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346558AbjKJS07 (ORCPT ); Fri, 10 Nov 2023 13:26:59 -0500 Received: from dggsgout12.his.huawei.com (unknown [45.249.212.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3B9B244BA; Fri, 10 Nov 2023 01:34:00 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4SRYYJ3m4tz4f3v76; Fri, 10 Nov 2023 17:33:56 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id AC62E1A0177; Fri, 10 Nov 2023 17:33:57 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgA3iA4E+U1l0pQlAg--.33627S5; Fri, 10 Nov 2023 17:33:57 +0800 (CST) From: Yu Kuai To: song@kernel.org, xni@redhat.com, yukuai3@huawei.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next 1/8] md: fix missing flush of sync_work Date: Sat, 11 Nov 2023 01:28:27 +0800 Message-Id: <20231110172834.3939490-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231110172834.3939490-1-yukuai1@huaweicloud.com> References: <20231110172834.3939490-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgA3iA4E+U1l0pQlAg--.33627S5 X-Coremail-Antispam: 1UD129KBjvJXoW7tr18WF13Cr18Cr4kJw13Jwb_yoW8XFyfp3 ySqa45ArWrAay7t3yUGa4q9a4rWw10qrZrtrW3u345JF1aqF45G3WY9F1jqFykJF93Zwn8 ZF40ya9xZ3W0vr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_Jr4l82xGYIkIc2x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2 F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjx v20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2 z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1l Ox8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErc IFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0pRBMKJUUUUU = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai Commit ac619781967b ("md: use separate work_struct for md_start_sync()") use a new sync_work to replace del_work, however, stop_sync_thread() and __md_stop_writes() was trying to wait for sync_thread to be done, hence they should switch to use sync_work as well. Noted that md_start_sync() from sync_work will grab 'reconfig_mutex', hence other contex can't held the same lock to flush work, and this will be fixed in later patches. Fixes: ac619781967b ("md: use separate work_struct for md_start_sync()") Signed-off-by: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/md.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 09686d8db983..1701e2fb219f 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4865,7 +4865,7 @@ static void stop_sync_thread(struct mddev *mddev) return; } =20 - if (work_pending(&mddev->del_work)) + if (work_pending(&mddev->sync_work)) flush_workqueue(md_misc_wq); =20 set_bit(MD_RECOVERY_INTR, &mddev->recovery); @@ -6273,7 +6273,7 @@ static void md_clean(struct mddev *mddev) static void __md_stop_writes(struct mddev *mddev) { set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - if (work_pending(&mddev->del_work)) + if (work_pending(&mddev->sync_work)) flush_workqueue(md_misc_wq); if (mddev->sync_thread) { set_bit(MD_RECOVERY_INTR, &mddev->recovery); --=20 2.39.2 From nobody Tue Dec 16 06:51:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4572C4332F for ; Fri, 10 Nov 2023 20:57:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346160AbjKJU5i (ORCPT ); Fri, 10 Nov 2023 15:57:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52054 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235873AbjKJU5X (ORCPT ); Fri, 10 Nov 2023 15:57:23 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2E15244BB; Fri, 10 Nov 2023 01:34:00 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SRYYG1c8Yz4f4PNq; Fri, 10 Nov 2023 17:33:54 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 17CDD1A0173; Fri, 10 Nov 2023 17:33:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgA3iA4E+U1l0pQlAg--.33627S6; Fri, 10 Nov 2023 17:33:57 +0800 (CST) From: Yu Kuai To: song@kernel.org, xni@redhat.com, yukuai3@huawei.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next 2/8] md: use interruptible apis in idle/frozen_sync_thread() Date: Sat, 11 Nov 2023 01:28:28 +0800 Message-Id: <20231110172834.3939490-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231110172834.3939490-1-yukuai1@huaweicloud.com> References: <20231110172834.3939490-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgA3iA4E+U1l0pQlAg--.33627S6 X-Coremail-Antispam: 1UD129KBjvJXoW7Kw4UWr1xur17ArW8Ar1UKFg_yoW8CF1Up3 yxGF98Ar45ArsxZ347J3WDZa4rZw1j9ayqyrW3Wa1fJwn3tr42gF109FyUZFykWayfCr4U Ja4rtFW3ZFy8Gw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_Jryl82xGYIkIc2x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2 F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjx v20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2 z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1l Ox8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErc IFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0pRY2NJUUUUU = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai Before refactoring idle and frozen from action_store, interruptible apis is used so that hungtask warning won't be triggered if it takes too long to finish indle/frozen sync_thread. So change to use interruptible apis. Fixes: 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadloc= k") Signed-off-by: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/md.c | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 1701e2fb219f..5c9387369de1 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4882,11 +4882,14 @@ static void idle_sync_thread(struct mddev *mddev) { int sync_seq =3D atomic_read(&mddev->sync_seq); =20 - mutex_lock(&mddev->sync_mutex); + if (mutex_lock_interruptible(&mddev->sync_mutex)) + return; + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); stop_sync_thread(mddev); =20 - wait_event(resync_wait, sync_seq !=3D atomic_read(&mddev->sync_seq) || + wait_event_interruptible(resync_wait, + sync_seq !=3D atomic_read(&mddev->sync_seq) || !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); =20 mutex_unlock(&mddev->sync_mutex); @@ -4894,11 +4897,13 @@ static void idle_sync_thread(struct mddev *mddev) =20 static void frozen_sync_thread(struct mddev *mddev) { - mutex_lock(&mddev->sync_mutex); + if (mutex_lock_interruptible(&mddev->sync_mutex)) + return; + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); stop_sync_thread(mddev); =20 - wait_event(resync_wait, mddev->sync_thread =3D=3D NULL && + wait_event_interruptible(resync_wait, mddev->sync_thread =3D=3D NULL && !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); =20 mutex_unlock(&mddev->sync_mutex); --=20 2.39.2 From nobody Tue Dec 16 06:51:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3EB93C4332F for ; Fri, 10 Nov 2023 17:54:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229598AbjKJRyh (ORCPT ); Fri, 10 Nov 2023 12:54:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229741AbjKJRx0 (ORCPT ); Fri, 10 Nov 2023 12:53:26 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1B19F244B3; Fri, 10 Nov 2023 01:34:01 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SRYYG4TX1z4f4BLR; Fri, 10 Nov 2023 17:33:54 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 7AECE1A019B; Fri, 10 Nov 2023 17:33:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgA3iA4E+U1l0pQlAg--.33627S7; Fri, 10 Nov 2023 17:33:58 +0800 (CST) From: Yu Kuai To: song@kernel.org, xni@redhat.com, yukuai3@huawei.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next 3/8] md: return error to user if idle/frozen_sync_thread() is interrupted Date: Sat, 11 Nov 2023 01:28:29 +0800 Message-Id: <20231110172834.3939490-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231110172834.3939490-1-yukuai1@huaweicloud.com> References: <20231110172834.3939490-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgA3iA4E+U1l0pQlAg--.33627S7 X-Coremail-Antispam: 1UD129KBjvJXoWxur4UWr1UJFW5Xr4xAFWfuFg_yoWrCr1xp3 yxJF98Ar4YyrZ3Zry7t3WDAay5uw1IqrWqyry3W34fAFn3tr47KF1Y9F1UAFykKayrAa1U XayrtF4fuFyrWr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JrWl82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2 F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjx v20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2 z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1l Ox8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErc IFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0pR_-BtUUUUU = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai It doesn't make sense that "echo idle/forzon > sync_action" doesn't return error to user if idle/frozen_sync_thread() is interrupted. Also make sure array recovery flags is not changed if error is returned. Fixes: 8e8e2518fcec ("md: Close race when setting 'action' to 'idle'.") Signed-off-by: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/md.c | 70 +++++++++++++++++++++++++++++++++---------------- 1 file changed, 48 insertions(+), 22 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 5c9387369de1..d7b9d597b54d 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4848,13 +4848,16 @@ action_show(struct mddev *mddev, char *page) return sprintf(page, "%s\n", type); } =20 -static void stop_sync_thread(struct mddev *mddev) +static int stop_sync_thread(struct mddev *mddev) { + int ret =3D 0; + if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) - return; + return 0; =20 - if (mddev_lock(mddev)) - return; + ret =3D mddev_lock(mddev); + if (ret) + return ret; =20 /* * Check again in case MD_RECOVERY_RUNNING is cleared before lock is @@ -4862,7 +4865,7 @@ static void stop_sync_thread(struct mddev *mddev) */ if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { mddev_unlock(mddev); - return; + return 0; } =20 if (work_pending(&mddev->sync_work)) @@ -4876,50 +4879,69 @@ static void stop_sync_thread(struct mddev *mddev) md_wakeup_thread_directly(mddev->sync_thread); =20 mddev_unlock(mddev); + return 0; } =20 -static void idle_sync_thread(struct mddev *mddev) +static int idle_sync_thread(struct mddev *mddev) { + int ret; int sync_seq =3D atomic_read(&mddev->sync_seq); + bool flag; =20 - if (mutex_lock_interruptible(&mddev->sync_mutex)) - return; + ret =3D mutex_lock_interruptible(&mddev->sync_mutex); + if (ret) + return ret; =20 - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - stop_sync_thread(mddev); + flag =3D test_and_clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + ret =3D stop_sync_thread(mddev); + if (ret) + goto out; =20 - wait_event_interruptible(resync_wait, + ret =3D wait_event_interruptible(resync_wait, sync_seq !=3D atomic_read(&mddev->sync_seq) || !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); - +out: + if (ret && flag) + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); mutex_unlock(&mddev->sync_mutex); + return ret; } =20 -static void frozen_sync_thread(struct mddev *mddev) +static int frozen_sync_thread(struct mddev *mddev) { - if (mutex_lock_interruptible(&mddev->sync_mutex)) - return; + int ret =3D mutex_lock_interruptible(&mddev->sync_mutex); + bool flag; =20 - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - stop_sync_thread(mddev); + if (ret) + return ret; =20 - wait_event_interruptible(resync_wait, mddev->sync_thread =3D=3D NULL && - !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); + flag =3D test_and_set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + ret =3D stop_sync_thread(mddev); + if (ret) + goto out; =20 + ret =3D wait_event_interruptible(resync_wait, + mddev->sync_thread =3D=3D NULL && + !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); +out: + if (ret && !flag) + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); mutex_unlock(&mddev->sync_mutex); + return ret; } =20 static ssize_t action_store(struct mddev *mddev, const char *page, size_t len) { + int ret =3D 0; + if (!mddev->pers || !mddev->pers->sync_request) return -EINVAL; =20 - if (cmd_match(page, "idle")) - idle_sync_thread(mddev); + ret =3D idle_sync_thread(mddev); else if (cmd_match(page, "frozen")) - frozen_sync_thread(mddev); + ret =3D frozen_sync_thread(mddev); else if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) return -EBUSY; else if (cmd_match(page, "resync")) @@ -4963,6 +4985,10 @@ action_store(struct mddev *mddev, const char *page, = size_t len) set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery); set_bit(MD_RECOVERY_SYNC, &mddev->recovery); } + + if (ret) + return ret; + if (mddev->ro =3D=3D MD_AUTO_READ) { /* A write to sync_action is enough to justify * canceling read-auto mode --=20 2.39.2 From nobody Tue Dec 16 06:51:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54805C4167D for ; Fri, 10 Nov 2023 18:04:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232252AbjKJSE4 (ORCPT ); Fri, 10 Nov 2023 13:04:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235658AbjKJSEe (ORCPT ); Fri, 10 Nov 2023 13:04:34 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E415244BD; Fri, 10 Nov 2023 01:34:01 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SRYYH05zTz4f4Q49; Fri, 10 Nov 2023 17:33:55 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id DAE1D1A016F; Fri, 10 Nov 2023 17:33:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgA3iA4E+U1l0pQlAg--.33627S8; Fri, 10 Nov 2023 17:33:58 +0800 (CST) From: Yu Kuai To: song@kernel.org, xni@redhat.com, yukuai3@huawei.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next 4/8] md: remove redundant md_wakeup_thread() Date: Sat, 11 Nov 2023 01:28:30 +0800 Message-Id: <20231110172834.3939490-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231110172834.3939490-1-yukuai1@huaweicloud.com> References: <20231110172834.3939490-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgA3iA4E+U1l0pQlAg--.33627S8 X-Coremail-Antispam: 1UD129KBjvJXoW7Kw1xuF48GFyrCFy7GrW7Arb_yoW8tFykpa yxJF98urWUZa43ArZrta4DXa45Zr1jqrWqyFW3u3yrJF1fta15uFyF9F17JrWvya92ya1Y qw48GrW7Z3WxWw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPY14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r 4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0pRKFAPU UUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai In md_set_readonly() and do_md_stop(), md_wakeup_thread() will be called while 'reconfig_mutex' is held, however, follow up mddev_unlock() will call md_wakeup_thread() again. Hence remove the redundant md_wakeup_thread(). Signed-off-by: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/md.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index d7b9d597b54d..a0ec01048ede 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6397,7 +6397,6 @@ static int md_set_readonly(struct mddev *mddev, struc= t block_device *bdev) if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) { did_freeze =3D 1; set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - md_wakeup_thread(mddev->thread); } if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) set_bit(MD_RECOVERY_INTR, &mddev->recovery); @@ -6425,7 +6424,6 @@ static int md_set_readonly(struct mddev *mddev, struc= t block_device *bdev) if (did_freeze) { clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); - md_wakeup_thread(mddev->thread); } err =3D -EBUSY; goto out; @@ -6440,7 +6438,6 @@ static int md_set_readonly(struct mddev *mddev, struc= t block_device *bdev) set_disk_ro(mddev->gendisk, 1); clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); - md_wakeup_thread(mddev->thread); sysfs_notify_dirent_safe(mddev->sysfs_state); err =3D 0; } @@ -6463,7 +6460,6 @@ static int do_md_stop(struct mddev *mddev, int mode, if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) { did_freeze =3D 1; set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - md_wakeup_thread(mddev->thread); } if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) set_bit(MD_RECOVERY_INTR, &mddev->recovery); @@ -6490,7 +6486,6 @@ static int do_md_stop(struct mddev *mddev, int mode, if (did_freeze) { clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); - md_wakeup_thread(mddev->thread); } return -EBUSY; } --=20 2.39.2 From nobody Tue Dec 16 06:51:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F526C4167B for ; Fri, 10 Nov 2023 17:59:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229575AbjKJR7B (ORCPT ); Fri, 10 Nov 2023 12:59:01 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50646 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234606AbjKJRzi (ORCPT ); Fri, 10 Nov 2023 12:55:38 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22226244BF; Fri, 10 Nov 2023 01:34:02 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SRYYH31MLz4f4PNh; Fri, 10 Nov 2023 17:33:55 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 4A8781A0181; Fri, 10 Nov 2023 17:33:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgA3iA4E+U1l0pQlAg--.33627S9; Fri, 10 Nov 2023 17:33:59 +0800 (CST) From: Yu Kuai To: song@kernel.org, xni@redhat.com, yukuai3@huawei.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next 5/8] md: don't leave 'MD_RECOVERY_FROZEN' in error path of md_set_readonly() Date: Sat, 11 Nov 2023 01:28:31 +0800 Message-Id: <20231110172834.3939490-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231110172834.3939490-1-yukuai1@huaweicloud.com> References: <20231110172834.3939490-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgA3iA4E+U1l0pQlAg--.33627S9 X-Coremail-Antispam: 1UD129KBjvJXoW7uFWkXryrtw18CrWfXryDtrb_yoW8Kr4fp3 yftF98Cry8Jry3Ar4Dt3WDXa45Zw12q3yqkFy3u34rJF1ftrZxCFyF9348JrWktas2v345 Xa1rGFW7u3W2gaUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjTRi4SO UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai If md_set_readonly() failed, the array could still be read-write, however 'MD_RECOVERY_FROZEN' could still be set, which leave the array in an abnormal state that sync or recovery can't continue anymore. Hence make sure the flag is cleared after md_set_readonly() returns. Signed-off-by: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/md.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index a0ec01048ede..c0f2bdafe46a 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6394,6 +6394,9 @@ static int md_set_readonly(struct mddev *mddev, struc= t block_device *bdev) int err =3D 0; int did_freeze =3D 0; =20 + if (mddev->external && test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags)) + return -EBUSY; + if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) { did_freeze =3D 1; set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); @@ -6407,8 +6410,6 @@ static int md_set_readonly(struct mddev *mddev, struc= t block_device *bdev) */ md_wakeup_thread_directly(mddev->sync_thread); =20 - if (mddev->external && test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags)) - return -EBUSY; mddev_unlock(mddev); wait_event(resync_wait, !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); @@ -6421,27 +6422,26 @@ static int md_set_readonly(struct mddev *mddev, str= uct block_device *bdev) mddev->sync_thread || test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { pr_warn("md: %s still in use.\n",mdname(mddev)); - if (did_freeze) { - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); - } err =3D -EBUSY; goto out; } if (mddev->pers) { __md_stop_writes(mddev); =20 - err =3D -ENXIO; - if (mddev->ro =3D=3D MD_RDONLY) + if (mddev->ro =3D=3D MD_RDONLY) { + err =3D -ENXIO; goto out; + } mddev->ro =3D MD_RDONLY; set_disk_ro(mddev->gendisk, 1); + } +out: + if ((mddev->pers && !err) || did_freeze) { clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); sysfs_notify_dirent_safe(mddev->sysfs_state); - err =3D 0; } -out: + mutex_unlock(&mddev->open_mutex); return err; } --=20 2.39.2 From nobody Tue Dec 16 06:51:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8BBBDC4167B for ; Fri, 10 Nov 2023 18:03:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235481AbjKJSDb (ORCPT ); Fri, 10 Nov 2023 13:03:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51182 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235729AbjKJSCk (ORCPT ); Fri, 10 Nov 2023 13:02:40 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49F632BE24; Fri, 10 Nov 2023 01:34:02 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SRYYK26h7z4f4HDw; Fri, 10 Nov 2023 17:33:57 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id AA82B1A0177; Fri, 10 Nov 2023 17:33:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgA3iA4E+U1l0pQlAg--.33627S10; Fri, 10 Nov 2023 17:33:59 +0800 (CST) From: Yu Kuai To: song@kernel.org, xni@redhat.com, yukuai3@huawei.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next 6/8] md: factor out a helper to stop sync_thread Date: Sat, 11 Nov 2023 01:28:32 +0800 Message-Id: <20231110172834.3939490-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231110172834.3939490-1-yukuai1@huaweicloud.com> References: <20231110172834.3939490-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgA3iA4E+U1l0pQlAg--.33627S10 X-Coremail-Antispam: 1UD129KBjvJXoWxtr1kKFyrur15GrWrWF15urg_yoWxKF1fp3 yfJF98Jr48ArWfZrZrt3WDZayrZr1jqayqyry3Wa4fJr1ftr43KFyF9FyUAFykKay0yr45 XayrtFW3ZFy7Wr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjTRi4SO UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai stop_sync_thread(), md_set_readonly() and do_md_stop() are trying to stop sync_thread() the same way, hence factor out a helper to make code cleaner, and also prepare to use the new helper to fix problems later. Signed-off-by: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/md.c | 129 ++++++++++++++++++++++++++---------------------- 1 file changed, 69 insertions(+), 60 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index c0f2bdafe46a..7252fae0c989 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4848,29 +4848,46 @@ action_show(struct mddev *mddev, char *page) return sprintf(page, "%s\n", type); } =20 -static int stop_sync_thread(struct mddev *mddev) +static bool sync_thread_stopped(struct mddev *mddev, int *seq_ptr) { - int ret =3D 0; + if (seq_ptr && *seq_ptr !=3D atomic_read(&mddev->sync_seq)) + return true; =20 - if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) - return 0; + return (!mddev->sync_thread && + !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); +} =20 - ret =3D mddev_lock(mddev); - if (ret) - return ret; +/* + * stop_sync_thread() - stop running sync_thread. + * @mddev: the array that sync_thread belongs to. + * @freeze: set true to prevent new sync_thread to start. + * @interruptible: if set true, then user can interrupt while waiting for + * sync_thread to be done. + * + * Noted that this function must be called with 'reconfig_mutex' grabbed, = and + * fter this function return, 'reconfig_mtuex' will be released. + */ +static int stop_sync_thread(struct mddev *mddev, bool freeze, + bool interruptible) + __releases(&mddev->reconfig_mutex) +{ + int *seq_ptr =3D NULL; + int sync_seq; + int ret =3D 0; + + if (freeze) { + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + } else { + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + sync_seq =3D atomic_read(&mddev->sync_seq); + seq_ptr =3D &sync_seq; + } =20 - /* - * Check again in case MD_RECOVERY_RUNNING is cleared before lock is - * held. - */ if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { mddev_unlock(mddev); return 0; } =20 - if (work_pending(&mddev->sync_work)) - flush_workqueue(md_misc_wq); - set_bit(MD_RECOVERY_INTR, &mddev->recovery); /* * Thread might be blocked waiting for metadata update which will now @@ -4879,53 +4896,58 @@ static int stop_sync_thread(struct mddev *mddev) md_wakeup_thread_directly(mddev->sync_thread); =20 mddev_unlock(mddev); - return 0; + if (work_pending(&mddev->sync_work)) + flush_work(&mddev->sync_work); + + if (interruptible) + ret =3D wait_event_interruptible(resync_wait, + sync_thread_stopped(mddev, seq_ptr)); + else + wait_event(resync_wait, sync_thread_stopped(mddev, seq_ptr)); + + return ret; } =20 static int idle_sync_thread(struct mddev *mddev) { int ret; - int sync_seq =3D atomic_read(&mddev->sync_seq); bool flag; =20 ret =3D mutex_lock_interruptible(&mddev->sync_mutex); if (ret) return ret; =20 - flag =3D test_and_clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - ret =3D stop_sync_thread(mddev); + flag =3D test_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + ret =3D mddev_lock(mddev); if (ret) - goto out; + goto unlock; =20 - ret =3D wait_event_interruptible(resync_wait, - sync_seq !=3D atomic_read(&mddev->sync_seq) || - !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); -out: + ret =3D stop_sync_thread(mddev, false, true); if (ret && flag) set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); +unlock: mutex_unlock(&mddev->sync_mutex); return ret; } =20 static int frozen_sync_thread(struct mddev *mddev) { - int ret =3D mutex_lock_interruptible(&mddev->sync_mutex); + int ret; bool flag; =20 + ret =3D mutex_lock_interruptible(&mddev->sync_mutex); if (ret) return ret; =20 - flag =3D test_and_set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - ret =3D stop_sync_thread(mddev); + flag =3D test_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + ret =3D mddev_lock(mddev); if (ret) - goto out; + goto unlock; =20 - ret =3D wait_event_interruptible(resync_wait, - mddev->sync_thread =3D=3D NULL && - !test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); -out: + ret =3D stop_sync_thread(mddev, true, true); if (ret && !flag) clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); +unlock: mutex_unlock(&mddev->sync_mutex); return ret; } @@ -6397,22 +6419,10 @@ static int md_set_readonly(struct mddev *mddev, str= uct block_device *bdev) if (mddev->external && test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags)) return -EBUSY; =20 - if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) { + if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) did_freeze =3D 1; - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - } - if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) - set_bit(MD_RECOVERY_INTR, &mddev->recovery); =20 - /* - * Thread might be blocked waiting for metadata update which will now - * never happen - */ - md_wakeup_thread_directly(mddev->sync_thread); - - mddev_unlock(mddev); - wait_event(resync_wait, !test_bit(MD_RECOVERY_RUNNING, - &mddev->recovery)); + stop_sync_thread(mddev, true, false); wait_event(mddev->sb_wait, !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags)); mddev_lock_nointr(mddev); @@ -6421,6 +6431,10 @@ static int md_set_readonly(struct mddev *mddev, stru= ct block_device *bdev) if ((mddev->pers && atomic_read(&mddev->openers) > !!bdev) || mddev->sync_thread || test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { + /* + * This could happen if user change array state through + * ioctl/sysfs while reconfig_mutex is released. + */ pr_warn("md: %s still in use.\n",mdname(mddev)); err =3D -EBUSY; goto out; @@ -6457,30 +6471,25 @@ static int do_md_stop(struct mddev *mddev, int mode, struct md_rdev *rdev; int did_freeze =3D 0; =20 - if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) { + if (!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) did_freeze =3D 1; + + if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { + stop_sync_thread(mddev, true, false); + mddev_lock_nointr(mddev); + } else { set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); } - if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) - set_bit(MD_RECOVERY_INTR, &mddev->recovery); - - /* - * Thread might be blocked waiting for metadata update which will now - * never happen - */ - md_wakeup_thread_directly(mddev->sync_thread); - - mddev_unlock(mddev); - wait_event(resync_wait, (mddev->sync_thread =3D=3D NULL && - !test_bit(MD_RECOVERY_RUNNING, - &mddev->recovery))); - mddev_lock_nointr(mddev); =20 mutex_lock(&mddev->open_mutex); if ((mddev->pers && atomic_read(&mddev->openers) > !!bdev) || mddev->sysfs_active || mddev->sync_thread || test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { + /* + * This could happen if user change array state through + * ioctl/sysfs while reconfig_mutex is released. + */ pr_warn("md: %s still in use.\n",mdname(mddev)); mutex_unlock(&mddev->open_mutex); if (did_freeze) { --=20 2.39.2 From nobody Tue Dec 16 06:51:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 618D5C4332F for ; Fri, 10 Nov 2023 18:07:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344049AbjKJSH5 (ORCPT ); Fri, 10 Nov 2023 13:07:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234895AbjKJSFI (ORCPT ); Fri, 10 Nov 2023 13:05:08 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 58894244B9; Fri, 10 Nov 2023 01:34:04 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SRYYJ1YWwz4f4Q4G; Fri, 10 Nov 2023 17:33:56 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 16BE41A016D; Fri, 10 Nov 2023 17:34:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgA3iA4E+U1l0pQlAg--.33627S11; Fri, 10 Nov 2023 17:33:59 +0800 (CST) From: Yu Kuai To: song@kernel.org, xni@redhat.com, yukuai3@huawei.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next 7/8] md: use new helper to stop sync_thread in __md_stop_writes() Date: Sat, 11 Nov 2023 01:28:33 +0800 Message-Id: <20231110172834.3939490-8-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231110172834.3939490-1-yukuai1@huaweicloud.com> References: <20231110172834.3939490-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgA3iA4E+U1l0pQlAg--.33627S11 X-Coremail-Antispam: 1UD129KBjvJXoW7tw1xWr48GFW8XF1rJFy8uFg_yoW8Jw48p3 yfKFn8Ar4DZr47A3yUJa4kZa45Z3ZFqrWvyFW3u3yrXFy3JFsrWw4Y9FyDZFWkGa4Sv3Zx Xa95tFZ3Za48Kr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjTRi4SO UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai md_reap_sync_thread() should only be called when md_do_sync() is done, for example, holding 'reconfig_mutex' to wait for md_do_sync() to be done can deadlock(see details in commit 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadlock")). Hence use new helper to stop sync_thread. Signed-off-by: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/md.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 7252fae0c989..35f3dd7db369 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6325,12 +6325,11 @@ static void md_clean(struct mddev *mddev) =20 static void __md_stop_writes(struct mddev *mddev) { - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - if (work_pending(&mddev->sync_work)) - flush_workqueue(md_misc_wq); - if (mddev->sync_thread) { - set_bit(MD_RECOVERY_INTR, &mddev->recovery); - md_reap_sync_thread(mddev); + if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) { + stop_sync_thread(mddev, true, false); + mddev_lock_nointr(mddev); + } else { + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); } =20 del_timer_sync(&mddev->safemode_timer); --=20 2.39.2 From nobody Tue Dec 16 06:51:01 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01B09C4332F for ; Fri, 10 Nov 2023 18:11:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344733AbjKJSLu (ORCPT ); Fri, 10 Nov 2023 13:11:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345348AbjKJSLA (ORCPT ); Fri, 10 Nov 2023 13:11:00 -0500 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 717452BE2D; Fri, 10 Nov 2023 01:34:04 -0800 (PST) Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4SRYYJ4VLMz4f4Q4J; Fri, 10 Nov 2023 17:33:56 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 7C2BB1A0176; Fri, 10 Nov 2023 17:34:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgA3iA4E+U1l0pQlAg--.33627S12; Fri, 10 Nov 2023 17:34:00 +0800 (CST) From: Yu Kuai To: song@kernel.org, xni@redhat.com, yukuai3@huawei.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH -next 8/8] dm-raid: fix a deadlock in md_stop() Date: Sat, 11 Nov 2023 01:28:34 +0800 Message-Id: <20231110172834.3939490-9-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231110172834.3939490-1-yukuai1@huaweicloud.com> References: <20231110172834.3939490-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgA3iA4E+U1l0pQlAg--.33627S12 X-Coremail-Antispam: 1UD129KBjvJXoW7tFWUurWUtw17JrW5Zry7ZFb_yoW8Zw45p3 yFqrWayr4UX3yUXayDGw1kuFyYq3ZYgrWqyrW3Ca4rZayayryxWw1rKa1vgrZ8JF9Iqan0 vF4qgas8W34jyFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2jI8I6cxK62vIxIIY0VWUZVW8XwA2048vs2IY02 0E87I2jVAFwI0_JF0E3s1l82xGYIkIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0 rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6x IIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xv wVC2z280aVCY1x0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFc xC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_ Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2 IErcIFxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjTRi4SO UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Yu Kuai After commit db5e653d7c9f ("md: delay choosing sync action to md_start_sync()"), md_start_sync() will hold 'reconfig_mutex', however, in order to make sure event_work is done, __md_stop() will flush workqueue with reconfig_mutex grabbed, hence if sync_work is still pending, deadlock will be triggered. md_stop md_start_sync mddev_lock mddev_lock flush_workqueue -> deadlock Currently, __md_stop() is the only place to flush workqueue with 'reconfig_mutex' grabbed, and event_work is only used for dm-raid, instead of split sync_work out of the workqueue, fix this problem the easy way by moving flush_workqueue to dm-raid where 'reconfig_mutex' is not held, this is safe because do_table_event() doesn't relate to mdadm and can be called after md_stop(). Fixes: db5e653d7c9f ("md: delay choosing sync action to md_start_sync()") Signed-off-by: Yu Kuai Signed-off-by: Yu Kuai --- drivers/md/dm-raid.c | 3 +++ drivers/md/md.c | 3 --- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index a4692f8f98ee..51f15c20f621 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3317,6 +3317,9 @@ static void raid_dtr(struct dm_target *ti) mddev_lock_nointr(&rs->md); md_stop(&rs->md); mddev_unlock(&rs->md); + + if (work_pending(&rs->md.event_work)) + flush_work(&rs->md.event_work); raid_set_free(rs); } =20 diff --git a/drivers/md/md.c b/drivers/md/md.c index 35f3dd7db369..8f5df249448d 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6378,9 +6378,6 @@ static void __md_stop(struct mddev *mddev) struct md_personality *pers =3D mddev->pers; md_bitmap_destroy(mddev); mddev_detach(mddev); - /* Ensure ->event_work is done */ - if (mddev->event_work.func) - flush_workqueue(md_misc_wq); spin_lock(&mddev->lock); mddev->pers =3D NULL; spin_unlock(&mddev->lock); --=20 2.39.2