From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D28017E11E; Tue, 5 Mar 2024 07:29:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623772; cv=none; b=TR4sERAH/0IrKUWBF2GznOT6BBjoZBjcqWCqwTRDWFimC25plZMgHcLRJfv/CYckW2LWru8fkxjBeHCJHf4GKDzVHyIAXFDsfNVv3T7G/1JXYQHBQbojw9bQMRSypmO3kULk7oiM9I3Emsg3IsCfDDFjYuyZjT+xjHFCvhpw3gk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623772; c=relaxed/simple; bh=LOAAo12TmpeMqfRlxKgrGSwC8uen4zm5weoQ2qlrzpg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Q6fdH0+dn/rw3q6oSKC0ukbqwu5s4zXRvfVZ8EEFyG65l9+tHJ+sc+SUGUdeA/rf42O0cH8aO3gqlQUfOtNwVfbfUGFsneh3yjQmnnN01B4xlopHtCDNZmZwsmINyp/fwCCoLtrHXnlesaoa/KOaemERkIsxhDDxhcpunIBgS9U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ26hfqz4f3k5b; Tue, 5 Mar 2024 15:29:22 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 6B2081A0C15; Tue, 5 Mar 2024 15:29:26 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S5; Tue, 05 Mar 2024 15:29:26 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 1/9] md: don't clear MD_RECOVERY_FROZEN for new dm-raid until resume Date: Tue, 5 Mar 2024 15:22:58 +0800 Message-Id: <20240305072306.2562024-2-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S5 X-Coremail-Antispam: 1UD129KBjvJXoW7Zw4ktrW3uryDCrW7CryxAFb_yoW8Gw47pa n7Gr1rurWkJrW2kF4DJa4v9a45tasxJrW0yryfu3y5Xr13Xr1kK34YgwnxZryqkF1aga45 AF1UAryrC3WrKw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JU4T5dUUUUU = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai After commit 9dbd1aa3a81c ("dm raid: add reshaping support to the target") raid_ctr() will set MD_RECOVERY_FROZEN before md_run() and expect to keep array frozen until resume. However, md_run() will clear the flag by setting mddev->recovery to 0. Before commit 1baae052cccd ("md: Don't ignore suspended array in md_check_recovery()"), dm-raid actually relied on suspending to prevent starting new sync_thread. Fix this problem by keeping 'MD_RECOVERY_FROZEN' for dm-raid in md_run(). Fixes: 1baae052cccd ("md: Don't ignore suspended array in md_check_recovery= ()") Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/md.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 9e41a9aaba8b..23f31fd1d024 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6038,7 +6038,10 @@ int md_run(struct mddev *mddev) pr_warn("True protection against single-disk failure might be compromis= ed.\n"); } =20 - mddev->recovery =3D 0; + /* dm-raid expect sync_thread to be frozen until resume */ + if (mddev->gendisk) + mddev->recovery =3D 0; + /* may be over-ridden by personality */ mddev->resync_max_sectors =3D mddev->dev_sectors; =20 --=20 2.39.2 From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A07E7FBA9; Tue, 5 Mar 2024 07:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623777; cv=none; b=HhbyZC4gk5W1zAlPVkN1rxb0DB9wn8k0fue3xtgQnKVQzNV1ZZPjzau6es8Lpkm0X8PjBQknkqc/1FkiCzNKuJyx2U70q8UDpI3+08/ReCfw6UGdK4Ln6pglPHA51hjaOyR+oNsv0CrCnYlnKn2jur2Vbi/NWlkIFvcOXDTnDW8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623777; c=relaxed/simple; bh=R2YXfoOPVIp/T74tk6RMErRlgJH831cBMUKOY5WBfuU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Xmw136v+wg8PeMPoQ5DxUb8rMH//BZql9p1SOmrXTOLAtC9cOFbMRzX9w8lG4q55rGr5Mm064PjkfvxvzkeLbBwf5WuUzrb2zq9+3CqCaJi+o0ePYTPWmWMZZQS2Omw8tyB4MLsivEyG6CizfmjR7k89fZkWJcPX8mqS76RVZ7Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ14MxMz4f3jrg; Tue, 5 Mar 2024 15:29:21 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 051071A0172; Tue, 5 Mar 2024 15:29:27 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S6; Tue, 05 Mar 2024 15:29:26 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 2/9] md: export helpers to stop sync_thread Date: Tue, 5 Mar 2024 15:22:59 +0800 Message-Id: <20240305072306.2562024-3-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S6 X-Coremail-Antispam: 1UD129KBjvJXoW7WF43Gr45Xw4kury8Kr1kuFg_yoW5JrWrpr W2qFn5Ar4YyFW3Zry7Ga4Dua4Yvwn2gFyktry3C3yfJ3WfKryDGF15Z3WUAFyDCa4rJr1q q3W5KFW3ury8Kr7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUBE14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMI IF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUHbyAUUUUU = X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai The new heleprs will be used in dm-raid in later patches to fix regressions and prevent calling md_reap_sync_thread() directly. Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/md.c | 29 +++++++++++++++++++++++++++++ drivers/md/md.h | 3 +++ 2 files changed, 32 insertions(+) diff --git a/drivers/md/md.c b/drivers/md/md.c index 23f31fd1d024..22e7512a5b1e 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -4919,6 +4919,35 @@ static void stop_sync_thread(struct mddev *mddev, bo= ol locked, bool check_seq) mddev_lock_nointr(mddev); } =20 +void md_idle_sync_thread(struct mddev *mddev) +{ + lockdep_assert_held(&mddev->reconfig_mutex); + + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + stop_sync_thread(mddev, true, true); +} +EXPORT_SYMBOL_GPL(md_idle_sync_thread); + +void md_frozen_sync_thread(struct mddev *mddev) +{ + lockdep_assert_held(&mddev->reconfig_mutex); + + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + stop_sync_thread(mddev, true, false); +} +EXPORT_SYMBOL_GPL(md_frozen_sync_thread); + +void md_unfrozen_sync_thread(struct mddev *mddev) +{ + lockdep_assert_held(&mddev->reconfig_mutex); + + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); + md_wakeup_thread(mddev->thread); + sysfs_notify_dirent_safe(mddev->sysfs_action); +} +EXPORT_SYMBOL_GPL(md_unfrozen_sync_thread); + static void idle_sync_thread(struct mddev *mddev) { mutex_lock(&mddev->sync_mutex); diff --git a/drivers/md/md.h b/drivers/md/md.h index 8d881cc59799..437ab70ce79b 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -781,6 +781,9 @@ extern void md_rdev_clear(struct md_rdev *rdev); extern void md_handle_request(struct mddev *mddev, struct bio *bio); extern int mddev_suspend(struct mddev *mddev, bool interruptible); extern void mddev_resume(struct mddev *mddev); +extern void md_idle_sync_thread(struct mddev *mddev); +extern void md_frozen_sync_thread(struct mddev *mddev); +extern void md_unfrozen_sync_thread(struct mddev *mddev); =20 extern void md_reload_sb(struct mddev *mddev, int raid_disk); extern void md_update_sb(struct mddev *mddev, int force); --=20 2.39.2 From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 389507FBA6; Tue, 5 Mar 2024 07:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623777; cv=none; b=Mpl5wALkYPjWE3kJGdkCs/H6gg8kT2O2B7+CLX439b7kykSUUelf/DfAl2WX2+qJ25vPiCO7M4zregtPzwPQ4CdEU02qZphi2WGsDC82ZvyoSlLkqKT/y2H+Gm2oVmlwC6Cy59ofdUHRNvN3xph9jR4i+Kuz3HRJqCbRyuUpfWk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623777; c=relaxed/simple; bh=a1W8ETvaPBy5gLulNcrgoIljokgm4KTvnlxO2lbuKEI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KkTViXvDv8I0ws8Q96nvt7xfFqyJvPi1niRj4BMVS9V+bJ4wxPilVsriutiisvsmF97/CkNNa7JqKo8+wcLDso0xh6a9P3OHWiszKTseqAEMMhgWY5w3x0vEj6RxPUnsK42BQYTeBtBm0PuX7wx1/UKOl7eZ4aQdA6MokUhjlcs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ21Wv2z4f3jsM; Tue, 5 Mar 2024 15:29:22 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 992591A0C19; Tue, 5 Mar 2024 15:29:27 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S7; Tue, 05 Mar 2024 15:29:27 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 3/9] md: export helper md_is_rdwr() Date: Tue, 5 Mar 2024 15:23:00 +0800 Message-Id: <20240305072306.2562024-4-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S7 X-Coremail-Antispam: 1UD129KBjvJXoW7Kr47Kr4xtrykGw13uF4fGrg_yoW8WF1kpa ySgFy3Ar4UZF43Xa4UJa1Dua45A3WfKrWjkry3uws5Xa43Xrs8CF1rGFWDKrWkWFyfAF1a ya15JF1UuF10kwUanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPj14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v2 6r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIxkGc2 Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_ Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8Jw CI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUojjgUUUU U X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai There are no functional changes for now, prepare to fix a deadlock for dm-raid456. Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/md.c | 12 ------------ drivers/md/md.h | 12 ++++++++++++ 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 22e7512a5b1e..306cbbf92fcf 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -99,18 +99,6 @@ static void mddev_detach(struct mddev *mddev); static void export_rdev(struct md_rdev *rdev, struct mddev *mddev); static void md_wakeup_thread_directly(struct md_thread __rcu *thread); =20 -enum md_ro_state { - MD_RDWR, - MD_RDONLY, - MD_AUTO_READ, - MD_MAX_STATE -}; - -static bool md_is_rdwr(struct mddev *mddev) -{ - return (mddev->ro =3D=3D MD_RDWR); -} - /* * Default number of read corrections we'll attempt on an rdev * before ejecting it from the array. We divide the read error diff --git a/drivers/md/md.h b/drivers/md/md.h index 437ab70ce79b..684cc866402a 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -558,6 +558,18 @@ enum recovery_flags { MD_RESYNCING_REMOTE, /* remote node is running resync thread */ }; =20 +enum md_ro_state { + MD_RDWR, + MD_RDONLY, + MD_AUTO_READ, + MD_MAX_STATE +}; + +static inline bool md_is_rdwr(struct mddev *mddev) +{ + return (mddev->ro =3D=3D MD_RDWR); +} + static inline int __must_check mddev_lock(struct mddev *mddev) { return mutex_lock_interruptible(&mddev->reconfig_mutex); --=20 2.39.2 From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5290F7E564; Tue, 5 Mar 2024 07:29:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623773; cv=none; b=qg9nrvCDfCW4s5q3LefPnoTJxHmyv4UZDuXRgrYl2mcqtSWVbUh6CJsafGzvvfnPMnklUzn2lv18bmGxUDJozXInxOdK+bQ/JrcQrf1eWfyy3eNAAnigIWiEzTCjDlqyyNqNNHoo/SXfKf8wQDEnNMNCUeJFMv/1yh05gEmGZ38= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623773; c=relaxed/simple; bh=xLrCo0TM2lW9vZ7sDJbrPgQEy+ULwHr3T5cie4P5Ah4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=R/FVw/7fGDqgPFyi+IEh4F3smOyQagOEqWPmLJFzt0D8Vwq4ivPyIUmir8Zh3KpswpiMZBmfE6HiFQd28M7LCwT1NNQGosUu7PEu70zEwFgSvv+e+WEH8m/PMEjDwk1TtQU020P33Rkgiq/vRlCju8+3KX5pruIk21uFyt3ZQVE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ45HDwz4f3k68; Tue, 5 Mar 2024 15:29:24 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 3BE5C1A016E; Tue, 5 Mar 2024 15:29:28 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S8; Tue, 05 Mar 2024 15:29:28 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 4/9] md: add a new helper reshape_interrupted() Date: Tue, 5 Mar 2024 15:23:01 +0800 Message-Id: <20240305072306.2562024-5-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S8 X-Coremail-Antispam: 1UD129KBjvJXoWrZw15Zr18uFW8Xr48Ar1fWFg_yoW8JF48pa yxGFZ8Zr4UAF9xGanrtF1kZa45uwn5KryDGFyfuan8AF13Xr129ryrWr1UArykGFySqa1a qa1rtFyUCr1xWw7anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTY UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai The helper will be used for dm-raid456 later to detect the case that reshape can't make progress. Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/md.h | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/drivers/md/md.h b/drivers/md/md.h index 684cc866402a..7f955115d78d 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -570,6 +570,25 @@ static inline bool md_is_rdwr(struct mddev *mddev) return (mddev->ro =3D=3D MD_RDWR); } =20 +static inline bool reshape_interrupted(struct mddev *mddev) +{ + /* reshape never start */ + if (mddev->reshape_position =3D=3D MaxSector) + return false; + + /* interrupted */ + if (!test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)) + return true; + + /* running reshape will be interrupted soon. */ + if (test_bit(MD_RECOVERY_WAIT, &mddev->recovery) || + test_bit(MD_RECOVERY_INTR, &mddev->recovery) || + test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)) + return true; + + return false; +} + static inline int __must_check mddev_lock(struct mddev *mddev) { return mutex_lock_interruptible(&mddev->reconfig_mutex); --=20 2.39.2 From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 251567E56C; Tue, 5 Mar 2024 07:29:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623774; cv=none; b=aVOMDYJYtZUWNDjvFlFxnHbR/Qo4qqPBY0+/nibBNHdZdkSXrYoTR8bAiBtB+VPfXkkT6cblrMImbHOK1zYpZ/n0R+O+BlW70x/Drybpxcu5CPauLheJhqi84SmnhduTMTC2gcqwaTLbm5C90BiJ5e4cKHl1IBtqKc99kriSlYE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623774; c=relaxed/simple; bh=XvvAbpEePZjEC0cSUPQ2s2dg9t4q+qLsBBKfup6+WvE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cRaZuzzPXBarmtKmnuwfmESJfnoGBOcKtYq4dcMum3dUjotVNm8NFrg7OIwvDIS/GrdUGftHOatNguRr97dvS1BvFDl/felBAygsf6gIFyYjMQY0Rt4mz5aLGCOsIE5m1MHI65g8j8em3+Xt8pQl0tay9rqqs6x02hlqhCUFu5Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ12XT8z4f3lgL; Tue, 5 Mar 2024 15:29:21 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id CFEA01A0175; Tue, 5 Mar 2024 15:29:28 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S9; Tue, 05 Mar 2024 15:29:28 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 5/9] dm-raid: really frozen sync_thread during suspend Date: Tue, 5 Mar 2024 15:23:02 +0800 Message-Id: <20240305072306.2562024-6-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S9 X-Coremail-Antispam: 1UD129KBjvJXoWxtry3CryrWw1rXw13tw13Arb_yoW7Zr1rpr WrtFs8Ar4UJrW7ZFZFy3WvqFW5ZwnIqrZrtry3W3yrA3WfKwsakFy09a47ZFWDGas7Ja4Y yan8tFW3u3WDKrJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJV W8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOBTY UUUUU X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai 1) commit f52f5c71f3d4 ("md: fix stopping sync thread") remove MD_RECOVERY_FROZEN from __md_stop_writes() and doesn't realize that dm-raid relies on __md_stop_writes() to frozen sync_thread indirectly. Fix this problem by adding MD_RECOVERY_FROZEN in md_stop_writes(), and since stop_sync_thread() is only used for dm-raid in this case, also move stop_sync_thread() to md_stop_writes(). 2) The flag MD_RECOVERY_FROZEN doesn't mean that sync thread is frozen, it only prevent new sync_thread to start, and it can't stop the running sync thread; In order to frozen sync_thread, after seting the flag, stop_sync_thread() should be used. 3) The flag MD_RECOVERY_FROZEN doesn't mean that writes are stopped, use it as condition for md_stop_writes() in raid_postsuspend() doesn't look correct. Consider that reentrant stop_sync_thread() do nothing, always call md_stop_writes() in raid_postsuspend(). 4) raid_message can set/clear the flag MD_RECOVERY_FROZEN at anytime, and if MD_RECOVERY_FROZEN is cleared while the array is suspended, new sync_thread can start unexpected. Fix this by disallow raid_message() to change sync_thread status during suspend. Note that after commit f52f5c71f3d4 ("md: fix stopping sync thread"), the test shell/lvconvert-raid-reshape.sh start to hang in stop_sync_thread(), and with previous fixes, the test won't hang there anymore, however, the test will still fail and complain that ext4 is corrupted. And with this patch, the test won't hang due to stop_sync_thread() or fail due to ext4 is corrupted anymore. However, there is still a deadlock related to dm-raid456 that will be fixed in following patches. Reported-by: Mikulas Patocka Closes: https://lore.kernel.org/all/e5e8afe2-e9a8-49a2-5ab0-958d4065c55e@re= dhat.com/ Fixes: 1af2048a3e87 ("dm raid: fix deadlock caused by premature md_stop_wri= tes()") Fixes: 9dbd1aa3a81c ("dm raid: add reshaping support to the target") Fixes: f52f5c71f3d4 ("md: fix stopping sync thread") Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/dm-raid.c | 25 +++++++++++++++---------- drivers/md/md.c | 3 ++- 2 files changed, 17 insertions(+), 11 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index eb009d6bb03a..e2d7a73c0f87 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3240,11 +3240,12 @@ static int raid_ctr(struct dm_target *ti, unsigned = int argc, char **argv) rs->md.ro =3D 1; rs->md.in_sync =3D 1; =20 - /* Keep array frozen until resume. */ - set_bit(MD_RECOVERY_FROZEN, &rs->md.recovery); - /* Has to be held on running the array */ mddev_suspend_and_lock_nointr(&rs->md); + + /* Keep array frozen until resume. */ + md_frozen_sync_thread(&rs->md); + r =3D md_run(&rs->md); rs->md.in_sync =3D 0; /* Assume already marked dirty */ if (r) { @@ -3722,6 +3723,9 @@ static int raid_message(struct dm_target *ti, unsigne= d int argc, char **argv, if (!mddev->pers || !mddev->pers->sync_request) return -EINVAL; =20 + if (test_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) + return -EBUSY; + if (!strcasecmp(argv[0], "frozen")) set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); else @@ -3796,10 +3800,11 @@ static void raid_postsuspend(struct dm_target *ti) struct raid_set *rs =3D ti->private; =20 if (!test_and_set_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) { - /* Writes have to be stopped before suspending to avoid deadlocks. */ - if (!test_bit(MD_RECOVERY_FROZEN, &rs->md.recovery)) - md_stop_writes(&rs->md); - + /* + * sync_thread must be stopped during suspend, and writes have + * to be stopped before suspending to avoid deadlocks. + */ + md_stop_writes(&rs->md); mddev_suspend(&rs->md, false); } } @@ -4012,8 +4017,6 @@ static int raid_preresume(struct dm_target *ti) } =20 /* Check for any resize/reshape on @rs and adjust/initiate */ - /* Be prepared for mddev_resume() in raid_resume() */ - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); if (mddev->recovery_cp && mddev->recovery_cp < MaxSector) { set_bit(MD_RECOVERY_REQUESTED, &mddev->recovery); mddev->resync_min =3D mddev->recovery_cp; @@ -4055,10 +4058,12 @@ static void raid_resume(struct dm_target *ti) if (mddev->delta_disks < 0) rs_set_capacity(rs); =20 + WARN_ON_ONCE(!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)); + WARN_ON_ONCE(test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); mddev_lock_nointr(mddev); - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); mddev->ro =3D 0; mddev->in_sync =3D 0; + md_unfrozen_sync_thread(mddev); mddev_unlock_and_resume(mddev); } } diff --git a/drivers/md/md.c b/drivers/md/md.c index 306cbbf92fcf..e9e666240e9f 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6335,7 +6335,6 @@ static void md_clean(struct mddev *mddev) =20 static void __md_stop_writes(struct mddev *mddev) { - stop_sync_thread(mddev, true, false); del_timer_sync(&mddev->safemode_timer); =20 if (mddev->pers && mddev->pers->quiesce) { @@ -6360,6 +6359,8 @@ static void __md_stop_writes(struct mddev *mddev) void md_stop_writes(struct mddev *mddev) { mddev_lock_nointr(mddev); + set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + stop_sync_thread(mddev, true, false); __md_stop_writes(mddev); mddev_unlock(mddev); } --=20 2.39.2 From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 541BA7E59D; Tue, 5 Mar 2024 07:29:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623774; cv=none; b=s6D1rejJecjiposQNfvouLvr3Q1MLCzmeDTGZe6HrU/T4T9NLSuNXSeuQPlRislFXsNrsXnjAqI6YKK9q9pyJ31aQI7F28X+NScsZKCvUnBs+oBW8YK3WfuwzMfH9ZIvK3iqHjPrdB0TLTo7RhmQRQB+jIq1BOPIgZi2ErGsT7I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623774; c=relaxed/simple; bh=wdrKCSawGx8VEqyGWfPkhEjDBA0pBVGNi+vB2GWEzzE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LAjbNKq+ZT02FO58LVO3VdqhQjIL/uJNnvcUpnp3h3Yk96mlGE1Ioh+HfYXSZ68j/yHGTZaUxYJw86f6WHT2L8R8RCpvXzOsvHDUGkOlsu+lL+3Ce4U0x+X5TeEbJzlGkjZWbOUzuRUymZEO2JGtBtqfu0FxBO9DsjH5WmjEPeY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ16xlpz4f3m6n; Tue, 5 Mar 2024 15:29:21 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 71BF31A0C2A; Tue, 5 Mar 2024 15:29:29 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S10; Tue, 05 Mar 2024 15:29:29 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 6/9] md/dm-raid: don't call md_reap_sync_thread() directly Date: Tue, 5 Mar 2024 15:23:03 +0800 Message-Id: <20240305072306.2562024-7-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S10 X-Coremail-Antispam: 1UD129KBjvJXoW7AF1xAF18KFyDZF15tw4UXFb_yoW5JFWrpa yakas8Ar48JrW3ZFsrt3Wq9ayFv3ZFgrWqyrWfG343JF1fKr13Wryj9a17ZFWDXFWrta45 XFZ8tF45uF4YqFJanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQ SdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai Currently md_reap_sync_thread() is called from raid_message() directly without holding 'reconfig_mutex', this is definitely unsafe because md_reap_sync_thread() can change many fields that is protected by 'reconfig_mutex'. However, hold 'reconfig_mutex' here is still problematic because this will cause deadlock, for example, commit 130443d60b1b ("md: refactor idle/frozen_sync_thread() to fix deadlock"). Fix this problem by using stop_sync_thread() to unregister sync_thread, like md/raid did. Fixes: be83651f0050 ("DM RAID: Add message/status support for changing sync= action") Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/dm-raid.c | 28 ++++++++++++++++++---------- 1 file changed, 18 insertions(+), 10 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index e2d7a73c0f87..47c4b1b6e532 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3719,6 +3719,7 @@ static int raid_message(struct dm_target *ti, unsigne= d int argc, char **argv, { struct raid_set *rs =3D ti->private; struct mddev *mddev =3D &rs->md; + int ret =3D 0; =20 if (!mddev->pers || !mddev->pers->sync_request) return -EINVAL; @@ -3726,17 +3727,24 @@ static int raid_message(struct dm_target *ti, unsig= ned int argc, char **argv, if (test_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) return -EBUSY; =20 - if (!strcasecmp(argv[0], "frozen")) - set_bit(MD_RECOVERY_FROZEN, &mddev->recovery); - else - clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + if (!strcasecmp(argv[0], "frozen")) { + ret =3D mddev_lock(mddev); + if (ret) + return ret; =20 - if (!strcasecmp(argv[0], "idle") || !strcasecmp(argv[0], "frozen")) { - if (mddev->sync_thread) { - set_bit(MD_RECOVERY_INTR, &mddev->recovery); - md_reap_sync_thread(mddev); - } - } else if (decipher_sync_action(mddev, mddev->recovery) !=3D st_idle) + md_frozen_sync_thread(mddev); + mddev_unlock(mddev); + } else if (!strcasecmp(argv[0], "idle")) { + ret =3D mddev_lock(mddev); + if (ret) + return ret; + + md_idle_sync_thread(mddev); + mddev_unlock(mddev); + } + + clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery); + if (decipher_sync_action(mddev, mddev->recovery) !=3D st_idle) return -EBUSY; else if (!strcasecmp(argv[0], "resync")) ; /* MD_RECOVERY_NEEDED set below */ --=20 2.39.2 From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32AC77E56C; Tue, 5 Mar 2024 07:29:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623780; cv=none; b=XNw0ICJkI6nFnS6b1Fc3FSIB6wZTyjWciv7zCjR11s7jVKprNp2nK+xHT4qh/cprXq/6OgQUePyGP8X0ScQiULHVYkpTW90kuVwewnvFK34RfIE8yQM/3LlP1nmHxQUcV1BGiZxDdWpseS9fvDhO67/mMUbQyXmZnLt/fcPBWtY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623780; c=relaxed/simple; bh=c7vtsLwByVaJ/kOv3+/pS6rhbOr70p9cvZ23xJScbOc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=S3dBt86v1qNr2F0zU97GaUiiTRXuNsfArfJDfTmWXiumNvV3S8CYbaucA/Hw9PHWPCQ0ee0e3xIEcAmNMg4aS5FTr7eE7QFCnEpU9AmtEs1HYMLmlnslGzUqtVGW3Qyy1pH0K1ljBuU1igdLsxZdJ40DHPwuAuYFLLW0x08+oxs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ44p7kz4f3jsr; Tue, 5 Mar 2024 15:29:24 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 1399D1A0232; Tue, 5 Mar 2024 15:29:30 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S11; Tue, 05 Mar 2024 15:29:29 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 7/9] dm-raid: add a new helper prepare_suspend() in md_personality Date: Tue, 5 Mar 2024 15:23:04 +0800 Message-Id: <20240305072306.2562024-8-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S11 X-Coremail-Antispam: 1UD129KBjvJXoW7Kr47Kr4xtrykGw13uF4fGrg_yoW8uFW7pa yIq34Yvr4UJFZI93WDJF4kua4aq3Z3KrWjyr9xJayfZFy2gryrGa4ftayYqrZ8CF15K3W7 AF4Ut34UWr109FDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQ SdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai There are no functional changes for now, prepare to fix a deadlock for dm-raid456. Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/dm-raid.c | 18 ++++++++++++++++++ drivers/md/md.h | 1 + 2 files changed, 19 insertions(+) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 47c4b1b6e532..7d48943acd57 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -3803,6 +3803,23 @@ static void raid_io_hints(struct dm_target *ti, stru= ct queue_limits *limits) blk_limits_io_opt(limits, chunk_size_bytes * mddev_data_stripes(rs)); } =20 +static void raid_presuspend(struct dm_target *ti) +{ + struct raid_set *rs =3D ti->private; + struct mddev *mddev =3D &rs->md; + + if (!reshape_interrupted(mddev)) + return; + + /* + * For raid456, if reshape is interrupted, IO across reshape position + * will never make progress, while caller will wait for IO to be done. + * Inform raid456 to handle those IO to prevent deadlock. + */ + if (mddev->pers && mddev->pers->prepare_suspend) + mddev->pers->prepare_suspend(mddev); +} + static void raid_postsuspend(struct dm_target *ti) { struct raid_set *rs =3D ti->private; @@ -4087,6 +4104,7 @@ static struct target_type raid_target =3D { .message =3D raid_message, .iterate_devices =3D raid_iterate_devices, .io_hints =3D raid_io_hints, + .presuspend =3D raid_presuspend, .postsuspend =3D raid_postsuspend, .preresume =3D raid_preresume, .resume =3D raid_resume, diff --git a/drivers/md/md.h b/drivers/md/md.h index 7f955115d78d..2db0a75febf9 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -648,6 +648,7 @@ struct md_personality int (*start_reshape) (struct mddev *mddev); void (*finish_reshape) (struct mddev *mddev); void (*update_reshape_pos) (struct mddev *mddev); + void (*prepare_suspend) (struct mddev *mddev); /* quiesce suspends or resumes internal processing. * 1 - stop new actions and wait for action io to complete * 0 - return to normal behaviour --=20 2.39.2 From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9811B7EF09; Tue, 5 Mar 2024 07:29:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623776; cv=none; b=KJ45pPdNi6XKFEQaIY8Vs3uChQxXAYn5rSTbqt2791GiTnEIGYTZ038ggZhO8HLSca0LFfBv0uLL+W3qnd4Dmq9bjwCczwuccksaSXI2Tfnb7OmZOl2ZT/3B31SvOYanccRTNvK4yELUNYiYnRjHthTbZsB4WTLxAARkc/SBZUI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623776; c=relaxed/simple; bh=DdK9NSLtymzytVxiXQOrSWhKL0dKh4wr7ZOBCaMyZaA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=AyN9v38b9gx6G19OHV9/VNSsSQqefcf9G/wtOzzldzz45PFZ8kZ04v07CJVsH71hL8jwUFS+MGM11Fdb1sAF+sZ7nOWjn+rJQr6SOo0LeeSxOYtYio5rQXV8huBeWp5iUzhL7ddy0ajZkwe7ZA8JjG2XowQ/fbG1QZtXWzHvRxM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ71Nhdz4f3k6M; Tue, 5 Mar 2024 15:29:27 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id A99951A0175; Tue, 5 Mar 2024 15:29:30 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S12; Tue, 05 Mar 2024 15:29:30 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 8/9] dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape Date: Tue, 5 Mar 2024 15:23:05 +0800 Message-Id: <20240305072306.2562024-9-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S12 X-Coremail-Antispam: 1UD129KBjvJXoW3CFy3GFy5uFW5XF1UGF43GFg_yoWkCr1rpa y7tas0vr4jq3yY9w4DAFWkWa4SgFn7KrZFyFZ3J34fA3W3trn5C3W5Xa1Yvrn8CFyrury5 Aa15tayDurWSgFDanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQ SdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai For raid456, if reshape is still in progress, then IO across reshape position will wait for reshape to make progress. However, for dm-raid, in following cases reshape will never make progress hence IO will hang: 1) the array is read-only; 2) MD_RECOVERY_WAIT is set; 3) MD_RECOVERY_FROZEN is set; After commit c467e97f079f ("md/raid6: use valid sector values to determine if an I/O should wait on the reshape") fix the problem that IO across reshape position doesn't wait for reshape, the dm-raid test shell/lvconvert-raid-reshape.sh start to hang: [root@fedora ~]# cat /proc/979/stack [<0>] wait_woken+0x7d/0x90 [<0>] raid5_make_request+0x929/0x1d70 [raid456] [<0>] md_handle_request+0xc2/0x3b0 [md_mod] [<0>] raid_map+0x2c/0x50 [dm_raid] [<0>] __map_bio+0x251/0x380 [dm_mod] [<0>] dm_submit_bio+0x1f0/0x760 [dm_mod] [<0>] __submit_bio+0xc2/0x1c0 [<0>] submit_bio_noacct_nocheck+0x17f/0x450 [<0>] submit_bio_noacct+0x2bc/0x780 [<0>] submit_bio+0x70/0xc0 [<0>] mpage_readahead+0x169/0x1f0 [<0>] blkdev_readahead+0x18/0x30 [<0>] read_pages+0x7c/0x3b0 [<0>] page_cache_ra_unbounded+0x1ab/0x280 [<0>] force_page_cache_ra+0x9e/0x130 [<0>] page_cache_sync_ra+0x3b/0x110 [<0>] filemap_get_pages+0x143/0xa30 [<0>] filemap_read+0xdc/0x4b0 [<0>] blkdev_read_iter+0x75/0x200 [<0>] vfs_read+0x272/0x460 [<0>] ksys_read+0x7a/0x170 [<0>] __x64_sys_read+0x1c/0x30 [<0>] do_syscall_64+0xc6/0x230 [<0>] entry_SYSCALL_64_after_hwframe+0x6c/0x74 This is because reshape can't make progress. For md/raid, the problem doesn't exist because register new sync_thread doesn't rely on the IO to be done any more: 1) If array is read-only, it can switch to read-write by ioctl/sysfs; 2) md/raid never set MD_RECOVERY_WAIT; 3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold 'reconfig_mutex', hence it can be cleared and reshape can continue by sysfs api 'sync_action'. However, I'm not sure yet how to avoid the problem in dm-raid yet. This patch on the one hand make sure raid_message() can't change sync_thread() through raid_message() after presuspend(), on the other hand detect the above 3 cases before wait for IO do be done in dm_suspend(), and let dm-raid requeue those IO. Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/dm-raid.c | 22 ++++++++++++++++++++-- drivers/md/md.c | 24 ++++++++++++++++++++++-- drivers/md/md.h | 3 ++- drivers/md/raid5.c | 32 ++++++++++++++++++++++++++++++-- 4 files changed, 74 insertions(+), 7 deletions(-) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index 7d48943acd57..ea45f777691c 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -213,6 +213,7 @@ struct raid_dev { #define RT_FLAG_RS_IN_SYNC 6 #define RT_FLAG_RS_RESYNCING 7 #define RT_FLAG_RS_GROW 8 +#define RT_FLAG_RS_FROZEN 9 =20 /* Array elements of 64 bit needed for rebuild/failed disk bits */ #define DISKS_ARRAY_ELEMS ((MAX_RAID_DEVICES + (sizeof(uint64_t) * 8 - 1))= / sizeof(uint64_t) / 8) @@ -3340,7 +3341,8 @@ static int raid_map(struct dm_target *ti, struct bio = *bio) if (unlikely(bio_end_sector(bio) > mddev->array_sectors)) return DM_MAPIO_REQUEUE; =20 - md_handle_request(mddev, bio); + if (unlikely(!md_handle_request(mddev, bio))) + return DM_MAPIO_REQUEUE; =20 return DM_MAPIO_SUBMITTED; } @@ -3724,7 +3726,8 @@ static int raid_message(struct dm_target *ti, unsigne= d int argc, char **argv, if (!mddev->pers || !mddev->pers->sync_request) return -EINVAL; =20 - if (test_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) + if (test_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags) || + test_bit(RT_FLAG_RS_FROZEN, &rs->runtime_flags)) return -EBUSY; =20 if (!strcasecmp(argv[0], "frozen")) { @@ -3808,6 +3811,12 @@ static void raid_presuspend(struct dm_target *ti) struct raid_set *rs =3D ti->private; struct mddev *mddev =3D &rs->md; =20 + /* + * From now on, disallow raid_message() to change sync_thread until + * resume, raid_postsuspend() is too late. + */ + set_bit(RT_FLAG_RS_FROZEN, &rs->runtime_flags); + if (!reshape_interrupted(mddev)) return; =20 @@ -3820,6 +3829,13 @@ static void raid_presuspend(struct dm_target *ti) mddev->pers->prepare_suspend(mddev); } =20 +static void raid_presuspend_undo(struct dm_target *ti) +{ + struct raid_set *rs =3D ti->private; + + clear_bit(RT_FLAG_RS_FROZEN, &rs->runtime_flags); +} + static void raid_postsuspend(struct dm_target *ti) { struct raid_set *rs =3D ti->private; @@ -4085,6 +4101,7 @@ static void raid_resume(struct dm_target *ti) =20 WARN_ON_ONCE(!test_bit(MD_RECOVERY_FROZEN, &mddev->recovery)); WARN_ON_ONCE(test_bit(MD_RECOVERY_RUNNING, &mddev->recovery)); + clear_bit(RT_FLAG_RS_FROZEN, &rs->runtime_flags); mddev_lock_nointr(mddev); mddev->ro =3D 0; mddev->in_sync =3D 0; @@ -4105,6 +4122,7 @@ static struct target_type raid_target =3D { .iterate_devices =3D raid_iterate_devices, .io_hints =3D raid_io_hints, .presuspend =3D raid_presuspend, + .presuspend_undo =3D raid_presuspend_undo, .postsuspend =3D raid_postsuspend, .preresume =3D raid_preresume, .resume =3D raid_resume, diff --git a/drivers/md/md.c b/drivers/md/md.c index e9e666240e9f..50d7bfa3c6f9 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -366,7 +366,7 @@ static bool is_suspended(struct mddev *mddev, struct bi= o *bio) return true; } =20 -void md_handle_request(struct mddev *mddev, struct bio *bio) +bool md_handle_request(struct mddev *mddev, struct bio *bio) { check_suspended: if (is_suspended(mddev, bio)) { @@ -374,7 +374,7 @@ void md_handle_request(struct mddev *mddev, struct bio = *bio) /* Bail out if REQ_NOWAIT is set for the bio */ if (bio->bi_opf & REQ_NOWAIT) { bio_wouldblock_error(bio); - return; + return true; } for (;;) { prepare_to_wait(&mddev->sb_wait, &__wait, @@ -390,10 +390,13 @@ void md_handle_request(struct mddev *mddev, struct bi= o *bio) =20 if (!mddev->pers->make_request(mddev, bio)) { percpu_ref_put(&mddev->active_io); + if (!mddev->gendisk && mddev->pers->prepare_suspend) + return false; goto check_suspended; } =20 percpu_ref_put(&mddev->active_io); + return true; } EXPORT_SYMBOL(md_handle_request); =20 @@ -8758,6 +8761,23 @@ void md_account_bio(struct mddev *mddev, struct bio = **bio) } EXPORT_SYMBOL_GPL(md_account_bio); =20 +void md_free_cloned_bio(struct bio *bio) +{ + struct md_io_clone *md_io_clone =3D bio->bi_private; + struct bio *orig_bio =3D md_io_clone->orig_bio; + struct mddev *mddev =3D md_io_clone->mddev; + + if (bio->bi_status && !orig_bio->bi_status) + orig_bio->bi_status =3D bio->bi_status; + + if (md_io_clone->start_time) + bio_end_io_acct(orig_bio, md_io_clone->start_time); + + bio_put(bio); + percpu_ref_put(&mddev->active_io); +} +EXPORT_SYMBOL_GPL(md_free_cloned_bio); + /* md_allow_write(mddev) * Calling this ensures that the array is marked 'active' so that writes * may proceed without blocking. It is important to call this before diff --git a/drivers/md/md.h b/drivers/md/md.h index 2db0a75febf9..3133650b57d4 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -782,6 +782,7 @@ extern void md_finish_reshape(struct mddev *mddev); void md_submit_discard_bio(struct mddev *mddev, struct md_rdev *rdev, struct bio *bio, sector_t start, sector_t size); void md_account_bio(struct mddev *mddev, struct bio **bio); +void md_free_cloned_bio(struct bio *bio); =20 extern bool __must_check md_flush_request(struct mddev *mddev, struct bio = *bio); extern void md_super_write(struct mddev *mddev, struct md_rdev *rdev, @@ -810,7 +811,7 @@ extern void md_stop_writes(struct mddev *mddev); extern int md_rdev_init(struct md_rdev *rdev); extern void md_rdev_clear(struct md_rdev *rdev); =20 -extern void md_handle_request(struct mddev *mddev, struct bio *bio); +extern bool md_handle_request(struct mddev *mddev, struct bio *bio); extern int mddev_suspend(struct mddev *mddev, bool interruptible); extern void mddev_resume(struct mddev *mddev); extern void md_idle_sync_thread(struct mddev *mddev); diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 6a7a32f7fb91..38a49f6eb54b 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -760,6 +760,7 @@ enum stripe_result { STRIPE_RETRY, STRIPE_SCHEDULE_AND_RETRY, STRIPE_FAIL, + STRIPE_WAIT_RESHAPE, }; =20 struct stripe_request_ctx { @@ -5946,7 +5947,8 @@ static enum stripe_result make_stripe_request(struct = mddev *mddev, if (ahead_of_reshape(mddev, logical_sector, conf->reshape_safe)) { spin_unlock_irq(&conf->device_lock); - return STRIPE_SCHEDULE_AND_RETRY; + ret =3D STRIPE_SCHEDULE_AND_RETRY; + goto out; } } spin_unlock_irq(&conf->device_lock); @@ -6025,6 +6027,12 @@ static enum stripe_result make_stripe_request(struct= mddev *mddev, =20 out_release: raid5_release_stripe(sh); +out: + if (ret =3D=3D STRIPE_SCHEDULE_AND_RETRY && reshape_interrupted(mddev)) { + bi->bi_status =3D BLK_STS_RESOURCE; + ret =3D STRIPE_WAIT_RESHAPE; + pr_err_ratelimited("dm-raid456: io across reshape position while reshape= can't make progress"); + } return ret; } =20 @@ -6146,7 +6154,7 @@ static bool raid5_make_request(struct mddev *mddev, s= truct bio * bi) while (1) { res =3D make_stripe_request(mddev, conf, &ctx, logical_sector, bi); - if (res =3D=3D STRIPE_FAIL) + if (res =3D=3D STRIPE_FAIL || res =3D=3D STRIPE_WAIT_RESHAPE) break; =20 if (res =3D=3D STRIPE_RETRY) @@ -6184,6 +6192,11 @@ static bool raid5_make_request(struct mddev *mddev, = struct bio * bi) =20 if (rw =3D=3D WRITE) md_write_end(mddev); + if (res =3D=3D STRIPE_WAIT_RESHAPE) { + md_free_cloned_bio(bi); + return false; + } + bio_endio(bi); return true; } @@ -8909,6 +8922,18 @@ static int raid5_start(struct mddev *mddev) return r5l_start(conf->log); } =20 +/* + * This is only used for dm-raid456, caller already frozen sync_thread, he= nce + * if rehsape is still in progress, io that is waiting for reshape can nev= er be + * done now, hence wake up and handle those IO. + */ +static void raid5_prepare_suspend(struct mddev *mddev) +{ + struct r5conf *conf =3D mddev->private; + + wake_up(&conf->wait_for_overlap); +} + static struct md_personality raid6_personality =3D { .name =3D "raid6", @@ -8932,6 +8957,7 @@ static struct md_personality raid6_personality =3D .quiesce =3D raid5_quiesce, .takeover =3D raid6_takeover, .change_consistency_policy =3D raid5_change_consistency_policy, + .prepare_suspend =3D raid5_prepare_suspend, }; static struct md_personality raid5_personality =3D { @@ -8956,6 +8982,7 @@ static struct md_personality raid5_personality =3D .quiesce =3D raid5_quiesce, .takeover =3D raid5_takeover, .change_consistency_policy =3D raid5_change_consistency_policy, + .prepare_suspend =3D raid5_prepare_suspend, }; =20 static struct md_personality raid4_personality =3D @@ -8981,6 +9008,7 @@ static struct md_personality raid4_personality =3D .quiesce =3D raid5_quiesce, .takeover =3D raid4_takeover, .change_consistency_policy =3D raid5_change_consistency_policy, + .prepare_suspend =3D raid5_prepare_suspend, }; =20 static int __init raid5_init(void) --=20 2.39.2 From nobody Mon Feb 9 13:13:45 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 274727F474; Tue, 5 Mar 2024 07:29:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623777; cv=none; b=JGIBNaHBIqT5oxWTKDGIdz4oQsAKQtLHuZSHK010k9uTTHXCte00Ui5rC9xRPs61hELhzYYp/U1HBDTA1kliIj/MnpQfJXySEYElcod41Zp5e+D379gsAGJeQjdbZ6jF1FuRG0VK2ohbmlK0MgQuUzBIkJJ+3GuZHWVDh5E5ccQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709623777; c=relaxed/simple; bh=KEW/XJoqhjDOxQSQ1eSle9j1tLlHhzVv3KI/rIzFYbI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=HUuEQ5PifwROOHd5B8pLiSCz8Xt47KO0+snPX6u+3qf9VOCyCfImBRLW4FxL7hpMknWdSBj3pwQVs5meR/CPU87h1Eo8PQ33hHX06PIYEbD+7AxzuwwbjPEgStIbvythJFAvHHOWnqH+y3qvmocnDSYptSloWBDjSX8zrUj8iyA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4TpnJ75gYJz4f3kJs; Tue, 5 Mar 2024 15:29:27 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.112]) by mail.maildlp.com (Postfix) with ESMTP id 49C9C1A0C3A; Tue, 5 Mar 2024 15:29:31 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.104.67]) by APP1 (Coremail) with SMTP id cCh0CgBHGBHRyeZlGnf+Fw--.17927S13; Tue, 05 Mar 2024 15:29:31 +0800 (CST) From: Yu Kuai To: xni@redhat.com, zkabelac@redhat.com, agk@redhat.com, snitzer@kernel.org, mpatocka@redhat.com, dm-devel@lists.linux.dev, song@kernel.org, yukuai3@huawei.com, heinzm@redhat.com, jbrassow@redhat.com, neilb@suse.de Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH md-6.8 v2 9/9] dm-raid: fix lockdep waring in "pers->hot_add_disk" Date: Tue, 5 Mar 2024 15:23:06 +0800 Message-Id: <20240305072306.2562024-10-yukuai1@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240305072306.2562024-1-yukuai1@huaweicloud.com> References: <20240305072306.2562024-1-yukuai1@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: cCh0CgBHGBHRyeZlGnf+Fw--.17927S13 X-Coremail-Antispam: 1UD129KBjvJXoW7Cr45Jw1UZF45JF13GFW5GFg_yoW8GFy3p3 ZrK343Kw4DJr48Z3Wqvw1q9a45tan8K3ySyrZxG3s5ZFy2vrZI9wn8Ka1agFWDZFyftFW5 AF43J3yrW3Z8t3DanT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUPI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_Ar0_tr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUJVWUGwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E 14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_GFv_WrylIx kGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAF wI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr 0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQ SdkUUUUU= X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ Content-Type: text/plain; charset="utf-8" From: Yu Kuai The lockdep assert is added by commit a448af25becf ("md/raid10: remove rcu protection to access rdev from conf") in print_conf(). And I didn't notice that dm-raid is calling "pers->hot_add_disk" without holding 'reconfig_mutex'. "pers->hot_add_disk" read and write many fields that is protected by 'reconfig_mutex', and raid_resume() already grab the lock in other contex. Hence fix this problem by protecting "pers->host_add_disk" with the lock. Fixes: 9092c02d9435 ("DM RAID: Add ability to restore transiently failed de= vices on resume") Fixes: a448af25becf ("md/raid10: remove rcu protection to access rdev from = conf") Signed-off-by: Yu Kuai Signed-off-by: Xiao Ni Acked-by: Mike Snitzer --- drivers/md/dm-raid.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/md/dm-raid.c b/drivers/md/dm-raid.c index ea45f777691c..17e9af60bbf7 100644 --- a/drivers/md/dm-raid.c +++ b/drivers/md/dm-raid.c @@ -4091,7 +4091,9 @@ static void raid_resume(struct dm_target *ti) * Take this opportunity to check whether any failed * devices are reachable again. */ + mddev_lock_nointr(mddev); attempt_restore_of_faulty_devices(rs); + mddev_unlock(mddev); } =20 if (test_and_clear_bit(RT_FLAG_RS_SUSPENDED, &rs->runtime_flags)) { --=20 2.39.2