From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B8526258CDF; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; cv=none; b=b3m4A5hMif6ftDTtOPJ5n+x23Mt15WoGWEqh3INA6c/arvi4N+L4VGMVsRkJF4CZf86iBNYZBRjdJmv3DNgdsC9bPmQk9C34sPBD43+8Urw39RHxq67ydzwykM/PXh1Z36n9EDK9Qb4NbyvP1qbG5Rnpe1ZPsmJzOemENzvEYCs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; c=relaxed/simple; bh=kol7nKOTHhmzvvwout/eC/zPYeCpPmgcDVjcD0ebrX8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JndO6j/+f7rlcbO712mbBq44RamBAw5N0IBt9RKprZYh9Uh+Dj/Ox3PfT6EhARoSXhIFJv/Ir+0NzXG1aZS9oPwhbSvLhMkMe4prx9mRzC03v73yvd4zZT3KAqcKjWWB8OxGSe0zrTXRJOUMEsvBpn3aPRRjB50nmQqwTYLJKq4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dlBRf3xrDzYQv2L; Mon, 5 Jan 2026 19:10:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 471C64056B; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S5; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 01/12] md/raid1: simplify uptodate handling in end_sync_write Date: Mon, 5 Jan 2026 19:02:49 +0800 Message-Id: <20260105110300.1442509-2-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S5 X-Coremail-Antispam: 1UD129KBjvJXoW7CFy7tF4UtF1UuFWrKry8Xwb_yoW8Xw4kp3 yUGFW5WrW5KFZ8ZFWDGFyDZF1fKw13W3y7CrZrWw1fXFn8tr98Ga1UXrWYgFyDZFZ3Cr43 Xw1vkay3Aa13JFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmC14x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r1I6r4UM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAa c4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzV Aqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4UJwAm 72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYx C7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2Iq xVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r 106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AK xVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7 xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_ Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUqtCcUUUUU= X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan In end_sync_write, r1bio state is always set to either R1BIO_WriteError or R1BIO_MadeGood. Consequently, put_sync_write_buf() never takes the 'else' branch that calls md_done_sync(), making the uptodate parameter have no practical effect. Pass 1 to put_sync_write_buf(). A more complete cleanup will be done in a follow-up patch. Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/raid1.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 57d50465eed1..6af75b82bc64 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2080,13 +2080,12 @@ static void put_sync_write_buf(struct r1bio *r1_bio= , int uptodate) =20 static void end_sync_write(struct bio *bio) { - int uptodate =3D !bio->bi_status; struct r1bio *r1_bio =3D get_resync_r1bio(bio); struct mddev *mddev =3D r1_bio->mddev; struct r1conf *conf =3D mddev->private; struct md_rdev *rdev =3D conf->mirrors[find_bio_disk(r1_bio, bio)].rdev; =20 - if (!uptodate) { + if (bio->bi_status) { abort_sync_write(mddev, r1_bio); set_bit(WriteErrorSeen, &rdev->flags); if (!test_and_set_bit(WantReplacement, &rdev->flags)) @@ -2099,7 +2098,7 @@ static void end_sync_write(struct bio *bio) set_bit(R1BIO_MadeGood, &r1_bio->state); } =20 - put_sync_write_buf(r1_bio, uptodate); + put_sync_write_buf(r1_bio, 1); } =20 static int r1_sync_page_io(struct md_rdev *rdev, sector_t sector, --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0D9E2DAFDA; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; cv=none; b=ozHzemn7aiG8F1ycZmSDcw3c01NbQzBZEfRZSoTGbH0c206u1NzzLLiNKN5f0LULJqHSkQj+kAh9BS0f624x1NOgfwHbluUGccDgW41RcWdDFJCNxfJeZLAPM1dg/8XiPRHgb6Kf8gTRMA2JFrfXfoVgRO1RwBYvwMRSvKhdqVw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; c=relaxed/simple; bh=iFIj/vZyjhYKNEJPdc0pyptg+Vr3buALaHHcZJPMbC0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=S/zRnbk0UJIIU1tF/iAYx1GJGXXYIcm3nqwOuvdgqR1InPv6fmNTBNb1OBhItKRnqAhaSfCF/rGeUaHyz5wO3thAYT6MkDI8cmw+YkpWzWQac5aoiQdAGXA9SZCeUE5z/XciMnWOoxlbLJw3nHFVyeljUsvJqQ7kEsB4SdLdAmo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dlBS31gl1zKHMgM; Mon, 5 Jan 2026 19:10:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 57DFE40573; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S6; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 02/12] md: factor error handling out of md_done_sync into helper Date: Mon, 5 Jan 2026 19:02:50 +0800 Message-Id: <20260105110300.1442509-3-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S6 X-Coremail-Antispam: 1UD129KBjvJXoW3Wr4UKF18ZryxGFyDKr4kJFb_yoW3Zr4Upa yDJFyrA3yUtFWavFyDAFWDua4Fy34xtFZrtFW7uwn7X3Z8tryDGF1UX3WYqFyDJa4rurW3 Xa1DWFW5CFyfJF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmC14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAa c4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzV Aqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4UJwAm 72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYx C7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2Iq xVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r 106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AK xVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7 xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_ Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUcyCJUUUUU= X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan The 'ok' parameter in md_done_sync() is redundant for most callers that always pass 'true'. Factor error handling logic into a separate helper function md_sync_error() to eliminate unnecessary parameter passing and improve code clarity. No functional changes introduced. Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/md.h | 3 ++- drivers/md/md.c | 17 ++++++++++------- drivers/md/raid1.c | 14 +++++++------- drivers/md/raid10.c | 11 ++++++----- drivers/md/raid5.c | 14 ++++++++------ 5 files changed, 33 insertions(+), 26 deletions(-) diff --git a/drivers/md/md.h b/drivers/md/md.h index 6985f2829bbd..8871c88ceef1 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -912,7 +912,8 @@ extern const char *md_sync_action_name(enum sync_action= action); extern void md_write_start(struct mddev *mddev, struct bio *bi); extern void md_write_inc(struct mddev *mddev, struct bio *bi); extern void md_write_end(struct mddev *mddev); -extern void md_done_sync(struct mddev *mddev, int blocks, int ok); +extern void md_done_sync(struct mddev *mddev, int blocks); +extern void md_sync_error(struct mddev *mddev); extern void md_error(struct mddev *mddev, struct md_rdev *rdev); extern void md_finish_reshape(struct mddev *mddev); void md_submit_discard_bio(struct mddev *mddev, struct md_rdev *rdev, diff --git a/drivers/md/md.c b/drivers/md/md.c index 6d73f6e196a9..55254483ec6b 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9068,20 +9068,23 @@ static bool is_mddev_idle(struct mddev *mddev, int = init) return idle; } =20 -void md_done_sync(struct mddev *mddev, int blocks, int ok) +void md_done_sync(struct mddev *mddev, int blocks) { /* another "blocks" (512byte) blocks have been synced */ atomic_sub(blocks, &mddev->recovery_active); wake_up(&mddev->recovery_wait); - if (!ok) { - set_bit(MD_RECOVERY_INTR, &mddev->recovery); - set_bit(MD_RECOVERY_ERROR, &mddev->recovery); - md_wakeup_thread(mddev->thread); - // stop recovery, signal do_sync .... - } } EXPORT_SYMBOL(md_done_sync); =20 +void md_sync_error(struct mddev *mddev) +{ + // stop recovery, signal do_sync .... + set_bit(MD_RECOVERY_INTR, &mddev->recovery); + set_bit(MD_RECOVERY_ERROR, &mddev->recovery); + md_wakeup_thread(mddev->thread); +} +EXPORT_SYMBOL(md_sync_error); + /* md_write_start(mddev, bi) * If we need to update some array metadata (e.g. 'active' flag * in superblock) before writing, schedule a superblock update diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 6af75b82bc64..90ad9455f74a 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2062,7 +2062,7 @@ static void abort_sync_write(struct mddev *mddev, str= uct r1bio *r1_bio) } while (sectors_to_go > 0); } =20 -static void put_sync_write_buf(struct r1bio *r1_bio, int uptodate) +static void put_sync_write_buf(struct r1bio *r1_bio) { if (atomic_dec_and_test(&r1_bio->remaining)) { struct mddev *mddev =3D r1_bio->mddev; @@ -2073,7 +2073,7 @@ static void put_sync_write_buf(struct r1bio *r1_bio, = int uptodate) reschedule_retry(r1_bio); else { put_buf(r1_bio); - md_done_sync(mddev, s, uptodate); + md_done_sync(mddev, s); } } } @@ -2098,7 +2098,7 @@ static void end_sync_write(struct bio *bio) set_bit(R1BIO_MadeGood, &r1_bio->state); } =20 - put_sync_write_buf(r1_bio, 1); + put_sync_write_buf(r1_bio); } =20 static int r1_sync_page_io(struct md_rdev *rdev, sector_t sector, @@ -2348,8 +2348,8 @@ static void sync_request_write(struct mddev *mddev, s= truct r1bio *r1_bio) if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) || !fix_sync_read_error(r1_bio)) { conf->recovery_disabled =3D mddev->recovery_disabled; - set_bit(MD_RECOVERY_INTR, &mddev->recovery); - md_done_sync(mddev, r1_bio->sectors, 0); + md_done_sync(mddev, r1_bio->sectors); + md_sync_error(mddev); put_buf(r1_bio); return; } @@ -2384,7 +2384,7 @@ static void sync_request_write(struct mddev *mddev, s= truct r1bio *r1_bio) submit_bio_noacct(wbio); } =20 - put_sync_write_buf(r1_bio, 1); + put_sync_write_buf(r1_bio); } =20 /* @@ -2575,7 +2575,7 @@ static void handle_sync_write_finished(struct r1conf = *conf, struct r1bio *r1_bio } } put_buf(r1_bio); - md_done_sync(conf->mddev, s, 1); + md_done_sync(conf->mddev, s); } =20 static void handle_write_finished(struct r1conf *conf, struct r1bio *r1_bi= o) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 84be4cc7e873..40c31c00dc60 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -2276,7 +2276,7 @@ static void end_sync_request(struct r10bio *r10_bio) reschedule_retry(r10_bio); else put_buf(r10_bio); - md_done_sync(mddev, s, 1); + md_done_sync(mddev, s); break; } else { struct r10bio *r10_bio2 =3D (struct r10bio *)r10_bio->master_bio; @@ -2452,7 +2452,7 @@ static void sync_request_write(struct mddev *mddev, s= truct r10bio *r10_bio) =20 done: if (atomic_dec_and_test(&r10_bio->remaining)) { - md_done_sync(mddev, r10_bio->sectors, 1); + md_done_sync(mddev, r10_bio->sectors); put_buf(r10_bio); } } @@ -3757,7 +3757,7 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, /* pretend they weren't skipped, it makes * no important difference in this case */ - md_done_sync(mddev, sectors_skipped, 1); + md_done_sync(mddev, sectors_skipped); =20 return sectors_skipped + nr_sectors; giveup: @@ -4913,7 +4913,8 @@ static void reshape_request_write(struct mddev *mddev= , struct r10bio *r10_bio) if (!test_bit(R10BIO_Uptodate, &r10_bio->state)) if (handle_reshape_read_error(mddev, r10_bio) < 0) { /* Reshape has been aborted */ - md_done_sync(mddev, r10_bio->sectors, 0); + md_done_sync(mddev, r10_bio->sectors); + md_sync_error(mddev); return; } =20 @@ -5071,7 +5072,7 @@ static void end_reshape_request(struct r10bio *r10_bi= o) { if (!atomic_dec_and_test(&r10_bio->remaining)) return; - md_done_sync(r10_bio->mddev, r10_bio->sectors, 1); + md_done_sync(r10_bio->mddev, r10_bio->sectors); bio_put(r10_bio->master_bio); put_buf(r10_bio); } diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 8dc98f545969..d6cd75c51573 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -3723,11 +3723,13 @@ handle_failed_sync(struct r5conf *conf, struct stri= pe_head *sh, RAID5_STRIPE_SECTORS(conf), 0)) abort =3D 1; } - if (abort) - conf->recovery_disabled =3D - conf->mddev->recovery_disabled; } - md_done_sync(conf->mddev, RAID5_STRIPE_SECTORS(conf), !abort); + md_done_sync(conf->mddev, RAID5_STRIPE_SECTORS(conf)); + + if (abort) { + conf->recovery_disabled =3D conf->mddev->recovery_disabled; + md_sync_error(conf->mddev); + } } =20 static int want_replace(struct stripe_head *sh, int disk_idx) @@ -5157,7 +5159,7 @@ static void handle_stripe(struct stripe_head *sh) if ((s.syncing || s.replacing) && s.locked =3D=3D 0 && !test_bit(STRIPE_COMPUTE_RUN, &sh->state) && test_bit(STRIPE_INSYNC, &sh->state)) { - md_done_sync(conf->mddev, RAID5_STRIPE_SECTORS(conf), 1); + md_done_sync(conf->mddev, RAID5_STRIPE_SECTORS(conf)); clear_bit(STRIPE_SYNCING, &sh->state); if (test_and_clear_bit(R5_Overlap, &sh->dev[sh->pd_idx].flags)) wake_up_bit(&sh->dev[sh->pd_idx].flags, R5_Overlap); @@ -5224,7 +5226,7 @@ static void handle_stripe(struct stripe_head *sh) clear_bit(STRIPE_EXPAND_READY, &sh->state); atomic_dec(&conf->reshape_stripes); wake_up(&conf->wait_for_reshape); - md_done_sync(conf->mddev, RAID5_STRIPE_SECTORS(conf), 1); + md_done_sync(conf->mddev, RAID5_STRIPE_SECTORS(conf)); } =20 if (s.expanding && s.locked =3D=3D 0 && --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B86D6274B35; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; cv=none; b=dX9vHcMou0bR/izovy2F5I2qWjTSIulX5OLBCyiQjHK47QJm1uJNR167zsw2gtaGG+Sp0hhm1AM9TQcE42WdHKuSUB3f7v1s1b95YoE8X7vf49qPVwTZns/6HR7NvTIxHV/uSfv9t9St8badk9kajTtORZb2ZCCe/BH2eSYV6Bc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; c=relaxed/simple; bh=GXziCHixb8LcBXAByaS42ErFOh69zcKhB+jUJ2bIQZ4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oioLzS2FQIEnqxQwNbkvfF2YCae3WFrCYjDYD6ZSn4scRblkiPKOk7WSACTyaJWGV5pJTFi1QabbiBXtUO8LTB1ASRfZM72p2XmXbjtx4vSnNQevwU2aQMjEjKJm5wrrfrGEm+XhJjZDKln9ZhRnQz4+1S0iJyxndS1kVXE8hX4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dlBRf4ygjzYQtxf; Mon, 5 Jan 2026 19:10:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 6D9324058C; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S7; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 03/12] md/raid1,raid10: support narrow_write_error when badblocks is disabled Date: Mon, 5 Jan 2026 19:02:51 +0800 Message-Id: <20260105110300.1442509-4-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S7 X-Coremail-Antispam: 1UD129KBjvJXoW7uF1xGrWxAr13WFyfWrW3KFg_yoW8Kr4xpa s7GFyfJ3yrury0va17X34j93WFv343GFWUCry7Z3sruryxGrZ7GF4kX345WFyjqFnxKF9F qa1UCrWUAF1DGaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmC14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAa c4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzV Aqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4UJwAm 72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYx C7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2Iq xVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r 106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AK xVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7 xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_ Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUd5rcUUUUU= X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan When badblocks.shift < 0 (badblocks disabled), narrow_write_error() return false, preventing write error handling. Since narrow_write_error() only splits IO into smaller sizes and re-submits, it can work with badblocks disabled. Adjust to use the logical block size for block_sectors when badblocks is disabled, allowing narrow_write_error() to function in this case. Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/raid1.c | 8 ++++---- drivers/md/raid10.c | 8 ++++---- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 90ad9455f74a..ef66ff4cab37 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2503,17 +2503,17 @@ static bool narrow_write_error(struct r1bio *r1_bio= , int i) * We currently own a reference on the rdev. */ =20 - int block_sectors; + int block_sectors, lbs =3D bdev_logical_block_size(rdev->bdev) >> 9; sector_t sector; int sectors; int sect_to_write =3D r1_bio->sectors; bool ok =3D true; =20 if (rdev->badblocks.shift < 0) - return false; + block_sectors =3D lbs; + else + block_sectors =3D roundup(1 << rdev->badblocks.shift, lbs); =20 - block_sectors =3D roundup(1 << rdev->badblocks.shift, - bdev_logical_block_size(rdev->bdev) >> 9); sector =3D r1_bio->sector; sectors =3D ((sector + block_sectors) & ~(sector_t)(block_sectors - 1)) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 40c31c00dc60..0700ed1dac60 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -2790,17 +2790,17 @@ static bool narrow_write_error(struct r10bio *r10_b= io, int i) * We currently own a reference to the rdev. */ =20 - int block_sectors; + int block_sectors, lbs =3D bdev_logical_block_size(rdev->bdev) >> 9; sector_t sector; int sectors; int sect_to_write =3D r10_bio->sectors; bool ok =3D true; =20 if (rdev->badblocks.shift < 0) - return false; + block_sectors =3D lbs; + else + block_sectors =3D roundup(1 << rdev->badblocks.shift, lbs); =20 - block_sectors =3D roundup(1 << rdev->badblocks.shift, - bdev_logical_block_size(rdev->bdev) >> 9); sector =3D r10_bio->sector; sectors =3D ((r10_bio->sector + block_sectors) & ~(sector_t)(block_sectors - 1)) --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B87532868A6; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; cv=none; b=MpAF++npZSO94/aS3aA2DhL8dz50wELRzdYgM9gFsYBT62rd2m6TEtgwdvIYuJLiD+jaeo13D/wuf7hnUKiskwZ1KDX6gOkiqjc5QUyH9pCd4aIbHO7S9ciliUrHkzzOOvuFrYY8FOo2bGFv6KkK5QB7LTuRUmHYpZsS8ItBYAc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; c=relaxed/simple; bh=qEafbAwgEmL26fWbbADNx6Xbyt2dY1CcH+Xfy+NEdcU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tMARdX7W5hEH/gsSrblc5yLVxvKZoueRxDKoBp/rOHdaYfl7aNnVnF9CcIV7bJq3hFx9BKHei2/aau/ytZG5EXcDOY3hur4TCy625SyHtLyT7VOsfxqv5LAR+upSK1lCkIxZV2I1NidHmqsISNsjbcJTn2bhx+Tr7x4Da0lvCSA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dlBRf5QD5zYQthS; Mon, 5 Jan 2026 19:10:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 7B2814056B; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S8; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 04/12] md: break remaining operations on badblocks set failure in narrow_write_error Date: Mon, 5 Jan 2026 19:02:52 +0800 Message-Id: <20260105110300.1442509-5-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S8 X-Coremail-Antispam: 1UD129KBjvJXoWxXrW8Jr4UKFWxWw4kur4rXwb_yoWrAr4kp3 yDGasayrWjqFyrWw4qyFZrua9Yk34fKFW2yrs7WwnrCwnYyrykGF4jq34YgFy8CFZIg3WU Xwn8urZrZr1DJFUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQj14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAac4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4U JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2 IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_ Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQqXLUUUUU= X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan Mark device faulty and exit at once when setting badblocks fails in narrow_write_error(). No need to continue processing remaining sections. With this change, narrow_write_error() no longer needs to return a value, so adjust its return type to void. Signed-off-by: Li Nan --- drivers/md/raid1.c | 24 ++++++++++++------------ drivers/md/raid10.c | 22 ++++++++++++---------- 2 files changed, 24 insertions(+), 22 deletions(-) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index ef66ff4cab37..a665e2f61ceb 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2486,7 +2486,7 @@ static void fix_read_error(struct r1conf *conf, struc= t r1bio *r1_bio) } } =20 -static bool narrow_write_error(struct r1bio *r1_bio, int i) +static void narrow_write_error(struct r1bio *r1_bio, int i) { struct mddev *mddev =3D r1_bio->mddev; struct r1conf *conf =3D mddev->private; @@ -2507,7 +2507,6 @@ static bool narrow_write_error(struct r1bio *r1_bio, = int i) sector_t sector; int sectors; int sect_to_write =3D r1_bio->sectors; - bool ok =3D true; =20 if (rdev->badblocks.shift < 0) block_sectors =3D lbs; @@ -2541,18 +2540,22 @@ static bool narrow_write_error(struct r1bio *r1_bio= , int i) bio_trim(wbio, sector - r1_bio->sector, sectors); wbio->bi_iter.bi_sector +=3D rdev->data_offset; =20 - if (submit_bio_wait(wbio) < 0) - /* failure! */ - ok =3D rdev_set_badblocks(rdev, sector, - sectors, 0) - && ok; + if (submit_bio_wait(wbio) && + !rdev_set_badblocks(rdev, sector, sectors, 0)) { + /* + * Badblocks set failed, disk marked Faulty. + * No further operations needed. + */ + md_error(mddev, rdev); + bio_put(wbio); + break; + } =20 bio_put(wbio); sect_to_write -=3D sectors; sector +=3D sectors; sectors =3D block_sectors; } - return ok; } =20 static void handle_sync_write_finished(struct r1conf *conf, struct r1bio *= r1_bio) @@ -2596,10 +2599,7 @@ static void handle_write_finished(struct r1conf *con= f, struct r1bio *r1_bio) * errors. */ fail =3D true; - if (!narrow_write_error(r1_bio, m)) - md_error(conf->mddev, - conf->mirrors[m].rdev); - /* an I/O failed, we can't clear the bitmap */ + narrow_write_error(r1_bio, m); rdev_dec_pending(conf->mirrors[m].rdev, conf->mddev); } diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 0700ed1dac60..62e0b501f74e 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -2773,7 +2773,7 @@ static void fix_read_error(struct r10conf *conf, stru= ct mddev *mddev, struct r10 } } =20 -static bool narrow_write_error(struct r10bio *r10_bio, int i) +static void narrow_write_error(struct r10bio *r10_bio, int i) { struct bio *bio =3D r10_bio->master_bio; struct mddev *mddev =3D r10_bio->mddev; @@ -2794,7 +2794,6 @@ static bool narrow_write_error(struct r10bio *r10_bio= , int i) sector_t sector; int sectors; int sect_to_write =3D r10_bio->sectors; - bool ok =3D true; =20 if (rdev->badblocks.shift < 0) block_sectors =3D lbs; @@ -2820,18 +2819,22 @@ static bool narrow_write_error(struct r10bio *r10_b= io, int i) choose_data_offset(r10_bio, rdev); wbio->bi_opf =3D REQ_OP_WRITE; =20 - if (submit_bio_wait(wbio) < 0) - /* Failure! */ - ok =3D rdev_set_badblocks(rdev, wsector, - sectors, 0) - && ok; + if (submit_bio_wait(wbio) && + !rdev_set_badblocks(rdev, wsector, sectors, 0)) { + /* + * Badblocks set failed, disk marked Faulty. + * No further operations needed. + */ + md_error(mddev, rdev); + bio_put(wbio); + break; + } =20 bio_put(wbio); sect_to_write -=3D sectors; sector +=3D sectors; sectors =3D block_sectors; } - return ok; } =20 static void handle_read_error(struct mddev *mddev, struct r10bio *r10_bio) @@ -2936,8 +2939,7 @@ static void handle_write_completed(struct r10conf *co= nf, struct r10bio *r10_bio) rdev_dec_pending(rdev, conf->mddev); } else if (bio !=3D NULL && bio->bi_status) { fail =3D true; - if (!narrow_write_error(r10_bio, m)) - md_error(conf->mddev, rdev); + narrow_write_error(r10_bio, m); rdev_dec_pending(rdev, conf->mddev); } bio =3D r10_bio->devs[m].repl_bio; --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D06122D73A3; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; cv=none; b=fEoExy+Cd6cpcgjI66237XfFooFxNs2DRJg05W0kk4sH7QUICriSI3b/YDuv2O9DGTCzFrI1diNOTHQ1BK8KDbu2FHk1sms+/KVrbys9cwXmwze/FEkrHHu6SBdpFZJWGvAdBewj6G6DVM+v16V5cWeeFX9WG0T+Tubh12Ai0/4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; c=relaxed/simple; bh=D7cNgeypkuftx1fTvYPvJZbZRhSKwCI+sGWHP2NZe10=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=lgzPZTN92UN0xWelpcUbZ1POnvxOpG9lcAkqWPuScygpnOP7SkHow41fY546wtg9xlP8KRe2NFr0nozTko89WcDT6BCprMuBof8Un3SXNvng53B0krh+c15XWcSki4cWHOWCzPFzQwkHfmCK2vMbD5IDBnDyPIgjEpvFCP+p/GI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dlBS33BhLzKHMmd; Mon, 5 Jan 2026 19:10:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 8C6A34056E; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S9; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 05/12] md: mark rdev Faulty when badblocks setting fails Date: Mon, 5 Jan 2026 19:02:53 +0800 Message-Id: <20260105110300.1442509-6-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S9 X-Coremail-Antispam: 1UD129KBjvJXoWxKrWUXw15ZFy5Ww1kZr4kXrb_yoW3GF1Upw srWa4SyrW5Gr1rZ3WDArWDWF9Ykw1ftFW2yr4aqw1xu3Z8Kr93tFW8Xry3WFyDZFy3uay2 q3Z8WrWDZFWUGFUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAac4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4U JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI 8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v2 6r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOPfHDUUUU X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan Currently when sync read fails and badblocks set fails (exceeding 512 limit), rdev isn't immediately marked Faulty. Instead 'recovery_disabled' is set and non-In_sync rdevs are removed later. This preserves array availability if bad regions aren't read, but bad sectors might be read by users before rdev removal. This occurs due to incorrect resync/recovery_offset updates that include these bad sectors. When badblocks exceed 512, keeping the disk provides little benefit while adding complexity. Prompt disk replacement is more important. Therefore when badblocks set fails, directly call md_error to mark rdev Faulty immediately, preventing potential data access issues. After this change, cleanup of offset update logic and 'recovery_disabled' handling will follow. Fixes: 5e5702898e93 ("md/raid10: Handle read errors during recovery better.= ") Fixes: 3a9f28a5117e ("md/raid1: improve handling of read failure during rec= overy.") Signed-off-by: Li Nan --- drivers/md/md.c | 8 +++++++- drivers/md/raid1.c | 16 +++++----------- drivers/md/raid10.c | 31 +++++++++++-------------------- drivers/md/raid5.c | 22 +++++++++------------- 4 files changed, 32 insertions(+), 45 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 55254483ec6b..90e128fc1397 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -10416,8 +10416,14 @@ bool rdev_set_badblocks(struct md_rdev *rdev, sect= or_t s, int sectors, else s +=3D rdev->data_offset; =20 - if (!badblocks_set(&rdev->badblocks, s, sectors, 0)) + if (!badblocks_set(&rdev->badblocks, s, sectors, 0)) { + /* + * Mark the disk as Faulty when setting badblocks fails, + * otherwise, bad sectors may be read. + */ + md_error(mddev, rdev); return false; + } =20 /* Make sure they get written out promptly */ if (test_bit(ExternalBbl, &rdev->flags)) diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index a665e2f61ceb..89d22204ad85 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -2115,8 +2115,7 @@ static int r1_sync_page_io(struct md_rdev *rdev, sect= or_t sector, rdev->mddev->recovery); } /* need to record an error - either for the block or the device */ - if (!rdev_set_badblocks(rdev, sector, sectors, 0)) - md_error(rdev->mddev, rdev); + rdev_set_badblocks(rdev, sector, sectors, 0); return 0; } =20 @@ -2441,8 +2440,7 @@ static void fix_read_error(struct r1conf *conf, struc= t r1bio *r1_bio) if (!success) { /* Cannot read from anywhere - mark it bad */ struct md_rdev *rdev =3D conf->mirrors[read_disk].rdev; - if (!rdev_set_badblocks(rdev, sect, s, 0)) - md_error(mddev, rdev); + rdev_set_badblocks(rdev, sect, s, 0); break; } /* write it back and re-read */ @@ -2546,7 +2544,6 @@ static void narrow_write_error(struct r1bio *r1_bio, = int i) * Badblocks set failed, disk marked Faulty. * No further operations needed. */ - md_error(mddev, rdev); bio_put(wbio); break; } @@ -2568,14 +2565,11 @@ static void handle_sync_write_finished(struct r1con= f *conf, struct r1bio *r1_bio if (bio->bi_end_io =3D=3D NULL) continue; if (!bio->bi_status && - test_bit(R1BIO_MadeGood, &r1_bio->state)) { + test_bit(R1BIO_MadeGood, &r1_bio->state)) rdev_clear_badblocks(rdev, r1_bio->sector, s, 0); - } if (bio->bi_status && - test_bit(R1BIO_WriteError, &r1_bio->state)) { - if (!rdev_set_badblocks(rdev, r1_bio->sector, s, 0)) - md_error(conf->mddev, rdev); - } + test_bit(R1BIO_WriteError, &r1_bio->state)) + rdev_set_badblocks(rdev, r1_bio->sector, s, 0); } put_buf(r1_bio); md_done_sync(conf->mddev, s); diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 62e0b501f74e..147d4bbdf123 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -2604,8 +2604,7 @@ static int r10_sync_page_io(struct md_rdev *rdev, sec= tor_t sector, &rdev->mddev->recovery); } /* need to record an error - either for the block or the device */ - if (!rdev_set_badblocks(rdev, sector, sectors, 0)) - md_error(rdev->mddev, rdev); + rdev_set_badblocks(rdev, sector, sectors, 0); return 0; } =20 @@ -2686,7 +2685,6 @@ static void fix_read_error(struct r10conf *conf, stru= ct mddev *mddev, struct r10 r10_bio->devs[slot].addr + sect, s, 0)) { - md_error(mddev, rdev); r10_bio->devs[slot].bio =3D IO_BLOCKED; } @@ -2825,7 +2823,6 @@ static void narrow_write_error(struct r10bio *r10_bio= , int i) * Badblocks set failed, disk marked Faulty. * No further operations needed. */ - md_error(mddev, rdev); bio_put(wbio); break; } @@ -2894,35 +2891,29 @@ static void handle_write_completed(struct r10conf *= conf, struct r10bio *r10_bio) if (r10_bio->devs[m].bio =3D=3D NULL || r10_bio->devs[m].bio->bi_end_io =3D=3D NULL) continue; - if (!r10_bio->devs[m].bio->bi_status) { + if (!r10_bio->devs[m].bio->bi_status) rdev_clear_badblocks( rdev, r10_bio->devs[m].addr, r10_bio->sectors, 0); - } else { - if (!rdev_set_badblocks( - rdev, - r10_bio->devs[m].addr, - r10_bio->sectors, 0)) - md_error(conf->mddev, rdev); - } + else + rdev_set_badblocks(rdev, + r10_bio->devs[m].addr, + r10_bio->sectors, 0); rdev =3D conf->mirrors[dev].replacement; if (r10_bio->devs[m].repl_bio =3D=3D NULL || r10_bio->devs[m].repl_bio->bi_end_io =3D=3D NULL) continue; =20 - if (!r10_bio->devs[m].repl_bio->bi_status) { + if (!r10_bio->devs[m].repl_bio->bi_status) rdev_clear_badblocks( rdev, r10_bio->devs[m].addr, r10_bio->sectors, 0); - } else { - if (!rdev_set_badblocks( - rdev, - r10_bio->devs[m].addr, - r10_bio->sectors, 0)) - md_error(conf->mddev, rdev); - } + else + rdev_set_badblocks(rdev, + r10_bio->devs[m].addr, + r10_bio->sectors, 0); } put_buf(r10_bio); } else { diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index d6cd75c51573..885cadb87cda 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -2817,11 +2817,9 @@ static void raid5_end_read_request(struct bio * bi) else { clear_bit(R5_ReadError, &sh->dev[i].flags); clear_bit(R5_ReWrite, &sh->dev[i].flags); - if (!(set_bad - && test_bit(In_sync, &rdev->flags) - && rdev_set_badblocks( - rdev, sh->sector, RAID5_STRIPE_SECTORS(conf), 0))) - md_error(conf->mddev, rdev); + if (!(set_bad && test_bit(In_sync, &rdev->flags))) + rdev_set_badblocks(rdev, sh->sector, + RAID5_STRIPE_SECTORS(conf), 0); } } rdev_dec_pending(rdev, conf->mddev); @@ -3599,11 +3597,10 @@ handle_failed_stripe(struct r5conf *conf, struct st= ripe_head *sh, else rdev =3D NULL; if (rdev) { - if (!rdev_set_badblocks( - rdev, - sh->sector, - RAID5_STRIPE_SECTORS(conf), 0)) - md_error(conf->mddev, rdev); + rdev_set_badblocks(rdev, + sh->sector, + RAID5_STRIPE_SECTORS(conf), + 0); rdev_dec_pending(rdev, conf->mddev); } } @@ -5255,9 +5252,8 @@ static void handle_stripe(struct stripe_head *sh) if (test_and_clear_bit(R5_WriteError, &dev->flags)) { /* We own a safe reference to the rdev */ rdev =3D conf->disks[i].rdev; - if (!rdev_set_badblocks(rdev, sh->sector, - RAID5_STRIPE_SECTORS(conf), 0)) - md_error(conf->mddev, rdev); + rdev_set_badblocks(rdev, sh->sector, + RAID5_STRIPE_SECTORS(conf), 0); rdev_dec_pending(rdev, conf->mddev); } if (test_and_clear_bit(R5_MadeGood, &dev->flags)) { --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D032E2877ED; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611504; cv=none; b=BIYKxYo3KsvnbIOZiGPd8BXMVVfinc5Bnngmuz4BhsuWX6EuOq33Nkd65DxIrxE4sTQS4QLVxRf/uG06sa41bhJJMin9H3xOD6QTF/WC4AxlzkdB8UFuH9gLPPxHh1DbNMvfoiZZoAURt44KnIdI+zVtY5/4gMYQTepseMFZ9M0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611504; c=relaxed/simple; bh=Ql9ZMitRNdYZIklGACeSLdg5l0/XpaNOe77wCQwgCX8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Ea2I2CKGE7mj699OhhCAGTRootd0NX2PJIJby+pIkQXD1NldPloMky2SxP2j0o47yo2AQof90IP98q3SdkUar2YDwUJGRPBRjnSu36zsmR1v4tHFJ2ltLVNhIioneGqpOG6Vu8bE2T2JEA88aMrY9f+OA23YfsyAb8YLJ1HEwgA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dlBS33cjqzKHMm0; Mon, 5 Jan 2026 19:10:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 99A2E40562; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S10; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 06/12] md: update curr_resync_completed even when MD_RECOVERY_INTR is set Date: Mon, 5 Jan 2026 19:02:54 +0800 Message-Id: <20260105110300.1442509-7-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S10 X-Coremail-Antispam: 1UD129KBjvJXoW7AFy7XryDCr4fWw1UKrWxZwb_yoW8XF45p3 97J3sIkrW8ZFWayF4DXr1rXFyrZ3y7tFW7AFW3W34UAan5Aw17JF1Fg3WUXFWDAr9Yqa1f t34rG343Z3W8Gw7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAac4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4U JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI 8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v2 6r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOPfHDUUUU X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan An error sync IO may be done and sub 'recovery_active' while its error handling work is pending. This work sets 'recovery_disabled' and MD_RECOVERY_INTR, then later removes the bad disk without Faulty flag. If 'curr_resync_completed' is updated before the disk is removed, it could lead to reading from sync-failed regions. With the previous patch, error IO will set badblocks or mark rdev as Faulty, sync-failed regions are no longer readable. After waiting for 'recovery_active' to reach 0 (in the previous line), all sync IO has *completed*, regardless of whether MD_RECOVERY_INTR is set. Thus, the MD_RECOVERY_INTR check can be removed. Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/md.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 90e128fc1397..7c3271808e69 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9734,8 +9734,8 @@ void md_do_sync(struct md_thread *thread) wait_event(mddev->recovery_wait, !atomic_read(&mddev->recovery_active)); =20 if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && - !test_bit(MD_RECOVERY_INTR, &mddev->recovery) && mddev->curr_resync >=3D MD_RESYNC_ACTIVE) { + /* All sync IO completes after recovery_active becomes 0 */ mddev->curr_resync_completed =3D mddev->curr_resync; sysfs_notify_dirent_safe(mddev->sysfs_completed); } --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B85F725F96D; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611503; cv=none; b=DJcrtCtUEO66Evssp6pWqgc2iwn24U1XqtpbPy7PMumYmOD1ZnZm6MFkNNv028inijzkSTs5I6E4DgtNCw2aqov5siJfeT10RZeY09xHExYANGvVAW9jlwn/kL25/5TtxslCChNq34DWJrP3j+qPDIT5+QeAFrpVbtercjX3e3k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611503; c=relaxed/simple; bh=c31ynZZaQWgJG+NWGXm+BtjCAm8izmdX0OzwJKunLdo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Ay89MnKdGoYgh12KIViesGNtr/wBKXQSVLsl2DlCk2RzExVoAwTZmCV79E6PrltRQ2GTtAVmZLVAqKGM7EtZyn7xv6AR0iyIeRlaPPWKMSQF7ZbAphdYjhElfdVkTXI2aYDvOqh+lkvQ98dIBz+YnIGDeLgxmXZR/uKjMqx87Vg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.177]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dlBRf6s1yzYQtxf; Mon, 5 Jan 2026 19:10:38 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id AE01F4058C; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S11; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 07/12] md: remove MD_RECOVERY_ERROR handling and simplify resync_offset update Date: Mon, 5 Jan 2026 19:02:55 +0800 Message-Id: <20260105110300.1442509-8-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S11 X-Coremail-Antispam: 1UD129KBjvJXoW7tr4UWrW8Jr4Dtr1UZry5Jwb_yoW8KFWkpa yfAFnxXw48ZFW3ZFWqqa4kZayrZr12yFWqkFW3u393JF9Yy3W3GFyj93W7JrWDJ3s7AF4a qa4rJFsxZ3W8Ww7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAac4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4U JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI 8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v2 6r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOPfHDUUUU X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan Following previous patch "md: update curr_resync_completed even when MD_RECOVERY_INTR is set", 'curr_resync_completed' always equals 'curr_resync' for resync, so MD_RECOVERY_ERROR can be removed. Also, simplify resync_offset update logic. Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/md.h | 2 -- drivers/md/md.c | 21 ++++----------------- 2 files changed, 4 insertions(+), 19 deletions(-) diff --git a/drivers/md/md.h b/drivers/md/md.h index 8871c88ceef1..698897f20385 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -646,8 +646,6 @@ enum recovery_flags { MD_RECOVERY_FROZEN, /* waiting for pers->start() to finish */ MD_RECOVERY_WAIT, - /* interrupted because io-error */ - MD_RECOVERY_ERROR, =20 /* flags determines sync action, see details in enum sync_action */ =20 diff --git a/drivers/md/md.c b/drivers/md/md.c index 7c3271808e69..d452a1128da8 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9080,7 +9080,6 @@ void md_sync_error(struct mddev *mddev) { // stop recovery, signal do_sync .... set_bit(MD_RECOVERY_INTR, &mddev->recovery); - set_bit(MD_RECOVERY_ERROR, &mddev->recovery); md_wakeup_thread(mddev->thread); } EXPORT_SYMBOL(md_sync_error); @@ -9743,24 +9742,12 @@ void md_do_sync(struct md_thread *thread) =20 if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) && mddev->curr_resync > MD_RESYNC_ACTIVE) { + if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) + mddev->curr_resync =3D MaxSector; + if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { - if (test_bit(MD_RECOVERY_INTR, &mddev->recovery)) { - if (mddev->curr_resync >=3D mddev->resync_offset) { - pr_debug("md: checkpointing %s of %s.\n", - desc, mdname(mddev)); - if (test_bit(MD_RECOVERY_ERROR, - &mddev->recovery)) - mddev->resync_offset =3D - mddev->curr_resync_completed; - else - mddev->resync_offset =3D - mddev->curr_resync; - } - } else - mddev->resync_offset =3D MaxSector; + mddev->resync_offset =3D mddev->curr_resync; } else { - if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) - mddev->curr_resync =3D MaxSector; if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { rcu_read_lock(); --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3ABFA330B3A; Mon, 5 Jan 2026 11:11:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611508; cv=none; b=ggqd2F96kyqfV3728Z2Y9Jo1kxf98ey1dafKSWEdtmqbVXT52ktza/YrAt31k/6YmInhOSM4Cvfu+cwAkHIg0kgYjrWlz8V0fd5OPjZcgz6mCJE77Xlh4fqkv1wcOOBa7uJ3hPcj+nHORUvyUwP3IMjGHWuf2h0rec+OJTpxtZ0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611508; c=relaxed/simple; bh=Drc6LnHStNfycg48P0ToIKiLg2iHN63bJyKGn5nyQ54=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YtQcqxKb2WWxw4jLVOxsqf16zT3u547xt2vfZCDvTEBGWxPaDhTX8WLaErQPCbp/ztLpRLAv4YUH8cvLp1ygj/AFqKo+fVS2xMMoO6sQXFU5YbZjNA/FQqMJAOPEckMg2rCJUPtnC/cWDti+RbcJXqDiXcK3CaCDoWJ1nRZa5qw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dlBRg0bXLzYQv4V; Mon, 5 Jan 2026 19:10:39 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id C74D14056D; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S12; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 08/12] md: factor out sync completion update into helper Date: Mon, 5 Jan 2026 19:02:56 +0800 Message-Id: <20260105110300.1442509-9-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S12 X-Coremail-Antispam: 1UD129KBjvJXoWxXF1fGr1kGF47Kw43tF1UKFg_yoWrZFy5p3 yxKFnxGr18XFW3XF47J3WkuFWrury8tryDtrWag397Jr1fKrnrGFyY9w1xXryDA34kZr45 X3y5Ww4DuF1xWw7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAac4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4U JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI 8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v2 6r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOPfHDUUUU X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan Repeatedly reading 'mddev->recovery' flags in md_do_sync() may introduce potential risk if this flag is modified during sync, leading to incorrect offset updates. Therefore, replace direct 'mddev->recovery' checks with 'action'. Move sync completion update logic into helper md_finish_sync(), which improves readability and maintainability. The reshape completion update remains safe as it only updated after successful reshape when MD_RECOVERY_INTR is not set and 'curr_resync' equals 'max_sectors'. Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/md.c | 82 ++++++++++++++++++++++++++++--------------------- 1 file changed, 47 insertions(+), 35 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index d452a1128da8..7090a514b02b 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9432,6 +9432,51 @@ static bool sync_io_within_limit(struct mddev *mddev) (raid_is_456(mddev) ? 8 : 128) * sync_io_depth(mddev); } =20 +/* + * Update sync offset and mddev status when sync completes + */ +static void md_finish_sync(struct mddev *mddev, enum sync_action action) +{ + struct md_rdev *rdev; + + switch (action) { + case ACTION_RESYNC: + case ACTION_REPAIR: + if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) + mddev->curr_resync =3D MaxSector; + mddev->resync_offset =3D mddev->curr_resync; + break; + case ACTION_RECOVER: + if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) + mddev->curr_resync =3D MaxSector; + rcu_read_lock(); + rdev_for_each_rcu(rdev, mddev) + if (mddev->delta_disks >=3D 0 && + rdev_needs_recovery(rdev, mddev->curr_resync)) + rdev->recovery_offset =3D mddev->curr_resync; + rcu_read_unlock(); + break; + case ACTION_RESHAPE: + if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery) && + mddev->delta_disks > 0 && + mddev->pers->finish_reshape && + mddev->pers->size && + !mddev_is_dm(mddev)) { + mddev_lock_nointr(mddev); + md_set_array_sectors(mddev, mddev->pers->size(mddev, 0, 0)); + mddev_unlock(mddev); + if (!mddev_is_clustered(mddev)) + set_capacity_and_notify(mddev->gendisk, + mddev->array_sectors); + } + break; + /* */ + case ACTION_CHECK: + default: + break; + } +} + #define SYNC_MARKS 10 #define SYNC_MARK_STEP (3*HZ) #define UPDATE_FREQUENCY (5*60*HZ) @@ -9447,7 +9492,6 @@ void md_do_sync(struct md_thread *thread) int last_mark,m; sector_t last_check; int skipped =3D 0; - struct md_rdev *rdev; enum sync_action action; const char *desc; struct blk_plug plug; @@ -9740,46 +9784,14 @@ void md_do_sync(struct md_thread *thread) } mddev->pers->sync_request(mddev, max_sectors, max_sectors, &skipped); =20 - if (!test_bit(MD_RECOVERY_CHECK, &mddev->recovery) && - mddev->curr_resync > MD_RESYNC_ACTIVE) { - if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) - mddev->curr_resync =3D MaxSector; - - if (test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { - mddev->resync_offset =3D mddev->curr_resync; - } else { - if (!test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && - test_bit(MD_RECOVERY_RECOVER, &mddev->recovery)) { - rcu_read_lock(); - rdev_for_each_rcu(rdev, mddev) - if (mddev->delta_disks >=3D 0 && - rdev_needs_recovery(rdev, mddev->curr_resync)) - rdev->recovery_offset =3D mddev->curr_resync; - rcu_read_unlock(); - } - } - } + if (mddev->curr_resync > MD_RESYNC_ACTIVE) + md_finish_sync(mddev, action); skip: /* set CHANGE_PENDING here since maybe another update is needed, * so other nodes are informed. It should be harmless for normal * raid */ set_mask_bits(&mddev->sb_flags, 0, BIT(MD_SB_CHANGE_PENDING) | BIT(MD_SB_CHANGE_DEVS)); - - if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && - !test_bit(MD_RECOVERY_INTR, &mddev->recovery) && - mddev->delta_disks > 0 && - mddev->pers->finish_reshape && - mddev->pers->size && - !mddev_is_dm(mddev)) { - mddev_lock_nointr(mddev); - md_set_array_sectors(mddev, mddev->pers->size(mddev, 0, 0)); - mddev_unlock(mddev); - if (!mddev_is_clustered(mddev)) - set_capacity_and_notify(mddev->gendisk, - mddev->array_sectors); - } - spin_lock(&mddev->lock); if (!test_bit(MD_RECOVERY_INTR, &mddev->recovery)) { /* We completed so min/max setting can be forgotten if used. */ --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0E0E2ED151; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611503; cv=none; b=XEEweJnH5YjBGKTDrDcpObwd1t+xYH1ktBmKyNPbtVePd9q3YpNHjZG2qyG6vAjR23iRIq1Mfa9AB2AUxkV5v5pIdk3Jd/sslMXfwB3yx1Dsx6MMku4uPuKgyviQ3MbsEGDkJRwpj1AudB5whbqzzIvcYeA4VbP/HdY2nWs68yc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611503; c=relaxed/simple; bh=uFLBZ61fdvFMU0ss77HHouDEFsodlqcmWMkZg/ZxZi4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JSL7VkOwY/96Qp3o3sFs96fVYqS4MW+jIkLGvVhBYMhmU80hibb23jH2HWIo5GyBBhpqt8Nk4nJCJTrhaX/nG5C70KAnn7mc4DfzYWvCdCfLpAbaIqzaxiG+Hs089qdlLY62J2ifwBeZN0QPPFbpiVMFd4huMJhhAKIymndeaUI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.198]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dlBS35D2bzKHMnt; Mon, 5 Jan 2026 19:10:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D1DD740578; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S13; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 09/12] md: move finish_reshape to md_finish_sync() Date: Mon, 5 Jan 2026 19:02:57 +0800 Message-Id: <20260105110300.1442509-10-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S13 X-Coremail-Antispam: 1UD129KBjvJXoW7Ww4Duw4rCF47Cw47Kr1rJFb_yoW8tF45p3 yIyF98GryUJrZxXa1UXa4qka4F934xKrWDtFW3C34fJw1agr4rJF1Y9a4UXFWvy34FyrW5 Xw45JFW8uF1I9aUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQ014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAac4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4U JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVWUCVW8JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI 8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v2 6r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjfUOPfHDUUUU X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan finish_reshape implementations of raid10 and raid5 only update mddev and rdev configurations. Move these operations to md_finish_sync() as it is more appropriate. No functional changes. Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/md.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 7090a514b02b..29a931404dbf 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9469,6 +9469,8 @@ static void md_finish_sync(struct mddev *mddev, enum = sync_action action) set_capacity_and_notify(mddev->gendisk, mddev->array_sectors); } + if (mddev->pers->finish_reshape) + mddev->pers->finish_reshape(mddev); break; /* */ case ACTION_CHECK: @@ -10306,7 +10308,7 @@ void md_reap_sync_thread(struct mddev *mddev) { struct md_rdev *rdev; sector_t old_dev_sectors =3D mddev->dev_sectors; - bool is_reshaped =3D false; + bool is_reshaped =3D test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery); =20 /* resync has finished, collect result */ md_unregister_thread(mddev, &mddev->sync_thread); @@ -10322,12 +10324,6 @@ void md_reap_sync_thread(struct mddev *mddev) set_bit(MD_SB_CHANGE_DEVS, &mddev->sb_flags); } } - if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery) && - mddev->pers->finish_reshape) { - mddev->pers->finish_reshape(mddev); - if (mddev_is_clustered(mddev)) - is_reshaped =3D true; - } =20 /* If array is no-longer degraded, then any saved_raid_disk * information must be scrapped. @@ -10354,8 +10350,9 @@ void md_reap_sync_thread(struct mddev *mddev) * be changed by md_update_sb, and MD_RECOVERY_RESHAPE is cleared, * so it is time to update size across cluster. */ - if (mddev_is_clustered(mddev) && is_reshaped - && !test_bit(MD_CLOSING, &mddev->flags)) + if (mddev_is_clustered(mddev) && is_reshaped && + mddev->pers->finish_reshape && + !test_bit(MD_CLOSING, &mddev->flags)) mddev->cluster_ops->update_size(mddev, old_dev_sectors); /* flag recovery needed just to double check */ set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0CF72D9EF9; Mon, 5 Jan 2026 11:11:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; cv=none; b=q+2Qs53Y2hDtjqvo4kEL63GZn8mW1uXNXouRfwZpiivbBLOE4B/tsP/NyTZ/GGDP5zk7UvIxcKPQ6DMhXGPbDk9MwGwj3cnyfwvqMNDNK13bWd4D/oikqf21fpO+PsabNiGJOy7FE1Fp51WO7+5NW+fp1yMz6GIXf+Osqpxfl9w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611502; c=relaxed/simple; bh=AXYleCh8ABpOR64lxyDuKkYhupTRY+KxJ/Xyp6KnLM4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mQAsLRiHeP4Kq2lrrZLSVQ0FxOV/I5Nz8EsJul3tLdWZZKDF0SIALXmpVUzZB6ZftYrVtWeimf7/x37r9PM/vtTz7+5KAcTYMgGU3ukDcodTzEUo8c6hfW9l7XXTZMEpGHQvay4RgzosBG4sKbvuavs5S/6c13gJ8mCLJO2NP6k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dlBS35c81zKHMnt; Mon, 5 Jan 2026 19:10:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id DE7384056C; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S14; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 10/12] md/raid10: fix any_working flag handling in raid10_sync_request Date: Mon, 5 Jan 2026 19:02:58 +0800 Message-Id: <20260105110300.1442509-11-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S14 X-Coremail-Antispam: 1UD129KBjvdXoWrtFy7uFy8Zw47GF1fKry8Krg_yoWkKFXEka 45ZF4Yqr1I9r12yw15Gr1SvrW3Za4DWan7Cr1UtryrZ34fZ3WFkr45uas8Zw15AF98Xas0 kw1vgrySva1DujkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbPxFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUAVCq3wA2048vs2 IY020Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28E F7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr 1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0D M2vYz4IE04k24VAvwVAKI4IrM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64 kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1Y6r17McIj6I8E87Iv67AKxVW8Jr0_Cr1U McvjeVCFs4IE7xkEbVWUJVW8JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I64 8v4I1lw4CEc2x0rVAKj4xxMxkF7I0En4kS14v26r1q6r43MxAIw28IcxkI7VAKI48JMxC2 0s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI 0_JrI_JrWlx4CE17CEb7AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE 14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWxJVW8Jr1lIxAIcVCF04k26cxKx2 IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_ Gr0_Gr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQqXLUUUUU= X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan In raid10_sync_request(), 'any_working' indicates if any IO will be submitted. When there's only one In_sync disk with badblocks, 'any_working' might be set to 1 but no IO is submitted. Fix it by setting 'any_working' after badblock checks. Fixes: e875ecea266a ("md/raid10 record bad blocks as needed during recovery= .") Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/raid10.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 147d4bbdf123..01e53a43d663 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -3395,7 +3395,6 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, !test_bit(In_sync, &rdev->flags)) continue; /* This is where we read from */ - any_working =3D 1; sector =3D r10_bio->devs[j].addr; =20 if (is_badblock(rdev, sector, max_sync, @@ -3410,6 +3409,7 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, continue; } } + any_working =3D 1; bio =3D r10_bio->devs[0].bio; bio->bi_next =3D biolist; biolist =3D bio; --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AC78330B3F; Mon, 5 Jan 2026 11:11:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611507; cv=none; b=Lpw4gw1CDLs82iApOlr2vhRtdxi4DjUBL7A3ZthdE5SH2MV+qut7vEV8nBpsLABmR3NyK+bnJnvs482PnKxTRM+IUhFv6CmTMbJoiWL6u4y9kAbvAZTNWUuTVvrR2J+1CKO1PrduIEIqcNXwz31DGGXuRQ4p8wBlpkdFKhJ65NQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611507; c=relaxed/simple; bh=uWRUdmUjfw8Jpb0zIts0NLt04xxEhZ3/kC8tMcxGYIc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=LclrVuvKPWVbJOGNdGHrucrz24O8NcJWmc0y53uPTZic4OmZaPcD/yg+qOf/1cMrPQ4VYe5uw9TXFu5WWF9r4mhNucX/3NQs4H9VbnobHagWX6GxIGDxrobne6MlbQF9BhtR9AIxMxSog/aNB02MttKkO2S/07EYAgLLyqgWgM0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4dlBRg1VLczYQv4V; Mon, 5 Jan 2026 19:10:39 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id E99B14056E; Mon, 5 Jan 2026 19:11:37 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S15; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 11/12] md/raid10: cleanup skip handling in raid10_sync_request Date: Mon, 5 Jan 2026 19:02:59 +0800 Message-Id: <20260105110300.1442509-12-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S15 X-Coremail-Antispam: 1UD129KBjvJXoWxKw4UXFyrGw1UZFy7Zry7ZFb_yoWxGw4fpa nxtFZrt3y8X3yrJwn8AryUWFyFyrWfJay5tr47W34Ikwn3KrsrZFWxXF40qFyDWFyrXF45 Xw4kXr45CasxtFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAac4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4U JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4 v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AK xVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQqXLUUUUU= X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan Skip a sector in raid10_sync_request() when it needs no syncing or no readable device exists. Current skip handling is unnecessary: - Use 'skip' label to reissue the next sector instead of return directly - Complete sync and return 'max_sectors' when multiple sectors are skipped due to badblocks The first is error-prone. For example, commit bc49694a9e8f ("md: pass in max_sectors for pers->sync_request()") removed redundant max_sector assignments. Since skip modifies max_sectors, `goto skip` leaves max_sectors equal to sector_nr after the jump, which is incorrect. The second causes sync to complete erroneously when no actual sync occurs. For recovery, recording badblocks and continue syncing subsequent sectors is more suitable. For resync, just skip bad sectors and syncing subsequent sectors. Clean up complex and unnecessary skip code. Return immediately when a sector should be skipped. Reduce code paths and lower regression risk. Fixes: bc49694a9e8f ("md: pass in max_sectors for pers->sync_request()") Signed-off-by: Li Nan Reviewed-by: Yu Kuai --- drivers/md/raid10.c | 96 +++++++++++---------------------------------- 1 file changed, 22 insertions(+), 74 deletions(-) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 01e53a43d663..e6b0d9e5e124 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -3161,11 +3161,8 @@ static sector_t raid10_sync_request(struct mddev *md= dev, sector_t sector_nr, int i; int max_sync; sector_t sync_blocks; - sector_t sectors_skipped =3D 0; - int chunks_skipped =3D 0; sector_t chunk_mask =3D conf->geo.chunk_mask; int page_idx =3D 0; - int error_disk =3D -1; =20 /* * Allow skipping a full rebuild for incremental assembly @@ -3186,7 +3183,6 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, if (init_resync(conf)) return 0; =20 - skipped: if (sector_nr >=3D max_sector) { conf->cluster_sync_low =3D 0; conf->cluster_sync_high =3D 0; @@ -3238,33 +3234,12 @@ static sector_t raid10_sync_request(struct mddev *m= ddev, sector_t sector_nr, mddev->bitmap_ops->close_sync(mddev); close_sync(conf); *skipped =3D 1; - return sectors_skipped; + return 0; } =20 if (test_bit(MD_RECOVERY_RESHAPE, &mddev->recovery)) return reshape_request(mddev, sector_nr, skipped); =20 - if (chunks_skipped >=3D conf->geo.raid_disks) { - pr_err("md/raid10:%s: %s fails\n", mdname(mddev), - test_bit(MD_RECOVERY_SYNC, &mddev->recovery) ? "resync" : "recovery"); - if (error_disk >=3D 0 && - !test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) { - /* - * recovery fails, set mirrors.recovery_disabled, - * device shouldn't be added to there. - */ - conf->mirrors[error_disk].recovery_disabled =3D - mddev->recovery_disabled; - return 0; - } - /* - * if there has been nothing to do on any drive, - * then there is nothing to do at all. - */ - *skipped =3D 1; - return (max_sector - sector_nr) + sectors_skipped; - } - if (max_sector > mddev->resync_max) max_sector =3D mddev->resync_max; /* Don't do IO beyond here */ =20 @@ -3347,7 +3322,6 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, /* yep, skip the sync_blocks here, but don't assume * that there will never be anything to do here */ - chunks_skipped =3D -1; continue; } if (mrdev) @@ -3478,29 +3452,19 @@ static sector_t raid10_sync_request(struct mddev *m= ddev, sector_t sector_nr, for (k =3D 0; k < conf->copies; k++) if (r10_bio->devs[k].devnum =3D=3D i) break; - if (mrdev && !test_bit(In_sync, - &mrdev->flags) - && !rdev_set_badblocks( - mrdev, - r10_bio->devs[k].addr, - max_sync, 0)) - any_working =3D 0; - if (mreplace && - !rdev_set_badblocks( - mreplace, - r10_bio->devs[k].addr, - max_sync, 0)) - any_working =3D 0; - } - if (!any_working) { - if (!test_and_set_bit(MD_RECOVERY_INTR, - &mddev->recovery)) - pr_warn("md/raid10:%s: insufficient working devices for recovery.\n", - mdname(mddev)); - mirror->recovery_disabled - =3D mddev->recovery_disabled; - } else { - error_disk =3D i; + if (mrdev && + !test_bit(In_sync, &mrdev->flags)) + rdev_set_badblocks( + mrdev, + r10_bio->devs[k].addr, + max_sync, 0); + if (mreplace) + rdev_set_badblocks( + mreplace, + r10_bio->devs[k].addr, + max_sync, 0); + pr_warn("md/raid10:%s: cannot recovery sector %llu + %d.\n", + mdname(mddev), r10_bio->devs[k].addr, max_sync); } put_buf(r10_bio); if (rb2) @@ -3541,7 +3505,8 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, rb2->master_bio =3D NULL; put_buf(rb2); } - goto giveup; + *skipped =3D 1; + return max_sync; } } else { /* resync. Schedule a read for every block at this virt offset */ @@ -3565,7 +3530,7 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, &mddev->recovery)) { /* We can skip this block */ *skipped =3D 1; - return sync_blocks + sectors_skipped; + return sync_blocks; } if (sync_blocks < max_sync) max_sync =3D sync_blocks; @@ -3657,8 +3622,8 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, mddev); } put_buf(r10_bio); - biolist =3D NULL; - goto giveup; + *skipped =3D 1; + return max_sync; } } =20 @@ -3678,7 +3643,8 @@ static sector_t raid10_sync_request(struct mddev *mdd= ev, sector_t sector_nr, if (WARN_ON(!bio_add_page(bio, page, len, 0))) { bio->bi_status =3D BLK_STS_RESOURCE; bio_endio(bio); - goto giveup; + *skipped =3D 1; + return max_sync; } } nr_sectors +=3D len>>9; @@ -3746,25 +3712,7 @@ static sector_t raid10_sync_request(struct mddev *md= dev, sector_t sector_nr, } } =20 - if (sectors_skipped) - /* pretend they weren't skipped, it makes - * no important difference in this case - */ - md_done_sync(mddev, sectors_skipped); - - return sectors_skipped + nr_sectors; - giveup: - /* There is nowhere to write, so all non-sync - * drives must be failed or in resync, all drives - * have a bad block, so try the next chunk... - */ - if (sector_nr + max_sync < max_sector) - max_sector =3D sector_nr + max_sync; - - sectors_skipped +=3D (max_sector - sector_nr); - chunks_skipped ++; - sector_nr =3D max_sector; - goto skipped; + return nr_sectors; } =20 static sector_t --=20 2.39.2 From nobody Mon Feb 9 04:08:20 2026 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0E8BA331A4C; Mon, 5 Jan 2026 11:11:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611507; cv=none; b=V9ezohlkuKuaGPw3i+vyibkUZ413R5Rxmv7chieGeJZUoRazjdxZHz30hkA/aPnYlkQUoDpO0a0Nrp5y0PxiYr8zmjOczi4yCYhcF4roKoGZ9YZ1UTU08FUh+L57nB5ZE3FcHV3aVgrgFjfE/jX80rgoDaU0SlrGr9TO+3R9wJI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1767611507; c=relaxed/simple; bh=gIB+/WPc/mYV6zmzmXIXFQGl262KXT1ZQjMdt458KB8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sf3RxiH1x5Q1Vx6rbjJId6QiuD1rxsY3V9YH0NwRyQZ5Fs51jVV1VHZgQUaGovX1cHArgdb/ykT8PLAVVml3dFXGNq58ppQEAmol3qxLisJeyyVYh1oRb2VmVN3XaxceyLVsnATW5RJJYszV1Le+VJglDawPEgfhOGqpyFhS9zk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.170]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4dlBS36lMpzKHMnt; Mon, 5 Jan 2026 19:10:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 12C084056D; Mon, 5 Jan 2026 19:11:38 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.87.129]) by APP4 (Coremail) with SMTP id gCh0CgBXuPhmnFtp6EHbCg--.50545S16; Mon, 05 Jan 2026 19:11:37 +0800 (CST) From: linan666@huaweicloud.com To: song@kernel.org, yukuai@fnnas.com, neil@brown.name, namhyung@gmail.com Cc: linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org, linan666@huaweicloud.com, yangerkun@huawei.com, yi.zhang@huawei.com Subject: [PATCH v4 12/12] md: remove recovery_disabled Date: Mon, 5 Jan 2026 19:03:00 +0800 Message-Id: <20260105110300.1442509-13-linan666@huaweicloud.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20260105110300.1442509-1-linan666@huaweicloud.com> References: <20260105110300.1442509-1-linan666@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgBXuPhmnFtp6EHbCg--.50545S16 X-Coremail-Antispam: 1UD129KBjvJXoWxKw1fZF4fXFy5Jr4xtF48Crg_yoWfCFW3pa nxXF9a9rWjqayFyF1DJFWDWFyrt3yUK3yxtFyfW3yUZa43trWkXa95XFyUXFyDJFWFva1I q3Z5GrW5JF1IgaUanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUQF14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAac4AC62xK8xCEY4vEwIxC4wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0V AKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Gr1j6F4U JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20V AGYxC7M4kE6xkIj40Ew7xC0wCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCF x2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14 v26r106r1rMI8E67AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY 67AKxVW8JVW5JwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4 v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AK xVW8Jr0_Cr1UYxBIdaVFxhVjvjDU0xZFpf9x0JUQqXLUUUUU= X-CM-SenderInfo: polqt0awwwqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Li Nan 'recovery_disabled' logic is complex and confusing, originally intended to preserve raid in extreme scenarios. It was used in following cases: - When sync fails and setting badblocks also fails, kick out non-In_sync rdev and block spare rdev from joining to preserve raid [1] - When last backup is unavailable, prevent repeated add-remove of spares triggering recovery [2] The original issues are now resolved: - Error handlers in all raid types prevent last rdev from being kicked out - Disks with failed recovery are marked Faulty and can't re-join Therefore, remove 'recovery_disabled' as it's no longer needed. [1] 5389042ffa36 ("md: change managed of recovery_disabled.") [2] 4044ba58dd15 ("md: don't retry recovery of raid1 that fails due to erro= r on source drive.") Signed-off-by: Li Nan --- drivers/md/md.h | 6 ------ drivers/md/raid1.h | 5 ----- drivers/md/raid10.h | 5 ----- drivers/md/raid5.h | 1 - drivers/md/md.c | 3 --- drivers/md/raid1.c | 17 +++-------------- drivers/md/raid10.c | 8 -------- drivers/md/raid5.c | 10 +--------- 8 files changed, 4 insertions(+), 51 deletions(-) diff --git a/drivers/md/md.h b/drivers/md/md.h index 698897f20385..a083f37374d0 100644 --- a/drivers/md/md.h +++ b/drivers/md/md.h @@ -495,12 +495,6 @@ struct mddev { int ok_start_degraded; =20 unsigned long recovery; - /* If a RAID personality determines that recovery (of a particular - * device) will fail due to a read error on the source device, it - * takes a copy of this number and does not attempt recovery again - * until this number changes. - */ - int recovery_disabled; =20 int in_sync; /* know to not need resync */ /* 'open_mutex' avoids races between 'md_open' and 'do_md_stop', so diff --git a/drivers/md/raid1.h b/drivers/md/raid1.h index 2ebe35aaa534..c98d43a7ae99 100644 --- a/drivers/md/raid1.h +++ b/drivers/md/raid1.h @@ -93,11 +93,6 @@ struct r1conf { */ int fullsync; =20 - /* When the same as mddev->recovery_disabled we don't allow - * recovery to be attempted as we expect a read error. - */ - int recovery_disabled; - mempool_t *r1bio_pool; mempool_t r1buf_pool; =20 diff --git a/drivers/md/raid10.h b/drivers/md/raid10.h index da00a55f7a55..ec79d87fb92f 100644 --- a/drivers/md/raid10.h +++ b/drivers/md/raid10.h @@ -18,11 +18,6 @@ struct raid10_info { struct md_rdev *rdev, *replacement; sector_t head_position; - int recovery_disabled; /* matches - * mddev->recovery_disabled - * when we shouldn't try - * recovering this device. - */ }; =20 struct r10conf { diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h index eafc6e9ed6ee..eff2bba9d76f 100644 --- a/drivers/md/raid5.h +++ b/drivers/md/raid5.h @@ -640,7 +640,6 @@ struct r5conf { * (fresh device added). * Cleared when a sync completes. */ - int recovery_disabled; /* per cpu variables */ struct raid5_percpu __percpu *percpu; int scribble_disks; diff --git a/drivers/md/md.c b/drivers/md/md.c index 29a931404dbf..5df2220b1bd1 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -2617,9 +2617,6 @@ static int bind_rdev_to_array(struct md_rdev *rdev, s= truct mddev *mddev) list_add_rcu(&rdev->same_set, &mddev->disks); bd_link_disk_holder(rdev->bdev, mddev->gendisk); =20 - /* May as well allow recovery to be retried once */ - mddev->recovery_disabled++; - return 0; =20 fail: diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 89d22204ad85..0781161ab4c1 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1760,7 +1760,6 @@ static void raid1_error(struct mddev *mddev, struct m= d_rdev *rdev) set_bit(MD_BROKEN, &mddev->flags); =20 if (!mddev->fail_last_dev) { - conf->recovery_disabled =3D mddev->recovery_disabled; spin_unlock_irqrestore(&conf->device_lock, flags); return; } @@ -1904,7 +1903,6 @@ static bool raid1_remove_conf(struct r1conf *conf, in= t disk) =20 /* Only remove non-faulty devices if recovery is not possible. */ if (!test_bit(Faulty, &rdev->flags) && - rdev->mddev->recovery_disabled !=3D conf->recovery_disabled && rdev->mddev->degraded < conf->raid_disks) return false; =20 @@ -1924,9 +1922,6 @@ static int raid1_add_disk(struct mddev *mddev, struct= md_rdev *rdev) int first =3D 0; int last =3D conf->raid_disks - 1; =20 - if (mddev->recovery_disabled =3D=3D conf->recovery_disabled) - return -EBUSY; - if (rdev->raid_disk >=3D 0) first =3D last =3D rdev->raid_disk; =20 @@ -2346,7 +2341,6 @@ static void sync_request_write(struct mddev *mddev, s= truct r1bio *r1_bio) */ if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery) || !fix_sync_read_error(r1_bio)) { - conf->recovery_disabled =3D mddev->recovery_disabled; md_done_sync(mddev, r1_bio->sectors); md_sync_error(mddev); put_buf(r1_bio); @@ -2948,16 +2942,12 @@ static sector_t raid1_sync_request(struct mddev *md= dev, sector_t sector_nr, *skipped =3D 1; put_buf(r1_bio); =20 - if (!ok) { - /* Cannot record the badblocks, so need to + if (!ok) + /* Cannot record the badblocks, md_error has set INTR, * abort the resync. - * If there are multiple read targets, could just - * fail the really bad ones ??? */ - conf->recovery_disabled =3D mddev->recovery_disabled; - set_bit(MD_RECOVERY_INTR, &mddev->recovery); return 0; - } else + else return min_bad; =20 } @@ -3144,7 +3134,6 @@ static struct r1conf *setup_conf(struct mddev *mddev) init_waitqueue_head(&conf->wait_barrier); =20 bio_list_init(&conf->pending_bio_list); - conf->recovery_disabled =3D mddev->recovery_disabled - 1; =20 err =3D -EIO; for (i =3D 0; i < conf->raid_disks * 2; i++) { diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index e6b0d9e5e124..e6f879d0eae3 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -2130,8 +2130,6 @@ static int raid10_add_disk(struct mddev *mddev, struc= t md_rdev *rdev) mirror =3D first; for ( ; mirror <=3D last ; mirror++) { p =3D &conf->mirrors[mirror]; - if (p->recovery_disabled =3D=3D mddev->recovery_disabled) - continue; if (p->rdev) { if (test_bit(WantReplacement, &p->rdev->flags) && p->replacement =3D=3D NULL && repl_slot < 0) @@ -2143,7 +2141,6 @@ static int raid10_add_disk(struct mddev *mddev, struc= t md_rdev *rdev) if (err) return err; p->head_position =3D 0; - p->recovery_disabled =3D mddev->recovery_disabled - 1; rdev->raid_disk =3D mirror; err =3D 0; if (rdev->saved_raid_disk !=3D mirror) @@ -2196,7 +2193,6 @@ static int raid10_remove_disk(struct mddev *mddev, st= ruct md_rdev *rdev) * is not possible. */ if (!test_bit(Faulty, &rdev->flags) && - mddev->recovery_disabled !=3D p->recovery_disabled && (!p->replacement || p->replacement =3D=3D rdev) && number < conf->geo.raid_disks && enough(conf, -1)) { @@ -2535,8 +2531,6 @@ static void fix_recovery_read_error(struct r10bio *r1= 0_bio) pr_notice("md/raid10:%s: recovery aborted due to read error\n", mdname(mddev)); =20 - conf->mirrors[dw].recovery_disabled - =3D mddev->recovery_disabled; set_bit(MD_RECOVERY_INTR, &mddev->recovery); break; @@ -4075,8 +4069,6 @@ static int raid10_run(struct mddev *mddev) disk->replacement->saved_raid_disk < 0) { conf->fullsync =3D 1; } - - disk->recovery_disabled =3D mddev->recovery_disabled - 1; } =20 if (mddev->resync_offset !=3D MaxSector) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 885cadb87cda..c9aa6f98b617 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -2918,7 +2918,6 @@ static void raid5_error(struct mddev *mddev, struct m= d_rdev *rdev) =20 if (has_failed(conf)) { set_bit(MD_BROKEN, &conf->mddev->flags); - conf->recovery_disabled =3D mddev->recovery_disabled; =20 pr_crit("md/raid:%s: Cannot continue operation (%d/%d failed).\n", mdname(mddev), mddev->degraded, conf->raid_disks); @@ -3723,10 +3722,8 @@ handle_failed_sync(struct r5conf *conf, struct strip= e_head *sh, } md_done_sync(conf->mddev, RAID5_STRIPE_SECTORS(conf)); =20 - if (abort) { - conf->recovery_disabled =3D conf->mddev->recovery_disabled; + if (abort) md_sync_error(conf->mddev); - } } =20 static int want_replace(struct stripe_head *sh, int disk_idx) @@ -7534,8 +7531,6 @@ static struct r5conf *setup_conf(struct mddev *mddev) } =20 conf->bypass_threshold =3D BYPASS_THRESHOLD; - conf->recovery_disabled =3D mddev->recovery_disabled - 1; - conf->raid_disks =3D mddev->raid_disks; if (mddev->reshape_position =3D=3D MaxSector) conf->previous_raid_disks =3D mddev->raid_disks; @@ -8209,7 +8204,6 @@ static int raid5_remove_disk(struct mddev *mddev, str= uct md_rdev *rdev) * isn't possible. */ if (!test_bit(Faulty, &rdev->flags) && - mddev->recovery_disabled !=3D conf->recovery_disabled && !has_failed(conf) && (!p->replacement || p->replacement =3D=3D rdev) && number < conf->raid_disks) { @@ -8270,8 +8264,6 @@ static int raid5_add_disk(struct mddev *mddev, struct= md_rdev *rdev) =20 return 0; } - if (mddev->recovery_disabled =3D=3D conf->recovery_disabled) - return -EBUSY; =20 if (rdev->saved_raid_disk < 0 && has_failed(conf)) /* no point adding a device */ --=20 2.39.2