From nobody Thu Apr 2 22:24:26 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BC662556E; Sat, 14 Feb 2026 06:10:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049419; cv=none; b=d4P8dUOyeWy/H/ZIMZKmcix+tSGphjfTVfNhWkuDmOunXunM3JcjTuaJS0FaAFINd3YyTR3q5+ygp66rO+GQLNT2y4WCOXiVpQZocXrJ5dlxjg/Eu6daVS/hlE59F9bxznqZg0qzW9kAq/FUjvXjd+atg8wGSdW95mFYc+Rmsns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049419; c=relaxed/simple; bh=7Tm19XjqDZryvGVvRo6y2vQQHtmWELjPcCgnqGUNitc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Txhrj/4gWGQnVLOgQc81AmX/LSkXulK0OnHU7NDkpK28bxMJxwVlhXf4ly1kzuWoRPCe0WYX587tRsY9T8a1L6iO2kV8XzQuu+8OIV1wQXRRaBODxbjkoflaUDqFqVs172HJY5Vh+MktIG09h++lzwCoNJgsYNnrGJPvaoloMs0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id B6E36C19423; Sat, 14 Feb 2026 06:10:17 +0000 (UTC) From: Yu Kuai To: song@kernel.org Cc: linan122@huawei.com, xni@redhat.com, colyli@fnnas.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/5] md/md-llbitmap: skip reading rdevs that are not in_sync Date: Sat, 14 Feb 2026 14:10:09 +0800 Message-ID: <20260214061013.2335604-2-yukuai@fnnas.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260214061013.2335604-1-yukuai@fnnas.com> References: <20260214061013.2335604-1-yukuai@fnnas.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When reading bitmap pages from member disks, the code iterates through all rdevs and attempts to read from the first available one. However, it only checks for raid_disk assignment and Faulty flag, missing the In_sync flag check. This can cause bitmap data to be read from spare disks that are still being rebuilt and don't have valid bitmap information yet. Reading stale or uninitialized bitmap data from such disks can lead to incorrect dirty bit tracking, potentially causing data corruption during recovery or normal operation. Add the In_sync flag check to ensure bitmap pages are only read from fully synchronized member disks that have valid bitmap data. Cc: stable@vger.kernel.org Fixes: 5ab829f1971d ("md/md-llbitmap: introduce new lockless bitmap") Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index cd713a7dc270..30d7e36b22c4 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -459,7 +459,8 @@ static struct page *llbitmap_read_page(struct llbitmap = *llbitmap, int idx) rdev_for_each(rdev, mddev) { sector_t sector; =20 - if (rdev->raid_disk < 0 || test_bit(Faulty, &rdev->flags)) + if (rdev->raid_disk < 0 || test_bit(Faulty, &rdev->flags) || + !test_bit(In_sync, &rdev->flags)) continue; =20 sector =3D mddev->bitmap_info.offset + --=20 2.51.0 From nobody Thu Apr 2 22:24:26 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BA8342556E; Sat, 14 Feb 2026 06:10:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049421; cv=none; b=HkOBMqyGgaJM33wwaWIKGQ89P0T7i30EHiAahYxONHRmXzn7YZByBMBpDAImtFXOZwRPsr7Je6VW8fHyGSBMIf63AICQY3jABgkMIlKdw3zAlvog8g9RoPUuYI/u2CEQzg9qX4e4/OTu9HIwk1t8X4dHS7tU69rQpXYvcK9AOXQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049421; c=relaxed/simple; bh=HybwxZPJFePf5IrcSW9RiixkMxuqA8cWUXf7Sk2guos=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=r9VsC4kvZQLT9StP+m2DD2s6GOz8eCN1yX9Xk/3Y6uxF7UauMSmzmNWdxjXeuQiw/Motk7lVRrNqAOOFdHUCnWC1q44nkntUNB0fnESlzm5EhgNXmCSPO7NCqfdvW3mSO0O66ZxN2iKnT7UYkuaYXlgGl//mxiYBViFq4cm2Z1c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id BEFF4C19421; Sat, 14 Feb 2026 06:10:19 +0000 (UTC) From: Yu Kuai To: song@kernel.org Cc: linan122@huawei.com, xni@redhat.com, colyli@fnnas.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/5] md/md-llbitmap: raise barrier before state machine transition Date: Sat, 14 Feb 2026 14:10:10 +0800 Message-ID: <20260214061013.2335604-3-yukuai@fnnas.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260214061013.2335604-1-yukuai@fnnas.com> References: <20260214061013.2335604-1-yukuai@fnnas.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Move the barrier raise operation before calling llbitmap_state_machine() in both llbitmap_start_write() and llbitmap_start_discard(). This ensures the barrier is in place before any state transitions occur, preventing potential race conditions where the state machine could complete before the barrier is properly raised. Cc: stable@vger.kernel.org Fixes: 5ab829f1971d ("md/md-llbitmap: introduce new lockless bitmap") Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 30d7e36b22c4..5f9e7004e3e3 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -1070,12 +1070,12 @@ static void llbitmap_start_write(struct mddev *mdde= v, sector_t offset, int page_start =3D (start + BITMAP_DATA_OFFSET) >> PAGE_SHIFT; int page_end =3D (end + BITMAP_DATA_OFFSET) >> PAGE_SHIFT; =20 - llbitmap_state_machine(llbitmap, start, end, BitmapActionStartwrite); - while (page_start <=3D page_end) { llbitmap_raise_barrier(llbitmap, page_start); page_start++; } + + llbitmap_state_machine(llbitmap, start, end, BitmapActionStartwrite); } =20 static void llbitmap_end_write(struct mddev *mddev, sector_t offset, @@ -1102,12 +1102,12 @@ static void llbitmap_start_discard(struct mddev *md= dev, sector_t offset, int page_start =3D (start + BITMAP_DATA_OFFSET) >> PAGE_SHIFT; int page_end =3D (end + BITMAP_DATA_OFFSET) >> PAGE_SHIFT; =20 - llbitmap_state_machine(llbitmap, start, end, BitmapActionDiscard); - while (page_start <=3D page_end) { llbitmap_raise_barrier(llbitmap, page_start); page_start++; } + + llbitmap_state_machine(llbitmap, start, end, BitmapActionDiscard); } =20 static void llbitmap_end_discard(struct mddev *mddev, sector_t offset, --=20 2.51.0 From nobody Thu Apr 2 22:24:26 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C13192556E; Sat, 14 Feb 2026 06:10:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049423; cv=none; b=F8MlgB2ccizt868qB1tMe2qxnBNCndkjuU/gFxvgZsQXUr2zcdc9cBtQWGwKdkwUrsg8bq6gZJjQLgtlLBGF5yZbDux/ucnyaMbcs4l1WapmIay1lrGJbv8VEE5kGy3c94ZAJiCaPhY0MLRcrBQuh+e7ftkStpEDAF/dA355l70= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049423; c=relaxed/simple; bh=TWkCZTMhj/SsTk4NR0HUNeixi+fZzYx7zEdXdlaZV/Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SzzAwOHHo+5uMNdgQgX3o4fvBSfNEmIluleuK6/e1bVbNLsPRrozjP24QXrsTr/JfKvo0Hz2FM5Q08NoW8TijnZojaN2x08xjEGQqVqj7+dVQEYEgwivuZ53eg6w2u/KzrVhiS/NVwEy+H8k5OoKtHSmdpBURaeoMwOuTYsVoPY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id C9403C19423; Sat, 14 Feb 2026 06:10:21 +0000 (UTC) From: Yu Kuai To: song@kernel.org Cc: linan122@huawei.com, xni@redhat.com, colyli@fnnas.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 3/5] md: add fallback to correct bitmap_ops on version mismatch Date: Sat, 14 Feb 2026 14:10:11 +0800 Message-ID: <20260214061013.2335604-4-yukuai@fnnas.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260214061013.2335604-1-yukuai@fnnas.com> References: <20260214061013.2335604-1-yukuai@fnnas.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" If default bitmap version and on-disk version doesn't match, and mdadm is not the latest version to set bitmap_type, set bitmap_ops based on the disk version. Signed-off-by: Yu Kuai --- drivers/md/md.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 102 insertions(+), 1 deletion(-) diff --git a/drivers/md/md.c b/drivers/md/md.c index 59cd303548de..d2607ed5c2e9 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -6447,15 +6447,116 @@ static void md_safemode_timeout(struct timer_list = *t) =20 static int start_dirty_degraded; =20 +/* + * Read bitmap superblock and return the bitmap_id based on disk version. + * This is used as fallback when default bitmap version and on-disk version + * doesn't match, and mdadm is not the latest version to set bitmap_type. + */ +static enum md_submodule_id md_bitmap_get_id_from_sb(struct mddev *mddev) +{ + struct md_rdev *rdev; + struct page *sb_page; + bitmap_super_t *sb; + enum md_submodule_id id =3D ID_BITMAP_NONE; + sector_t sector; + u32 version; + + if (!mddev->bitmap_info.offset) + return ID_BITMAP_NONE; + + sb_page =3D alloc_page(GFP_KERNEL); + if (!sb_page) + return ID_BITMAP_NONE; + + sector =3D mddev->bitmap_info.offset; + + rdev_for_each(rdev, mddev) { + u32 iosize; + + if (!test_bit(In_sync, &rdev->flags) || + test_bit(Faulty, &rdev->flags) || + test_bit(Bitmap_sync, &rdev->flags)) + continue; + + iosize =3D roundup(sizeof(bitmap_super_t), + bdev_logical_block_size(rdev->bdev)); + if (sync_page_io(rdev, sector, iosize, sb_page, REQ_OP_READ, + true)) + goto read_ok; + } + goto out; + +read_ok: + sb =3D kmap_local_page(sb_page); + if (sb->magic !=3D cpu_to_le32(BITMAP_MAGIC)) + goto out_unmap; + + version =3D le32_to_cpu(sb->version); + switch (version) { + case BITMAP_MAJOR_LO: + case BITMAP_MAJOR_HI: + case BITMAP_MAJOR_CLUSTERED: + id =3D ID_BITMAP; + break; + case BITMAP_MAJOR_LOCKLESS: + id =3D ID_LLBITMAP; + break; + default: + pr_warn("md: %s: unknown bitmap version %u\n", + mdname(mddev), version); + break; + } + +out_unmap: + kunmap_local(sb); +out: + __free_page(sb_page); + return id; +} + static int md_bitmap_create(struct mddev *mddev) { + enum md_submodule_id orig_id =3D mddev->bitmap_id; + enum md_submodule_id sb_id; + int err; + if (mddev->bitmap_id =3D=3D ID_BITMAP_NONE) return -EINVAL; =20 if (!mddev_set_bitmap_ops(mddev)) return -ENOENT; =20 - return mddev->bitmap_ops->create(mddev); + err =3D mddev->bitmap_ops->create(mddev); + if (!err) + return 0; + + /* + * Create failed, if default bitmap version and on-disk version + * doesn't match, and mdadm is not the latest version to set + * bitmap_type, set bitmap_ops based on the disk version. + */ + mddev_clear_bitmap_ops(mddev); + + sb_id =3D md_bitmap_get_id_from_sb(mddev); + if (sb_id =3D=3D ID_BITMAP_NONE || sb_id =3D=3D orig_id) + return err; + + pr_info("md: %s: bitmap version mismatch, switching from %d to %d\n", + mdname(mddev), orig_id, sb_id); + + mddev->bitmap_id =3D sb_id; + if (!mddev_set_bitmap_ops(mddev)) { + mddev->bitmap_id =3D orig_id; + return -ENOENT; + } + + err =3D mddev->bitmap_ops->create(mddev); + if (err) { + mddev_clear_bitmap_ops(mddev); + mddev->bitmap_id =3D orig_id; + } + + return err; } =20 static void md_bitmap_destroy(struct mddev *mddev) --=20 2.51.0 From nobody Thu Apr 2 22:24:26 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 870902556E; Sat, 14 Feb 2026 06:10:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049425; cv=none; b=Ky6+aXmCn+ggVkjH+jYxH5K1kHo5wcXEtRgG0xIYqCg6GBTddp8rtu77sS7whhlyGsVXtH0NI0vaJ46SXx43fyM/1ClYoScmUASVJol7fy8rElhQ1I87bNWtmNroDxm8oUWEPIxT5k0VVxK+r5eqDJZaXRJg8fOFuJSF54ZYVJs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049425; c=relaxed/simple; bh=AtYbLLtqTovQtwcXZh+bSJgPxg9ybGDy8IxEFogiDiQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=OYTQ9We0rAvdwMBtl3zr17LxHZ0N/LiJ9nsP8cwYxFM/3/jdx1rW7ewzqjRrIbB08c0SNOe68rWVGIFZ1XQnUEhAbn4iHsTKLCT3vQGyULcqP6RJleHs48cxv/pcGxgQ93N+XqEn4r3h/dQB5SJJSyiI4WTE0XVzO75pC1YbnUg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id D212FC19421; Sat, 14 Feb 2026 06:10:23 +0000 (UTC) From: Yu Kuai To: song@kernel.org Cc: linan122@huawei.com, xni@redhat.com, colyli@fnnas.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/5] md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity building Date: Sat, 14 Feb 2026 14:10:12 +0800 Message-ID: <20260214061013.2335604-5-yukuai@fnnas.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260214061013.2335604-1-yukuai@fnnas.com> References: <20260214061013.2335604-1-yukuai@fnnas.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add new states to the llbitmap state machine to support proactive XOR parity building for RAID-5 arrays. This allows users to pre-build parity data for unwritten regions before any user data is written. New states added: - BitNeedSyncUnwritten: Transitional state when proactive sync is triggered via sysfs on Unwritten regions. - BitSyncingUnwritten: Proactive sync in progress for unwritten region. - BitCleanUnwritten: XOR parity has been pre-built, but no user data written yet. When user writes to this region, it transitions to BitDirty. New actions added: - BitmapActionProactiveSync: Trigger for proactive XOR parity building. - BitmapActionClearUnwritten: Convert CleanUnwritten/NeedSyncUnwritten/ SyncingUnwritten states back to Unwritten before recovery starts. State flows: - Current (lazy): Unwritten -> (write) -> NeedSync -> (sync) -> Dirty -> Cl= ean - New (proactive): Unwritten -> (sysfs) -> NeedSyncUnwritten -> (sync) -> C= leanUnwritten - On write to CleanUnwritten: CleanUnwritten -> (write) -> Dirty -> Clean - On disk replacement: CleanUnwritten regions are converted to Unwritten before recovery starts, so recovery only rebuilds regions with user data A new sysfs interface is added at /sys/block/mdX/md/llbitmap/proactive_sync (write-only) to trigger proactive sync. This only works for RAID-456 arrays. Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 140 +++++++++++++++++++++++++++++++++++---- drivers/md/md.c | 6 +- 2 files changed, 132 insertions(+), 14 deletions(-) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 5f9e7004e3e3..461050b2771b 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -208,6 +208,20 @@ enum llbitmap_state { BitNeedSync, /* data is synchronizing */ BitSyncing, + /* + * Proactive sync requested for unwritten region (raid456 only). + * Triggered via sysfs when user wants to pre-build XOR parity + * for regions that have never been written. + */ + BitNeedSyncUnwritten, + /* Proactive sync in progress for unwritten region */ + BitSyncingUnwritten, + /* + * XOR parity has been pre-built for a region that has never had + * user data written. When user writes to this region, it transitions + * to BitDirty. + */ + BitCleanUnwritten, BitStateCount, BitNone =3D 0xff, }; @@ -232,6 +246,12 @@ enum llbitmap_action { * BitNeedSync. */ BitmapActionStale, + /* + * Proactive sync trigger for raid456 - builds XOR parity for + * Unwritten regions without requiring user data write first. + */ + BitmapActionProactiveSync, + BitmapActionClearUnwritten, BitmapActionCount, /* Init state is BitUnwritten */ BitmapActionInit, @@ -304,6 +324,8 @@ static char state_machine[BitStateCount][BitmapActionCo= unt] =3D { [BitmapActionDaemon] =3D BitNone, [BitmapActionDiscard] =3D BitNone, [BitmapActionStale] =3D BitNone, + [BitmapActionProactiveSync] =3D BitNeedSyncUnwritten, + [BitmapActionClearUnwritten] =3D BitNone, }, [BitClean] =3D { [BitmapActionStartwrite] =3D BitDirty, @@ -314,6 +336,8 @@ static char state_machine[BitStateCount][BitmapActionCo= unt] =3D { [BitmapActionDaemon] =3D BitNone, [BitmapActionDiscard] =3D BitUnwritten, [BitmapActionStale] =3D BitNeedSync, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitNone, }, [BitDirty] =3D { [BitmapActionStartwrite] =3D BitNone, @@ -324,6 +348,8 @@ static char state_machine[BitStateCount][BitmapActionCo= unt] =3D { [BitmapActionDaemon] =3D BitClean, [BitmapActionDiscard] =3D BitUnwritten, [BitmapActionStale] =3D BitNeedSync, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitNone, }, [BitNeedSync] =3D { [BitmapActionStartwrite] =3D BitNone, @@ -334,6 +360,8 @@ static char state_machine[BitStateCount][BitmapActionCo= unt] =3D { [BitmapActionDaemon] =3D BitNone, [BitmapActionDiscard] =3D BitUnwritten, [BitmapActionStale] =3D BitNone, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitNone, }, [BitSyncing] =3D { [BitmapActionStartwrite] =3D BitNone, @@ -344,6 +372,44 @@ static char state_machine[BitStateCount][BitmapActionC= ount] =3D { [BitmapActionDaemon] =3D BitNone, [BitmapActionDiscard] =3D BitUnwritten, [BitmapActionStale] =3D BitNeedSync, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitNone, + }, + [BitNeedSyncUnwritten] =3D { + [BitmapActionStartwrite] =3D BitNeedSync, + [BitmapActionStartsync] =3D BitSyncingUnwritten, + [BitmapActionEndsync] =3D BitNone, + [BitmapActionAbortsync] =3D BitUnwritten, + [BitmapActionReload] =3D BitUnwritten, + [BitmapActionDaemon] =3D BitNone, + [BitmapActionDiscard] =3D BitUnwritten, + [BitmapActionStale] =3D BitUnwritten, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitUnwritten, + }, + [BitSyncingUnwritten] =3D { + [BitmapActionStartwrite] =3D BitSyncing, + [BitmapActionStartsync] =3D BitSyncingUnwritten, + [BitmapActionEndsync] =3D BitCleanUnwritten, + [BitmapActionAbortsync] =3D BitUnwritten, + [BitmapActionReload] =3D BitUnwritten, + [BitmapActionDaemon] =3D BitNone, + [BitmapActionDiscard] =3D BitUnwritten, + [BitmapActionStale] =3D BitUnwritten, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitUnwritten, + }, + [BitCleanUnwritten] =3D { + [BitmapActionStartwrite] =3D BitDirty, + [BitmapActionStartsync] =3D BitNone, + [BitmapActionEndsync] =3D BitNone, + [BitmapActionAbortsync] =3D BitNone, + [BitmapActionReload] =3D BitNone, + [BitmapActionDaemon] =3D BitNone, + [BitmapActionDiscard] =3D BitUnwritten, + [BitmapActionStale] =3D BitUnwritten, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitUnwritten, }, }; =20 @@ -376,6 +442,7 @@ static void llbitmap_infect_dirty_bits(struct llbitmap = *llbitmap, pctl->state[pos] =3D level_456 ? BitNeedSync : BitDirty; break; case BitClean: + case BitCleanUnwritten: pctl->state[pos] =3D BitDirty; break; } @@ -383,7 +450,7 @@ static void llbitmap_infect_dirty_bits(struct llbitmap = *llbitmap, } =20 static void llbitmap_set_page_dirty(struct llbitmap *llbitmap, int idx, - int offset) + int offset, bool infect) { struct llbitmap_page_ctl *pctl =3D llbitmap->pctl[idx]; unsigned int io_size =3D llbitmap->io_size; @@ -398,7 +465,7 @@ static void llbitmap_set_page_dirty(struct llbitmap *ll= bitmap, int idx, * resync all the dirty bits, hence skip infect new dirty bits to * prevent resync unnecessary data. */ - if (llbitmap->mddev->degraded) { + if (llbitmap->mddev->degraded || !infect) { set_bit(block, pctl->dirty); return; } @@ -438,7 +505,9 @@ static void llbitmap_write(struct llbitmap *llbitmap, e= num llbitmap_state state, =20 llbitmap->pctl[idx]->state[bit] =3D state; if (state =3D=3D BitDirty || state =3D=3D BitNeedSync) - llbitmap_set_page_dirty(llbitmap, idx, bit); + llbitmap_set_page_dirty(llbitmap, idx, bit, true); + else if (state =3D=3D BitNeedSyncUnwritten) + llbitmap_set_page_dirty(llbitmap, idx, bit, false); } =20 static struct page *llbitmap_read_page(struct llbitmap *llbitmap, int idx) @@ -627,11 +696,10 @@ static enum llbitmap_state llbitmap_state_machine(str= uct llbitmap *llbitmap, goto write_bitmap; } =20 - if (c =3D=3D BitNeedSync) + if (c =3D=3D BitNeedSync || c =3D=3D BitNeedSyncUnwritten) need_resync =3D !mddev->degraded; =20 state =3D state_machine[c][action]; - write_bitmap: if (unlikely(mddev->degraded)) { /* For degraded array, mark new data as need sync. */ @@ -658,8 +726,7 @@ static enum llbitmap_state llbitmap_state_machine(struc= t llbitmap *llbitmap, } =20 llbitmap_write(llbitmap, state, start); - - if (state =3D=3D BitNeedSync) + if (state =3D=3D BitNeedSync || state =3D=3D BitNeedSyncUnwritten) need_resync =3D !mddev->degraded; else if (state =3D=3D BitDirty && !timer_pending(&llbitmap->pending_timer)) @@ -1229,7 +1296,7 @@ static bool llbitmap_blocks_synced(struct mddev *mdde= v, sector_t offset) unsigned long p =3D offset >> llbitmap->chunkshift; enum llbitmap_state c =3D llbitmap_read(llbitmap, p); =20 - return c =3D=3D BitClean || c =3D=3D BitDirty; + return c =3D=3D BitClean || c =3D=3D BitDirty || c =3D=3D BitCleanUnwritt= en; } =20 static sector_t llbitmap_skip_sync_blocks(struct mddev *mddev, sector_t of= fset) @@ -1243,6 +1310,10 @@ static sector_t llbitmap_skip_sync_blocks(struct mdd= ev *mddev, sector_t offset) if (c =3D=3D BitUnwritten) return blocks; =20 + /* Skip CleanUnwritten - no user data, will be reset after recovery */ + if (c =3D=3D BitCleanUnwritten) + return blocks; + /* For degraded array, don't skip */ if (mddev->degraded) return 0; @@ -1261,14 +1332,25 @@ static bool llbitmap_start_sync(struct mddev *mddev= , sector_t offset, { struct llbitmap *llbitmap =3D mddev->bitmap; unsigned long p =3D offset >> llbitmap->chunkshift; + enum llbitmap_state state; + + /* + * Before recovery starts, convert CleanUnwritten to Unwritten. + * This ensures the new disk won't have stale parity data. + */ + if (offset =3D=3D 0 && test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && + !test_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery)) + llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1, + BitmapActionClearUnwritten); + =20 /* * Handle one bit at a time, this is much simpler. And it doesn't matter * if md_do_sync() loop more times. */ *blocks =3D llbitmap->chunksize - (offset & (llbitmap->chunksize - 1)); - return llbitmap_state_machine(llbitmap, p, p, - BitmapActionStartsync) =3D=3D BitSyncing; + state =3D llbitmap_state_machine(llbitmap, p, p, BitmapActionStartsync); + return state =3D=3D BitSyncing || state =3D=3D BitSyncingUnwritten; } =20 /* Something is wrong, sync_thread stop at @offset */ @@ -1474,9 +1556,15 @@ static ssize_t bits_show(struct mddev *mddev, char *= page) } =20 mutex_unlock(&mddev->bitmap_info.mutex); - return sprintf(page, "unwritten %d\nclean %d\ndirty %d\nneed sync %d\nsyn= cing %d\n", + return sprintf(page, + "unwritten %d\nclean %d\ndirty %d\n" + "need sync %d\nsyncing %d\n" + "need sync unwritten %d\nsyncing unwritten %d\n" + "clean unwritten %d\n", bits[BitUnwritten], bits[BitClean], bits[BitDirty], - bits[BitNeedSync], bits[BitSyncing]); + bits[BitNeedSync], bits[BitSyncing], + bits[BitNeedSyncUnwritten], bits[BitSyncingUnwritten], + bits[BitCleanUnwritten]); } =20 static struct md_sysfs_entry llbitmap_bits =3D __ATTR_RO(bits); @@ -1549,11 +1637,39 @@ barrier_idle_store(struct mddev *mddev, const char = *buf, size_t len) =20 static struct md_sysfs_entry llbitmap_barrier_idle =3D __ATTR_RW(barrier_i= dle); =20 +static ssize_t +proactive_sync_store(struct mddev *mddev, const char *buf, size_t len) +{ + struct llbitmap *llbitmap; + + /* Only for RAID-456 */ + if (!raid_is_456(mddev)) + return -EINVAL; + + mutex_lock(&mddev->bitmap_info.mutex); + llbitmap =3D mddev->bitmap; + if (!llbitmap || !llbitmap->pctl) { + mutex_unlock(&mddev->bitmap_info.mutex); + return -ENODEV; + } + + /* Trigger proactive sync on all Unwritten regions */ + llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1, + BitmapActionProactiveSync); + + mutex_unlock(&mddev->bitmap_info.mutex); + return len; +} + +static struct md_sysfs_entry llbitmap_proactive_sync =3D + __ATTR(proactive_sync, 0200, NULL, proactive_sync_store); + static struct attribute *md_llbitmap_attrs[] =3D { &llbitmap_bits.attr, &llbitmap_metadata.attr, &llbitmap_daemon_sleep.attr, &llbitmap_barrier_idle.attr, + &llbitmap_proactive_sync.attr, NULL }; =20 diff --git a/drivers/md/md.c b/drivers/md/md.c index d2607ed5c2e9..270802b8a4fc 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9870,8 +9870,10 @@ void md_do_sync(struct md_thread *thread) * Give other IO more of a chance. * The faster the devices, the less we wait. */ - wait_event(mddev->recovery_wait, - !atomic_read(&mddev->recovery_active)); + wait_event_timeout( + mddev->recovery_wait, + !atomic_read(&mddev->recovery_active), + HZ); } } } --=20 2.51.0 From nobody Thu Apr 2 22:24:26 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B67B63009E1; Sat, 14 Feb 2026 06:10:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049427; cv=none; b=Cp4+/hj+8UP43ItX/1obG9gxIMzw+SS+wGWOWaYYmyClCOLcrq0nL+EGdT0YBtxeEcCXUXcZ45BzXnTxvMFFhcJ4one88V9//f3HWWw8ICsa9uzk9IE/T7qJ5TxHtQDICEs0pkH+nEeQSMlt/joQgdD8JVo7H/P6TFgr/UqEe3c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771049427; c=relaxed/simple; bh=jVBk1b/wivXQrHrNwPugL4S2SWbQ6unlwk6a0c7bpYs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k1V/5P6MwfPGiad6yOhBqODcfkiMdHewI1KBs82gEMRLwfI0nhPXoxwpS3GIgq2vgVcuU0obDALhn1BbGD3y7La1b2Q5WGn3uc7iTKqrxpvoX6eWab/4OYSBOUxhALCVpL7Tp2AxECdNFX4ruzRrAoyQ0OjyrCA3VliLWgpSjsk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id DA9C9C19421; Sat, 14 Feb 2026 06:10:25 +0000 (UTC) From: Yu Kuai To: song@kernel.org Cc: linan122@huawei.com, xni@redhat.com, colyli@fnnas.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 5/5] md/md-llbitmap: optimize initial sync with write_zeroes_unmap support Date: Sat, 14 Feb 2026 14:10:13 +0800 Message-ID: <20260214061013.2335604-6-yukuai@fnnas.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260214061013.2335604-1-yukuai@fnnas.com> References: <20260214061013.2335604-1-yukuai@fnnas.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" For RAID-456 arrays with llbitmap, if all underlying disks support write_zeroes with unmap, issue write_zeroes to zero all disk data regions and initialize the bitmap to BitCleanUnwritten instead of BitUnwritten. This optimization skips the initial XOR parity building because: 1. write_zeroes with unmap guarantees zeroed reads after the operation 2. For RAID-456, when all data is zero, parity is automatically consistent (0 XOR 0 XOR ... =3D 0) 3. BitCleanUnwritten indicates parity is valid but no user data has been written The implementation adds two helper functions: - llbitmap_all_disks_support_wzeroes_unmap(): Checks if all active disks support write_zeroes with unmap - llbitmap_zero_all_disks(): Issues blkdev_issue_zeroout() to each rdev's data region to zero all disks The zeroing and bitmap state setting happens in llbitmap_init_state() during bitmap initialization. If any disk fails to zero, we fall back to BitUnwritten and normal lazy recovery. This significantly reduces array initialization time for RAID-456 arrays built on modern NVMe SSDs or other devices that support write_zeroes with unmap. Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 62 +++++++++++++++++++++++++++++++++++++++- 1 file changed, 61 insertions(+), 1 deletion(-) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 461050b2771b..48bc6a639edd 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -654,13 +654,73 @@ static int llbitmap_cache_pages(struct llbitmap *llbi= tmap) return 0; } =20 +/* + * Check if all underlying disks support write_zeroes with unmap. + */ +static bool llbitmap_all_disks_support_wzeroes_unmap(struct llbitmap *llbi= tmap) +{ + struct mddev *mddev =3D llbitmap->mddev; + struct md_rdev *rdev; + + rdev_for_each(rdev, mddev) { + if (rdev->raid_disk < 0 || test_bit(Faulty, &rdev->flags)) + continue; + + if (bdev_write_zeroes_unmap_sectors(rdev->bdev) =3D=3D 0) + return false; + } + + return true; +} + +/* + * Issue write_zeroes to all underlying disks to zero their data regions. + * This ensures parity consistency for RAID-456 (0 XOR 0 =3D 0). + * Returns true if all disks were successfully zeroed. + */ +static bool llbitmap_zero_all_disks(struct llbitmap *llbitmap) +{ + struct mddev *mddev =3D llbitmap->mddev; + struct md_rdev *rdev; + sector_t dev_sectors =3D mddev->dev_sectors; + int ret; + + rdev_for_each(rdev, mddev) { + if (rdev->raid_disk < 0 || test_bit(Faulty, &rdev->flags)) + continue; + + ret =3D blkdev_issue_zeroout(rdev->bdev, + rdev->data_offset, + dev_sectors, + GFP_KERNEL, 0); + if (ret) { + pr_warn("md/llbitmap: failed to zero disk %pg: %d\n", + rdev->bdev, ret); + return false; + } + } + + return true; +} + static void llbitmap_init_state(struct llbitmap *llbitmap) { + struct mddev *mddev =3D llbitmap->mddev; enum llbitmap_state state =3D BitUnwritten; unsigned long i; =20 - if (test_and_clear_bit(BITMAP_CLEAN, &llbitmap->flags)) + if (test_and_clear_bit(BITMAP_CLEAN, &llbitmap->flags)) { state =3D BitClean; + } else if (raid_is_456(mddev) && + llbitmap_all_disks_support_wzeroes_unmap(llbitmap)) { + /* + * All disks support write_zeroes with unmap. Zero all disks + * to ensure parity consistency, then set BitCleanUnwritten + * to skip initial sync. + */ + if (llbitmap_zero_all_disks(llbitmap)) + state =3D BitCleanUnwritten; + } =20 for (i =3D 0; i < llbitmap->chunks; i++) llbitmap_write(llbitmap, state, i); --=20 2.51.0