From nobody Thu Apr 2 23:55:46 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7386F20B80B; Mon, 23 Feb 2026 02:40:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771814450; cv=none; b=mVFM7ff8Rtt9C6AGTknaTmLWJ7eBU3tK6HpdNj+hKxFVR6dNGstAzpJjnCn3qXXbKN+KVcasiqPhn3nAgoOiWxah14VaQgZAMS1sYPaVpwy91CvXkgz4TLLcVltpgbVdVJHpareKnClGqo3XANxLiBN5J9KZ78BOL3M0pTuCJVI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1771814450; c=relaxed/simple; bh=Ih8gwF7jn26x/E7dOoYcIg8P5AZ5FG83lqjmFxx2PoI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kzW8tNBeKoFbALRHki4Elf96JVSUl650PR9Gn5Agf7u2ZfEJf8h4/k18ENFUW+tDlsufpB6HwvQDHj5RTbmcWEv2ivyoKAZPHyx2f7J3GybqWyoMz6MQQgwGjXNtwoWUJZFFMls55wbSaxUkv5QXlyK6sPDhJrCp+YT+lUpaCvk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 92DD7C116D0; Mon, 23 Feb 2026 02:40:48 +0000 (UTC) From: Yu Kuai To: song@kernel.org Cc: linan122@huawei.com, xni@redhat.com, colyli@fnnas.com, linux-raid@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 4/5] md/md-llbitmap: add CleanUnwritten state for RAID-5 proactive parity building Date: Mon, 23 Feb 2026 10:40:37 +0800 Message-ID: <20260223024038.3084853-5-yukuai@fnnas.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260223024038.3084853-1-yukuai@fnnas.com> References: <20260223024038.3084853-1-yukuai@fnnas.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add new states to the llbitmap state machine to support proactive XOR parity building for RAID-5 arrays. This allows users to pre-build parity data for unwritten regions before any user data is written. New states added: - BitNeedSyncUnwritten: Transitional state when proactive sync is triggered via sysfs on Unwritten regions. - BitSyncingUnwritten: Proactive sync in progress for unwritten region. - BitCleanUnwritten: XOR parity has been pre-built, but no user data written yet. When user writes to this region, it transitions to BitDirty. New actions added: - BitmapActionProactiveSync: Trigger for proactive XOR parity building. - BitmapActionClearUnwritten: Convert CleanUnwritten/NeedSyncUnwritten/ SyncingUnwritten states back to Unwritten before recovery starts. State flows: - Current (lazy): Unwritten -> (write) -> NeedSync -> (sync) -> Dirty -> Cl= ean - New (proactive): Unwritten -> (sysfs) -> NeedSyncUnwritten -> (sync) -> C= leanUnwritten - On write to CleanUnwritten: CleanUnwritten -> (write) -> Dirty -> Clean - On disk replacement: CleanUnwritten regions are converted to Unwritten before recovery starts, so recovery only rebuilds regions with user data A new sysfs interface is added at /sys/block/mdX/md/llbitmap/proactive_sync (write-only) to trigger proactive sync. This only works for RAID-456 arrays. Signed-off-by: Yu Kuai --- drivers/md/md-llbitmap.c | 140 +++++++++++++++++++++++++++++++++++---- drivers/md/md.c | 6 +- 2 files changed, 132 insertions(+), 14 deletions(-) diff --git a/drivers/md/md-llbitmap.c b/drivers/md/md-llbitmap.c index 5f9e7004e3e3..461050b2771b 100644 --- a/drivers/md/md-llbitmap.c +++ b/drivers/md/md-llbitmap.c @@ -208,6 +208,20 @@ enum llbitmap_state { BitNeedSync, /* data is synchronizing */ BitSyncing, + /* + * Proactive sync requested for unwritten region (raid456 only). + * Triggered via sysfs when user wants to pre-build XOR parity + * for regions that have never been written. + */ + BitNeedSyncUnwritten, + /* Proactive sync in progress for unwritten region */ + BitSyncingUnwritten, + /* + * XOR parity has been pre-built for a region that has never had + * user data written. When user writes to this region, it transitions + * to BitDirty. + */ + BitCleanUnwritten, BitStateCount, BitNone =3D 0xff, }; @@ -232,6 +246,12 @@ enum llbitmap_action { * BitNeedSync. */ BitmapActionStale, + /* + * Proactive sync trigger for raid456 - builds XOR parity for + * Unwritten regions without requiring user data write first. + */ + BitmapActionProactiveSync, + BitmapActionClearUnwritten, BitmapActionCount, /* Init state is BitUnwritten */ BitmapActionInit, @@ -304,6 +324,8 @@ static char state_machine[BitStateCount][BitmapActionCo= unt] =3D { [BitmapActionDaemon] =3D BitNone, [BitmapActionDiscard] =3D BitNone, [BitmapActionStale] =3D BitNone, + [BitmapActionProactiveSync] =3D BitNeedSyncUnwritten, + [BitmapActionClearUnwritten] =3D BitNone, }, [BitClean] =3D { [BitmapActionStartwrite] =3D BitDirty, @@ -314,6 +336,8 @@ static char state_machine[BitStateCount][BitmapActionCo= unt] =3D { [BitmapActionDaemon] =3D BitNone, [BitmapActionDiscard] =3D BitUnwritten, [BitmapActionStale] =3D BitNeedSync, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitNone, }, [BitDirty] =3D { [BitmapActionStartwrite] =3D BitNone, @@ -324,6 +348,8 @@ static char state_machine[BitStateCount][BitmapActionCo= unt] =3D { [BitmapActionDaemon] =3D BitClean, [BitmapActionDiscard] =3D BitUnwritten, [BitmapActionStale] =3D BitNeedSync, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitNone, }, [BitNeedSync] =3D { [BitmapActionStartwrite] =3D BitNone, @@ -334,6 +360,8 @@ static char state_machine[BitStateCount][BitmapActionCo= unt] =3D { [BitmapActionDaemon] =3D BitNone, [BitmapActionDiscard] =3D BitUnwritten, [BitmapActionStale] =3D BitNone, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitNone, }, [BitSyncing] =3D { [BitmapActionStartwrite] =3D BitNone, @@ -344,6 +372,44 @@ static char state_machine[BitStateCount][BitmapActionC= ount] =3D { [BitmapActionDaemon] =3D BitNone, [BitmapActionDiscard] =3D BitUnwritten, [BitmapActionStale] =3D BitNeedSync, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitNone, + }, + [BitNeedSyncUnwritten] =3D { + [BitmapActionStartwrite] =3D BitNeedSync, + [BitmapActionStartsync] =3D BitSyncingUnwritten, + [BitmapActionEndsync] =3D BitNone, + [BitmapActionAbortsync] =3D BitUnwritten, + [BitmapActionReload] =3D BitUnwritten, + [BitmapActionDaemon] =3D BitNone, + [BitmapActionDiscard] =3D BitUnwritten, + [BitmapActionStale] =3D BitUnwritten, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitUnwritten, + }, + [BitSyncingUnwritten] =3D { + [BitmapActionStartwrite] =3D BitSyncing, + [BitmapActionStartsync] =3D BitSyncingUnwritten, + [BitmapActionEndsync] =3D BitCleanUnwritten, + [BitmapActionAbortsync] =3D BitUnwritten, + [BitmapActionReload] =3D BitUnwritten, + [BitmapActionDaemon] =3D BitNone, + [BitmapActionDiscard] =3D BitUnwritten, + [BitmapActionStale] =3D BitUnwritten, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitUnwritten, + }, + [BitCleanUnwritten] =3D { + [BitmapActionStartwrite] =3D BitDirty, + [BitmapActionStartsync] =3D BitNone, + [BitmapActionEndsync] =3D BitNone, + [BitmapActionAbortsync] =3D BitNone, + [BitmapActionReload] =3D BitNone, + [BitmapActionDaemon] =3D BitNone, + [BitmapActionDiscard] =3D BitUnwritten, + [BitmapActionStale] =3D BitUnwritten, + [BitmapActionProactiveSync] =3D BitNone, + [BitmapActionClearUnwritten] =3D BitUnwritten, }, }; =20 @@ -376,6 +442,7 @@ static void llbitmap_infect_dirty_bits(struct llbitmap = *llbitmap, pctl->state[pos] =3D level_456 ? BitNeedSync : BitDirty; break; case BitClean: + case BitCleanUnwritten: pctl->state[pos] =3D BitDirty; break; } @@ -383,7 +450,7 @@ static void llbitmap_infect_dirty_bits(struct llbitmap = *llbitmap, } =20 static void llbitmap_set_page_dirty(struct llbitmap *llbitmap, int idx, - int offset) + int offset, bool infect) { struct llbitmap_page_ctl *pctl =3D llbitmap->pctl[idx]; unsigned int io_size =3D llbitmap->io_size; @@ -398,7 +465,7 @@ static void llbitmap_set_page_dirty(struct llbitmap *ll= bitmap, int idx, * resync all the dirty bits, hence skip infect new dirty bits to * prevent resync unnecessary data. */ - if (llbitmap->mddev->degraded) { + if (llbitmap->mddev->degraded || !infect) { set_bit(block, pctl->dirty); return; } @@ -438,7 +505,9 @@ static void llbitmap_write(struct llbitmap *llbitmap, e= num llbitmap_state state, =20 llbitmap->pctl[idx]->state[bit] =3D state; if (state =3D=3D BitDirty || state =3D=3D BitNeedSync) - llbitmap_set_page_dirty(llbitmap, idx, bit); + llbitmap_set_page_dirty(llbitmap, idx, bit, true); + else if (state =3D=3D BitNeedSyncUnwritten) + llbitmap_set_page_dirty(llbitmap, idx, bit, false); } =20 static struct page *llbitmap_read_page(struct llbitmap *llbitmap, int idx) @@ -627,11 +696,10 @@ static enum llbitmap_state llbitmap_state_machine(str= uct llbitmap *llbitmap, goto write_bitmap; } =20 - if (c =3D=3D BitNeedSync) + if (c =3D=3D BitNeedSync || c =3D=3D BitNeedSyncUnwritten) need_resync =3D !mddev->degraded; =20 state =3D state_machine[c][action]; - write_bitmap: if (unlikely(mddev->degraded)) { /* For degraded array, mark new data as need sync. */ @@ -658,8 +726,7 @@ static enum llbitmap_state llbitmap_state_machine(struc= t llbitmap *llbitmap, } =20 llbitmap_write(llbitmap, state, start); - - if (state =3D=3D BitNeedSync) + if (state =3D=3D BitNeedSync || state =3D=3D BitNeedSyncUnwritten) need_resync =3D !mddev->degraded; else if (state =3D=3D BitDirty && !timer_pending(&llbitmap->pending_timer)) @@ -1229,7 +1296,7 @@ static bool llbitmap_blocks_synced(struct mddev *mdde= v, sector_t offset) unsigned long p =3D offset >> llbitmap->chunkshift; enum llbitmap_state c =3D llbitmap_read(llbitmap, p); =20 - return c =3D=3D BitClean || c =3D=3D BitDirty; + return c =3D=3D BitClean || c =3D=3D BitDirty || c =3D=3D BitCleanUnwritt= en; } =20 static sector_t llbitmap_skip_sync_blocks(struct mddev *mddev, sector_t of= fset) @@ -1243,6 +1310,10 @@ static sector_t llbitmap_skip_sync_blocks(struct mdd= ev *mddev, sector_t offset) if (c =3D=3D BitUnwritten) return blocks; =20 + /* Skip CleanUnwritten - no user data, will be reset after recovery */ + if (c =3D=3D BitCleanUnwritten) + return blocks; + /* For degraded array, don't skip */ if (mddev->degraded) return 0; @@ -1261,14 +1332,25 @@ static bool llbitmap_start_sync(struct mddev *mddev= , sector_t offset, { struct llbitmap *llbitmap =3D mddev->bitmap; unsigned long p =3D offset >> llbitmap->chunkshift; + enum llbitmap_state state; + + /* + * Before recovery starts, convert CleanUnwritten to Unwritten. + * This ensures the new disk won't have stale parity data. + */ + if (offset =3D=3D 0 && test_bit(MD_RECOVERY_RECOVER, &mddev->recovery) && + !test_bit(MD_RECOVERY_LAZY_RECOVER, &mddev->recovery)) + llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1, + BitmapActionClearUnwritten); + =20 /* * Handle one bit at a time, this is much simpler. And it doesn't matter * if md_do_sync() loop more times. */ *blocks =3D llbitmap->chunksize - (offset & (llbitmap->chunksize - 1)); - return llbitmap_state_machine(llbitmap, p, p, - BitmapActionStartsync) =3D=3D BitSyncing; + state =3D llbitmap_state_machine(llbitmap, p, p, BitmapActionStartsync); + return state =3D=3D BitSyncing || state =3D=3D BitSyncingUnwritten; } =20 /* Something is wrong, sync_thread stop at @offset */ @@ -1474,9 +1556,15 @@ static ssize_t bits_show(struct mddev *mddev, char *= page) } =20 mutex_unlock(&mddev->bitmap_info.mutex); - return sprintf(page, "unwritten %d\nclean %d\ndirty %d\nneed sync %d\nsyn= cing %d\n", + return sprintf(page, + "unwritten %d\nclean %d\ndirty %d\n" + "need sync %d\nsyncing %d\n" + "need sync unwritten %d\nsyncing unwritten %d\n" + "clean unwritten %d\n", bits[BitUnwritten], bits[BitClean], bits[BitDirty], - bits[BitNeedSync], bits[BitSyncing]); + bits[BitNeedSync], bits[BitSyncing], + bits[BitNeedSyncUnwritten], bits[BitSyncingUnwritten], + bits[BitCleanUnwritten]); } =20 static struct md_sysfs_entry llbitmap_bits =3D __ATTR_RO(bits); @@ -1549,11 +1637,39 @@ barrier_idle_store(struct mddev *mddev, const char = *buf, size_t len) =20 static struct md_sysfs_entry llbitmap_barrier_idle =3D __ATTR_RW(barrier_i= dle); =20 +static ssize_t +proactive_sync_store(struct mddev *mddev, const char *buf, size_t len) +{ + struct llbitmap *llbitmap; + + /* Only for RAID-456 */ + if (!raid_is_456(mddev)) + return -EINVAL; + + mutex_lock(&mddev->bitmap_info.mutex); + llbitmap =3D mddev->bitmap; + if (!llbitmap || !llbitmap->pctl) { + mutex_unlock(&mddev->bitmap_info.mutex); + return -ENODEV; + } + + /* Trigger proactive sync on all Unwritten regions */ + llbitmap_state_machine(llbitmap, 0, llbitmap->chunks - 1, + BitmapActionProactiveSync); + + mutex_unlock(&mddev->bitmap_info.mutex); + return len; +} + +static struct md_sysfs_entry llbitmap_proactive_sync =3D + __ATTR(proactive_sync, 0200, NULL, proactive_sync_store); + static struct attribute *md_llbitmap_attrs[] =3D { &llbitmap_bits.attr, &llbitmap_metadata.attr, &llbitmap_daemon_sleep.attr, &llbitmap_barrier_idle.attr, + &llbitmap_proactive_sync.attr, NULL }; =20 diff --git a/drivers/md/md.c b/drivers/md/md.c index 245785ad0ffd..b6543d81ac96 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -9878,8 +9878,10 @@ void md_do_sync(struct md_thread *thread) * Give other IO more of a chance. * The faster the devices, the less we wait. */ - wait_event(mddev->recovery_wait, - !atomic_read(&mddev->recovery_active)); + wait_event_timeout( + mddev->recovery_wait, + !atomic_read(&mddev->recovery_active), + HZ); } } } --=20 2.51.0