fs/ext4/inode.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
From: Zhang Yi <yi.zhang@huawei.com>
When zeroing out a written partial block, it is necessary to order the
data to prevent exposing stale data on disk. However, if the buffer is
unwritten or delayed, it is not allocated as written, so ordering the
data is not required. This can prevent strange and unnecessary ordered
writes when appending data across a region within a block.
Assume we have a 2K unwritten file on a filesystem with 4K blocksize,
and buffered write from 3K to 4K. Before this patch,
__ext4_block_zero_page_range() would add the range [2k,3k) to the
ordered range, and then the JBD2 commit process would write back this
block. However, it does nothing since the block is not mapped as
written, this folio will be redirtied and written back agian through the
normal write back process.
Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
---
fs/ext4/inode.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 2e79b09fe2f0..f2d70c9af446 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4109,9 +4109,13 @@ static int __ext4_block_zero_page_range(handle_t *handle,
if (ext4_should_journal_data(inode)) {
err = ext4_dirty_journalled_data(handle, bh);
} else {
- err = 0;
mark_buffer_dirty(bh);
- if (ext4_should_order_data(inode))
+ /*
+ * Only the written block requires ordered data to prevent
+ * exposing stale data.
+ */
+ if (!buffer_unwritten(bh) && !buffer_delay(bh) &&
+ ext4_should_order_data(inode))
err = ext4_jbd2_inode_add_write(handle, inode, from,
length);
}
--
2.52.0
On Tue, 23 Dec 2025 09:19:27 +0800, Zhang Yi wrote:
> When zeroing out a written partial block, it is necessary to order the
> data to prevent exposing stale data on disk. However, if the buffer is
> unwritten or delayed, it is not allocated as written, so ordering the
> data is not required. This can prevent strange and unnecessary ordered
> writes when appending data across a region within a block.
>
> Assume we have a 2K unwritten file on a filesystem with 4K blocksize,
> and buffered write from 3K to 4K. Before this patch,
> __ext4_block_zero_page_range() would add the range [2k,3k) to the
> ordered range, and then the JBD2 commit process would write back this
> block. However, it does nothing since the block is not mapped as
> written, this folio will be redirtied and written back agian through the
> normal write back process.
>
> [...]
Applied, thanks!
[1/1] ext4: don't order data when zeroing unwritten or delayed block
commit: 154922b34da9770223d9883ac6976635a786b5ba
Best regards,
--
Theodore Ts'o <tytso@mit.edu>
On 2025-12-23 09:19, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
>
> When zeroing out a written partial block, it is necessary to order the
> data to prevent exposing stale data on disk. However, if the buffer is
> unwritten or delayed, it is not allocated as written, so ordering the
> data is not required. This can prevent strange and unnecessary ordered
> writes when appending data across a region within a block.
>
> Assume we have a 2K unwritten file on a filesystem with 4K blocksize,
> and buffered write from 3K to 4K. Before this patch,
> __ext4_block_zero_page_range() would add the range [2k,3k) to the
> ordered range, and then the JBD2 commit process would write back this
> block. However, it does nothing since the block is not mapped as
> written, this folio will be redirtied and written back agian through the
> normal write back process.
>
> Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
> Reviewed-by: Jan Kara <jack@suse.cz>
Makes sense. Feel free to add:
Reviewed-by: Baokun Li <libaokun1@huawei.com>
> ---
> fs/ext4/inode.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 2e79b09fe2f0..f2d70c9af446 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -4109,9 +4109,13 @@ static int __ext4_block_zero_page_range(handle_t *handle,
> if (ext4_should_journal_data(inode)) {
> err = ext4_dirty_journalled_data(handle, bh);
> } else {
> - err = 0;
> mark_buffer_dirty(bh);
> - if (ext4_should_order_data(inode))
> + /*
> + * Only the written block requires ordered data to prevent
> + * exposing stale data.
> + */
> + if (!buffer_unwritten(bh) && !buffer_delay(bh) &&
> + ext4_should_order_data(inode))
> err = ext4_jbd2_inode_add_write(handle, inode, from,
> length);
> }
© 2016 - 2026 Red Hat, Inc.