fs/ext4/ext4_jbd2.c | 11 +++++++++-- fs/jbd2/transaction.c | 13 +++++++++---- 2 files changed, 18 insertions(+), 6 deletions(-)
From: Zhang Yi <yi.zhang@huawei.com> Hello! This series fixes an data corruption issue reported by Gao Xiang in nojournal mode. The problem is happened after a metadata block is freed, it can be immediately reallocated as a data block. However, the metadata on this block may still be in the process of being written back, which means the new data in this block could potentially be overwritten by the stale metadata and trigger a data corruption issue. Please see below discussion with Jan for more details: https://lore.kernel.org/linux-ext4/a9417096-9549-4441-9878-b1955b899b4e@huaweicloud.com/ Patch 1 strengthens the same case in ordered journal mode, theoretically preventing the occurrence of stale data issues. Patch 2 fix this issue in nojournal mode. Regards, Yi. Zhang Yi (2): jbd2: ensure that all ongoing I/O complete before freeing blocks ext4: wait for ongoing I/O to complete before freeing blocks fs/ext4/ext4_jbd2.c | 11 +++++++++-- fs/jbd2/transaction.c | 13 +++++++++---- 2 files changed, 18 insertions(+), 6 deletions(-) -- 2.46.1
Hi Ted, On 2025/9/16 17:33, Zhang Yi wrote: > From: Zhang Yi <yi.zhang@huawei.com> > > Hello! > > This series fixes an data corruption issue reported by Gao Xiang in > nojournal mode. The problem is happened after a metadata block is freed, > it can be immediately reallocated as a data block. However, the metadata > on this block may still be in the process of being written back, which > means the new data in this block could potentially be overwritten by the > stale metadata and trigger a data corruption issue. Please see below > discussion with Jan for more details: > > https://lore.kernel.org/linux-ext4/a9417096-9549-4441-9878-b1955b899b4e@huaweicloud.com/ > > Patch 1 strengthens the same case in ordered journal mode, theoretically > preventing the occurrence of stale data issues. > Patch 2 fix this issue in nojournal mode. It seems this series is not applied, is it ignored? When ext4 nojournal mode is used, it is actually a very serious bug since data corruption can happen very easily in specific conditions (we actually have a specific environment which can reproduce the issue very quickly) Also it seems AWS folks reported this issue years ago (2021), the phenomenon was almost the same, but the issue still exists until now: https://lore.kernel.org/linux-ext4/20211108173520.xp6xphodfhcen2sy@u87e72aa3c6c25c.ant.amazon.com/ Some of our internal businesses actually rely on EXT4 no_journal mode and when they upgrade the kernel from 4.19 to 5.10, they actually read corrupted data after page cache memory is reclaimed (actually the on-disk data was corrupted even earlier). So personally I wonder what's the current status of EXT4 no_journal mode since this issue has been existing for more than 5 years but some people may need an extent-enabled ext2 so they selected this mode. We already released an announcement to advise customers not using no_journal mode because it seems lack of enough maintainence (yet many end users are interested in this mode): https://www.alibabacloud.com/help/en/alinux/support/data-corruption-risk-and-solution-in-ext4-nojounral-mode Thanks, Gao Xiang > > Regards, > Yi. > > Zhang Yi (2): > jbd2: ensure that all ongoing I/O complete before freeing blocks > ext4: wait for ongoing I/O to complete before freeing blocks > > fs/ext4/ext4_jbd2.c | 11 +++++++++-- > fs/jbd2/transaction.c | 13 +++++++++---- > 2 files changed, 18 insertions(+), 6 deletions(-) >
© 2016 - 2025 Red Hat, Inc.