fs/fat/fatent.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
KCSAN reported a data-race between fat12_ent_put() and fat_mirror_bhs():
one CPU was updating a FAT12 entry while another CPU was copying the
whole sector to mirror FATs.Fix this by protecting memcpy() in
fat_mirror_bhs() with the same fat12_entry_lock that guards
fat12_ent_put(),so that the writer and the mirror operation
are mutually exclusive.
FAT-fs (loop5): error, clusters badly computed (404 != 401)
==================================================================
BUG: KCSAN: data-race in fat12_ent_put / fat_mirror_bhs
write to 0xffff888106c953e9 of 1 bytes by task 7452 on cpu 1:
fat12_ent_put+0x74/0x170 fs/fat/fatent.c:168
fat_ent_write+0x69/0xe0 fs/fat/fatent.c:417
fat_chain_add+0x15d/0x440 fs/fat/misc.c:136
fat_add_cluster fs/fat/inode.c:112 [inline]
__fat_get_block fs/fat/inode.c:154 [inline]
fat_get_block+0x46c/0x5e0 fs/fat/inode.c:189
__block_write_begin_int+0x3fd/0xf90 fs/buffer.c:2145
block_write_begin fs/buffer.c:2256 [inline]
cont_write_begin+0x5fc/0x970 fs/buffer.c:2594
fat_write_begin+0x4f/0xe0 fs/fat/inode.c:229
cont_expand_zero fs/buffer.c:2522 [inline]
cont_write_begin+0x1ad/0x970 fs/buffer.c:2584
fat_write_begin+0x4f/0xe0 fs/fat/inode.c:229
generic_perform_write+0x184/0x490 mm/filemap.c:4175
__generic_file_write_iter+0x9e/0x120 mm/filemap.c:4292
generic_file_write_iter+0x8d/0x2f0 mm/filemap.c:4318
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x52a/0x960 fs/read_write.c:686
ksys_pwrite64 fs/read_write.c:793 [inline]
__do_sys_pwrite64 fs/read_write.c:801 [inline]
__se_sys_pwrite64 fs/read_write.c:798 [inline]
__x64_sys_pwrite64+0xfd/0x150 fs/read_write.c:798
x64_sys_call+0xc4d/0x2ff0 arch/x86/include/generated/asm/syscalls_64.h:19
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd2/0x200 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffff888106c95200 of 512 bytes by task 7433 on cpu 0:
fat_mirror_bhs+0x1df/0x320 fs/fat/fatent.c:395
fat_ent_write+0xd0/0xe0 fs/fat/fatent.c:423
fat_free fs/fat/file.c:363 [inline]
fat_truncate_blocks+0x353/0x550 fs/fat/file.c:394
fat_write_failed fs/fat/inode.c:218 [inline]
fat_write_end+0xba/0x160 fs/fat/inode.c:246
generic_perform_write+0x312/0x490 mm/filemap.c:4196
__generic_file_write_iter+0x9e/0x120 mm/filemap.c:4292
generic_file_write_iter+0x8d/0x2f0 mm/filemap.c:4318
new_sync_write fs/read_write.c:593 [inline]
vfs_write+0x52a/0x960 fs/read_write.c:686
ksys_write+0xda/0x1a0 fs/read_write.c:738
__do_sys_write fs/read_write.c:749 [inline]
__se_sys_write fs/read_write.c:746 [inline]
__x64_sys_write+0x40/0x50 fs/read_write.c:746
x64_sys_call+0x27fe/0x2ff0 arch/x86/include/generated/asm/syscalls_64.h:2
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xd2/0x200 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Signed-off-by: YangWen <anmuxixixi@gmail.com>
---
fs/fat/fatent.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
index a7061c2ad8e4..e25775642489 100644
--- a/fs/fat/fatent.c
+++ b/fs/fat/fatent.c
@@ -379,6 +379,7 @@ static int fat_mirror_bhs(struct super_block *sb, struct buffer_head **bhs,
struct msdos_sb_info *sbi = MSDOS_SB(sb);
struct buffer_head *c_bh;
int err, n, copy;
+ bool is_fat12 = is_fat12(sbi);
err = 0;
for (copy = 1; copy < sbi->fats; copy++) {
@@ -392,7 +393,17 @@ static int fat_mirror_bhs(struct super_block *sb, struct buffer_head **bhs,
}
/* Avoid race with userspace read via bdev */
lock_buffer(c_bh);
+ /*
+ * For FAT12, protect memcpy() of the source sector
+ * against concurrent 12-bit entry updates in
+ * fat12_ent_put(), otherwise we may copy a torn
+ * pair of bytes into the mirror FAT.
+ */
+ if (is_fat12)
+ spin_lock(&fat12_entry_lock);
memcpy(c_bh->b_data, bhs[n]->b_data, sb->s_blocksize);
+ if (is_fat12)
+ spin_unlock(&fat12_entry_lock);
set_buffer_uptodate(c_bh);
unlock_buffer(c_bh);
mark_buffer_dirty_inode(c_bh, sbi->fat_inode);
--
2.43.0
Hi YangWen, kernel test robot noticed the following build errors: [auto build test ERROR on brauner-vfs/vfs.all] [also build test ERROR on linus/master v6.17-rc4 next-20250902] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/YangWen/fat-fix-data-race-between-fat12_ent_put-and-fat_mirror_bhs/20250902-162253 base: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git vfs.all patch link: https://lore.kernel.org/r/20250902081727.7146-1-anmuxixixi%40gmail.com patch subject: [PATCH] fat: fix data-race between fat12_ent_put() and fat_mirror_bhs() config: csky-randconfig-002-20250903 (https://download.01.org/0day-ci/archive/20250903/202509030347.6cUT1zxe-lkp@intel.com/config) compiler: csky-linux-gcc (GCC) 15.1.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250903/202509030347.6cUT1zxe-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202509030347.6cUT1zxe-lkp@intel.com/ All errors (new ones prefixed by >>): fs/fat/fatent.c: In function 'fat_mirror_bhs': >> fs/fat/fatent.c:382:25: error: called object 'is_fat12' is not a function or function pointer 382 | bool is_fat12 = is_fat12(sbi); | ^~~~~~~~ fs/fat/fatent.c:382:14: note: declared here 382 | bool is_fat12 = is_fat12(sbi); | ^~~~~~~~ vim +/is_fat12 +382 fs/fat/fatent.c 374 375 /* FIXME: We can write the blocks as more big chunk. */ 376 static int fat_mirror_bhs(struct super_block *sb, struct buffer_head **bhs, 377 int nr_bhs) 378 { 379 struct msdos_sb_info *sbi = MSDOS_SB(sb); 380 struct buffer_head *c_bh; 381 int err, n, copy; > 382 bool is_fat12 = is_fat12(sbi); 383 384 err = 0; 385 for (copy = 1; copy < sbi->fats; copy++) { 386 sector_t backup_fat = sbi->fat_length * copy; 387 388 for (n = 0; n < nr_bhs; n++) { 389 c_bh = sb_getblk(sb, backup_fat + bhs[n]->b_blocknr); 390 if (!c_bh) { 391 err = -ENOMEM; 392 goto error; 393 } 394 /* Avoid race with userspace read via bdev */ 395 lock_buffer(c_bh); 396 /* 397 * For FAT12, protect memcpy() of the source sector 398 * against concurrent 12-bit entry updates in 399 * fat12_ent_put(), otherwise we may copy a torn 400 * pair of bytes into the mirror FAT. 401 */ 402 if (is_fat12) 403 spin_lock(&fat12_entry_lock); 404 memcpy(c_bh->b_data, bhs[n]->b_data, sb->s_blocksize); 405 if (is_fat12) 406 spin_unlock(&fat12_entry_lock); 407 set_buffer_uptodate(c_bh); 408 unlock_buffer(c_bh); 409 mark_buffer_dirty_inode(c_bh, sbi->fat_inode); 410 if (sb->s_flags & SB_SYNCHRONOUS) 411 err = sync_dirty_buffer(c_bh); 412 brelse(c_bh); 413 if (err) 414 goto error; 415 } 416 } 417 error: 418 return err; 419 } 420 -- 0-DAY CI Kernel Test Service https://github.com/intel/lkp-tests/wiki
YangWen <anmuxixixi@gmail.com> writes: > KCSAN reported a data-race between fat12_ent_put() and fat_mirror_bhs(): > one CPU was updating a FAT12 entry while another CPU was copying the > whole sector to mirror FATs.Fix this by protecting memcpy() in > fat_mirror_bhs() with the same fat12_entry_lock that guards > fat12_ent_put(),so that the writer and the mirror operation > are mutually exclusive. Hm, what is wrong with temporary inconsistent? If it had the race with future modification, it can be temporary inconsistent. However, future modification will fix it by updating with latest blocks, right? Or did you actually get the inconsistent state after clean unmount? Thanks. > FAT-fs (loop5): error, clusters badly computed (404 != 401) > ================================================================== > BUG: KCSAN: data-race in fat12_ent_put / fat_mirror_bhs > > write to 0xffff888106c953e9 of 1 bytes by task 7452 on cpu 1: > fat12_ent_put+0x74/0x170 fs/fat/fatent.c:168 > fat_ent_write+0x69/0xe0 fs/fat/fatent.c:417 > fat_chain_add+0x15d/0x440 fs/fat/misc.c:136 > fat_add_cluster fs/fat/inode.c:112 [inline] > __fat_get_block fs/fat/inode.c:154 [inline] > fat_get_block+0x46c/0x5e0 fs/fat/inode.c:189 > __block_write_begin_int+0x3fd/0xf90 fs/buffer.c:2145 > block_write_begin fs/buffer.c:2256 [inline] > cont_write_begin+0x5fc/0x970 fs/buffer.c:2594 > fat_write_begin+0x4f/0xe0 fs/fat/inode.c:229 > cont_expand_zero fs/buffer.c:2522 [inline] > cont_write_begin+0x1ad/0x970 fs/buffer.c:2584 > fat_write_begin+0x4f/0xe0 fs/fat/inode.c:229 > generic_perform_write+0x184/0x490 mm/filemap.c:4175 > __generic_file_write_iter+0x9e/0x120 mm/filemap.c:4292 > generic_file_write_iter+0x8d/0x2f0 mm/filemap.c:4318 > new_sync_write fs/read_write.c:593 [inline] > vfs_write+0x52a/0x960 fs/read_write.c:686 > ksys_pwrite64 fs/read_write.c:793 [inline] > __do_sys_pwrite64 fs/read_write.c:801 [inline] > __se_sys_pwrite64 fs/read_write.c:798 [inline] > __x64_sys_pwrite64+0xfd/0x150 fs/read_write.c:798 > x64_sys_call+0xc4d/0x2ff0 arch/x86/include/generated/asm/syscalls_64.h:19 > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] > do_syscall_64+0xd2/0x200 arch/x86/entry/syscall_64.c:94 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > read to 0xffff888106c95200 of 512 bytes by task 7433 on cpu 0: > fat_mirror_bhs+0x1df/0x320 fs/fat/fatent.c:395 > fat_ent_write+0xd0/0xe0 fs/fat/fatent.c:423 > fat_free fs/fat/file.c:363 [inline] > fat_truncate_blocks+0x353/0x550 fs/fat/file.c:394 > fat_write_failed fs/fat/inode.c:218 [inline] > fat_write_end+0xba/0x160 fs/fat/inode.c:246 > generic_perform_write+0x312/0x490 mm/filemap.c:4196 > __generic_file_write_iter+0x9e/0x120 mm/filemap.c:4292 > generic_file_write_iter+0x8d/0x2f0 mm/filemap.c:4318 > new_sync_write fs/read_write.c:593 [inline] > vfs_write+0x52a/0x960 fs/read_write.c:686 > ksys_write+0xda/0x1a0 fs/read_write.c:738 > __do_sys_write fs/read_write.c:749 [inline] > __se_sys_write fs/read_write.c:746 [inline] > __x64_sys_write+0x40/0x50 fs/read_write.c:746 > x64_sys_call+0x27fe/0x2ff0 arch/x86/include/generated/asm/syscalls_64.h:2 > do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] > do_syscall_64+0xd2/0x200 arch/x86/entry/syscall_64.c:94 > entry_SYSCALL_64_after_hwframe+0x77/0x7f > > Signed-off-by: YangWen <anmuxixixi@gmail.com> > --- > fs/fat/fatent.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c > index a7061c2ad8e4..e25775642489 100644 > --- a/fs/fat/fatent.c > +++ b/fs/fat/fatent.c > @@ -379,6 +379,7 @@ static int fat_mirror_bhs(struct super_block *sb, struct buffer_head **bhs, > struct msdos_sb_info *sbi = MSDOS_SB(sb); > struct buffer_head *c_bh; > int err, n, copy; > + bool is_fat12 = is_fat12(sbi); > > err = 0; > for (copy = 1; copy < sbi->fats; copy++) { > @@ -392,7 +393,17 @@ static int fat_mirror_bhs(struct super_block *sb, struct buffer_head **bhs, > } > /* Avoid race with userspace read via bdev */ > lock_buffer(c_bh); > + /* > + * For FAT12, protect memcpy() of the source sector > + * against concurrent 12-bit entry updates in > + * fat12_ent_put(), otherwise we may copy a torn > + * pair of bytes into the mirror FAT. > + */ > + if (is_fat12) > + spin_lock(&fat12_entry_lock); > memcpy(c_bh->b_data, bhs[n]->b_data, sb->s_blocksize); > + if (is_fat12) > + spin_unlock(&fat12_entry_lock); > set_buffer_uptodate(c_bh); > unlock_buffer(c_bh); > mark_buffer_dirty_inode(c_bh, sbi->fat_inode); -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Hi, On Tue, 02 Sep 2025 23:13:42 +0900, OGAWA Hirofumi wrote: > Hm, what is wrong with temporary inconsistent? > > If it had the race with future modification, it can be temporary > inconsistent. However, future modification will fix it by updating with > latest blocks, right? > > Or did you actually get the inconsistent state after clean unmount? Thanks for your comment. This is not only a temporary in-memory inconsistency. KCSAN detected a race where fat12_ent_put() updates two bytes of a 12-bit FAT entry while fat_mirror_bhs() concurrently memcpy()’s the entire sector. The mirror FAT may therefore receive a torn entry. Since fat_mirror_bhs() marks those buffers dirty, the corrupted mirror content can be flushed to disk. In our syzkaller testing, this already resulted in runtime errors such as: FAT-fs (loop4): error, clusters badly computed (421 != 418) FAT-fs (loop4): error, fat_bmap_cluster: request beyond EOF (i_pos 2075) These errors occurred even after a clean unmount, which suggests that the inconsistent FAT entries were actually written to disk and not corrected later by "future modification". FAT16/32 do not suffer from this problem because their entries are naturally aligned 16/32-bit accesses, which are atomic on supported architectures. FAT12 is special because of the 12-bit packing across two bytes. So I think it is necessary to protect memcpy() in fat_mirror_bhs() with fat12_entry_lock to avoid copying a torn FAT12 entry. Thanks.
Hi, On Tue, 02 Sep 2025 23:13:42 +0900, OGAWA Hirofumi wrote: > Hm, what is wrong with temporary inconsistent? > > If it had the race with future modification, it can be temporary > inconsistent. However, future modification will fix it by updating with > latest blocks, right? > > Or did you actually get the inconsistent state after clean unmount? Thanks for your comment. This is not only a temporary in-memory inconsistency. KCSAN detected a race where fat12_ent_put() updates two bytes of a 12-bit FAT entry while fat_mirror_bhs() concurrently memcpy()’s the entire sector. The mirror FAT may therefore receive a torn entry. Since fat_mirror_bhs() marks those buffers dirty, the corrupted mirror content can be flushed to disk. In our syzkaller testing, this already resulted in runtime errors such as: FAT-fs (loop4): error, clusters badly computed (421 != 418) FAT-fs (loop4): error, fat_bmap_cluster: request beyond EOF (i_pos 2075) These errors occurred even after a clean unmount, which suggests that the inconsistent FAT entries were actually written to disk and not corrected later by "future modification". FAT16/32 do not suffer from this problem because their entries are naturally aligned 16/32-bit accesses, which are atomic on supported architectures. FAT12 is special because of the 12-bit packing across two bytes. So I think it is necessary to protect memcpy() in fat_mirror_bhs() with fat12_entry_lock to avoid copying a torn FAT12 entry. Thanks.
YangWen <anmuxixixi@gmail.com> writes: > On Tue, 02 Sep 2025 23:13:42 +0900, OGAWA Hirofumi wrote: >> Hm, what is wrong with temporary inconsistent? >> >> If it had the race with future modification, it can be temporary >> inconsistent. However, future modification will fix it by updating with >> latest blocks, right? >> >> Or did you actually get the inconsistent state after clean unmount? > > Thanks for your comment. > > This is not only a temporary in-memory inconsistency. KCSAN detected a > race where fat12_ent_put() updates two bytes of a 12-bit FAT entry while > fat_mirror_bhs() concurrently memcpy()’s the entire sector. The mirror > FAT may therefore receive a torn entry. > > Since fat_mirror_bhs() marks those buffers dirty, the corrupted mirror > content can be flushed to disk. In our syzkaller testing, this already > resulted in runtime errors such as: > > FAT-fs (loop4): error, clusters badly computed (421 != 418) > FAT-fs (loop4): error, fat_bmap_cluster: request beyond EOF (i_pos 2075) > > These errors occurred even after a clean unmount, which suggests that the > inconsistent FAT entries were actually written to disk and not corrected > later by "future modification". > > FAT16/32 do not suffer from this problem because their entries are > naturally aligned 16/32-bit accesses, which are atomic on supported > architectures. FAT12 is special because of the 12-bit packing across > two bytes. > > So I think it is necessary to protect memcpy() in fat_mirror_bhs() with > fat12_entry_lock to avoid copying a torn FAT12 entry. Sounds like strange. FAT driver never read the mirror FAT area in runtime. Why did you think the mirror FAT affect to it? Thanks. -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Hi, OGAWA Hirofumi wrote: > Sounds like strange. FAT driver never read the mirror FAT area in > runtime. Why did you think the mirror FAT affect to it? Thanks for raising this point. FAT driver itself never reads the mirror FAT at runtime, so this race does not directly cause runtime corruption. However, if the primary FAT on disk becomes damaged, user-space tools such as fsck_msdos will consult the backup FAT copies in order to repair it. In that scenario, keeping the primary and backup FAT copies consistent is important. If they diverge due to a race between fat12_ent_put() and fat_mirror_bhs(), recovery by fsck_msdos may become unreliable. So my intention was not to fix a runtime problem, but rather to prevent unnecessary inconsistencies between the primary and backup FAT copies, which can help later recovery tools work as expected. Thanks.
YangWen <anmuxixixi@gmail.com> writes: > OGAWA Hirofumi wrote: >> Sounds like strange. FAT driver never read the mirror FAT area in >> runtime. Why did you think the mirror FAT affect to it? > > Thanks for raising this point. > > FAT driver itself never reads the mirror FAT at > runtime, so this race does not directly cause runtime corruption. > > However, if the primary FAT on disk becomes damaged, user-space tools > such as fsck_msdos will consult the backup FAT copies in order to > repair it. In that scenario, keeping the primary and backup FAT copies > consistent is important. If they diverge due to a race between > fat12_ent_put() and fat_mirror_bhs(), recovery by fsck_msdos > may become unreliable. > > So my intention was not to fix a runtime problem, but rather to prevent > unnecessary inconsistencies between the primary and backup FAT copies, > which can help later recovery tools work as expected. You are forgetting what I said first. I said, this should be temporary inconsistent. When unmount, temporary inconsistent should be fixed by later write out. IOW, I can't see why you claim this race can be the cause of permanent inconsistent. Thanks. -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Hi, OGAWA Hirofumi wrote: > You are forgetting what I said first. I said, this should be temporary > inconsistent. When unmount, temporary inconsistent should be fixed by > later write out. > > IOW, I can't see why you claim this race can be the cause of permanent > inconsistent. I don’t have a reproducer showing a permanent corruption after a clean unmount. My concern came only from KCSAN reports under syzkaller, and then I tried to reason from the code. In particular, in fat_mirror_bhs() there is a path: if (sb->s_flags & SB_SYNCHRONOUS) err = sync_dirty_buffer(c_bh); So with -o sync mount, if memcpy() observes a half-updated FAT12 entry, the torn value in the backup FAT buffer could be immediately written to disk. In that case, even though the primary FAT is later corrected, the backup FAT might persist inconsistent content. Thanks.
YangWen <anmuxixixi@gmail.com> writes: > Hi, > > OGAWA Hirofumi wrote: >> You are forgetting what I said first. I said, this should be temporary >> inconsistent. When unmount, temporary inconsistent should be fixed by >> later write out. >> >> IOW, I can't see why you claim this race can be the cause of permanent >> inconsistent. > > I don’t have a reproducer showing a permanent corruption > after a clean unmount. My concern came only from KCSAN reports under > syzkaller, and then I tried to reason from the code. > > In particular, in fat_mirror_bhs() there is a path: > > if (sb->s_flags & SB_SYNCHRONOUS) > err = sync_dirty_buffer(c_bh); > > So with -o sync mount, if memcpy() observes a half-updated FAT12 entry, > the torn value in the backup FAT buffer could be immediately written to > disk. In that case, even though the primary FAT is later corrected, the > backup FAT might persist inconsistent content. Sync mount doesn't try to keep all of consistency. It is trying to keep sync the minimum blocks for consistency. The primary should be always consistent, however this doesn't care much about mirror FAT. Thanks. -- OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Hi, OGAWA Hirofumi wrote: > Sync mount doesn't try to keep all of consistency. It is trying to keep > sync the minimum blocks for consistency. The primary should be always > consistent, however this doesn't care much about mirror FAT. Thanks a lot for your explanation. I understand your point and you are right.I just noticed that in fat_mirror_bhs(), the buffer head "c_bh" actually represents the mirror FAT blocks, and the code does: if (sb->s_flags & SB_SYNCHRONOUS) err = sync_dirty_buffer(c_bh); So on -o sync mounts the mirror FAT blocks are also forced to disk immediately. Thanks.
© 2016 - 2025 Red Hat, Inc.