fat: Prevent the race of read/write the FAT32 entry

[PATCH V2] fat: Prevent the race of read/write the FAT32 entry

Posted by Edward Adam Davis 2 months, 1 week ago

syzbot reports data-race in fat32_ent_get/fat32_ent_put. 

	CPU0(Task A)			CPU1(Task B)
	====				====
	vfs_write
	new_sync_write
	generic_file_write_iter
	fat_write_begin
	block_write_begin		vfs_statfs
	fat_get_block			statfs_by_dentry
	fat_add_cluster			fat_statfs
	fat_ent_write			fat_count_free_clusters
	fat32_ent_put			fat32_ent_get

Task A's write operation on CPU0 and Task B's read operation on CPU1 occur
simultaneously, generating an race condition.

Add READ/WRITE_ONCE to solve the race condition that occurs when accessing
FAT32 entry.

Reported-by: syzbot+d3c29ed63db6ddf8406e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d3c29ed63db6ddf8406e
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
---
V1 -> V2: using READ/WRITE_ONCE to fix

 fs/fat/fatent.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
index 1db348f8f887..a9eecd090d91 100644
--- a/fs/fat/fatent.c
+++ b/fs/fat/fatent.c
@@ -146,8 +146,8 @@ static int fat16_ent_get(struct fat_entry *fatent)
 
 static int fat32_ent_get(struct fat_entry *fatent)
 {
-	int next = le32_to_cpu(*fatent->u.ent32_p) & 0x0fffffff;
-	WARN_ON((unsigned long)fatent->u.ent32_p & (4 - 1));
+	int next = le32_to_cpu(READ_ONCE(*fatent->u.ent32_p)) & 0x0fffffff;
+	WARN_ON((unsigned long)READ_ONCE(fatent->u.ent32_p) & (4 - 1));
 	if (next >= BAD_FAT32)
 		next = FAT_ENT_EOF;
 	return next;
@@ -187,8 +187,8 @@ static void fat16_ent_put(struct fat_entry *fatent, int new)
 static void fat32_ent_put(struct fat_entry *fatent, int new)
 {
 	WARN_ON(new & 0xf0000000);
-	new |= le32_to_cpu(*fatent->u.ent32_p) & ~0x0fffffff;
-	*fatent->u.ent32_p = cpu_to_le32(new);
+	new |= le32_to_cpu(READ_ONCE(*fatent->u.ent32_p)) & ~0x0fffffff;
+	WRITE_ONCE(*fatent->u.ent32_p, cpu_to_le32(new));
 	mark_buffer_dirty_inode(fatent->bhs[0], fatent->fat_inode);
 }
 
-- 
2.43.0

Re: [PATCH V2] fat: Prevent the race of read/write the FAT32 entry

Posted by Al Viro 2 months, 1 week ago

On Tue, Jul 29, 2025 at 02:17:10PM +0800, Edward Adam Davis wrote:
> syzbot reports data-race in fat32_ent_get/fat32_ent_put. 
> 
> 	CPU0(Task A)			CPU1(Task B)
> 	====				====
> 	vfs_write
> 	new_sync_write
> 	generic_file_write_iter
> 	fat_write_begin
> 	block_write_begin		vfs_statfs
> 	fat_get_block			statfs_by_dentry
> 	fat_add_cluster			fat_statfs
> 	fat_ent_write			fat_count_free_clusters
> 	fat32_ent_put			fat32_ent_get
> 
> Task A's write operation on CPU0 and Task B's read operation on CPU1 occur
> simultaneously, generating an race condition.
> 
> Add READ/WRITE_ONCE to solve the race condition that occurs when accessing
> FAT32 entry.

	Solve it in which sense?  fat32_ent_get() and fat32_ent_put()
are already atomic wrt each other; neither this nor your previous
variant change anything whatsoever.  And if you are talking about the
results of *multiple* fat32_ent_get(), with some assumptions made by
fat_count_free_clusters() that somehow get screwed by the modifications
from fat_add_cluster(), your patch does not prevent any of that (not
that you explained what kind of assumptions would those be).

	Long story short - accesses to individual entries are already
atomic wrt each other; the fact that they happen simultaneously _might_
be a symptom of insufficient serialization, but neither version of your
patch resolves that in any way - it just prevents the tool from reporting
its suspicions.

	It does not give fat_count_free_clusters() a stable state of
the entire table, assuming it needs one.  It might, at that - I hadn't
looked into that code since way back.  But unless I'm missing something,
the only thing your patch does is making your (rather blunt) tool STFU.

	If there is a race, explain what sequence of events leads to
incorrect behaviour and explain why your proposed change prevents that
incorrect behaviour.

	Note that if that behaviour is "amount of free space reported
by statfs(2) depends upon how far did the ongoing write(2) get", it
is *not* incorrect - that's exactly what the userland has asked for.
If it's "statfs(2) gets confused into reporting an amount of free space
that wouldn't have been accurate for any moment of time (or, worse yet,
crashes, etc.)" - yes, that would be a problem, but it could not be
solved by preventing simultaneous access to *single* entries, if it
happens at all.

Re: [PATCH V2] fat: Prevent the race of read/write the FAT32 entry

Posted by Al Viro 2 months, 1 week ago

On Tue, Jul 29, 2025 at 10:35:58AM +0100, Al Viro wrote:
> On Tue, Jul 29, 2025 at 02:17:10PM +0800, Edward Adam Davis wrote:
> > syzbot reports data-race in fat32_ent_get/fat32_ent_put. 
> > 
> > 	CPU0(Task A)			CPU1(Task B)
> > 	====				====
> > 	vfs_write
> > 	new_sync_write
> > 	generic_file_write_iter
> > 	fat_write_begin
> > 	block_write_begin		vfs_statfs
> > 	fat_get_block			statfs_by_dentry
> > 	fat_add_cluster			fat_statfs

Sorry, no - you've missed an intermediate fat_chain_add() in the call chain
here.  And that makes your race a non-issue.

> > 	fat_ent_write			fat_count_free_clusters
> > 	fat32_ent_put			fat32_ent_get

fat_count_free_clusters() doesn't care about exact value of entry;
the only thing that matters is whether it's equal to FAT_ENT_FREE.

Actualy changes of that predicate (i.e. allocating and freeing of
clusters) still happens under fat_lock() - nothing has changed in
that area.  *That* is not happening simultaneously with reads in
fat_count_free_clusters().  It's attaching the new element to the
end of chain that is outside of fat_lock().

And that operation does not affect that predicate at all; it changes
the value of the entry for last cluster of file (FAT_ENT_EOF) to the
number of cluster being added to the file.  Neither is equal to
FAT_ENT_FREE, so there's no problem - value you get ->ent_get()
is affected by that store, the value of
	ops->ent_get(&fatent) == FAT_ENT_FREE
isn't.  Probably worth a comment in fat_chain_add() re "this store
is safe outside of fat_lock() because of <list of reasons>".
Changes to this chain (both extending and truncating it) are
excluded by <lock> and as far as allocator is concerned, we are
not changing the state of any cluster.

Basically, FAT encodes both the free block map (entry[block] == FAT_ENT_FREE
<=> block is not in use) and linked lists of blocks for individual files -
they store the number of file's first block within directory entry and
use FAT for forwards pointers in the list.  FAT_ENT_EOF is used for "no
more blocks, it's the end of file".  Allocator and fat_count_free_clusters
care only about the free block map part of that thing; access to file
contents - the linked list for that file.

Re: [PATCH V2] fat: Prevent the race of read/write the FAT32 entry

Posted by Edward Adam Davis 1 month, 2 weeks ago

On Tue, 29 Jul 2025 11:06:35 +0100, Al Viro wrote:
syzbot reports data-race in fat32_ent_get/fat32_ent_put.

	CPU0(Task A)			CPU1(Task B)
	====				====
	vfs_write
	new_sync_write
	generic_file_write_iter
	fat_write_begin
	block_write_begin
	fat_get_block			vfs_statfs
	fat_add_cluster			statfs_by_dentry
	fat_chain_add			fat_statfs
	fat_ent_write			fat_count_free_clusters
	fat32_ent_put			fat32_ent_get

In fat_add_cluster(), fat_alloc_clusters() retrieves an free
cluster and marks the entry with a value of FAT_ENT_EOF, protected
by lock_fat(). There is no lock protection in fat_chain_add().
When fat_ent_write() writes the last entry to new_dclus, this has
no effect on fat_count_free_clusters() in the statfs operation.
The last entry is not FAT_ENT_FREE before or after the write.

Therefore, the race condition reported here is invalid.

BR,
Edward

Re: [PATCH V2] fat: Prevent the race of read/write the FAT32 entry

Posted by OGAWA Hirofumi 2 months, 1 week ago

Edward Adam Davis <eadavis@qq.com> writes:

> syzbot reports data-race in fat32_ent_get/fat32_ent_put. 
>
> 	CPU0(Task A)			CPU1(Task B)
> 	====				====
> 	vfs_write
> 	new_sync_write
> 	generic_file_write_iter
> 	fat_write_begin
> 	block_write_begin		vfs_statfs
> 	fat_get_block			statfs_by_dentry
> 	fat_add_cluster			fat_statfs
> 	fat_ent_write			fat_count_free_clusters
> 	fat32_ent_put			fat32_ent_get
>
> Task A's write operation on CPU0 and Task B's read operation on CPU1 occur
> simultaneously, generating an race condition.
>
> Add READ/WRITE_ONCE to solve the race condition that occurs when accessing
> FAT32 entry.

I mentioned about READ/WRITE_ONCE though, it was about possibility. Do
we really need to add those?  Can you clarify more?

> diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
> index 1db348f8f887..a9eecd090d91 100644
> --- a/fs/fat/fatent.c
> +++ b/fs/fat/fatent.c
> @@ -146,8 +146,8 @@ static int fat16_ent_get(struct fat_entry *fatent)
>  
>  static int fat32_ent_get(struct fat_entry *fatent)
>  {
> -	int next = le32_to_cpu(*fatent->u.ent32_p) & 0x0fffffff;
> -	WARN_ON((unsigned long)fatent->u.ent32_p & (4 - 1));
> +	int next = le32_to_cpu(READ_ONCE(*fatent->u.ent32_p)) & 0x0fffffff;
> +	WARN_ON((unsigned long)READ_ONCE(fatent->u.ent32_p) & (4 - 1));

Adding READ_ONCE() for pointer is broken.

>  	if (next >= BAD_FAT32)
>  		next = FAT_ENT_EOF;
>  	return next;
> @@ -187,8 +187,8 @@ static void fat16_ent_put(struct fat_entry *fatent, int new)
>  static void fat32_ent_put(struct fat_entry *fatent, int new)
>  {
>  	WARN_ON(new & 0xf0000000);
> -	new |= le32_to_cpu(*fatent->u.ent32_p) & ~0x0fffffff;
> -	*fatent->u.ent32_p = cpu_to_le32(new);
> +	new |= le32_to_cpu(READ_ONCE(*fatent->u.ent32_p)) & ~0x0fffffff;
> +	WRITE_ONCE(*fatent->u.ent32_p, cpu_to_le32(new));
>  	mark_buffer_dirty_inode(fatent->bhs[0], fatent->fat_inode);
>  }

Also READ_ONCE() is really required here?
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>