[PATCH] f2fs: avoid cp_wait use-after-free in f2fs_write_end_io()

Wenjie Qi posted 1 patch 2 weeks ago
There is a newer version of this series
fs/f2fs/data.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
[PATCH] f2fs: avoid cp_wait use-after-free in f2fs_write_end_io()
Posted by Wenjie Qi 2 weeks ago
f2fs_write_end_io() currently decrements the writeback page counter before
waking sbi->cp_wait for the last F2FS_WB_CP_DATA completion.

That decrement can drop the F2FS_WB_CP_DATA count to zero. It can unblock
a concurrent unmount path waiting in f2fs_wait_on_all_pages(). Unmount can
continue through f2fs_put_super() and eventually free sbi while the end_io
callback is still about to evaluate wq_has_sleeper() and wake_up() on
sbi->cp_wait.

Commit 2d9c4a4ed4ee ("f2fs: fix UAF caused by decrementing sbi->nr_pages[]
in f2fs_write_end_io()") fixed one post-decrement sbi access by moving the
warm-node-list handling before dec_page_count(). The compressed writeback
path follows the same rule and documents that dec_page_count() must be the
last access to sbi when it can drop F2FS_WB_CP_DATA to zero.

Apply the same ordering rule to the cp_wait wakeup. Check whether this is
the last F2FS_WB_CP_DATA completion and wake the waiter before the counter
decrement. Then the callback no longer dereferences sbi->cp_wait after the
lifetime boundary. A waiter that runs before the decrement may observe old
count and sleep until the one-jiffy timeout, but correctness no longer
depends on touching sbi after the counter reaches zero.

Fixes: ce2739e482bc ("f2fs: fix to avoid UAF in f2fs_write_end_io()")
Cc: stable@kernel.org
Signed-off-by: Wenjie Qi <qiwenjie@xiaomi.com>
---
 fs/f2fs/data.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index d83a21998ec2..b1e9fb5ca159 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -392,16 +392,16 @@ static void f2fs_write_end_io(struct bio *bio)
 		if (f2fs_in_warm_node_list(folio))
 			f2fs_del_fsync_node_entry(sbi, folio);
 
-		dec_page_count(sbi, type);
-
 		/*
-		 * we should access sbi before folio_end_writeback() to
-		 * avoid racing w/ kill_f2fs_super()
+		 * Access sbi before dec_page_count() and folio_end_writeback()
+		 * to avoid racing w/ kill_f2fs_super().
 		 */
-		if (type == F2FS_WB_CP_DATA && !get_pages(sbi, type) &&
-				wq_has_sleeper(&sbi->cp_wait))
+		if (type == F2FS_WB_CP_DATA && get_pages(sbi, type) == 1 &&
+		    wq_has_sleeper(&sbi->cp_wait))
 			wake_up(&sbi->cp_wait);
 
+		dec_page_count(sbi, type);
+
 		folio_clear_f2fs_gcing(folio);
 		folio_end_writeback(folio);
 	}
-- 
2.43.0
Re: [PATCH] f2fs: avoid cp_wait use-after-free in f2fs_write_end_io()
Posted by Chao Yu 2 weeks ago
On 5/25/26 13:30, Wenjie Qi wrote:
> f2fs_write_end_io() currently decrements the writeback page counter before
> waking sbi->cp_wait for the last F2FS_WB_CP_DATA completion.
> 
> That decrement can drop the F2FS_WB_CP_DATA count to zero. It can unblock
> a concurrent unmount path waiting in f2fs_wait_on_all_pages(). Unmount can
> continue through f2fs_put_super() and eventually free sbi while the end_io
> callback is still about to evaluate wq_has_sleeper() and wake_up() on
> sbi->cp_wait.
> 
> Commit 2d9c4a4ed4ee ("f2fs: fix UAF caused by decrementing sbi->nr_pages[]
> in f2fs_write_end_io()") fixed one post-decrement sbi access by moving the
> warm-node-list handling before dec_page_count(). The compressed writeback
> path follows the same rule and documents that dec_page_count() must be the
> last access to sbi when it can drop F2FS_WB_CP_DATA to zero.
> 
> Apply the same ordering rule to the cp_wait wakeup. Check whether this is
> the last F2FS_WB_CP_DATA completion and wake the waiter before the counter
> decrement. Then the callback no longer dereferences sbi->cp_wait after the
> lifetime boundary. A waiter that runs before the decrement may observe old
> count and sleep until the one-jiffy timeout, but correctness no longer
> depends on touching sbi after the counter reaches zero.
> 
> Fixes: ce2739e482bc ("f2fs: fix to avoid UAF in f2fs_write_end_io()")
> Cc: stable@kernel.org
> Signed-off-by: Wenjie Qi <qiwenjie@xiaomi.com>
> ---
>   fs/f2fs/data.c | 12 ++++++------
>   1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
> index d83a21998ec2..b1e9fb5ca159 100644
> --- a/fs/f2fs/data.c
> +++ b/fs/f2fs/data.c
> @@ -392,16 +392,16 @@ static void f2fs_write_end_io(struct bio *bio)
>   		if (f2fs_in_warm_node_list(folio))
>   			f2fs_del_fsync_node_entry(sbi, folio);
>   
> -		dec_page_count(sbi, type);
> -
>   		/*
> -		 * we should access sbi before folio_end_writeback() to
> -		 * avoid racing w/ kill_f2fs_super()
> +		 * Access sbi before dec_page_count() and folio_end_writeback()
> +		 * to avoid racing w/ kill_f2fs_super().
>   		 */
> -		if (type == F2FS_WB_CP_DATA && !get_pages(sbi, type) &&
> -				wq_has_sleeper(&sbi->cp_wait))
> +		if (type == F2FS_WB_CP_DATA && get_pages(sbi, type) == 1 &&
> +		    wq_has_sleeper(&sbi->cp_wait))
>   			wake_up(&sbi->cp_wait);

If we call dec_page_count() after wake_up(), get_pages() in below function
may return true, and then ckpt thread may need to wait on cp_wait for another
DEFAULT_SCHEDULE_TIMEOUT?

void f2fs_wait_on_all_pages(struct f2fs_sb_info *sbi, int type)
{
	DEFINE_WAIT(wait);

	for (;;) {
		if (!get_pages(sbi, type))
			break;

		if (unlikely(f2fs_cp_error(sbi) &&
			!is_sbi_flag_set(sbi, SBI_IS_CLOSE)))
			break;

		if (type == F2FS_DIRTY_META)
			f2fs_sync_meta_pages(sbi, LONG_MAX, FS_CP_META_IO);
		else if (type == F2FS_WB_CP_DATA)
			f2fs_submit_merged_write(sbi, DATA);

		prepare_to_wait(&sbi->cp_wait, &wait, TASK_UNINTERRUPTIBLE);
		io_schedule_timeout(DEFAULT_SCHEDULE_TIMEOUT);
	}
	finish_wait(&sbi->cp_wait, &wait);
}

How about:

static inline int dec_return_page_count(struct f2fs_sb_info *sbi, int count_type)
{
	return atomic_dec_return(&sbi->nr_pages[count_type]);
}

f2fs_write_end_io()
{
...
	bool need_wakeup = false;

...

	if (type == F2FS_WB_CP_DATA)
		need_wakeup = !dec_return_page_count(sbi, type);
	else
		dec_page_count(sbi, type);


	if (need_wakeup && wq_has_sleeper(&sbi->cp_wait))
		wake_up(&sbi->cp_wait);
...
}

Thanks,

>   
> +		dec_page_count(sbi, type);
> +
>   		folio_clear_f2fs_gcing(folio);
>   		folio_end_writeback(folio);
>   	}
Re: [PATCH] f2fs: avoid cp_wait use-after-free in f2fs_write_end_io()
Posted by Wenjie Qi 1 week, 6 days ago
  Agreed, v1 can wake cp_wait before nr_pages[F2FS_WB_CP_DATA] reaches
zero, so
  the waiter may recheck the counter, still see a non-zero value, and
sleep until
  DEFAULT_SCHEDULE_TIMEOUT.

  I sent v2 to address this:
  https://lore.kernel.org/r/20260526034439.1017521-1-qiwenjie@xiaomi.com

  In v2, I used atomic_dec_and_lock_irqsave() for the F2FS_WB_CP_DATA zero
  transition and wake waiters while holding cp_wait.lock.
f2fs_wait_on_all_pages()
  also prepares the wait entry and rechecks get_pages() under the same
lock before
  sleeping.

  This should avoid the missed wakeup you pointed out, while also avoiding an
  unprotected post-zero access to sbi->cp_wait in f2fs_write_end_io().