From nobody Mon Jun 8 16:28:21 2026 Received: from mail-pg1-f195.google.com (mail-pg1-f195.google.com [209.85.215.195]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7848312CD8B for ; Thu, 28 May 2026 06:56:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.195 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779951371; cv=none; b=j4Vmc7Kmbee5cNC+F1eDMC4QusolQNf8vkB7EBxCFc2w5R9i47aKr/lScD5LY0OOjHm5nuwpi6oj1OO5/a7ovSpJa35ZkZvwVOMeabVHx9wQcDDaYXILgOvcZ6J6QGfPf+G84W+/PzB42S5kuRL4HWUirlA9xviYmqN1mOpuAnc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1779951371; c=relaxed/simple; bh=NxGaUXRJCtK/y8dO7HP/Cuj5X46MQL/v/3x3X8LsbTU=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=ebKVgtxyv9O2pyCgI3JY33D2LqDAhKvxP3MK8oXBmcYhYazKG9n1oyr/GZcNeIT4ilrA+BR1jGu3ThvfKI6hY9Sw4B8QihoDOZM1qSnnfCuW9264PKyDKiwyBEmWfvjLJ7ruByP4n3k0FK11BPFeWNGg+Nh5kco6RNSUpp7JHfM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=qNJAYBCt; arc=none smtp.client-ip=209.85.215.195 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qNJAYBCt" Received: by mail-pg1-f195.google.com with SMTP id 41be03b00d2f7-c70c112cb61so9667206a12.0 for ; Wed, 27 May 2026 23:56:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1779951369; x=1780556169; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=hv5bxsNiGg323Ed+whtFPX08umiutM48iIq29EIeM28=; b=qNJAYBCtD/2X3f70eJocnzJuxjTgEcdHLvIuu0F0mBnFcU1uCey/PUqffk//w68ugI d6d9icS+cIQnF4O6V6dYnM2vdinUGry+Dzxc1lWvIEq1uzymOqhv+RkuwSckkOgbJW1a g2X9hwhi2jz/fvbKdJPe/JPGoD5oKBLwtEajDx5jSA3t/7xdU/7xMgKhikIIKNbWDHnN ZnqJBjbsmpocV765zNAy7fEnyCPG09dt7pAWDd4ZvJ50fBwN4EOVk17M4LLcpR7u+mCV ccPRi9+b/1znPWL78SCYiiXMpDAwUVfYnOHjs2cRPKC5odcgw/eLCP5BBXEhcUnRze53 KlxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1779951369; x=1780556169; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hv5bxsNiGg323Ed+whtFPX08umiutM48iIq29EIeM28=; b=MjnrrvZ+gx2syBT0bn94XQC0XiFKlKxejJjIkXcMDbFiDPc0QvL0WJ0F5nM9f74Xp1 Zph913NPBusUD9O1SFISfE+bxtgJLu5Np4IvIL1SHVIZefnxVq4BD/lglnSzUahXW3TL nGGQW8KPL4/fSC++qpJuxGqd54PLm6YYeM9otmiVR4BmmY0DDfHEILBFtKrG0W40NDiP /TIpawXl2uo5GugGJv3nGwKErmzviwx+JEoPN8z2pEk+YnvdP6MNqEDTVeuonSRyDG+z JVBqW4M37QFS/QUST1UMfXpKIwFDkG3KjOEnmcZRsSxo0gwx5bAnI93fVLtsxBdwuo7N xf4w== X-Forwarded-Encrypted: i=1; AFNElJ/7U8ngW+BHFn8oHuLznwyKpiOntPnNWVnjRruA0rvmLF+ui78q1qr9uRL18xW9KtHuud2vW3Ib7Nqz0k0=@vger.kernel.org X-Gm-Message-State: AOJu0Yzv1+oZ+nkDbIPnFVi26bvGRQrUUM0kI/lClY99u+ZBv3vpZtSL wGD6rg65xABgXIOUw6nZCVTTdgyx9QwbNS2Nmg1DsUqZuPnYGaVXcUV7 X-Gm-Gg: Acq92OE5/Ts33hON5oNtjNKVYcin/wrgPodyQhDnIXAJI4SYTVCMtzisXlQDCcazLtv 4uEreTqyly/7uRooXYMRs4XFPZueUlb7RRj7OC5ThxlqeXO6f6CW7HajYI9d2ywi8TQv1pJpxuP Ltn6F1+hurwQBCB+Q+utm897BE8GxMxFM4twq/aLNIv2PrnLSLsEsAEsUKKC8To/t7klkTPHKYP b3Snk/qydJ4WQOyYqpIgvrCPtaIQoMMEhaT18/s6nZ6C2B3qaCIOZ++hpMh5ELp2BPtiU8hJmoj gsKsJFf51kFMhusFwPIgTy0y0ovbzpKDYZFJrdWGv5oRTH3D1YWExYvLtO4UIU4ReBAg2C3sG55 h9UaOu0/7PRdVqoZbo1mnvi2zlF4fok+ZrhNVazOAWXDYdxcXs8iwAKdUbwyJHVRX3FsIzHBuwW jGwPnYd1vL1t1KTvOrTPkImLlnFKr1zBbR3B6hJlBiOw5wrc3DLX8V X-Received: by 2002:a05:6a21:9f07:b0:3a2:f2b1:1ac with SMTP id adf61e73a8af0-3b328ec89dcmr28275315637.44.1779951368533; Wed, 27 May 2026 23:56:08 -0700 (PDT) Received: from localhost ([111.228.63.84]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-c852054d845sm13811680a12.18.2026.05.27.23.56.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 27 May 2026 23:56:08 -0700 (PDT) From: Zhang Cen To: Jaegeuk Kim , Chao Yu Cc: Gao Xiang , linux-f2fs-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org, zerocling0077@gmail.com, 2045gemini@gmail.com, Zhang Cen Subject: [PATCH v3] f2fs: protect published gc_thread during teardown Date: Thu, 28 May 2026 14:56:01 +0800 Message-Id: <20260528065601.3257303-1-rollkingzzc@gmail.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" f2fs stores the background GC worker in a heap-allocated sbi->gc_thread. Foreground GC merge waiters, GC-thread sysfs callbacks, and victim- selection readers can still touch waitqueues, policy knobs, or f2fs_gc_task through that published object, but the dead-device shutdown path can reach f2fs_stop_gc_thread() under shared s_umount coverage and free it. The buggy scenario involves two paths, with each column showing the order within that path: f2fs_balance_fs() reader: dead-device shutdown: 1. sees low free space and 1. enters f2fs_shutdown() enters GC_MERGE 2. calls f2fs_do_shutdown(..., false) 2. borrows sbi->gc_thread 3. calls f2fs_stop_gc_thread() 3. queues on gc_th->fggc_wq 4. stops gc_th->f2fs_gc_task 4. wakes gc_th->gc_wait_queue_head 5. frees gc_th 5. sleeps in io_schedule() 6. clears sbi->gc_thread 6. resumes in finish_wait() Fix this by making gc_thread a properly published object. Initialize it fully before publishing it, remove it from sbi->gc_thread with xchg() before teardown, keep f2fs_gc_task alive until an SRCU grace period has drained the borrowed readers, and route the waitqueue, scalar, and sysfs users through shared get/put helpers. Recheck f2fs_gc_task after prepare_to_wait() in f2fs_balance_fs() so teardown cannot miss an already-queued foreground waiter. Validation reproduced this kernel report: KASAN slab-use-after-free in _raw_spin_lock_irqsave() Read of size 1 Call trace: dump_stack_lvl() (?:?) print_address_description() (mm/kasan/report.c:373) _raw_spin_lock_irqsave() (?:?) print_report() (?:?) __virt_addr_valid() (?:?) srso_alias_return_thunk() (arch/x86/include/asm/nospec-branch.h:375) kasan_addr_to_slab() (mm/kasan/common.c:45) kasan_report() (?:?) __kasan_check_byte() (mm/kasan/common.c:571) lock_acquire() (?:?) rcu_is_watching() (?:?) prepare_to_wait() (kernel/sched/wait.c:248) f2fs_balance_fs() (fs/f2fs/segment.c:448) f2fs_write_single_data_page() (fs/f2fs/f2fs.h:4184) do_raw_spin_lock() (?:?) find_held_lock() (kernel/locking/lockdep.c:5340) folio_mkclean() (?:?) f2fs_write_cache_pages() (fs/f2fs/data.c:3238) __lock_acquire() (kernel/locking/lockdep.c:5077) unwind_next_frame() (?:?) arch_stack_walk() (?:?) __f2fs_write_data_pages() (fs/f2fs/data.c:3565) validate_chain() (kernel/locking/lockdep.c:3861) insert_work() (kernel/workqueue.c:2220) do_writepages() (mm/page-writeback.c:2560) filemap_writeback() (mm/filemap.c:371) __queue_work() (kernel/workqueue.c:2275) file_write_and_wait_range() (?:?) f2fs_do_sync_file() (fs/f2fs/file.c:285) __lock_release() (kernel/locking/lockdep.c:5511) f2fs_sync_file() (fs/f2fs/f2fs.h:3781) do_fsync() (fs/sync.c:204) __x64_sys_fsync() (?:?) do_syscall_64() (arch/x86/entry/syscall_64.c:87) entry_SYSCALL_64_after_hwframe() (?:?) kasan_save_stack() (mm/kasan/common.c:52) kasan_save_track() (mm/kasan/common.c:74) __kasan_kmalloc() (?:?) f2fs_start_gc_thread() (fs/f2fs/f2fs.h:4203) f2fs_fill_super() (fs/f2fs/super.c:4993) Fixes: 5911d2d1d1a3 ("f2fs: introduce gc_merge mount option") Signed-off-by: Zhang Cen --- v3: - Add the Fixes tag for the GC_MERGE foreground wait path. - Add comments for the gc_thread release/acquire publication pair. - Fix checkpatch alignment issues and avoid misleading indentation in the critical_task_priority sysfs update path. v2: - Sashiko.dev pointed out that sysfs critical_task_priority could use a stale f2fs_gc_task. - Sashiko.dev pointed out that GC_MERGE foreground waiters could miss teardown wakeups. - Sashiko.dev pointed out that holding gc_thread_lock across kthread_run() could deadlock with reclaim. - Sashiko.dev pointed out that gc_urgent and gc_idle still touched GC state without safe lifetime protection. - Reset gc_thread_srcu_inited per f2fs_fill_super() attempt so cleanup only covers the current SRCU state. fs/f2fs/f2fs.h | 30 +++++++++- fs/f2fs/gc.c | 78 ++++++++++++++++-------- fs/f2fs/gc.h | 8 ++- fs/f2fs/segment.c | 34 +++++++---- fs/f2fs/super.c | 11 ++++ fs/f2fs/sysfs.c | 148 ++++++++++++++++++++++++++++++++++------------ 6 files changed, 233 insertions(+), 76 deletions(-) diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h index 91f506e7c9cfb..719c94119b71c 100644 --- a/fs/f2fs/f2fs.h +++ b/fs/f2fs/f2fs.h @@ -25,6 +25,7 @@ #include #include #include +#include =20 #include #include @@ -156,6 +157,8 @@ typedef u32 block_t; /* */ typedef u32 nid_t; =20 +struct f2fs_gc_kthread; + #define COMPRESS_EXT_NUM 16 =20 enum blkzone_allocation_policy { @@ -1873,7 +1876,8 @@ struct f2fs_sb_info { * semaphore for GC, avoid * race between GC and GC or CP */ - struct f2fs_gc_kthread *gc_thread; /* GC thread */ + struct f2fs_gc_kthread *gc_thread; /* published GC thread */ + struct srcu_struct gc_thread_srcu; /* protects gc_thread readers */ struct atgc_management am; /* atgc management */ unsigned int cur_victim_sec; /* current victim section num */ unsigned int gc_mode; /* current GC state */ @@ -4200,6 +4204,30 @@ extern const struct iomap_ops f2fs_iomap_ops; /* * gc.c */ +static inline struct f2fs_gc_kthread * +f2fs_get_gc_thread(struct f2fs_sb_info *sbi, int *srcu_idx) +{ + struct f2fs_gc_kthread *gc_th; + + *srcu_idx =3D srcu_read_lock(&sbi->gc_thread_srcu); + /* + * Pairs with the release store in f2fs_start_gc_thread() so readers + * see a fully initialized worker after observing sbi->gc_thread. + */ + gc_th =3D smp_load_acquire(&sbi->gc_thread); + if (!gc_th) { + srcu_read_unlock(&sbi->gc_thread_srcu, *srcu_idx); + *srcu_idx =3D -1; + } + + return gc_th; +} + +static inline void f2fs_put_gc_thread(struct f2fs_sb_info *sbi, int srcu_i= dx) +{ + srcu_read_unlock(&sbi->gc_thread_srcu, srcu_idx); +} + int f2fs_start_gc_thread(struct f2fs_sb_info *sbi); void f2fs_stop_gc_thread(struct f2fs_sb_info *sbi); block_t f2fs_start_bidx_of_node(unsigned int node_ofs, struct inode *inode= ); diff --git a/fs/f2fs/gc.c b/fs/f2fs/gc.c index ba93010924c06..c38dc5e4f66de 100644 --- a/fs/f2fs/gc.c +++ b/fs/f2fs/gc.c @@ -30,10 +30,10 @@ static unsigned int count_bits(const unsigned long *add= r, =20 static int gc_thread_func(void *data) { - struct f2fs_sb_info *sbi =3D data; - struct f2fs_gc_kthread *gc_th =3D sbi->gc_thread; - wait_queue_head_t *wq =3D &sbi->gc_thread->gc_wait_queue_head; - wait_queue_head_t *fggc_wq =3D &sbi->gc_thread->fggc_wq; + struct f2fs_gc_kthread *gc_th =3D data; + struct f2fs_sb_info *sbi =3D gc_th->sbi; + wait_queue_head_t *wq =3D &gc_th->gc_wait_queue_head; + wait_queue_head_t *fggc_wq =3D &gc_th->fggc_wq; unsigned int wait_ms; struct f2fs_gc_control gc_control =3D { .victim_segno =3D NULL_SEGNO, @@ -134,7 +134,7 @@ static int gc_thread_func(void *data) wait_ms =3D gc_th->max_sleep_time; } =20 - if (need_to_boost_gc(sbi)) { + if (need_to_boost_gc(sbi, gc_th)) { decrease_sleep_time(gc_th, &wait_ms); if (f2fs_sb_has_blkzoned(sbi)) gc_boost =3D true; @@ -194,6 +194,7 @@ static int gc_thread_func(void *data) int f2fs_start_gc_thread(struct f2fs_sb_info *sbi) { struct f2fs_gc_kthread *gc_th; + struct task_struct *task; dev_t dev =3D sbi->sb->s_bdev->bd_dev; =20 gc_th =3D f2fs_kmalloc(sbi, sizeof(struct f2fs_gc_kthread), GFP_KERNEL); @@ -219,36 +220,45 @@ int f2fs_start_gc_thread(struct f2fs_sb_info *sbi) gc_th->boost_zoned_gc_percent =3D 0; } =20 + gc_th->sbi =3D sbi; gc_th->gc_wake =3D false; + init_waitqueue_head(&gc_th->gc_wait_queue_head); + init_waitqueue_head(&gc_th->fggc_wq); =20 - sbi->gc_thread =3D gc_th; - init_waitqueue_head(&sbi->gc_thread->gc_wait_queue_head); - init_waitqueue_head(&sbi->gc_thread->fggc_wq); - sbi->gc_thread->f2fs_gc_task =3D kthread_run(gc_thread_func, sbi, - "f2fs_gc-%u:%u", MAJOR(dev), MINOR(dev)); - if (IS_ERR(gc_th->f2fs_gc_task)) { - int err =3D PTR_ERR(gc_th->f2fs_gc_task); + task =3D kthread_run(gc_thread_func, gc_th, "f2fs_gc-%u:%u", + MAJOR(dev), MINOR(dev)); + if (IS_ERR(task)) { + int err =3D PTR_ERR(task); =20 kfree(gc_th); - sbi->gc_thread =3D NULL; return err; } =20 - set_user_nice(gc_th->f2fs_gc_task, - PRIO_TO_NICE(sbi->critical_task_priority)); + get_task_struct(task); + WRITE_ONCE(gc_th->f2fs_gc_task, task); + set_user_nice(task, PRIO_TO_NICE(sbi->critical_task_priority)); + /* Publish only after gc_th and its task pointer are fully initialized. */ + smp_store_release(&sbi->gc_thread, gc_th); return 0; } =20 void f2fs_stop_gc_thread(struct f2fs_sb_info *sbi) { - struct f2fs_gc_kthread *gc_th =3D sbi->gc_thread; + struct f2fs_gc_kthread *gc_th; + struct task_struct *task; =20 + gc_th =3D xchg(&sbi->gc_thread, NULL); if (!gc_th) return; - kthread_stop(gc_th->f2fs_gc_task); + + task =3D xchg(&gc_th->f2fs_gc_task, NULL); + if (task) + kthread_stop(task); wake_up_all(&gc_th->fggc_wq); + synchronize_srcu(&sbi->gc_thread_srcu); + if (task) + put_task_struct(task); kfree(gc_th); - sbi->gc_thread =3D NULL; } =20 static int select_gc_type(struct f2fs_sb_info *sbi, int gc_type) @@ -795,8 +805,16 @@ int f2fs_get_victim(struct f2fs_sb_info *sbi, unsigned= int *result, p.age_threshold =3D sbi->am.age_threshold; if (one_time) { p.one_time_gc =3D one_time; - if (has_enough_free_secs(sbi, 0, NR_PERSISTENT_LOG)) - valid_thresh_ratio =3D sbi->gc_thread->valid_thresh_ratio; + if (has_enough_free_secs(sbi, 0, NR_PERSISTENT_LOG)) { + struct f2fs_gc_kthread *gc_th; + int srcu_idx; + + gc_th =3D f2fs_get_gc_thread(sbi, &srcu_idx); + if (gc_th) { + valid_thresh_ratio =3D gc_th->valid_thresh_ratio; + f2fs_put_gc_thread(sbi, srcu_idx); + } + } } =20 retry: @@ -1776,11 +1794,21 @@ static int do_garbage_collect(struct f2fs_sb_info *= sbi, unsigned int window_granularity =3D sbi->migration_window_granularity; =20 - if (f2fs_sb_has_blkzoned(sbi) && - !has_enough_free_blocks(sbi, - sbi->gc_thread->boost_zoned_gc_percent)) - window_granularity *=3D - sbi->gc_thread->boost_gc_multiple; + if (f2fs_sb_has_blkzoned(sbi)) { + struct f2fs_gc_kthread *gc_th; + int srcu_idx; + + gc_th =3D f2fs_get_gc_thread(sbi, &srcu_idx); + if (gc_th) { + unsigned int boost_percent =3D + gc_th->boost_zoned_gc_percent; + + if (!has_enough_free_blocks(sbi, boost_percent)) + window_granularity *=3D + gc_th->boost_gc_multiple; + f2fs_put_gc_thread(sbi, srcu_idx); + } + } =20 end_segno =3D start_segno + window_granularity; } diff --git a/fs/f2fs/gc.h b/fs/f2fs/gc.h index 6c4d4567571eb..b32d57a70f760 100644 --- a/fs/f2fs/gc.h +++ b/fs/f2fs/gc.h @@ -45,7 +45,10 @@ =20 #define NR_GC_CHECKPOINT_SECS (3) /* data/node/dentry sections */ =20 +struct f2fs_sb_info; + struct f2fs_gc_kthread { + struct f2fs_sb_info *sbi; struct task_struct *f2fs_gc_task; wait_queue_head_t gc_wait_queue_head; =20 @@ -193,10 +196,11 @@ static inline bool has_enough_invalid_blocks(struct f= 2fs_sb_info *sbi) limit_free_user_blocks(invalid_user_blocks)); } =20 -static inline bool need_to_boost_gc(struct f2fs_sb_info *sbi) +static inline bool need_to_boost_gc(struct f2fs_sb_info *sbi, + struct f2fs_gc_kthread *gc_th) { if (f2fs_sb_has_blkzoned(sbi)) return !has_enough_free_blocks(sbi, - sbi->gc_thread->boost_zoned_gc_percent); + gc_th->boost_zoned_gc_percent); return has_enough_invalid_blocks(sbi); } diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c index 788f8b0502492..cd8d5084c3f31 100644 --- a/fs/f2fs/segment.c +++ b/fs/f2fs/segment.c @@ -444,16 +444,30 @@ void f2fs_balance_fs(struct f2fs_sb_info *sbi, bool n= eed) if (has_enough_free_secs(sbi, 0, 0)) return; =20 - if (test_opt(sbi, GC_MERGE) && sbi->gc_thread && - sbi->gc_thread->f2fs_gc_task) { - DEFINE_WAIT(wait); - - prepare_to_wait(&sbi->gc_thread->fggc_wq, &wait, - TASK_UNINTERRUPTIBLE); - wake_up(&sbi->gc_thread->gc_wait_queue_head); - io_schedule(); - finish_wait(&sbi->gc_thread->fggc_wq, &wait); - } else { + if (test_opt(sbi, GC_MERGE)) { + struct f2fs_gc_kthread *gc_th; + int srcu_idx; + + gc_th =3D f2fs_get_gc_thread(sbi, &srcu_idx); + if (gc_th) { + if (READ_ONCE(gc_th->f2fs_gc_task)) { + DEFINE_WAIT(wait); + + prepare_to_wait(&gc_th->fggc_wq, &wait, + TASK_UNINTERRUPTIBLE); + if (READ_ONCE(gc_th->f2fs_gc_task)) { + wake_up(&gc_th->gc_wait_queue_head); + io_schedule(); + } + finish_wait(&gc_th->fggc_wq, &wait); + f2fs_put_gc_thread(sbi, srcu_idx); + return; + } + f2fs_put_gc_thread(sbi, srcu_idx); + } + } + + { struct f2fs_gc_control gc_control =3D { .victim_segno =3D NULL_SEGNO, .init_gc_type =3D f2fs_sb_has_blkzoned(sbi) ? diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c index ccf806b676f53..a1c3784656fb2 100644 --- a/fs/f2fs/super.c +++ b/fs/f2fs/super.c @@ -4954,11 +4954,13 @@ static int f2fs_fill_super(struct super_block *sb, = struct fs_context *fc) int recovery, i, valid_super_block; struct curseg_info *seg_i; int retry_cnt =3D 1; + bool gc_thread_srcu_inited =3D false; #ifdef CONFIG_QUOTA bool quota_enabled =3D false; #endif =20 try_onemore: + gc_thread_srcu_inited =3D false; err =3D -EINVAL; raw_super =3D NULL; valid_super_block =3D -1; @@ -4993,6 +4995,10 @@ static int f2fs_fill_super(struct super_block *sb, s= truct fs_context *fc) spin_lock_init(&sbi->inode_lock[i]); } mutex_init(&sbi->flush_lock); + err =3D init_srcu_struct(&sbi->gc_thread_srcu); + if (err) + goto free_sbi; + gc_thread_srcu_inited =3D true; =20 /* set a block size */ if (unlikely(!sb_set_blocksize(sb, F2FS_BLKSIZE))) { @@ -5454,6 +5460,10 @@ static int f2fs_fill_super(struct super_block *sb, s= truct fs_context *fc) #ifdef CONFIG_DEBUG_LOCK_ALLOC lockdep_unregister_key(&sbi->cp_global_sem_key); #endif + if (gc_thread_srcu_inited) { + cleanup_srcu_struct(&sbi->gc_thread_srcu); + gc_thread_srcu_inited =3D false; + } kfree(sbi); sb->s_fs_info =3D NULL; =20 @@ -5538,6 +5548,7 @@ static void kill_f2fs_super(struct super_block *sb) #ifdef CONFIG_DEBUG_LOCK_ALLOC lockdep_unregister_key(&sbi->cp_global_sem_key); #endif + cleanup_srcu_struct(&sbi->gc_thread_srcu); kfree(sbi); sb->s_fs_info =3D NULL; } diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c index 352e96ad5c3a5..14f0d3c179789 100644 --- a/fs/f2fs/sysfs.c +++ b/fs/f2fs/sysfs.c @@ -375,12 +375,10 @@ static ssize_t __sbi_show_value(struct f2fs_attr *a, } } =20 -static ssize_t f2fs_sbi_show(struct f2fs_attr *a, - struct f2fs_sb_info *sbi, char *buf) +static ssize_t __f2fs_sbi_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf, + unsigned char *ptr) { - unsigned char *ptr =3D NULL; - - ptr =3D __struct_ptr(sbi, a->struct_type); if (!ptr) return -EINVAL; =20 @@ -464,6 +462,29 @@ static ssize_t f2fs_sbi_show(struct f2fs_attr *a, return __sbi_show_value(a, sbi, buf, ptr + a->offset); } =20 +static ssize_t f2fs_sbi_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf) +{ + return __f2fs_sbi_show(a, sbi, buf, + __struct_ptr(sbi, a->struct_type)); +} + +static ssize_t f2fs_gc_thread_show(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, char *buf) +{ + struct f2fs_gc_kthread *gc_th; + int srcu_idx; + ssize_t ret; + + gc_th =3D f2fs_get_gc_thread(sbi, &srcu_idx); + if (!gc_th) + return -EINVAL; + + ret =3D __f2fs_sbi_show(a, sbi, buf, (unsigned char *)gc_th); + f2fs_put_gc_thread(sbi, srcu_idx); + return ret; +} + static void __sbi_store_value(struct f2fs_attr *a, struct f2fs_sb_info *sbi, unsigned char *ui, unsigned long value) @@ -487,16 +508,47 @@ static void __sbi_store_value(struct f2fs_attr *a, } } =20 -static ssize_t __sbi_store(struct f2fs_attr *a, - struct f2fs_sb_info *sbi, - const char *buf, size_t count) +static bool f2fs_wake_gc_thread(struct f2fs_sb_info *sbi) +{ + struct f2fs_gc_kthread *gc_th; + int srcu_idx; + + gc_th =3D f2fs_get_gc_thread(sbi, &srcu_idx); + if (!gc_th) + return false; + + gc_th->gc_wake =3D true; + wake_up_interruptible_all(&gc_th->gc_wait_queue_head); + f2fs_put_gc_thread(sbi, srcu_idx); + return true; +} + +static void f2fs_set_gc_thread_nice(struct f2fs_sb_info *sbi, long priorit= y) +{ + struct f2fs_gc_kthread *gc_th; + struct task_struct *gc_task; + int srcu_idx; + + gc_th =3D f2fs_get_gc_thread(sbi, &srcu_idx); + if (!gc_th) + return; + + gc_task =3D READ_ONCE(gc_th->f2fs_gc_task); + if (gc_task) + set_user_nice(gc_task, PRIO_TO_NICE(priority)); + + f2fs_put_gc_thread(sbi, srcu_idx); +} + +static ssize_t __f2fs_sbi_store(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, + unsigned char *ptr, + const char *buf, size_t count) { - unsigned char *ptr; unsigned long t; unsigned int *ui; ssize_t ret; =20 - ptr =3D __struct_ptr(sbi, a->struct_type); if (!ptr) return -EINVAL; =20 @@ -664,21 +716,13 @@ static ssize_t __sbi_store(struct f2fs_attr *a, sbi->gc_mode =3D GC_NORMAL; } else if (t =3D=3D 1) { sbi->gc_mode =3D GC_URGENT_HIGH; - if (sbi->gc_thread) { - sbi->gc_thread->gc_wake =3D true; - wake_up_interruptible_all( - &sbi->gc_thread->gc_wait_queue_head); + if (f2fs_wake_gc_thread(sbi)) wake_up_discard_thread(sbi, true); - } } else if (t =3D=3D 2) { sbi->gc_mode =3D GC_URGENT_LOW; } else if (t =3D=3D 3) { sbi->gc_mode =3D GC_URGENT_MID; - if (sbi->gc_thread) { - sbi->gc_thread->gc_wake =3D true; - wake_up_interruptible_all( - &sbi->gc_thread->gc_wait_queue_head); - } + f2fs_wake_gc_thread(sbi); } else { return -EINVAL; } @@ -934,14 +978,14 @@ static ssize_t __sbi_store(struct f2fs_attr *a, if (!strcmp(a->attr.name, "gc_boost_gc_multiple")) { if (t < 1 || t > SEGS_PER_SEC(sbi)) return -EINVAL; - sbi->gc_thread->boost_gc_multiple =3D (unsigned int)t; + *ui =3D (unsigned int)t; return count; } =20 if (!strcmp(a->attr.name, "gc_boost_gc_greedy")) { if (t > GC_GREEDY) return -EINVAL; - sbi->gc_thread->boost_gc_greedy =3D (unsigned int)t; + *ui =3D (unsigned int)t; return count; } =20 @@ -980,39 +1024,64 @@ static ssize_t __sbi_store(struct f2fs_attr *a, return count; } =20 - if (!strcmp(a->attr.name, "critical_task_priority")) { - if (t < NICE_TO_PRIO(MIN_NICE) || t > NICE_TO_PRIO(MAX_NICE)) - return -EINVAL; - if (!capable(CAP_SYS_NICE)) - return -EPERM; - sbi->critical_task_priority =3D t; - if (sbi->cprc_info.f2fs_issue_ckpt) - set_user_nice(sbi->cprc_info.f2fs_issue_ckpt, - PRIO_TO_NICE(sbi->critical_task_priority)); - if (sbi->gc_thread && sbi->gc_thread->f2fs_gc_task) - set_user_nice(sbi->gc_thread->f2fs_gc_task, - PRIO_TO_NICE(sbi->critical_task_priority)); - return count; - } + if (!strcmp(a->attr.name, "critical_task_priority")) { + int nice; + + if (t < NICE_TO_PRIO(MIN_NICE) || t > NICE_TO_PRIO(MAX_NICE)) + return -EINVAL; + if (!capable(CAP_SYS_NICE)) + return -EPERM; + sbi->critical_task_priority =3D t; + nice =3D PRIO_TO_NICE(sbi->critical_task_priority); + if (sbi->cprc_info.f2fs_issue_ckpt) + set_user_nice(sbi->cprc_info.f2fs_issue_ckpt, nice); + f2fs_set_gc_thread_nice(sbi, sbi->critical_task_priority); + return count; + } =20 __sbi_store_value(a, sbi, ptr + a->offset, t); =20 return count; } =20 +static ssize_t f2fs_gc_thread_store(struct f2fs_attr *a, + struct f2fs_sb_info *sbi, + const char *buf, size_t count) +{ + struct f2fs_gc_kthread *gc_th; + int srcu_idx; + ssize_t ret; + + if (!down_read_trylock(&sbi->sb->s_umount)) + return -EAGAIN; + + gc_th =3D f2fs_get_gc_thread(sbi, &srcu_idx); + if (!gc_th) { + up_read(&sbi->sb->s_umount); + return -EINVAL; + } + + ret =3D __f2fs_sbi_store(a, sbi, (unsigned char *)gc_th, buf, count); + f2fs_put_gc_thread(sbi, srcu_idx); + up_read(&sbi->sb->s_umount); + return ret; +} + static ssize_t f2fs_sbi_store(struct f2fs_attr *a, struct f2fs_sb_info *sbi, const char *buf, size_t count) { ssize_t ret; bool gc_entry =3D (!strcmp(a->attr.name, "gc_urgent") || + !strcmp(a->attr.name, "gc_idle") || a->struct_type =3D=3D GC_THREAD); =20 if (gc_entry) { if (!down_read_trylock(&sbi->sb->s_umount)) return -EAGAIN; } - ret =3D __sbi_store(a, sbi, buf, count); + ret =3D __f2fs_sbi_store(a, sbi, + __struct_ptr(sbi, a->struct_type), buf, count); if (gc_entry) up_read(&sbi->sb->s_umount); =20 @@ -1173,7 +1242,10 @@ static struct f2fs_attr f2fs_attr_##name =3D __ATTR(= name, 0444, name##_show, NULL) #endif =20 #define GC_THREAD_RW_ATTR(name, elname) \ - F2FS_RW_ATTR(GC_THREAD, f2fs_gc_kthread, name, elname) + F2FS_ATTR_OFFSET(GC_THREAD, name, 0644, \ + f2fs_gc_thread_show, f2fs_gc_thread_store, \ + offsetof(struct f2fs_gc_kthread, elname), \ + sizeof_field(struct f2fs_gc_kthread, elname)) =20 #define SM_INFO_RW_ATTR(name, elname) \ F2FS_RW_ATTR(SM_INFO, f2fs_sm_info, name, elname) --=20 2.43.0