fs/erofs/zdata.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
Current check for atomic context is not sufficient as
z_erofs_decompressqueue_endio can be called under rcu lock
from blk_mq_flush_plug_list(). See the stacktrace [1]
In such case we should hand off the decompression work for async
processing rather than trying to do sync decompression in current
context. Patch fixes the detection by checking for
rcu_read_lock_any_held() and while at it use more appropriate
!in_task() check than in_atomic().
Background: Historically erofs would always schedule a kworker for
decompression which would incur the scheduling cost regardless of
the context. But z_erofs_decompressqueue_endio() may not always
be in atomic context and we could actually benefit from doing the
decompression in z_erofs_decompressqueue_endio() if we are in
thread context, for example when running with dm-verity.
This optimization was later added in patch [2] which has shown
improvement in performance benchmarks.
==============================================
[1] Problem stacktrace
[name:core&]BUG: sleeping function called from invalid context at kernel/locking/mutex.c:291
[name:core&]in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1615, name: CpuMonitorServi
[name:core&]preempt_count: 0, expected: 0
[name:core&]RCU nest depth: 1, expected: 0
CPU: 7 PID: 1615 Comm: CpuMonitorServi Tainted: G S W OE 6.1.25-android14-5-maybe-dirty-mainline #1
Hardware name: MT6897 (DT)
Call trace:
dump_backtrace+0x108/0x15c
show_stack+0x20/0x30
dump_stack_lvl+0x6c/0x8c
dump_stack+0x20/0x48
__might_resched+0x1fc/0x308
__might_sleep+0x50/0x88
mutex_lock+0x2c/0x110
z_erofs_decompress_queue+0x11c/0xc10
z_erofs_decompress_kickoff+0x110/0x1a4
z_erofs_decompressqueue_endio+0x154/0x180
bio_endio+0x1b0/0x1d8
__dm_io_complete+0x22c/0x280
clone_endio+0xe4/0x280
bio_endio+0x1b0/0x1d8
blk_update_request+0x138/0x3a4
blk_mq_plug_issue_direct+0xd4/0x19c
blk_mq_flush_plug_list+0x2b0/0x354
__blk_flush_plug+0x110/0x160
blk_finish_plug+0x30/0x4c
read_pages+0x2fc/0x370
page_cache_ra_unbounded+0xa4/0x23c
page_cache_ra_order+0x290/0x320
do_sync_mmap_readahead+0x108/0x2c0
filemap_fault+0x19c/0x52c
__do_fault+0xc4/0x114
handle_mm_fault+0x5b4/0x1168
do_page_fault+0x338/0x4b4
do_translation_fault+0x40/0x60
do_mem_abort+0x60/0xc8
el0_da+0x4c/0xe0
el0t_64_sync_handler+0xd4/0xfc
el0t_64_sync+0x1a0/0x1a4
[2] Link: https://lore.kernel.org/all/20210317035448.13921-1-huangjianan@oppo.com/
Reported-by: Will Shiu <Will.Shiu@mediatek.com>
Suggested-by: Gao Xiang <xiang@kernel.org>
Signed-off-by: Sandeep Dhavale <dhavale@google.com>
---
fs/erofs/zdata.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/erofs/zdata.c b/fs/erofs/zdata.c
index 264bf553c287..5f1890e309c6 100644
--- a/fs/erofs/zdata.c
+++ b/fs/erofs/zdata.c
@@ -1447,7 +1447,7 @@ static void z_erofs_decompress_kickoff(struct z_erofs_decompressqueue *io,
if (atomic_add_return(bios, &io->pending_bios))
return;
/* Use (kthread_)work and sync decompression for atomic contexts only */
- if (in_atomic() || irqs_disabled()) {
+ if (!in_task() || irqs_disabled() || rcu_read_lock_any_held()) {
#ifdef CONFIG_EROFS_FS_PCPU_KTHREAD
struct kthread_worker *worker;
--
2.41.0.162.gfafddb0af9-goog
On 22/06/2023 00:08, Sandeep Dhavale wrote: > Current check for atomic context is not sufficient as > z_erofs_decompressqueue_endio can be called under rcu lock > from blk_mq_flush_plug_list(). See the stacktrace [1] > > In such case we should hand off the decompression work for async > processing rather than trying to do sync decompression in current > context. Patch fixes the detection by checking for > rcu_read_lock_any_held() and while at it use more appropriate > !in_task() check than in_atomic(). > > Background: Historically erofs would always schedule a kworker for > decompression which would incur the scheduling cost regardless of > the context. But z_erofs_decompressqueue_endio() may not always > be in atomic context and we could actually benefit from doing the > decompression in z_erofs_decompressqueue_endio() if we are in > thread context, for example when running with dm-verity. > This optimization was later added in patch [2] which has shown > improvement in performance benchmarks. Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com> -- Regards, Alexandre
On 2023/6/22 06:08, Sandeep Dhavale wrote: > Current check for atomic context is not sufficient as > z_erofs_decompressqueue_endio can be called under rcu lock > from blk_mq_flush_plug_list(). See the stacktrace [1] > > In such case we should hand off the decompression work for async > processing rather than trying to do sync decompression in current > context. Patch fixes the detection by checking for > rcu_read_lock_any_held() and while at it use more appropriate > !in_task() check than in_atomic(). > > Background: Historically erofs would always schedule a kworker for > decompression which would incur the scheduling cost regardless of > the context. But z_erofs_decompressqueue_endio() may not always > be in atomic context and we could actually benefit from doing the > decompression in z_erofs_decompressqueue_endio() if we are in > thread context, for example when running with dm-verity. > This optimization was later added in patch [2] which has shown > improvement in performance benchmarks. > > ============================================== > [1] Problem stacktrace > [name:core&]BUG: sleeping function called from invalid context at kernel/locking/mutex.c:291 > [name:core&]in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 1615, name: CpuMonitorServi > [name:core&]preempt_count: 0, expected: 0 > [name:core&]RCU nest depth: 1, expected: 0 > CPU: 7 PID: 1615 Comm: CpuMonitorServi Tainted: G S W OE 6.1.25-android14-5-maybe-dirty-mainline #1 > Hardware name: MT6897 (DT) > Call trace: > dump_backtrace+0x108/0x15c > show_stack+0x20/0x30 > dump_stack_lvl+0x6c/0x8c > dump_stack+0x20/0x48 > __might_resched+0x1fc/0x308 > __might_sleep+0x50/0x88 > mutex_lock+0x2c/0x110 > z_erofs_decompress_queue+0x11c/0xc10 > z_erofs_decompress_kickoff+0x110/0x1a4 > z_erofs_decompressqueue_endio+0x154/0x180 > bio_endio+0x1b0/0x1d8 > __dm_io_complete+0x22c/0x280 > clone_endio+0xe4/0x280 > bio_endio+0x1b0/0x1d8 > blk_update_request+0x138/0x3a4 > blk_mq_plug_issue_direct+0xd4/0x19c > blk_mq_flush_plug_list+0x2b0/0x354 > __blk_flush_plug+0x110/0x160 > blk_finish_plug+0x30/0x4c > read_pages+0x2fc/0x370 > page_cache_ra_unbounded+0xa4/0x23c > page_cache_ra_order+0x290/0x320 > do_sync_mmap_readahead+0x108/0x2c0 > filemap_fault+0x19c/0x52c > __do_fault+0xc4/0x114 > handle_mm_fault+0x5b4/0x1168 > do_page_fault+0x338/0x4b4 > do_translation_fault+0x40/0x60 > do_mem_abort+0x60/0xc8 > el0_da+0x4c/0xe0 > el0t_64_sync_handler+0xd4/0xfc > el0t_64_sync+0x1a0/0x1a4 > > [2] Link: https://lore.kernel.org/all/20210317035448.13921-1-huangjianan@oppo.com/ > > Reported-by: Will Shiu <Will.Shiu@mediatek.com> > Suggested-by: Gao Xiang <xiang@kernel.org> > Signed-off-by: Sandeep Dhavale <dhavale@google.com> It looks good to me, Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Thanks, Gao Xiang
© 2016 - 2024 Red Hat, Inc.