[PATCH] dm-bufio: fix sched in atomic context

Sheng Yong posted 1 patch 2 months, 4 weeks ago
drivers/md/dm-bufio.c | 2 ++
1 file changed, 2 insertions(+)
[PATCH] dm-bufio: fix sched in atomic context
Posted by Sheng Yong 2 months, 4 weeks ago
From: Sheng Yong <shengyong1@xiaomi.com>

If "try_verify_in_tasklet" is set for dm-verity, DM_BUFIO_CLIENT_NO_SLEEP
is enabled for dm-bufio. However, when bufio tries to evict buffers, there
is a chance to trigger scheduling in spin_lock_bh, the following warning
is hit:

BUG: sleeping function called from invalid context at drivers/md/dm-bufio.c:2745
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 123, name: kworker/2:2
preempt_count: 201, expected: 0
RCU nest depth: 0, expected: 0
4 locks held by kworker/2:2/123:
 #0: ffff88800a2d1548 ((wq_completion)dm_bufio_cache){....}-{0:0}, at: process_one_work+0xe46/0x1970
 #1: ffffc90000d97d20 ((work_completion)(&dm_bufio_replacement_work)){....}-{0:0}, at: process_one_work+0x763/0x1970
 #2: ffffffff8555b528 (dm_bufio_clients_lock){....}-{3:3}, at: do_global_cleanup+0x1ce/0x710
 #3: ffff88801d5820b8 (&c->spinlock){....}-{2:2}, at: do_global_cleanup+0x2a5/0x710
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 2 UID: 0 PID: 123 Comm: kworker/2:2 Not tainted 6.16.0-rc3-g90548c634bd0 #305 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: dm_bufio_cache do_global_cleanup
Call Trace:
 <TASK>
 dump_stack_lvl+0x53/0x70
 __might_resched+0x360/0x4e0
 do_global_cleanup+0x2f5/0x710
 process_one_work+0x7db/0x1970
 worker_thread+0x518/0xea0
 kthread+0x359/0x690
 ret_from_fork+0xf3/0x1b0
 ret_from_fork_asm+0x1a/0x30
 </TASK>

That can be reproduced by:

  veritysetup format --data-block-size=4096 --hash-block-size=4096 /dev/vda /dev/vdb
  SIZE=$(blockdev --getsz /dev/vda)
  dmsetup create myverity -r --table "0 $SIZE verity 1 /dev/vda /dev/vdb 4096 4096 <data_blocks> 1 sha256 <root_hash> <salt> 1 try_verify_in_tasklet"
  mount /dev/dm-0 /mnt -o ro
  echo 102400 > /sys/module/dm_bufio/parameters/max_cache_size_bytes
  [read files in /mnt]

Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
Signed-off-by: Wang Shuai <wangshuai12@xiaomi.com>
Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
---
 drivers/md/dm-bufio.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
index ec84ba5e93e5..caf6ae9a8b52 100644
--- a/drivers/md/dm-bufio.c
+++ b/drivers/md/dm-bufio.c
@@ -2742,7 +2742,9 @@ static unsigned long __evict_a_few(unsigned long nr_buffers)
 		__make_buffer_clean(b);
 		__free_buffer_wake(b);
 
+		dm_bufio_unlock(c);
 		cond_resched();
+		dm_bufio_lock(c);
 	}
 
 	dm_bufio_unlock(c);
-- 
2.43.0
Re: [PATCH] dm-bufio: fix sched in atomic context
Posted by Mikulas Patocka 2 months, 3 weeks ago

On Thu, 10 Jul 2025, Sheng Yong wrote:

> From: Sheng Yong <shengyong1@xiaomi.com>
> 
> If "try_verify_in_tasklet" is set for dm-verity, DM_BUFIO_CLIENT_NO_SLEEP
> is enabled for dm-bufio. However, when bufio tries to evict buffers, there
> is a chance to trigger scheduling in spin_lock_bh, the following warning
> is hit:
> 
> BUG: sleeping function called from invalid context at drivers/md/dm-bufio.c:2745
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 123, name: kworker/2:2
> preempt_count: 201, expected: 0
> RCU nest depth: 0, expected: 0
> 4 locks held by kworker/2:2/123:
>  #0: ffff88800a2d1548 ((wq_completion)dm_bufio_cache){....}-{0:0}, at: process_one_work+0xe46/0x1970
>  #1: ffffc90000d97d20 ((work_completion)(&dm_bufio_replacement_work)){....}-{0:0}, at: process_one_work+0x763/0x1970
>  #2: ffffffff8555b528 (dm_bufio_clients_lock){....}-{3:3}, at: do_global_cleanup+0x1ce/0x710
>  #3: ffff88801d5820b8 (&c->spinlock){....}-{2:2}, at: do_global_cleanup+0x2a5/0x710
> Preemption disabled at:
> [<0000000000000000>] 0x0
> CPU: 2 UID: 0 PID: 123 Comm: kworker/2:2 Not tainted 6.16.0-rc3-g90548c634bd0 #305 PREEMPT(voluntary)
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
> Workqueue: dm_bufio_cache do_global_cleanup
> Call Trace:
>  <TASK>
>  dump_stack_lvl+0x53/0x70
>  __might_resched+0x360/0x4e0
>  do_global_cleanup+0x2f5/0x710
>  process_one_work+0x7db/0x1970
>  worker_thread+0x518/0xea0
>  kthread+0x359/0x690
>  ret_from_fork+0xf3/0x1b0
>  ret_from_fork_asm+0x1a/0x30
>  </TASK>
> 
> That can be reproduced by:
> 
>   veritysetup format --data-block-size=4096 --hash-block-size=4096 /dev/vda /dev/vdb
>   SIZE=$(blockdev --getsz /dev/vda)
>   dmsetup create myverity -r --table "0 $SIZE verity 1 /dev/vda /dev/vdb 4096 4096 <data_blocks> 1 sha256 <root_hash> <salt> 1 try_verify_in_tasklet"
>   mount /dev/dm-0 /mnt -o ro
>   echo 102400 > /sys/module/dm_bufio/parameters/max_cache_size_bytes
>   [read files in /mnt]
> 
> Fixes: 5721d4e5a9cd ("dm verity: Add optional "try_verify_in_tasklet" feature")
> Signed-off-by: Wang Shuai <wangshuai12@xiaomi.com>
> Signed-off-by: Sheng Yong <shengyong1@xiaomi.com>
> ---
>  drivers/md/dm-bufio.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
> index ec84ba5e93e5..caf6ae9a8b52 100644
> --- a/drivers/md/dm-bufio.c
> +++ b/drivers/md/dm-bufio.c
> @@ -2742,7 +2742,9 @@ static unsigned long __evict_a_few(unsigned long nr_buffers)
>  		__make_buffer_clean(b);
>  		__free_buffer_wake(b);
>  
> +		dm_bufio_unlock(c);
>  		cond_resched();
> +		dm_bufio_lock(c);
>  	}
>  
>  	dm_bufio_unlock(c);
> -- 
> 2.43.0

Hi

I accepted this patch. I changed it to:

-               cond_resched();
+               if (need_resched()) {
+                       dm_bufio_unlock(c);
+                       cond_resched();
+                       dm_bufio_lock(c);
+               }
        }

        dm_bufio_unlock(c);

so that we are not hammering on the dm bufio lock when scheduling is not 
needed.

Mikulas
Re: [PATCH] dm-bufio: fix sched in atomic context
Posted by Sheng Yong 2 months, 3 weeks ago
On 7/15/25 01:17, Mikulas Patocka wrote:
> 
> 
> On Thu, 10 Jul 2025, Sheng Yong wrote:
> 
>> From: Sheng Yong <shengyong1@xiaomi.com>
>>
[..]
>>
>> diff --git a/drivers/md/dm-bufio.c b/drivers/md/dm-bufio.c
>> index ec84ba5e93e5..caf6ae9a8b52 100644
>> --- a/drivers/md/dm-bufio.c
>> +++ b/drivers/md/dm-bufio.c
>> @@ -2742,7 +2742,9 @@ static unsigned long __evict_a_few(unsigned long nr_buffers)
>>   		__make_buffer_clean(b);
>>   		__free_buffer_wake(b);
>>   
>> +		dm_bufio_unlock(c);
>>   		cond_resched();
>> +		dm_bufio_lock(c);
>>   	}
>>   
>>   	dm_bufio_unlock(c);
>> -- 
>> 2.43.0
> 
> Hi
> 
> I accepted this patch. I changed it to:
> 
> -               cond_resched();
> +               if (need_resched()) {
> +                       dm_bufio_unlock(c);
> +                       cond_resched();
> +                       dm_bufio_lock(c);
> +               }
>          }
> 
>          dm_bufio_unlock(c);
> 
> so that we are not hammering on the dm bufio lock when scheduling is not
> needed.

Hi, Mikulas,

Thank you for the update. It looks good to me.

thanks,
Yong
> 
> Mikulas
>