block/bfq-iosched.c | 8 ++++++++ 1 file changed, 8 insertions(+)
From: Yu Kuai <yukuai3@huawei.com>
Our test report a uaf for 'bfqq->bic' in 5.10:
==================================================================
BUG: KASAN: use-after-free in bfq_select_queue+0x378/0xa30
Read of size 8 at addr ffff88810efb42d8 by task fsstress/2318352
CPU: 6 PID: 2318352 Comm: fsstress Kdump: loaded Not tainted 5.10.0-60.18.0.50.h602.kasan.eulerosv2r11.x86_64 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-20220320_160524-szxrtosci10000 04/01/2014
Call Trace:
dump_stack+0x9c/0xd3
print_address_description.constprop.0+0x19/0x170
? bfq_select_queue+0x378/0xa30
__kasan_report.cold+0x6c/0x84
? bfq_select_queue+0x378/0xa30
kasan_report+0x3a/0x50
bfq_select_queue+0x378/0xa30
? bfq_bfqq_expire+0x6c0/0x6c0
? bfq_mark_bfqq_busy+0x1f/0x30
? _raw_spin_lock_irq+0x7b/0xd0
__bfq_dispatch_request+0x1c4/0x220
bfq_dispatch_request+0xe8/0x130
__blk_mq_do_dispatch_sched+0x3f4/0x560
? blk_mq_sched_mark_restart_hctx+0x50/0x50
? bfq_init_rq+0x128/0x940
? pvclock_clocksource_read+0xf6/0x1d0
blk_mq_do_dispatch_sched+0x62/0xb0
__blk_mq_sched_dispatch_requests+0x215/0x2a0
? blk_mq_do_dispatch_ctx+0x3a0/0x3a0
? bfq_insert_request+0x193/0x3f0
blk_mq_sched_dispatch_requests+0x8f/0xd0
__blk_mq_run_hw_queue+0x98/0x180
__blk_mq_delay_run_hw_queue+0x22b/0x240
? bfq_asymmetric_scenario+0x160/0x160
blk_mq_run_hw_queue+0xe3/0x190
? bfq_insert_request+0x3f0/0x3f0
blk_mq_sched_insert_requests+0x107/0x200
blk_mq_flush_plug_list+0x26e/0x3c0
? blk_mq_insert_requests+0x250/0x250
? blk_check_plugged+0x190/0x190
blk_finish_plug+0x63/0x90
__iomap_dio_rw+0x7b5/0x910
? iomap_dio_actor+0x150/0x150
? userns_put+0x70/0x70
? userns_put+0x70/0x70
? avc_has_perm_noaudit+0x1d0/0x1d0
? down_read+0xd5/0x1a0
? down_read_killable+0x1b0/0x1b0
? from_kgid+0xa0/0xa0
iomap_dio_rw+0x36/0x80
ext4_dio_read_iter+0x146/0x190 [ext4]
ext4_file_read_iter+0x1e2/0x230 [ext4]
new_sync_read+0x29f/0x400
? default_llseek+0x160/0x160
? find_isec_ns+0x8d/0x2e0
? avc_policy_seqno+0x27/0x40
? selinux_file_permission+0x34/0x180
? security_file_permission+0x135/0x2b0
vfs_read+0x24e/0x2d0
ksys_read+0xd5/0x1b0
? __ia32_sys_pread64+0x160/0x160
? __audit_syscall_entry+0x1cc/0x220
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
RIP: 0033:0x7ff05f96fe62
Code: c0 e9 b2 fe ff ff 50 48 8d 3d 12 04 0c 00 e8 b5 fe 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 54 24
RSP: 002b:00007fffd30c0ff8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 00000000000000a5 RCX: 00007ff05f96fe62
RDX: 000000000001d000 RSI: 0000000001ffc000 RDI: 0000000000000003
RBP: 0000000000000003 R08: 0000000002019000 R09: 0000000000000000
R10: 00007ff05fa65290 R11: 0000000000000246 R12: 0000000000131800
R13: 000000000001d000 R14: 0000000000000000 R15: 0000000001ffc000
Allocated by task 2318348:
kasan_save_stack+0x1b/0x40
__kasan_kmalloc.constprop.0+0xb5/0xe0
kmem_cache_alloc_node+0x15d/0x480
ioc_create_icq+0x68/0x2e0
blk_mq_sched_assign_ioc+0xbc/0xd0
blk_mq_rq_ctx_init+0x4b0/0x600
__blk_mq_alloc_request+0x21f/0x2e0
blk_mq_submit_bio+0x27a/0xd60
__submit_bio_noacct_mq+0x10b/0x270
submit_bio_noacct+0x13d/0x150
submit_bio+0xbf/0x280
iomap_dio_submit_bio+0x155/0x180
iomap_dio_bio_actor+0x2f0/0x770
iomap_dio_actor+0xd9/0x150
iomap_apply+0x1d2/0x4f0
__iomap_dio_rw+0x43a/0x910
iomap_dio_rw+0x36/0x80
ext4_dio_write_iter+0x46f/0x730 [ext4]
ext4_file_write_iter+0xd8/0x100 [ext4]
new_sync_write+0x2ac/0x3a0
vfs_write+0x365/0x430
ksys_write+0xd5/0x1b0
do_syscall_64+0x33/0x40
entry_SYSCALL_64_after_hwframe+0x61/0xc6
Freed by task 2320929:
kasan_save_stack+0x1b/0x40
kasan_set_track+0x1c/0x30
kasan_set_free_info+0x20/0x40
__kasan_slab_free+0x151/0x180
kmem_cache_free+0x9e/0x540
rcu_do_batch+0x292/0x700
rcu_core+0x270/0x2d0
__do_softirq+0xfd/0x402
Last call_rcu():
kasan_save_stack+0x1b/0x40
kasan_record_aux_stack+0xa8/0xf0
__call_rcu+0xa4/0x3a0
ioc_release_fn+0x45/0x120
process_one_work+0x3c5/0x730
worker_thread+0x93/0x650
kthread+0x1ba/0x210
ret_from_fork+0x22/0x30
Second to last call_rcu():
kasan_save_stack+0x1b/0x40
kasan_record_aux_stack+0xa8/0xf0
__call_rcu+0xa4/0x3a0
ioc_release_fn+0x45/0x120
process_one_work+0x3c5/0x730
worker_thread+0x93/0x650
kthread+0x1ba/0x210
ret_from_fork+0x22/0x30
The buggy address belongs to the object at ffff88810efb42a0
which belongs to the cache bfq_io_cq of size 160
The buggy address is located 56 bytes inside of
160-byte region [ffff88810efb42a0, ffff88810efb4340)
The buggy address belongs to the page:
page:00000000a519c14c refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88810efb4000 pfn:0x10efb4
head:00000000a519c14c order:1 compound_mapcount:0
flags: 0x17ffffc0010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
raw: 0017ffffc0010200 0000000000000000 dead000000000122 ffff8881407c8600
raw: ffff88810efb4000 000000008024001a 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff88810efb4180: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb
ffff88810efb4200: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>ffff88810efb4280: fc fc fc fc fa fb fb fb fb fb fb fb fb fb fb fb
^
ffff88810efb4300: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff88810efb4380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================
Commit 3bc5e683c67d ("bfq: Split shared queues on move between cgroups")
changes that move process to a new cgroup will allocate a new bfqq to
use, however, the old bfqq and new bfqq can point to the same bic:
1) Initial state, two process with io in the same cgroup.
Process 1 Process 2
(BIC1) (BIC2)
| Λ | Λ
| | | |
V | V |
bfqq1 bfqq2
2) bfqq1 is merged to bfqq2.
Process 1 Process 2(cg1)
(BIC1) (BIC2)
| |
\-------------\|
V
bfqq1 bfqq2(coop)
3) Process 1 exit, then issue new io(denoce IOA) from Process 2.
(BIC2)
| Λ
| |
V |
bfqq2(coop)
4) Before IOA is completed, move Process 2 to another cgroup and issue io.
Process 2
(BIC2)
Λ
|\--------------\
| V
bfqq2 bfqq3
Now that BIC2 points to bfqq3, while bfqq2 and bfqq3 both point to BIC2.
If all the requests are completed, and Process 2 exit, BIC2 will be
freed while there is no guarantee that bfqq2 will be freed before BIC2.
Fix the problem by clearing bfqq->bic if process references is decreased
to zero, since that they are not related anymore.
Fixes: 3bc5e683c67d ("bfq: Split shared queues on move between cgroups")
Signed-off-by: Yu Kuai <yukuai3@huawei.com>
---
block/bfq-iosched.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index a72304c728fc..6eada99d1b34 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -3036,6 +3036,14 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq)
bfq_reassign_last_bfqq(bfqq, NULL);
+ /*
+ * __bfq_bic_change_cgroup() just reset bic->bfqq so that a new bfqq
+ * will be created to handle new io, while old bfqq will stay around
+ * until all the requests are completed. It's unsafe to keep bfqq->bic
+ * since they are not related anymore.
+ */
+ if (bfqq_process_refs(bfqq) == 1)
+ bfqq->bic = NULL;
bfq_put_queue(bfqq);
}
--
2.31.1
On Sat 10-12-22 18:25:37, Yu Kuai wrote: > From: Yu Kuai <yukuai3@huawei.com> > > Our test report a uaf for 'bfqq->bic' in 5.10: > > ================================================================== > BUG: KASAN: use-after-free in bfq_select_queue+0x378/0xa30 > Read of size 8 at addr ffff88810efb42d8 by task fsstress/2318352 > > CPU: 6 PID: 2318352 Comm: fsstress Kdump: loaded Not tainted 5.10.0-60.18.0.50.h602.kasan.eulerosv2r11.x86_64 #1 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-20220320_160524-szxrtosci10000 04/01/2014 > Call Trace: ... > bfq_select_queue+0x378/0xa30 > __bfq_dispatch_request+0x1c4/0x220 > bfq_dispatch_request+0xe8/0x130 > __blk_mq_do_dispatch_sched+0x3f4/0x560 > blk_mq_do_dispatch_sched+0x62/0xb0 > __blk_mq_sched_dispatch_requests+0x215/0x2a0 > blk_mq_sched_dispatch_requests+0x8f/0xd0 > __blk_mq_run_hw_queue+0x98/0x180 > __blk_mq_delay_run_hw_queue+0x22b/0x240 > blk_mq_run_hw_queue+0xe3/0x190 > blk_mq_sched_insert_requests+0x107/0x200 > blk_mq_flush_plug_list+0x26e/0x3c0 > blk_finish_plug+0x63/0x90 > __iomap_dio_rw+0x7b5/0x910 > iomap_dio_rw+0x36/0x80 > ext4_dio_read_iter+0x146/0x190 [ext4] > ext4_file_read_iter+0x1e2/0x230 [ext4] > new_sync_read+0x29f/0x400 > vfs_read+0x24e/0x2d0 > ksys_read+0xd5/0x1b0 Perhaps we can trim this UAF report a bit to what I've left above? That should be enough to give idea about the problem. > Commit 3bc5e683c67d ("bfq: Split shared queues on move between cgroups") > changes that move process to a new cgroup will allocate a new bfqq to > use, however, the old bfqq and new bfqq can point to the same bic: > > 1) Initial state, two process with io in the same cgroup. > > Process 1 Process 2 > (BIC1) (BIC2) > | Λ | Λ > | | | | > V | V | > bfqq1 bfqq2 > > 2) bfqq1 is merged to bfqq2. > > Process 1 Process 2(cg1) > (BIC1) (BIC2) > | | > \-------------\| > V > bfqq1 bfqq2(coop) > > 3) Process 1 exit, then issue new io(denoce IOA) from Process 2. > > (BIC2) > | Λ > | | > V | > bfqq2(coop) > > 4) Before IOA is completed, move Process 2 to another cgroup and issue io. > > Process 2 > (BIC2) > Λ > |\--------------\ > | V > bfqq2 bfqq3 > > Now that BIC2 points to bfqq3, while bfqq2 and bfqq3 both point to BIC2. > If all the requests are completed, and Process 2 exit, BIC2 will be > freed while there is no guarantee that bfqq2 will be freed before BIC2. > > Fix the problem by clearing bfqq->bic if process references is decreased > to zero, since that they are not related anymore. > > Fixes: 3bc5e683c67d ("bfq: Split shared queues on move between cgroups") > Signed-off-by: Yu Kuai <yukuai3@huawei.com> Thanks for the analysis and the patch! I agree this is a problem. > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c > index a72304c728fc..6eada99d1b34 100644 > --- a/block/bfq-iosched.c > +++ b/block/bfq-iosched.c > @@ -3036,6 +3036,14 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq) > > bfq_reassign_last_bfqq(bfqq, NULL); > > + /* > + * __bfq_bic_change_cgroup() just reset bic->bfqq so that a new bfqq > + * will be created to handle new io, while old bfqq will stay around > + * until all the requests are completed. It's unsafe to keep bfqq->bic > + * since they are not related anymore. > + */ > + if (bfqq_process_refs(bfqq) == 1) > + bfqq->bic = NULL; > bfq_put_queue(bfqq); Rather than changing bfq_release_process_ref() I think it would be more logical to change bic_set_bfqq() like: struct bfq_queue *old_bfqq = bic->bfqq[is_sync]; /* Clear bic pointer if we are detaching bfqq from its bic */ if (old_bfqq && old_bfqq->bic == bic) old_bfqq->bic = NULL; And then we can also remove several explicit bfqq->bic = NULL statements from bfq code. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
Hi, Jan! 在 2022/12/12 21:35, Jan Kara 写道: > On Sat 10-12-22 18:25:37, Yu Kuai wrote: >> From: Yu Kuai <yukuai3@huawei.com> >> >> Our test report a uaf for 'bfqq->bic' in 5.10: >> >> ================================================================== >> BUG: KASAN: use-after-free in bfq_select_queue+0x378/0xa30 >> Read of size 8 at addr ffff88810efb42d8 by task fsstress/2318352 >> >> CPU: 6 PID: 2318352 Comm: fsstress Kdump: loaded Not tainted 5.10.0-60.18.0.50.h602.kasan.eulerosv2r11.x86_64 #1 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-20220320_160524-szxrtosci10000 04/01/2014 >> Call Trace: > ... >> bfq_select_queue+0x378/0xa30 >> __bfq_dispatch_request+0x1c4/0x220 >> bfq_dispatch_request+0xe8/0x130 >> __blk_mq_do_dispatch_sched+0x3f4/0x560 >> blk_mq_do_dispatch_sched+0x62/0xb0 >> __blk_mq_sched_dispatch_requests+0x215/0x2a0 >> blk_mq_sched_dispatch_requests+0x8f/0xd0 >> __blk_mq_run_hw_queue+0x98/0x180 >> __blk_mq_delay_run_hw_queue+0x22b/0x240 >> blk_mq_run_hw_queue+0xe3/0x190 >> blk_mq_sched_insert_requests+0x107/0x200 >> blk_mq_flush_plug_list+0x26e/0x3c0 >> blk_finish_plug+0x63/0x90 >> __iomap_dio_rw+0x7b5/0x910 >> iomap_dio_rw+0x36/0x80 >> ext4_dio_read_iter+0x146/0x190 [ext4] >> ext4_file_read_iter+0x1e2/0x230 [ext4] >> new_sync_read+0x29f/0x400 >> vfs_read+0x24e/0x2d0 >> ksys_read+0xd5/0x1b0 > > Perhaps we can trim this UAF report a bit to what I've left above? That > should be enough to give idea about the problem. Yes, of course. > >> Commit 3bc5e683c67d ("bfq: Split shared queues on move between cgroups") >> changes that move process to a new cgroup will allocate a new bfqq to >> use, however, the old bfqq and new bfqq can point to the same bic: >> >> 1) Initial state, two process with io in the same cgroup. >> >> Process 1 Process 2 >> (BIC1) (BIC2) >> | Λ | Λ >> | | | | >> V | V | >> bfqq1 bfqq2 >> >> 2) bfqq1 is merged to bfqq2. >> >> Process 1 Process 2(cg1) >> (BIC1) (BIC2) >> | | >> \-------------\| >> V >> bfqq1 bfqq2(coop) >> >> 3) Process 1 exit, then issue new io(denoce IOA) from Process 2. >> >> (BIC2) >> | Λ >> | | >> V | >> bfqq2(coop) >> >> 4) Before IOA is completed, move Process 2 to another cgroup and issue io. >> >> Process 2 >> (BIC2) >> Λ >> |\--------------\ >> | V >> bfqq2 bfqq3 >> >> Now that BIC2 points to bfqq3, while bfqq2 and bfqq3 both point to BIC2. >> If all the requests are completed, and Process 2 exit, BIC2 will be >> freed while there is no guarantee that bfqq2 will be freed before BIC2. >> >> Fix the problem by clearing bfqq->bic if process references is decreased >> to zero, since that they are not related anymore. >> >> Fixes: 3bc5e683c67d ("bfq: Split shared queues on move between cgroups") >> Signed-off-by: Yu Kuai <yukuai3@huawei.com> > > Thanks for the analysis and the patch! I agree this is a problem. > >> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c >> index a72304c728fc..6eada99d1b34 100644 >> --- a/block/bfq-iosched.c >> +++ b/block/bfq-iosched.c >> @@ -3036,6 +3036,14 @@ void bfq_release_process_ref(struct bfq_data *bfqd, struct bfq_queue *bfqq) >> >> bfq_reassign_last_bfqq(bfqq, NULL); >> >> + /* >> + * __bfq_bic_change_cgroup() just reset bic->bfqq so that a new bfqq >> + * will be created to handle new io, while old bfqq will stay around >> + * until all the requests are completed. It's unsafe to keep bfqq->bic >> + * since they are not related anymore. >> + */ >> + if (bfqq_process_refs(bfqq) == 1) >> + bfqq->bic = NULL; >> bfq_put_queue(bfqq); > > Rather than changing bfq_release_process_ref() I think it would be more > logical to change bic_set_bfqq() like: > > struct bfq_queue *old_bfqq = bic->bfqq[is_sync]; > > /* Clear bic pointer if we are detaching bfqq from its bic */ > if (old_bfqq && old_bfqq->bic == bic) > old_bfqq->bic = NULL; > > And then we can also remove several explicit bfqq->bic = NULL statements > from bfq code. Yes, I agree. I'll send a new patch soon. Thanks, Kuai > > Honza >
© 2016 - 2025 Red Hat, Inc.