RE: [syzbot] KASAN: use-after-free Read in dev_uevent

Zhang, Qiang1 posted 1 patch 4 years, 4 months ago
There is a newer version of this series
drivers/base/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
RE: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by Zhang, Qiang1 4 years, 4 months ago

syzbot has found a reproducer for the following issue on:

HEAD commit:    4f12b742eb2b Merge tag 'nfs-for-5.17-3' of git://git.linux..
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=110a6df2700000
kernel config:  https://syzkaller.appspot.com/x/.config?x=f6a069ed94a1ed1d
dashboard link: https://syzkaller.appspot.com/bug?extid=348b571beb5eeb70a582
compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12377296700000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+348b571beb5eeb70a582@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: use-after-free in dev_uevent+0x712/0x780 drivers/base/core.c:2320 Read of size 8 at addr ffff88802b934098 by task udevd/3689

CPU: 2 PID: 3689 Comm: udevd Not tainted 5.17.0-rc4-syzkaller-00229-g4f12b742eb2b #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
 print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255  __kasan_report mm/kasan/report.c:442 [inline]  kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
 dev_uevent+0x712/0x780 drivers/base/core.c:2320
 uevent_show+0x1b8/0x380 drivers/base/core.c:2391
 dev_attr_show+0x4b/0x90 drivers/base/core.c:2094
 sysfs_kf_seq_show+0x219/0x3d0 fs/sysfs/file.c:59
 seq_read_iter+0x4f5/0x1280 fs/seq_file.c:230
 kernfs_fop_read_iter+0x514/0x6f0 fs/kernfs/file.c:241  call_read_iter include/linux/fs.h:2068 [inline]
 new_sync_read+0x429/0x6e0 fs/read_write.c:400
 vfs_read+0x35c/0x600 fs/read_write.c:481
 ksys_read+0x12d/0x250 fs/read_write.c:619
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80  entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f964cc558fe
Code: c0 e9 e6 fe ff ff 50 48 8d 3d 0e c7 09 00 e8 c9 cf 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
RSP: 002b:00007ffc0133d258 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 000056497b21a140 RCX: 00007f964cc558fe
RDX: 0000000000001000 RSI: 000056497b218650 RDI: 0000000000000008
RBP: 00007f964cd22380 R08: 0000000000000008 R09: 00007f964cd25a60
R10: 0000000000000008 R11: 0000000000000246 R12: 000056497b21a140
R13: 0000000000000d68 R14: 00007f964cd21780 R15: 0000000000000d68  </TASK>

Cc: Alan Stern 
       Felipe Balbi

Hello syzbot, Please try it:

From 574d45ff924e2d2f9b9f5cc3e846f8004498a811 Mon Sep 17 00:00:00 2001
From: Zqiang <qiang1.zhang@intel.com>
Date: Wed, 23 Feb 2022 18:18:22 +0800
Subject: [PATCH] driver core: Fix use-after-free in dev_uevent()

In dev_uevent(), if the "dev->driver" is valid, the "dev->driver->name"
be accessed, there may be a window period between these two operations.
in this window period if the "dev->driver" is set to null
(in usb_gadget_unregister_driver function), when the "dev->driver->name"
is accessed again, invalid address will be accessed. fix it by checking
"dev->driver" again.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 drivers/base/core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 3d6430eb0c6a..a45b927ee76e 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2317,7 +2317,7 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
                add_uevent_var(env, "DEVTYPE=%s", dev->type->name);

        if (dev->driver)
-               add_uevent_var(env, "DRIVER=%s", dev->driver->name);
+               add_uevent_var(env, "DRIVER=%s", dev_driver_string(dev));

        /* Add common DT information about the device */
        of_device_uevent(dev, env);
--
2.25.1

Thanks,
Zqiang

Allocated by task 4316:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38  kasan_set_track mm/kasan/common.c:45 [inline]  set_alloc_info mm/kasan/common.c:436 [inline]  ____kasan_kmalloc mm/kasan/common.c:515 [inline]  ____kasan_kmalloc mm/kasan/common.c:474 [inline]
 __kasan_kmalloc+0xa6/0xd0 mm/kasan/common.c:524  kasan_kmalloc include/linux/kasan.h:270 [inline]
 kmem_cache_alloc_trace+0x1ea/0x4a0 mm/slab.c:3567  kmalloc include/linux/slab.h:581 [inline]  kzalloc include/linux/slab.h:715 [inline]  dev_new drivers/usb/gadget/legacy/raw_gadget.c:183 [inline]
 raw_open+0x8d/0x4c0 drivers/usb/gadget/legacy/raw_gadget.c:373
 misc_open+0x372/0x4a0 drivers/char/misc.c:141
 chrdev_open+0x266/0x770 fs/char_dev.c:414
 do_dentry_open+0x4b9/0x1250 fs/open.c:824  do_open fs/namei.c:3476 [inline]
 path_openat+0x1c9e/0x2940 fs/namei.c:3609
 do_filp_open+0x1aa/0x400 fs/namei.c:3636
 do_sys_openat2+0x16d/0x4d0 fs/open.c:1214  do_sys_open fs/open.c:1230 [inline]  __do_sys_openat fs/open.c:1246 [inline]  __se_sys_openat fs/open.c:1241 [inline]
 __x64_sys_openat+0x13f/0x1f0 fs/open.c:1241
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80  entry_SYSCALL_64_after_hwframe+0x44/0xae

Freed by task 4315:
 kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
 kasan_set_track+0x21/0x30 mm/kasan/common.c:45
 kasan_set_free_info+0x20/0x30 mm/kasan/generic.c:370  ____kasan_slab_free mm/kasan/common.c:366 [inline]
 ____kasan_slab_free+0xff/0x140 mm/kasan/common.c:328  kasan_slab_free include/linux/kasan.h:236 [inline]  __cache_free mm/slab.c:3437 [inline]
 kfree+0xf8/0x2b0 mm/slab.c:3794
 kref_put include/linux/kref.h:65 [inline]
 raw_release+0x218/0x290 drivers/usb/gadget/legacy/raw_gadget.c:412
 __fput+0x286/0x9f0 fs/file_table.c:317
 task_work_run+0xdd/0x1a0 kernel/task_work.c:164  tracehook_notify_resume include/linux/tracehook.h:188 [inline]  exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
 exit_to_user_mode_prepare+0x27e/0x290 kernel/entry/common.c:207  __syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
 syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:300
 do_syscall_64+0x42/0xb0 arch/x86/entry/common.c:86  entry_SYSCALL_64_after_hwframe+0x44/0xae

The buggy address belongs to the object at ffff88802b934000  which belongs to the cache kmalloc-4k of size 4096 The buggy address is located 152 bytes inside of  4096-byte region [ffff88802b934000, ffff88802b935000) The buggy address belongs to the page:
page:ffffea0000ae4d00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2b934
head:ffffea0000ae4d00 order:1 compound_mapcount:0
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
raw: 00fff00000010200 ffffea00008be908 ffffea0000612d08 ffff888010c40900
raw: 0000000000000000 ffff88802b934000 0000000100000001 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 1, migratetype Unmovable, gfp_mask 0x2420c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_COMP|__GFP_THISNODE), pid 4316, ts 254636955499, free_ts 240714313612  prep_new_page mm/page_alloc.c:2434 [inline]
 get_page_from_freelist+0xa72/0x2f50 mm/page_alloc.c:4165
 __alloc_pages_slowpath.constprop.0+0x2eb/0x20d0 mm/page_alloc.c:4934
 __alloc_pages+0x412/0x500 mm/page_alloc.c:5402  __alloc_pages_node include/linux/gfp.h:572 [inline]  kmem_getpages mm/slab.c:1378 [inline]
 cache_grow_begin+0x75/0x390 mm/slab.c:2584
 cache_alloc_refill+0x27f/0x380 mm/slab.c:2957  ____cache_alloc mm/slab.c:3040 [inline]  ____cache_alloc mm/slab.c:3023 [inline]  __do_cache_alloc mm/slab.c:3267 [inline]  slab_alloc mm/slab.c:3308 [inline]
 kmem_cache_alloc_trace+0x380/0x4a0 mm/slab.c:3565  kmalloc include/linux/slab.h:581 [inline]  kzalloc include/linux/slab.h:715 [inline]  dev_new drivers/usb/gadget/legacy/raw_gadget.c:183 [inline]
 raw_open+0x8d/0x4c0 drivers/usb/gadget/legacy/raw_gadget.c:373
 misc_open+0x372/0x4a0 drivers/char/misc.c:141
 chrdev_open+0x266/0x770 fs/char_dev.c:414
 do_dentry_open+0x4b9/0x1250 fs/open.c:824  do_open fs/namei.c:3476 [inline]
 path_openat+0x1c9e/0x2940 fs/namei.c:3609
 do_filp_open+0x1aa/0x400 fs/namei.c:3636
 do_sys_openat2+0x16d/0x4d0 fs/open.c:1214  do_sys_open fs/open.c:1230 [inline]  __do_sys_openat fs/open.c:1246 [inline]  __se_sys_openat fs/open.c:1241 [inline]
 __x64_sys_openat+0x13f/0x1f0 fs/open.c:1241
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80  entry_SYSCALL_64_after_hwframe+0x44/0xae
page last free stack trace:
 reset_page_owner include/linux/page_owner.h:24 [inline]  free_pages_prepare mm/page_alloc.c:1352 [inline]
 free_pcp_prepare+0x374/0x870 mm/page_alloc.c:1404  free_unref_page_prepare mm/page_alloc.c:3325 [inline]
 free_unref_page+0x19/0x690 mm/page_alloc.c:3404  slab_destroy mm/slab.c:1630 [inline]
 slabs_destroy+0x89/0xc0 mm/slab.c:1650
 cache_flusharray mm/slab.c:3410 [inline]
 ___cache_free+0x303/0x600 mm/slab.c:3472  qlink_free mm/kasan/quarantine.c:157 [inline]
 qlist_free_all+0x50/0x1a0 mm/kasan/quarantine.c:176
 kasan_quarantine_reduce+0x180/0x200 mm/kasan/quarantine.c:283
 __kasan_slab_alloc+0x97/0xb0 mm/kasan/common.c:446  kasan_slab_alloc include/linux/kasan.h:260 [inline]  slab_post_alloc_hook mm/slab.h:732 [inline]  slab_alloc_node mm/slab.c:3253 [inline]
 kmem_cache_alloc_node+0x2ea/0x590 mm/slab.c:3591
 __alloc_skb+0x215/0x340 net/core/skbuff.c:414  alloc_skb include/linux/skbuff.h:1158 [inline]
 alloc_skb_with_frags+0x93/0x620 net/core/skbuff.c:5956
 sock_alloc_send_pskb+0x793/0x920 net/core/sock.c:2586
 unix_dgram_sendmsg+0x414/0x1a10 net/unix/af_unix.c:1896  sock_sendmsg_nosec net/socket.c:705 [inline]
 sock_sendmsg+0xcf/0x120 net/socket.c:725
 __sys_sendto+0x21c/0x320 net/socket.c:2040  __do_sys_sendto net/socket.c:2052 [inline]  __se_sys_sendto net/socket.c:2048 [inline]
 __x64_sys_sendto+0xdd/0x1b0 net/socket.c:2048
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80

Memory state around the buggy address:
 ffff88802b933f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
 ffff88802b934000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff88802b934080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                            ^
 ffff88802b934100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff88802b934180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ==================================================================

Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by Pavel Skripkin 4 years, 4 months ago
Hi Qiang1,

On 2/23/22 14:17, Zhang, Qiang1 wrote:
> 
> Cc: Alan Stern
>         Felipe Balbi
> 
> Hello syzbot, Please try it:
> 
>  From 574d45ff924e2d2f9b9f5cc3e846f8004498a811 Mon Sep 17 00:00:00 2001
> From: Zqiang <qiang1.zhang@intel.com>
> Date: Wed, 23 Feb 2022 18:18:22 +0800
> Subject: [PATCH] driver core: Fix use-after-free in dev_uevent()
> 
> In dev_uevent(), if the "dev->driver" is valid, the "dev->driver->name"
> be accessed, there may be a window period between these two operations.
> in this window period if the "dev->driver" is set to null
> (in usb_gadget_unregister_driver function), when the "dev->driver->name"
> is accessed again, invalid address will be accessed. fix it by checking
> "dev->driver" again.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> ---
>   drivers/base/core.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 3d6430eb0c6a..a45b927ee76e 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2317,7 +2317,7 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
>                  add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
> 
>          if (dev->driver)
> -               add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> +               add_uevent_var(env, "DRIVER=%s", dev_driver_string(dev));
> 
>          /* Add common DT information about the device */
>          of_device_uevent(dev, env);
> --
> 2.25.1
> 

you should use '#syz test' command to ask syzbot to test the patch. 
Basic syntax is '#syz test: <git tree> <branch or sha>' and syzbot will 
apply attached patch (if you have attached it)


More about syzbot interactions here [1].

[1] 
https://github.com/google/syzkaller/blob/15439f1624735bde5ae3f3b66c1b964a98




With regards,
Pavel Skripkin
Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by Pavel Skripkin 4 years, 4 months ago
On 2/23/22 14:23, Pavel Skripkin wrote:
> you should use '#syz test' command to ask syzbot to test the patch.
> Basic syntax is '#syz test: <git tree> <branch or sha>' and syzbot will
> apply attached patch (if you have attached it)
> 
> 
> More about syzbot interactions here [1].
> 
> [1]
> https://github.com/google/syzkaller/blob/15439f1624735bde5ae3f3b66c1b964a98

^^^^^^

I've copy-pasted something weird... Sorry about that. Here is actual link


https://github.com/google/syzkaller/blob/master/docs/syzbot.md




With regards,
Pavel Skripkin
Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by gregkh@linuxfoundation.org 4 years, 4 months ago
On Wed, Feb 23, 2022 at 11:17:07AM +0000, Zhang, Qiang1 wrote:
> 
> syzbot has found a reproducer for the following issue on:
> 
> HEAD commit:    4f12b742eb2b Merge tag 'nfs-for-5.17-3' of git://git.linux..
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=110a6df2700000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f6a069ed94a1ed1d
> dashboard link: https://syzkaller.appspot.com/bug?extid=348b571beb5eeb70a582
> compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12377296700000
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+348b571beb5eeb70a582@syzkaller.appspotmail.com
> 
> ==================================================================
> BUG: KASAN: use-after-free in dev_uevent+0x712/0x780 drivers/base/core.c:2320 Read of size 8 at addr ffff88802b934098 by task udevd/3689
> 
> CPU: 2 PID: 3689 Comm: udevd Not tainted 5.17.0-rc4-syzkaller-00229-g4f12b742eb2b #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace:
>  <TASK>
>  __dump_stack lib/dump_stack.c:88 [inline]
>  dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
>  print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255  __kasan_report mm/kasan/report.c:442 [inline]  kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
>  dev_uevent+0x712/0x780 drivers/base/core.c:2320
>  uevent_show+0x1b8/0x380 drivers/base/core.c:2391
>  dev_attr_show+0x4b/0x90 drivers/base/core.c:2094
>  sysfs_kf_seq_show+0x219/0x3d0 fs/sysfs/file.c:59
>  seq_read_iter+0x4f5/0x1280 fs/seq_file.c:230
>  kernfs_fop_read_iter+0x514/0x6f0 fs/kernfs/file.c:241  call_read_iter include/linux/fs.h:2068 [inline]
>  new_sync_read+0x429/0x6e0 fs/read_write.c:400
>  vfs_read+0x35c/0x600 fs/read_write.c:481
>  ksys_read+0x12d/0x250 fs/read_write.c:619
>  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
>  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80  entry_SYSCALL_64_after_hwframe+0x44/0xae
> RIP: 0033:0x7f964cc558fe
> Code: c0 e9 e6 fe ff ff 50 48 8d 3d 0e c7 09 00 e8 c9 cf 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> RSP: 002b:00007ffc0133d258 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> RAX: ffffffffffffffda RBX: 000056497b21a140 RCX: 00007f964cc558fe
> RDX: 0000000000001000 RSI: 000056497b218650 RDI: 0000000000000008
> RBP: 00007f964cd22380 R08: 0000000000000008 R09: 00007f964cd25a60
> R10: 0000000000000008 R11: 0000000000000246 R12: 000056497b21a140
> R13: 0000000000000d68 R14: 00007f964cd21780 R15: 0000000000000d68  </TASK>
> 
> Cc: Alan Stern 
>        Felipe Balbi
> 
> Hello syzbot, Please try it:
> 
> From 574d45ff924e2d2f9b9f5cc3e846f8004498a811 Mon Sep 17 00:00:00 2001
> From: Zqiang <qiang1.zhang@intel.com>
> Date: Wed, 23 Feb 2022 18:18:22 +0800
> Subject: [PATCH] driver core: Fix use-after-free in dev_uevent()
> 
> In dev_uevent(), if the "dev->driver" is valid, the "dev->driver->name"
> be accessed, there may be a window period between these two operations.

There should not be any such window period.  The bus locks should
prevent this, unless some driver is doing odd things with the pointers
that it should not be doing.

> in this window period if the "dev->driver" is set to null
> (in usb_gadget_unregister_driver function), when the "dev->driver->name"
> is accessed again, invalid address will be accessed. fix it by checking
> "dev->driver" again.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> ---
>  drivers/base/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 3d6430eb0c6a..a45b927ee76e 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2317,7 +2317,7 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
>                 add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
> 
>         if (dev->driver)
> -               add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> +               add_uevent_var(env, "DRIVER=%s", dev_driver_string(dev));

What's to prevent the "window" from happening in the middle of the
dev_driver_string() call?

thanks,

greg k-h
Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by stern@rowland.harvard.edu 4 years, 4 months ago
On Wed, Feb 23, 2022 at 12:29:03PM +0100, gregkh@linuxfoundation.org wrote:
> On Wed, Feb 23, 2022 at 11:17:07AM +0000, Zhang, Qiang1 wrote:
> > 
> > syzbot has found a reproducer for the following issue on:
> > 
> > HEAD commit:    4f12b742eb2b Merge tag 'nfs-for-5.17-3' of git://git.linux..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=110a6df2700000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=f6a069ed94a1ed1d
> > dashboard link: https://syzkaller.appspot.com/bug?extid=348b571beb5eeb70a582
> > compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12377296700000
> > 
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+348b571beb5eeb70a582@syzkaller.appspotmail.com
> > 
> > ==================================================================
> > BUG: KASAN: use-after-free in dev_uevent+0x712/0x780 drivers/base/core.c:2320 Read of size 8 at addr ffff88802b934098 by task udevd/3689
> > 
> > CPU: 2 PID: 3689 Comm: udevd Not tainted 5.17.0-rc4-syzkaller-00229-g4f12b742eb2b #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace:
> >  <TASK>
> >  __dump_stack lib/dump_stack.c:88 [inline]
> >  dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
> >  print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255  __kasan_report mm/kasan/report.c:442 [inline]  kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
> >  dev_uevent+0x712/0x780 drivers/base/core.c:2320
> >  uevent_show+0x1b8/0x380 drivers/base/core.c:2391
> >  dev_attr_show+0x4b/0x90 drivers/base/core.c:2094
> >  sysfs_kf_seq_show+0x219/0x3d0 fs/sysfs/file.c:59
> >  seq_read_iter+0x4f5/0x1280 fs/seq_file.c:230
> >  kernfs_fop_read_iter+0x514/0x6f0 fs/kernfs/file.c:241  call_read_iter include/linux/fs.h:2068 [inline]
> >  new_sync_read+0x429/0x6e0 fs/read_write.c:400
> >  vfs_read+0x35c/0x600 fs/read_write.c:481
> >  ksys_read+0x12d/0x250 fs/read_write.c:619
> >  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > RIP: 0033:0x7f964cc558fe
> > Code: c0 e9 e6 fe ff ff 50 48 8d 3d 0e c7 09 00 e8 c9 cf 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> > RSP: 002b:00007ffc0133d258 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> > RAX: ffffffffffffffda RBX: 000056497b21a140 RCX: 00007f964cc558fe
> > RDX: 0000000000001000 RSI: 000056497b218650 RDI: 0000000000000008
> > RBP: 00007f964cd22380 R08: 0000000000000008 R09: 00007f964cd25a60
> > R10: 0000000000000008 R11: 0000000000000246 R12: 000056497b21a140
> > R13: 0000000000000d68 R14: 00007f964cd21780 R15: 0000000000000d68  </TASK>
> > 
> > Cc: Alan Stern 
> >        Felipe Balbi
> > 
> > Hello syzbot, Please try it:
> > 
> > From 574d45ff924e2d2f9b9f5cc3e846f8004498a811 Mon Sep 17 00:00:00 2001
> > From: Zqiang <qiang1.zhang@intel.com>
> > Date: Wed, 23 Feb 2022 18:18:22 +0800
> > Subject: [PATCH] driver core: Fix use-after-free in dev_uevent()
> > 
> > In dev_uevent(), if the "dev->driver" is valid, the "dev->driver->name"
> > be accessed, there may be a window period between these two operations.
> 
> There should not be any such window period.  The bus locks should
> prevent this, unless some driver is doing odd things with the pointers
> that it should not be doing.

Which bus locks are you referring to?  I'm not aware of any locks that 
synchronize dev_uevent() with anything (in particular, with driver 
unbinding).

And as far as I know, usb_gadget_remove_driver() doesn't play any odd 
tricks with pointers.

> > in this window period if the "dev->driver" is set to null
> > (in usb_gadget_unregister_driver function), when the "dev->driver->name"
> > is accessed again, invalid address will be accessed. fix it by checking
> > "dev->driver" again.
> > 
> > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > ---
> >  drivers/base/core.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index 3d6430eb0c6a..a45b927ee76e 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -2317,7 +2317,7 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
> >                 add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
> > 
> >         if (dev->driver)
> > -               add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> > +               add_uevent_var(env, "DRIVER=%s", dev_driver_string(dev));
> 
> What's to prevent the "window" from happening in the middle of the
> dev_driver_string() call?

Nothing prevents it.  But if you read the code for dev_driver_string(), 
you will see that it doesn't matter if dev->driver gets set to NULL 
while the routine is running.

(Of course, there is still the possibility that the driver structure 
itself might get deallocated while dev_driver_string() is running.  
This whole area could perhaps use a little careful thought.  Driver 
unbinding might be a good application for SRCU.)

Alan Stern
Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by gregkh@linuxfoundation.org 4 years, 4 months ago
On Wed, Feb 23, 2022 at 09:38:20AM -0500, stern@rowland.harvard.edu wrote:
> On Wed, Feb 23, 2022 at 12:29:03PM +0100, gregkh@linuxfoundation.org wrote:
> > On Wed, Feb 23, 2022 at 11:17:07AM +0000, Zhang, Qiang1 wrote:
> > > 
> > > syzbot has found a reproducer for the following issue on:
> > > 
> > > HEAD commit:    4f12b742eb2b Merge tag 'nfs-for-5.17-3' of git://git.linux..
> > > git tree:       upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=110a6df2700000
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=f6a069ed94a1ed1d
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=348b571beb5eeb70a582
> > > compiler:       gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> > > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=12377296700000
> > > 
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: syzbot+348b571beb5eeb70a582@syzkaller.appspotmail.com
> > > 
> > > ==================================================================
> > > BUG: KASAN: use-after-free in dev_uevent+0x712/0x780 drivers/base/core.c:2320 Read of size 8 at addr ffff88802b934098 by task udevd/3689
> > > 
> > > CPU: 2 PID: 3689 Comm: udevd Not tainted 5.17.0-rc4-syzkaller-00229-g4f12b742eb2b #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 Call Trace:
> > >  <TASK>
> > >  __dump_stack lib/dump_stack.c:88 [inline]
> > >  dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
> > >  print_address_description.constprop.0.cold+0x8d/0x303 mm/kasan/report.c:255  __kasan_report mm/kasan/report.c:442 [inline]  kasan_report.cold+0x83/0xdf mm/kasan/report.c:459
> > >  dev_uevent+0x712/0x780 drivers/base/core.c:2320
> > >  uevent_show+0x1b8/0x380 drivers/base/core.c:2391
> > >  dev_attr_show+0x4b/0x90 drivers/base/core.c:2094
> > >  sysfs_kf_seq_show+0x219/0x3d0 fs/sysfs/file.c:59
> > >  seq_read_iter+0x4f5/0x1280 fs/seq_file.c:230
> > >  kernfs_fop_read_iter+0x514/0x6f0 fs/kernfs/file.c:241  call_read_iter include/linux/fs.h:2068 [inline]
> > >  new_sync_read+0x429/0x6e0 fs/read_write.c:400
> > >  vfs_read+0x35c/0x600 fs/read_write.c:481
> > >  ksys_read+0x12d/0x250 fs/read_write.c:619
> > >  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> > >  do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > RIP: 0033:0x7f964cc558fe
> > > Code: c0 e9 e6 fe ff ff 50 48 8d 3d 0e c7 09 00 e8 c9 cf 01 00 66 0f 1f 84 00 00 00 00 00 64 8b 04 25 18 00 00 00 85 c0 75 14 0f 05 <48> 3d 00 f0 ff ff 77 5a c3 66 0f 1f 84 00 00 00 00 00 48 83 ec 28
> > > RSP: 002b:00007ffc0133d258 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
> > > RAX: ffffffffffffffda RBX: 000056497b21a140 RCX: 00007f964cc558fe
> > > RDX: 0000000000001000 RSI: 000056497b218650 RDI: 0000000000000008
> > > RBP: 00007f964cd22380 R08: 0000000000000008 R09: 00007f964cd25a60
> > > R10: 0000000000000008 R11: 0000000000000246 R12: 000056497b21a140
> > > R13: 0000000000000d68 R14: 00007f964cd21780 R15: 0000000000000d68  </TASK>
> > > 
> > > Cc: Alan Stern 
> > >        Felipe Balbi
> > > 
> > > Hello syzbot, Please try it:
> > > 
> > > From 574d45ff924e2d2f9b9f5cc3e846f8004498a811 Mon Sep 17 00:00:00 2001
> > > From: Zqiang <qiang1.zhang@intel.com>
> > > Date: Wed, 23 Feb 2022 18:18:22 +0800
> > > Subject: [PATCH] driver core: Fix use-after-free in dev_uevent()
> > > 
> > > In dev_uevent(), if the "dev->driver" is valid, the "dev->driver->name"
> > > be accessed, there may be a window period between these two operations.
> > 
> > There should not be any such window period.  The bus locks should
> > prevent this, unless some driver is doing odd things with the pointers
> > that it should not be doing.
> 
> Which bus locks are you referring to?  I'm not aware of any locks that 
> synchronize dev_uevent() with anything (in particular, with driver 
> unbinding).

The locks in the driver core that handle the binding and unbinding of
drivers to devices.

> And as far as I know, usb_gadget_remove_driver() doesn't play any odd 
> tricks with pointers.

Ah, I never noticed that this is doing a "fake" bus and does the
bind/unbind itself outside of the driver core.  It should just be a
normal bus type and have the core do the work for it, but oh well.

And there is a lock that should serialize all of this already, so it's
odd that this is able to be triggered at all.

Unless the device is being removed at the same time it was manually
unbound from the driver?  If so, then this really should be fixed up to
use the driver core logic instead...

thanks,

greg k-h
Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by stern@rowland.harvard.edu 4 years, 4 months ago
On Wed, Feb 23, 2022 at 05:00:12PM +0100, gregkh@linuxfoundation.org wrote:
> On Wed, Feb 23, 2022 at 09:38:20AM -0500, stern@rowland.harvard.edu wrote:
> > Which bus locks are you referring to?  I'm not aware of any locks that 
> > synchronize dev_uevent() with anything (in particular, with driver 
> > unbinding).
> 
> The locks in the driver core that handle the binding and unbinding of
> drivers to devices.
> 
> > And as far as I know, usb_gadget_remove_driver() doesn't play any odd 
> > tricks with pointers.
> 
> Ah, I never noticed that this is doing a "fake" bus and does the
> bind/unbind itself outside of the driver core.  It should just be a
> normal bus type and have the core do the work for it, but oh well.
> 
> And there is a lock that should serialize all of this already, so it's
> odd that this is able to be triggered at all.

I guess at a minimum the UDC core should hold the device lock when it 
registers, unregisters, binds, or unbinds UDC and gadget devices.  
Would that be enough to fix the problem?  I really don't understand how 
sysfs file access gets synchronized with device removal.

> Unless the device is being removed at the same time it was manually
> unbound from the driver?  If so, then this really should be fixed up to
> use the driver core logic instead...

Device removal does of course trigger unbinding, but they always take 
place in the same thread so it isn't an issue.

Probably part of the reason people don't want to use the driver core 
here is so that they can specify which UDC a gadget driver should bind 
to.  The driver core would always bind each new gadget to the first 
registered gadget driver.

When Dave Brownell originally wrote the gadget subsystem, I believe he 
didn't bother to integrate it with the driver core because it was a 
"bus" with only a single device and a single driver.  The ability to 
have multiple UDCs in the system was added later.

Alan Stern
RE: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by Zhang, Qiang1 4 years, 4 months ago
On Wed, Feb 23, 2022 at 05:00:12PM +0100, gregkh@linuxfoundation.org wrote:
> On Wed, Feb 23, 2022 at 09:38:20AM -0500, stern@rowland.harvard.edu wrote:
> > Which bus locks are you referring to?  I'm not aware of any locks 
> > that synchronize dev_uevent() with anything (in particular, with 
> > driver unbinding).
> 
> The locks in the driver core that handle the binding and unbinding of 
> drivers to devices.
> 
> > And as far as I know, usb_gadget_remove_driver() doesn't play any 
> > odd tricks with pointers.
> 
> Ah, I never noticed that this is doing a "fake" bus and does the 
> bind/unbind itself outside of the driver core.  It should just be a 
> normal bus type and have the core do the work for it, but oh well.
> 
> And there is a lock that should serialize all of this already, so it's 
> odd that this is able to be triggered at all.

>>I guess at a minimum the UDC core should hold the device lock when it registers, unregisters, binds, or unbinds UDC and gadget devices.  
>>Would that be enough to fix the problem?  I really don't understand how sysfs file access gets synchronized with device removal.


Agree with you, in usb_gadget_remove_driver() function, the udc->dev.driver and udc->gadget->dev.driver be set to null
without any protection, so when the udevd accessed the dev->driver, this address may be invalid at this time.
maybe the operation of dev->driver can be protected by device_lock(). 

Thanks,
Zqiang


> Unless the device is being removed at the same time it was manually 
> unbound from the driver?  If so, then this really should be fixed up 
> to use the driver core logic instead...
>>
>>Device removal does of course trigger unbinding, but they always take place in the same thread so it isn't an issue.
>>
>>Probably part of the reason people don't want to use the driver core here is so that they can specify which UDC a gadget driver should bind to.  The driver core would always bind each new gadget to the first registered gadget driver.
>>
>>When Dave Brownell originally wrote the gadget subsystem, I believe he didn't bother to integrate it with the driver core because it was a "bus" with only a single device and a single driver.  The ability to have multiple UDCs in the system was added later.
>>
>>Alan Stern
RE: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by Zhang, Qiang1 4 years, 4 months ago

On Wed, Feb 23, 2022 at 05:00:12PM +0100, gregkh@linuxfoundation.org wrote:
> On Wed, Feb 23, 2022 at 09:38:20AM -0500, stern@rowland.harvard.edu wrote:
> > Which bus locks are you referring to?  I'm not aware of any locks 
> > that synchronize dev_uevent() with anything (in particular, with 
> > driver unbinding).
> 
> The locks in the driver core that handle the binding and unbinding of 
> drivers to devices.
> 
> > And as far as I know, usb_gadget_remove_driver() doesn't play any 
> > odd tricks with pointers.
> 
> Ah, I never noticed that this is doing a "fake" bus and does the 
> bind/unbind itself outside of the driver core.  It should just be a 
> normal bus type and have the core do the work for it, but oh well.
> 
> And there is a lock that should serialize all of this already, so it's 
> odd that this is able to be triggered at all.

>>I guess at a minimum the UDC core should hold the device lock when it registers, unregisters, binds, or unbinds UDC and gadget devices.  
>>Would that be enough to fix the problem?  I really don't understand how sysfs file access gets synchronized with device removal.

>>>
>>>Agree with you, in usb_gadget_remove_driver() function, the udc->dev.driver and udc->gadget->dev.driver be set to null without any protection, so when the udevd accessed the dev->driver, this address may be invalid at this time.
>>>maybe the operation of dev->driver can be protected by device_lock(). 
>>>

Is it enough that we just need to protect "dev.driver" ?

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 3d6430eb0c6a..9bd2624973d7 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2316,8 +2316,10 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
        if (dev->type && dev->type->name)
                add_uevent_var(env, "DEVTYPE=%s", dev->type->name);

+       device_lock(dev);
        if (dev->driver)
                add_uevent_var(env, "DRIVER=%s", dev->driver->name);
+       device_unlock(dev);

        /* Add common DT information about the device */
        of_device_uevent(dev, env);
diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index 568534a0d17c..7877142397d3 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -1436,8 +1436,14 @@ static void usb_gadget_remove_driver(struct usb_udc *udc)
        usb_gadget_udc_stop(udc);

        udc->driver = NULL;
+
+       device_lock(&udc->dev);
        udc->dev.driver = NULL;
+       device_unlock(&udc->dev);
+
+       device_lock(&udc->gadget->dev);
        udc->gadget->dev.driver = NULL;
+       device_unlock(&udc->gadget->dev);
 }

 /**
@@ -1498,8 +1504,14 @@ static int udc_bind_to_driver(struct usb_udc *udc, struct usb_gadget_driver *dri
                        driver->function);

        udc->driver = driver;
+
+       device_lock(&udc->dev);
        udc->dev.driver = &driver->driver;
+       device_unlock(&udc->dev);
+
+       device_lock(&udc->gadget->dev);
        udc->gadget->dev.driver = &driver->driver;
+       device_unlock(&udc->gadget->dev);

        usb_gadget_udc_set_speed(udc, driver->max_speed);

@@ -1521,8 +1533,14 @@ static int udc_bind_to_driver(struct usb_udc *udc, struct usb_gadget_driver *dri
                dev_err(&udc->dev, "failed to start %s: %d\n",
                        udc->driver->function, ret);
        udc->driver = NULL;
+
+       device_lock(&udc->dev);
        udc->dev.driver = NULL;
+       device_unlock(&udc->dev);
+
+       device_lock(&udc->gadget->dev);
        udc->gadget->dev.driver = NULL;
+       device_unlock(&udc->gadget->dev);
        return ret;
 }

Thanks,
Zqiang


>>>Thanks,
>>>Zqiang


> Unless the device is being removed at the same time it was manually 
> unbound from the driver?  If so, then this really should be fixed up 
> to use the driver core logic instead...
>>
>>Device removal does of course trigger unbinding, but they always take place in the same thread so it isn't an issue.
>>
>>Probably part of the reason people don't want to use the driver core here is so that they can specify which UDC a gadget driver should bind to.  The driver core would always bind each new gadget to the first registered gadget driver.
>>
>>When Dave Brownell originally wrote the gadget subsystem, I believe he didn't bother to integrate it with the driver core because it was a "bus" with only a single device and a single driver.  The ability to have multiple UDCs in the system was added later.
>>
>>Alan Stern
Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by stern@rowland.harvard.edu 4 years, 4 months ago
On Thu, Feb 24, 2022 at 03:14:54AM +0000, Zhang, Qiang1 wrote:
> 
> On Wed, Feb 23, 2022 at 05:00:12PM +0100, gregkh@linuxfoundation.org wrote:
> > On Wed, Feb 23, 2022 at 09:38:20AM -0500, stern@rowland.harvard.edu wrote:
> > > Which bus locks are you referring to?  I'm not aware of any locks 
> > > that synchronize dev_uevent() with anything (in particular, with 
> > > driver unbinding).
> > 
> > The locks in the driver core that handle the binding and unbinding of 
> > drivers to devices.
> > 
> > > And as far as I know, usb_gadget_remove_driver() doesn't play any 
> > > odd tricks with pointers.
> > 
> > Ah, I never noticed that this is doing a "fake" bus and does the 
> > bind/unbind itself outside of the driver core.  It should just be a 
> > normal bus type and have the core do the work for it, but oh well.
> > 
> > And there is a lock that should serialize all of this already, so it's 
> > odd that this is able to be triggered at all.
> 
> >>I guess at a minimum the UDC core should hold the device lock when it registers, unregisters, binds, or unbinds UDC and gadget devices.  
> >>Would that be enough to fix the problem?  I really don't understand how sysfs file access gets synchronized with device removal.
> 
> >>>
> >>>Agree with you, in usb_gadget_remove_driver() function, the udc->dev.driver and udc->gadget->dev.driver be set to null without any protection, so when the udevd accessed the dev->driver, this address may be invalid at this time.
> >>>maybe the operation of dev->driver can be protected by device_lock(). 
> >>>
> 
> Is it enough that we just need to protect "dev.driver" ?

I don't know, although I doubt it.  The right way to fix it is to make 
sure that the existing protections, which apply to drivers that are 
registered in the driver core, can also work properly with gadgets.  But 
I don't know what those protections are or how they work.

> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 3d6430eb0c6a..9bd2624973d7 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2316,8 +2316,10 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
>         if (dev->type && dev->type->name)
>                 add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
> 
> +       device_lock(dev);
>         if (dev->driver)
>                 add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> +       device_unlock(dev);

You probably should not do this.  Unless there's a serious bug, the 
driver core already takes all the locks it needs.  Doing this might 
cause a deadlock (because the caller may already hold the device lock).

> 
>         /* Add common DT information about the device */
>         of_device_uevent(dev, env);
> diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
> index 568534a0d17c..7877142397d3 100644
> --- a/drivers/usb/gadget/udc/core.c
> +++ b/drivers/usb/gadget/udc/core.c
> @@ -1436,8 +1436,14 @@ static void usb_gadget_remove_driver(struct usb_udc *udc)
>         usb_gadget_udc_stop(udc);
> 
>         udc->driver = NULL;
> +
> +       device_lock(&udc->dev);
>         udc->dev.driver = NULL;
> +       device_unlock(&udc->dev);
> +
> +       device_lock(&udc->gadget->dev);
>         udc->gadget->dev.driver = NULL;
> +       device_unlock(&udc->gadget->dev);
>  }

These are reasonable things to do, but I don't know if they will fix the 
problem.

We really should ask advice from somebody who understands how this stuff 
is supposed to work.  I'm not sure who to ask, though.  Tejun Heo, 
perhaps (CC'ed).

Tejun: The USB Gadget core binds and unbinds drivers without using the 
normal driver core facilities (see the code in 
usb_gadget_remove_driver() above).  As a result, unbinding races with 
uevent generation, which can lead to a NULL pointer dereference as found 
by syzbot testing.  In particular, dev->driver can become NULL between 
the times when dev_uevent() tests it and uses it (see above).

Can you tell us how this should be fixed?

Alan Stern
Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by gregkh@linuxfoundation.org 4 years, 4 months ago
On Thu, Feb 24, 2022 at 04:23:26PM -0500, stern@rowland.harvard.edu wrote:
> On Thu, Feb 24, 2022 at 03:14:54AM +0000, Zhang, Qiang1 wrote:
> > 
> > On Wed, Feb 23, 2022 at 05:00:12PM +0100, gregkh@linuxfoundation.org wrote:
> > > On Wed, Feb 23, 2022 at 09:38:20AM -0500, stern@rowland.harvard.edu wrote:
> > > > Which bus locks are you referring to?  I'm not aware of any locks 
> > > > that synchronize dev_uevent() with anything (in particular, with 
> > > > driver unbinding).
> > > 
> > > The locks in the driver core that handle the binding and unbinding of 
> > > drivers to devices.
> > > 
> > > > And as far as I know, usb_gadget_remove_driver() doesn't play any 
> > > > odd tricks with pointers.
> > > 
> > > Ah, I never noticed that this is doing a "fake" bus and does the 
> > > bind/unbind itself outside of the driver core.  It should just be a 
> > > normal bus type and have the core do the work for it, but oh well.
> > > 
> > > And there is a lock that should serialize all of this already, so it's 
> > > odd that this is able to be triggered at all.
> > 
> > >>I guess at a minimum the UDC core should hold the device lock when it registers, unregisters, binds, or unbinds UDC and gadget devices.  
> > >>Would that be enough to fix the problem?  I really don't understand how sysfs file access gets synchronized with device removal.
> > 
> > >>>
> > >>>Agree with you, in usb_gadget_remove_driver() function, the udc->dev.driver and udc->gadget->dev.driver be set to null without any protection, so when the udevd accessed the dev->driver, this address may be invalid at this time.
> > >>>maybe the operation of dev->driver can be protected by device_lock(). 
> > >>>
> > 
> > Is it enough that we just need to protect "dev.driver" ?
> 
> I don't know, although I doubt it.  The right way to fix it is to make 
> sure that the existing protections, which apply to drivers that are 
> registered in the driver core, can also work properly with gadgets.  But 
> I don't know what those protections are or how they work.
> 
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index 3d6430eb0c6a..9bd2624973d7 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -2316,8 +2316,10 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
> >         if (dev->type && dev->type->name)
> >                 add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
> > 
> > +       device_lock(dev);
> >         if (dev->driver)
> >                 add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> > +       device_unlock(dev);
> 
> You probably should not do this.  Unless there's a serious bug, the 
> driver core already takes all the locks it needs.  Doing this might 
> cause a deadlock (because the caller may already hold the device lock).
> 
> > 
> >         /* Add common DT information about the device */
> >         of_device_uevent(dev, env);
> > diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
> > index 568534a0d17c..7877142397d3 100644
> > --- a/drivers/usb/gadget/udc/core.c
> > +++ b/drivers/usb/gadget/udc/core.c
> > @@ -1436,8 +1436,14 @@ static void usb_gadget_remove_driver(struct usb_udc *udc)
> >         usb_gadget_udc_stop(udc);
> > 
> >         udc->driver = NULL;
> > +
> > +       device_lock(&udc->dev);
> >         udc->dev.driver = NULL;
> > +       device_unlock(&udc->dev);
> > +
> > +       device_lock(&udc->gadget->dev);
> >         udc->gadget->dev.driver = NULL;
> > +       device_unlock(&udc->gadget->dev);
> >  }
> 
> These are reasonable things to do, but I don't know if they will fix the 
> problem.
> 
> We really should ask advice from somebody who understands how this stuff 
> is supposed to work.  I'm not sure who to ask, though.  Tejun Heo, 
> perhaps (CC'ed).
> 
> Tejun: The USB Gadget core binds and unbinds drivers without using the 
> normal driver core facilities (see the code in 
> usb_gadget_remove_driver() above).  As a result, unbinding races with 
> uevent generation, which can lead to a NULL pointer dereference as found 
> by syzbot testing.  In particular, dev->driver can become NULL between 
> the times when dev_uevent() tests it and uses it (see above).
> 
> Can you tell us how this should be fixed?

It should be fixed by properly using the driver core to bind/unbind the
driver to devices like I mentioned previously :)

That will be more work, but it's the correct fix here.  Otherwise it
needs to take the same bus locks that the device lives on to keep things
in sync, like the driver core would do if it were managing these things.
That could be the "short term" fix if no one wants to do the real work
needed here.  Nothing should be needed to change in the driver core
itself, it is rightfully thinking it owns the device and can free it
when needed.

thanks,

greg k-h
Re: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by stern@rowland.harvard.edu 4 years, 4 months ago
On Thu, Feb 24, 2022 at 11:37:39PM +0100, gregkh@linuxfoundation.org wrote:
> On Thu, Feb 24, 2022 at 04:23:26PM -0500, stern@rowland.harvard.edu wrote:
> > Can you tell us how this should be fixed?
> 
> It should be fixed by properly using the driver core to bind/unbind the
> driver to devices like I mentioned previously :)

This would involve creating a "gadget" bus_type (or should it be a 
device_type under the platform bus?) and registering the gadgets 
on it, right?.  Similarly, the gadget drivers would be registered on 
this bus.  I suppose we can control which drivers get bound to which 
gadgets with careful matching code.

> That will be more work, but it's the correct fix here.  Otherwise it
> needs to take the same bus locks that the device lives on to keep things
> in sync, like the driver core would do if it were managing these things.
> That could be the "short term" fix if no one wants to do the real work
> needed here.  Nothing should be needed to change in the driver core
> itself, it is rightfully thinking it owns the device and can free it
> when needed.

All right, thanks.  I'll think about implementing it.

Alan Stern
RE: [syzbot] KASAN: use-after-free Read in dev_uevent
Posted by Zhang, Qiang1 4 years, 4 months ago
On Thu, Feb 24, 2022 at 03:14:54AM +0000, Zhang, Qiang1 wrote:
> 
> On Wed, Feb 23, 2022 at 05:00:12PM +0100, gregkh@linuxfoundation.org wrote:
> > On Wed, Feb 23, 2022 at 09:38:20AM -0500, stern@rowland.harvard.edu wrote:
> > > Which bus locks are you referring to?  I'm not aware of any locks 
> > > that synchronize dev_uevent() with anything (in particular, with 
> > > driver unbinding).
> > 
> > The locks in the driver core that handle the binding and unbinding 
> > of drivers to devices.
> > 
> > > And as far as I know, usb_gadget_remove_driver() doesn't play any 
> > > odd tricks with pointers.
> > 
> > Ah, I never noticed that this is doing a "fake" bus and does the 
> > bind/unbind itself outside of the driver core.  It should just be a 
> > normal bus type and have the core do the work for it, but oh well.
> > 
> > And there is a lock that should serialize all of this already, so 
> > it's odd that this is able to be triggered at all.
> 
> >>I guess at a minimum the UDC core should hold the device lock when it registers, unregisters, binds, or unbinds UDC and gadget devices.  
> >>Would that be enough to fix the problem?  I really don't understand how sysfs file access gets synchronized with device removal.
> 
> >>>
> >>>Agree with you, in usb_gadget_remove_driver() function, the udc->dev.driver and udc->gadget->dev.driver be set to null without any protection, so when the udevd accessed the dev->driver, this address may be invalid at this time.
> >>>maybe the operation of dev->driver can be protected by device_lock(). 
> >>>
> 
> Is it enough that we just need to protect "dev.driver" ?

I don't know, although I doubt it.  The right way to fix it is to make sure that the existing protections, which apply to drivers that are registered in the driver core, can also work properly with gadgets.  But I don't know what those protections are or how they work.

> diff --git a/drivers/base/core.c b/drivers/base/core.c index 
> 3d6430eb0c6a..9bd2624973d7 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2316,8 +2316,10 @@ static int dev_uevent(struct kobject *kobj, struct kobj_uevent_env *env)
>         if (dev->type && dev->type->name)
>                 add_uevent_var(env, "DEVTYPE=%s", dev->type->name);
> 
> +       device_lock(dev);
>         if (dev->driver)
>                 add_uevent_var(env, "DRIVER=%s", dev->driver->name);
> +       device_unlock(dev);

>>You probably should not do this.  Unless there's a serious bug, the driver core already takes all the locks it needs.  Doing this might cause a deadlock (because the caller may already hold the device lock).

Sorry, yes it causes recursive locks.

> 
>         /* Add common DT information about the device */
>         of_device_uevent(dev, env);
> diff --git a/drivers/usb/gadget/udc/core.c 
> b/drivers/usb/gadget/udc/core.c index 568534a0d17c..7877142397d3 
> 100644
> --- a/drivers/usb/gadget/udc/core.c
> +++ b/drivers/usb/gadget/udc/core.c
> @@ -1436,8 +1436,14 @@ static void usb_gadget_remove_driver(struct usb_udc *udc)
>         usb_gadget_udc_stop(udc);
> 
>         udc->driver = NULL;
> +
> +       device_lock(&udc->dev);
>         udc->dev.driver = NULL;
> +       device_unlock(&udc->dev);
> +
> +       device_lock(&udc->gadget->dev);
>         udc->gadget->dev.driver = NULL;
> +       device_unlock(&udc->gadget->dev);
>  }

>>These are reasonable things to do, but I don't know if they will fix the problem.
>>
>>We really should ask advice from somebody who understands how this stuff is supposed to work.  I'm not sure who to ask, though.  Tejun Heo, perhaps (CC'ed).
>>
>>Tejun: The USB Gadget core binds and unbinds drivers without using the normal driver core facilities (see the code in
>>usb_gadget_remove_driver() above).  As a result, unbinding races with uevent generation, which can lead to a NULL pointer dereference as found by syzbot testing.  In particular, dev->driver can become NULL between the times when dev_uevent() tests it and uses it (see above).

                CPU0 (task 4316)                                                                   CPU1 (udevd task 3689)
usb_gadget_remove_driver()                                                dev_uevent()
   	...........		                                                               if (dev->driver)
    udc->dev.driver = NULL;                                                                  add_uevent_var(env, "DRIVER=%s", dev->driver->name)
    udc->gadget->dev.driver = NULL; 


Thanks,
Zqiang

>>
>>Can you tell us how this should be fixed?
>>
>>Alan Stern