[PATCH] fuse: add a null-ptr check

Nihar Chaithanya posted 1 patch 6 days, 18 hours ago
fs/fuse/dev.c | 3 +++
1 file changed, 3 insertions(+)
[PATCH] fuse: add a null-ptr check
Posted by Nihar Chaithanya 6 days, 18 hours ago
The bug KASAN: null-ptr-deref is triggered due to *val being
dereferenced when it is null in fuse_copy_do() when performing
memcpy().
Add a check in fuse_copy_one() to prevent this.

Reported-by: syzbot+87b8e6ed25dbc41759f7@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=87b8e6ed25dbc41759f7
Fixes: 3b97c3652d91 ("fuse: convert direct io to use folios")
Tested-by: syzbot+87b8e6ed25dbc41759f7@syzkaller.appspotmail.com
Signed-off-by: Nihar Chaithanya <niharchaithanya@gmail.com>
---
 fs/fuse/dev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 563a0bfa0e95..9c93759ac14b 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -1070,6 +1070,9 @@ static int fuse_copy_pages(struct fuse_copy_state *cs, unsigned nbytes,
 /* Copy a single argument in the request to/from userspace buffer */
 static int fuse_copy_one(struct fuse_copy_state *cs, void *val, unsigned size)
 {
+	if (!val)
+		return -EINVAL;
+
 	while (size) {
 		if (!cs->len) {
 			int err = fuse_copy_fill(cs);
-- 
2.34.1
Re: [PATCH] fuse: add a null-ptr check
Posted by Bernd Schubert 6 days, 17 hours ago

On 11/30/24 07:51, Nihar Chaithanya wrote:
> The bug KASAN: null-ptr-deref is triggered due to *val being
> dereferenced when it is null in fuse_copy_do() when performing
> memcpy().
> Add a check in fuse_copy_one() to prevent this.
> 
> Reported-by: syzbot+87b8e6ed25dbc41759f7@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=87b8e6ed25dbc41759f7
> Fixes: 3b97c3652d91 ("fuse: convert direct io to use folios")
> Tested-by: syzbot+87b8e6ed25dbc41759f7@syzkaller.appspotmail.com
> Signed-off-by: Nihar Chaithanya <niharchaithanya@gmail.com>
> ---
>  fs/fuse/dev.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> index 563a0bfa0e95..9c93759ac14b 100644
> --- a/fs/fuse/dev.c
> +++ b/fs/fuse/dev.c
> @@ -1070,6 +1070,9 @@ static int fuse_copy_pages(struct fuse_copy_state *cs, unsigned nbytes,
>  /* Copy a single argument in the request to/from userspace buffer */
>  static int fuse_copy_one(struct fuse_copy_state *cs, void *val, unsigned size)
>  {
> +	if (!val)
> +		return -EINVAL;
> +
>  	while (size) {
>  		if (!cs->len) {
>  			int err = fuse_copy_fill(cs);

I'm going to read through Joannes patches in the evening. Without
further explanation I find it unusual to have size, but no value.


Thanks,
Bernd
Re: [PATCH] fuse: add a null-ptr check
Posted by Joanne Koong 4 days, 5 hours ago
On Sat, Nov 30, 2024 at 12:22 AM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
> On 11/30/24 07:51, Nihar Chaithanya wrote:

Hi Nihar and Bernd,

> > The bug KASAN: null-ptr-deref is triggered due to *val being
> > dereferenced when it is null in fuse_copy_do() when performing
> > memcpy().

It's not clear to me that syzbot's "null-ptr-deref" complaint is about
*val being dereferenced when val is NULL.

The stack trace [1] points to the 2nd memcpy in fuse_copy_do():

/* Do as much copy to/from userspace buffer as we can */
static int fuse_copy_do(struct fuse_copy_state *cs, void **val, unsigned *size)
{
        unsigned ncpy = min(*size, cs->len);
        if (val) {
                void *pgaddr = kmap_local_page(cs->pg);
                void *buf = pgaddr + cs->offset;

                if (cs->write)
                        memcpy(buf, *val, ncpy);
                else
                        memcpy(*val, buf, ncpy);

                kunmap_local(pgaddr);
                *val += ncpy;
        }
...
}

but AFAICT, if val is NULL then we never try to deref val since it's
guarded by the "if (val)" check.

It seems like syzbot is either complaining about buf being NULL / *val
being NULL and then trying to deference those inside the memcpy call,
or maybe it actually is (mistakenly) complaining about val being NULL.

It's not clear to me either how the "fuse: convert direct io to use
folios" patch (on the fuse tree, it's commit 3b97c36) [2] directly
causes this.

If I'm remembering correctly, it's possible to add debug printks to a
patch and syzbot will print out the debug messages as it triggers the
issue? It'd be interesting to see which request opcode triggers this,
and what exactly is being deref-ed here that is NULL. I need to look
at this more deeply but so far, nothing stands out as to what could be
the culprit.


Thanks,
Joanne

[1] https://lore.kernel.org/linux-fsdevel/67475f25.050a0220.253251.005b.GAE@google.com/
[2] https://lore.kernel.org/linux-fsdevel/20241024171809.3142801-13-joannelkoong@gmail.com/

> > Add a check in fuse_copy_one() to prevent this.
> >
> > Reported-by: syzbot+87b8e6ed25dbc41759f7@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=87b8e6ed25dbc41759f7
> > Fixes: 3b97c3652d91 ("fuse: convert direct io to use folios")
> > Tested-by: syzbot+87b8e6ed25dbc41759f7@syzkaller.appspotmail.com
> > Signed-off-by: Nihar Chaithanya <niharchaithanya@gmail.com>
> > ---
> >  fs/fuse/dev.c | 3 +++
> >  1 file changed, 3 insertions(+)
> >
> > diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
> > index 563a0bfa0e95..9c93759ac14b 100644
> > --- a/fs/fuse/dev.c
> > +++ b/fs/fuse/dev.c
> > @@ -1070,6 +1070,9 @@ static int fuse_copy_pages(struct fuse_copy_state *cs, unsigned nbytes,
> >  /* Copy a single argument in the request to/from userspace buffer */
> >  static int fuse_copy_one(struct fuse_copy_state *cs, void *val, unsigned size)
> >  {
> > +     if (!val)
> > +             return -EINVAL;
> > +
> >       while (size) {
> >               if (!cs->len) {
> >                       int err = fuse_copy_fill(cs);
>
> I'm going to read through Joannes patches in the evening. Without
> further explanation I find it unusual to have size, but no value.
>
>
> Thanks,
> Bernd
Re: [PATCH] fuse: add a null-ptr check
Posted by Bernd Schubert 4 days, 4 hours ago
Btw, totally unrelated to the report, but related to what the C 
reproducer does, killing it sometimes results in

  12563 pts/1    Zl     0:00 [syzkaller] <defunct>


[   46.018014] mount.nfs (1163) used greatest stack depth: 23944 bytes left
[ 9929.865478] syzkaller (12313) used greatest stack depth: 23216 bytes left
[10159.658915] INFO: task syzkaller:12312 blocked for more than 120 seconds.
[10159.663075]       Not tainted 6.13.0-rc1+ #92
[10159.665618] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[10159.673650] task:syzkaller       state:D stack:28944 pid:12312 tgid:12307 ppid:1      flags:0x00004006
[10159.681276] Call Trace:
[10159.683004]  <TASK>
[10159.685636]  __schedule+0x1b42/0x25b0
[10159.688521]  schedule+0xb5/0x260
[10159.690415]  __fuse_simple_request+0xc49/0x1350 [fuse]
[10159.694677]  ? wake_bit_function+0x210/0x210
[10159.697145]  fuse_do_getattr+0x2cb/0x600 [fuse]



Aborting the connection(s) 'fixes' that, but looks like it triggers
another issue. Timeouts would certainly help, but it still should
work automatically.


Thanks,
Bernd
Re: [PATCH] fuse: add a null-ptr check
Posted by Bernd Schubert 4 days, 3 hours ago
I think I found it, look into fuse_get_user_pages() in your patch - it
returns nbytesp as coming in, without having added any pages.
Re: [PATCH] fuse: add a null-ptr check
Posted by Joanne Koong 4 days, 3 hours ago
On Mon, Dec 2, 2024 at 1:50 PM Bernd Schubert
<bernd.schubert@fastmail.fm> wrote:
>
> I think I found it, look into fuse_get_user_pages() in your patch - it
> returns nbytesp as coming in, without having added any pages.

Nice!! Just saw your submission here:
https://lore.kernel.org/linux-fsdevel/20241202-fix-fuse_get_user_pages-v1-1-8b5cccaf5bbe@ddn.com/T/#u
Will comment on that thread.

Thanks,
Joanne
Re: [PATCH] fuse: add a null-ptr check
Posted by Bernd Schubert 4 days, 4 hours ago
Hi Joanne,

On 12/2/24 21:40, Joanne Koong wrote:
> On Sat, Nov 30, 2024 at 12:22 AM Bernd Schubert
> <bernd.schubert@fastmail.fm> wrote:
>>
>> On 11/30/24 07:51, Nihar Chaithanya wrote:
> 
> Hi Nihar and Bernd,
> 
>>> The bug KASAN: null-ptr-deref is triggered due to *val being
>>> dereferenced when it is null in fuse_copy_do() when performing
>>> memcpy().
> 
> It's not clear to me that syzbot's "null-ptr-deref" complaint is about
> *val being dereferenced when val is NULL.
> 
> The stack trace [1] points to the 2nd memcpy in fuse_copy_do():
> 
> /* Do as much copy to/from userspace buffer as we can */
> static int fuse_copy_do(struct fuse_copy_state *cs, void **val, unsigned *size)
> {
>         unsigned ncpy = min(*size, cs->len);
>         if (val) {
>                 void *pgaddr = kmap_local_page(cs->pg);
>                 void *buf = pgaddr + cs->offset;
> 
>                 if (cs->write)
>                         memcpy(buf, *val, ncpy);
>                 else
>                         memcpy(*val, buf, ncpy);
> 
>                 kunmap_local(pgaddr);
>                 *val += ncpy;
>         }
> ...
> }
> 
> but AFAICT, if val is NULL then we never try to deref val since it's
> guarded by the "if (val)" check.

The function takes &val in fuse_copy_one(). The NULL check is more for
passing NULL from fuse_copy_page().


> 
> It seems like syzbot is either complaining about buf being NULL / *val
> being NULL and then trying to deference those inside the memcpy call,
> or maybe it actually is (mistakenly) complaining about val being NULL.

I don't think it is 'buf', because of 

==> Write of size 5 at addr 0000000000000000 

If it would be buf, it would be a read. With the knowledge that the line
number is correct, as it goes through fuse_dev_write(). Although I have
to admit that cs->write is really confusing - just the other way
around of fuse_dev_do_write  / fuse_dev_do_read.



> 
> It's not clear to me either how the "fuse: convert direct io to use
> folios" patch (on the fuse tree, it's commit 3b97c36) [2] directly
> causes this.
> 
> If I'm remembering correctly, it's possible to add debug printks to a
> patch and syzbot will print out the debug messages as it triggers the
> issue? It'd be interesting to see which request opcode triggers this,
> and what exactly is being deref-ed here that is NULL. I need to look
> at this more deeply but so far, nothing stands out as to what could be
> the culprit.

Yeah, I was just thinking the same and just reading through syzbot doku. 
I had tried to reproduce in my lokal VM on master/6.13 - no luck.


Thanks,
Bernd