kernel/watch_queue.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
If not done, a reference to a freed pipe remains in the watch_queue,
as this function is called before freeing a pipe in free_pipe_info()
(see line 834 of fs/pipe.c).
This causes a UAF when post_one_notification tries to access the pipe on
a key update, which is reported by syzbot.
Bug report: https://syzkaller.appspot.com/bug?id=1870dd7791ba05f2ea7f47f7cbdde701173973fc
Reported-and-tested-by: syzbot+c70d87ac1d001f29a058@syzkaller.appspotmail.com
Signed-off-by: Siddh Raman Pant <code@siddh.me>
---
kernel/watch_queue.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
index bb9962b33f95..bab9e09c74cf 100644
--- a/kernel/watch_queue.c
+++ b/kernel/watch_queue.c
@@ -637,8 +637,17 @@ void watch_queue_clear(struct watch_queue *wqueue)
spin_lock_bh(&wqueue->lock);
}
- spin_unlock_bh(&wqueue->lock);
rcu_read_unlock();
+
+ /* Clearing the watch queue, so we should clean the associated pipe. */
+#ifdef CONFIG_WATCH_QUEUE
+ if (wqueue->pipe) {
+ wqueue->pipe->watch_queue = NULL;
+ wqueue->pipe = NULL;
+ }
+#endif
+
+ spin_unlock_bh(&wqueue->lock);
}
/**
--
2.35.1
Siddh Raman Pant <code@siddh.me> wrote: > +++ b/kernel/watch_queue.c > ... >+#ifdef CONFIG_WATCH_QUEUE But it says: obj-$(CONFIG_WATCH_QUEUE) += watch_queue.o in the Makefile. David
On Wed, 27 Jul 2022 19:45:42 +0530 David Howells <dhowells@redhat.com> wrote: > Siddh Raman Pant <code@siddh.me> wrote: > > > +++ b/kernel/watch_queue.c > > ... > >+#ifdef CONFIG_WATCH_QUEUE > > But it says: > > obj-$(CONFIG_WATCH_QUEUE) += watch_queue.o > > in the Makefile. > > David > > Yes, that's what I realised and meant in reply to Khalid. I had sent a v2, which you can find here: https://lore.kernel.org/linux-kernel/20220724040240.7842-1-code@siddh.me/ Thanks, Siddh
On Sat, Jul 23, 2022 at 07:24:47PM +0530, Siddh Raman Pant via Linux-kernel-mentees wrote: > If not done, a reference to a freed pipe remains in the watch_queue, > as this function is called before freeing a pipe in free_pipe_info() > (see line 834 of fs/pipe.c). > > This causes a UAF when post_one_notification tries to access the pipe on > a key update, which is reported by syzbot. > > Bug report: https://syzkaller.appspot.com/bug?id=1870dd7791ba05f2ea7f47f7cbdde701173973fc > Reported-and-tested-by: syzbot+c70d87ac1d001f29a058@syzkaller.appspotmail.com > > Signed-off-by: Siddh Raman Pant <code@siddh.me> > --- > kernel/watch_queue.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c > index bb9962b33f95..bab9e09c74cf 100644 > --- a/kernel/watch_queue.c > +++ b/kernel/watch_queue.c > @@ -637,8 +637,17 @@ void watch_queue_clear(struct watch_queue *wqueue) > spin_lock_bh(&wqueue->lock); > } > > - spin_unlock_bh(&wqueue->lock); > rcu_read_unlock(); Also you now have a spinlock held when calling rcu_read_unlock(), are you sure that's ok?
On Sat, 23 Jul 2022 19:34:17 +0530 Greg KH <gregkh@linuxfoundation.org> wrote: > Also you now have a spinlock held when calling rcu_read_unlock(), are > you sure that's ok? > > We logically should not do write operations in a read critical section, so the nulling of `wqueue->pipe->watch_queue` should happen after rcu_read_unlock(). Also, since we already have a spinlock, we can use it to ensure the nulling. So I think it is okay. Though, it is my first time encountering a spinlock and an rcu lock together, so if I am wrong, please do correct me. Thanks, Siddh
Siddh Raman Pant <code@siddh.me> wrote:
> Greg KH <gregkh@linuxfoundation.org> wrote:
> > > - spin_unlock_bh(&wqueue->lock);
> > > rcu_read_unlock();
> >
> > Also you now have a spinlock held when calling rcu_read_unlock(), are
> > you sure that's ok?
Worse, we have softirqs disabled still, which might cause problems for
rcu_read_unlock()?
> We logically should not do write operations in a read critical section, so the
> nulling of `wqueue->pipe->watch_queue` should happen after rcu_read_unlock().
> Also, since we already have a spinlock, we can use it to ensure the nulling.
> So I think it is okay.
Read/write locks are perhaps misnamed in this sense; they perhaps should be
shared/exclusive. But, yes, we *can* do certain write operations with the
lock held - if we're careful. Locks are required if we need to pairs of
related memory accesses; if we're only making a single non-dependent write,
then we don't necessarily need a write lock.
However, you're referring to RCU read lock. That's a very special lock that
has to do with maintenance of persistence of objects without taking any other
lock. The moment you drop that lock, anything you accessed under RCU protocol
rules should be considered to have evaporated.
Think of it more as a way to have a deferred destructor/deallocator.
So I would do:
+
+ /* Clearing the watch queue, so we should clean the associated pipe. */
+ if (wqueue->pipe) {
+ wqueue->pipe->watch_queue = NULL;
+ wqueue->pipe = NULL;
+ }
+
spin_unlock_bh(&wqueue->lock);
rcu_read_unlock();
}
However, since you're now changing wqueue->pipe whilst a notification is being
posted, you need a barrier in post_one_notification() to prevent the compiler
from reloading the value:
struct pipe_inode_info *pipe = READ_ONCE(wqueue->pipe);
David
On Wed, 27 Jul 2022 20:16:40 +0530 David Howells <dhowells@redhat.com> wrote:
> Siddh Raman Pant <code@siddh.me> wrote:
>
> > Greg KH <gregkh@linuxfoundation.org> wrote:
>
> > > > - spin_unlock_bh(&wqueue->lock);
> > > > rcu_read_unlock();
> > >
> > > Also you now have a spinlock held when calling rcu_read_unlock(), are
> > > you sure that's ok?
>
> Worse, we have softirqs disabled still, which might cause problems for
> rcu_read_unlock()?
>
> > We logically should not do write operations in a read critical section, so the
> > nulling of `wqueue->pipe->watch_queue` should happen after rcu_read_unlock().
> > Also, since we already have a spinlock, we can use it to ensure the nulling.
> > So I think it is okay.
>
> Read/write locks are perhaps misnamed in this sense; they perhaps should be
> shared/exclusive. But, yes, we *can* do certain write operations with the
> lock held - if we're careful. Locks are required if we need to pairs of
> related memory accesses; if we're only making a single non-dependent write,
> then we don't necessarily need a write lock.
>
> However, you're referring to RCU read lock. That's a very special lock that
> has to do with maintenance of persistence of objects without taking any other
> lock. The moment you drop that lock, anything you accessed under RCU protocol
> rules should be considered to have evaporated.
>
> Think of it more as a way to have a deferred destructor/deallocator.
>
> So I would do:
>
> +
> + /* Clearing the watch queue, so we should clean the associated pipe. */
> + if (wqueue->pipe) {
> + wqueue->pipe->watch_queue = NULL;
> + wqueue->pipe = NULL;
> + }
> +
> spin_unlock_bh(&wqueue->lock);
> rcu_read_unlock();
> }
>
> However, since you're now changing wqueue->pipe whilst a notification is being
> posted, you need a barrier in post_one_notification() to prevent the compiler
> from reloading the value:
>
> struct pipe_inode_info *pipe = READ_ONCE(wqueue->pipe);
>
> David
>
Thank you for explaining it!
I will send a v3. Should I add a Suggested-by tag mentioning you?
Thanks,
Siddh
On Wed, Jul 27, 2022 at 09:50:52PM +0530, Siddh Raman Pant wrote: > Thank you for explaining it! > > I will send a v3. Should I add a Suggested-by tag mentioning you? Sorry for jumping in. We have reported the same bug in kernel v5.10.131 [https://lore.kernel.org/all/CANX2M5bHye2ZEEhEV6PUj1kYL2KdWYeJtgXw8KZRzwrNpLYz+A@mail.gmail.com]. We have been suggested to join this discussion so that we can have appropriate meta-information injected in this patch’s commit message to make sure that it gets backported to v5.10.y. Therefore, we would like to be in the loop so that we can offer help in the process, if needed.
On Sun, 31 Jul 2022 23:41:31 +0530 Dipanjan Das <mail.dipanjan.das@gmail.com> wrote:
> On Wed, Jul 27, 2022 at 09:50:52PM +0530, Siddh Raman Pant wrote:
> > Thank you for explaining it!
> >
> > I will send a v3. Should I add a Suggested-by tag mentioning you?
>
> Sorry for jumping in.
>
> We have reported the same bug in kernel v5.10.131 [https://lore.kernel.org/all/CANX2M5bHye2ZEEhEV6PUj1kYL2KdWYeJtgXw8KZRzwrNpLYz+A@mail.gmail.com]. We have been suggested to join this discussion so that we can have appropriate meta-information injected in this patch’s commit message to make sure that it gets backported to v5.10.y. Therefore, we would like to be in the loop so that we can offer help in the process, if needed.
>
As you are suggesting for backporting, I should CC the stable list, or mail
after it gets merged. You have reproduced it on v5.10, but the change seems to
be introduced by c73be61cede5 ("pipe: Add general notification queue support"),
which got in at v5.8. So should it be backported till v5.8 instead?
I actually looked this up on the internet / lore now for any other reports, and
it seems this fixes a CVE (CVE-2022-1882).
The reporter of CVE seems to have linked his patch as a part of CVE report, of
which he sent v2, but he seems to do it in a roundabout way, and also in a way
similar to what Hillf Danton had replied to my v2 patch, wherein he missed
353f7988dd84 ("watchqueue: make sure to serialize 'wqueue->defunct' properly"),
so I guess I can propose my patch as a fix for the CVE.
Note: I have already sent the v3, so please suggest any new improvements etc.
(except replying to the conversation here) to the v3, which can be found here:
https://lore.kernel.org/linux-kernel/20220728155121.12145-1-code@siddh.me/
Also, you may want to break text into multiples lines instead of one huge line.
Thanks,
Siddh
On Mon, Aug 01, 2022 at 12:16:43AM +0530, Siddh Raman Pant wrote:
> On Sun, 31 Jul 2022 23:41:31 +0530 Dipanjan Das <mail.dipanjan.das@gmail.com> wrote:
> > On Wed, Jul 27, 2022 at 09:50:52PM +0530, Siddh Raman Pant wrote:
> > > Thank you for explaining it!
> > >
> > > I will send a v3. Should I add a Suggested-by tag mentioning you?
> >
> > Sorry for jumping in.
> >
> > We have reported the same bug in kernel v5.10.131 [https://lore.kernel.org/all/CANX2M5bHye2ZEEhEV6PUj1kYL2KdWYeJtgXw8KZRzwrNpLYz+A@mail.gmail.com]. We have been suggested to join this discussion so that we can have appropriate meta-information injected in this patch’s commit message to make sure that it gets backported to v5.10.y. Therefore, we would like to be in the loop so that we can offer help in the process, if needed.
> >
>
> As you are suggesting for backporting, I should CC the stable list, or mail
> after it gets merged. You have reproduced it on v5.10, but the change seems to
> be introduced by c73be61cede5 ("pipe: Add general notification queue support"),
> which got in at v5.8. So should it be backported till v5.8 instead?
There are no active supported kernels other than the ones listed on
kernel.org, so 5.8 doesn't make much sense here, only 5.10 and 5.15 and
5.18 at the moment.
thanks,
greg k-h
On Mon, 01 Aug 2022 14:17:23 +0530 Greg KH <gregkh@linuxfoundation.org> wrote: > There are no active supported kernels other than the ones listed on > kernel.org, so 5.8 doesn't make much sense here, only 5.10 and 5.15 and > 5.18 at the moment. > > thanks, > > greg k-h Okay, thanks for correcting me. -- Siddh
On Sat, Jul 23, 2022 at 07:24:47PM +0530, Siddh Raman Pant via Linux-kernel-mentees wrote: > If not done, a reference to a freed pipe remains in the watch_queue, > as this function is called before freeing a pipe in free_pipe_info() > (see line 834 of fs/pipe.c). > > This causes a UAF when post_one_notification tries to access the pipe on > a key update, which is reported by syzbot. > > Bug report: https://syzkaller.appspot.com/bug?id=1870dd7791ba05f2ea7f47f7cbdde701173973fc > Reported-and-tested-by: syzbot+c70d87ac1d001f29a058@syzkaller.appspotmail.com > > Signed-off-by: Siddh Raman Pant <code@siddh.me> > --- > kernel/watch_queue.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c > index bb9962b33f95..bab9e09c74cf 100644 > --- a/kernel/watch_queue.c > +++ b/kernel/watch_queue.c > @@ -637,8 +637,17 @@ void watch_queue_clear(struct watch_queue *wqueue) > spin_lock_bh(&wqueue->lock); > } > > - spin_unlock_bh(&wqueue->lock); > rcu_read_unlock(); > + > + /* Clearing the watch queue, so we should clean the associated pipe. */ > +#ifdef CONFIG_WATCH_QUEUE You should not use #ifdef in .c files, it's unmaintainable over time. thanks, greg k-h
On Sat, 23 Jul 2022 19:33:33 +0530 Greg KH <gregkh@linuxfoundation.org> wrote: > You should not use #ifdef in .c files, it's unmaintainable over time. > > thanks, > > greg k-h > I used it because it is used in the same way in fs/pipe.c too (please check the stated line number). That, in turn, is because `watch_queue` member in the `pipe_inode_info` struct is defined that way (see line 80 of include/linux/pipe_fs_i.h), so I am forced to use the ifdef guard. Thanks, Siddh
On Sat, Jul 23, 2022 at 8:29 PM Siddh Raman Pant via
Linux-kernel-mentees <linux-kernel-mentees@lists.linuxfoundation.org>
wrote:
>
> On Sat, 23 Jul 2022 19:33:33 +0530 Greg KH <gregkh@linuxfoundation.org> wrote:
> > You should not use #ifdef in .c files, it's unmaintainable over time.
> >
> > thanks,
> >
> > greg k-h
> >
>
> I used it because it is used in the same way in fs/pipe.c too (please check the
> stated line number).
>
> That, in turn, is because `watch_queue` member in the `pipe_inode_info` struct
> is defined that way (see line 80 of include/linux/pipe_fs_i.h), so I am forced
> to use the ifdef guard.
>
Maybe, we can use the IS_ENABLED macro here to avoid ifdef in the .c file as
suggested here:
https://www.kernel.org/doc/html/v4.10/process/coding-style.html#conditional-compilation
if(IS_ENABLED(CONFIG_WATCH_QUEUE)){
...
}
> Thanks,
> Siddh
Thanks,
-- Khalid Masum
On Sun, 24 Jul 2022 09:15:27 +0530 Khalid Masum <khalid.masum.92@gmail.com> wrote:
> On Sat, Jul 23, 2022 at 8:29 PM Siddh Raman Pant via
> Linux-kernel-mentees <linux-kernel-mentees@lists.linuxfoundation.org>
> wrote:
> >
> > On Sat, 23 Jul 2022 19:33:33 +0530 Greg KH <gregkh@linuxfoundation.org> wrote:
> > > You should not use #ifdef in .c files, it's unmaintainable over time.
> > >
> > > thanks,
> > >
> > > greg k-h
> > >
> >
> > I used it because it is used in the same way in fs/pipe.c too (please check the
> > stated line number).
> >
> > That, in turn, is because `watch_queue` member in the `pipe_inode_info` struct
> > is defined that way (see line 80 of include/linux/pipe_fs_i.h), so I am forced
> > to use the ifdef guard.
> >
> Maybe, we can use the IS_ENABLED macro here to avoid ifdef in the .c file as
> suggested here:
> https://www.kernel.org/doc/html/v4.10/process/coding-style.html#conditional-compilation
>
> if(IS_ENABLED(CONFIG_WATCH_QUEUE)){
> ...
> }
>
> > Thanks,
> > Siddh
>
> Thanks,
> -- Khalid Masum
I have looked at it again. The guard is superfluous in watch_queue.c (don't
need it since we are already in watch queue), hence I am sending v2 with it
removed.
Thanks,
Siddh
© 2016 - 2026 Red Hat, Inc.