[PATCH v2 03/28] binder: fix race between mmput() and do_exit()

Carlos Llamas posted 28 patches 2 years ago
Only 27 patches received!
[PATCH v2 03/28] binder: fix race between mmput() and do_exit()
Posted by Carlos Llamas 2 years ago
Task A calls binder_update_page_range() to allocate and insert pages on
a remote address space from Task B. For this, Task A pins the remote mm
via mmget_not_zero() first. This can race with Task B do_exit() and the
final mmput() refcount decrement will come from Task A.

  Task A            | Task B
  ------------------+------------------
  mmget_not_zero()  |
                    |  do_exit()
                    |    exit_mm()
                    |      mmput()
  mmput()           |
    exit_mmap()     |
      remove_vma()  |
        fput()      |

In this case, the work of ____fput() from Task B is queued up in Task A
as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup
work gets executed. However, Task A instead sleep, waiting for a reply
from Task B that never comes (it's dead).

This means the binder_deferred_release() is blocked until an unrelated
binder event forces Task A to go back to userspace. All the associated
death notifications will also be delayed until then.

In order to fix this use mmput_async() that will schedule the work in
the corresponding mm->async_put_work WQ instead of Task A.

Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 drivers/android/binder_alloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 9d2eff70c3ba..adcec5ec0959 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -271,7 +271,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
 	}
 	if (mm) {
 		mmap_write_unlock(mm);
-		mmput(mm);
+		mmput_async(mm);
 	}
 	return 0;
 
@@ -304,7 +304,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
 err_no_vma:
 	if (mm) {
 		mmap_write_unlock(mm);
-		mmput(mm);
+		mmput_async(mm);
 	}
 	return vma ? -ENOMEM : -ESRCH;
 }
-- 
2.43.0.rc2.451.g8631bc7472-goog
Re: [PATCH v2 03/28] binder: fix race between mmput() and do_exit()
Posted by Carlos Llamas 1 year, 11 months ago
On Fri, Dec 01, 2023 at 05:21:32PM +0000, Carlos Llamas wrote:
> Task A calls binder_update_page_range() to allocate and insert pages on
> a remote address space from Task B. For this, Task A pins the remote mm
> via mmget_not_zero() first. This can race with Task B do_exit() and the
> final mmput() refcount decrement will come from Task A.
> 
>   Task A            | Task B
>   ------------------+------------------
>   mmget_not_zero()  |
>                     |  do_exit()
>                     |    exit_mm()
>                     |      mmput()
>   mmput()           |
>     exit_mmap()     |
>       remove_vma()  |
>         fput()      |
> 
> In this case, the work of ____fput() from Task B is queued up in Task A
> as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup
> work gets executed. However, Task A instead sleep, waiting for a reply
> from Task B that never comes (it's dead).
> 
> This means the binder_deferred_release() is blocked until an unrelated
> binder event forces Task A to go back to userspace. All the associated
> death notifications will also be delayed until then.
> 
> In order to fix this use mmput_async() that will schedule the work in
> the corresponding mm->async_put_work WQ instead of Task A.
> 
> Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> Signed-off-by: Carlos Llamas <cmllamas@google.com>
> ---

Sorry, I forgot to Cc: stable@vger.kernel.org.

--
Carlos Llamas
Re: [PATCH v2 03/28] binder: fix race between mmput() and do_exit()
Posted by Greg Kroah-Hartman 1 year, 11 months ago
On Thu, Jan 18, 2024 at 07:29:07PM +0000, Carlos Llamas wrote:
> On Fri, Dec 01, 2023 at 05:21:32PM +0000, Carlos Llamas wrote:
> > Task A calls binder_update_page_range() to allocate and insert pages on
> > a remote address space from Task B. For this, Task A pins the remote mm
> > via mmget_not_zero() first. This can race with Task B do_exit() and the
> > final mmput() refcount decrement will come from Task A.
> > 
> >   Task A            | Task B
> >   ------------------+------------------
> >   mmget_not_zero()  |
> >                     |  do_exit()
> >                     |    exit_mm()
> >                     |      mmput()
> >   mmput()           |
> >     exit_mmap()     |
> >       remove_vma()  |
> >         fput()      |
> > 
> > In this case, the work of ____fput() from Task B is queued up in Task A
> > as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup
> > work gets executed. However, Task A instead sleep, waiting for a reply
> > from Task B that never comes (it's dead).
> > 
> > This means the binder_deferred_release() is blocked until an unrelated
> > binder event forces Task A to go back to userspace. All the associated
> > death notifications will also be delayed until then.
> > 
> > In order to fix this use mmput_async() that will schedule the work in
> > the corresponding mm->async_put_work WQ instead of Task A.
> > 
> > Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
> > Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> > Signed-off-by: Carlos Llamas <cmllamas@google.com>
> > ---
> 
> Sorry, I forgot to Cc: stable@vger.kernel.org.

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree.  Please read:
    https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
for how to do this properly.

</formletter>
Re: [PATCH v2 03/28] binder: fix race between mmput() and do_exit()
Posted by Carlos Llamas 1 year, 11 months ago
On Fri, Jan 19, 2024 at 06:48:43AM +0100, Greg Kroah-Hartman wrote:
> On Thu, Jan 18, 2024 at 07:29:07PM +0000, Carlos Llamas wrote:
> > On Fri, Dec 01, 2023 at 05:21:32PM +0000, Carlos Llamas wrote:
> > > Task A calls binder_update_page_range() to allocate and insert pages on
> > > a remote address space from Task B. For this, Task A pins the remote mm
> > > via mmget_not_zero() first. This can race with Task B do_exit() and the
> > > final mmput() refcount decrement will come from Task A.
> > > 
> > >   Task A            | Task B
> > >   ------------------+------------------
> > >   mmget_not_zero()  |
> > >                     |  do_exit()
> > >                     |    exit_mm()
> > >                     |      mmput()
> > >   mmput()           |
> > >     exit_mmap()     |
> > >       remove_vma()  |
> > >         fput()      |
> > > 
> > > In this case, the work of ____fput() from Task B is queued up in Task A
> > > as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup
> > > work gets executed. However, Task A instead sleep, waiting for a reply
> > > from Task B that never comes (it's dead).
> > > 
> > > This means the binder_deferred_release() is blocked until an unrelated
> > > binder event forces Task A to go back to userspace. All the associated
> > > death notifications will also be delayed until then.
> > > 
> > > In order to fix this use mmput_async() that will schedule the work in
> > > the corresponding mm->async_put_work WQ instead of Task A.
> > > 
> > > Fixes: 457b9a6f09f0 ("Staging: android: add binder driver")
> > > Reviewed-by: Alice Ryhl <aliceryhl@google.com>
> > > Signed-off-by: Carlos Llamas <cmllamas@google.com>
> > > ---
> > 
> > Sorry, I forgot to Cc: stable@vger.kernel.org.
> 
> <formletter>
> 
> This is not the correct way to submit patches for inclusion in the
> stable kernel tree.  Please read:
>     https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
> for how to do this properly.
> 
> </formletter>

Oops, here is the complete info:

Commit ID: 9a9ab0d963621d9d12199df9817e66982582d5a5
Subject:   "binder: fix race between mmput() and do_exit()"
Reason:    Fixes a race condition in binder.
Versions:  v4.19+

Note this will have a trivial conflict in v4.19 and v5.10 kernels as
commit d8ed45c5dcd4 is not there. Please let me know if I should send
those patches separately.

Thanks,
--
Carlos Llamas
Re: [PATCH v2 03/28] binder: fix race between mmput() and do_exit()
Posted by Carlos Llamas 1 year, 11 months ago
On Fri, Jan 19, 2024 at 05:06:13PM +0000, Carlos Llamas wrote:
> 
> Oops, here is the complete info:
> 
> Commit ID: 9a9ab0d963621d9d12199df9817e66982582d5a5
> Subject:   "binder: fix race between mmput() and do_exit()"
> Reason:    Fixes a race condition in binder.
> Versions:  v4.19+
> 
> Note this will have a trivial conflict in v4.19 and v5.10 kernels as
> commit d8ed45c5dcd4 is not there. Please let me know if I should send
> those patches separately.
> 
> Thanks,
> --
> Carlos Llamas

Sigh, I meant to type "conflict in v4.19 and v5.4". The patch applies
cleanly in v5.10+.
Re: [PATCH v2 03/28] binder: fix race between mmput() and do_exit()
Posted by Greg Kroah-Hartman 1 year, 11 months ago
On Fri, Jan 19, 2024 at 05:37:22PM +0000, Carlos Llamas wrote:
> On Fri, Jan 19, 2024 at 05:06:13PM +0000, Carlos Llamas wrote:
> > 
> > Oops, here is the complete info:
> > 
> > Commit ID: 9a9ab0d963621d9d12199df9817e66982582d5a5
> > Subject:   "binder: fix race between mmput() and do_exit()"
> > Reason:    Fixes a race condition in binder.
> > Versions:  v4.19+
> > 
> > Note this will have a trivial conflict in v4.19 and v5.10 kernels as
> > commit d8ed45c5dcd4 is not there. Please let me know if I should send
> > those patches separately.
> > 
> > Thanks,
> > --
> > Carlos Llamas
> 
> Sigh, I meant to type "conflict in v4.19 and v5.4". The patch applies
> cleanly in v5.10+.

Yes, I need backported patches please.

thanks,

greg k-h
Re: [PATCH v2 03/28] binder: fix race between mmput() and do_exit()
Posted by Carlos Llamas 1 year, 11 months ago
On Sat, Jan 20, 2024 at 07:37:25AM +0100, Greg Kroah-Hartman wrote:
> On Fri, Jan 19, 2024 at 05:37:22PM +0000, Carlos Llamas wrote:
> > On Fri, Jan 19, 2024 at 05:06:13PM +0000, Carlos Llamas wrote:
> > > 
> > > Oops, here is the complete info:
> > > 
> > > Commit ID: 9a9ab0d963621d9d12199df9817e66982582d5a5
> > > Subject:   "binder: fix race between mmput() and do_exit()"
> > > Reason:    Fixes a race condition in binder.
> > > Versions:  v4.19+
> > > 
> > > Note this will have a trivial conflict in v4.19 and v5.10 kernels as
> > > commit d8ed45c5dcd4 is not there. Please let me know if I should send
> > > those patches separately.
> > > 
> > > Thanks,
> > > --
> > > Carlos Llamas
> > 
> > Sigh, I meant to type "conflict in v4.19 and v5.4". The patch applies
> > cleanly in v5.10+.
> 
> Yes, I need backported patches please.
> 
> thanks,
> 
> greg k-h

Backports have been sent.

linux-4.19.y:
https://lore.kernel.org/all/20240122174250.2123854-1-cmllamas@google.com/

linux-5.4.y:
https://lore.kernel.org/all/20240122175751.2214176-1-cmllamas@google.com/

The patch should apply cleanly in remaining stable branches.

Thanks,
--
Carlos Llamas