mm/madvise.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
When unwinding race on -ERESTARTNOINTR handling of process_madvise(),
madvise_lock() failure is ignored. Check the failure and abort
remaining works in the case.
Fixes: 4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()")
Cc: stable@kernel.org
Reported-by: Barry Song <21cnbao@gmail.com>
Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com
Signed-off-by: SeongJae Park <sj@kernel.org>
---
mm/madvise.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/mm/madvise.c b/mm/madvise.c
index 8433ac9b27e0..5f7a66a1617e 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1881,7 +1881,9 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
/* Drop and reacquire lock to unwind race. */
madvise_finish_tlb(&madv_behavior);
madvise_unlock(mm, behavior);
- madvise_lock(mm, behavior);
+ ret = madvise_lock(mm, behavior);
+ if (ret)
+ goto out;
madvise_init_tlb(&madv_behavior, mm);
continue;
}
@@ -1892,6 +1894,7 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
madvise_finish_tlb(&madv_behavior);
madvise_unlock(mm, behavior);
+out:
ret = (total_len - iov_iter_count(iter)) ? : ret;
return ret;
base-commit: d85ea9175e4147e15ff6e3c0e02c6c447ef473c8
--
2.39.5
On Tue, Jun 3, 2025 at 5:49 AM SeongJae Park <sj@kernel.org> wrote:
>
> When unwinding race on -ERESTARTNOINTR handling of process_madvise(),
> madvise_lock() failure is ignored. Check the failure and abort
> remaining works in the case.
>
> Fixes: 4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()")
> Cc: stable@kernel.org
> Reported-by: Barry Song <21cnbao@gmail.com>
> Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com
> Signed-off-by: SeongJae Park <sj@kernel.org>
LGTM, thanks!
Reviewed-by: Barry Song <baohua@kernel.org>
> ---
> mm/madvise.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 8433ac9b27e0..5f7a66a1617e 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1881,7 +1881,9 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
> /* Drop and reacquire lock to unwind race. */
> madvise_finish_tlb(&madv_behavior);
> madvise_unlock(mm, behavior);
> - madvise_lock(mm, behavior);
> + ret = madvise_lock(mm, behavior);
> + if (ret)
> + goto out;
> madvise_init_tlb(&madv_behavior, mm);
> continue;
> }
> @@ -1892,6 +1894,7 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
> madvise_finish_tlb(&madv_behavior);
> madvise_unlock(mm, behavior);
>
> +out:
> ret = (total_len - iov_iter_count(iter)) ? : ret;
>
> return ret;
>
> base-commit: d85ea9175e4147e15ff6e3c0e02c6c447ef473c8
> --
> 2.39.5
On Mon, Jun 02, 2025 at 10:49:26AM -0700, SeongJae Park wrote:
> When unwinding race on -ERESTARTNOINTR handling of process_madvise(),
> madvise_lock() failure is ignored. Check the failure and abort
> remaining works in the case.
>
> Fixes: 4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()")
> Cc: stable@kernel.org
> Reported-by: Barry Song <21cnbao@gmail.com>
> Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com
> Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev>
On 02.06.25 19:49, SeongJae Park wrote:
> When unwinding race on -ERESTARTNOINTR handling of process_madvise(),
> madvise_lock() failure is ignored. Check the failure and abort
> remaining works in the case.
>
> Fixes: 4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()")
> Cc: stable@kernel.org
> Reported-by: Barry Song <21cnbao@gmail.com>
> Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com
> Signed-off-by: SeongJae Park <sj@kernel.org>
> ---
> mm/madvise.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 8433ac9b27e0..5f7a66a1617e 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1881,7 +1881,9 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
> /* Drop and reacquire lock to unwind race. */
> madvise_finish_tlb(&madv_behavior);
> madvise_unlock(mm, behavior);
> - madvise_lock(mm, behavior);
> + ret = madvise_lock(mm, behavior);
> + if (ret)
> + goto out;
> madvise_init_tlb(&madv_behavior, mm);
> continue;
> }
> @@ -1892,6 +1894,7 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
> madvise_finish_tlb(&madv_behavior);
> madvise_unlock(mm, behavior);
>
> +out:
> ret = (total_len - iov_iter_count(iter)) ? : ret;
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
On Mon, Jun 02, 2025 at 10:49:26AM -0700, SeongJae Park wrote:
> When unwinding race on -ERESTARTNOINTR handling of process_madvise(),
> madvise_lock() failure is ignored. Check the failure and abort
> remaining works in the case.
>
> Fixes: 4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()")
> Cc: stable@kernel.org
> Reported-by: Barry Song <21cnbao@gmail.com>
> Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com
> Signed-off-by: SeongJae Park <sj@kernel.org>
I said it in reply to Jann's ping to Andrew, but to repeat here - I don't think
this actually has any impact in reality for MADV_GUARD_INSTALL / REMOVE, since
those only require read locks, which causes madvise_lock() to always succeed.
However we shouldn't be ignore return values and so it is right we ostensibly
handle this (who knows what might change in future, etc.)
So:
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
> mm/madvise.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/mm/madvise.c b/mm/madvise.c
> index 8433ac9b27e0..5f7a66a1617e 100644
> --- a/mm/madvise.c
> +++ b/mm/madvise.c
> @@ -1881,7 +1881,9 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
> /* Drop and reacquire lock to unwind race. */
> madvise_finish_tlb(&madv_behavior);
> madvise_unlock(mm, behavior);
> - madvise_lock(mm, behavior);
> + ret = madvise_lock(mm, behavior);
> + if (ret)
> + goto out;
> madvise_init_tlb(&madv_behavior, mm);
> continue;
> }
> @@ -1892,6 +1894,7 @@ static ssize_t vector_madvise(struct mm_struct *mm, struct iov_iter *iter,
> madvise_finish_tlb(&madv_behavior);
> madvise_unlock(mm, behavior);
>
> +out:
> ret = (total_len - iov_iter_count(iter)) ? : ret;
>
> return ret;
>
> base-commit: d85ea9175e4147e15ff6e3c0e02c6c447ef473c8
> --
> 2.39.5
@akpm FYI, this looks like it fixes a security bug in 6.15 (probably
leads to UAF of VMA structs and page tables by racing madvise(...,
MADV_GUARD_INSTALL) with concurrent faults)
On Mon, Jun 2, 2025 at 7:49 PM SeongJae Park <sj@kernel.org> wrote:
> When unwinding race on -ERESTARTNOINTR handling of process_madvise(),
> madvise_lock() failure is ignored. Check the failure and abort
> remaining works in the case.
>
> Fixes: 4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()")
> Cc: stable@kernel.org
> Reported-by: Barry Song <21cnbao@gmail.com>
> Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com
> Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Jann Horn <jannh@google.com>
On Mon, Jun 02, 2025 at 09:20:14PM +0200, Jann Horn wrote:
> @akpm FYI, this looks like it fixes a security bug in 6.15 (probably
> leads to UAF of VMA structs and page tables by racing madvise(...,
> MADV_GUARD_INSTALL) with concurrent faults)
Hmm MADV_GUARD_INSTALL / MADV_GUARD_REMOVE require only a read lock, so
madvise_lock() will be:
if (madvise_need_mmap_write(behavior)) { <--- nope
if (mmap_write_lock_killable(mm))
return -EINTR;
} else {
mmap_read_lock(mm); <---- this branch
}
return 0;
So for guard install, which is the only thing that can return -ERESTARTNOINTR
madvise_lock() ignoring the return value is essentially a no-op no?
Am I missing something?
>
> On Mon, Jun 2, 2025 at 7:49 PM SeongJae Park <sj@kernel.org> wrote:
> > When unwinding race on -ERESTARTNOINTR handling of process_madvise(),
> > madvise_lock() failure is ignored. Check the failure and abort
> > remaining works in the case.
> >
> > Fixes: 4000e3d0a367 ("mm/madvise: remove redundant mmap_lock operations from process_madvise()")
> > Cc: stable@kernel.org
> > Reported-by: Barry Song <21cnbao@gmail.com>
> > Closes: https://lore.kernel.org/CAGsJ_4xJXXO0G+4BizhohSZ4yDteziPw43_uF8nPXPWxUVChzw@mail.gmail.com
> > Signed-off-by: SeongJae Park <sj@kernel.org>
>
> Reviewed-by: Jann Horn <jannh@google.com>
On Mon, Jun 2, 2025 at 9:28 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
> On Mon, Jun 02, 2025 at 09:20:14PM +0200, Jann Horn wrote:
> > @akpm FYI, this looks like it fixes a security bug in 6.15 (probably
> > leads to UAF of VMA structs and page tables by racing madvise(...,
> > MADV_GUARD_INSTALL) with concurrent faults)
>
> Hmm MADV_GUARD_INSTALL / MADV_GUARD_REMOVE require only a read lock, so
> madvise_lock() will be:
>
>
> if (madvise_need_mmap_write(behavior)) { <--- nope
> if (mmap_write_lock_killable(mm))
> return -EINTR;
> } else {
> mmap_read_lock(mm); <---- this branch
> }
> return 0;
>
> So for guard install, which is the only thing that can return -ERESTARTNOINTR
> madvise_lock() ignoring the return value is essentially a no-op no?
>
> Am I missing something?
... you're right, of course. please ignore my needlessly alarmist comment.
(I think it is surprising that the write lock is killable while the
read lock isn't but that's another story)
On Mon, Jun 02, 2025 at 09:34:55PM +0200, Jann Horn wrote:
> On Mon, Jun 2, 2025 at 9:28 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> > On Mon, Jun 02, 2025 at 09:20:14PM +0200, Jann Horn wrote:
> > > @akpm FYI, this looks like it fixes a security bug in 6.15 (probably
> > > leads to UAF of VMA structs and page tables by racing madvise(...,
> > > MADV_GUARD_INSTALL) with concurrent faults)
> >
> > Hmm MADV_GUARD_INSTALL / MADV_GUARD_REMOVE require only a read lock, so
> > madvise_lock() will be:
> >
> >
> > if (madvise_need_mmap_write(behavior)) { <--- nope
> > if (mmap_write_lock_killable(mm))
> > return -EINTR;
> > } else {
> > mmap_read_lock(mm); <---- this branch
> > }
> > return 0;
> >
> > So for guard install, which is the only thing that can return -ERESTARTNOINTR
> > madvise_lock() ignoring the return value is essentially a no-op no?
> >
> > Am I missing something?
>
> ... you're right, of course. please ignore my needlessly alarmist comment.
>
> (I think it is surprising that the write lock is killable while the
> read lock isn't but that's another story)
>
Blood pressure drops :P it's still a good spot, we should handle this because in
future we may change this behaviour and we mustn't ignore this kind of code
path.
What was that you were saying on fedi about killable locks ;) a source of pain
indeed :)
© 2016 - 2025 Red Hat, Inc.