arch/x86/include/asm/mmu_context.h | 8 +++++++- arch/x86/include/asm/tlbflush.h | 9 ++++----- arch/x86/kernel/process_64.c | 25 ++++++++++++++++--------- arch/x86/mm/tlb.c | 15 ++++++++------- 4 files changed, 35 insertions(+), 22 deletions(-)
This series has fixups and cleanups for LAM. Most importantly, patch 1 fixes a sycnhronization issue that may cause crashes of userspace applications. This is a resend of v3, rebased on top of v6.10-rc6. v3: https://lore.kernel.org/lkml/20240418012835.3360429-1-yosryahmed@google.com/ v2: https://lore.kernel.org/lkml/20240312155641.4003683-1-yosryahmed@google.com/ v1: https://lore.kernel.org/lkml/20240312035951.3535980-1-yosryahmed@google.com/ RFC: https://lore.kernel.org/lkml/20240307133916.3782068-1-yosryahmed@google.com/ Yosry Ahmed (3): x86/mm: Use IPIs to synchronize LAM enablement x86/mm: Fix LAM inconsistency during context switch x86/mm: Cleanup prctl_enable_tagged_addr() nr_bits error checking arch/x86/include/asm/mmu_context.h | 8 +++++++- arch/x86/include/asm/tlbflush.h | 9 ++++----- arch/x86/kernel/process_64.c | 25 ++++++++++++++++--------- arch/x86/mm/tlb.c | 15 ++++++++------- 4 files changed, 35 insertions(+), 22 deletions(-) -- 2.45.2.803.g4e1b14247a-goog
On Tue, 2 Jul 2024 13:21:36 +0000 Yosry Ahmed <yosryahmed@google.com> wrote: > This series has fixups and cleanups for LAM. Most importantly, patch 1 > fixes a sycnhronization issue that may cause crashes of userspace > applications. This is a resend of v3, rebased on top of v6.10-rc6. "Crashes of userspace applications" is bad. Yet the patchset has been floating about for four months. It's unclear (to me) how serious this is. Can you please explain how common this is, what the userspace application needs to do to trigger this, etc?
On Tue, Jul 2, 2024 at 10:36 AM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Tue, 2 Jul 2024 13:21:36 +0000 Yosry Ahmed <yosryahmed@google.com> wrote: > > > This series has fixups and cleanups for LAM. Most importantly, patch 1 > > fixes a sycnhronization issue that may cause crashes of userspace > > applications. This is a resend of v3, rebased on top of v6.10-rc6. > > "Crashes of userspace applications" is bad. Yet the patchset has been > floating about for four months. > > It's unclear (to me) how serious this is. Can you please explain how > common this is, what the userspace application needs to do to trigger > this, etc? I don't think it would be common. The bug only happens on new hardware supporting LAM, and it happens in a specific scenario where a userspace task enables LAM while a kthread is using (borrowing) its mm_struct on another CPU. So it is possible but I certainly wouldn't call it common or easily triggerable.
On Tue, 2 Jul 2024 10:39:03 -0700 Yosry Ahmed <yosryahmed@google.com> wrote:
> On Tue, Jul 2, 2024 at 10:36 AM Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Tue, 2 Jul 2024 13:21:36 +0000 Yosry Ahmed <yosryahmed@google.com> wrote:
> >
> > > This series has fixups and cleanups for LAM. Most importantly, patch 1
> > > fixes a sycnhronization issue that may cause crashes of userspace
> > > applications. This is a resend of v3, rebased on top of v6.10-rc6.
> >
> > "Crashes of userspace applications" is bad. Yet the patchset has been
> > floating about for four months.
> >
> > It's unclear (to me) how serious this is. Can you please explain how
> > common this is, what the userspace application needs to do to trigger
> > this, etc?
>
> I don't think it would be common. The bug only happens on new hardware
> supporting LAM, and it happens in a specific scenario where a
> userspace task enables LAM while a kthread is using (borrowing) its
> mm_struct on another CPU.
>
> So it is possible but I certainly wouldn't call it common or easily triggerable.
But when people run older (or current) kernels on newer hardware, they
will hit this. So a backport to cover 82721d8b25d7 ("x86/mm: Handle
LAM on context switch") is needed.
The series doesn't seem to be getting much traction so I can add it to
mm.git's mm-unstable branch for wider testing, but it's clearly an x86
tree thing.
On 7/2/24 11:35, Andrew Morton wrote:
> But when people run older (or current) kernels on newer hardware, they
> will hit this. So a backport to cover 82721d8b25d7 ("x86/mm: Handle
> LAM on context switch") is needed.
>
> The series doesn't seem to be getting much traction so I can add it to
> mm.git's mm-unstable branch for wider testing, but it's clearly an x86
> tree thing.
I was really hoping Andy L would look at this since he suggested this
whole thing really.
I completely agree that this needs some wider testing. How about I pull
it into x86/mm so it gets some linux-next testing instead of having it
in mm-unstable? Maybe it'll also attract Andy's attention once it's in
there.
On Tue, Jul 2, 2024 at 11:38 AM Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 7/2/24 11:35, Andrew Morton wrote:
> > But when people run older (or current) kernels on newer hardware, they
> > will hit this. So a backport to cover 82721d8b25d7 ("x86/mm: Handle
> > LAM on context switch") is needed.
> >
> > The series doesn't seem to be getting much traction so I can add it to
> > mm.git's mm-unstable branch for wider testing, but it's clearly an x86
> > tree thing.
>
> I was really hoping Andy L would look at this since he suggested this
> whole thing really.
>
> I completely agree that this needs some wider testing. How about I pull
> it into x86/mm so it gets some linux-next testing instead of having it
> in mm-unstable? Maybe it'll also attract Andy's attention once it's in
> there.
That would be great. Thanks Dave!
© 2016 - 2025 Red Hat, Inc.