arch/alpha/include/asm/pgtable.h | 33 ++++++++- arch/alpha/include/asm/tlbflush.h | 4 +- arch/alpha/mm/Makefile | 2 +- arch/alpha/mm/tlbflush.c | 112 ++++++++++++++++++++++++++++++ 4 files changed, 148 insertions(+), 3 deletions(-) create mode 100644 arch/alpha/mm/tlbflush.c
This patch fixes long-standing user-space crashes on Alpha systems when memory compaction is enabled. Observed symptoms include: - sporadic SIGSEGV in unrelated user programs - glibc allocator failures (e.g. "unaligned tcache chunk detected") - gcc "internal compiler error" - heap corruption detected by malloc consistency checks The failures occur only when page migration / compaction is active and disappear when compaction is disabled. They affect both UP and SMP kernels and are not specific to a particular Alpha CPU model. Root cause ========== Alpha relies on Address Space Numbers (ASNs) for user-space instruction cache coherency. Existing TLB shootdown paths during page migration primarily depend on MM context rollover, with lazy invalidation of translations on CPUs not actively running the MM. This approach is insufficient during page migration. Migration creates a window where stale data or instruction translations can survive long enough for a CPU to perform loads or stores using the wrong physical page. This leads to silent user-space memory corruption that later manifests as crashes. Testing shows that the corruption is triggered during the unmap phase of migration. Installing the fix in ptep_clear_flush() is sufficient. No additional handling is required when installing the new mapping. Instruction barriers were evaluated during debugging but were found not to be required. Immediate TLB invalidation combined with ASN rollover is sufficient to prevent stale instruction and data access. Solution ======== This patch introduces a migration-specific TLB flush helper that combines: - MM context invalidation (ASN rollover), - immediate per-CPU TLB invalidation, - synchronous cross-CPU shootdown. The helper is used only by the page migration / compaction unmap path, leaving normal TLB semantics unchanged for other VM operations. Summary ======= This patch fixes real user-visible corruption bugs during page migration on Alpha by making TLB shootdowns migration-safe, without impacting non-migration code paths. Thanks for taking a look. Magnus Lindholm (1): alpha: fix user-space corruption during memory compaction arch/alpha/include/asm/pgtable.h | 33 ++++++++- arch/alpha/include/asm/tlbflush.h | 4 +- arch/alpha/mm/Makefile | 2 +- arch/alpha/mm/tlbflush.c | 112 ++++++++++++++++++++++++++++++ 4 files changed, 148 insertions(+), 3 deletions(-) create mode 100644 arch/alpha/mm/tlbflush.c -- 2.51.0
Hi Magnus, On Fri, 2026-01-02 at 18:30 +0100, Magnus Lindholm wrote: > This patch fixes long-standing user-space crashes on Alpha systems > when memory compaction is enabled. > > Observed symptoms include: > - sporadic SIGSEGV in unrelated user programs > - glibc allocator failures (e.g. "unaligned tcache chunk detected") > - gcc "internal compiler error" > - heap corruption detected by malloc consistency checks > > The failures occur only when page migration / compaction is active > and disappear when compaction is disabled. They affect both UP and > SMP kernels and are not specific to a particular Alpha CPU model. Wow, thanks for fixing this! This has been indeed a longstanding issue and seeing it fixed would be great. I'm CC'ing Michael Cree who has been observing the issue as well and could help testing your series. I'll try to test your series as well. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer `. `' Physicist `- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
On Fri, Jan 02, 2026 at 06:57:28PM +0100, John Paul Adrian Glaubitz wrote: > Hi Magnus, > > On Fri, 2026-01-02 at 18:30 +0100, Magnus Lindholm wrote: > > This patch fixes long-standing user-space crashes on Alpha systems > > when memory compaction is enabled. > > > > Observed symptoms include: > > - sporadic SIGSEGV in unrelated user programs > > - glibc allocator failures (e.g. "unaligned tcache chunk detected") > > - gcc "internal compiler error" > > - heap corruption detected by malloc consistency checks > > > > The failures occur only when page migration / compaction is active > > and disappear when compaction is disabled. They affect both UP and > > SMP kernels and are not specific to a particular Alpha CPU model. > > Wow, thanks for fixing this! This has been indeed a longstanding issue and > seeing it fixed would be great. > > I'm CC'ing Michael Cree who has been observing the issue as well and could > help testing your series. I've successfully run a 6.18.3 kernel with this patch and CONFIG_COMPACTION on for almost four days on an XP1000 without any problem. I would normally see one of the observed problems described above within a day, and certainly within two days, when running with a bad kernel, so this is looking good. Feel free to add: Tested-by: Michael Cree <mcree@orcon.net.nz> Cheers, Michael.
© 2016 - 2026 Red Hat, Inc.