[PATCH v10 00/15] x86: Enable Linear Address Space Separation support

Sohil Mehta posted 15 patches 11 hours ago
.../admin-guide/kernel-parameters.txt         |   4 +-
arch/x86/Kconfig.cpufeatures                  |   4 +
arch/x86/entry/vsyscall/vsyscall_64.c         |  83 +++++++-----
arch/x86/include/asm/cpufeatures.h            |   1 +
arch/x86/include/asm/smap.h                   |  35 ++++-
arch/x86/include/asm/string.h                 |  26 ++++
arch/x86/include/asm/vsyscall.h               |  13 +-
arch/x86/include/uapi/asm/processor-flags.h   |   2 +
arch/x86/kernel/alternative.c                 |  18 ++-
arch/x86/kernel/cpu/common.c                  |  30 ++--
arch/x86/kernel/cpu/cpuid-deps.c              |   1 +
arch/x86/kernel/dumpstack.c                   |   6 +-
arch/x86/kernel/relocate_kernel_64.S          |   7 +-
arch/x86/kernel/traps.c                       | 128 +++++++++++++-----
arch/x86/kernel/umip.c                        |   3 +
arch/x86/mm/fault.c                           |   2 +-
arch/x86/platform/efi/efi.c                   |  14 +-
17 files changed, 284 insertions(+), 93 deletions(-)
[PATCH v10 00/15] x86: Enable Linear Address Space Separation support
Posted by Sohil Mehta 11 hours ago
Linear Address Space Separation (LASS) is a security feature [1] that
works pre-paging to prevent a class of side-channel attacks that rely on
speculative access across the user/kernel boundary.

Change of personnel
-------------------
I am picking up this series from Kiryl. The patches have switched hands
multiple times over the last couple of years. I have refreshed the
commit tags since most of the patches have gone around a full circle.

Many thanks to Kiryl and Alex for taking these patches forward. Would
highly appreciate your reviews tags on the updated series.

Changes in v10
--------------
- Use the simplified versions of inline memcpy/memset (patch 2)
- New patch to fix an issue during Kexec relocate kernel (patch 7)
- Dropped the LAM re-enabling patch (will post separately)
- Reworded some of the commit messages
- Minor updates to code formatting and code comments

v9: https://lore.kernel.org/lkml/20250707080317.3791624-1-kirill.shutemov@linux.intel.com/

Patch structure
---------------
Patch     1: Enumerate LASS
Patch   2-3: Update text poking
Patch   4-5: CR pinning changes
Patch   6-7: Update EFI and kexec flows
Patch  8-11: Vsyscall impact
Patch 12-14: LASS hints during #GP and #SS
Patch    15: Enable LASS

The series is maturing, as reflected by the limited incremental changes.
Please consider providing review tags/acks for patches that seem ready.

Background
----------
Privilege mode based access protection already exists today with paging
and features such as SMEP and SMAP. However, to enforce these
protections, the processor must traverse the paging structures in
memory.  An attacker can use timing information resulting from this
traversal to determine details about the paging structures, and to
determine the layout of the kernel memory.

The LASS mechanism provides the same mode-based protections as paging,
but without traversing the paging structures. Because the protections
enforced by LASS are applied before paging, an attacker will not be able
to derive timing information from the various caching structures such as
the TLBs, mid-level caches, page walkers, data caches, etc. LASS can
prevent probing using double page faults, TLB flush and reload, and
software prefetch instructions. See [2], [3], and [4] for research
on the related attack vectors.

Though LASS was developed in response to Meltdown, in hindsight, it
alone could have mitigated Meltdown had it been available. In addition,
LASS prevents an attack vector targeting Linear Address Masking (LAM)
described in the Spectre LAM (SLAM) whitepaper [5].

LASS enforcement relies on the typical kernel implementation dividing
the 64-bit virtual address space into two halves:
  Addr[63]=0 -> User address space
  Addr[63]=1 -> Kernel address space
Any data access or code execution across address spaces typically
results in a #GP, with an #SS generated in some rare cases.

Kernel accesses
---------------
When there are valid reasons for the kernel to access memory in the user
half, it can temporarily suspend LASS enforcement by toggling the
RFLAGS.AC bit. Most of these cases are already covered today through the
stac()/clac() pairs, which avoid SMAP violations. However, there are
kernel usages, such as text poking, that access mappings (!_PAGE_USER)
in the lower half of the address space. LASS-specific AC bit toggling is
added for these cases.

There are a couple of cases where instruction fetches are done from a
lower address. Toggling the AC bit is not sufficient here because it
only manages data accesses. Therefore, CR4.LASS is modified in the case
of EFI set_virtual_address_map() and kexec relocate_kernel() to avoid
LASS violations. To let EFI modify CR4 during boot, CR pinning
enforcement is deferred until late_initcall().

Exception handling
------------------
With LASS enabled, NULL pointer dereferences generate a #GP instead of a
#PF. Due to the limited error information available during #GP, some of
the helpful hints would no longer be printed. The patches enchance the
#GP address decoding logic to identify LASS violations and NULL pointer
exceptions.

For example, two invalid userspace accesses would generate:
#PF (without LASS):
  BUG: kernel NULL pointer dereference, address: 0000000000000000
  BUG: unable to handle page fault for address: 0000000000100000

#GP (with LASS):
  Oops: general protection fault, kernel NULL pointer dereference 0x0: 0000
  Oops: general protection fault, probably LASS violation for address 0x100000: 0000

Similar debug hints are added for the #SS handling as well.

Userspace accesses
------------------
Userspace attempts to access any kernel address generate a #GP when LASS
is enabled. Unfortunately, legacy vsyscall functions are located in the
address range 0xffffffffff600000 - 0xffffffffff601000. Prior to LASS,
default access (XONLY) to the vsyscall page would generate a page fault
and the access would be emulated in the kernel. To avoid breaking user
applications when LASS is enabled, the patches extend vsyscall emulation
in XONLY mode to the #GP handler.

In contrast, the vsyscall EMULATE mode is deprecated and not expected to
be used by anyone. Supporting EMULATE mode with LASS would require
complex instruction decoding in the #GP fault handler, which is probably
not worth the effort. For now, LASS is disabled in the rare case when
someone absolutely needs to enable vsyscall=emulate via the command
line.

Links
-----
[1]: "Linear-Address Pre-Processing", Intel SDM (June 2025), Vol 3, Chapter 4.
[2]: "Practical Timing Side Channel Attacks against Kernel Space ASLR", https://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
[3]: "Prefetch Side-Channel Attacks: Bypassing SMAP and Kernel ASLR", http://doi.acm.org/10.1145/2976749.2978356
[4]: "Harmful prefetch on Intel", https://ioactive.com/harmful-prefetch-on-intel/ (H/T Anders)
[5]: "Spectre LAM", https://download.vusec.net/papers/slam_sp24.pdf


Alexander Shishkin (2):
  x86/efi: Disable LASS while mapping the EFI runtime services
  x86/traps: Communicate a LASS violation in #GP message

Kirill A. Shutemov (2):
  x86/traps: Generalize #GP address decode and hint code
  x86/traps: Provide additional hints for a kernel stack segment fault

Peter Zijlstra (Intel) (1):
  x86/asm: Introduce inline memcpy and memset

Sohil Mehta (9):
  x86/cpu: Enumerate the LASS feature bits
  x86/alternatives: Disable LASS when patching kernel alternatives
  x86/cpu: Defer CR pinning enforcement until late_initcall()
  x86/kexec: Disable LASS during relocate kernel
  x86/vsyscall: Reorganize the page fault emulation code
  x86/traps: Consolidate user fixups in exc_general_protection()
  x86/vsyscall: Add vsyscall emulation for #GP
  x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
  x86/cpu: Enable LASS by default during CPU initialization

Yian Chen (1):
  x86/cpu: Set LASS CR4 bit as pinning sensitive

 .../admin-guide/kernel-parameters.txt         |   4 +-
 arch/x86/Kconfig.cpufeatures                  |   4 +
 arch/x86/entry/vsyscall/vsyscall_64.c         |  83 +++++++-----
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/smap.h                   |  35 ++++-
 arch/x86/include/asm/string.h                 |  26 ++++
 arch/x86/include/asm/vsyscall.h               |  13 +-
 arch/x86/include/uapi/asm/processor-flags.h   |   2 +
 arch/x86/kernel/alternative.c                 |  18 ++-
 arch/x86/kernel/cpu/common.c                  |  30 ++--
 arch/x86/kernel/cpu/cpuid-deps.c              |   1 +
 arch/x86/kernel/dumpstack.c                   |   6 +-
 arch/x86/kernel/relocate_kernel_64.S          |   7 +-
 arch/x86/kernel/traps.c                       | 128 +++++++++++++-----
 arch/x86/kernel/umip.c                        |   3 +
 arch/x86/mm/fault.c                           |   2 +-
 arch/x86/platform/efi/efi.c                   |  14 +-
 17 files changed, 284 insertions(+), 93 deletions(-)


base-commit: fd94619c43360eb44d28bd3ef326a4f85c600a07
-- 
2.43.0
Re: [PATCH v10 00/15] x86: Enable Linear Address Space Separation support
Posted by Edgecombe, Rick P 2 hours ago
On Mon, 2025-10-06 at 23:51 -0700, Sohil Mehta wrote:
> > Userspace accesses
> > ------------------
> > Userspace attempts to access any kernel address generate a #GP when LASS
> > is enabled. Unfortunately, legacy vsyscall functions are located in the
> > address range 0xffffffffff600000 - 0xffffffffff601000. Prior to LASS,
> > default access (XONLY) to the vsyscall page would generate a page fault
> > and the access would be emulated in the kernel. To avoid breaking user
> > applications when LASS is enabled, the patches extend vsyscall emulation
> > in XONLY mode to the #GP handler.
> > 
> > In contrast, the vsyscall EMULATE mode is deprecated and not expected to
> > be used by anyone. Supporting EMULATE mode with LASS would require
> > complex instruction decoding in the #GP fault handler, which is probably
> > not worth the effort. For now, LASS is disabled in the rare case when
> > someone absolutely needs to enable vsyscall=emulate via the command
> > line.

There is also an expected harmless UABI change around SIG_SEGV. For a user mode
kernel address range access, the kernel can deliver a signal that provides the
exception type and the address. Before it was #PF, now a #GP with no address.