[PATCH v3 00/12] Enable Linear Address Space Separation support

Alexander Shishkin posted 12 patches 2 years, 8 months ago
There is a newer version of this series
.../admin-guide/kernel-parameters.txt         |  4 +-
arch/x86/entry/vsyscall/vsyscall_64.c         | 70 ++++++++++++++-----
arch/x86/include/asm/cpufeatures.h            |  1 +
arch/x86/include/asm/disabled-features.h      |  4 +-
arch/x86/include/asm/smap.h                   |  4 ++
arch/x86/include/asm/string_32.h              | 21 ++++++
arch/x86/include/asm/string_64.h              | 21 ++++++
arch/x86/include/asm/vsyscall.h               | 16 +++--
arch/x86/include/uapi/asm/processor-flags.h   |  2 +
arch/x86/kernel/alternative.c                 | 12 +++-
arch/x86/kernel/cpu/common.c                  | 10 ++-
arch/x86/kernel/cpu/cpuid-deps.c              |  1 +
arch/x86/kernel/traps.c                       | 12 ++--
arch/x86/mm/fault.c                           | 13 +---
arch/x86/platform/efi/efi_64.c                |  6 ++
tools/arch/x86/include/asm/cpufeatures.h      |  1 +
16 files changed, 153 insertions(+), 45 deletions(-)
[PATCH v3 00/12] Enable Linear Address Space Separation support
Posted by Alexander Shishkin 2 years, 8 months ago
Changes from v2[5]:
- Added myself to the SoB chain

Changes from v1[1]:
- Emulate vsyscall violations in execute mode in the #GP fault handler
- Use inline memcpy and memset while patching alternatives
- Remove CONFIG_X86_LASS
- Make LASS depend on SMAP
- Dropped the minimal KVM enabling patch

Linear Address Space Separation (LASS) is a security feature that intends to
prevent malicious virtual address space accesses across user/kernel mode.

Such mode based access protection already exists today with paging and features
such as SMEP and SMAP. However, to enforce these protections, the processor
must traverse the paging structures in memory.  Malicious software can use
timing information resulting from this traversal to determine details about the
paging structures, and these details may also be used to determine the layout
of the kernel memory.

The LASS mechanism provides the same mode-based protections as paging but
without traversing the paging structures. Because the protections enforced by
LASS are applied before paging, software will not be able to derive
paging-based timing information from the various caching structures such as the
TLBs, mid-level caches, page walker, data caches, etc. LASS can avoid probing
using double page faults, TLB flush and reload, and SW prefetch instructions.
See [2], [3] and [4] for some research on the related attack vectors.

LASS enforcement relies on the typical kernel implemetation to divide the
64-bit virtual address space into two halves:
  Addr[63]=0 -> User address space
  Addr[63]=1 -> Kernel address space
Any data access or code execution across address spaces typically results in a
#GP fault.

Kernel accesses usually only happen to the kernel address space. However, there
are valid reasons for kernel to access memory in the user half. For these cases
(such as text poking and EFI runtime accesses), the kernel can temporarily
suspend the enforcement of LASS by toggling SMAP (Supervisor Mode Access
Prevention) using the stac()/clac() instructions.

User space cannot access any kernel address while LASS is enabled.
Unfortunately, legacy vsyscall functions are located in the address range
0xffffffffff600000 - 0xffffffffff601000 and emulated in kernel.  To avoid
breaking user applications when LASS is enabled, extend the vsyscall emulation
in execute (XONLY) mode to the #GP fault handler.

In contrast, the vsyscall EMULATE mode is deprecated and not expected to be
used by anyone.  Supporting EMULATE mode with LASS would need complex
intruction decoding in the #GP fault handler and is probably not worth the
hassle. Disable LASS in this rare case when someone absolutely needs and
enables vsyscall=emulate via the command line.

As of now there is no publicly available CPU supporting LASS.  The first one to
support LASS would be the Sierra Forest line. The Intel Simics® Simulator was
used as software development and testing vehicle for this patch set.

[1] https://lore.kernel.org/lkml/20230110055204.3227669-1-yian.chen@intel.com/
[2] “Practical Timing Side Channel Attacks against Kernel Space ASLR”,
https://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf
[3] “Prefetch Side-Channel Attacks: Bypassing SMAP and Kernel ASLR”, http://doi.acm.org/10.1145/2976749.2978356
[4] “Harmful prefetch on Intel”, https://ioactive.com/harmful-prefetch-on-intel/ (H/T Anders)
[5] https://lore.kernel.org/all/20230530114247.21821-1-alexander.shishkin@linux.intel.com/

Alexander Shishkin (1):
  x86/vsyscall: Document the fact that vsyscall=emulate disables LASS

Peter Zijlstra (1):
  x86/asm: Introduce inline memcpy and memset

Sohil Mehta (9):
  x86/cpu: Enumerate the LASS feature bits
  x86/alternatives: Disable LASS when patching kernel alternatives
  x86/cpu: Enable LASS during CPU initialization
  x86/cpu: Remove redundant comment during feature setup
  x86/vsyscall: Reorganize the #PF emulation code
  x86/traps: Consolidate user fixups in exc_general_protection()
  x86/vsyscall: Add vsyscall emulation for #GP
  x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE
  [RFC] x86/efi: Disable LASS enforcement when switching to EFI MM

Yian Chen (1):
  x86/cpu: Set LASS CR4 bit as pinning sensitive

 .../admin-guide/kernel-parameters.txt         |  4 +-
 arch/x86/entry/vsyscall/vsyscall_64.c         | 70 ++++++++++++++-----
 arch/x86/include/asm/cpufeatures.h            |  1 +
 arch/x86/include/asm/disabled-features.h      |  4 +-
 arch/x86/include/asm/smap.h                   |  4 ++
 arch/x86/include/asm/string_32.h              | 21 ++++++
 arch/x86/include/asm/string_64.h              | 21 ++++++
 arch/x86/include/asm/vsyscall.h               | 16 +++--
 arch/x86/include/uapi/asm/processor-flags.h   |  2 +
 arch/x86/kernel/alternative.c                 | 12 +++-
 arch/x86/kernel/cpu/common.c                  | 10 ++-
 arch/x86/kernel/cpu/cpuid-deps.c              |  1 +
 arch/x86/kernel/traps.c                       | 12 ++--
 arch/x86/mm/fault.c                           | 13 +---
 arch/x86/platform/efi/efi_64.c                |  6 ++
 tools/arch/x86/include/asm/cpufeatures.h      |  1 +
 16 files changed, 153 insertions(+), 45 deletions(-)

-- 
2.39.2

Re: [PATCH v3 00/12] Enable Linear Address Space Separation support
Posted by Alexander Shishkin 2 years, 6 months ago
Alexander Shishkin <alexander.shishkin@linux.intel.com> writes:

> Changes from v2[5]:
> - Added myself to the SoB chain

Gentle ping.

Regards,
--
Alex
Re: [PATCH v3 00/12] Enable Linear Address Space Separation support
Posted by Edgecombe, Rick P 2 years, 6 months ago
On Fri, 2023-06-09 at 21:36 +0300, Alexander Shishkin wrote:


What do NULL pointer de-references look like with LASS enabled? They
will be a #GP instead of a #PF, right? Currently the kernel prints out
several types of helpful messages:
 - "BUG: kernel NULL pointer dereference, address: %lx"
 - "BUG: unable to handle page fault for address: %px
 - "unable to execute userspace code (SMEP?) (uid: %d)"
 - etc

These will go away I guess, and you will get a more opaque "general
protection fault" message?

Assuming that is all right, I don't know if it might be worth tweaking
that #GP message, so people aren't confused when debugging. Something
that explains to turn off LASS to get more debugging info.

> Kernel accesses usually only happen to the kernel address space.
> However, there
> are valid reasons for kernel to access memory in the user half. For
> these cases
> (such as text poking and EFI runtime accesses), the kernel can
> temporarily
> suspend the enforcement of LASS by toggling SMAP (Supervisor Mode
> Access
> Prevention) using the stac()/clac() instructions.

CET introduces this unusual instruction called WRUSS. It allows you to
make user memory accesses while executing in the kernel. Because of
this special property, the CET shadow stack patches don't toggle
stac/clac while executing this instruction. So I think LASS will need
it to behave more like a normal userspace access from the kernel.
Shadow stack is not upstream yet, so just something to keep in mind for
the future.

Also, what is this series based on? I wasn't able to apply it.
Re: [PATCH v3 00/12] Enable Linear Address Space Separation support
Posted by Edgecombe, Rick P 2 years, 6 months ago
On Mon, 2023-07-31 at 15:36 -0700, Rick Edgecombe wrote:
> On Fri, 2023-06-09 at 21:36 +0300, Alexander Shishkin wrote:
> 
> 
> What do NULL pointer de-references look like with LASS enabled? They
> will be a #GP instead of a #PF, right? Currently the kernel prints
> out
> several types of helpful messages:
>  - "BUG: kernel NULL pointer dereference, address: %lx"
>  - "BUG: unable to handle page fault for address: %px
>  - "unable to execute userspace code (SMEP?) (uid: %d)"
>  - etc
> 
> These will go away I guess, and you will get a more opaque "general
> protection fault" message?
> 
> Assuming that is all right, I don't know if it might be worth
> tweaking
> that #GP message, so people aren't confused when debugging. Something
> that explains to turn off LASS to get more debugging info.

Maybe get_kernel_gp_address() could be enhanced to give hints for some
of those cases like it does for non-canonical addresses?


Separately, I think there is a tiny userspace visible change with this.
If userspace tries to access the kernel half of the cannonical address
space they will get a segfault. It seems previously the signal would
have REG_TRAPNO as 14 (X86_TRAP_PF) in this case, but with LASS it will
be 13 (X86_TRAP_GP).

I did a quick search and couldn't find any applications that seemed to
be relying on this behavior (not surprising). Some are looking for
REG_TRAPNO as 14, but none appeared to be relying on accesses to kernel
memory so I guess this should be ok. Still, it is probably appropriate
to call out the change and CC linux-api.

It makes me wonder if it should match for LASS and not LASS going
forward though. Like maybe always do X86_TRAP_GP for user->kernel
accesses instead of having it vary by whether LASS is used? Since there
isn't enough information to do REG_TRAPNO X86_TRAP_PF when LASS is
used.
Re: [PATCH v3 00/12] Enable Linear Address Space Separation support
Posted by Sohil Mehta 2 years, 6 months ago
On 7/31/2023 3:36 PM, Edgecombe, Rick P wrote:
> CET introduces this unusual instruction called WRUSS. It allows you to
> make user memory accesses while executing in the kernel. Because of
> this special property, the CET shadow stack patches don't toggle
> stac/clac while executing this instruction. So I think LASS will need
> it to behave more like a normal userspace access from the kernel.
> Shadow stack is not upstream yet, so just something to keep in mind for
> the future.
> 

This is a good point. We should definitely test this out to confirm.

But, isn't WRUSS already defined as a user-mode access? So, in theory, a
user-mode access to a user address space *should* not be blocked by LASS
(even with CPL=0).

Are you suggesting that we might need to do something special for WRUSS
with LASS enabled?

Sohil
Re: [PATCH v3 00/12] Enable Linear Address Space Separation support
Posted by Edgecombe, Rick P 2 years, 6 months ago
On Tue, 2023-08-01 at 12:50 -0700, Sohil Mehta wrote:
> On 7/31/2023 3:36 PM, Edgecombe, Rick P wrote:
> > CET introduces this unusual instruction called WRUSS. It allows you
> > to
> > make user memory accesses while executing in the kernel. Because of
> > this special property, the CET shadow stack patches don't toggle
> > stac/clac while executing this instruction. So I think LASS will
> > need
> > it to behave more like a normal userspace access from the kernel.
> > Shadow stack is not upstream yet, so just something to keep in mind
> > for
> > the future.
> > 
> 
> This is a good point. We should definitely test this out to confirm.
> 
> But, isn't WRUSS already defined as a user-mode access? So, in
> theory, a
> user-mode access to a user address space *should* not be blocked by
> LASS
> (even with CPL=0).
> 
> Are you suggesting that we might need to do something special for
> WRUSS
> with LASS enabled?

I was, but reading the docs, I think you are right. It looks like it
will be treated like a user access as far as LASS is concerned. Thanks.