[PATCHv3 0/8] Linear Address Masking enabling

Kirill A. Shutemov posted 8 patches 3 years, 10 months ago
There is a newer version of this series
arch/arm64/include/asm/memory.h               |  4 +-
arch/arm64/include/asm/signal.h               |  2 +-
arch/arm64/include/asm/uaccess.h              |  4 +-
arch/arm64/kernel/hw_breakpoint.c             |  2 +-
arch/arm64/kernel/traps.c                     |  4 +-
arch/arm64/mm/fault.c                         | 10 +--
arch/sparc/include/asm/pgtable_64.h           |  2 +-
arch/sparc/include/asm/uaccess_64.h           |  2 +
arch/x86/include/asm/cpufeatures.h            |  1 +
arch/x86/include/asm/elf.h                    |  3 +-
arch/x86/include/asm/mmu.h                    |  2 +
arch/x86/include/asm/mmu_context.h            | 58 +++++++++++++++++
arch/x86/include/asm/processor-flags.h        |  2 +-
arch/x86/include/asm/tlbflush.h               |  3 +
arch/x86/include/asm/uaccess.h                | 44 ++++++++++++-
arch/x86/include/uapi/asm/prctl.h             |  3 +
arch/x86/include/uapi/asm/processor-flags.h   |  6 ++
arch/x86/kernel/Makefile                      |  2 +
arch/x86/kernel/fpu/xstate.c                  | 47 --------------
arch/x86/kernel/proc.c                        | 50 +++++++++++++++
arch/x86/kernel/process.c                     |  3 +
arch/x86/kernel/process_64.c                  | 54 +++++++++++++++-
arch/x86/kernel/sys_x86_64.c                  |  5 +-
arch/x86/mm/hugetlbpage.c                     |  6 +-
arch/x86/mm/mmap.c                            |  9 ++-
arch/x86/mm/tlb.c                             | 62 ++++++++++++++-----
.../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  2 +-
drivers/gpu/drm/radeon/radeon_gem.c           |  2 +-
drivers/infiniband/hw/mlx4/mr.c               |  2 +-
drivers/media/common/videobuf2/frame_vector.c |  2 +-
drivers/media/v4l2-core/videobuf-dma-contig.c |  2 +-
.../staging/media/atomisp/pci/hmm/hmm_bo.c    |  2 +-
drivers/tee/tee_shm.c                         |  2 +-
drivers/vfio/vfio_iommu_type1.c               |  2 +-
fs/proc/task_mmu.c                            |  2 +-
include/linux/mm.h                            | 11 ----
include/linux/uaccess.h                       | 11 ++++
lib/strncpy_from_user.c                       |  2 +-
lib/strnlen_user.c                            |  2 +-
mm/gup.c                                      |  6 +-
mm/madvise.c                                  |  2 +-
mm/mempolicy.c                                |  6 +-
mm/migrate.c                                  |  2 +-
mm/mincore.c                                  |  2 +-
mm/mlock.c                                    |  4 +-
mm/mmap.c                                     |  2 +-
mm/mprotect.c                                 |  2 +-
mm/mremap.c                                   |  2 +-
mm/msync.c                                    |  2 +-
virt/kvm/kvm_main.c                           |  2 +-
51 files changed, 342 insertions(+), 126 deletions(-)
create mode 100644 arch/x86/kernel/proc.c
[PATCHv3 0/8] Linear Address Masking enabling
Posted by Kirill A. Shutemov 3 years, 10 months ago
Linear Address Masking[1] (LAM) modifies the checking that is applied to
64-bit linear addresses, allowing software to use of the untranslated
address bits for metadata.

The patchset brings support for LAM for userspace addresses.

LAM_U48 enabling is controversial since it competes for bits with
5-level paging. Its enabling isolated into an optional last patch that
can be applied at maintainer's discretion.

Please review and consider applying.

v3:
  - Rebased onto v5.19-rc1
  - Per-process enabling;
  - API overhaul (again);
  - Avoid branches and costly computations in the fast path;
  - LAM_U48 is in optional patch.
v2:
  - Rebased onto v5.18-rc1
  - New arch_prctl(2)-based API
  - Expose status of LAM (or other thread features) in
    /proc/$PID/arch_status

[1] ISE, Chapter 14.
https://software.intel.com/content/dam/develop/external/us/en/documents-tps/architecture-instruction-set-extensions-programming-reference.pdf

Kirill A. Shutemov (8):
  x86/mm: Fix CR3_ADDR_MASK
  x86: CPUID and CR3/CR4 flags for Linear Address Masking
  mm: Pass down mm_struct to untagged_addr()
  x86/mm: Handle LAM on context switch
  x86/uaccess: Provide untagged_addr() and remove tags before address check
  x86/mm: Provide ARCH_GET_UNTAG_MASK and ARCH_ENABLE_TAGGED_ADDR
  x86: Expose untagging mask in /proc/$PID/arch_status
  x86/mm: Extend LAM to support to LAM_U48

 arch/arm64/include/asm/memory.h               |  4 +-
 arch/arm64/include/asm/signal.h               |  2 +-
 arch/arm64/include/asm/uaccess.h              |  4 +-
 arch/arm64/kernel/hw_breakpoint.c             |  2 +-
 arch/arm64/kernel/traps.c                     |  4 +-
 arch/arm64/mm/fault.c                         | 10 +--
 arch/sparc/include/asm/pgtable_64.h           |  2 +-
 arch/sparc/include/asm/uaccess_64.h           |  2 +
 arch/x86/include/asm/cpufeatures.h            |  1 +
 arch/x86/include/asm/elf.h                    |  3 +-
 arch/x86/include/asm/mmu.h                    |  2 +
 arch/x86/include/asm/mmu_context.h            | 58 +++++++++++++++++
 arch/x86/include/asm/processor-flags.h        |  2 +-
 arch/x86/include/asm/tlbflush.h               |  3 +
 arch/x86/include/asm/uaccess.h                | 44 ++++++++++++-
 arch/x86/include/uapi/asm/prctl.h             |  3 +
 arch/x86/include/uapi/asm/processor-flags.h   |  6 ++
 arch/x86/kernel/Makefile                      |  2 +
 arch/x86/kernel/fpu/xstate.c                  | 47 --------------
 arch/x86/kernel/proc.c                        | 50 +++++++++++++++
 arch/x86/kernel/process.c                     |  3 +
 arch/x86/kernel/process_64.c                  | 54 +++++++++++++++-
 arch/x86/kernel/sys_x86_64.c                  |  5 +-
 arch/x86/mm/hugetlbpage.c                     |  6 +-
 arch/x86/mm/mmap.c                            |  9 ++-
 arch/x86/mm/tlb.c                             | 62 ++++++++++++++-----
 .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  2 +-
 drivers/gpu/drm/radeon/radeon_gem.c           |  2 +-
 drivers/infiniband/hw/mlx4/mr.c               |  2 +-
 drivers/media/common/videobuf2/frame_vector.c |  2 +-
 drivers/media/v4l2-core/videobuf-dma-contig.c |  2 +-
 .../staging/media/atomisp/pci/hmm/hmm_bo.c    |  2 +-
 drivers/tee/tee_shm.c                         |  2 +-
 drivers/vfio/vfio_iommu_type1.c               |  2 +-
 fs/proc/task_mmu.c                            |  2 +-
 include/linux/mm.h                            | 11 ----
 include/linux/uaccess.h                       | 11 ++++
 lib/strncpy_from_user.c                       |  2 +-
 lib/strnlen_user.c                            |  2 +-
 mm/gup.c                                      |  6 +-
 mm/madvise.c                                  |  2 +-
 mm/mempolicy.c                                |  6 +-
 mm/migrate.c                                  |  2 +-
 mm/mincore.c                                  |  2 +-
 mm/mlock.c                                    |  4 +-
 mm/mmap.c                                     |  2 +-
 mm/mprotect.c                                 |  2 +-
 mm/mremap.c                                   |  2 +-
 mm/msync.c                                    |  2 +-
 virt/kvm/kvm_main.c                           |  2 +-
 51 files changed, 342 insertions(+), 126 deletions(-)
 create mode 100644 arch/x86/kernel/proc.c

-- 
2.35.1
Re: [PATCHv3 0/8] Linear Address Masking enabling
Posted by Edgecombe, Rick P 3 years, 10 months ago
On Fri, 2022-06-10 at 17:35 +0300, Kirill A. Shutemov wrote:
> Linear Address Masking[1] (LAM) modifies the checking that is applied
> to
> 64-bit linear addresses, allowing software to use of the untranslated
> address bits for metadata.
> 
> The patchset brings support for LAM for userspace addresses.

Arm has this documentation about which memory operations support being
passed tagged pointers, and which do not:
Documentation/arm64/tagged-address-abi.rst

Is the idea that LAM would have something similar, or exactly mirror
the arm ABI? It seems like it is the same right now. Should the docs be
generalized?

Re: [PATCHv3 0/8] Linear Address Masking enabling
Posted by Kirill A. Shutemov 3 years, 10 months ago
On Thu, Jun 16, 2022 at 10:52:14PM +0000, Edgecombe, Rick P wrote:
> On Fri, 2022-06-10 at 17:35 +0300, Kirill A. Shutemov wrote:
> > Linear Address Masking[1] (LAM) modifies the checking that is applied
> > to
> > 64-bit linear addresses, allowing software to use of the untranslated
> > address bits for metadata.
> > 
> > The patchset brings support for LAM for userspace addresses.
> 
> Arm has this documentation about which memory operations support being
> passed tagged pointers, and which do not:
> Documentation/arm64/tagged-address-abi.rst
> 
> Is the idea that LAM would have something similar, or exactly mirror
> the arm ABI? It seems like it is the same right now. Should the docs be
> generalized?

It is somewhat similar, but not exact. ARM TBI interface implies tag size
and placement. ARM TBI is per-thread and LAM is per-process.

-- 
 Kirill A. Shutemov
Re: [PATCHv3 0/8] Linear Address Masking enabling
Posted by Edgecombe, Rick P 3 years, 10 months ago
On Fri, 2022-06-17 at 02:43 +0300, Kirill A. Shutemov wrote:
> On Thu, Jun 16, 2022 at 10:52:14PM +0000, Edgecombe, Rick P wrote:
> > On Fri, 2022-06-10 at 17:35 +0300, Kirill A. Shutemov wrote:
> > > Linear Address Masking[1] (LAM) modifies the checking that is
> > > applied
> > > to
> > > 64-bit linear addresses, allowing software to use of the
> > > untranslated
> > > address bits for metadata.
> > > 
> > > The patchset brings support for LAM for userspace addresses.
> > 
> > Arm has this documentation about which memory operations support
> > being
> > passed tagged pointers, and which do not:
> > Documentation/arm64/tagged-address-abi.rst
> > 
> > Is the idea that LAM would have something similar, or exactly
> > mirror
> > the arm ABI? It seems like it is the same right now. Should the
> > docs be
> > generalized?
> 
> It is somewhat similar, but not exact. ARM TBI interface implies tag
> size
> and placement. ARM TBI is per-thread and LAM is per-process.

Ah right. I was thinking more the part about which syscalls support
tagged addresses:

https://www.kernel.org/doc/html/latest/arm64/tagged-address-abi.html#id1

Some mention kernel versions where they changed. Just thinking it could
get complex to track which HW features support which syscalls for which
kernel versions.
Re: [PATCHv3 0/8] Linear Address Masking enabling
Posted by Kostya Serebryany 3 years, 10 months ago
Thanks for working on this, please make LAM happen.
It enables efficient memory safety testing that is already available on AArch64.

Memory error detectors, such as ASAN and Valgrind (or KASAN for the kernel)
have limited applicability, primarily because of their run-time overheads
(CPU, RAM, and code size). In many cases, the major obstacle to a wider
deployment is the RAM overhead, which is typically 2x-3x. There is another tool,
HWASAN [1], which solves the same problem and has < 10% RAM overhead.
This tool is available only on AArch64 because it relies on the
top-byte-ignore (TBI)
feature. Full support for that feature [2] has been added to the
kernel in order to
enable HWASAN. Adding support for LAM will enable HWASAN on x86_64.

HWASAN is already the main memory safety tool for Android [3] - the reduced RAM
overhead allowed us to utilize this testing tool where ASAN’s RAM overhead was
prohibitive. We have also prototyped the x86_64 variant of HWASAN, and we can
observe that it is a major improvement over ASAN. The kernel support
and hardware
availability are the only missing parts.

Making HWASAN available on x86_64 will enable developers of server and
client software
to scale up their memory safety testing, and thus improve the quality
and security of their products.


[1] https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html
[2] https://www.kernel.org/doc/html/latest/arm64/tagged-address-abi.html
[3] https://source.android.com/devices/tech/debug/hwasan

--kcc


On Fri, Jun 10, 2022 at 7:35 AM Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
>
> Linear Address Masking[1] (LAM) modifies the checking that is applied to
> 64-bit linear addresses, allowing software to use of the untranslated
> address bits for metadata.
>
> The patchset brings support for LAM for userspace addresses.
>
> LAM_U48 enabling is controversial since it competes for bits with
> 5-level paging. Its enabling isolated into an optional last patch that
> can be applied at maintainer's discretion.
>
> Please review and consider applying.
>
> v3:
>   - Rebased onto v5.19-rc1
>   - Per-process enabling;
>   - API overhaul (again);
>   - Avoid branches and costly computations in the fast path;
>   - LAM_U48 is in optional patch.
> v2:
>   - Rebased onto v5.18-rc1
>   - New arch_prctl(2)-based API
>   - Expose status of LAM (or other thread features) in
>     /proc/$PID/arch_status
>
> [1] ISE, Chapter 14.
> https://software.intel.com/content/dam/develop/external/us/en/documents-tps/architecture-instruction-set-extensions-programming-reference.pdf
>
> Kirill A. Shutemov (8):
>   x86/mm: Fix CR3_ADDR_MASK
>   x86: CPUID and CR3/CR4 flags for Linear Address Masking
>   mm: Pass down mm_struct to untagged_addr()
>   x86/mm: Handle LAM on context switch
>   x86/uaccess: Provide untagged_addr() and remove tags before address check
>   x86/mm: Provide ARCH_GET_UNTAG_MASK and ARCH_ENABLE_TAGGED_ADDR
>   x86: Expose untagging mask in /proc/$PID/arch_status
>   x86/mm: Extend LAM to support to LAM_U48
>
>  arch/arm64/include/asm/memory.h               |  4 +-
>  arch/arm64/include/asm/signal.h               |  2 +-
>  arch/arm64/include/asm/uaccess.h              |  4 +-
>  arch/arm64/kernel/hw_breakpoint.c             |  2 +-
>  arch/arm64/kernel/traps.c                     |  4 +-
>  arch/arm64/mm/fault.c                         | 10 +--
>  arch/sparc/include/asm/pgtable_64.h           |  2 +-
>  arch/sparc/include/asm/uaccess_64.h           |  2 +
>  arch/x86/include/asm/cpufeatures.h            |  1 +
>  arch/x86/include/asm/elf.h                    |  3 +-
>  arch/x86/include/asm/mmu.h                    |  2 +
>  arch/x86/include/asm/mmu_context.h            | 58 +++++++++++++++++
>  arch/x86/include/asm/processor-flags.h        |  2 +-
>  arch/x86/include/asm/tlbflush.h               |  3 +
>  arch/x86/include/asm/uaccess.h                | 44 ++++++++++++-
>  arch/x86/include/uapi/asm/prctl.h             |  3 +
>  arch/x86/include/uapi/asm/processor-flags.h   |  6 ++
>  arch/x86/kernel/Makefile                      |  2 +
>  arch/x86/kernel/fpu/xstate.c                  | 47 --------------
>  arch/x86/kernel/proc.c                        | 50 +++++++++++++++
>  arch/x86/kernel/process.c                     |  3 +
>  arch/x86/kernel/process_64.c                  | 54 +++++++++++++++-
>  arch/x86/kernel/sys_x86_64.c                  |  5 +-
>  arch/x86/mm/hugetlbpage.c                     |  6 +-
>  arch/x86/mm/mmap.c                            |  9 ++-
>  arch/x86/mm/tlb.c                             | 62 ++++++++++++++-----
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c       |  2 +-
>  drivers/gpu/drm/radeon/radeon_gem.c           |  2 +-
>  drivers/infiniband/hw/mlx4/mr.c               |  2 +-
>  drivers/media/common/videobuf2/frame_vector.c |  2 +-
>  drivers/media/v4l2-core/videobuf-dma-contig.c |  2 +-
>  .../staging/media/atomisp/pci/hmm/hmm_bo.c    |  2 +-
>  drivers/tee/tee_shm.c                         |  2 +-
>  drivers/vfio/vfio_iommu_type1.c               |  2 +-
>  fs/proc/task_mmu.c                            |  2 +-
>  include/linux/mm.h                            | 11 ----
>  include/linux/uaccess.h                       | 11 ++++
>  lib/strncpy_from_user.c                       |  2 +-
>  lib/strnlen_user.c                            |  2 +-
>  mm/gup.c                                      |  6 +-
>  mm/madvise.c                                  |  2 +-
>  mm/mempolicy.c                                |  6 +-
>  mm/migrate.c                                  |  2 +-
>  mm/mincore.c                                  |  2 +-
>  mm/mlock.c                                    |  4 +-
>  mm/mmap.c                                     |  2 +-
>  mm/mprotect.c                                 |  2 +-
>  mm/mremap.c                                   |  2 +-
>  mm/msync.c                                    |  2 +-
>  virt/kvm/kvm_main.c                           |  2 +-
>  51 files changed, 342 insertions(+), 126 deletions(-)
>  create mode 100644 arch/x86/kernel/proc.c
>
> --
> 2.35.1
>