[PATCH RFT v11 0/8] fork: Support shadow stacks in clone3()

Mark Brown posted 8 patches 1 month, 3 weeks ago
There is a newer version of this series
Documentation/userspace-api/index.rst             |   1 +
Documentation/userspace-api/shadow_stack.rst      |  41 ++++
arch/arm64/include/asm/gcs.h                      |   8 +-
arch/arm64/kernel/process.c                       |   8 +-
arch/arm64/mm/gcs.c                               |  62 +++++-
arch/x86/include/asm/shstk.h                      |  11 +-
arch/x86/kernel/process.c                         |   2 +-
arch/x86/kernel/shstk.c                           |  57 +++++-
include/asm-generic/cacheflush.h                  |  11 ++
include/linux/sched/task.h                        |  17 ++
include/uapi/linux/sched.h                        |  10 +-
kernel/fork.c                                     |  96 +++++++--
tools/testing/selftests/clone3/clone3.c           | 226 ++++++++++++++++++----
tools/testing/selftests/clone3/clone3_selftests.h |  65 ++++++-
tools/testing/selftests/ksft_shstk.h              |  98 ++++++++++
15 files changed, 632 insertions(+), 81 deletions(-)
[PATCH RFT v11 0/8] fork: Support shadow stacks in clone3()
Posted by Mark Brown 1 month, 3 weeks ago
The kernel has recently added support for shadow stacks, currently
x86 only using their CET feature but both arm64 and RISC-V have
equivalent features (GCS and Zicfiss respectively), I am actively
working on GCS[1].  With shadow stacks the hardware maintains an
additional stack containing only the return addresses for branch
instructions which is not generally writeable by userspace and ensures
that any returns are to the recorded addresses.  This provides some
protection against ROP attacks and making it easier to collect call
stacks.  These shadow stacks are allocated in the address space of the
userspace process.

Our API for shadow stacks does not currently offer userspace any
flexiblity for managing the allocation of shadow stacks for newly
created threads, instead the kernel allocates a new shadow stack with
the same size as the normal stack whenever a thread is created with the
feature enabled.  The stacks allocated in this way are freed by the
kernel when the thread exits or shadow stacks are disabled for the
thread.  This lack of flexibility and control isn't ideal, in the vast
majority of cases the shadow stack will be over allocated and the
implicit allocation and deallocation is not consistent with other
interfaces.  As far as I can tell the interface is done in this manner
mainly because the shadow stack patches were in development since before
clone3() was implemented.

Since clone3() is readily extensible let's add support for specifying a
shadow stack when creating a new thread or process, keeping the current
implicit allocation behaviour if one is not specified either with
clone3() or through the use of clone().  The user must provide a shadow
stack pointer, this must point to memory mapped for use as a shadow
stackby map_shadow_stack() with an architecture specified shadow stack
token at the top of the stack.

Please note that the x86 portions of this code are build tested only, I
don't appear to have a system that can run CET available to me.

[1] https://lore.kernel.org/linux-arm-kernel/20241001-arm64-gcs-v13-0-222b78d87eee@kernel.org/T/#mc58f97f27461749ccf400ebabf6f9f937116a86b

Signed-off-by: Mark Brown <broonie@kernel.org>
---
Changes in v11:
- Rebase onto arm64 for-next/gcs, which is based on v6.12-rc1, and
  integrate arm64 support.
- Rework the interface to specify a shadow stack pointer rather than a
  base and size like we do for the regular stack.
- Link to v10: https://lore.kernel.org/r/20240821-clone3-shadow-stack-v10-0-06e8797b9445@kernel.org

Changes in v10:
- Integrate fixes & improvements for the x86 implementation from Rick
  Edgecombe.
- Require that the shadow stack be VM_WRITE.
- Require that the shadow stack base and size be sizeof(void *) aligned.
- Clean up trailing newline.
- Link to v9: https://lore.kernel.org/r/20240819-clone3-shadow-stack-v9-0-962d74f99464@kernel.org

Changes in v9:
- Pull token validation earlier and report problems with an error return
  to parent rather than signal delivery to the child.
- Verify that the top of the supplied shadow stack is VM_SHADOW_STACK.
- Rework token validation to only do the page mapping once.
- Drop no longer needed support for testing for signals in selftest.
- Fix typo in comments.
- Link to v8: https://lore.kernel.org/r/20240808-clone3-shadow-stack-v8-0-0acf37caf14c@kernel.org

Changes in v8:
- Fix token verification with user specified shadow stack.
- Don't track user managed shadow stacks for child processes.
- Link to v7: https://lore.kernel.org/r/20240731-clone3-shadow-stack-v7-0-a9532eebfb1d@kernel.org

Changes in v7:
- Rebase onto v6.11-rc1.
- Typo fixes.
- Link to v6: https://lore.kernel.org/r/20240623-clone3-shadow-stack-v6-0-9ee7783b1fb9@kernel.org

Changes in v6:
- Rebase onto v6.10-rc3.
- Ensure we don't try to free the parent shadow stack in error paths of
  x86 arch code.
- Spelling fixes in userspace API document.
- Additional cleanups and improvements to the clone3() tests to support
  the shadow stack tests.
- Link to v5: https://lore.kernel.org/r/20240203-clone3-shadow-stack-v5-0-322c69598e4b@kernel.org

Changes in v5:
- Rebase onto v6.8-rc2.
- Rework ABI to have the user allocate the shadow stack memory with
  map_shadow_stack() and a token.
- Force inlining of the x86 shadow stack enablement.
- Move shadow stack enablement out into a shared header for reuse by
  other tests.
- Link to v4: https://lore.kernel.org/r/20231128-clone3-shadow-stack-v4-0-8b28ffe4f676@kernel.org

Changes in v4:
- Formatting changes.
- Use a define for minimum shadow stack size and move some basic
  validation to fork.c.
- Link to v3: https://lore.kernel.org/r/20231120-clone3-shadow-stack-v3-0-a7b8ed3e2acc@kernel.org

Changes in v3:
- Rebase onto v6.7-rc2.
- Remove stale shadow_stack in internal kargs.
- If a shadow stack is specified unconditionally use it regardless of
  CLONE_ parameters.
- Force enable shadow stacks in the selftest.
- Update changelogs for RISC-V feature rename.
- Link to v2: https://lore.kernel.org/r/20231114-clone3-shadow-stack-v2-0-b613f8681155@kernel.org

Changes in v2:
- Rebase onto v6.7-rc1.
- Remove ability to provide preallocated shadow stack, just specify the
  desired size.
- Link to v1: https://lore.kernel.org/r/20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@kernel.org

---
Mark Brown (8):
      arm64/gcs: Return a success value from gcs_alloc_thread_stack()
      Documentation: userspace-api: Add shadow stack API documentation
      selftests: Provide helper header for shadow stack testing
      fork: Add shadow stack support to clone3()
      selftests/clone3: Remove redundant flushes of output streams
      selftests/clone3: Factor more of main loop into test_clone3()
      selftests/clone3: Allow tests to flag if -E2BIG is a valid error code
      selftests/clone3: Test shadow stack support

 Documentation/userspace-api/index.rst             |   1 +
 Documentation/userspace-api/shadow_stack.rst      |  41 ++++
 arch/arm64/include/asm/gcs.h                      |   8 +-
 arch/arm64/kernel/process.c                       |   8 +-
 arch/arm64/mm/gcs.c                               |  62 +++++-
 arch/x86/include/asm/shstk.h                      |  11 +-
 arch/x86/kernel/process.c                         |   2 +-
 arch/x86/kernel/shstk.c                           |  57 +++++-
 include/asm-generic/cacheflush.h                  |  11 ++
 include/linux/sched/task.h                        |  17 ++
 include/uapi/linux/sched.h                        |  10 +-
 kernel/fork.c                                     |  96 +++++++--
 tools/testing/selftests/clone3/clone3.c           | 226 ++++++++++++++++++----
 tools/testing/selftests/clone3/clone3_selftests.h |  65 ++++++-
 tools/testing/selftests/ksft_shstk.h              |  98 ++++++++++
 15 files changed, 632 insertions(+), 81 deletions(-)
---
base-commit: d17cd7b7cc92d37ee8b2df8f975fc859a261f4dc
change-id: 20231019-clone3-shadow-stack-15d40d2bf536

Best regards,
-- 
Mark Brown <broonie@kernel.org>
Re: [PATCH RFT v11 0/8] fork: Support shadow stacks in clone3()
Posted by Mark Brown 4 weeks ago
On Sat, Oct 05, 2024 at 11:31:27AM +0100, Mark Brown wrote:
> The kernel has recently added support for shadow stacks, currently
> x86 only using their CET feature but both arm64 and RISC-V have
> equivalent features (GCS and Zicfiss respectively), I am actively
> working on GCS[1].  With shadow stacks the hardware maintains an
> additional stack containing only the return addresses for branch
> instructions which is not generally writeable by userspace and ensures
> that any returns are to the recorded addresses.  This provides some
> protection against ROP attacks and making it easier to collect call
> stacks.  These shadow stacks are allocated in the address space of the
> userspace process.

Does anyone have any thoughts on this?  I reworked things to specify the
address for the shadow stack pointer rather than the extent of the stack
as Rick and Yuri suggested, otherwise the only change from the prior
version was rebasing onto the arm64 GCS support since that's queued in
-next.  I think the only substantial question is picking the ABI for
specifying the shadow stack.
Re: [PATCH RFT v11 0/8] fork: Support shadow stacks in clone3()
Posted by Yury Khrustalev 4 weeks ago
On Wed, Oct 30, 2024 at 02:08:59PM +0000, Mark Brown wrote:
> On Sat, Oct 05, 2024 at 11:31:27AM +0100, Mark Brown wrote:
> > The kernel has recently added support for shadow stacks, currently
> > x86 only using their CET feature but both arm64 and RISC-V have
> > equivalent features (GCS and Zicfiss respectively), I am actively
> > working on GCS[1].  With shadow stacks the hardware maintains an
> > additional stack containing only the return addresses for branch
> > instructions which is not generally writeable by userspace and ensures
> > that any returns are to the recorded addresses.  This provides some
> > protection against ROP attacks and making it easier to collect call
> > stacks.  These shadow stacks are allocated in the address space of the
> > userspace process.
> 
> Does anyone have any thoughts on this?  I reworked things to specify the
> address for the shadow stack pointer rather than the extent of the stack
> as Rick and Yuri suggested, otherwise the only change from the prior
> version was rebasing onto the arm64 GCS support since that's queued in
> -next.  I think the only substantial question is picking the ABI for
> specifying the shadow stack.

I will need more time to review this as both my primary and shadow stacks
are full with other work. At a glance, I cannot offer any informed opinion
for choosing ABI atm. Apologies for the delay.

Kind regards,
Yury