.../admin-guide/kernel-parameters.txt | 10 +++++ arch/x86/entry/vsyscall/vsyscall_64.c | 9 ++++- include/linux/mm.h | 12 ++++++ mm/mmap.c | 10 +++++ mm/mseal.c | 39 +++++++++++++++++++ security/Kconfig | 11 ++++++ 6 files changed, 89 insertions(+), 2 deletions(-)
From: Jeff Xu <jeffxu@chromium.org> Seal vdso, vvar, sigpage, uprobes and vsyscall. Those mappings are readonly or executable only, sealing can protect them from ever changing or unmapped during the life time of the process. For complete descriptions of memory sealing, please see mseal.rst [1]. System mappings such as vdso, vvar, and sigpage (for arm) are generated by the kernel during program initialization, and are sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [1]. It is sealed from creation. The vdso, vvar, sigpage, and uprobe mappings all invoke the _install_special_mapping() function. As no other mappings utilize this function, it is logical to incorporate sealing logic within _install_special_mapping(). This approach avoids the necessity of modifying code across various architecture-specific implementations. The vsyscall mapping, which has its own initialization function, is sealed in the XONLY case, it seems to be the most common and secure case of using vsyscall. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the mapping of vdso, vvar, and sigpage during restore operations. Consequently, this feature cannot be universally enabled across all systems. To address this, a kernel configuration option has been introduced to enable or disable this functionality. [1] Documentation/userspace-api/mseal.rst [2] https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ History: V3: Revert uprobe to v1 logic (Oleg Nesterov) use CONFIG_SEAL_SYSTEM_MAPPINGS instead of _ALWAYS/_NEVER (Kees Cook) Move kernel cmd line from fs/exec.c to mm/mseal.c and misc. refactor (Liam R. Howlett) V2: https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/ Seal uprobe always (Oleg Nesterov) Update comments and description (Randy Dunlap, Liam R.Howlett, Oleg Nesterov) Rebase to linux_main V1: https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/ Jeff Xu (1): exec: seal system mappings .../admin-guide/kernel-parameters.txt | 10 +++++ arch/x86/entry/vsyscall/vsyscall_64.c | 9 ++++- include/linux/mm.h | 12 ++++++ mm/mmap.c | 10 +++++ mm/mseal.c | 39 +++++++++++++++++++ security/Kconfig | 11 ++++++ 6 files changed, 89 insertions(+), 2 deletions(-) -- 2.47.0.277.g8800431eea-goog
You left kernel test bots and review unanswered on v2, which makes it difficult to know whether you actually addressed everything. Please respond to all outstanding comments in the v2 thread so we know, thanks, even if it is to say 'this is no longer an issue as v3 addresses'. On Wed, Nov 13, 2024 at 07:16:01PM +0000, jeffxu@chromium.org wrote: > From: Jeff Xu <jeffxu@chromium.org> > > Seal vdso, vvar, sigpage, uprobes and vsyscall. > > Those mappings are readonly or executable only, sealing can protect > them from ever changing or unmapped during the life time of the process. > For complete descriptions of memory sealing, please see mseal.rst [1]. > > System mappings such as vdso, vvar, and sigpage (for arm) are > generated by the kernel during program initialization, and are > sealed after creation. > > Unlike the aforementioned mappings, the uprobe mapping is not > established during program startup. However, its lifetime is the same > as the process's lifetime [1]. It is sealed from creation. > > The vdso, vvar, sigpage, and uprobe mappings all invoke the > _install_special_mapping() function. As no other mappings utilize this > function, it is logical to incorporate sealing logic within > _install_special_mapping(). This approach avoids the necessity of > modifying code across various architecture-specific implementations. Some arches unmap VDSO's which mseal prevents, so are those broken now? Did you test this? > > The vsyscall mapping, which has its own initialization function, is > sealed in the XONLY case, it seems to be the most common and secure > case of using vsyscall. > > It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may > alter the mapping of vdso, vvar, and sigpage during restore > operations. Consequently, this feature cannot be universally enabled > across all systems. To address this, a kernel configuration option has > been introduced to enable or disable this functionality. > > [1] Documentation/userspace-api/mseal.rst > [2] https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ I don't see any mention to testing, and this affects multiple different architectures. Please describe what testing you performed and on what architectures. I suggest we allow this only for architectures you have explicitly tested, especially as this is 'hidden' from arch maintainers who might find this change surprising. > > History: > V3: > Revert uprobe to v1 logic (Oleg Nesterov) > use CONFIG_SEAL_SYSTEM_MAPPINGS instead of _ALWAYS/_NEVER (Kees Cook) > Move kernel cmd line from fs/exec.c to mm/mseal.c and misc. refactor (Liam R. Howlett) > > V2: > https://lore.kernel.org/all/20241014215022.68530-1-jeffxu@google.com/ > Seal uprobe always (Oleg Nesterov) > Update comments and description (Randy Dunlap, Liam R.Howlett, Oleg Nesterov) > Rebase to linux_main > > V1: > https://lore.kernel.org/all/20241004163155.3493183-1-jeffxu@google.com/ > > Jeff Xu (1): > exec: seal system mappings > > .../admin-guide/kernel-parameters.txt | 10 +++++ > arch/x86/entry/vsyscall/vsyscall_64.c | 9 ++++- > include/linux/mm.h | 12 ++++++ > mm/mmap.c | 10 +++++ > mm/mseal.c | 39 +++++++++++++++++++ > security/Kconfig | 11 ++++++ > 6 files changed, 89 insertions(+), 2 deletions(-) > > -- > 2.47.0.277.g8800431eea-goog >
Hi Lorenzo On Wed, Nov 13, 2024 at 12:36 PM Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote: > > You left kernel test bots and review unanswered on v2, which makes it > difficult to know whether you actually addressed everything. > Thanks for reminding me, I got distracted previously. I responded to the test bots. > Please respond to all outstanding comments in the v2 thread so we know, > thanks, even if it is to say 'this is no longer an issue as v3 addresses'. > All comments of v2 were addressed in V3. > On Wed, Nov 13, 2024 at 07:16:01PM +0000, jeffxu@chromium.org wrote: > > From: Jeff Xu <jeffxu@chromium.org> > > > > Seal vdso, vvar, sigpage, uprobes and vsyscall. > > > > Those mappings are readonly or executable only, sealing can protect > > them from ever changing or unmapped during the life time of the process. > > For complete descriptions of memory sealing, please see mseal.rst [1]. > > > > System mappings such as vdso, vvar, and sigpage (for arm) are > > generated by the kernel during program initialization, and are > > sealed after creation. > > > > Unlike the aforementioned mappings, the uprobe mapping is not > > established during program startup. However, its lifetime is the same > > as the process's lifetime [1]. It is sealed from creation. > > > > The vdso, vvar, sigpage, and uprobe mappings all invoke the > > _install_special_mapping() function. As no other mappings utilize this > > function, it is logical to incorporate sealing logic within > > _install_special_mapping(). This approach avoids the necessity of > > modifying code across various architecture-specific implementations. > > Some arches unmap VDSO's which mseal prevents, so are those broken now? Did > you test this? > Do you happen to know which arch might unmap vdso ? The information I collected so far is only CHECKPOINT_RESTORE would remap/unmap vdso. And if CHECKPOINT_RESTORE is enabled, Kconfig will prevent this from being enabled. > > > > The vsyscall mapping, which has its own initialization function, is > > sealed in the XONLY case, it seems to be the most common and secure > > case of using vsyscall. > > > > It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may > > alter the mapping of vdso, vvar, and sigpage during restore > > operations. Consequently, this feature cannot be universally enabled > > across all systems. To address this, a kernel configuration option has > > been introduced to enable or disable this functionality. > > > > [1] Documentation/userspace-api/mseal.rst > > [2] https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ > > I don't see any mention to testing, and this affects multiple different > architectures. > > Please describe what testing you performed and on what architectures. > The tests are done in ChromeOS and Android on ARM and INTEL. > I suggest we allow this only for architectures you have explicitly tested, > especially as this is 'hidden' from arch maintainers who might find this > change surprising. > I thought the current approach aligns with Linus's suggestion of unifying vdso/vvar code [1]. I honestly think this is not architecture dependent, i.e. this doesn't require any specific CPU feature. I could add ARCH_HAS_SEAL_SYSTEM_MAPPINGS in KCONFIG and enable this for x86_64 and arm64 for now, this would allow other architecture maintainers to have opportunities to test this . [1] https://lore.kernel.org/all/CAHk-=wgTXVMBRuya5J0peujSrtunehRtzk=WVrm6njPhHrpTJw@mail.gmail.com/ Thanks for reviewing. -Jeff
© 2016 - 2024 Red Hat, Inc.