Enable the WXN system control bit straight out of reset when running in
EL1 with the initial ID map from flash. This setting will be inherited
by the page table code after it sets up the permanent boot time page
tables, resulting in all memory mappings that are not explicitly mapped
as read-only to be non-executable.
Note that this requires runtime drivers to be built with position
independent codegen, to ensure that all absolute symbol references are
moved into a separate section in the binary. Otherwise, unmapping the
pages that are subject to relocation fixups at runtime (during the
invocation of SetVirtualAddressMap()) could result in code mappings
losing their executable permissions.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
ArmVirtPkg/ArmVirt.dsc.inc | 1 +
ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S | 2 +-
2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/ArmVirtPkg/ArmVirt.dsc.inc b/ArmVirtPkg/ArmVirt.dsc.inc
index 5b18184be263..928dd6330edb 100644
--- a/ArmVirtPkg/ArmVirt.dsc.inc
+++ b/ArmVirtPkg/ArmVirt.dsc.inc
@@ -31,6 +31,7 @@ [BuildOptions.common.EDKII.DXE_CORE,BuildOptions.common.EDKII.DXE_DRIVER,BuildOp
[BuildOptions.common.EDKII.DXE_RUNTIME_DRIVER]
GCC:*_*_ARM_DLINK_FLAGS = -z common-page-size=0x1000
+ GCC:*_*_AARCH64_CC_FLAGS = -fpie
GCC:*_*_AARCH64_DLINK_FLAGS = -z common-page-size=0x10000
[LibraryClasses.common]
diff --git a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
index 5ac7c732f6ec..51c089a45ffc 100644
--- a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
+++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S
@@ -38,7 +38,7 @@
.set SCTLR_EL1_ITD, 0x1 << 7
.set SCTLR_EL1_RES1, (0x1 << 11) | (0x1 << 20) | (0x1 << 22) | (0x1 << 28) | (0x1 << 29)
.set sctlrval, SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_SA | SCTLR_EL1_ITD | SCTLR_EL1_SED
- .set sctlrval, sctlrval | SCTLR_ELx_I | SCTLR_EL1_SPAN | SCTLR_EL1_RES1
+ .set sctlrval, sctlrval | SCTLR_ELx_I | SCTLR_EL1_SPAN | SCTLR_EL1_RES1 | SCTLR_EL1_WXN
ASM_FUNC(ArmPlatformPeiBootAction)
--
2.39.1
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#100100): https://edk2.groups.io/g/devel/message/100100
Mute This Topic: https://groups.io/mt/96937498/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-
Hey Ard, *Praise* to you for this series. Comments inline. On Mon, Feb 13, 2023 at 07:19 AM, Ard Biesheuvel wrote: > > Enable the WXN system control bit straight out of reset when running in > EL1 with the initial ID map from flash. This setting will be inherited > by the page table code after it sets up the permanent boot time page > tables, resulting in all memory mappings that are not explicitly mapped > as read-only to be non-executable. > > Note that this requires runtime drivers to be built with position > independent codegen, to ensure that all absolute symbol references are > moved into a separate section in the binary. Otherwise, unmapping the > pages that are subject to relocation fixups at runtime (during the > invocation of SetVirtualAddressMap()) could result in code mappings > losing their executable permissions. I never actually thought about this. SetVirtualAddressMap() will have to relocate its own parent binary, causing issues for software W^X when .text relocs are present (like with MSVC builds). :( > > > Signed-off-by: Ard Biesheuvel <ardb@...> > --- > ArmVirtPkg/ArmVirt.dsc.inc | 1 + > ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S | 2 +- > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/ArmVirtPkg/ArmVirt.dsc.inc b/ArmVirtPkg/ArmVirt.dsc.inc > index 5b18184be263..928dd6330edb 100644 > --- a/ArmVirtPkg/ArmVirt.dsc.inc > +++ b/ArmVirtPkg/ArmVirt.dsc.inc > @@ -31,6 +31,7 @@ > [BuildOptions.common.EDKII.DXE_CORE,BuildOptions.common.E= > DKII.DXE_DRIVER,BuildOp > =0D > [BuildOptions.common.EDKII.DXE_RUNTIME_DRIVER]=0D > GCC:*_*_ARM_DLINK_FLAGS =3D -z common-page-size=3D0x1000=0D > + GCC:*_*_AARCH64_CC_FLAGS =3D -fpie=0D Doesn't this mean -pie must be passed to the linker? I saw in the previous patch that .plt was added to the linker script, was there a particular reason -fno-plt wasn't used here? I just read it may have some unexpected side-effects, but I thought it would be safe for our statically-linked UEFI environment. On another (related) matter, I've been spending my last two days looking into the whole ELF-to-PE process, because GenFw has been becoming unbearable to us downstream. I went through a bunch of old commits which deal with PIE and saw it was usually disabled but for X64. The funny thing with X64 (even currently) is, that -fpie is combined with -q (a.k.a. --emit-relocs), yielding both object file relocs (.rela.sectname) and PIE-related relative relocs (.rela) in the same binary (as documented in GenFw, they may overlap!). It's my understanding that GenFw currently processes exclusively the -q relocs and not the -fpie relocs (which should be safe as done for X64, I have no experience with ARM whatsoever). However, when PIE is involved anyway, it makes most sense to me to use its related relocs for the translation over a dance with the object file relocs. This change will cause the same behaviour for AARCH64 RT drivers now, right? In an ideal world, I suppose all architectures but IA32 (due to lacking efficient pcrel addressing) should be using PIE, as most (often all with X64) GOT references can be relaxed, as we strictly deal with local symbols. Though I have to wonder how unideal the world really is. :) Best regards, Marvin > > GCC:*_*_AARCH64_DLINK_FLAGS =3D -z common-page-size=3D0x10000=0D > =0D > [LibraryClasses.common]=0D > diff --git > a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelpe= > r.S b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > index 5ac7c732f6ec..51c089a45ffc 100644 > --- a/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > +++ b/ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S > @@ -38,7 +38,7 @@ > .set SCTLR_EL1_ITD, 0x1 << 7=0D > .set SCTLR_EL1_RES1, (0x1 << 11) | (0x1 << 20) | (0x1 << 22) | (0= > x1 << 28) | (0x1 << 29)=0D > .set sctlrval, SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_SA | SCTLR_EL1_IT= > D | SCTLR_EL1_SED=0D > - .set sctlrval, sctlrval | SCTLR_ELx_I | SCTLR_EL1_SPAN | SCTLR_EL1_RES= > 1=0D > + .set sctlrval, sctlrval | SCTLR_ELx_I | SCTLR_EL1_SPAN | SCTLR_EL1_RES= > 1 | SCTLR_EL1_WXN=0D > =0D > =0D > ASM_FUNC(ArmPlatformPeiBootAction)=0D > --=20 > 2.39.1 -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#100133): https://edk2.groups.io/g/devel/message/100133 Mute This Topic: https://groups.io/mt/96937498/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Mon, 13 Feb 2023 at 22:16, Marvin Häuser <mhaeuser@posteo.de> wrote: > > Hey Ard, > > *Praise* to you for this series. Comments inline. > Thanks :-) > On Mon, Feb 13, 2023 at 07:19 AM, Ard Biesheuvel wrote: > > Enable the WXN system control bit straight out of reset when running in > EL1 with the initial ID map from flash. This setting will be inherited > by the page table code after it sets up the permanent boot time page > tables, resulting in all memory mappings that are not explicitly mapped > as read-only to be non-executable. > > Note that this requires runtime drivers to be built with position > independent codegen, to ensure that all absolute symbol references are > moved into a separate section in the binary. Otherwise, unmapping the > pages that are subject to relocation fixups at runtime (during the > invocation of SetVirtualAddressMap()) could result in code mappings > losing their executable permissions. > > I never actually thought about this. SetVirtualAddressMap() will have to relocate its own parent binary, causing issues for software W^X when .text relocs are present (like with MSVC builds). :( > > > Signed-off-by: Ard Biesheuvel <ardb@...> > --- > ArmVirtPkg/ArmVirt.dsc.inc | 1 + > ArmVirtPkg/Library/ArmPlatformLibQemu/AArch64/ArmPlatformHelper.S | 2 +- > 2 files changed, 2 insertions(+), 1 deletion(-) > > diff --git a/ArmVirtPkg/ArmVirt.dsc.inc b/ArmVirtPkg/ArmVirt.dsc.inc > index 5b18184be263..928dd6330edb 100644 > --- a/ArmVirtPkg/ArmVirt.dsc.inc > +++ b/ArmVirtPkg/ArmVirt.dsc.inc > @@ -31,6 +31,7 @@ [BuildOptions.common.EDKII.DXE_CORE,BuildOptions.common.E= > DKII.DXE_DRIVER,BuildOp > =0D > [BuildOptions.common.EDKII.DXE_RUNTIME_DRIVER]=0D > GCC:*_*_ARM_DLINK_FLAGS =3D -z common-page-size=3D0x1000=0D > + GCC:*_*_AARCH64_CC_FLAGS =3D -fpie=0D > > Doesn't this mean -pie must be passed to the linker? I saw in the previous patch that .plt was added to the linker script, was there a particular reason -fno-plt wasn't used here? I just read it may have some unexpected side-effects, but I thought it would be safe for our statically-linked UEFI environment. > No, the only reason for adding -fpie here is to ensure that statically initialized CONST pointers are emitted into .data.rel.ro and not into .rodata, as this is under the control of the compiler. Although, thinking about this, I wonder if we need to pass this to the linker for codegen under LTO as well. But the PIE link itself should be unnecessary here. > On another (related) matter, I've been spending my last two days looking into the whole ELF-to-PE process, because GenFw has been becoming unbearable to us downstream. I went through a bunch of old commits which deal with PIE and saw it was usually disabled but for X64. The funny thing with X64 (even currently) is, that -fpie is combined with -q (a.k.a. --emit-relocs), yielding both object file relocs (.rela.sectname) and PIE-related relative relocs (.rela) in the same binary (as documented in GenFw, they may overlap!). It's my understanding that GenFw currently processes exclusively the -q relocs and not the -fpie relocs (which should be safe as done for X64, I have no experience with ARM whatsoever). However, when PIE is involved anyway, it makes most sense to me to use its related relocs for the translation over a dance with the object file relocs. This change will cause the same behaviour for AARCH64 RT drivers now, right? > It will if you pass -pie to the linker, which is why I would prefer to avoid that. The main issue IIRC is that the emit-relocs section does not cover the entries in the GOT table that also require relocation, and are only covered by the PIE .rela section. For AArch64, I added relaxation logic to GenFw to actually patch the instructions instead, which is always possible given the absence of dynamic linking. (d2687f23c909475d80cef32cdf9a5d121f0a9ae6, 7b8f69d7e10628d473dd225224d8c2122d25a38d) This means that we don't have to care about compiler generated symbol references, and so the relocs emitted by emit-relocs are sufficient, and the additional ones emitted into .rela are unused anyway. The only remaining absolute references are the ones resulting from statically initialized globals, and those will either be in .data or in .data.rel.ro (if -fpie is being used) But I agree that not using --emit-relocs and only relying on the .rela section to populate the PE/COFF reloc section would be far cleaner. > In an ideal world, I suppose all architectures but IA32 (due to lacking efficient pcrel addressing) should be using PIE, as most (often all with X64) GOT references can be relaxed, as we strictly deal with local symbols. Though I have to wonder how unideal the world really is. :) > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#100136): https://edk2.groups.io/g/devel/message/100136 Mute This Topic: https://groups.io/mt/96937498/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
> On 13. Feb 2023, at 22:59, Ard Biesheuvel <ardb@kernel.org> wrote: > > No, the only reason for adding -fpie here is to ensure that statically > initialized CONST pointers are emitted into .data.rel.ro and not into > .rodata, as this is under the control of the compiler. Although, > thinking about this, I wonder if we need to pass this to the linker > for codegen under LTO as well. But the PIE link itself should be > unnecessary here. Oh, what fun. For some reason I thought it would be unsafe to specify -fpie but not -pie, but considering PIE relocs are ignored either way, this actually makes perfect sense. Sorry! About that last part, the docs say: "It is recommended that you compile all the files participating in the same link with the same options and also specify those options at link time." [1], so good catch! [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options But what about -fno-plt? > > It will if you pass -pie to the linker, which is why I would prefer to > avoid that. The main issue IIRC is that the emit-relocs section does > not cover the entries in the GOT table that also require relocation, > and are only covered by the PIE .rela section. For AArch64, I added > relaxation logic to GenFw to actually patch the instructions instead, > which is always possible given the absence of dynamic linking. > (d2687f23c909475d80cef32cdf9a5d121f0a9ae6, > 7b8f69d7e10628d473dd225224d8c2122d25a38d) Yes, seen, very nice. I do wonder though why GOT entries are generated in the first place when symbols are all local and data is within the PC-addressable range. Just today, for a X64 build, I actually saw Clang relax a GOT reference to __stack_chk_guard itself. > > This means that we don't have to care about compiler generated symbol > references, and so the relocs emitted by emit-relocs are sufficient, > and the additional ones emitted into .rela are unused anyway. The only > remaining absolute references are the ones resulting from statically > initialized globals, and those will either be in .data or in > .data.rel.ro (if -fpie is being used) Right. thank you. Best regards, Marvin -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#100140): https://edk2.groups.io/g/devel/message/100140 Mute This Topic: https://groups.io/mt/96937498/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Mon, 13 Feb 2023 at 23:23, Marvin Häuser <mhaeuser@posteo.de> wrote: > > > On 13. Feb 2023, at 22:59, Ard Biesheuvel <ardb@kernel.org> wrote: > > No, the only reason for adding -fpie here is to ensure that statically > initialized CONST pointers are emitted into .data.rel.ro and not into > .rodata, as this is under the control of the compiler. Although, > thinking about this, I wonder if we need to pass this to the linker > for codegen under LTO as well. But the PIE link itself should be > unnecessary here. > > > Oh, what fun. For some reason I thought it would be unsafe to specify -fpie but not -pie, but considering PIE relocs are ignored either way, this actually makes perfect sense. Sorry! About that last part, the docs say: "It is recommended that you compile all the files participating in the same link with the same options and also specify those options at link time." [1], so good catch! > > [1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Optimize-Options > Yeah, but come to think of it, we pass the CC flags to the linker, so this should be covered. > But what about -fno-plt? > That doesn't make a difference in codegen on AArch64 - the same BL instruction is emitted along with the same type of relocation, and it is left up to the linker whether or not a PLT entry is emitted, and I don't think it will ever do that when generating a fully linked binary without -pie. > > It will if you pass -pie to the linker, which is why I would prefer to > avoid that. The main issue IIRC is that the emit-relocs section does > not cover the entries in the GOT table that also require relocation, > and are only covered by the PIE .rela section. For AArch64, I added > relaxation logic to GenFw to actually patch the instructions instead, > which is always possible given the absence of dynamic linking. > (d2687f23c909475d80cef32cdf9a5d121f0a9ae6, > 7b8f69d7e10628d473dd225224d8c2122d25a38d) > > > Yes, seen, very nice. I do wonder though why GOT entries are generated in the first place when symbols are all local and data is within the PC-addressable range. Just today, for a X64 build, I actually saw Clang relax a GOT reference to __stack_chk_guard itself. > Yeah there are some strange corner cases, but in general, all symbol references are relative - it is precisely the 'special' symbol references that the compiler generates internally that sometimes get this wrong. > > This means that we don't have to care about compiler generated symbol > references, and so the relocs emitted by emit-relocs are sufficient, > and the additional ones emitted into .rela are unused anyway. The only > remaining absolute references are the ones resulting from statically > initialized globals, and those will either be in .data or in > .data.rel.ro (if -fpie is being used) > > > Right. thank you. > > Best regards, > Marvin > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#100142): https://edk2.groups.io/g/devel/message/100142 Mute This Topic: https://groups.io/mt/96937498/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
© 2016 - 2026 Red Hat, Inc.