BaseTools/Source/C/GenFw/Elf64Convert.c | 1 + 1 file changed, 1 insertion(+)
In the R_AARCH64_ADR_GOT_PAGE case on AARCH64, be sure to change the
opcode to ADRP. Prior to this change, we updated the address, but not
the opcode.
This resolves an issue experienced when building a StandaloneMm image
with stack protection enabled on GCC 10.5. This scenario generates an
ADR where an ADRP is more common in other versions of GCC tested. That
explains the obscurity of the issue. However, an ADR is valid and
should be handled by GenFw.
Using the stack protection scenario as an example, the following code is
being generated by the toolchain:
# Load to set the stack canary
2ffc: 10028020 adr x0, 8000 <mErrorString+0x1bc>
3008: f940d400 ldr x0, [x0, #424]
# Load to check the stack canary
30cc: b0000020 adrp x0, 8000 <mErrorString+0x1bc>
30d0: f940d400 ldr x0, [x0, #424]
GenFw rewrote that to:
# Load to set the stack canary
2ffc: 10000480 adr x0, 0x308c
3008: 912ec000 add x0, x0, #0xbb0
# Load to check the stack canary
30cc: f0000460 adrp x0, 0x92000
30d0: 912ec000 add x0, x0, #0xbb0
Note that we're now setting the stack canary from the wrong address,
resulting in an erroneous stack fault.
After this fix, the opcode is also updated, so GenFw rewrites it to:
2ffc: 90000480 adrp x0, 0x92000
3008: 912ec000 add x0, x0, #0xbb0
And the stack canary is set correctly.
Signed-off-by: Jake Garver <jake@nvidia.com>
---
BaseTools/Source/C/GenFw/Elf64Convert.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/BaseTools/Source/C/GenFw/Elf64Convert.c b/BaseTools/Source/C/GenFw/Elf64Convert.c
index 9911db65af..4669ac3a2d 100644
--- a/BaseTools/Source/C/GenFw/Elf64Convert.c
+++ b/BaseTools/Source/C/GenFw/Elf64Convert.c
@@ -1565,6 +1565,7 @@ WriteSections64 (
Offset = (Sym->st_value - (Rel->r_offset & ~0xfff)) >> 12;
*(UINT32 *)Targ &= 0x9000001f;
+ *(UINT32 *)Targ |= 0x90000000;
*(UINT32 *)Targ |= ((Offset & 0x1ffffc) << (5 - 2)) | ((Offset & 0x3) << 29);
/* fall through */
--
2.34.1
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#110120): https://edk2.groups.io/g/devel/message/110120
Mute This Topic: https://groups.io/mt/102202314/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-
On Thu, Oct 26, 2023 at 4:32 PM Jake Garver via groups.io <jake=nvidia.com@groups.io> wrote: > > In the R_AARCH64_ADR_GOT_PAGE case on AARCH64, be sure to change the > opcode to ADRP. Prior to this change, we updated the address, but not > the opcode. > > This resolves an issue experienced when building a StandaloneMm image > with stack protection enabled on GCC 10.5. This scenario generates an > ADR where an ADRP is more common in other versions of GCC tested. That > explains the obscurity of the issue. However, an ADR is valid and > should be handled by GenFw. Is this not a toolchain bug? The aarch64 ELF ABI (https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst) says: "Set the immediate value of an ADRP to bits [32:12] of X; check that –232 <= X < 232" So it mentions this relocation pointing *to an ADRP* specifically. And AFAIK there's no way Page(G(GDAT(S+A)))-Page(P) would ever make sense if we're looking at an adr and not an adrp. Can you post the full disassembly for the function, plus relevant relocations? -- Pedro -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110140): https://edk2.groups.io/g/devel/message/110140 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Thanks for your response, Pedro. I chased this as a toolchain bug originally, but concluded that the ADR indeed works before GenFw rewrites it. But I see your point regarding the relocation statement. As requested, below is the disassembled function along with relocations. This was generated from the dll using "aarch64-linux-gnu-objdump -r -D". The ADR in question is at 2ffc. This code was generated by the current version of the GCC 10.x toolchain on Ubuntu20. So, if we're concluding this is a toolchain issue, then it's with a fairly "stock" toolchain. 0000000000002fec <fdt_path_offset>: 2fec: a9b97bfd stp x29, x30, [sp, #-112]! 2ff0: 910003fd mov x29, sp 2ff4: a90363f7 stp x23, x24, [sp, #48] 2ff8: aa0003f8 mov x24, x0 2ffc: 10020020 adr x0, 7000 <_cont+0xe98> 2ffc: R_AARCH64_ADR_GOT_PAGE __stack_chk_guard 3000: a90153f3 stp x19, x20, [sp, #16] 3004: aa0103f3 mov x19, x1 3008: f946cc00 ldr x0, [x0, #3480] 3008: R_AARCH64_LD64_GOT_LO12_NC __stack_chk_guard 300c: a9025bf5 stp x21, x22, [sp, #32] 3010: a9046bf9 stp x25, x26, [sp, #64] 3014: f9002bfb str x27, [sp, #80] 3018: f9400001 ldr x1, [x0] 301c: f90037e1 str x1, [sp, #104] 3020: d2800001 mov x1, #0x0 // #0 3024: aa1303e0 mov x0, x19 3028: 97fffd04 bl 2438 <AsciiStrLen> 3028: R_AARCH64_CALL26 .text+0x1438 302c: aa0003f5 mov x21, x0 3030: aa1803e0 mov x0, x24 3034: 97fff807 bl 1050 <fdt_check_header> 3034: R_AARCH64_CALL26 .text+0x50 3038: 2a0003f4 mov w20, w0 303c: 35000260 cbnz w0, 3088 <fdt_path_offset+0x9c> 3040: 39400260 ldrb w0, [x19] 3044: 93407ea1 sxtw x1, w21 3048: 8b35c275 add x21, x19, w21, sxtw 304c: 7100bc1f cmp w0, #0x2f 3050: 54000420 b.eq 30d4 <fdt_path_offset+0xe8> // b.none 3054: 528005e2 mov w2, #0x2f // #47 3058: aa1303e0 mov x0, x19 305c: 97ffff2f bl 2d18 <ScanMem8> 305c: R_AARCH64_CALL26 .text+0x1d18 3060: f100001f cmp x0, #0x0 3064: 9a951016 csel x22, x0, x21, ne // ne = any 3068: f0000001 adrp x1, 6000 <__stack_chk_fail+0x7c> 3068: R_AARCH64_ADR_PREL_PG_HI21 .text+0x58b6 306c: 9122d821 add x1, x1, #0x8b6 306c: R_AARCH64_ADD_ABS_LO12_NC .text+0x58b6 3070: aa1803e0 mov x0, x24 3074: cb1302d4 sub x20, x22, x19 3078: 97ffffdd bl 2fec <fdt_path_offset> 307c: 2a0003e1 mov w1, w0 3080: 36f80140 tbz w0, #31, 30a8 <fdt_path_offset+0xbc> 3084: 12800094 mov w20, #0xfffffffb // #-5 3088: 90000020 adrp x0, 7000 <_cont+0xe98> 3088: R_AARCH64_ADR_GOT_PAGE __stack_chk_guard 308c: f946cc00 ldr x0, [x0, #3480] 308c: R_AARCH64_LD64_GOT_LO12_NC __stack_chk_guard 3090: f94037e1 ldr x1, [sp, #104] 3094: f9400002 ldr x2, [x0] 3098: eb020021 subs x1, x1, x2 309c: d2800002 mov x2, #0x0 // #0 30a0: 54000a00 b.eq 31e0 <fdt_path_offset+0x1f4> // b.none 30a4: 94000bb8 bl 5f84 <__stack_chk_fail> 30a4: R_AARCH64_CALL26 __stack_chk_fail 30a8: 2a1403e3 mov w3, w20 30ac: aa1303e2 mov x2, x19 30b0: aa1803e0 mov x0, x24 30b4: d2800004 mov x4, #0x0 // #0 30b8: 97ffff72 bl 2e80 <fdt_get_property_namelen> 30b8: R_AARCH64_CALL26 .text+0x1e80 30bc: b4fffe40 cbz x0, 3084 <fdt_path_offset+0x98> 30c0: 91003001 add x1, x0, #0xc 30c4: aa1603f3 mov x19, x22 30c8: aa1803e0 mov x0, x24 30cc: 97ffffc8 bl 2fec <fdt_path_offset> 30d0: 2a0003f4 mov w20, w0 30d4: 910193fa add x26, sp, #0x64 30d8: 14000037 b 31b4 <fdt_path_offset+0x1c8> 30dc: 910006d6 add x22, x22, #0x1 30e0: eb1602bf cmp x21, x22 30e4: 54fffd20 b.eq 3088 <fdt_path_offset+0x9c> // b.none 30e8: 394002c0 ldrb w0, [x22] 30ec: 7100bc1f cmp w0, #0x2f 30f0: 54ffff60 b.eq 30dc <fdt_path_offset+0xf0> // b.none 30f4: cb1602a1 sub x1, x21, x22 30f8: aa1603e0 mov x0, x22 30fc: 528005e2 mov w2, #0x2f // #47 3100: 97ffff06 bl 2d18 <ScanMem8> 3100: R_AARCH64_CALL26 .text+0x1d18 3104: f100001f cmp x0, #0x0 3108: 9a951013 csel x19, x0, x21, ne // ne = any 310c: aa1803e0 mov x0, x24 3110: 97fff7d0 bl 1050 <fdt_check_header> 3110: R_AARCH64_CALL26 .text+0x50 3114: 35000620 cbnz w0, 31d8 <fdt_path_offset+0x1ec> 3118: 4b160277 sub w23, w19, w22 311c: b90067ff str wzr, [sp, #100] 3120: 110006fb add w27, w23, #0x1 3124: 93407ef7 sxtw x23, w23 3128: b94067e0 ldr w0, [sp, #100] 312c: 7100029f cmp w20, #0x0 3130: 7a40a801 ccmp w0, #0x0, #0x1, ge // ge = tcont 3134: 5400008a b.ge 3144 <fdt_path_offset+0x158> // b.tcont 3138: 36f803c0 tbz w0, #31, 31b0 <fdt_path_offset+0x1c4> 313c: 12800014 mov w20, #0xffffffff // #-1 3140: 17ffffd2 b 3088 <fdt_path_offset+0x9c> 3144: 7100041f cmp w0, #0x1 3148: 540000e0 b.eq 3164 <fdt_path_offset+0x178> // b.none 314c: 2a1403e1 mov w1, w20 3150: aa1a03e2 mov x2, x26 3154: aa1803e0 mov x0, x24 3158: 97fff87c bl 1348 <fdt_next_node> 3158: R_AARCH64_CALL26 .text+0x348 315c: 2a0003f4 mov w20, w0 3160: 17fffff2 b 3128 <fdt_path_offset+0x13c> 3164: 2a1b03e2 mov w2, w27 3168: 11001281 add w1, w20, #0x4 316c: aa1803e0 mov x0, x24 3170: 97fff7da bl 10d8 <fdt_offset_ptr> 3170: R_AARCH64_CALL26 .text+0xd8 3174: aa0003f9 mov x25, x0 3178: b4fffea0 cbz x0, 314c <fdt_path_offset+0x160> 317c: f10002ff cmp x23, #0x0 3180: fa4012c4 ccmp x22, x0, #0x4, ne // ne = any 3184: 54000201 b.ne 31c4 <fdt_path_offset+0x1d8> // b.any 3188: 38776b20 ldrb w0, [x25, x23] 318c: 34000120 cbz w0, 31b0 <fdt_path_offset+0x1c4> 3190: aa1703e1 mov x1, x23 3194: aa1603e0 mov x0, x22 3198: 52800802 mov w2, #0x40 // #64 319c: 97fffedf bl 2d18 <ScanMem8> 319c: R_AARCH64_CALL26 .text+0x1d18 31a0: b5fffd60 cbnz x0, 314c <fdt_path_offset+0x160> 31a4: 38776b20 ldrb w0, [x25, x23] 31a8: 7101001f cmp w0, #0x40 31ac: 54fffd01 b.ne 314c <fdt_path_offset+0x160> // b.any 31b0: 37fff6d4 tbnz w20, #31, 3088 <fdt_path_offset+0x9c> 31b4: eb1302bf cmp x21, x19 31b8: 54fff689 b.ls 3088 <fdt_path_offset+0x9c> // b.plast 31bc: aa1303f6 mov x22, x19 31c0: 17ffffca b 30e8 <fdt_path_offset+0xfc> 31c4: aa1703e2 mov x2, x23 31c8: aa1603e1 mov x1, x22 31cc: 97fffef7 bl 2da8 <CompareMem.part.0> 31cc: R_AARCH64_CALL26 .text+0x1da8 31d0: 35fffbe0 cbnz w0, 314c <fdt_path_offset+0x160> 31d4: 17ffffed b 3188 <fdt_path_offset+0x19c> 31d8: 2a0003f4 mov w20, w0 31dc: 17fffff5 b 31b0 <fdt_path_offset+0x1c4> 31e0: 2a1403e0 mov w0, w20 31e4: a94153f3 ldp x19, x20, [sp, #16] 31e8: a9425bf5 ldp x21, x22, [sp, #32] 31ec: a94363f7 ldp x23, x24, [sp, #48] 31f0: a9446bf9 ldp x25, x26, [sp, #64] 31f4: f9402bfb ldr x27, [sp, #80] 31f8: a8c77bfd ldp x29, x30, [sp], #112 31fc: d65f03c0 ret 3200: 14000400 b 4200 <MmioWrite32.isra.0> 3204: d503201f nop 3208: f946cc00 ldr x0, [x0, #3480] 320c: 17ffff80 b 300c <fdt_path_offset+0x20> Jake ________________________________ From: Pedro Falcato <pedro.falcato@gmail.com> Sent: Thursday, October 26, 2023 2:46 PM To: devel@edk2.groups.io <devel@edk2.groups.io>; Jake Garver <jake@nvidia.com> Cc: rebecca@bsdio.com <rebecca@bsdio.com>; gaoliming@byosoft.com.cn <gaoliming@byosoft.com.cn>; bob.c.feng@intel.com <bob.c.feng@intel.com>; yuwei.chen@intel.com <yuwei.chen@intel.com>; ardb+tianocore@kernel.org <ardb+tianocore@kernel.org> Subject: Re: [edk2-devel] [PATCH] BaseTools/GenFw: Change opcode when converting ADR to ADRP External email: Use caution opening links or attachments On Thu, Oct 26, 2023 at 4:32 PM Jake Garver via groups.io <jake=nvidia.com@groups.io> wrote: > > In the R_AARCH64_ADR_GOT_PAGE case on AARCH64, be sure to change the > opcode to ADRP. Prior to this change, we updated the address, but not > the opcode. > > This resolves an issue experienced when building a StandaloneMm image > with stack protection enabled on GCC 10.5. This scenario generates an > ADR where an ADRP is more common in other versions of GCC tested. That > explains the obscurity of the issue. However, an ADR is valid and > should be handled by GenFw. Is this not a toolchain bug? The aarch64 ELF ABI (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FARM-software%2Fabi-aa%2Fblob%2Fmain%2Faaelf64%2Faaelf64.rst&data=05%7C01%7Cjake%40nvidia.com%7C607b433f7ddc40266e3c08dbd653f1c1%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638339428273243439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6DcbbXjY4UfjTwAUpd525B%2BvqPEWkZ1RStdc62%2F2er4%3D&reserved=0<https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst>) says: "Set the immediate value of an ADRP to bits [32:12] of X; check that –232 <= X < 232" So it mentions this relocation pointing *to an ADRP* specifically. And AFAIK there's no way Page(G(GDAT(S+A)))-Page(P) would ever make sense if we're looking at an adr and not an adrp. Can you post the full disassembly for the function, plus relevant relocations? -- Pedro -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110205): https://edk2.groups.io/g/devel/message/110205 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hello all, Apologies for the late response. On Fri, 27 Oct 2023 at 14:44, Jake Garver via groups.io <jake=nvidia.com@groups.io> wrote: > > Thanks for your response, Pedro. > > I chased this as a toolchain bug originally, but concluded that the ADR indeed works before GenFw rewrites it. But I see your point regarding the relocation statement. > > As requested, below is the disassembled function along with relocations. This was generated from the dll using "aarch64-linux-gnu-objdump -r -D". The ADR in question is at 2ffc. > Can you double check the object file? I suspect this is a linker relaxation not a compiler issue. > This code was generated by the current version of the GCC 10.x toolchain on Ubuntu20. So, if we're concluding this is a toolchain issue, then it's with a fairly "stock" toolchain. > > 0000000000002fec <fdt_path_offset>: > 2fec: a9b97bfd stp x29, x30, [sp, #-112]! > 2ff0: 910003fd mov x29, sp > 2ff4: a90363f7 stp x23, x24, [sp, #48] > 2ff8: aa0003f8 mov x24, x0 > 2ffc: 10020020 adr x0, 7000 <_cont+0xe98> > 2ffc: R_AARCH64_ADR_GOT_PAGE __stack_chk_guard The nasty thing with relying on --emit-relocs is that they get out of sync. R_AARCH64_ADR_GOT_PAGE is documented as applying to ADRP only, so this code is non-compliant one way or the other. There are other relaxations that also confuse GenFw when the static relocs don't get updated accordingly. > 3000: a90153f3 stp x19, x20, [sp, #16] > 3004: aa0103f3 mov x19, x1 > 3008: f946cc00 ldr x0, [x0, #3480] > 3008: R_AARCH64_LD64_GOT_LO12_NC __stack_chk_guard > 300c: a9025bf5 stp x21, x22, [sp, #32] > 3010: a9046bf9 stp x25, x26, [sp, #64] > 3014: f9002bfb str x27, [sp, #80] > 3018: f9400001 ldr x1, [x0] > 301c: f90037e1 str x1, [sp, #104] > 3020: d2800001 mov x1, #0x0 // #0 > 3024: aa1303e0 mov x0, x19 > 3028: 97fffd04 bl 2438 <AsciiStrLen> > 3028: R_AARCH64_CALL26 .text+0x1438 > 302c: aa0003f5 mov x21, x0 > 3030: aa1803e0 mov x0, x24 > 3034: 97fff807 bl 1050 <fdt_check_header> > 3034: R_AARCH64_CALL26 .text+0x50 > 3038: 2a0003f4 mov w20, w0 > 303c: 35000260 cbnz w0, 3088 <fdt_path_offset+0x9c> > 3040: 39400260 ldrb w0, [x19] > 3044: 93407ea1 sxtw x1, w21 > 3048: 8b35c275 add x21, x19, w21, sxtw > 304c: 7100bc1f cmp w0, #0x2f > 3050: 54000420 b.eq 30d4 <fdt_path_offset+0xe8> // b.none > 3054: 528005e2 mov w2, #0x2f // #47 > 3058: aa1303e0 mov x0, x19 > 305c: 97ffff2f bl 2d18 <ScanMem8> > 305c: R_AARCH64_CALL26 .text+0x1d18 > 3060: f100001f cmp x0, #0x0 > 3064: 9a951016 csel x22, x0, x21, ne // ne = any > 3068: f0000001 adrp x1, 6000 <__stack_chk_fail+0x7c> > 3068: R_AARCH64_ADR_PREL_PG_HI21 .text+0x58b6 > 306c: 9122d821 add x1, x1, #0x8b6 > 306c: R_AARCH64_ADD_ABS_LO12_NC .text+0x58b6 > 3070: aa1803e0 mov x0, x24 > 3074: cb1302d4 sub x20, x22, x19 > 3078: 97ffffdd bl 2fec <fdt_path_offset> > 307c: 2a0003e1 mov w1, w0 > 3080: 36f80140 tbz w0, #31, 30a8 <fdt_path_offset+0xbc> > 3084: 12800094 mov w20, #0xfffffffb // #-5 > 3088: 90000020 adrp x0, 7000 <_cont+0xe98> > 3088: R_AARCH64_ADR_GOT_PAGE __stack_chk_guard > 308c: f946cc00 ldr x0, [x0, #3480] > 308c: R_AARCH64_LD64_GOT_LO12_NC __stack_chk_guard > 3090: f94037e1 ldr x1, [sp, #104] > 3094: f9400002 ldr x2, [x0] > 3098: eb020021 subs x1, x1, x2 > 309c: d2800002 mov x2, #0x0 // #0 > 30a0: 54000a00 b.eq 31e0 <fdt_path_offset+0x1f4> // b.none > 30a4: 94000bb8 bl 5f84 <__stack_chk_fail> > 30a4: R_AARCH64_CALL26 __stack_chk_fail > 30a8: 2a1403e3 mov w3, w20 > 30ac: aa1303e2 mov x2, x19 > 30b0: aa1803e0 mov x0, x24 > 30b4: d2800004 mov x4, #0x0 // #0 > 30b8: 97ffff72 bl 2e80 <fdt_get_property_namelen> > 30b8: R_AARCH64_CALL26 .text+0x1e80 > 30bc: b4fffe40 cbz x0, 3084 <fdt_path_offset+0x98> > 30c0: 91003001 add x1, x0, #0xc > 30c4: aa1603f3 mov x19, x22 > 30c8: aa1803e0 mov x0, x24 > 30cc: 97ffffc8 bl 2fec <fdt_path_offset> > 30d0: 2a0003f4 mov w20, w0 > 30d4: 910193fa add x26, sp, #0x64 > 30d8: 14000037 b 31b4 <fdt_path_offset+0x1c8> > 30dc: 910006d6 add x22, x22, #0x1 > 30e0: eb1602bf cmp x21, x22 > 30e4: 54fffd20 b.eq 3088 <fdt_path_offset+0x9c> // b.none > 30e8: 394002c0 ldrb w0, [x22] > 30ec: 7100bc1f cmp w0, #0x2f > 30f0: 54ffff60 b.eq 30dc <fdt_path_offset+0xf0> // b.none > 30f4: cb1602a1 sub x1, x21, x22 > 30f8: aa1603e0 mov x0, x22 > 30fc: 528005e2 mov w2, #0x2f // #47 > 3100: 97ffff06 bl 2d18 <ScanMem8> > 3100: R_AARCH64_CALL26 .text+0x1d18 > 3104: f100001f cmp x0, #0x0 > 3108: 9a951013 csel x19, x0, x21, ne // ne = any > 310c: aa1803e0 mov x0, x24 > 3110: 97fff7d0 bl 1050 <fdt_check_header> > 3110: R_AARCH64_CALL26 .text+0x50 > 3114: 35000620 cbnz w0, 31d8 <fdt_path_offset+0x1ec> > 3118: 4b160277 sub w23, w19, w22 > 311c: b90067ff str wzr, [sp, #100] > 3120: 110006fb add w27, w23, #0x1 > 3124: 93407ef7 sxtw x23, w23 > 3128: b94067e0 ldr w0, [sp, #100] > 312c: 7100029f cmp w20, #0x0 > 3130: 7a40a801 ccmp w0, #0x0, #0x1, ge // ge = tcont > 3134: 5400008a b.ge 3144 <fdt_path_offset+0x158> // b.tcont > 3138: 36f803c0 tbz w0, #31, 31b0 <fdt_path_offset+0x1c4> > 313c: 12800014 mov w20, #0xffffffff // #-1 > 3140: 17ffffd2 b 3088 <fdt_path_offset+0x9c> > 3144: 7100041f cmp w0, #0x1 > 3148: 540000e0 b.eq 3164 <fdt_path_offset+0x178> // b.none > 314c: 2a1403e1 mov w1, w20 > 3150: aa1a03e2 mov x2, x26 > 3154: aa1803e0 mov x0, x24 > 3158: 97fff87c bl 1348 <fdt_next_node> > 3158: R_AARCH64_CALL26 .text+0x348 > 315c: 2a0003f4 mov w20, w0 > 3160: 17fffff2 b 3128 <fdt_path_offset+0x13c> > 3164: 2a1b03e2 mov w2, w27 > 3168: 11001281 add w1, w20, #0x4 > 316c: aa1803e0 mov x0, x24 > 3170: 97fff7da bl 10d8 <fdt_offset_ptr> > 3170: R_AARCH64_CALL26 .text+0xd8 > 3174: aa0003f9 mov x25, x0 > 3178: b4fffea0 cbz x0, 314c <fdt_path_offset+0x160> > 317c: f10002ff cmp x23, #0x0 > 3180: fa4012c4 ccmp x22, x0, #0x4, ne // ne = any > 3184: 54000201 b.ne 31c4 <fdt_path_offset+0x1d8> // b.any > 3188: 38776b20 ldrb w0, [x25, x23] > 318c: 34000120 cbz w0, 31b0 <fdt_path_offset+0x1c4> > 3190: aa1703e1 mov x1, x23 > 3194: aa1603e0 mov x0, x22 > 3198: 52800802 mov w2, #0x40 // #64 > 319c: 97fffedf bl 2d18 <ScanMem8> > 319c: R_AARCH64_CALL26 .text+0x1d18 > 31a0: b5fffd60 cbnz x0, 314c <fdt_path_offset+0x160> > 31a4: 38776b20 ldrb w0, [x25, x23] > 31a8: 7101001f cmp w0, #0x40 > 31ac: 54fffd01 b.ne 314c <fdt_path_offset+0x160> // b.any > 31b0: 37fff6d4 tbnz w20, #31, 3088 <fdt_path_offset+0x9c> > 31b4: eb1302bf cmp x21, x19 > 31b8: 54fff689 b.ls 3088 <fdt_path_offset+0x9c> // b.plast > 31bc: aa1303f6 mov x22, x19 > 31c0: 17ffffca b 30e8 <fdt_path_offset+0xfc> > 31c4: aa1703e2 mov x2, x23 > 31c8: aa1603e1 mov x1, x22 > 31cc: 97fffef7 bl 2da8 <CompareMem.part.0> > 31cc: R_AARCH64_CALL26 .text+0x1da8 > 31d0: 35fffbe0 cbnz w0, 314c <fdt_path_offset+0x160> > 31d4: 17ffffed b 3188 <fdt_path_offset+0x19c> > 31d8: 2a0003f4 mov w20, w0 > 31dc: 17fffff5 b 31b0 <fdt_path_offset+0x1c4> > 31e0: 2a1403e0 mov w0, w20 > 31e4: a94153f3 ldp x19, x20, [sp, #16] > 31e8: a9425bf5 ldp x21, x22, [sp, #32] > 31ec: a94363f7 ldp x23, x24, [sp, #48] > 31f0: a9446bf9 ldp x25, x26, [sp, #64] > 31f4: f9402bfb ldr x27, [sp, #80] > 31f8: a8c77bfd ldp x29, x30, [sp], #112 > 31fc: d65f03c0 ret > 3200: 14000400 b 4200 <MmioWrite32.isra.0> > 3204: d503201f nop > 3208: f946cc00 ldr x0, [x0, #3480] > 320c: 17ffff80 b 300c <fdt_path_offset+0x20> > > Jake > ________________________________ > From: Pedro Falcato <pedro.falcato@gmail.com> > Sent: Thursday, October 26, 2023 2:46 PM > To: devel@edk2.groups.io <devel@edk2.groups.io>; Jake Garver <jake@nvidia.com> > Cc: rebecca@bsdio.com <rebecca@bsdio.com>; gaoliming@byosoft.com.cn <gaoliming@byosoft.com.cn>; bob.c.feng@intel.com <bob.c.feng@intel.com>; yuwei.chen@intel.com <yuwei.chen@intel.com>; ardb+tianocore@kernel.org <ardb+tianocore@kernel.org> > Subject: Re: [edk2-devel] [PATCH] BaseTools/GenFw: Change opcode when converting ADR to ADRP > > External email: Use caution opening links or attachments > > > On Thu, Oct 26, 2023 at 4:32 PM Jake Garver via groups.io > <jake=nvidia.com@groups.io> wrote: > > > > In the R_AARCH64_ADR_GOT_PAGE case on AARCH64, be sure to change the > > opcode to ADRP. Prior to this change, we updated the address, but not > > the opcode. > > > > This resolves an issue experienced when building a StandaloneMm image > > with stack protection enabled on GCC 10.5. This scenario generates an > > ADR where an ADRP is more common in other versions of GCC tested. That > > explains the obscurity of the issue. However, an ADR is valid and > > should be handled by GenFw. > > Is this not a toolchain bug? > The aarch64 ELF ABI > (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FARM-software%2Fabi-aa%2Fblob%2Fmain%2Faaelf64%2Faaelf64.rst&data=05%7C01%7Cjake%40nvidia.com%7C607b433f7ddc40266e3c08dbd653f1c1%7C43083d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638339428273243439%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=6DcbbXjY4UfjTwAUpd525B%2BvqPEWkZ1RStdc62%2F2er4%3D&reserved=0) > says: > "Set the immediate value of an ADRP to bits [32:12] of X; check that > –232 <= X < 232" > > So it mentions this relocation pointing *to an ADRP* specifically. And > AFAIK there's no way > Page(G(GDAT(S+A)))-Page(P) > > would ever make sense if we're looking at an adr and not an adrp. > > Can you post the full disassembly for the function, plus relevant relocations? > > -- > Pedro > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110207): https://edk2.groups.io/g/devel/message/110207 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Fri, Oct 27, 2023 at 2:47 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > Hello all, > > Apologies for the late response. > > On Fri, 27 Oct 2023 at 14:44, Jake Garver via groups.io > <jake=nvidia.com@groups.io> wrote: > > > > Thanks for your response, Pedro. > > > > I chased this as a toolchain bug originally, but concluded that the ADR indeed works before GenFw rewrites it. But I see your point regarding the relocation statement. > > > > As requested, below is the disassembled function along with relocations. This was generated from the dll using "aarch64-linux-gnu-objdump -r -D". The ADR in question is at 2ffc. > > > > Can you double check the object file? I suspect this is a linker > relaxation not a compiler issue. > > > This code was generated by the current version of the GCC 10.x toolchain on Ubuntu20. So, if we're concluding this is a toolchain issue, then it's with a fairly "stock" toolchain. > > > > 0000000000002fec <fdt_path_offset>: > > 2fec: a9b97bfd stp x29, x30, [sp, #-112]! > > 2ff0: 910003fd mov x29, sp > > 2ff4: a90363f7 stp x23, x24, [sp, #48] > > 2ff8: aa0003f8 mov x24, x0 > > 2ffc: 10020020 adr x0, 7000 <_cont+0xe98> > > 2ffc: R_AARCH64_ADR_GOT_PAGE __stack_chk_guard > > The nasty thing with relying on --emit-relocs is that they get out of > sync. R_AARCH64_ADR_GOT_PAGE is documented as applying to ADRP only, > so this code is non-compliant one way or the other. I was narrowing it down to this, but _ADR_GOT_PAGE does not seem to be relaxable as ADRP -> ADR, per the ABI spec (see "Large GOT Indirection". "PC-relative addressing" only applies to _ADR_PREL_PG_HI21...) Maybe it's just not a documented relaxation? Or does it relax as _ADR_GOT_PAGE -> _ADR_PREL_PG_HI21 -> _ADR_PREL_LO21 without updating internal relocation info? > > There are other relaxations that also confuse GenFw when the static > relocs don't get updated accordingly. Wonderful that you can confirm this is probably a linker --emit-relocs issue. So, if this proves to be the case: Acked-by: Pedro Falcato <pedro.falcato@gmail.com> but please rewrite the commit message so that it's clear that this is an --emit-relocs toolchain bug. -- Pedro -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110209): https://edk2.groups.io/g/devel/message/110209 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi Ard: > Can you double check the object file? I suspect this is a linker relaxation not a compiler issue. With LTO in play, is there a way to check the object file? It's not in aarch64 assembly. Thanks, Jake ________________________________ From: Ard Biesheuvel <ardb@kernel.org> Sent: Friday, October 27, 2023 9:46 AM To: devel@edk2.groups.io <devel@edk2.groups.io>; Jake Garver <jake@nvidia.com> Cc: Pedro Falcato <pedro.falcato@gmail.com>; rebecca@bsdio.com <rebecca@bsdio.com>; gaoliming@byosoft.com.cn <gaoliming@byosoft.com.cn>; bob.c.feng@intel.com <bob.c.feng@intel.com>; yuwei.chen@intel.com <yuwei.chen@intel.com>; ardb+tianocore@kernel.org <ardb+tianocore@kernel.org> Subject: Re: [edk2-devel] [PATCH] BaseTools/GenFw: Change opcode when converting ADR to ADRP External email: Use caution opening links or attachments Hello all, Apologies for the late response. On Fri, 27 Oct 2023 at 14:44, Jake Garver via groups.io <jake=nvidia.com@groups.io> wrote: > > Thanks for your response, Pedro. > > I chased this as a toolchain bug originally, but concluded that the ADR indeed works before GenFw rewrites it. But I see your point regarding the relocation statement. > > As requested, below is the disassembled function along with relocations. This was generated from the dll using "aarch64-linux-gnu-objdump -r -D". The ADR in question is at 2ffc. > Can you double check the object file? I suspect this is a linker relaxation not a compiler issue. > This code was generated by the current version of the GCC 10.x toolchain on Ubuntu20. So, if we're concluding this is a toolchain issue, then it's with a fairly "stock" toolchain. > > 0000000000002fec <fdt_path_offset>: > 2fec: a9b97bfd stp x29, x30, [sp, #-112]! > 2ff0: 910003fd mov x29, sp > 2ff4: a90363f7 stp x23, x24, [sp, #48] > 2ff8: aa0003f8 mov x24, x0 > 2ffc: 10020020 adr x0, 7000 <_cont+0xe98> > 2ffc: R_AARCH64_ADR_GOT_PAGE __stack_chk_guard The nasty thing with relying on --emit-relocs is that they get out of sync. R_AARCH64_ADR_GOT_PAGE is documented as applying to ADRP only, so this code is non-compliant one way or the other. There are other relaxations that also confuse GenFw when the static relocs don't get updated accordingly. > 3000: a90153f3 stp x19, x20, [sp, #16] > 3004: aa0103f3 mov x19, x1 > 3008: f946cc00 ldr x0, [x0, #3480] > 3008: R_AARCH64_LD64_GOT_LO12_NC __stack_chk_guard > 300c: a9025bf5 stp x21, x22, [sp, #32] > 3010: a9046bf9 stp x25, x26, [sp, #64] > 3014: f9002bfb str x27, [sp, #80] > 3018: f9400001 ldr x1, [x0] > 301c: f90037e1 str x1, [sp, #104] > 3020: d2800001 mov x1, #0x0 // #0 > 3024: aa1303e0 mov x0, x19 > 3028: 97fffd04 bl 2438 <AsciiStrLen> > 3028: R_AARCH64_CALL26 .text+0x1438 > 302c: aa0003f5 mov x21, x0 > 3030: aa1803e0 mov x0, x24 > 3034: 97fff807 bl 1050 <fdt_check_header> > 3034: R_AARCH64_CALL26 .text+0x50 > 3038: 2a0003f4 mov w20, w0 > 303c: 35000260 cbnz w0, 3088 <fdt_path_offset+0x9c> > 3040: 39400260 ldrb w0, [x19] > 3044: 93407ea1 sxtw x1, w21 > 3048: 8b35c275 add x21, x19, w21, sxtw > 304c: 7100bc1f cmp w0, #0x2f > 3050: 54000420 b.eq 30d4 <fdt_path_offset+0xe8> // b.none > 3054: 528005e2 mov w2, #0x2f // #47 > 3058: aa1303e0 mov x0, x19 > 305c: 97ffff2f bl 2d18 <ScanMem8> > 305c: R_AARCH64_CALL26 .text+0x1d18 > 3060: f100001f cmp x0, #0x0 > 3064: 9a951016 csel x22, x0, x21, ne // ne = any > 3068: f0000001 adrp x1, 6000 <__stack_chk_fail+0x7c> > 3068: R_AARCH64_ADR_PREL_PG_HI21 .text+0x58b6 > 306c: 9122d821 add x1, x1, #0x8b6 > 306c: R_AARCH64_ADD_ABS_LO12_NC .text+0x58b6 > 3070: aa1803e0 mov x0, x24 > 3074: cb1302d4 sub x20, x22, x19 > 3078: 97ffffdd bl 2fec <fdt_path_offset> > 307c: 2a0003e1 mov w1, w0 > 3080: 36f80140 tbz w0, #31, 30a8 <fdt_path_offset+0xbc> > 3084: 12800094 mov w20, #0xfffffffb // #-5 > 3088: 90000020 adrp x0, 7000 <_cont+0xe98> > 3088: R_AARCH64_ADR_GOT_PAGE __stack_chk_guard > 308c: f946cc00 ldr x0, [x0, #3480] > 308c: R_AARCH64_LD64_GOT_LO12_NC __stack_chk_guard > 3090: f94037e1 ldr x1, [sp, #104] > 3094: f9400002 ldr x2, [x0] > 3098: eb020021 subs x1, x1, x2 > 309c: d2800002 mov x2, #0x0 // #0 > 30a0: 54000a00 b.eq 31e0 <fdt_path_offset+0x1f4> // b.none > 30a4: 94000bb8 bl 5f84 <__stack_chk_fail> > 30a4: R_AARCH64_CALL26 __stack_chk_fail > 30a8: 2a1403e3 mov w3, w20 > 30ac: aa1303e2 mov x2, x19 > 30b0: aa1803e0 mov x0, x24 > 30b4: d2800004 mov x4, #0x0 // #0 > 30b8: 97ffff72 bl 2e80 <fdt_get_property_namelen> > 30b8: R_AARCH64_CALL26 .text+0x1e80 > 30bc: b4fffe40 cbz x0, 3084 <fdt_path_offset+0x98> > 30c0: 91003001 add x1, x0, #0xc > 30c4: aa1603f3 mov x19, x22 > 30c8: aa1803e0 mov x0, x24 > 30cc: 97ffffc8 bl 2fec <fdt_path_offset> > 30d0: 2a0003f4 mov w20, w0 > 30d4: 910193fa add x26, sp, #0x64 > 30d8: 14000037 b 31b4 <fdt_path_offset+0x1c8> > 30dc: 910006d6 add x22, x22, #0x1 > 30e0: eb1602bf cmp x21, x22 > 30e4: 54fffd20 b.eq 3088 <fdt_path_offset+0x9c> // b.none > 30e8: 394002c0 ldrb w0, [x22] > 30ec: 7100bc1f cmp w0, #0x2f > 30f0: 54ffff60 b.eq 30dc <fdt_path_offset+0xf0> // b.none > 30f4: cb1602a1 sub x1, x21, x22 > 30f8: aa1603e0 mov x0, x22 > 30fc: 528005e2 mov w2, #0x2f // #47 > 3100: 97ffff06 bl 2d18 <ScanMem8> > 3100: R_AARCH64_CALL26 .text+0x1d18 > 3104: f100001f cmp x0, #0x0 > 3108: 9a951013 csel x19, x0, x21, ne // ne = any > 310c: aa1803e0 mov x0, x24 > 3110: 97fff7d0 bl 1050 <fdt_check_header> > 3110: R_AARCH64_CALL26 .text+0x50 > 3114: 35000620 cbnz w0, 31d8 <fdt_path_offset+0x1ec> > 3118: 4b160277 sub w23, w19, w22 > 311c: b90067ff str wzr, [sp, #100] > 3120: 110006fb add w27, w23, #0x1 > 3124: 93407ef7 sxtw x23, w23 > 3128: b94067e0 ldr w0, [sp, #100] > 312c: 7100029f cmp w20, #0x0 > 3130: 7a40a801 ccmp w0, #0x0, #0x1, ge // ge = tcont > 3134: 5400008a b.ge 3144 <fdt_path_offset+0x158> // b.tcont > 3138: 36f803c0 tbz w0, #31, 31b0 <fdt_path_offset+0x1c4> > 313c: 12800014 mov w20, #0xffffffff // #-1 > 3140: 17ffffd2 b 3088 <fdt_path_offset+0x9c> > 3144: 7100041f cmp w0, #0x1 > 3148: 540000e0 b.eq 3164 <fdt_path_offset+0x178> // b.none > 314c: 2a1403e1 mov w1, w20 > 3150: aa1a03e2 mov x2, x26 > 3154: aa1803e0 mov x0, x24 > 3158: 97fff87c bl 1348 <fdt_next_node> > 3158: R_AARCH64_CALL26 .text+0x348 > 315c: 2a0003f4 mov w20, w0 > 3160: 17fffff2 b 3128 <fdt_path_offset+0x13c> > 3164: 2a1b03e2 mov w2, w27 > 3168: 11001281 add w1, w20, #0x4 > 316c: aa1803e0 mov x0, x24 > 3170: 97fff7da bl 10d8 <fdt_offset_ptr> > 3170: R_AARCH64_CALL26 .text+0xd8 > 3174: aa0003f9 mov x25, x0 > 3178: b4fffea0 cbz x0, 314c <fdt_path_offset+0x160> > 317c: f10002ff cmp x23, #0x0 > 3180: fa4012c4 ccmp x22, x0, #0x4, ne // ne = any > 3184: 54000201 b.ne 31c4 <fdt_path_offset+0x1d8> // b.any > 3188: 38776b20 ldrb w0, [x25, x23] > 318c: 34000120 cbz w0, 31b0 <fdt_path_offset+0x1c4> > 3190: aa1703e1 mov x1, x23 > 3194: aa1603e0 mov x0, x22 > 3198: 52800802 mov w2, #0x40 // #64 > 319c: 97fffedf bl 2d18 <ScanMem8> > 319c: R_AARCH64_CALL26 .text+0x1d18 > 31a0: b5fffd60 cbnz x0, 314c <fdt_path_offset+0x160> > 31a4: 38776b20 ldrb w0, [x25, x23] > 31a8: 7101001f cmp w0, #0x40 > 31ac: 54fffd01 b.ne 314c <fdt_path_offset+0x160> // b.any > 31b0: 37fff6d4 tbnz w20, #31, 3088 <fdt_path_offset+0x9c> > 31b4: eb1302bf cmp x21, x19 > 31b8: 54fff689 b.ls 3088 <fdt_path_offset+0x9c> // b.plast > 31bc: aa1303f6 mov x22, x19 > 31c0: 17ffffca b 30e8 <fdt_path_offset+0xfc> > 31c4: aa1703e2 mov x2, x23 > 31c8: aa1603e1 mov x1, x22 > 31cc: 97fffef7 bl 2da8 <CompareMem.part.0> > 31cc: R_AARCH64_CALL26 .text+0x1da8 > 31d0: 35fffbe0 cbnz w0, 314c <fdt_path_offset+0x160> > 31d4: 17ffffed b 3188 <fdt_path_offset+0x19c> > 31d8: 2a0003f4 mov w20, w0 > 31dc: 17fffff5 b 31b0 <fdt_path_offset+0x1c4> > 31e0: 2a1403e0 mov w0, w20 > 31e4: a94153f3 ldp x19, x20, [sp, #16] > 31e8: a9425bf5 ldp x21, x22, [sp, #32] > 31ec: a94363f7 ldp x23, x24, [sp, #48] > 31f0: a9446bf9 ldp x25, x26, [sp, #64] > 31f4: f9402bfb ldr x27, [sp, #80] > 31f8: a8c77bfd ldp x29, x30, [sp], #112 > 31fc: d65f03c0 ret > 3200: 14000400 b 4200 <MmioWrite32.isra.0> > 3204: d503201f nop > 3208: f946cc00 ldr x0, [x0, #3480] > 320c: 17ffff80 b 300c <fdt_path_offset+0x20> > > Jake > ________________________________ > From: Pedro Falcato <pedro.falcato@gmail.com> > Sent: Thursday, October 26, 2023 2:46 PM > To: devel@edk2.groups.io <devel@edk2.groups.io>; Jake Garver <jake@nvidia.com> > Cc: rebecca@bsdio.com <rebecca@bsdio.com>; gaoliming@byosoft.com.cn <gaoliming@byosoft.com.cn>; bob.c.feng@intel.com <bob.c.feng@intel.com>; yuwei.chen@intel.com <yuwei.chen@intel.com>; ardb+tianocore@kernel.org <ardb+tianocore@kernel.org> > Subject: Re: [edk2-devel] [PATCH] BaseTools/GenFw: Change opcode when converting ADR to ADRP > > External email: Use caution opening links or attachments > > > On Thu, Oct 26, 2023 at 4:32 PM Jake Garver via groups.io > <jake=nvidia.com@groups.io> wrote: > > > > In the R_AARCH64_ADR_GOT_PAGE case on AARCH64, be sure to change the > > opcode to ADRP. Prior to this change, we updated the address, but not > > the opcode. > > > > This resolves an issue experienced when building a StandaloneMm image > > with stack protection enabled on GCC 10.5. This scenario generates an > > ADR where an ADRP is more common in other versions of GCC tested. That > > explains the obscurity of the issue. However, an ADR is valid and > > should be handled by GenFw. > > Is this not a toolchain bug? > The aarch64 ELF ABI > (https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst) > says: > "Set the immediate value of an ADRP to bits [32:12] of X; check that > –232 <= X < 232" > > So it mentions this relocation pointing *to an ADRP* specifically. And > AFAIK there's no way > Page(G(GDAT(S+A)))-Page(P) > > would ever make sense if we're looking at an adr and not an adrp. > > Can you post the full disassembly for the function, plus relevant relocations? > > -- > Pedro > -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110208): https://edk2.groups.io/g/devel/message/110208 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Fri, 27 Oct 2023 at 16:09, Jake Garver via groups.io <jake=nvidia.com@groups.io> wrote: > > Hi Ard: > > > Can you double check the object file? I suspect this is a linker relaxation not a compiler issue. > > With LTO in play, is there a way to check the object file? It's not in aarch64 assembly. > Perhaps not. Could you try adding -Wl,--no-relax to the DLINK_FLAGS for your build and see if it makes a difference? We should be adding that in any case, but it would be good to know if it helps here too. -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110211): https://edk2.groups.io/g/devel/message/110211 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Fri, Oct 27, 2023 at 3:13 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > On Fri, 27 Oct 2023 at 16:09, Jake Garver via groups.io > <jake=nvidia.com@groups.io> wrote: > > > > Hi Ard: > > > > > Can you double check the object file? I suspect this is a linker relaxation not a compiler issue. > > > > With LTO in play, is there a way to check the object file? It's not in aarch64 assembly. > > > > Perhaps not. > > Could you try adding -Wl,--no-relax to the DLINK_FLAGS for your build > and see if it makes a difference? > > We should be adding that in any case, but it would be good to know if > it helps here too. I've never checked on aarch64, but don't you get solid space savings with linker relaxation? Or is it mostly meaningless? -- Pedro -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110212): https://edk2.groups.io/g/devel/message/110212 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Fri, 27 Oct 2023 at 16:26, Pedro Falcato <pedro.falcato@gmail.com> wrote: > > On Fri, Oct 27, 2023 at 3:13 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Fri, 27 Oct 2023 at 16:09, Jake Garver via groups.io > > <jake=nvidia.com@groups.io> wrote: > > > > > > Hi Ard: > > > > > > > Can you double check the object file? I suspect this is a linker relaxation not a compiler issue. > > > > > > With LTO in play, is there a way to check the object file? It's not in aarch64 assembly. > > > > > > > Perhaps not. > > > > Could you try adding -Wl,--no-relax to the DLINK_FLAGS for your build > > and see if it makes a difference? > > > > We should be adding that in any case, but it would be good to know if > > it helps here too. > > I've never checked on aarch64, but don't you get solid space savings > with linker relaxation? Or is it mostly meaningless? > The most important use case for linker relaxation is turning GOT lookups into relative references, given that PIC object files are typically constructed without prior knowledge of whether a given external reference will be satisfied by the same dynamic object or a different one. So each relaxation removes a load from the program and may reduce the size of the GOT (if no other non-relaxable references to it exist) and this might matter on a hot path. None of this makes sense for UEFI, though, given that all executables are fully linked and performance doesn't usually matter to the same extent. On AArch64, there is a specific advantage to being able to relax ADRP to ADR (and this is why GenFw implements this relaxation itself too): ADRP instructions need to appear at the same offset modulo 4k as they were linked at, which means 4k alignment for the entire code section. EDK2 builds are made up of lots of tiny SEC and PEI drivers and rounding up each of those to 4k would result in a substantial increase in footprint of the firmware image. -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110214): https://edk2.groups.io/g/devel/message/110214 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Ard, Pedro, > Could you try adding -Wl,--no-relax to the DLINK_FLAGS for your build and see if it makes a difference? With "-Wl,--no-relax", I still see an ADR instruction. Clean build. Verified the "-Wl,--no-relax" is part of the gcc all to build this dll. 0000000000002fec <fdt_path_offset>: 2fec: a9b97bfd stp x29, x30, [sp, #-112]! 2ff0: 910003fd mov x29, sp 2ff4: a90363f7 stp x23, x24, [sp, #48] 2ff8: aa0003f8 mov x24, x0 2ffc: 10020020 adr x0, 7000 <_cont+0xe98> 2ffc: R_AARCH64_ADR_GOT_PAGE __stack_chk_guard 3000: a90153f3 stp x19, x20, [sp, #16] 3004: aa0103f3 mov x19, x1 3008: f946cc00 ldr x0, [x0, #3480] 3008: R_AARCH64_LD64_GOT_LO12_NC __stack_chk_guard How do you think I should proceed? Thanks for all your help! Jake ________________________________ From: Ard Biesheuvel <ardb@kernel.org> Sent: Friday, October 27, 2023 10:43 AM To: Pedro Falcato <pedro.falcato@gmail.com> Cc: devel@edk2.groups.io <devel@edk2.groups.io>; Jake Garver <jake@nvidia.com>; rebecca@bsdio.com <rebecca@bsdio.com>; gaoliming@byosoft.com.cn <gaoliming@byosoft.com.cn>; bob.c.feng@intel.com <bob.c.feng@intel.com>; yuwei.chen@intel.com <yuwei.chen@intel.com> Subject: Re: [edk2-devel] [PATCH] BaseTools/GenFw: Change opcode when converting ADR to ADRP External email: Use caution opening links or attachments On Fri, 27 Oct 2023 at 16:26, Pedro Falcato <pedro.falcato@gmail.com> wrote: > > On Fri, Oct 27, 2023 at 3:13 PM Ard Biesheuvel <ardb@kernel.org> wrote: > > > > On Fri, 27 Oct 2023 at 16:09, Jake Garver via groups.io > > <jake=nvidia.com@groups.io> wrote: > > > > > > Hi Ard: > > > > > > > Can you double check the object file? I suspect this is a linker relaxation not a compiler issue. > > > > > > With LTO in play, is there a way to check the object file? It's not in aarch64 assembly. > > > > > > > Perhaps not. > > > > Could you try adding -Wl,--no-relax to the DLINK_FLAGS for your build > > and see if it makes a difference? > > > > We should be adding that in any case, but it would be good to know if > > it helps here too. > > I've never checked on aarch64, but don't you get solid space savings > with linker relaxation? Or is it mostly meaningless? > The most important use case for linker relaxation is turning GOT lookups into relative references, given that PIC object files are typically constructed without prior knowledge of whether a given external reference will be satisfied by the same dynamic object or a different one. So each relaxation removes a load from the program and may reduce the size of the GOT (if no other non-relaxable references to it exist) and this might matter on a hot path. None of this makes sense for UEFI, though, given that all executables are fully linked and performance doesn't usually matter to the same extent. On AArch64, there is a specific advantage to being able to relax ADRP to ADR (and this is why GenFw implements this relaxation itself too): ADRP instructions need to appear at the same offset modulo 4k as they were linked at, which means 4k alignment for the entire code section. EDK2 builds are made up of lots of tiny SEC and PEI drivers and rounding up each of those to 4k would result in a substantial increase in footprint of the firmware image. -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110219): https://edk2.groups.io/g/devel/message/110219 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On Fri, Oct 27, 2023 at 3:09 PM Jake Garver via groups.io <jake=nvidia.com@groups.io> wrote: > > Hi Ard: > > > Can you double check the object file? I suspect this is a linker relaxation not a compiler issue. > > With LTO in play, is there a way to check the object file? It's not in aarch64 assembly. Unless you pass -ffat-lto-objects, no. But, in the case of LTO, I doubt we'll get many answers. LTO will significantly distort things before doing the actual "link". -- Pedro -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110210): https://edk2.groups.io/g/devel/message/110210 Mute This Topic: https://groups.io/mt/102202314/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
© 2016 - 2024 Red Hat, Inc.