[edk2-devel] [RFC PATCH 00/11] Permit DXE drivers to execute in place

Ard Biesheuvel posted 11 patches 11 months, 1 week ago
Failed in applying to current master (apply log)
ArmVirtPkg/ArmVirtQemu.dsc                                 |   1 +
ArmVirtPkg/ArmVirtQemuFvMain.fdf.inc                       |  17 +-
ArmVirtPkg/ArmVirtRules.fdf.inc                            |   9 +
ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.c |   4 +-
ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf                |   2 +-
MdeModulePkg/Core/Dxe/DxeMain.h                            |   1 +
MdeModulePkg/Core/Dxe/DxeMain.inf                          |   3 +
MdeModulePkg/Core/Dxe/FwVol/FwVol.c                        | 113 ++++++-
MdeModulePkg/Core/Dxe/FwVol/FwVolDriver.h                  |  31 ++
MdeModulePkg/Core/Dxe/FwVol/FwVolRead.c                    |  22 --
MdeModulePkg/Core/Dxe/Image/Image.c                        | 322 ++++++++++----------
MdeModulePkg/Core/Dxe/Misc/MemoryProtection.c              |   7 +
MdeModulePkg/Core/DxeIplPeim/DxeIpl.inf                    |   1 +
MdeModulePkg/Core/DxeIplPeim/DxeLoad.c                     | 196 ++++++++++++
MdeModulePkg/Include/Protocol/MemoryMappedFv.h             |  59 ++++
MdeModulePkg/MdeModulePkg.dec                              |   6 +
16 files changed, 607 insertions(+), 187 deletions(-)
create mode 100644 MdeModulePkg/Include/Protocol/MemoryMappedFv.h
[edk2-devel] [RFC PATCH 00/11] Permit DXE drivers to execute in place
Posted by Ard Biesheuvel 11 months, 1 week ago
TL;DR - allow DXE drivers to execute in place from the decompressed FV
loaded into memory by DxeIpl so we can apply strict permissions before
dispatching DXE core.

Currently, executable images loaded from firmware volumes are copied at
least three times: once in the firmware volume driver, once in the DXE
core load image code, and finally when the PE sections are populated in
memory based on the section descriptions in the file.

At least two of these copies serve little purpose, given that most
drivers are typically dispatched from a memory-mapped firmware volume
that is loaded into DRAM by DxeIpl from a compressed image in the boot
FV, and so we can take a short-cut in the DXE image loader so that the
PE/COFF library that performs the load uses the image in the memory
mapped FV as its source directly. This is implemented by the first 6
patches (where the first 3 are just cleanups)

With this logic in place, we can go one step further, and actually
dispatch the image in place (similar to how XIP PEIMs are dispatched),
without over moving it out of the decompressed firmware volume. This
requires the image to be aligned sufficiently inside the FV, but this is
also the same logic that applies to XIP PEIMs, and this can be achieved
trivially by tweaking some FDF image generation rules. (Note that this
adds padding to the FV, but this generally compresses well, and we
ultimately uses less memory at runtime by not making a copy of the
image).

This requires the DXE IPL (which is the component that decompresses the
firmware volumes to memory) to iterate over the contents and relocate
these drivers in place. Given that DXE IPL is already in charge of
applying NX permissions to the stack and to other memory regions, we can
trivially extend it to apply restricted permissions to the XIP DXE
drivers after relocation.

This means we enter DXE core with those DXE drivers ready to be
dispatched, removing the need to perform manipulation of memory
attributes before the CPU arch protocol is dispatched, which is a bit of
a catch-22 otherwise.

With these changes in place, the platform no longer needs to map memory
writable and executable by default, and all DRAM can be mapped
non-executable right out of reset.

Cc: Ray Ni <ray.ni@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Taylor Beebe <t@taylorbeebe.com>
Cc: Oliver Smith-Denny <osd@smith-denny.com>
Cc: Dandan Bi <dandan.bi@intel.com>
Cc: Liming Gao <gaoliming@byosoft.com.cn>
Cc: "Kinney, Michael D" <michael.d.kinney@intel.com>
Cc: Leif Lindholm <quic_llindhol@quicinc.com>
Cc: Michael Kubacki <mikuback@linux.microsoft.com>

Ard Biesheuvel (11):
  MdeModulePkg/DxeCore: Remove unused 'EntryPoint' argument to LoadImage
  MdeModulePkg/DxeCore: Remove unused DstBuffer arg from LoadImage
  MdeModulePkg/DxeCore: Remove FreePage argument from CoreUnloadImage
  MdeModulePkg/DxeCore: Avoid caching memory mapped FFS files
  MdeModulePkg/DxeCore: Use memory mapped FV protocol to avoid image
    copy
  MdeModulePkg/DxeCore: Expose memory mapped FV protocol when possible
  MdeModulePkg/DxeCore: Execute loaded images in place if possible
  MdeModulePkg/DxeIpl: Relocate and remap XIP capable DXE drivers
  MdeModulePkg/DxeCore: Add PCD NX policy bit for default NX state
  ArmVirtPkg/ArmVirtQemu: Allow CPU arch protocol DXE to execute in
    place
  ArmVirtPkg/ArmVirtQemu: Map all DRAM non-execute by default

 ArmVirtPkg/ArmVirtQemu.dsc                                 |   1 +
 ArmVirtPkg/ArmVirtQemuFvMain.fdf.inc                       |  17 +-
 ArmVirtPkg/ArmVirtRules.fdf.inc                            |   9 +
 ArmVirtPkg/Library/QemuVirtMemInfoLib/QemuVirtMemInfoLib.c |   4 +-
 ArmVirtPkg/MemoryInitPei/MemoryInitPeim.inf                |   2 +-
 MdeModulePkg/Core/Dxe/DxeMain.h                            |   1 +
 MdeModulePkg/Core/Dxe/DxeMain.inf                          |   3 +
 MdeModulePkg/Core/Dxe/FwVol/FwVol.c                        | 113 ++++++-
 MdeModulePkg/Core/Dxe/FwVol/FwVolDriver.h                  |  31 ++
 MdeModulePkg/Core/Dxe/FwVol/FwVolRead.c                    |  22 --
 MdeModulePkg/Core/Dxe/Image/Image.c                        | 322 ++++++++++----------
 MdeModulePkg/Core/Dxe/Misc/MemoryProtection.c              |   7 +
 MdeModulePkg/Core/DxeIplPeim/DxeIpl.inf                    |   1 +
 MdeModulePkg/Core/DxeIplPeim/DxeLoad.c                     | 196 ++++++++++++
 MdeModulePkg/Include/Protocol/MemoryMappedFv.h             |  59 ++++
 MdeModulePkg/MdeModulePkg.dec                              |   6 +
 16 files changed, 607 insertions(+), 187 deletions(-)
 create mode 100644 MdeModulePkg/Include/Protocol/MemoryMappedFv.h

-- 
2.39.2



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#105362): https://edk2.groups.io/g/devel/message/105362
Mute This Topic: https://groups.io/mt/99197132/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-
Re: [edk2-devel] [RFC PATCH 00/11] Permit DXE drivers to execute in place
Posted by Oliver Smith-Denny 11 months, 1 week ago
On 5/29/2023 3:16 AM, Ard Biesheuvel wrote:
> TL;DR - allow DXE drivers to execute in place from the decompressed FV
> 
> loaded into memory by DxeIpl so we can apply strict permissions before
> 
> dispatching DXE core.
> 
> 
> 
> Currently, executable images loaded from firmware volumes are copied at
> 
> least three times: once in the firmware volume driver, once in the DXE
> 
> core load image code, and finally when the PE sections are populated in
> 
> memory based on the section descriptions in the file.
> 
> 
> 
> At least two of these copies serve little purpose, given that most
> 
> drivers are typically dispatched from a memory-mapped firmware volume
> 
> that is loaded into DRAM by DxeIpl from a compressed image in the boot
> 
> FV, and so we can take a short-cut in the DXE image loader so that the
> 
> PE/COFF library that performs the load uses the image in the memory
> 
> mapped FV as its source directly. This is implemented by the first 6
> 
> patches (where the first 3 are just cleanups)
> 
> 
> 
> With this logic in place, we can go one step further, and actually
> 
> dispatch the image in place (similar to how XIP PEIMs are dispatched),
> 
> without over moving it out of the decompressed firmware volume. This
> 
> requires the image to be aligned sufficiently inside the FV, but this is
> 
> also the same logic that applies to XIP PEIMs, and this can be achieved
> 
> trivially by tweaking some FDF image generation rules. (Note that this
> 
> adds padding to the FV, but this generally compresses well, and we
> 
> ultimately uses less memory at runtime by not making a copy of the
> 
> image).
> 
> 
> 
> This requires the DXE IPL (which is the component that decompresses the
> 
> firmware volumes to memory) to iterate over the contents and relocate
> 
> these drivers in place. Given that DXE IPL is already in charge of
> 
> applying NX permissions to the stack and to other memory regions, we can
> 
> trivially extend it to apply restricted permissions to the XIP DXE
> 
> drivers after relocation.
> 
> 
> 
> This means we enter DXE core with those DXE drivers ready to be
> 
> dispatched, removing the need to perform manipulation of memory
> 
> attributes before the CPU arch protocol is dispatched, which is a bit of
> 
> a catch-22 otherwise.
> 
> 
> 
> With these changes in place, the platform no longer needs to map memory
> 
> writable and executable by default, and all DRAM can be mapped
> 
> non-executable right out of reset.
> 
> 

Hi Ard,

Thanks for sending out this RFC, great to see more work on the memory
protections front. A few questions and thoughts:

This seems a good effort (in conjunction with your last RFC) to close
the protection gap between DxeCore launch and CpuDxe launch for marking
non-code regions NX. Do you see other protections (guard pages for
example) fitting into this method? I believe for any dynamic protections
during this timeframe we would need the ability to manipulate the page
tables directly from DxeCore.

Similarly, in order to lessen the complexity of the DXE driver usage of
memory resources and avoiding sync issues (e.g. a driver allocates pages
that are mapped NX by default, then it sets a cacheability attribute
and accidentally clears NX), I think further work would be valuable to
reduce that complexity. I think your new PPI that allows setting and
preserving bits independently of what is passed in is a very good step
towards reducing this complexity.

This patchset would move all properly aligned DXE drivers to be XIP,
correct? Because we are XIP in DRAM, this should not have any
performance implications (other than a benefit from reducing the extra
copies in your first few patches), aside from potential space
differences, which as you note compression will likely do away with,
right?

Thanks,
Oliver



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#105567): https://edk2.groups.io/g/devel/message/105567
Mute This Topic: https://groups.io/mt/99197132/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-
Re: [edk2-devel] [RFC PATCH 00/11] Permit DXE drivers to execute in place
Posted by Ard Biesheuvel 11 months, 1 week ago
On Thu, 1 Jun 2023 at 16:53, Oliver Smith-Denny
<osde@linux.microsoft.com> wrote:
>
> Thanks for sending out this RFC, great to see more work on the memory
> protections front. A few questions and thoughts:
>
> This seems a good effort (in conjunction with your last RFC) to close
> the protection gap between DxeCore launch and CpuDxe launch for marking
> non-code regions NX. Do you see other protections (guard pages for
> example) fitting into this method? I believe for any dynamic protections
> during this timeframe we would need the ability to manipulate the page
> tables directly from DxeCore.
>

The use case of guard pages did not really occur to me, to be honest,
and this is obviously something that doesn't work either before the
CPU arch protocol is dispatched

I still think it would be preferable to add the ability to manage
memory mapping permissions to the DXE core itself, and separate it
from the CPU arch protocol.

Note that clumping everything together does not really help in this
respect either: if the memory permission manipulation logically
remains a part of the CPU arch protocol, which cannot be installed
until its dependencies are satisfied, we are still in a situation
where dispatching those dependencies may result in page allocations
being created before we can unmap the guard pages.

> Similarly, in order to lessen the complexity of the DXE driver usage of
> memory resources and avoiding sync issues (e.g. a driver allocates pages
> that are mapped NX by default, then it sets a cacheability attribute
> and accidentally clears NX), I think further work would be valuable to
> reduce that complexity. I think your new PPI that allows setting and
> preserving bits independently of what is passed in is a very good step
> towards reducing this complexity.
>

Hopefully, we'll be able to do something at the library/driver level
here (AllocatePages in the DMA or PCI layer).

Another thing we might entertain (which maps really well onto the WXN
thing we have on ARM) is to add a GCD memory region capability that
makes memory XP unless it is RO. But I haven't really experimented
with that yet - I'll keep you posted on that.

> This patchset would move all properly aligned DXE drivers to be XIP,
> correct?

Yes.

> Because we are XIP in DRAM, this should not have any
> performance implications (other than a benefit from reducing the extra
> copies in your first few patches), aside from potential space
> differences, which as you note compression will likely do away with,
> right?
>

Exactly.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#105598): https://edk2.groups.io/g/devel/message/105598
Mute This Topic: https://groups.io/mt/99197132/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-
Re: [edk2-devel] [RFC PATCH 00/11] Permit DXE drivers to execute in place
Posted by Oliver Smith-Denny 11 months, 1 week ago
On 6/1/2023 11:11 AM, Ard Biesheuvel wrote:
> On Thu, 1 Jun 2023 at 16:53, Oliver Smith-Denny
> <osde@linux.microsoft.com> wrote:
>>
>> Thanks for sending out this RFC, great to see more work on the memory
>> protections front. A few questions and thoughts:
>>
>> This seems a good effort (in conjunction with your last RFC) to close
>> the protection gap between DxeCore launch and CpuDxe launch for marking
>> non-code regions NX. Do you see other protections (guard pages for
>> example) fitting into this method? I believe for any dynamic protections
>> during this timeframe we would need the ability to manipulate the page
>> tables directly from DxeCore.
>>
> 
> The use case of guard pages did not really occur to me, to be honest,
> and this is obviously something that doesn't work either before the
> CPU arch protocol is dispatched
> 
> I still think it would be preferable to add the ability to manage
> memory mapping permissions to the DXE core itself, and separate it
> from the CPU arch protocol.
> 
> Note that clumping everything together does not really help in this
> respect either: if the memory permission manipulation logically
> remains a part of the CPU arch protocol, which cannot be installed
> until its dependencies are satisfied, we are still in a situation
> where dispatching those dependencies may result in page allocations
> being created before we can unmap the guard pages.
> 

I agree. I think that the ability for DXE Core to manage the memory
attributes itself is the central simplification that could go a long
ways towards cleaning up the existing interfaces and making the code
more maintainable with less potential pitfalls (that we currently see
hit frequently).

I think the memory attribute management can be split apart from CpuDxe
without bringing all of CpuDxe into Dxe Core and certainly if that is
the better path towards tackling the problem we face, I support it.

>> Similarly, in order to lessen the complexity of the DXE driver usage of
>> memory resources and avoiding sync issues (e.g. a driver allocates pages
>> that are mapped NX by default, then it sets a cacheability attribute
>> and accidentally clears NX), I think further work would be valuable to
>> reduce that complexity. I think your new PPI that allows setting and
>> preserving bits independently of what is passed in is a very good step
>> towards reducing this complexity.
>>
> 
> Hopefully, we'll be able to do something at the library/driver level
> here (AllocatePages in the DMA or PCI layer).
> 
> Another thing we might entertain (which maps really well onto the WXN
> thing we have on ARM) is to add a GCD memory region capability that
> makes memory XP unless it is RO. But I haven't really experimented
> with that yet - I'll keep you posted on that.
> 

This is definitely interesting to me as well. I would love this
capability :). I've been doing some investigations in this area,
but not specifically towards ARM's WXN, that seems a great place
to intersect.

Thanks,
Oliver

>> This patchset would move all properly aligned DXE drivers to be XIP,
>> correct?
> 
> Yes.
> 
>> Because we are XIP in DRAM, this should not have any
>> performance implications (other than a benefit from reducing the extra
>> copies in your first few patches), aside from potential space
>> differences, which as you note compression will likely do away with,
>> right?
>>
> 
> Exactly.
> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#105603): https://edk2.groups.io/g/devel/message/105603
Mute This Topic: https://groups.io/mt/99197132/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-