[edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable

Ashish Singhal via groups.io posted 1 patch 2 years, 2 months ago
Failed in applying to current master (apply log)
There is a newer version of this series
ArmPkg/Library/ArmLib/AArch64/AArch64Support.S           | 4 +++-
ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++
2 files changed, 5 insertions(+), 1 deletion(-)
[edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ashish Singhal via groups.io 2 years, 2 months ago
Even with MMU turned off, instruction cache can speculate
and fetch instructions. This can cause a crash if region
being executed has been modified recently. With this patch,
we ensure that instruction cache is invalidated right after
MMU has been enabled and any potentially stale instruction
fetched earlier has been discarded.

This is specially helpful when the memory attributes of a
region in MMU are being changed and some instructions
operating on the region are prefetched in the instruction
cache.

Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com>
---
 ArmPkg/Library/ArmLib/AArch64/AArch64Support.S           | 4 +++-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
index d3cc1e8671..9648245182 100644
--- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
+++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
@@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu)
    dsb     nsh
    isb
    msr     sctlr_el3, x0       // Write back
-4: isb
+4: ic      iallu
+   dsb     sy
+   isb
    ret
 
 
diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
index 66ebca571e..56cc2dd73f 100644
--- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
+++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
@@ -37,6 +37,8 @@
 
   // re-enable the MMU
   msr   sctlr_el\el, x8
+  ic    iallu
+  dsb   sy
   isb
   .endm
 
-- 
2.17.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86841): https://edk2.groups.io/g/devel/message/86841
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ashish Singhal via groups.io 2 years, 2 months ago
+ @Samer El-Haj-Mahmoud<mailto:Samer.El-Haj-Mahmoud@arm.com>

Hello Leif/Ard/Sami/Samer,

Can you please look at this patch and provide some feedback?

Thanks
Ashish
________________________________
From: Ashish Singhal <ashishsingha@nvidia.com>
Sent: Monday, February 21, 2022 7:42 PM
To: devel@edk2.groups.io <devel@edk2.groups.io>; sami.mujawar@arm.com <sami.mujawar@arm.com>; ardb+tianocore@kernel.org <ardb+tianocore@kernel.org>; quic_llindhol@quicinc.com <quic_llindhol@quicinc.com>
Cc: Ashish Singhal <ashishsingha@nvidia.com>
Subject: [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable

Even with MMU turned off, instruction cache can speculate
and fetch instructions. This can cause a crash if region
being executed has been modified recently. With this patch,
we ensure that instruction cache is invalidated right after
MMU has been enabled and any potentially stale instruction
fetched earlier has been discarded.

This is specially helpful when the memory attributes of a
region in MMU are being changed and some instructions
operating on the region are prefetched in the instruction
cache.

Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com>
---
 ArmPkg/Library/ArmLib/AArch64/AArch64Support.S           | 4 +++-
 ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
index d3cc1e8671..9648245182 100644
--- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
+++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
@@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu)
    dsb     nsh
    isb
    msr     sctlr_el3, x0       // Write back
-4: isb
+4: ic      iallu
+   dsb     sy
+   isb
    ret


diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
index 66ebca571e..56cc2dd73f 100644
--- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
+++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
@@ -37,6 +37,8 @@

   // re-enable the MMU
   msr   sctlr_el\el, x8
+  ic    iallu
+  dsb   sy
   isb
   .endm

--
2.17.1



-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86881): https://edk2.groups.io/g/devel/message/86881
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ard Biesheuvel 2 years, 2 months ago
(+ Marc)

On Tue, 22 Feb 2022 at 03:42, Ashish Singhal <ashishsingha@nvidia.com> wrote:
>
> Even with MMU turned off, instruction cache can speculate
> and fetch instructions. This can cause a crash if region
> being executed has been modified recently. With this patch,
> we ensure that instruction cache is invalidated right after
> MMU has been enabled and any potentially stale instruction
> fetched earlier has been discarded.
>

I don't follow this reasoning. Every time the UEFI image loader loads
an image into memory, it does so with MMU and caches enabled, and the
code regions are cleaned to the PoU before being invalidated in the
I-cache. So how could these regions be stale? The only code that needs
special care here is the little trampoline that executes with the MMU
off, but this is being taken care of explicitly, by cleaning the code
to the PoC before invoking it.

> This is specially helpful when the memory attributes of a
> region in MMU are being changed and some instructions
> operating on the region are prefetched in the instruction
> cache.
>

This sounds like a TLB problem not a cache problem. For the sake of
posterity, could you include a more detailed description of the issue
in the commit log? IIRC, this is about speculative instruction fetches
hitting device memory locations?

As I mentioned before, the architecture does not permit speculative
instruction fetches for regions mapped with the XN attribute.

Is the issue occurring under virtualization? Is there a stage 2
mapping that lacks the XN attribute for the device memory region in
question?

> Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com>
> ---
>  ArmPkg/Library/ArmLib/AArch64/AArch64Support.S           | 4 +++-
>  ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++
>  2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> index d3cc1e8671..9648245182 100644
> --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> @@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu)
>     dsb     nsh
>     isb
>     msr     sctlr_el3, x0       // Write back
> -4: isb
> +4: ic      iallu
> +   dsb     sy
> +   isb
>     ret
>
>
> diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> index 66ebca571e..56cc2dd73f 100644
> --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> @@ -37,6 +37,8 @@
>
>    // re-enable the MMU
>    msr   sctlr_el\el, x8
> +  ic    iallu
> +  dsb   sy
>    isb
>    .endm
>
> --
> 2.17.1
>


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86883): https://edk2.groups.io/g/devel/message/86883
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Marc Zyngier 2 years, 2 months ago
On Wed, 23 Feb 2022 07:02:28 +0000,
Ard Biesheuvel <ardb@kernel.org> wrote:
> 
> (+ Marc)
> 
> On Tue, 22 Feb 2022 at 03:42, Ashish Singhal <ashishsingha@nvidia.com> wrote:
> >
> > Even with MMU turned off, instruction cache can speculate
> > and fetch instructions. This can cause a crash if region
> > being executed has been modified recently. With this patch,
> > we ensure that instruction cache is invalidated right after
> > MMU has been enabled and any potentially stale instruction
> > fetched earlier has been discarded.

Are you doing code patching while the MMU is off?

> >
> 
> I don't follow this reasoning. Every time the UEFI image loader loads
> an image into memory, it does so with MMU and caches enabled, and the
> code regions are cleaned to the PoU before being invalidated in the
> I-cache. So how could these regions be stale? The only code that needs
> special care here is the little trampoline that executes with the MMU
> off, but this is being taken care of explicitly, by cleaning the code
> to the PoC before invoking it.
>
> > This is specially helpful when the memory attributes of a
> > region in MMU are being changed and some instructions
> > operating on the region are prefetched in the instruction
> > cache.
> >

Huh, this is way too vague. How do you expect anyone to understand
your problem based on this?

> 
> This sounds like a TLB problem not a cache problem. For the sake of
> posterity, could you include a more detailed description of the issue
> in the commit log? IIRC, this is about speculative instruction fetches
> hitting device memory locations?
> 
> As I mentioned before, the architecture does not permit speculative
> instruction fetches for regions mapped with the XN attribute.
> 
> Is the issue occurring under virtualization? Is there a stage 2
> mapping that lacks the XN attribute for the device memory region in
> question?
> 
> > Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com>
> > ---
> >  ArmPkg/Library/ArmLib/AArch64/AArch64Support.S           | 4 +++-
> >  ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > index d3cc1e8671..9648245182 100644
> > --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > @@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu)
> >     dsb     nsh
> >     isb
> >     msr     sctlr_el3, x0       // Write back
> > -4: isb
> > +4: ic      iallu
> > +   dsb     sy

Why DSB SY? Given that you are only invalidating on a single CPU,
DSB NSH should be enough.

> > +   isb
> >     ret
> >
> >
> > diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> > index 66ebca571e..56cc2dd73f 100644
> > --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> > +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> > @@ -37,6 +37,8 @@
> >
> >    // re-enable the MMU
> >    msr   sctlr_el\el, x8
> > +  ic    iallu
> > +  dsb   sy

Same thing here.

	M.

-- 
Without deviation from the norm, progress is not possible.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86946): https://edk2.groups.io/g/devel/message/86946
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ashish Singhal via groups.io 2 years, 2 months ago
Hello Ard and Marc,

I apologize for not providing the background on this in the commit message and I understand the commit message is not very clear as well. Let me try to summarize the problem.

In our UEFI implementation, we are doing the following as part of the initial MMU setup:

  1.  Set the applicable device memory as nGnRnE memory.
  2.  Set the whole DRAM as normal memory that translates to RW and executable memory.
  3.  Enable caches and MMU.
  4.  At this time, the memory map looks correct when I check that from DS-5.

When we start dispatching drivers, DxeCore dispatches a driver and marks its code area as RO and executable and its data region as RW and non-executable. What we are seeing randomly is that some of the page tables (using DS-5) have invalid output address that leads to the correct input address from UEFI being translated to an unavailable memory location causing a crash sometime in EL2 or sometimes as a RAS error in EL3.

When I reached out to the CPU team here, they said Arm® Cortex®-A78AE is a highly speculative core and we need to have appropriate barriers in place so that there is consistency in the way an address is accessed especially if it is done right after there is a change in translation tables. Based on this, I started some experimentation wrt caches whenever MMUs are enabled and I found that invalidating the instruction cache after enabling MMUs solves this problem.

Please note that I could be wrong with my hypothesis here and I may just be masking the issue. If that is the case, please let me know what I should be trying as I am out of ideas at this point. Also, the same UEFI works on NVIDIA's Xavier Silicon that has Carmel cores but shows this issue on Orin Silicon that has Arm® Cortex®-A78AE v8.2 64-bit CPU.

@Marc Zyngier<mailto:maz@kernel.org>,

I agree with your concern about using "dsb sy" and we should replace that with "dsb nsh" as we are running this only on a single core during boot.

Thanks
Ashish
________________________________
From: Marc Zyngier <maz@kernel.org>
Sent: Wednesday, February 23, 2022 1:58 AM
To: Ard Biesheuvel <ardb@kernel.org>; Ashish Singhal <ashishsingha@nvidia.com>
Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com>
Subject: Re: [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable

External email: Use caution opening links or attachments


On Wed, 23 Feb 2022 07:02:28 +0000,
Ard Biesheuvel <ardb@kernel.org> wrote:
>
> (+ Marc)
>
> On Tue, 22 Feb 2022 at 03:42, Ashish Singhal <ashishsingha@nvidia.com> wrote:
> >
> > Even with MMU turned off, instruction cache can speculate
> > and fetch instructions. This can cause a crash if region
> > being executed has been modified recently. With this patch,
> > we ensure that instruction cache is invalidated right after
> > MMU has been enabled and any potentially stale instruction
> > fetched earlier has been discarded.

Are you doing code patching while the MMU is off?

> >
>
> I don't follow this reasoning. Every time the UEFI image loader loads
> an image into memory, it does so with MMU and caches enabled, and the
> code regions are cleaned to the PoU before being invalidated in the
> I-cache. So how could these regions be stale? The only code that needs
> special care here is the little trampoline that executes with the MMU
> off, but this is being taken care of explicitly, by cleaning the code
> to the PoC before invoking it.
>
> > This is specially helpful when the memory attributes of a
> > region in MMU are being changed and some instructions
> > operating on the region are prefetched in the instruction
> > cache.
> >

Huh, this is way too vague. How do you expect anyone to understand
your problem based on this?

>
> This sounds like a TLB problem not a cache problem. For the sake of
> posterity, could you include a more detailed description of the issue
> in the commit log? IIRC, this is about speculative instruction fetches
> hitting device memory locations?
>
> As I mentioned before, the architecture does not permit speculative
> instruction fetches for regions mapped with the XN attribute.
>
> Is the issue occurring under virtualization? Is there a stage 2
> mapping that lacks the XN attribute for the device memory region in
> question?
>
> > Signed-off-by: Ashish Singhal <ashishsingha@nvidia.com>
> > ---
> >  ArmPkg/Library/ArmLib/AArch64/AArch64Support.S           | 4 +++-
> >  ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S | 2 ++
> >  2 files changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > index d3cc1e8671..9648245182 100644
> > --- a/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > +++ b/ArmPkg/Library/ArmLib/AArch64/AArch64Support.S
> > @@ -89,7 +89,9 @@ ASM_FUNC(ArmEnableMmu)
> >     dsb     nsh
> >     isb
> >     msr     sctlr_el3, x0       // Write back
> > -4: isb
> > +4: ic      iallu
> > +   dsb     sy

Why DSB SY? Given that you are only invalidating on a single CPU,
DSB NSH should be enough.

> > +   isb
> >     ret
> >
> >
> > diff --git a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> > index 66ebca571e..56cc2dd73f 100644
> > --- a/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> > +++ b/ArmPkg/Library/ArmMmuLib/AArch64/ArmMmuLibReplaceEntry.S
> > @@ -37,6 +37,8 @@
> >
> >    // re-enable the MMU
> >    msr   sctlr_el\el, x8
> > +  ic    iallu
> > +  dsb   sy

Same thing here.

        M.

--
Without deviation from the norm, progress is not possible.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86917): https://edk2.groups.io/g/devel/message/86917
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ard Biesheuvel 2 years, 2 months ago
On Wed, 23 Feb 2022 at 18:36, Ashish Singhal via groups.io
<ashishsingha=nvidia.com@groups.io> wrote:
>
> Hello Ard and Marc,
>
> I apologize for not providing the background on this in the commit message and I understand the commit message is not very clear as well. Let me try to summarize the problem.
>
> In our UEFI implementation, we are doing the following as part of the initial MMU setup:
>
> Set the applicable device memory as nGnRnE memory.
> Set the whole DRAM as normal memory that translates to RW and executable memory.
> Enable caches and MMU.
> At this time, the memory map looks correct when I check that from DS-5.
>

I have asked you a number of times about the XN attribute. How and
where are you setting it for the device regions?

> When we start dispatching drivers, DxeCore dispatches a driver and marks its code area as RO and executable and its data region as RW and non-executable. What we are seeing randomly is that some of the page tables (using DS-5) have invalid output address that leads to the correct input address from UEFI being translated to an unavailable memory location causing a crash sometime in EL2 or sometimes as a RAS error in EL3.
>

At which EL does UEFI run? If at EL1, is it running under a stage2
mapping? If so, does the stage 2 mapping set the XN attribute
appropriately for the device mappings?


> When I reached out to the CPU team here, they said Arm® Cortex®-A78AE is a highly speculative core and we need to have appropriate barriers in place so that there is consistency in the way an address is accessed especially if it is done right after there is a change in translation tables. Based on this, I started some experimentation wrt caches whenever MMUs are enabled and I found that invalidating the instruction cache after enabling MMUs solves this problem.
>

It may hide it, but I don't think it is a proper fix.

> Please note that I could be wrong with my hypothesis here and I may just be masking the issue. If that is the case, please let me know what I should be trying as I am out of ideas at this point. Also, the same UEFI works on NVIDIA's Xavier Silicon that has Carmel cores but shows this issue on Orin Silicon that has Arm® Cortex®-A78AE v8.2 64-bit CPU.
>

Thanks,
Ard.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86919): https://edk2.groups.io/g/devel/message/86919
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ashish Singhal via groups.io 2 years, 2 months ago
Ard,

During PrePi, I setup the initial memory map by calling into ArmConfigureMmu function with my memory table where device memory regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_DEVICE and DRAM regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK.

For device memory, XN bit is set by ArmMemoryAttributeToPageAttribute function. After PrePi, when I add a region of memory to device memory from a DXE driver, I call gDS->AddMemorySpace with EfiGcdMemoryTypeMemoryMappedIo and EFI_MEMORY_UC | EFI_MEMORY_RUNTIME followed by gDS->SetMemorySpaceAttributes with EFI_MEMORY_UC.

Please let me know in case I have still not understood your question.

In our case, UEFI is running in NS-EL2. I understand what I am proposing may not be a proper fix. I am looking for help on what could be a fix for this as we are using all upstream code for memory management.

Thanks
Ashish
________________________________
From: Ard Biesheuvel <ardb@kernel.org>
Sent: Wednesday, February 23, 2022 10:40 AM
To: edk2-devel-groups-io <devel@edk2.groups.io>; Ashish Singhal <ashishsingha@nvidia.com>
Cc: Marc Zyngier <maz@kernel.org>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com>
Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable

External email: Use caution opening links or attachments


On Wed, 23 Feb 2022 at 18:36, Ashish Singhal via groups.io
<ashishsingha=nvidia.com@groups.io> wrote:
>
> Hello Ard and Marc,
>
> I apologize for not providing the background on this in the commit message and I understand the commit message is not very clear as well. Let me try to summarize the problem.
>
> In our UEFI implementation, we are doing the following as part of the initial MMU setup:
>
> Set the applicable device memory as nGnRnE memory.
> Set the whole DRAM as normal memory that translates to RW and executable memory.
> Enable caches and MMU.
> At this time, the memory map looks correct when I check that from DS-5.
>

I have asked you a number of times about the XN attribute. How and
where are you setting it for the device regions?

> When we start dispatching drivers, DxeCore dispatches a driver and marks its code area as RO and executable and its data region as RW and non-executable. What we are seeing randomly is that some of the page tables (using DS-5) have invalid output address that leads to the correct input address from UEFI being translated to an unavailable memory location causing a crash sometime in EL2 or sometimes as a RAS error in EL3.
>

At which EL does UEFI run? If at EL1, is it running under a stage2
mapping? If so, does the stage 2 mapping set the XN attribute
appropriately for the device mappings?


> When I reached out to the CPU team here, they said Arm® Cortex®-A78AE is a highly speculative core and we need to have appropriate barriers in place so that there is consistency in the way an address is accessed especially if it is done right after there is a change in translation tables. Based on this, I started some experimentation wrt caches whenever MMUs are enabled and I found that invalidating the instruction cache after enabling MMUs solves this problem.
>

It may hide it, but I don't think it is a proper fix.

> Please note that I could be wrong with my hypothesis here and I may just be masking the issue. If that is the case, please let me know what I should be trying as I am out of ideas at this point. Also, the same UEFI works on NVIDIA's Xavier Silicon that has Carmel cores but shows this issue on Orin Silicon that has Arm® Cortex®-A78AE v8.2 64-bit CPU.
>

Thanks,
Ard.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86921): https://edk2.groups.io/g/devel/message/86921
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ard Biesheuvel 2 years, 2 months ago
On Wed, 23 Feb 2022 at 19:14, Ashish Singhal <ashishsingha@nvidia.com> wrote:
>
> Ard,
>
> During PrePi, I setup the initial memory map by calling into ArmConfigureMmu function with my memory table where device memory regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_DEVICE and DRAM regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK.
>
> For device memory, XN bit is set by ArmMemoryAttributeToPageAttribute function. After PrePi, when I add a region of memory to device memory from a DXE driver, I call gDS->AddMemorySpace with EfiGcdMemoryTypeMemoryMappedIo and EFI_MEMORY_UC | EFI_MEMORY_RUNTIME followed by gDS->SetMemorySpaceAttributes with EFI_MEMORY_UC.
>
> Please let me know in case I have still not understood your question.
>

This all looks ok. But the real question is whether the address that
the speculative access targets is mapped using the XN attribute or
not, so you will need to find a way to check that.

So there are a couple of options:
- The XN attribute is set correctly, but the CPU is speculatively
fetching instructions anyway. This would imply a severe hardware bug,
and flushing the I-cache is unlikely to make a difference.
- The speculative access is not the result of an instruction fetch, in
which case I-cache maintenance is unlikely to help either.
- The XN bit is not being set correctly, and so the MMU code needs to be fixed.

Papering over this by adding I-cache maintenance doesn't seem the best
course of action tbh.

-- 
Ard.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86927): https://edk2.groups.io/g/devel/message/86927
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ashish Singhal via groups.io 2 years, 2 months ago
Hello Ard,

When we had a discussion on this topic earlier, maybe a few weeks back, we thought device memory is being accessed in a speculative manner. Our latest debug where we focussed on MMU page tables at the time of error tells that the issue is actually DRAM mapping in page tables getting corrupt (as per DS-5) where descriptor for a page seems to be something garbage. What this causes is that a valid input address in DRAM may get translated to an address in a different region of DRAM or some address in device memory.

When I invalidate the instruction cache after enabling MMUs, this issue seems to be getting resolved. Again, I am not saying this is a fix but this is something that solves the issue while we are looking for suggestions from you for a proper fix.

I have tried to summarize the problem based on the latest findings a few emails down the trail if you want to have a look at that again.

Thanks
Ashish
________________________________
From: Ard Biesheuvel <ardb@kernel.org>
Sent: Wednesday, February 23, 2022 3:54 PM
To: Ashish Singhal <ashishsingha@nvidia.com>
Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Marc Zyngier <maz@kernel.org>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com>
Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable

External email: Use caution opening links or attachments


On Wed, 23 Feb 2022 at 19:14, Ashish Singhal <ashishsingha@nvidia.com> wrote:
>
> Ard,
>
> During PrePi, I setup the initial memory map by calling into ArmConfigureMmu function with my memory table where device memory regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_DEVICE and DRAM regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK.
>
> For device memory, XN bit is set by ArmMemoryAttributeToPageAttribute function. After PrePi, when I add a region of memory to device memory from a DXE driver, I call gDS->AddMemorySpace with EfiGcdMemoryTypeMemoryMappedIo and EFI_MEMORY_UC | EFI_MEMORY_RUNTIME followed by gDS->SetMemorySpaceAttributes with EFI_MEMORY_UC.
>
> Please let me know in case I have still not understood your question.
>

This all looks ok. But the real question is whether the address that
the speculative access targets is mapped using the XN attribute or
not, so you will need to find a way to check that.

So there are a couple of options:
- The XN attribute is set correctly, but the CPU is speculatively
fetching instructions anyway. This would imply a severe hardware bug,
and flushing the I-cache is unlikely to make a difference.
- The speculative access is not the result of an instruction fetch, in
which case I-cache maintenance is unlikely to help either.
- The XN bit is not being set correctly, and so the MMU code needs to be fixed.

Papering over this by adding I-cache maintenance doesn't seem the best
course of action tbh.

--
Ard.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#86950): https://edk2.groups.io/g/devel/message/86950
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable
Posted by Ashish Singhal via groups.io 2 years, 2 months ago
@Marc Zyngier<mailto:maz@kernel.org>

I have addressed your comment regarding the type of data synchronization barrier and have posted v2 of the patch set<https://edk2.groups.io/g/devel/message/87030>.

@Ard Biesheuvel<mailto:ardb@kernel.org>

I am still looking for a better solution to the problem and I am open to working with you or anyone else from ARM on that. Please let me know if you want me to try something else.

Thanks
Ashish
________________________________
From: Ashish Singhal <ashishsingha@nvidia.com>
Sent: Wednesday, February 23, 2022 11:01 PM
To: Ard Biesheuvel <ardb@kernel.org>
Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Marc Zyngier <maz@kernel.org>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com>
Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable

Hello Ard,

When we had a discussion on this topic earlier, maybe a few weeks back, we thought device memory is being accessed in a speculative manner. Our latest debug where we focussed on MMU page tables at the time of error tells that the issue is actually DRAM mapping in page tables getting corrupt (as per DS-5) where descriptor for a page seems to be something garbage. What this causes is that a valid input address in DRAM may get translated to an address in a different region of DRAM or some address in device memory.

When I invalidate the instruction cache after enabling MMUs, this issue seems to be getting resolved. Again, I am not saying this is a fix but this is something that solves the issue while we are looking for suggestions from you for a proper fix.

I have tried to summarize the problem based on the latest findings a few emails down the trail if you want to have a look at that again.

Thanks
Ashish
________________________________
From: Ard Biesheuvel <ardb@kernel.org>
Sent: Wednesday, February 23, 2022 3:54 PM
To: Ashish Singhal <ashishsingha@nvidia.com>
Cc: edk2-devel-groups-io <devel@edk2.groups.io>; Marc Zyngier <maz@kernel.org>; Sami Mujawar <sami.mujawar@arm.com>; Ard Biesheuvel <ardb+tianocore@kernel.org>; Leif Lindholm <quic_llindhol@quicinc.com>
Subject: Re: [edk2-devel] [PATCH] ArmPkg: Invalidate Instruction Cache On MMU Enable

External email: Use caution opening links or attachments


On Wed, 23 Feb 2022 at 19:14, Ashish Singhal <ashishsingha@nvidia.com> wrote:
>
> Ard,
>
> During PrePi, I setup the initial memory map by calling into ArmConfigureMmu function with my memory table where device memory regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_DEVICE and DRAM regions have attribute of ARM_MEMORY_REGION_ATTRIBUTE_WRITE_BACK.
>
> For device memory, XN bit is set by ArmMemoryAttributeToPageAttribute function. After PrePi, when I add a region of memory to device memory from a DXE driver, I call gDS->AddMemorySpace with EfiGcdMemoryTypeMemoryMappedIo and EFI_MEMORY_UC | EFI_MEMORY_RUNTIME followed by gDS->SetMemorySpaceAttributes with EFI_MEMORY_UC.
>
> Please let me know in case I have still not understood your question.
>

This all looks ok. But the real question is whether the address that
the speculative access targets is mapped using the XN attribute or
not, so you will need to find a way to check that.

So there are a couple of options:
- The XN attribute is set correctly, but the CPU is speculatively
fetching instructions anyway. This would imply a severe hardware bug,
and flushing the I-cache is unlikely to make a difference.
- The speculative access is not the result of an instruction fetch, in
which case I-cache maintenance is unlikely to help either.
- The XN bit is not being set correctly, and so the MMU code needs to be fixed.

Papering over this by adding I-cache maintenance doesn't seem the best
course of action tbh.

--
Ard.


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#87031): https://edk2.groups.io/g/devel/message/87031
Mute This Topic: https://groups.io/mt/89309504/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-