[edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map

Jian J Wang posted 1 patch 6 years, 5 months ago
Failed in applying to current master (apply log)
UefiCpuPkg/CpuDxe/CpuPageTable.c | 69 +++++++++++++++++++++++++++++-----------
1 file changed, 50 insertions(+), 19 deletions(-)
[edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map
Posted by Jian J Wang 6 years, 5 months ago
> v5:
>    Coding style clean-up

> v4:
> a. Remove DoUpdate and check attributes mismatch all the time to avoid
>    a logic hole
> b. Add warning message if failed to update capability
> c. Add local variable to hold new attributes to make code cleaner

> v3:
> a. Add comment to explain more on updating memory capabilities
> b. Fix logic hole in updating attributes
> c. Instead of checking illegal memory space address and size, use return
>    status of gDS->SetMemorySpaceCapabilities() to skip memory block which
>    cannot be updated with new capabilities.

> v2
> a. Fix an issue which will cause setting capability failure if size is smaller
>    than a page.

More than one entry of RT_CODE memory might cause boot problem for some
old OSs. This patch will fix this issue to keep OS compatibility as much
as possible.

More detailed information, please refer to
    https://bugzilla.tianocore.org/show_bug.cgi?id=753

Cc: Eric Dong <eric.dong@intel.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Contributed-under: TianoCore Contribution Agreement 1.1
Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
---
 UefiCpuPkg/CpuDxe/CpuPageTable.c | 69 +++++++++++++++++++++++++++++-----------
 1 file changed, 50 insertions(+), 19 deletions(-)

diff --git a/UefiCpuPkg/CpuDxe/CpuPageTable.c b/UefiCpuPkg/CpuDxe/CpuPageTable.c
index d312eb66f8..61537838b7 100644
--- a/UefiCpuPkg/CpuDxe/CpuPageTable.c
+++ b/UefiCpuPkg/CpuDxe/CpuPageTable.c
@@ -789,8 +789,7 @@ RefreshGcdMemoryAttributesFromPaging (
   UINT64                              BaseAddress;
   UINT64                              PageStartAddress;
   UINT64                              Attributes;
-  UINT64                              Capabilities;
-  BOOLEAN                             DoUpdate;
+  UINT64                              NewAttributes;
   UINTN                               Index;
 
   //
@@ -802,9 +801,8 @@ RefreshGcdMemoryAttributesFromPaging (
 
   GetCurrentPagingContext (&PagingContext);
 
-  DoUpdate      = FALSE;
-  Capabilities  = 0;
   Attributes    = 0;
+  NewAttributes = 0;
   BaseAddress   = 0;
   PageLength    = 0;
 
@@ -813,6 +811,34 @@ RefreshGcdMemoryAttributesFromPaging (
       continue;
     }
 
+    //
+    // Sync the actual paging related capabilities back to GCD service first.
+    // As a side effect (good one), this can also help to avoid unnecessary
+    // memory map entries due to the different capabilities of the same type
+    // memory, such as multiple RT_CODE and RT_DATA entries in memory map,
+    // which could cause boot failure of some old Linux distro (before v4.3).
+    //
+    Status = gDS->SetMemorySpaceCapabilities (
+                    MemorySpaceMap[Index].BaseAddress,
+                    MemorySpaceMap[Index].Length,
+                    MemorySpaceMap[Index].Capabilities |
+                    EFI_MEMORY_PAGETYPE_MASK
+                    );
+    if (EFI_ERROR (Status)) {
+      //
+      // If we cannot udpate the capabilities, we cannot update its
+      // attributes either. So just simply skip current block of memory.
+      //
+      DEBUG ((
+        DEBUG_WARN,
+        "Failed to update capability: [%lu] %016lx - %016lx (%016lx -> %016lx)\r\n",
+        (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
+        MemorySpaceMap[Index].Capabilities,
+        MemorySpaceMap[Index].Capabilities | EFI_MEMORY_PAGETYPE_MASK
+        ));
+      continue;
+    }
+
     if (MemorySpaceMap[Index].BaseAddress >= (BaseAddress + PageLength)) {
       //
       // Current memory space starts at a new page. Resetting PageLength will
@@ -826,7 +852,9 @@ RefreshGcdMemoryAttributesFromPaging (
       PageLength -= (MemorySpaceMap[Index].BaseAddress - BaseAddress);
     }
 
-    // Sync real page attributes to GCD
+    //
+    // Sync actual page attributes to GCD
+    //
     BaseAddress       = MemorySpaceMap[Index].BaseAddress;
     MemorySpaceLength = MemorySpaceMap[Index].Length;
     while (MemorySpaceLength > 0) {
@@ -842,23 +870,26 @@ RefreshGcdMemoryAttributesFromPaging (
         PageStartAddress  = (*PageEntry) & (UINT64)PageAttributeToMask(PageAttribute);
         PageLength        = PageAttributeToLength (PageAttribute) - (BaseAddress - PageStartAddress);
         Attributes        = GetAttributesFromPageEntry (PageEntry);
-
-        if (Attributes != (MemorySpaceMap[Index].Attributes & EFI_MEMORY_PAGETYPE_MASK)) {
-          DoUpdate = TRUE;
-          Attributes |= (MemorySpaceMap[Index].Attributes & ~EFI_MEMORY_PAGETYPE_MASK);
-          Capabilities = Attributes | MemorySpaceMap[Index].Capabilities;
-        } else {
-          DoUpdate = FALSE;
-        }
       }
 
       Length = MIN (PageLength, MemorySpaceLength);
-      if (DoUpdate) {
-        gDS->SetMemorySpaceCapabilities (BaseAddress, Length, Capabilities);
-        gDS->SetMemorySpaceAttributes (BaseAddress, Length, Attributes);
-        DEBUG ((DEBUG_INFO, "Update memory space attribute: [%02d] %016lx - %016lx (%08lx -> %08lx)\r\n",
-                             Index, BaseAddress, BaseAddress + Length - 1,
-                             MemorySpaceMap[Index].Attributes, Attributes));
+      if (Attributes != (MemorySpaceMap[Index].Attributes &
+                         EFI_MEMORY_PAGETYPE_MASK)) {
+        NewAttributes = (MemorySpaceMap[Index].Attributes &
+                         ~EFI_MEMORY_PAGETYPE_MASK) | Attributes;
+        Status = gDS->SetMemorySpaceAttributes (
+                        BaseAddress,
+                        Length,
+                        NewAttributes
+                        );
+        ASSERT_EFI_ERROR (Status);
+        DEBUG ((
+          DEBUG_INFO,
+          "Updated memory space attribute: [%lu] %016lx - %016lx (%016lx -> %016lx)\r\n",
+          (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
+          MemorySpaceMap[Index].Attributes,
+          NewAttributes
+          ));
       }
 
       PageLength        -= Length;
-- 
2.14.1.windows.1

_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map
Posted by Laszlo Ersek 6 years, 5 months ago
Hi Jian,

I'm CC'ing Ard and Matt, and commenting at the bottom.

On 11/10/17 02:02, Jian J Wang wrote:
>> v5:
>>    Coding style clean-up
> 
>> v4:
>> a. Remove DoUpdate and check attributes mismatch all the time to avoid
>>    a logic hole
>> b. Add warning message if failed to update capability
>> c. Add local variable to hold new attributes to make code cleaner
> 
>> v3:
>> a. Add comment to explain more on updating memory capabilities
>> b. Fix logic hole in updating attributes
>> c. Instead of checking illegal memory space address and size, use return
>>    status of gDS->SetMemorySpaceCapabilities() to skip memory block which
>>    cannot be updated with new capabilities.
> 
>> v2
>> a. Fix an issue which will cause setting capability failure if size is smaller
>>    than a page.
> 
> More than one entry of RT_CODE memory might cause boot problem for some
> old OSs. This patch will fix this issue to keep OS compatibility as much
> as possible.
> 
> More detailed information, please refer to
>     https://bugzilla.tianocore.org/show_bug.cgi?id=753
> 
> Cc: Eric Dong <eric.dong@intel.com>
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Laszlo Ersek <lersek@redhat.com>
> Contributed-under: TianoCore Contribution Agreement 1.1
> Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> ---
>  UefiCpuPkg/CpuDxe/CpuPageTable.c | 69 +++++++++++++++++++++++++++++-----------
>  1 file changed, 50 insertions(+), 19 deletions(-)
> 
> diff --git a/UefiCpuPkg/CpuDxe/CpuPageTable.c b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> index d312eb66f8..61537838b7 100644
> --- a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> +++ b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> @@ -789,8 +789,7 @@ RefreshGcdMemoryAttributesFromPaging (
>    UINT64                              BaseAddress;
>    UINT64                              PageStartAddress;
>    UINT64                              Attributes;
> -  UINT64                              Capabilities;
> -  BOOLEAN                             DoUpdate;
> +  UINT64                              NewAttributes;
>    UINTN                               Index;
>  
>    //
> @@ -802,9 +801,8 @@ RefreshGcdMemoryAttributesFromPaging (
>  
>    GetCurrentPagingContext (&PagingContext);
>  
> -  DoUpdate      = FALSE;
> -  Capabilities  = 0;
>    Attributes    = 0;
> +  NewAttributes = 0;
>    BaseAddress   = 0;
>    PageLength    = 0;
>  
> @@ -813,6 +811,34 @@ RefreshGcdMemoryAttributesFromPaging (
>        continue;
>      }
>  
> +    //
> +    // Sync the actual paging related capabilities back to GCD service first.
> +    // As a side effect (good one), this can also help to avoid unnecessary
> +    // memory map entries due to the different capabilities of the same type
> +    // memory, such as multiple RT_CODE and RT_DATA entries in memory map,
> +    // which could cause boot failure of some old Linux distro (before v4.3).
> +    //
> +    Status = gDS->SetMemorySpaceCapabilities (
> +                    MemorySpaceMap[Index].BaseAddress,
> +                    MemorySpaceMap[Index].Length,
> +                    MemorySpaceMap[Index].Capabilities |
> +                    EFI_MEMORY_PAGETYPE_MASK
> +                    );
> +    if (EFI_ERROR (Status)) {
> +      //
> +      // If we cannot udpate the capabilities, we cannot update its
> +      // attributes either. So just simply skip current block of memory.
> +      //
> +      DEBUG ((
> +        DEBUG_WARN,
> +        "Failed to update capability: [%lu] %016lx - %016lx (%016lx -> %016lx)\r\n",
> +        (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> +        MemorySpaceMap[Index].Capabilities,
> +        MemorySpaceMap[Index].Capabilities | EFI_MEMORY_PAGETYPE_MASK
> +        ));
> +      continue;
> +    }
> +
>      if (MemorySpaceMap[Index].BaseAddress >= (BaseAddress + PageLength)) {
>        //
>        // Current memory space starts at a new page. Resetting PageLength will
> @@ -826,7 +852,9 @@ RefreshGcdMemoryAttributesFromPaging (
>        PageLength -= (MemorySpaceMap[Index].BaseAddress - BaseAddress);
>      }
>  
> -    // Sync real page attributes to GCD
> +    //
> +    // Sync actual page attributes to GCD
> +    //
>      BaseAddress       = MemorySpaceMap[Index].BaseAddress;
>      MemorySpaceLength = MemorySpaceMap[Index].Length;
>      while (MemorySpaceLength > 0) {
> @@ -842,23 +870,26 @@ RefreshGcdMemoryAttributesFromPaging (
>          PageStartAddress  = (*PageEntry) & (UINT64)PageAttributeToMask(PageAttribute);
>          PageLength        = PageAttributeToLength (PageAttribute) - (BaseAddress - PageStartAddress);
>          Attributes        = GetAttributesFromPageEntry (PageEntry);
> -
> -        if (Attributes != (MemorySpaceMap[Index].Attributes & EFI_MEMORY_PAGETYPE_MASK)) {
> -          DoUpdate = TRUE;
> -          Attributes |= (MemorySpaceMap[Index].Attributes & ~EFI_MEMORY_PAGETYPE_MASK);
> -          Capabilities = Attributes | MemorySpaceMap[Index].Capabilities;
> -        } else {
> -          DoUpdate = FALSE;
> -        }
>        }
>  
>        Length = MIN (PageLength, MemorySpaceLength);
> -      if (DoUpdate) {
> -        gDS->SetMemorySpaceCapabilities (BaseAddress, Length, Capabilities);
> -        gDS->SetMemorySpaceAttributes (BaseAddress, Length, Attributes);
> -        DEBUG ((DEBUG_INFO, "Update memory space attribute: [%02d] %016lx - %016lx (%08lx -> %08lx)\r\n",
> -                             Index, BaseAddress, BaseAddress + Length - 1,
> -                             MemorySpaceMap[Index].Attributes, Attributes));
> +      if (Attributes != (MemorySpaceMap[Index].Attributes &
> +                         EFI_MEMORY_PAGETYPE_MASK)) {
> +        NewAttributes = (MemorySpaceMap[Index].Attributes &
> +                         ~EFI_MEMORY_PAGETYPE_MASK) | Attributes;
> +        Status = gDS->SetMemorySpaceAttributes (
> +                        BaseAddress,
> +                        Length,
> +                        NewAttributes
> +                        );
> +        ASSERT_EFI_ERROR (Status);
> +        DEBUG ((
> +          DEBUG_INFO,
> +          "Updated memory space attribute: [%lu] %016lx - %016lx (%016lx -> %016lx)\r\n",
> +          (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> +          MemorySpaceMap[Index].Attributes,
> +          NewAttributes
> +          ));
>        }
>  
>        PageLength        -= Length;
> 

So, I was ready to give my R-b for this patch, but then I also wanted to
test it. I applied the patch on current edk2 master (7e2a8dfe8a9a,
"ArmPlatformPkg/PrePeiCore: seed temporary stack before entering PEI
core", 2017-10-20), and built OVMF like this:

$ build \
  -a IA32 \
  -a X64 \
  -p OvmfPkg/OvmfPkgIa32X64.dsc \
  -t GCC48 \
  -b NOOPT \
  -D SMM_REQUIRE \
  -D SECURE_BOOT_ENABLE \
  -D E1000_ENABLE \
  -D HTTP_BOOT_ENABLE

For testing I used a recent-ish upstream QEMU development build
(ae49fbbcd8e4, "Merge remote-tracking branch
'remotes/rth/tags/pull-tcg-20171025' into staging", 2017-10-25), with
the Q35 machine type (which is required by SMM anyway).

The results vary across guest OSes:

(1) Up-to-date Fedora 26 guest crashes during boot, with the following
call stack:

BUG: unable to handle kernel paging request at fffffffefe893018
Call Trace:
 ? __change_page_attr_set_clr+0xaa6/0xd70
 ? kernel_map_pages_in_pgd+0xbc/0xd0
 ? efi_call+0x58/0x90
 ? virt_efi_set_variable.part.7+0x66/0x120
 ? virt_efi_set_variable+0x4f/0x60
 ? efi_delete_dummy_variable+0x62/0x90
 ? efi_enter_virtual_mode+0x4d4/0x4e8
 ? efi_enter_virtual_mode+0x4d4/0x4e8
 ? start_kernel+0x442/0x4e6
 ? early_idt_handler_array+0x120/0x120
 ? x86_64_start_reservations+0x24/0x26
 ? x86_64_start_kernel+0x13e/0x161
 ? secondary_startup_64+0x9f/0x9f

(2) The following Windows OSes all boot successfully:

- Windows 7
- Windows Server 2008 R2
- Windows 8.1
- Windows Server 2012 R2
- Windows 10

(3) Windows Server 2016 crashes with a BSOD; reporting "ATTEMPTED WRITE
TO READONLY MEMORY".

(Without the patch, all OSes boot OK.)


I'm attaching a ZIP file with the following contents (note that I'll
attach the same file to TianoCore BZ#753 as well, because the mailing
list archive(s) don't seem to preserve attachments):

- "ovmf.pre.txt", "shell.memmap.pre.txt", "kernel.pre.txt": OVMF log,
MEMMAP command output in the UEFI shell, and Fedora 26 kernel boot log
(successful) *before* applying your patch. The kernel log is detailed
(the cmdline had "ignore_loglevel" and "efi=debug").

- "ovmf.post.txt", "shell.memmap.post.txt", "kernel.post.txt": same
files as above, but saved *after* applying your patch. This is when the
F26 kernel crashes.

- "win2016.post.png": screenshot of the Windows Server 2016 boot failure
(after the patch was applied).

Thanks,
Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map
Posted by Wang, Jian J 6 years, 5 months ago
This is really a surprise. Anyway, thanks for validating so many OSs. I guess we 
have to turn to your suggestion before, which is adding capability to affected 
memory block only, not all memory space.

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Friday, November 10, 2017 8:24 PM
> To: Wang, Jian J <jian.j.wang@intel.com>
> Cc: edk2-devel@lists.01.org; Dong, Eric <eric.dong@intel.com>; Yao, Jiewen
> <jiewen.yao@intel.com>; Ard Biesheuvel <ard.biesheuvel@linaro.org>; Matt
> Fleming <matt@codeblueprint.co.uk>
> Subject: Re: [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in
> memory map
> 
> Hi Jian,
> 
> I'm CC'ing Ard and Matt, and commenting at the bottom.
> 
> On 11/10/17 02:02, Jian J Wang wrote:
> >> v5:
> >>    Coding style clean-up
> >
> >> v4:
> >> a. Remove DoUpdate and check attributes mismatch all the time to avoid
> >>    a logic hole
> >> b. Add warning message if failed to update capability
> >> c. Add local variable to hold new attributes to make code cleaner
> >
> >> v3:
> >> a. Add comment to explain more on updating memory capabilities
> >> b. Fix logic hole in updating attributes
> >> c. Instead of checking illegal memory space address and size, use return
> >>    status of gDS->SetMemorySpaceCapabilities() to skip memory block which
> >>    cannot be updated with new capabilities.
> >
> >> v2
> >> a. Fix an issue which will cause setting capability failure if size is smaller
> >>    than a page.
> >
> > More than one entry of RT_CODE memory might cause boot problem for
> some
> > old OSs. This patch will fix this issue to keep OS compatibility as much
> > as possible.
> >
> > More detailed information, please refer to
> >     https://bugzilla.tianocore.org/show_bug.cgi?id=753
> >
> > Cc: Eric Dong <eric.dong@intel.com>
> > Cc: Jiewen Yao <jiewen.yao@intel.com>
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Contributed-under: TianoCore Contribution Agreement 1.1
> > Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> > ---
> >  UefiCpuPkg/CpuDxe/CpuPageTable.c | 69
> +++++++++++++++++++++++++++++-----------
> >  1 file changed, 50 insertions(+), 19 deletions(-)
> >
> > diff --git a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > index d312eb66f8..61537838b7 100644
> > --- a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > +++ b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > @@ -789,8 +789,7 @@ RefreshGcdMemoryAttributesFromPaging (
> >    UINT64                              BaseAddress;
> >    UINT64                              PageStartAddress;
> >    UINT64                              Attributes;
> > -  UINT64                              Capabilities;
> > -  BOOLEAN                             DoUpdate;
> > +  UINT64                              NewAttributes;
> >    UINTN                               Index;
> >
> >    //
> > @@ -802,9 +801,8 @@ RefreshGcdMemoryAttributesFromPaging (
> >
> >    GetCurrentPagingContext (&PagingContext);
> >
> > -  DoUpdate      = FALSE;
> > -  Capabilities  = 0;
> >    Attributes    = 0;
> > +  NewAttributes = 0;
> >    BaseAddress   = 0;
> >    PageLength    = 0;
> >
> > @@ -813,6 +811,34 @@ RefreshGcdMemoryAttributesFromPaging (
> >        continue;
> >      }
> >
> > +    //
> > +    // Sync the actual paging related capabilities back to GCD service first.
> > +    // As a side effect (good one), this can also help to avoid unnecessary
> > +    // memory map entries due to the different capabilities of the same type
> > +    // memory, such as multiple RT_CODE and RT_DATA entries in memory
> map,
> > +    // which could cause boot failure of some old Linux distro (before v4.3).
> > +    //
> > +    Status = gDS->SetMemorySpaceCapabilities (
> > +                    MemorySpaceMap[Index].BaseAddress,
> > +                    MemorySpaceMap[Index].Length,
> > +                    MemorySpaceMap[Index].Capabilities |
> > +                    EFI_MEMORY_PAGETYPE_MASK
> > +                    );
> > +    if (EFI_ERROR (Status)) {
> > +      //
> > +      // If we cannot udpate the capabilities, we cannot update its
> > +      // attributes either. So just simply skip current block of memory.
> > +      //
> > +      DEBUG ((
> > +        DEBUG_WARN,
> > +        "Failed to update capability: [%lu] %016lx - %016lx (%016lx -
> > %016lx)\r\n",
> > +        (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > +        MemorySpaceMap[Index].Capabilities,
> > +        MemorySpaceMap[Index].Capabilities | EFI_MEMORY_PAGETYPE_MASK
> > +        ));
> > +      continue;
> > +    }
> > +
> >      if (MemorySpaceMap[Index].BaseAddress >= (BaseAddress + PageLength)) {
> >        //
> >        // Current memory space starts at a new page. Resetting PageLength will
> > @@ -826,7 +852,9 @@ RefreshGcdMemoryAttributesFromPaging (
> >        PageLength -= (MemorySpaceMap[Index].BaseAddress - BaseAddress);
> >      }
> >
> > -    // Sync real page attributes to GCD
> > +    //
> > +    // Sync actual page attributes to GCD
> > +    //
> >      BaseAddress       = MemorySpaceMap[Index].BaseAddress;
> >      MemorySpaceLength = MemorySpaceMap[Index].Length;
> >      while (MemorySpaceLength > 0) {
> > @@ -842,23 +870,26 @@ RefreshGcdMemoryAttributesFromPaging (
> >          PageStartAddress  = (*PageEntry) &
> (UINT64)PageAttributeToMask(PageAttribute);
> >          PageLength        = PageAttributeToLength (PageAttribute) - (BaseAddress -
> PageStartAddress);
> >          Attributes        = GetAttributesFromPageEntry (PageEntry);
> > -
> > -        if (Attributes != (MemorySpaceMap[Index].Attributes &
> EFI_MEMORY_PAGETYPE_MASK)) {
> > -          DoUpdate = TRUE;
> > -          Attributes |= (MemorySpaceMap[Index].Attributes &
> ~EFI_MEMORY_PAGETYPE_MASK);
> > -          Capabilities = Attributes | MemorySpaceMap[Index].Capabilities;
> > -        } else {
> > -          DoUpdate = FALSE;
> > -        }
> >        }
> >
> >        Length = MIN (PageLength, MemorySpaceLength);
> > -      if (DoUpdate) {
> > -        gDS->SetMemorySpaceCapabilities (BaseAddress, Length, Capabilities);
> > -        gDS->SetMemorySpaceAttributes (BaseAddress, Length, Attributes);
> > -        DEBUG ((DEBUG_INFO, "Update memory space attribute: [%02d] %016lx
> - %016lx (%08lx -> %08lx)\r\n",
> > -                             Index, BaseAddress, BaseAddress + Length - 1,
> > -                             MemorySpaceMap[Index].Attributes, Attributes));
> > +      if (Attributes != (MemorySpaceMap[Index].Attributes &
> > +                         EFI_MEMORY_PAGETYPE_MASK)) {
> > +        NewAttributes = (MemorySpaceMap[Index].Attributes &
> > +                         ~EFI_MEMORY_PAGETYPE_MASK) | Attributes;
> > +        Status = gDS->SetMemorySpaceAttributes (
> > +                        BaseAddress,
> > +                        Length,
> > +                        NewAttributes
> > +                        );
> > +        ASSERT_EFI_ERROR (Status);
> > +        DEBUG ((
> > +          DEBUG_INFO,
> > +          "Updated memory space attribute: [%lu] %016lx - %016lx (%016lx -
> > %016lx)\r\n",
> > +          (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > +          MemorySpaceMap[Index].Attributes,
> > +          NewAttributes
> > +          ));
> >        }
> >
> >        PageLength        -= Length;
> >
> 
> So, I was ready to give my R-b for this patch, but then I also wanted to
> test it. I applied the patch on current edk2 master (7e2a8dfe8a9a,
> "ArmPlatformPkg/PrePeiCore: seed temporary stack before entering PEI
> core", 2017-10-20), and built OVMF like this:
> 
> $ build \
>   -a IA32 \
>   -a X64 \
>   -p OvmfPkg/OvmfPkgIa32X64.dsc \
>   -t GCC48 \
>   -b NOOPT \
>   -D SMM_REQUIRE \
>   -D SECURE_BOOT_ENABLE \
>   -D E1000_ENABLE \
>   -D HTTP_BOOT_ENABLE
> 
> For testing I used a recent-ish upstream QEMU development build
> (ae49fbbcd8e4, "Merge remote-tracking branch
> 'remotes/rth/tags/pull-tcg-20171025' into staging", 2017-10-25), with
> the Q35 machine type (which is required by SMM anyway).
> 
> The results vary across guest OSes:
> 
> (1) Up-to-date Fedora 26 guest crashes during boot, with the following
> call stack:
> 
> BUG: unable to handle kernel paging request at fffffffefe893018
> Call Trace:
>  ? __change_page_attr_set_clr+0xaa6/0xd70
>  ? kernel_map_pages_in_pgd+0xbc/0xd0
>  ? efi_call+0x58/0x90
>  ? virt_efi_set_variable.part.7+0x66/0x120
>  ? virt_efi_set_variable+0x4f/0x60
>  ? efi_delete_dummy_variable+0x62/0x90
>  ? efi_enter_virtual_mode+0x4d4/0x4e8
>  ? efi_enter_virtual_mode+0x4d4/0x4e8
>  ? start_kernel+0x442/0x4e6
>  ? early_idt_handler_array+0x120/0x120
>  ? x86_64_start_reservations+0x24/0x26
>  ? x86_64_start_kernel+0x13e/0x161
>  ? secondary_startup_64+0x9f/0x9f
> 
> (2) The following Windows OSes all boot successfully:
> 
> - Windows 7
> - Windows Server 2008 R2
> - Windows 8.1
> - Windows Server 2012 R2
> - Windows 10
> 
> (3) Windows Server 2016 crashes with a BSOD; reporting "ATTEMPTED WRITE
> TO READONLY MEMORY".
> 
> (Without the patch, all OSes boot OK.)
> 
> 
> I'm attaching a ZIP file with the following contents (note that I'll
> attach the same file to TianoCore BZ#753 as well, because the mailing
> list archive(s) don't seem to preserve attachments):
> 
> - "ovmf.pre.txt", "shell.memmap.pre.txt", "kernel.pre.txt": OVMF log,
> MEMMAP command output in the UEFI shell, and Fedora 26 kernel boot log
> (successful) *before* applying your patch. The kernel log is detailed
> (the cmdline had "ignore_loglevel" and "efi=debug").
> 
> - "ovmf.post.txt", "shell.memmap.post.txt", "kernel.post.txt": same
> files as above, but saved *after* applying your patch. This is when the
> F26 kernel crashes.
> 
> - "win2016.post.png": screenshot of the Windows Server 2016 boot failure
> (after the patch was applied).
> 
> Thanks,
> Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map
Posted by Wang, Jian J 6 years, 5 months ago
Hi Laszlo,

I did some investigation works on this issue. I think I found the root cause and
tend to believe this is a Fedora kernel issue. Here're proves:

memmap output patterns in which Fedora 26 failed to boot:
1) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000002600F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000002600F
    RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100 800000000002600F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000002600F

2) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000000000F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000000000F
    RT_Code    00000000AE38F000-00000000AE467FFF 00000000000000D9 800000000000000F
    RT_Code    00000000AE468000-00000000AE469FFF 0000000000000002 800000000002600F
    RT_Code    00000000AE46A000-00000000AE46EFFF 0000000000000005 800000000000000F
    RT_Code    00000000AE46F000-00000000AE470FFF 0000000000000002 800000000002600F
    RT_Code    00000000AE471000-00000000AE475FFF 0000000000000005 800000000000000F
    RT_Code    00000000AE476000-00000000AE479FFF 0000000000000004 800000000002600F
    RT_Code    00000000AE47A000-00000000AE47FFFF 0000000000000006 800000000000000F
    RT_Code    00000000AE480000-00000000AE481FFF 0000000000000002 800000000002600F
    RT_Code    00000000AE482000-00000000AE487FFF 0000000000000006 800000000000000F
    RT_Code    00000000AE488000-00000000AE489FFF 0000000000000002 800000000002600F
    RT_Code    00000000AE48A000-00000000AE48EFFF 0000000000000005 800000000000000F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000000000F

3) RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000000000F
    RT_Code    00000000AE38F000-00000000AE418FFF 000000000000008A 800000000000000F
    RT_Code    00000000AE419000-00000000AE48EFFF 0000000000000076 800000000002600F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000000000F

4) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000002400F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000002400F
    RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100 800000000002400F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000002400F


memmap output pattern in which Fedora 26 booted successfully
5) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000000000F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000000000F
    RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100 800000000000000F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000000000F

6) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000000000F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000000000F
    RT_Code    00000000AE38F000-00000000AE418FFF 000000000000008A 800000000000000F
    RT_Code    00000000AE419000-00000000AE419FFF 0000000000000001 800000000000400F
    RT_Code    00000000AE41A000-00000000AE41BFFF 0000000000000002 800000000002000F
    RT_Code    00000000AE41C000-00000000AE420FFF 0000000000000005 800000000000400F
    RT_Code    00000000AE421000-00000000AE422FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE423000-00000000AE427FFF 0000000000000005 800000000000400F
    RT_Code    00000000AE428000-00000000AE42AFFF 0000000000000003 800000000002000F
    RT_Code    00000000AE42B000-00000000AE42FFFF 0000000000000005 800000000000400F
    RT_Code    00000000AE430000-00000000AE432FFF 0000000000000003 800000000002000F
    RT_Code    00000000AE433000-00000000AE439FFF 0000000000000007 800000000000400F
    RT_Code    00000000AE43A000-00000000AE43DFFF 0000000000000004 800000000002000F
    RT_Code    00000000AE43E000-00000000AE444FFF 0000000000000007 800000000000400F
    RT_Code    00000000AE445000-00000000AE448FFF 0000000000000004 800000000002000F
    RT_Code    00000000AE449000-00000000AE44EFFF 0000000000000006 800000000000400F
    RT_Code    00000000AE44F000-00000000AE450FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE451000-00000000AE455FFF 0000000000000005 800000000000400F
    RT_Code    00000000AE456000-00000000AE458FFF 0000000000000003 800000000002000F
    RT_Code    00000000AE459000-00000000AE45FFFF 0000000000000007 800000000000400F
    RT_Code    00000000AE460000-00000000AE461FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE462000-00000000AE467FFF 0000000000000006 800000000000400F
    RT_Code    00000000AE468000-00000000AE469FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE46A000-00000000AE46EFFF 0000000000000005 800000000000400F
    RT_Code    00000000AE46F000-00000000AE470FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE471000-00000000AE475FFF 0000000000000005 800000000000400F
    RT_Code    00000000AE476000-00000000AE479FFF 0000000000000004 800000000002000F
    RT_Code    00000000AE47A000-00000000AE47FFFF 0000000000000006 800000000000400F
    RT_Code    00000000AE480000-00000000AE481FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE482000-00000000AE487FFF 0000000000000006 800000000000400F
    RT_Code    00000000AE488000-00000000AE489FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE48A000-00000000AE48EFFF 0000000000000005 800000000000400F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000000000F

You may notice that 4) will fail but 6) will succeed. Taking into account the fact that failure
always happened in the runtime service api (set_variable), I think it must be that the kernel will
mark the memory block to be RO/XP/RP memory according to its capabilities but not its current
attributes. 

This can explain why 4) will fail but 6) will succeed. Although memmap shows all runtime 
driver image memory as RT_Code, but they're not all code segment. Instead some of them are 
actually data segment.

It's normal to mark code segment to be RO and data segment to be XP. But mark data segment
to be RO will cause runtime services failure. 4) shows all RT_Code to be XXX24XXX, which means
Fedora kernel will mark all code segment and data segment to be RO and XP mistakenly, based 
on my previous hypothesis. This can also explain why 1), 2), 3) will fail the boot because XXX26XXX
will let Fedora kernel to mark RT_Code memory block to be not-present.

I haven't got time to look into the Linux kernel source so I can't confirm above analysis yet.
I think you're more familiar with kernel source than us. Maybe you could help to take a look.

Thanks,
Jian

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Friday, November 10, 2017 8:24 PM
> To: Wang, Jian J <jian.j.wang@intel.com>
> Cc: edk2-devel@lists.01.org; Dong, Eric <eric.dong@intel.com>; Yao, Jiewen
> <jiewen.yao@intel.com>; Ard Biesheuvel <ard.biesheuvel@linaro.org>; Matt
> Fleming <matt@codeblueprint.co.uk>
> Subject: Re: [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in
> memory map
> 
> Hi Jian,
> 
> I'm CC'ing Ard and Matt, and commenting at the bottom.
> 
> On 11/10/17 02:02, Jian J Wang wrote:
> >> v5:
> >>    Coding style clean-up
> >
> >> v4:
> >> a. Remove DoUpdate and check attributes mismatch all the time to avoid
> >>    a logic hole
> >> b. Add warning message if failed to update capability
> >> c. Add local variable to hold new attributes to make code cleaner
> >
> >> v3:
> >> a. Add comment to explain more on updating memory capabilities
> >> b. Fix logic hole in updating attributes
> >> c. Instead of checking illegal memory space address and size, use return
> >>    status of gDS->SetMemorySpaceCapabilities() to skip memory block which
> >>    cannot be updated with new capabilities.
> >
> >> v2
> >> a. Fix an issue which will cause setting capability failure if size is smaller
> >>    than a page.
> >
> > More than one entry of RT_CODE memory might cause boot problem for
> some
> > old OSs. This patch will fix this issue to keep OS compatibility as much
> > as possible.
> >
> > More detailed information, please refer to
> >     https://bugzilla.tianocore.org/show_bug.cgi?id=753
> >
> > Cc: Eric Dong <eric.dong@intel.com>
> > Cc: Jiewen Yao <jiewen.yao@intel.com>
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Contributed-under: TianoCore Contribution Agreement 1.1
> > Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> > ---
> >  UefiCpuPkg/CpuDxe/CpuPageTable.c | 69
> +++++++++++++++++++++++++++++-----------
> >  1 file changed, 50 insertions(+), 19 deletions(-)
> >
> > diff --git a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > index d312eb66f8..61537838b7 100644
> > --- a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > +++ b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > @@ -789,8 +789,7 @@ RefreshGcdMemoryAttributesFromPaging (
> >    UINT64                              BaseAddress;
> >    UINT64                              PageStartAddress;
> >    UINT64                              Attributes;
> > -  UINT64                              Capabilities;
> > -  BOOLEAN                             DoUpdate;
> > +  UINT64                              NewAttributes;
> >    UINTN                               Index;
> >
> >    //
> > @@ -802,9 +801,8 @@ RefreshGcdMemoryAttributesFromPaging (
> >
> >    GetCurrentPagingContext (&PagingContext);
> >
> > -  DoUpdate      = FALSE;
> > -  Capabilities  = 0;
> >    Attributes    = 0;
> > +  NewAttributes = 0;
> >    BaseAddress   = 0;
> >    PageLength    = 0;
> >
> > @@ -813,6 +811,34 @@ RefreshGcdMemoryAttributesFromPaging (
> >        continue;
> >      }
> >
> > +    //
> > +    // Sync the actual paging related capabilities back to GCD service first.
> > +    // As a side effect (good one), this can also help to avoid unnecessary
> > +    // memory map entries due to the different capabilities of the same type
> > +    // memory, such as multiple RT_CODE and RT_DATA entries in memory
> map,
> > +    // which could cause boot failure of some old Linux distro (before v4.3).
> > +    //
> > +    Status = gDS->SetMemorySpaceCapabilities (
> > +                    MemorySpaceMap[Index].BaseAddress,
> > +                    MemorySpaceMap[Index].Length,
> > +                    MemorySpaceMap[Index].Capabilities |
> > +                    EFI_MEMORY_PAGETYPE_MASK
> > +                    );
> > +    if (EFI_ERROR (Status)) {
> > +      //
> > +      // If we cannot udpate the capabilities, we cannot update its
> > +      // attributes either. So just simply skip current block of memory.
> > +      //
> > +      DEBUG ((
> > +        DEBUG_WARN,
> > +        "Failed to update capability: [%lu] %016lx - %016lx (%016lx -
> > %016lx)\r\n",
> > +        (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > +        MemorySpaceMap[Index].Capabilities,
> > +        MemorySpaceMap[Index].Capabilities | EFI_MEMORY_PAGETYPE_MASK
> > +        ));
> > +      continue;
> > +    }
> > +
> >      if (MemorySpaceMap[Index].BaseAddress >= (BaseAddress + PageLength)) {
> >        //
> >        // Current memory space starts at a new page. Resetting PageLength will
> > @@ -826,7 +852,9 @@ RefreshGcdMemoryAttributesFromPaging (
> >        PageLength -= (MemorySpaceMap[Index].BaseAddress - BaseAddress);
> >      }
> >
> > -    // Sync real page attributes to GCD
> > +    //
> > +    // Sync actual page attributes to GCD
> > +    //
> >      BaseAddress       = MemorySpaceMap[Index].BaseAddress;
> >      MemorySpaceLength = MemorySpaceMap[Index].Length;
> >      while (MemorySpaceLength > 0) {
> > @@ -842,23 +870,26 @@ RefreshGcdMemoryAttributesFromPaging (
> >          PageStartAddress  = (*PageEntry) &
> (UINT64)PageAttributeToMask(PageAttribute);
> >          PageLength        = PageAttributeToLength (PageAttribute) - (BaseAddress -
> PageStartAddress);
> >          Attributes        = GetAttributesFromPageEntry (PageEntry);
> > -
> > -        if (Attributes != (MemorySpaceMap[Index].Attributes &
> EFI_MEMORY_PAGETYPE_MASK)) {
> > -          DoUpdate = TRUE;
> > -          Attributes |= (MemorySpaceMap[Index].Attributes &
> ~EFI_MEMORY_PAGETYPE_MASK);
> > -          Capabilities = Attributes | MemorySpaceMap[Index].Capabilities;
> > -        } else {
> > -          DoUpdate = FALSE;
> > -        }
> >        }
> >
> >        Length = MIN (PageLength, MemorySpaceLength);
> > -      if (DoUpdate) {
> > -        gDS->SetMemorySpaceCapabilities (BaseAddress, Length, Capabilities);
> > -        gDS->SetMemorySpaceAttributes (BaseAddress, Length, Attributes);
> > -        DEBUG ((DEBUG_INFO, "Update memory space attribute: [%02d] %016lx
> - %016lx (%08lx -> %08lx)\r\n",
> > -                             Index, BaseAddress, BaseAddress + Length - 1,
> > -                             MemorySpaceMap[Index].Attributes, Attributes));
> > +      if (Attributes != (MemorySpaceMap[Index].Attributes &
> > +                         EFI_MEMORY_PAGETYPE_MASK)) {
> > +        NewAttributes = (MemorySpaceMap[Index].Attributes &
> > +                         ~EFI_MEMORY_PAGETYPE_MASK) | Attributes;
> > +        Status = gDS->SetMemorySpaceAttributes (
> > +                        BaseAddress,
> > +                        Length,
> > +                        NewAttributes
> > +                        );
> > +        ASSERT_EFI_ERROR (Status);
> > +        DEBUG ((
> > +          DEBUG_INFO,
> > +          "Updated memory space attribute: [%lu] %016lx - %016lx (%016lx -
> > %016lx)\r\n",
> > +          (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > +          MemorySpaceMap[Index].Attributes,
> > +          NewAttributes
> > +          ));
> >        }
> >
> >        PageLength        -= Length;
> >
> 
> So, I was ready to give my R-b for this patch, but then I also wanted to
> test it. I applied the patch on current edk2 master (7e2a8dfe8a9a,
> "ArmPlatformPkg/PrePeiCore: seed temporary stack before entering PEI
> core", 2017-10-20), and built OVMF like this:
> 
> $ build \
>   -a IA32 \
>   -a X64 \
>   -p OvmfPkg/OvmfPkgIa32X64.dsc \
>   -t GCC48 \
>   -b NOOPT \
>   -D SMM_REQUIRE \
>   -D SECURE_BOOT_ENABLE \
>   -D E1000_ENABLE \
>   -D HTTP_BOOT_ENABLE
> 
> For testing I used a recent-ish upstream QEMU development build
> (ae49fbbcd8e4, "Merge remote-tracking branch
> 'remotes/rth/tags/pull-tcg-20171025' into staging", 2017-10-25), with
> the Q35 machine type (which is required by SMM anyway).
> 
> The results vary across guest OSes:
> 
> (1) Up-to-date Fedora 26 guest crashes during boot, with the following
> call stack:
> 
> BUG: unable to handle kernel paging request at fffffffefe893018
> Call Trace:
>  ? __change_page_attr_set_clr+0xaa6/0xd70
>  ? kernel_map_pages_in_pgd+0xbc/0xd0
>  ? efi_call+0x58/0x90
>  ? virt_efi_set_variable.part.7+0x66/0x120
>  ? virt_efi_set_variable+0x4f/0x60
>  ? efi_delete_dummy_variable+0x62/0x90
>  ? efi_enter_virtual_mode+0x4d4/0x4e8
>  ? efi_enter_virtual_mode+0x4d4/0x4e8
>  ? start_kernel+0x442/0x4e6
>  ? early_idt_handler_array+0x120/0x120
>  ? x86_64_start_reservations+0x24/0x26
>  ? x86_64_start_kernel+0x13e/0x161
>  ? secondary_startup_64+0x9f/0x9f
> 
> (2) The following Windows OSes all boot successfully:
> 
> - Windows 7
> - Windows Server 2008 R2
> - Windows 8.1
> - Windows Server 2012 R2
> - Windows 10
> 
> (3) Windows Server 2016 crashes with a BSOD; reporting "ATTEMPTED WRITE
> TO READONLY MEMORY".
> 
> (Without the patch, all OSes boot OK.)
> 
> 
> I'm attaching a ZIP file with the following contents (note that I'll
> attach the same file to TianoCore BZ#753 as well, because the mailing
> list archive(s) don't seem to preserve attachments):
> 
> - "ovmf.pre.txt", "shell.memmap.pre.txt", "kernel.pre.txt": OVMF log,
> MEMMAP command output in the UEFI shell, and Fedora 26 kernel boot log
> (successful) *before* applying your patch. The kernel log is detailed
> (the cmdline had "ignore_loglevel" and "efi=debug").
> 
> - "ovmf.post.txt", "shell.memmap.post.txt", "kernel.post.txt": same
> files as above, but saved *after* applying your patch. This is when the
> F26 kernel crashes.
> 
> - "win2016.post.png": screenshot of the Windows Server 2016 boot failure
> (after the patch was applied).
> 
> Thanks,
> Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map
Posted by Zeng, Star 6 years, 5 months ago
How about the code to filter out paging capabilities from UEFI memory map in Page.c CoreGetMemoryMap(), then with maximum compatibility, the UEFI memory map could be same with before 14dde9e903bb9a719ebb8f3381da72b19509bc36 [MdeModulePkg/Core: Fix out-of-sync issue in GCD].

Thanks,
Star
-----Original Message-----
From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Wang, Jian J
Sent: Tuesday, November 14, 2017 10:36 PM
To: Laszlo Ersek <lersek@redhat.com>
Cc: Matt Fleming <matt@codeblueprint.co.uk>; edk2-devel@lists.01.org; Yao, Jiewen <jiewen.yao@intel.com>; Dong, Eric <eric.dong@intel.com>; Ard Biesheuvel <ard.biesheuvel@linaro.org>
Subject: Re: [edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map

Hi Laszlo,

I did some investigation works on this issue. I think I found the root cause and tend to believe this is a Fedora kernel issue. Here're proves:

memmap output patterns in which Fedora 26 failed to boot:
1) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000002600F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000002600F
    RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100 800000000002600F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000002600F

2) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000000000F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000000000F
    RT_Code    00000000AE38F000-00000000AE467FFF 00000000000000D9 800000000000000F
    RT_Code    00000000AE468000-00000000AE469FFF 0000000000000002 800000000002600F
    RT_Code    00000000AE46A000-00000000AE46EFFF 0000000000000005 800000000000000F
    RT_Code    00000000AE46F000-00000000AE470FFF 0000000000000002 800000000002600F
    RT_Code    00000000AE471000-00000000AE475FFF 0000000000000005 800000000000000F
    RT_Code    00000000AE476000-00000000AE479FFF 0000000000000004 800000000002600F
    RT_Code    00000000AE47A000-00000000AE47FFFF 0000000000000006 800000000000000F
    RT_Code    00000000AE480000-00000000AE481FFF 0000000000000002 800000000002600F
    RT_Code    00000000AE482000-00000000AE487FFF 0000000000000006 800000000000000F
    RT_Code    00000000AE488000-00000000AE489FFF 0000000000000002 800000000002600F
    RT_Code    00000000AE48A000-00000000AE48EFFF 0000000000000005 800000000000000F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000000000F

3) RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000000000F
    RT_Code    00000000AE38F000-00000000AE418FFF 000000000000008A 800000000000000F
    RT_Code    00000000AE419000-00000000AE48EFFF 0000000000000076 800000000002600F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000000000F

4) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000002400F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000002400F
    RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100 800000000002400F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000002400F


memmap output pattern in which Fedora 26 booted successfully
5) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000000000F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000000000F
    RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100 800000000000000F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000000000F

6) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F 000000000000000F
    RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180 800000000000000F
    RT_Code    00000000AE38F000-00000000AE418FFF 000000000000008A 800000000000000F
    RT_Code    00000000AE419000-00000000AE419FFF 0000000000000001 800000000000400F
    RT_Code    00000000AE41A000-00000000AE41BFFF 0000000000000002 800000000002000F
    RT_Code    00000000AE41C000-00000000AE420FFF 0000000000000005 800000000000400F
    RT_Code    00000000AE421000-00000000AE422FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE423000-00000000AE427FFF 0000000000000005 800000000000400F
    RT_Code    00000000AE428000-00000000AE42AFFF 0000000000000003 800000000002000F
    RT_Code    00000000AE42B000-00000000AE42FFFF 0000000000000005 800000000000400F
    RT_Code    00000000AE430000-00000000AE432FFF 0000000000000003 800000000002000F
    RT_Code    00000000AE433000-00000000AE439FFF 0000000000000007 800000000000400F
    RT_Code    00000000AE43A000-00000000AE43DFFF 0000000000000004 800000000002000F
    RT_Code    00000000AE43E000-00000000AE444FFF 0000000000000007 800000000000400F
    RT_Code    00000000AE445000-00000000AE448FFF 0000000000000004 800000000002000F
    RT_Code    00000000AE449000-00000000AE44EFFF 0000000000000006 800000000000400F
    RT_Code    00000000AE44F000-00000000AE450FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE451000-00000000AE455FFF 0000000000000005 800000000000400F
    RT_Code    00000000AE456000-00000000AE458FFF 0000000000000003 800000000002000F
    RT_Code    00000000AE459000-00000000AE45FFFF 0000000000000007 800000000000400F
    RT_Code    00000000AE460000-00000000AE461FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE462000-00000000AE467FFF 0000000000000006 800000000000400F
    RT_Code    00000000AE468000-00000000AE469FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE46A000-00000000AE46EFFF 0000000000000005 800000000000400F
    RT_Code    00000000AE46F000-00000000AE470FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE471000-00000000AE475FFF 0000000000000005 800000000000400F
    RT_Code    00000000AE476000-00000000AE479FFF 0000000000000004 800000000002000F
    RT_Code    00000000AE47A000-00000000AE47FFFF 0000000000000006 800000000000400F
    RT_Code    00000000AE480000-00000000AE481FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE482000-00000000AE487FFF 0000000000000006 800000000000400F
    RT_Code    00000000AE488000-00000000AE489FFF 0000000000000002 800000000002000F
    RT_Code    00000000AE48A000-00000000AE48EFFF 0000000000000005 800000000000400F
    Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100 000000000000000F

You may notice that 4) will fail but 6) will succeed. Taking into account the fact that failure always happened in the runtime service api (set_variable), I think it must be that the kernel will mark the memory block to be RO/XP/RP memory according to its capabilities but not its current attributes. 

This can explain why 4) will fail but 6) will succeed. Although memmap shows all runtime driver image memory as RT_Code, but they're not all code segment. Instead some of them are actually data segment.

It's normal to mark code segment to be RO and data segment to be XP. But mark data segment to be RO will cause runtime services failure. 4) shows all RT_Code to be XXX24XXX, which means Fedora kernel will mark all code segment and data segment to be RO and XP mistakenly, based on my previous hypothesis. This can also explain why 1), 2), 3) will fail the boot because XXX26XXX will let Fedora kernel to mark RT_Code memory block to be not-present.

I haven't got time to look into the Linux kernel source so I can't confirm above analysis yet.
I think you're more familiar with kernel source than us. Maybe you could help to take a look.

Thanks,
Jian

> -----Original Message-----
> From: Laszlo Ersek [mailto:lersek@redhat.com]
> Sent: Friday, November 10, 2017 8:24 PM
> To: Wang, Jian J <jian.j.wang@intel.com>
> Cc: edk2-devel@lists.01.org; Dong, Eric <eric.dong@intel.com>; Yao, 
> Jiewen <jiewen.yao@intel.com>; Ard Biesheuvel 
> <ard.biesheuvel@linaro.org>; Matt Fleming <matt@codeblueprint.co.uk>
> Subject: Re: [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of 
> RT_CODE in memory map
> 
> Hi Jian,
> 
> I'm CC'ing Ard and Matt, and commenting at the bottom.
> 
> On 11/10/17 02:02, Jian J Wang wrote:
> >> v5:
> >>    Coding style clean-up
> >
> >> v4:
> >> a. Remove DoUpdate and check attributes mismatch all the time to avoid
> >>    a logic hole
> >> b. Add warning message if failed to update capability c. Add local 
> >> variable to hold new attributes to make code cleaner
> >
> >> v3:
> >> a. Add comment to explain more on updating memory capabilities b. 
> >> Fix logic hole in updating attributes c. Instead of checking 
> >> illegal memory space address and size, use return
> >>    status of gDS->SetMemorySpaceCapabilities() to skip memory block which
> >>    cannot be updated with new capabilities.
> >
> >> v2
> >> a. Fix an issue which will cause setting capability failure if size is smaller
> >>    than a page.
> >
> > More than one entry of RT_CODE memory might cause boot problem for
> some
> > old OSs. This patch will fix this issue to keep OS compatibility as 
> > much as possible.
> >
> > More detailed information, please refer to
> >     https://bugzilla.tianocore.org/show_bug.cgi?id=753
> >
> > Cc: Eric Dong <eric.dong@intel.com>
> > Cc: Jiewen Yao <jiewen.yao@intel.com>
> > Cc: Laszlo Ersek <lersek@redhat.com>
> > Contributed-under: TianoCore Contribution Agreement 1.1
> > Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> > ---
> >  UefiCpuPkg/CpuDxe/CpuPageTable.c | 69
> +++++++++++++++++++++++++++++-----------
> >  1 file changed, 50 insertions(+), 19 deletions(-)
> >
> > diff --git a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > index d312eb66f8..61537838b7 100644
> > --- a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > +++ b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > @@ -789,8 +789,7 @@ RefreshGcdMemoryAttributesFromPaging (
> >    UINT64                              BaseAddress;
> >    UINT64                              PageStartAddress;
> >    UINT64                              Attributes;
> > -  UINT64                              Capabilities;
> > -  BOOLEAN                             DoUpdate;
> > +  UINT64                              NewAttributes;
> >    UINTN                               Index;
> >
> >    //
> > @@ -802,9 +801,8 @@ RefreshGcdMemoryAttributesFromPaging (
> >
> >    GetCurrentPagingContext (&PagingContext);
> >
> > -  DoUpdate      = FALSE;
> > -  Capabilities  = 0;
> >    Attributes    = 0;
> > +  NewAttributes = 0;
> >    BaseAddress   = 0;
> >    PageLength    = 0;
> >
> > @@ -813,6 +811,34 @@ RefreshGcdMemoryAttributesFromPaging (
> >        continue;
> >      }
> >
> > +    //
> > +    // Sync the actual paging related capabilities back to GCD service first.
> > +    // As a side effect (good one), this can also help to avoid unnecessary
> > +    // memory map entries due to the different capabilities of the same type
> > +    // memory, such as multiple RT_CODE and RT_DATA entries in 
> > + memory
> map,
> > +    // which could cause boot failure of some old Linux distro (before v4.3).
> > +    //
> > +    Status = gDS->SetMemorySpaceCapabilities (
> > +                    MemorySpaceMap[Index].BaseAddress,
> > +                    MemorySpaceMap[Index].Length,
> > +                    MemorySpaceMap[Index].Capabilities |
> > +                    EFI_MEMORY_PAGETYPE_MASK
> > +                    );
> > +    if (EFI_ERROR (Status)) {
> > +      //
> > +      // If we cannot udpate the capabilities, we cannot update its
> > +      // attributes either. So just simply skip current block of memory.
> > +      //
> > +      DEBUG ((
> > +        DEBUG_WARN,
> > +        "Failed to update capability: [%lu] %016lx - %016lx (%016lx 
> > + -
> > %016lx)\r\n",
> > +        (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > +        MemorySpaceMap[Index].Capabilities,
> > +        MemorySpaceMap[Index].Capabilities | EFI_MEMORY_PAGETYPE_MASK
> > +        ));
> > +      continue;
> > +    }
> > +
> >      if (MemorySpaceMap[Index].BaseAddress >= (BaseAddress + PageLength)) {
> >        //
> >        // Current memory space starts at a new page. Resetting 
> > PageLength will @@ -826,7 +852,9 @@ RefreshGcdMemoryAttributesFromPaging (
> >        PageLength -= (MemorySpaceMap[Index].BaseAddress - BaseAddress);
> >      }
> >
> > -    // Sync real page attributes to GCD
> > +    //
> > +    // Sync actual page attributes to GCD
> > +    //
> >      BaseAddress       = MemorySpaceMap[Index].BaseAddress;
> >      MemorySpaceLength = MemorySpaceMap[Index].Length;
> >      while (MemorySpaceLength > 0) { @@ -842,23 +870,26 @@ 
> > RefreshGcdMemoryAttributesFromPaging (
> >          PageStartAddress  = (*PageEntry) &
> (UINT64)PageAttributeToMask(PageAttribute);
> >          PageLength        = PageAttributeToLength (PageAttribute) - (BaseAddress -
> PageStartAddress);
> >          Attributes        = GetAttributesFromPageEntry (PageEntry);
> > -
> > -        if (Attributes != (MemorySpaceMap[Index].Attributes &
> EFI_MEMORY_PAGETYPE_MASK)) {
> > -          DoUpdate = TRUE;
> > -          Attributes |= (MemorySpaceMap[Index].Attributes &
> ~EFI_MEMORY_PAGETYPE_MASK);
> > -          Capabilities = Attributes | MemorySpaceMap[Index].Capabilities;
> > -        } else {
> > -          DoUpdate = FALSE;
> > -        }
> >        }
> >
> >        Length = MIN (PageLength, MemorySpaceLength);
> > -      if (DoUpdate) {
> > -        gDS->SetMemorySpaceCapabilities (BaseAddress, Length, Capabilities);
> > -        gDS->SetMemorySpaceAttributes (BaseAddress, Length, Attributes);
> > -        DEBUG ((DEBUG_INFO, "Update memory space attribute: [%02d] %016lx
> - %016lx (%08lx -> %08lx)\r\n",
> > -                             Index, BaseAddress, BaseAddress + Length - 1,
> > -                             MemorySpaceMap[Index].Attributes, Attributes));
> > +      if (Attributes != (MemorySpaceMap[Index].Attributes &
> > +                         EFI_MEMORY_PAGETYPE_MASK)) {
> > +        NewAttributes = (MemorySpaceMap[Index].Attributes &
> > +                         ~EFI_MEMORY_PAGETYPE_MASK) | Attributes;
> > +        Status = gDS->SetMemorySpaceAttributes (
> > +                        BaseAddress,
> > +                        Length,
> > +                        NewAttributes
> > +                        );
> > +        ASSERT_EFI_ERROR (Status);
> > +        DEBUG ((
> > +          DEBUG_INFO,
> > +          "Updated memory space attribute: [%lu] %016lx - %016lx 
> > + (%016lx -
> > %016lx)\r\n",
> > +          (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > +          MemorySpaceMap[Index].Attributes,
> > +          NewAttributes
> > +          ));
> >        }
> >
> >        PageLength        -= Length;
> >
> 
> So, I was ready to give my R-b for this patch, but then I also wanted 
> to test it. I applied the patch on current edk2 master (7e2a8dfe8a9a,
> "ArmPlatformPkg/PrePeiCore: seed temporary stack before entering PEI 
> core", 2017-10-20), and built OVMF like this:
> 
> $ build \
>   -a IA32 \
>   -a X64 \
>   -p OvmfPkg/OvmfPkgIa32X64.dsc \
>   -t GCC48 \
>   -b NOOPT \
>   -D SMM_REQUIRE \
>   -D SECURE_BOOT_ENABLE \
>   -D E1000_ENABLE \
>   -D HTTP_BOOT_ENABLE
> 
> For testing I used a recent-ish upstream QEMU development build 
> (ae49fbbcd8e4, "Merge remote-tracking branch 
> 'remotes/rth/tags/pull-tcg-20171025' into staging", 2017-10-25), with 
> the Q35 machine type (which is required by SMM anyway).
> 
> The results vary across guest OSes:
> 
> (1) Up-to-date Fedora 26 guest crashes during boot, with the following 
> call stack:
> 
> BUG: unable to handle kernel paging request at fffffffefe893018 Call 
> Trace:
>  ? __change_page_attr_set_clr+0xaa6/0xd70
>  ? kernel_map_pages_in_pgd+0xbc/0xd0
>  ? efi_call+0x58/0x90
>  ? virt_efi_set_variable.part.7+0x66/0x120
>  ? virt_efi_set_variable+0x4f/0x60
>  ? efi_delete_dummy_variable+0x62/0x90
>  ? efi_enter_virtual_mode+0x4d4/0x4e8
>  ? efi_enter_virtual_mode+0x4d4/0x4e8
>  ? start_kernel+0x442/0x4e6
>  ? early_idt_handler_array+0x120/0x120
>  ? x86_64_start_reservations+0x24/0x26
>  ? x86_64_start_kernel+0x13e/0x161
>  ? secondary_startup_64+0x9f/0x9f
> 
> (2) The following Windows OSes all boot successfully:
> 
> - Windows 7
> - Windows Server 2008 R2
> - Windows 8.1
> - Windows Server 2012 R2
> - Windows 10
> 
> (3) Windows Server 2016 crashes with a BSOD; reporting "ATTEMPTED 
> WRITE TO READONLY MEMORY".
> 
> (Without the patch, all OSes boot OK.)
> 
> 
> I'm attaching a ZIP file with the following contents (note that I'll 
> attach the same file to TianoCore BZ#753 as well, because the mailing 
> list archive(s) don't seem to preserve attachments):
> 
> - "ovmf.pre.txt", "shell.memmap.pre.txt", "kernel.pre.txt": OVMF log, 
> MEMMAP command output in the UEFI shell, and Fedora 26 kernel boot log
> (successful) *before* applying your patch. The kernel log is detailed 
> (the cmdline had "ignore_loglevel" and "efi=debug").
> 
> - "ovmf.post.txt", "shell.memmap.post.txt", "kernel.post.txt": same 
> files as above, but saved *after* applying your patch. This is when 
> the
> F26 kernel crashes.
> 
> - "win2016.post.png": screenshot of the Windows Server 2016 boot 
> failure (after the patch was applied).
> 
> Thanks,
> Laszlo
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel
Re: [edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in memory map
Posted by Wang, Jian J 6 years, 5 months ago
Since OS never had chance to use the those capabilities before, I think it's feasible.
But it's just a workaround not solution because there's real gap between UEFI and
OS on how to interpret the memory capabilities.

> -----Original Message-----
> From: Zeng, Star
> Sent: Wednesday, November 15, 2017 2:53 PM
> To: Wang, Jian J <jian.j.wang@intel.com>; Laszlo Ersek <lersek@redhat.com>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>; edk2-devel@lists.01.org; Yao,
> Jiewen <jiewen.yao@intel.com>; Dong, Eric <eric.dong@intel.com>; Ard
> Biesheuvel <ard.biesheuvel@linaro.org>; Zeng, Star <star.zeng@intel.com>
> Subject: RE: [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of RT_CODE in
> memory map
> 
> How about the code to filter out paging capabilities from UEFI memory map in
> Page.c CoreGetMemoryMap(), then with maximum compatibility, the UEFI
> memory map could be same with before
> 14dde9e903bb9a719ebb8f3381da72b19509bc36 [MdeModulePkg/Core: Fix out-
> of-sync issue in GCD].
> 
> Thanks,
> Star
> -----Original Message-----
> From: edk2-devel [mailto:edk2-devel-bounces@lists.01.org] On Behalf Of Wang,
> Jian J
> Sent: Tuesday, November 14, 2017 10:36 PM
> To: Laszlo Ersek <lersek@redhat.com>
> Cc: Matt Fleming <matt@codeblueprint.co.uk>; edk2-devel@lists.01.org; Yao,
> Jiewen <jiewen.yao@intel.com>; Dong, Eric <eric.dong@intel.com>; Ard
> Biesheuvel <ard.biesheuvel@linaro.org>
> Subject: Re: [edk2] [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of
> RT_CODE in memory map
> 
> Hi Laszlo,
> 
> I did some investigation works on this issue. I think I found the root cause and
> tend to believe this is a Fedora kernel issue. Here're proves:
> 
> memmap output patterns in which Fedora 26 failed to boot:
> 1) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000002600F
>     RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000002600F
>     RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100
> 800000000002600F
>     Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000002600F
> 
> 2) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000000000F
>     RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000000000F
>     RT_Code    00000000AE38F000-00000000AE467FFF 00000000000000D9
> 800000000000000F
>     RT_Code    00000000AE468000-00000000AE469FFF 0000000000000002
> 800000000002600F
>     RT_Code    00000000AE46A000-00000000AE46EFFF 0000000000000005
> 800000000000000F
>     RT_Code    00000000AE46F000-00000000AE470FFF 0000000000000002
> 800000000002600F
>     RT_Code    00000000AE471000-00000000AE475FFF 0000000000000005
> 800000000000000F
>     RT_Code    00000000AE476000-00000000AE479FFF 0000000000000004
> 800000000002600F
>     RT_Code    00000000AE47A000-00000000AE47FFFF 0000000000000006
> 800000000000000F
>     RT_Code    00000000AE480000-00000000AE481FFF 0000000000000002
> 800000000002600F
>     RT_Code    00000000AE482000-00000000AE487FFF 0000000000000006
> 800000000000000F
>     RT_Code    00000000AE488000-00000000AE489FFF 0000000000000002
> 800000000002600F
>     RT_Code    00000000AE48A000-00000000AE48EFFF 0000000000000005
> 800000000000000F
>     Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000000000F
> 
> 3) RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000000000F
>     RT_Code    00000000AE38F000-00000000AE418FFF 000000000000008A
> 800000000000000F
>     RT_Code    00000000AE419000-00000000AE48EFFF 0000000000000076
> 800000000002600F
>     Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000000000F
> 
> 4) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000002400F
>     RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000002400F
>     RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100
> 800000000002400F
>     Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000002400F
> 
> 
> memmap output pattern in which Fedora 26 booted successfully
> 5) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000000000F
>     RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000000000F
>     RT_Code    00000000AE38F000-00000000AE48EFFF 0000000000000100
> 800000000000000F
>     Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000000000F
> 
> 6) BS_Data    00000000AE200000-00000000AE20EFFF 000000000000000F
> 000000000000000F
>     RT_Data    00000000AE20F000-00000000AE38EFFF 0000000000000180
> 800000000000000F
>     RT_Code    00000000AE38F000-00000000AE418FFF 000000000000008A
> 800000000000000F
>     RT_Code    00000000AE419000-00000000AE419FFF 0000000000000001
> 800000000000400F
>     RT_Code    00000000AE41A000-00000000AE41BFFF 0000000000000002
> 800000000002000F
>     RT_Code    00000000AE41C000-00000000AE420FFF 0000000000000005
> 800000000000400F
>     RT_Code    00000000AE421000-00000000AE422FFF 0000000000000002
> 800000000002000F
>     RT_Code    00000000AE423000-00000000AE427FFF 0000000000000005
> 800000000000400F
>     RT_Code    00000000AE428000-00000000AE42AFFF 0000000000000003
> 800000000002000F
>     RT_Code    00000000AE42B000-00000000AE42FFFF 0000000000000005
> 800000000000400F
>     RT_Code    00000000AE430000-00000000AE432FFF 0000000000000003
> 800000000002000F
>     RT_Code    00000000AE433000-00000000AE439FFF 0000000000000007
> 800000000000400F
>     RT_Code    00000000AE43A000-00000000AE43DFFF 0000000000000004
> 800000000002000F
>     RT_Code    00000000AE43E000-00000000AE444FFF 0000000000000007
> 800000000000400F
>     RT_Code    00000000AE445000-00000000AE448FFF 0000000000000004
> 800000000002000F
>     RT_Code    00000000AE449000-00000000AE44EFFF 0000000000000006
> 800000000000400F
>     RT_Code    00000000AE44F000-00000000AE450FFF 0000000000000002
> 800000000002000F
>     RT_Code    00000000AE451000-00000000AE455FFF 0000000000000005
> 800000000000400F
>     RT_Code    00000000AE456000-00000000AE458FFF 0000000000000003
> 800000000002000F
>     RT_Code    00000000AE459000-00000000AE45FFFF 0000000000000007
> 800000000000400F
>     RT_Code    00000000AE460000-00000000AE461FFF 0000000000000002
> 800000000002000F
>     RT_Code    00000000AE462000-00000000AE467FFF 0000000000000006
> 800000000000400F
>     RT_Code    00000000AE468000-00000000AE469FFF 0000000000000002
> 800000000002000F
>     RT_Code    00000000AE46A000-00000000AE46EFFF 0000000000000005
> 800000000000400F
>     RT_Code    00000000AE46F000-00000000AE470FFF 0000000000000002
> 800000000002000F
>     RT_Code    00000000AE471000-00000000AE475FFF 0000000000000005
> 800000000000400F
>     RT_Code    00000000AE476000-00000000AE479FFF 0000000000000004
> 800000000002000F
>     RT_Code    00000000AE47A000-00000000AE47FFFF 0000000000000006
> 800000000000400F
>     RT_Code    00000000AE480000-00000000AE481FFF 0000000000000002
> 800000000002000F
>     RT_Code    00000000AE482000-00000000AE487FFF 0000000000000006
> 800000000000400F
>     RT_Code    00000000AE488000-00000000AE489FFF 0000000000000002
> 800000000002000F
>     RT_Code    00000000AE48A000-00000000AE48EFFF 0000000000000005
> 800000000000400F
>     Reserved   00000000AE48F000-00000000AE58EFFF 0000000000000100
> 000000000000000F
> 
> You may notice that 4) will fail but 6) will succeed. Taking into account the fact
> that failure always happened in the runtime service api (set_variable), I think it
> must be that the kernel will mark the memory block to be RO/XP/RP memory
> according to its capabilities but not its current attributes.
> 
> This can explain why 4) will fail but 6) will succeed. Although memmap shows all
> runtime driver image memory as RT_Code, but they're not all code segment.
> Instead some of them are actually data segment.
> 
> It's normal to mark code segment to be RO and data segment to be XP. But mark
> data segment to be RO will cause runtime services failure. 4) shows all RT_Code
> to be XXX24XXX, which means Fedora kernel will mark all code segment and data
> segment to be RO and XP mistakenly, based on my previous hypothesis. This can
> also explain why 1), 2), 3) will fail the boot because XXX26XXX will let Fedora
> kernel to mark RT_Code memory block to be not-present.
> 
> I haven't got time to look into the Linux kernel source so I can't confirm above
> analysis yet.
> I think you're more familiar with kernel source than us. Maybe you could help to
> take a look.
> 
> Thanks,
> Jian
> 
> > -----Original Message-----
> > From: Laszlo Ersek [mailto:lersek@redhat.com]
> > Sent: Friday, November 10, 2017 8:24 PM
> > To: Wang, Jian J <jian.j.wang@intel.com>
> > Cc: edk2-devel@lists.01.org; Dong, Eric <eric.dong@intel.com>; Yao,
> > Jiewen <jiewen.yao@intel.com>; Ard Biesheuvel
> > <ard.biesheuvel@linaro.org>; Matt Fleming <matt@codeblueprint.co.uk>
> > Subject: Re: [PATCH v5] UefiCpuPkg/CpuDxe: Fix multiple entries of
> > RT_CODE in memory map
> >
> > Hi Jian,
> >
> > I'm CC'ing Ard and Matt, and commenting at the bottom.
> >
> > On 11/10/17 02:02, Jian J Wang wrote:
> > >> v5:
> > >>    Coding style clean-up
> > >
> > >> v4:
> > >> a. Remove DoUpdate and check attributes mismatch all the time to avoid
> > >>    a logic hole
> > >> b. Add warning message if failed to update capability c. Add local
> > >> variable to hold new attributes to make code cleaner
> > >
> > >> v3:
> > >> a. Add comment to explain more on updating memory capabilities b.
> > >> Fix logic hole in updating attributes c. Instead of checking
> > >> illegal memory space address and size, use return
> > >>    status of gDS->SetMemorySpaceCapabilities() to skip memory block which
> > >>    cannot be updated with new capabilities.
> > >
> > >> v2
> > >> a. Fix an issue which will cause setting capability failure if size is smaller
> > >>    than a page.
> > >
> > > More than one entry of RT_CODE memory might cause boot problem for
> > some
> > > old OSs. This patch will fix this issue to keep OS compatibility as
> > > much as possible.
> > >
> > > More detailed information, please refer to
> > >     https://bugzilla.tianocore.org/show_bug.cgi?id=753
> > >
> > > Cc: Eric Dong <eric.dong@intel.com>
> > > Cc: Jiewen Yao <jiewen.yao@intel.com>
> > > Cc: Laszlo Ersek <lersek@redhat.com>
> > > Contributed-under: TianoCore Contribution Agreement 1.1
> > > Signed-off-by: Jian J Wang <jian.j.wang@intel.com>
> > > ---
> > >  UefiCpuPkg/CpuDxe/CpuPageTable.c | 69
> > +++++++++++++++++++++++++++++-----------
> > >  1 file changed, 50 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > > index d312eb66f8..61537838b7 100644
> > > --- a/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > > +++ b/UefiCpuPkg/CpuDxe/CpuPageTable.c
> > > @@ -789,8 +789,7 @@ RefreshGcdMemoryAttributesFromPaging (
> > >    UINT64                              BaseAddress;
> > >    UINT64                              PageStartAddress;
> > >    UINT64                              Attributes;
> > > -  UINT64                              Capabilities;
> > > -  BOOLEAN                             DoUpdate;
> > > +  UINT64                              NewAttributes;
> > >    UINTN                               Index;
> > >
> > >    //
> > > @@ -802,9 +801,8 @@ RefreshGcdMemoryAttributesFromPaging (
> > >
> > >    GetCurrentPagingContext (&PagingContext);
> > >
> > > -  DoUpdate      = FALSE;
> > > -  Capabilities  = 0;
> > >    Attributes    = 0;
> > > +  NewAttributes = 0;
> > >    BaseAddress   = 0;
> > >    PageLength    = 0;
> > >
> > > @@ -813,6 +811,34 @@ RefreshGcdMemoryAttributesFromPaging (
> > >        continue;
> > >      }
> > >
> > > +    //
> > > +    // Sync the actual paging related capabilities back to GCD service first.
> > > +    // As a side effect (good one), this can also help to avoid unnecessary
> > > +    // memory map entries due to the different capabilities of the same type
> > > +    // memory, such as multiple RT_CODE and RT_DATA entries in
> > > + memory
> > map,
> > > +    // which could cause boot failure of some old Linux distro (before v4.3).
> > > +    //
> > > +    Status = gDS->SetMemorySpaceCapabilities (
> > > +                    MemorySpaceMap[Index].BaseAddress,
> > > +                    MemorySpaceMap[Index].Length,
> > > +                    MemorySpaceMap[Index].Capabilities |
> > > +                    EFI_MEMORY_PAGETYPE_MASK
> > > +                    );
> > > +    if (EFI_ERROR (Status)) {
> > > +      //
> > > +      // If we cannot udpate the capabilities, we cannot update its
> > > +      // attributes either. So just simply skip current block of memory.
> > > +      //
> > > +      DEBUG ((
> > > +        DEBUG_WARN,
> > > +        "Failed to update capability: [%lu] %016lx - %016lx (%016lx
> > > + -
> > > %016lx)\r\n",
> > > +        (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > > +        MemorySpaceMap[Index].Capabilities,
> > > +        MemorySpaceMap[Index].Capabilities |
> EFI_MEMORY_PAGETYPE_MASK
> > > +        ));
> > > +      continue;
> > > +    }
> > > +
> > >      if (MemorySpaceMap[Index].BaseAddress >= (BaseAddress + PageLength))
> {
> > >        //
> > >        // Current memory space starts at a new page. Resetting
> > > PageLength will @@ -826,7 +852,9 @@
> RefreshGcdMemoryAttributesFromPaging (
> > >        PageLength -= (MemorySpaceMap[Index].BaseAddress - BaseAddress);
> > >      }
> > >
> > > -    // Sync real page attributes to GCD
> > > +    //
> > > +    // Sync actual page attributes to GCD
> > > +    //
> > >      BaseAddress       = MemorySpaceMap[Index].BaseAddress;
> > >      MemorySpaceLength = MemorySpaceMap[Index].Length;
> > >      while (MemorySpaceLength > 0) { @@ -842,23 +870,26 @@
> > > RefreshGcdMemoryAttributesFromPaging (
> > >          PageStartAddress  = (*PageEntry) &
> > (UINT64)PageAttributeToMask(PageAttribute);
> > >          PageLength        = PageAttributeToLength (PageAttribute) -
> (BaseAddress -
> > PageStartAddress);
> > >          Attributes        = GetAttributesFromPageEntry (PageEntry);
> > > -
> > > -        if (Attributes != (MemorySpaceMap[Index].Attributes &
> > EFI_MEMORY_PAGETYPE_MASK)) {
> > > -          DoUpdate = TRUE;
> > > -          Attributes |= (MemorySpaceMap[Index].Attributes &
> > ~EFI_MEMORY_PAGETYPE_MASK);
> > > -          Capabilities = Attributes | MemorySpaceMap[Index].Capabilities;
> > > -        } else {
> > > -          DoUpdate = FALSE;
> > > -        }
> > >        }
> > >
> > >        Length = MIN (PageLength, MemorySpaceLength);
> > > -      if (DoUpdate) {
> > > -        gDS->SetMemorySpaceCapabilities (BaseAddress, Length, Capabilities);
> > > -        gDS->SetMemorySpaceAttributes (BaseAddress, Length, Attributes);
> > > -        DEBUG ((DEBUG_INFO, "Update memory space attribute:
> [%02d] %016lx
> > - %016lx (%08lx -> %08lx)\r\n",
> > > -                             Index, BaseAddress, BaseAddress + Length - 1,
> > > -                             MemorySpaceMap[Index].Attributes, Attributes));
> > > +      if (Attributes != (MemorySpaceMap[Index].Attributes &
> > > +                         EFI_MEMORY_PAGETYPE_MASK)) {
> > > +        NewAttributes = (MemorySpaceMap[Index].Attributes &
> > > +                         ~EFI_MEMORY_PAGETYPE_MASK) | Attributes;
> > > +        Status = gDS->SetMemorySpaceAttributes (
> > > +                        BaseAddress,
> > > +                        Length,
> > > +                        NewAttributes
> > > +                        );
> > > +        ASSERT_EFI_ERROR (Status);
> > > +        DEBUG ((
> > > +          DEBUG_INFO,
> > > +          "Updated memory space attribute: [%lu] %016lx - %016lx
> > > + (%016lx -
> > > %016lx)\r\n",
> > > +          (UINT64)Index, BaseAddress, BaseAddress + Length - 1,
> > > +          MemorySpaceMap[Index].Attributes,
> > > +          NewAttributes
> > > +          ));
> > >        }
> > >
> > >        PageLength        -= Length;
> > >
> >
> > So, I was ready to give my R-b for this patch, but then I also wanted
> > to test it. I applied the patch on current edk2 master (7e2a8dfe8a9a,
> > "ArmPlatformPkg/PrePeiCore: seed temporary stack before entering PEI
> > core", 2017-10-20), and built OVMF like this:
> >
> > $ build \
> >   -a IA32 \
> >   -a X64 \
> >   -p OvmfPkg/OvmfPkgIa32X64.dsc \
> >   -t GCC48 \
> >   -b NOOPT \
> >   -D SMM_REQUIRE \
> >   -D SECURE_BOOT_ENABLE \
> >   -D E1000_ENABLE \
> >   -D HTTP_BOOT_ENABLE
> >
> > For testing I used a recent-ish upstream QEMU development build
> > (ae49fbbcd8e4, "Merge remote-tracking branch
> > 'remotes/rth/tags/pull-tcg-20171025' into staging", 2017-10-25), with
> > the Q35 machine type (which is required by SMM anyway).
> >
> > The results vary across guest OSes:
> >
> > (1) Up-to-date Fedora 26 guest crashes during boot, with the following
> > call stack:
> >
> > BUG: unable to handle kernel paging request at fffffffefe893018 Call
> > Trace:
> >  ? __change_page_attr_set_clr+0xaa6/0xd70
> >  ? kernel_map_pages_in_pgd+0xbc/0xd0
> >  ? efi_call+0x58/0x90
> >  ? virt_efi_set_variable.part.7+0x66/0x120
> >  ? virt_efi_set_variable+0x4f/0x60
> >  ? efi_delete_dummy_variable+0x62/0x90
> >  ? efi_enter_virtual_mode+0x4d4/0x4e8
> >  ? efi_enter_virtual_mode+0x4d4/0x4e8
> >  ? start_kernel+0x442/0x4e6
> >  ? early_idt_handler_array+0x120/0x120
> >  ? x86_64_start_reservations+0x24/0x26
> >  ? x86_64_start_kernel+0x13e/0x161
> >  ? secondary_startup_64+0x9f/0x9f
> >
> > (2) The following Windows OSes all boot successfully:
> >
> > - Windows 7
> > - Windows Server 2008 R2
> > - Windows 8.1
> > - Windows Server 2012 R2
> > - Windows 10
> >
> > (3) Windows Server 2016 crashes with a BSOD; reporting "ATTEMPTED
> > WRITE TO READONLY MEMORY".
> >
> > (Without the patch, all OSes boot OK.)
> >
> >
> > I'm attaching a ZIP file with the following contents (note that I'll
> > attach the same file to TianoCore BZ#753 as well, because the mailing
> > list archive(s) don't seem to preserve attachments):
> >
> > - "ovmf.pre.txt", "shell.memmap.pre.txt", "kernel.pre.txt": OVMF log,
> > MEMMAP command output in the UEFI shell, and Fedora 26 kernel boot log
> > (successful) *before* applying your patch. The kernel log is detailed
> > (the cmdline had "ignore_loglevel" and "efi=debug").
> >
> > - "ovmf.post.txt", "shell.memmap.post.txt", "kernel.post.txt": same
> > files as above, but saved *after* applying your patch. This is when
> > the
> > F26 kernel crashes.
> >
> > - "win2016.post.png": screenshot of the Windows Server 2016 boot
> > failure (after the patch was applied).
> >
> > Thanks,
> > Laszlo
> _______________________________________________
> edk2-devel mailing list
> edk2-devel@lists.01.org
> https://lists.01.org/mailman/listinfo/edk2-devel
_______________________________________________
edk2-devel mailing list
edk2-devel@lists.01.org
https://lists.01.org/mailman/listinfo/edk2-devel