[Patch v3] x86/head/64: remove redundant check on level2_kernel_pgt's _PAGE_PRESENT bit

Wei Yang posted 1 patch 1 year, 8 months ago
arch/x86/kernel/head64.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
[Patch v3] x86/head/64: remove redundant check on level2_kernel_pgt's _PAGE_PRESENT bit
Posted by Wei Yang 1 year, 8 months ago
Remove a redundant check on kernel code's PMD _PAGE_PRESENT attribute
before fix up.

Current process looks like this:

    pmd in [0, _text)
        unset _PAGE_PRESENT
    pmd in [_text, _end]
        if (_PAGE_PRESENT)
            fix up delta
    pmd in (_end, 512)
        unset _PAGE_PRESENT

level2_kernel_pgt compiled with _PAGE_PRESENT set. The check is
redundant

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
CC: Thomas Gleixner <tglx@linutronix.de>
CC: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
CC: Ingo Molnar <mingo@kernel.org>
CC: Steve Wahl <steve.wahl@hpe.com>

---
v3: refine the change log per kirill's comment
v2: adjust the change log to emphasize the redundant check
---
 arch/x86/kernel/head64.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index a817ed0724d1..bac33ec19aa2 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -260,8 +260,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
 
 	/* fixup pages that are part of the kernel image */
 	for (; i <= pmd_index((unsigned long)_end); i++)
-		if (pmd[i] & _PAGE_PRESENT)
-			pmd[i] += load_delta;
+		pmd[i] += load_delta;
 
 	/* invalidate pages after the kernel image */
 	for (; i < PTRS_PER_PMD; i++)
-- 
2.34.1
Re: [Patch v3] x86/head/64: remove redundant check on level2_kernel_pgt's _PAGE_PRESENT bit
Posted by Wei Yang 1 year, 7 months ago
May I ask what else I should do?

On Thu, May 23, 2024 at 12:35:39PM +0000, Wei Yang wrote:
>Remove a redundant check on kernel code's PMD _PAGE_PRESENT attribute
>before fix up.
>
>Current process looks like this:
>
>    pmd in [0, _text)
>        unset _PAGE_PRESENT
>    pmd in [_text, _end]
>        if (_PAGE_PRESENT)
>            fix up delta
>    pmd in (_end, 512)
>        unset _PAGE_PRESENT
>
>level2_kernel_pgt compiled with _PAGE_PRESENT set. The check is
>redundant
>
>Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>CC: Thomas Gleixner <tglx@linutronix.de>
>CC: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>CC: Ingo Molnar <mingo@kernel.org>
>CC: Steve Wahl <steve.wahl@hpe.com>
>
>---
>v3: refine the change log per kirill's comment
>v2: adjust the change log to emphasize the redundant check
>---
> arch/x86/kernel/head64.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
>diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
>index a817ed0724d1..bac33ec19aa2 100644
>--- a/arch/x86/kernel/head64.c
>+++ b/arch/x86/kernel/head64.c
>@@ -260,8 +260,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
> 
> 	/* fixup pages that are part of the kernel image */
> 	for (; i <= pmd_index((unsigned long)_end); i++)
>-		if (pmd[i] & _PAGE_PRESENT)
>-			pmd[i] += load_delta;
>+		pmd[i] += load_delta;
> 
> 	/* invalidate pages after the kernel image */
> 	for (; i < PTRS_PER_PMD; i++)
>-- 
>2.34.1

-- 
Wei Yang
Help you, Help me
Re: [Patch v3] x86/head/64: remove redundant check on level2_kernel_pgt's _PAGE_PRESENT bit
Posted by Dave Hansen 1 year, 8 months ago
On 5/23/24 05:35, Wei Yang wrote:
> --- a/arch/x86/kernel/head64.c
> +++ b/arch/x86/kernel/head64.c
> @@ -260,8 +260,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
>  
>  	/* fixup pages that are part of the kernel image */
>  	for (; i <= pmd_index((unsigned long)_end); i++)
> -		if (pmd[i] & _PAGE_PRESENT)
> -			pmd[i] += load_delta;
> +		pmd[i] += load_delta;

So, I think this is correct.  But, man, I wish folks would go through
the git history and make it clear that they understand _how_ the code
got the way it is.

I suspect that the original _PAGE_PRESENT check wasn't even necessary if
cleanup_highmap() really did fix things up.  But this commit:

	2aa85f246c18 ("x86/boot/64: Make level2_kernel_pgt pages invalid
		       outside kernel area")

tweaked things to actively clear out PMDs that weren't populated in
Kirill's original loop.  It didn't touch the _PAGE_PRESENT check.  But
it certainly did imply that the PMD doesn't have any holes in it and
there's nothing int he middle that needs _PAGE_PRESENT cleared.

> level2_kernel_pgt compiled with _PAGE_PRESENT set. The check is
> redundant

This isn't super reassuring.  It also depends on nothing having munged
the page tables up to this point.  The code is also a bit cruel in that
it manipulates two different sets of PMDs with the same 'pmd' variable.

Also, is this comment still accurate after '2aa85f246c18'?

>          * Fixup the kernel text+data virtual addresses. Note that
>          * we might write invalid pmds, when the kernel is relocated
>          * cleanup_highmap() fixes this up along with the mappings
>          * beyond _end.
Re: [Patch v3] x86/head/64: remove redundant check on level2_kernel_pgt's _PAGE_PRESENT bit
Posted by Wei Yang 1 year, 8 months ago
On Mon, Jun 03, 2024 at 11:50:06AM -0700, Dave Hansen wrote:
>On 5/23/24 05:35, Wei Yang wrote:
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -260,8 +260,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
>>  
>>  	/* fixup pages that are part of the kernel image */
>>  	for (; i <= pmd_index((unsigned long)_end); i++)
>> -		if (pmd[i] & _PAGE_PRESENT)
>> -			pmd[i] += load_delta;
>> +		pmd[i] += load_delta;
>
>So, I think this is correct.  But, man, I wish folks would go through
>the git history and make it clear that they understand _how_ the code
>got the way it is.
>
>I suspect that the original _PAGE_PRESENT check wasn't even necessary if
>cleanup_highmap() really did fix things up.  But this commit:
>
>	2aa85f246c18 ("x86/boot/64: Make level2_kernel_pgt pages invalid
>		       outside kernel area")
>
>tweaked things to actively clear out PMDs that weren't populated in
>Kirill's original loop.  It didn't touch the _PAGE_PRESENT check.  But
>it certainly did imply that the PMD doesn't have any holes in it and
>there's nothing int he middle that needs _PAGE_PRESENT cleared.
>
>> level2_kernel_pgt compiled with _PAGE_PRESENT set. The check is
>> redundant
>
>This isn't super reassuring.  It also depends on nothing having munged
>the page tables up to this point.  The code is also a bit cruel in that
>it manipulates two different sets of PMDs with the same 'pmd' variable.
>
>Also, is this comment still accurate after '2aa85f246c18'?
>
>>          * Fixup the kernel text+data virtual addresses. Note that
>>          * we might write invalid pmds, when the kernel is relocated
>>          * cleanup_highmap() fixes this up along with the mappings
>>          * beyond _end.

Hi, Dave

Do you have other suggestions? What do I expect to do next?

-- 
Wei Yang
Help you, Help me
Re: [Patch v3] x86/head/64: remove redundant check on level2_kernel_pgt's _PAGE_PRESENT bit
Posted by Wei Yang 1 year, 8 months ago
On Mon, Jun 03, 2024 at 11:50:06AM -0700, Dave Hansen wrote:
>On 5/23/24 05:35, Wei Yang wrote:
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -260,8 +260,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
>>  
>>  	/* fixup pages that are part of the kernel image */
>>  	for (; i <= pmd_index((unsigned long)_end); i++)
>> -		if (pmd[i] & _PAGE_PRESENT)
>> -			pmd[i] += load_delta;
>> +		pmd[i] += load_delta;
>
>So, I think this is correct.  But, man, I wish folks would go through
>the git history and make it clear that they understand _how_ thecode
>got the way it is.
>

Dave

Thanks for your comment.

In my first version, it lists the historical change, while Thomas thought they
are not relevant. So I remove those descriptions.

https://lkml.org/lkml/2024/3/23/350

>I suspect that the original _PAGE_PRESENT check wasn't even necessary if
>cleanup_highmap() really did fix things up.  But this commit:
>
>	2aa85f246c18 ("x86/boot/64: Make level2_kernel_pgt pages invalid
>		       outside kernel area")
>
>tweaked things to actively clear out PMDs that weren't populated in
>Kirill's original loop.  It didn't touch the _PAGE_PRESENT check.  But
>it certainly did imply that the PMD doesn't have any holes in it and
>there's nothing int he middle that needs _PAGE_PRESENT cleared.
>

As I mentioned in my first version, the original code is introduced by

	commit 1ab60e0f72f7 ("[PATCH] x86-64: Relocatable Kernel Support")

The reason for the check on _PAGE_PRESENT is at that moment, level2_kernel_pgt
is defined as:

NEXT_PAGE(level2_kernel_pgt)
	/* 40MB kernel mapping. The kernel code cannot be bigger than that.
	   When you change this change KERNEL_TEXT_SIZE in page.h too. */
	/* (2^48-(2*1024*1024*1024)-((2^39)*511)-((2^30)*510)) = 0 */
	PMDS(0x0000000000000000, __PAGE_KERNEL_LARGE_EXEC|_PAGE_GLOBAL,
		KERNEL_TEXT_SIZE/PMD_SIZE)
	/* Module mapping starts here */
	.fill	(PTRS_PER_PMD - (KERNEL_TEXT_SIZE/PMD_SIZE)),8,0

While now, it looks like this:

SYM_DATA_START_PAGE_ALIGNED(level2_kernel_pgt)
	/*
	 * Kernel high mapping.
	 *
	 * The kernel code+data+bss must be located below KERNEL_IMAGE_SIZE in
	 * virtual address space, which is 1 GiB if RANDOMIZE_BASE is enabled,
	 * 512 MiB otherwise.
	 *
	 * (NOTE: after that starts the module area, see MODULES_VADDR.)
	 *
	 * This table is eventually used by the kernel during normal runtime.
	 * Care must be taken to clear out undesired bits later, like _PAGE_RW
	 * or _PAGE_GLOBAL in some cases.
	 */
	PMDS(0, __PAGE_KERNEL_LARGE_EXEC, KERNEL_IMAGE_SIZE/PMD_SIZE)
SYM_DATA_END(level2_kernel_pgt)

The difference is at the original version, level2_kernel_pgt is not all
defined with _PAGE_PRESENT set. I didn't dig into from which commit we expand
the level2_kernel_pgt to full, while I think from that point, the check is
redundant.

>> level2_kernel_pgt compiled with _PAGE_PRESENT set. The check is
>> redundant
>
>This isn't super reassuring.  It also depends on nothing having munged
>the page tables up to this point.  The code is also a bit cruel in that
>it manipulates two different sets of PMDs with the same 'pmd' variable.
>
>Also, is this comment still accurate after '2aa85f246c18'?
>
>>          * Fixup the kernel text+data virtual addresses. Note that
>>          * we might write invalid pmds, when the kernel is relocated
>>          * cleanup_highmap() fixes this up along with the mappings
>>          * beyond _end.

Sounds this is not necessary any more. Do you prefer to remove this in next
version of this patch.

-- 
Wei Yang
Help you, Help me
Re: [Patch v3] x86/head/64: remove redundant check on level2_kernel_pgt's _PAGE_PRESENT bit
Posted by Kirill A . Shutemov 1 year, 8 months ago
On Thu, May 23, 2024 at 12:35:39PM +0000, Wei Yang wrote:
> Remove a redundant check on kernel code's PMD _PAGE_PRESENT attribute
> before fix up.
> 
> Current process looks like this:
> 
>     pmd in [0, _text)
>         unset _PAGE_PRESENT
>     pmd in [_text, _end]
>         if (_PAGE_PRESENT)
>             fix up delta
>     pmd in (_end, 512)
>         unset _PAGE_PRESENT
> 
> level2_kernel_pgt compiled with _PAGE_PRESENT set. The check is
> redundant
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> CC: Thomas Gleixner <tglx@linutronix.de>
> CC: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> CC: Ingo Molnar <mingo@kernel.org>
> CC: Steve Wahl <steve.wahl@hpe.com>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
  Kiryl Shutsemau / Kirill A. Shutemov