[PATCH v3 2/3] arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG

Ryan Roberts posted 3 patches 1 year, 7 months ago
There is a newer version of this series
[PATCH v3 2/3] arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG
Posted by Ryan Roberts 1 year, 7 months ago
PTE_PRESENT_INVALID was previously occupying bit 59, which when a PTE is
valid can either be IGNORED, PBHA[0] or AttrIndex[3], depending on the
HW configuration. In practice this is currently not a problem because
PTE_PRESENT_INVALID can only be 1 when PTE_VALID=0 and upstream Linux
always requires the bit set to 0 for a valid pte.

However, if in future Linux wants to use the field (e.g. AttrIndex[3])
then we could end up with confusion when PTE_PRESENT_INVALID comes along
and corrupts the field - we would ideally want to preserve it even for
an invalid (but present) pte.

The other problem with bit 59 is that it prevents the offset field of a
swap entry within a swap pte from growing beyond 51 bits. By moving
PTE_PRESENT_INVALID to a low bit we can lay the swap pte out so that the
offset field could grow to 52 bits in future.

So let's move PTE_PRESENT_INVALID to overlay PTE_NG (bit 11).

There is no need to persist NG for a present-invalid entry; it is always
set for user mappings and is not used by SW to derive any state from the
pte. PTE_NS was considered instead of PTE_NG, but it is RES0 for
non-secure SW, so there is a chance that future architecture may
allocate the bit and we may therefore need to persist that bit for
present-invalid ptes.

These are both marginal benefits, but make things a bit tidier in my
opinion.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 arch/arm64/include/asm/pgtable-prot.h |  2 +-
 arch/arm64/include/asm/pgtable.h      | 12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index cd8c06f5fb02..3047d10987fd 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -18,7 +18,7 @@
 #define PTE_DIRTY		(_AT(pteval_t, 1) << 55)
 #define PTE_SPECIAL		(_AT(pteval_t, 1) << 56)
 #define PTE_DEVMAP		(_AT(pteval_t, 1) << 57)
-#define PTE_PRESENT_INVALID	(_AT(pteval_t, 1) << 59) /* only when !PTE_VALID */
+#define PTE_PRESENT_INVALID	(PTE_NG)		 /* only when !PTE_VALID */
 
 #define _PROT_DEFAULT		(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
 #define _PROT_SECT_DEFAULT	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index c0f4471423db..7f1ff59c43ed 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -1254,15 +1254,15 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
  * Encode and decode a swap entry:
  *	bits 0-1:	present (must be zero)
  *	bits 2:		remember PG_anon_exclusive
- *	bits 3-7:	swap type
- *	bits 8-57:	swap offset
- *	bit  59:	PTE_PRESENT_INVALID (must be zero)
+ *	bits 6-10:	swap type
+ *	bit  11:	PTE_PRESENT_INVALID (must be zero)
+ *	bits 12-61:	swap offset
  */
-#define __SWP_TYPE_SHIFT	3
+#define __SWP_TYPE_SHIFT	6
 #define __SWP_TYPE_BITS		5
-#define __SWP_OFFSET_BITS	50
 #define __SWP_TYPE_MASK		((1 << __SWP_TYPE_BITS) - 1)
-#define __SWP_OFFSET_SHIFT	(__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
+#define __SWP_OFFSET_SHIFT	12
+#define __SWP_OFFSET_BITS	50
 #define __SWP_OFFSET_MASK	((1UL << __SWP_OFFSET_BITS) - 1)
 
 #define __swp_type(x)		(((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK)
-- 
2.25.1
Re: [PATCH v3 2/3] arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG
Posted by Anshuman Khandual 1 year, 7 months ago

On 5/1/24 20:24, Ryan Roberts wrote:
> PTE_PRESENT_INVALID was previously occupying bit 59, which when a PTE is
> valid can either be IGNORED, PBHA[0] or AttrIndex[3], depending on the
> HW configuration. In practice this is currently not a problem because
> PTE_PRESENT_INVALID can only be 1 when PTE_VALID=0 and upstream Linux
> always requires the bit set to 0 for a valid pte.
> 
> However, if in future Linux wants to use the field (e.g. AttrIndex[3])
> then we could end up with confusion when PTE_PRESENT_INVALID comes along
> and corrupts the field - we would ideally want to preserve it even for
> an invalid (but present) pte.
> 
> The other problem with bit 59 is that it prevents the offset field of a
> swap entry within a swap pte from growing beyond 51 bits. By moving
> PTE_PRESENT_INVALID to a low bit we can lay the swap pte out so that the
> offset field could grow to 52 bits in future.
> 
> So let's move PTE_PRESENT_INVALID to overlay PTE_NG (bit 11).
> 
> There is no need to persist NG for a present-invalid entry; it is always
> set for user mappings and is not used by SW to derive any state from the

But the idea of present and invalid state is that all the HW used information
should be fetched successfully even though the the entry is not valid and not
being walked by the MMU. Setting and clearing PTE SW bits in such state, does
not change that, but tampering with HW bits would break the assumption around
present and invalid entry ?

> pte. PTE_NS was considered instead of PTE_NG, but it is RES0 for
> non-secure SW, so there is a chance that future architecture may
> allocate the bit and we may therefore need to persist that bit for
> present-invalid ptes.

If we are being careful around PTE_NS and even for AttrIndex[3] as mentioned
earlier to be useful during an invalid state, how can PTE_NG be used without
any such consideration.

> 
> These are both marginal benefits, but make things a bit tidier in my
> opinion.

Apart from swap offset field expansion does this change achieve anything else ?

> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>  arch/arm64/include/asm/pgtable-prot.h |  2 +-
>  arch/arm64/include/asm/pgtable.h      | 12 ++++++------
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
> index cd8c06f5fb02..3047d10987fd 100644
> --- a/arch/arm64/include/asm/pgtable-prot.h
> +++ b/arch/arm64/include/asm/pgtable-prot.h
> @@ -18,7 +18,7 @@
>  #define PTE_DIRTY		(_AT(pteval_t, 1) << 55)
>  #define PTE_SPECIAL		(_AT(pteval_t, 1) << 56)
>  #define PTE_DEVMAP		(_AT(pteval_t, 1) << 57)
> -#define PTE_PRESENT_INVALID	(_AT(pteval_t, 1) << 59) /* only when !PTE_VALID */
> +#define PTE_PRESENT_INVALID	(PTE_NG)		 /* only when !PTE_VALID */
>  
>  #define _PROT_DEFAULT		(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
>  #define _PROT_SECT_DEFAULT	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index c0f4471423db..7f1ff59c43ed 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -1254,15 +1254,15 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
>   * Encode and decode a swap entry:
>   *	bits 0-1:	present (must be zero)
>   *	bits 2:		remember PG_anon_exclusive
> - *	bits 3-7:	swap type
> - *	bits 8-57:	swap offset
> - *	bit  59:	PTE_PRESENT_INVALID (must be zero)
> + *	bits 6-10:	swap type
> + *	bit  11:	PTE_PRESENT_INVALID (must be zero)
> + *	bits 12-61:	swap offset
>   */
> -#define __SWP_TYPE_SHIFT	3
> +#define __SWP_TYPE_SHIFT	6
>  #define __SWP_TYPE_BITS		5
> -#define __SWP_OFFSET_BITS	50
>  #define __SWP_TYPE_MASK		((1 << __SWP_TYPE_BITS) - 1)
> -#define __SWP_OFFSET_SHIFT	(__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
> +#define __SWP_OFFSET_SHIFT	12
> +#define __SWP_OFFSET_BITS	50
>  #define __SWP_OFFSET_MASK	((1UL << __SWP_OFFSET_BITS) - 1)
>  
>  #define __swp_type(x)		(((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK)
Re: [PATCH v3 2/3] arm64/mm: Move PTE_PRESENT_INVALID to overlay PTE_NG
Posted by Ryan Roberts 1 year, 7 months ago
On 02/05/2024 07:12, Anshuman Khandual wrote:
> 
> 
> On 5/1/24 20:24, Ryan Roberts wrote:
>> PTE_PRESENT_INVALID was previously occupying bit 59, which when a PTE is
>> valid can either be IGNORED, PBHA[0] or AttrIndex[3], depending on the
>> HW configuration. In practice this is currently not a problem because
>> PTE_PRESENT_INVALID can only be 1 when PTE_VALID=0 and upstream Linux
>> always requires the bit set to 0 for a valid pte.
>>
>> However, if in future Linux wants to use the field (e.g. AttrIndex[3])
>> then we could end up with confusion when PTE_PRESENT_INVALID comes along
>> and corrupts the field - we would ideally want to preserve it even for
>> an invalid (but present) pte.
>>
>> The other problem with bit 59 is that it prevents the offset field of a
>> swap entry within a swap pte from growing beyond 51 bits. By moving
>> PTE_PRESENT_INVALID to a low bit we can lay the swap pte out so that the
>> offset field could grow to 52 bits in future.
>>
>> So let's move PTE_PRESENT_INVALID to overlay PTE_NG (bit 11).
>>
>> There is no need to persist NG for a present-invalid entry; it is always
>> set for user mappings and is not used by SW to derive any state from the
> 
> But the idea of present and invalid state is that all the HW used information
> should be fetched successfully even though the the entry is not valid and not
> being walked by the MMU. Setting and clearing PTE SW bits in such state, does
> not change that, but tampering with HW bits would break the assumption around
> present and invalid entry ?

But we are not changing anything; NG is *always* set in valid-present user
entries. And it continues to be *always* set in invalid-present user entries,
because PTE_PRESENT_INVALID is the same bit.

Additionally there are no helpers (that I could find) that read or write the NG
bit so core-mm can't possibly be accidentally corrupting anything or reading
back incorrect state.

> 
>> pte. PTE_NS was considered instead of PTE_NG, but it is RES0 for
>> non-secure SW, so there is a chance that future architecture may
>> allocate the bit and we may therefore need to persist that bit for
>> present-invalid ptes.
> 
> If we are being careful around PTE_NS and even for AttrIndex[3] as mentioned
> earlier to be useful during an invalid state, how can PTE_NG be used without
> any such consideration.

Are you suggesting that in future we may want to make global mappings available
to user space? I think that's unlikely?

The point about NG is that it is already architecturally defined in normal world
and we are using it - kernel mappings are (usually) global, user mappings are
not. So a new meaning for that bit can't come along unless we are willing to say
that *all* mappings (inc kernel mappings) will become implicitly non-global.

So my assertion that this bit can be reused like this (now and in future) is
predicated on 1) all user mappings are and will continue to be non-global, 2)
the bit will never be repurposed by the architecture (or if it is, Linux will
never opt-in because we need per-pte (non-)global control).

> 
>>
>> These are both marginal benefits, but make things a bit tidier in my
>> opinion.
> 
> Apart from swap offset field expansion does this change achieve anything else ?

It frees up bit 59 to be used for its architectural purpose in future.

> 
>>
>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>  arch/arm64/include/asm/pgtable-prot.h |  2 +-
>>  arch/arm64/include/asm/pgtable.h      | 12 ++++++------
>>  2 files changed, 7 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
>> index cd8c06f5fb02..3047d10987fd 100644
>> --- a/arch/arm64/include/asm/pgtable-prot.h
>> +++ b/arch/arm64/include/asm/pgtable-prot.h
>> @@ -18,7 +18,7 @@
>>  #define PTE_DIRTY		(_AT(pteval_t, 1) << 55)
>>  #define PTE_SPECIAL		(_AT(pteval_t, 1) << 56)
>>  #define PTE_DEVMAP		(_AT(pteval_t, 1) << 57)
>> -#define PTE_PRESENT_INVALID	(_AT(pteval_t, 1) << 59) /* only when !PTE_VALID */
>> +#define PTE_PRESENT_INVALID	(PTE_NG)		 /* only when !PTE_VALID */
>>  
>>  #define _PROT_DEFAULT		(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
>>  #define _PROT_SECT_DEFAULT	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index c0f4471423db..7f1ff59c43ed 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -1254,15 +1254,15 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
>>   * Encode and decode a swap entry:
>>   *	bits 0-1:	present (must be zero)
>>   *	bits 2:		remember PG_anon_exclusive
>> - *	bits 3-7:	swap type
>> - *	bits 8-57:	swap offset
>> - *	bit  59:	PTE_PRESENT_INVALID (must be zero)
>> + *	bits 6-10:	swap type
>> + *	bit  11:	PTE_PRESENT_INVALID (must be zero)
>> + *	bits 12-61:	swap offset
>>   */
>> -#define __SWP_TYPE_SHIFT	3
>> +#define __SWP_TYPE_SHIFT	6
>>  #define __SWP_TYPE_BITS		5
>> -#define __SWP_OFFSET_BITS	50
>>  #define __SWP_TYPE_MASK		((1 << __SWP_TYPE_BITS) - 1)
>> -#define __SWP_OFFSET_SHIFT	(__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
>> +#define __SWP_OFFSET_SHIFT	12
>> +#define __SWP_OFFSET_BITS	50
>>  #define __SWP_OFFSET_MASK	((1UL << __SWP_OFFSET_BITS) - 1)
>>  
>>  #define __swp_type(x)		(((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK)