[PATCH V2 3/7] mm: Use ptep_get() for accessing PTE entries

Anshuman Khandual posted 7 patches 2 months, 1 week ago
[PATCH V2 3/7] mm: Use ptep_get() for accessing PTE entries
Posted by Anshuman Khandual 2 months, 1 week ago
Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but
also provides the platform an opportunity to override when required. This
stores read page table entry value in a local variable which can be used in
multiple instances there after. This helps in avoiding multiple memory load
operations as well possible race conditions.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 include/linux/pgtable.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 2a6a3cccfc36..547eeae8c43f 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
  */
 #define set_pte_safe(ptep, pte) \
 ({ \
-	WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \
+	pte_t __old = ptep_get(ptep); \
+	WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \
 	set_pte(ptep, pte); \
 })
 
-- 
2.25.1
Re: [PATCH V2 3/7] mm: Use ptep_get() for accessing PTE entries
Posted by David Hildenbrand 2 months, 1 week ago
On 17.09.24 09:31, Anshuman Khandual wrote:
> Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but
> also provides the platform an opportunity to override when required. This
> stores read page table entry value in a local variable which can be used in
> multiple instances there after. This helps in avoiding multiple memory load
> operations as well possible race conditions.
> 

Please make it clearer in the subject+description that this really only 
involves set_pte_safe().


> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   include/linux/pgtable.h | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 2a6a3cccfc36..547eeae8c43f 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>    */
>   #define set_pte_safe(ptep, pte) \
>   ({ \
> -	WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \
> +	pte_t __old = ptep_get(ptep); \
> +	WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \
>   	set_pte(ptep, pte); \
>   })
>   

I don't think this is necessary. PTE present cannot flip concurrently, 
that's the whole reason of the "safe" part after all.

Can we just move these weird set_pte/pmd_safe() stuff to x86 init code 
and be done with it? Then it's also clear *where* it is getting used and 
for which reason.

-- 
Cheers,

David / dhildenb
Re: [PATCH V2 3/7] mm: Use ptep_get() for accessing PTE entries
Posted by Anshuman Khandual 2 months, 1 week ago

On 9/17/24 15:58, David Hildenbrand wrote:
> On 17.09.24 09:31, Anshuman Khandual wrote:
>> Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but
>> also provides the platform an opportunity to override when required. This
>> stores read page table entry value in a local variable which can be used in
>> multiple instances there after. This helps in avoiding multiple memory load
>> operations as well possible race conditions.
>>
> 
> Please make it clearer in the subject+description that this really only involves set_pte_safe().

I will update the commit message with some thing like this.

mm: Use ptep_get() in set_pte_safe()

This converts PTE accesses in set_pte_safe() via ptep_get() helper which
defaults as READ_ONCE() but also provides the platform an opportunity to
override when required. This stores read page table entry value in a local
variable which can be used in multiple instances there after. This helps
in avoiding multiple memory load operations as well as some possible race
conditions.

> 
> 
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
>> Cc: linux-mm@kvack.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>   include/linux/pgtable.h | 3 ++-
>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>> index 2a6a3cccfc36..547eeae8c43f 100644
>> --- a/include/linux/pgtable.h
>> +++ b/include/linux/pgtable.h
>> @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>>    */
>>   #define set_pte_safe(ptep, pte) \
>>   ({ \
>> -    WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \
>> +    pte_t __old = ptep_get(ptep); \
>> +    WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \
>>       set_pte(ptep, pte); \
>>   })
>>   
> 
> I don't think this is necessary. PTE present cannot flip concurrently, that's the whole reason of the "safe" part after all.

Which is not necessary ? Converting de-references to ptep_get() OR caching
the page table read value in a local variable ? ptep_get() conversion also
serves the purpose providing an opportunity for platform to override.

> 
> Can we just move these weird set_pte/pmd_safe() stuff to x86 init code and be done with it? Then it's also clear *where* it is getting used and for which reason.
> 
set_pte/pmd_safe() can be moved to x86 platform - as that is currently the
sole user for these helpers. But because set_pgd_safe() gets used in riscv
platform, just wondering would it be worth moving only the pte/pmd helpers
but not the pgd one ?
Re: [PATCH V2 3/7] mm: Use ptep_get() for accessing PTE entries
Posted by David Hildenbrand 2 months, 1 week ago
On 18.09.24 08:32, Anshuman Khandual wrote:
> 
> 
> On 9/17/24 15:58, David Hildenbrand wrote:
>> On 17.09.24 09:31, Anshuman Khandual wrote:
>>> Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but
>>> also provides the platform an opportunity to override when required. This
>>> stores read page table entry value in a local variable which can be used in
>>> multiple instances there after. This helps in avoiding multiple memory load
>>> operations as well possible race conditions.
>>>
>>
>> Please make it clearer in the subject+description that this really only involves set_pte_safe().
> 
> I will update the commit message with some thing like this.
> 
> mm: Use ptep_get() in set_pte_safe()
> 
> This converts PTE accesses in set_pte_safe() via ptep_get() helper which
> defaults as READ_ONCE() but also provides the platform an opportunity to
> override when required. This stores read page table entry value in a local
> variable which can be used in multiple instances there after. This helps
> in avoiding multiple memory load operations as well as some possible race
> conditions.
> 
>>
>>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: David Hildenbrand <david@redhat.com>
>>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>>> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
>>> Cc: linux-mm@kvack.org
>>> Cc: linux-kernel@vger.kernel.org
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>>    include/linux/pgtable.h | 3 ++-
>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>>> index 2a6a3cccfc36..547eeae8c43f 100644
>>> --- a/include/linux/pgtable.h
>>> +++ b/include/linux/pgtable.h
>>> @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>>>     */
>>>    #define set_pte_safe(ptep, pte) \
>>>    ({ \
>>> -    WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \
>>> +    pte_t __old = ptep_get(ptep); \
>>> +    WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \
>>>        set_pte(ptep, pte); \
>>>    })
>>>    
>>
>> I don't think this is necessary. PTE present cannot flip concurrently, that's the whole reason of the "safe" part after all.
> 
> Which is not necessary ? Converting de-references to ptep_get() OR caching
> the page table read value in a local variable ? ptep_get() conversion also
> serves the purpose providing an opportunity for platform to override.

Which arch override are you thinking of where this change here would 
make a real difference? Would it even make a difference with cont-pte on 
arm?

> 
>>
>> Can we just move these weird set_pte/pmd_safe() stuff to x86 init code and be done with it? Then it's also clear *where* it is getting used and for which reason.
>>
> set_pte/pmd_safe() can be moved to x86 platform - as that is currently the
> sole user for these helpers. But because set_pgd_safe() gets used in riscv
> platform, just wondering would it be worth moving only the pte/pmd helpers
> but not the pgd one ?

My take would be just to move them where they are used, and possibly 
even inlining them.

The point is that it's absolutely underdocumented what "_safe" is 
supposed to be here, and I don't really see the reason to have this in 
common code (making the common API more complicated).

-- 
Cheers,

David / dhildenb

Re: [PATCH V2 3/7] mm: Use ptep_get() for accessing PTE entries
Posted by Anshuman Khandual 2 months, 1 week ago
On 9/19/24 13:34, David Hildenbrand wrote:
> On 18.09.24 08:32, Anshuman Khandual wrote:
>>
>>
>> On 9/17/24 15:58, David Hildenbrand wrote:
>>> On 17.09.24 09:31, Anshuman Khandual wrote:
>>>> Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but
>>>> also provides the platform an opportunity to override when required. This
>>>> stores read page table entry value in a local variable which can be used in
>>>> multiple instances there after. This helps in avoiding multiple memory load
>>>> operations as well possible race conditions.
>>>>
>>>
>>> Please make it clearer in the subject+description that this really only involves set_pte_safe().
>>
>> I will update the commit message with some thing like this.
>>
>> mm: Use ptep_get() in set_pte_safe()
>>
>> This converts PTE accesses in set_pte_safe() via ptep_get() helper which
>> defaults as READ_ONCE() but also provides the platform an opportunity to
>> override when required. This stores read page table entry value in a local
>> variable which can be used in multiple instances there after. This helps
>> in avoiding multiple memory load operations as well as some possible race
>> conditions.
>>
>>>
>>>
>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>> Cc: David Hildenbrand <david@redhat.com>
>>>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>>>> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
>>>> Cc: linux-mm@kvack.org
>>>> Cc: linux-kernel@vger.kernel.org
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>>    include/linux/pgtable.h | 3 ++-
>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>>>> index 2a6a3cccfc36..547eeae8c43f 100644
>>>> --- a/include/linux/pgtable.h
>>>> +++ b/include/linux/pgtable.h
>>>> @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>>>>     */
>>>>    #define set_pte_safe(ptep, pte) \
>>>>    ({ \
>>>> -    WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \
>>>> +    pte_t __old = ptep_get(ptep); \
>>>> +    WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \
>>>>        set_pte(ptep, pte); \
>>>>    })
>>>>    
>>>
>>> I don't think this is necessary. PTE present cannot flip concurrently, that's the whole reason of the "safe" part after all.
>>
>> Which is not necessary ? Converting de-references to ptep_get() OR caching
>> the page table read value in a local variable ? ptep_get() conversion also
>> serves the purpose providing an opportunity for platform to override.
> 
> Which arch override are you thinking of where this change here would make a real difference? Would it even make a difference with cont-pte on arm?

As we figured out already this code is not used any where other than x86 platform.
So changing this, won't make a difference for arm64 unless I am missing something.
The idea behind the series is to ensure that, there are no direct de-referencing
of page table entries in generic MM code and all accesses should go via available
helpers instead. But if we move these set_pxd_safe() helpers into platform code as
you have suggested earlier, those changes will not be necessary anymore.

> 
>>
>>>
>>> Can we just move these weird set_pte/pmd_safe() stuff to x86 init code and be done with it? Then it's also clear *where* it is getting used and for which reason.
>>>
>> set_pte/pmd_safe() can be moved to x86 platform - as that is currently the
>> sole user for these helpers. But because set_pgd_safe() gets used in riscv
>> platform, just wondering would it be worth moving only the pte/pmd helpers
>> but not the pgd one ?
> 
> My take would be just to move them where they are used, and possibly even inlining them.
> 
> The point is that it's absolutely underdocumented what "_safe" is supposed to be here, and I don't really see the reason to have this in common code (making the common API more complicated).

Agreed, it makes sense for these helpers to be in the platform code instead where
they get used (x86, riscv). Will move them as required.
Re: [PATCH V2 3/7] mm: Use ptep_get() for accessing PTE entries
Posted by Ryan Roberts 2 months, 1 week ago
On 17/09/2024 08:31, Anshuman Khandual wrote:
> Convert PTE accesses via ptep_get() helper that defaults as READ_ONCE() but
> also provides the platform an opportunity to override when required. This
> stores read page table entry value in a local variable which can be used in
> multiple instances there after. This helps in avoiding multiple memory load
> operations as well possible race conditions.
> 
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>

Reviewed-by: Ryan Roberts <ryan.roberts@arm.com>

> ---
>  include/linux/pgtable.h | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 2a6a3cccfc36..547eeae8c43f 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -1060,7 +1060,8 @@ static inline int pgd_same(pgd_t pgd_a, pgd_t pgd_b)
>   */
>  #define set_pte_safe(ptep, pte) \
>  ({ \
> -	WARN_ON_ONCE(pte_present(*ptep) && !pte_same(*ptep, pte)); \
> +	pte_t __old = ptep_get(ptep); \
> +	WARN_ON_ONCE(pte_present(__old) && !pte_same(__old, pte)); \
>  	set_pte(ptep, pte); \
>  })
>