[PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access

Bibo Mao posted 3 patches 1 month, 2 weeks ago
[PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access
Posted by Bibo Mao 1 month, 2 weeks ago
It is possible to return a spurious fault if memory is accessed
right after the pte is set. For user address space, pte is set
in kernel space and memory is accessed in user space, there is
long time for synchronization, no barrier needed. However for
kernel address space, it is possible that memory is accessed
right after the pte is set.

Here flush_cache_vmap/flush_cache_vmap_early is used for
synchronization.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
index f8754d08a31a..53be231319ef 100644
--- a/arch/loongarch/include/asm/cacheflush.h
+++ b/arch/loongarch/include/asm/cacheflush.h
@@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
 #define flush_cache_dup_mm(mm)				do { } while (0)
 #define flush_cache_range(vma, start, end)		do { } while (0)
 #define flush_cache_page(vma, vmaddr, pfn)		do { } while (0)
-#define flush_cache_vmap(start, end)			do { } while (0)
 #define flush_cache_vunmap(start, end)			do { } while (0)
 #define flush_icache_user_page(vma, page, addr, len)	do { } while (0)
 #define flush_dcache_mmap_lock(mapping)			do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)		do { } while (0)
 
+/*
+ * It is possible for a kernel virtual mapping access to return a spurious
+ * fault if it's accessed right after the pte is set. The page fault handler
+ * does not expect this type of fault. flush_cache_vmap is not exactly the
+ * right place to put this, but it seems to work well enough.
+ */
+static inline void flush_cache_vmap(unsigned long start, unsigned long end)
+{
+	smp_mb();
+}
+#define flush_cache_vmap flush_cache_vmap
+#define flush_cache_vmap_early	flush_cache_vmap
+
 #define cache_op(op, addr)						\
 	__asm__ __volatile__(						\
 	"	cacop	%0, %1					\n"	\
-- 
2.39.3
Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access
Posted by Huacai Chen 1 month, 1 week ago
Hi, Bibo,

On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@loongson.cn> wrote:
>
> It is possible to return a spurious fault if memory is accessed
> right after the pte is set. For user address space, pte is set
> in kernel space and memory is accessed in user space, there is
> long time for synchronization, no barrier needed. However for
> kernel address space, it is possible that memory is accessed
> right after the pte is set.
>
> Here flush_cache_vmap/flush_cache_vmap_early is used for
> synchronization.
>
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> ---
>  arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
> index f8754d08a31a..53be231319ef 100644
> --- a/arch/loongarch/include/asm/cacheflush.h
> +++ b/arch/loongarch/include/asm/cacheflush.h
> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
>  #define flush_cache_dup_mm(mm)                         do { } while (0)
>  #define flush_cache_range(vma, start, end)             do { } while (0)
>  #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
> -#define flush_cache_vmap(start, end)                   do { } while (0)
>  #define flush_cache_vunmap(start, end)                 do { } while (0)
>  #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
>  #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
>  #define flush_dcache_mmap_unlock(mapping)              do { } while (0)
>
> +/*
> + * It is possible for a kernel virtual mapping access to return a spurious
> + * fault if it's accessed right after the pte is set. The page fault handler
> + * does not expect this type of fault. flush_cache_vmap is not exactly the
> + * right place to put this, but it seems to work well enough.
> + */
> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
> +{
> +       smp_mb();
> +}
> +#define flush_cache_vmap flush_cache_vmap
> +#define flush_cache_vmap_early flush_cache_vmap
From the history of flush_cache_vmap_early(), It seems only archs with
"virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a
no-op here.

And I still think flush_cache_vunmap() should be a smp_mb(). A
smp_mb() in flush_cache_vmap() prevents subsequent accesses be
reordered before pte_set(), and a smp_mb() in flush_cache_vunmap()
prevents preceding accesses be reordered after pte_clear(). This
potential problem may not be seen from experiment, but it is needed in
theory.

Huacai

> +
>  #define cache_op(op, addr)                                             \
>         __asm__ __volatile__(                                           \
>         "       cacop   %0, %1                                  \n"     \
> --
> 2.39.3
>
>
Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access
Posted by maobibo 1 month, 1 week ago

On 2024/10/14 下午2:31, Huacai Chen wrote:
> Hi, Bibo,
> 
> On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@loongson.cn> wrote:
>>
>> It is possible to return a spurious fault if memory is accessed
>> right after the pte is set. For user address space, pte is set
>> in kernel space and memory is accessed in user space, there is
>> long time for synchronization, no barrier needed. However for
>> kernel address space, it is possible that memory is accessed
>> right after the pte is set.
>>
>> Here flush_cache_vmap/flush_cache_vmap_early is used for
>> synchronization.
>>
>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>> ---
>>   arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
>> index f8754d08a31a..53be231319ef 100644
>> --- a/arch/loongarch/include/asm/cacheflush.h
>> +++ b/arch/loongarch/include/asm/cacheflush.h
>> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
>>   #define flush_cache_dup_mm(mm)                         do { } while (0)
>>   #define flush_cache_range(vma, start, end)             do { } while (0)
>>   #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
>> -#define flush_cache_vmap(start, end)                   do { } while (0)
>>   #define flush_cache_vunmap(start, end)                 do { } while (0)
>>   #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
>>   #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
>>   #define flush_dcache_mmap_unlock(mapping)              do { } while (0)
>>
>> +/*
>> + * It is possible for a kernel virtual mapping access to return a spurious
>> + * fault if it's accessed right after the pte is set. The page fault handler
>> + * does not expect this type of fault. flush_cache_vmap is not exactly the
>> + * right place to put this, but it seems to work well enough.
>> + */
>> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
>> +{
>> +       smp_mb();
>> +}
>> +#define flush_cache_vmap flush_cache_vmap
>> +#define flush_cache_vmap_early flush_cache_vmap
>  From the history of flush_cache_vmap_early(), It seems only archs with
> "virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a
> no-op here.

Here is usage about flush_cache_vmap_early in file linux/mm/percpu.c,
map the page and access it immediately. Do you think it should be noop 
on LoongArch.

rc = __pcpu_map_pages(unit_addr, &pages[unit * unit_pages],
                                      unit_pages);
if (rc < 0)
     panic("failed to map percpu area, err=%d\n", rc);
     flush_cache_vmap_early(unit_addr, unit_addr + ai->unit_size);
     /* copy static data */
     memcpy((void *)unit_addr, __per_cpu_load, ai->static_size);
}


> 
> And I still think flush_cache_vunmap() should be a smp_mb(). A
> smp_mb() in flush_cache_vmap() prevents subsequent accesses be
> reordered before pte_set(), and a smp_mb() in flush_cache_vunmap()
smp_mb() in flush_cache_vmap() does not prevent reorder. It is to flush 
pipeline and let page table walker HW sync with data cache.

For the following example.
   rb = vmap(pages, nr_meta_pages + 2 * nr_data_pages,
                   VM_MAP | VM_USERMAP, PAGE_KERNEL);
   if (rb) {
<<<<<<<<<<< * the sentence if (rb) can prevent reorder. Otherwise with 
any API kmalloc/vmap/vmalloc and subsequent memory access, there will be 
reorder issu. *
       kmemleak_not_leak(pages);
       rb->pages = pages;
       rb->nr_pages = nr_pages;
       return rb;
   }

> prevents preceding accesses be reordered after pte_clear(). This
Can you give an example about such usage about flush_cache_vunmap()? and 
we can continue to talk about it, else it is just guessing.

Regards
Bibo Mao
> potential problem may not be seen from experiment, but it is needed in
> theory.
> 
> Huacai
> 
>> +
>>   #define cache_op(op, addr)                                             \
>>          __asm__ __volatile__(                                           \
>>          "       cacop   %0, %1                                  \n"     \
>> --
>> 2.39.3
>>
>>

Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access
Posted by Huacai Chen 1 month, 1 week ago
On Tue, Oct 15, 2024 at 10:54 AM maobibo <maobibo@loongson.cn> wrote:
>
>
>
> On 2024/10/14 下午2:31, Huacai Chen wrote:
> > Hi, Bibo,
> >
> > On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@loongson.cn> wrote:
> >>
> >> It is possible to return a spurious fault if memory is accessed
> >> right after the pte is set. For user address space, pte is set
> >> in kernel space and memory is accessed in user space, there is
> >> long time for synchronization, no barrier needed. However for
> >> kernel address space, it is possible that memory is accessed
> >> right after the pte is set.
> >>
> >> Here flush_cache_vmap/flush_cache_vmap_early is used for
> >> synchronization.
> >>
> >> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> >> ---
> >>   arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
> >>   1 file changed, 13 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
> >> index f8754d08a31a..53be231319ef 100644
> >> --- a/arch/loongarch/include/asm/cacheflush.h
> >> +++ b/arch/loongarch/include/asm/cacheflush.h
> >> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
> >>   #define flush_cache_dup_mm(mm)                         do { } while (0)
> >>   #define flush_cache_range(vma, start, end)             do { } while (0)
> >>   #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
> >> -#define flush_cache_vmap(start, end)                   do { } while (0)
> >>   #define flush_cache_vunmap(start, end)                 do { } while (0)
> >>   #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
> >>   #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
> >>   #define flush_dcache_mmap_unlock(mapping)              do { } while (0)
> >>
> >> +/*
> >> + * It is possible for a kernel virtual mapping access to return a spurious
> >> + * fault if it's accessed right after the pte is set. The page fault handler
> >> + * does not expect this type of fault. flush_cache_vmap is not exactly the
> >> + * right place to put this, but it seems to work well enough.
> >> + */
> >> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
> >> +{
> >> +       smp_mb();
> >> +}
> >> +#define flush_cache_vmap flush_cache_vmap
> >> +#define flush_cache_vmap_early flush_cache_vmap
> >  From the history of flush_cache_vmap_early(), It seems only archs with
> > "virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a
> > no-op here.
OK,  flush_cache_vmap_early() also needs smp_mb().

>
> Here is usage about flush_cache_vmap_early in file linux/mm/percpu.c,
> map the page and access it immediately. Do you think it should be noop
> on LoongArch.
>
> rc = __pcpu_map_pages(unit_addr, &pages[unit * unit_pages],
>                                       unit_pages);
> if (rc < 0)
>      panic("failed to map percpu area, err=%d\n", rc);
>      flush_cache_vmap_early(unit_addr, unit_addr + ai->unit_size);
>      /* copy static data */
>      memcpy((void *)unit_addr, __per_cpu_load, ai->static_size);
> }
>
>
> >
> > And I still think flush_cache_vunmap() should be a smp_mb(). A
> > smp_mb() in flush_cache_vmap() prevents subsequent accesses be
> > reordered before pte_set(), and a smp_mb() in flush_cache_vunmap()
> smp_mb() in flush_cache_vmap() does not prevent reorder. It is to flush
> pipeline and let page table walker HW sync with data cache.
>
> For the following example.
>    rb = vmap(pages, nr_meta_pages + 2 * nr_data_pages,
>                    VM_MAP | VM_USERMAP, PAGE_KERNEL);
>    if (rb) {
> <<<<<<<<<<< * the sentence if (rb) can prevent reorder. Otherwise with
> any API kmalloc/vmap/vmalloc and subsequent memory access, there will be
> reorder issu. *
>        kmemleak_not_leak(pages);
>        rb->pages = pages;
>        rb->nr_pages = nr_pages;
>        return rb;
>    }
>
> > prevents preceding accesses be reordered after pte_clear(). This
> Can you give an example about such usage about flush_cache_vunmap()? and
> we can continue to talk about it, else it is just guessing.
Since we cannot reach a consensus, and the flush_cache_* API look very
strange for this purpose (Yes, I know PowerPC does it like this, but
ARM64 doesn't). I prefer to still use the ARM64 method which means add
a dbar in set_pte(). Of course the performance will be a little worse,
but still better than the old version, and it is more robust.

I know you are very busy, so if you have no time you don't need to
send V3, I can just do a small modification on the 3rd patch.


Huacai

>
> Regards
> Bibo Mao
> > potential problem may not be seen from experiment, but it is needed in
> > theory.
> >
> > Huacai
> >
> >> +
> >>   #define cache_op(op, addr)                                             \
> >>          __asm__ __volatile__(                                           \
> >>          "       cacop   %0, %1                                  \n"     \
> >> --
> >> 2.39.3
> >>
> >>
>
>
Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access
Posted by maobibo 1 month, 1 week ago

On 2024/10/15 下午8:27, Huacai Chen wrote:
> On Tue, Oct 15, 2024 at 10:54 AM maobibo <maobibo@loongson.cn> wrote:
>>
>>
>>
>> On 2024/10/14 下午2:31, Huacai Chen wrote:
>>> Hi, Bibo,
>>>
>>> On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@loongson.cn> wrote:
>>>>
>>>> It is possible to return a spurious fault if memory is accessed
>>>> right after the pte is set. For user address space, pte is set
>>>> in kernel space and memory is accessed in user space, there is
>>>> long time for synchronization, no barrier needed. However for
>>>> kernel address space, it is possible that memory is accessed
>>>> right after the pte is set.
>>>>
>>>> Here flush_cache_vmap/flush_cache_vmap_early is used for
>>>> synchronization.
>>>>
>>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>>>> ---
>>>>    arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
>>>>    1 file changed, 13 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
>>>> index f8754d08a31a..53be231319ef 100644
>>>> --- a/arch/loongarch/include/asm/cacheflush.h
>>>> +++ b/arch/loongarch/include/asm/cacheflush.h
>>>> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
>>>>    #define flush_cache_dup_mm(mm)                         do { } while (0)
>>>>    #define flush_cache_range(vma, start, end)             do { } while (0)
>>>>    #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
>>>> -#define flush_cache_vmap(start, end)                   do { } while (0)
>>>>    #define flush_cache_vunmap(start, end)                 do { } while (0)
>>>>    #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
>>>>    #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
>>>>    #define flush_dcache_mmap_unlock(mapping)              do { } while (0)
>>>>
>>>> +/*
>>>> + * It is possible for a kernel virtual mapping access to return a spurious
>>>> + * fault if it's accessed right after the pte is set. The page fault handler
>>>> + * does not expect this type of fault. flush_cache_vmap is not exactly the
>>>> + * right place to put this, but it seems to work well enough.
>>>> + */
>>>> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
>>>> +{
>>>> +       smp_mb();
>>>> +}
>>>> +#define flush_cache_vmap flush_cache_vmap
>>>> +#define flush_cache_vmap_early flush_cache_vmap
>>>   From the history of flush_cache_vmap_early(), It seems only archs with
>>> "virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a
>>> no-op here.
> OK,  flush_cache_vmap_early() also needs smp_mb().
> 
>>
>> Here is usage about flush_cache_vmap_early in file linux/mm/percpu.c,
>> map the page and access it immediately. Do you think it should be noop
>> on LoongArch.
>>
>> rc = __pcpu_map_pages(unit_addr, &pages[unit * unit_pages],
>>                                        unit_pages);
>> if (rc < 0)
>>       panic("failed to map percpu area, err=%d\n", rc);
>>       flush_cache_vmap_early(unit_addr, unit_addr + ai->unit_size);
>>       /* copy static data */
>>       memcpy((void *)unit_addr, __per_cpu_load, ai->static_size);
>> }
>>
>>
>>>
>>> And I still think flush_cache_vunmap() should be a smp_mb(). A
>>> smp_mb() in flush_cache_vmap() prevents subsequent accesses be
>>> reordered before pte_set(), and a smp_mb() in flush_cache_vunmap()
>> smp_mb() in flush_cache_vmap() does not prevent reorder. It is to flush
>> pipeline and let page table walker HW sync with data cache.
>>
>> For the following example.
>>     rb = vmap(pages, nr_meta_pages + 2 * nr_data_pages,
>>                     VM_MAP | VM_USERMAP, PAGE_KERNEL);
>>     if (rb) {
>> <<<<<<<<<<< * the sentence if (rb) can prevent reorder. Otherwise with
>> any API kmalloc/vmap/vmalloc and subsequent memory access, there will be
>> reorder issu. *
>>         kmemleak_not_leak(pages);
>>         rb->pages = pages;
>>         rb->nr_pages = nr_pages;
>>         return rb;
>>     }
>>
>>> prevents preceding accesses be reordered after pte_clear(). This
>> Can you give an example about such usage about flush_cache_vunmap()? and
>> we can continue to talk about it, else it is just guessing.
> Since we cannot reach a consensus, and the flush_cache_* API look very
> strange for this purpose (Yes, I know PowerPC does it like this, but
> ARM64 doesn't). I prefer to still use the ARM64 method which means add
> a dbar in set_pte(). Of course the performance will be a little worse,
> but still better than the old version, and it is more robust.
> 
> I know you are very busy, so if you have no time you don't need to
> send V3, I can just do a small modification on the 3rd patch.
No, I will send V3 by myself. And I will drop the this patch in this 
patchset since by actual test vmalloc_test works well even without this
patch on 3C5000 Dual-way, also weak function kernel_pte_init will be 
replaced with inline function rebased on
 
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-define-general-function-pxd_init.patch

I dislike the copy-paste method without further understanding :(, 
although I also copy and paste code, but as least I try best to 
understand it.

Regards
Bibo Mao
> 
> 
> Huacai
> 
>>
>> Regards
>> Bibo Mao
>>> potential problem may not be seen from experiment, but it is needed in
>>> theory.
>>>
>>> Huacai
>>>
>>>> +
>>>>    #define cache_op(op, addr)                                             \
>>>>           __asm__ __volatile__(                                           \
>>>>           "       cacop   %0, %1                                  \n"     \
>>>> --
>>>> 2.39.3
>>>>
>>>>
>>
>>

Re: [PATCH v2 2/3] LoongArch: Add barrier between set_pte and memory access
Posted by Huacai Chen 1 month, 1 week ago
On Wed, Oct 16, 2024 at 2:09 PM maobibo <maobibo@loongson.cn> wrote:
>
>
>
> On 2024/10/15 下午8:27, Huacai Chen wrote:
> > On Tue, Oct 15, 2024 at 10:54 AM maobibo <maobibo@loongson.cn> wrote:
> >>
> >>
> >>
> >> On 2024/10/14 下午2:31, Huacai Chen wrote:
> >>> Hi, Bibo,
> >>>
> >>> On Mon, Oct 14, 2024 at 11:59 AM Bibo Mao <maobibo@loongson.cn> wrote:
> >>>>
> >>>> It is possible to return a spurious fault if memory is accessed
> >>>> right after the pte is set. For user address space, pte is set
> >>>> in kernel space and memory is accessed in user space, there is
> >>>> long time for synchronization, no barrier needed. However for
> >>>> kernel address space, it is possible that memory is accessed
> >>>> right after the pte is set.
> >>>>
> >>>> Here flush_cache_vmap/flush_cache_vmap_early is used for
> >>>> synchronization.
> >>>>
> >>>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> >>>> ---
> >>>>    arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
> >>>>    1 file changed, 13 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
> >>>> index f8754d08a31a..53be231319ef 100644
> >>>> --- a/arch/loongarch/include/asm/cacheflush.h
> >>>> +++ b/arch/loongarch/include/asm/cacheflush.h
> >>>> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
> >>>>    #define flush_cache_dup_mm(mm)                         do { } while (0)
> >>>>    #define flush_cache_range(vma, start, end)             do { } while (0)
> >>>>    #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
> >>>> -#define flush_cache_vmap(start, end)                   do { } while (0)
> >>>>    #define flush_cache_vunmap(start, end)                 do { } while (0)
> >>>>    #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
> >>>>    #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
> >>>>    #define flush_dcache_mmap_unlock(mapping)              do { } while (0)
> >>>>
> >>>> +/*
> >>>> + * It is possible for a kernel virtual mapping access to return a spurious
> >>>> + * fault if it's accessed right after the pte is set. The page fault handler
> >>>> + * does not expect this type of fault. flush_cache_vmap is not exactly the
> >>>> + * right place to put this, but it seems to work well enough.
> >>>> + */
> >>>> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
> >>>> +{
> >>>> +       smp_mb();
> >>>> +}
> >>>> +#define flush_cache_vmap flush_cache_vmap
> >>>> +#define flush_cache_vmap_early flush_cache_vmap
> >>>   From the history of flush_cache_vmap_early(), It seems only archs with
> >>> "virtual cache" (VIVT or VIPT) need this API, so LoongArch can be a
> >>> no-op here.
> > OK,  flush_cache_vmap_early() also needs smp_mb().
> >
> >>
> >> Here is usage about flush_cache_vmap_early in file linux/mm/percpu.c,
> >> map the page and access it immediately. Do you think it should be noop
> >> on LoongArch.
> >>
> >> rc = __pcpu_map_pages(unit_addr, &pages[unit * unit_pages],
> >>                                        unit_pages);
> >> if (rc < 0)
> >>       panic("failed to map percpu area, err=%d\n", rc);
> >>       flush_cache_vmap_early(unit_addr, unit_addr + ai->unit_size);
> >>       /* copy static data */
> >>       memcpy((void *)unit_addr, __per_cpu_load, ai->static_size);
> >> }
> >>
> >>
> >>>
> >>> And I still think flush_cache_vunmap() should be a smp_mb(). A
> >>> smp_mb() in flush_cache_vmap() prevents subsequent accesses be
> >>> reordered before pte_set(), and a smp_mb() in flush_cache_vunmap()
> >> smp_mb() in flush_cache_vmap() does not prevent reorder. It is to flush
> >> pipeline and let page table walker HW sync with data cache.
> >>
> >> For the following example.
> >>     rb = vmap(pages, nr_meta_pages + 2 * nr_data_pages,
> >>                     VM_MAP | VM_USERMAP, PAGE_KERNEL);
> >>     if (rb) {
> >> <<<<<<<<<<< * the sentence if (rb) can prevent reorder. Otherwise with
> >> any API kmalloc/vmap/vmalloc and subsequent memory access, there will be
> >> reorder issu. *
> >>         kmemleak_not_leak(pages);
> >>         rb->pages = pages;
> >>         rb->nr_pages = nr_pages;
> >>         return rb;
> >>     }
> >>
> >>> prevents preceding accesses be reordered after pte_clear(). This
> >> Can you give an example about such usage about flush_cache_vunmap()? and
> >> we can continue to talk about it, else it is just guessing.
> > Since we cannot reach a consensus, and the flush_cache_* API look very
> > strange for this purpose (Yes, I know PowerPC does it like this, but
> > ARM64 doesn't). I prefer to still use the ARM64 method which means add
> > a dbar in set_pte(). Of course the performance will be a little worse,
> > but still better than the old version, and it is more robust.
> >
> > I know you are very busy, so if you have no time you don't need to
> > send V3, I can just do a small modification on the 3rd patch.
> No, I will send V3 by myself. And I will drop the this patch in this
> patchset since by actual test vmalloc_test works well even without this
> patch on 3C5000 Dual-way, also weak function kernel_pte_init will be
> replaced with inline function rebased on
>
> https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patches/mm-define-general-function-pxd_init.patch
This patch is in Andrew's mm-unstable branch. As far I know,
mm-unstable is for next (6.13) and mm-stable is for current (6.12).

But this series is bugfix, so it is for current (6.12).

>
> I dislike the copy-paste method without further understanding :(,
> although I also copy and paste code, but as least I try best to
> understand it.

I dislike too. But in order to make this series be in 6.12, it is
better to keep copy-paste, and then update the refactoring patch to V2
for Andrew (rebase and drop is normal for mm-unstable).


Huacai

>
> Regards
> Bibo Mao
> >
> >
> > Huacai
> >
> >>
> >> Regards
> >> Bibo Mao
> >>> potential problem may not be seen from experiment, but it is needed in
> >>> theory.
> >>>
> >>> Huacai
> >>>
> >>>> +
> >>>>    #define cache_op(op, addr)                                             \
> >>>>           __asm__ __volatile__(                                           \
> >>>>           "       cacop   %0, %1                                  \n"     \
> >>>> --
> >>>> 2.39.3
> >>>>
> >>>>
> >>
> >>
>