The preventive sfence.vma were emitted because new mappings must be made
visible to the page table walker but Svvptc guarantees that it will
happen within a bounded timeframe, so no need to sfence.vma for the uarchs
that implement this extension, we will then take gratuitous (but very
unlikely) page faults, similarly to x86 and arm64.
This allows to drastically reduce the number of sfence.vma emitted:
* Ubuntu boot to login:
Before: ~630k sfence.vma
After: ~200k sfence.vma
* ltp - mmapstress01
Before: ~45k
After: ~6.3k
* lmbench - lat_pagefault
Before: ~665k
After: 832 (!)
* lmbench - lat_mmap
Before: ~546k
After: 718 (!)
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
---
arch/riscv/include/asm/pgtable.h | 16 +++++++++++++++-
arch/riscv/mm/pgtable.c | 13 +++++++++++++
2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index aad8b8ca51f1..816147e25ca9 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -476,6 +476,9 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
struct vm_area_struct *vma, unsigned long address,
pte_t *ptep, unsigned int nr)
{
+ asm goto(ALTERNATIVE("nop", "j %l[svvptc]", 0, RISCV_ISA_EXT_SVVPTC, 1)
+ : : : : svvptc);
+
/*
* The kernel assumes that TLBs don't cache invalid entries, but
* in RISC-V, SFENCE.VMA specifies an ordering constraint, not a
@@ -485,12 +488,23 @@ static inline void update_mmu_cache_range(struct vm_fault *vmf,
*/
while (nr--)
local_flush_tlb_page(address + nr * PAGE_SIZE);
+
+svvptc:
+ /*
+ * Svvptc guarantees that the new valid pte will be visible within
+ * a bounded timeframe, so when the uarch does not cache invalid
+ * entries, we don't have to do anything.
+ */
}
#define update_mmu_cache(vma, addr, ptep) \
update_mmu_cache_range(NULL, vma, addr, ptep, 1)
#define __HAVE_ARCH_UPDATE_MMU_TLB
-#define update_mmu_tlb update_mmu_cache
+static inline void update_mmu_tlb(struct vm_area_struct *vma,
+ unsigned long address, pte_t *ptep)
+{
+ flush_tlb_range(vma, address, address + PAGE_SIZE);
+}
static inline void update_mmu_cache_pmd(struct vm_area_struct *vma,
unsigned long address, pmd_t *pmdp)
diff --git a/arch/riscv/mm/pgtable.c b/arch/riscv/mm/pgtable.c
index 533ec9055fa0..4ae67324f992 100644
--- a/arch/riscv/mm/pgtable.c
+++ b/arch/riscv/mm/pgtable.c
@@ -9,6 +9,9 @@ int ptep_set_access_flags(struct vm_area_struct *vma,
unsigned long address, pte_t *ptep,
pte_t entry, int dirty)
{
+ asm goto(ALTERNATIVE("nop", "j %l[svvptc]", 0, RISCV_ISA_EXT_SVVPTC, 1)
+ : : : : svvptc);
+
if (!pte_same(ptep_get(ptep), entry))
__set_pte_at(vma->vm_mm, ptep, entry);
/*
@@ -16,6 +19,16 @@ int ptep_set_access_flags(struct vm_area_struct *vma,
* the case that the PTE changed and the spurious fault case.
*/
return true;
+
+svvptc:
+ if (!pte_same(ptep_get(ptep), entry)) {
+ __set_pte_at(vma->vm_mm, ptep, entry);
+ /* Here only not svadu is impacted */
+ flush_tlb_page(vma, address);
+ return true;
+ }
+
+ return false;
}
int ptep_test_and_clear_young(struct vm_area_struct *vma,
--
2.39.2
Hi Alexandre,
kernel test robot noticed the following build errors:
[auto build test ERROR on robh/for-next]
[also build test ERROR on linus/master v6.10-rc7]
[cannot apply to next-20240710]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Alexandre-Ghiti/riscv-Add-ISA-extension-parsing-for-Svvptc/20240702-171920
base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next
patch link: https://lore.kernel.org/r/20240702085034.48395-5-alexghiti%40rivosinc.com
patch subject: [PATCH v3 4/4] riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
config: riscv-allmodconfig (https://download.01.org/0day-ci/archive/20240711/202407111151.m5cK0E6R-lkp@intel.com/config)
compiler: clang version 19.0.0git (https://github.com/llvm/llvm-project a0c6b8aef853eedaa0980f07c0a502a5a8a9740e)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240711/202407111151.m5cK0E6R-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202407111151.m5cK0E6R-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from lib/test_bitops.c:10:
In file included from include/linux/module.h:19:
In file included from include/linux/elf.h:6:
In file included from arch/riscv/include/asm/elf.h:12:
In file included from include/linux/compat.h:17:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:9:
In file included from include/linux/sched/task.h:13:
In file included from include/linux/uaccess.h:11:
In file included from arch/riscv/include/asm/uaccess.h:12:
>> arch/riscv/include/asm/pgtable.h:498:1: error: label at end of compound statement is a C23 extension [-Werror,-Wc23-extensions]
498 | }
| ^
1 error generated.
vim +498 arch/riscv/include/asm/pgtable.h
07037db5d479f9 Palmer Dabbelt 2017-07-10 469
07037db5d479f9 Palmer Dabbelt 2017-07-10 470 #define pgd_ERROR(e) \
07037db5d479f9 Palmer Dabbelt 2017-07-10 471 pr_err("%s:%d: bad pgd " PTE_FMT ".\n", __FILE__, __LINE__, pgd_val(e))
07037db5d479f9 Palmer Dabbelt 2017-07-10 472
07037db5d479f9 Palmer Dabbelt 2017-07-10 473
07037db5d479f9 Palmer Dabbelt 2017-07-10 474 /* Commit new configuration to MMU hardware */
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 475) static inline void update_mmu_cache_range(struct vm_fault *vmf,
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 476) struct vm_area_struct *vma, unsigned long address,
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 477) pte_t *ptep, unsigned int nr)
07037db5d479f9 Palmer Dabbelt 2017-07-10 478 {
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 479 asm goto(ALTERNATIVE("nop", "j %l[svvptc]", 0, RISCV_ISA_EXT_SVVPTC, 1)
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 480 : : : : svvptc);
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 481
07037db5d479f9 Palmer Dabbelt 2017-07-10 482 /*
07037db5d479f9 Palmer Dabbelt 2017-07-10 483 * The kernel assumes that TLBs don't cache invalid entries, but
07037db5d479f9 Palmer Dabbelt 2017-07-10 484 * in RISC-V, SFENCE.VMA specifies an ordering constraint, not a
07037db5d479f9 Palmer Dabbelt 2017-07-10 485 * cache flush; it is necessary even after writing invalid entries.
07037db5d479f9 Palmer Dabbelt 2017-07-10 486 * Relying on flush_tlb_fix_spurious_fault would suffice, but
07037db5d479f9 Palmer Dabbelt 2017-07-10 487 * the extra traps reduce performance. So, eagerly SFENCE.VMA.
07037db5d479f9 Palmer Dabbelt 2017-07-10 488 */
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 489) while (nr--)
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 490) local_flush_tlb_page(address + nr * PAGE_SIZE);
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 491
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 492 svvptc:
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 493 /*
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 494 * Svvptc guarantees that the new valid pte will be visible within
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 495 * a bounded timeframe, so when the uarch does not cache invalid
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 496 * entries, we don't have to do anything.
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 497 */
07037db5d479f9 Palmer Dabbelt 2017-07-10 @498 }
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 499) #define update_mmu_cache(vma, addr, ptep) \
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 500) update_mmu_cache_range(NULL, vma, addr, ptep, 1)
07037db5d479f9 Palmer Dabbelt 2017-07-10 501
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Hi Alexandre,
kernel test robot noticed the following build errors:
[auto build test ERROR on robh/for-next]
[also build test ERROR on linus/master v6.10-rc7]
[cannot apply to next-20240710]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Alexandre-Ghiti/riscv-Add-ISA-extension-parsing-for-Svvptc/20240702-171920
base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next
patch link: https://lore.kernel.org/r/20240702085034.48395-5-alexghiti%40rivosinc.com
patch subject: [PATCH v3 4/4] riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc
config: riscv-randconfig-001-20240711 (https://download.01.org/0day-ci/archive/20240711/202407110946.e0VNIrJP-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240711/202407110946.e0VNIrJP-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202407110946.e0VNIrJP-lkp@intel.com/
All errors (new ones prefixed by >>):
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:30:
In file included from include/linux/pgtable.h:6:
>> arch/riscv/include/asm/pgtable.h:498:1: error: expected statement
}
^
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:1115:
In file included from include/linux/huge_mm.h:8:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:98:11: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set->sig[3] | set->sig[2] |
^ ~
include/uapi/asm-generic/signal.h:62:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:1115:
In file included from include/linux/huge_mm.h:8:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:98:25: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set->sig[3] | set->sig[2] |
^ ~
include/uapi/asm-generic/signal.h:62:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:1115:
In file included from include/linux/huge_mm.h:8:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:99:4: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
set->sig[1] | set->sig[0]) == 0;
^ ~
include/uapi/asm-generic/signal.h:62:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:1115:
In file included from include/linux/huge_mm.h:8:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:101:11: warning: array index 1 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set->sig[1] | set->sig[0]) == 0;
^ ~
include/uapi/asm-generic/signal.h:62:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:1115:
In file included from include/linux/huge_mm.h:8:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:114:11: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set1->sig[3] == set2->sig[3]) &&
^ ~
include/uapi/asm-generic/signal.h:62:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:1115:
In file included from include/linux/huge_mm.h:8:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:114:27: warning: array index 3 is past the end of the array (which contains 1 element) [-Warray-bounds]
return (set1->sig[3] == set2->sig[3]) &&
^ ~
include/uapi/asm-generic/signal.h:62:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:1115:
In file included from include/linux/huge_mm.h:8:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
include/linux/signal.h:115:5: warning: array index 2 is past the end of the array (which contains 1 element) [-Warray-bounds]
(set1->sig[2] == set2->sig[2]) &&
^ ~
include/uapi/asm-generic/signal.h:62:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
In file included from arch/riscv/kernel/asm-offsets.c:10:
In file included from include/linux/mm.h:1115:
In file included from include/linux/huge_mm.h:8:
In file included from include/linux/fs.h:33:
In file included from include/linux/percpu-rwsem.h:7:
In file included from include/linux/rcuwait.h:6:
In file included from include/linux/sched/signal.h:6:
vim +498 arch/riscv/include/asm/pgtable.h
07037db5d479f9 Palmer Dabbelt 2017-07-10 469
07037db5d479f9 Palmer Dabbelt 2017-07-10 470 #define pgd_ERROR(e) \
07037db5d479f9 Palmer Dabbelt 2017-07-10 471 pr_err("%s:%d: bad pgd " PTE_FMT ".\n", __FILE__, __LINE__, pgd_val(e))
07037db5d479f9 Palmer Dabbelt 2017-07-10 472
07037db5d479f9 Palmer Dabbelt 2017-07-10 473
07037db5d479f9 Palmer Dabbelt 2017-07-10 474 /* Commit new configuration to MMU hardware */
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 475) static inline void update_mmu_cache_range(struct vm_fault *vmf,
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 476) struct vm_area_struct *vma, unsigned long address,
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 477) pte_t *ptep, unsigned int nr)
07037db5d479f9 Palmer Dabbelt 2017-07-10 478 {
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 479 asm goto(ALTERNATIVE("nop", "j %l[svvptc]", 0, RISCV_ISA_EXT_SVVPTC, 1)
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 480 : : : : svvptc);
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 481
07037db5d479f9 Palmer Dabbelt 2017-07-10 482 /*
07037db5d479f9 Palmer Dabbelt 2017-07-10 483 * The kernel assumes that TLBs don't cache invalid entries, but
07037db5d479f9 Palmer Dabbelt 2017-07-10 484 * in RISC-V, SFENCE.VMA specifies an ordering constraint, not a
07037db5d479f9 Palmer Dabbelt 2017-07-10 485 * cache flush; it is necessary even after writing invalid entries.
07037db5d479f9 Palmer Dabbelt 2017-07-10 486 * Relying on flush_tlb_fix_spurious_fault would suffice, but
07037db5d479f9 Palmer Dabbelt 2017-07-10 487 * the extra traps reduce performance. So, eagerly SFENCE.VMA.
07037db5d479f9 Palmer Dabbelt 2017-07-10 488 */
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 489) while (nr--)
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 490) local_flush_tlb_page(address + nr * PAGE_SIZE);
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 491
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 492 svvptc:
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 493 /*
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 494 * Svvptc guarantees that the new valid pte will be visible within
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 495 * a bounded timeframe, so when the uarch does not cache invalid
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 496 * entries, we don't have to do anything.
b5bdff9ee1fdca Alexandre Ghiti 2024-07-02 497 */
07037db5d479f9 Palmer Dabbelt 2017-07-10 @498 }
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 499) #define update_mmu_cache(vma, addr, ptep) \
864609c6a0b5f0 Matthew Wilcox (Oracle 2023-08-02 500) update_mmu_cache_range(NULL, vma, addr, ptep, 1)
07037db5d479f9 Palmer Dabbelt 2017-07-10 501
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
© 2016 - 2025 Red Hat, Inc.