From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
The execmem_update_copy() that used text poking was required when memory
allocated from ROX cache was always read-only. Since now its permissions
can be switched to read-write there is no need in a function that updates
memory with text poking.
Remove it.
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
include/linux/execmem.h | 13 -------------
mm/execmem.c | 5 -----
2 files changed, 18 deletions(-)
diff --git a/include/linux/execmem.h b/include/linux/execmem.h
index 3be35680a54f..734fbe83d98e 100644
--- a/include/linux/execmem.h
+++ b/include/linux/execmem.h
@@ -185,19 +185,6 @@ DEFINE_FREE(execmem, void *, if (_T) execmem_free(_T));
struct vm_struct *execmem_vmap(size_t size);
#endif
-/**
- * execmem_update_copy - copy an update to executable memory
- * @dst: destination address to update
- * @src: source address containing the data
- * @size: how many bytes of memory shold be copied
- *
- * Copy @size bytes from @src to @dst using text poking if the memory at
- * @dst is read-only.
- *
- * Return: a pointer to @dst or NULL on error
- */
-void *execmem_update_copy(void *dst, const void *src, size_t size);
-
/**
* execmem_is_rox - check if execmem is read-only
* @type - the execmem type to check
diff --git a/mm/execmem.c b/mm/execmem.c
index 2b683e7d864d..0712ebb4eb77 100644
--- a/mm/execmem.c
+++ b/mm/execmem.c
@@ -399,11 +399,6 @@ void execmem_free(void *ptr)
vfree(ptr);
}
-void *execmem_update_copy(void *dst, const void *src, size_t size)
-{
- return text_poke_copy(dst, src, size);
-}
-
bool execmem_is_rox(enum execmem_type type)
{
return !!(execmem_info->ranges[type].flags & EXECMEM_ROX_CACHE);
--
2.47.2
Le 04/07/2025 à 15:49, Mike Rapoport a écrit : > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org> > > The execmem_update_copy() that used text poking was required when memory > allocated from ROX cache was always read-only. Since now its permissions > can be switched to read-write there is no need in a function that updates > memory with text poking. Erm. Looks like I missed the patch that introduced this change. On some variant of powerpc, namely book3s/32, this is not feasible. The granularity for setting the NX (non exec) bit is 256 Mbytes sections. So the area dedicated to execmem [MODULES_VADDR; MODULES_END[ always have the NX bit unset. You can change any page within this area from ROX to RWX but you can't make it RW without X. If you want RW without X you must map it in the VMALLOC area, as VMALLOC area have NX bit always set. Christophe > > Remove it. > > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> > --- > include/linux/execmem.h | 13 ------------- > mm/execmem.c | 5 ----- > 2 files changed, 18 deletions(-) > > diff --git a/include/linux/execmem.h b/include/linux/execmem.h > index 3be35680a54f..734fbe83d98e 100644 > --- a/include/linux/execmem.h > +++ b/include/linux/execmem.h > @@ -185,19 +185,6 @@ DEFINE_FREE(execmem, void *, if (_T) execmem_free(_T)); > struct vm_struct *execmem_vmap(size_t size); > #endif > > -/** > - * execmem_update_copy - copy an update to executable memory > - * @dst: destination address to update > - * @src: source address containing the data > - * @size: how many bytes of memory shold be copied > - * > - * Copy @size bytes from @src to @dst using text poking if the memory at > - * @dst is read-only. > - * > - * Return: a pointer to @dst or NULL on error > - */ > -void *execmem_update_copy(void *dst, const void *src, size_t size); > - > /** > * execmem_is_rox - check if execmem is read-only > * @type - the execmem type to check > diff --git a/mm/execmem.c b/mm/execmem.c > index 2b683e7d864d..0712ebb4eb77 100644 > --- a/mm/execmem.c > +++ b/mm/execmem.c > @@ -399,11 +399,6 @@ void execmem_free(void *ptr) > vfree(ptr); > } > > -void *execmem_update_copy(void *dst, const void *src, size_t size) > -{ > - return text_poke_copy(dst, src, size); > -} > - > bool execmem_is_rox(enum execmem_type type) > { > return !!(execmem_info->ranges[type].flags & EXECMEM_ROX_CACHE);
On Mon, Jul 07, 2025 at 12:10:43PM +0200, Christophe Leroy wrote: > > Le 04/07/2025 à 15:49, Mike Rapoport a écrit : > > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org> > > > > The execmem_update_copy() that used text poking was required when memory > > allocated from ROX cache was always read-only. Since now its permissions > > can be switched to read-write there is no need in a function that updates > > memory with text poking. > > Erm. Looks like I missed the patch that introduced this change. > > On some variant of powerpc, namely book3s/32, this is not feasible. The only user of EXECMEM_ROX_CACHE for now is x86-64, we can always revisit when powerpc book3s/32 would want to opt in to cache usage. And it seems that [MODULES_VADDR, MODULES_END] is already mapped with "large pages", isn't it? > The granularity for setting the NX (non exec) bit is 256 Mbytes sections. > So the area dedicated to execmem [MODULES_VADDR; MODULES_END[ always have > the NX bit unset. > > You can change any page within this area from ROX to RWX but you can't make > it RW without X. If you want RW without X you must map it in the VMALLOC > area, as VMALLOC area have NX bit always set. So what will happen when one callse set_memory_nx() set_memory_rw() in such areas? > Christophe -- Sincerely yours, Mike.
Le 07/07/2025 à 13:49, Mike Rapoport a écrit : > On Mon, Jul 07, 2025 at 12:10:43PM +0200, Christophe Leroy wrote: >> >> Le 04/07/2025 à 15:49, Mike Rapoport a écrit : >>> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org> >>> >>> The execmem_update_copy() that used text poking was required when memory >>> allocated from ROX cache was always read-only. Since now its permissions >>> can be switched to read-write there is no need in a function that updates >>> memory with text poking. >> >> Erm. Looks like I missed the patch that introduced this change. >> >> On some variant of powerpc, namely book3s/32, this is not feasible. > > The only user of EXECMEM_ROX_CACHE for now is x86-64, we can always revisit > when powerpc book3s/32 would want to opt in to cache usage. > > And it seems that [MODULES_VADDR, MODULES_END] is already mapped with > "large pages", isn't it? I don't think so. It uses execmem_alloc() which sets VM_ALLOW_HUGE_VMAP only when using EXECMEM_ROX_CACHE. And book3s/32 doesn't have large pages. Only 8xx has large pages but they are not PMD aligned (PMD_SIZE is 4M while large pages are 512k and 8M) so it wouldn't work well with existing execmem_vmalloc(). > >> The granularity for setting the NX (non exec) bit is 256 Mbytes sections. >> So the area dedicated to execmem [MODULES_VADDR; MODULES_END[ always have >> the NX bit unset. >> >> You can change any page within this area from ROX to RWX but you can't make >> it RW without X. If you want RW without X you must map it in the VMALLOC >> area, as VMALLOC area have NX bit always set. > > So what will happen when one callse > > set_memory_nx() > set_memory_rw() > > in such areas? Nothing will happen. It will successfully unset the X bit on the PTE but that will be ignored by the HW which only relies on the segment's NX bit which is set for the entire VMALLOC area and unset for the entire MODULE area. That's one of the reasons why it has CONFIG_ARCH_WANTS_MODULES_DATA_IN_VMALLOC , to make sure text is allocated in exec area and data in no-exec area. Christophe
On Mon, Jul 07, 2025 at 03:02:15PM +0200, Christophe Leroy wrote: > > > Le 07/07/2025 à 13:49, Mike Rapoport a écrit : > > On Mon, Jul 07, 2025 at 12:10:43PM +0200, Christophe Leroy wrote: > > > > > > Le 04/07/2025 à 15:49, Mike Rapoport a écrit : > > > > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org> > > > > > > > > The execmem_update_copy() that used text poking was required when memory > > > > allocated from ROX cache was always read-only. Since now its permissions > > > > can be switched to read-write there is no need in a function that updates > > > > memory with text poking. > > > > > > Erm. Looks like I missed the patch that introduced this change. > > > > > > On some variant of powerpc, namely book3s/32, this is not feasible. > > > > The only user of EXECMEM_ROX_CACHE for now is x86-64, we can always revisit > > when powerpc book3s/32 would want to opt in to cache usage. > > > > And it seems that [MODULES_VADDR, MODULES_END] is already mapped with > > "large pages", isn't it? > > I don't think so. It uses execmem_alloc() which sets VM_ALLOW_HUGE_VMAP only > when using EXECMEM_ROX_CACHE. And book3s/32 doesn't have large pages. > > Only 8xx has large pages but they are not PMD aligned (PMD_SIZE is 4M while > large pages are 512k and 8M) so it wouldn't work well with existing > execmem_vmalloc(). The PMD_SIZE can be replaced with one of arch_vmap size helpers if needed. Or even parametrized in execmem_info. > > > The granularity for setting the NX (non exec) bit is 256 Mbytes sections. > > > So the area dedicated to execmem [MODULES_VADDR; MODULES_END[ always have > > > the NX bit unset. > > > > > > You can change any page within this area from ROX to RWX but you can't make > > > it RW without X. If you want RW without X you must map it in the VMALLOC > > > area, as VMALLOC area have NX bit always set. > > > > So what will happen when one callse > > > > set_memory_nx() > > set_memory_rw() > > > > in such areas? > > Nothing will happen. It will successfully unset the X bit on the PTE but > that will be ignored by the HW which only relies on the segment's NX bit > which is set for the entire VMALLOC area and unset for the entire MODULE > area. And set_memory_rw() will essentially make the mapping RWX if it's in MODULE area? > Christophe > -- Sincerely yours, Mike.
© 2016 - 2025 Red Hat, Inc.