From nobody Sat Feb 7 18:28:58 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07BE3148319 for ; Thu, 15 Feb 2024 23:14:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038883; cv=none; b=s3uEyJeRhpCPhhWwXWG+XWOH00MOhou0/9YLeKRcPIjA1R4wUWWvSAXHIHsdmuFNebppFzWnqbi5I1ZDxKFp0LxcPtpK+y3q0ol8L5moxtDDNcqx5GLVOK5lgNA1+MHvKWgdUlFXIlu36e1C0N9mImrob814HRjkvzEp5F7CiL0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038883; c=relaxed/simple; bh=9ZHYd2M5Kk8D0EsHHIG1lih+Sz46XE6ZpNULUyzgA8g=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=grKqUhYab++IPNX94xEy46e8rqbdyrQ+CTG1nAHMqtz9SVKVZxnsEvkrecIYsK50DWAsZEc2M9gSI0I9laVadaQCQ8Moq7BeVjaXfHryghRP1i8BVm9Qw6TqZaKJCEcfUp5+mU4p1OP9vZUPQCZAOONH1XqFh9snlDzMXg3uAlo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=G6erGEjb; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="G6erGEjb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708038880; x=1739574880; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9ZHYd2M5Kk8D0EsHHIG1lih+Sz46XE6ZpNULUyzgA8g=; b=G6erGEjbjIeTqyilQsW4/pLL/VKTQES9xNjK+6HrOVQtyJEymyNIgUfc GfP/0GKFzolNF8RVnUTPFZv2O+hgLma6wpGKJIhyhq1qEUAqEJTSB6guD cZ4zp6zQiuRvOmLDPuayHZWWi3rhDWDPmBRFeFPr0TsvrkxJWKQ3yiWJv IHvBHNl/xbBv9dRGhZ8MyLPTP9U+WxMyOJL89AXdFgTScYICPXVQbQud3 v5+ID04Ktby/FvjPBo3LV4vdkEhRAQsKpXRQq6SXw/G1S92h9CrKeI7/A UHSt7Hw0IU31vFJtJsuYxWeqOSixIw2GXd1UbR3Eunc1KKhfTUFm1Flu7 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="2066314" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="2066314" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="912250188" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="912250188" Received: from yshin-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.95.133]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:37 -0800 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, debug@rivosinc.com, broonie@kernel.org, kirill.shutemov@linux.intel.com, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, peterz@infradead.org, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [RFC PATCH 1/8] mm: Switch mm->get_unmapped_area() to a flag Date: Thu, 15 Feb 2024 15:13:25 -0800 Message-Id: <20240215231332.1556787-2-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> References: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The mm_struct contains a function pointer *get_unmapped_area(), which is set to either arch_get_unmapped_area() or arch_get_unmapped_area_topdown() during the initialization of the mm. Since the function pointer only ever points to two functions that are named the same across all arch's, a function pointer is not really required. In addition future changes will want to add versions of the functions that take additional arguments. So to save a pointers worth of bytes in mm_struct, and prevent adding additional function pointers to mm_struct in future changes, remove it and keep the information about which get_unmapped_area() to use in a flag. Introduce a helper, mm_get_unmapped_area(), to easily convert code that refers to the old function pointer to instead select and call either arch_get_unmapped_area() or arch_get_unmapped_area_topdown() based on the flag. Then drop the mm->get_unmapped_area() function pointer. Leave the get_unmapped_area() pointer in struct file_operations alone. The main purpose of this change is to reorganize in preparation for future changes, but it also converts the calls of mm->get_unmapped_area() from indirect branches into a direct ones. The stress-ng bigheap benchmark calls realloc a lot, which calls through get_unmapped_area() in the kernel. On x86, the change yielded a ~4% improvement there. (bogo ops/s (usr+sys time)) In testing a few x86 configs, removing the pointer unfortunately didn't result in any actual size reductions in the compiled layout of mm_struct. But depending on compiler or arch alignment requirements, the change could possibly shrink the size of mm_struct. Signed-off-by: Rick Edgecombe Acked-by: Dave Hansen Acked-by: Liam R. Howlett Reviewed-by: Kirill A. Shutemov --- arch/s390/mm/hugetlbpage.c | 2 +- arch/s390/mm/mmap.c | 4 ++-- arch/sparc/kernel/sys_sparc_64.c | 15 ++++++--------- arch/sparc/mm/hugetlbpage.c | 2 +- arch/x86/kernel/cpu/sgx/driver.c | 2 +- arch/x86/mm/hugetlbpage.c | 2 +- arch/x86/mm/mmap.c | 4 ++-- drivers/char/mem.c | 2 +- drivers/dax/device.c | 6 +++--- fs/hugetlbfs/inode.c | 2 +- fs/proc/inode.c | 15 ++++++++------- fs/ramfs/file-mmu.c | 2 +- include/linux/mm_types.h | 6 +----- include/linux/sched/coredump.h | 1 + include/linux/sched/mm.h | 5 +++++ io_uring/io_uring.c | 2 +- mm/debug.c | 6 ------ mm/huge_memory.c | 6 +++--- mm/mmap.c | 21 ++++++++++++++++++--- mm/shmem.c | 11 +++++------ mm/util.c | 6 +++--- 21 files changed, 65 insertions(+), 57 deletions(-) diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index 297a6d897d5a..c2d2850ec8d5 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -328,7 +328,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *fi= le, unsigned long addr, goto check_asce_limit; } =20 - if (mm->get_unmapped_area =3D=3D arch_get_unmapped_area) + if (!test_bit(MMF_TOPDOWN, &mm->flags)) addr =3D hugetlb_get_unmapped_area_bottomup(file, addr, len, pgoff, flags); else diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c index fc9a7dc26c5e..cd52d72b59cf 100644 --- a/arch/s390/mm/mmap.c +++ b/arch/s390/mm/mmap.c @@ -182,10 +182,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm, stru= ct rlimit *rlim_stack) */ if (mmap_is_legacy(rlim_stack)) { mm->mmap_base =3D mmap_base_legacy(random_factor); - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); } else { mm->mmap_base =3D mmap_base(random_factor, rlim_stack); - mm->get_unmapped_area =3D arch_get_unmapped_area_topdown; + set_bit(MMF_TOPDOWN, &mm->flags); } } =20 diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc= _64.c index 1e9a9e016237..1dbf7211666e 100644 --- a/arch/sparc/kernel/sys_sparc_64.c +++ b/arch/sparc/kernel/sys_sparc_64.c @@ -218,14 +218,10 @@ arch_get_unmapped_area_topdown(struct file *filp, con= st unsigned long addr0, unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_a= ddr, unsigned long len, unsigned long pgoff, unsigned long flags) { unsigned long align_goal, addr =3D -ENOMEM; - unsigned long (*get_area)(struct file *, unsigned long, - unsigned long, unsigned long, unsigned long); - - get_area =3D current->mm->get_unmapped_area; =20 if (flags & MAP_FIXED) { /* Ok, don't mess with it. */ - return get_area(NULL, orig_addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, fl= ags); } flags &=3D ~MAP_SHARED; =20 @@ -238,7 +234,8 @@ unsigned long get_fb_unmapped_area(struct file *filp, u= nsigned long orig_addr, u align_goal =3D (64UL * 1024); =20 do { - addr =3D get_area(NULL, orig_addr, len + (align_goal - PAGE_SIZE), pgoff= , flags); + addr =3D mm_get_unmapped_area(current->mm, NULL, orig_addr, + len + (align_goal - PAGE_SIZE), pgoff, flags); if (!(addr & ~PAGE_MASK)) { addr =3D (addr + (align_goal - 1UL)) & ~(align_goal - 1UL); break; @@ -256,7 +253,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, u= nsigned long orig_addr, u * be obtained. */ if (addr & ~PAGE_MASK) - addr =3D get_area(NULL, orig_addr, len, pgoff, flags); + addr =3D mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, = flags); =20 return addr; } @@ -292,7 +289,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct= rlimit *rlim_stack) gap =3D=3D RLIM_INFINITY || sysctl_legacy_va_layout) { mm->mmap_base =3D TASK_UNMAPPED_BASE + random_factor; - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); } else { /* We know it's 32-bit */ unsigned long task_size =3D STACK_TOP32; @@ -303,7 +300,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct= rlimit *rlim_stack) gap =3D (task_size / 6 * 5); =20 mm->mmap_base =3D PAGE_ALIGN(task_size - gap - random_factor); - mm->get_unmapped_area =3D arch_get_unmapped_area_topdown; + set_bit(MMF_TOPDOWN, &mm->flags); } } =20 diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index b432500c13a5..38a1bef47efb 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -123,7 +123,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned l= ong addr, (!vma || addr + len <=3D vm_start_gap(vma))) return addr; } - if (mm->get_unmapped_area =3D=3D arch_get_unmapped_area) + if (!test_bit(MMF_TOPDOWN, &mm->flags)) return hugetlb_get_unmapped_area_bottomup(file, addr, len, pgoff, flags); else diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/dri= ver.c index 262f5fb18d74..22b65a5f5ec6 100644 --- a/arch/x86/kernel/cpu/sgx/driver.c +++ b/arch/x86/kernel/cpu/sgx/driver.c @@ -113,7 +113,7 @@ static unsigned long sgx_get_unmapped_area(struct file = *file, if (flags & MAP_FIXED) return addr; =20 - return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); } =20 #ifdef CONFIG_COMPAT diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index 5804bbae4f01..6d77c0039617 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -141,7 +141,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned l= ong addr, } =20 get_unmapped_area: - if (mm->get_unmapped_area =3D=3D arch_get_unmapped_area) + if (!test_bit(MMF_TOPDOWN, &mm->flags)) return hugetlb_get_unmapped_area_bottomup(file, addr, len, pgoff, flags); else diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index c90c20904a60..a2cabb1c81e1 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -129,9 +129,9 @@ static void arch_pick_mmap_base(unsigned long *base, un= signed long *legacy_base, void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) { if (mmap_is_legacy()) - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); else - mm->get_unmapped_area =3D arch_get_unmapped_area_topdown; + set_bit(MMF_TOPDOWN, &mm->flags); =20 arch_pick_mmap_base(&mm->mmap_base, &mm->mmap_legacy_base, arch_rnd(mmap64_rnd_bits), task_size_64bit(0), diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 3c6670cf905f..9b80e622ae80 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -544,7 +544,7 @@ static unsigned long get_unmapped_area_zero(struct file= *file, } =20 /* Otherwise flags & MAP_PRIVATE: with no shmem object beneath it */ - return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); #else return -ENOSYS; #endif diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 93ebedc5ec8c..47c126d37b59 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -329,14 +329,14 @@ static unsigned long dax_get_unmapped_area(struct fil= e *filp, if ((off + len_align) < off) goto out; =20 - addr_align =3D current->mm->get_unmapped_area(filp, addr, len_align, - pgoff, flags); + addr_align =3D mm_get_unmapped_area(current->mm, filp, addr, len_align, + pgoff, flags); if (!IS_ERR_VALUE(addr_align)) { addr_align +=3D (off - addr_align) & (align - 1); return addr_align; } out: - return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags); } =20 static const struct address_space_operations dev_dax_aops =3D { diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index f757d4f7ad98..a63d2eee086f 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -242,7 +242,7 @@ generic_hugetlb_get_unmapped_area(struct file *file, un= signed long addr, * If architectures have special needs, they should define their own * version of hugetlb_get_unmapped_area. */ - if (mm->get_unmapped_area =3D=3D arch_get_unmapped_area_topdown) + if (test_bit(MMF_TOPDOWN, &mm->flags)) return hugetlb_get_unmapped_area_topdown(file, addr, len, pgoff, flags); return hugetlb_get_unmapped_area_bottomup(file, addr, len, diff --git a/fs/proc/inode.c b/fs/proc/inode.c index b33e490e3fd9..6f4c2e21e68f 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -454,15 +454,16 @@ pde_get_unmapped_area(struct proc_dir_entry *pde, str= uct file *file, unsigned lo unsigned long len, unsigned long pgoff, unsigned long flags) { - typeof_member(struct proc_ops, proc_get_unmapped_area) get_area; - - get_area =3D pde->proc_ops->proc_get_unmapped_area; + if (pde->proc_ops->proc_get_unmapped_area) + return pde->proc_ops->proc_get_unmapped_area(file, orig_addr, + len, pgoff, + flags); #ifdef CONFIG_MMU - if (!get_area) - get_area =3D current->mm->get_unmapped_area; + else + return mm_get_unmapped_area(current->mm, file, orig_addr, + len, pgoff, flags); #endif - if (get_area) - return get_area(file, orig_addr, len, pgoff, flags); + return orig_addr; } =20 diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c index c7a1aa3c882b..b45c7edc3225 100644 --- a/fs/ramfs/file-mmu.c +++ b/fs/ramfs/file-mmu.c @@ -35,7 +35,7 @@ static unsigned long ramfs_mmu_get_unmapped_area(struct f= ile *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { - return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); } =20 const struct file_operations ramfs_file_operations =3D { diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 957ce38768b2..f01c01b4a4fc 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -745,11 +745,7 @@ struct mm_struct { } ____cacheline_aligned_in_smp; =20 struct maple_tree mm_mt; -#ifdef CONFIG_MMU - unsigned long (*get_unmapped_area) (struct file *filp, - unsigned long addr, unsigned long len, - unsigned long pgoff, unsigned long flags); -#endif + unsigned long mmap_base; /* base of mmap area */ unsigned long mmap_legacy_base; /* base of mmap area in bottom-up alloca= tions */ #ifdef CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h index 02f5090ffea2..428e440424c5 100644 --- a/include/linux/sched/coredump.h +++ b/include/linux/sched/coredump.h @@ -74,6 +74,7 @@ static inline int get_dumpable(struct mm_struct *mm) #define MMF_DISABLE_THP_MASK (1 << MMF_DISABLE_THP) #define MMF_OOM_REAP_QUEUED 25 /* mm was queued for oom_reaper */ #define MMF_MULTIPROCESS 26 /* mm is shared between processes */ +#define MMF_TOPDOWN 27 /* mm is shared between processes */ /* * MMF_HAS_PINNED: Whether this mm has pinned any pages. This can be eith= er * replaced in the future by mm.pinned_vm when it becomes stable, or grow = into diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 9a19f1b42f64..cde946e926d8 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -8,6 +8,7 @@ #include #include #include +#include =20 /* * Routines for handling mm_structs @@ -186,6 +187,10 @@ arch_get_unmapped_area_topdown(struct file *filp, unsi= gned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); =20 +unsigned long mm_get_unmapped_area(struct mm_struct *mm, struct file *filp, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags); + unsigned long generic_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 9626a363f121..b3c4ec115989 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3569,7 +3569,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(s= truct file *filp, #else addr =3D 0UL; #endif - return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags); } =20 #else /* !CONFIG_MMU */ diff --git a/mm/debug.c b/mm/debug.c index ee533a5ceb79..32db5de8e1e7 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -162,9 +162,6 @@ EXPORT_SYMBOL(dump_vma); void dump_mm(const struct mm_struct *mm) { pr_emerg("mm %px task_size %lu\n" -#ifdef CONFIG_MMU - "get_unmapped_area %px\n" -#endif "mmap_base %lu mmap_legacy_base %lu\n" "pgd %px mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n" "hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n" @@ -190,9 +187,6 @@ void dump_mm(const struct mm_struct *mm) "def_flags: %#lx(%pGv)\n", =20 mm, mm->task_size, -#ifdef CONFIG_MMU - mm->get_unmapped_area, -#endif mm->mmap_base, mm->mmap_legacy_base, mm->pgd, atomic_read(&mm->mm_users), atomic_read(&mm->mm_count), diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 86ee29b5c39c..e9ef43a719a5 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -641,8 +641,8 @@ static unsigned long __thp_get_unmapped_area(struct fil= e *filp, if (len_pad < len || (off + len_pad) < off) return 0; =20 - ret =3D current->mm->get_unmapped_area(filp, addr, len_pad, - off >> PAGE_SHIFT, flags); + ret =3D mm_get_unmapped_area(current->mm, filp, addr, len_pad, + off >> PAGE_SHIFT, flags); =20 /* * The failure might be due to length padding. The caller will retry @@ -672,7 +672,7 @@ unsigned long thp_get_unmapped_area(struct file *filp, = unsigned long addr, if (ret) return ret; =20 - return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags); } EXPORT_SYMBOL_GPL(thp_get_unmapped_area); =20 diff --git a/mm/mmap.c b/mm/mmap.c index aa82eec17489..b61bc515c729 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1807,7 +1807,8 @@ get_unmapped_area(struct file *file, unsigned long ad= dr, unsigned long len, unsigned long pgoff, unsigned long flags) { unsigned long (*get_area)(struct file *, unsigned long, - unsigned long, unsigned long, unsigned long); + unsigned long, unsigned long, unsigned long) + =3D NULL; =20 unsigned long error =3D arch_mmap_check(addr, len, flags); if (error) @@ -1817,7 +1818,6 @@ get_unmapped_area(struct file *file, unsigned long ad= dr, unsigned long len, if (len > TASK_SIZE) return -ENOMEM; =20 - get_area =3D current->mm->get_unmapped_area; if (file) { if (file->f_op->get_unmapped_area) get_area =3D file->f_op->get_unmapped_area; @@ -1834,7 +1834,11 @@ get_unmapped_area(struct file *file, unsigned long a= ddr, unsigned long len, get_area =3D thp_get_unmapped_area; } =20 - addr =3D get_area(file, addr, len, pgoff, flags); + if (get_area) + addr =3D get_area(file, addr, len, pgoff, flags); + else + addr =3D mm_get_unmapped_area(current->mm, file, addr, len, + pgoff, flags); if (IS_ERR_VALUE(addr)) return addr; =20 @@ -1849,6 +1853,17 @@ get_unmapped_area(struct file *file, unsigned long a= ddr, unsigned long len, =20 EXPORT_SYMBOL(get_unmapped_area); =20 +unsigned long +mm_get_unmapped_area(struct mm_struct *mm, struct file *file, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + if (test_bit(MMF_TOPDOWN, &mm->flags)) + return arch_get_unmapped_area_topdown(file, addr, len, pgoff, flags); + return arch_get_unmapped_area(file, addr, len, pgoff, flags); +} +EXPORT_SYMBOL(mm_get_unmapped_area); + /** * find_vma_intersection() - Look up the first VMA which intersects the in= terval * @mm: The process address space. diff --git a/mm/shmem.c b/mm/shmem.c index 0d1ce70bce38..a8cac92dff88 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2242,8 +2242,6 @@ unsigned long shmem_get_unmapped_area(struct file *fi= le, unsigned long uaddr, unsigned long len, unsigned long pgoff, unsigned long flags) { - unsigned long (*get_area)(struct file *, - unsigned long, unsigned long, unsigned long, unsigned long); unsigned long addr; unsigned long offset; unsigned long inflated_len; @@ -2253,8 +2251,8 @@ unsigned long shmem_get_unmapped_area(struct file *fi= le, if (len > TASK_SIZE) return -ENOMEM; =20 - get_area =3D current->mm->get_unmapped_area; - addr =3D get_area(file, uaddr, len, pgoff, flags); + addr =3D mm_get_unmapped_area(current->mm, file, uaddr, len, pgoff, + flags); =20 if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) return addr; @@ -2311,7 +2309,8 @@ unsigned long shmem_get_unmapped_area(struct file *fi= le, if (inflated_len < len) return addr; =20 - inflated_addr =3D get_area(NULL, uaddr, inflated_len, 0, flags); + inflated_addr =3D mm_get_unmapped_area(current->mm, NULL, uaddr, + inflated_len, 0, flags); if (IS_ERR_VALUE(inflated_addr)) return addr; if (inflated_addr & ~PAGE_MASK) @@ -4757,7 +4756,7 @@ unsigned long shmem_get_unmapped_area(struct file *fi= le, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { - return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); } #endif =20 diff --git a/mm/util.c b/mm/util.c index 744b4d7e3fae..10c836a75e66 100644 --- a/mm/util.c +++ b/mm/util.c @@ -452,17 +452,17 @@ void arch_pick_mmap_layout(struct mm_struct *mm, stru= ct rlimit *rlim_stack) =20 if (mmap_is_legacy(rlim_stack)) { mm->mmap_base =3D TASK_UNMAPPED_BASE + random_factor; - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); } else { mm->mmap_base =3D mmap_base(random_factor, rlim_stack); - mm->get_unmapped_area =3D arch_get_unmapped_area_topdown; + set_bit(MMF_TOPDOWN, &mm->flags); } } #elif defined(CONFIG_MMU) && !defined(HAVE_ARCH_PICK_MMAP_LAYOUT) void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) { mm->mmap_base =3D TASK_UNMAPPED_BASE; - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); } #endif =20 --=20 2.34.1 From nobody Sat Feb 7 18:28:58 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70096145346 for ; Thu, 15 Feb 2024 23:14:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038883; cv=none; b=jNx2rAxrlh9B29fpxUxcNldgT1F5Fl2g6eJMvVVLg3u4LKb5g9XuPkXHCkyo46cPs0iw0WOjZayAf+FhSGZGQ9WPEmfIKhtTDq/5OgksizcrazQDULpztXmlyR4nVrgz/4f4ViLo80CAcN/hNLm6J4e7UaQVk9VP6SCNj6/vBM4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038883; c=relaxed/simple; bh=NmgdWGUKdUuPH56dpQmxlcj6RB+edyhe9JvhgeXosrc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=QIx8ks7WFcZbpnSKUGypeRd+AIwVErxY01fXPgNbgyCu8TnWKLqe3zQknkxZYrg14mCqQ9woWk3YaVD034Wz3so8x2zlL3jRaqRg+c76ICyXux+05Yrtt6/lVFzlScZGVhuS4jIKMSXNNGYXR9H5YJh/DIHdAFiqvIN/0B3hzJY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=bjSdW0Fi; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="bjSdW0Fi" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708038881; x=1739574881; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NmgdWGUKdUuPH56dpQmxlcj6RB+edyhe9JvhgeXosrc=; b=bjSdW0FiZPOIPYqgZjTJtj/8HojCKHYM1z3hDgxnfScB+SH2YcZkXIwF aBnp5shlB6XSIaZ+OAVMGgmZcw3wgbWkyzjzjdIMBwDAn/x2PVp4+nAwL vE0rZUvMCjGVftzZEr9y30opbdXlqhpcOur0vmrGuZgD+JX3S4tls3tNc 2jQn8NJI1dHyMAE2q/+91lX6QgIz2TOScFWfW8NC531ewJMhn64rZxJ4M Fb5cQNMOBeeXCrLYU6KNgLvc9Gxw8HWU5Mz0ArK+NZpS0ibu+3OuJ7iVt jAPgaHa0PRoK0uUIQw9uCt/OGRBTg5t85eghuNYt8MJKQyZwWfAEJ0go6 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="2066326" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="2066326" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="912250191" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="912250191" Received: from yshin-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.95.133]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:38 -0800 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, debug@rivosinc.com, broonie@kernel.org, kirill.shutemov@linux.intel.com, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, peterz@infradead.org, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [RFC PATCH 2/8] mm: Introduce arch_get_unmapped_area_vmflags() Date: Thu, 15 Feb 2024 15:13:26 -0800 Message-Id: <20240215231332.1556787-3-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> References: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. In order to take the start gap into account, the maple tree search needs to know the size of start gap the new mapping will need. The call chain from do_mmap() to the actual maple tree search looks like this: do_mmap(size, vm_flags, map_flags, ..) mm/mmap.c:get_unmapped_area(size, map_flags, ...) arch_get_unmapped_area(size, map_flags, ...) vm_unmapped_area(struct vm_unmapped_area_info) One option would be to add another MAP_ flag to mean a one page start gap (as is for shadow stack), but this consumes a flag unnecessarily. Another option could be to simply increase the size passed in do_mmap() by the start gap size, and adjust after the fact, but this will interfere with the alignment requirements passed in struct vm_unmapped_area_info, and unknown to mmap.c. Instead, introduce variants of arch_get_unmapped_area/_topdown() that take vm_flags. In future changes, these variants can be used in mmap.c:get_unmapped_area() to allow the vm_flags to be passed through to vm_unmapped_area(), while preserving the normal arch_get_unmapped_area/_topdown() for the existing callers. Signed-off-by: Rick Edgecombe --- include/linux/sched/mm.h | 17 +++++++++++++++++ mm/mmap.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index cde946e926d8..7b44441865c5 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -191,6 +191,23 @@ unsigned long mm_get_unmapped_area(struct mm_struct *m= m, struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); =20 +extern unsigned long +arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags); +extern unsigned long +arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long ad= dr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t); + +unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, + struct file *filp, + unsigned long addr, + unsigned long len, + unsigned long pgoff, + unsigned long flags, + vm_flags_t vm_flags); + unsigned long generic_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, diff --git a/mm/mmap.c b/mm/mmap.c index b61bc515c729..2021bc040e81 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1802,6 +1802,34 @@ arch_get_unmapped_area_topdown(struct file *filp, un= signed long addr, } #endif =20 +#ifndef HAVE_ARCH_UNMAPPED_AREA_VMFLAGS +extern unsigned long +arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsi= gned long len, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) +{ + return arch_get_unmapped_area(filp, addr, len, pgoff, flags); +} + +extern unsigned long +arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long ad= dr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags) +{ + return arch_get_unmapped_area_topdown(filp, addr, len, pgoff, flags); +} +#endif + +unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, struct fi= le *filp, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags, + vm_flags_t vm_flags) +{ + if (test_bit(MMF_TOPDOWN, &mm->flags)) + return arch_get_unmapped_area_topdown_vmflags(filp, addr, len, pgoff, + flags, vm_flags); + return arch_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, vm_f= lags); +} + unsigned long get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) --=20 2.34.1 From nobody Sat Feb 7 18:28:58 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9AF36148FF3 for ; Thu, 15 Feb 2024 23:14:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038883; cv=none; b=VZT/hCbxpbHjYsnGoaPRpuVzb51wqrDGqovl3wP0Ik1HddDZgowwxo+BSU/TzfxXgCo5aJJi0R3E8NFXbrzOYr4SHV1419n5BUHgXS5NOgRgfaGlP2chZNDmYct7bdDYxakab9S6Ie9WN5HQ4lzVaiKh5KRUP6Df2wF1PrYADfc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038883; c=relaxed/simple; bh=unKAy/CGTalz/NFNrhf3RgWUr6Aw4RHxK63+GyNg8+4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=Y2Hj9zuPVxwpqcplg2o9BNvbx/EMUzorMlGT1ne+LX724DtfReTPlZi/YL+1iNat2SPbCLeyjR2kzZDGEidiJ+HpvQ1IABvq6SPPIdWfuLhumnfsymVbWmePxCLswyNC05ZAVFfWFc7boF/f6JTe385mG8TB3YP7dgJygvCx6z0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YM/GTh8e; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YM/GTh8e" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708038882; x=1739574882; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=unKAy/CGTalz/NFNrhf3RgWUr6Aw4RHxK63+GyNg8+4=; b=YM/GTh8eK6mmi7U9ztodoNavv2KsLhrv/Sw40BG7ed/5Y/ArNFlMiQdt AhW1OAnRzD+XSqg6dVq8RnXL+IipxMwuQwAYURyYBG3zf3jgjuhj+wvz0 MuUNearUX/GMA0y0QSBUp5mi4/0IreoGqW5jWJo8zQKvGcPCF3FqI0dNT kvCkrS8Kc/CJNaqxDwqNCMpB9+W69FXpZHVlmGLY9gigJThblFJTyIta3 ZHd4h9FL3YPa2Z1P+9ktN5/M+8KUwLGNv98cGw5JgsXby+lBzJDubLQne 5xoie9YIb97VNby7QW8BuQtfaNuQLrq0Y2aHHCu3dqiZKtGWLzYYfPz/+ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="2066341" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="2066341" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="912250194" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="912250194" Received: from yshin-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.95.133]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:39 -0800 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, debug@rivosinc.com, broonie@kernel.org, kirill.shutemov@linux.intel.com, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, peterz@infradead.org, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [RFC PATCH 3/8] mm: Use get_unmapped_area_vmflags() Date: Thu, 15 Feb 2024 15:13:27 -0800 Message-Id: <20240215231332.1556787-4-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> References: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. Use mm_get_unmapped_area_vmflags() in the do_mmap() so future changes can cause shadow stack mappings to be placed with a guard gap. Also use the THP variant that takes vm_flags, such that THP shadow stack can get the same treatment. Adjust the vm_flags calculation to happen earlier so that the vm_flags can be passed into __get_unmapped_area(). Signed-off-by: Rick Edgecombe --- include/linux/mm.h | 2 ++ mm/mmap.c | 38 ++++++++++++++++++++++---------------- 2 files changed, 24 insertions(+), 16 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index da5219b48d52..9addf16dbf18 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3352,6 +3352,8 @@ unsigned long randomize_stack_top(unsigned long stack= _top); unsigned long randomize_page(unsigned long start, unsigned long range); =20 extern unsigned long get_unmapped_area(struct file *, unsigned long, unsig= ned long, unsigned long, unsigned long); +extern unsigned long __get_unmapped_area(struct file *, unsigned long, uns= igned long, + unsigned long, unsigned long, vm_flags_t); =20 extern unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, diff --git a/mm/mmap.c b/mm/mmap.c index 2021bc040e81..2723f26f7c62 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1249,18 +1249,6 @@ unsigned long do_mmap(struct file *file, unsigned lo= ng addr, if (mm->map_count > sysctl_max_map_count) return -ENOMEM; =20 - /* Obtain the address to map to. we verify (or select) it and ensure - * that it represents a valid section of the address space. - */ - addr =3D get_unmapped_area(file, addr, len, pgoff, flags); - if (IS_ERR_VALUE(addr)) - return addr; - - if (flags & MAP_FIXED_NOREPLACE) { - if (find_vma_intersection(mm, addr, addr + len)) - return -EEXIST; - } - if (prot =3D=3D PROT_EXEC) { pkey =3D execute_only_pkey(mm); if (pkey < 0) @@ -1274,6 +1262,18 @@ unsigned long do_mmap(struct file *file, unsigned lo= ng addr, vm_flags |=3D calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; =20 + /* Obtain the address to map to. we verify (or select) it and ensure + * that it represents a valid section of the address space. + */ + addr =3D __get_unmapped_area(file, addr, len, pgoff, flags, vm_flags); + if (IS_ERR_VALUE(addr)) + return addr; + + if (flags & MAP_FIXED_NOREPLACE) { + if (find_vma_intersection(mm, addr, addr + len)) + return -EEXIST; + } + if (flags & MAP_LOCKED) if (!can_do_mlock()) return -EPERM; @@ -1831,8 +1831,8 @@ unsigned long mm_get_unmapped_area_vmflags(struct mm_= struct *mm, struct file *fi } =20 unsigned long -get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, - unsigned long pgoff, unsigned long flags) +__get_unmapped_area(struct file *file, unsigned long addr, unsigned long l= en, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) { unsigned long (*get_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long) @@ -1865,8 +1865,8 @@ get_unmapped_area(struct file *file, unsigned long ad= dr, unsigned long len, if (get_area) addr =3D get_area(file, addr, len, pgoff, flags); else - addr =3D mm_get_unmapped_area(current->mm, file, addr, len, - pgoff, flags); + addr =3D mm_get_unmapped_area_vmflags(current->mm, file, addr, len, + pgoff, flags, vm_flags); if (IS_ERR_VALUE(addr)) return addr; =20 @@ -1879,6 +1879,12 @@ get_unmapped_area(struct file *file, unsigned long a= ddr, unsigned long len, return error ? error : addr; } =20 +unsigned long +get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + return __get_unmapped_area(file, addr, len, pgoff, flags, 0); +} EXPORT_SYMBOL(get_unmapped_area); =20 unsigned long --=20 2.34.1 From nobody Sat Feb 7 18:28:58 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5C8E914A0B4 for ; Thu, 15 Feb 2024 23:14:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038885; cv=none; b=NuD3kT1bXHQLwSujhO0FVDKrq1gyq/+5Nt38KGwyNftj/ZlBFRI80lBMUpiMDOdnmiwKDmbDIEvmWI6zLRWFt7FxrUEFKjJw3BqfyAvxN/3/FNW5T+meh/HB7+IN5kacpvqRnesY1ux7YsbzdWjiTx3VKE0/UzU3x1cKZRwQdco= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038885; c=relaxed/simple; bh=XHhdhuxOKWz8EspMQDfPASDrP/vRFk3wVhxiIKVJyZE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=XCM6cTGbz+wgQVg31jWnb6meQPVXYAfRR6VV+0b3kMpHHENWBHdYKv2B660DMoLgkV7bBt7GiAa9XuhufQ0e3OCoQkiNGP8JSbLhZICQCnTwZB5Y+qg3rLGZ4CTjFr/aZM8Y8OkHemkw7y1uolMqniT+oEpPV3Vx6r7bYfjoMvs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=NfpFrd8Q; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="NfpFrd8Q" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708038883; x=1739574883; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=XHhdhuxOKWz8EspMQDfPASDrP/vRFk3wVhxiIKVJyZE=; b=NfpFrd8QKeI9hUSsEoGQdlTQLTlezIbtpPp8Y4K6b2oPY9jS6+Wvohsm 2vvbo9RRbf9w7K80AvrYN+7sm+rvOYqOKCPuyhyzXKQBAynW5cstBsKNF OSIlrxc06ExNllCwHwuym+USb6RXSx8RBsjlgdvIqvYsA31ULgejFJ0SY XsL5ww3hdqlK+i27Qh1s9X4vkADpL81xt7EPzKOFTDYAI8+WHpKt0hh54 TxEWKe4CAu/IwDzTl3BFj1l8ufffzWSOd9k61b89Qd1YTW8d/3+R44VkM Zsb2vKgtuE4L+eeoNW1ckkVHMhYLvekTND0Fq9hE5rFaFNPY9W+ALI0TH w==; X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="2066354" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="2066354" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="912250197" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="912250197" Received: from yshin-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.95.133]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:39 -0800 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, debug@rivosinc.com, broonie@kernel.org, kirill.shutemov@linux.intel.com, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, peterz@infradead.org, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [RFC PATCH 4/8] thp: Add thp_get_unmapped_area_vmflags() Date: Thu, 15 Feb 2024 15:13:28 -0800 Message-Id: <20240215231332.1556787-5-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> References: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. Add a THP implementations of the vm_flags variant of get_unmapped_area(). Future changes will call this from mmap.c in the do_mmap() path to allow shadow stacks to be placed with consideration taken for the start guard gap. Shadow stack memory is always private and anonymous and so special guard gap logic is not needed in a lot of caseis, but it can be mapped by THP, so needs to be handled. Signed-off-by: Rick Edgecombe --- include/linux/huge_mm.h | 11 +++++++++++ mm/huge_memory.c | 23 ++++++++++++++++------- mm/mmap.c | 12 +++++++----- 3 files changed, 34 insertions(+), 12 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index fa0350b0812a..ef7251dfd9f9 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -139,6 +139,9 @@ bool hugepage_vma_check(struct vm_area_struct *vma, uns= igned long vm_flags, =20 unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); +unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned lo= ng addr, + unsigned long len, unsigned long pgoff, unsigned long flags, + vm_flags_t vm_flags); =20 void folio_prep_large_rmappable(struct folio *folio); bool can_split_folio(struct folio *folio, int *pextra_pins); @@ -286,6 +289,14 @@ static inline void folio_prep_large_rmappable(struct f= olio *folio) {} =20 #define thp_get_unmapped_area NULL =20 +static inline unsigned long +thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags) +{ + return 0; +} + static inline bool can_split_folio(struct folio *folio, int *pextra_pins) { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e9ef43a719a5..f235f6d3ff62 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -628,7 +628,8 @@ static inline bool is_transparent_hugepage(struct folio= *folio) =20 static unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, - loff_t off, unsigned long flags, unsigned long size) + loff_t off, unsigned long flags, unsigned long size, + vm_flags_t vm_flags) { loff_t off_end =3D off + len; loff_t off_align =3D round_up(off, size); @@ -641,8 +642,8 @@ static unsigned long __thp_get_unmapped_area(struct fil= e *filp, if (len_pad < len || (off + len_pad) < off) return 0; =20 - ret =3D mm_get_unmapped_area(current->mm, filp, addr, len_pad, - off >> PAGE_SHIFT, flags); + ret =3D mm_get_unmapped_area_vmflags(current->mm, filp, addr, len_pad, + off >> PAGE_SHIFT, flags, vm_flags); =20 /* * The failure might be due to length padding. The caller will retry @@ -662,17 +663,25 @@ static unsigned long __thp_get_unmapped_area(struct f= ile *filp, return ret; } =20 -unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, - unsigned long len, unsigned long pgoff, unsigned long flags) +unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned lo= ng addr, + unsigned long len, unsigned long pgoff, unsigned long flags, + vm_flags_t vm_flags) { unsigned long ret; loff_t off =3D (loff_t)pgoff << PAGE_SHIFT; =20 - ret =3D __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE); + ret =3D __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE, vm= _flags); if (ret) return ret; =20 - return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags); + return mm_get_unmapped_area_vmflags(current->mm, filp, addr, len, pgoff, = flags, + vm_flags); +} + +unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + return thp_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, 0); } EXPORT_SYMBOL_GPL(thp_get_unmapped_area); =20 diff --git a/mm/mmap.c b/mm/mmap.c index 2723f26f7c62..936d728ba1ca 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1857,16 +1857,18 @@ __get_unmapped_area(struct file *file, unsigned lon= g addr, unsigned long len, */ pgoff =3D 0; get_area =3D shmem_get_unmapped_area; - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { - /* Ensures that larger anonymous mappings are THP aligned. */ - get_area =3D thp_get_unmapped_area; } =20 - if (get_area) + if (get_area) { addr =3D get_area(file, addr, len, pgoff, flags); - else + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { + /* Ensures that larger anonymous mappings are THP aligned. */ + addr =3D thp_get_unmapped_area_vmflags(file, addr, len, + pgoff, flags, vm_flags); + } else { addr =3D mm_get_unmapped_area_vmflags(current->mm, file, addr, len, pgoff, flags, vm_flags); + } if (IS_ERR_VALUE(addr)) return addr; =20 --=20 2.34.1 From nobody Sat Feb 7 18:28:58 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E29114A0BE for ; Thu, 15 Feb 2024 23:14:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038885; cv=none; b=cb045/qJ6zu+wrCr5cVaqHLljkROJ3BpUoiQVVERDX0FdDWTGBgGvgpr3UFyHpG6sLTsAjDXbOmx5Zq3pKAEKfL7qAc6zGyUAToruQ1WvBfDs7mCcrI4A0vAT/jg+pUwhXa59j2fyht1DS8Dez6nFH3aFP2e8ataRZKgulPAw1Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038885; c=relaxed/simple; bh=Ib9DRZGDBCSUD4uXPmR+ijcvzy4h1J3AkaOgyP6jQrA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=QlhR5j7cZdmVHU4nxOXRX4SH6QHfiYuRku/AmVqCvQXhUMzFA7LVUp4t/UrziUvFp+0xmLCPns7BIabNk5Hqa75l0ml6cwK6ekmJEP7EjofApmj1DW+rdHTgOayN+JcawQfQPwbssya6+MymRPiYdIXi7FHcTArrgsHNbJrzjNg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=aZFz9HTh; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="aZFz9HTh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708038884; x=1739574884; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ib9DRZGDBCSUD4uXPmR+ijcvzy4h1J3AkaOgyP6jQrA=; b=aZFz9HThrAXYgiC93qZAMnKdqWEZXqXnbLivzJoGt6MORRK3hWMLXBBb mdYwToFGS83l+1BJsVIb1vz7QYsW3lqm8vbPZ6MbvPlG2AsNv347Kh4fz 86l86S03256YyZBJU7Ae7P4sgVea75b2YLRXQe/oZ3XhNurn2N6sn1qIu nYruggcY+Gz5748T/jpXg259Ve5H1kY1mSws/MWV3CNNTuWf/epzFI1FD /U5FB73nXTBvk6/FHRovQ1R71jyXJywfnbGLhoCwSWplc/LhXmO42VE+w Eb/f43bAYS1zhnoPOp95vhZtUiZMBpmP1ioK73ngqHKh2EpfOYBl05jHx A==; X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="2066367" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="2066367" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="912250200" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="912250200" Received: from yshin-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.95.133]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:40 -0800 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, debug@rivosinc.com, broonie@kernel.org, kirill.shutemov@linux.intel.com, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, peterz@infradead.org, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [RFC PATCH 5/8] mm: Take placement mappings gap into account Date: Thu, 15 Feb 2024 15:13:29 -0800 Message-Id: <20240215231332.1556787-6-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> References: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. For MAP_GROWSDOWN/VM_GROWSDOWN and MAP_GROWSUP/VM_GROWSUP this has not been a problem in practice because applications place these kinds of mappings very early, when there is not many mappings to find a space between. But for shadow stacks, they may be placed throughout the lifetime of the application. So define a VM_UNMAPPED_START_GAP_SET flag to specify that a start_gap field has been set, as most vm_unmapped_area_info structs are not zeroed, so the added field will often contain garbage. Use VM_UNMAPPED_START_GAP_SET in unmapped_area/_topdown() to find a space that includes the guard gap for the new mapping. Take care to not interfere with the alignment. Signed-off-by: Rick Edgecombe --- include/linux/mm.h | 2 ++ mm/mmap.c | 21 ++++++++++++++------- 2 files changed, 16 insertions(+), 7 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 9addf16dbf18..160bb6db7a16 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3393,12 +3393,14 @@ extern unsigned long __must_check vm_mmap(struct fi= le *, unsigned long, =20 struct vm_unmapped_area_info { #define VM_UNMAPPED_AREA_TOPDOWN 1 +#define VM_UNMAPPED_START_GAP_SET 2 unsigned long flags; unsigned long length; unsigned long low_limit; unsigned long high_limit; unsigned long align_mask; unsigned long align_offset; + unsigned long start_gap; }; =20 extern unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info); diff --git a/mm/mmap.c b/mm/mmap.c index 936d728ba1ca..1b6c333656f9 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1567,14 +1567,17 @@ static inline int accountable_mapping(struct file *= file, vm_flags_t vm_flags) */ static unsigned long unmapped_area(struct vm_unmapped_area_info *info) { - unsigned long length, gap; + unsigned long length, gap, start_gap =3D 0; unsigned long low_limit, high_limit; struct vm_area_struct *tmp; =20 MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); =20 + if (info->flags & VM_UNMAPPED_START_GAP_SET) + start_gap =3D info->start_gap; + /* Adjust search length to account for worst case alignment overhead */ - length =3D info->length + info->align_mask; + length =3D info->length + info->align_mask + start_gap; if (length < info->length) return -ENOMEM; =20 @@ -1586,7 +1589,7 @@ static unsigned long unmapped_area(struct vm_unmapped= _area_info *info) if (mas_empty_area(&mas, low_limit, high_limit - 1, length)) return -ENOMEM; =20 - gap =3D mas.index; + gap =3D mas.index + start_gap; gap +=3D (info->align_offset - gap) & info->align_mask; tmp =3D mas_next(&mas, ULONG_MAX); if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if = possible */ @@ -1619,13 +1622,17 @@ static unsigned long unmapped_area(struct vm_unmapp= ed_area_info *info) */ static unsigned long unmapped_area_topdown(struct vm_unmapped_area_info *i= nfo) { - unsigned long length, gap, gap_end; + unsigned long length, gap, gap_end, start_gap =3D 0; unsigned long low_limit, high_limit; struct vm_area_struct *tmp; =20 MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); + + if (info->flags & VM_UNMAPPED_START_GAP_SET) + start_gap =3D info->start_gap; + /* Adjust search length to account for worst case alignment overhead */ - length =3D info->length + info->align_mask; + length =3D info->length + info->align_mask + start_gap; if (length < info->length) return -ENOMEM; =20 @@ -1832,7 +1839,7 @@ unsigned long mm_get_unmapped_area_vmflags(struct mm_= struct *mm, struct file *fi =20 unsigned long __get_unmapped_area(struct file *file, unsigned long addr, unsigned long l= en, - unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) { unsigned long (*get_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long) @@ -1883,7 +1890,7 @@ __get_unmapped_area(struct file *file, unsigned long = addr, unsigned long len, =20 unsigned long get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, - unsigned long pgoff, unsigned long flags) + unsigned long pgoff, unsigned long flags) { return __get_unmapped_area(file, addr, len, pgoff, flags, 0); } --=20 2.34.1 From nobody Sat Feb 7 18:28:58 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C51D214A4C9 for ; Thu, 15 Feb 2024 23:14:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038885; cv=none; b=kphL5IjI8qtSvNR3bQQLe73WFLlx1vmTgKG1iywqnyPrrGvAxQH4Fr83gBjS4nPfOpJYK1dskOvZK61GC1M/+FXNsX+qt4igigV3/CZnaoV3fS2jFXDz43s0csBVdz0wmlN0UniZgI5OjhWUv+ZwAVMuiWd8Fc+B4AT+5QxZFYU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038885; c=relaxed/simple; bh=VO9DQY6lGtiG1eC5gi6w3K4FE6tMMi6+Q5ti33XNprc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=JqbkO8CciD7zEOoEuPQ2wt61aJ5HfhK+Agm4cJjOgc4Auwtdm6tKVrWiqqz1dpW4AfjMU+2fgbR06CZSi0HYj7oqDODJ9WcbJYlToXmJ0aTH1dYlcfgL8f3m1Jyj6e6Rdeag6K6vQrr5J/D/hdY7oPcmD8JxUX+HxNJbEe140T8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=J67+WqLJ; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="J67+WqLJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708038884; x=1739574884; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VO9DQY6lGtiG1eC5gi6w3K4FE6tMMi6+Q5ti33XNprc=; b=J67+WqLJBRvfQDvEc8w0q2MZpIfq4EbTbPasgs/MIL1ucB8F0t1Izdph AnoHk+OtNrZDygV6IAuvYp8yCy8sWoy/baVZcWtvChTcHLCrq3d+4y9zn fp4WczYl9+n75Oxx7eiC/yHofsFcD/FRMEqA8Zh9mics4TpIQ2UaHM8NP GlPhKHi4fW7tTrsQfUdjda6UA+PdasrkVhfYTlaw49GODv1fHT0Kd5g+n 6W0fn4gsGIDvGJbj7tPdKSXSGwIowtmkR/xrfqx6pNfhWDsoIGM7T+qBq /e6EfL9jU3eLeccw/BK8klCptxvVd8cXu7dCH1BcoCo1EX5YgfR5Csqbb w==; X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="2066382" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="2066382" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="912250203" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="912250203" Received: from yshin-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.95.133]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:41 -0800 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, debug@rivosinc.com, broonie@kernel.org, kirill.shutemov@linux.intel.com, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, peterz@infradead.org, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [RFC PATCH 6/8] x86/mm: Implement HAVE_ARCH_UNMAPPED_AREA_VMFLAGS Date: Thu, 15 Feb 2024 15:13:30 -0800 Message-Id: <20240215231332.1556787-7-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> References: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. Add x86 arch implementations of arch_get_unmapped_area_vmflags/_topdown() so future changes can allow the the guard gap of type of vma being placed to be taken into account. This will be used for shadow stack memory. Signed-off-by: Rick Edgecombe --- arch/x86/include/asm/pgtable_64.h | 1 + arch/x86/kernel/sys_x86_64.c | 29 ++++++++++++++++++++++------- 2 files changed, 23 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtab= le_64.h index a629b1b9f65a..eb09a11621ad 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -244,6 +244,7 @@ extern void cleanup_highmap(void); =20 #define HAVE_ARCH_UNMAPPED_AREA #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN +#define HAVE_ARCH_UNMAPPED_AREA_VMFLAGS =20 #define PAGE_AGP PAGE_KERNEL_NOCACHE #define HAVE_PAGE_AGP 1 diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index c783aeb37dce..f92780cf9662 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -119,9 +119,9 @@ static void find_start_end(unsigned long addr, unsigned= long flags, *end =3D task_size_64bit(addr > DEFAULT_MAP_WINDOW); } =20 -unsigned long -arch_get_unmapped_area(struct file *filp, unsigned long addr, - unsigned long len, unsigned long pgoff, unsigned long flags) +extern unsigned long +arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsi= gned long len, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) { struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma; @@ -157,10 +157,10 @@ arch_get_unmapped_area(struct file *filp, unsigned lo= ng addr, return vm_unmapped_area(&info); } =20 -unsigned long -arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr= 0, - const unsigned long len, const unsigned long pgoff, - const unsigned long flags) +extern unsigned long +arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long ad= dr0, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags) { struct vm_area_struct *vma; struct mm_struct *mm =3D current->mm; @@ -230,3 +230,18 @@ arch_get_unmapped_area_topdown(struct file *filp, cons= t unsigned long addr0, */ return arch_get_unmapped_area(filp, addr0, len, pgoff, flags); } + +unsigned long +arch_get_unmapped_area(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + return arch_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, 0); +} + +unsigned long +arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr, + const unsigned long len, const unsigned long pgoff, + const unsigned long flags) +{ + return arch_get_unmapped_area_topdown_vmflags(filp, addr, len, pgoff, fla= gs, 0); +} --=20 2.34.1 From nobody Sat Feb 7 18:28:58 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6108014AD24 for ; Thu, 15 Feb 2024 23:14:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038887; cv=none; b=SO827y/gPA8hQVS/H/XetiTszVnvIa17ITCCoOSkHjo9dOT4zEViHsj/O4UQIDxpUExUcLlJceia502JT8as+7aRx3YoHah3IPNCGraAXG/LVAFtrB5yOUAdAZKbnRlVH885DIUuKaTs3re2H5135sKd3iKY9tT57IFofLMBo+U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038887; c=relaxed/simple; bh=fQLzGTXW5Y/EBT0eWnQYJ+GT+r/+mVoQa4wR7CvVXl4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=VqK7ZWSOf56zozTbSD4G3ID/AAGKr/+LVfoefbuJwgPdvcYyhrUKAtuhGritUXQ/cQXskQSJMReG7utUqc5Sv/P8WhejOnz2u7uIo4c/nrz3KudXwhgYFrpbE72O36muAS6NlzIPKoVoX8oVPL5Kqohe/qWhnNJjfu7xJYaW/9g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=PVh08EVe; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PVh08EVe" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708038885; x=1739574885; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fQLzGTXW5Y/EBT0eWnQYJ+GT+r/+mVoQa4wR7CvVXl4=; b=PVh08EVezN8JnrhzLkbq0lM+KeuoNBnEybXTIw8rqwcf5mwgy9AVdhxE vyRjbcM8wg1LqUif4GKWl6o3aQyK25+ji/te3cLD/K8kbvVe+eA4l25yf g4gVQQZeHdXVY5FINTevT6rbZAXB/Gi7sIPhoVQZGJUnVG/xKRpTsGDGq +GP5+HtTCHH0mzmEped4jGp3lcpD2Ye+Icd2mbsf+A45ed+UzZ9HbOhrY CxEzJFfOfzM3bVzrcYVUkGHGxcQDJ0b3Nzrpgh1YQpFWklZdOuKfH7156 7MHa+aHKiioT6vPPN+n+TQaKy9m57yl/FQqbPxD0rX149sCtWFpXiy34u w==; X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="2066396" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="2066396" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="912250207" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="912250207" Received: from yshin-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.95.133]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:41 -0800 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, debug@rivosinc.com, broonie@kernel.org, kirill.shutemov@linux.intel.com, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, peterz@infradead.org, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [RFC PATCH 7/8] x86/mm: Care about shadow stack guard gap during placement Date: Thu, 15 Feb 2024 15:13:31 -0800 Message-Id: <20240215231332.1556787-8-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> References: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. Now that the vm_flags is passed into the arch get_unmapped_area()'s, and vm_unmapped_area() is ready to consider it, have VM_SHADOW_STACK's get guard gap consideration for scenario 2. Signed-off-by: Rick Edgecombe --- arch/x86/kernel/sys_x86_64.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index f92780cf9662..3b78fdc235fc 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -119,6 +119,14 @@ static void find_start_end(unsigned long addr, unsigne= d long flags, *end =3D task_size_64bit(addr > DEFAULT_MAP_WINDOW); } =20 +static inline unsigned long stack_guard_placement(vm_flags_t vm_flags) +{ + if (vm_flags & VM_SHADOW_STACK) + return PAGE_SIZE; + + return 0; +} + extern unsigned long arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsi= gned long len, unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) @@ -144,12 +152,13 @@ arch_get_unmapped_area_vmflags(struct file *filp, uns= igned long addr, unsigned l return addr; } =20 - info.flags =3D 0; + info.flags =3D VM_UNMAPPED_START_GAP_SET; info.length =3D len; info.low_limit =3D begin; info.high_limit =3D end; info.align_mask =3D 0; info.align_offset =3D pgoff << PAGE_SHIFT; + info.start_gap =3D stack_guard_placement(vm_flags); if (filp) { info.align_mask =3D get_align_mask(); info.align_offset +=3D get_align_bits(); @@ -191,7 +200,7 @@ arch_get_unmapped_area_topdown_vmflags(struct file *fil= p, unsigned long addr0, } get_unmapped_area: =20 - info.flags =3D VM_UNMAPPED_AREA_TOPDOWN; + info.flags =3D VM_UNMAPPED_AREA_TOPDOWN | VM_UNMAPPED_START_GAP_SET; info.length =3D len; if (!in_32bit_syscall() && (flags & MAP_ABOVE4G)) info.low_limit =3D SZ_4G; @@ -199,6 +208,7 @@ arch_get_unmapped_area_topdown_vmflags(struct file *fil= p, unsigned long addr0, info.low_limit =3D PAGE_SIZE; =20 info.high_limit =3D get_mmap_base(0); + info.start_gap =3D stack_guard_placement(vm_flags); =20 /* * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area --=20 2.34.1 From nobody Sat Feb 7 18:28:58 2026 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B066414AD33 for ; Thu, 15 Feb 2024 23:14:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038887; cv=none; b=AslqiY01oZZZiGFUlinZpzgtDAPiFtoj6JrycAVWJLBSbIFeVarZOA+4reuALaJQ90SuM+FuJdmEHykNK0VO61t+iTHHNdcpBpfa4uNT3T5lPfS8a7P58vXdbIzT/DmiTBLugZRfms0AKk0Y44netZwy3WFPro/ySkKuKBO5O3I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708038887; c=relaxed/simple; bh=9Kz1B1g9YBWVQVBA7u5n8sBfRhBddgPHEWEy/INdrdY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=msJYnINfhEOeZrhRClKwDQIrm/m4FvFX9r18C1KodfJq/FDQvx+GB9u4zHX/v8HSargdVcQlzpzICWZ0Yc1zYQ4OJYdmYlqWfCSjp3OEsUxkbPCSRpwJ1+aPRVkMcaw4Rp2NYLgLSfT0ExKH8E7o7vIC/M+MWKyhN5BDKq6RtHw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=H6vzGf+G; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="H6vzGf+G" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1708038886; x=1739574886; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9Kz1B1g9YBWVQVBA7u5n8sBfRhBddgPHEWEy/INdrdY=; b=H6vzGf+Gbtr4uatpgey901EkkN8dPcjUaVdbO6ZsJGj5Ur+Xc5qO9Opw OnFSvkA7WtptVm/9vxwk0aCVCUukpqtxR7o4j2pWxbxkHaclbjelHyHmg cc4d1WB7+3S5ED2JmKYuPwG9ETKkiTvzl+cryHxPJGyeI8AmDJ7t59k3Z Mj9zaPtrbP1c2fnap0C5jU9tGEG4pxm3A0/tzT7RqneVuJwKF5VAB6qHh d+87yCZHSOP6kfPSRDxDkSab9Hm79hlZAk3GVizaJZHPtnLzwXI74Le/b Uir8ooXOQat0wlL5JvslqDgKk9/E4Y+caJCrsbtxLS1jfDnmZi+OO6wZb A==; X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="2066410" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="2066410" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10985"; a="912250210" X-IronPort-AV: E=Sophos;i="6.06,162,1705392000"; d="scan'208";a="912250210" Received: from yshin-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.95.133]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Feb 2024 15:14:42 -0800 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, debug@rivosinc.com, broonie@kernel.org, kirill.shutemov@linux.intel.com, keescook@chromium.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, peterz@infradead.org, hpa@zytor.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: rick.p.edgecombe@intel.com Subject: [RFC PATCH 8/8] selftests/x86: Add placement guard gap test for shstk Date: Thu, 15 Feb 2024 15:13:32 -0800 Message-Id: <20240215231332.1556787-9-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> References: <20240215231332.1556787-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The existing shadow stack test for guard gaps just checks that new mappings are not placed in an existing mapping's guard gap. Add one that checks that new mappings are not placed such that preexisting mappings are in the new mappings guard gap. Signed-off-by: Rick Edgecombe --- .../testing/selftests/x86/test_shadow_stack.c | 67 +++++++++++++++++-- 1 file changed, 63 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/x86/test_shadow_stack.c b/tools/testin= g/selftests/x86/test_shadow_stack.c index 757e6527f67e..ee909a7927f9 100644 --- a/tools/testing/selftests/x86/test_shadow_stack.c +++ b/tools/testing/selftests/x86/test_shadow_stack.c @@ -556,7 +556,7 @@ struct node { * looked at the shadow stack gaps. * 5. See if it landed in the gap. */ -int test_guard_gap(void) +int test_guard_gap_other_gaps(void) { void *free_area, *shstk, *test_map =3D (void *)0xFFFFFFFFFFFFFFFF; struct node *head =3D NULL, *cur; @@ -593,11 +593,64 @@ int test_guard_gap(void) if (shstk - test_map - PAGE_SIZE !=3D PAGE_SIZE) return 1; =20 - printf("[OK]\tGuard gap test\n"); + printf("[OK]\tGuard gap test, other mapping's gaps\n"); =20 return 0; } =20 +/* Tests respecting the guard gap of the mapping getting placed */ +int test_guard_gap_new_mappings_gaps(void) +{ + void *free_area, *shstk_start, *test_map =3D (void *)0xFFFFFFFFFFFFFFFF; + struct node *head =3D NULL, *cur; + int ret =3D 0; + + free_area =3D mmap(0, PAGE_SIZE * 4, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + munmap(free_area, PAGE_SIZE * 4); + + /* Test letting map_shadow_stack find a free space */ + shstk_start =3D mmap(free_area, PAGE_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (shstk_start =3D=3D MAP_FAILED || shstk_start !=3D free_area) + return 1; + + while (test_map > shstk_start) { + test_map =3D (void *)syscall(__NR_map_shadow_stack, 0, PAGE_SIZE, 0); + if (test_map =3D=3D MAP_FAILED) { + printf("[INFO]\tmap_shadow_stack MAP_FAILED\n"); + ret =3D 1; + break; + } + + cur =3D malloc(sizeof(*cur)); + cur->mapping =3D test_map; + + cur->next =3D head; + head =3D cur; + + if (test_map =3D=3D free_area + PAGE_SIZE) { + printf("[INFO]\tNew mapping has other mapping in guard gap!\n"); + ret =3D 1; + break; + } + } + + while (head) { + cur =3D head; + head =3D cur->next; + munmap(cur->mapping, PAGE_SIZE); + free(cur); + } + + munmap(shstk_start, PAGE_SIZE); + + if (!ret) + printf("[OK]\tGuard gap test, placement mapping's gaps\n"); + + return ret; +} + /* * Too complicated to pull it out of the 32 bit header, but also get the * 64 bit one needed above. Just define a copy here. @@ -850,9 +903,15 @@ int main(int argc, char *argv[]) goto out; } =20 - if (test_guard_gap()) { + if (test_guard_gap_other_gaps()) { ret =3D 1; - printf("[FAIL]\tGuard gap test\n"); + printf("[FAIL]\tGuard gap test, other mappings' gaps\n"); + goto out; + } + + if (test_guard_gap_new_mappings_gaps()) { + ret =3D 1; + printf("[FAIL]\tGuard gap test, placement mapping's gaps\n"); goto out; } =20 --=20 2.34.1