From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC3ED142636 for ; Tue, 12 Mar 2024 22:29:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282546; cv=none; b=Q7gnumukG02mVY/rQbKRAejCAHHzf5OzhcZNb+3v3720j0mboQSv6p0dFutTs0Ju/X1R4MQ8WjPXJqXheLft0CTeehe93DSiDMz2QkBKfiznnALqJVlxs6JbYoFDADTY/YVMscqhd/pPLV1oG7U1CmcqAkGe7TJvLdzqHCZzT8M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282546; c=relaxed/simple; bh=zVX1vaEkDKmChEJ4w568WWgPFcE9+A3WKwowQ5YMPd8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=E+RXCwdGTcyoyhuzOpz3rirK4rSQY/JlLVXAsK0btMQLs1MD5Apxf1h3I9ZesJv//YogCZxdRfvorPUO326Jfb2LBIRPcdSkK6RggjsVm+4sh8KOtgZXfacdSz5RcyG8bX8+dfJ/BO/ri8FW6u5s/pxGgI+ks6TpXxbCpQnJc2s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YXEnqaow; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YXEnqaow" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282544; x=1741818544; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zVX1vaEkDKmChEJ4w568WWgPFcE9+A3WKwowQ5YMPd8=; b=YXEnqaowRjXpiScu3NxzDtJNRs7GiuUEh3WVvRLzRLLEg/MO9H9eWDny WOqLBT2sLPoYASVLjICFQlmGuRW5qYV4SWDKWeiJuXdSW4p8KsqmXYxkQ TDWVwLoyN8tMdnd/qTATJvG4k3dzysGLF7f/2AdHUqYBwuIJGiph4tLmp 0Xe0EhGdXNUm+GZxuwO27OvJNjtxuS7utamxFbhEpC7Yfzbz8L0Nh2QzZ njITS1GXVIi2bzy/v3kT5fSOfcemsMA0KgJT5Vopn920V+tAyb9UbAg2E aSDp3nUGsjnrCMXYEHHh8P2ECf1LwAWShZqRFvJ33Fh3ryJ+gcx5smFyo A==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5191918" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5191918" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356826" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:01 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com Subject: [PATCH v3 01/12] mm: Switch mm->get_unmapped_area() to a flag Date: Tue, 12 Mar 2024 15:28:32 -0700 Message-Id: <20240312222843.2505560-2-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The mm_struct contains a function pointer *get_unmapped_area(), which is set to either arch_get_unmapped_area() or arch_get_unmapped_area_topdown() during the initialization of the mm. Since the function pointer only ever points to two functions that are named the same across all arch's, a function pointer is not really required. In addition future changes will want to add versions of the functions that take additional arguments. So to save a pointers worth of bytes in mm_struct, and prevent adding additional function pointers to mm_struct in future changes, remove it and keep the information about which get_unmapped_area() to use in a flag. Add the new flag to MMF_INIT_MASK so it doesn't get clobbered on fork by mmf_init_flags(). Most MM flags get clobbered on fork. In the pre-existing behavior mm->get_unmapped_area() would get copied to the new mm in dup_mm(), so not clobbering the flag preserves the existing behavior around inheriting the topdown-ness. Introduce a helper, mm_get_unmapped_area(), to easily convert code that refers to the old function pointer to instead select and call either arch_get_unmapped_area() or arch_get_unmapped_area_topdown() based on the flag. Then drop the mm->get_unmapped_area() function pointer. Leave the get_unmapped_area() pointer in struct file_operations alone. The main purpose of this change is to reorganize in preparation for future changes, but it also converts the calls of mm->get_unmapped_area() from indirect branches into a direct ones. The stress-ng bigheap benchmark calls realloc a lot, which calls through get_unmapped_area() in the kernel. On x86, the change yielded a ~1% improvement there on a retpoline config. In testing a few x86 configs, removing the pointer unfortunately didn't result in any actual size reductions in the compiled layout of mm_struct. But depending on compiler or arch alignment requirements, the change could shrink the size of mm_struct. Signed-off-by: Rick Edgecombe Acked-by: Dave Hansen Acked-by: Liam R. Howlett Reviewed-by: Kirill A. Shutemov --- v3: - Fix comment that still referred to mm->get_unmapped_area() - Resolve trivial rebase conflicts with "mm: thp_get_unmapped_area must honour topdown preference" - Spelling fix in log v2: - Fix comment on MMF_TOPDOWN (Kirill, rppt) - Move MMF_TOPDOWN to actually unused bit - Add MMF_TOPDOWN to MMF_INIT_MASK so it doesn't get clobbered on fork, and result in the children using the search up path. - New lower performance results after above bug fix - Add Reviews and Acks --- arch/s390/mm/hugetlbpage.c | 2 +- arch/s390/mm/mmap.c | 4 ++-- arch/sparc/kernel/sys_sparc_64.c | 15 ++++++--------- arch/sparc/mm/hugetlbpage.c | 2 +- arch/x86/kernel/cpu/sgx/driver.c | 2 +- arch/x86/mm/hugetlbpage.c | 2 +- arch/x86/mm/mmap.c | 4 ++-- drivers/char/mem.c | 2 +- drivers/dax/device.c | 6 +++--- fs/hugetlbfs/inode.c | 4 ++-- fs/proc/inode.c | 15 ++++++++------- fs/ramfs/file-mmu.c | 2 +- include/linux/mm_types.h | 6 +----- include/linux/sched/coredump.h | 5 ++++- include/linux/sched/mm.h | 5 +++++ io_uring/io_uring.c | 2 +- mm/debug.c | 6 ------ mm/huge_memory.c | 9 ++++----- mm/mmap.c | 21 ++++++++++++++++++--- mm/shmem.c | 11 +++++------ mm/util.c | 6 +++--- 21 files changed, 70 insertions(+), 61 deletions(-) diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index 297a6d897d5a..c2d2850ec8d5 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -328,7 +328,7 @@ unsigned long hugetlb_get_unmapped_area(struct file *fi= le, unsigned long addr, goto check_asce_limit; } =20 - if (mm->get_unmapped_area =3D=3D arch_get_unmapped_area) + if (!test_bit(MMF_TOPDOWN, &mm->flags)) addr =3D hugetlb_get_unmapped_area_bottomup(file, addr, len, pgoff, flags); else diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c index fc9a7dc26c5e..cd52d72b59cf 100644 --- a/arch/s390/mm/mmap.c +++ b/arch/s390/mm/mmap.c @@ -182,10 +182,10 @@ void arch_pick_mmap_layout(struct mm_struct *mm, stru= ct rlimit *rlim_stack) */ if (mmap_is_legacy(rlim_stack)) { mm->mmap_base =3D mmap_base_legacy(random_factor); - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); } else { mm->mmap_base =3D mmap_base(random_factor, rlim_stack); - mm->get_unmapped_area =3D arch_get_unmapped_area_topdown; + set_bit(MMF_TOPDOWN, &mm->flags); } } =20 diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc= _64.c index 1e9a9e016237..1dbf7211666e 100644 --- a/arch/sparc/kernel/sys_sparc_64.c +++ b/arch/sparc/kernel/sys_sparc_64.c @@ -218,14 +218,10 @@ arch_get_unmapped_area_topdown(struct file *filp, con= st unsigned long addr0, unsigned long get_fb_unmapped_area(struct file *filp, unsigned long orig_a= ddr, unsigned long len, unsigned long pgoff, unsigned long flags) { unsigned long align_goal, addr =3D -ENOMEM; - unsigned long (*get_area)(struct file *, unsigned long, - unsigned long, unsigned long, unsigned long); - - get_area =3D current->mm->get_unmapped_area; =20 if (flags & MAP_FIXED) { /* Ok, don't mess with it. */ - return get_area(NULL, orig_addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, fl= ags); } flags &=3D ~MAP_SHARED; =20 @@ -238,7 +234,8 @@ unsigned long get_fb_unmapped_area(struct file *filp, u= nsigned long orig_addr, u align_goal =3D (64UL * 1024); =20 do { - addr =3D get_area(NULL, orig_addr, len + (align_goal - PAGE_SIZE), pgoff= , flags); + addr =3D mm_get_unmapped_area(current->mm, NULL, orig_addr, + len + (align_goal - PAGE_SIZE), pgoff, flags); if (!(addr & ~PAGE_MASK)) { addr =3D (addr + (align_goal - 1UL)) & ~(align_goal - 1UL); break; @@ -256,7 +253,7 @@ unsigned long get_fb_unmapped_area(struct file *filp, u= nsigned long orig_addr, u * be obtained. */ if (addr & ~PAGE_MASK) - addr =3D get_area(NULL, orig_addr, len, pgoff, flags); + addr =3D mm_get_unmapped_area(current->mm, NULL, orig_addr, len, pgoff, = flags); =20 return addr; } @@ -292,7 +289,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct= rlimit *rlim_stack) gap =3D=3D RLIM_INFINITY || sysctl_legacy_va_layout) { mm->mmap_base =3D TASK_UNMAPPED_BASE + random_factor; - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); } else { /* We know it's 32-bit */ unsigned long task_size =3D STACK_TOP32; @@ -303,7 +300,7 @@ void arch_pick_mmap_layout(struct mm_struct *mm, struct= rlimit *rlim_stack) gap =3D (task_size / 6 * 5); =20 mm->mmap_base =3D PAGE_ALIGN(task_size - gap - random_factor); - mm->get_unmapped_area =3D arch_get_unmapped_area_topdown; + set_bit(MMF_TOPDOWN, &mm->flags); } } =20 diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index b432500c13a5..38a1bef47efb 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -123,7 +123,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned l= ong addr, (!vma || addr + len <=3D vm_start_gap(vma))) return addr; } - if (mm->get_unmapped_area =3D=3D arch_get_unmapped_area) + if (!test_bit(MMF_TOPDOWN, &mm->flags)) return hugetlb_get_unmapped_area_bottomup(file, addr, len, pgoff, flags); else diff --git a/arch/x86/kernel/cpu/sgx/driver.c b/arch/x86/kernel/cpu/sgx/dri= ver.c index 262f5fb18d74..22b65a5f5ec6 100644 --- a/arch/x86/kernel/cpu/sgx/driver.c +++ b/arch/x86/kernel/cpu/sgx/driver.c @@ -113,7 +113,7 @@ static unsigned long sgx_get_unmapped_area(struct file = *file, if (flags & MAP_FIXED) return addr; =20 - return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); } =20 #ifdef CONFIG_COMPAT diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index 5804bbae4f01..6d77c0039617 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -141,7 +141,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned l= ong addr, } =20 get_unmapped_area: - if (mm->get_unmapped_area =3D=3D arch_get_unmapped_area) + if (!test_bit(MMF_TOPDOWN, &mm->flags)) return hugetlb_get_unmapped_area_bottomup(file, addr, len, pgoff, flags); else diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index c90c20904a60..a2cabb1c81e1 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -129,9 +129,9 @@ static void arch_pick_mmap_base(unsigned long *base, un= signed long *legacy_base, void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) { if (mmap_is_legacy()) - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); else - mm->get_unmapped_area =3D arch_get_unmapped_area_topdown; + set_bit(MMF_TOPDOWN, &mm->flags); =20 arch_pick_mmap_base(&mm->mmap_base, &mm->mmap_legacy_base, arch_rnd(mmap64_rnd_bits), task_size_64bit(0), diff --git a/drivers/char/mem.c b/drivers/char/mem.c index 3c6670cf905f..9b80e622ae80 100644 --- a/drivers/char/mem.c +++ b/drivers/char/mem.c @@ -544,7 +544,7 @@ static unsigned long get_unmapped_area_zero(struct file= *file, } =20 /* Otherwise flags & MAP_PRIVATE: with no shmem object beneath it */ - return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); #else return -ENOSYS; #endif diff --git a/drivers/dax/device.c b/drivers/dax/device.c index 93ebedc5ec8c..47c126d37b59 100644 --- a/drivers/dax/device.c +++ b/drivers/dax/device.c @@ -329,14 +329,14 @@ static unsigned long dax_get_unmapped_area(struct fil= e *filp, if ((off + len_align) < off) goto out; =20 - addr_align =3D current->mm->get_unmapped_area(filp, addr, len_align, - pgoff, flags); + addr_align =3D mm_get_unmapped_area(current->mm, filp, addr, len_align, + pgoff, flags); if (!IS_ERR_VALUE(addr_align)) { addr_align +=3D (off - addr_align) & (align - 1); return addr_align; } out: - return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags); } =20 static const struct address_space_operations dev_dax_aops =3D { diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index d746866ae3b6..cd87ea5944a1 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -249,11 +249,11 @@ generic_hugetlb_get_unmapped_area(struct file *file, = unsigned long addr, } =20 /* - * Use mm->get_unmapped_area value as a hint to use topdown routine. + * Use MMF_TOPDOWN flag as a hint to use topdown routine. * If architectures have special needs, they should define their own * version of hugetlb_get_unmapped_area. */ - if (mm->get_unmapped_area =3D=3D arch_get_unmapped_area_topdown) + if (test_bit(MMF_TOPDOWN, &mm->flags)) return hugetlb_get_unmapped_area_topdown(file, addr, len, pgoff, flags); return hugetlb_get_unmapped_area_bottomup(file, addr, len, diff --git a/fs/proc/inode.c b/fs/proc/inode.c index 05350f3c2812..017144a8516c 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -451,15 +451,16 @@ pde_get_unmapped_area(struct proc_dir_entry *pde, str= uct file *file, unsigned lo unsigned long len, unsigned long pgoff, unsigned long flags) { - typeof_member(struct proc_ops, proc_get_unmapped_area) get_area; - - get_area =3D pde->proc_ops->proc_get_unmapped_area; + if (pde->proc_ops->proc_get_unmapped_area) + return pde->proc_ops->proc_get_unmapped_area(file, orig_addr, + len, pgoff, + flags); #ifdef CONFIG_MMU - if (!get_area) - get_area =3D current->mm->get_unmapped_area; + else + return mm_get_unmapped_area(current->mm, file, orig_addr, + len, pgoff, flags); #endif - if (get_area) - return get_area(file, orig_addr, len, pgoff, flags); + return orig_addr; } =20 diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c index c7a1aa3c882b..b45c7edc3225 100644 --- a/fs/ramfs/file-mmu.c +++ b/fs/ramfs/file-mmu.c @@ -35,7 +35,7 @@ static unsigned long ramfs_mmu_get_unmapped_area(struct f= ile *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { - return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); } =20 const struct file_operations ramfs_file_operations =3D { diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8b611e13153e..d20869881214 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -749,11 +749,7 @@ struct mm_struct { } ____cacheline_aligned_in_smp; =20 struct maple_tree mm_mt; -#ifdef CONFIG_MMU - unsigned long (*get_unmapped_area) (struct file *filp, - unsigned long addr, unsigned long len, - unsigned long pgoff, unsigned long flags); -#endif + unsigned long mmap_base; /* base of mmap area */ unsigned long mmap_legacy_base; /* base of mmap area in bottom-up alloca= tions */ #ifdef CONFIG_HAVE_ARCH_COMPAT_MMAP_BASES diff --git a/include/linux/sched/coredump.h b/include/linux/sched/coredump.h index 02f5090ffea2..e62ff805cfc9 100644 --- a/include/linux/sched/coredump.h +++ b/include/linux/sched/coredump.h @@ -92,9 +92,12 @@ static inline int get_dumpable(struct mm_struct *mm) #define MMF_VM_MERGE_ANY 30 #define MMF_VM_MERGE_ANY_MASK (1 << MMF_VM_MERGE_ANY) =20 +#define MMF_TOPDOWN 31 /* mm searches top down by default */ +#define MMF_TOPDOWN_MASK (1 << MMF_TOPDOWN) + #define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\ MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\ - MMF_VM_MERGE_ANY_MASK) + MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK) =20 static inline unsigned long mmf_init_flags(unsigned long flags) { diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index 9a19f1b42f64..cde946e926d8 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -8,6 +8,7 @@ #include #include #include +#include =20 /* * Routines for handling mm_structs @@ -186,6 +187,10 @@ arch_get_unmapped_area_topdown(struct file *filp, unsi= gned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); =20 +unsigned long mm_get_unmapped_area(struct mm_struct *mm, struct file *filp, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags); + unsigned long generic_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index cd9a137ad6ce..9eb3b2587031 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3513,7 +3513,7 @@ static unsigned long io_uring_mmu_get_unmapped_area(s= truct file *filp, #else addr =3D 0UL; #endif - return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags); } =20 #else /* !CONFIG_MMU */ diff --git a/mm/debug.c b/mm/debug.c index ee533a5ceb79..32db5de8e1e7 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -162,9 +162,6 @@ EXPORT_SYMBOL(dump_vma); void dump_mm(const struct mm_struct *mm) { pr_emerg("mm %px task_size %lu\n" -#ifdef CONFIG_MMU - "get_unmapped_area %px\n" -#endif "mmap_base %lu mmap_legacy_base %lu\n" "pgd %px mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n" "hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n" @@ -190,9 +187,6 @@ void dump_mm(const struct mm_struct *mm) "def_flags: %#lx(%pGv)\n", =20 mm, mm->task_size, -#ifdef CONFIG_MMU - mm->get_unmapped_area, -#endif mm->mmap_base, mm->mmap_legacy_base, mm->pgd, atomic_read(&mm->mm_users), atomic_read(&mm->mm_count), diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 94c958f7ebb5..bc3bf441e768 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -822,8 +822,8 @@ static unsigned long __thp_get_unmapped_area(struct fil= e *filp, if (len_pad < len || (off + len_pad) < off) return 0; =20 - ret =3D current->mm->get_unmapped_area(filp, addr, len_pad, - off >> PAGE_SHIFT, flags); + ret =3D mm_get_unmapped_area(current->mm, filp, addr, len_pad, + off >> PAGE_SHIFT, flags); =20 /* * The failure might be due to length padding. The caller will retry @@ -841,8 +841,7 @@ static unsigned long __thp_get_unmapped_area(struct fil= e *filp, =20 off_sub =3D (off - ret) & (size - 1); =20 - if (current->mm->get_unmapped_area =3D=3D arch_get_unmapped_area_topdown = && - !off_sub) + if (test_bit(MMF_TOPDOWN, ¤t->mm->flags) && !off_sub) return ret + size; =20 ret +=3D off_sub; @@ -859,7 +858,7 @@ unsigned long thp_get_unmapped_area(struct file *filp, = unsigned long addr, if (ret) return ret; =20 - return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags); } EXPORT_SYMBOL_GPL(thp_get_unmapped_area); =20 diff --git a/mm/mmap.c b/mm/mmap.c index 3281287771c9..39e9a3ae3ca5 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1815,7 +1815,8 @@ get_unmapped_area(struct file *file, unsigned long ad= dr, unsigned long len, unsigned long pgoff, unsigned long flags) { unsigned long (*get_area)(struct file *, unsigned long, - unsigned long, unsigned long, unsigned long); + unsigned long, unsigned long, unsigned long) + =3D NULL; =20 unsigned long error =3D arch_mmap_check(addr, len, flags); if (error) @@ -1825,7 +1826,6 @@ get_unmapped_area(struct file *file, unsigned long ad= dr, unsigned long len, if (len > TASK_SIZE) return -ENOMEM; =20 - get_area =3D current->mm->get_unmapped_area; if (file) { if (file->f_op->get_unmapped_area) get_area =3D file->f_op->get_unmapped_area; @@ -1844,7 +1844,11 @@ get_unmapped_area(struct file *file, unsigned long a= ddr, unsigned long len, if (!file) pgoff =3D 0; =20 - addr =3D get_area(file, addr, len, pgoff, flags); + if (get_area) + addr =3D get_area(file, addr, len, pgoff, flags); + else + addr =3D mm_get_unmapped_area(current->mm, file, addr, len, + pgoff, flags); if (IS_ERR_VALUE(addr)) return addr; =20 @@ -1859,6 +1863,17 @@ get_unmapped_area(struct file *file, unsigned long a= ddr, unsigned long len, =20 EXPORT_SYMBOL(get_unmapped_area); =20 +unsigned long +mm_get_unmapped_area(struct mm_struct *mm, struct file *file, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + if (test_bit(MMF_TOPDOWN, &mm->flags)) + return arch_get_unmapped_area_topdown(file, addr, len, pgoff, flags); + return arch_get_unmapped_area(file, addr, len, pgoff, flags); +} +EXPORT_SYMBOL(mm_get_unmapped_area); + /** * find_vma_intersection() - Look up the first VMA which intersects the in= terval * @mm: The process address space. diff --git a/mm/shmem.c b/mm/shmem.c index d7c84ff62186..5452065faa46 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2240,8 +2240,6 @@ unsigned long shmem_get_unmapped_area(struct file *fi= le, unsigned long uaddr, unsigned long len, unsigned long pgoff, unsigned long flags) { - unsigned long (*get_area)(struct file *, - unsigned long, unsigned long, unsigned long, unsigned long); unsigned long addr; unsigned long offset; unsigned long inflated_len; @@ -2251,8 +2249,8 @@ unsigned long shmem_get_unmapped_area(struct file *fi= le, if (len > TASK_SIZE) return -ENOMEM; =20 - get_area =3D current->mm->get_unmapped_area; - addr =3D get_area(file, uaddr, len, pgoff, flags); + addr =3D mm_get_unmapped_area(current->mm, file, uaddr, len, pgoff, + flags); =20 if (!IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) return addr; @@ -2309,7 +2307,8 @@ unsigned long shmem_get_unmapped_area(struct file *fi= le, if (inflated_len < len) return addr; =20 - inflated_addr =3D get_area(NULL, uaddr, inflated_len, 0, flags); + inflated_addr =3D mm_get_unmapped_area(current->mm, NULL, uaddr, + inflated_len, 0, flags); if (IS_ERR_VALUE(inflated_addr)) return addr; if (inflated_addr & ~PAGE_MASK) @@ -4755,7 +4754,7 @@ unsigned long shmem_get_unmapped_area(struct file *fi= le, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { - return current->mm->get_unmapped_area(file, addr, len, pgoff, flags); + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); } #endif =20 diff --git a/mm/util.c b/mm/util.c index 5a6a9802583b..2b959553f9ce 100644 --- a/mm/util.c +++ b/mm/util.c @@ -452,17 +452,17 @@ void arch_pick_mmap_layout(struct mm_struct *mm, stru= ct rlimit *rlim_stack) =20 if (mmap_is_legacy(rlim_stack)) { mm->mmap_base =3D TASK_UNMAPPED_BASE + random_factor; - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); } else { mm->mmap_base =3D mmap_base(random_factor, rlim_stack); - mm->get_unmapped_area =3D arch_get_unmapped_area_topdown; + set_bit(MMF_TOPDOWN, &mm->flags); } } #elif defined(CONFIG_MMU) && !defined(HAVE_ARCH_PICK_MMAP_LAYOUT) void arch_pick_mmap_layout(struct mm_struct *mm, struct rlimit *rlim_stack) { mm->mmap_base =3D TASK_UNMAPPED_BASE; - mm->get_unmapped_area =3D arch_get_unmapped_area; + clear_bit(MMF_TOPDOWN, &mm->flags); } #endif =20 --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55749143C67 for ; Tue, 12 Mar 2024 22:29:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282547; cv=none; b=KbdKEgoRhsflhFCmsYss7GlgTTXGnuW+1zkiOxG+7POF8HygxQdvtcX7f2RDBWQN7jWbnX6FsG6GKFjX/l0FUHAzyCA6oZANfdGCFafQZM/pd4UjuqfLdiHdCtGuRwFWnVBa4JuctPRJZdREVj8PZA68gCtspzzN0iwfFdYJk+I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282547; c=relaxed/simple; bh=dgRXKbvPqyuL8hOX1YSrzuU0fpl568sxa0M2x6sm+7M=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=XdgfsPelgrty0UabL2Twfx8bRC8w0zzymNnjAs757+IDOVca6JkRy+VM1oHgurde5BUX8JLpqQiZWZEbf8T/C1jZ9tLS8rkIogq5E0AVDA4fm2k/U6PXKp7Dp0NInQTDLB2pendgHjj94XJl/XhqW0E/cusH0qO5VkNXcp7d+MM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TFB8VeGm; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TFB8VeGm" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282546; x=1741818546; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dgRXKbvPqyuL8hOX1YSrzuU0fpl568sxa0M2x6sm+7M=; b=TFB8VeGmoVtM4LI8BsNN8T/jXdfDHfhMxEJhfEypwQUKq9yNUtkynoZg xOw12WER5Qufxai22EBT3Ks78X+zu6wkhaYhLcrAVTaHnvP6esq7fAOQR vKFEUSfl6sI7FYsJyqdgTXJE+9gNQRwN8daiJYWd79RdTVQo7t4THKXJV Ltu0FoqU6KiovaW33u8CQOFZo07W/UW9DVdy4eMTHGYjWK/zU9gI5kxgt l2+tohsOMVgjIYroWM3H7BEEwFNTwt7EDd+nrQtesakuGct5F01CG3yhZ q4LC7s7VXvaBdvHOxrVeZSEhFZHe9hNpLqNdCEHG8Wsu3WrBy46XNIUAB Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5191940" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5191940" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356831" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:01 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com Subject: [PATCH v3 02/12] mm: Introduce arch_get_unmapped_area_vmflags() Date: Tue, 12 Mar 2024 15:28:33 -0700 Message-Id: <20240312222843.2505560-3-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. In order to take the start gap into account, the maple tree search needs to know the size of start gap the new mapping will need. The call chain from do_mmap() to the actual maple tree search looks like this: do_mmap(size, vm_flags, map_flags, ..) mm/mmap.c:get_unmapped_area(size, map_flags, ...) arch_get_unmapped_area(size, map_flags, ...) vm_unmapped_area(struct vm_unmapped_area_info) One option would be to add another MAP_ flag to mean a one page start gap (as is for shadow stack), but this consumes a flag unnecessarily. Another option could be to simply increase the size passed in do_mmap() by the start gap size, and adjust after the fact, but this will interfere with the alignment requirements passed in struct vm_unmapped_area_info, and unknown to mmap.c. Instead, introduce variants of arch_get_unmapped_area/_topdown() that take vm_flags. In future changes, these variants can be used in mmap.c:get_unmapped_area() to allow the vm_flags to be passed through to vm_unmapped_area(), while preserving the normal arch_get_unmapped_area/_topdown() for the existing callers. Signed-off-by: Rick Edgecombe --- include/linux/sched/mm.h | 17 +++++++++++++++++ mm/mmap.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+) diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index cde946e926d8..7b44441865c5 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -191,6 +191,23 @@ unsigned long mm_get_unmapped_area(struct mm_struct *m= m, struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); =20 +extern unsigned long +arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags); +extern unsigned long +arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long ad= dr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t); + +unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, + struct file *filp, + unsigned long addr, + unsigned long len, + unsigned long pgoff, + unsigned long flags, + vm_flags_t vm_flags); + unsigned long generic_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, diff --git a/mm/mmap.c b/mm/mmap.c index 39e9a3ae3ca5..e23ce8ca24c9 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1810,6 +1810,34 @@ arch_get_unmapped_area_topdown(struct file *filp, un= signed long addr, } #endif =20 +#ifndef HAVE_ARCH_UNMAPPED_AREA_VMFLAGS +extern unsigned long +arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsi= gned long len, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) +{ + return arch_get_unmapped_area(filp, addr, len, pgoff, flags); +} + +extern unsigned long +arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long ad= dr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags) +{ + return arch_get_unmapped_area_topdown(filp, addr, len, pgoff, flags); +} +#endif + +unsigned long mm_get_unmapped_area_vmflags(struct mm_struct *mm, struct fi= le *filp, + unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags, + vm_flags_t vm_flags) +{ + if (test_bit(MMF_TOPDOWN, &mm->flags)) + return arch_get_unmapped_area_topdown_vmflags(filp, addr, len, pgoff, + flags, vm_flags); + return arch_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, vm_f= lags); +} + unsigned long get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C362D14402C for ; Tue, 12 Mar 2024 22:29:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282548; cv=none; b=YdkXZwyrDB0akCbQ9oR/k0Ixk8gkVSdqBdQAGfqbMR9A1wh2QWR8L6JNhrHg+uLp9SYJV5BPanCX+pUjj21M10xvukvj1wRfC5G+iFJJheyfkkIMtIeRYGiFK07KSU1TsHtvMdzBNkVrjsxMNRgP9kn1zuMJuAue9Dm6KGs+4Rc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282548; c=relaxed/simple; bh=/il3sOjTRp5Oqe4q9Enzhux+EAXqchUHAcOIqSHR2bA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=VPMw5ovsDDMsFPLQ3/5jt82CSX/t9gJVPwJDUtde0GmuIHeRemg652ozGprAUALCBlwkR3b+wv2NQqTBJrOg24vO9o2kHphJ6gVO1m+aKdfhifhS81VJ7b49yVvu2H+3PJRmJU/LlfOuD8nBDwxqIDY7WzjxMz6GgoBHiZp3cnA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=C3fpnPCu; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="C3fpnPCu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282547; x=1741818547; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/il3sOjTRp5Oqe4q9Enzhux+EAXqchUHAcOIqSHR2bA=; b=C3fpnPCucoLFrq2Y6zSxDbF9HXdm/TZZol6ntgrNRmy+suJDltn1ATa2 lDctq6FFVHRm0pZK3vekCK2owD9aszky5cXA/vh9MvZW/2UcgA1GWF2Yr 7u/uzb1+aXRUXVGIvp4NNRAmJuENRlKu+5TNAUVuQ46vb0aF14WwvITkA m2OW6CtwVdBi1wmHIg0UdKpp+a457sv6d8sntP0ZrrXSkURrGGRcceitQ l1EV1BKCsblBpM3slZW5ROMPO/I31iu4gCFdW9Q1IzCrrIA3J3FVq26tz nQ9sVnMmEhMQdXFfQiFCXMBQ9wQhLCsfOnnct5ZBV8vAtl7ndPUeK23Qo Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5191946" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5191946" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356836" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:02 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com Subject: [PATCH v3 03/12] mm: Use get_unmapped_area_vmflags() Date: Tue, 12 Mar 2024 15:28:34 -0700 Message-Id: <20240312222843.2505560-4-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. Use mm_get_unmapped_area_vmflags() in the do_mmap() so future changes can cause shadow stack mappings to be placed with a guard gap. Also use the THP variant that takes vm_flags, such that THP shadow stack can get the same treatment. Adjust the vm_flags calculation to happen earlier so that the vm_flags can be passed into __get_unmapped_area(). Signed-off-by: Rick Edgecombe Reviewed-by: Christophe Leroy --- v2: - Make get_unmapped_area() a static inline (Kirill) --- include/linux/mm.h | 11 ++++++++++- mm/mmap.c | 34 ++++++++++++++++------------------ 2 files changed, 26 insertions(+), 19 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index f5a97dec5169..d91cde79aaee 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3363,7 +3363,16 @@ extern int install_special_mapping(struct mm_struct = *mm, unsigned long randomize_stack_top(unsigned long stack_top); unsigned long randomize_page(unsigned long start, unsigned long range); =20 -extern unsigned long get_unmapped_area(struct file *, unsigned long, unsig= ned long, unsigned long, unsigned long); +unsigned long +__get_unmapped_area(struct file *file, unsigned long addr, unsigned long l= en, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags); + +static inline unsigned long +get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, + unsigned long pgoff, unsigned long flags) +{ + return __get_unmapped_area(file, addr, len, pgoff, flags, 0); +} =20 extern unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, diff --git a/mm/mmap.c b/mm/mmap.c index e23ce8ca24c9..a3128ed26676 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1257,18 +1257,6 @@ unsigned long do_mmap(struct file *file, unsigned lo= ng addr, if (mm->map_count > sysctl_max_map_count) return -ENOMEM; =20 - /* Obtain the address to map to. we verify (or select) it and ensure - * that it represents a valid section of the address space. - */ - addr =3D get_unmapped_area(file, addr, len, pgoff, flags); - if (IS_ERR_VALUE(addr)) - return addr; - - if (flags & MAP_FIXED_NOREPLACE) { - if (find_vma_intersection(mm, addr, addr + len)) - return -EEXIST; - } - if (prot =3D=3D PROT_EXEC) { pkey =3D execute_only_pkey(mm); if (pkey < 0) @@ -1282,6 +1270,18 @@ unsigned long do_mmap(struct file *file, unsigned lo= ng addr, vm_flags |=3D calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) | mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC; =20 + /* Obtain the address to map to. we verify (or select) it and ensure + * that it represents a valid section of the address space. + */ + addr =3D __get_unmapped_area(file, addr, len, pgoff, flags, vm_flags); + if (IS_ERR_VALUE(addr)) + return addr; + + if (flags & MAP_FIXED_NOREPLACE) { + if (find_vma_intersection(mm, addr, addr + len)) + return -EEXIST; + } + if (flags & MAP_LOCKED) if (!can_do_mlock()) return -EPERM; @@ -1839,8 +1839,8 @@ unsigned long mm_get_unmapped_area_vmflags(struct mm_= struct *mm, struct file *fi } =20 unsigned long -get_unmapped_area(struct file *file, unsigned long addr, unsigned long len, - unsigned long pgoff, unsigned long flags) +__get_unmapped_area(struct file *file, unsigned long addr, unsigned long l= en, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) { unsigned long (*get_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long) @@ -1875,8 +1875,8 @@ get_unmapped_area(struct file *file, unsigned long ad= dr, unsigned long len, if (get_area) addr =3D get_area(file, addr, len, pgoff, flags); else - addr =3D mm_get_unmapped_area(current->mm, file, addr, len, - pgoff, flags); + addr =3D mm_get_unmapped_area_vmflags(current->mm, file, addr, len, + pgoff, flags, vm_flags); if (IS_ERR_VALUE(addr)) return addr; =20 @@ -1889,8 +1889,6 @@ get_unmapped_area(struct file *file, unsigned long ad= dr, unsigned long len, return error ? error : addr; } =20 -EXPORT_SYMBOL(get_unmapped_area); - unsigned long mm_get_unmapped_area(struct mm_struct *mm, struct file *file, unsigned long addr, unsigned long len, --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B8AA5143C74 for ; Tue, 12 Mar 2024 22:29:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282547; cv=none; b=o13kOaItmv+P+opCpgWfWK6oOYQJD1+UVOIeygP0NIjYVjvF2XV5zNHarNjYQu+QEzGxJ89hj5UmXPP6iTbSq/GNZ0mrCXQb4Jwrv9PJqVvuQ9svUUuLA5eqt43yOnILJ4nrX3c7urSpdcSDsDuwAnpXYem9UIb+VuMhXdr4jgw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282547; c=relaxed/simple; bh=/QgkcmxaszXiFcioo4194oXRAE7nWH9S6PPTH54Wib0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=eBdXYO/LME5NYs18T6tIjJcF8dRvDz+AvzC08wBSmXTOoPss3XjHszrwqHQv8yrZB+T+1alUUrKQHAnAdQqplwYi85XaNmcBPQvSnpj0i4FUra+f4dgPxrjG1Q7Qehl4lUbjdukifw+MILKm7e2+Tz/5Yq+ak35Ub18Ad5oq6N0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=MXYO5lm2; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="MXYO5lm2" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282546; x=1741818546; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/QgkcmxaszXiFcioo4194oXRAE7nWH9S6PPTH54Wib0=; b=MXYO5lm2sx8QJAaL83Rxan2dGl/ke4Ef1mwpUho+EhxBD7w/1v9VEPs7 71xQu5dLlQUWw/2+zliMxJy/77kJCiF3MvWG17RutpEaY253yVIAB0Bmj Al7TNn81vBIh8e0DuzU52lS4S1K+m3nXeE+3hhJjKsE+VL4X7CgXrc9Y6 8oHAzQOSLuwuqOiIOYUzxYmmC2MYiZXp1O0p8bUgT6/DYCCKnJR3MXC+a V0T1BS268YMzaPtfBtdFaPq2yM+70P8je/Oopq/z6zt8ef9ZkiH4sB+eC 4yosXBmQg4QBUEFUR+L2xfF9gVrXHWpzhl6WuNhcBT8irhj7x3JlGp6+H A==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5191960" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5191960" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356841" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:02 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com Subject: [PATCH v3 04/12] thp: Add thp_get_unmapped_area_vmflags() Date: Tue, 12 Mar 2024 15:28:35 -0700 Message-Id: <20240312222843.2505560-5-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. Add a THP implementations of the vm_flags variant of get_unmapped_area(). Future changes will call this from mmap.c in the do_mmap() path to allow shadow stacks to be placed with consideration taken for the start guard gap. Shadow stack memory is always private and anonymous and so special guard gap logic is not needed in a lot of caseis, but it can be mapped by THP, so needs to be handled. Signed-off-by: Rick Edgecombe Reviewed-by: Christophe Leroy --- include/linux/huge_mm.h | 11 +++++++++++ mm/huge_memory.c | 23 ++++++++++++++++------- mm/mmap.c | 12 +++++++----- 3 files changed, 34 insertions(+), 12 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5adb86af35fc..8744c808d380 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -262,6 +262,9 @@ unsigned long thp_vma_allowable_orders(struct vm_area_s= truct *vma, =20 unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags); +unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned lo= ng addr, + unsigned long len, unsigned long pgoff, unsigned long flags, + vm_flags_t vm_flags); =20 void folio_prep_large_rmappable(struct folio *folio); bool can_split_folio(struct folio *folio, int *pextra_pins); @@ -416,6 +419,14 @@ static inline void folio_prep_large_rmappable(struct f= olio *folio) {} =20 #define thp_get_unmapped_area NULL =20 +static inline unsigned long +thp_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags) +{ + return 0; +} + static inline bool can_split_folio(struct folio *folio, int *pextra_pins) { diff --git a/mm/huge_memory.c b/mm/huge_memory.c index bc3bf441e768..349c93a1a7c3 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -806,7 +806,8 @@ static inline bool is_transparent_hugepage(struct folio= *folio) =20 static unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long addr, unsigned long len, - loff_t off, unsigned long flags, unsigned long size) + loff_t off, unsigned long flags, unsigned long size, + vm_flags_t vm_flags) { loff_t off_end =3D off + len; loff_t off_align =3D round_up(off, size); @@ -822,8 +823,8 @@ static unsigned long __thp_get_unmapped_area(struct fil= e *filp, if (len_pad < len || (off + len_pad) < off) return 0; =20 - ret =3D mm_get_unmapped_area(current->mm, filp, addr, len_pad, - off >> PAGE_SHIFT, flags); + ret =3D mm_get_unmapped_area_vmflags(current->mm, filp, addr, len_pad, + off >> PAGE_SHIFT, flags, vm_flags); =20 /* * The failure might be due to length padding. The caller will retry @@ -848,17 +849,25 @@ static unsigned long __thp_get_unmapped_area(struct f= ile *filp, return ret; } =20 -unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, - unsigned long len, unsigned long pgoff, unsigned long flags) +unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned lo= ng addr, + unsigned long len, unsigned long pgoff, unsigned long flags, + vm_flags_t vm_flags) { unsigned long ret; loff_t off =3D (loff_t)pgoff << PAGE_SHIFT; =20 - ret =3D __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE); + ret =3D __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE, vm= _flags); if (ret) return ret; =20 - return mm_get_unmapped_area(current->mm, filp, addr, len, pgoff, flags); + return mm_get_unmapped_area_vmflags(current->mm, filp, addr, len, pgoff, = flags, + vm_flags); +} + +unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + return thp_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, 0); } EXPORT_SYMBOL_GPL(thp_get_unmapped_area); =20 diff --git a/mm/mmap.c b/mm/mmap.c index a3128ed26676..68381b90f906 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1863,20 +1863,22 @@ __get_unmapped_area(struct file *file, unsigned lon= g addr, unsigned long len, * so use shmem's get_unmapped_area in case it can be huge. */ get_area =3D shmem_get_unmapped_area; - } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { - /* Ensures that larger anonymous mappings are THP aligned. */ - get_area =3D thp_get_unmapped_area; } =20 /* Always treat pgoff as zero for anonymous memory. */ if (!file) pgoff =3D 0; =20 - if (get_area) + if (get_area) { addr =3D get_area(file, addr, len, pgoff, flags); - else + } else if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) { + /* Ensures that larger anonymous mappings are THP aligned. */ + addr =3D thp_get_unmapped_area_vmflags(file, addr, len, + pgoff, flags, vm_flags); + } else { addr =3D mm_get_unmapped_area_vmflags(current->mm, file, addr, len, pgoff, flags, vm_flags); + } if (IS_ERR_VALUE(addr)) return addr; =20 --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3AF5B14403A; Tue, 12 Mar 2024 22:29:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282549; cv=none; b=rx1oGqaqvGoBCEzQDNqZuW3p2Z4KH0C4J3p8XnZIa29laP/iEqpqW4iFHTOM683DdsudWO3O4+v9Y3tI/UwcHw9ONmQjAvjU/y/FDtdheq/Cvfxs+trmBPPXKkWphDu+x4rQmGTmursOQjIjdVbIJarChaGx4IbsM2Q6qY3vuYg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282549; c=relaxed/simple; bh=u8AE1TgBiBYssYKjxzzEC6IaZ9p99XcwVxRZfDUcRoE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=tCuZS3ZpBd/HbxOQN8k1VxD154Nv6aCiVfo9VAg1GG2JC/lcvWNfq6je9gy2YGK/01uQpzzgfFw/ypN+6KTa7sbSjpkF4ly2JiDhMEOFmLOzEAEuo7O1K0VM8ZE11nGXZjXg0GajMo/7NLnGbWXgjsEA+OKsYix28C51SPUDjXA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YlWgh+XJ; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YlWgh+XJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282548; x=1741818548; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=u8AE1TgBiBYssYKjxzzEC6IaZ9p99XcwVxRZfDUcRoE=; b=YlWgh+XJ6F+Ll9FvYBtWk9zJ0x+S4guMqq3owhv4OpS/d//spXWtFZQ3 /+To6A7fItKP59uVDafSFHiIcjAaDK2+xaKoAcZ3o2e9rPA2xuELGYZTX PwjrbsFRhT4oT7zOnfKctgbvBW8lgUJWAsKMladPV9VdG5q8Wgpu4XiIl lKjIKi/nz4pAcXPm3fAHgDaYNqiuWTTJ6f6EzqNBQ5NSYnp0Kzi8mE8VX wmXAPZKgIdjHtm9Q9uxmdfdaBcLaYXuPamwmFajOs36uuFmKjrwSV1xsq zlWpt1HkjUycNHKcuqN3ZMh18VqBxAmR0RA19fqFDuPa5UdjukzmlChxS w==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5191974" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5191974" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356852" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:02 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com, Guo Ren , linux-csky@vger.kernel.org Subject: [PATCH v3 05/12] csky: Use initializer for struct vm_unmapped_area_info Date: Tue, 12 Mar 2024 15:28:36 -0700 Message-Id: <20240312222843.2505560-6-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Future changes will need to add a new member to struct vm_unmapped_area_info. This would cause trouble for any call site that doesn't initialize the struct. Currently every caller sets each member manually, so if new members are added they will be uninitialized and the core code parsing the struct will see garbage in the new member. It could be possible to initialize the new member manually to 0 at each call site. This and a couple other options were discussed, and a working consensus (see links) was that in general the best way to accomplish this would be via static initialization with designated member initiators. Having some struct vm_unmapped_area_info instances not zero initialized will put those sites at risk of feeding garbage into vm_unmapped_area() if the convention is to zero initialize the struct and any new member addition misses a call site that initializes each member manually. It could be possible to leave the code mostly untouched, and just change the line: struct vm_unmapped_area_info info to: struct vm_unmapped_area_info info =3D {}; However, that would leave cleanup for the members that are manually set to zero, as it would no longer be required. So to be reduce the chance of bugs via uninitialized members, instead simply continue the process to initialize the struct this way tree wide. This will zero any unspecified members. Move the member initializers to the struct declaration when they are known at that time. Leave the members out that were manually initialized to zero, as this would be redundant for designated initializers. Signed-off-by: Rick Edgecombe Reviewed-by: Guo Ren Cc: Guo Ren Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/lkml/202402280912.33AEE7A9CF@keescook/#t Link: https://lore.kernel.org/lkml/j7bfvig3gew3qruouxrh7z7ehjjafrgkbcmg6tcg= hhfh3rhmzi@wzlcoecgy5rs/ Reviewed-by: Christophe Leroy --- v3: - Fixed spelling errors in log - Be consistent about field vs member in log Hi, This patch was split and refactored out of a tree-wide change [0] to just zero-init each struct vm_unmapped_area_info. The overall goal of the series is to help shadow stack guard gaps. Currently, there is only one arch with shadow stacks, but two more are in progress. It is compile tested only. There was further discussion that this method of initializing the structs while nice in some ways has a greater risk of introducing bugs in some of the more complicated callers. Since this version was reviewed my arch maintainers already, leave it as was already acknowledged. Thanks, Rick [0] https://lore.kernel.org/lkml/20240226190951.3240433-6-rick.p.edgecombe@= intel.com/ --- arch/csky/abiv1/mmap.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/arch/csky/abiv1/mmap.c b/arch/csky/abiv1/mmap.c index 6792aca49999..7f826331d409 100644 --- a/arch/csky/abiv1/mmap.c +++ b/arch/csky/abiv1/mmap.c @@ -28,7 +28,12 @@ arch_get_unmapped_area(struct file *filp, unsigned long = addr, struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma; int do_align =3D 0; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D { + .length =3D len, + .low_limit =3D mm->mmap_base, + .high_limit =3D TASK_SIZE, + .align_offset =3D pgoff << PAGE_SHIFT + }; =20 /* * We only need to do colour alignment if either the I or D @@ -61,11 +66,6 @@ arch_get_unmapped_area(struct file *filp, unsigned long = addr, return addr; } =20 - info.flags =3D 0; - info.length =3D len; - info.low_limit =3D mm->mmap_base; - info.high_limit =3D TASK_SIZE; info.align_mask =3D do_align ? (PAGE_MASK & (SHMLBA - 1)) : 0; - info.align_offset =3D pgoff << PAGE_SHIFT; return vm_unmapped_area(&info); } --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 990D814404C; Tue, 12 Mar 2024 22:29:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282549; cv=none; b=VixazLgWwxrCvSXHl720LXcvpQcKYMMwpRTEjlSOvAooO1vJESeGJwpwO2WO5DlBqE23LdXWFubJm2dG6ev8eu3Vy8VUZ/Yvrp8dvDYCCtBbJdS/qU3x1CPkPm+LuvkgMYa8Qzdvd5l75dN/TVDdIAunKtQBSTMDEm6Xt7YITdU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282549; c=relaxed/simple; bh=0AXtXC2QsVELXP+nYGygIha/dj1Cad9y2eG6v/oGLtg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=n56d4VeEFGVYkQ3N3SJCa+7ujH1B43TeJb753YhxdEt60kGw4qhWKurjgH+NGYs7AFAET5F5Twfh0LpBUkL98FBk7x//g6AbN1Q5HXCaRl6JgO5WPElJ0ceVbdh78QO6H8VNVEnu3UDcfDJhGCGlNCGvTWeSRgC36EcqB2IcaWg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Q5QPyTLr; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Q5QPyTLr" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282548; x=1741818548; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0AXtXC2QsVELXP+nYGygIha/dj1Cad9y2eG6v/oGLtg=; b=Q5QPyTLrsnRpx8b/qwzY04lG0KEQUsLe9w5Pc3au8d/TEVAjVYq7Ek+W Gi/IreZIGiiilVjRHKB425731sI3qTs7Y/bp5iXQxVsHDzGNN/+F1DREW h0tTnwO2FjjUtT1n5AJet3mY97YvPZHRZI2GY6e9MaMkoKqm9fgnY5K6O G/4DENSS7QuQ2FN/jVrU/tFGD6pHFywKpb5Yy2miFKXgQhmOQUDkKoDWj siPoVobeqGB5ikLsTZig0KiknKK1okqQLyTqXfWjNXfB8w6dPkch16kmF hpTW7JeIGc+ydsGu7v5c98w/KEIWBWS5ScE2fLEpPKEy0E68fFIkJZVJT Q==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5191986" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5191986" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356855" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:03 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com, Helge Deller , "James E.J. Bottomley" , linux-parisc@vger.kernel.org Subject: [PATCH v3 06/12] parisc: Use initializer for struct vm_unmapped_area_info Date: Tue, 12 Mar 2024 15:28:37 -0700 Message-Id: <20240312222843.2505560-7-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Future changes will need to add a new member to struct vm_unmapped_area_info. This would cause trouble for any call site that doesn't initialize the struct. Currently every caller sets each member manually, so if new members are added they will be uninitialized and the core code parsing the struct will see garbage in the new member. It could be possible to initialize the new member manually to 0 at each call site. This and a couple other options were discussed, and a working consensus (see links) was that in general the best way to accomplish this would be via static initialization with designated member initiators. Having some struct vm_unmapped_area_info instances not zero initialized will put those sites at risk of feeding garbage into vm_unmapped_area() if the convention is to zero initialize the struct and any new member addition misses a call site that initializes each member manually. It could be possible to leave the code mostly untouched, and just change the line: struct vm_unmapped_area_info info to: struct vm_unmapped_area_info info =3D {}; However, that would leave cleanup for the members that are manually set to zero, as it would no longer be required. So to be reduce the chance of bugs via uninitialized members, instead simply continue the process to initialize the struct this way tree wide. This will zero any unspecified members. Move the member initializers to the struct declaration when they are known at that time. Leave the members out that were manually initialized to zero, as this would be redundant for designated initializers. Signed-off-by: Rick Edgecombe Acked-by: Helge Deller Cc: "James E.J. Bottomley" Cc: Helge Deller Cc: linux-parisc@vger.kernel.org Link: https://lore.kernel.org/lkml/202402280912.33AEE7A9CF@keescook/#t Link: https://lore.kernel.org/lkml/j7bfvig3gew3qruouxrh7z7ehjjafrgkbcmg6tcg= hhfh3rhmzi@wzlcoecgy5rs/ Reviewed-by: Christophe Leroy --- v3: - Fixed spelling errors in log - Be consistent about field vs member in log Hi, This patch was split and refactored out of a tree-wide change [0] to just zero-init each struct vm_unmapped_area_info. The overall goal of the series is to help shadow stack guard gaps. Currently, there is only one arch with shadow stacks, but two more are in progress. It is compile tested only. There was further discussion that this method of initializing the structs while nice in some ways has a greater risk of introducing bugs in some of the more complicated callers. Since this version was reviewed my arch maintainers already, leave it as was already acknowledged. Thanks, Rick [0] https://lore.kernel.org/lkml/20240226190951.3240433-6-rick.p.edgecombe@= intel.com/ --- arch/parisc/kernel/sys_parisc.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_paris= c.c index 98af719d5f85..f7722451276e 100644 --- a/arch/parisc/kernel/sys_parisc.c +++ b/arch/parisc/kernel/sys_parisc.c @@ -104,7 +104,9 @@ static unsigned long arch_get_unmapped_area_common(stru= ct file *filp, struct vm_area_struct *vma, *prev; unsigned long filp_pgoff; int do_color_align; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D { + .length =3D len + }; =20 if (unlikely(len > TASK_SIZE)) return -ENOMEM; @@ -139,7 +141,6 @@ static unsigned long arch_get_unmapped_area_common(stru= ct file *filp, return addr; } =20 - info.length =3D len; info.align_mask =3D do_color_align ? (PAGE_MASK & (SHM_COLOUR - 1)) : 0; info.align_offset =3D shared_align_offset(filp_pgoff, pgoff); =20 @@ -160,7 +161,6 @@ static unsigned long arch_get_unmapped_area_common(stru= ct file *filp, */ } =20 - info.flags =3D 0; info.low_limit =3D mm->mmap_base; info.high_limit =3D mmap_upper_limit(NULL); return vm_unmapped_area(&info); --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BC7801448D5 for ; Tue, 12 Mar 2024 22:29:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282550; cv=none; b=s6C3Roe3TE4MWsM2fF90m7hCl0azbpYjhvWfLNdOhNEHV8WgBR8kNDXmKnQzefDMyi2Lvqhp/IeT+35hP6g2d+v9BzeDV6c7tpYp8daHXwgroiinTDjtyV0O9pcuLzOOuuWq7AQSvO97Me7tZbC/zc+ecMSb6rMF+Ygd+9R0AXQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282550; c=relaxed/simple; bh=SOxmdONZG+ooJ46ryKwSr/UQJN1BRbIVpZR9p2Kx8ZQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Vf5YJ06agcMTvPzS/KtWG9xiXOotpvIWwV6hGeyCs0bbdgopT63aeXnASOkPCn6WsmxjXZep161ObErVHtv5x2CQCjOCfLpIWzVOyTc9RsGPI/RvjglkH2FIeEGpQgzA0XdyHoDnMw68HlN/1c466Dx4ztNeOCa44IW/80Tp4S8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=RITabw2D; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="RITabw2D" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282549; x=1741818549; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SOxmdONZG+ooJ46ryKwSr/UQJN1BRbIVpZR9p2Kx8ZQ=; b=RITabw2DjNdHts4LOOzYBA7YTt5/3f8/fMVgayp5PF17FEOURjdpRMc2 dP4D+/JMMlyShm3x3XrpOpEuvKU+7Wc7f2T3EmUHnWfEJFSejBCXf5Xrb n7mXGE7kg59XQE7R7HejawVdbj0yeu/GRYAeHhcb/owp+ypAqjHQPB0aF M3nC3/QUfoItaHx9rKQbOab21HFn0Jzmvpkb0tUSAsf0ENsUtBnMuRaxg KNZQyLiHGQL6sSfJPkXEPnMYeqvxcna4GF9T78gln5tTjG6+q4PkIplrV fOZUAgxnMS2US23YPdMnM1D0R9NQAFLHzxAkU4w+qosKWeoNX5hXzYIQd g==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5192005" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5192005" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356860" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:03 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com, Michael Ellerman , Nicholas Piggin , "Aneesh Kumar K . V" , "Naveen N . Rao" , linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 07/12] powerpc: Use initializer for struct vm_unmapped_area_info Date: Tue, 12 Mar 2024 15:28:38 -0700 Message-Id: <20240312222843.2505560-8-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Future changes will need to add a new member to struct vm_unmapped_area_info. This would cause trouble for any call site that doesn't initialize the struct. Currently every caller sets each member manually, so if new members are added they will be uninitialized and the core code parsing the struct will see garbage in the new member. It could be possible to initialize the new member manually to 0 at each call site. This and a couple other options were discussed, and a working consensus (see links) was that in general the best way to accomplish this would be via static initialization with designated member initiators. Having some struct vm_unmapped_area_info instances not zero initialized will put those sites at risk of feeding garbage into vm_unmapped_area() if the convention is to zero initialize the struct and any new member addition misses a call site that initializes each member manually. It could be possible to leave the code mostly untouched, and just change the line: struct vm_unmapped_area_info info to: struct vm_unmapped_area_info info =3D {}; However, that would leave cleanup for the members that are manually set to zero, as it would no longer be required. So to be reduce the chance of bugs via uninitialized members, instead simply continue the process to initialize the struct this way tree wide. This will zero any unspecified members. Move the member initializers to the struct declaration when they are known at that time. Leave the members out that were manually initialized to zero, as this would be redundant for designated initializers. Signed-off-by: Rick Edgecombe Acked-by: Michael Ellerman Cc: Michael Ellerman Cc: Nicholas Piggin Cc: Christophe Leroy Cc: Aneesh Kumar K.V Cc: Naveen N. Rao Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/lkml/202402280912.33AEE7A9CF@keescook/#t Link: https://lore.kernel.org/lkml/j7bfvig3gew3qruouxrh7z7ehjjafrgkbcmg6tcg= hhfh3rhmzi@wzlcoecgy5rs/ --- v3: - Fixed spelling errors in log - Be consistent about field vs member in log Hi, This patch was split and refactored out of a tree-wide change [0] to just zero-init each struct vm_unmapped_area_info. The overall goal of the series is to help shadow stack guard gaps. Currently, there is only one arch with shadow stacks, but two more are in progress. It is compile tested only. There was further discussion that this method of initializing the structs while nice in some ways has a greater risk of introducing bugs in some of the more complicated callers. Since this version was reviewed my arch maintainers already, leave it as was already acknowledged. Thanks, Rick [0] https://lore.kernel.org/lkml/20240226190951.3240433-6-rick.p.edgecombe@= intel.com/ --- arch/powerpc/mm/book3s64/slice.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/mm/book3s64/slice.c b/arch/powerpc/mm/book3s64/sl= ice.c index c0b58afb9a47..6c7ac8c73a6c 100644 --- a/arch/powerpc/mm/book3s64/slice.c +++ b/arch/powerpc/mm/book3s64/slice.c @@ -282,12 +282,12 @@ static unsigned long slice_find_area_bottomup(struct = mm_struct *mm, { int pshift =3D max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); unsigned long found, next_end; - struct vm_unmapped_area_info info; - - info.flags =3D 0; - info.length =3D len; - info.align_mask =3D PAGE_MASK & ((1ul << pshift) - 1); - info.align_offset =3D 0; + struct vm_unmapped_area_info info =3D { + .flags =3D 0, + .length =3D len, + .align_mask =3D PAGE_MASK & ((1ul << pshift) - 1), + .align_offset =3D 0 + }; /* * Check till the allow max value for this mmap request */ @@ -326,13 +326,14 @@ static unsigned long slice_find_area_topdown(struct m= m_struct *mm, { int pshift =3D max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT); unsigned long found, prev; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D { + .flags =3D VM_UNMAPPED_AREA_TOPDOWN, + .length =3D len, + .align_mask =3D PAGE_MASK & ((1ul << pshift) - 1), + .align_offset =3D 0 + }; unsigned long min_addr =3D max(PAGE_SIZE, mmap_min_addr); =20 - info.flags =3D VM_UNMAPPED_AREA_TOPDOWN; - info.length =3D len; - info.align_mask =3D PAGE_MASK & ((1ul << pshift) - 1); - info.align_offset =3D 0; /* * If we are trying to allocate above DEFAULT_MAP_WINDOW * Add the different to the mmap_base. --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F464143727; Tue, 12 Mar 2024 22:29:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282552; cv=none; b=kgU0Jo3YYWX3XkgZv+9agHYBAijDKwCmC50Ckl0VytVvmPf2HfIcH8ldjz+pPdUAI3/N8Sp9g6WOFH8Henx1Xo4yfcRE6B2AG/V0WCLFXrq4ED3Uiz0bfSRQRdSFGxq646w5J5ePTTaG00oNZWz/wxHJ49F3fY2Gidj1kI3H5vg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282552; c=relaxed/simple; bh=D1/EWAXI8SpxfCEHFRS2m8qX6cfft62KgrdcBDWVQkY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ulgPVreWefIx1q7lyyOXAxxlS114yArjz0nElqVmUWmRsx2mRZVYQqb6W7aLTvoTIFrGRKT4EW2B9m3koI8LrlG/3FmkA2gAr5a+LZuxvHXGx/JzwzeruYcUFsKWnzT2iZlc8G0MOn8JsqAhtBT92jY490+yWXciBz909Enfu8g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KTuSXWWb; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KTuSXWWb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282550; x=1741818550; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=D1/EWAXI8SpxfCEHFRS2m8qX6cfft62KgrdcBDWVQkY=; b=KTuSXWWbnVLPx/y/uALVzuPrVBkQiIHal561a+UOUuZ7qEllLD4L0X2T BSgvy25QVnIcL4jytEdOzc+2gdzzTWrnOe4XlxeZBTfFBc2LyxGi/SJNW TaXPZ5H5H+qVosAlsyVJZqWaFHaPxGbPuoIRo10x9J69yh8xzjWQWgs/7 hC+eJNI3aw92i6afrESbL2ma83XJcKzBQDaghZyWpVLa7E3uRg4RSsjct CwoofCfN9opBZft5d2WYvnN9tGoEw/WAx2hkGEhlSr7j0VbevDnpgzPgB Nz7KsWx9KoDEbfYxT3vaMR2vvAIVfnb0dqNkBIThCdJQeECD7duho82Ec w==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5192024" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5192024" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356866" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:03 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com, linux-alpha@vger.kernel.org, linux-snps-arc@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-csky@vger.kernel.org, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, sparclinux@vger.kernel.org Subject: [PATCH v3 08/12] treewide: Use initializer for struct vm_unmapped_area_info Date: Tue, 12 Mar 2024 15:28:39 -0700 Message-Id: <20240312222843.2505560-9-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Future changes will need to add a new member to struct vm_unmapped_area_info. This would cause trouble for any call site that doesn't initialize the struct. Currently every caller sets each member manually, so if new ones are added they will be uninitialized and the core code parsing the struct will see garbage in the new member. It could be possible to initialize the new member manually to 0 at each call site. This and a couple other options were discussed. Having some struct vm_unmapped_area_info instances not zero initialized will put those sites at risk of feeding garbage into vm_unmapped_area(), if the convention is to zero initialize the struct and any new field addition missed a call site that initializes each field manually. So it is useful to do things similar across the kernel. The consensus (see links) was that in general the best way to accomplish taking into account both code cleanliness and minimizing the chance of introducing bugs, was to do C99 static initialization. As in: struct vm_unmapped_area_info info =3D {}; With this method of initialization, the whole struct will be zero initialized, and any statements setting fields to zero will be unneeded. The change should not leave cleanup at the call sides. While iterating though the possible solutions a few archs kindly acked other variations that still zero initialized the struct. These sites have been modified in previous changes using the pattern acked by the respective arch. So to be reduce the chance of bugs via uninitialized fields, perform a tree wide change using the consensus for the best general way to do this change. Use C99 static initializing to zero the struct and remove and statements that simply set members to zero. Signed-off-by: Rick Edgecombe Cc: linux-mm@kvack.org Cc: linux-alpha@vger.kernel.org Cc: linux-snps-arc@lists.infradead.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-csky@vger.kernel.org Cc: loongarch@lists.linux.dev Cc: linux-mips@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: linux-sh@vger.kernel.org Cc: sparclinux@vger.kernel.org Link: https://lore.kernel.org/lkml/202402280912.33AEE7A9CF@keescook/#t Link: https://lore.kernel.org/lkml/j7bfvig3gew3qruouxrh7z7ehjjafrgkbcmg6tcg= hhfh3rhmzi@wzlcoecgy5rs/ Link: https://lore.kernel.org/lkml/ec3e377a-c0a0-4dd3-9cb9-96517e54d17e@csg= roup.eu/ Reviewed-by: Kees Cook --- Hi archs, For some context, this is part of a larger series to improve shadow stack guard gaps. It involves plumbing a new field via struct vm_unmapped_area_info. The first user is x86, but arm and riscv may likely use it as well. The change is compile tested only for non-x86. Thanks, Rick --- arch/alpha/kernel/osf_sys.c | 5 +---- arch/arc/mm/mmap.c | 4 +--- arch/arm/mm/mmap.c | 5 ++--- arch/loongarch/mm/mmap.c | 3 +-- arch/mips/mm/mmap.c | 3 +-- arch/s390/mm/hugetlbpage.c | 7 ++----- arch/s390/mm/mmap.c | 11 ++++------- arch/sh/mm/mmap.c | 5 ++--- arch/sparc/kernel/sys_sparc_32.c | 3 +-- arch/sparc/kernel/sys_sparc_64.c | 5 ++--- arch/sparc/mm/hugetlbpage.c | 7 ++----- arch/x86/kernel/sys_x86_64.c | 7 ++----- arch/x86/mm/hugetlbpage.c | 7 ++----- fs/hugetlbfs/inode.c | 7 ++----- mm/mmap.c | 9 ++------- 15 files changed, 27 insertions(+), 61 deletions(-) diff --git a/arch/alpha/kernel/osf_sys.c b/arch/alpha/kernel/osf_sys.c index 5db88b627439..e5f881bc8288 100644 --- a/arch/alpha/kernel/osf_sys.c +++ b/arch/alpha/kernel/osf_sys.c @@ -1218,14 +1218,11 @@ static unsigned long arch_get_unmapped_area_1(unsigned long addr, unsigned long len, unsigned long limit) { - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D addr; info.high_limit =3D limit; - info.align_mask =3D 0; - info.align_offset =3D 0; return vm_unmapped_area(&info); } =20 diff --git a/arch/arc/mm/mmap.c b/arch/arc/mm/mmap.c index 3c1c7ae73292..69a915297155 100644 --- a/arch/arc/mm/mmap.c +++ b/arch/arc/mm/mmap.c @@ -27,7 +27,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long a= ddr, { struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 /* * We enforce the MAP_FIXED case. @@ -51,11 +51,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long = addr, return addr; } =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D mm->mmap_base; info.high_limit =3D TASK_SIZE; - info.align_mask =3D 0; info.align_offset =3D pgoff << PAGE_SHIFT; return vm_unmapped_area(&info); } diff --git a/arch/arm/mm/mmap.c b/arch/arm/mm/mmap.c index a0f8a0ca0788..d65d0e6ed10a 100644 --- a/arch/arm/mm/mmap.c +++ b/arch/arm/mm/mmap.c @@ -34,7 +34,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long a= ddr, struct vm_area_struct *vma; int do_align =3D 0; int aliasing =3D cache_is_vipt_aliasing(); - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 /* * We only need to do colour alignment if either the I or D @@ -68,7 +68,6 @@ arch_get_unmapped_area(struct file *filp, unsigned long a= ddr, return addr; } =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D mm->mmap_base; info.high_limit =3D TASK_SIZE; @@ -87,7 +86,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const u= nsigned long addr0, unsigned long addr =3D addr0; int do_align =3D 0; int aliasing =3D cache_is_vipt_aliasing(); - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 /* * We only need to do colour alignment if either the I or D diff --git a/arch/loongarch/mm/mmap.c b/arch/loongarch/mm/mmap.c index a9630a81b38a..4bbd449b4a47 100644 --- a/arch/loongarch/mm/mmap.c +++ b/arch/loongarch/mm/mmap.c @@ -24,7 +24,7 @@ static unsigned long arch_get_unmapped_area_common(struct= file *filp, struct vm_area_struct *vma; unsigned long addr =3D addr0; int do_color_align; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 if (unlikely(len > TASK_SIZE)) return -ENOMEM; @@ -82,7 +82,6 @@ static unsigned long arch_get_unmapped_area_common(struct= file *filp, */ } =20 - info.flags =3D 0; info.low_limit =3D mm->mmap_base; info.high_limit =3D TASK_SIZE; return vm_unmapped_area(&info); diff --git a/arch/mips/mm/mmap.c b/arch/mips/mm/mmap.c index 00fe90c6db3e..7e11d7b58761 100644 --- a/arch/mips/mm/mmap.c +++ b/arch/mips/mm/mmap.c @@ -34,7 +34,7 @@ static unsigned long arch_get_unmapped_area_common(struct= file *filp, struct vm_area_struct *vma; unsigned long addr =3D addr0; int do_color_align; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 if (unlikely(len > TASK_SIZE)) return -ENOMEM; @@ -92,7 +92,6 @@ static unsigned long arch_get_unmapped_area_common(struct= file *filp, */ } =20 - info.flags =3D 0; info.low_limit =3D mm->mmap_base; info.high_limit =3D TASK_SIZE; return vm_unmapped_area(&info); diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index c2d2850ec8d5..51fb3806395b 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -258,14 +258,12 @@ static unsigned long hugetlb_get_unmapped_area_bottom= up(struct file *file, unsigned long pgoff, unsigned long flags) { struct hstate *h =3D hstate_file(file); - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D current->mm->mmap_base; info.high_limit =3D TASK_SIZE; info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); - info.align_offset =3D 0; return vm_unmapped_area(&info); } =20 @@ -274,7 +272,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(= struct file *file, unsigned long pgoff, unsigned long flags) { struct hstate *h =3D hstate_file(file); - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; unsigned long addr; =20 info.flags =3D VM_UNMAPPED_AREA_TOPDOWN; @@ -282,7 +280,6 @@ static unsigned long hugetlb_get_unmapped_area_topdown(= struct file *file, info.low_limit =3D PAGE_SIZE; info.high_limit =3D current->mm->mmap_base; info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); - info.align_offset =3D 0; addr =3D vm_unmapped_area(&info); =20 /* diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c index cd52d72b59cf..5c9d9f18a55f 100644 --- a/arch/s390/mm/mmap.c +++ b/arch/s390/mm/mmap.c @@ -77,7 +77,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, u= nsigned long addr, { struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 if (len > TASK_SIZE - mmap_min_addr) return -ENOMEM; @@ -93,14 +93,12 @@ unsigned long arch_get_unmapped_area(struct file *filp,= unsigned long addr, goto check_asce_limit; } =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D mm->mmap_base; info.high_limit =3D TASK_SIZE; if (filp || (flags & MAP_SHARED)) info.align_mask =3D MMAP_ALIGN_MASK << PAGE_SHIFT; - else - info.align_mask =3D 0; + info.align_offset =3D pgoff << PAGE_SHIFT; addr =3D vm_unmapped_area(&info); if (offset_in_page(addr)) @@ -116,7 +114,7 @@ unsigned long arch_get_unmapped_area_topdown(struct fil= e *filp, unsigned long ad { struct vm_area_struct *vma; struct mm_struct *mm =3D current->mm; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 /* requested length too big for entire address space */ if (len > TASK_SIZE - mmap_min_addr) @@ -140,8 +138,7 @@ unsigned long arch_get_unmapped_area_topdown(struct fil= e *filp, unsigned long ad info.high_limit =3D mm->mmap_base; if (filp || (flags & MAP_SHARED)) info.align_mask =3D MMAP_ALIGN_MASK << PAGE_SHIFT; - else - info.align_mask =3D 0; + info.align_offset =3D pgoff << PAGE_SHIFT; addr =3D vm_unmapped_area(&info); =20 diff --git a/arch/sh/mm/mmap.c b/arch/sh/mm/mmap.c index b82199878b45..bee329d4149a 100644 --- a/arch/sh/mm/mmap.c +++ b/arch/sh/mm/mmap.c @@ -57,7 +57,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, u= nsigned long addr, struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma; int do_colour_align; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 if (flags & MAP_FIXED) { /* We do not accept a shared mapping if it would violate @@ -88,7 +88,6 @@ unsigned long arch_get_unmapped_area(struct file *filp, u= nsigned long addr, return addr; } =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D TASK_UNMAPPED_BASE; info.high_limit =3D TASK_SIZE; @@ -106,7 +105,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const= unsigned long addr0, struct mm_struct *mm =3D current->mm; unsigned long addr =3D addr0; int do_colour_align; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 if (flags & MAP_FIXED) { /* We do not accept a shared mapping if it would violate diff --git a/arch/sparc/kernel/sys_sparc_32.c b/arch/sparc/kernel/sys_sparc= _32.c index 082a551897ed..08a19727795c 100644 --- a/arch/sparc/kernel/sys_sparc_32.c +++ b/arch/sparc/kernel/sys_sparc_32.c @@ -41,7 +41,7 @@ SYSCALL_DEFINE0(getpagesize) =20 unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr= , unsigned long len, unsigned long pgoff, unsigned long flags) { - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 if (flags & MAP_FIXED) { /* We do not accept a shared mapping if it would violate @@ -59,7 +59,6 @@ unsigned long arch_get_unmapped_area(struct file *filp, u= nsigned long addr, unsi if (!addr) addr =3D TASK_UNMAPPED_BASE; =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D addr; info.high_limit =3D TASK_SIZE; diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc= _64.c index 1dbf7211666e..d9c3b34ca744 100644 --- a/arch/sparc/kernel/sys_sparc_64.c +++ b/arch/sparc/kernel/sys_sparc_64.c @@ -93,7 +93,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, u= nsigned long addr, unsi struct vm_area_struct * vma; unsigned long task_size =3D TASK_SIZE; int do_color_align; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 if (flags & MAP_FIXED) { /* We do not accept a shared mapping if it would violate @@ -126,7 +126,6 @@ unsigned long arch_get_unmapped_area(struct file *filp,= unsigned long addr, unsi return addr; } =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D TASK_UNMAPPED_BASE; info.high_limit =3D min(task_size, VA_EXCLUDE_START); @@ -154,7 +153,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const= unsigned long addr0, unsigned long task_size =3D STACK_TOP32; unsigned long addr =3D addr0; int do_color_align; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 /* This should only ever run for 32-bit processes. */ BUG_ON(!test_thread_flag(TIF_32BIT)); diff --git a/arch/sparc/mm/hugetlbpage.c b/arch/sparc/mm/hugetlbpage.c index 38a1bef47efb..4caf56b32e26 100644 --- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -31,17 +31,15 @@ static unsigned long hugetlb_get_unmapped_area_bottomup= (struct file *filp, { struct hstate *h =3D hstate_file(filp); unsigned long task_size =3D TASK_SIZE; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 if (test_thread_flag(TIF_32BIT)) task_size =3D STACK_TOP32; =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D TASK_UNMAPPED_BASE; info.high_limit =3D min(task_size, VA_EXCLUDE_START); info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); - info.align_offset =3D 0; addr =3D vm_unmapped_area(&info); =20 if ((addr & ~PAGE_MASK) && task_size > VA_EXCLUDE_END) { @@ -63,7 +61,7 @@ hugetlb_get_unmapped_area_topdown(struct file *filp, cons= t unsigned long addr0, struct hstate *h =3D hstate_file(filp); struct mm_struct *mm =3D current->mm; unsigned long addr =3D addr0; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 /* This should only ever run for 32-bit processes. */ BUG_ON(!test_thread_flag(TIF_32BIT)); @@ -73,7 +71,6 @@ hugetlb_get_unmapped_area_topdown(struct file *filp, cons= t unsigned long addr0, info.low_limit =3D PAGE_SIZE; info.high_limit =3D mm->mmap_base; info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); - info.align_offset =3D 0; addr =3D vm_unmapped_area(&info); =20 /* diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index c783aeb37dce..b3278e4f7e59 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -125,7 +125,7 @@ arch_get_unmapped_area(struct file *filp, unsigned long= addr, { struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; unsigned long begin, end; =20 if (flags & MAP_FIXED) @@ -144,11 +144,9 @@ arch_get_unmapped_area(struct file *filp, unsigned lon= g addr, return addr; } =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D begin; info.high_limit =3D end; - info.align_mask =3D 0; info.align_offset =3D pgoff << PAGE_SHIFT; if (filp) { info.align_mask =3D get_align_mask(); @@ -165,7 +163,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const= unsigned long addr0, struct vm_area_struct *vma; struct mm_struct *mm =3D current->mm; unsigned long addr =3D addr0; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 /* requested length too big for entire address space */ if (len > TASK_SIZE) @@ -210,7 +208,6 @@ arch_get_unmapped_area_topdown(struct file *filp, const= unsigned long addr0, if (addr > DEFAULT_MAP_WINDOW && !in_32bit_syscall()) info.high_limit +=3D TASK_SIZE_MAX - DEFAULT_MAP_WINDOW; =20 - info.align_mask =3D 0; info.align_offset =3D pgoff << PAGE_SHIFT; if (filp) { info.align_mask =3D get_align_mask(); diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index 6d77c0039617..fb600949a355 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -51,9 +51,8 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(s= truct file *file, unsigned long pgoff, unsigned long flags) { struct hstate *h =3D hstate_file(file); - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D get_mmap_base(1); =20 @@ -65,7 +64,6 @@ static unsigned long hugetlb_get_unmapped_area_bottomup(s= truct file *file, task_size_32bit() : task_size_64bit(addr > DEFAULT_MAP_WINDOW); =20 info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); - info.align_offset =3D 0; return vm_unmapped_area(&info); } =20 @@ -74,7 +72,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(st= ruct file *file, unsigned long pgoff, unsigned long flags) { struct hstate *h =3D hstate_file(file); - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 info.flags =3D VM_UNMAPPED_AREA_TOPDOWN; info.length =3D len; @@ -89,7 +87,6 @@ static unsigned long hugetlb_get_unmapped_area_topdown(st= ruct file *file, info.high_limit +=3D TASK_SIZE_MAX - DEFAULT_MAP_WINDOW; =20 info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); - info.align_offset =3D 0; addr =3D vm_unmapped_area(&info); =20 /* diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index cd87ea5944a1..ae833080a146 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -176,14 +176,12 @@ hugetlb_get_unmapped_area_bottomup(struct file *file,= unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { struct hstate *h =3D hstate_file(file); - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D current->mm->mmap_base; info.high_limit =3D arch_get_mmap_end(addr, len, flags); info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); - info.align_offset =3D 0; return vm_unmapped_area(&info); } =20 @@ -192,14 +190,13 @@ hugetlb_get_unmapped_area_topdown(struct file *file, = unsigned long addr, unsigned long len, unsigned long pgoff, unsigned long flags) { struct hstate *h =3D hstate_file(file); - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; =20 info.flags =3D VM_UNMAPPED_AREA_TOPDOWN; info.length =3D len; info.low_limit =3D PAGE_SIZE; info.high_limit =3D arch_get_mmap_base(addr, current->mm->mmap_base); info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); - info.align_offset =3D 0; addr =3D vm_unmapped_area(&info); =20 /* diff --git a/mm/mmap.c b/mm/mmap.c index 68381b90f906..b889c79d11bd 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1707,7 +1707,7 @@ generic_get_unmapped_area(struct file *filp, unsigned= long addr, { struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma, *prev; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; const unsigned long mmap_end =3D arch_get_mmap_end(addr, len, flags); =20 if (len > mmap_end - mmap_min_addr) @@ -1725,12 +1725,9 @@ generic_get_unmapped_area(struct file *filp, unsigne= d long addr, return addr; } =20 - info.flags =3D 0; info.length =3D len; info.low_limit =3D mm->mmap_base; info.high_limit =3D mmap_end; - info.align_mask =3D 0; - info.align_offset =3D 0; return vm_unmapped_area(&info); } =20 @@ -1755,7 +1752,7 @@ generic_get_unmapped_area_topdown(struct file *filp, = unsigned long addr, { struct vm_area_struct *vma, *prev; struct mm_struct *mm =3D current->mm; - struct vm_unmapped_area_info info; + struct vm_unmapped_area_info info =3D {}; const unsigned long mmap_end =3D arch_get_mmap_end(addr, len, flags); =20 /* requested length too big for entire address space */ @@ -1779,8 +1776,6 @@ generic_get_unmapped_area_topdown(struct file *filp, = unsigned long addr, info.length =3D len; info.low_limit =3D PAGE_SIZE; info.high_limit =3D arch_get_mmap_base(addr, mm->mmap_base); - info.align_mask =3D 0; - info.align_offset =3D 0; addr =3D vm_unmapped_area(&info); =20 /* --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 461B71448E6 for ; Tue, 12 Mar 2024 22:29:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282551; cv=none; b=rbGMc9p5TP3JNXLTAap5lKtk0UJAAW+tHgFNUqvv+TfXJnpu587/95DhK+/OhGXzOsuGJbvHQW3TuQRtBqDjWiAQtH6k0NZpjAlvXMfv2ZvwXdquVqb7C6SJ7IOdUK4Co9bZJ30yzm9xsw9rO1eHmgRmic5isgv3S9LbLt/irU4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282551; c=relaxed/simple; bh=DqX79GXBAG2/j46/SZGgINNlJjEl0m3V0W63EM+RdAU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=k78BSFm8ZyDMAng6E/1MEsnQUEREG7AORKT2w11gevwDlqvk2aL4Q/YQT026O9Vn5XABoBobCs2DUWoBYKxOH6jiF4qoQLmAn115jIlcdU2oFR88INe3Xvj4E8K2bwXQj5Gtwo0gJLtqCV9W0Dc3DsWzoG+5L0Q3kfDH+vQryVM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=eyJOLy9p; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="eyJOLy9p" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282549; x=1741818549; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DqX79GXBAG2/j46/SZGgINNlJjEl0m3V0W63EM+RdAU=; b=eyJOLy9pry3ZnXbUhxW2Sx4lpHv+Lhs8nQg73Ni2LZacxHn4fh3H22Ny emcMvD9HBw9SDeCJXvXyqs+JGKzzTGYlk2mzYmD2bK13O1Mgqk7bO31SD MohS3G978Z+EXEEkYVwDeHMNFWARbXufA1xblVTBokwaeBHyKFu/ZolgY CX4XQhZYcRjTyHk8f/AqtWEFADf5AU3lucqM2rqpt4JfltV2YVNcjD0GU E1zQ01yTLMgVt/dlp4LCojJK8qu+QPJMqlBgAJQQb2z6j/f/H44mW+Ip4 sDHEeYw6V4pbkhP+nCPZnnnLTmcYUujrorWvH9/kQIXSaWxsCiR+6yrkE A==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5192057" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5192057" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356874" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:04 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com Subject: [PATCH v3 09/12] mm: Take placement mappings gap into account Date: Tue, 12 Mar 2024 15:28:40 -0700 Message-Id: <20240312222843.2505560-10-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. For MAP_GROWSDOWN/VM_GROWSDOWN and MAP_GROWSUP/VM_GROWSUP this has not been a problem in practice because applications place these kinds of mappings very early, when there is not many mappings to find a space between. But for shadow stacks, they may be placed throughout the lifetime of the application. Use the start_gap field to find a space that includes the guard gap for the new mapping. Take care to not interfere with the alignment. Signed-off-by: Rick Edgecombe Reviewed-by: Christophe Leroy --- v3: - Spelling fix in comment v2: - Remove VM_UNMAPPED_START_GAP_SET and have struct vm_unmapped_area_info initialized with zeros (in another patch). (Kirill) - Drop unrelated space change (Kirill) - Add comment around interactions of alignment and start gap step (Kirill) --- include/linux/mm.h | 1 + mm/mmap.c | 12 +++++++++--- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index d91cde79aaee..deade7be00d0 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3418,6 +3418,7 @@ struct vm_unmapped_area_info { unsigned long high_limit; unsigned long align_mask; unsigned long align_offset; + unsigned long start_gap; }; =20 extern unsigned long vm_unmapped_area(struct vm_unmapped_area_info *info); diff --git a/mm/mmap.c b/mm/mmap.c index b889c79d11bd..634e706fd97e 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1582,7 +1582,7 @@ static unsigned long unmapped_area(struct vm_unmapped= _area_info *info) MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); =20 /* Adjust search length to account for worst case alignment overhead */ - length =3D info->length + info->align_mask; + length =3D info->length + info->align_mask + info->start_gap; if (length < info->length) return -ENOMEM; =20 @@ -1594,7 +1594,13 @@ static unsigned long unmapped_area(struct vm_unmappe= d_area_info *info) if (mas_empty_area(&mas, low_limit, high_limit - 1, length)) return -ENOMEM; =20 - gap =3D mas.index; + /* + * Adjust for the gap first so it doesn't interfere with the + * later alignment. The first step is the minimum needed to + * fulill the start gap, the next steps is the minimum to align + * that. It is the minimum needed to fulill both. + */ + gap =3D mas.index + info->start_gap; gap +=3D (info->align_offset - gap) & info->align_mask; tmp =3D mas_next(&mas, ULONG_MAX); if (tmp && (tmp->vm_flags & VM_STARTGAP_FLAGS)) { /* Avoid prev check if = possible */ @@ -1633,7 +1639,7 @@ static unsigned long unmapped_area_topdown(struct vm_= unmapped_area_info *info) =20 MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); /* Adjust search length to account for worst case alignment overhead */ - length =3D info->length + info->align_mask; + length =3D info->length + info->align_mask + info->start_gap; if (length < info->length) return -ENOMEM; =20 --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C247314534D for ; Tue, 12 Mar 2024 22:29:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282552; cv=none; b=i02bp9cBTflmBithrijKyaZNCzXFa7SejpGfGdnP7/YGxAI7hbR1uJ/3aoOGtFoPwlG2dKFq2s12E51b2JWORl1WL01hj3vKT5V8FaCnE016eWtfJVqU/L30w0F05E47ywExLW50QEdohKSIi+Z2y+/5K0QZb1GXqFbsn64YaHQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282552; c=relaxed/simple; bh=CreGxe5/iXfUXls5YDaI9mwicpyrEDmmn8Wmlsjwro0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=bD5cogevrATZiwhcAS90A0Clm9uXYZxf1d/MJvcDlrABD7UTFCqWF7ua3e4ilP8Ecsj78A4o4gYcGn1N+9p664g0rvgQGqpcO/uoQtbvaEA40uN7gG86PNIhwj8HJcPJzICslGKWXYabxobOZjjiWSa7y/W4rBaEu/s6ZFsQ0Ho= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=SyXcLdvt; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="SyXcLdvt" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282551; x=1741818551; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=CreGxe5/iXfUXls5YDaI9mwicpyrEDmmn8Wmlsjwro0=; b=SyXcLdvtMsMYzgACl6Hl1fzWH1D4E2TbtEvDmTfYxDHMjQ7dqe89vDXJ 6Qoza3X00xbx1Lf7cOGfw2hvKkZR16+R+cIHFbO8G65nOJP4eJWTr/vZ6 hOHCDKQCdVI68fJdPG0S+OwH7ZorGBCXcxsejUg1rejDY2tzChMIJwuut qNR7f/An1P9YJj7PB3+mgn6KAeLejQWw6QvevsTQac+21kPhfakDHbA9g UHsPC7tfQRYqd8wzhKaPwsf1wzDd53IEG+TOeQW7FWj0wNlnFFG5AKolD qNVlq79QmVAxtHTA7Wf0OYwa7o1r71YZypqDUiNWqbAYrGdmRQs+W4XDf A==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5192058" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5192058" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356880" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:04 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com Subject: [PATCH v3 10/12] x86/mm: Implement HAVE_ARCH_UNMAPPED_AREA_VMFLAGS Date: Tue, 12 Mar 2024 15:28:41 -0700 Message-Id: <20240312222843.2505560-11-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. Add x86 arch implementations of arch_get_unmapped_area_vmflags/_topdown() so future changes can allow the guard gap of type of vma being placed to be taken into account. This will be used for shadow stack memory. Signed-off-by: Rick Edgecombe --- v3: - Commit log grammar v2: - Remove unnecessary added extern --- arch/x86/include/asm/pgtable_64.h | 1 + arch/x86/kernel/sys_x86_64.c | 25 ++++++++++++++++++++----- 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/pgtable_64.h b/arch/x86/include/asm/pgtab= le_64.h index 24af25b1551a..13dcaf436efd 100644 --- a/arch/x86/include/asm/pgtable_64.h +++ b/arch/x86/include/asm/pgtable_64.h @@ -244,6 +244,7 @@ extern void cleanup_highmap(void); =20 #define HAVE_ARCH_UNMAPPED_AREA #define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN +#define HAVE_ARCH_UNMAPPED_AREA_VMFLAGS =20 #define PAGE_AGP PAGE_KERNEL_NOCACHE #define HAVE_PAGE_AGP 1 diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index b3278e4f7e59..d6fbc4dd08ef 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -120,8 +120,8 @@ static void find_start_end(unsigned long addr, unsigned= long flags, } =20 unsigned long -arch_get_unmapped_area(struct file *filp, unsigned long addr, - unsigned long len, unsigned long pgoff, unsigned long flags) +arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsi= gned long len, + unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) { struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma; @@ -156,9 +156,9 @@ arch_get_unmapped_area(struct file *filp, unsigned long= addr, } =20 unsigned long -arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr= 0, - const unsigned long len, const unsigned long pgoff, - const unsigned long flags) +arch_get_unmapped_area_topdown_vmflags(struct file *filp, unsigned long ad= dr0, + unsigned long len, unsigned long pgoff, + unsigned long flags, vm_flags_t vm_flags) { struct vm_area_struct *vma; struct mm_struct *mm =3D current->mm; @@ -227,3 +227,18 @@ arch_get_unmapped_area_topdown(struct file *filp, cons= t unsigned long addr0, */ return arch_get_unmapped_area(filp, addr0, len, pgoff, flags); } + +unsigned long +arch_get_unmapped_area(struct file *filp, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + return arch_get_unmapped_area_vmflags(filp, addr, len, pgoff, flags, 0); +} + +unsigned long +arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr, + const unsigned long len, const unsigned long pgoff, + const unsigned long flags) +{ + return arch_get_unmapped_area_topdown_vmflags(filp, addr, len, pgoff, fla= gs, 0); +} --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2437F145640 for ; Tue, 12 Mar 2024 22:29:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282552; cv=none; b=guA7oj2MdunL1wW5MTjYD/oWir+rCfqaTNZAd8/g0qE5ApoThYYDlGujmMent42WDR7PW6V1LvvOlZ5/MJr3MZOzvbxfouXVsD690f2pzYaSN5tQG3uBChd01ShCW7jUoDJT3ZCLm39AJbogfDiYg6E25Nbf5Oom3pTAM7BVclI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282552; c=relaxed/simple; bh=TSd5WZe/p2AZRhCNpPLqqksEWjeny0UmWfCVRjzSme8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=CUp9dCoauY4FRo1+B10rvDaj3qidYapuv3aERZ38SEV83YI+6YiMvfcBVIEjCbAiayvQUQKYGNJ3MU/fltpz7nbaMHJyTOpH2NVXVwPhI5HAf/5XU0DXYhrlQOEp7t3VdluRIZE3qtkUE/i0rOfarXT2G1EKMLby+SoApH3wuFs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=CwiEAiAQ; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="CwiEAiAQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282551; x=1741818551; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TSd5WZe/p2AZRhCNpPLqqksEWjeny0UmWfCVRjzSme8=; b=CwiEAiAQxiaaJgzmACJGOaxq1ebSWc5WxWBMGXe+DlDRHvjVdF8Gzpti Mkot/mlUV7LV0qGejLkSCuAKMsb5YnKWckjHDEAtrihVzjE2yjBt6HfDC apcaMzguMx+ZO0VJsEhtcXMFCOYG7DzNezR8mDAICZjcmp9rvSKk9T55I vkTkneF0MvJ9l3FUJK6gz7LnzxrMPd9fkXzg8dbzN80R9kIUU6MyJfYAR r7mE8AVDmMEG0+A3cWdUG7qNNURCTpMcQw2C553SO5WqJicdaKEICuPOz vz0QKGWpdZc3xcxUoxgBBwA0Jn9tBYEWJQcAeP6GMxmXMi1AR5haahScH g==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5192088" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5192088" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356883" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:04 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com Subject: [PATCH v3 11/12] x86/mm: Care about shadow stack guard gap during placement Date: Tue, 12 Mar 2024 15:28:42 -0700 Message-Id: <20240312222843.2505560-12-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable When memory is being placed, mmap() will take care to respect the guard gaps of certain types of memory (VM_SHADOWSTACK, VM_GROWSUP and VM_GROWSDOWN). In order to ensure guard gaps between mappings, mmap() needs to consider two things: 1. That the new mapping isn=E2=80=99t placed in an any existing mappings g= uard gaps. 2. That the new mapping isn=E2=80=99t placed such that any existing mappin= gs are not in *its* guard gaps. The long standing behavior of mmap() is to ensure 1, but not take any care around 2. So for example, if there is a PAGE_SIZE free area, and a mmap() with a PAGE_SIZE size, and a type that has a guard gap is being placed, mmap() may place the shadow stack in the PAGE_SIZE free area. Then the mapping that is supposed to have a guard gap will not have a gap to the adjacent VMA. Now that the vm_flags is passed into the arch get_unmapped_area()'s, and vm_unmapped_area() is ready to consider it, have VM_SHADOW_STACK's get guard gap consideration for scenario 2. Signed-off-by: Rick Edgecombe --- arch/x86/kernel/sys_x86_64.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index d6fbc4dd08ef..964cb435710e 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -119,6 +119,14 @@ static void find_start_end(unsigned long addr, unsigne= d long flags, *end =3D task_size_64bit(addr > DEFAULT_MAP_WINDOW); } =20 +static inline unsigned long stack_guard_placement(vm_flags_t vm_flags) +{ + if (vm_flags & VM_SHADOW_STACK) + return PAGE_SIZE; + + return 0; +} + unsigned long arch_get_unmapped_area_vmflags(struct file *filp, unsigned long addr, unsi= gned long len, unsigned long pgoff, unsigned long flags, vm_flags_t vm_flags) @@ -148,6 +156,7 @@ arch_get_unmapped_area_vmflags(struct file *filp, unsig= ned long addr, unsigned l info.low_limit =3D begin; info.high_limit =3D end; info.align_offset =3D pgoff << PAGE_SHIFT; + info.start_gap =3D stack_guard_placement(vm_flags); if (filp) { info.align_mask =3D get_align_mask(); info.align_offset +=3D get_align_bits(); @@ -197,6 +206,7 @@ arch_get_unmapped_area_topdown_vmflags(struct file *fil= p, unsigned long addr0, info.low_limit =3D PAGE_SIZE; =20 info.high_limit =3D get_mmap_base(0); + info.start_gap =3D stack_guard_placement(vm_flags); =20 /* * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area --=20 2.34.1 From nobody Sun Feb 8 12:38:38 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F63D145674 for ; Tue, 12 Mar 2024 22:29:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282554; cv=none; b=b9iev4w30z8ied/BilL/erGzWaulC7OfuUqQYQJX5E6nk4ob5jsw7ysE7GfLm+0dV1pCevReL4RUw2lI+7sW1ClP8G5FumvPvai0/aaMjlCce8kSmZSscc1v9fxHJ/bDQgFMNapmWCKppjTFbNzqassIVMJJoFtfHq0Zh6BjbZg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710282554; c=relaxed/simple; bh=9Kz1B1g9YBWVQVBA7u5n8sBfRhBddgPHEWEy/INdrdY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=HIDaueI3VY54MGTDaicJCnFobKIMg0HKnwGJaPDoNiXldwlFZaOCv9sku8ekOnHJ0JL0v1yuGUVmb10bGcwgbO0JQLfAgpRBYNj0wxz9f24hghu6Q/mL3kNIZ+DiN4iUWRfk5ffTie1m/BJcxwzoUX5yUcgbAPx24GWFh5VMmqs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=e8ZO6C39; arc=none smtp.client-ip=192.198.163.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="e8ZO6C39" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1710282553; x=1741818553; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9Kz1B1g9YBWVQVBA7u5n8sBfRhBddgPHEWEy/INdrdY=; b=e8ZO6C39lQATqSmUewTLf/q93OfAJeT8nyWFJx5ZM0q4yK4BtupB2wYu /q4HH50m397PQn1hreOnAYB04BTXrxAY2wTj6E0h9FYC1ggBfzjiGWBQD OFP7nISyj/OyMmMjo0kCAZrmKtnhri8ib5xlRlGdVYpf7+6RxqS5jDtq3 a0JzGcRfC/9gKAqC4gWGOlHkL/2FyREzjyEjRMGqAcYHf9A03cT7mfAhP 16indzBY7i3NC9H1144dQJ0orsk1KQK4/6eK9avIZSZnXKxG+x5WRCVG1 d5Qx4p/X2Czo3XFoDIHPLVXcFW82P4iTH5KPqTIMHmk/Cn3ykZssl14R8 A==; X-IronPort-AV: E=McAfee;i="6600,9927,11011"; a="5192091" X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="5192091" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:05 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.07,119,1708416000"; d="scan'208";a="16356888" Received: from gargayus-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.255.231.196]) by orviesa004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Mar 2024 15:29:05 -0700 From: Rick Edgecombe To: Liam.Howlett@oracle.com, akpm@linux-foundation.org, bp@alien8.de, broonie@kernel.org, dave.hansen@linux.intel.com, debug@rivosinc.com, hpa@zytor.com, keescook@chromium.org, kirill.shutemov@linux.intel.com, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, x86@kernel.org, christophe.leroy@csgroup.eu Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, rick.p.edgecombe@intel.com Subject: [PATCH v3 12/12] selftests/x86: Add placement guard gap test for shstk Date: Tue, 12 Mar 2024 15:28:43 -0700 Message-Id: <20240312222843.2505560-13-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> References: <20240312222843.2505560-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The existing shadow stack test for guard gaps just checks that new mappings are not placed in an existing mapping's guard gap. Add one that checks that new mappings are not placed such that preexisting mappings are in the new mappings guard gap. Signed-off-by: Rick Edgecombe --- .../testing/selftests/x86/test_shadow_stack.c | 67 +++++++++++++++++-- 1 file changed, 63 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/x86/test_shadow_stack.c b/tools/testin= g/selftests/x86/test_shadow_stack.c index 757e6527f67e..ee909a7927f9 100644 --- a/tools/testing/selftests/x86/test_shadow_stack.c +++ b/tools/testing/selftests/x86/test_shadow_stack.c @@ -556,7 +556,7 @@ struct node { * looked at the shadow stack gaps. * 5. See if it landed in the gap. */ -int test_guard_gap(void) +int test_guard_gap_other_gaps(void) { void *free_area, *shstk, *test_map =3D (void *)0xFFFFFFFFFFFFFFFF; struct node *head =3D NULL, *cur; @@ -593,11 +593,64 @@ int test_guard_gap(void) if (shstk - test_map - PAGE_SIZE !=3D PAGE_SIZE) return 1; =20 - printf("[OK]\tGuard gap test\n"); + printf("[OK]\tGuard gap test, other mapping's gaps\n"); =20 return 0; } =20 +/* Tests respecting the guard gap of the mapping getting placed */ +int test_guard_gap_new_mappings_gaps(void) +{ + void *free_area, *shstk_start, *test_map =3D (void *)0xFFFFFFFFFFFFFFFF; + struct node *head =3D NULL, *cur; + int ret =3D 0; + + free_area =3D mmap(0, PAGE_SIZE * 4, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + munmap(free_area, PAGE_SIZE * 4); + + /* Test letting map_shadow_stack find a free space */ + shstk_start =3D mmap(free_area, PAGE_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); + if (shstk_start =3D=3D MAP_FAILED || shstk_start !=3D free_area) + return 1; + + while (test_map > shstk_start) { + test_map =3D (void *)syscall(__NR_map_shadow_stack, 0, PAGE_SIZE, 0); + if (test_map =3D=3D MAP_FAILED) { + printf("[INFO]\tmap_shadow_stack MAP_FAILED\n"); + ret =3D 1; + break; + } + + cur =3D malloc(sizeof(*cur)); + cur->mapping =3D test_map; + + cur->next =3D head; + head =3D cur; + + if (test_map =3D=3D free_area + PAGE_SIZE) { + printf("[INFO]\tNew mapping has other mapping in guard gap!\n"); + ret =3D 1; + break; + } + } + + while (head) { + cur =3D head; + head =3D cur->next; + munmap(cur->mapping, PAGE_SIZE); + free(cur); + } + + munmap(shstk_start, PAGE_SIZE); + + if (!ret) + printf("[OK]\tGuard gap test, placement mapping's gaps\n"); + + return ret; +} + /* * Too complicated to pull it out of the 32 bit header, but also get the * 64 bit one needed above. Just define a copy here. @@ -850,9 +903,15 @@ int main(int argc, char *argv[]) goto out; } =20 - if (test_guard_gap()) { + if (test_guard_gap_other_gaps()) { ret =3D 1; - printf("[FAIL]\tGuard gap test\n"); + printf("[FAIL]\tGuard gap test, other mappings' gaps\n"); + goto out; + } + + if (test_guard_gap_new_mappings_gaps()) { + ret =3D 1; + printf("[FAIL]\tGuard gap test, placement mapping's gaps\n"); goto out; } =20 --=20 2.34.1