From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 703A51F4176; Wed, 20 Aug 2025 01:05:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651959; cv=none; b=FMFIaGxw0lUvm9c3Yo5T+pbNPHnmAzw4nbYZkPU6g+tNRDblRe0QG6hD84D0LWo9iGlbOvSXmjjmpQMKl7NAAy4iC8K8IYl/tARQVpufRBsjbCj1iRJTIxTd+8+puZfToMtPGKrJpsJZ6KlgP4TiZoQ/lyDESsfoFFvP5yPiiu8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651959; c=relaxed/simple; bh=MffUHMuqtaCzpkhdizzcpJ0sXDdhSU8Tjt+mNcF8NUE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pRUD9Hf3/pfWqsO3bJcsI3fky6wCeJl/H5G3+X+18regiYNfVORwpY/w2XNl65/lUQSkDc6DhXdAEINiieTya5C26mfIzTZxOtvu9SHbYDWIysdlUDaLH26me6gqUityWKj0/1uHFU5vd5qGDGlpqQMFriIKTthF8XFEMs1a+0E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=chRqHdL3; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="chRqHdL3" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBntf004717; Wed, 20 Aug 2025 01:04:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=gvtSU 8joP13kEFVRuIji2CS52l0yNsMc//UfkvDFdGU=; b=chRqHdL30O/eJPJFelYIF JxRhsujjXC0JG/oz9/MU5CeqTekl999B56TQbX8LfmRMQBmg3rQn0O2/7XD8OGre kBCg84rUXr3D8TeMSJxqHyPScVo/5uNq/kKaagc8u8oe+YwQSW6IZjYXtu+phYQN ufvs6aXscoBag76SiGkOoUgknXMK56YgqJFwlnEy0r2m5XirmpM9efE5HQfKGnSb FfpImDr2IJHNd9b9GBjYH+qgmaj/vG8JhZG6W0yY1F3g14BALk3pG5VHoX6Z+86i otDMMSxlOPjI4xSZpnxTMq1Pk/wdC30PX0vCpCnhiJsx49iJXEqQ9tPiQ+LTel6N w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tqr8aq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:28 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JN39eS007104; Wed, 20 Aug 2025 01:04:27 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29ng-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:27 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Nd6011685; Wed, 20 Aug 2025 01:04:26 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-2; Wed, 20 Aug 2025 01:04:26 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 01/22] mm: Add msharefs filesystem Date: Tue, 19 Aug 2025 18:03:54 -0700 Message-ID: <20250820010415.699353-2-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfXxAv3h/m+MTfc 0ptiIARYa95maJCEZqJrG3tLLfGpL+aAvAx5WUp1c4PkyP4lb9O2BjnTaVHBUz7+VrN6M2X8Odl /9giLrBYnZUEDrpCesHEZOlRkw1P5OBPFkRxms3GEYaLOFD78Uo8BxIQKjsaT8z5IbGiIgwiWy0 fN6x5Cu2SJbcRrRaX9KjHMIw1aUI8DMevMqW3DsLJduVto3IfZr3FwThOQdIByR0qKOb5V18kUv X7hTq4VMAuoDTt3LolBTn4BF7FbEJ+kkuHfMXBT3iWtuCiTPcoj/ADRlBbAuirJiwkNWSPhlcDG KfHWFnvVrQOdKanevXFtxI0EPCSzYgC7PaRTBFgculdg6jYyEnPjJkMEORformFhMkzIXg285lM GhlZJFYkk7/nK/ZLzIcIL5P8NkG//Q== X-Proofpoint-ORIG-GUID: yRHOcQOSeDi0JpjdrasRvx-vwL1WmHeC X-Proofpoint-GUID: yRHOcQOSeDi0JpjdrasRvx-vwL1WmHeC X-Authority-Analysis: v=2.4 cv=K/p73yWI c=1 sm=1 tr=0 ts=68a51f1c cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=VwQbUJbxAAAA:8 a=yPCof4ZbAAAA:8 a=Mf_uHGjaqS_lSwFvvQkA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" From: Khalid Aziz Add a pseudo filesystem that contains files and page table sharing information that enables processes to share page table entries. This patch adds the basic filesystem that can be mounted, a CONFIG_MSHARE option to enable the feature, and documentation. Signed-off-by: Khalid Aziz Signed-off-by: Anthony Yznaga --- Documentation/filesystems/index.rst | 1 + Documentation/filesystems/msharefs.rst | 96 +++++++++++++++++++++++++ include/uapi/linux/magic.h | 1 + mm/Kconfig | 11 +++ mm/Makefile | 4 ++ mm/mshare.c | 97 ++++++++++++++++++++++++++ 6 files changed, 210 insertions(+) create mode 100644 Documentation/filesystems/msharefs.rst create mode 100644 mm/mshare.c diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystem= s/index.rst index 11a599387266..dcd6605eb228 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -102,6 +102,7 @@ Documentation for filesystem implementations. fuse-passthrough inotify isofs + msharefs nilfs2 nfs/index ntfs3 diff --git a/Documentation/filesystems/msharefs.rst b/Documentation/filesys= tems/msharefs.rst new file mode 100644 index 000000000000..3e5b7d531821 --- /dev/null +++ b/Documentation/filesystems/msharefs.rst @@ -0,0 +1,96 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D +Msharefs - A filesystem to support shared page tables +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D + +What is msharefs? +----------------- + +msharefs is a pseudo filesystem that allows multiple processes to +share page table entries for shared pages. To enable support for +msharefs the kernel must be compiled with CONFIG_MSHARE set. + +msharefs is typically mounted like this:: + + mount -t msharefs none /sys/fs/mshare + +A file created on msharefs creates a new shared region where all +processes mapping that region will map it using shared page table +entries. Once the size of the region has been established via +ftruncate() or fallocate(), the region can be mapped into processes +and ioctls used to map and unmap objects within it. Note that an +msharefs file is a control file and accessing mapped objects within +a shared region through read or write of the file is not permitted. + +How to use mshare +----------------- + +Here are the basic steps for using mshare: + + 1. Mount msharefs on /sys/fs/mshare:: + + mount -t msharefs msharefs /sys/fs/mshare + + 2. mshare regions have alignment and size requirements. Start + address for the region must be aligned to an address boundary and + be a multiple of fixed size. This alignment and size requirement + can be obtained by reading the file ``/sys/fs/mshare/mshare_info`` + which returns a number in text format. mshare regions must be + aligned to this boundary and be a multiple of this size. + + 3. For the process creating an mshare region: + + a. Create a file on /sys/fs/mshare, for example:: + + fd =3D open("/sys/fs/mshare/shareme", + O_RDWR|O_CREAT|O_EXCL, 0600); + + b. Establish the size of the region:: + + fallocate(fd, 0, 0, BUF_SIZE); + + or:: + + ftruncate(fd, BUF_SIZE); + + c. Map some memory in the region:: + + struct mshare_create mcreate; + + mcreate.region_offset =3D 0; + mcreate.size =3D BUF_SIZE; + mcreate.offset =3D 0; + mcreate.prot =3D PROT_READ | PROT_WRITE; + mcreate.flags =3D MAP_ANONYMOUS | MAP_SHARED | MAP_FIXED; + mcreate.fd =3D -1; + + ioctl(fd, MSHAREFS_CREATE_MAPPING, &mcreate); + + d. Map the mshare region into the process:: + + mmap(NULL, BUF_SIZE, + PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + + e. Write and read to mshared region normally. + + + 4. For processes attaching an mshare region: + + a. Open the msharefs file, for example:: + + fd =3D open("/sys/fs/mshare/shareme", O_RDWR); + + b. Get the size of the mshare region from the file:: + + fstat(fd, &sb); + mshare_size =3D sb.st_size; + + c. Map the mshare region into the process:: + + mmap(NULL, mshare_size, + PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + + 5. To delete the mshare region:: + + unlink("/sys/fs/mshare/shareme"); diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..e53dd6063cba 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -103,5 +103,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ +#define MSHARE_MAGIC 0x4d534852 /* "MSHR" */ =20 #endif /* __LINUX_MAGIC_H__ */ diff --git a/mm/Kconfig b/mm/Kconfig index 4108bcd96784..8b50e9785729 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1400,6 +1400,17 @@ config PT_RECLAIM config FIND_NORMAL_PAGE def_bool n =20 +config MSHARE + bool "Mshare" + depends on MMU + help + Enable msharefs: A pseudo filesystem that allows multiple processes + to share kernel resources for mapping shared pages. A file created on + msharefs represents a shared region where all processes mapping that + region will map objects within it with shared page table entries and + VMAs. Ioctls are used to configure and map objects into the shared + region. + source "mm/damon/Kconfig" =20 endmenu diff --git a/mm/Makefile b/mm/Makefile index ef54aa615d9d..4af111b29c68 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -48,6 +48,10 @@ ifdef CONFIG_64BIT mmu-$(CONFIG_MMU) +=3D mseal.o endif =20 +ifdef CONFIG_MSHARE +mmu-$(CONFIG_MMU) +=3D mshare.o +endif + obj-y :=3D filemap.o mempool.o oom_kill.o fadvise.o \ maccess.o page-writeback.o folio-compat.o \ readahead.o swap.o truncate.o vmscan.o shrinker.o \ diff --git a/mm/mshare.c b/mm/mshare.c new file mode 100644 index 000000000000..f703af49ec81 --- /dev/null +++ b/mm/mshare.c @@ -0,0 +1,97 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Enable cooperating processes to share page table between + * them to reduce the extra memory consumed by multiple copies + * of page tables. + * + * This code adds an in-memory filesystem - msharefs. + * msharefs is used to manage page table sharing + * + * + * Copyright (C) 2024 Oracle Corp. All rights reserved. + * Author: Khalid Aziz + * + */ + +#include +#include +#include + +static const struct file_operations msharefs_file_operations =3D { + .open =3D simple_open, +}; + +static const struct super_operations mshare_s_ops =3D { + .statfs =3D simple_statfs, +}; + +static int +msharefs_fill_super(struct super_block *sb, struct fs_context *fc) +{ + struct inode *inode; + + sb->s_blocksize =3D PAGE_SIZE; + sb->s_blocksize_bits =3D PAGE_SHIFT; + sb->s_maxbytes =3D MAX_LFS_FILESIZE; + sb->s_magic =3D MSHARE_MAGIC; + sb->s_op =3D &mshare_s_ops; + sb->s_time_gran =3D 1; + + inode =3D new_inode(sb); + if (!inode) + return -ENOMEM; + + inode->i_ino =3D 1; + inode->i_mode =3D S_IFDIR | 0777; + simple_inode_init_ts(inode); + inode->i_op =3D &simple_dir_inode_operations; + inode->i_fop =3D &simple_dir_operations; + set_nlink(inode, 2); + + sb->s_root =3D d_make_root(inode); + if (!sb->s_root) + return -ENOMEM; + + return 0; +} + +static int +msharefs_get_tree(struct fs_context *fc) +{ + return get_tree_nodev(fc, msharefs_fill_super); +} + +static const struct fs_context_operations msharefs_context_ops =3D { + .get_tree =3D msharefs_get_tree, +}; + +static int +mshare_init_fs_context(struct fs_context *fc) +{ + fc->ops =3D &msharefs_context_ops; + return 0; +} + +static struct file_system_type mshare_fs =3D { + .name =3D "msharefs", + .init_fs_context =3D mshare_init_fs_context, + .kill_sb =3D kill_litter_super, +}; + +static int __init +mshare_init(void) +{ + int ret; + + ret =3D sysfs_create_mount_point(fs_kobj, "mshare"); + if (ret) + return ret; + + ret =3D register_filesystem(&mshare_fs); + if (ret) + sysfs_remove_mount_point(fs_kobj, "mshare"); + + return ret; +} + +core_initcall(mshare_init); --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 60A0E198A11; Wed, 20 Aug 2025 01:05:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651951; cv=none; b=kTT768bgO62jF1l6f1P6zJCb37ChI+CFACPzpfX6ieMwOV7LI22zPsqgHWCpXBsfc8JSYy3TBg3IyvlJsRdvorc/W6jsRpdtoxbCvCnoUhlp80clDvxTKqctKaRU4aOveZjSJDfo/Q3ahP59heDOP4CSfMa198S63PnM6Jfoh8c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651951; c=relaxed/simple; bh=DQs2lOKramlJfbiRxzUvOaOWr9Qs+1TYMmKTXcyFxA4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=trZQPa8YQm6L7nLsLC9Zz41ET05Ftx8aIYFLu9pLWnT/zckRzOl6q/c1RlId1POI3PU259ipyMAAXVBHtE09pdIVnDaa7bDpMv8tqAJrUE5+Aanpb+PtFA4VfXxwAFF5SzbPBhi5glcneU/Jh/NU1D8OuNqc43MGrKtyYGGjGXc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=ns4vnE04; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="ns4vnE04" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLC5u6012018; Wed, 20 Aug 2025 01:04:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=1LKCO d90IuRKQUERNf3PP/AzsM6PPL8y1xaZdJTCpTY=; b=ns4vnE04MJGeAJTqH0K6w cv3s0i36oCkYZGMO2ysoAJqeZ8dlAnPEUUeqvGA/XZu3tEZirCjo/hJLl2CtYlsY WKG2EZKBDZ1IRNyqkzruIHctY9FZu0/UVtK0sfdAOZ0ZAwVgMcEGXjit2iLLx3ch TGznPTXd1h3GMUWKYoYXZIBrN6g4FysgnB5b6mgsqwRTrEyapOQE6esT+Fv6p04q gV3X/XA2r4o06kbm48fhpen9ByOVCmece6KJQVd/kttDOToWPX3tSlUart2el5HF NgjcPpxsmOgFnAQgqVjPYYFyk0eBhVezTaqLHtQbv/EhfxIM8eXbvXGwgXjgIpFd w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0trr8dj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:31 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57K02ibF007205; Wed, 20 Aug 2025 01:04:30 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29pf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:30 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Nd8011685; Wed, 20 Aug 2025 01:04:29 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-3; Wed, 20 Aug 2025 01:04:28 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 02/22] mm/mshare: pre-populate msharefs with information file Date: Tue, 19 Aug 2025 18:03:55 -0700 Message-ID: <20250820010415.699353-3-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-ORIG-GUID: kxNfsh9Q8YiSklUfYeZXOlaps9W-bKKp X-Proofpoint-GUID: kxNfsh9Q8YiSklUfYeZXOlaps9W-bKKp X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX6YMrtj8XpWKS LcqBDDeWdu/UOBa9TA9CuJrKyr7WHxicnwTZFC8DVa6AshqREBW6ABAMPiVzHKDz2uyJt3w3jTx QSxgWMT0p3xYqUpgtUjHd3F+dulSIXl14yY9immgOvZCF6BRyhOvwGaka0Lqb7L3+plejJ9NYsE 6wVVY2xNLVfNWK6mMAgdKthgQr2y0O1GeAt8G81/czDVkqjdaEZv8F2/GOY5Z4BDiuanejVpVjn /qKPlN3qlLcUkvGbEGYZ/PwsWp668uD31YHbHpangz2ixwG/7kd3FtXH0zUMru4NXRhKNbuGeHT C1DKqsN0bAezloJGtQQf+/OAAEWeis0MCNb5PNMmLSEzZsWIqLRbu6hcKIfHlOjYVFytdls8gAq kKF+yHAgkGBhBxMnBiSBud6FnMQk4w== X-Authority-Analysis: v=2.4 cv=Qp4HHVyd c=1 sm=1 tr=0 ts=68a51f1f cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=VwQbUJbxAAAA:8 a=yPCof4ZbAAAA:8 a=Nn8paf86YN4m1DmQybIA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" From: Khalid Aziz Users of mshare need to know the size and alignment requirement for shared regions. Pre-populate msharefs with a file, mshare_info, that provides this information. For now, pagetable sharing is hardcoded to be at the PUD level. Signed-off-by: Khalid Aziz Signed-off-by: Anthony Yznaga --- mm/mshare.c | 77 +++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 75 insertions(+), 2 deletions(-) diff --git a/mm/mshare.c b/mm/mshare.c index f703af49ec81..d666471bc94b 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -17,18 +17,74 @@ #include #include =20 +const unsigned long mshare_align =3D P4D_SIZE; + static const struct file_operations msharefs_file_operations =3D { .open =3D simple_open, }; =20 +struct msharefs_info { + struct dentry *info_dentry; +}; + +static ssize_t +mshare_info_read(struct file *file, char __user *buf, size_t nbytes, + loff_t *ppos) +{ + char s[80]; + + sprintf(s, "%ld\n", mshare_align); + return simple_read_from_buffer(buf, nbytes, ppos, s, strlen(s)); +} + +static const struct file_operations mshare_info_ops =3D { + .read =3D mshare_info_read, + .llseek =3D noop_llseek, +}; + static const struct super_operations mshare_s_ops =3D { .statfs =3D simple_statfs, }; =20 +static int +msharefs_create_mshare_info(struct super_block *sb) +{ + struct msharefs_info *info =3D sb->s_fs_info; + struct dentry *root =3D sb->s_root; + struct dentry *dentry; + struct inode *inode; + int ret; + + ret =3D -ENOMEM; + inode =3D new_inode(sb); + if (!inode) + goto out; + + inode->i_ino =3D 2; + simple_inode_init_ts(inode); + inode_init_owner(&nop_mnt_idmap, inode, NULL, S_IFREG | 0444); + inode->i_fop =3D &mshare_info_ops; + + dentry =3D d_alloc_name(root, "mshare_info"); + if (!dentry) + goto out; + + info->info_dentry =3D dentry; + d_add(dentry, inode); + + return 0; +out: + iput(inode); + + return ret; +} + static int msharefs_fill_super(struct super_block *sb, struct fs_context *fc) { + struct msharefs_info *info; struct inode *inode; + int ret; =20 sb->s_blocksize =3D PAGE_SIZE; sb->s_blocksize_bits =3D PAGE_SHIFT; @@ -37,6 +93,12 @@ msharefs_fill_super(struct super_block *sb, struct fs_co= ntext *fc) sb->s_op =3D &mshare_s_ops; sb->s_time_gran =3D 1; =20 + info =3D kzalloc(sizeof(*info), GFP_KERNEL); + if (!info) + return -ENOMEM; + + sb->s_fs_info =3D info; + inode =3D new_inode(sb); if (!inode) return -ENOMEM; @@ -52,7 +114,9 @@ msharefs_fill_super(struct super_block *sb, struct fs_co= ntext *fc) if (!sb->s_root) return -ENOMEM; =20 - return 0; + ret =3D msharefs_create_mshare_info(sb); + + return ret; } =20 static int @@ -72,10 +136,19 @@ mshare_init_fs_context(struct fs_context *fc) return 0; } =20 +static void +msharefs_kill_super(struct super_block *sb) +{ + struct msharefs_info *info =3D sb->s_fs_info; + + kfree(info); + kill_litter_super(sb); +} + static struct file_system_type mshare_fs =3D { .name =3D "msharefs", .init_fs_context =3D mshare_init_fs_context, - .kill_sb =3D kill_litter_super, + .kill_sb =3D msharefs_kill_super, }; =20 static int __init --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B72E194A45; Wed, 20 Aug 2025 01:05:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651951; cv=none; b=UHzTfpqdmkv8nMcSiKwy9vnxBJHbIcsPHyqPDj2JECkHGQtLtbEaQvL+1THCb6SUMX2WAfQjX7rrzBbQaARPjid6Haq3hfJ6sdr6resxTN9E6MXFO+f7mNVNXvhF9/gKJrBX+ovEOBmz7wmlyYe75jTnWiJBWAHWuDOEj1Upr6Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651951; c=relaxed/simple; bh=lAig7eb/hob+2A4NLAvkyBsI1qHNRKx8MbiGSivCp0E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pUJF7BJTtCHltcGbmoqgh2xgfVjQ3lLKMq96xiuMIsdxG6lIIPg+xv+JVyjKnES7Rw47+8FQARw9jYlQp80hmxu2CZpBw82W2IX4Pv4/JrOhx9kdwi/mngOgFaw3g9BtRhlhggOrbJzH3Pxnz33r9zt2W03fY9eQgpKMvVwv9wk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=eWQsvBL0; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="eWQsvBL0" Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLCUxL017628; Wed, 20 Aug 2025 01:04:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=vzGmx D04TI1mSj9V+t8aruaidXWI36zruInJlfX4dNY=; b=eWQsvBL0w0HGsmRYPf6QJ nKgKCB0Ak6IKgVB31Cwbq4ZFRGACcRwkED1X9KA/yjb8SIGcNRrk5Dzeb4hf2GbE SJF78ULl3t8V1t4EdUl4TsrZ3viw4H3AjsYtAkWPv6QcEjwFs6JurtluZbJnqgZG w7NeF3RniN/YQ4PI4XWF6FtIv/2XVU1pvkAyxso3d15RgS4wYHQsM7bZN/N4Ert+ nh8Zw2rAdvM0v7eLENDcFtuw+EmcCr/SYIiUaSFtKbiNz+gzrje+YuVvlTKPpk79 9D8Tc0gN9RCjcSKvZaFiNv+HatM6FUqK8SoX0Fa9uA8gXFVIibcD71jtER1ETWmg g== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0ttg8bu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:34 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNsqtx007279; Wed, 20 Aug 2025 01:04:33 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29qj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:33 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdA011685; Wed, 20 Aug 2025 01:04:32 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-4; Wed, 20 Aug 2025 01:04:31 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 03/22] mm/mshare: make msharefs writable and support directories Date: Tue, 19 Aug 2025 18:03:56 -0700 Message-ID: <20250820010415.699353-4-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX6nrVr8GQuiM2 KpjmSaQeCD+L4Xo8HZem/Sq+wiCbpltxbCytOR+N8x61m38VYPq6poGPjOnGit/hsZYn8ivdRgF bY5K+BX2GWQk6V1LjuiSqrdzBsKYtKOb44Uf/vAL/OjS4aY5Ilq4yLEEacafLde7CuWUTVvqH2P 1furpmQ6sPJr71xRnjQ8XqEY8N8V18arQD7y6SGMZlQBkspOKSx1nTanvVEWmXMfxt9ZddnzjW+ KMKf8+WjGcVZnYnNDFAys09xbO0fKtzCWwU151mf1HZDyB90k3kE0aaCrgqKvFTlqpbJAa9LzKI exHqAx/dG6oABfEE2CBw+NrYTwuadPkrDFv2YqrZawWDcEsiHyp79EuoAsCQb33VofrnXTWMjYc EGpUw5V/ABbuV5d98GU7IUMv7B8dOQ== X-Proofpoint-GUID: wwEvDZgrz76BuuzQ800K3COTnMBSVo6E X-Proofpoint-ORIG-GUID: wwEvDZgrz76BuuzQ800K3COTnMBSVo6E X-Authority-Analysis: v=2.4 cv=V94kEeni c=1 sm=1 tr=0 ts=68a51f22 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=VwQbUJbxAAAA:8 a=yPCof4ZbAAAA:8 a=-pbL9rYvnyYrJvq5bQ8A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" From: Khalid Aziz Make msharefs filesystem writable and allow creating directories to support better access control to mshare'd regions defined in msharefs. Signed-off-by: Khalid Aziz Signed-off-by: Anthony Yznaga --- mm/mshare.c | 116 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 115 insertions(+), 1 deletion(-) diff --git a/mm/mshare.c b/mm/mshare.c index d666471bc94b..c43b53a7323a 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -19,14 +19,128 @@ =20 const unsigned long mshare_align =3D P4D_SIZE; =20 +static const struct inode_operations msharefs_dir_inode_ops; +static const struct inode_operations msharefs_file_inode_ops; + static const struct file_operations msharefs_file_operations =3D { .open =3D simple_open, }; =20 +static struct inode +*msharefs_get_inode(struct mnt_idmap *idmap, struct super_block *sb, + const struct inode *dir, umode_t mode) +{ + struct inode *inode =3D new_inode(sb); + + if (!inode) + return ERR_PTR(-ENOMEM); + + inode->i_ino =3D get_next_ino(); + inode_init_owner(&nop_mnt_idmap, inode, dir, mode); + simple_inode_init_ts(inode); + + switch (mode & S_IFMT) { + case S_IFREG: + inode->i_op =3D &msharefs_file_inode_ops; + inode->i_fop =3D &msharefs_file_operations; + break; + case S_IFDIR: + inode->i_op =3D &msharefs_dir_inode_ops; + inode->i_fop =3D &simple_dir_operations; + inc_nlink(inode); + break; + default: + iput(inode); + return ERR_PTR(-EINVAL); + } + + return inode; +} + +static int +msharefs_mknod(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) +{ + struct inode *inode; + + inode =3D msharefs_get_inode(idmap, dir->i_sb, dir, mode); + if (IS_ERR(inode)) + return PTR_ERR(inode); + + d_instantiate(dentry, inode); + dget(dentry); + inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir)); + + return 0; +} + +static int +msharefs_create(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode, bool excl) +{ + return msharefs_mknod(idmap, dir, dentry, mode | S_IFREG); +} + +static struct dentry * +msharefs_mkdir(struct mnt_idmap *idmap, struct inode *dir, + struct dentry *dentry, umode_t mode) +{ + int ret =3D msharefs_mknod(idmap, dir, dentry, mode | S_IFDIR); + + if (!ret) + inc_nlink(dir); + return ERR_PTR(ret); +} + struct msharefs_info { struct dentry *info_dentry; }; =20 +static inline bool +is_msharefs_info_file(const struct dentry *dentry) +{ + struct msharefs_info *info =3D dentry->d_sb->s_fs_info; + + return info->info_dentry =3D=3D dentry; +} + +static int +msharefs_rename(struct mnt_idmap *idmap, + struct inode *old_dir, struct dentry *old_dentry, + struct inode *new_dir, struct dentry *new_dentry, + unsigned int flags) +{ + if (is_msharefs_info_file(old_dentry) || + is_msharefs_info_file(new_dentry)) + return -EPERM; + + return simple_rename(idmap, old_dir, old_dentry, new_dir, + new_dentry, flags); +} + +static int +msharefs_unlink(struct inode *dir, struct dentry *dentry) +{ + if (is_msharefs_info_file(dentry)) + return -EPERM; + + return simple_unlink(dir, dentry); +} + +static const struct inode_operations msharefs_file_inode_ops =3D { + .setattr =3D simple_setattr, +}; + +static const struct inode_operations msharefs_dir_inode_ops =3D { + .create =3D msharefs_create, + .lookup =3D simple_lookup, + .link =3D simple_link, + .unlink =3D msharefs_unlink, + .mkdir =3D msharefs_mkdir, + .rmdir =3D simple_rmdir, + .rename =3D msharefs_rename, +}; + static ssize_t mshare_info_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos) @@ -106,7 +220,7 @@ msharefs_fill_super(struct super_block *sb, struct fs_c= ontext *fc) inode->i_ino =3D 1; inode->i_mode =3D S_IFDIR | 0777; simple_inode_init_ts(inode); - inode->i_op =3D &simple_dir_inode_operations; + inode->i_op =3D &msharefs_dir_inode_ops; inode->i_fop =3D &simple_dir_operations; set_nlink(inode, 2); =20 --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 37A7F1F4161; Wed, 20 Aug 2025 01:05:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651959; cv=none; b=itdBuKd2y0L6cid3H2kjHE3Pi0UEPb+qIGqFJIX8U8CBmErLMZ8t/0kOaUdyMnnIpAbOKKIo6sdYJDbMMGT1K3+79xTjXIt0FADKS/V7/ZZJnv23aLOQqxECWoBOsdT9qkIDf6hFcKIH2ml4vNm9DMtCCYjgoHlcQjkUq5bo5fc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651959; c=relaxed/simple; bh=O0ZKrrHX4tTjCQplZtJ1DpbnmSC1BfCawA8dp8HslY8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aQbwe+Dhvo5O88ay840z2Kly++o5hdXOSbC/8E9ZaQIGeLrC8gIlxEJ5o/UONlghsyk78xcELnVrQHWA5VLUNDvt3wfAYYA7LH4MZqQ/8MdSh2ODeYBS4HJv8/XAr45xgDXj1nsaF66QTgOXGlV68iCe75HxePiTcrnMgOHyJ84= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=Qx3zTwpE; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="Qx3zTwpE" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLCHN2012275; Wed, 20 Aug 2025 01:04:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=fQFnq 88K9K4MmLcnD5W2me3DfjuW88ktErhJymD1WnQ=; b=Qx3zTwpEPQiJbr5opLwnj wmakVtwduCmaE3tTazo7NJUMAQCEHfLwh7/zfEi1FrWDp2Vyff1swsmjQJBz87zj OAfNy32xhgUU9kOHk8s7RpobXBdEdY+yOvJaWo/BNTS9HGwocgV6O8pXxTCChtrG rZwz2r4wnyN571Uo8xFztiEtXDAyr+UK4OPRZBPzJzOBdObLKE8k4aVKjoXZVNQK D8+MD8pxuZcp7k4rSuQSMC4a/t8pYWx5JOMI/iGx4wxkli24AaMDti/E0VD8F/a5 DYVCSKIqyRn8bJqVxiK9LodcEIykI2AC9sxpYuuS8CuEynLfXThTFnMotro2nkd1 A== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0trr8dq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:36 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JN314f007380; Wed, 20 Aug 2025 01:04:36 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29rq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:35 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdC011685; Wed, 20 Aug 2025 01:04:34 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-5; Wed, 20 Aug 2025 01:04:34 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 04/22] mm/mshare: allocate an mm_struct for msharefs files Date: Tue, 19 Aug 2025 18:03:57 -0700 Message-ID: <20250820010415.699353-5-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-ORIG-GUID: cHfUWM3KGiFs7FJoiOuwHErwqfvdF0rU X-Proofpoint-GUID: cHfUWM3KGiFs7FJoiOuwHErwqfvdF0rU X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX+V+sG7is/uC0 FR8t0L4s8S37T0jKbVUSzYW6Vn+KN5uC9PmxNS7i0HKw54PL2u8Qp/FFBHiTpwOsLkWb+9Mx9Q2 hJ6NgARxnInuZ+mJmpbVfRngMlyviPDYAmmiQTCNKzP0kHb38TJSe90pNUE77CtzcJGanBgACoD SSWuuT6/F2qkhR2P3HcNz5zSvKYFy9KFBoT+Zxg+/1wQdGgb6K4Ckaz9o/owAOOcsUVZpOGBK6l Kkz00FvgZWY2B9zMQSZOzcK5auVBWnCmmqu8Nq9YoO6P6xo5c3uJlYSlx+9a5iFb4Fm0o0UOCik KqIHJhthilO7tzm7Ja7Co3S24Yy/AgarEGSvpi3RD3pTF5GHiqCZFJYn4JK6Ot/+VAVkCEOheFK rAvygBTEBSS88CrUSVjGknuTdiV1/w== X-Authority-Analysis: v=2.4 cv=Qp4HHVyd c=1 sm=1 tr=0 ts=68a51f24 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=VwQbUJbxAAAA:8 a=yPCof4ZbAAAA:8 a=eZrgD7rVaXJ09-4h1GUA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" When a new file is created under msharefs, allocate a new mm_struct to be associated with it for the lifetime of the file. The mm_struct will hold the VMAs and pagetables for the mshare region the file represents. Signed-off-by: Khalid Aziz Signed-off-by: Anthony Yznaga --- mm/mshare.c | 68 +++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) diff --git a/mm/mshare.c b/mm/mshare.c index c43b53a7323a..400f198c0791 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -19,6 +19,11 @@ =20 const unsigned long mshare_align =3D P4D_SIZE; =20 +struct mshare_data { + struct mm_struct *mm; + refcount_t ref; +}; + static const struct inode_operations msharefs_dir_inode_ops; static const struct inode_operations msharefs_file_inode_ops; =20 @@ -26,11 +31,55 @@ static const struct file_operations msharefs_file_opera= tions =3D { .open =3D simple_open, }; =20 +static int +msharefs_fill_mm(struct inode *inode) +{ + struct mm_struct *mm; + struct mshare_data *m_data =3D NULL; + int ret =3D -ENOMEM; + + mm =3D mm_alloc(); + if (!mm) + return -ENOMEM; + + mm->mmap_base =3D mm->task_size =3D 0; + + m_data =3D kzalloc(sizeof(*m_data), GFP_KERNEL); + if (!m_data) + goto err_free; + m_data->mm =3D mm; + + refcount_set(&m_data->ref, 1); + inode->i_private =3D m_data; + return 0; + +err_free: + mmput(mm); + kfree(m_data); + return ret; +} + +static void +msharefs_delmm(struct mshare_data *m_data) +{ + mmput(m_data->mm); + kfree(m_data); +} + +static void mshare_data_putref(struct mshare_data *m_data) +{ + if (!refcount_dec_and_test(&m_data->ref)) + return; + + msharefs_delmm(m_data); +} + static struct inode *msharefs_get_inode(struct mnt_idmap *idmap, struct super_block *sb, const struct inode *dir, umode_t mode) { struct inode *inode =3D new_inode(sb); + int ret; =20 if (!inode) return ERR_PTR(-ENOMEM); @@ -43,6 +92,11 @@ static struct inode case S_IFREG: inode->i_op =3D &msharefs_file_inode_ops; inode->i_fop =3D &msharefs_file_operations; + ret =3D msharefs_fill_mm(inode); + if (ret) { + iput(inode); + inode =3D ERR_PTR(ret); + } break; case S_IFDIR: inode->i_op =3D &msharefs_dir_inode_ops; @@ -141,6 +195,19 @@ static const struct inode_operations msharefs_dir_inod= e_ops =3D { .rename =3D msharefs_rename, }; =20 +static void +msharefs_evict_inode(struct inode *inode) +{ + struct mshare_data *m_data =3D inode->i_private; + + if (!m_data) + goto out; + + mshare_data_putref(m_data); +out: + clear_inode(inode); +} + static ssize_t mshare_info_read(struct file *file, char __user *buf, size_t nbytes, loff_t *ppos) @@ -158,6 +225,7 @@ static const struct file_operations mshare_info_ops =3D= { =20 static const struct super_operations mshare_s_ops =3D { .statfs =3D simple_statfs, + .evict_inode =3D msharefs_evict_inode, }; =20 static int --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EB11E19CD1B; Wed, 20 Aug 2025 01:05:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651952; cv=none; b=GBINba7hpUh81Tw5fmbAwy9X166GNxiveYyWglsGUeXTgORpC8oZKMuOAg60Eh9M1FCG5u3fba+0yAe7TvaLgNi7+mnvGSP/6oUHioESm/oHE7u6SJ/8cVr4ED8q3/LIgETFrmseMbaxxGek7zugZ5RlpKwJQcjK1oFRxfo/f/M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651952; c=relaxed/simple; bh=VOTlOBxtM59V7Zzj573LgcG59E0PSEmDSLHYYRxscvg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=J/hOTa0ai2qbkpjXJrVvnm645Nb4G7z5jVIOi3Od1YpQi8kPC0EeH2qKsRQZmKw4CIia1N7Tx+1AFpgRqiG6zO7SNMxjG2BC6Byok2fCIGuMXNwxj9l2xYSaIEwIUT2b3kDR3n8XyosxFr6DRjx9C5l+n7EojWvkF6T+5HAjqfI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=R6Ykf47M; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="R6Ykf47M" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLCEq1012242; Wed, 20 Aug 2025 01:04:39 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=x+H+s A2wbapk0bAi60LM8rU+Rifb31fILDaW1a9nREQ=; b=R6Ykf47MjXhvjC985/OeR XVQkGZBP3q1CQ9WiaQXraupQua32LyA7cZULB8GPAWTmTSwLS/qCk3YMf3nP/45M pjVR4W74u+wlnNz2O0D0Z7sbGmtLyw6gh0ZFvQRoWFLs3SmKGGmy5vHHCGbAiPww SetDygntZgUFHaqH9mFwG6ukU915Us+9e8ro+95jFBh3IQKcwB8Q4Iujl276LowT xXfZpbZ7bC6pPv6J1uNa1QKFuzxNCrN+6gH9eRdQpetPz5dBGB+DmlaVl9bh3e1n KE/RDgiMpk/TM839b0GUCVQDr3PecQjKbl7pG9Ftk8lMyqx+GR+ps5gDClnaJGQF w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0trr8dt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:39 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNXLM8007358; Wed, 20 Aug 2025 01:04:38 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29sf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:38 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdE011685; Wed, 20 Aug 2025 01:04:37 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-6; Wed, 20 Aug 2025 01:04:37 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 05/22] mm/mshare: add ways to set the size of an mshare region Date: Tue, 19 Aug 2025 18:03:58 -0700 Message-ID: <20250820010415.699353-6-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=920 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-ORIG-GUID: LHUwwBAlFDOKUZ1C5Oa7Uw7vETmGP-xi X-Proofpoint-GUID: LHUwwBAlFDOKUZ1C5Oa7Uw7vETmGP-xi X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfXx3Dliz+VYorm 6jh3wb507jjHEcFE0clu3pt9XherMgo3M8okFWpuLKVV3v18+WVZ9MrDLbAbNSh/rBYC+oeCTEi mUV7mFEwJlu7igr87w1/qX5BoID8gLdojVVlJQTiCuEQHuXK9NokfAKVqP6lQGfewsWqZLgDOZ2 rAd9lsgKVUgfAoXSXYHf8xUSyADucGmPpyt1Y5Kacq+MuA6gZ7ajiyxcJX2n9r2UYE6StbhnxeQ bkLJTtYhBQ33OksTNqeY5QvhgeI4HBqrDcSuRdnPIeiAIEtcoek17PrwPPFoG+lYv8qK6suKaru MYgTMkAoC+5KcYgWUMAG3AlNEhb0EO4Sud42kXQuLcCNXsfk9xXsmwS2kXMAhqzbOvcwSxvNVr/ 7WtJTJSbNs4sxi3ttoA9MzxGJtmOlA== X-Authority-Analysis: v=2.4 cv=Qp4HHVyd c=1 sm=1 tr=0 ts=68a51f27 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=RpjSixA7iQhGN-bV:21 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=HTk2KgwdMEbdErbPyz8A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" Add file and inode operations to allow the size of an mshare region to be set fallocate() or ftruncate(). Signed-off-by: Anthony Yznaga --- mm/mshare.c | 87 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 86 insertions(+), 1 deletion(-) diff --git a/mm/mshare.c b/mm/mshare.c index 400f198c0791..bf859b176e09 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -16,19 +16,78 @@ #include #include #include +#include =20 const unsigned long mshare_align =3D P4D_SIZE; =20 +#define MSHARE_INITIALIZED 0x1 + struct mshare_data { struct mm_struct *mm; refcount_t ref; + unsigned long size; + unsigned long flags; }; =20 +static inline bool mshare_is_initialized(struct mshare_data *m_data) +{ + return test_bit(MSHARE_INITIALIZED, &m_data->flags); +} + +static int msharefs_set_size(struct mshare_data *m_data, unsigned long siz= e) +{ + int error =3D -EINVAL; + + if (mshare_is_initialized(m_data)) + goto out; + + if (m_data->size || (size & (mshare_align - 1))) + goto out; + + m_data->mm->task_size =3D m_data->size =3D size; + + set_bit(MSHARE_INITIALIZED, &m_data->flags); + error =3D 0; +out: + return error; +} + +static long msharefs_fallocate(struct file *file, int mode, loff_t offset, + loff_t len) +{ + struct inode *inode =3D file_inode(file); + struct mshare_data *m_data =3D inode->i_private; + int error; + + if (mode !=3D FALLOC_FL_ALLOCATE_RANGE) + return -EOPNOTSUPP; + + if (offset) + return -EINVAL; + + inode_lock(inode); + + error =3D inode_newsize_ok(inode, len); + if (error) + goto out; + + error =3D msharefs_set_size(m_data, len); + if (error) + goto out; + + i_size_write(inode, len); +out: + inode_unlock(inode); + + return error; +} + static const struct inode_operations msharefs_dir_inode_ops; static const struct inode_operations msharefs_file_inode_ops; =20 static const struct file_operations msharefs_file_operations =3D { .open =3D simple_open, + .fallocate =3D msharefs_fallocate, }; =20 static int @@ -128,6 +187,32 @@ msharefs_mknod(struct mnt_idmap *idmap, struct inode *= dir, return 0; } =20 +static int msharefs_setattr(struct mnt_idmap *idmap, + struct dentry *dentry, struct iattr *attr) +{ + struct inode *inode =3D d_inode(dentry); + struct mshare_data *m_data =3D inode->i_private; + unsigned int ia_valid =3D attr->ia_valid; + int error; + + error =3D setattr_prepare(idmap, dentry, attr); + if (error) + return error; + + if (ia_valid & ATTR_SIZE) { + loff_t newsize =3D attr->ia_size; + + error =3D msharefs_set_size(m_data, newsize); + if (error) + return error; + + i_size_write(inode, newsize); + } + + setattr_copy(idmap, inode, attr); + return 0; +} + static int msharefs_create(struct mnt_idmap *idmap, struct inode *dir, struct dentry *dentry, umode_t mode, bool excl) @@ -182,7 +267,7 @@ msharefs_unlink(struct inode *dir, struct dentry *dentr= y) } =20 static const struct inode_operations msharefs_file_inode_ops =3D { - .setattr =3D simple_setattr, + .setattr =3D msharefs_setattr, }; =20 static const struct inode_operations msharefs_dir_inode_ops =3D { --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A13501C3C04; Wed, 20 Aug 2025 01:05:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651954; cv=none; b=Ij87fQduiP2q3AQoLLZLdNor5YNlPzI0TDNyCeSVidNu8QHcxLXfKy9N3xLjkkiKh0+gyaZPjeKEywCb1ohp3eHRDQCey/lSsAk9a6LDilm3IC6QoiZnQAJgzqtP1vTZySFfHG+53xzYSO4vH+4QYQs1QUmK96mgjee6F2TYXOA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651954; c=relaxed/simple; bh=l8WrPTqDpXHT+zAulBOg7G5601YoHk0/n71OpcMmoIE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=eQRj6lhq06kXXia9xnolN3IDHPME7j7mOXURKYAZ1MaUxN/yHtU8sJbIKjjnZ0GFE2ultwPTb+m9tUEpgprzXY2lCGPcet2jgfeZs7ZjrdK6jUglv7XhwphZwwrQiKgHfQCHso9tZnh88OX5Iez1LdVKP2Lgxi1qfE725Ta3WI0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=TsJWzzK4; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="TsJWzzK4" Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLE2AZ008324; Wed, 20 Aug 2025 01:04:42 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=joAKD yFfJF1Jn1BV6nqdFRNo25EhXa7oaKvhAKhAMM0=; b=TsJWzzK4uQdbD4trS9KNN X16/3jlvecQFlDTBLFpdvb5rsHhyjEBfgmBtqFSntlhbHzv1ujIlgxOGHSgfRX3i creZoVStTuzJ3HziYH6vx/BVhtO9NKH1DbpczR9BWElg5q6Ae9Sk6d2SOuTZO6HJ gIVbhOjPM7h+EnmcL+tf2Hvr9UCuclAkBaBYMLIoymlYMdiPt7jKeuLQp7wcKvQK 4+wVwuqW0lCF8uhtocdZqRsyriOfe7m6GeqFvZfKmhw4lnUi4fL3lmG3BAAAgik6 UjudOQxFfryrBGIoaD5SsbLBZeBIKREoedUtnC8WomhzK8DzMAllaEqsRw2Wi8rm A== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0ts88gd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:42 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57K04sUS007394; Wed, 20 Aug 2025 01:04:41 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29tf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:41 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdG011685; Wed, 20 Aug 2025 01:04:40 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-7; Wed, 20 Aug 2025 01:04:39 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 06/22] mm/mshare: Add a vma flag to indicate an mshare region Date: Tue, 19 Aug 2025 18:03:59 -0700 Message-ID: <20250820010415.699353-7-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: dwMicK_aXrOx3OW1u1XRTV9-Cn-ar4zW X-Proofpoint-ORIG-GUID: dwMicK_aXrOx3OW1u1XRTV9-Cn-ar4zW X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX7Z/OUNiXSfdU sA6bstIT3iMsz8KRti2mg1l475gtdbbJ2FZvfqVhon8SqobWqbpI70WnKKusDCOzlTP+zksE0Im 1GtwLD5XlOd8WU3IivD71koutCzR4dpDFAymxfQpdpN+XK84G+zSIbCDP3XqTw0edCbcwdsIyo7 XHPOyoznkVMf/2cNQdtNyYC6KBgmt26xdKHBvgLOtlTOufy0shalNVxZ+E7mn2l9TQ8gNwG++0K aTobhHT8iI3wYywQOQfG50/xH66YiNV0FZQc80kLfDdQAIgt30n10SH7wdhWgb5cars1PV12n9F r8/TrvjKas0wGsX5ix7cFZji371+VVBtst2UjCa30/ha0QPs0rsLPjwbvEpP9ilQq58b5BEYOBP 1iUldN1J0zsi5WsdxY3gUGWdT/1z5A== X-Authority-Analysis: v=2.4 cv=HKOa1otv c=1 sm=1 tr=0 ts=68a51f2a cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=VwQbUJbxAAAA:8 a=JfrnYn6hAAAA:8 a=yPCof4ZbAAAA:8 a=sWrvqzRc1c8b1S7DXhEA:9 a=1CNFftbPRP8L7MoqJWF3:22 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" From: Khalid Aziz An mshare region contains zero or more actual vmas that map objects in the mshare range with shared page tables. Signed-off-by: Khalid Aziz Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Anthony Yznaga --- include/linux/mm.h | 19 +++++++++++++++++++ include/trace/events/mmflags.h | 7 +++++++ 2 files changed, 26 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 76ee2bfaa8bd..aca853b4c5dc 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -431,6 +431,13 @@ extern unsigned int kobjsize(const void *objp); #define VM_SEALED VM_NONE #endif =20 +#ifdef CONFIG_MSHARE +#define VM_MSHARE_BIT 43 +#define VM_MSHARE BIT(VM_MSHARE_BIT) +#else +#define VM_MSHARE VM_NONE +#endif + /* Bits set in the VMA until the stack is in its final location */ #define VM_STACK_INCOMPLETE_SETUP (VM_RAND_READ | VM_SEQ_READ | VM_STACK_E= ARLY) =20 @@ -991,6 +998,18 @@ static inline bool vma_is_anon_shmem(struct vm_area_st= ruct *vma) { return false; =20 int vma_is_stack_for_current(struct vm_area_struct *vma); =20 +#ifdef CONFIG_MSHARE +static inline bool vma_is_mshare(const struct vm_area_struct *vma) +{ + return vma->vm_flags & VM_MSHARE; +} +#else +static inline bool vma_is_mshare(const struct vm_area_struct *vma) +{ + return false; +} +#endif + /* flush_tlb_range() takes a vma, not a mm, and can care about flags */ #define TLB_FLUSH_VMA(mm,flags) { .vm_mm =3D (mm), .vm_flags =3D (flags) } =20 diff --git a/include/trace/events/mmflags.h b/include/trace/events/mmflags.h index aa441f593e9a..a9b13a8513d0 100644 --- a/include/trace/events/mmflags.h +++ b/include/trace/events/mmflags.h @@ -200,6 +200,12 @@ IF_HAVE_PG_ARCH_3(arch_3) # define IF_HAVE_VM_DROPPABLE(flag, name) #endif =20 +#ifdef CONFIG_MSHARE +# define IF_HAVE_VM_MSHARE(flag, name) {flag, name}, +#else +# define IF_HAVE_VM_MSHARE(flag, name) +#endif + #define __def_vmaflag_names \ {VM_READ, "read" }, \ {VM_WRITE, "write" }, \ @@ -233,6 +239,7 @@ IF_HAVE_VM_SOFTDIRTY(VM_SOFTDIRTY, "softdirty" ) \ {VM_HUGEPAGE, "hugepage" }, \ {VM_NOHUGEPAGE, "nohugepage" }, \ IF_HAVE_VM_DROPPABLE(VM_DROPPABLE, "droppable" ) \ +IF_HAVE_VM_MSHARE(VM_MSHARE, "mshare" ) \ {VM_MERGEABLE, "mergeable" } \ =20 #define show_vma_flags(flags) \ --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A80CC1EB5C2; Wed, 20 Aug 2025 01:05:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651957; cv=none; b=jsfgNB/LrbHqRDQ2DXQ/prIsC+pLlThpwzS12+PtFTAUhvZrcOblbSS7V8t4RGTOdqPQMCJZnHbppMWZFpegYxkKQisio1WmMWMUTLjNg73KSLr6pUyuZmr6YHycJM/A8XFO8ycQBDLaUUr9au9A3uzsMPccdoUDP+WNJ2QiJMg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651957; c=relaxed/simple; bh=vt/bnNThzd6lIXdsG+ux5P+TJ6dCttwxkS20QlIPpeQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=B2Bk0SGMXRDaXLjNI9vA62DPu3mlO0gzpccSE5F7ZzBAZIXdicBJ5Z3Ok13pwr20saTPjpI7RWIdeIENAEAf3y2P/kC/SinABe2Kqzkb7Dq5eSnBFQZaPTo+SqObzrU61HTXNkM59SMFGWO0buNEgEV5emiBL3NtCit26lzHCp8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=FtfXpejm; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="FtfXpejm" Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBqcb011356; Wed, 20 Aug 2025 01:04:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=vFVJ7 GCdYMNMRdvxqqjco1umADEpiPX0dC61Kp/W1YM=; b=FtfXpejmbXjtyreLF0sLS IFvkL6wSldqBZRFQ4MrwNBi0uGNac3WHZchzdGUde+tPoDKE+8s2zVsdMBH1dMwJ poWWxfeI5zgSx73mzR/j+d6rdeWsval1Hkt81iUAewu6ZFwYtwtlwP8vixyRQfwS KdJY7Nw2zCChLneVMN/0QagUHMW35WQSbJlgpbBFhmTXI8Arf6mdJwLenN85xdrU MYYxwbc6gYHvD99C7U5jISNfOrIi4j7QjgD50ruy3lDfcDk7hdaiC46WeYVYy4lW 22I/+wsDb6ipS4i+0ve5ldYw4i2aLnwr7avpA2r09fsbg3FGnIW1ZPbk6bIB9hgs Q== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0trr8dx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:45 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNaG3E007329; Wed, 20 Aug 2025 01:04:44 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29u9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:44 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdI011685; Wed, 20 Aug 2025 01:04:43 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-8; Wed, 20 Aug 2025 01:04:42 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 07/22] mm/mshare: Add mmap support Date: Tue, 19 Aug 2025 18:04:00 -0700 Message-ID: <20250820010415.699353-8-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-ORIG-GUID: wIfT0P-xPLPfkzb9YRueG9mDMcete3Gx X-Proofpoint-GUID: wIfT0P-xPLPfkzb9YRueG9mDMcete3Gx X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX3e3pgJ8CVHxe qJOtxNuqvZuvQRB3WvNsGAeFXqSKO5ojFOBf4f282kadjKp3Q24tjqPPay03XzG7B4pCtYnG41z T0IuzOTUuV644NlCXyNIJYQ3GBPKqU3OIYl/mkcAuseEDUKy88Nlar7pCL8ctD0VCh3RAgiSdvr LYChS83SlBPoooe0gsrkPfFH7nXfPRDn6Gclvd5RffulCvHBYFhS+aC79vpv7tQPwvnhU5UXHn4 1gLpGdw0M2WHTCcaH7ATi8Xv43leg6mzUGVaAsRud8Gjh6kWyahPNLtmGm8pgcoYtdOh8hR2rlJ NfRIGmTBMw+InoPPpQzZMqjNh4ylVZHpk9tMhbLgePj0nAKBeS0C2eh8SLQxy1WBR0p8E+Cjyvu jAUX6BbXR+HjHMRPl0hgOfqcHdnDFg== X-Authority-Analysis: v=2.4 cv=Qp4HHVyd c=1 sm=1 tr=0 ts=68a51f2d cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=VwQbUJbxAAAA:8 a=yPCof4ZbAAAA:8 a=Zqnq54RP4WoKE8Fhti4A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" From: Khalid Aziz Add support for mapping an mshare region into a process after the region has been established in msharefs. Disallow operations that could split the resulting msharefs vma such as partial unmaps and protection changes. Fault handling, mapping, unmapping, and protection changes for objects mapped into an mshare region will be done using the shared vmas created for them in the host mm. This functionality will be added in later patches. Signed-off-by: Khalid Aziz Signed-off-by: Anthony Yznaga --- mm/mshare.c | 133 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 132 insertions(+), 1 deletion(-) diff --git a/mm/mshare.c b/mm/mshare.c index bf859b176e09..e0dc42602f7f 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -15,16 +15,19 @@ =20 #include #include +#include #include #include =20 const unsigned long mshare_align =3D P4D_SIZE; +const unsigned long mshare_base =3D mshare_align; =20 #define MSHARE_INITIALIZED 0x1 =20 struct mshare_data { struct mm_struct *mm; refcount_t ref; + unsigned long start; unsigned long size; unsigned long flags; }; @@ -34,6 +37,130 @@ static inline bool mshare_is_initialized(struct mshare_= data *m_data) return test_bit(MSHARE_INITIALIZED, &m_data->flags); } =20 +static int mshare_vm_op_split(struct vm_area_struct *vma, unsigned long ad= dr) +{ + return -EINVAL; +} + +static int mshare_vm_op_mprotect(struct vm_area_struct *vma, unsigned long= start, + unsigned long end, unsigned long newflags) +{ + return -EINVAL; +} + +static const struct vm_operations_struct msharefs_vm_ops =3D { + .may_split =3D mshare_vm_op_split, + .mprotect =3D mshare_vm_op_mprotect, +}; + +/* + * msharefs_mmap() - mmap an mshare region + */ +static int +msharefs_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct mshare_data *m_data =3D file->private_data; + + vma->vm_private_data =3D m_data; + vm_flags_set(vma, VM_MSHARE | VM_DONTEXPAND); + vma->vm_ops =3D &msharefs_vm_ops; + + return 0; +} + +static unsigned long +msharefs_get_unmapped_area_bottomup(struct file *file, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + struct vm_unmapped_area_info info =3D {}; + + info.length =3D len; + info.low_limit =3D current->mm->mmap_base; + info.high_limit =3D arch_get_mmap_end(addr, len, flags); + info.align_mask =3D PAGE_MASK & (mshare_align - 1); + return vm_unmapped_area(&info); +} + +static unsigned long +msharefs_get_unmapped_area_topdown(struct file *file, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + struct vm_unmapped_area_info info =3D {}; + + info.flags =3D VM_UNMAPPED_AREA_TOPDOWN; + info.length =3D len; + info.low_limit =3D PAGE_SIZE; + info.high_limit =3D arch_get_mmap_base(addr, current->mm->mmap_base); + info.align_mask =3D PAGE_MASK & (mshare_align - 1); + addr =3D vm_unmapped_area(&info); + + /* + * A failed mmap() very likely causes application failure, + * so fall back to the bottom-up function here. This scenario + * can happen with large stack limits and large mmap() + * allocations. + */ + if (unlikely(offset_in_page(addr))) { + VM_BUG_ON(addr !=3D -ENOMEM); + info.flags =3D 0; + info.low_limit =3D current->mm->mmap_base; + info.high_limit =3D arch_get_mmap_end(addr, len, flags); + addr =3D vm_unmapped_area(&info); + } + + return addr; +} + +static unsigned long +msharefs_get_unmapped_area(struct file *file, unsigned long addr, + unsigned long len, unsigned long pgoff, unsigned long flags) +{ + struct mshare_data *m_data =3D file->private_data; + struct mm_struct *mm =3D current->mm; + struct vm_area_struct *vma, *prev; + unsigned long mshare_start, mshare_size; + const unsigned long mmap_end =3D arch_get_mmap_end(addr, len, flags); + + mmap_assert_write_locked(mm); + + if ((flags & MAP_TYPE) =3D=3D MAP_PRIVATE) + return -EINVAL; + + if (!mshare_is_initialized(m_data)) + return -EINVAL; + + mshare_start =3D m_data->start; + mshare_size =3D m_data->size; + + if (len !=3D mshare_size) + return -EINVAL; + + if (len > mmap_end - mmap_min_addr) + return -ENOMEM; + + if (flags & MAP_FIXED) { + if (!IS_ALIGNED(addr, mshare_align)) + return -EINVAL; + return addr; + } + + if (addr) { + addr =3D ALIGN(addr, mshare_align); + vma =3D find_vma_prev(mm, addr, &prev); + if (mmap_end - len >=3D addr && addr >=3D mmap_min_addr && + (!vma || addr + len <=3D vm_start_gap(vma)) && + (!prev || addr >=3D vm_end_gap(prev))) + return addr; + } + + if (!mm_flags_test(MMF_TOPDOWN, mm)) + return msharefs_get_unmapped_area_bottomup(file, addr, len, + pgoff, flags); + else + return msharefs_get_unmapped_area_topdown(file, addr, len, + pgoff, flags); +} + static int msharefs_set_size(struct mshare_data *m_data, unsigned long siz= e) { int error =3D -EINVAL; @@ -87,6 +214,8 @@ static const struct inode_operations msharefs_file_inode= _ops; =20 static const struct file_operations msharefs_file_operations =3D { .open =3D simple_open, + .mmap =3D msharefs_mmap, + .get_unmapped_area =3D msharefs_get_unmapped_area, .fallocate =3D msharefs_fallocate, }; =20 @@ -101,12 +230,14 @@ msharefs_fill_mm(struct inode *inode) if (!mm) return -ENOMEM; =20 - mm->mmap_base =3D mm->task_size =3D 0; + mm->mmap_base =3D mshare_base; + mm->task_size =3D 0; =20 m_data =3D kzalloc(sizeof(*m_data), GFP_KERNEL); if (!m_data) goto err_free; m_data->mm =3D mm; + m_data->start =3D mshare_base; =20 refcount_set(&m_data->ref, 1); inode->i_private =3D m_data; --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A815E1EB5E1; Wed, 20 Aug 2025 01:05:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651957; cv=none; b=LwMoodRPr1mDGvisxhCKOUEUbYIE6Ra8FZFPAPIcnvZ4+hJ7iX2gqrBdTfCkT5qtGHxfTj6yKRx1s9gogMibyw+P45+4/XSKAASxfe1iMeXtrmdcUHpMHnBAVeal4ifesF9gqFy5BVmDNT7V6LC9jEGCH4ZbonqpDVtvGIFvBVo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651957; c=relaxed/simple; bh=cK+Yhbyrxz6cRQ4ja5MMJn+1dAtSfifLO4P/ZHRpVvw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=M9hDo8ZY4vVtAyRm+5q6rZxxZHdbHQxnHWycQiRxmDcs+35uSLlLf+JlzgzDt7nLnX/DHtYW36ODnjdY12cpI88V2SB9y0XJTJlNgl4UAU36kuGtj2HVuCXhgO6vwSHfI3MvvoMqXwl5JbQNTCvmEuJTo7GErTRed2Jt5VgBxbI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=sgpRAyWw; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="sgpRAyWw" Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLCHQM028221; Wed, 20 Aug 2025 01:04:48 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=9L4mK jeGA5OyWK8/K1NWQvE53M9E3W4TNz8fycuQ5CQ=; b=sgpRAyWwR+4TxZS77Pv5L tgfjsj36FrQShtOvaWE1EKMDedetv9gjtQb2i/mDPxzLObU9kSms/jiMJPoHxgyc /gj6FAco0ZyI8fwy+IlkiFbctnFoEXApIwVg38Y6P7DBLLLkNpH8LkOq2bsS8ARJ IMMkNpLqo/ZKVsQmoDRu37awClhkyxdLpqznfi8WePpokF7EMaNfdqs8HcWGQfsZ NjBHTmb6YEhhdgTdsHgBj1btzELw5YvMemlqmBAnguxDeOkRT09yUwKlGTR2a7zD g40KsVeOUhbo84S4FRxPjZ48XJbOY5mR+fs1UmNHZJEI4afelYI5AviWgvMQgJvG g== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tt08bp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:47 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNWIUI007267; Wed, 20 Aug 2025 01:04:47 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29v9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:46 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdK011685; Wed, 20 Aug 2025 01:04:45 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-9; Wed, 20 Aug 2025 01:04:45 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 08/22] mm/mshare: flush all TLBs when updating PTEs in an mshare range Date: Tue, 19 Aug 2025 18:04:01 -0700 Message-ID: <20250820010415.699353-9-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Authority-Analysis: v=2.4 cv=YvRWh4YX c=1 sm=1 tr=0 ts=68a51f2f cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=0InV0jWQfKTt0iU26UwA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 X-Proofpoint-GUID: tBA8mk8grAxjezx5ksHcHYHuEzgvJI8t X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NiBTYWx0ZWRfXzCjnlAcKDkxl M/W3GCTi3Cck0tQGb9R4qooCLskGa3VkpOpP1pf05OmPWvaJ4/I3KFOr2i+jRYpiUjl9h2+I7Zm r+MHQ41Jz7B3jRarYNRwg1IS1+nXFLIinc7PIhYtdbe8cvHEXjrMrbNdaHNXdHc31hYsW0MVJJP aYitiektuIMWnkwGeXWhA2XJYGQ1OW47P3nMNmgGH+93OqRxBzEZEMjKx6Z17oOiLCRSX891ssA fVkubZUw/W3QdfS064dz2uCg5T34F/+oC4JjhtEKFsd3pklDXl3GTfV79URq/kRGVpfpQgl/A8R Vxv7SsSsXg7ynDPsmQ3t9pHRH5fI+tcrfSK0jLh3Un6YRPiyhs1vrU/5En5Q2lE1bqoPSqWlv+8 Oa6w1RHw3jA0PIujvZ5D7VWqENudvw== X-Proofpoint-ORIG-GUID: tBA8mk8grAxjezx5ksHcHYHuEzgvJI8t Content-Type: text/plain; charset="utf-8" Unlike the mm of a task, an mshare host mm is not updated on context switch. In particular this means that mm_cpumask is never updated which results in TLB flushes for updates to mshare PTEs only being done on the local CPU. To ensure entries are flushed for non-local TLBs, set up an mmu notifier on the mshare mm and use the .arch_invalidate_secondary_tlbs callback to flush all TLBs. arch_invalidate_secondary_tlbs guarantees that TLB entries will be flushed before pages are freed when unmapping pages in an mshare region. Signed-off-by: Anthony Yznaga --- mm/mshare.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/mm/mshare.c b/mm/mshare.c index e0dc42602f7f..be7cae739225 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -16,8 +16,10 @@ #include #include #include +#include #include #include +#include =20 const unsigned long mshare_align =3D P4D_SIZE; const unsigned long mshare_base =3D mshare_align; @@ -30,6 +32,7 @@ struct mshare_data { unsigned long start; unsigned long size; unsigned long flags; + struct mmu_notifier mn; }; =20 static inline bool mshare_is_initialized(struct mshare_data *m_data) @@ -37,6 +40,16 @@ static inline bool mshare_is_initialized(struct mshare_d= ata *m_data) return test_bit(MSHARE_INITIALIZED, &m_data->flags); } =20 +static void mshare_invalidate_tlbs(struct mmu_notifier *mn, struct mm_stru= ct *mm, + unsigned long start, unsigned long end) +{ + flush_tlb_all(); +} + +static const struct mmu_notifier_ops mshare_mmu_ops =3D { + .arch_invalidate_secondary_tlbs =3D mshare_invalidate_tlbs, +}; + static int mshare_vm_op_split(struct vm_area_struct *vma, unsigned long ad= dr) { return -EINVAL; @@ -238,6 +251,10 @@ msharefs_fill_mm(struct inode *inode) goto err_free; m_data->mm =3D mm; m_data->start =3D mshare_base; + m_data->mn.ops =3D &mshare_mmu_ops; + ret =3D mmu_notifier_register(&m_data->mn, mm); + if (ret) + goto err_free; =20 refcount_set(&m_data->ref, 1); inode->i_private =3D m_data; --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB3FFEACD; Wed, 20 Aug 2025 01:05:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651951; cv=none; b=BnXcrOO0i4aowJY6AGaRiSCA+ibthbmFvz8sHF1gonMbAI5HJ0AOWan3V7om5DBIBwPhTFgsz4kUla1X80V0LT9xBg5Kzx8WCKV1xpTrMNlVXpwSYQXBrpUXnSudRJMAu5v63y2Sfmj/7qJKqwxenjUycb/4uJsydBlIqLW1xak= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651951; c=relaxed/simple; bh=k/0CokvgxVQXuD5EoA1X+9nYuoVnaLv2zm+DaVfTfHU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NCHHhuJjYKq7Nc1oexlvO80VXUUHLlpsrcWm1LKSrmdEp8Suc2HJGihTBO35FpmMfHUY1s9iESdqbekTtrURV2dlzMkkAhOdhuNdfmRYB56CeFad/OSxDY9IKtFC9vlcZ88TFvOzJbId6zELoICGezHjDgNLQrNmScxAv6uqM1Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=f9rqNDyW; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="f9rqNDyW" Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLE2Aa008324; Wed, 20 Aug 2025 01:04:50 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=/DmF+ MuOx9lfBDmRRC8FPNiO57MW57fL18nJmrC16ww=; b=f9rqNDyWguXWS8I3hcAkX X1+oBex+YWZjxY0OjWF3weAMJoB2p1d19UfDHmEMJ/Zd3EPCw7YnMRAdlWBkUa+I XzW12RSIvzijTDDl/J5h+i5AKTKm4492PkE8He4GFKmTXv3kdmE3ZmWkYnl+u/tH IS1f4fSsd8YOBXg4UzcF47uMhW13LpiE11UpOIJI/juZcUfTIfzDxH07Fy5UkVX2 AACD+gZjoNaxb8+Tv5/EU+pyFh7aCo/10g4hyCk5xOeeM5sHZH1Ktzt0ob4HyEtM uydW/XfgTuxCWSPo3dcGI3vpj918cf/PQZvm2LVNkLUtN45uFsWR3CcohbQN/LRI Q== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0ts88gk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:50 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNZVeL007332; Wed, 20 Aug 2025 01:04:49 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29w3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:49 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdM011685; Wed, 20 Aug 2025 01:04:48 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-10; Wed, 20 Aug 2025 01:04:48 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 09/22] sched/numa: do not scan msharefs vmas Date: Tue, 19 Aug 2025 18:04:02 -0700 Message-ID: <20250820010415.699353-10-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: kkX1jrsmAmAWOAdgPkvoTgAV_QB4TK5z X-Proofpoint-ORIG-GUID: kkX1jrsmAmAWOAdgPkvoTgAV_QB4TK5z X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX0BPSeq4GrA/L zcO7Tgz46CTrssMsJV5wS1/futixs+GLOYvakk8NrmOdZlica8nXpVsUSRCkA2eufd8HR3A7ZGu WtfGhUo7KPw1v4ADEbD1xtPqCJN5vIyzRB9ac4rYNSMoh9Xf6GrckvYH9Rgps/DdSoAyuGQ0Ec9 YHfvbvxxYuoolnI0WT2SMefEJjoO25LD7Z64ZnQV6HYfm9Vb9u5vSUrbw1wnLThPSrKkDshpYGr miZcYEJY5ZQbPMEIKle2JH2tFBGURcNhgLj5EOVLXCGXWHaL4yiv+CnPJV+gkZsBDPqvILdjQmu 73cvbt3i2u/bDVPuI8y7M48qB0CmuupNGTgM0hugejYo7jfAjySQyqij8MdACSaDDCl8RJSjBvq 02W20G4UPnLctmK5FXLCr4Lt7Xbh5w== X-Authority-Analysis: v=2.4 cv=HKOa1otv c=1 sm=1 tr=0 ts=68a51f32 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=2OwiNFxuFh20pgB9HS0A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" Scanning an msharefs vma results in changes to the shared page table but with TLB flushes incorrectly only going to the process with the vma. Signed-off-by: Anthony Yznaga --- kernel/sched/fair.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index e256793b9a08..6f28395991cd 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3379,7 +3379,8 @@ static void task_numa_work(struct callback_head *work) =20 for (; vma; vma =3D vma_next(&vmi)) { if (!vma_migratable(vma) || !vma_policy_mof(vma) || - is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_MIXEDMAP)) { + is_vm_hugetlb_page(vma) || (vma->vm_flags & VM_MIXEDMAP) || + vma_is_mshare(vma)) { trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_UNSUITABLE); continue; } --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA0FA1FECBA; Wed, 20 Aug 2025 01:05:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.165.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651960; cv=none; b=mkU2NwmFr+S0dqxyptthmbDLiqXPonot4R16TU3T76yY0wenyk6MbHMqRd5pdP5UNLhDen2Clmeeqls4bnvlZB18EhONyul7tj47WoqMFKVgCPfX1hwpVKcbozViIjfRUNvDIhlRXaNfrnR8iyVmk+oQ447I2YY3RExB6sx6yBc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651960; c=relaxed/simple; bh=yw9aGqyxJXYjFULGNdmg/sm2U3OWpv9sLI5g8JLosp4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iMbQWkoSZ0Tl3OXkSCq7BUGCyBGLmr4f7hHbqbqrWRoPOuutzWCM40ZR70rjVYhzlf2Lpg9dMzpUB2s4ZjTAESgyiM0Rbofn9kE3jsgD7Zq8XSmvgY500Jf+hRtVnNhiVhgc7BLAKF4hVQxNI4fM8atnfLD2ymfWnKyGbi4YWB8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=FvMDBI9D; arc=none smtp.client-ip=205.220.165.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="FvMDBI9D" Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLCDGl005682; Wed, 20 Aug 2025 01:04:52 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=Le2GA DmgFKn9Yb9MtlR8f8swzH8vtb/n0jwBJSbLGNY=; b=FvMDBI9DGn0QkQHCginz8 JbM8/Snb0kT2s6rR5PMiJDyK2AoCK07lLecZQ7VhaW1YhjzRPi5guRmKtc+mcHTi Vti1+elCNJGbIjKSvvvYeqWEX2xk94TfJUpaHZ/Ni60CDgFhmHxdAZgotykkiAAL ym/r2YMgrLPOg/rVVpvHXihsla/+IwF/ACw4hMAfBWU5oq14YoALRLB29kkpr4Yp MEShVeN1tNQWcGk1rQ8FrsAv65IS3WheB8PR8yfJocniIHsmvn2hLwSH8tc5M5cH 6on04DsFYToJRPVu6JzzHTiLvlFfQT8dD4/XLiposr/yzWOfPeKznCk4T55Pf1o8 Q== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0ts88gm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:52 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNkhLF007278; Wed, 20 Aug 2025 01:04:51 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29x2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:51 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdO011685; Wed, 20 Aug 2025 01:04:50 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-11; Wed, 20 Aug 2025 01:04:50 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 10/22] mm: add mmap_read_lock_killable_nested() Date: Tue, 19 Aug 2025 18:04:03 -0700 Message-ID: <20250820010415.699353-11-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: IzMLS54QSyw4yb-w04wcug_DzS4oYxWb X-Proofpoint-ORIG-GUID: IzMLS54QSyw4yb-w04wcug_DzS4oYxWb X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfXzFr7NCmWGcQS r6rWahbigH3tL3T2O4uUCawj1dnQ3YVLiaOroZl7zfVWUl5LBE9KSlBE1ejgxittoxyvG9/B5DK abA9GKMkwf4CP6bOw/uol16+0UuSfR+ubY+/NHmpTE18vRI+Mpb6ELG95jHs1lJVYTi+ZU73DBu VqVSO27e8ing7RpCBDzZEYh2COHHyseisJ27FHlES28cTKh6FDZfTcempDS2io8M0JUcTpuXr19 1PJZVxAU1CNqPeAid+nwydmffWHGTh/8DJe2gf00z6GvCr8u4cEMmh35FeKXGAYyVerL5CzvUD1 X4uM3+0g8pXvnTteXhIcMqZC2+UNIDwDl9tVuGiqi/+qtikWg8nR6zvlhq1xoY77rJeQv/OvM4S AWJev3h51X4zWyTVhu2KzJWh9PpFeg== X-Authority-Analysis: v=2.4 cv=HKOa1otv c=1 sm=1 tr=0 ts=68a51f34 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=c3nol8fVUP3th71kQ2MA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" This will be used to support mshare functionality where the read lock on an mshare host mm is taken while holding the lock on a process mm. Signed-off-by: Anthony Yznaga --- include/linux/mmap_lock.h | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 2c9fffa58714..3cf7219306a1 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -369,6 +369,13 @@ static inline void mmap_read_lock(struct mm_struct *mm) __mmap_lock_trace_acquire_returned(mm, false, true); } =20 +static inline void mmap_read_lock_nested(struct mm_struct *mm, int subclas= s) +{ + __mmap_lock_trace_start_locking(mm, false); + down_read_nested(&mm->mmap_lock, subclass); + __mmap_lock_trace_acquire_returned(mm, false, true); +} + static inline int mmap_read_lock_killable(struct mm_struct *mm) { int ret; --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E6A5719539F; Wed, 20 Aug 2025 01:05:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651951; cv=none; b=SMR5bxXHXDOVdqFtRzIRDH6UTFOznQUPsu+GBdWnWoSOmtoe4aPWpayaDvNq80AtRb9C0xH6DsSKFbtZou/TDeD9ddKVWUx/mJZQIBfrTKNuY9T+sUPkIN03WD8YVksIyt0ROHOViGob6FBPNITQencjwIzl+Ptz1N4CKIMsrz8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651951; c=relaxed/simple; bh=UepIJF8vj6kp0esFf5B3rlPzWcfWDIuvCnpMtvOB4n8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rsbmnbtYKcxRYWOjFDS1c4ed6hjqcVbiM5sGk68sbcKr+FcYfoR0YYE5Gom5iZ8mmMiFZ8Jl6SRUSozHiTz4aMQxJ075Z9meFbrr8unboLld0G1Sm43jJH6Mmn+k4MyWNe1iyBvSu6xuIcjBEwZkU1dmDwjO+FzOTEuAiRVdqTc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=U49XgFJj; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="U49XgFJj" Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBtnv003092; Wed, 20 Aug 2025 01:04:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=oB/yt OrLg9T9IbRvCshxXxjR6QK8HqXJqmgz7J86k/E=; b=U49XgFJj6adJHiHse3EIa qgpjjCnYXFWOQ9Adg6dDVwdOoey/xFGy9PQS4EbYogkkQs2Jwra/TsHIFYoNC2kl +4KxfubFTzDWk5JkqOHF3+WYRCf26uLBw+MrUiCUZYAUkzxCdOoSlIwzS1lNrGE2 ZQL4dZCNrgwysd1E+14or+a1JSgqIgO2Gb1stpPcBTH6mFICv9h/FFGh9nnSKfQS oBzR0z9GBF/GcSpvXFafEv4JWNuTGWG3+DFmBiHfc0YvWnvJYiFg88fchsKpQ2Nd IuLx8WqAdS3LF6HgF4bzVX99o9tJNu/rL/IWSOoC/IeZv64GMRIoVho84cqEpqS6 g== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tr0863-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:55 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57K01WpA007223; Wed, 20 Aug 2025 01:04:54 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q29xy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:54 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdQ011685; Wed, 20 Aug 2025 01:04:53 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-12; Wed, 20 Aug 2025 01:04:53 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 11/22] mm: add and use unmap_page_range vm_ops hook Date: Tue, 19 Aug 2025 18:04:04 -0700 Message-ID: <20250820010415.699353-12-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: _PKCORV95t34utTONhSuPWPoCV8aYqY1 X-Authority-Analysis: v=2.4 cv=FY1uBJ+6 c=1 sm=1 tr=0 ts=68a51f37 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=viETJqx1g-7UJoe2XGoA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 X-Proofpoint-ORIG-GUID: _PKCORV95t34utTONhSuPWPoCV8aYqY1 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX0UFlNL3UB1MF a96GovBAUffpWwKC8eTw1TjY/mPb/56RFmJrZziuxYWJk/9ZUkl4E/+CP1D6304gPEy+AkPU0s/ OYxPxjjDvVmneCGHCAAXe5y6jFUbS4t8/QjQKgSszcNBDm87JCc5v6qy/sxj1pOZDb8+t2XSbho PSGAMAJNldnpoObUT1hRAc5rr1/xCBPSMIXiyIC8o6VoIcX1KF+KjcBaQnA3bTxFdQmwPZ8gP18 4bGQi3rzUBZCVcLOTCjgmFrKSWRNzawcbZl/yiQtK/odgiuqqF9uWRN6uwM8SaztWM8Hl0//1dj 7kxYAoaDyXJLAqPPaGPWELc+cPkkOKnUiWahuCrC/szB95ixRqmfNpg2W9Qbzl1QR3yVLXiUzJZ 5iXB9vBaThGfh3EtNbhL9PLXj+lhEw== Content-Type: text/plain; charset="utf-8" Special handling is needed when unmapping a hugetlb vma and will be needed when unmapping an msharefs vma once support is added for handling faults in an mshare region. Signed-off-by: Anthony Yznaga --- include/linux/mm.h | 10 ++++++++++ ipc/shm.c | 17 +++++++++++++++++ mm/hugetlb.c | 25 +++++++++++++++++++++++++ mm/memory.c | 36 +++++++++++++----------------------- 4 files changed, 65 insertions(+), 23 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index aca853b4c5dc..96440082a633 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -43,6 +43,7 @@ struct anon_vma_chain; struct user_struct; struct pt_regs; struct folio_batch; +struct zap_details; =20 void arch_mm_preinit(void); void mm_core_init(void); @@ -681,8 +682,17 @@ struct vm_operations_struct { struct page *(*find_normal_page)(struct vm_area_struct *vma, unsigned long addr); #endif /* CONFIG_FIND_NORMAL_PAGE */ + void (*unmap_page_range)(struct mmu_gather *tlb, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + struct zap_details *details); }; =20 +void __unmap_page_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + struct zap_details *details); + #ifdef CONFIG_NUMA_BALANCING static inline void vma_numab_state_init(struct vm_area_struct *vma) { diff --git a/ipc/shm.c b/ipc/shm.c index a9310b6dbbc3..14376b63d46a 100644 --- a/ipc/shm.c +++ b/ipc/shm.c @@ -588,6 +588,22 @@ static struct mempolicy *shm_get_policy(struct vm_area= _struct *vma, } #endif =20 +static void shm_unmap_page_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + struct zap_details *details) +{ + struct file *file =3D vma->vm_file; + struct shm_file_data *sfd =3D shm_file_data(file); + + if (sfd->vm_ops->unmap_page_range) { + sfd->vm_ops->unmap_page_range(tlb, vma, addr, end, details); + return; + } + + __unmap_page_range(tlb, vma, addr, end, details); +} + static int shm_mmap(struct file *file, struct vm_area_struct *vma) { struct shm_file_data *sfd =3D shm_file_data(file); @@ -688,6 +704,7 @@ static const struct vm_operations_struct shm_vm_ops =3D= { .set_policy =3D shm_set_policy, .get_policy =3D shm_get_policy, #endif + .unmap_page_range =3D shm_unmap_page_range, }; =20 /** diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 514fab5a20ef..3fc6eb8a5858 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5461,6 +5461,30 @@ static vm_fault_t hugetlb_vm_op_fault(struct vm_faul= t *vmf) return 0; } =20 +static void hugetlb_vm_op_unmap_page_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + struct zap_details *details) +{ + zap_flags_t zap_flags =3D details ? details->zap_flags : 0; + + /* + * It is undesirable to test vma->vm_file as it + * should be non-null for valid hugetlb area. + * However, vm_file will be NULL in the error + * cleanup path of mmap_region. When + * hugetlbfs ->mmap method fails, + * mmap_region() nullifies vma->vm_file + * before calling this function to clean up. + * Since no pte has actually been setup, it is + * safe to do nothing in this case. + */ + if (!vma->vm_file) + return; + + __unmap_hugepage_range(tlb, vma, addr, end, NULL, zap_flags); +} + /* * When a new function is introduced to vm_operations_struct and added * to hugetlb_vm_ops, please consider adding the function to shm_vm_ops. @@ -5474,6 +5498,7 @@ const struct vm_operations_struct hugetlb_vm_ops =3D { .close =3D hugetlb_vm_op_close, .may_split =3D hugetlb_vm_op_split, .pagesize =3D hugetlb_vm_op_pagesize, + .unmap_page_range =3D hugetlb_vm_op_unmap_page_range, }; =20 static pte_t make_huge_pte(struct vm_area_struct *vma, struct folio *folio, diff --git a/mm/memory.c b/mm/memory.c index 002c28795d8b..dbc299aa82c2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1999,7 +1999,7 @@ static inline unsigned long zap_p4d_range(struct mmu_= gather *tlb, return addr; } =20 -void unmap_page_range(struct mmu_gather *tlb, +void __unmap_page_range(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long addr, unsigned long end, struct zap_details *details) @@ -2019,6 +2019,16 @@ void unmap_page_range(struct mmu_gather *tlb, tlb_end_vma(tlb, vma); } =20 +void unmap_page_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + struct zap_details *details) +{ + if (vma->vm_ops && vma->vm_ops->unmap_page_range) + vma->vm_ops->unmap_page_range(tlb, vma, addr, end, details); + else + __unmap_page_range(tlb, vma, addr, end, details); +} =20 static void unmap_single_vma(struct mmu_gather *tlb, struct vm_area_struct *vma, unsigned long start_addr, @@ -2037,28 +2047,8 @@ static void unmap_single_vma(struct mmu_gather *tlb, if (vma->vm_file) uprobe_munmap(vma, start, end); =20 - if (start !=3D end) { - if (unlikely(is_vm_hugetlb_page(vma))) { - /* - * It is undesirable to test vma->vm_file as it - * should be non-null for valid hugetlb area. - * However, vm_file will be NULL in the error - * cleanup path of mmap_region. When - * hugetlbfs ->mmap method fails, - * mmap_region() nullifies vma->vm_file - * before calling this function to clean up. - * Since no pte has actually been setup, it is - * safe to do nothing in this case. - */ - if (vma->vm_file) { - zap_flags_t zap_flags =3D details ? - details->zap_flags : 0; - __unmap_hugepage_range(tlb, vma, start, end, - NULL, zap_flags); - } - } else - unmap_page_range(tlb, vma, start, end, details); - } + if (start !=3D end) + unmap_page_range(tlb, vma, start, end, details); } =20 /** --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC43719C558; Wed, 20 Aug 2025 01:05:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651952; cv=none; b=JRhnqmNfEs2S+MiuhZTrP/Nru/Wm2lQNtZn0hRw66ly07f2cSyDl3czmqDK+/raMer2buh1kfnNOfT4QlInsrR5a+FHkza+SEeSsWRqUudlTig2NrQjXPZekAFaNqqXu9veOmbNvYQ/uTz0nf/xGbt4Gb96WHpBCk5IzZmuGVX0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651952; c=relaxed/simple; bh=hxRffp3Ng6NjJf68DfwLew2B+T4koeIMJF7H56OM5JQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=sG/N8Q7NvwDBXLZe3TRY97s56gVGiPZ7tX0kyq7te4SWTU40ouHr5q+URkw+XGxQQMXsxudlH1DwZhXFMgIjobP36Sg2EwaqR9RxDWStO8Mn97ArmaON7G0xRMFdBKnzw6zh0GzbOFSJhYuC+cEjrg8b5/2ZN3btq3CTAs1Yb7s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=OQOU23n/; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="OQOU23n/" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBrL3005739; Wed, 20 Aug 2025 01:04:58 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=aTCOK Z54R6TXOK6l3PvLQlmdTpFS20mjQ2epoOBlYkw=; b=OQOU23n/RfYRKc5h5GRAy yIy6JJ62m9PD4vdEPNvz8l1uBz0Y/6ULoHuZL38fTG/icIZqnxv9Ln+UrHJhhM0E WQND59BPxaOIXfFJ0oR72Fb3HU9JxVhiEXs8KRbGO3BS95LSxVwWNaCu0eLkc36y wJY89Kcp5fQXWgjg6tTgxWg/nqCpViyPKVxBAQOnWOZXhGrB0XIru7+2TA+PvE7u CH+4Odzha0ejI2gVYDGQwac9RrW3XvTlpyfITZWlkxLq5sIHDYm1dpfztSEYgI2x fz/bpktBJ0s3TyAqhkOuD2Iw0NHtPuhl1jOKrl049bFem3b0W3+q47h4De9VDUn8 g== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tsr84s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:58 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57K02PP9007367; Wed, 20 Aug 2025 01:04:56 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a06-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:56 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdS011685; Wed, 20 Aug 2025 01:04:55 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-13; Wed, 20 Aug 2025 01:04:55 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 12/22] mm: introduce PUD page table shared count Date: Tue, 19 Aug 2025 18:04:05 -0700 Message-ID: <20250820010415.699353-13-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: XkiUDz7QrV9JQbSS0E3iydrFNOELU_g0 X-Authority-Analysis: v=2.4 cv=S6eAAIsP c=1 sm=1 tr=0 ts=68a51f3a cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=wzGCPD0JsuPSr4ClS-4A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 X-Proofpoint-ORIG-GUID: XkiUDz7QrV9JQbSS0E3iydrFNOELU_g0 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfXx8FqyQ7+B2ay bCJ7q/WZ++K45BiFIwOb/XZv2xR9mK6HAjlKdLED4YAI/qTI9Ep/xzvpUby2QFmqlCwg2Yrc1bp BUvAhvVnxG92Fem7MaInikH2sff8NGy4WUGgxQ0VHAw2QIA2ZyFdzLYJ7OOulgShQ53V9VJL/t1 uMLaUU7CWk3PP6SnYn41CBbRMGKVQ1W2Fs3f0JuFktIRBWAywc3UG5d1xNRI/WhDLMxz0xBYvBA Wg+PFdmM/0qQ2YzRoYfYXXwmrwoeskpyp0Zc9xXU31k2s6MyI/cIYgRREwlabHwG9MMTFVEQJQf hworEMdy8ev8UiL4eTxOjFd9j4pcI6GIV/ebogIUMER5uhGHB+4cka2iwavhfZAOfT729ed7g6c dYqaS4HXlwwAt9q8kyQOCBOoIxFsmg== Content-Type: text/plain; charset="utf-8" Once an mshare shared page table has been linked with one or more process page tables it becomes necessary to ensure that the shared page table is not completely freed when objects in it are unmapped in order to avoid a potential UAF bug. To do this, introduce and use a reference count for PUD pages. Signed-off-by: Anthony Yznaga --- include/linux/mm.h | 1 + include/linux/mm_types.h | 36 ++++++++++++++++++++++++++++++++++-- mm/memory.c | 21 +++++++++++++++++++-- 3 files changed, 54 insertions(+), 4 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 96440082a633..c8dfa5c6e7d4 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3217,6 +3217,7 @@ static inline spinlock_t *pud_lock(struct mm_struct *= mm, pud_t *pud) =20 static inline void pagetable_pud_ctor(struct ptdesc *ptdesc) { + ptdesc_pud_pts_init(ptdesc); __pagetable_ctor(ptdesc); } =20 diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index c8f4d2a2c60b..da5a7a31a81d 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -537,7 +537,7 @@ FOLIO_MATCH(compound_head, _head_3); * @pt_index: Used for s390 gmap. * @pt_mm: Used for x86 pgds. * @pt_frag_refcount: For fragmented page table tracking. Powerpc only. - * @pt_share_count: Used for HugeTLB PMD page table share count. + * @pt_share_count: Used for HugeTLB PMD or Mshare PUD page table share = count. * @_pt_pad_2: Padding to ensure proper alignment. * @ptl: Lock for the page table. * @__page_type: Same as page->page_type. Unused for page tables. @@ -564,7 +564,7 @@ struct ptdesc { pgoff_t pt_index; struct mm_struct *pt_mm; atomic_t pt_frag_refcount; -#ifdef CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING +#if defined(CONFIG_HUGETLB_PMD_PAGE_TABLE_SHARING) || defined(CONFIG_MSHAR= E) atomic_t pt_share_count; #endif }; @@ -638,6 +638,38 @@ static inline void ptdesc_pmd_pts_init(struct ptdesc *= ptdesc) } #endif =20 +#ifdef CONFIG_MSHARE +static inline void ptdesc_pud_pts_init(struct ptdesc *ptdesc) +{ + atomic_set(&ptdesc->pt_share_count, 0); +} + +static inline void ptdesc_pud_pts_inc(struct ptdesc *ptdesc) +{ + atomic_inc(&ptdesc->pt_share_count); +} + +static inline void ptdesc_pud_pts_dec(struct ptdesc *ptdesc) +{ + atomic_dec(&ptdesc->pt_share_count); +} + +static inline int ptdesc_pud_pts_count(struct ptdesc *ptdesc) +{ + return atomic_read(&ptdesc->pt_share_count); +} +#else +static inline void ptdesc_pud_pts_init(struct ptdesc *ptdesc) +{ +} + +static inline int ptdesc_pud_pts_count(struct ptdesc *ptdesc) +{ + return 0; +} +#endif + + /* * Used for sizing the vmemmap region on some architectures */ diff --git a/mm/memory.c b/mm/memory.c index dbc299aa82c2..4e3bb49b95e2 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -228,9 +228,18 @@ static inline void free_pmd_range(struct mmu_gather *t= lb, pud_t *pud, mm_dec_nr_pmds(tlb->mm); } =20 +static inline bool pud_range_is_shared(pud_t *pud) +{ + if (ptdesc_pud_pts_count(virt_to_ptdesc(pud))) + return true; + + return false; +} + static inline void free_pud_range(struct mmu_gather *tlb, p4d_t *p4d, unsigned long addr, unsigned long end, - unsigned long floor, unsigned long ceiling) + unsigned long floor, unsigned long ceiling, + bool *pud_is_shared) { pud_t *pud; unsigned long next; @@ -257,6 +266,10 @@ static inline void free_pud_range(struct mmu_gather *t= lb, p4d_t *p4d, return; =20 pud =3D pud_offset(p4d, start); + if (unlikely(pud_range_is_shared(pud))) { + *pud_is_shared =3D true; + return; + } p4d_clear(p4d); pud_free_tlb(tlb, pud, start); mm_dec_nr_puds(tlb->mm); @@ -269,6 +282,7 @@ static inline void free_p4d_range(struct mmu_gather *tl= b, pgd_t *pgd, p4d_t *p4d; unsigned long next; unsigned long start; + bool pud_is_shared =3D false; =20 start =3D addr; p4d =3D p4d_offset(pgd, addr); @@ -276,7 +290,8 @@ static inline void free_p4d_range(struct mmu_gather *tl= b, pgd_t *pgd, next =3D p4d_addr_end(addr, end); if (p4d_none_or_clear_bad(p4d)) continue; - free_pud_range(tlb, p4d, addr, next, floor, ceiling); + free_pud_range(tlb, p4d, addr, next, floor, ceiling, + &pud_is_shared); } while (p4d++, addr =3D next, addr !=3D end); =20 start &=3D PGDIR_MASK; @@ -290,6 +305,8 @@ static inline void free_p4d_range(struct mmu_gather *tl= b, pgd_t *pgd, if (end - 1 > ceiling - 1) return; =20 + if (unlikely(pud_is_shared)) + return; p4d =3D p4d_offset(pgd, start); pgd_clear(pgd); p4d_free_tlb(tlb, p4d, start); --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7EB6A21D59C; Wed, 20 Aug 2025 01:06:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651965; cv=none; b=IOc8R94rWN7UrnjrNsbY1Hns1IJIVW5i/mZKKSyrE/T5sPmrttgQPnwAwBA8/y5BbgGhDuXRcPhzwD5sBYNIqVSbo+u23FLqPEPdEwvCxwTFtlLuRBhMtq4DvMCuyG/tNihEAkp40ueg2fNaLc2ou2NqCj5hWJ77qncRO6kx30M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651965; c=relaxed/simple; bh=cMZcpkHmcHjFanFwDYa6TFvc40+sPi1388/+2JhmLT8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GG+MfJAc949Iq4YLtMG3U6n/cvuV4vq7EyNaDcXMViWz8ZCM+XHdkLHjdYqHrDM8r1rUqyxiUH06pSSF3xd5SPrr2e4X6Ip3Jeew1FCtHUvLzMhTfLn7ygoIhfrkC8T5WpOreuSoufbe7+XEKOHtfPXYH2dt8wYmejGEjLT2IUk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=ozxUXBvo; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="ozxUXBvo" Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBn4r002950; Wed, 20 Aug 2025 01:05:01 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=eSh7Z 5LIKWQz5MJlTe0n+mPXy6c+OfFvOssfN/OYZ+U=; b=ozxUXBvoCo7n4QRzMHRtg UA1ZigpOutCYZOWM0/c6crdoHv7J/n0fMq7pTwoPWLSFgpKjs32ExntMFlSDSPCH FJGJkqpf0gykyDCAk9f1WI/EHg1pdQlF99pRjAQqd7+reRjWy+P48UVSixluV9YK dChXTluN+NvHeG9ihbKo3I2aRHc3vl7vIx/JPEn3+sDAb575drdnIWuKyIoz5RvG CQtBImFZmJ7kbhlvEQJ9mhkLwGsitk7CbLlTOSIddvapixy7cHrdANhFR3PWs2ip umepQlFnkNWhNMmFgYrlysUCSG0hu5U19bA2Ph3Cec+CjezFaQrN5jFrD7w3KquU w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tr0866-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:00 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JN39ek007104; Wed, 20 Aug 2025 01:04:59 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a12-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:04:59 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdU011685; Wed, 20 Aug 2025 01:04:58 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-14; Wed, 20 Aug 2025 01:04:57 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 13/22] mm/mshare: prepare for page table sharing support Date: Tue, 19 Aug 2025 18:04:06 -0700 Message-ID: <20250820010415.699353-14-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: DdG8eFm3x5GuGQmLEDIxP2XBUph5-Pkq X-Authority-Analysis: v=2.4 cv=FY1uBJ+6 c=1 sm=1 tr=0 ts=68a51f3d cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=VwQbUJbxAAAA:8 a=JfrnYn6hAAAA:8 a=yPCof4ZbAAAA:8 a=8yVM5d_fee57hvELVi4A:9 a=1CNFftbPRP8L7MoqJWF3:22 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 X-Proofpoint-ORIG-GUID: DdG8eFm3x5GuGQmLEDIxP2XBUph5-Pkq X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX7eSvXIMaq74g MSb6UQgPEncgspwK4IRk8qKGWMQzI4JGbGk4BatH1QQHsui22yLkpZPVPBPQfjqtQTiN0ep2k6W M6BQcIe2fbncW7jvgeClOtolBvWg36AkdzmHVCZYxHoIabcTkTuW5C00CavuIKXOupPDyPuylCr 0kVj3XXcYjB6JzrYMH+I34DsiHKFx4AQk2Smv4j0zuzMYBCPByTBDYitjZMeGOCn01VjXUmt3E0 xFS7/qzSGje+Ru7jKI4IycPp70kZ/cyneEKFc3X1S+M7iCY4UitPa9HOLiOFM1MvWl2jgKpMid+ X79yW5wkWL+tSEg21KdHvEWTIEV8J7ysVSTqAkR4juZFr/8D/XZXASSoXq4vXEaeQq/PjzyMAyJ kybryL6bfrryWTcTA9aJsKRB/ZQeIw== Content-Type: text/plain; charset="utf-8" From: Khalid Aziz In preparation for enabling the handling of page faults in an mshare region provide a way to link an mshare shared page table to a process page table and otherwise find the actual vma in order to handle a page fault. Implement an unmap_page_range vm_ops function for msharefs VMAs to unlink shared page tables when a process exits or an mshare region is explicitly unmapped. Signed-off-by: Khalid Aziz Signed-off-by: Matthew Wilcox (Oracle) Signed-off-by: Anthony Yznaga --- include/linux/mm.h | 6 +++ mm/memory.c | 6 +++ mm/mshare.c | 107 +++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 119 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index c8dfa5c6e7d4..3a8dddb5925a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1009,11 +1009,17 @@ static inline bool vma_is_anon_shmem(struct vm_area= _struct *vma) { return false; int vma_is_stack_for_current(struct vm_area_struct *vma); =20 #ifdef CONFIG_MSHARE +vm_fault_t find_shared_vma(struct vm_area_struct **vma, unsigned long *add= rp); static inline bool vma_is_mshare(const struct vm_area_struct *vma) { return vma->vm_flags & VM_MSHARE; } #else +static inline vm_fault_t find_shared_vma(struct vm_area_struct **vma, unsi= gned long *addrp) +{ + WARN_ON_ONCE(1); + return VM_FAULT_SIGBUS; +} static inline bool vma_is_mshare(const struct vm_area_struct *vma) { return false; diff --git a/mm/memory.c b/mm/memory.c index 4e3bb49b95e2..177eb53475cb 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6475,6 +6475,12 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vm= a, unsigned long address, if (ret) goto out; =20 + if (unlikely(vma_is_mshare(vma))) { + WARN_ON_ONCE(1); + ret =3D VM_FAULT_SIGBUS; + goto out; + } + if (!arch_vma_access_permitted(vma, flags & FAULT_FLAG_WRITE, flags & FAULT_FLAG_INSTRUCTION, flags & FAULT_FLAG_REMOTE)) { diff --git a/mm/mshare.c b/mm/mshare.c index be7cae739225..f7b7904f0405 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -21,6 +21,8 @@ #include #include =20 +#include + const unsigned long mshare_align =3D P4D_SIZE; const unsigned long mshare_base =3D mshare_align; =20 @@ -50,6 +52,66 @@ static const struct mmu_notifier_ops mshare_mmu_ops =3D { .arch_invalidate_secondary_tlbs =3D mshare_invalidate_tlbs, }; =20 +static p4d_t *walk_to_p4d(struct mm_struct *mm, unsigned long addr) +{ + pgd_t *pgd; + p4d_t *p4d; + + pgd =3D pgd_offset(mm, addr); + p4d =3D p4d_alloc(mm, pgd, addr); + if (!p4d) + return NULL; + + return p4d; +} + +/* Returns holding the host mm's lock for read. Caller must release. */ +vm_fault_t +find_shared_vma(struct vm_area_struct **vmap, unsigned long *addrp) +{ + struct vm_area_struct *vma, *guest =3D *vmap; + struct mshare_data *m_data =3D guest->vm_private_data; + struct mm_struct *host_mm =3D m_data->mm; + unsigned long host_addr; + p4d_t *p4d, *guest_p4d; + + mmap_read_lock_nested(host_mm, SINGLE_DEPTH_NESTING); + host_addr =3D *addrp - guest->vm_start + host_mm->mmap_base; + p4d =3D walk_to_p4d(host_mm, host_addr); + guest_p4d =3D walk_to_p4d(guest->vm_mm, *addrp); + if (!p4d_same(*guest_p4d, *p4d)) { + spinlock_t *guest_ptl =3D &guest->vm_mm->page_table_lock; + + spin_lock(guest_ptl); + if (!p4d_same(*guest_p4d, *p4d)) { + pud_t *pud =3D p4d_pgtable(*p4d); + + ptdesc_pud_pts_inc(virt_to_ptdesc(pud)); + set_p4d(guest_p4d, *p4d); + spin_unlock(guest_ptl); + mmap_read_unlock(host_mm); + return VM_FAULT_NOPAGE; + } + spin_unlock(guest_ptl); + } + + *addrp =3D host_addr; + vma =3D find_vma(host_mm, host_addr); + + /* XXX: expand stack? */ + if (vma && vma->vm_start > host_addr) + vma =3D NULL; + + *vmap =3D vma; + + /* + * release host mm lock unless a matching vma is found + */ + if (!vma) + mmap_read_unlock(host_mm); + return 0; +} + static int mshare_vm_op_split(struct vm_area_struct *vma, unsigned long ad= dr) { return -EINVAL; @@ -61,9 +123,54 @@ static int mshare_vm_op_mprotect(struct vm_area_struct = *vma, unsigned long start return -EINVAL; } =20 +/* + * Unlink any shared page tables in the range and ensure TLBs are flushed. + * Pages in the mshare region itself are not unmapped. + */ +static void mshare_vm_op_unshare_page_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, + unsigned long addr, unsigned long end, + struct zap_details *details) +{ + struct mm_struct *mm =3D vma->vm_mm; + spinlock_t *ptl =3D &mm->page_table_lock; + unsigned long sz =3D mshare_align; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + + WARN_ON(!vma_is_mshare(vma)); + + tlb_start_vma(tlb, vma); + + for (; addr < end ; addr +=3D sz) { + spin_lock(ptl); + + pgd =3D pgd_offset(mm, addr); + if (!pgd_present(*pgd)) { + spin_unlock(ptl); + continue; + } + p4d =3D p4d_offset(pgd, addr); + if (!p4d_present(*p4d)) { + spin_unlock(ptl); + continue; + } + pud =3D p4d_pgtable(*p4d); + ptdesc_pud_pts_dec(virt_to_ptdesc(pud)); + + p4d_clear(p4d); + spin_unlock(ptl); + tlb_flush_p4d_range(tlb, addr, sz); + } + + tlb_end_vma(tlb, vma); +} + static const struct vm_operations_struct msharefs_vm_ops =3D { .may_split =3D mshare_vm_op_split, .mprotect =3D mshare_vm_op_mprotect, + .unmap_page_range =3D mshare_vm_op_unshare_page_range, }; =20 /* --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EF4081EEA5F; Wed, 20 Aug 2025 01:05:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651957; cv=none; b=tyP6zdwZiMhrVgniTTG9fwx3HnamYQWOYD6VsXNic9/G3nDBgaZltbvwIiTgRbBrXm/o9sKGoQNvCG2UjHG1zSd0BWDMTCdwLPAgwGc+SRbeY1X9RNR2AjFKshgaPeuZNC3TgmifadisNWkODp0n1z436P99f3+8iehjMG8Tfc8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651957; c=relaxed/simple; bh=UxUfGphjSgERYaOTfqq6+WCe+adUUaKoCrFrjyfVn8U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=W6AMmfXda0wrWwjTDUGtnwjY5rLxri7vAiOhlcN27x44+LHYcVmrXd2pKm+Z/IyXfYZ9loeqE3kDM6D1o2eoCJHmvJT8k+v6OH/9EbyF/Q5eXXJkWsWQjTgdRizyR/50kFpxJ/sIpiPdLvZHpPYJ1pzodNnVkX7dcrsFU7IJ1Kc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=chnHnyYj; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="chnHnyYj" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBnPa004722; Wed, 20 Aug 2025 01:05:03 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=1GQml N9LOVzic9ck/xp65PGIqSqPMNjfJbjfxs1Golw=; b=chnHnyYjWp3wkGD/CgOcz HpMSyBe+D8D4mAl/U+Okemx3m4NJsii48co+UswncZILIkGBCJAJn+dL4YDQuLJX K34CQ7OAEt2ZeJdpDfqttt1n5G8n/ACBg6+1xxfj/2XvZTdTFoGeQ0IP7MJ0vBOG bkypU4RjidEKGQsRZEs35DkR71C0IU5b95FENFjQw3R/eHx1HdSiuDa5NCZutCCc VXPdFSrabZc/k3nN96p+9InV+33tItDwMhW0IrX41SVR0dLBYw5KME2ic042jQEq kRXexCsjU0a/DYjnSw/hZEgws4ec1kw7rE0Bqc1bkhAuyUn/p91h9TLmwi2VI4zj A== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tqr8b1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:02 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNWYJX007314; Wed, 20 Aug 2025 01:05:02 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a20-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:01 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdW011685; Wed, 20 Aug 2025 01:05:00 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-15; Wed, 20 Aug 2025 01:05:00 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 14/22] x86/mm: enable page table sharing Date: Tue, 19 Aug 2025 18:04:07 -0700 Message-ID: <20250820010415.699353-15-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfXxdI9l8NKIh8B mJjG4HXLfqqFSvIWISLSu1cTPVGpy6ji5M8+A4xbOEXv5aL/rW8vKlN8CEnBcxOihMsWKnPvqyB ylOR9qePixVQGRrdHcgkCYEHggTsh+zrdXo//cv4ctsTGP9nZYBwonptZHHskbQHJNiQl2+xrJW bDhg6SSZmFhoDdB5NfKy6bN5oWd3F4drhrDhneMKS2LOOTSVmbbu+rxjFolTv/7e4Z4/rdUNW9z 0uREjatAweDePwVOuGMaZgzFin8b7bnAFKMBzHzzR3l5Esrz4Qegf/wYm7Gkg8JVSSveeufgMSR MJ3e5inmMBZcYPUfWKES0bNsHknLdNwm02b9ualxapGaDbpHvbvpc1sdLI1XVRrX+eS8V1Qj96A og3snhv4YHTn3vfm3uLGvloRzpBRcA== X-Proofpoint-ORIG-GUID: YSu3qm5PcULgkW0gS9e9Y7497HnEs1OT X-Proofpoint-GUID: YSu3qm5PcULgkW0gS9e9Y7497HnEs1OT X-Authority-Analysis: v=2.4 cv=K/p73yWI c=1 sm=1 tr=0 ts=68a51f3f cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=H1hRpGHkK2FX5b2Ra40A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" Enable x86 support for handling page faults in an mshare region by redirecting page faults to operate on the mshare mm_struct and vmas contained in it. Some permissions checks are done using vma flags in architecture-specfic fault handling code so the actual vma needed to complete the handling is acquired before calling handle_mm_fault(). Because of this an ARCH_SUPPORTS_MSHARE config option is added. Signed-off-by: Anthony Yznaga --- arch/Kconfig | 3 +++ arch/x86/Kconfig | 1 + arch/x86/mm/fault.c | 40 +++++++++++++++++++++++++++++++++++++++- mm/Kconfig | 2 +- 4 files changed, 44 insertions(+), 2 deletions(-) diff --git a/arch/Kconfig b/arch/Kconfig index d1b4ffd6e085..2e10a11fc442 100644 --- a/arch/Kconfig +++ b/arch/Kconfig @@ -1676,6 +1676,9 @@ config HAVE_ARCH_PFN_VALID config ARCH_SUPPORTS_DEBUG_PAGEALLOC bool =20 +config ARCH_SUPPORTS_MSHARE + bool + config ARCH_SUPPORTS_PAGE_TABLE_CHECK bool =20 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 58d890fe2100..1ad252eec417 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -124,6 +124,7 @@ config X86 select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOC select ARCH_SUPPORTS_HUGETLBFS + select ARCH_SUPPORTS_MSHARE if X86_64 select ARCH_SUPPORTS_PAGE_TABLE_CHECK if X86_64 select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 select ARCH_SUPPORTS_KMAP_LOCAL_FORCE_MAP if NR_CPUS <=3D 4096 diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c index 998bd807fc7b..2a7df3aa13b4 100644 --- a/arch/x86/mm/fault.c +++ b/arch/x86/mm/fault.c @@ -1215,6 +1215,8 @@ void do_user_addr_fault(struct pt_regs *regs, struct mm_struct *mm; vm_fault_t fault; unsigned int flags =3D FAULT_FLAG_DEFAULT; + bool is_shared_vma; + unsigned long addr; =20 tsk =3D current; mm =3D tsk->mm; @@ -1328,6 +1330,12 @@ void do_user_addr_fault(struct pt_regs *regs, if (!vma) goto lock_mmap; =20 + /* mshare does not support per-VMA locks yet */ + if (vma_is_mshare(vma)) { + vma_end_read(vma); + goto lock_mmap; + } + if (unlikely(access_error(error_code, vma))) { bad_area_access_error(regs, error_code, address, NULL, vma); count_vm_vma_lock_event(VMA_LOCK_SUCCESS); @@ -1356,17 +1364,38 @@ void do_user_addr_fault(struct pt_regs *regs, lock_mmap: =20 retry: + addr =3D address; + is_shared_vma =3D false; vma =3D lock_mm_and_find_vma(mm, address, regs); if (unlikely(!vma)) { bad_area_nosemaphore(regs, error_code, address); return; } =20 + if (unlikely(vma_is_mshare(vma))) { + fault =3D find_shared_vma(&vma, &addr); + + if (fault) { + mmap_read_unlock(mm); + goto done; + } + + if (!vma) { + mmap_read_unlock(mm); + bad_area_nosemaphore(regs, error_code, address); + return; + } + + is_shared_vma =3D true; + } + /* * Ok, we have a good vm_area for this memory access, so * we can handle it.. */ if (unlikely(access_error(error_code, vma))) { + if (unlikely(is_shared_vma)) + mmap_read_unlock(vma->vm_mm); bad_area_access_error(regs, error_code, address, mm, vma); return; } @@ -1384,7 +1413,14 @@ void do_user_addr_fault(struct pt_regs *regs, * userland). The return to userland is identified whenever * FAULT_FLAG_USER|FAULT_FLAG_KILLABLE are both set in flags. */ - fault =3D handle_mm_fault(vma, address, flags, regs); + fault =3D handle_mm_fault(vma, addr, flags, regs); + + /* + * If the lock on the shared mm has been released, release the lock + * on the task's mm now. + */ + if (unlikely(is_shared_vma) && (fault & (VM_FAULT_COMPLETED | VM_FAULT_RE= TRY))) + mmap_read_unlock(mm); =20 if (fault_signal_pending(fault, regs)) { /* @@ -1412,6 +1448,8 @@ void do_user_addr_fault(struct pt_regs *regs, goto retry; } =20 + if (unlikely(is_shared_vma)) + mmap_read_unlock(vma->vm_mm); mmap_read_unlock(mm); done: if (likely(!(fault & VM_FAULT_ERROR))) diff --git a/mm/Kconfig b/mm/Kconfig index 8b50e9785729..824da2a481f9 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1402,7 +1402,7 @@ config FIND_NORMAL_PAGE =20 config MSHARE bool "Mshare" - depends on MMU + depends on MMU && ARCH_SUPPORTS_MSHARE help Enable msharefs: A pseudo filesystem that allows multiple processes to share kernel resources for mapping shared pages. A file created on --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 00F851F3FE9; Wed, 20 Aug 2025 01:05:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651958; cv=none; b=IeObaxgo1USu3iHNX/rW2ZOC4B+OjPyAh0NcPAgwrfd05UQUXWzn/yhSuTv/lM62b1QeCTtfDaRskmJpDWwhGh1ndjA0CNIcrXBkodAg9/D80jsn2wfzVnO4WajZ744Jpa2sqVmDkcPcebOTHaM/snP8UXS51qilRKAYvN41470= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651958; c=relaxed/simple; bh=auDFaT8uGobdi7ItBtUrOnjvKnEkCg1oarWevwcrdH4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XNMQ0XrE8YzazC5f6sbwNOAoc5zKzvZmSjV8zZMKZ6oqZU6s+P6GB101AtDIt0JRCNPHX5EJI24uD/JLpBTt/EduQkG6alggMTbiBfpT8aRsEQ+lR3XrkqCf8j1z3W2FYt1igSP731Kb+RTql5Bd4t0ALR+sGktXWFlMlze1UJE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=E1jkn6uc; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="E1jkn6uc" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBqrS005563; Wed, 20 Aug 2025 01:05:05 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=HbYD1 l2iLYYzN9WowdlH5+feA7G9e6IFZDEjiJfik3M=; b=E1jkn6ucUK/gVVlsGu7QL 8EJjAOIUm/io6qbZVuaFtYjKvOiJZanHf+TEq7rELvmcp841H1MXmBYW3XSzqdUU prrJUA44c6Xn7+DYjEe96ZyXBlBpo+mwhm7xH+WHjzb6O+ConDgg8WLSoGLY4/Ky p8IprtksQIjnmh49p5oA9guZRzITYgdpGyM7S+5I703qOwYQzMfQ7qFyc0nEs+5x wNGwh+8eQrDQPtKHLPBKiNHMiR3RNTWbB2+XUSecdVFH4TDKkFaVtqui5uY2+80Y HJRTrVssfgLWvV9yalScIxCuoKouBruky2grh7d+qaS9g5Gk0AXft+eCUksKYKR9 Q== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tsr850-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:05 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNVOUF007290; Wed, 20 Aug 2025 01:05:04 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a3j-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:04 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14NdY011685; Wed, 20 Aug 2025 01:05:03 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-16; Wed, 20 Aug 2025 01:05:03 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 15/22] mm: create __do_mmap() to take an mm_struct * arg Date: Tue, 19 Aug 2025 18:04:08 -0700 Message-ID: <20250820010415.699353-16-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: TbbjLz4-bGRIkmOtv-lhbXkJ4biLIzYW X-Authority-Analysis: v=2.4 cv=S6eAAIsP c=1 sm=1 tr=0 ts=68a51f41 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=xDz22KVWXZV-ta6dUt0A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 X-Proofpoint-ORIG-GUID: TbbjLz4-bGRIkmOtv-lhbXkJ4biLIzYW X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX+3aNG8/E6p++ 4I/UAW+dDyT06eH24CC14bTvgSx5mYxM8Y+AUzyruJr/tjhFmWIipK63mm1MZ02txrWNMaWTECA E7J7AQDJXDp9fUfEGx1jQPkQp+egpAuC6wq8v1Y1+3DG59V87QlDoTXq2RdkoDBQAuxV9ZzhoLh 1EL4R+QRz1+ZWRcs7mTD+je5fjEV0dhb/HQMA3go5oYWey1bw2ULhLOpOLDhfnIi4OpbtFUJ0k5 EmFCULWx4L36303shUuWSKyMvs+LDDYkxpXm8i4/FaWxv212zAKKn3rzcK455ySkWffkO7wUm/O XFNsF0bdfnya6u2B7wLG22BXzyq3+hlVi8ioefq+KQT86Qq8iAkdZoD2SUs3tGBPiGa49pc+9DR enx1eRCPQidAfCoO56Q8JlWO/+lq3w== Content-Type: text/plain; charset="utf-8" In preparation for mapping objects into an mshare region, create __do_mmap() to allow mapping into a specified mm. There are no functional changes otherwise. Signed-off-by: Anthony Yznaga --- include/linux/mm.h | 16 ++++++++++++++++ mm/mmap.c | 10 +++++----- mm/vma.c | 12 ++++++------ mm/vma.h | 2 +- 4 files changed, 28 insertions(+), 12 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3a8dddb5925a..07e0a15a4618 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3434,10 +3434,26 @@ get_unmapped_area(struct file *file, unsigned long = addr, unsigned long len, return __get_unmapped_area(file, addr, len, pgoff, flags, 0); } =20 +#ifdef CONFIG_MMU +unsigned long __do_mmap(struct file *file, unsigned long addr, + unsigned long len, unsigned long prot, unsigned long flags, + vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, + struct list_head *uf, struct mm_struct *mm); +static inline unsigned long do_mmap(struct file *file, unsigned long addr, + unsigned long len, unsigned long prot, unsigned long flags, + vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, + struct list_head *uf) +{ + return __do_mmap(file, addr, len, prot, flags, vm_flags, pgoff, + populate, uf, current->mm); +} +#else extern unsigned long do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, struct list_head *uf); +#endif + extern int do_vmi_munmap(struct vma_iterator *vmi, struct mm_struct *mm, unsigned long start, size_t len, struct list_head *uf, bool unlock); diff --git a/mm/mmap.c b/mm/mmap.c index 7a057e0e8da9..18f266a511e2 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -275,7 +275,7 @@ static inline bool file_mmap_ok(struct file *file, stru= ct inode *inode, } =20 /** - * do_mmap() - Perform a userland memory mapping into the current process + * __do_mmap() - Perform a userland memory mapping into the current process * address space of length @len with protection bits @prot, mmap flags @fl= ags * (from which VMA flags will be inferred), and any additional VMA flags to * apply @vm_flags. If this is a file-backed mapping then the file is spec= ified @@ -327,17 +327,17 @@ static inline bool file_mmap_ok(struct file *file, st= ruct inode *inode, * @uf: An optional pointer to a list head to track userfaultfd unmap even= ts * should unmapping events arise. If provided, it is up to the caller to m= anage * this. + * @mm: The mm_struct * * Returns: Either an error, or the address at which the requested mapping= has * been performed. */ -unsigned long do_mmap(struct file *file, unsigned long addr, +unsigned long __do_mmap(struct file *file, unsigned long addr, unsigned long len, unsigned long prot, unsigned long flags, vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate, - struct list_head *uf) + struct list_head *uf, struct mm_struct *mm) { - struct mm_struct *mm =3D current->mm; int pkey =3D 0; =20 *populate =3D 0; @@ -555,7 +555,7 @@ unsigned long do_mmap(struct file *file, unsigned long = addr, vm_flags |=3D VM_NORESERVE; } =20 - addr =3D mmap_region(file, addr, len, vm_flags, pgoff, uf); + addr =3D mmap_region(file, addr, len, vm_flags, pgoff, uf, mm); if (!IS_ERR_VALUE(addr) && ((vm_flags & VM_LOCKED) || (flags & (MAP_POPULATE | MAP_NONBLOCK)) =3D=3D MAP_POPULATE)) diff --git a/mm/vma.c b/mm/vma.c index 3b12c7579831..a7fbd339d259 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -2637,9 +2637,8 @@ static bool can_set_ksm_flags_early(struct mmap_state= *map) =20 static unsigned long __mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf) + struct list_head *uf, struct mm_struct *mm) { - struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma =3D NULL; int error; bool have_mmap_prepare =3D file && file->f_op->mmap_prepare; @@ -2706,18 +2705,19 @@ static unsigned long __mmap_region(struct file *fil= e, unsigned long addr, * the virtual page offset in memory of the anonymous mapping. * @uf: Optionally, a pointer to a list head used for tracking userfaultfd= unmap * events. + * @mm: The mm struct * * Returns: Either an error, or the address at which the requested mapping= has * been performed. */ unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf) + struct list_head *uf, struct mm_struct *mm) { unsigned long ret; bool writable_file_mapping =3D false; =20 - mmap_assert_write_locked(current->mm); + mmap_assert_write_locked(mm); =20 /* Check to see if MDWE is applicable. */ if (map_deny_write_exec(vm_flags, vm_flags)) @@ -2736,13 +2736,13 @@ unsigned long mmap_region(struct file *file, unsign= ed long addr, writable_file_mapping =3D true; } =20 - ret =3D __mmap_region(file, addr, len, vm_flags, pgoff, uf); + ret =3D __mmap_region(file, addr, len, vm_flags, pgoff, uf, mm); =20 /* Clear our write mapping regardless of error. */ if (writable_file_mapping) mapping_unmap_writable(file->f_mapping); =20 - validate_mm(current->mm); + validate_mm(mm); return ret; } =20 diff --git a/mm/vma.h b/mm/vma.h index bcdc261c5b15..20fc1c2a32fd 100644 --- a/mm/vma.h +++ b/mm/vma.h @@ -352,7 +352,7 @@ void mm_drop_all_locks(struct mm_struct *mm); =20 unsigned long mmap_region(struct file *file, unsigned long addr, unsigned long len, vm_flags_t vm_flags, unsigned long pgoff, - struct list_head *uf); + struct list_head *uf, struct mm_struct *mm); =20 int do_brk_flags(struct vma_iterator *vmi, struct vm_area_struct *brkvma, unsigned long addr, unsigned long request, unsigned long flags); --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7393E21772A; Wed, 20 Aug 2025 01:06:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651963; cv=none; b=NP4LdlCBxUB2cARrnfiMPgUzZ29XBd/YevXzEbakGwo5/YWj4wKK497GuD4DgWuQy5hbtsYHuHPkKfhHnFCdEG8wzwXybWAii1xoHD8sGDpR0Hv7wTas2Iy4ckdeKAlcdiHNHtAvMbsUzxSL6+fN4sI6XA5JnnSkhzjBpyhVbTA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651963; c=relaxed/simple; bh=a3k1tweFTP+K8uO5a5NJ/5ox2KH5eglPmhPGC9jZ0Nk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZK1RqDCwGNY4BYs//3OdRpGlIk08xHk+QhFPlcO/d86XqquHQAAwLdtUzzADOZf7CEB1Yb+p5roJcpihdLBUCYK67owslp+oaoee1y+tXi5TqObesPq4CmhOvBqYTC6AEU/hV5mGv4avRRhdVl15IL9IEjw01//NA8KFA+KbmLM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=TJNxdG7n; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="TJNxdG7n" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBvvG004970; Wed, 20 Aug 2025 01:05:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=r5c6g UgwZz1vea1f0Y4r0+NHJZn6xlXu4ov6LR/succ=; b=TJNxdG7nIaxwt5fsrW9NJ 0ZyCpVhhyouLb2tN3m6windeIj9P+Rnf6Ayt0iUxac2oE/fYgHtvDF+R7mH+M4u9 kJZST3SwGl7GiZxoFHK0ix7/TuVbDqJSO593Tqoh9+EiIpXmf1Cph8QepIORlX33 i5jfGLd/xRfWgY1Middf65v5/S5Y44QzLV0+FOX+PXu4L4vfUPkRvKGzVXs1DfLQ JsTTv5+TF11xOEnVQC8FsSR1WiqIXlM3LUwTNkvFX9QLu/yD/zx7E/mekk2jzRnt ngnbozolIbEHlyUCDJ0GKXwS66BF9St9J7hMFTqJxSn9IqduwRaMvs/7IxXspP8o g== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tqr8b5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:08 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNXLMI007358; Wed, 20 Aug 2025 01:05:07 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a4s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:07 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Nda011685; Wed, 20 Aug 2025 01:05:05 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-17; Wed, 20 Aug 2025 01:05:05 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 16/22] mm: pass the mm in vma_munmap_struct Date: Tue, 19 Aug 2025 18:04:09 -0700 Message-ID: <20250820010415.699353-17-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX/0m/BDOMomdc mjmcUu91DMOoaRioIccCUu/ZMLajNDbMYXYBa19rJLPyRf5+D7f1Q6tm1H7iPlpDp/PUkj6zEeh kOTM/Enzm2D+HZODMwqSlS8ot+iFWzmQ4CYqiHX+9NVanA6mIUUhE0vV7w/5MueGm0Z3Cm/7Bdw c1nMwb8yCVyEHLNDnFc0T6X3Ivlw6qiFTiwW4wVM5u5HjHL+tdRSwWRG9oRajZ1dKbMHSHhcryw jqEw4F/dVjEqsEWLfvRSf1dqXDoTPTnn6a75a8oNsJaoh2rgN+KV5IcqQFVme9WFzUfBMcl5Pa2 6VVkJV3h2aT5rQKyX+Up5/TkdfMOp7N4nC1eMOKydpDMATxvqnMnUFgFBMWjJngqVuSRJ8hdBP/ TYCr5kr/aabAjlzCUWsAseatQ5faiA== X-Proofpoint-ORIG-GUID: GZpwVOy5OHxElfkRP804gWCLz5akPFy_ X-Proofpoint-GUID: GZpwVOy5OHxElfkRP804gWCLz5akPFy_ X-Authority-Analysis: v=2.4 cv=K/p73yWI c=1 sm=1 tr=0 ts=68a51f44 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=KjIxs5Qxk5BEPHft6hcA:9 a=0bXxn9q0MV6snEgNplNhOjQmxlI=:19 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" Allow unmap to work with an mshare host mm. Signed-off-by: Anthony Yznaga --- mm/vma.c | 10 ++++++---- mm/vma.h | 1 + 2 files changed, 7 insertions(+), 4 deletions(-) diff --git a/mm/vma.c b/mm/vma.c index a7fbd339d259..c09b2e1a08e6 100644 --- a/mm/vma.c +++ b/mm/vma.c @@ -1265,7 +1265,7 @@ static void vms_complete_munmap_vmas(struct vma_munma= p_struct *vms, struct vm_area_struct *vma; struct mm_struct *mm; =20 - mm =3D current->mm; + mm =3D vms->mm; mm->map_count -=3D vms->vma_count; mm->locked_vm -=3D vms->locked_vm; if (vms->unlock) @@ -1473,13 +1473,15 @@ static int vms_gather_munmap_vmas(struct vma_munmap= _struct *vms, * @start: The aligned start address to munmap * @end: The aligned end address to munmap * @uf: The userfaultfd list_head + * @mm: The mm struct * @unlock: Unlock after the operation. Only unlocked on success */ static void init_vma_munmap(struct vma_munmap_struct *vms, struct vma_iterator *vmi, struct vm_area_struct *vma, unsigned long start, unsigned long end, struct list_head *uf, - bool unlock) + struct mm_struct *mm, bool unlock) { + vms->mm =3D mm; vms->vmi =3D vmi; vms->vma =3D vma; if (vma) { @@ -1523,7 +1525,7 @@ int do_vmi_align_munmap(struct vma_iterator *vmi, str= uct vm_area_struct *vma, struct vma_munmap_struct vms; int error; =20 - init_vma_munmap(&vms, vmi, vma, start, end, uf, unlock); + init_vma_munmap(&vms, vmi, vma, start, end, uf, mm, unlock); error =3D vms_gather_munmap_vmas(&vms, &mas_detach); if (error) goto gather_failed; @@ -2346,7 +2348,7 @@ static int __mmap_prepare(struct mmap_state *map, str= uct list_head *uf) =20 /* Find the first overlapping VMA and initialise unmap state. */ vms->vma =3D vma_find(vmi, map->end); - init_vma_munmap(vms, vmi, vms->vma, map->addr, map->end, uf, + init_vma_munmap(vms, vmi, vms->vma, map->addr, map->end, uf, map->mm, /* unlock =3D */ false); =20 /* OK, we have overlapping VMAs - prepare to unmap them. */ diff --git a/mm/vma.h b/mm/vma.h index 20fc1c2a32fd..4946d7dc13fd 100644 --- a/mm/vma.h +++ b/mm/vma.h @@ -51,6 +51,7 @@ struct vma_munmap_struct { unsigned long exec_vm; unsigned long stack_vm; unsigned long data_vm; + struct mm_struct *mm; }; =20 enum vma_merge_state { --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E94421A92F; Wed, 20 Aug 2025 01:06:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651964; cv=none; b=YSbAv020rgdfsOCN+wRBku+NM/t8a7RAA6irs9BIPdD9R3U07omqfJqjf94XV0HK3GlEQT/TZQ1JODjYUCDyAIxSumxfjqHSw8eRjhmufeSh9oviivHcioTZBE2LcjycuF0OqMv0z0MBe6Zbx5iRJLMn51tZHJ5kxsC7xAqst3E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651964; c=relaxed/simple; bh=OW3d/JEyHsp4z0Q2OUIVjqNBCBoPj2TrgTMapGv6n5E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=piMUCBttaW4vM8pdX/h94YbdWsxr7Ymkoaa4CCIcwaaYmglLB09nunilT35WclVR/qRS+7QvT4YPKwmTC9gorFbCI+01Vmiu50TiPOz/Nscei6IgHJey0xvV+0tRa3zm6LkTH1JAs2maLlG7b/nRZwLAXBl8DoIvLrsBEtPfm5M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=htbZnQZQ; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="htbZnQZQ" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBvtH005009; Wed, 20 Aug 2025 01:05:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=M5Hpt mkD2k7saiCVXTbFxXoicAZ47I2SdR8LsHkZ+Dw=; b=htbZnQZQl7VEqAQ+KB3Hv KZlrT2qVjkF2zAFwcI83HExFEvqPyxIMIzFYBAS63iG62wNi25UQfr8FqqVLJZbV u6260g4yKvAOZ40bEnYANDjU6EW+RYmxE1YNZhYT4a8cIyw7SxKlEFE/Rwr5NuW4 z5Ub12w7wwk+S0WwnHUkwxGE2z88E+orfaKQKI3o/dcWJJD/uOl/B6vohrWGZwny vntpnof1cihDbFeNnA2uGp3OeuVxh5c+d+1+iBjxTNXxC4L3Ibmk03kkN1sc7Klo DIkMzF1tnUgLCZJDeGBV+zGWmkJyVQ1iWiZMsL1/3XKD2OyVtnnUEj1VGZbAGd+a w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tqr8b7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:10 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57K00ocR007355; Wed, 20 Aug 2025 01:05:09 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a5t-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:09 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Ndc011685; Wed, 20 Aug 2025 01:05:08 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-18; Wed, 20 Aug 2025 01:05:08 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 17/22] sched/mshare: mshare ownership Date: Tue, 19 Aug 2025 18:04:10 -0700 Message-ID: <20250820010415.699353-18-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfXwPsFLjftIuCD 3eipW0vP5sRTRRqnr1pXf3t1dD/HahI2T8F5xStQvPfgK8SlsmENTBRjcEd2pecwXuMKbWbwjFg 8KuGQYXoafaWoFQcOEIS9zXp42nlitXN8pWYuVu7Zi83RRPeD04OZ8tIXj7o/mPMT8zh4u5PRU1 CLFjuA3BEIMA3YO/fSKrcBHqasShWM4pm0pLSuzz1YLt/QhjEqaXQujf4hzZlQxLsw5MdunqK1w reAWzftsPzmtOiweSy70ySYx4DClnx7d2Lpy8HFEjcFU+38wxGVW4lltcqPxpefycCAUxBJ9xXm 3USWuFRc9iuUZV/DzGi2AmfQF5w1viV6sPIdq5RLR70Y5e/aGTyMZR0tzVmJAwZ8c9nhPWmBdsB C2/8IlVs0Fu1R+PMorsdASVpRszaMA== X-Proofpoint-ORIG-GUID: 7gnbcA1QiiEQgIIbSFbyw9kCgcSxWa8J X-Proofpoint-GUID: 7gnbcA1QiiEQgIIbSFbyw9kCgcSxWa8J X-Authority-Analysis: v=2.4 cv=K/p73yWI c=1 sm=1 tr=0 ts=68a51f46 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=ZiWb_19Z13N3sy88laYA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" Ownership of an mshare region is assigned to the process that creates it. Establishing ownership ensures that accounting the memory in an mshare region is applied to the owner and not spread among the processes sharing the memory. It also provides a means for freeing mshare memory in an OOM situation. Once an mshare owner exits, access to the memory by a non-owner process results in a SIGSEGV. For this initial implementation ownership is not shared or transferred through forking or other means. Signed-off-by: Anthony Yznaga --- include/linux/mshare.h | 25 +++++++++++++ include/linux/sched.h | 5 +++ kernel/exit.c | 1 + kernel/fork.c | 1 + mm/mshare.c | 83 ++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 115 insertions(+) create mode 100644 include/linux/mshare.h diff --git a/include/linux/mshare.h b/include/linux/mshare.h new file mode 100644 index 000000000000..b62f0e54cf84 --- /dev/null +++ b/include/linux/mshare.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_MSHARE_H_ +#define _LINUX_MSHARE_H_ + +#include + +struct task_struct; + +#ifdef CONFIG_MSHARE + +void exit_mshare(struct task_struct *task); +#define mshare_init_task(task) INIT_LIST_HEAD(&(task)->mshare_mem) + +#else + +static inline void exit_mshare(struct task_struct *task) +{ +} +static inline void mshare_init_task(struct task_struct *task) +{ +} + +#endif + +#endif /* _LINUX_MSHARE_H_ */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 2b272382673d..17f2f3c0b465 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -48,6 +48,7 @@ #include #include #include +#include #include =20 /* task_struct member predeclarations (sorted alphabetically): */ @@ -1654,6 +1655,10 @@ struct task_struct { /* CPU-specific state of this task: */ struct thread_struct thread; =20 +#ifdef CONFIG_MSHARE + struct list_head mshare_mem; +#endif + /* * New fields for task_struct should be added above here, so that * they are included in the randomized portion of task_struct. diff --git a/kernel/exit.c b/kernel/exit.c index 343eb97543d5..24445109865d 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -951,6 +951,7 @@ void __noreturn do_exit(long code) if (group_dead) acct_process(); =20 + exit_mshare(tsk); exit_sem(tsk); exit_shm(tsk); exit_files(tsk); diff --git a/kernel/fork.c b/kernel/fork.c index 5115be549234..eba6bd709c6e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2143,6 +2143,7 @@ __latent_entropy struct task_struct *copy_process( #endif =20 unwind_task_init(p); + mshare_init_task(p); =20 /* Perform scheduler related setup. Assign this task to a CPU. */ retval =3D sched_fork(clone_flags, p); diff --git a/mm/mshare.c b/mm/mshare.c index f7b7904f0405..8a23b391fa11 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include #include @@ -27,6 +28,7 @@ const unsigned long mshare_align =3D P4D_SIZE; const unsigned long mshare_base =3D mshare_align; =20 #define MSHARE_INITIALIZED 0x1 +#define MSHARE_HAS_OWNER 0x2 =20 struct mshare_data { struct mm_struct *mm; @@ -35,6 +37,7 @@ struct mshare_data { unsigned long size; unsigned long flags; struct mmu_notifier mn; + struct list_head list; }; =20 static inline bool mshare_is_initialized(struct mshare_data *m_data) @@ -42,6 +45,65 @@ static inline bool mshare_is_initialized(struct mshare_d= ata *m_data) return test_bit(MSHARE_INITIALIZED, &m_data->flags); } =20 +static inline bool mshare_has_owner(struct mshare_data *m_data) +{ + return test_bit(MSHARE_HAS_OWNER, &m_data->flags); +} + +static bool mshare_data_getref(struct mshare_data *m_data); +static void mshare_data_putref(struct mshare_data *m_data); + +void exit_mshare(struct task_struct *task) +{ + for (;;) { + struct mshare_data *m_data; + int error; + + task_lock(task); + + if (list_empty(&task->mshare_mem)) { + task_unlock(task); + break; + } + + m_data =3D list_first_entry(&task->mshare_mem, struct mshare_data, + list); + + WARN_ON_ONCE(!mshare_data_getref(m_data)); + + list_del_init(&m_data->list); + task_unlock(task); + + /* + * The owner of an mshare region is going away. Unmap + * everything in the region and prevent more mappings from + * being created. + * + * XXX + * The fact that the unmap can possibly fail is problematic. + * One alternative is doing a subset of what exit_mmap() does. + * If it's preferrable to preserve the mappings then another + * approach is to fail any further faults on the mshare region + * and unlink the shared page tables from the page tables of + * each sharing process by walking the rmap via the msharefs + * inode. + * Unmapping everything means mshare memory is freed up when + * the owner exits which may be preferrable for OOM situations. + */ + + clear_bit(MSHARE_HAS_OWNER, &m_data->flags); + + mmap_write_lock(m_data->mm); + error =3D do_munmap(m_data->mm, m_data->start, m_data->size, NULL); + mmap_write_unlock(m_data->mm); + + if (error) + pr_warn("%s: do_munmap returned %d\n", __func__, error); + + mshare_data_putref(m_data); + } +} + static void mshare_invalidate_tlbs(struct mmu_notifier *mn, struct mm_stru= ct *mm, unsigned long start, unsigned long end) { @@ -362,6 +424,11 @@ msharefs_fill_mm(struct inode *inode) ret =3D mmu_notifier_register(&m_data->mn, mm); if (ret) goto err_free; + INIT_LIST_HEAD(&m_data->list); + task_lock(current); + list_add(&m_data->list, ¤t->mshare_mem); + task_unlock(current); + set_bit(MSHARE_HAS_OWNER, &m_data->flags); =20 refcount_set(&m_data->ref, 1); inode->i_private =3D m_data; @@ -380,6 +447,11 @@ msharefs_delmm(struct mshare_data *m_data) kfree(m_data); } =20 +static bool mshare_data_getref(struct mshare_data *m_data) +{ + return refcount_inc_not_zero(&m_data->ref); +} + static void mshare_data_putref(struct mshare_data *m_data) { if (!refcount_dec_and_test(&m_data->ref)) @@ -543,6 +615,17 @@ msharefs_evict_inode(struct inode *inode) if (!m_data) goto out; =20 + rcu_read_lock(); + + if (!list_empty(&m_data->list)) { + struct task_struct *owner =3D m_data->mm->owner; + + task_lock(owner); + list_del_init(&m_data->list); + task_unlock(owner); + } + rcu_read_unlock(); + mshare_data_putref(m_data); out: clear_inode(inode); --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C4E52222B7; Wed, 20 Aug 2025 01:06:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651966; cv=none; b=BgIukgdSy65s0WLiMEB4Y//ipp0B81lwocpQRNUp/+19GRjqUCPCCc//iR6lEUVJWeZuWt1ux2Dql6cRuzfYXx+Pd633+EqCfEKrW508db5tQRbDnP27anK0uvHXWJJRD0TNB6GFDvgiQQrOWcoPRL+YGFYaDvPxmUJV1GJI3BY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651966; c=relaxed/simple; bh=/RhrwRbkVHnFBF07M6aULHNkv0YMhhGpWK6+QpX7VLs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tFP6CLsTFCifl9Z66JWTlJFpfZU75p8+XDJM7HbFOylCSCnk3MGkxplKC1pIBJXMG1+87s8V1cwmByMLUTRXUCwFeyxSF29EzyHtXastK3Sz2ZdC93F5OwraxwuzjvhXvkxNwpcBUCjk/Knq1plHZhs8pfW2kFgK37ujzG0VJ+s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=kqZ1Qy60; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="kqZ1Qy60" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLC7Jg006874; Wed, 20 Aug 2025 01:05:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=GNBho c+YcWSTKbxc9VAOMiT6S5zpt8f2lVNd6yhPsGs=; b=kqZ1Qy60fgqJijj34CYnH mnn8DCBFkvJ8aDaasVhFhShGR4dHPh1Mvx0KesTuKOA2zROjrhn71WJedww8WxtX kbAMmdurLhcCwKbBYCoNL9FCaQHEtlDSWT8kKmpKo1GgdJTnfRJr1qOQbo34pLpn /n6XScPHYwTjF+qgLWoJRJuvZcga1Fl921ChfcT84r6WYs95cyRB8QO3vPgYlL1k s7yTn7DvGY8GvFVh1hvj68p23sOdR3JBMhzlV9GYw066Qcp3T3ko0d6mBIZELu6h 9e3rG1m1+hxRxPe9W0jMNKE0g/R539B3BzxjhFEorfGEWrBGvz9Qvq1z7ZASPo0E w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tsr854-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:13 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNwUoL007361; Wed, 20 Aug 2025 01:05:12 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a6x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:12 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Nde011685; Wed, 20 Aug 2025 01:05:11 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-19; Wed, 20 Aug 2025 01:05:10 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 18/22] mm/mshare: Add an ioctl for mapping objects in an mshare region Date: Tue, 19 Aug 2025 18:04:11 -0700 Message-ID: <20250820010415.699353-19-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: 6KYLXE3TpaFgkOXiCEvekb2ffu_D24DL X-Authority-Analysis: v=2.4 cv=S6eAAIsP c=1 sm=1 tr=0 ts=68a51f49 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=EF7ItLl7AAAA:8 a=VwQbUJbxAAAA:8 a=yPCof4ZbAAAA:8 a=JfrnYn6hAAAA:8 a=gkfpIXjfDrXi2JZSh4AA:9 a=lqcHg5cX4UMA:10 a=KzV_IjdM9kfMg8rc9Rf7:22 a=1CNFftbPRP8L7MoqJWF3:22 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 X-Proofpoint-ORIG-GUID: 6KYLXE3TpaFgkOXiCEvekb2ffu_D24DL X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX+xKyLrwWzz+S DkBTmG7J0OyaAKxYPFlj+zmqZqkYUQsJBWCr0l7NpA6HStjKdLnelBOuauFwZHNl1KMK+FU2aag IM7UTYQSxuaqF3XoTGeLQWlhJGBQgQz+ig1fvMNqUKl/J7JUpydagUZP53hbjErCPC/UKuUiF6r 6xnSMQXe6fknnuIH8H6FGDxT5cQbuEc0iyQDzDpJyyezoy9e9QbZ6QBz5YsqjKmoppIYUgk1cmS SjobAvchPfXIfHnRH8+RlCJNOo4S1vP2WwLhTFufqvycrf78FhGwXTA0jemhAtJq79rNimtwz/I ntyHr4mrdVf0gPiNjA+gTolI5/s5/ZJ4fHYQ0WINLUy30aueKHUIPqg8fzIXTekxzHzgjRzTfwW CmUog9/8KoeuRkB/tzd7fpYBWjP66A== Content-Type: text/plain; charset="utf-8" From: Khalid Aziz Reserve a range of ioctls for msharefs and add an ioctl for mapping objects within an mshare region. The arguments are the same as mmap() except that the start of the mapping is specified as an offset into the mshare region instead of as an address. System-selected addresses are not supported so MAP_FIXED must be specified. Only shared anonymous memory is supported initially. Signed-off-by: Khalid Aziz Signed-off-by: Anthony Yznaga --- .../userspace-api/ioctl/ioctl-number.rst | 1 + include/uapi/linux/msharefs.h | 31 ++++++++ mm/mshare.c | 76 ++++++++++++++++++- 3 files changed, 107 insertions(+), 1 deletion(-) create mode 100644 include/uapi/linux/msharefs.h diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documenta= tion/userspace-api/ioctl/ioctl-number.rst index 406a9f4d0869..cb7377f40696 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -308,6 +308,7 @@ Code Seq# Include File = Comments 'v' 20-27 arch/powerpc/include/uapi/asm/vas-api.h VAS= API 'v' C0-FF linux/meye.h con= flict! 'w' all CER= N SCI driver +'x' 00-1F linux/msharefs.h msh= arefs filesystem 'y' 00-1F pac= ket based user level communications 'z' 00-3F CAN= bus card conflict! diff --git a/include/uapi/linux/msharefs.h b/include/uapi/linux/msharefs.h new file mode 100644 index 000000000000..ad129beeef62 --- /dev/null +++ b/include/uapi/linux/msharefs.h @@ -0,0 +1,31 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +/* + * msharefs defines a memory region that is shared across processes. + * ioctl is used on files created under msharefs to set various + * attributes on these shared memory regions + * + * + * Copyright (C) 2024 Oracle Corp. All rights reserved. + * Author: Khalid Aziz + */ + +#ifndef _UAPI_LINUX_MSHAREFS_H +#define _UAPI_LINUX_MSHAREFS_H + +#include +#include + +/* + * msharefs specific ioctl commands + */ +#define MSHAREFS_CREATE_MAPPING _IOW('x', 0, struct mshare_create) + +struct mshare_create { + __u64 region_offset; + __u64 size; + __u64 offset; + __u32 prot; + __u32 flags; + __u32 fd; +}; +#endif diff --git a/mm/mshare.c b/mm/mshare.c index 8a23b391fa11..ebec51e655e4 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -10,6 +10,7 @@ * * Copyright (C) 2024 Oracle Corp. All rights reserved. * Author: Khalid Aziz + * Author: Matthew Wilcox * */ =20 @@ -19,6 +20,7 @@ #include #include #include +#include #include #include =20 @@ -308,7 +310,7 @@ msharefs_get_unmapped_area(struct file *file, unsigned = long addr, if ((flags & MAP_TYPE) =3D=3D MAP_PRIVATE) return -EINVAL; =20 - if (!mshare_is_initialized(m_data)) + if (!mshare_is_initialized(m_data) || !mshare_has_owner(m_data)) return -EINVAL; =20 mshare_start =3D m_data->start; @@ -343,6 +345,77 @@ msharefs_get_unmapped_area(struct file *file, unsigned= long addr, pgoff, flags); } =20 +static long +msharefs_create_mapping(struct mshare_data *m_data, struct mshare_create *= mcreate) +{ + struct mm_struct *host_mm =3D m_data->mm; + unsigned long mshare_start, mshare_end; + unsigned long region_offset =3D mcreate->region_offset; + unsigned long size =3D mcreate->size; + unsigned int fd =3D mcreate->fd; + int flags =3D mcreate->flags; + int prot =3D mcreate->prot; + unsigned long populate =3D 0; + unsigned long mapped_addr; + unsigned long addr; + vm_flags_t vm_flags; + int error =3D -EINVAL; + + mshare_start =3D m_data->start; + mshare_end =3D mshare_start + m_data->size; + addr =3D mshare_start + region_offset; + + if ((addr < mshare_start) || (addr >=3D mshare_end) || + (addr + size > mshare_end)) + goto out; + + /* + * Only anonymous shared memory at fixed addresses is allowed for now. + */ + if ((flags & (MAP_SHARED | MAP_FIXED)) !=3D (MAP_SHARED | MAP_FIXED)) + goto out; + if (fd !=3D -1) + goto out; + + if (mmap_write_lock_killable(host_mm)) { + error =3D -EINTR; + goto out; + } + + error =3D 0; + mapped_addr =3D __do_mmap(NULL, addr, size, prot, flags, vm_flags, + 0, &populate, NULL, host_mm); + + if (IS_ERR_VALUE(mapped_addr)) + error =3D (long)mapped_addr; + + mmap_write_unlock(host_mm); +out: + return error; +} + +static long +msharefs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) +{ + struct mshare_data *m_data =3D filp->private_data; + struct mshare_create mcreate; + + if (!mshare_is_initialized(m_data)) + return -EINVAL; + + switch (cmd) { + case MSHAREFS_CREATE_MAPPING: + if (copy_from_user(&mcreate, (struct mshare_create __user *)arg, + sizeof(mcreate))) + return -EFAULT; + + return msharefs_create_mapping(m_data, &mcreate); + + default: + return -ENOTTY; + } +} + static int msharefs_set_size(struct mshare_data *m_data, unsigned long siz= e) { int error =3D -EINVAL; @@ -398,6 +471,7 @@ static const struct file_operations msharefs_file_opera= tions =3D { .open =3D simple_open, .mmap =3D msharefs_mmap, .get_unmapped_area =3D msharefs_get_unmapped_area, + .unlocked_ioctl =3D msharefs_ioctl, .fallocate =3D msharefs_fallocate, }; =20 --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBBB5227E80; Wed, 20 Aug 2025 01:06:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651972; cv=none; b=RWOxhAogH6eCJk2Tcs+RjGqTey15TSovy6msPYTDrS1oMRFNK6jA+4jtec0vmLYM85iZMRwQJZADHi/sGXiVr6bc452vIP8rYcbLhbG7HWL+t6Dz7njcpXl4UsFaE5yftK98pX9h/fcIddV4QudosTDsyvCPIJTvrvk6+Q9xwlU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651972; c=relaxed/simple; bh=cZvyvJf9F6u/jKKs7PEWyR5kNPKqZD+jYnQg0JJWqt4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=pnggqZNZFMFCfgeaElz8pEy0T8G+7s7wrfoV2hOfJmYPHkEAEKaK5WTldRn/LqDinxaBKTbOY32yRrKDDuIMNy9pLlLcemeSYIMpnVXwcWpZvLUJB4zpJhgsczHAr+RrHfIDZtVJGD3tFb6KLOt+v/fV55xjQw27gVCYiqaSQbE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=eqBI7DFR; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="eqBI7DFR" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBm8q005364; Wed, 20 Aug 2025 01:05:17 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=wjoGm Xa06NlpaxdsM8t7M7zxrx9vuEjCZ5dRGwsl5fo=; b=eqBI7DFRKH03KpiWOKlQf pzdHTAwO0jt+gAj0aiyhXZ7ydUPU9y8RrH3DJrC63erHzODgdCM2RrnPmJM97X+N LPOQUfn5t06eVLinkRf2Vg5N1oTgZOjDnWktianvw1G2rBucX63YcwwJ68McS438 TZGY+iCIOpv1cPS4v9pslIQ5fxwtXKFEfa8tJXmg+gOvy4jyx3MIiQ9M08KtkZyH /3Av8GeliSmEpbU0g93GGsM1D4awpgJkRda72kbvOsZ0J1bapVSn0bjfhK8VXGtJ 3wUfhG1HhKI4KCH3awJAhph3eF4nPjmigTG2UBKVSXYvS+XLEGwwLUXr9ZJfuVNq w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tsr855-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:17 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNU7G0007374; Wed, 20 Aug 2025 01:05:16 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a8q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:15 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Ndg011685; Wed, 20 Aug 2025 01:05:14 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-20; Wed, 20 Aug 2025 01:05:14 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 19/22] mm/mshare: Add an ioctl for unmapping objects in an mshare region Date: Tue, 19 Aug 2025 18:04:12 -0700 Message-ID: <20250820010415.699353-20-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: FZMDyNrGz-kBXzU0LBJu2U9tFf2PbkWF X-Authority-Analysis: v=2.4 cv=S6eAAIsP c=1 sm=1 tr=0 ts=68a51f4d cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=ZK8ikAtwNUxspKypWBsA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 X-Proofpoint-ORIG-GUID: FZMDyNrGz-kBXzU0LBJu2U9tFf2PbkWF X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX35ivsWoahLZu OeP1bjg8Z0tMRWMw9IQpQf+HzRLKANG5OcR+e6vIIvyehO7/EfOP7EKF7mYHaeLs1zOUbk42VPv pcnrCHHnsnC7whM/YZccUTxGwclUxSBVR4ahC6i9lAu+hmAPB9pVnGprW3dm5bPqy3e8J0QVT3n sHJPY5xDwK75i7WjxBSv5E9SuKtcS8BAdUpE2RjrB1lH0Le3QtTkfEvPr0+nOMf0jY6WTX49NRu pLXVE/vdWj/bOYm+scKriNbB7FBk8vb0snFbp0cIhm+aeM5T/ROmxBiulqhlyfEI7PodM0crGjk 0hKMNeQz27agz45/NeAebGDpPhrOhlxUw1KvxNK3TZI2Ii9K1010hFWVTpEGviy3hrj4Gxx9UjO +KlU8JgsG8w5D/pIsDiYYP/qBs/n3g== Content-Type: text/plain; charset="utf-8" The arguments are the same as munmap() except that the start of the mapping is specified as an offset into the mshare region instead of as an address. Signed-off-by: Anthony Yznaga --- include/uapi/linux/msharefs.h | 7 +++++++ mm/mshare.c | 37 +++++++++++++++++++++++++++++++++++ 2 files changed, 44 insertions(+) diff --git a/include/uapi/linux/msharefs.h b/include/uapi/linux/msharefs.h index ad129beeef62..fb0235d1e384 100644 --- a/include/uapi/linux/msharefs.h +++ b/include/uapi/linux/msharefs.h @@ -19,6 +19,7 @@ * msharefs specific ioctl commands */ #define MSHAREFS_CREATE_MAPPING _IOW('x', 0, struct mshare_create) +#define MSHAREFS_UNMAP _IOW('x', 1, struct mshare_unmap) =20 struct mshare_create { __u64 region_offset; @@ -28,4 +29,10 @@ struct mshare_create { __u32 flags; __u32 fd; }; + +struct mshare_unmap { + __u64 region_offset; + __u64 size; +}; + #endif diff --git a/mm/mshare.c b/mm/mshare.c index ebec51e655e4..b1e02f5e1f60 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -394,11 +394,41 @@ msharefs_create_mapping(struct mshare_data *m_data, s= truct mshare_create *mcreat return error; } =20 +static long +msharefs_unmap(struct mshare_data *m_data, struct mshare_unmap *munmap) +{ + struct mm_struct *host_mm =3D m_data->mm; + unsigned long mshare_start, mshare_end, mshare_size; + unsigned long region_offset =3D munmap->region_offset; + unsigned long size =3D munmap->size; + unsigned long addr; + int error; + + mshare_start =3D m_data->start; + mshare_size =3D m_data->size; + mshare_end =3D mshare_start + mshare_size; + addr =3D mshare_start + region_offset; + + if ((size > mshare_size) || (region_offset >=3D mshare_size) || + (addr + size > mshare_end)) + return -EINVAL; + + if (mmap_write_lock_killable(host_mm)) + return -EINTR; + + error =3D do_munmap(host_mm, addr, size, NULL); + + mmap_write_unlock(host_mm); + + return error; +} + static long msharefs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { struct mshare_data *m_data =3D filp->private_data; struct mshare_create mcreate; + struct mshare_unmap munmap; =20 if (!mshare_is_initialized(m_data)) return -EINVAL; @@ -411,6 +441,13 @@ msharefs_ioctl(struct file *filp, unsigned int cmd, un= signed long arg) =20 return msharefs_create_mapping(m_data, &mcreate); =20 + case MSHAREFS_UNMAP: + if (copy_from_user(&munmap, (struct mshare_unmap __user *)arg, + sizeof(munmap))) + return -EFAULT; + + return msharefs_unmap(m_data, &munmap); + default: return -ENOTTY; } --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 55436226CF9; Wed, 20 Aug 2025 01:06:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651970; cv=none; b=gdxtgrbEL5j1uKcVMcRwhYzaSyDGcsCHAHQ3wFVoWPk0kBN9GgowzdYO93+aR/BiD4Bv5/XYcVbOH5cYakGZdP3knSfwf736Csg/X8irTtsYXGXejsG70Vmwi7xdfF48dyKW4u4QvfVwLhr67ZC/fyIDj1RX7AZB9mfbzFpY//Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651970; c=relaxed/simple; bh=CqA3axyEPPWlesCqDyBoHkyQgtsyMscDG5i4SWFiEFA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YzkIHmbEHgHjlZar/v8SoJhpKR+9C0Dccf8CEkXSGNtQLX5VaT2r30Zn/tKPkmFmdEClL9fLNuPbtx1bEaqTTBeYS73DdvqqGuQn4Fr5b3k1Z6cZONU1mvUS5vg0hd0rquFOZt/yWnWP/7hQqDNN+/W2wJSZWYvWYGXpA2bm8KE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=CK7XsvN6; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="CK7XsvN6" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBnPc004722; Wed, 20 Aug 2025 01:05:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=U4HJJ F0kT82Axkedvm6cvuJARVtxnZRVAJyk6ERDIjM=; b=CK7XsvN6rvcD7Zupjeg5X Ndh9sJleqGacn6JLqO0fgZv0/6AtmxRPDx+lDH8kFekLdygetxOw5o3j5Rt1iPQh Ltt6ZMlZVPAVN4vlUZFTXR6w1+7jTfjszbbnOj82K8TrH/8oWp8fg/VxbrdHzLbg v8VuBryVrG8t5YxXk18NwaA2KnwOOr/PRWhXjF3/1m79K921+ER0kiolxV5Gbi6Z ym7K/+z7L9hncg/66izRvioYM5ByowrjcrcfIzs6rNpQyZJWXguG5mWk5Gp0Dq6y e2ZfCqLFgf4xEgFqn9IcVgprLCYYAWA+7aqhzNIq0fw1uNE5SYpgvNTHTd3ZGUUx Q== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tqr8bb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:19 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JN39es007104; Wed, 20 Aug 2025 01:05:18 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2a9w-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:18 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Ndi011685; Wed, 20 Aug 2025 01:05:17 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-21; Wed, 20 Aug 2025 01:05:17 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 20/22] mm/mshare: support mapping files and anon hugetlb in an mshare region Date: Tue, 19 Aug 2025 18:04:13 -0700 Message-ID: <20250820010415.699353-21-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX2k3PXK72Qe4k 1dRywlpIWn1TF8G6IA+amQjnns06VIRSoOhrhwH97/DPeZopg7zbcy+1zz2ERVjTJnPe5qapaM0 1XYINXRshxWmpVav31HI94SLLMD67TA2bw/pzEYyjy4louZ8Cp5zPXfaO6zdxtJ/R6s7Cw/XWti p2hEu764Ge4PpFzMOlp4B9qpKkO7cJlfoYYhz0X8OF9zxj9XPCF19VFZS7GVgDaZSHLQOAPOOZU y2xNNNKCWkDmL0HwEMwfFTRgMUWxoxLQAclotjuUD3b5ijsInTXUixM1N5XSnYX/0hjqNF80Vq8 1xYkuYqU2XMsGkxREnhSute87OMnlobolzbWxyut2Bx6/Fb9fAGemSOknKcogxkxJ4aChj6Dt2C SYqmcQq5Mrtfmy5gqz3PmPmcsfJJyw== X-Proofpoint-ORIG-GUID: OZAPdZWoBtvoOPQXRPTcW7j3zLPp_reX X-Proofpoint-GUID: OZAPdZWoBtvoOPQXRPTcW7j3zLPp_reX X-Authority-Analysis: v=2.4 cv=K/p73yWI c=1 sm=1 tr=0 ts=68a51f4f cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=fqiw4W1tFLHFuho4Wq4A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" The astute reader will notice that the code is largely copied from ksys_mmap_pgoff() with key differences being that mapping an mshare region within an mshare region is disallowed and that the possibly modified size is checked to ensure the new mapping does not exceed the bounds of the mshare region. Signed-off-by: Anthony Yznaga --- mm/mshare.c | 71 +++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 63 insertions(+), 8 deletions(-) diff --git a/mm/mshare.c b/mm/mshare.c index b1e02f5e1f60..ddcf7bb2e956 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -14,8 +14,10 @@ * */ =20 +#include #include #include +#include #include #include #include @@ -345,12 +347,15 @@ msharefs_get_unmapped_area(struct file *file, unsigne= d long addr, pgoff, flags); } =20 +static const struct file_operations msharefs_file_operations; + static long msharefs_create_mapping(struct mshare_data *m_data, struct mshare_create *= mcreate) { struct mm_struct *host_mm =3D m_data->mm; unsigned long mshare_start, mshare_end; unsigned long region_offset =3D mcreate->region_offset; + unsigned long pgoff =3D mcreate->offset >> PAGE_SHIFT; unsigned long size =3D mcreate->size; unsigned int fd =3D mcreate->fd; int flags =3D mcreate->flags; @@ -359,37 +364,87 @@ msharefs_create_mapping(struct mshare_data *m_data, s= truct mshare_create *mcreat unsigned long mapped_addr; unsigned long addr; vm_flags_t vm_flags; + struct file *file =3D NULL; int error =3D -EINVAL; =20 mshare_start =3D m_data->start; mshare_end =3D mshare_start + m_data->size; addr =3D mshare_start + region_offset; =20 - if ((addr < mshare_start) || (addr >=3D mshare_end) || - (addr + size > mshare_end)) + /* + * Check the size later after size has possibly been + * adjusted. + */ + if ((addr < mshare_start) || (addr >=3D mshare_end)) goto out; =20 /* - * Only anonymous shared memory at fixed addresses is allowed for now. + * Only shared memory at fixed addresses is allowed for now. */ if ((flags & (MAP_SHARED | MAP_FIXED)) !=3D (MAP_SHARED | MAP_FIXED)) goto out; - if (fd !=3D -1) - goto out; + + if (!(flags & MAP_ANONYMOUS)) { + file =3D fget(fd); + if (!file) { + error =3D -EBADF; + goto out; + } + if (is_file_hugepages(file)) { + size =3D ALIGN(size, huge_page_size(hstate_file(file))); + } else if (unlikely(flags & MAP_HUGETLB)) { + error =3D -EINVAL; + goto out_fput; + } else if (file->f_op =3D=3D &msharefs_file_operations) { + error =3D -EINVAL; + goto out_fput; + } + } else if (flags & MAP_HUGETLB) { + struct hstate *hs; + + hs =3D hstate_sizelog((flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); + if (!hs) + return -EINVAL; + + size =3D ALIGN(size, huge_page_size(hs)); + /* + * VM_NORESERVE is used because the reservations will be + * taken when vm_ops->mmap() is called + */ + file =3D hugetlb_file_setup(HUGETLB_ANON_FILE, size, + VM_NORESERVE, + HUGETLB_ANONHUGE_INODE, + (flags >> MAP_HUGE_SHIFT) & MAP_HUGE_MASK); + if (IS_ERR(file)) { + error =3D PTR_ERR(file); + goto out; + } + } + + if (addr + size > mshare_end) + goto out_fput; + + error =3D security_mmap_file(file, prot, flags); + if (error) + goto out_fput; =20 if (mmap_write_lock_killable(host_mm)) { error =3D -EINTR; - goto out; + goto out_fput; } =20 error =3D 0; - mapped_addr =3D __do_mmap(NULL, addr, size, prot, flags, vm_flags, - 0, &populate, NULL, host_mm); + mapped_addr =3D __do_mmap(file, addr, size, prot, flags, vm_flags, + pgoff, &populate, NULL, host_mm); =20 if (IS_ERR_VALUE(mapped_addr)) error =3D (long)mapped_addr; =20 mmap_write_unlock(host_mm); + +out_fput: + if (file) + fput(file); out: return error; } --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DBC21228CB5; Wed, 20 Aug 2025 01:06:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651972; cv=none; b=ltrvtkcPIMOFvmoSC3QVODxrP6v/943W1YBjBTYckke3bLa14t2BqJ3wB6koXbdinwsPNqxdvVEJ1OTfeagGugVaakCgWdmHhX1c9wghq54lfL7SyPzlDPhx6J3Du/8EpASiog4v3lnnBwuGJheuKcRhF4nDfzFULaTh4Ua3CGs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651972; c=relaxed/simple; bh=z7GVUXgnefSvjb4ZQLTbaIE2/vWnbF28ErxIdtf3blg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iETmQNRs6S8KV6jGfnJTL8LkPysTZc/II0mlPmeLG2Cp/5eiD+VP482Nx1iJZLdjgs6B1ckPYCvHAgN08fKM9mwLU5TdyRX7ix3tdpeLGIyhkJejIu36QJCtwmsDK3bvK+EckhJRy3EnsDm+vYZSjcSboe3Lih50yZYA4xbtAgU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=T60BrX8Q; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="T60BrX8Q" Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBnpF005373; Wed, 20 Aug 2025 01:05:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=infP1 lJZ7Ka8IX5+GWUINM3qKZZWXr34+Aqn/QJlI4Q=; b=T60BrX8QOrU3Uf13MxnNg ONRGEb9jRBgXTNgFlp27/vPFA3Dft51qeT+l6NwTHT0YitxzvIwigPTTQrRLf3Xk 94CK5Yvtx2hCcpmJlO+v2VqpY1g6bgJhkueU624LEwsmf0VaZk0vPO+Y2fjhaIoR R/Ev7R1vhM9XmTq5PeDG7zbFm4toh5mneAmIg0dPn4VeNcN+xhwK8/m/BzFRe56v vh81AHOU3f45rYG8+bs2NSULQlnzhEX4NeGXTJkN8hxlpQABkFvi3DpvPmgtonk3 YcOSJoJhkw1ZBuUT0ZSXfD0Ahq26CLLtDrNkndXrcgw59eiiqCe7/EO0D/PAeOmO g== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tsr858-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:22 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JNPt9O007235; Wed, 20 Aug 2025 01:05:21 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2aba-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:20 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Ndk011685; Wed, 20 Aug 2025 01:05:19 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-22; Wed, 20 Aug 2025 01:05:19 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 21/22] mm/mshare: provide a way to identify an mm as an mshare host mm Date: Tue, 19 Aug 2025 18:04:14 -0700 Message-ID: <20250820010415.699353-22-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-GUID: z9by1qV16PLcYdBxBbw-tvoWMv6AlGCh X-Authority-Analysis: v=2.4 cv=S6eAAIsP c=1 sm=1 tr=0 ts=68a51f52 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=mFgWS2ekddffsZ1b:21 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=VYm1j9aar4kuMGX0MzQA:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 X-Proofpoint-ORIG-GUID: z9by1qV16PLcYdBxBbw-tvoWMv6AlGCh X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX2xGjmgVR1QC5 CZxdsBOHh4VIMmGeNE7QLyPjvtpWcFHeWyInZ0UEuuhQnz7P8txovb5Y99DBRMzfiJ0RDJlQeYW l4jo3Ytu6EwXuM8t67BbQdUmDe4h5MKDLFMIrafpTX84pFg6eRwgkQeI50hmp1Ynv0bpGJzaDhJ AKJYann+OnPJTpyvrCvWS2RhvBMbJD31ka/eFH0Gh4ilzT33PBJCKHNtUw0t0pdTImPtWzNG2Bv f635KwsF4kOxiSZwieU1/YLmiNkucPbbRWYueRGhcNedL1a77dY+n0keJUxAlU7PWDVscTaQ9xG reT/oj9b0hLszWvN8I/RzrHZTwWuwJOrvZY6ZS07V4fAGxt+UPi/pSsyjG0WPB2+5v9CdD5OLpX G+iDse/42ToobEW6TQ1/7NAtBx5zcg== Content-Type: text/plain; charset="utf-8" Add new mm flag, MMF_MSHARE. Signed-off-by: Anthony Yznaga --- include/linux/mm_types.h | 2 ++ mm/mshare.c | 1 + 2 files changed, 3 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index da5a7a31a81d..4586a3f384f1 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1847,6 +1847,8 @@ enum { #define MMF_TOPDOWN 31 /* mm searches top down by default */ #define MMF_TOPDOWN_MASK _BITUL(MMF_TOPDOWN) =20 +#define MMF_MSHARE 32 /* mm is an mshare host mm */ + #define MMF_INIT_LEGACY_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\ MMF_DISABLE_THP_MASK | MMF_HAS_MDWE_MASK |\ MMF_VM_MERGE_ANY_MASK | MMF_TOPDOWN_MASK) diff --git a/mm/mshare.c b/mm/mshare.c index ddcf7bb2e956..22e2aedb74d3 100644 --- a/mm/mshare.c +++ b/mm/mshare.c @@ -578,6 +578,7 @@ msharefs_fill_mm(struct inode *inode) if (!mm) return -ENOMEM; =20 + mm_flags_set(MMF_MSHARE, mm); mm->mmap_base =3D mshare_base; mm->task_size =3D 0; =20 --=20 2.47.1 From nobody Sat Oct 4 05:02:34 2025 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5A41C237163; Wed, 20 Aug 2025 01:06:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651976; cv=none; b=ogCvuPlmEd3scNLbpAB7g7iQuHgzSe0jGKSw2f/Rxz7fNVT2NrSdARGm35riYaN7y3sXy/8E63w5/qdoYei13y1Lu737WkrqhqKwJv8FEfcJtDQ/CMM8tBR9x/RUIH4qGOowWweFaGk0gawsEIZxe0tRuaNiF7HCbUW7GBAXZ60= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1755651976; c=relaxed/simple; bh=dShIl1d0AGB+5S6ZwLgFhvei4FspP2iEuigbDgXUNCU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=mNg+cfnhq1O+HuCw1THsMVt6uWddjfdjwu10o19qlNZtIfIFl6nf9WG8fw3YsOCJPqCzrWNsFSKsOTQNwwKce3gDuPyBZAu6xXAoDotLtKIgd6gOKOH/pXWkW9GUHY5FBneCu6J5ocdUU8+oSgMJfZo1fH837znQhXYhJYDYp5U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=bOYPhJ4w; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="bOYPhJ4w" Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 57JLBnPd004722; Wed, 20 Aug 2025 01:05:24 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=cc :content-transfer-encoding:date:from:in-reply-to:message-id :mime-version:references:subject:to; s=corp-2025-04-25; bh=bF0MG WFgVsHJanG/wrNJY+QDnxV4S15/3lSafFI64CE=; b=bOYPhJ4wSBJ3+LEczXzz2 LjzyGjAu7Aw3Gke1934IsvA7v9qpf14b3z/96Zys+wT16tpp2cS9cB5IdTQsxXV0 ykw5MIEQ9x0wPMootWonN8E1tm68NzSwgrbXNKTf2bF109gL1+Mfm2NmyILAd+1W gYePZd0tlcJzUX7gALfyxk5lIZ3ZKV+h3kyjzlJK+cV0n2gQxwe5CZ119oVuTjIZ xTVTft4VUOX8qP0Re6Aw77CaVEc9f/xBFGhV4qncpEFZdVyb8OmzYYIfoGPPonJj WnWk+9yLE6wsikWSidseeGDA10bA10XaxDrxDRHTvWz3VFVbYH6N+y/Nq6tT59Hb Q== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 48n0tqr8be-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:24 +0000 (GMT) Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.18.1.2/8.18.1.2) with ESMTP id 57JN39ew007104; Wed, 20 Aug 2025 01:05:23 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 48my3q2acd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 20 Aug 2025 01:05:23 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 57K14Ndm011685; Wed, 20 Aug 2025 01:05:22 GMT Received: from localhost.localdomain (ca-dev60.us.oracle.com [10.129.136.27]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 48my3q29gw-23; Wed, 20 Aug 2025 01:05:22 +0000 From: Anthony Yznaga To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, andreyknvl@gmail.com, arnd@arndb.de, bp@alien8.de, brauner@kernel.org, bsegall@google.com, corbet@lwn.net, dave.hansen@linux.intel.com, david@redhat.com, dietmar.eggemann@arm.com, ebiederm@xmission.com, hpa@zytor.com, jakub.wartak@mailbox.org, jannh@google.com, juri.lelli@redhat.com, khalid@kernel.org, liam.howlett@oracle.com, linyongting@bytedance.com, lorenzo.stoakes@oracle.com, luto@kernel.org, markhemm@googlemail.com, maz@kernel.org, mhiramat@kernel.org, mgorman@suse.de, mhocko@suse.com, mingo@redhat.com, muchun.song@linux.dev, neilb@suse.de, osalvador@suse.de, pcc@google.com, peterz@infradead.org, pfalcato@suse.de, rostedt@goodmis.org, rppt@kernel.org, shakeel.butt@linux.dev, surenb@google.com, tglx@linutronix.de, vasily.averin@linux.dev, vbabka@suse.cz, vincent.guittot@linaro.org, viro@zeniv.linux.org.uk, vschneid@redhat.com, willy@infradead.org, x86@kernel.org, xhao@linux.alibaba.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v3 22/22] mm/mshare: charge fault handling allocations to the mshare owner Date: Tue, 19 Aug 2025 18:04:15 -0700 Message-ID: <20250820010415.699353-23-anthony.yznaga@oracle.com> X-Mailer: git-send-email 2.47.3 In-Reply-To: <20250820010415.699353-1-anthony.yznaga@oracle.com> References: <20250820010415.699353-1-anthony.yznaga@oracle.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1099,Hydra:6.1.9,FMLib:17.12.80.40 definitions=2025-08-19_04,2025-08-14_01,2025-03-28_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 adultscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 suspectscore=0 mlxlogscore=999 mlxscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2508110000 definitions=main-2508200007 X-Proofpoint-Spam-Details-Enc: AW1haW4tMjUwODE5MDE5NyBTYWx0ZWRfX8TOJSK7gz1xM dGsozF7c7go5Ju4nHoshqHdFX1NHqMX18M//9z4kt8ozI/FP8YlrREkk3LCNpCOE025+N3Td/09 djZcCpnIvQl2nVR4R4sqbQTm+kGb1ICe2+vC1DHrbJeUX3muutedLqFYN9Ec6rynDCjB9393Y3x 0fw8hMivk6YuhOK7pJ9mpVSVb5vIHdCOEttnWoZNihWgjHB9w94Muo/WyE5RMO3bzsGwt9sWL6J qlwbakYBsxoUdiRqsJ0c5cbZzNbZaFQEi8pnlusV+mJVXwrHIQwPgV1ExsU+X/iZU4xg+HnGKsk L/pEE5/A0aZWhPYysUqEPkrj0/AGS35RPuuYAV2bAvm/jbc+yS5Ymih5NdkRuaETtIyM/+Oypv0 E/xDksS0eZDii/keOaR1WsMBzEaavw== X-Proofpoint-ORIG-GUID: _RaWCOjm3DSPcPOQ-pG-pfNqWr3YDQob X-Proofpoint-GUID: _RaWCOjm3DSPcPOQ-pG-pfNqWr3YDQob X-Authority-Analysis: v=2.4 cv=K/p73yWI c=1 sm=1 tr=0 ts=68a51f54 cx=c_pps a=XiAAW1AwiKB2Y8Wsi+sD2Q==:117 a=XiAAW1AwiKB2Y8Wsi+sD2Q==:17 a=2OwXVqhp2XgA:10 a=yPCof4ZbAAAA:8 a=cleBvi95ZpQhKa3SDf4A:9 a=UhEZJTgQB8St2RibIkdl:22 a=Z5ABNNGmrOfJ6cZ5bIyy:22 a=QOGEsqRv6VhmHaoFNykA:22 Content-Type: text/plain; charset="utf-8" When handling a fault in an mshare range, redirect charges for page tables and other allocations to the mshare owner rather than the current task. Signed-off-by: Anthony Yznaga --- mm/memory.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index 177eb53475cb..127db0b9932c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -6468,9 +6468,17 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vm= a, unsigned long address, struct mm_struct *mm =3D vma->vm_mm; vm_fault_t ret; bool is_droppable; + bool is_mshare =3D mm_flags_test(MMF_MSHARE, mm); + struct mem_cgroup *mshare_memcg; + struct mem_cgroup *memcg; =20 __set_current_state(TASK_RUNNING); =20 + if (unlikely(is_mshare)) { + mshare_memcg =3D get_mem_cgroup_from_mm(vma->vm_mm); + memcg =3D set_active_memcg(mshare_memcg); + } + ret =3D sanitize_fault_flags(vma, &flags); if (ret) goto out; @@ -6530,6 +6538,11 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vm= a, unsigned long address, out: mm_account_fault(mm, regs, address, flags, ret); =20 + if (unlikely(is_mshare)) { + set_active_memcg(memcg); + mem_cgroup_put(mshare_memcg); + } + return ret; } EXPORT_SYMBOL_GPL(handle_mm_fault); --=20 2.47.1