From nobody Tue Dec 2 02:43:37 2025 Received: from mail-ed1-f52.google.com (mail-ed1-f52.google.com [209.85.208.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D4B9257AEC for ; Tue, 18 Nov 2025 13:25:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763472340; cv=none; b=X7Ns2DevsTSz4OQD5h+mHR9TerTSCeEjx9D5tT8UDekG47NYj/9jL17+XaGFCkts0m9ogPl8SSy5mMzsJlkyZ02eijoUmmuKifmnslho2092DLKja4knE2h3jfpxS10iEIXPkA9blhMg6s+Olf/VU/HtNvrH4f6izhOSesX7hgg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763472340; c=relaxed/simple; bh=zWSSTCstAadN7vGSahoGauZZ1cquOTC15LnM7FQfqG4=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=WqyoQBtS985PTEKSm6WNUNlLnWYdcA+yCkvCYDDklRWyjy6f+D2H8/DagAstGhFbYSL3ZPyKo5+/3FOGH2J+pNAUpr4gLc3YnTLqh+3hw/84t+F6oYK69OB6QbBJUeL5EmcN0amAmOhuXBsjntK/enCOX3JWNmdh2jNXREUc+YA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=arlxGG0O; arc=none smtp.client-ip=209.85.208.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="arlxGG0O" Received: by mail-ed1-f52.google.com with SMTP id 4fb4d7f45d1cf-6419e6dab7fso8173392a12.2 for ; Tue, 18 Nov 2025 05:25:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1763472337; x=1764077137; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:sender:from:to:cc:subject:date:message-id:reply-to; bh=D/KSGnmYHhpQtAQRF2DGYnHFC5JvS++vMYTx6SfB8k8=; b=arlxGG0OXJWJ8NwPX9/AEq4P8MJgLtt95FS17POFV/dq0klNbCuHUNU5kXnd3IiGRt OzNERoYMp2ClB2PwZIESpW1UrCjHnrkxd+VaTcYNOjimlJrhsQY6siK5DOtKXDK/Ow5y ea3zGmfPX7plr6jVFZ6Kz+JTKpCtbixHOEFUOVk4iPNrQvpqV2reCbFdl2xnVlRuFInT J/b7GnKw2ThGBTLGi0DQdylptwUJxzOwFCgpuB9jxWLpyK2fnvgCKH4T+ROnHI9aQQei E4nThFSHf5TGhwjh0+1UqXfOmCxj7QWQ2ES9zxZ9X+y7m7bdLDYojpr6gri3OPF0kOjM 4Ggw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763472337; x=1764077137; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:sender:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=D/KSGnmYHhpQtAQRF2DGYnHFC5JvS++vMYTx6SfB8k8=; b=xJlZ3+3c/OyMTod4SJOj9w6DlY86jQSHeuRVdvOBjEzmnhZMqWt13p9FVX1eOR0u0D OfbLVr/W9p4uspIDv3HhpkdTE2l7eQK15qgjTqLKKJBs9OQjuwFIBOxC2PB2SYtlAEra 8DApTMSqgoW7NFoYHF/57rqx63XwJflP1O54+BRoPpa7f14UBln6aG6z02mjZxxCLvfX q2IkYhOyJUSs7/PCyYugi6wVAk/HtldioMrkt8SasKXmGjPCYTX5ilfDvmTnNn2OxVNY algKkD2FXY0CSKNMHsbfcPEdmogTMF+21jKlxcizaXc3RIIw5JCJUfonUuwE6H4EdzLD BojA== X-Forwarded-Encrypted: i=1; AJvYcCXwmVuXsFJTaTBMkpOo7FyHqj6HswMy2qGVTty4VvozGJGOxxipGlqcbccxNujLO+iZL/4NRF9hzX2KkeY=@vger.kernel.org X-Gm-Message-State: AOJu0YwfXI63UGx3ueMP79EwBtVwQZ2KQfdDNwEqsX20bMfcszjCEmsd R8EmfE++nUNaqs/Bmn3jcJwi7yKnvH0pr7ZB85rl/2gF1ja8kX6d/PHX X-Gm-Gg: ASbGnct1l270j5ree9D1d5zAMZDV0QZ3RlLQ3pX6QAN11gQ4m0yIZg+wRqWFJuNswZQ vShSJ8Hk2RRbTeoRzSsq+5O6rT+CcZLkziXHe4OBVqUxWOSCsCiNWrs2RB4lHtEZhRAWpDe0I8O Hp2rqbucvCmEofW7jEWXn5KvHCsqsEEGG1XHuA1ghEbhEuxr8xJFBEisamuBxHAhnVzt9IN913C ZPVyJV+8WeaaTi2iboguEEfTt8T98THoJ9NZDxUw02Yi7kiqKJIyLS+1D7oTkJem9/8m8DbpwO+ 6M5b+KS/6q0Ai5oNIyw6eEdL+3HpB5dbLHIfkD41M8bxc+lPg7zMhusZmLFHFiw4R1m148OSbMt BANCWDsOjSIOqa93vafMgtB8ZtyAIEZ25WdwnrthB1+uTP+sttBkZABPmmgo7tp+XP8Z4ARSvKc l7GQ== X-Google-Smtp-Source: AGHT+IG2x1j6ML0KGvAvBy+bKZskByZJ2z7ymHKf2lVL/7qF2hpWCGHi+QKBlWZtJZn20ow3Mbr2qg== X-Received: by 2002:a17:907:989:b0:b73:5936:77fc with SMTP id a640c23a62f3a-b7367829d26mr1691732166b.13.1763472336638; Tue, 18 Nov 2025 05:25:36 -0800 (PST) Received: from localhost.localdomain ([2a02:8308:b093:bb00::a7b6]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-b734fd80a06sm1339517866b.38.2025.11.18.05.25.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Nov 2025 05:25:36 -0800 (PST) Sender: Alban Crequy From: Alban Crequy To: Andrew Morton , David Hildenbrand , Christian Brauner Cc: Lorenzo Stoakes , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Alban Crequy , Alban Crequy , Peter Xu , Willy Tarreau , mfriese@microsoft.com Subject: [PATCH] process_vm_readv/writev: add flags for pidfd and nowait Date: Tue, 18 Nov 2025 14:23:48 +0100 Message-ID: <20251118132348.2415603-1-alban.crequy@gmail.com> X-Mailer: git-send-email 2.45.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Alban Crequy - PROCESS_VM_PIDFD: refer to the remote process via PID file descriptor instead of PID. Such a file descriptor can be obtained with pidfd_open(2). - PROCESS_VM_NOWAIT: do not block on IO if the memory access causes a page fault. If a given flag is unsupported, the syscall returns the error EINVAL without checking the buffers. This gives a way to userspace to detect whether the current kernel supports a specific flag: process_vm_readv(pid, NULL, 1, NULL, 1, PROCESS_VM_PIDFD) -> EINVAL if the kernel does not support the flag PROCESS_VM_PIDFD (before this patch) -> EFAULT if the kernel supports the flag (after this patch) Signed-off-by: Alban Crequy Reviewed-by: Christian Brauner --- MAINTAINERS | 1 + include/uapi/linux/process_vm.h | 9 +++++++++ mm/process_vm_access.c | 20 +++++++++++++++----- 3 files changed, 25 insertions(+), 5 deletions(-) create mode 100644 include/uapi/linux/process_vm.h diff --git a/MAINTAINERS b/MAINTAINERS index e64b94e6b5a9..91b4647cf761 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -16272,6 +16272,7 @@ F: include/linux/pgtable.h F: include/linux/ptdump.h F: include/linux/vmpressure.h F: include/linux/vmstat.h +F: include/uapi/linux/process_vm.h F: kernel/fork.c F: mm/Kconfig F: mm/debug.c diff --git a/include/uapi/linux/process_vm.h b/include/uapi/linux/process_v= m.h new file mode 100644 index 000000000000..4168e09f3f4e --- /dev/null +++ b/include/uapi/linux/process_vm.h @@ -0,0 +1,9 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_LINUX_PROCESS_VM_H +#define _UAPI_LINUX_PROCESS_VM_H + +/* Flags for process_vm_readv/process_vm_writev */ +#define PROCESS_VM_PIDFD (1UL << 0) +#define PROCESS_VM_NOWAIT (1UL << 1) + +#endif /* _UAPI_LINUX_PROCESS_VM_H */ diff --git a/mm/process_vm_access.c b/mm/process_vm_access.c index 656d3e88755b..b5eac870ef24 100644 --- a/mm/process_vm_access.c +++ b/mm/process_vm_access.c @@ -14,6 +14,7 @@ #include #include #include +#include =20 /** * process_vm_rw_pages - read/write pages from task specified @@ -68,6 +69,7 @@ static int process_vm_rw_pages(struct page **pages, * @mm: mm for task * @task: task to read/write from * @vm_write: 0 means copy from, 1 means copy to + * @pvm_flags: PROCESS_VM_* flags * Returns 0 on success or on failure error code */ static int process_vm_rw_single_vec(unsigned long addr, @@ -76,7 +78,8 @@ static int process_vm_rw_single_vec(unsigned long addr, struct page **process_pages, struct mm_struct *mm, struct task_struct *task, - int vm_write) + int vm_write, + unsigned int pvm_flags) { unsigned long pa =3D addr & PAGE_MASK; unsigned long start_offset =3D addr - pa; @@ -91,6 +94,8 @@ static int process_vm_rw_single_vec(unsigned long addr, =20 if (vm_write) flags |=3D FOLL_WRITE; + if (pvm_flags & PROCESS_VM_NOWAIT) + flags |=3D FOLL_NOWAIT; =20 while (!rc && nr_pages && iov_iter_count(iter)) { int pinned_pages =3D min_t(unsigned long, nr_pages, PVM_MAX_USER_PAGES); @@ -141,7 +146,7 @@ static int process_vm_rw_single_vec(unsigned long addr, * @iter: where to copy to/from locally * @rvec: iovec array specifying where to copy to/from in the other process * @riovcnt: size of rvec array - * @flags: currently unused + * @flags: process_vm_readv/writev flags * @vm_write: 0 if reading from other process, 1 if writing to other proce= ss * * Returns the number of bytes read/written or error code. May @@ -163,6 +168,7 @@ static ssize_t process_vm_rw_core(pid_t pid, struct iov= _iter *iter, unsigned long nr_pages_iov; ssize_t iov_len; size_t total_len =3D iov_iter_count(iter); + unsigned int f_flags; =20 /* * Work out how many pages of struct pages we're going to need @@ -194,7 +200,11 @@ static ssize_t process_vm_rw_core(pid_t pid, struct io= v_iter *iter, } =20 /* Get process information */ - task =3D find_get_task_by_vpid(pid); + if (flags & PROCESS_VM_PIDFD) + task =3D pidfd_get_task(pid, &f_flags); + else + task =3D find_get_task_by_vpid(pid); + if (!task) { rc =3D -ESRCH; goto free_proc_pages; @@ -215,7 +225,7 @@ static ssize_t process_vm_rw_core(pid_t pid, struct iov= _iter *iter, for (i =3D 0; i < riovcnt && iov_iter_count(iter) && !rc; i++) rc =3D process_vm_rw_single_vec( (unsigned long)rvec[i].iov_base, rvec[i].iov_len, - iter, process_pages, mm, task, vm_write); + iter, process_pages, mm, task, vm_write, flags); =20 /* copied =3D space before - space after */ total_len -=3D iov_iter_count(iter); @@ -266,7 +276,7 @@ static ssize_t process_vm_rw(pid_t pid, ssize_t rc; int dir =3D vm_write ? ITER_SOURCE : ITER_DEST; =20 - if (flags !=3D 0) + if (flags & ~(PROCESS_VM_NOWAIT | PROCESS_VM_PIDFD)) return -EINVAL; =20 /* Check iovecs */ --=20 2.45.0