From nobody Sun Dec 28 03:05:23 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E2D0C4332F for ; Wed, 13 Dec 2023 22:42:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233925AbjLMWmG (ORCPT ); Wed, 13 Dec 2023 17:42:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442890AbjLMWll (ORCPT ); Wed, 13 Dec 2023 17:41:41 -0500 Received: from mail-yw1-x1144.google.com (mail-yw1-x1144.google.com [IPv6:2607:f8b0:4864:20::1144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C83EA189; Wed, 13 Dec 2023 14:41:39 -0800 (PST) Received: by mail-yw1-x1144.google.com with SMTP id 00721157ae682-5e266e8d39eso19459017b3.1; Wed, 13 Dec 2023 14:41:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702507298; x=1703112098; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lipR8XQlYrI4nz9NVeJQmAt2/A+ckhzHN/RL9jRzBRI=; b=DZ6EQUBfUYoy4hslnh/TtSdwYhjHqiY04b+od5UjjvdN4MmkdaEaclLsONv9/Pf8uI o+dqxBK/R+SNs5KBEjjvAkhyAvjlxrmvoDmXDVNn1lVPThEMn5zxDMLUxVonXSZBYItD zmXOtwvxL1pE9JEq8yggk4CDy4M+fr3ZLRsrnR4U0f4eZNEJ+bhkGs8GLSjgw/tbaUz7 WJ+hYpsEBcT5RywvOjyse0TkOm6tR22X83eYkxt/R6ciP9ubpSJHLrvJxTgPlrckNI/8 JLT4NAtKlJag1ODVHGCqelEqE2vAZaSrGF3kgGVGDz3+2s+Bzsjrl0yQ57XUj2K9gUiW 5cZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702507298; x=1703112098; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lipR8XQlYrI4nz9NVeJQmAt2/A+ckhzHN/RL9jRzBRI=; b=hJWlPAyj3e3pV7RAqerz9Wt+lUyZQAIS+zDRtbKaM6PcdA1Y9rBlIb+H+rgdXnP4fW z5L8i1QzY+wEQqbwuLR5wZqe8i7XQC4scDYm2BjXVrfpRuXgVF8Ha4ijFHY5i/4+Rpvb 4oG2/NS0rBAiu8iLbhvlgY+qj7WWR1uS/koYU1vo639JavrYMurGX7QxIh8iWusExKzW 36CJWz7AwFhoPhqkI47/kmydew6OH0EigwVGhd7eYr2+hNdzSeLlFpH1CEI2rUo76xqM n7pcBfVXrNZMTbQzrIQnq6CTHAv8mfMtqjXQuBYAfZcDPC7N+4Tc0UFKl4VFTBe+sS3O Tv6A== X-Gm-Message-State: AOJu0Yza8AYs2jONQ/nK8owNSH2ZF1Q5FW3wCaXNLwET96fofGCWlueO D4cWVVmncyQPvW46trs3YA== X-Google-Smtp-Source: AGHT+IGVfVIlTjCY9BSQlcFJSdFROjcHB9iS5xvEreYoD0kyQO+vkenLn3JHmIqFato8sU7T8CJn9g== X-Received: by 2002:a81:7206:0:b0:5d9:c21d:ae03 with SMTP id n6-20020a817206000000b005d9c21dae03mr6634702ywc.98.1702507298546; Wed, 13 Dec 2023 14:41:38 -0800 (PST) Received: from fedora.mshome.net (pool-173-79-56-208.washdc.fios.verizon.net. [173.79.56.208]) by smtp.gmail.com with ESMTPSA id v4-20020a818504000000b005d9729068f5sm5050583ywf.42.2023.12.13.14.41.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 14:41:38 -0800 (PST) From: Gregory Price X-Google-Original-From: Gregory Price To: linux-mm@kvack.org Cc: linux-doc@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, akpm@linux-foundation.org, arnd@arndb.de, tglx@linutronix.de, luto@kernel.org, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mhocko@kernel.org, tj@kernel.org, ying.huang@intel.com, gregory.price@memverge.com, corbet@lwn.net, rakie.kim@sk.com, hyeongtak.ji@sk.com, honggyu.kim@sk.com, vtavarespetr@micron.com, peterz@infradead.org, jgroves@micron.com, ravis.opensrc@micron.com, sthanneeru@micron.com, emirakhur@micron.com, Hasan.Maruf@amd.com, seungjun.ha@samsung.com, Michal Hocko Subject: [PATCH v3 09/11] mm/mempolicy: add get_mempolicy2 syscall Date: Wed, 13 Dec 2023 17:41:16 -0500 Message-Id: <20231213224118.1949-10-gregory.price@memverge.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20231213224118.1949-1-gregory.price@memverge.com> References: <20231213224118.1949-1-gregory.price@memverge.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" get_mempolicy2 is an extensible get_mempolicy interface which allows a user to retrieve the memory policy for a task or address. Defined as: get_mempolicy2(struct mpol_args *args, size_t size, unsigned long addr, unsigned long flags) Top level input values: mpol_args: The field which collects information about the mempolicy returned to userspace. addr: if MPOL_F_ADDR is passed in `flags`, this address will be used to return the mempolicy details of the vma the address belongs to flags: if MPOL_F_ADDR, return mempolicy info vma containing addr else, returns task mempolicy information Input values include the following fields of mpol_args: pol_nodes: if set, the nodemask of the policy returned here pol_maxnodes: if pol_nodes is set, must describe max number of nodes to be copied to pol_nodes Output values include the following fields of mpol_args: mode: mempolicy mode mode_flags: mempolicy mode flags home_node: policy home node will be returned here, or -1 if not. pol_nodes: if set, the nodemask for the mempolicy policy_node: if the policy has extended node information, it will be placed here. For example MPOL_INTERLEAVE will return the next node which will be used for allocation MPOL_F_NODE has been dropped from get_mempolicy2 (EINVAL). Suggested-by: Michal Hocko Signed-off-by: Gregory Price --- .../admin-guide/mm/numa_memory_policy.rst | 10 ++++- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + include/linux/syscalls.h | 2 + include/uapi/asm-generic/unistd.h | 4 +- mm/mempolicy.c | 43 +++++++++++++++++++ 18 files changed, 71 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Document= ation/admin-guide/mm/numa_memory_policy.rst index e57d400d0281..8c1fcdb30602 100644 --- a/Documentation/admin-guide/mm/numa_memory_policy.rst +++ b/Documentation/admin-guide/mm/numa_memory_policy.rst @@ -456,11 +456,19 @@ Get [Task] Memory Policy or Related Information:: long get_mempolicy(int *mode, const unsigned long *nmask, unsigned long maxnode, void *addr, int flags); + long get_mempolicy2(struct mpol_args args, size_t size, + unsigned long addr, unsigned long flags); =20 Queries the "task/process memory policy" of the calling task, or the policy or location of a specified virtual address, depending on the 'flags' argument. =20 +get_mempolicy2() is an extended version of get_mempolicy() capable of +acquiring extended information about a mempolicy, including those +that can only be set via set_mempolicy2() or mbind2(). + +MPOL_F_NODE functionality has been removed from get_mempolicy2(). + See the get_mempolicy(2) man page for more details =20 =20 @@ -504,7 +512,7 @@ Extended Mempolicy Arguments:: The extended mempolicy argument structure is defined to allow the mempolicy interfaces future extensibility without the need for additional system cal= ls. =20 -Extended interfaces (set_mempolicy2) use this argument structure. +Extended interfaces (set_mempolicy2 and get_mempolicy2) use this structure. =20 The core arguments (mode, mode_flags, pol_nodes, and pol_maxnodes) apply to all interfaces relative to their non-extended counterparts. Each additional diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/sys= calls/syscall.tbl index 0dc288a1118a..0301a8b0a262 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -497,3 +497,4 @@ 565 common futex_wait sys_futex_wait 566 common futex_requeue sys_futex_requeue 567 common set_mempolicy2 sys_set_mempolicy2 +568 common get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 50172ec0e1f5..771a33446e8e 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -471,3 +471,4 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/sysca= lls/syscall.tbl index 839d90c535f2..048a409e684c 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -457,3 +457,4 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/= kernel/syscalls/syscall.tbl index 567c8b883735..327b01bd6793 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -463,3 +463,4 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/s= yscalls/syscall_n32.tbl index cc0640e16f2f..921d58e1da23 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -396,3 +396,4 @@ 455 n32 futex_wait sys_futex_wait 456 n32 futex_requeue sys_futex_requeue 457 n32 set_mempolicy2 sys_set_mempolicy2 +458 n32 get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/s= yscalls/syscall_o32.tbl index f7262fde98d9..9271c83c9993 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -445,3 +445,4 @@ 455 o32 futex_wait sys_futex_wait 456 o32 futex_requeue sys_futex_requeue 457 o32 set_mempolicy2 sys_set_mempolicy2 +458 o32 get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/s= yscalls/syscall.tbl index e10f0e8bd064..0654f3f89fc7 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -456,3 +456,4 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel= /syscalls/syscall.tbl index 4f03f5f42b78..ac11d2064e7a 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -544,3 +544,4 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/sysca= lls/syscall.tbl index f98dadc2e9df..1cdcafe1ccca 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -460,3 +460,4 @@ 455 common futex_wait sys_futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/= syscall.tbl index f47ba9f2d05d..f71742024c29 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -460,3 +460,4 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/sys= calls/syscall.tbl index 53fb16616728..2fbf5dbe0620 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -503,3 +503,4 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscal= ls/syscall_32.tbl index 4b4dc41b24ee..0af813b9a118 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -462,3 +462,4 @@ 455 i386 futex_wait sys_futex_wait 456 i386 futex_requeue sys_futex_requeue 457 i386 set_mempolicy2 sys_set_mempolicy2 +458 i386 get_mempolicy2 sys_get_mempolicy2 diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscal= ls/syscall_64.tbl index 1bc2190bec27..0b777876fc15 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -379,6 +379,7 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 =20 # # Due to a historical design error, certain syscalls are numbered differen= tly diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/s= yscalls/syscall.tbl index e26dc89399eb..4536c9a4227d 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -428,3 +428,4 @@ 455 common futex_wait sys_futex_wait 456 common futex_requeue sys_futex_requeue 457 common set_mempolicy2 sys_set_mempolicy2 +458 common get_mempolicy2 sys_get_mempolicy2 diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h index 451f0089601f..f696855cbe8c 100644 --- a/include/linux/syscalls.h +++ b/include/linux/syscalls.h @@ -821,6 +821,8 @@ asmlinkage long sys_get_mempolicy(int __user *policy, unsigned long __user *nmask, unsigned long maxnode, unsigned long addr, unsigned long flags); +asmlinkage long sys_get_mempolicy2(struct mpol_args __user *args, size_t s= ize, + unsigned long addr, unsigned long flags); asmlinkage long sys_set_mempolicy(int mode, const unsigned long __user *nm= ask, unsigned long maxnode); asmlinkage long sys_set_mempolicy2(struct mpol_args __user *args, size_t s= ize, diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/u= nistd.h index 55486aba099f..719accc731db 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -830,9 +830,11 @@ __SYSCALL(__NR_futex_wait, sys_futex_wait) __SYSCALL(__NR_futex_requeue, sys_futex_requeue) #define __NR_set_mempolicy2 457 __SYSCALL(__NR_set_mempolicy2, sys_set_mempolicy2) +#define __NR_get_mempolicy2 458 +__SYSCALL(__NR_get_mempolicy2, sys_get_mempolicy2) =20 #undef __NR_syscalls -#define __NR_syscalls 458 +#define __NR_syscalls 459 =20 /* * 32 bit systems traditionally used different diff --git a/mm/mempolicy.c b/mm/mempolicy.c index 4bf563f3732b..09c613113c7c 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1863,6 +1863,49 @@ SYSCALL_DEFINE5(get_mempolicy, int __user *, policy, return kernel_get_mempolicy(policy, nmask, maxnode, addr, flags); } =20 +SYSCALL_DEFINE4(get_mempolicy2, struct mpol_args __user *, uargs, size_t, = usize, + unsigned long, addr, unsigned long, flags) +{ + struct mpol_args kargs; + struct mempolicy_args margs; + int err; + nodemask_t policy_nodemask; + unsigned long __user *nodes_ptr; + + if (flags & ~(MPOL_F_ADDR)) + return -EINVAL; + + /* initialize any memory liable to be copied to userland */ + memset(&margs, 0, sizeof(margs)); + + err =3D copy_struct_from_user(&kargs, sizeof(kargs), uargs, usize); + if (err) + return -EINVAL; + + margs.policy_nodes =3D kargs.pol_nodes ? &policy_nodemask : NULL; + if (flags & MPOL_F_ADDR) + err =3D do_get_vma_mempolicy(untagged_addr(addr), NULL, &margs); + else + err =3D do_get_task_mempolicy(&margs); + + if (err) + return err; + + kargs.mode =3D margs.mode; + kargs.mode_flags =3D margs.mode_flags; + kargs.policy_node =3D margs.policy_node; + kargs.home_node =3D margs.home_node; + if (kargs.pol_nodes) { + nodes_ptr =3D u64_to_user_ptr(kargs.pol_nodes); + err =3D copy_nodes_to_user(nodes_ptr, kargs.pol_maxnodes, + margs.policy_nodes); + if (err) + return err; + } + + return copy_to_user(uargs, &kargs, usize) ? -EFAULT : 0; +} + bool vma_migratable(struct vm_area_struct *vma) { if (vma->vm_flags & (VM_IO | VM_PFNMAP)) --=20 2.39.1