From nobody Wed Feb 11 06:32:21 2026 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F34E52222DF for ; Fri, 30 May 2025 09:34:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748597694; cv=none; b=JICzyLglG0nwnLhQmUgC3wmA143VYZucXMbMuTKTup6F6GuGt6X7kKylf4lt5i8hB446GPc0r0QIlzlRweD3kI47He9YnxhnzQdAYRct98YRAEN6xwohQ+WRBqf+Zx1QTJlSGQs5zp50f1E3p5nGHJ7qqq29j21YdP8ZCF8FSxs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748597694; c=relaxed/simple; bh=huOWwdeBYR2ktVOxaLw5MQ08zC/qZOEF1LgzmHEU8B4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=BGN94+HwkEpGGZ8E8nz7TghW3QLx2ewE0Yy9Htyzqp2Fr3qZtCfFqjIsWAM9C7K5Q+W1R8QsSGqgPbc9WDm1rXiDUF8geKgakV15MkWc+IfQQDkFgjyXKr9M1lmhPbYopZi2zNxOXA7tvzolNr2hFqzn9ZItMtV0lFduQFW95/o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=lIRmN2ee; arc=none smtp.client-ip=209.85.216.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="lIRmN2ee" Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-3109fb9f941so2028108a91.3 for ; Fri, 30 May 2025 02:34:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1748597691; x=1749202491; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4r9koD1NPFSh7WkJOunYhnXV7czMJGlSW2YfbvvIqZo=; b=lIRmN2eeZ8hSHOyD39mDvrocaoB8bYLso4269wCfjXQI0Z8JKAB6gX6E1JAqesqZ/l zaiBhXOow9Usxm0YRWoacFzS88MYj527OdXbqSVqJ1/1BO+xLI32fpqghiDeoKeOCtCU u8iM/Bfx+qO4chcVapQLofBcqDwsjrgzxfXdYb676HxXu+4OoSAEtNhOwekhobMWxe5b VrmEQwZP8W8pjANYe8ZNtzAHPMNLXeC1Sv3tW4hu7PDctYmDkiIKsRP0vycUImdyIOns w/0C7zy0BB/CD1W6apOtKJLlYzXpQ1a6enbXVSMYtpU7Um7GyF2uvmS9ZZMjLd3ah+Uw zd0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748597691; x=1749202491; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4r9koD1NPFSh7WkJOunYhnXV7czMJGlSW2YfbvvIqZo=; b=fThKqwJbXgYmaFEmcIHQhoe4CYEvhqFvNk/SYYyOcX2nC96ORyOixTqMu33VKmwNxj UUjZe3emCSDL9/CL5mttwy0dE31VcLGzAzzcTWxCOJV+a3VA8ExoMYvfhVmBH++t/OIv fuXYIXAPDHCmNZR+YZmSy+nvZDka7Z8gAWz3c0V8d9HC4K12MOzqAQqzRd6K6fPs2a03 V3uFdXy2aWMORTP3HmBfHxMaZDE1bb2m1VTj54SDlvhef9YHfV9LrIT5eW2PDn1OYcmI udmKfsFCJtAe56uC+YMGY/PjC/9+NI3Cyc+EId7yBKLgwv3kPOcgeGolE2y0aANbXz+p xWKA== X-Forwarded-Encrypted: i=1; AJvYcCVTJlAM56kypMsUY5Tl56dIodGa76esH4he2D3YNPOJFyIFmg8qTf1WFiETh1ugUxDRc3Gwr4EG3pVoHeM=@vger.kernel.org X-Gm-Message-State: AOJu0YxQ9hRgevlnk1xMcUv4a5qTNhSfEmBjGs4fzzOEZT+wGJ2erwbY eNfn6fwythp4wwu9AWLnE1fA1LA1gvT2fziqVc87efUHw8J+j6LxYzy46I2DITPYCEo= X-Gm-Gg: ASbGncvwtYxPvjoU1BnBFS1u/6hb1u4ab8d0AQGHwXPLPJRaFbhMwngiYyZy8clQB11 AQR2ByFsUCjXSRoic/9T4dIkopEI+dHgZh3/jxGX3XGXAqaVob50SAESj+SaIdtMIlDZ7jF6U4j 5SsNu9yhI6tCXAtbjMDlawzIlrsOGs1lrHaVcilpS6sb0krgd+eo96MX3VKO0I0xSHZRnHrIvGQ +x57QgePNFneBEAHu+dGQ3JxsT9WqdP23+5E7VZDygtIbCDnyT25DFUwUkVWjcw+dypIhzuF354 sSGTQp43tLSCRyNVxCQ8On/KXxTustV6YhTivO0Hm56Bj133ErHizTJuvCn1bPi6MoKcMOeqzPy bL1TuyvWMWA== X-Google-Smtp-Source: AGHT+IEYFSuwVcZ5gMKEd63K0LSZFm1at7KJSWQADYXZzWbymIXeS5JhogJZ9S4360x8945TmkY35w== X-Received: by 2002:a17:90b:2e8b:b0:311:e637:287b with SMTP id 98e67ed59e1d1-3124198a8e2mr3896259a91.29.1748597690878; Fri, 30 May 2025 02:34:50 -0700 (PDT) Received: from FQ627FTG20.bytedance.net ([63.216.146.178]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3124e29f7b8sm838724a91.2.2025.05.30.02.34.35 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 30 May 2025 02:34:50 -0700 (PDT) From: Bo Li To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, luto@kernel.org, kees@kernel.org, akpm@linux-foundation.org, david@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, peterz@infradead.org Cc: dietmar.eggemann@arm.com, hpa@zytor.com, acme@kernel.org, namhyung@kernel.org, mark.rutland@arm.com, alexander.shishkin@linux.intel.com, jolsa@kernel.org, irogers@google.com, adrian.hunter@intel.com, kan.liang@linux.intel.com, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, rostedt@goodmis.org, bsegall@google.com, mgorman@suse.de, vschneid@redhat.com, jannh@google.com, pfalcato@suse.de, riel@surriel.com, harry.yoo@oracle.com, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, yinhongbo@bytedance.com, dengliang.1214@bytedance.com, xieyongji@bytedance.com, chaiwen.cc@bytedance.com, songmuchun@bytedance.com, yuanzhu@bytedance.com, chengguozhu@bytedance.com, sunjiadong.lff@bytedance.com, Bo Li Subject: [RFC v2 25/35] RPAL: add MPK initialization and interface Date: Fri, 30 May 2025 17:27:53 +0800 Message-Id: <569387db40571a03a71506cbec12813c1e5dde62.1748594841.git.libo.gcs85@bytedance.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" RPAL uses MPK (Memory Protection Keys) to protect memory. Therefore, RPAL needs to perform MPK initialization, allocation, and other related tasks, while providing corresponding user-mode interfaces. This patch executes MPK initialization operations, including feature detection, implementation of user mode interfaces for setting and retrieving pkeys, and development of utility functions. For pkey allocation, RPAL prioritizes using pkeys provided by user mode, with user mode responsible for preventing pkey collisions between different services. If user mode does not provide a valid pkey, RPAL generates a pkey via id % arch_max_pkey() to maximize the avoidance of pkey collisions. Additionally, RPAL does not permit services to manipulate pkeys independently; thus, all pkeys are marked as allocated, and services are prohibited from releasing pkeys. Signed-off-by: Bo Li --- arch/x86/rpal/Kconfig | 12 +++++++- arch/x86/rpal/Makefile | 1 + arch/x86/rpal/core.c | 13 ++++++++ arch/x86/rpal/internal.h | 5 +++ arch/x86/rpal/pku.c | 47 ++++++++++++++++++++++++++++ arch/x86/rpal/proc.c | 5 +++ arch/x86/rpal/service.c | 24 +++++++++++++++ include/linux/rpal.h | 66 ++++++++++++++++++++++++++++++++++++++++ mm/mprotect.c | 9 ++++++ 9 files changed, 181 insertions(+), 1 deletion(-) create mode 100644 arch/x86/rpal/pku.c diff --git a/arch/x86/rpal/Kconfig b/arch/x86/rpal/Kconfig index e5e6996553ea..5434fdb2940d 100644 --- a/arch/x86/rpal/Kconfig +++ b/arch/x86/rpal/Kconfig @@ -8,4 +8,14 @@ config RPAL depends on X86_64 help This option enables system support for Run Process As - library (RPAL). \ No newline at end of file + library (RPAL). + +config RPAL_PKU + bool "mpk protection for RPAL" + default y + depends on RPAL + help + Memory protection key (MPK) can achieve intra-process + memory separation which is broken by RPAL, Always keep + it on when use RPAL. CPU feature will be detected at + boot time as some CPUs do not support it. \ No newline at end of file diff --git a/arch/x86/rpal/Makefile b/arch/x86/rpal/Makefile index 89f745382c51..42a42b0393be 100644 --- a/arch/x86/rpal/Makefile +++ b/arch/x86/rpal/Makefile @@ -3,3 +3,4 @@ obj-$(CONFIG_RPAL) +=3D rpal.o =20 rpal-y :=3D service.o core.o mm.o proc.o thread.o +rpal-$(CONFIG_RPAL_PKU) +=3D pku.o \ No newline at end of file diff --git a/arch/x86/rpal/core.c b/arch/x86/rpal/core.c index 406d54788bac..41111d693994 100644 --- a/arch/x86/rpal/core.c +++ b/arch/x86/rpal/core.c @@ -8,6 +8,7 @@ =20 #include #include +#include #include =20 #include "internal.h" @@ -374,6 +375,14 @@ static bool check_hardware_features(void) rpal_err("no fsgsbase feature\n"); return false; } + +#ifdef CONFIG_RPAL_PKU + if (!arch_pkeys_enabled()) { + rpal_err("MPK is not enabled\n"); + return false; + } +#endif + return true; } =20 @@ -390,6 +399,10 @@ int __init rpal_init(void) if (ret) goto fail; =20 +#ifdef CONFIG_RPAL_PKU + rpal_set_cap(RPAL_CAP_PKU); +#endif + rpal_inited =3D true; return 0; =20 diff --git a/arch/x86/rpal/internal.h b/arch/x86/rpal/internal.h index 6256172bb79e..71afa8225450 100644 --- a/arch/x86/rpal/internal.h +++ b/arch/x86/rpal/internal.h @@ -54,3 +54,8 @@ rpal_build_call_state(const struct rpal_sender_data *rsd) return ((rsd->rcd.service_id << RPAL_SID_SHIFT) | (rsd->scc->sender_id << RPAL_ID_SHIFT) | RPAL_RECEIVER_STATE_CALL); } + +/* pkey.c */ +int rpal_alloc_pkey(struct rpal_service *rs, int pkey); +int rpal_pkey_setup(struct rpal_service *rs, int pkey); +void rpal_service_pku_init(void); diff --git a/arch/x86/rpal/pku.c b/arch/x86/rpal/pku.c new file mode 100644 index 000000000000..4c5151ca5b8b --- /dev/null +++ b/arch/x86/rpal/pku.c @@ -0,0 +1,47 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * RPAL service level operations + * Copyright (c) 2025, ByteDance. All rights reserved. + * + * Author: Jiadong Sun + */ + +#include +#include + +#include "internal.h" + +void rpal_service_pku_init(void) +{ + u16 all_pkeys_mask =3D ((1U << arch_max_pkey()) - 1); + struct mm_struct *mm =3D current->mm; + + /* We consume all pkeys so that no pkeys will be allocated by others */ + mmap_write_lock(mm); + if (mm->context.pkey_allocation_map !=3D 0x1) + rpal_err("pkey has been allocated: %u\n", + mm->context.pkey_allocation_map); + mm->context.pkey_allocation_map =3D all_pkeys_mask; + mmap_write_unlock(mm); +} + +int rpal_pkey_setup(struct rpal_service *rs, int pkey) +{ + int val; + + val =3D rpal_pkey_to_pkru(pkey); + rs->pkey =3D pkey; + return 0; +} + +int rpal_alloc_pkey(struct rpal_service *rs, int pkey) +{ + int ret; + + if (pkey >=3D 0 && pkey < arch_max_pkey()) + return pkey; + + ret =3D rs->id % arch_max_pkey(); + + return ret; +} diff --git a/arch/x86/rpal/proc.c b/arch/x86/rpal/proc.c index 16ac9612bfc5..2f9cceec4992 100644 --- a/arch/x86/rpal/proc.c +++ b/arch/x86/rpal/proc.c @@ -76,6 +76,11 @@ static long rpal_ioctl(struct file *file, unsigned int c= md, unsigned long arg) case RPAL_IOCTL_RELEASE_SERVICE: ret =3D rpal_release_service(arg); break; +#ifdef CONFIG_RPAL_PKU + case RPAL_IOCTL_GET_SERVICE_PKEY: + ret =3D put_user(cur->pkey, (int __user *)arg); + break; +#endif default: return -EINVAL; } diff --git a/arch/x86/rpal/service.c b/arch/x86/rpal/service.c index 16e94d710445..ca795dacc90d 100644 --- a/arch/x86/rpal/service.c +++ b/arch/x86/rpal/service.c @@ -208,6 +208,10 @@ struct rpal_service *rpal_register_service(void) spin_lock_init(&rs->rpd.poll_lock); bitmap_zero(rs->rpd.dead_key_bitmap, RPAL_NR_ID); init_waitqueue_head(&rs->rpd.rpal_waitqueue); +#ifdef CONFIG_RPAL_PKU + rs->pkey =3D -1; + rpal_service_pku_init(); +#endif =20 rs->bad_service =3D false; rs->base =3D calculate_base_address(rs->id); @@ -288,6 +292,9 @@ static int add_mapped_service(struct rpal_service *rs, = struct rpal_service *tgt, if (node->rs =3D=3D NULL) { node->rs =3D rpal_get_service(tgt); set_bit(type_bit, &node->type); +#ifdef CONFIG_RPAL_PKU + node->pkey =3D tgt->pkey; +#endif } else { if (node->rs !=3D tgt) { ret =3D -EINVAL; @@ -397,6 +404,19 @@ int rpal_request_service(unsigned long arg) goto put_service; } =20 +#ifdef CONFIG_RPAL_PKU + if (cur->pkey =3D=3D tgt->pkey) { + ret =3D -EINVAL; + goto put_service; + } + + ret =3D put_user(tgt->pkey, rra.pkey); + if (ret) { + ret =3D -EFAULT; + goto put_service; + } +#endif + ret =3D put_user((unsigned long)(tgt->rsm.user_meta), rra.user_metap); if (ret) { ret =3D -EFAULT; @@ -577,6 +597,10 @@ int rpal_enable_service(unsigned long arg) mutex_lock(&cur->mutex); if (!cur->enabled) { cur->rsm =3D rsm; +#ifdef CONFIG_RPAL_PKU + rsm.pkey =3D rpal_alloc_pkey(cur, rsm.pkey); + rpal_pkey_setup(cur, rsm.pkey); +#endif cur->enabled =3D true; } mutex_unlock(&cur->mutex); diff --git a/include/linux/rpal.h b/include/linux/rpal.h index 4f1d92053818..2f2982d281cc 100644 --- a/include/linux/rpal.h +++ b/include/linux/rpal.h @@ -97,6 +97,12 @@ enum { #define RPAL_ID_MASK (~(0 | RPAL_RECEIVER_STATE_MASK | RPAL_SID_MASK)) #define RPAL_MAX_ID ((1 << (RPAL_SID_SHIFT - RPAL_ID_SHIFT)) - 1) =20 +#define RPAL_PKRU_BASE_CODE_READ 0xAAAAAAAA +#define RPAL_PKRU_BASE_CODE 0xFFFFFFFF +#define RPAL_PKRU_SET 0 +#define RPAL_PKRU_UNION 1 +#define RPAL_PKRU_INTERSECT 2 + extern unsigned long rpal_cap; =20 enum rpal_task_flag_bits { @@ -122,6 +128,10 @@ enum rpal_sender_state { RPAL_SENDER_STATE_KERNEL_RET, }; =20 +enum rpal_capability { + RPAL_CAP_PKU +}; + struct rpal_critical_section { unsigned long ret_begin; unsigned long ret_end; @@ -134,6 +144,7 @@ struct rpal_service_metadata { unsigned long version; void __user *user_meta; struct rpal_critical_section rcs; + int pkey; }; =20 struct rpal_request_arg { @@ -141,11 +152,17 @@ struct rpal_request_arg { u64 key; unsigned long __user *user_metap; int __user *id; +#ifdef CONFIG_RPAL_PKU + int __user *pkey; +#endif }; =20 struct rpal_mapped_service { unsigned long type; struct rpal_service *rs; +#ifdef CONFIG_RPAL_PKU + int pkey; +#endif }; =20 struct rpal_poll_data { @@ -220,6 +237,11 @@ struct rpal_service { /* fsbase / pid map */ struct rpal_fsbase_tsk_map fs_tsk_map[RPAL_MAX_RECEIVER_NUM]; =20 +#ifdef CONFIG_RPAL_PKU + /* pkey */ + int pkey; +#endif + /* delayed service put work */ struct delayed_work delayed_put_work; =20 @@ -323,6 +345,7 @@ enum rpal_command_type { RPAL_CMD_DISABLE_SERVICE, RPAL_CMD_REQUEST_SERVICE, RPAL_CMD_RELEASE_SERVICE, + RPAL_CMD_GET_SERVICE_PKEY, RPAL_NR_CMD, }; =20 @@ -351,6 +374,8 @@ enum rpal_command_type { _IOWR(RPAL_IOCTL_MAGIC, RPAL_CMD_REQUEST_SERVICE, unsigned long) #define RPAL_IOCTL_RELEASE_SERVICE \ _IOWR(RPAL_IOCTL_MAGIC, RPAL_CMD_RELEASE_SERVICE, unsigned long) +#define RPAL_IOCTL_GET_SERVICE_PKEY \ + _IOWR(RPAL_IOCTL_MAGIC, RPAL_CMD_GET_SERVICE_PKEY, int *) =20 #define rpal_for_each_requested_service(rs, idx) = \ for (idx =3D find_first_bit(rs->requested_service_bitmap, RPAL_NR_ID); \ @@ -420,6 +445,47 @@ static inline bool rpal_is_correct_address(struct rpal= _service *rs, unsigned lon return true; } =20 +static inline void rpal_set_cap(unsigned long cap) +{ + set_bit(cap, &rpal_cap); +} + +static inline void rpal_clear_cap(unsigned long cap) +{ + clear_bit(cap, &rpal_cap); +} + +static inline bool rpal_has_cap(unsigned long cap) +{ + return test_bit(cap, &rpal_cap); +} + +static inline u32 rpal_pkey_to_pkru(int pkey) +{ + int offset =3D pkey * 2; + u32 mask =3D 0x3 << offset; + + return RPAL_PKRU_BASE_CODE & ~mask; +} + +static inline u32 rpal_pkey_to_pkru_read(int pkey) +{ + int offset =3D pkey * 2; + u32 mask =3D 0x3 << offset; + + return RPAL_PKRU_BASE_CODE_READ & ~mask; +} + +static inline u32 rpal_pkru_union(u32 pkru0, u32 pkru1) +{ + return pkru0 & pkru1; +} + +static inline u32 rpal_pkru_intersect(u32 pkru0, u32 pkru1) +{ + return pkru0 | pkru1; +} + #ifdef CONFIG_RPAL static inline struct rpal_service *rpal_current_service(void) { diff --git a/mm/mprotect.c b/mm/mprotect.c index 62c1f7945741..982f911ffaba 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -33,6 +33,7 @@ #include #include #include +#include #include #include #include @@ -895,6 +896,14 @@ SYSCALL_DEFINE1(pkey_free, int, pkey) { int ret; =20 +#ifdef CONFIG_RPAL_PKU + if (rpal_current_service()) { + rpal_err("try_to_free pkey: %d %s\n", current->pid, + current->comm); + return -EINVAL; + } +#endif + mmap_write_lock(current->mm); ret =3D mm_pkey_free(current->mm, pkey); mmap_write_unlock(current->mm); --=20 2.20.1