From nobody Wed Sep 17 01:32:19 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66DC7C46467 for ; Tue, 27 Dec 2022 12:17:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232299AbiL0MQw (ORCPT ); Tue, 27 Dec 2022 07:16:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53802 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231651AbiL0MOn (ORCPT ); Tue, 27 Dec 2022 07:14:43 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BD6A2639; Tue, 27 Dec 2022 04:13:56 -0800 (PST) Date: Tue, 27 Dec 2022 12:13:52 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1672143232; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5DAL5Q5nrFxihn+kbgAp72qNEoYjdUKlz1xuabz9lMA=; b=s2tSbRif1KWaLprm8ZbDNPD/DQPr+1tdZM6ShFz1h2reXcQ8zQRRiWs4jEYbbnblOnPLIC uODZwTJ15XKzIl8He62mqVSb2gYF7yoEYkH+B3mZ/X/SU2zzxox2MECQ/maDUt8/0GEEz+ lkc2JGj+1b21DrrcxdVTxNzOmlgjNTPoZb6irrWLwE21YFPLKym+an+FMMrMBhc8rwB5F4 yH15rs3aptkHGdXDmuJ9C5/yn2zp3tqh179lO6NoPfZm0+AnOwbAPzpTmZpUeUH96OQjrj 3Du14o/97HqpUPJ3oFyGvM+FS9ttnTg/9cNErioQuy455iL5E88eeDtQ07OU4w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1672143232; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5DAL5Q5nrFxihn+kbgAp72qNEoYjdUKlz1xuabz9lMA=; b=izSLIbKw3EFdqe2gMFAOKBQzUROnaMMpPuQcKjg5MiykZDAocdM3w/zOfIF0jxy6VUgxaC RVWbCABbrszxdBDg== From: "tip-bot2 for Mathieu Desnoyers" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] rseq: Introduce extensible rseq ABI Cc: Mathieu Desnoyers , "Peter Zijlstra (Intel)" , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20221122203932.231377-4-mathieu.desnoyers@efficios.com> References: <20221122203932.231377-4-mathieu.desnoyers@efficios.com> MIME-Version: 1.0 Message-ID: <167214323246.4906.2743738647026767672.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the sched/core branch of tip: Commit-ID: ee3e3ac05c2631ce1f12d88c9cc9a092f8fe947a Gitweb: https://git.kernel.org/tip/ee3e3ac05c2631ce1f12d88c9cc9a092f= 8fe947a Author: Mathieu Desnoyers AuthorDate: Tue, 22 Nov 2022 15:39:05 -05:00 Committer: Peter Zijlstra CommitterDate: Tue, 27 Dec 2022 12:52:10 +01:00 rseq: Introduce extensible rseq ABI Introduce the extensible rseq ABI, where the feature size supported by the kernel and the required alignment are communicated to user-space through ELF auxiliary vectors. This allows user-space to call rseq registration with a rseq_len of either 32 bytes for the original struct rseq size (which includes padding), or larger. If rseq_len is larger than 32 bytes, then it must be large enough to contain the feature size communicated to user-space through ELF auxiliary vectors. Signed-off-by: Mathieu Desnoyers Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20221122203932.231377-4-mathieu.desnoyers@e= fficios.com --- include/linux/sched.h | 4 ++++ kernel/ptrace.c | 2 +- kernel/rseq.c | 37 ++++++++++++++++++++++++++++++------- 3 files changed, 35 insertions(+), 8 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 853d08f..e0bc020 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1302,6 +1302,7 @@ struct task_struct { =20 #ifdef CONFIG_RSEQ struct rseq __user *rseq; + u32 rseq_len; u32 rseq_sig; /* * RmW on rseq_event_mask must be performed atomically @@ -2352,10 +2353,12 @@ static inline void rseq_fork(struct task_struct *t,= unsigned long clone_flags) { if (clone_flags & CLONE_VM) { t->rseq =3D NULL; + t->rseq_len =3D 0; t->rseq_sig =3D 0; t->rseq_event_mask =3D 0; } else { t->rseq =3D current->rseq; + t->rseq_len =3D current->rseq_len; t->rseq_sig =3D current->rseq_sig; t->rseq_event_mask =3D current->rseq_event_mask; } @@ -2364,6 +2367,7 @@ static inline void rseq_fork(struct task_struct *t, u= nsigned long clone_flags) static inline void rseq_execve(struct task_struct *t) { t->rseq =3D NULL; + t->rseq_len =3D 0; t->rseq_sig =3D 0; t->rseq_event_mask =3D 0; } diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 5448219..0786450 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -813,7 +813,7 @@ static long ptrace_get_rseq_configuration(struct task_s= truct *task, { struct ptrace_rseq_configuration conf =3D { .rseq_abi_pointer =3D (u64)(uintptr_t)task->rseq, - .rseq_abi_size =3D sizeof(*task->rseq), + .rseq_abi_size =3D task->rseq_len, .signature =3D task->rseq_sig, .flags =3D 0, }; diff --git a/kernel/rseq.c b/kernel/rseq.c index d38ab94..7962738 100644 --- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -18,6 +18,9 @@ #define CREATE_TRACE_POINTS #include =20 +/* The original rseq structure size (including padding) is 32 bytes. */ +#define ORIG_RSEQ_SIZE 32 + #define RSEQ_CS_NO_RESTART_FLAGS (RSEQ_CS_FLAG_NO_RESTART_ON_PREEMPT | \ RSEQ_CS_FLAG_NO_RESTART_ON_SIGNAL | \ RSEQ_CS_FLAG_NO_RESTART_ON_MIGRATE) @@ -87,10 +90,15 @@ static int rseq_update_cpu_id(struct task_struct *t) u32 cpu_id =3D raw_smp_processor_id(); struct rseq __user *rseq =3D t->rseq; =20 - if (!user_write_access_begin(rseq, sizeof(*rseq))) + if (!user_write_access_begin(rseq, t->rseq_len)) goto efault; unsafe_put_user(cpu_id, &rseq->cpu_id_start, efault_end); unsafe_put_user(cpu_id, &rseq->cpu_id, efault_end); + /* + * Additional feature fields added after ORIG_RSEQ_SIZE + * need to be conditionally updated only if + * t->rseq_len !=3D ORIG_RSEQ_SIZE. + */ user_write_access_end(); trace_rseq_update(t); return 0; @@ -117,6 +125,11 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *= t) */ if (put_user(cpu_id, &t->rseq->cpu_id)) return -EFAULT; + /* + * Additional feature fields added after ORIG_RSEQ_SIZE + * need to be conditionally reset only if + * t->rseq_len !=3D ORIG_RSEQ_SIZE. + */ return 0; } =20 @@ -344,7 +357,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, = rseq_len, /* Unregister rseq for current thread. */ if (current->rseq !=3D rseq || !current->rseq) return -EINVAL; - if (rseq_len !=3D sizeof(*rseq)) + if (rseq_len !=3D current->rseq_len) return -EINVAL; if (current->rseq_sig !=3D sig) return -EPERM; @@ -353,6 +366,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, = rseq_len, return ret; current->rseq =3D NULL; current->rseq_sig =3D 0; + current->rseq_len =3D 0; return 0; } =20 @@ -365,7 +379,7 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32, = rseq_len, * the provided address differs from the prior * one. */ - if (current->rseq !=3D rseq || rseq_len !=3D sizeof(*rseq)) + if (current->rseq !=3D rseq || rseq_len !=3D current->rseq_len) return -EINVAL; if (current->rseq_sig !=3D sig) return -EPERM; @@ -374,15 +388,24 @@ SYSCALL_DEFINE4(rseq, struct rseq __user *, rseq, u32= , rseq_len, } =20 /* - * If there was no rseq previously registered, - * ensure the provided rseq is properly aligned and valid. + * If there was no rseq previously registered, ensure the provided rseq + * is properly aligned, as communcated to user-space through the ELF + * auxiliary vector AT_RSEQ_ALIGN. If rseq_len is the original rseq + * size, the required alignment is the original struct rseq alignment. + * + * In order to be valid, rseq_len is either the original rseq size, or + * large enough to contain all supported fields, as communicated to + * user-space through the ELF auxiliary vector AT_RSEQ_FEATURE_SIZE. */ - if (!IS_ALIGNED((unsigned long)rseq, __alignof__(*rseq)) || - rseq_len !=3D sizeof(*rseq)) + if (rseq_len < ORIG_RSEQ_SIZE || + (rseq_len =3D=3D ORIG_RSEQ_SIZE && !IS_ALIGNED((unsigned long)rseq, O= RIG_RSEQ_SIZE)) || + (rseq_len !=3D ORIG_RSEQ_SIZE && (!IS_ALIGNED((unsigned long)rseq, __= alignof__(*rseq)) || + rseq_len < offsetof(struct rseq, end)))) return -EINVAL; if (!access_ok(rseq, rseq_len)) return -EFAULT; current->rseq =3D rseq; + current->rseq_len =3D rseq_len; current->rseq_sig =3D sig; /* * If rseq was previously inactive, and has just been