From nobody Tue Dec 16 16:39:28 2025 Received: from mail.zytor.com (terminus.zytor.com [198.137.202.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CCC8B171B0 for ; Mon, 17 Jun 2024 08:46:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.136 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718613974; cv=none; b=K/guH2IgC/g/+yC/3F5XHmObtfFqeikszZp6bRu3ISb7pTHkcXx7556N+9ZMhwaA6BTkVyRSgaUBvbDEevaTxOHNBaUCLLU3qZbX+CtBMcLpqiZtreyJLVHBUITOzMGJjURiILG+fYJsCE3G9s8mLOG6uuKbc5ISELlAZ14vcvQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718613974; c=relaxed/simple; bh=j8ppNxmC4fTY7RuZBWlTYXHhxmOMeTxYX6OQc6shWJU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=m3pbV8SbSYZge63ijfhhqiGm1/ENZKWG8RCUmSW1qsuoalSD8ClgmRoswPImiCzqWDx1CBXnxy8lDRZJfP29S/aT/E44LXO04+7LM/6vWE3iw6csMN/NuCw5sc55/iHC6WdIETtEQEqJ/Aj6uwT6nerHGDNjROrkeGiH0Qxeg6Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com; spf=pass smtp.mailfrom=zytor.com; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b=OcRA8St9; arc=none smtp.client-ip=198.137.202.136 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zytor.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b="OcRA8St9" Received: from terminus.zytor.com (terminus.zytor.com [IPv6:2607:7c80:54:3:0:0:0:136]) (authenticated bits=0) by mail.zytor.com (8.17.2/8.17.1) with ESMTPSA id 45H8jGt21484406 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 17 Jun 2024 01:45:21 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 45H8jGt21484406 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2024061501; t=1718613922; bh=TtDNCH2JQ5R+8SPYrvMZcbUwDGS6x0n/87jEWB8+7RA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OcRA8St9Gdx37CsUgiYY+pSegRKV/5C0plLP4MuT485H05d1ntzNgHflPG1NJOhTy Kmf8F7zkQmx+HcY+RQP3iuL6VA/dXBiM/oKDdAVjASiHjH7GY8obF5hpu7Om9V75mj /kI7UStnhujKuGtM18ZZJmzX48q09Vm7vgNipvAUPXT3uMkHeZsnsY5MAOgOae2lzt GspkA3RlLPcbaBEIOeiX70nvYZAp/VbExcNyT9QbedUXwQKgbSb9eJoYHx5Qif7AgS KPPPcmjLGPfvfMhNM6B42AplIxOeuU/5aliOzomYXjnrCzT55i85ZEu4rNbEtGqOMP 3Kq/mnNIlrTAg== From: "Xin Li (Intel)" To: linux-kernel@vger.kernel.org Cc: luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, peterz@infradead.org, brgerst@gmail.com Subject: [PATCH v1 1/3] x86/fred: Allow variable-sized event frame Date: Mon, 17 Jun 2024 01:45:13 -0700 Message-ID: <20240617084516.1484390-2-xin@zytor.com> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240617084516.1484390-1-xin@zytor.com> References: <20240617084516.1484390-1-xin@zytor.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A FRED event frame could contain different amount of information for different event types, or perhaps even for different instances of the same event type. Thus the size of an event frame pushed by a FRED CPU is not fixed and the address of the pt_regs structure that is used to save a user level context of current task is not at a fixed offset from top of current task kernel stack. Add a new field named 'user_pt_regs' in the thread_info structure to save the address of user level context pt_regs structure, thus to eliminate the need of any advance information of event frame size and allow a FRED CPU to push variable-sized event frame. For IDT user level event delivery, a pt_regs structure is pushed by hardware and software _always_ at a fixed offset from top of current task kernel stack, so simply initialize user_pt_regs to point to the pt_regs structure no matter whether one is pushed or not. While for FRED user level event delivery, user_pt_regs is updated with a pt_regs structure pointer generated in asm_fred_entrypoint_user(). Suggested-by: H. Peter Anvin (Intel) Signed-off-by: Xin Li (Intel) --- arch/x86/entry/entry_fred.c | 22 ++++++++++++++++++++++ arch/x86/include/asm/processor.h | 18 ++++++++++++------ arch/x86/include/asm/thread_info.h | 9 ++++++--- arch/x86/kernel/process.c | 22 ++++++++++++++++++++++ include/linux/thread_info.h | 1 + kernel/fork.c | 6 ++++++ 6 files changed, 69 insertions(+), 9 deletions(-) diff --git a/arch/x86/entry/entry_fred.c b/arch/x86/entry/entry_fred.c index f004a4dc74c2..1d54d451acb6 100644 --- a/arch/x86/entry/entry_fred.c +++ b/arch/x86/entry/entry_fred.c @@ -228,6 +228,28 @@ __visible noinstr void fred_entry_from_user(struct pt_= regs *regs) /* Invalidate orig_ax so that syscall_get_nr() works correctly */ regs->orig_ax =3D -1; =20 + /* + * A FRED event frame could contain different amount of information + * for different event types, or perhaps even for different instances + * of the same event type. Thus the size of an event frame pushed by + * a FRED CPU is not fixed and the address of the pt_regs structure + * that is used to save a user level context of current task is not + * at a fixed offset from top of current task stack. + * + * Save the address of the pt_regs structure passed from and generated + * in the caller function asm_fred_entrypoint_user() in thread_info so + * that task_pt_regs() can be used to access the pt_regs structure + * containing user level context after this point. + * + * What if another event happens before this point? + * + * Actually, another kernel event could happen earlier, even before the + * pt_regs structure for saving user level context is completely saved. + * It is guaranteed that the handler of the new event will NOT access + * the pt_regs structure of the previous user level event. + */ + current->thread_info.user_pt_regs =3D regs; + switch (regs->fred_ss.type) { case EVENT_TYPE_EXTINT: return fred_extint(regs); diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/proces= sor.h index bd0621210f63..ea7733e7bf1d 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -634,12 +634,18 @@ static __always_inline void prefetchw(const void *x) =20 #define task_top_of_stack(task) ((unsigned long)(task_pt_regs(task) + 1)) =20 -#define task_pt_regs(task) \ -({ \ - unsigned long __ptr =3D (unsigned long)task_stack_page(task); \ - __ptr +=3D THREAD_SIZE - TOP_OF_KERNEL_STACK_PADDING; \ - ((struct pt_regs *)__ptr) - 1; \ -}) +/* + * task_pt_regs() no longer converts a fixed offset from top of a task + * kernel stack to a pt_regs structure pointer, but rather returns + * whatever in the thread_info.user_pt_regs field, which contains the + * address of a pt_regs structure used to save a user level context of + * current task. + * + * Note, this can't be converted to an inline function as this header + * file defines 'struct thread_struct' which is used in the task_struct + * structure definition. + */ +#define task_pt_regs(task) ((task)->thread_info.user_pt_regs) =20 #ifdef CONFIG_X86_32 #define INIT_THREAD { \ diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thre= ad_info.h index 12da7dfd5ef1..326268d440cf 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -56,6 +56,7 @@ */ #ifndef __ASSEMBLY__ struct task_struct; +struct pt_regs; #include #include =20 @@ -66,11 +67,13 @@ struct thread_info { #ifdef CONFIG_SMP u32 cpu; /* current CPU */ #endif + struct pt_regs *user_pt_regs; }; =20 -#define INIT_THREAD_INFO(tsk) \ -{ \ - .flags =3D 0, \ +#define INIT_THREAD_INFO(tsk) \ +{ \ + .flags =3D 0, \ + .user_pt_regs =3D (struct pt_regs *)TOP_OF_INIT_STACK - 1, \ } =20 #else /* !__ASSEMBLY__ */ diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 0c63035d8164..787a402e4ead 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -100,6 +100,28 @@ int arch_dup_task_struct(struct task_struct *dst, stru= ct task_struct *src) return 0; } =20 +/* + * Initialize thread_info.user_pt_regs for IDT event delivery. + * + * For IDT user level event delivery, a pt_regs structure is pushed by both + * hardware and software and always resides at a fixed offset from top of + * current task kernel stack, thus thread_info.user_pt_regs is a per-task + * constant and NEVER changes after initialization. + * + * While for FRED user level event delivery, user_pt_regs is updated in + * fred_entry_from_user() immediately after user level event delivery. + * + * Note: thread_info.user_pt_regs of the init task is initialized at build + * time. + */ +void arch_init_user_pt_regs(struct task_struct *tsk) +{ + unsigned long top_of_stack =3D (unsigned long)task_stack_page(tsk) + THRE= AD_SIZE; + + top_of_stack -=3D TOP_OF_KERNEL_STACK_PADDING; + tsk->thread_info.user_pt_regs =3D (struct pt_regs *)top_of_stack - 1; +} + #ifdef CONFIG_X86_64 void arch_release_task_struct(struct task_struct *tsk) { diff --git a/include/linux/thread_info.h b/include/linux/thread_info.h index 9ea0b28068f4..5b2a75a19a07 100644 --- a/include/linux/thread_info.h +++ b/include/linux/thread_info.h @@ -260,6 +260,7 @@ void arch_task_cache_init(void); /* for CONFIG_SH */ void arch_release_task_struct(struct task_struct *tsk); int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src); +void arch_init_user_pt_regs(struct task_struct *tsk); =20 #endif /* __KERNEL__ */ =20 diff --git a/kernel/fork.c b/kernel/fork.c index 99076dbe27d8..c4198599a7d4 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1089,6 +1089,10 @@ int __weak arch_dup_task_struct(struct task_struct *= dst, return 0; } =20 +void __weak arch_init_user_pt_regs(struct task_struct *tsk) +{ +} + void set_task_stack_end_magic(struct task_struct *tsk) { unsigned long *stackend; @@ -1116,6 +1120,8 @@ static struct task_struct *dup_task_struct(struct tas= k_struct *orig, int node) if (err) goto free_tsk; =20 + arch_init_user_pt_regs(tsk); + #ifdef CONFIG_THREAD_INFO_IN_TASK refcount_set(&tsk->stack_refcount, 1); #endif --=20 2.45.1 From nobody Tue Dec 16 16:39:28 2025 Received: from mail.zytor.com (terminus.zytor.com [198.137.202.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CCD0154918 for ; Mon, 17 Jun 2024 08:46:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.136 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718613975; cv=none; b=g9RRV/Nz+QbdvyHTpiOtIonl02yNPIeV8ii8WLFZGfjUf1HDeWrE7aO9wj7bjdDGtCbFQN60wddywcI9L4ikVDIwGn/GSUueQjrayYvkrn0uhUCZcEZddTKvMnmH2GJtfTcbF1U9LkHOzzyeD2qellBpDtfp+IOAJtjTgjVvACI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718613975; c=relaxed/simple; bh=rA4VS4FxTvDCKx3FhWhSHsbZyMK4lV2SdapRQhVPjE4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=JWTei6c6vtQSmiDvxt7Cd6Gn9ELvXkDLii+JW7ANfD/HEL9ow286G79Xxbc0f4zpxj2DYBprPvuSsAu0EcbQr8syfl34EQBLJY/xRSmgoBQEWvoEVzZG3GUQIiueJVIreU77PrXvsxOLb2vEiFSYBn7b1moaNktdJ2Jh/KQYhWo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com; spf=pass smtp.mailfrom=zytor.com; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b=Ba+8YBYS; arc=none smtp.client-ip=198.137.202.136 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zytor.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b="Ba+8YBYS" Received: from terminus.zytor.com (terminus.zytor.com [IPv6:2607:7c80:54:3:0:0:0:136]) (authenticated bits=0) by mail.zytor.com (8.17.2/8.17.1) with ESMTPSA id 45H8jGt31484406 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 17 Jun 2024 01:45:22 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 45H8jGt31484406 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2024061501; t=1718613922; bh=rlj70q5uzhl4dfMY8LHCyKumd5neSWNSwo53XAQwPq8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ba+8YBYSJP82QeOXi8oTyqiHr+sHbNtSqulRty2SF87oUoSklT1FwdYAWlt4pi+Iq MwOrEkYryDSrWfNbEt2y41S1RtNNmACK8zQSVvchqfiFg6xWXxGPAeBullK8lUeMYs 17QIEjkFbf3kQkc8Zt8V6fkRUbzWHiARU8cfEu9QJmfwn2GyG/AsgV1Lny9ZHrNPK0 KqHMBEhqOa4dRlcoMXu5qCheQMl+iTe3LmnVt4wVl8dMHd12faIO1Y2fg+5p8J3jM2 JbeKd9n5C0N4pcTTgRIMb1hv2TilWYgs4ZL2y835wwaUVrP84ZOuQ5F0nN8rZeKtqI 7Wl5RjkmlOxWw== From: "Xin Li (Intel)" To: linux-kernel@vger.kernel.org Cc: luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, peterz@infradead.org, brgerst@gmail.com Subject: [PATCH v1 2/3] x86: Remove the padding space at top of the init stack Date: Mon, 17 Jun 2024 01:45:14 -0700 Message-ID: <20240617084516.1484390-3-xin@zytor.com> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240617084516.1484390-1-xin@zytor.com> References: <20240617084516.1484390-1-xin@zytor.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Because the owner of the init stack, init task, doesn't have any user level context, there will NEVER be an actual pt_regs structure pushed at top of the init stack. However a zeroed pt_regs structure is created at build time and kept at top of the init stack for task_pt_regs() to function properly with the init task in the same manner as a normal task with user level context. Besides, task_pt_regs() no longer converts a fixed offset from top of a task kernel stack to a pt_regs structure pointer, but rather returns whatever in the thread_info.user_pt_regs field, which is initialized at build time to '(struct pt_regs *)TOP_OF_INIT_STACK - 1' for the init task. As a result, there is no point to reserve any padding space at top of the init stack, so remove the padding space. Signed-off-by: Xin Li (Intel) --- arch/x86/include/asm/processor.h | 16 ++++++++++++++-- arch/x86/kernel/vmlinux.lds.S | 18 ++++++++++++++++-- 2 files changed, 30 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/proces= sor.h index ea7733e7bf1d..91803844c4d7 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -629,8 +629,20 @@ static __always_inline void prefetchw(const void *x) "m" (*(const char *)x)); } =20 -#define TOP_OF_INIT_STACK ((unsigned long)&init_stack + sizeof(init_stack)= - \ - TOP_OF_KERNEL_STACK_PADDING) +extern unsigned long __end_init_stack[]; + +/* + * No need to reserve extra padding space above the pt_regs structure + * at top of the init stack, because its owner init task doesn't have + * any user level context, thus there will NEVER be an actual pt_regs + * structure pushed at top of the init stack. + * + * However a zeroed pt_regs structure is created at build time and kept + * at top of the init stack for task_pt_regs() to function properly with + * the init task in the same manner as a normal task with user level + * context. + */ +#define TOP_OF_INIT_STACK ((unsigned long)&__end_init_stack) =20 #define task_top_of_stack(task) ((unsigned long)(task_pt_regs(task) + 1)) =20 diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S index 3509afc6a672..b440928191c4 100644 --- a/arch/x86/kernel/vmlinux.lds.S +++ b/arch/x86/kernel/vmlinux.lds.S @@ -167,8 +167,22 @@ SECTIONS /* init_task */ INIT_TASK_DATA(THREAD_SIZE) =20 - /* equivalent to task_pt_regs(&init_task) */ - __top_init_kernel_stack =3D __end_init_stack - TOP_OF_KERNEL_STACK_PADDI= NG - PTREGS_SIZE; + /* + * No need to reserve extra padding space above the pt_regs + * structure at top of the init stack, because its owner + * init task doesn't have any user level context, thus there + * will NEVER be an actual pt_regs structure pushed at top + * of the init stack. + * + * However a zeroed pt_regs structure is created at build + * time and kept at top of the init stack for task_pt_regs() + * to function properly with the init task in the same manner + * as a normal task with user level context. + * + * task_pt_regs(&init_task) is now: + * '(struct pt_regs *)&__end_init_stack - 1' + */ + __top_init_kernel_stack =3D __end_init_stack - PTREGS_SIZE; =20 #ifdef CONFIG_X86_32 /* 32 bit has nosave before _edata */ --=20 2.45.1 From nobody Tue Dec 16 16:39:28 2025 Received: from mail.zytor.com (terminus.zytor.com [198.137.202.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CCCCC5336A for ; Mon, 17 Jun 2024 08:46:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.137.202.136 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718613974; cv=none; b=lg9ieczWnOmLbDMGuCpB6gH8CzXl0b3SP9D+dDob4YKqzJ1Tovr/o5rL9TvMEw3W8vjg9bcvc84+1InjKs4tJ+IpfE06lFm+dGoltn6/17WVQoqxGqxU/JbXL+vThCIrzN5iaOu8L0s39CATubOfQtrhb41e+B2iGPpzmsrCRSE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718613974; c=relaxed/simple; bh=Bq7/0qa7R8/HJvVA9joTBDPoHjCRFLDPoUwUWy0pxB8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NoP+znlh0c+/fk7FU2WMPbGDbF2t7wE7Q33wAA/x59XIlqpgaZCTMJhbAm+hbFf5Ax/9P6eJCsSl+MY6Wgnm8E3WCu6kuOPqdQANeELgi9sTabOOAVjzEqLWZ7AuKjufc8/wCa7u/7Tfb7nX+kVVV4/vH4dtNg6aGF5vQ9raYcg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com; spf=pass smtp.mailfrom=zytor.com; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b=q9F0VIjf; arc=none smtp.client-ip=198.137.202.136 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=zytor.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=zytor.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=zytor.com header.i=@zytor.com header.b="q9F0VIjf" Received: from terminus.zytor.com (terminus.zytor.com [IPv6:2607:7c80:54:3:0:0:0:136]) (authenticated bits=0) by mail.zytor.com (8.17.2/8.17.1) with ESMTPSA id 45H8jGt41484406 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO); Mon, 17 Jun 2024 01:45:23 -0700 DKIM-Filter: OpenDKIM Filter v2.11.0 mail.zytor.com 45H8jGt41484406 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zytor.com; s=2024061501; t=1718613923; bh=qFqPyA+WpBJtrQjQTK7/FKSQCXP4sJIk7KZPCCFOrmI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=q9F0VIjfaodugVbHzMYnwV5Phc+D5gNSrIvWNP6JPQMlI06InAr5ruZWm+6bmDSTY yunkv9fASMIKAU97OkAUWImt/VrBMpIQO6/j8jd/ZI+BJv+Q53tYKCTl/rk+kYulki J5VVt4nB0SzWsJlbdL6x9xAvyuP7LA4KKQ9/hC2zZwjykiV0rdiTQ3Cs+jyehZH15n epRb5e59JlpWyfWJ5VB21C1uAelS5bpufMg/t0hxyA5NSWi0a8XNeO+BYwUIrQuCGh VrUHlmUkISSIk3SNRmhb3EKjRnzp5ICro6snWjTB9Vc6epFCIfUE4lNv9z1hc4oQWb ckg33WmbG8HyA== From: "Xin Li (Intel)" To: linux-kernel@vger.kernel.org Cc: luto@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, peterz@infradead.org, brgerst@gmail.com Subject: [PATCH v1 3/3] x86: Get rid of TOP_OF_KERNEL_STACK_PADDING on x86_64 Date: Mon, 17 Jun 2024 01:45:15 -0700 Message-ID: <20240617084516.1484390-4-xin@zytor.com> X-Mailer: git-send-email 2.45.1 In-Reply-To: <20240617084516.1484390-1-xin@zytor.com> References: <20240617084516.1484390-1-xin@zytor.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Because task_pt_regs() is now just an alias of thread_info.user_pt_regs, and no matter whether FRED is enabled or not a user level event frame on x86_64 is always pushed from top of current task kernel stack, i.e., '(unsigned long)task_stack_page(task) + THREAD_SIZE', there is no meaning to keep TOP_OF_KERNEL_STACK_PADDING on x86_64, thus remove it. Signed-off-by: Xin Li (Intel) --- arch/x86/include/asm/processor.h | 6 ++++-- arch/x86/include/asm/switch_to.h | 2 +- arch/x86/include/asm/thread_info.h | 10 ---------- arch/x86/kernel/process.c | 3 +-- 4 files changed, 6 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/proces= sor.h index 91803844c4d7..9c5294f6d923 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -644,8 +644,6 @@ extern unsigned long __end_init_stack[]; */ #define TOP_OF_INIT_STACK ((unsigned long)&__end_init_stack) =20 -#define task_top_of_stack(task) ((unsigned long)(task_pt_regs(task) + 1)) - /* * task_pt_regs() no longer converts a fixed offset from top of a task * kernel stack to a pt_regs structure pointer, but rather returns @@ -660,6 +658,9 @@ extern unsigned long __end_init_stack[]; #define task_pt_regs(task) ((task)->thread_info.user_pt_regs) =20 #ifdef CONFIG_X86_32 +#define task_top_of_stack(task) ((unsigned long)task_stack_page(task) + TH= READ_SIZE \ + - TOP_OF_KERNEL_STACK_PADDING) + #define INIT_THREAD { \ .sp0 =3D TOP_OF_INIT_STACK, \ .sysenter_cs =3D __KERNEL_CS, \ @@ -669,6 +670,7 @@ extern unsigned long __end_init_stack[]; =20 #else extern unsigned long __top_init_kernel_stack[]; +#define task_top_of_stack(task) ((unsigned long)task_stack_page(task) + TH= READ_SIZE) =20 #define INIT_THREAD { \ .sp =3D (unsigned long)&__top_init_kernel_stack, \ diff --git a/arch/x86/include/asm/switch_to.h b/arch/x86/include/asm/switch= _to.h index c3bd0c0758c9..902f1612ef3f 100644 --- a/arch/x86/include/asm/switch_to.h +++ b/arch/x86/include/asm/switch_to.h @@ -72,7 +72,7 @@ static inline void update_task_stack(struct task_struct *= task) #else if (cpu_feature_enabled(X86_FEATURE_FRED)) { /* WRMSRNS is a baseline feature for FRED. */ - wrmsrns(MSR_IA32_FRED_RSP0, (unsigned long)task_stack_page(task) + THREA= D_SIZE); + wrmsrns(MSR_IA32_FRED_RSP0, task_top_of_stack(task)); } else if (cpu_feature_enabled(X86_FEATURE_XENPV)) { /* Xen PV enters the kernel on the thread stack. */ load_sp0(task_top_of_stack(task)); diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thre= ad_info.h index 326268d440cf..331a6f32a0be 100644 --- a/arch/x86/include/asm/thread_info.h +++ b/arch/x86/include/asm/thread_info.h @@ -30,10 +30,6 @@ * * In vm86 mode, the hardware frame is much longer still, so add 16 * bytes to make room for the real-mode segments. - * - * x86-64 has a fixed-length stack frame, but it depends on whether - * or not FRED is enabled. Future versions of FRED might make this - * dynamic, but for now it is always 2 words longer. */ #ifdef CONFIG_X86_32 # ifdef CONFIG_VM86 @@ -41,12 +37,6 @@ # else # define TOP_OF_KERNEL_STACK_PADDING 8 # endif -#else /* x86-64 */ -# ifdef CONFIG_X86_FRED -# define TOP_OF_KERNEL_STACK_PADDING (2 * 8) -# else -# define TOP_OF_KERNEL_STACK_PADDING 0 -# endif #endif =20 /* diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 787a402e4ead..99f9887f710e 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -116,9 +116,8 @@ int arch_dup_task_struct(struct task_struct *dst, struc= t task_struct *src) */ void arch_init_user_pt_regs(struct task_struct *tsk) { - unsigned long top_of_stack =3D (unsigned long)task_stack_page(tsk) + THRE= AD_SIZE; + unsigned long top_of_stack =3D task_top_of_stack(tsk); =20 - top_of_stack -=3D TOP_OF_KERNEL_STACK_PADDING; tsk->thread_info.user_pt_regs =3D (struct pt_regs *)top_of_stack - 1; } =20 --=20 2.45.1