From nobody Sun Jun 28 00:10:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F682C433EF for ; Thu, 17 Feb 2022 10:24:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239144AbiBQKYf (ORCPT ); Thu, 17 Feb 2022 05:24:35 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233219AbiBQKY0 (ORCPT ); Thu, 17 Feb 2022 05:24:26 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 013E9279085; Thu, 17 Feb 2022 02:24:12 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645093450; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OqTj/MWZ/KJ2RN1mgDHy03k6dnz1y2VoL1uj4/vvUAQ=; b=M2lB99miPqxH2UH/lhrIfA/YEHH0IVoBZN1SLGauILXE8Lrb3NbRJrCW1A1sUNnw6Ygu9M t5zOIphs9JUeJvNalDD3DWNloYDBGKd7wcG6kdvfzEHGpfROuCoZTmV85Vb2EiPAjCTiOp 5WDPQTX/L7Un56QduCdMy6vr65QHBgtEtRFTu5vEPh5S/4aYhreu8JwUjKyjX9ZI6Sh1bm 1CKWkLu399qUDN6C/Te3ZQLt9HsVeaic5hJtdIBF8O7qI9peOuRzBdLZUuGVhqBT2z6Dn5 symUrzAbo6lD2fC/cm5GhMxtjWH8cjj8Q9Z+bEbCxxcMeAg3gRz9ngiTE6JI5g== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645093450; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=OqTj/MWZ/KJ2RN1mgDHy03k6dnz1y2VoL1uj4/vvUAQ=; b=HkyDWvAfLgoilVNOWERANnOXJy0P3321md/wXHEvPcTKv8pWCpZ9T+BhxLVAi2vyEjifYE krFzgMYBh2nuBGAA== To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Vincent Guittot , Sebastian Andrzej Siewior Subject: [PATCH v2 1/8] kernel/fork: Redo ifdefs around task's handling. Date: Thu, 17 Feb 2022 11:23:59 +0100 Message-Id: <20220217102406.3697941-2-bigeasy@linutronix.de> In-Reply-To: <20220217102406.3697941-1-bigeasy@linutronix.de> References: <20220217102406.3697941-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The use of ifdef CONFIG_VMAP_STACK is confusing in terms what is actually happenning and what can happen. For instance from reading free_thread_stack() it appears that in the CONFIG_VMAP_STACK case we may receive a non-NULL vm pointer but it may also be NULL in which case __free_pages() is used to free the stack. This is however not the case because in the VMAP case a non-NULL pointer is always returned here. Since it looks like this might happen, the compiler creates the correct dead code with the invocation to __free_pages() and everything around it. Twice. Add spaces between the ifdef and the identifer to recognize the ifdef level that we are currently in. Add the current identifer as a comment behind #else and #endif. Move the code within free_thread_stack() and alloc_thread_stack_node() into the relavant ifdef block. Signed-off-by: Sebastian Andrzej Siewior Acked-by: Andy Lutomirski --- kernel/fork.c | 74 +++++++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 35 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index d75a528f7b219..f63c0af6002da 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -185,7 +185,7 @@ static inline void free_task_struct(struct task_struct = *tsk) */ # if THREAD_SIZE >=3D PAGE_SIZE || defined(CONFIG_VMAP_STACK) =20 -#ifdef CONFIG_VMAP_STACK +# ifdef CONFIG_VMAP_STACK /* * vmalloc() is a bit slow, and calling vfree() enough times will force a = TLB * flush. Try to minimize the number of calls by caching stacks. @@ -210,11 +210,9 @@ static int free_vm_stack_cache(unsigned int cpu) =20 return 0; } -#endif =20 static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int= node) { -#ifdef CONFIG_VMAP_STACK void *stack; int i; =20 @@ -258,7 +256,34 @@ static unsigned long *alloc_thread_stack_node(struct t= ask_struct *tsk, int node) tsk->stack =3D stack; } return stack; -#else +} + +static void free_thread_stack(struct task_struct *tsk) +{ + struct vm_struct *vm =3D task_stack_vm_area(tsk); + int i; + + for (i =3D 0; i < THREAD_SIZE / PAGE_SIZE; i++) + memcg_kmem_uncharge_page(vm->pages[i], 0); + + for (i =3D 0; i < NR_CACHED_STACKS; i++) { + if (this_cpu_cmpxchg(cached_stacks[i], NULL, + tsk->stack_vm_area) !=3D NULL) + continue; + + tsk->stack =3D NULL; + tsk->stack_vm_area =3D NULL; + return; + } + vfree_atomic(tsk->stack); + tsk->stack =3D NULL; + tsk->stack_vm_area =3D NULL; +} + +# else /* !CONFIG_VMAP_STACK */ + +static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int= node) +{ struct page *page =3D alloc_pages_node(node, THREADINFO_GFP, THREAD_SIZE_ORDER); =20 @@ -267,36 +292,17 @@ static unsigned long *alloc_thread_stack_node(struct = task_struct *tsk, int node) return tsk->stack; } return NULL; -#endif } =20 -static inline void free_thread_stack(struct task_struct *tsk) +static void free_thread_stack(struct task_struct *tsk) { -#ifdef CONFIG_VMAP_STACK - struct vm_struct *vm =3D task_stack_vm_area(tsk); - - if (vm) { - int i; - - for (i =3D 0; i < THREAD_SIZE / PAGE_SIZE; i++) - memcg_kmem_uncharge_page(vm->pages[i], 0); - - for (i =3D 0; i < NR_CACHED_STACKS; i++) { - if (this_cpu_cmpxchg(cached_stacks[i], - NULL, tsk->stack_vm_area) !=3D NULL) - continue; - - return; - } - - vfree_atomic(tsk->stack); - return; - } -#endif - __free_pages(virt_to_page(tsk->stack), THREAD_SIZE_ORDER); + tsk->stack =3D NULL; } -# else + +# endif /* CONFIG_VMAP_STACK */ +# else /* !(THREAD_SIZE >=3D PAGE_SIZE || defined(CONFIG_VMAP_STACK)) */ + static struct kmem_cache *thread_stack_cache; =20 static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, @@ -312,6 +318,7 @@ static unsigned long *alloc_thread_stack_node(struct ta= sk_struct *tsk, static void free_thread_stack(struct task_struct *tsk) { kmem_cache_free(thread_stack_cache, tsk->stack); + tsk->stack =3D NULL; } =20 void thread_stack_cache_init(void) @@ -321,8 +328,9 @@ void thread_stack_cache_init(void) THREAD_SIZE, NULL); BUG_ON(thread_stack_cache =3D=3D NULL); } -# endif -#endif + +# endif /* THREAD_SIZE >=3D PAGE_SIZE || defined(CONFIG_VMAP_STACK) */ +#endif /* !CONFIG_ARCH_THREAD_STACK_ALLOCATOR */ =20 /* SLAB cache for signal_struct structures (tsk->signal) */ static struct kmem_cache *signal_cachep; @@ -432,10 +440,6 @@ static void release_task_stack(struct task_struct *tsk) =20 account_kernel_stack(tsk, -1); free_thread_stack(tsk); - tsk->stack =3D NULL; -#ifdef CONFIG_VMAP_STACK - tsk->stack_vm_area =3D NULL; -#endif } =20 #ifdef CONFIG_THREAD_INFO_IN_TASK --=20 2.34.1 From nobody Sun Jun 28 00:10:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7C1CC433F5 for ; Thu, 17 Feb 2022 10:24:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239162AbiBQKYn (ORCPT ); Thu, 17 Feb 2022 05:24:43 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236450AbiBQKY0 (ORCPT ); Thu, 17 Feb 2022 05:24:26 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71499279087; Thu, 17 Feb 2022 02:24:12 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645093451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7cvc7OKGjIgjSJRVBHrDCTpoYiTTVf4ZQnsLwC8HQM0=; b=CMMLLs9kVHZMiydXabuSbbrrQdMQxrIVEqiadZfw83BkIVxJGRnGW8h5gkWrgG83UJkeUZ o/F0AMivQaN5bHDviR5jTIHOHQqSLTV6gkxaWQ9RaPWIB9JWZYVgR2HrC3NxzqOs0FJAIO ulzfIndoIkX8Q7coVLQCqm8cDk4xyaWiz+TU3qEUBkMIJyk7oDDccboAWTyuMgOZxeTxZp zozloP/LDl4EqYi7z0x5OMctOqJ/2aiVMemE7Cwp/gsenihfjmx1RmRkyFAhza/DUFCmns gBaJdVGKuvg/2m9RvJ3D2nFcB6OGfL9G9usQ30qT71Gd8B0rivt6/J3rrO7WhA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645093451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7cvc7OKGjIgjSJRVBHrDCTpoYiTTVf4ZQnsLwC8HQM0=; b=RtZ/Qg7YlYC5hILg38EdaNn2FAmOMrgZz3c1+VAh7suz2PdyJGeCsmvEipZP7TioeqwXS5 p/6xFzumIKIjneBw== To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Vincent Guittot , Sebastian Andrzej Siewior Subject: [PATCH v2 2/8] kernel/fork: Duplicate task_struct before stack allocation. Date: Thu, 17 Feb 2022 11:24:00 +0100 Message-Id: <20220217102406.3697941-3-bigeasy@linutronix.de> In-Reply-To: <20220217102406.3697941-1-bigeasy@linutronix.de> References: <20220217102406.3697941-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" alloc_thread_stack_node() already populates the task_struct::stack member except on IA64. The stack pointer is saved and populated again because IA64 needs it and arch_dup_task_struct() overwrites it. Allocate thread's stack after task_struct has been duplicated as a preparation. Signed-off-by: Sebastian Andrzej Siewior Acked-by: Andy Lutomirski --- kernel/fork.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index f63c0af6002da..c47dcba5d66d2 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -888,6 +888,10 @@ static struct task_struct *dup_task_struct(struct task= _struct *orig, int node) if (!tsk) return NULL; =20 + err =3D arch_dup_task_struct(tsk, orig); + if (err) + goto free_tsk; + stack =3D alloc_thread_stack_node(tsk, node); if (!stack) goto free_tsk; @@ -897,8 +901,6 @@ static struct task_struct *dup_task_struct(struct task_= struct *orig, int node) =20 stack_vm_area =3D task_stack_vm_area(tsk); =20 - err =3D arch_dup_task_struct(tsk, orig); - /* * arch_dup_task_struct() clobbers the stack-related fields. Make * sure they're properly initialized before using any stack-related @@ -912,9 +914,6 @@ static struct task_struct *dup_task_struct(struct task_= struct *orig, int node) refcount_set(&tsk->stack_refcount, 1); #endif =20 - if (err) - goto free_stack; - err =3D scs_prepare(tsk, node); if (err) goto free_stack; --=20 2.34.1 From nobody Sun Jun 28 00:10:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6521EC433FE for ; Thu, 17 Feb 2022 10:24:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239184AbiBQKYo (ORCPT ); Thu, 17 Feb 2022 05:24:44 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60726 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238991AbiBQKY0 (ORCPT ); Thu, 17 Feb 2022 05:24:26 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D267B279088; Thu, 17 Feb 2022 02:24:12 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645093451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0N2u0jEiOBJ6wA9k02f9xsIuHCuLjYtxcfKJUOlNMLQ=; b=H0EKjk7I/SZP8KecVeRo8F34dE26bMiGuqMRhhNn2faGqFl9i0TwohmIh1RB/vnOmdkZKZ ROY3XT9G1a42F0mKD7iTeFlgmmMfsea3JJQBL7oM9u9hNnRCoHyXlFzA1MtOksKChenIbL sDhe0eY99uw83YGpHomRFj6nU7PjwaYS/y5x+Qi6j1wdmNDPSt+2occBZ/Ts2CW1dlgi7E 9pputWC5iMWxBiUGbygfkzzlKFCNcSbCiiSkzcep59udDEzQSWVHkoYvaaKGQbNiyiAy0l HNF6i96r9kweUnYCmKCNwqHtwtSLhGTJ3dQC52R+oGz8V0xLDrQawhHLD4cdag== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645093451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0N2u0jEiOBJ6wA9k02f9xsIuHCuLjYtxcfKJUOlNMLQ=; b=KBpF/ASSmzKE2Tt+Yfj+cGQipTlGHBBUnCt8T+3yweP6Vg+EhGR6tF4wC5hP41n9E1ANh0 tO+9oZNalGNGEmBg== To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Vincent Guittot , Sebastian Andrzej Siewior Subject: [PATCH v2 3/8] kernel/fork, IA64: Provide a alloc_thread_stack_node() for IA64. Date: Thu, 17 Feb 2022 11:24:01 +0100 Message-Id: <20220217102406.3697941-4-bigeasy@linutronix.de> In-Reply-To: <20220217102406.3697941-1-bigeasy@linutronix.de> References: <20220217102406.3697941-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Provide a generic alloc_thread_stack_node() for IA64/ CONFIG_ARCH_THREAD_STACK_ALLOCATOR which returns stack pointer and sets task_struct::stack so it behaves exactly like the other implementations. Rename IA64's alloc_thread_stack_node() and add the generic version to the fork code so it is in one place _and_ to drastically lower chances of fat fingering the IA64 code. Do the same for free_thread_stack(). Signed-off-by: Sebastian Andrzej Siewior Acked-by: Andy Lutomirski --- arch/ia64/include/asm/thread_info.h | 6 +++--- kernel/fork.c | 17 +++++++++++++++++ 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/ia64/include/asm/thread_info.h b/arch/ia64/include/asm/th= read_info.h index 51d20cb377062..1684716f08201 100644 --- a/arch/ia64/include/asm/thread_info.h +++ b/arch/ia64/include/asm/thread_info.h @@ -55,15 +55,15 @@ struct thread_info { #ifndef ASM_OFFSETS_C /* how to get the thread information struct from C */ #define current_thread_info() ((struct thread_info *) ((char *) current + = IA64_TASK_SIZE)) -#define alloc_thread_stack_node(tsk, node) \ +#define arch_alloc_thread_stack_node(tsk, node) \ ((unsigned long *) ((char *) (tsk) + IA64_TASK_SIZE)) #define task_thread_info(tsk) ((struct thread_info *) ((char *) (tsk) + IA= 64_TASK_SIZE)) #else #define current_thread_info() ((struct thread_info *) 0) -#define alloc_thread_stack_node(tsk, node) ((unsigned long *) 0) +#define arch_alloc_thread_stack_node(tsk, node) ((unsigned long *) 0) #define task_thread_info(tsk) ((struct thread_info *) 0) #endif -#define free_thread_stack(tsk) /* nothing */ +#define arch_free_thread_stack(tsk) /* nothing */ #define task_stack_page(tsk) ((void *)(tsk)) =20 #define __HAVE_THREAD_FUNCTIONS diff --git a/kernel/fork.c b/kernel/fork.c index c47dcba5d66d2..a6697215fe663 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -330,6 +330,23 @@ void thread_stack_cache_init(void) } =20 # endif /* THREAD_SIZE >=3D PAGE_SIZE || defined(CONFIG_VMAP_STACK) */ +#else /* CONFIG_ARCH_THREAD_STACK_ALLOCATOR */ + +static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int= node) +{ + unsigned long *stack; + + stack =3D arch_alloc_thread_stack_node(tsk, node); + tsk->stack =3D stack; + return stack; +} + +static void free_thread_stack(struct task_struct *tsk) +{ + arch_free_thread_stack(tsk); + tsk->stack =3D NULL; +} + #endif /* !CONFIG_ARCH_THREAD_STACK_ALLOCATOR */ =20 /* SLAB cache for signal_struct structures (tsk->signal) */ --=20 2.34.1 From nobody Sun Jun 28 00:10:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 891FFC433F5 for ; Thu, 17 Feb 2022 10:24:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239191AbiBQKYs (ORCPT ); Thu, 17 Feb 2022 05:24:48 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239082AbiBQKY2 (ORCPT ); Thu, 17 Feb 2022 05:24:28 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 32317279084; Thu, 17 Feb 2022 02:24:13 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645093451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jJpK0+bAg4ngggu0bN8b9W4x5mOPd0mVg74Vx0SOVMI=; b=MhPy1Wxj3dfjl57JudEnMiSYyZG94K4eDOJY6lPdnYB+RfsdyWqaCyW/66oVHVGrFSfjNf kNGRe8YazVniA5dAz6mRSyOmggfXUjrTIJdF14iy5V8y+7Mqq65TbisdGGMlfKcybQc2C4 eCjIWZ9f4TpRBsojjym2werZrVVzPZfAfk1ojtP02QD/9l0yAnuQGydXv/fAP4/JY3fwpb hhO0lDml+xuOihRxyO/JF9RyPf0qHiDHl2smgKGRAEi5qCbh79aLwt1prRs4aTaeIvLyJT Z8YMEBghNRvnniEsosjncs/7V2zWr9SL/+1mS2ARNssZLlN5V96vjf8dfdxKyw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645093451; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jJpK0+bAg4ngggu0bN8b9W4x5mOPd0mVg74Vx0SOVMI=; b=fH4WzmOfWB2BxyRt4lgkFSmsETvamjIEdBUkWlNAZoeieRdHgPuWzlxSi3VUa8Byj/YQVr xP8tw4OBhtgWuQDA== To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Vincent Guittot , Sebastian Andrzej Siewior Subject: [PATCH v2 4/8] kernel/fork: Don't assign the stack pointer in dup_task_struct(). Date: Thu, 17 Feb 2022 11:24:02 +0100 Message-Id: <20220217102406.3697941-5-bigeasy@linutronix.de> In-Reply-To: <20220217102406.3697941-1-bigeasy@linutronix.de> References: <20220217102406.3697941-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" All four versions of alloc_thread_stack_node() assign now task_struct::stack in case the allocation was successful. Let alloc_thread_stack_node() return an error code instead of the stack pointer and remove the stack assignment in dup_task_struct(). Signed-off-by: Sebastian Andrzej Siewior Acked-by: Andy Lutomirski --- kernel/fork.c | 47 ++++++++++++++++------------------------------- 1 file changed, 16 insertions(+), 31 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index a6697215fe663..546bea2e3b28a 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -211,7 +211,7 @@ static int free_vm_stack_cache(unsigned int cpu) return 0; } =20 -static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int= node) +static int alloc_thread_stack_node(struct task_struct *tsk, int node) { void *stack; int i; @@ -232,7 +232,7 @@ static unsigned long *alloc_thread_stack_node(struct ta= sk_struct *tsk, int node) =20 tsk->stack_vm_area =3D s; tsk->stack =3D s->addr; - return s->addr; + return 0; } =20 /* @@ -245,17 +245,16 @@ static unsigned long *alloc_thread_stack_node(struct = task_struct *tsk, int node) THREADINFO_GFP & ~__GFP_ACCOUNT, PAGE_KERNEL, 0, node, __builtin_return_address(0)); - + if (!stack) + return -ENOMEM; /* * We can't call find_vm_area() in interrupt context, and * free_thread_stack() can be called in interrupt context, * so cache the vm_struct. */ - if (stack) { - tsk->stack_vm_area =3D find_vm_area(stack); - tsk->stack =3D stack; - } - return stack; + tsk->stack_vm_area =3D find_vm_area(stack); + tsk->stack =3D stack; + return 0; } =20 static void free_thread_stack(struct task_struct *tsk) @@ -282,16 +281,16 @@ static void free_thread_stack(struct task_struct *tsk) =20 # else /* !CONFIG_VMAP_STACK */ =20 -static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int= node) +static int alloc_thread_stack_node(struct task_struct *tsk, int node) { struct page *page =3D alloc_pages_node(node, THREADINFO_GFP, THREAD_SIZE_ORDER); =20 if (likely(page)) { tsk->stack =3D kasan_reset_tag(page_address(page)); - return tsk->stack; + return 0; } - return NULL; + return -ENOMEM; } =20 static void free_thread_stack(struct task_struct *tsk) @@ -305,14 +304,13 @@ static void free_thread_stack(struct task_struct *tsk) =20 static struct kmem_cache *thread_stack_cache; =20 -static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, - int node) +static int alloc_thread_stack_node(struct task_struct *tsk, int node) { unsigned long *stack; stack =3D kmem_cache_alloc_node(thread_stack_cache, THREADINFO_GFP, node); stack =3D kasan_reset_tag(stack); tsk->stack =3D stack; - return stack; + return stack ? 0 : -ENOMEM; } =20 static void free_thread_stack(struct task_struct *tsk) @@ -332,13 +330,13 @@ void thread_stack_cache_init(void) # endif /* THREAD_SIZE >=3D PAGE_SIZE || defined(CONFIG_VMAP_STACK) */ #else /* CONFIG_ARCH_THREAD_STACK_ALLOCATOR */ =20 -static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int= node) +static int alloc_thread_stack_node(struct task_struct *tsk, int node) { unsigned long *stack; =20 stack =3D arch_alloc_thread_stack_node(tsk, node); tsk->stack =3D stack; - return stack; + return stack ? 0 : -ENOMEM; } =20 static void free_thread_stack(struct task_struct *tsk) @@ -895,8 +893,6 @@ void set_task_stack_end_magic(struct task_struct *tsk) static struct task_struct *dup_task_struct(struct task_struct *orig, int n= ode) { struct task_struct *tsk; - unsigned long *stack; - struct vm_struct *stack_vm_area __maybe_unused; int err; =20 if (node =3D=3D NUMA_NO_NODE) @@ -909,24 +905,13 @@ static struct task_struct *dup_task_struct(struct tas= k_struct *orig, int node) if (err) goto free_tsk; =20 - stack =3D alloc_thread_stack_node(tsk, node); - if (!stack) + err =3D alloc_thread_stack_node(tsk, node); + if (err) goto free_tsk; =20 if (memcg_charge_kernel_stack(tsk)) goto free_stack; =20 - stack_vm_area =3D task_stack_vm_area(tsk); - - /* - * arch_dup_task_struct() clobbers the stack-related fields. Make - * sure they're properly initialized before using any stack-related - * functions again. - */ - tsk->stack =3D stack; -#ifdef CONFIG_VMAP_STACK - tsk->stack_vm_area =3D stack_vm_area; -#endif #ifdef CONFIG_THREAD_INFO_IN_TASK refcount_set(&tsk->stack_refcount, 1); #endif --=20 2.34.1 From nobody Sun Jun 28 00:10:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F677C433F5 for ; Thu, 17 Feb 2022 10:24:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239200AbiBQKYv (ORCPT ); Thu, 17 Feb 2022 05:24:51 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239113AbiBQKY2 (ORCPT ); Thu, 17 Feb 2022 05:24:28 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9D35279089; Thu, 17 Feb 2022 02:24:13 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645093452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eh6w4p7U+VnlRGvHFngCmwGKSGGucQC6c+wAAkZxo88=; b=rUtZKv7N8+ne95KUmtvRgIWk22VnECkOmLuG+eh92IbcxGKSuMtqJgyQWCgh46M26cqTlI 1QNgo5ZVBsPEPim92P6GY4DqRVCBTk73+ETrw5rkVN6d2SkmeLEFYHMnq9/sm6WXqx6g/y Ekrvr4fyM8gCfkN9xul3lS+wGGy3+ztZqNGepZaoc0FwrhlJpWzkxdcZtrI1H3R+hnyitP F7iTYezko30lgWWuSVySVz1cALv0Vyq6rCQzUHK0loCrEJ1aWU5X36rgVKU/Dark6543u6 PAeqAfqhWZlc18cjroFcCb0byWt5tVIOkvx/AwY5++DFE94PWbJxWNsyVzMFzQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645093452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eh6w4p7U+VnlRGvHFngCmwGKSGGucQC6c+wAAkZxo88=; b=7xZmJB1nVW1pBGcM8XTu/6mGLTwkl/9uEQTLGEWSdxFe5szKjFWo489qXlHtU3P0ShA4Zo 7IZbBqYc+uEKPkDQ== To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Vincent Guittot , Sebastian Andrzej Siewior Subject: [PATCH v2 5/8] kernel/fork: Move memcg_charge_kernel_stack() into CONFIG_VMAP_STACK. Date: Thu, 17 Feb 2022 11:24:03 +0100 Message-Id: <20220217102406.3697941-6-bigeasy@linutronix.de> In-Reply-To: <20220217102406.3697941-1-bigeasy@linutronix.de> References: <20220217102406.3697941-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" memcg_charge_kernel_stack() is only used in the CONFIG_VMAP_STACK case. Move memcg_charge_kernel_stack() into the CONFIG_VMAP_STACK block and invoke it from within alloc_thread_stack_node(). Signed-off-by: Sebastian Andrzej Siewior Acked-by: Andy Lutomirski --- kernel/fork.c | 69 +++++++++++++++++++++++++++------------------------ 1 file changed, 36 insertions(+), 33 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 546bea2e3b28a..919bdcf21b8e5 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -211,6 +211,32 @@ static int free_vm_stack_cache(unsigned int cpu) return 0; } =20 +static int memcg_charge_kernel_stack(struct task_struct *tsk) +{ + struct vm_struct *vm =3D task_stack_vm_area(tsk); + int i; + int ret; + + BUILD_BUG_ON(IS_ENABLED(CONFIG_VMAP_STACK) && PAGE_SIZE % 1024 !=3D 0); + BUG_ON(vm->nr_pages !=3D THREAD_SIZE / PAGE_SIZE); + + for (i =3D 0; i < THREAD_SIZE / PAGE_SIZE; i++) { + ret =3D memcg_kmem_charge_page(vm->pages[i], GFP_KERNEL, 0); + if (ret) + goto err; + } + return 0; +err: + /* + * If memcg_kmem_charge_page() fails, page's memory cgroup pointer is + * NULL, and memcg_kmem_uncharge_page() in free_thread_stack() will + * ignore this page. + */ + for (i =3D 0; i < THREAD_SIZE / PAGE_SIZE; i++) + memcg_kmem_uncharge_page(vm->pages[i], 0); + return ret; +} + static int alloc_thread_stack_node(struct task_struct *tsk, int node) { void *stack; @@ -230,6 +256,11 @@ static int alloc_thread_stack_node(struct task_struct = *tsk, int node) /* Clear stale pointers from reused stack. */ memset(s->addr, 0, THREAD_SIZE); =20 + if (memcg_charge_kernel_stack(tsk)) { + vfree(s->addr); + return -ENOMEM; + } + tsk->stack_vm_area =3D s; tsk->stack =3D s->addr; return 0; @@ -247,6 +278,11 @@ static int alloc_thread_stack_node(struct task_struct = *tsk, int node) 0, node, __builtin_return_address(0)); if (!stack) return -ENOMEM; + + if (memcg_charge_kernel_stack(tsk)) { + vfree(stack); + return -ENOMEM; + } /* * We can't call find_vm_area() in interrupt context, and * free_thread_stack() can be called in interrupt context, @@ -418,36 +454,6 @@ static void account_kernel_stack(struct task_struct *t= sk, int account) } } =20 -static int memcg_charge_kernel_stack(struct task_struct *tsk) -{ -#ifdef CONFIG_VMAP_STACK - struct vm_struct *vm =3D task_stack_vm_area(tsk); - int ret; - - BUILD_BUG_ON(IS_ENABLED(CONFIG_VMAP_STACK) && PAGE_SIZE % 1024 !=3D 0); - - if (vm) { - int i; - - BUG_ON(vm->nr_pages !=3D THREAD_SIZE / PAGE_SIZE); - - for (i =3D 0; i < THREAD_SIZE / PAGE_SIZE; i++) { - /* - * If memcg_kmem_charge_page() fails, page's - * memory cgroup pointer is NULL, and - * memcg_kmem_uncharge_page() in free_thread_stack() - * will ignore this page. - */ - ret =3D memcg_kmem_charge_page(vm->pages[i], GFP_KERNEL, - 0); - if (ret) - return ret; - } - } -#endif - return 0; -} - static void release_task_stack(struct task_struct *tsk) { if (WARN_ON(READ_ONCE(tsk->__state) !=3D TASK_DEAD)) @@ -909,9 +915,6 @@ static struct task_struct *dup_task_struct(struct task_= struct *orig, int node) if (err) goto free_tsk; =20 - if (memcg_charge_kernel_stack(tsk)) - goto free_stack; - #ifdef CONFIG_THREAD_INFO_IN_TASK refcount_set(&tsk->stack_refcount, 1); #endif --=20 2.34.1 From nobody Sun Jun 28 00:10:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A56BC433F5 for ; Thu, 17 Feb 2022 10:24:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239250AbiBQKY5 (ORCPT ); Thu, 17 Feb 2022 05:24:57 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60834 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239119AbiBQKY2 (ORCPT ); Thu, 17 Feb 2022 05:24:28 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E63E827908B; Thu, 17 Feb 2022 02:24:13 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645093452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tzmkYQx1xEwd8qVOArg0icKqGEVrOwoPune071DIWs8=; b=d1irsmpnpAwkcPjHdfgmtqMiHqSi+YL/levDedGRPQEn1oS4VFnwx6l9/ejdwGj+bvhf2p +xXVb6+/r2EpVUpUeLcq6xBptRmty36qUk1VE7luZ9DmWWhZSFnw1a4ciRXv8Y8M5XiOCq brO5btU1XSHG3VEAQI+Ypzwy22Notd8+Gpt5VbsjrjNZ2iUgci1VLa6B6XQ9uxkC6V288Y Lps/wh2zvZcVVA/A2Nga6Aw2Dq65cTqEwjbYdcAXwHZYlu5SM/CCjyFgIad5HWaFExUnFB bvpY8oiKOCirUZX6x+zgXcs2lznrruZacG0EHw23kgiHI1KQZVoz3l4koqWKhA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645093452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tzmkYQx1xEwd8qVOArg0icKqGEVrOwoPune071DIWs8=; b=Lk1Szs2FYGOr7TkfHTqLlL31Z65/3LhH7VGR35I5JpgSTBGvZiWkTz2GmzxDot2rwu3DX7 ZKT1iiITznfAPgBA== To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Vincent Guittot , Sebastian Andrzej Siewior Subject: [PATCH v2 6/8] kernel/fork: Move task stack account to do_exit(). Date: Thu, 17 Feb 2022 11:24:04 +0100 Message-Id: <20220217102406.3697941-7-bigeasy@linutronix.de> In-Reply-To: <20220217102406.3697941-1-bigeasy@linutronix.de> References: <20220217102406.3697941-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There is no need to perform the stack accounting of the outgoing task in its final schedule() invocation which happens with disabled preemption. The task is leaving, the resources will be freed and the accounting can happen in do_exit() before the actual schedule invocation which frees the stack memory. Move the accounting of the stack memory from release_task_stack() to exit_task_stack_account() which then can be invoked from do_exit(). Signed-off-by: Sebastian Andrzej Siewior Acked-by: Andy Lutomirski --- include/linux/sched/task_stack.h | 2 ++ kernel/exit.c | 1 + kernel/fork.c | 35 +++++++++++++++++++++----------- 3 files changed, 26 insertions(+), 12 deletions(-) diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_st= ack.h index d10150587d819..892562ebbd3aa 100644 --- a/include/linux/sched/task_stack.h +++ b/include/linux/sched/task_stack.h @@ -79,6 +79,8 @@ static inline void *try_get_task_stack(struct task_struct= *tsk) static inline void put_task_stack(struct task_struct *tsk) {} #endif =20 +void exit_task_stack_account(struct task_struct *tsk); + #define task_stack_end_corrupted(task) \ (*(end_of_stack(task)) !=3D STACK_END_MAGIC) =20 diff --git a/kernel/exit.c b/kernel/exit.c index b00a25bb4ab93..c303cffe7fdb4 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -845,6 +845,7 @@ void __noreturn do_exit(long code) put_page(tsk->task_frag.page); =20 validate_creds_for_do_exit(tsk); + exit_task_stack_account(tsk); =20 check_stack_usage(); preempt_disable(); diff --git a/kernel/fork.c b/kernel/fork.c index 919bdcf21b8e5..984f69d6f211f 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -211,9 +211,8 @@ static int free_vm_stack_cache(unsigned int cpu) return 0; } =20 -static int memcg_charge_kernel_stack(struct task_struct *tsk) +static int memcg_charge_kernel_stack(struct vm_struct *vm) { - struct vm_struct *vm =3D task_stack_vm_area(tsk); int i; int ret; =20 @@ -239,6 +238,7 @@ static int memcg_charge_kernel_stack(struct task_struct= *tsk) =20 static int alloc_thread_stack_node(struct task_struct *tsk, int node) { + struct vm_struct *vm; void *stack; int i; =20 @@ -256,7 +256,7 @@ static int alloc_thread_stack_node(struct task_struct *= tsk, int node) /* Clear stale pointers from reused stack. */ memset(s->addr, 0, THREAD_SIZE); =20 - if (memcg_charge_kernel_stack(tsk)) { + if (memcg_charge_kernel_stack(s)) { vfree(s->addr); return -ENOMEM; } @@ -279,7 +279,8 @@ static int alloc_thread_stack_node(struct task_struct *= tsk, int node) if (!stack) return -ENOMEM; =20 - if (memcg_charge_kernel_stack(tsk)) { + vm =3D find_vm_area(stack); + if (memcg_charge_kernel_stack(vm)) { vfree(stack); return -ENOMEM; } @@ -288,19 +289,15 @@ static int alloc_thread_stack_node(struct task_struct= *tsk, int node) * free_thread_stack() can be called in interrupt context, * so cache the vm_struct. */ - tsk->stack_vm_area =3D find_vm_area(stack); + tsk->stack_vm_area =3D vm; tsk->stack =3D stack; return 0; } =20 static void free_thread_stack(struct task_struct *tsk) { - struct vm_struct *vm =3D task_stack_vm_area(tsk); int i; =20 - for (i =3D 0; i < THREAD_SIZE / PAGE_SIZE; i++) - memcg_kmem_uncharge_page(vm->pages[i], 0); - for (i =3D 0; i < NR_CACHED_STACKS; i++) { if (this_cpu_cmpxchg(cached_stacks[i], NULL, tsk->stack_vm_area) !=3D NULL) @@ -454,12 +451,25 @@ static void account_kernel_stack(struct task_struct *= tsk, int account) } } =20 +void exit_task_stack_account(struct task_struct *tsk) +{ + account_kernel_stack(tsk, -1); + + if (IS_ENABLED(CONFIG_VMAP_STACK)) { + struct vm_struct *vm; + int i; + + vm =3D task_stack_vm_area(tsk); + for (i =3D 0; i < THREAD_SIZE / PAGE_SIZE; i++) + memcg_kmem_uncharge_page(vm->pages[i], 0); + } +} + static void release_task_stack(struct task_struct *tsk) { if (WARN_ON(READ_ONCE(tsk->__state) !=3D TASK_DEAD)) return; /* Better to leak the stack than to free prematurely */ =20 - account_kernel_stack(tsk, -1); free_thread_stack(tsk); } =20 @@ -918,6 +928,7 @@ static struct task_struct *dup_task_struct(struct task_= struct *orig, int node) #ifdef CONFIG_THREAD_INFO_IN_TASK refcount_set(&tsk->stack_refcount, 1); #endif + account_kernel_stack(tsk, 1); =20 err =3D scs_prepare(tsk, node); if (err) @@ -961,8 +972,6 @@ static struct task_struct *dup_task_struct(struct task_= struct *orig, int node) tsk->wake_q.next =3D NULL; tsk->worker_private =3D NULL; =20 - account_kernel_stack(tsk, 1); - kcov_task_init(tsk); kmap_local_fork(tsk); =20 @@ -981,6 +990,7 @@ static struct task_struct *dup_task_struct(struct task_= struct *orig, int node) return tsk; =20 free_stack: + exit_task_stack_account(tsk); free_thread_stack(tsk); free_tsk: free_task_struct(tsk); @@ -2449,6 +2459,7 @@ static __latent_entropy struct task_struct *copy_proc= ess( exit_creds(p); bad_fork_free: WRITE_ONCE(p->__state, TASK_DEAD); + exit_task_stack_account(p); put_task_stack(p); delayed_free_task(p); fork_out: --=20 2.34.1 From nobody Sun Jun 28 00:10:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53225C4332F for ; Thu, 17 Feb 2022 10:24:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239156AbiBQKZD (ORCPT ); Thu, 17 Feb 2022 05:25:03 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60906 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239128AbiBQKY2 (ORCPT ); Thu, 17 Feb 2022 05:24:28 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7C9B8279085; Thu, 17 Feb 2022 02:24:14 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645093452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SEsTUOSaWCSEZQKtwmhfwzPA5QEczAckklkj6p/jMls=; b=ykFHdezJ3RFsbfhL4QVCanbtjZSidvlMttr1xwuY/oHDFg+XteKL8d/s7tSVym8qEcwkoT hXxeMHupJcJF3sbp7YJUIZpSM8pyPCMNEMpv4IRexx7fqbTV2PXsjCtGagwLOAHJEWAE+3 nmQEmbIn5gdmJOYONGvpHrn4X7II+2NA1agbBfHb/5QOvj/Sr8/cQyoNzk0HB9D6ga+1fV qxXJcJxnTGkT3FkD314AkFkOv1QUjk2ww++oQLi9kTnkJQAr44O0tvpi/Qy7S0a2vAcHME 4/ZP8sZiHhDGi3buT5hCOYpVqDtFZ9H1bshnwGsX/YJepfO0w/bU/qPMEzQY1g== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645093452; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SEsTUOSaWCSEZQKtwmhfwzPA5QEczAckklkj6p/jMls=; b=mXjFPopJFMNHPL94dGaalBh9Ao2w2Jvw5Fz3OleuR78Ucnuev9W17RZaSEnaoPUn81xdFM filAtZAZgBrsmCCQ== To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Vincent Guittot , Sebastian Andrzej Siewior Subject: [PATCH v2 7/8] kernel/fork: Only cache the VMAP stack in finish_task_switch(). Date: Thu, 17 Feb 2022 11:24:05 +0100 Message-Id: <20220217102406.3697941-8-bigeasy@linutronix.de> In-Reply-To: <20220217102406.3697941-1-bigeasy@linutronix.de> References: <20220217102406.3697941-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The task stack could be deallocated later. For fork()/exec() kind of workloads (say a shell script executing several commands) it is important that the stack is released in finish_task_switch() so that in VMAP_STACK case it can be cached and reused in the new task. For PREEMPT_RT it would be good if the wake-up in vfree_atomic() could be avoided in the scheduling path. Far worse are the other free_thread_stack() implementations which invoke __free_pages()/ kmem_cache_free() with disabled preemption. Cache the stack in free_thread_stack() in the VMAP_STACK case and RCU-delay the free path otherwise. Free the stack in the RCU callback. In the VMAP_STACK case this is another opportunity to fill the cache. Signed-off-by: Sebastian Andrzej Siewior Acked-by: Andy Lutomirski --- kernel/fork.c | 76 ++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 63 insertions(+), 13 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index 984f69d6f211f..aa17ed2a2afc7 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -193,6 +193,41 @@ static inline void free_task_struct(struct task_struct= *tsk) #define NR_CACHED_STACKS 2 static DEFINE_PER_CPU(struct vm_struct *, cached_stacks[NR_CACHED_STACKS]); =20 +struct vm_stack { + struct rcu_head rcu; + struct vm_struct *stack_vm_area; +}; + +static bool try_release_thread_stack_to_cache(struct vm_struct *vm) +{ + unsigned int i; + + for (i =3D 0; i < NR_CACHED_STACKS; i++) { + if (this_cpu_cmpxchg(cached_stacks[i], NULL, vm) !=3D NULL) + continue; + return true; + } + return false; +} + +static void thread_stack_free_rcu(struct rcu_head *rh) +{ + struct vm_stack *vm_stack =3D container_of(rh, struct vm_stack, rcu); + + if (try_release_thread_stack_to_cache(vm_stack->stack_vm_area)) + return; + + vfree(vm_stack); +} + +static void thread_stack_delayed_free(struct task_struct *tsk) +{ + struct vm_stack *vm_stack =3D tsk->stack; + + vm_stack->stack_vm_area =3D tsk->stack_vm_area; + call_rcu(&vm_stack->rcu, thread_stack_free_rcu); +} + static int free_vm_stack_cache(unsigned int cpu) { struct vm_struct **cached_vm_stacks =3D per_cpu_ptr(cached_stacks, cpu); @@ -296,24 +331,27 @@ static int alloc_thread_stack_node(struct task_struct= *tsk, int node) =20 static void free_thread_stack(struct task_struct *tsk) { - int i; + if (!try_release_thread_stack_to_cache(tsk->stack_vm_area)) + thread_stack_delayed_free(tsk); =20 - for (i =3D 0; i < NR_CACHED_STACKS; i++) { - if (this_cpu_cmpxchg(cached_stacks[i], NULL, - tsk->stack_vm_area) !=3D NULL) - continue; - - tsk->stack =3D NULL; - tsk->stack_vm_area =3D NULL; - return; - } - vfree_atomic(tsk->stack); tsk->stack =3D NULL; tsk->stack_vm_area =3D NULL; } =20 # else /* !CONFIG_VMAP_STACK */ =20 +static void thread_stack_free_rcu(struct rcu_head *rh) +{ + __free_pages(virt_to_page(rh), THREAD_SIZE_ORDER); +} + +static void thread_stack_delayed_free(struct task_struct *tsk) +{ + struct rcu_head *rh =3D tsk->stack; + + call_rcu(rh, thread_stack_free_rcu); +} + static int alloc_thread_stack_node(struct task_struct *tsk, int node) { struct page *page =3D alloc_pages_node(node, THREADINFO_GFP, @@ -328,7 +366,7 @@ static int alloc_thread_stack_node(struct task_struct *= tsk, int node) =20 static void free_thread_stack(struct task_struct *tsk) { - __free_pages(virt_to_page(tsk->stack), THREAD_SIZE_ORDER); + thread_stack_delayed_free(tsk); tsk->stack =3D NULL; } =20 @@ -337,6 +375,18 @@ static void free_thread_stack(struct task_struct *tsk) =20 static struct kmem_cache *thread_stack_cache; =20 +static void thread_stack_free_rcu(struct rcu_head *rh) +{ + kmem_cache_free(thread_stack_cache, rh); +} + +static void thread_stack_delayed_free(struct task_struct *tsk) +{ + struct rcu_head *rh =3D tsk->stack; + + call_rcu(rh, thread_stack_free_rcu); +} + static int alloc_thread_stack_node(struct task_struct *tsk, int node) { unsigned long *stack; @@ -348,7 +398,7 @@ static int alloc_thread_stack_node(struct task_struct *= tsk, int node) =20 static void free_thread_stack(struct task_struct *tsk) { - kmem_cache_free(thread_stack_cache, tsk->stack); + thread_stack_delayed_free(tsk); tsk->stack =3D NULL; } =20 --=20 2.34.1 From nobody Sun Jun 28 00:10:28 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C822DC433EF for ; Thu, 17 Feb 2022 10:24:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239206AbiBQKYy (ORCPT ); Thu, 17 Feb 2022 05:24:54 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:60908 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239130AbiBQKY2 (ORCPT ); Thu, 17 Feb 2022 05:24:28 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7E6B27908C; Thu, 17 Feb 2022 02:24:14 -0800 (PST) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1645093453; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fWbPOcLNbd+78CrHSdKKmwuiwmtQSc74Iuad3beVla8=; b=fHPPBLjF9+ZJ48rBstHcHLvXZjVophxsE6QjuwpIIfRtQGlrK+GWyvmbGeDhofvpqhp8mX 7BhQ0dwvGE3ilLt0IT5HsTJishv3qIqEoxHgPbeNp2E+f5Dq1xp3rIoWBy6YCwUw68l2c3 bhuLUvZbsSUMlEXeYejuDJBLrI84iZ5x8kRKcHo7GsgKVnjJGAeyec8l3yxi3PrwW2z2tC zS+I0XsTmT09fUnOofJFd+1YP6tFV2ooRO9VAkg5a70bF02GDkB5P4EeNg9SkYIg23VaVB vsih6ug4akfMlyV9mplT/bnDVKnI8Vxo8T+NZnkRsfNeuo7iBRbUN6zv1LRVCw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1645093453; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fWbPOcLNbd+78CrHSdKKmwuiwmtQSc74Iuad3beVla8=; b=gyBFGLoe6dzqIDz/Zv/9rNogBmQdaqQftY6FSSOmcrPYKlrIV1jDtFPOmxx/9f3gNZsfig GPziWMahoxOIeICA== To: linux-kernel@vger.kernel.org, linux-ia64@vger.kernel.org Cc: Andy Lutomirski , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Ingo Molnar , Juri Lelli , Peter Zijlstra , Steven Rostedt , Thomas Gleixner , Vincent Guittot , Sebastian Andrzej Siewior Subject: [PATCH v2 8/8] kernel/fork: Use IS_ENABLED() in account_kernel_stack(). Date: Thu, 17 Feb 2022 11:24:06 +0100 Message-Id: <20220217102406.3697941-9-bigeasy@linutronix.de> In-Reply-To: <20220217102406.3697941-1-bigeasy@linutronix.de> References: <20220217102406.3697941-1-bigeasy@linutronix.de> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Not strickly needed but checking CONFIG_VMAP_STACK instead of task_stack_vm_area()' result allows the compiler the remove the else path in the CONFIG_VMAP_STACK case where the pointer can't be NULL. Check for CONFIG_VMAP_STACK in order to use the proper path. Signed-off-by: Sebastian Andrzej Siewior Acked-by: Andy Lutomirski --- kernel/fork.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/kernel/fork.c b/kernel/fork.c index aa17ed2a2afc7..4c97e0b5fb62a 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -485,16 +485,16 @@ void vm_area_free(struct vm_area_struct *vma) =20 static void account_kernel_stack(struct task_struct *tsk, int account) { - void *stack =3D task_stack_page(tsk); - struct vm_struct *vm =3D task_stack_vm_area(tsk); - - if (vm) { + if (IS_ENABLED(CONFIG_VMAP_STACK)) { + struct vm_struct *vm =3D task_stack_vm_area(tsk); int i; =20 for (i =3D 0; i < THREAD_SIZE / PAGE_SIZE; i++) mod_lruvec_page_state(vm->pages[i], NR_KERNEL_STACK_KB, account * (PAGE_SIZE / 1024)); } else { + void *stack =3D task_stack_page(tsk); + /* All stack pages are in the same node. */ mod_lruvec_kmem_state(stack, NR_KERNEL_STACK_KB, account * (THREAD_SIZE / 1024)); --=20 2.34.1