From nobody Wed Apr 1 11:25:17 2026 Received: from va-1-112.ptr.blmpb.com (va-1-112.ptr.blmpb.com [209.127.230.112]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 649043E121B for ; Tue, 31 Mar 2026 11:32:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.127.230.112 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774956747; cv=none; b=Pp6Rcsf8kNZouFmuZVgX30bol3To26RwxI7MB/ct2gAC1XV7H5WaX0uOyd0NxnXXmvOfN6vi2ZmM+/JLKKDDSHNepXQq4XR2c7MzszbKJTB769ktISocH31rjq7/dpHNJI2KjWJmVjs9KkysddPE47h78yKEJ/2b/GL0UrsYByc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774956747; c=relaxed/simple; bh=y/C9Nyl1bF1y1y/rvIxGspctdmepZf8+vZ9zdJl2/Yc=; h=Date:Cc:Subject:Content-Type:Message-Id:From:Mime-Version: In-Reply-To:References:To; b=RAg4Q/+fh5AHmeVZx/ma1gThXqrBGJWynqH+iurD19ZfMI7lyiL+DdPgah5Zlj+v59BfTLYIja1EKSHwDpPDJuV+OjcTldT+YS8nwbRNp7PvlfRUJ9kfWYW6Jw7ACQkcyAbhSFF+o5YbRu0jJniiC6EXb6rySuCuSZSlXwwmBfE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=I9325wY5; arc=none smtp.client-ip=209.127.230.112 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="I9325wY5" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1774956740; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=4AZlOeAQjBFoDsOULBq831rbKH1dU4zDMu2/CTl7cHQ=; b=I9325wY5rj0bbw1zbI5hxtqBA2pRJRrApNtu9SORiSqpW3j2jz0fMlatpQilAgDzEDHP+p ms6UZxEoMyREO9CA0yHxiMu2y2hyhssCT2Ie7iGghF3VRRifxYl5XpZb/HZZx2dLhjb6NK 47WjP8LJZtFs9nAHqYglFlTdxBsx8der2N+s+cmVwlfg3/wJb7wSRwz7gETR7ZukH2iF8f 3us17uc71DLcJWQebxKpELSkLdJfR13k6Ev/QsR7u+pODydcIChSVFQDbNumxZzPPqPbOr yFx5lEfqvyfKg2l54GHnUOS0Yj5dKjlMRjSuljhLaBDhZb/s4x8B1p/BekND0A== Date: Tue, 31 Mar 2026 19:30:55 +0800 X-Original-From: Chuyi Zhou Cc: , "Chuyi Zhou" Subject: [PATCH v4 04/12] smp: Use task-local IPI cpumask in smp_call_function_many_cond() Message-Id: <20260331113103.2197007-5-zhouchuyi@bytedance.com> From: "Chuyi Zhou" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 In-Reply-To: <20260331113103.2197007-1-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 X-Lms-Return-Path: Content-Transfer-Encoding: quoted-printable References: <20260331113103.2197007-1-zhouchuyi@bytedance.com> To: , , , , , , , , , , , , Content-Type: text/plain; charset="utf-8" This patch prepares the task-local IPI cpumask during thread creation, and uses the local cpumask to replace the percpu cfd cpumask in smp_call_function_many_cond(). We will enable preemption during csd_lock_wait() later, and this can prevent concurrent access to the cfd->cpumask from other tasks on the current CPU. For cases where cpumask_size() is smaller than or equal to the pointer size, it tries to stash the cpumask in the pointer itself to avoid extra memory allocations. Signed-off-by: Chuyi Zhou --- include/linux/sched.h | 6 +++++ include/linux/smp.h | 20 +++++++++++++++ kernel/fork.c | 9 ++++++- kernel/smp.c | 59 ++++++++++++++++++++++++++++++++++++++----- 4 files changed, 87 insertions(+), 7 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 5a5d3dbc9cdf..6daab67caacc 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1346,6 +1346,12 @@ struct task_struct { struct list_head perf_event_list; struct perf_ctx_data __rcu *perf_ctx_data; #endif +#if defined(CONFIG_SMP) && defined(CONFIG_PREEMPTION) + union { + cpumask_t *ipi_mask_ptr; + unsigned long ipi_mask_val; + }; +#endif #ifdef CONFIG_DEBUG_PREEMPT unsigned long preempt_disable_ip; #endif diff --git a/include/linux/smp.h b/include/linux/smp.h index 1ebd88026119..c7b8cc82ad3c 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -167,6 +167,12 @@ void smp_call_function_many(const struct cpumask *mask, int smp_call_function_any(const struct cpumask *mask, smp_call_func_t func, void *info, int wait); =20 +#ifdef CONFIG_PREEMPTION +int smp_task_ipi_mask_alloc(struct task_struct *task); +void smp_task_ipi_mask_free(struct task_struct *task); +cpumask_t *smp_task_ipi_mask(struct task_struct *cur); +#endif + void kick_all_cpus_sync(void); void wake_up_all_idle_cpus(void); bool cpus_peek_for_pending_ipi(const struct cpumask *mask); @@ -306,4 +312,18 @@ bool csd_lock_is_stuck(void); static inline bool csd_lock_is_stuck(void) { return false; } #endif =20 +#if !defined(CONFIG_SMP) || !defined(CONFIG_PREEMPTION) +static inline int smp_task_ipi_mask_alloc(struct task_struct *task) +{ + return 0; +} +static inline void smp_task_ipi_mask_free(struct task_struct *task) +{ +} +static inline cpumask_t *smp_task_ipi_mask(struct task_struct *cur) +{ + return NULL; +} +#endif + #endif /* __LINUX_SMP_H */ diff --git a/kernel/fork.c b/kernel/fork.c index bc2bf58b93b6..7082eb1c02c1 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -533,6 +533,7 @@ void free_task(struct task_struct *tsk) #endif release_user_cpus_ptr(tsk); scs_release(tsk); + smp_task_ipi_mask_free(tsk); =20 #ifndef CONFIG_THREAD_INFO_IN_TASK /* @@ -930,10 +931,14 @@ static struct task_struct *dup_task_struct(struct tas= k_struct *orig, int node) #endif account_kernel_stack(tsk, 1); =20 - err =3D scs_prepare(tsk, node); + err =3D smp_task_ipi_mask_alloc(tsk); if (err) goto free_stack; =20 + err =3D scs_prepare(tsk, node); + if (err) + goto free_ipi_mask; + #ifdef CONFIG_SECCOMP /* * We must handle setting up seccomp filters once we're under @@ -1004,6 +1009,8 @@ static struct task_struct *dup_task_struct(struct tas= k_struct *orig, int node) #endif return tsk; =20 +free_ipi_mask: + smp_task_ipi_mask_free(tsk); free_stack: exit_task_stack_account(tsk); free_thread_stack(tsk); diff --git a/kernel/smp.c b/kernel/smp.c index 80daf9dd4a25..446e3f80007e 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -785,6 +785,44 @@ int smp_call_function_any(const struct cpumask *mask, } EXPORT_SYMBOL_GPL(smp_call_function_any); =20 +static DEFINE_STATIC_KEY_FALSE(ipi_mask_inlined); + +#ifdef CONFIG_PREEMPTION + +int smp_task_ipi_mask_alloc(struct task_struct *task) +{ + if (static_branch_unlikely(&ipi_mask_inlined)) + return 0; + + task->ipi_mask_ptr =3D kmalloc(cpumask_size(), GFP_KERNEL); + if (!task->ipi_mask_ptr) + return -ENOMEM; + + return 0; +} + +void smp_task_ipi_mask_free(struct task_struct *task) +{ + if (static_branch_unlikely(&ipi_mask_inlined)) + return; + + kfree(task->ipi_mask_ptr); +} + +cpumask_t *smp_task_ipi_mask(struct task_struct *cur) +{ + /* + * If cpumask_size() is smaller than or equal to the pointer + * size, it stashes the cpumask in the pointer itself to + * avoid extra memory allocations. + */ + if (static_branch_unlikely(&ipi_mask_inlined)) + return (cpumask_t *)&cur->ipi_mask_val; + + return cur->ipi_mask_ptr; +} +#endif + /* * Flags to be used as scf_flags argument of smp_call_function_many_cond(). * @@ -802,11 +840,18 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, int cpu, last_cpu, this_cpu =3D smp_processor_id(); struct call_function_data *cfd; bool wait =3D scf_flags & SCF_WAIT; + struct cpumask *cpumask, *task_mask; + bool preemptible_wait; int nr_cpus =3D 0; bool run_remote =3D false; =20 lockdep_assert_preemption_disabled(); =20 + task_mask =3D smp_task_ipi_mask(current); + preemptible_wait =3D task_mask && preemptible(); + cfd =3D this_cpu_ptr(&cfd_data); + cpumask =3D preemptible_wait ? task_mask : cfd->cpumask; + /* * Can deadlock when called with interrupts disabled. * We allow cpu's that are not yet online though, as no one else can @@ -827,16 +872,15 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, =20 /* Check if we need remote execution, i.e., any CPU excluding this one. */ if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) { - cfd =3D this_cpu_ptr(&cfd_data); - cpumask_and(cfd->cpumask, mask, cpu_online_mask); - __cpumask_clear_cpu(this_cpu, cfd->cpumask); + cpumask_and(cpumask, mask, cpu_online_mask); + __cpumask_clear_cpu(this_cpu, cpumask); =20 cpumask_clear(cfd->cpumask_ipi); - for_each_cpu(cpu, cfd->cpumask) { + for_each_cpu(cpu, cpumask) { call_single_data_t *csd =3D per_cpu_ptr(cfd->csd, cpu); =20 if (cond_func && !cond_func(cpu, info)) { - __cpumask_clear_cpu(cpu, cfd->cpumask); + __cpumask_clear_cpu(cpu, cpumask); continue; } =20 @@ -887,7 +931,7 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, } =20 if (run_remote && wait) { - for_each_cpu(cpu, cfd->cpumask) { + for_each_cpu(cpu, cpumask) { call_single_data_t *csd; =20 csd =3D per_cpu_ptr(cfd->csd, cpu); @@ -1003,6 +1047,9 @@ EXPORT_SYMBOL(nr_cpu_ids); void __init setup_nr_cpu_ids(void) { set_nr_cpu_ids(find_last_bit(cpumask_bits(cpu_possible_mask), NR_CPUS) + = 1); + + if (IS_ENABLED(CONFIG_PREEMPTION) && cpumask_size() <=3D sizeof(unsigned = long)) + static_branch_enable(&ipi_mask_inlined); } =20 /* Called by boot processor to activate the rest. */ --=20 2.20.1