From nobody Mon Apr 6 21:27:07 2026 Received: from va-1-114.ptr.blmpb.com (va-1-114.ptr.blmpb.com [209.127.230.114]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D92E32D97AA for ; Wed, 18 Mar 2026 04:58:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.127.230.114 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773809929; cv=none; b=rwlw9PXv5AWZCLhhpo5EiTjhDlgMX0diUqMcAlqDxj2PbGglTuLkL8H5d7R5/mmUDA/NZXka9nrRqIEcBDV10e9Wk2mh65EshJapqLs74bcAZliT3j8lO+s1js768Z8PCoez+mtHPDjkm01QG/4UsZXg4rxlTpcw820h2PGW+Po= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773809929; c=relaxed/simple; bh=nJ3YM+2uk6gNK9Bh9fpvR79XeGn7S966rmjjKp3BW7c=; h=Date:To:Cc:Subject:Mime-Version:From:Message-Id:In-Reply-To: References:Content-Type; b=dp66QhgH9xfe6BKqg5q5p90wlnEXea3QB6FA/M403yS7hwmLyu6FMm3EtSHOQBFed4WJwnTOHXK1zZGRDgGHtqxENDE0cFg/z+wHJsvadHG5Mi3seN2gn9uwa3zX1nhx/E2E0o6hMJqgmS/jItShLivMcgsQJCZJRsAvPjQIPyw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=dPaE6SuE; arc=none smtp.client-ip=209.127.230.114 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="dPaE6SuE" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1773809923; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=MHnbIwKrVkHQClJXvUMbP3oOlNn3PbRZLv/FUo9ECVo=; b=dPaE6SuEs+XY+NORr/AOsV4Qnum8kykktHe2dJsyJ9rp91eFQtvIQGzvc548QJt2bF8Bvr /hhqYHEEO4tQNXksWe7iN6UlzF3NVOw63q+L7CLOx6ciaIfkbORFvz/gAnMLTmOOnnjjLN Bwl0K5bJ/VSIEUyWJm3gQ0XVDuZ8qNE4yAjzdyZ66UcR5PyhYaPiCd/buWC3tSnaa2QdrG DN0Mupft0HAn6W7pEhj+sR66tulmp/4olqYxo3DYl/+KjNRGsAkjGvmH8+Il55VCUdc3RF GrjdF5DN7vrK/oTE0MEak384B3FrDtSCN1blrfmNw9yOQ8vwmYDKGGtX3nGhag== Date: Wed, 18 Mar 2026 12:56:32 +0800 X-Original-From: Chuyi Zhou To: , , , , , , , , , , , Cc: , "Chuyi Zhou" Subject: [PATCH v3 06/12] smp: Enable preemption early in smp_call_function_many_cond Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Lms-Return-Path: From: "Chuyi Zhou" Message-Id: <20260318045638.1572777-7-zhouchuyi@bytedance.com> In-Reply-To: <20260318045638.1572777-1-zhouchuyi@bytedance.com> Content-Transfer-Encoding: quoted-printable X-Mailer: git-send-email 2.20.1 References: <20260318045638.1572777-1-zhouchuyi@bytedance.com> Content-Type: text/plain; charset="utf-8" Now smp_call_function_many_cond() disables preemption mainly for the following reasons: - To prevent the remote online CPU from going offline. Specifically, we want to ensure that no new csds are queued after smpcfd_dying_cpu() has finished. Therefore, preemption must be disabled until all necessary IPIs are sent. - To prevent migration to another CPU, which also implicitly prevents the current CPU from going offline (since stop_machine requires preempting the current task to execute offline callbacks). - To protect the per-cpu cfd_data from concurrent modification by other smp_call_*() on the current CPU. cfd_data contains cpumasks and per-cpu csds. Before enqueueing a csd, we block on the csd_lock() to ensure the previous asyc csd->func() has completed, and then initialize csd->func and csd->info. After sending the IPI, we spin-wait for the remote CPU to call csd_unlock(). Actually the csd_lock mechanism already guarantees csd serialization. If preemption occurs during csd_lock_wait, other concurrent smp_call_function_many_cond calls will simply block until the previous csd->func() completes: task A task B sd->func =3D fun_a send ipis preempted by B ---------------> csd_lock(csd); // block until last // fun_a finished csd->func =3D func_b; csd->info =3D info; ... send ipis switch back to A <--------------- csd_lock_wait(csd); // block until remote finish func_* This patch enables preemption before csd_lock_wait() which makes the potentially unpredictable csd_lock_wait() preemptible and migratable. Note that being migrated to another CPU and calling csd_lock_wait() may cause UAF due to smpcfd_dead_cpu() during the current CPU offline process. Previous patch used the RCU mechanism to synchronize csd_lock_wait() with smpcfd_dead_cpu() to prevent the above UAF issue. Signed-off-by: Chuyi Zhou --- kernel/smp.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index 32c293d8be0e..18e7e4a8f1b6 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -801,7 +801,7 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, smp_cond_func_t cond_func) { bool preemptible_wait =3D !IS_ENABLED(CONFIG_CPUMASK_OFFSTACK); - int cpu, last_cpu, this_cpu =3D smp_processor_id(); + int cpu, last_cpu, this_cpu; struct call_function_data *cfd; bool wait =3D scf_flags & SCF_WAIT; cpumask_var_t cpumask_stack; @@ -809,9 +809,9 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, int nr_cpus =3D 0; bool run_remote =3D false; =20 - lockdep_assert_preemption_disabled(); - rcu_read_lock(); + this_cpu =3D get_cpu(); + cfd =3D this_cpu_ptr(&cfd_data); cpumask =3D cfd->cpumask; =20 @@ -898,6 +898,19 @@ static void smp_call_function_many_cond(const struct c= pumask *mask, local_irq_restore(flags); } =20 + /* + * We may block in csd_lock_wait() for a significant amount of time, + * especially when interrupts are disabled or with a large number of + * remote CPUs. Try to enable preemption before csd_lock_wait(). + * + * Use the cpumask_stack instead of cfd->cpumask to avoid concurrency + * modification from tasks on the same cpu. If preemption occurs during + * csd_lock_wait, other concurrent smp_call_function_many_cond() calls + * will simply block until the previous csd->func() complete. + */ + if (preemptible_wait) + put_cpu(); + if (run_remote && wait) { for_each_cpu(cpu, cpumask) { call_single_data_t *csd; @@ -907,9 +920,11 @@ static void smp_call_function_many_cond(const struct c= pumask *mask, } } =20 - rcu_read_unlock(); - if (preemptible_wait) + if (!preemptible_wait) + put_cpu(); + else free_cpumask_var(cpumask_stack); + rcu_read_unlock(); } =20 /** --=20 2.20.1