From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-101.ptr.blmpb.com (sg-1-101.ptr.blmpb.com [118.26.132.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 51E91337BB8 for ; Mon, 2 Mar 2026 07:53:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772437986; cv=none; b=A4v5ezPyB+zEivg0NRqhQO8rUDOptu2ObhaE/6ecjDX6RmqWykEMCkwWMGrwBHqSfc1Cs41x6DL9w+aF5EbfHnfdCu1xJVh7GxcX+0gPM1/CEGTt01dGITE2VFtGsQ6n+/+BcWuUAEoWCoVaQSjAMVZC/WnFswrHHQqwIWyVE0o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772437986; c=relaxed/simple; bh=XH+TgyQJYrU8fNK1eEXW2c0NoIumPBqw2CKhw534Sz4=; h=Subject:Date:In-Reply-To:Cc:From:To:Message-Id:Content-Type: References:Mime-Version; b=s8TLES+/UYMvxAvwAtOlYcVhZ19zK4dtLX/9B+2vLRYxhE7zcs7ZPKywB3aT8QTlygkWxd8LzdqArViWgTB6t5chAGBEP+efd25lv+VjC00uvhVXHwlRlZWl9XUuuf+N8ludYsIsJdJ9x8+xUmeAikao/JpSLstBRdRIzCb0jqA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=Jld/BJAD; arc=none smtp.client-ip=118.26.132.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="Jld/BJAD" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772437978; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=XJQbGVgOA6I60dB8y0qr6QT4Xv7obtiE15uJEm45WXA=; b=Jld/BJADDSrudFLScLy7q16yPcx94r7oZsE6OgMpGXHqfmPOCLgKxNMfAViwRJsMmJmk9r eTldZBH0U0ItENCDOTE10dG8IGSeKVsDwLwe+JbndcSboScdqKhXg9L7Cfm363cu8kKwV4 BtDcYCsfghIeSLD7WgavEg7lxzWbSpQgRVtmsEwR4yuQNScZFOyEkvsjWCTMEAiTmx/4V+ xIYScDTVLCpt5Fnef4kYhmdOjo/We4mEJLSGKw/XEp5Lr1Lnom5Hy8q2CupjqGBb0CcOgf S/a94IBuQGqbvURHHNboohC8WDFQvYNh4Y+b969U2aQYWiYt0QraXv4SV3IqZg== Subject: [PATCH v2 01/12] smp: Disable preemption explicitly in __csd_lock_wait Date: Mon, 2 Mar 2026 15:52:05 +0800 In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> X-Original-From: Chuyi Zhou Cc: , "Chuyi Zhou" From: "Chuyi Zhou" X-Mailer: git-send-email 2.20.1 To: , , , , , , , , , , , Message-Id: <20260302075216.2170675-2-zhouchuyi@bytedance.com> X-Lms-Return-Path: References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The latter patches will enable preemption before csd_lock_wait(), which could break csdlock_debug. Because the slice of other tasks on the CPU may be accounted between ktime_get_mono_fast_ns() calls. Disable preemption explicitly in __csd_lock_wait(). This is a preparation for the next patches. Signed-off-by: Chuyi Zhou Acked-by: Muchun Song --- kernel/smp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index f349960f79ca..fc1f7a964616 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -323,6 +323,8 @@ static void __csd_lock_wait(call_single_data_t *csd) int bug_id =3D 0; u64 ts0, ts1; =20 + guard(preempt)(); + ts1 =3D ts0 =3D ktime_get_mono_fast_ns(); for (;;) { if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages)) --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-102.ptr.blmpb.com (sg-1-102.ptr.blmpb.com [118.26.132.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A40C9212B0A for ; Mon, 2 Mar 2026 07:53:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.102 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772437998; cv=none; b=btbdWrsq5LKyv5O8114RGazuExi1SCE/vaVB5OustIXlWaOx1lulc7TR6btRDjgIfUjUvO1/k2OiCcXGQnsiWcVvKC991R7c5uOMRaxsAa0qHH6lj1K3qJtlhH4lEHUP712aGyGlAl+Aa1AL/CC+zzZDxxXofMN557YaX+ow06g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772437998; c=relaxed/simple; bh=BUQMpQd8u4yUUj234WDO6+aggVbiGFe/pxhvaD81hJA=; h=Cc:Mime-Version:References:Content-Type:Subject:Date:In-Reply-To: To:From:Message-Id; b=Ay4KRnks+qBKdDIg1UKqCwCMBw/yV1ZwD/uszmTH/kT0z01F5wnXdLvVyyJYE7cULKxDO6c15pIUgc/i275TbgSGnv6/MVm5KIrjB++bdLGnvhOysyn7zIAV/mPOoEDtMyeEXWUC2AnYG3+go/XJuvPhBoAAiNwFfo75rf8X2bU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=UiKFCKpi; arc=none smtp.client-ip=118.26.132.102 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="UiKFCKpi" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772437990; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=3XTAwPdhGZK2OUqdl9lebTQ6K/gQ43T09kCwqsH/4Gg=; b=UiKFCKpiCgl5Yb6HP1ZW9GkJI/lLyexP3/yuZrYz1x4JqBZCfLGTjReJnkKlWBA76k8yC+ aGqZdASEA5il6YHKPFLVDwsGSmivC0/RQgr4ZwQSCbM+2gG65Xjfi+jwSyD1Pg5AC5Dz/c mzmHWwts19DM1Fqwa2WbfrCI2pOMi5MmDQwnF4X+GQ03g/nrJJugMGx5hcvURXSARmC3P/ RjivLf9BjCkq+1LE/5Y1CN2idJiPmf853YSP2iG+67lRJs9ZFquJoXdSF3O78eD3bccZ5d 2MJkPYwcAF53SUz88VQPDd2NuoUyuRiMakVIooSgSm95rjqPVZ9KGT+YZT/GHg== Cc: , "Chuyi Zhou" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> X-Lms-Return-Path: Subject: [PATCH v2 02/12] smp: Enable preemption early in smp_call_function_single Date: Mon, 2 Mar 2026 15:52:06 +0800 In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Content-Transfer-Encoding: quoted-printable X-Original-From: Chuyi Zhou To: , , , , , , , , , , , From: "Chuyi Zhou" Message-Id: <20260302075216.2170675-3-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 Content-Type: text/plain; charset="utf-8" Now smp_call_function_single() disables preemption mainly for the following reasons: - To protect the per-cpu csd_data from concurrent modification by other tasks on the current CPU in the !wait case. For the wait case, synchronization is not a concern as on-stack csd is used. - To prevent the remote online CPU from being offlined. Specifically, we want to ensure that no new IPIs are queued after smpcfd_dying_cpu() has finished. Disabling preemption for the entire execution is unnecessary, especially csd_lock_wait() part does not require preemption protection. This patch enables preemption before csd_lock_wait() to reduce the preemption-disabled critical section. Signed-off-by: Chuyi Zhou Reviewed-by: Muchun Song --- kernel/smp.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index fc1f7a964616..b603d4229f95 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -685,11 +685,16 @@ int smp_call_function_single(int cpu, smp_call_func_t= func, void *info, =20 err =3D generic_exec_single(cpu, csd); =20 + /* + * @csd is stack-allocated when @wait is true. No concurrent access + * except from the IPI completion path, so we can re-enable preemption + * early to reduce latency. + */ + put_cpu(); + if (wait) csd_lock_wait(csd); =20 - put_cpu(); - return err; } EXPORT_SYMBOL(smp_call_function_single); --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-103.ptr.blmpb.com (sg-1-103.ptr.blmpb.com [118.26.132.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A4B41DF73C for ; Mon, 2 Mar 2026 07:53:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.103 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438016; cv=none; b=S+WfHOSn9rURI3R/m55zqLjt6MeoCqL9sFiwPD7rFApzb8XVvKGh6xHDnznKq5zHIa1YwxKT+wR8qnB0WZLhjKUoddpYcqb2OJfmFr46pMswg6ln7u+0j3QVb8wh26u2vxehEBq9hKi5h/IusqN8/iHLQT7s9KiwkhK/Og+H7aM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438016; c=relaxed/simple; bh=TCy2KX9qDbiHGy1QuZHj0y6OOg+oyJE4Ks74Pqah68A=; h=In-Reply-To:Cc:Message-Id:Content-Type:Subject:References:To:From: Date:Mime-Version; b=JegkQckedm4yyJImtn/mYzBP8cbLAdhXZXhk3yhgrZQxq8R4ShllEdUsbV4rkNbSRSgllYYsn3VXM5++KRJ0UIzfWvGbK6eq9zMYN2nQzRoq6viraH22W8dfDZckDBjCemRR3YLbxDpRupwub1U1whmmHQLW3LZB6JRFSWggCjE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=DBny4CM9; arc=none smtp.client-ip=118.26.132.103 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="DBny4CM9" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438003; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=KHQMxt7OqXLP0XEy9NJ/DuIMZST6z1kq98CQ96mzrVY=; b=DBny4CM9qZq/PLHV90WtFVxoDEOx0sGU5FShhHT26Dc9R09lOdLB2XLE8gh8E3J2BklO1+ cGJhJfbb/xqfwemf+MJ3r+G9JRk99PHwNFokFXlxWTTq/Tu0irm3L7gHXycxg35IRrNzlv XLx81IxhV6SM1pZdqU3L+fCXtF1kp5WFMQ6VnmsFWGwP115A2sVYMkyBUuvJURsRP4+Tp2 E8r3ivbau1HrYAvCRoUgY2F+HNv9qUBfGkNmWDZHCfa9FluX1RdIplThJVV4kSm7aAqirn hIVChIr4lu8NFDgn+lvEsYz1tpCqTocyWp0KNFBruv++BrjZO+sdzbasBjOSzg== In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> X-Original-From: Chuyi Zhou Cc: , "Chuyi Zhou" Message-Id: <20260302075216.2170675-4-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 X-Lms-Return-Path: Content-Transfer-Encoding: quoted-printable Subject: [PATCH v2 03/12] smp: Remove get_cpu from smp_call_function_any References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> To: , , , , , , , , , , , From: "Chuyi Zhou" Date: Mon, 2 Mar 2026 15:52:07 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Now smp_call_function_single() would enable preemption before csd_lock_wait() to reduce the critical section. To allow callers of smp_call_function_any() to also benefit from this optimization, remove get_cpu()/put_cpu() from smp_call_function_any(). Signed-off-by: Chuyi Zhou Reviewed-by: Muchun Song --- kernel/smp.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index b603d4229f95..80daf9dd4a25 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -761,16 +761,26 @@ EXPORT_SYMBOL_GPL(smp_call_function_single_async); int smp_call_function_any(const struct cpumask *mask, smp_call_func_t func, void *info, int wait) { + bool local =3D true; unsigned int cpu; int ret; =20 - /* Try for same CPU (cheapest) */ + /* + * Prevent migration to another CPU after selecting the current CPU + * as the target. + */ cpu =3D get_cpu(); - if (!cpumask_test_cpu(cpu, mask)) + + /* Try for same CPU (cheapest) */ + if (!cpumask_test_cpu(cpu, mask)) { cpu =3D sched_numa_find_nth_cpu(mask, 0, cpu_to_node(cpu)); + local =3D false; + put_cpu(); + } =20 ret =3D smp_call_function_single(cpu, func, info, wait); - put_cpu(); + if (local) + put_cpu(); return ret; } EXPORT_SYMBOL_GPL(smp_call_function_any); --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-101.ptr.blmpb.com (sg-1-101.ptr.blmpb.com [118.26.132.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66F0D212B0A for ; Mon, 2 Mar 2026 07:53:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438028; cv=none; b=bfvOuVO3aQcuQJR66opgke3uE5n1ZZt/B7fFnwdlVp9KArURpMH8S30BqMD58cTn+UPvIRPlRiDPR1hSonUd7U8S9XezDrHfKMeUt9hRyfmyZppRFqvnPnEvp7TUkkUHw4v5pTOU5SBWbGCTPBKxpMBPLEKvzLDuwbg/tvvAlPU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438028; c=relaxed/simple; bh=EcfRwb2pEYQjsSKbm1m/19nTZG2B3csf+WGJ27Z1jEc=; h=To:Subject:References:Date:Mime-Version:In-Reply-To:Cc:From: Message-Id:Content-Type; b=cHTwLpXq3ookHdxlq2jdX+CiKLLVvLbpLFJk6NuMAHCde53jmIHqVnVkd6p3w3KH75ONOgEsxofjvTlMhaHJTbQ8YPF48EFu30AxW/WvD3uhMLc7tlTF7jQQnLdg4j2hq13ibRuD4+OJAXAPoNjE7j3jnyInus7fAQ/pJpsRCZ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=p5lcHwls; arc=none smtp.client-ip=118.26.132.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="p5lcHwls" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438015; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=hq2wwmpbjv0StbGStx9tEznySPN2jyh4fTMfOnGCPAA=; b=p5lcHwlsENvbTGByPWvHI7huFCRpggcTs7YRGnpgWTcR7bM0It3emu/gvjV7ZiPkRb9rf1 yYiKWZk6MNbY6Du99ZNvS6mFFMsTkEIQG3Gnb7lz4BhAnGuRoCNjGnBkRZi+r68YRZzJSU 8JdPHd8RlKoR5u5nfzvA1mZs/YGCkj50w1OQvA68IEoJfFpC76xvab76EwIJyoB6o/GcaQ Tlka2mg7xjWDyOQ2xtMFCJEEBFYZCIM8fFhTFA2aw3SPdf36WEjF34s8J5zt79gDNp8jCL 1hitKLdDrGYkWLl5nT/ppjo5Af5tuJP/LWLZWhQHSi3ZV5a1FDcfEOlISKybZA== To: , , , , , , , , , , , Subject: [PATCH v2 04/12] smp: Use on-stack cpumask in smp_call_function_many_cond References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Content-Transfer-Encoding: quoted-printable X-Lms-Return-Path: Date: Mon, 2 Mar 2026 15:52:08 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.20.1 X-Original-From: Chuyi Zhou In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Cc: , "Chuyi Zhou" From: "Chuyi Zhou" Message-Id: <20260302075216.2170675-5-zhouchuyi@bytedance.com> Content-Type: text/plain; charset="utf-8" This patch use on-stack cpumask to replace percpu cfd cpumask in smp_call_function_many_cond(). Note that when both CONFIG_CPUMASK_OFFSTACK and PREEMPT_RT are enabled, allocation during preempt-disabled section would break RT. Therefore, only do this when CONFIG_CPUMASK_OFFSTACK=3Dn. This is a preparation for enabling preemption during csd_lock_wait() in smp_call_function_many_cond(). Signed-off-by: Chuyi Zhou Reviewed-by: Muchun Song --- kernel/smp.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index 80daf9dd4a25..9728ba55944d 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -799,14 +799,25 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, unsigned int scf_flags, smp_cond_func_t cond_func) { + bool preemptible_wait =3D !IS_ENABLED(CONFIG_CPUMASK_OFFSTACK); int cpu, last_cpu, this_cpu =3D smp_processor_id(); struct call_function_data *cfd; bool wait =3D scf_flags & SCF_WAIT; + cpumask_var_t cpumask_stack; + struct cpumask *cpumask; int nr_cpus =3D 0; bool run_remote =3D false; =20 lockdep_assert_preemption_disabled(); =20 + cfd =3D this_cpu_ptr(&cfd_data); + cpumask =3D cfd->cpumask; + + if (preemptible_wait) { + BUILD_BUG_ON(!alloc_cpumask_var(&cpumask_stack, GFP_ATOMIC)); + cpumask =3D cpumask_stack; + } + /* * Can deadlock when called with interrupts disabled. * We allow cpu's that are not yet online though, as no one else can @@ -827,16 +838,15 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, =20 /* Check if we need remote execution, i.e., any CPU excluding this one. */ if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) { - cfd =3D this_cpu_ptr(&cfd_data); - cpumask_and(cfd->cpumask, mask, cpu_online_mask); - __cpumask_clear_cpu(this_cpu, cfd->cpumask); + cpumask_and(cpumask, mask, cpu_online_mask); + __cpumask_clear_cpu(this_cpu, cpumask); =20 cpumask_clear(cfd->cpumask_ipi); - for_each_cpu(cpu, cfd->cpumask) { + for_each_cpu(cpu, cpumask) { call_single_data_t *csd =3D per_cpu_ptr(cfd->csd, cpu); =20 if (cond_func && !cond_func(cpu, info)) { - __cpumask_clear_cpu(cpu, cfd->cpumask); + __cpumask_clear_cpu(cpu, cpumask); continue; } =20 @@ -887,13 +897,16 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, } =20 if (run_remote && wait) { - for_each_cpu(cpu, cfd->cpumask) { + for_each_cpu(cpu, cpumask) { call_single_data_t *csd; =20 csd =3D per_cpu_ptr(cfd->csd, cpu); csd_lock_wait(csd); } } + + if (preemptible_wait) + free_cpumask_var(cpumask_stack); } =20 /** --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-104.ptr.blmpb.com (sg-1-104.ptr.blmpb.com [118.26.132.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D1FBA212B0A for ; Mon, 2 Mar 2026 07:53:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.104 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438041; cv=none; b=M5wXVu4zJu8+PB5REaUjKmzNSkO20Y0tGSmXjKtUZRMbcFq8QkRoHPocblirR9g75UU9DEt97lMdNbhyGqc+E0vBTjhtLOh2iwZl2r/Oix7pYOzEQ396Gb6KKsgfCpgGT7F3piblCN9vQ9s1pPvZh3cmhNRiEI7Nnja6Fcdv9VA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438041; c=relaxed/simple; bh=8pznXcKFYJvoxH8YItppNkL6UjU79rjgMSlX6zdOH1I=; h=From:References:Message-Id:Mime-Version:Content-Type:Subject: In-Reply-To:Date:To:Cc; b=C30cYzQXBFjuTuq4ShGTi3uX4lNWQy/MAVIH6sIC+VOVW4FFrFDE61ImgIeyDbN9dkvI47BTTw65tQ3as9xtXd0E8VMwibvVD7wnbrZMwSbOW2They3+b7dMUUZYc9siHDVgeDZoCIXtJbLlecVrJIHAYo5XdjcHg+O61hEHm10= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=QinAPqca; arc=none smtp.client-ip=118.26.132.104 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="QinAPqca" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438028; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=1rbj8jBDHrec0FdITBc5ycuECF8/Wgdmn0NL2xBzeZQ=; b=QinAPqcafuBeVQWAsAC/BGxCT5rm+2qFy7O4LWDHA4fxBkxTWgBiYaUl4j8xgOgh9/7Y5T p+vYogZvvt35H3OnbIq46XifJxSmS6ZZu8e3EEBy2ilxZVOjuOiEhTjpWDk+HktBI/dI4t JMiBPtwhSqPW0QqpzBMEUr7+USXe6HQWKzkKxEMpZfGYBHYYSEjP8DuOEiC7T8cRQJH2hZ MpcobpoHDItBuItlznxeMZnNHf2FxFeMKawOtszLpW1AQZC7EsYJJGXmS6AMjq0r4QLSa4 r0ctnNCW0KUnmocnk7vAwZ/Uz5MR/OnFMDtaljwC1yiV9G9CrwVsxoY6N7eKyw== Content-Transfer-Encoding: quoted-printable From: "Chuyi Zhou" X-Original-From: Chuyi Zhou References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Message-Id: <20260302075216.2170675-6-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Lms-Return-Path: X-Mailer: git-send-email 2.20.1 Subject: [PATCH v2 05/12] smp: Free call_function_data via RCU in smpcfd_dead_cpu In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Date: Mon, 2 Mar 2026 15:52:09 +0800 To: , , , , , , , , , , , Cc: , "Chuyi Zhou" Content-Type: text/plain; charset="utf-8" Use rcu_read_lock to protect the hole scope of smp_call_function_many_cond and wait for all read critical sections to exit before releasing percpu csd data. This is a preparation for enabling preemption during csd_lock_wait(). Signed-off-by: Chuyi Zhou --- kernel/smp.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index 9728ba55944d..ad6073b71bbd 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -77,6 +77,7 @@ int smpcfd_dead_cpu(unsigned int cpu) { struct call_function_data *cfd =3D &per_cpu(cfd_data, cpu); =20 + synchronize_rcu(); free_cpumask_var(cfd->cpumask); free_cpumask_var(cfd->cpumask_ipi); free_percpu(cfd->csd); @@ -810,6 +811,7 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, =20 lockdep_assert_preemption_disabled(); =20 + rcu_read_lock(); cfd =3D this_cpu_ptr(&cfd_data); cpumask =3D cfd->cpumask; =20 @@ -907,6 +909,7 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, =20 if (preemptible_wait) free_cpumask_var(cpumask_stack); + rcu_read_unlock(); } =20 /** --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-102.ptr.blmpb.com (sg-1-102.ptr.blmpb.com [118.26.132.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F145D1DF73C for ; Mon, 2 Mar 2026 07:54:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.102 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438047; cv=none; b=Ce0btXrXSBGnlNCH8g9knOlT33k9SbCzy5L71nU+342oPE8kZXI6oWgzeumtfh6uRxyGVMlH5gL0cFvyWGFkHAU611K5V5Ff0QKCPnnZSija1bJRPjIbBV0h2IPSRinJ7ZJesSGFXK/xcDKh9W807hPnTSrNU1cuiRHE/tHFeBk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438047; c=relaxed/simple; bh=NcG8Wz4fb5QGR9IfgzJPleBuULaYFRoy8AhRldIt9Rk=; h=Mime-Version:From:Subject:Message-Id:In-Reply-To:To:References:Cc: Date:Content-Type; b=ccCe57/Vqjp/ZnHG53ERn+PmyBAxIvIBiY9V7gk72qZDcWgPw91FCNwkRkiaelNpjdA0Yn9TbXPTmV2s7Crjw1aTrcVBjcWpuHOws6sokk3lP6hpuYKzSRvLBFOjhuyxnT0Vp6b84W3FRk2whuudnV38CGTqL2ym1V0vmFSTVbs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=DhmhvF3i; arc=none smtp.client-ip=118.26.132.102 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="DhmhvF3i" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438040; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=khwCmEjQKCEpmf3resegxvg4YTzSDN2uWcvMSwGXVs0=; b=DhmhvF3irZq+yklkCHVxPLZ8xYZWMV2K2694a6j3SiT3VgBcawJYzxsiLWCbAINedKqpEL QntjY7BFSxA8tZyPfJS9mKL82erzNONKn8o3RDCUorn7SWyt78zOOjijDpAV8tP5xzZo0V 94fSowwX/vYT/pe6r9uzCLETZMhFs7wF02Ni7q0kRgz+mQbLeA+s3AYfl3ydQwpse+YSxX q47TXlmQUlnxOP8ddN0z/23TtMXM4TN4WhmcwwofHniveWLjiCdnuZ9x8RvgZ0CyMwV9YP PCq/Se57VBy70Ga86bZCqjp0cKmszqN6UPIdanyPYZIyMP16kgxSOmgXWYSSiQ== Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Original-From: Chuyi Zhou From: "Chuyi Zhou" Subject: [PATCH v2 06/12] smp: Enable preemption early in smp_call_function_many_cond Message-Id: <20260302075216.2170675-7-zhouchuyi@bytedance.com> In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> To: , , , , , , , , , , , X-Lms-Return-Path: References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 Cc: , "Chuyi Zhou" Date: Mon, 2 Mar 2026 15:52:10 +0800 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now smp_call_function_many_cond() disables preemption mainly for the following reasons: - To prevent the remote online CPU from going offline. Specifically, we want to ensure that no new csds are queued after smpcfd_dying_cpu() has finished. Therefore, preemption must be disabled until all necessary IPIs are sent. - To prevent migration to another CPU, which also implicitly prevents the current CPU from going offline (since stop_machine requires preempting the current task to execute offline callbacks). - To protect the per-cpu cfd_data from concurrent modification by other smp_call_*() on the current CPU. cfd_data contains cpumasks and per-cpu csds. Before enqueueing a csd, we block on the csd_lock() to ensure the previous asyc csd->func() has completed, and then initialize csd->func and csd->info. After sending the IPI, we spin-wait for the remote CPU to call csd_unlock(). Actually the csd_lock mechanism already guarantees csd serialization. If preemption occurs during csd_lock_wait, other concurrent smp_call_function_many_cond calls will simply block until the previous csd->func() completes: task A task B sd->func =3D fun_a send ipis preempted by B ---------------> csd_lock(csd); // block until last // fun_a finished csd->func =3D func_b; csd->info =3D info; ... send ipis switch back to A <--------------- csd_lock_wait(csd); // block until remote finish func_* This patch enables preemption before csd_lock_wait() which makes the potentially unpredictable csd_lock_wait() preemptible and migratable. Note that being migrated to another CPU and calling csd_lock_wait() may cause UAF due to smpcfd_dead_cpu() during the current CPU offline process. Previous patch used the RCU mechanism to synchronize csd_lock_wait() with smpcfd_dead_cpu() to prevent the above UAF issue. Signed-off-by: Chuyi Zhou --- kernel/smp.c | 23 +++++++++++++++++++---- 1 file changed, 19 insertions(+), 4 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index ad6073b71bbd..18e7e4a8f1b6 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -801,7 +801,7 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, smp_cond_func_t cond_func) { bool preemptible_wait =3D !IS_ENABLED(CONFIG_CPUMASK_OFFSTACK); - int cpu, last_cpu, this_cpu =3D smp_processor_id(); + int cpu, last_cpu, this_cpu; struct call_function_data *cfd; bool wait =3D scf_flags & SCF_WAIT; cpumask_var_t cpumask_stack; @@ -809,9 +809,9 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, int nr_cpus =3D 0; bool run_remote =3D false; =20 - lockdep_assert_preemption_disabled(); - rcu_read_lock(); + this_cpu =3D get_cpu(); + cfd =3D this_cpu_ptr(&cfd_data); cpumask =3D cfd->cpumask; =20 @@ -898,6 +898,19 @@ static void smp_call_function_many_cond(const struct c= pumask *mask, local_irq_restore(flags); } =20 + /* + * We may block in csd_lock_wait() for a significant amount of time, + * especially when interrupts are disabled or with a large number of + * remote CPUs. Try to enable preemption before csd_lock_wait(). + * + * Use the cpumask_stack instead of cfd->cpumask to avoid concurrency + * modification from tasks on the same cpu. If preemption occurs during + * csd_lock_wait, other concurrent smp_call_function_many_cond() calls + * will simply block until the previous csd->func() complete. + */ + if (preemptible_wait) + put_cpu(); + if (run_remote && wait) { for_each_cpu(cpu, cpumask) { call_single_data_t *csd; @@ -907,7 +920,9 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, } } =20 - if (preemptible_wait) + if (!preemptible_wait) + put_cpu(); + else free_cpumask_var(cpumask_stack); rcu_read_unlock(); } --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-100.ptr.blmpb.com (sg-1-100.ptr.blmpb.com [118.26.132.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E95E43859FD for ; Mon, 2 Mar 2026 07:54:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438074; cv=none; b=Qjco1BMgLrF2vjSi18eb9zRtl7wDyfFz62abO2nBDvXtRov9ODKbvnCzOtyJte/CPNSksVUrV/Fu9dDROR2ax+c7BepQqHMRaftjFZrzcWN4HrSg5szExv25TewUOqwck912sIK4tv+UJKMHuPDcvh5Gy14F9JVgSrIdXXTMJuw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438074; c=relaxed/simple; bh=b6PSKk8e2yfcMAkK4wgQK3M55nkpZOsRu/LRUYkwesk=; h=From:Date:Mime-Version:Content-Type:In-Reply-To:To:Cc:References: Subject:Message-Id; b=OkhcZeFvun3jmTupNuC66Gzu+cjiSPdZ/VkA3eHhPlltWVrfpzVo6+otzNIWkRFcCwPD1woE0rYbKIp8aR4R5wM7ZxuDrlTzjutz4jflFxi5TBPdZjdw+oW23DbQFCQM9UdTQJJKvmTQ1ILEzjn7h413k+Wc6t6dek+YwT0P2i4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=cFJ46ZIO; arc=none smtp.client-ip=118.26.132.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="cFJ46ZIO" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438061; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=jNeEoePrKFoQvzxrKQCLn2y/j5HaRLKhAxm7LGABlMo=; b=cFJ46ZIOwesQ3bgvLiJqf5+n37tABvih1WIlJjlEHpmQcZhGLEtX+8nwnTcAzOBFRtZIy1 2k2ji5y0UnZwSvtcIiPO58zqKc/v2kvOx8LU9iosm2sgbF8Dw80Muwrugz5w5xqVW2tniU O2N9CX5SBSpSgABE+05Pwhxq/WIcO968rtoXJx2VSQqo5uwgOoe1ONgTsfkn12QwcxYOAK tH/Y8//3ps9pqOSkrGPdZGpsZPFJV3+vwgcVDPflfCShGuLiLeMYJh0hVfaggDdlnRQ41B KuYTnjwudwljoDzSzDd7sVy41jkJuH0PDeisY3DQ0xzy63xn0fRB+G++15Sayw== From: "Chuyi Zhou" Date: Mon, 2 Mar 2026 15:52:11 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.20.1 X-Original-From: Chuyi Zhou Content-Transfer-Encoding: quoted-printable X-Lms-Return-Path: In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> To: , , , , , , , , , , , Cc: , "Chuyi Zhou" References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Subject: [PATCH v2 07/12] smp: Remove preempt_disable from smp_call_function Message-Id: <20260302075216.2170675-8-zhouchuyi@bytedance.com> Content-Type: text/plain; charset="utf-8" Now smp_call_function_many_cond() internally handles the preemption logic, so smp_call_function() does not need to explicitly disable preemption. Remove preempt_{enable, disable} from smp_call_function(). Signed-off-by: Chuyi Zhou Reviewed-by: Muchun Song --- kernel/smp.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index 18e7e4a8f1b6..f9c0028968ef 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -966,9 +966,8 @@ EXPORT_SYMBOL(smp_call_function_many); */ void smp_call_function(smp_call_func_t func, void *info, int wait) { - preempt_disable(); - smp_call_function_many(cpu_online_mask, func, info, wait); - preempt_enable(); + smp_call_function_many_cond(cpu_online_mask, func, info, + wait ? SCF_WAIT : 0, NULL); } EXPORT_SYMBOL(smp_call_function); =20 --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-104.ptr.blmpb.com (sg-1-104.ptr.blmpb.com [118.26.132.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0C48337BB8 for ; Mon, 2 Mar 2026 07:54:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.104 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438073; cv=none; b=hY1j80qAzxPk3bEB+2SPB7CQd4IJw7hc+pVWxqPq5o9XKR46GH5c2zdQGRqXWm4P8L7qF4E0BNjLQth6HBdznljDPdW79Y8OIKy6TGeacjYMH05pTAm1I/9ZArS/xrPv04niInsw0UkuwMgzjHH+cn4au+k70cScMhyt7lLuc34= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438073; c=relaxed/simple; bh=xsVtRwJ3BcGBF88+nA1pmn/MCUEIrM/tWZpsM+GC80c=; h=References:Message-Id:Mime-Version:To:From:In-Reply-To:Date: Content-Type:Cc:Subject; b=BTPfdIg4jR9UHCYlnIZAKK14APRkyubYfyLaCOYQFBS8dOSO12xtcAXVdyo5uIr6GwAhVCTpdlDm2G+w97s91VRwBqKPvMN65SXqOBHqZIBmqoyDQdOZhB9CCYsycvtMWbkl1v6xz/HhwYWVm0cSnOK+SwUhiCut6lhLJnzL8EQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=QWflwUU3; arc=none smtp.client-ip=118.26.132.104 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="QWflwUU3" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438066; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=GWYJ6xFzpzffV/bAMKLcRmXGenEkkdsW9HP/OYS8cwg=; b=QWflwUU3U/bmyzf5uKpR0Qn8lsnLTNtr0tOlN/f/ZN6HMZEOUlC69+TUmo5lEZIhBRgMpi 0NOTs/2bTjCixlDdRcr4pp4smrPXFZwO/niW8IMFKWnHDUMx2fIVi5ImgHhZnXvSyHlGAC eYY5Z6KdQjcgI9OICf+UTBVa5hnRZsY0oeyayZUyB9AJWWyof7vmdJiUTorxGJ8TNe5iRp 58WJcvO+JC0+06ZwjUBW9kwHGTowcAdYS7wXy92LQfdRVfq7N5H7/HdqsGN/KcLSAzboin TZxXZ2hZ/WbcnDijmw4zKGX82oK0/b3fQLfnkr1ZpCidwW1mdSPMgpizMuJRbQ== X-Lms-Return-Path: References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 Message-Id: <20260302075216.2170675-9-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 To: , , , , , , , , , , , From: "Chuyi Zhou" In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Date: Mon, 2 Mar 2026 15:52:12 +0800 X-Original-From: Chuyi Zhou Content-Transfer-Encoding: quoted-printable Cc: , "Chuyi Zhou" Subject: [PATCH v2 08/12] smp: Remove preempt_disable from on_each_cpu_cond_mask Content-Type: text/plain; charset="utf-8" Now smp_call_function_many_cond() internally handles the preemption logic, so on_each_cpu_cond_mask does not need to explicitly disable preemption. Remove preempt_{enable, disable} from on_each_cpu_cond_mask(). Signed-off-by: Chuyi Zhou Reviewed-by: Muchun Song --- kernel/smp.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index f9c0028968ef..47c3b057f57f 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -1086,9 +1086,7 @@ void on_each_cpu_cond_mask(smp_cond_func_t cond_func,= smp_call_func_t func, if (wait) scf_flags |=3D SCF_WAIT; =20 - preempt_disable(); smp_call_function_many_cond(mask, func, info, scf_flags, cond_func); - preempt_enable(); } EXPORT_SYMBOL(on_each_cpu_cond_mask); =20 --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-104.ptr.blmpb.com (sg-1-104.ptr.blmpb.com [118.26.132.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0AF263859FA for ; Mon, 2 Mar 2026 07:54:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.104 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438085; cv=none; b=djpjSFc3WzNB9CyE1BIQ5dLFEyqUzxv4b9zz8EYybVAOUJiUfLHnKwxLNp6Xgv0NFKQllLhULV0qMHUjp4Gq7OmTezOvudUO3xiIt7itZi4yxVQhglJi+49F3U5/AwXfRFbjArSf4/QZMd3yPJ+NBpteGruOlArP5Xvm4QqLR6Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438085; c=relaxed/simple; bh=Qa9IpuSjSfpTtmS0nJfwKeG6oQYGbdwsouDwUnE2vi8=; h=In-Reply-To:Content-Type:To:From:Subject:Cc:References:Date: Message-Id:Mime-Version; b=G7ou2ngB+4WsTs5DwwB643lRW7VzZdnQzXuDjQZ0V58f0cGDfPcsXNJwdZp657PIWJK8XnrB4441pahq3C6a7eEiYDHUlmZHVAUaEoGv2VCXsHWPO3l4hed05n9XacQZMBnphu69Qs15FzhoNRoqO8NNHlpPBqMwFd+Cm2lRDEU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=VkusLVmQ; arc=none smtp.client-ip=118.26.132.104 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="VkusLVmQ" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438078; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=0usi0H2e8tYtT63WQwB8R/fMy+WKcYDOp5mE7Fo7UFU=; b=VkusLVmQtVcLX5GmZDzN8vlGRGefhmL221qI5AWYYnKNnhw3Fb5oU26Qe4t4xISMWz9ihI 8ON9UpTMNw0nq62fzd9Wrn9DeZi/V2A+ov21+rA1Ck25E4t3geusbh73b1qgR1ca6eIHuH 3u35KV6uqP3SteqxTgizMnBFjENXHdsb9hcZEnf5rw+LuYlv2vwO9hKEIU+2Gmi6IJ91PU 4RLHpB0dx84J1NpOcwtAVdM5dcNlPwe4qeRpQUh/fWYvxUNQ4Le6LcNXxaVeanGcbZD4i6 U0h4wbB7ZiZUBdFkIbIOsTSS8hYJAhhaPPUDZQV8hw1g1BVb2S4wb+wBI/Qd4g== In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> X-Lms-Return-Path: To: , , , , , , , , , , , From: "Chuyi Zhou" Subject: [PATCH v2 09/12] scftorture: Remove preempt_disable in scftorture_invoke_one Content-Transfer-Encoding: quoted-printable Cc: , "Chuyi Zhou" X-Original-From: Chuyi Zhou References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Date: Mon, 2 Mar 2026 15:52:13 +0800 Message-Id: <20260302075216.2170675-10-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.20.1 Content-Type: text/plain; charset="utf-8" Previous patches make smp_call*() handle preemption logic internally. Now the preempt_disable() by most callers becomes unnecessary and can therefore be removed. Remove preempt_{enable, disable} in scftorture_invoke_one(). Signed-off-by: Chuyi Zhou --- kernel/scftorture.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/kernel/scftorture.c b/kernel/scftorture.c index 327c315f411c..b87215e40be5 100644 --- a/kernel/scftorture.c +++ b/kernel/scftorture.c @@ -364,8 +364,6 @@ static void scftorture_invoke_one(struct scf_statistics= *scfp, struct torture_ra } if (use_cpus_read_lock) cpus_read_lock(); - else - preempt_disable(); switch (scfsp->scfs_prim) { case SCF_PRIM_RESCHED: if (IS_BUILTIN(CONFIG_SCF_TORTURE_TEST)) { @@ -411,13 +409,10 @@ static void scftorture_invoke_one(struct scf_statisti= cs *scfp, struct torture_ra if (!ret) { if (use_cpus_read_lock) cpus_read_unlock(); - else - preempt_enable(); + wait_for_completion(&scfcp->scfc_completion); if (use_cpus_read_lock) cpus_read_lock(); - else - preempt_disable(); } else { scfp->n_single_rpc_ofl++; scf_add_to_free_list(scfcp); @@ -463,8 +458,6 @@ static void scftorture_invoke_one(struct scf_statistics= *scfp, struct torture_ra } if (use_cpus_read_lock) cpus_read_unlock(); - else - preempt_enable(); if (allocfail) schedule_timeout_idle((1 + longwait) * HZ); // Let no-wait handlers com= plete. else if (!(torture_random(trsp) & 0xfff)) --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-100.ptr.blmpb.com (sg-1-100.ptr.blmpb.com [118.26.132.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B561A3859FA for ; Mon, 2 Mar 2026 07:54:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438098; cv=none; b=Qo2zsYcEaI3vtwaChEuR3ua3VWFW/gGqa9o7/PdO5AV9f9tkqVz7NsyEQ0gMNapfrJ6nqeQyK2xBApnG75MUicqmG10s38afxDjvjZWNdwfF6QXptb5q+61A4bN36DV44W3OYiBTxJmA4P1I1WvNnIn2cOSgufYq31spGstMD3w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438098; c=relaxed/simple; bh=ZJPiP3iiwzTzFv8Jv9zQuJf7IKei6bg0RrzxY/+/dUI=; h=References:Cc:Message-Id:Date:Mime-Version:To:Subject:In-Reply-To: Content-Type:From; b=aGLpY49xY50oaWrpWpeMH5HF1L5p13l73pl/teg3opROGvU0CQsVDV3tbS5TNI9gz8U1b+TJrJR5baXc7KJmTOSYf1xfd3p80WzyqMRwDa+G7dvVXX3paaZphgBGF24e92/icuZKkt6kRywRaF7IcC5rWj17Io0hSJd3NP0d8t0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=lcRwQkj/; arc=none smtp.client-ip=118.26.132.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="lcRwQkj/" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438090; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=/k8R7M0GdTYBASHl+0tW79yEgti28TaR/njBYckaHLg=; b=lcRwQkj/nZDkr1gqu1CW1fCoWQYkuuUSyEYNJyw2bhrznGUjKOPB3vdn3Q4NWNh04pOg+3 2Qg+s7948EK3GHT6BL/+yqoS5Y65aZcBrEn8uT7yzdoqIjSyOjZiwXGSgGCF5NFV8nXwJi Iakvy2RVSwjiYkUAePfGpLegIobxMpJAogInPxnSFHpWoitkA2W5YPGmUuYNpoFkzkSzL+ 2Tkzof2xU9gOt5XQv1o85Bwizr7TmQ1SC7Th6p0QTwQ9HAls+LZgXApyWfKVqm3ZdSaCnW SFiD7k9j1r2Ky1f6HZpJoxqNZO3Ij3IZ7R9/JQvSG9GL2m7wf2cONt96Cvrdog== References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> X-Original-From: Chuyi Zhou Cc: , "Chuyi Zhou" Message-Id: <20260302075216.2170675-11-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 Date: Mon, 2 Mar 2026 15:52:14 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Lms-Return-Path: To: , , , , , , , , , , , Subject: [PATCH v2 10/12] x86/mm: Move flush_tlb_info back to the stack In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Content-Transfer-Encoding: quoted-printable From: "Chuyi Zhou" Content-Type: text/plain; charset="utf-8" Commit 3db6d5a5ecaf ("x86/mm/tlb: Remove 'struct flush_tlb_info' from the stack") converted flush_tlb_info from stack variable to per-CPU variable. This brought about a performance improvement of around 3% in extreme test. However, it also required that all flush_tlb* operations keep preemption disabled entirely to prevent concurrent modifications of flush_tlb_info. flush_tlb* needs to send IPIs to remote CPUs and synchronously wait for all remote CPUs to complete their local TLB flushes. The process could take tens of milliseconds when interrupts are disabled or with a large number of remote CPUs. From the perspective of improving kernel real-time performance, this patch reverts flush_tlb_info back to stack variables. This is a preparation for enabling preemption during TLB flush in next patch. Signed-off-by: Chuyi Zhou --- arch/x86/mm/tlb.c | 124 ++++++++++++++++++---------------------------- 1 file changed, 49 insertions(+), 75 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 621e09d049cb..91a0fb389303 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1394,71 +1394,30 @@ void flush_tlb_multi(const struct cpumask *cpumask, */ unsigned long tlb_single_page_flush_ceiling __read_mostly =3D 33; =20 -static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tlb_info, flush_tlb_info= ); - -#ifdef CONFIG_DEBUG_VM -static DEFINE_PER_CPU(unsigned int, flush_tlb_info_idx); -#endif - -static struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm, - unsigned long start, unsigned long end, - unsigned int stride_shift, bool freed_tables, - u64 new_tlb_gen) +void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned int stride_shift, + bool freed_tables) { - struct flush_tlb_info *info =3D this_cpu_ptr(&flush_tlb_info); + int cpu =3D get_cpu(); =20 -#ifdef CONFIG_DEBUG_VM - /* - * Ensure that the following code is non-reentrant and flush_tlb_info - * is not overwritten. This means no TLB flushing is initiated by - * interrupt handlers and machine-check exception handlers. - */ - BUG_ON(this_cpu_inc_return(flush_tlb_info_idx) !=3D 1); -#endif + struct flush_tlb_info info =3D { + .mm =3D mm, + .stride_shift =3D stride_shift, + .freed_tables =3D freed_tables, + .trim_cpumask =3D 0, + .initiating_cpu =3D cpu, + }; =20 - /* - * If the number of flushes is so large that a full flush - * would be faster, do a full flush. - */ if ((end - start) >> stride_shift > tlb_single_page_flush_ceiling) { start =3D 0; end =3D TLB_FLUSH_ALL; } =20 - info->start =3D start; - info->end =3D end; - info->mm =3D mm; - info->stride_shift =3D stride_shift; - info->freed_tables =3D freed_tables; - info->new_tlb_gen =3D new_tlb_gen; - info->initiating_cpu =3D smp_processor_id(); - info->trim_cpumask =3D 0; - - return info; -} - -static void put_flush_tlb_info(void) -{ -#ifdef CONFIG_DEBUG_VM - /* Complete reentrancy prevention checks */ - barrier(); - this_cpu_dec(flush_tlb_info_idx); -#endif -} - -void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, - unsigned long end, unsigned int stride_shift, - bool freed_tables) -{ - struct flush_tlb_info *info; - int cpu =3D get_cpu(); - u64 new_tlb_gen; - /* This is also a barrier that synchronizes with switch_mm(). */ - new_tlb_gen =3D inc_mm_tlb_gen(mm); + info.new_tlb_gen =3D inc_mm_tlb_gen(mm); =20 - info =3D get_flush_tlb_info(mm, start, end, stride_shift, freed_tables, - new_tlb_gen); + info.start =3D start; + info.end =3D end; =20 /* * flush_tlb_multi() is not optimized for the common case in which only @@ -1466,19 +1425,18 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsig= ned long start, * flush_tlb_func_local() directly in this case. */ if (mm_global_asid(mm)) { - broadcast_tlb_flush(info); + broadcast_tlb_flush(&info); } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { - info->trim_cpumask =3D should_trim_cpumask(mm); - flush_tlb_multi(mm_cpumask(mm), info); + info.trim_cpumask =3D should_trim_cpumask(mm); + flush_tlb_multi(mm_cpumask(mm), &info); consider_global_asid(mm); } else if (mm =3D=3D this_cpu_read(cpu_tlbstate.loaded_mm)) { lockdep_assert_irqs_enabled(); local_irq_disable(); - flush_tlb_func(info); + flush_tlb_func(&info); local_irq_enable(); } =20 - put_flush_tlb_info(); put_cpu(); mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } @@ -1548,19 +1506,29 @@ static void kernel_tlb_flush_range(struct flush_tlb= _info *info) =20 void flush_tlb_kernel_range(unsigned long start, unsigned long end) { - struct flush_tlb_info *info; + struct flush_tlb_info info =3D { + .mm =3D NULL, + .stride_shift =3D PAGE_SHIFT, + .freed_tables =3D false, + .trim_cpumask =3D 0, + .new_tlb_gen =3D TLB_GENERATION_INVALID + }; =20 guard(preempt)(); =20 - info =3D get_flush_tlb_info(NULL, start, end, PAGE_SHIFT, false, - TLB_GENERATION_INVALID); + if ((end - start) >> PAGE_SHIFT > tlb_single_page_flush_ceiling) { + start =3D 0; + end =3D TLB_FLUSH_ALL; + } =20 - if (info->end =3D=3D TLB_FLUSH_ALL) - kernel_tlb_flush_all(info); - else - kernel_tlb_flush_range(info); + info.initiating_cpu =3D smp_processor_id(), + info.start =3D start; + info.end =3D end; =20 - put_flush_tlb_info(); + if (info.end =3D=3D TLB_FLUSH_ALL) + kernel_tlb_flush_all(&info); + else + kernel_tlb_flush_range(&info); } =20 /* @@ -1728,12 +1696,19 @@ EXPORT_SYMBOL_FOR_KVM(__flush_tlb_all); =20 void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { - struct flush_tlb_info *info; - int cpu =3D get_cpu(); =20 - info =3D get_flush_tlb_info(NULL, 0, TLB_FLUSH_ALL, 0, false, - TLB_GENERATION_INVALID); + struct flush_tlb_info info =3D { + .start =3D 0, + .end =3D TLB_FLUSH_ALL, + .mm =3D NULL, + .stride_shift =3D 0, + .freed_tables =3D false, + .new_tlb_gen =3D TLB_GENERATION_INVALID, + .initiating_cpu =3D cpu, + .trim_cpumask =3D 0, + }; + /* * flush_tlb_multi() is not optimized for the common case in which only * a local TLB flush is needed. Optimize this use-case by calling @@ -1743,17 +1718,16 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap= _batch *batch) invlpgb_flush_all_nonglobals(); batch->unmapped_pages =3D false; } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { - flush_tlb_multi(&batch->cpumask, info); + flush_tlb_multi(&batch->cpumask, &info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); local_irq_disable(); - flush_tlb_func(info); + flush_tlb_func(&info); local_irq_enable(); } =20 cpumask_clear(&batch->cpumask); =20 - put_flush_tlb_info(); put_cpu(); } =20 --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-103.ptr.blmpb.com (sg-1-103.ptr.blmpb.com [118.26.132.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68C33387348 for ; Mon, 2 Mar 2026 07:55:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.103 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438111; cv=none; b=aNChTO1a0n+8gH76zFGSoTAlLp/kcBToel4bxH67fjxRsCnhzQb+6ntqVDiUn939ZQqTcVzVNSxxaHr+SxZQtkmJjb7kpgHg6ZEytogpCf5hMAddudAh85zNN1U7OXJaMrTYJpg6vh4xxwOXnw9bsNxBRgd8Ifu4jKdmZVlYfxY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438111; c=relaxed/simple; bh=y59GHL9JlZyhg7fkewWl8OtZk6FJ0uRTr7r6nAfQRHM=; h=Subject:In-Reply-To:To:Cc:Content-Type:Mime-Version:References: From:Message-Id:Date; b=V480FMco/mcVeQ3m7MiVRtC7OQTvbvcsuK62tYzO+7XlsedrpXpHHVLmUEh4kcvTB3j4+00suD2l9Yk/tI5dac7gSKZC98mqepSrn9q+fyGBzSq4ASWzWVx8czZJEQ4N4hZO6k4JJYKs69o0BpP108/hyEmOIHDEBA645Yo2U7U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=nNTCtimC; arc=none smtp.client-ip=118.26.132.103 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="nNTCtimC" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438103; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=tuek/6tM4UVEV2b2X9kHS269FPRLDmhOYMEZPzb5Qw8=; b=nNTCtimCswJg4iBTQ0KIj+8y6ZQCPALl7EfkK6t2N8KdmXxIQ4h42acSrRius0HNhIN2HW vIhh1u1EFc67wHR0LO30IksxXYfeogygAYBZhqf8Tk2LOEoGp4K4esVPi4FzNmUFwEcGTA uMOeG56Bqksl0ds8BQq1G47Fyydt/0yr++zV3Nlvi2f8jIK6KTe9CCSrP7in9EmEsQ/oTZ z+GZqllmD22uagcFbkf0TlgfRJGnoiHFCrKXOAGoNnkXjIsU5jhFB1NDo6S15umiLTyl7i iwmtKmKnCSpOOcwbsQquBp8FVFlKKLBSrBgZsig0dR5eTevg3ann4dKCHtiLQA== Subject: [PATCH v2 11/12] x86/mm: Enable preemption during native_flush_tlb_multi X-Mailer: git-send-email 2.20.1 In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> X-Original-From: Chuyi Zhou To: , , , , , , , , , , , Cc: , "Chuyi Zhou" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> From: "Chuyi Zhou" Message-Id: <20260302075216.2170675-12-zhouchuyi@bytedance.com> Date: Mon, 2 Mar 2026 15:52:15 +0800 X-Lms-Return-Path: Content-Type: text/plain; charset="utf-8" flush_tlb_mm_range()/arch_tlbbatch_flush() -> native_flush_tlb_multi() is a common path in real production environments. When pages are reclaimed or process exit, native_flush_tlb_multi() sends IPIs to remote CPUs and waits for all remote CPUs to complete their local TLB flushes. The overall latency may reach tens of milliseconds due to a large number of remote CPUs and other factors (such as interrupts being disabled). Since flush_tlb_mm_range()/arch_tlbbatch_flush() always disable preemption, which may cause increased scheduling latency for other threads on the current CPU. Previous patch converted flush_tlb_info from per-cpu variable to on-stack variable. Additionally, it's no longer necessary to explicitly disable preemption before calling smp_call*() since they internally handles the preemption logic. Now is's safe to enable preemption during native_flush_tlb_multi(). Signed-off-by: Chuyi Zhou --- arch/x86/kernel/kvm.c | 4 +++- arch/x86/mm/tlb.c | 9 +++++++-- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 3bc062363814..4f7f4c1149b9 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -668,8 +668,10 @@ static void kvm_flush_tlb_multi(const struct cpumask *= cpumask, u8 state; int cpu; struct kvm_steal_time *src; - struct cpumask *flushmask =3D this_cpu_cpumask_var_ptr(__pv_cpu_mask); + struct cpumask *flushmask; =20 + guard(preempt)(); + flushmask =3D this_cpu_cpumask_var_ptr(__pv_cpu_mask); cpumask_copy(flushmask, cpumask); /* * We have to call flush only on online vCPUs. And diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 91a0fb389303..86d9c208e424 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1427,9 +1427,11 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsign= ed long start, if (mm_global_asid(mm)) { broadcast_tlb_flush(&info); } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { + put_cpu(); info.trim_cpumask =3D should_trim_cpumask(mm); flush_tlb_multi(mm_cpumask(mm), &info); consider_global_asid(mm); + goto invalidate; } else if (mm =3D=3D this_cpu_read(cpu_tlbstate.loaded_mm)) { lockdep_assert_irqs_enabled(); local_irq_disable(); @@ -1438,6 +1440,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigne= d long start, } =20 put_cpu(); +invalidate: mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } =20 @@ -1718,7 +1721,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_b= atch *batch) invlpgb_flush_all_nonglobals(); batch->unmapped_pages =3D false; } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { + put_cpu(); flush_tlb_multi(&batch->cpumask, &info); + goto clear; } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); local_irq_disable(); @@ -1726,9 +1731,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_b= atch *batch) local_irq_enable(); } =20 - cpumask_clear(&batch->cpumask); - put_cpu(); +clear: + cpumask_clear(&batch->cpumask); } =20 /* --=20 2.20.1 From nobody Thu Apr 16 06:57:21 2026 Received: from sg-1-104.ptr.blmpb.com (sg-1-104.ptr.blmpb.com [118.26.132.104]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0F2E7387348 for ; Mon, 2 Mar 2026 07:55:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.104 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438128; cv=none; b=sxvE6z+Rcs2w7IxM+yVM9dUK3XYdOXHKPBuZ+QcJWKCvNtrCwY/AasZqv8sPBG+98jFn2CcU4j2dKWJyWBc22mT8jq0NDBk0oY1WUi1AxdbebeNDsyojkwQgEth+Jk2lAKOeyN/w1N7l/+/XTj2SDirnKXCh5YgKHA72xDRtKVo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772438128; c=relaxed/simple; bh=7RuIyUQBYovh4IrD1W4gq+5RbcVdcgrK/QFad3BFaLQ=; h=To:Subject:From:Mime-Version:Content-Type:Cc:Date:Message-Id: In-Reply-To:References; b=bMrMuvfxs02xOL3ui5pGtySdVUhP0HmTJYQsaKptWFbYFStzdw7JBzlqCC7MLuF0KQeFJSOcXu/WQTbpCTM/KscYOkII/FZG5lhTSYedH8T6NpHqNtM+Y/UbA50MelOiHuTTUUEUnv3d1m9xkjnucxRBfT1WKttb0An6YnYkrlc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=F11W3LOE; arc=none smtp.client-ip=118.26.132.104 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="F11W3LOE" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1772438115; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=2HjLzxVFAdciSC+xw+Bt+ewKYjmFDkMaGqIa1/RlcuU=; b=F11W3LOECzTLxqzNYCgaJHjiDpsddy02iJTtgAMhlJHgiXQ1azJ6cC/8Yo9I8YjoKIvSN+ TXSQutuuKYov3NuO+M2m3t9yBSPlQ+hI+wxZC+vIZOyd0tJ1LOnH6l9VzC48J/idKXNm10 C7F+9oDT5UbOwyelbSBdTZ4zIkkpCoVMlps/5iVFlkWFGX0FhvV+9gpaWfgcXSosfi6dIP PF7j3NABpes4Z8UyZsZ9/8nct2VPHsaW8lOLnWlEJcU48OyHIZ6PNyHXgXTmN/Uq14XE0t DLG3HsrlfXwhFxjB1Jv4wwA0UDLEWpyPv3iHkztc/TwhFK7cZ5XDcADc3kJb6g== To: , , , , , , , , , , , Subject: [PATCH v2 12/12] x86/mm: Enable preemption during flush_tlb_kernel_range X-Mailer: git-send-email 2.20.1 X-Lms-Return-Path: From: "Chuyi Zhou" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Original-From: Chuyi Zhou Cc: , "Chuyi Zhou" Date: Mon, 2 Mar 2026 15:52:16 +0800 Message-Id: <20260302075216.2170675-13-zhouchuyi@bytedance.com> In-Reply-To: <20260302075216.2170675-1-zhouchuyi@bytedance.com> References: <20260302075216.2170675-1-zhouchuyi@bytedance.com> Content-Type: text/plain; charset="utf-8" flush_tlb_kernel_range() is invoked when kernel memory mapping changes. On x86 platforms without the INVLPGB feature enabled, we need to send IPIs to every online CPU and synchronously wait for them to complete do_kernel_range_flush(). This process can be time-consuming due to factors such as a large number of CPUs or other issues (like interrupts being disabled). flush_tlb_kernel_range() always disables preemption, this may affect the scheduling latency of other tasks on the current CPU. Previous patch convert flush_tlb_info from per-cpu variable to on-stack variable. Additionally, it's no longer necessary to explicitly disable preemption before calling smp_call*() since they internally handles the preemption logic. Now is's safe to enable preemption during flush_tlb_kernel_range(). Signed-off-by: Chuyi Zhou --- arch/x86/mm/tlb.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 86d9c208e424..48371eb36773 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1467,6 +1467,8 @@ static void invlpgb_kernel_range_flush(struct flush_t= lb_info *info) { unsigned long addr, nr; =20 + guard(preempt)(); + for (addr =3D info->start; addr < info->end; addr +=3D nr << PAGE_SHIFT) { nr =3D (info->end - addr) >> PAGE_SHIFT; =20 @@ -1517,8 +1519,6 @@ void flush_tlb_kernel_range(unsigned long start, unsi= gned long end) .new_tlb_gen =3D TLB_GENERATION_INVALID }; =20 - guard(preempt)(); - if ((end - start) >> PAGE_SHIFT > tlb_single_page_flush_ceiling) { start =3D 0; end =3D TLB_FLUSH_ALL; --=20 2.20.1