From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-102.ptr.blmpb.com (sg-1-102.ptr.blmpb.com [118.26.132.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4486A2652B6 for ; Tue, 3 Feb 2026 11:25:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.102 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770117912; cv=none; b=DEPGrjFblkix1dOruUnFOw+CF2NaSXZ+cxQoxF5jbNaRFyTA2k7t8wFvkubiIDHIcIPf4EyKqv1ZdITBsg82YwGLqR50qVsCog5q35qWiHHra3PI2Ez7nDLsPDzTOp452xGtLvtNZwDtT2PWPOo3KddPZ1VrgTCMjLVOaFLBzWo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770117912; c=relaxed/simple; bh=GCGOcQqtvk/19f8vftEKqC2/G2Ep9scqa/joti9V7jY=; h=Cc:Message-Id:To:Date:Mime-Version:In-Reply-To:References: Content-Type:From:Subject; b=kXFtsPzm6x/UqL2ZjNf1sCjQ6pGUMz9uBfiDTxnfC7MpFQ6CPMej2ornqFYc0MTA/jRs/OKX7dOYP4uBpw1oGFW6MUsnBLtyBPoPH3hQ6ihLUc6yFYXahCMFrTXEGRdk4HRxpbXJKaDqSFdwxRs0E+GWc+eaQ/yU7NDGtdKIaMU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=BQljZZdH; arc=none smtp.client-ip=118.26.132.102 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="BQljZZdH" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770117898; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=xsWddccpnpFM5L4OYeRPd3cPjoAsrE9YsBUvxquv4vw=; b=BQljZZdHRBGSa3ZYZaX381q/rIaiRGExCY0WdBichULOG1ZuvZKBvj2JAJ1MXxcMAHdiHJ 5MPcDneadeNtUEPFGz9YzIOS5lISI65m4FxBXBHCrKpAJo+WTzEbCBCVYbQXFjq8XdzhLr GS+QlVEMSbqhnW1o6vlJUevGGpVQuVOKNY9dhjk9idxqNlx5ojwfz2YQ7nH2DwKZ8jABj7 N1cq/M6gmq//XR92GeUvTKhcQcaOS4dNQUw+Rs7+MBRDOefkYeLIclTvkAsHjydVZgc1oa bTCxmNqOHRKxt0Z1FH32d+owIrRWsg0mpLP/xMFwu40fDJjJ4zAGchSFqx5QUA== Cc: , "Chuyi Zhou" Message-Id: <20260203112401.3889029-2-zhouchuyi@bytedance.com> X-Original-From: Chuyi Zhou To: , , , , , , , Date: Tue, 3 Feb 2026 19:23:51 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> X-Lms-Return-Path: From: "Chuyi Zhou" X-Mailer: git-send-email 2.20.1 Subject: [PATCH 01/11] smp: Disable preemption explicitly in __csd_lock_wait Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The latter patches will enable preemption before csd_lock_wait(), which could break csdlock_debug. Because the slice of other tasks on the CPU may be accounted between ktime_get_mono_fast_ns() calls. Disable preemption explicitly in __csd_lock_wait(). This is a preparation for the next patches. Signed-off-by: Chuyi Zhou Acked-by: Muchun Song --- kernel/smp.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/smp.c b/kernel/smp.c index f349960f79ca..fc1f7a964616 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -323,6 +323,8 @@ static void __csd_lock_wait(call_single_data_t *csd) int bug_id =3D 0; u64 ts0, ts1; =20 + guard(preempt)(); + ts1 =3D ts0 =3D ktime_get_mono_fast_ns(); for (;;) { if (csd_lock_wait_toolong(csd, ts0, &ts1, &bug_id, &nmessages)) --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-101.ptr.blmpb.com (sg-1-101.ptr.blmpb.com [118.26.132.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 894282D249B for ; Tue, 3 Feb 2026 11:27:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118039; cv=none; b=Dc+m9YhPqnBQ+H+xdRxxadWB6lfG9Zc4wIuvFDMWSAAdxPwFA8C5N3WFdWHqEzmdz50HMEx6SX5CmVhSoO1iOOcDAvBNfJtov1nzXYUoDZhp1iXXbniKb3tiJilwkwBH+02f67X+fsEtrQm3HjYTDZNEwR8RpkZdo9fFXQVLShE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118039; c=relaxed/simple; bh=Q1P6izkVvYjrmOeu+q9ERUgfKxN3fmxiLcTLNClinWs=; h=Subject:Message-Id:Content-Type:References:To:In-Reply-To:Cc:From: Date:Mime-Version; b=f72F2e+u3Cc0KdNnzG8zexwJRwZ0d2gmbHu9JgfEmBI0WBehIzX0NBfrY4GDKbBcQetu7QUamkpRKjX3kg91uwmBfgWBUrxiKZ8jKJ2E37wQDZoHKOrlKTNGiuMc+KptZ4aYZ+PSkxDX0PqzBei2/ot57ItBYEGSF7NCZjlqhBo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=eTMwVFQ0; arc=none smtp.client-ip=118.26.132.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="eTMwVFQ0" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770117916; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=08BXau/MsSHycfubVqb8Ektt5Nkz6Dg6w0xeBIf+a7Y=; b=eTMwVFQ00RCyl5yq/BpIVAD4AATnkLw77ko53zrmarptjF6do1xjqbWvNTb/ZBUDx2jrLX i5UU12te4BQ9RXURJfKomDWG4opfoRJdxBfpd58wVuAGertpVBE/mTiUrWk6o8L/ERa4sA CxeyodRWZdPYEiYrRFkFjm+JDOhasMBTyCO2cgAANYznVZjIuGtvKcy2rkVEeWUxyAN4y9 MlFoeOAIOHtz4Pha3dER0d9VZjY+x/UrvUSfG2rnJgjNx6rPN6L41Fiu6bKlTKJuGx4fJ7 YJKKMNMDQq6Y/DpByyCrE6aKVTIyhXYjBVNK3j/VTDazVoYoeLtdgutH4zdI1A== Subject: [PATCH 02/11] smp: Enable preemption early in smp_call_function_single Message-Id: <20260203112401.3889029-3-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> X-Lms-Return-Path: To: , , , , , , , In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Cc: , "Chuyi Zhou" From: "Chuyi Zhou" Date: Tue, 3 Feb 2026 19:23:52 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Original-From: Chuyi Zhou Content-Type: text/plain; charset="utf-8" Now smp_call_function_single() disables preemption mainly for the following reasons: - To protect the per-cpu csd_data from concurrent modification by other tasks on the current CPU in the !wait case. For the wait case, synchronization is not a concern as on-stack csd is used. - To prevent the remote online CPU from being offlined. Specifically, we want to ensure that no new IPIs are queued after smpcfd_dying_cpu() has finished. Disabling preemption for the entire execution is unnecessary, especially csd_lock_wait() part does not require preemption protection. This patch enables preemption before csd_lock_wait() to reduce the preemption-disabled critical section. Signed-off-by: Chuyi Zhou Reviewed-by: Muchun Song --- kernel/smp.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index fc1f7a964616..0858553f3666 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -685,11 +685,24 @@ int smp_call_function_single(int cpu, smp_call_func_t= func, void *info, =20 err =3D generic_exec_single(cpu, csd); =20 + /* + * We may block in csd_lock_wait() for a significant amount of time (e.g.= , if the + * remote CPU has interrupts disabled). Disabling preemption throughout t= he entire + * smp_call_function_single() impacts the scheduling latency and is unnec= essary. + * + * - Preemption must be disabled before sending the IPI to ensure no new = IPIs are + * queued after smpcfd_dying_cpu() finishes. + * + * @csd is stack-allocated when @wait is true. No concurrent access except + * from the IPI completion path, so we can re-enable preemption early + * to reduce latency. + * + */ + put_cpu(); + if (wait) csd_lock_wait(csd); =20 - put_cpu(); - return err; } EXPORT_SYMBOL(smp_call_function_single); --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-103.ptr.blmpb.com (sg-1-103.ptr.blmpb.com [118.26.132.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA2852EBB8C for ; Tue, 3 Feb 2026 11:25:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.103 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770117940; cv=none; b=Qk4gSWEuRPPKOUvMId7+sAz0OwOkeXdOnut6yWkZDD9WjJqM6qBNIasGdcu3/30ThWDli8XwcWs8OR0JAmCqRPrkQS6lAVvhM1Sn8gjtdc97mx+ZJfisC+yMS0iqCn1weNp0v8UaJrbmqZH79KuJt6eTGdMog6DW4mn3vi1zqps= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770117940; c=relaxed/simple; bh=kCM9zucjuCqSks7fkVhukFz8iheuhxebqX42L1Gy6tw=; h=Mime-Version:To:Cc:From:Subject:Date:Message-Id:In-Reply-To: References:Content-Type; b=P+1rUeL/S8LssXfVv8be675z1QVgU5V+wxeJ+wGFduQrjYhIxxRvKicSQJ+/lYMth7/6VPdE8l0yzaeDj/BzI7ZVDFc8U3lL9Ayoa2bwETassvWZSiZxN1AOzYmKct7mnJQ1Oc2w6cMUQmovLtxE06X3aCK77SBzineNbTVCU+o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=FRX7Cj8K; arc=none smtp.client-ip=118.26.132.103 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="FRX7Cj8K" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770117933; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=LjNvt22VDEvMny3UgZSQXODsa5cYAdKdWzoXfyN2sc4=; b=FRX7Cj8Ktdk6P2tpvurVFEASyHLgvq+KCacRZj5Fm0Sk8KWqqncpxUYPGacLI7HD1ANuHN 1obBUzGGa7z7k2hmRkRhF8cuRhpvfgvfpoIAdG5DrprcLRBwmK9Pingt7Is31FOkdsAjGA F8AxPUeoYCHm2Xfb9Kn/C0EtrlzAKJrbfCyWDswZTs5prpMWGqNV8ZUs97y0m+mnl8b3vQ aG9UOphascfIzsVpljfJrlMRAj3BbVOKsR+soda+rH8h22g6r+Qv/aqTZxgHI22/qAY7oC VfFHDGYV3d28Hhlud1ZSvfwhCiPjIr1lpcj3yvLDYJrvEoxOusYpr8KQOBi+MA== Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.20.1 To: , , , , , , , Cc: , "Chuyi Zhou" From: "Chuyi Zhou" Subject: [PATCH 03/11] smp: Remove get_cpu from smp_call_function_any Date: Tue, 3 Feb 2026 19:23:53 +0800 Message-Id: <20260203112401.3889029-4-zhouchuyi@bytedance.com> X-Original-From: Chuyi Zhou Content-Transfer-Encoding: quoted-printable In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> X-Lms-Return-Path: Content-Type: text/plain; charset="utf-8" Now smp_call_function_single() would enable preemption before csd_lock_wait() to reduce the critical section. To allow callers of smp_call_function_any() to also benefit from this optimization, remove get_cpu()/put_cpu() from smp_call_function_any(). Signed-off-by: Chuyi Zhou Reviewed-by: Muchun Song --- kernel/smp.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index 0858553f3666..f572716c3c7d 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -772,13 +772,18 @@ int smp_call_function_any(const struct cpumask *mask, unsigned int cpu; int ret; =20 + /* + * Prevent migration to another CPU after selecting the current CPU + * as the target. + */ + guard(migrate)(); + /* Try for same CPU (cheapest) */ - cpu =3D get_cpu(); + cpu =3D smp_processor_id(); if (!cpumask_test_cpu(cpu, mask)) cpu =3D sched_numa_find_nth_cpu(mask, 0, cpu_to_node(cpu)); =20 ret =3D smp_call_function_single(cpu, func, info, wait); - put_cpu(); return ret; } EXPORT_SYMBOL_GPL(smp_call_function_any); --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-100.ptr.blmpb.com (sg-1-100.ptr.blmpb.com [118.26.132.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 209742EBB8C for ; Tue, 3 Feb 2026 11:26:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770117968; cv=none; b=XHawR9goS9D/HOuuX/wLu/sz5NI+HWhqMWM7VqPFYH5xGhQ8bv5no6vKb7lfgPcQKRFVn9bD9HG1XvEOU0aINI/98pgtNqk2m7nwLZa1nr6NesLuqBZ8Ov1Nw1LeFfgoeprfyZt2K/AnOnjVhuCDwDFvY1J6bpd1yIgkAxn6q3Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770117968; c=relaxed/simple; bh=e+8wl+N+NxaXEiUDqDmrsUti1Yp4Bgh5W+VSAj8ZkUY=; h=To:Subject:References:Cc:From:Message-Id:Date:Mime-Version: In-Reply-To:Content-Type; b=aAqNcv5A7qqgtOabK+gT7kU4mVy6hhR2VVZ78GKlCcBcmWxxam48x7nK8XAdUNkk9wpghbXATakcpD13NU206oz3VPsOb+ninNAHqM3KdKTCXIDl9xogwzA4OHdijJS5itUF9eMeq2C1l8fdHTIyll5/t8RIiVsPVG2sO546OFg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=EnrbvbEI; arc=none smtp.client-ip=118.26.132.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="EnrbvbEI" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770117954; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=jNKPSWTCnUlclqvcSRrpUMGBsa3SgDKXU/ZT4JMAWRM=; b=EnrbvbEI6vLL8axyxWnVPpiCddRvrWZFYXjJt8e+Y1L25prcs8dM6yIFq4G7hju+aoVS+3 3kY0EHCbCBUuHdwokj7QZxd+kjCVEumkj/aPSC7A/Ek1inLsQQiqhSatBa8sghfL4M+wq6 8Kaosa+JJ2uJzTjFg/Yp3cwYisZ5Bi70WZsxKORybcAJQsCNf9IvQKGiLL3yRQ7HHYx6Rs 0uAor3tFD64VpXTunOhJZo2D2ozh+VtnK5LEAZKYbBcvW5pm2PqOgnjztJYTVOFV2G5qXa rNEDk8UL1X8VlJbjWc1eMX28mEEwK66AiIKnvO+sfb8x4hNOgB5EdXch3/Vyrw== To: , , , , , , , Subject: [PATCH 04/11] smp: Use on-stack cpumask in smp_call_function_many_cond References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> X-Original-From: Chuyi Zhou Cc: , "Chuyi Zhou" From: "Chuyi Zhou" Message-Id: <20260203112401.3889029-5-zhouchuyi@bytedance.com> Content-Transfer-Encoding: quoted-printable X-Lms-Return-Path: Date: Tue, 3 Feb 2026 19:23:54 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 Content-Type: text/plain; charset="utf-8" This patch use on-stack cpumask to replace percpu cfd cpumask in smp_call_function_many_cond(). alloc_cpumask_var() may fail when CONFIG_CPUMASK_OFFSTACK is enabled. In such extreme case, fall back to cfd->cpumask. This is a preparation for the next patch. Signed-off-by: Chuyi Zhou --- kernel/smp.c | 22 +++++++++++++++++----- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index f572716c3c7d..35948afced2e 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -805,11 +805,17 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, int cpu, last_cpu, this_cpu =3D smp_processor_id(); struct call_function_data *cfd; bool wait =3D scf_flags & SCF_WAIT; + bool preemptible_wait =3D true; + cpumask_var_t cpumask_stack; + struct cpumask *cpumask; int nr_cpus =3D 0; bool run_remote =3D false; =20 lockdep_assert_preemption_disabled(); =20 + if (!alloc_cpumask_var(&cpumask_stack, GFP_ATOMIC)) + preemptible_wait =3D false; + /* * Can deadlock when called with interrupts disabled. * We allow cpu's that are not yet online though, as no one else can @@ -831,15 +837,18 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, /* Check if we need remote execution, i.e., any CPU excluding this one. */ if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) { cfd =3D this_cpu_ptr(&cfd_data); - cpumask_and(cfd->cpumask, mask, cpu_online_mask); - __cpumask_clear_cpu(this_cpu, cfd->cpumask); + + cpumask =3D preemptible_wait ? cpumask_stack : cfd->cpumask; + + cpumask_and(cpumask, mask, cpu_online_mask); + __cpumask_clear_cpu(this_cpu, cpumask); =20 cpumask_clear(cfd->cpumask_ipi); - for_each_cpu(cpu, cfd->cpumask) { + for_each_cpu(cpu, cpumask) { call_single_data_t *csd =3D per_cpu_ptr(cfd->csd, cpu); =20 if (cond_func && !cond_func(cpu, info)) { - __cpumask_clear_cpu(cpu, cfd->cpumask); + __cpumask_clear_cpu(cpu, cpumask); continue; } =20 @@ -890,13 +899,16 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, } =20 if (run_remote && wait) { - for_each_cpu(cpu, cfd->cpumask) { + for_each_cpu(cpu, cpumask) { call_single_data_t *csd; =20 csd =3D per_cpu_ptr(cfd->csd, cpu); csd_lock_wait(csd); } } + + if (preemptible_wait) + free_cpumask_var(cpumask_stack); } =20 /** --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-102.ptr.blmpb.com (sg-1-102.ptr.blmpb.com [118.26.132.102]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C625527F749 for ; Tue, 3 Feb 2026 11:26:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.102 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770117984; cv=none; b=kOuvXtHoeKrRNJxp9gFXD6lWYwbWNrm4EVGjLXQlEbaAuePTshuSbCGJuJ4jRsH37g2EhPAVAURLfLZB6dgYJC2iWRiF+Mc59kU2sPTQAO39zpPqaKv+NmjuPNQgpT07YoA7SFLzGuhH2Jh7fBn48iEDLK4tBvnWMAmxcZUUs/4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770117984; c=relaxed/simple; bh=lhsNOCr9Hu01iLW57UWbSzzV4RpeS1vWjdP2R3ceaxc=; h=Content-Type:Message-Id:Date:Cc:From:Subject:In-Reply-To: References:To:Mime-Version; b=jsfEgfpnf93suxZ4Q4QDaUNndEC/fsvIaN1WvM/GYUSIXMzrNBXpSa6wNGhAHFaD3stmfzXqahxF+yz/2YX2bzv5hevtnNvgITTvNvVAZe86KblGo1M6jv6wQ566KsXBcVLMK927zCJVmWnwoBBMB9cRaWHFUO2VZ+u5k8+nscY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=kwfhZlxD; arc=none smtp.client-ip=118.26.132.102 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="kwfhZlxD" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770117976; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=WmuQeSAc6rDzd9MMPM8lYrbL4z7rQ6zPOQvFA6g+8Fo=; b=kwfhZlxDgKx1E0HZTwk4Vw30VMTYFnChBrLwIwmG9iyI9WNK8S+x98W8n78iDoipaW3wkk PhBw44c3eBtWcAtkCXL05+hAfizNGwffONPlNLov7fkG+4uCklsY9Qu3uk+zzUegE+lcBf tiNx1PwntH79nUh9UsomiuTEFdI9XzcbolbA7Xa1P+hkxiFB2EjQVybij7BcU0F7rnRpVx NjOJVL9grCL9FI0+sJVhlrRoOjnmh9YyLmK0s87ROCDtEyJTZpyEXe2G70uliT3/pcZ8ux lRXdx09xGB4j1SOOhSZKLN4RQokyI55dl0vxXEmvYXqeamv7N2Diq8A5QQEzew== X-Original-From: Chuyi Zhou X-Mailer: git-send-email 2.20.1 Message-Id: <20260203112401.3889029-6-zhouchuyi@bytedance.com> X-Lms-Return-Path: Date: Tue, 3 Feb 2026 19:23:55 +0800 Cc: , "Chuyi Zhou" From: "Chuyi Zhou" Subject: [PATCH 05/11] smp: Enable preemption early in smp_call_function_many_cond In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> To: , , , , , , , Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" Now smp_call_function_many_cond() disables preemption mainly for the following reasons: - To prevent the remote online CPU from going offline. Specifically, we want to ensure that no new csds are queued after smpcfd_dying_cpu() has finished. Therefore, preemption must be disabled until all necessary IPIs are sent. - To prevent migration to another CPU, which also implicitly prevents the current CPU from going offline (since stop_machine requires preempting the current task to execute offline callbacks). This can be achieved equally using migrate_disable(), as tasks must be migrated to other CPUs before takedown_cpu(). - To protect the per-cpu cfd_data from concurrent modification by other smp_call_*() on the current CPU. cfd_data contains cpumasks and per-cpu csds. Before enqueueing a csd, we block on the csd_lock() to ensure the previous asyc csd->func() has completed, and then initialize csd->func and csd->info. After sending the IPI, we spin-wait for the remote CPU to call csd_unlock(). Actually the csd_lock mechanism already guarantees csd serialization. If preemption occurs during csd_lock_wait, other concurrent smp_call_function_many_cond calls will simply block until the previous csd->func() completes: task A task B sd->func =3D fun_a send ipis preempted by B ---------------> csd_lock(csd); // block until last // fun_a finished csd->func =3D func_b; csd->info =3D info; ... send ipis switch back to A <--------------- csd_lock_wait(csd); // block until remote finish func_* This patch use migrate_disable() to protect the scope of smp_call_function_many_cond() and enables preemption before csd_lock_wait. This makes the potentially unpredictable csd_lock_wait preemptible. Using cpumask_stack can avoid concurrency modification issues, and we can fall back to the default logic if alloc_cpumask_var() fails. Signed-off-by: Chuyi Zhou --- kernel/smp.c | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index 35948afced2e..af9cee7d4939 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -802,7 +802,7 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, unsigned int scf_flags, smp_cond_func_t cond_func) { - int cpu, last_cpu, this_cpu =3D smp_processor_id(); + int cpu, last_cpu, this_cpu; struct call_function_data *cfd; bool wait =3D scf_flags & SCF_WAIT; bool preemptible_wait =3D true; @@ -811,11 +811,18 @@ static void smp_call_function_many_cond(const struct = cpumask *mask, int nr_cpus =3D 0; bool run_remote =3D false; =20 - lockdep_assert_preemption_disabled(); - - if (!alloc_cpumask_var(&cpumask_stack, GFP_ATOMIC)) + if (!wait || !alloc_cpumask_var(&cpumask_stack, GFP_ATOMIC)) preemptible_wait =3D false; =20 + /* + * Prevent the current CPU from going offline. + * Being migrated to another CPU and calling csd_lock_wait() may cause + * UAF due to smpcfd_dead_cpu() during the current CPU offline process. + */ + migrate_disable(); + + this_cpu =3D get_cpu(); + /* * Can deadlock when called with interrupts disabled. * We allow cpu's that are not yet online though, as no one else can @@ -898,6 +905,22 @@ static void smp_call_function_many_cond(const struct c= pumask *mask, local_irq_restore(flags); } =20 + /* + * We may block in csd_lock_wait() for a significant amount of time, espe= cially + * when interrupts are disabled or with a large number of remote CPUs. + * Try to enable preemption before csd_lock_wait(). + * + * - If @wait is true, we try to use the cpumask_stack instead of cfd->cp= umask to + * avoid concurrency modification from tasks on the same cpu. If alloc_cp= umask_var() + * return false, fallback to the default logic. + * + * - If preemption occurs during csd_lock_wait, other concurrent + * smp_call_function_many_cond() calls will simply block until the previo= us csd->func() + * complete. + */ + if (preemptible_wait) + put_cpu(); + if (run_remote && wait) { for_each_cpu(cpu, cpumask) { call_single_data_t *csd; @@ -907,8 +930,12 @@ static void smp_call_function_many_cond(const struct c= pumask *mask, } } =20 - if (preemptible_wait) + if (!preemptible_wait) + put_cpu(); + else free_cpumask_var(cpumask_stack); + + migrate_enable(); } =20 /** --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-103.ptr.blmpb.com (sg-1-103.ptr.blmpb.com [118.26.132.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D55927F749 for ; Tue, 3 Feb 2026 11:26:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.103 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118001; cv=none; b=rRJWHsXahD0TqWWC54c9f/VBdIGa2UCogE/OGCAQrhLre2NggDPv7BGhJVrIzF4jEMkD8nle0Yme2LXJ2iFrL+hL5CGxOmiDdIpPKaweQT2vd7hLSdPpgkRcIyGddNkiLCkC5vVk9suGrrpNsGDyg2b7E94y3eIl02eG6LPRLU0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118001; c=relaxed/simple; bh=YuxME/ukDXOMO9B7xKLtUYV9e2Ye10rySpF043VPjf0=; h=Subject:Cc:In-Reply-To:References:Mime-Version:Date:To:Message-Id: Content-Type:From; b=V6B98+g8GT9ZcefRzAV7KrDg8SksC4HcvJGOF+ETrXQAGVG8njX1alAcmuBPRaSeKI3R5YDk9rMPoxs3zIzY8vLXawnovXHKd64IUewtK8a+Fm15aTYgzxNI4dSB7X0KNgJ5q0sKv08yCJ8SXFN5/Q9nDpiV2n9fMupac9TXMLY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=MVgVhp/Y; arc=none smtp.client-ip=118.26.132.103 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="MVgVhp/Y" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770117994; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=EAutZnaA37pRa0HLMj9j47IZArNos1uolZuKY4pXZXw=; b=MVgVhp/YV8jm8l0cGTSeU41G9P5aYT1zmU7Gt88X8MeZJUHVV6I8fEBXsZ6Aegs8F4xcOT gM9WjAD/1SFWHEkMAqP/vTEvT5E/s+QL/W+Ox1utxQcQkSQdlQyvKjdm6xuUYE1LRmok3I 5BGeuLQBY9Zh/rwyBGHoHkBZjCJOAxDDSmjMBgDuWPBIbNEccAzmX428K0rnw5jCBCkT/E ZF3xdvpIjIoE3vkxVmTADbrlCe8fgRIyaLAVN6Aiwzo8UYiBNyX5fMvXKus5CJCabxIG6h CK/mna1kuM42eS3R8jDAXZEzMeQlR2HVAIGhAA0bBQHDo/acN9hPhL1H4g588w== Subject: [PATCH 06/11] smp: Remove preempt_disable from smp_call_function Content-Transfer-Encoding: quoted-printable X-Lms-Return-Path: Cc: , "Chuyi Zhou" In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Date: Tue, 3 Feb 2026 19:23:56 +0800 To: , , , , , , , Message-Id: <20260203112401.3889029-7-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 X-Original-From: Chuyi Zhou From: "Chuyi Zhou" Content-Type: text/plain; charset="utf-8" Now smp_call_function_many_cond() internally handles the preemption logic, so smp_call_function() does not need to explicitly disable preemption. Remove preempt_{enable, disable} from smp_call_function(). Signed-off-by: Chuyi Zhou --- kernel/smp.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index af9cee7d4939..088b581003fb 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -977,9 +977,8 @@ EXPORT_SYMBOL(smp_call_function_many); */ void smp_call_function(smp_call_func_t func, void *info, int wait) { - preempt_disable(); - smp_call_function_many(cpu_online_mask, func, info, wait); - preempt_enable(); + smp_call_function_many_cond(cpu_online_mask, func, info, + wait ? SCF_WAIT : 0, NULL); } EXPORT_SYMBOL(smp_call_function); =20 --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-101.ptr.blmpb.com (sg-1-101.ptr.blmpb.com [118.26.132.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E3222D249B for ; Tue, 3 Feb 2026 11:26:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118022; cv=none; b=VJRe81VVkIxdoU67cRr5YW3iNxYc6yNtgnlJA7ygwYYjgrE9kImjZOx5FCcb9SJHv20+97UlTsb6kCYTYFY600YJcswi5I9oRsoaJDvLQPJIHBrU+vC7aorTw9dMFF9JMe+bFKIGD6LvhgQ36cqUJIeizwO1KUQ+8nEVFBb6D+8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118022; c=relaxed/simple; bh=kfQMDDOCBw2RxJxJcki6FLp7F2ni4+ObQHxScepTCBs=; h=To:In-Reply-To:Mime-Version:Cc:Subject:References:Content-Type: From:Date:Message-Id; b=D1v3FtRTABrs0YK3Zc+UJyu4I1BWQTVDf2WlDy6zefbKmlTE3lHRN9ZOEWwOSp3SzNta7Z/rlOMrBT8q603nWleZUxGXNj/IW4cwPLEA15+DFqY37q1p0cxy8XmnqqZiOP8KgGYgDFYxBQmNg1h/W1m0ILqSY6nhnfkwIVdbW1o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=QADM8//Q; arc=none smtp.client-ip=118.26.132.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="QADM8//Q" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770118008; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=QkSROuoaLhphb9KijzT8lWhQJTKfiUdn7qfrFD5oeP8=; b=QADM8//Q22CPrrsHuEES25lgQQGw7kWSxlrB3hT+Re35hdbUBfJeRvNBJvE26K6kPu3Tbw Whpw+zJjSEAXP9KdIjXhblwZnxWrqMuJ2qrSZwTKTjubeGA8WHnvWK7uZVK2pbkdxU4CCI qA8MWWazTskiL7GEb/TtRa+Q4c9hl+3UUzF6z4TekVwDh0F7MJQB2ZG72+lnU1rIsuK/Ji ZSYD4yqi6hyb4zwiiYSW5pzmxLUePDimnXjZ9RJ862mi2fsyhQ0EMP4VVTjAv+OyI+JT8F DWjJ4CC/8cGSeOLAUPZRsM5hcJWg50dKYt7rS9E2XtMu1DulQ0YYFlo/y+39MA== To: , , , , , , , In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Original-From: Chuyi Zhou X-Lms-Return-Path: Cc: , "Chuyi Zhou" Subject: [PATCH 07/11] smp: Remove preempt_disable from on_each_cpu_cond_mask References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> From: "Chuyi Zhou" Date: Tue, 3 Feb 2026 19:23:57 +0800 Message-Id: <20260203112401.3889029-8-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 Content-Type: text/plain; charset="utf-8" Now smp_call_function_many_cond() internally handles the preemption logic, so on_each_cpu_cond_mask does not need to explicitly disable preemption. Remove preempt_{enable, disable} from on_each_cpu_cond_mask(). Signed-off-by: Chuyi Zhou --- kernel/smp.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/kernel/smp.c b/kernel/smp.c index 088b581003fb..c859076239c4 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -1097,9 +1097,7 @@ void on_each_cpu_cond_mask(smp_cond_func_t cond_func,= smp_call_func_t func, if (wait) scf_flags |=3D SCF_WAIT; =20 - preempt_disable(); smp_call_function_many_cond(mask, func, info, scf_flags, cond_func); - preempt_enable(); } EXPORT_SYMBOL(on_each_cpu_cond_mask); =20 --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-101.ptr.blmpb.com (sg-1-101.ptr.blmpb.com [118.26.132.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2F02232F75B for ; Tue, 3 Feb 2026 11:27:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118029; cv=none; b=HljYo2fCz9rHYQRvZ2pEWCynXHsW8JGCcQ9XGaUBdLqsGv4MnAeaSGS0qOX1jLRplAlKq2MGJyAVZ62wadPeE2Tbg0FnGqK5JnF2BLxAMYUN6MVM9eYBMH+sBvw+YmzPLeLhNarxCltXvszKuCQxB+Qog0TZiEfMXFcMxz4X9Qg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118029; c=relaxed/simple; bh=Phg/lpYVFw2n1GuxMvOC182zpHg93y76BBvFJ38yPFc=; h=To:Cc:Message-Id:Date:References:Content-Type:Subject: Mime-Version:In-Reply-To:From; b=HJpRpeQ0peXGx2fkEDPUITUk7VmQQTcGe8Ia8WR7yEKUK8siVp554ixaMiCCEzkUq0JXhR0S7P0JZzHRmuyQw8sjO/iyFGJidbg/2iZNYHTYgQaUzREB5WZ16qsKveJL3TtWa0pELQbphi7aOZ+IiTmA11TusbSt+qfch5rGrCU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=QXssbBId; arc=none smtp.client-ip=118.26.132.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="QXssbBId" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770118021; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=PKbAp6p11Sgqgiy+Y1H52B1Iz/jSoTaX5tZrwEsH0uY=; b=QXssbBIdLUgqo6S5LcM673fddwcK1xYyWPEzwT2lsRdr7sAS0wsG9GdBGwxCPHVQs+niN1 E5UFE53LbmbOdbyu6lwi//QkNP36yZxr1cWLXWz4Ga5uB98YfH7P+RmA5co8E7R2/OtDE2 uZrQ+v6hKLjDvCshC7nMipunLC5EPwad5nFHQswj/8TUY68pp0Bj/+SulfDwC9vrzOwmCq ZMT2GMQXinll8VHJdbRDYt8UVRYrZiOqWy6879vAR/io9s/xON3AFrCHkGTCNZDjwGnQuY 9fjQd2ta2qMQGBVqngt1xSxlQqqqvhky6/eD3dMmyTZbi5wWTyDJGXrbx3uoXw== To: , , , , , , , Cc: , "Chuyi Zhou" Message-Id: <20260203112401.3889029-9-zhouchuyi@bytedance.com> Date: Tue, 3 Feb 2026 19:23:58 +0800 References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Content-Transfer-Encoding: quoted-printable Subject: [PATCH 08/11] scftorture: Remove preempt_disable in scftorture_invoke_one Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Original-From: Chuyi Zhou In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> X-Lms-Return-Path: From: "Chuyi Zhou" X-Mailer: git-send-email 2.20.1 Content-Type: text/plain; charset="utf-8" Now we no longer need explicit preempt_disable calls before smp_call_*(), because the smp_call*() internally handle preemption logic themselves. Remove preempt_{enable, disable} in scftorture_invoke_one. Signed-off-by: Chuyi Zhou --- kernel/scftorture.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-) diff --git a/kernel/scftorture.c b/kernel/scftorture.c index d86d2d9c4624..3fb1742f3129 100644 --- a/kernel/scftorture.c +++ b/kernel/scftorture.c @@ -364,8 +364,6 @@ static void scftorture_invoke_one(struct scf_statistics= *scfp, struct torture_ra } if (use_cpus_read_lock) cpus_read_lock(); - else - preempt_disable(); switch (scfsp->scfs_prim) { case SCF_PRIM_RESCHED: if (IS_BUILTIN(CONFIG_SCF_TORTURE_TEST)) { @@ -411,13 +409,10 @@ static void scftorture_invoke_one(struct scf_statisti= cs *scfp, struct torture_ra if (!ret) { if (use_cpus_read_lock) cpus_read_unlock(); - else - preempt_enable(); + wait_for_completion(&scfcp->scfc_completion); if (use_cpus_read_lock) cpus_read_lock(); - else - preempt_disable(); } else { scfp->n_single_rpc_ofl++; scf_add_to_free_list(scfcp); @@ -463,8 +458,6 @@ static void scftorture_invoke_one(struct scf_statistics= *scfp, struct torture_ra } if (use_cpus_read_lock) cpus_read_unlock(); - else - preempt_enable(); if (allocfail) schedule_timeout_idle((1 + longwait) * HZ); // Let no-wait handlers com= plete. else if (!(torture_random(trsp) & 0xfff)) --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-100.ptr.blmpb.com (sg-1-100.ptr.blmpb.com [118.26.132.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1BACA27F749 for ; Tue, 3 Feb 2026 11:27:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118045; cv=none; b=PQ6k25sZnFIAandEqGp6QhqTb7fet0iseuolX7S74+N6SpCj/uIjAU7oAB8wUpUMVuhlUyZmesGuYNKknm2kFYbBdrTnx/08ViHAWwUty/28qlxtBty4t3NcEjzXaexKXxINO/xxOrMGyGO1knWZU1ga5GcP/kxm9Yd3Kdxm1C4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118045; c=relaxed/simple; bh=N6MI7lw2gdVXum84X2nWXs7DYm3+C+cRAT9VRcFxEsI=; h=To:Content-Type:References:Date:Message-Id:Mime-Version: In-Reply-To:Subject:From:Cc; b=HjHaZsR7sfkET0UonG1dN5uceLEAzZbkOz4j3d5yXvdflaFdX2gmIRMJi76J0UJfV8S5312TbQAQXSyNuxGaUQOItPgf66UAIMm5YIdyLzjaV7AflW/efLSDM2qFrkHkghjJf/5TK472CzeeiiBIPY1Dno43gV98xMvdiCBqpkA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=C6tBP9kT; arc=none smtp.client-ip=118.26.132.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="C6tBP9kT" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770118037; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=85r2DtOX3PVn6TzG78RrOLSMWjMfWJi9MSPvR284Zu0=; b=C6tBP9kTxpeCAxWfl15XSNEJOyE3TqjDmI1yyEdN+xc5XuFDx6bO6ikHjwm3CuDIMMfivd iL/G0zqJ4UzDr5JxLp/iHtP/sDJDLm4q4LqJAw4g6muvlIQaC2/VXLcmdVAvsRv1rw7/uf Pg3kwVERNjMyr7xKwJ6gn9gk4+Cv0gj8LhbyJ4j49uTxdeF/nAjI5+ZbB4tGb87rgBFLHv XIF8fPv9OPLqBOiaadL7PQhq6HDy//fWb0qfMdLYHC3AiS8nQr7sA84lAvzM+JOQRmYOqW /9oFmPJrkx0/WZU1++Z0Ivvy6R9XTX/MXt/6a8HDEd+s7nRQNn2VrBSOD9OVJg== X-Mailer: git-send-email 2.20.1 To: , , , , , , , X-Lms-Return-Path: References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Date: Tue, 3 Feb 2026 19:23:59 +0800 Message-Id: <20260203112401.3889029-10-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Original-From: Chuyi Zhou In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Subject: [PATCH 09/11] x86/mm: Move flush_tlb_info back to the stack From: "Chuyi Zhou" Cc: , "Chuyi Zhou" Content-Type: text/plain; charset="utf-8" Commit 3db6d5a5ecaf ("x86/mm/tlb: Remove 'struct flush_tlb_info' from the stack") changed flush_tlb_info from stack variables to per-CPU variables. This brought about a performance improvement of around 3% in extreme test. However, it also required that all flush_tlb* operations keep preemption disabled entirely to prevent concurrent modifications of flush_tlb_info. flush_tlb* needs to send IPIs to remote CPUs and synchronously wait for all remote CPUs to complete their local TLB flushes. The process could take tens of milliseconds when interrupts are disabled or with a large number of remote CPUs. From the perspective of improving kernel real-time performance, this patch reverts flush_tlb_info back to stack variables. This is a preparation for enabling preemption during TLB flush in next patch. Signed-off-by: Chuyi Zhou --- arch/x86/mm/tlb.c | 124 ++++++++++++++++++---------------------------- 1 file changed, 49 insertions(+), 75 deletions(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index f5b93e01e347..2d68297ed35b 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1394,71 +1394,30 @@ void flush_tlb_multi(const struct cpumask *cpumask, */ unsigned long tlb_single_page_flush_ceiling __read_mostly =3D 33; =20 -static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tlb_info, flush_tlb_info= ); - -#ifdef CONFIG_DEBUG_VM -static DEFINE_PER_CPU(unsigned int, flush_tlb_info_idx); -#endif - -static struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm, - unsigned long start, unsigned long end, - unsigned int stride_shift, bool freed_tables, - u64 new_tlb_gen) +void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, + unsigned long end, unsigned int stride_shift, + bool freed_tables) { - struct flush_tlb_info *info =3D this_cpu_ptr(&flush_tlb_info); + int cpu =3D get_cpu(); =20 -#ifdef CONFIG_DEBUG_VM - /* - * Ensure that the following code is non-reentrant and flush_tlb_info - * is not overwritten. This means no TLB flushing is initiated by - * interrupt handlers and machine-check exception handlers. - */ - BUG_ON(this_cpu_inc_return(flush_tlb_info_idx) !=3D 1); -#endif + struct flush_tlb_info info =3D { + .mm =3D mm, + .stride_shift =3D stride_shift, + .freed_tables =3D freed_tables, + .trim_cpumask =3D 0, + .initiating_cpu =3D cpu + }; =20 - /* - * If the number of flushes is so large that a full flush - * would be faster, do a full flush. - */ if ((end - start) >> stride_shift > tlb_single_page_flush_ceiling) { start =3D 0; end =3D TLB_FLUSH_ALL; } =20 - info->start =3D start; - info->end =3D end; - info->mm =3D mm; - info->stride_shift =3D stride_shift; - info->freed_tables =3D freed_tables; - info->new_tlb_gen =3D new_tlb_gen; - info->initiating_cpu =3D smp_processor_id(); - info->trim_cpumask =3D 0; - - return info; -} - -static void put_flush_tlb_info(void) -{ -#ifdef CONFIG_DEBUG_VM - /* Complete reentrancy prevention checks */ - barrier(); - this_cpu_dec(flush_tlb_info_idx); -#endif -} - -void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start, - unsigned long end, unsigned int stride_shift, - bool freed_tables) -{ - struct flush_tlb_info *info; - int cpu =3D get_cpu(); - u64 new_tlb_gen; - /* This is also a barrier that synchronizes with switch_mm(). */ - new_tlb_gen =3D inc_mm_tlb_gen(mm); + info.new_tlb_gen =3D inc_mm_tlb_gen(mm); =20 - info =3D get_flush_tlb_info(mm, start, end, stride_shift, freed_tables, - new_tlb_gen); + info.start =3D start; + info.end =3D end; =20 /* * flush_tlb_multi() is not optimized for the common case in which only @@ -1466,19 +1425,18 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsig= ned long start, * flush_tlb_func_local() directly in this case. */ if (mm_global_asid(mm)) { - broadcast_tlb_flush(info); + broadcast_tlb_flush(&info); } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { - info->trim_cpumask =3D should_trim_cpumask(mm); - flush_tlb_multi(mm_cpumask(mm), info); + info.trim_cpumask =3D should_trim_cpumask(mm); + flush_tlb_multi(mm_cpumask(mm), &info); consider_global_asid(mm); } else if (mm =3D=3D this_cpu_read(cpu_tlbstate.loaded_mm)) { lockdep_assert_irqs_enabled(); local_irq_disable(); - flush_tlb_func(info); + flush_tlb_func(&info); local_irq_enable(); } =20 - put_flush_tlb_info(); put_cpu(); mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } @@ -1548,19 +1506,29 @@ static void kernel_tlb_flush_range(struct flush_tlb= _info *info) =20 void flush_tlb_kernel_range(unsigned long start, unsigned long end) { - struct flush_tlb_info *info; + struct flush_tlb_info info =3D { + .mm =3D NULL, + .stride_shift =3D PAGE_SHIFT, + .freed_tables =3D false, + .trim_cpumask =3D 0, + .new_tlb_gen =3D TLB_GENERATION_INVALID + }; =20 guard(preempt)(); =20 - info =3D get_flush_tlb_info(NULL, start, end, PAGE_SHIFT, false, - TLB_GENERATION_INVALID); + if ((end - start) >> PAGE_SHIFT > tlb_single_page_flush_ceiling) { + start =3D 0; + end =3D TLB_FLUSH_ALL; + } =20 - if (info->end =3D=3D TLB_FLUSH_ALL) - kernel_tlb_flush_all(info); - else - kernel_tlb_flush_range(info); + info.initiating_cpu =3D smp_processor_id(), + info.start =3D start; + info.end =3D end; =20 - put_flush_tlb_info(); + if (info.end =3D=3D TLB_FLUSH_ALL) + kernel_tlb_flush_all(&info); + else + kernel_tlb_flush_range(&info); } =20 /* @@ -1728,12 +1696,19 @@ EXPORT_SYMBOL_FOR_KVM(__flush_tlb_all); =20 void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { - struct flush_tlb_info *info; - int cpu =3D get_cpu(); =20 - info =3D get_flush_tlb_info(NULL, 0, TLB_FLUSH_ALL, 0, false, - TLB_GENERATION_INVALID); + struct flush_tlb_info info =3D { + .start =3D 0, + .end =3D TLB_FLUSH_ALL, + .mm =3D NULL, + .stride_shift =3D 0, + .freed_tables =3D false, + .new_tlb_gen =3D TLB_GENERATION_INVALID, + .initiating_cpu =3D cpu, + .trim_cpumask =3D 0, + }; + /* * flush_tlb_multi() is not optimized for the common case in which only * a local TLB flush is needed. Optimize this use-case by calling @@ -1743,17 +1718,16 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap= _batch *batch) invlpgb_flush_all_nonglobals(); batch->unmapped_pages =3D false; } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { - flush_tlb_multi(&batch->cpumask, info); + flush_tlb_multi(&batch->cpumask, &info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); local_irq_disable(); - flush_tlb_func(info); + flush_tlb_func(&info); local_irq_enable(); } =20 cpumask_clear(&batch->cpumask); =20 - put_flush_tlb_info(); put_cpu(); } =20 --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-101.ptr.blmpb.com (sg-1-101.ptr.blmpb.com [118.26.132.101]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70CC827F749 for ; Tue, 3 Feb 2026 11:27:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.101 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118059; cv=none; b=czCOL42oV3aC6gePLg3XaiGtxY0sY7zydg1CeydcvtS0CjaaAruJxQf2np6JJt+rdo88iT9ZaNms0L2o5s4micWL/qBW6bmMrK5/VwY4hNA1Qc1Tj2gO8TrbUHUiaCaismwKhXGBpDPutQINGfAQ95qDhlyK1TXa1k3FwCTitU8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118059; c=relaxed/simple; bh=/0YpcSBXOXU6aBRucGUSQIzAJxwZeE0SDGhxShcFE/A=; h=Content-Type:Subject:In-Reply-To:Cc:References:Date:To:From: Message-Id:Mime-Version; b=hGCF2mCgPXXIpfB4Yh8lyqt5AlH02g2hrX3M19FvAy7G7o+ve/EVq2J++96fK+8cVTkhKTsL7R9Dmkffz4S3r06zQ3QaL8thOMVqrCgEq6BjmPs3MReD79cQgI0qfASNbLHUn01PzyQy6VqNs+lk64uegT+yywu8UiIp4N1TsWo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ICMwMliU; arc=none smtp.client-ip=118.26.132.101 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ICMwMliU" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770118052; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=PKi6eotB2YEfYlofLQTiEKnRk3xBhGzOqmjCiy+lFkU=; b=ICMwMliUOuTZ+8cJwY0bJTnkIUjDuUEMfVaY08MnkLSESZ7AQX9cACvZspBjsP23oSWu78 ueaoDVxE79YdwIEbCGafA2rwvUYg2XpZiy/k7mLKpYIp2Og7U1nO04KZ+UKaiB2b+Qc76g dbK6hVMyDRZDs6QEMr/ghNiutY8lHRwc8Nsq5dXUjtPjepVvKPs54hN9enE5CRIu3qf9Pt rZULilIxgsrtoR4Qt088KPKPUdfOPv6bM/1Nsrzu6l0/+S1yXQ6XNY7rxPK+KghNONw5Ov 3syRazRVwe5/4hqu7Iau2G3CSLPyl4cm4O6xthQelaJ0bOkngNSxDbasVqfycw== Subject: [PATCH 10/11] x86/mm: Enable preemption during native_flush_tlb_multi Content-Transfer-Encoding: quoted-printable X-Original-From: Chuyi Zhou In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Cc: , "Chuyi Zhou" References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Date: Tue, 3 Feb 2026 19:24:00 +0800 X-Lms-Return-Path: X-Mailer: git-send-email 2.20.1 To: , , , , , , , From: "Chuyi Zhou" Message-Id: <20260203112401.3889029-11-zhouchuyi@bytedance.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 Content-Type: text/plain; charset="utf-8" flush_tlb_mm_range()/arch_tlbbatch_flush() -> native_flush_tlb_multi() is a common triggering path in real production environments. When pages are reclaimed or process exit, native_flush_tlb_multi() sends IPIs to remote CPUs and waits for all remote CPUs to complete their local TLB flushes. The overall latency may reach tens of milliseconds due to a large number of remote CPUs and other factors (such as interrupts being disabled). Since flush_tlb_mm_range()/arch_tlbbatch_flush() always disable preemption, which may cause increased scheduling latency for other threads on the current CPU. Previous patche convert flush_tlb_info from per-cpu variable to on-stack variable. Additionally, it's no longer necessary to explicitly disable preemption before calling smp_call*() since they internally handles the preemption logic. Now is's safe to enable preemption during native_flush_tlb_multi(). Signed-off-by: Chuyi Zhou --- arch/x86/hyperv/mmu.c | 2 ++ arch/x86/kernel/kvm.c | 4 +++- arch/x86/mm/tlb.c | 23 +++++++++++++---------- arch/x86/xen/mmu_pv.c | 1 + 4 files changed, 19 insertions(+), 11 deletions(-) diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c index cfcb60468b01..394f849af10a 100644 --- a/arch/x86/hyperv/mmu.c +++ b/arch/x86/hyperv/mmu.c @@ -65,6 +65,8 @@ static void hyperv_flush_tlb_multi(const struct cpumask *= cpus, unsigned long flags; bool do_lazy =3D !info->freed_tables; =20 + guard(preempt)(); + trace_hyperv_mmu_flush_tlb_multi(cpus, info); =20 if (!hv_hypercall_pg) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index df78ddee0abb..6b56dab28e66 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -654,8 +654,10 @@ static void kvm_flush_tlb_multi(const struct cpumask *= cpumask, u8 state; int cpu; struct kvm_steal_time *src; - struct cpumask *flushmask =3D this_cpu_cpumask_var_ptr(__pv_cpu_mask); + struct cpumask *flushmask; =20 + guard(preempt)(); + flushmask =3D this_cpu_cpumask_var_ptr(__pv_cpu_mask); cpumask_copy(flushmask, cpumask); /* * We have to call flush only on online vCPUs. And diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 2d68297ed35b..4162d7ff024f 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1398,21 +1398,23 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsig= ned long start, unsigned long end, unsigned int stride_shift, bool freed_tables) { - int cpu =3D get_cpu(); - struct flush_tlb_info info =3D { .mm =3D mm, .stride_shift =3D stride_shift, .freed_tables =3D freed_tables, - .trim_cpumask =3D 0, - .initiating_cpu =3D cpu + .trim_cpumask =3D 0 }; + int cpu; =20 if ((end - start) >> stride_shift > tlb_single_page_flush_ceiling) { start =3D 0; end =3D TLB_FLUSH_ALL; } =20 + migrate_disable(); + + cpu =3D info.initiating_cpu =3D smp_processor_id(); + /* This is also a barrier that synchronizes with switch_mm(). */ info.new_tlb_gen =3D inc_mm_tlb_gen(mm); =20 @@ -1425,6 +1427,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigne= d long start, * flush_tlb_func_local() directly in this case. */ if (mm_global_asid(mm)) { + guard(preempt)(); broadcast_tlb_flush(&info); } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { info.trim_cpumask =3D should_trim_cpumask(mm); @@ -1437,7 +1440,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigne= d long start, local_irq_enable(); } =20 - put_cpu(); + migrate_enable(); mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } =20 @@ -1696,8 +1699,6 @@ EXPORT_SYMBOL_FOR_KVM(__flush_tlb_all); =20 void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) { - int cpu =3D get_cpu(); - struct flush_tlb_info info =3D { .start =3D 0, .end =3D TLB_FLUSH_ALL, @@ -1705,9 +1706,13 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_= batch *batch) .stride_shift =3D 0, .freed_tables =3D false, .new_tlb_gen =3D TLB_GENERATION_INVALID, - .initiating_cpu =3D cpu, .trim_cpumask =3D 0, }; + int cpu; + + guard(migrate)(); + + info.initiating_cpu =3D cpu =3D smp_processor_id(); =20 /* * flush_tlb_multi() is not optimized for the common case in which only @@ -1727,8 +1732,6 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_b= atch *batch) } =20 cpumask_clear(&batch->cpumask); - - put_cpu(); } =20 /* diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c index 2a4a8deaf612..b801721050f7 100644 --- a/arch/x86/xen/mmu_pv.c +++ b/arch/x86/xen/mmu_pv.c @@ -1330,6 +1330,7 @@ static void xen_flush_tlb_multi(const struct cpumask = *cpus, const size_t mc_entry_size =3D sizeof(args->op) + sizeof(args->mask[0]) * BITS_TO_LONGS(num_possible_cpus()); =20 + guard(preempt)(); trace_xen_mmu_flush_tlb_multi(cpus, info->mm, info->start, info->end); =20 if (cpumask_empty(cpus)) --=20 2.20.1 From nobody Sun Feb 8 23:32:14 2026 Received: from sg-1-103.ptr.blmpb.com (sg-1-103.ptr.blmpb.com [118.26.132.103]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F9EA32FA34 for ; Tue, 3 Feb 2026 11:27:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.26.132.103 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118077; cv=none; b=QPW7xZtabjMeVGSoJ4EmS5KEUg71Gb56MyGn377crPJgOI4BmPdiLdneEXwbqKzQFkvM2RgmplDEy2WSb6mtUhK6x1JsIfbCtBnWcsPW7Icu0TmXhBXuuCB+PAcD3owK2TZKojzaGrtGdTDs0ICDAUFxIQ9TkXyaLfh9HpFiaXI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1770118077; c=relaxed/simple; bh=GLYGBVnW802UzY6CUwcavE9aJDfVPeku32Rz3znly7o=; h=Content-Type:References:To:From:Cc:Mime-Version:In-Reply-To: Subject:Date:Message-Id; b=ECsp05Gpvc8URe+aMsbfLn3C6gCqET4MKnitxU/WEbEEj1IiUtUDEVP8LAONT1x4n1S2jMgRMZWaP1/AtlyRKwo8kEIO2iFKkyC8sy7HNrJ8mk52Zo+IGeHDnXLHj0z4ySkhddTS+WTsJNseRGlUfgm8MvjOEkvYdSnXKMSfQ8A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=cooHrDYN; arc=none smtp.client-ip=118.26.132.103 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="cooHrDYN" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1770118070; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=IpE3NNYVl9/0SgQzJZ3rQxnLyoiiIaSB0M6BLs0BeCg=; b=cooHrDYN2k1jd02RzxhiCmeyIG1ErnSAbg0QZdmEKpA4kZjgB6ieibXSgJICRIS5h1x/Wi if81zKBm70Vjt6oOq8hJIOiYzL7T3remgP6kunee5nEn09SnLH+H2dR4Tl8T5t6q1oXeuM Q3AdJiNI39t37POCFQRBZmmPDyKsSGUDxhwAYzFOSnQJRn7MEKZE812dU+SYURTPcfYv6g i6kvzhvjVi7E/aDa01xLSaE164L/Q9E3nwGR0XE0vzdJLqpHT7wSqFUWuMiIkoYMjxf+5C vfSdO2nkMDMNaSZWeOjMHYuwfZ3tbjwYEhinoLVC4LJxGCJ7LN1cldKtCqz0Jw== X-Mailer: git-send-email 2.20.1 Content-Transfer-Encoding: quoted-printable X-Lms-Return-Path: References: <20260203112401.3889029-1-zhouchuyi@bytedance.com> To: , , , , , , , From: "Chuyi Zhou" X-Original-From: Chuyi Zhou Cc: , "Chuyi Zhou" Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 In-Reply-To: <20260203112401.3889029-1-zhouchuyi@bytedance.com> Subject: [PATCH 11/11] x86/mm: Enable preemption during flush_tlb_kernel_range Date: Tue, 3 Feb 2026 19:24:01 +0800 Message-Id: <20260203112401.3889029-12-zhouchuyi@bytedance.com> Content-Type: text/plain; charset="utf-8" flush_tlb_kernel_range() is invoked when kernel memory mapping changes. On x86 platforms without the INVLPGB feature enabled, we need to send IPIs to every online CPU and synchronously wait for them to complete do_kernel_range_flush(). This process can be time-consuming due to factors such as a large number of CPUs or other issues (like interrupts being disabled). flush_tlb_kernel_range() always disables preemption, this may affect the scheduling latency of other tasks on the current CPU. Previous patch convert flush_tlb_info from per-cpu variable to on-stack variable. Additionally, it's no longer necessary to explicitly disable preemption before calling smp_call*() since they internally handles the preemption logic. Now is's safe to enable preemption during flush_tlb_kernel_range(). Signed-off-by: Chuyi Zhou --- arch/x86/mm/tlb.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 4162d7ff024f..f0de6c1e387f 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1467,6 +1467,8 @@ static void invlpgb_kernel_range_flush(struct flush_t= lb_info *info) { unsigned long addr, nr; =20 + guard(preempt)(); + for (addr =3D info->start; addr < info->end; addr +=3D nr << PAGE_SHIFT) { nr =3D (info->end - addr) >> PAGE_SHIFT; =20 @@ -1517,7 +1519,7 @@ void flush_tlb_kernel_range(unsigned long start, unsi= gned long end) .new_tlb_gen =3D TLB_GENERATION_INVALID }; =20 - guard(preempt)(); + guard(migrate)(); =20 if ((end - start) >> PAGE_SHIFT > tlb_single_page_flush_ceiling) { start =3D 0; --=20 2.20.1