From nobody Wed Apr 1 11:33:13 2026 Received: from va-1-113.ptr.blmpb.com (va-1-113.ptr.blmpb.com [209.127.230.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1DEDD3A5E61 for ; Tue, 31 Mar 2026 11:33:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.127.230.113 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774956840; cv=none; b=u3GTgHc9J5Q6PuvjD9p2PqJSM/cI3n3n76Xn5xWeRnGEE6NMQGKaxL2Qbn5rW6AFyESQNWAiKQwqUA81JSJNslp1IPtHSo7v9vPOd6l5GLJfXYm647Yopd7EoaFdJghrHvvm4VaJ/6inMjGET2HEB4f2Q/R9oQS06PBhuuLj15s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774956840; c=relaxed/simple; bh=qSQTKmFatG7y7DFIJ0d6KtLoocQeO0QaRbWFAqrfiOM=; h=Cc:Subject:To:Message-Id:References:From:In-Reply-To:Content-Type: Date:Mime-Version; b=YIpYIp57n7FfbKJhps/a6hEx1Dnj55ZvMkgf+VVYTjPHDOmHvglQkXoF1RhU1/B8vsaFtOaNRAZXKBHUtbNyZnh5GNR4al4ytLiNI0ylT6VdqFSRRf5TSpvjrS18FazFF8rjMUbSWErRqdbJmsyHcWBK3cOPsqg5MN3oEXuRlcs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=K2R0bU6g; arc=none smtp.client-ip=209.127.230.113 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="K2R0bU6g" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; s=2212171451; d=bytedance.com; t=1774956834; h=from:subject: mime-version:from:date:message-id:subject:to:cc:reply-to:content-type: mime-version:in-reply-to:message-id; bh=hJBc7tJV1GDZ0XkoFVNF8pqXPsHEQjYxsWQJKGn58QQ=; b=K2R0bU6g2V9rjFV2TZvTyjCp6UiEUxSGtKLtG2Bv/IKrX5tkDgQSzN4TOlpQYTVo7rb1Ss bpa2PsCYl0OdOvphbXvHZcyLtQ8tG2aXQ1iPGzZ/s1wNgzTQB5qyIVwFtCaOBbWzEeV8HN KDbEbmFNFOwZWoAE6U23oQyGb9cJNO5ZIrWp2QEWSpuBkwfjDkAkyuu17eAtsFP0s9nUl3 5ri60WOuoiP1mNq7YjQ17fZldjRFYncp5CWIkPpzpngQJKCjpoU9aH5nQUOUtRIL3QTXxH 8jNK/gOB/3Ivpl9rJ5Mk4iY09nibdFLH9M+aTJmS7w2pIu5XMeRl/QehB5TLbQ== Cc: , "Chuyi Zhou" Subject: [PATCH v4 11/12] x86/mm: Enable preemption during native_flush_tlb_multi To: , , , , , , , , , , , , Message-Id: <20260331113103.2197007-12-zhouchuyi@bytedance.com> X-Lms-Return-Path: X-Mailer: git-send-email 2.20.1 Content-Transfer-Encoding: quoted-printable References: <20260331113103.2197007-1-zhouchuyi@bytedance.com> From: "Chuyi Zhou" In-Reply-To: <20260331113103.2197007-1-zhouchuyi@bytedance.com> Date: Tue, 31 Mar 2026 19:31:02 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Original-From: Chuyi Zhou Content-Type: text/plain; charset="utf-8" native_flush_tlb_multi() may be frequently called by flush_tlb_mm_range() and arch_tlbbatch_flush() in production environments. When pages are reclaimed or process exit, native_flush_tlb_multi() sends IPIs to remote CPUs and waits for all remote CPUs to complete their local TLB flushes. The overall latency may reach tens of milliseconds due to a large number of remote CPUs and other factors (such as interrupts being disabled). Since flush_tlb_mm_range() and arch_tlbbatch_flush() always disable preemption, which may cause increased scheduling latency for other threads on the current CPU. Previous patch converted flush_tlb_info from per-cpu variable to on-stack variable. Additionally, it's no longer necessary to explicitly disable preemption before calling smp_call*() since they internally handle the preemption logic. Now it's safe to enable preemption during native_flush_tlb_multi(). Signed-off-by: Chuyi Zhou --- arch/x86/kernel/kvm.c | 4 +++- arch/x86/mm/tlb.c | 9 +++++++-- 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 3bc062363814..4f7f4c1149b9 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -668,8 +668,10 @@ static void kvm_flush_tlb_multi(const struct cpumask *= cpumask, u8 state; int cpu; struct kvm_steal_time *src; - struct cpumask *flushmask =3D this_cpu_cpumask_var_ptr(__pv_cpu_mask); + struct cpumask *flushmask; =20 + guard(preempt)(); + flushmask =3D this_cpu_cpumask_var_ptr(__pv_cpu_mask); cpumask_copy(flushmask, cpumask); /* * We have to call flush only on online vCPUs. And diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index cfc3a72477f5..58c6f3d2f993 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -1421,9 +1421,11 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsign= ed long start, if (mm_global_asid(mm)) { broadcast_tlb_flush(info); } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { + put_cpu(); info->trim_cpumask =3D should_trim_cpumask(mm); flush_tlb_multi(mm_cpumask(mm), info); consider_global_asid(mm); + goto invalidate; } else if (mm =3D=3D this_cpu_read(cpu_tlbstate.loaded_mm)) { lockdep_assert_irqs_enabled(); local_irq_disable(); @@ -1432,6 +1434,7 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigne= d long start, } =20 put_cpu(); +invalidate: mmu_notifier_arch_invalidate_secondary_tlbs(mm, start, end); } =20 @@ -1691,7 +1694,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_b= atch *batch) invlpgb_flush_all_nonglobals(); batch->unmapped_pages =3D false; } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { + put_cpu(); flush_tlb_multi(&batch->cpumask, &info); + goto clear; } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { lockdep_assert_irqs_enabled(); local_irq_disable(); @@ -1699,9 +1704,9 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_b= atch *batch) local_irq_enable(); } =20 - cpumask_clear(&batch->cpumask); - put_cpu(); +clear: + cpumask_clear(&batch->cpumask); } =20 /* --=20 2.20.1