From nobody Mon Oct 6 01:24:22 2025 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E96E250BEC for ; Sun, 27 Jul 2025 22:37:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=193.142.43.55 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753655880; cv=none; b=AELPY/l8PAXquq9RyXqxmAPbppjbTVlV/nn1dMjmd15RzAFAnCnQ22HKzKmqyTbNEd0HrdzDYoXPmQHxezHl3nLrJ5A8+M8yKyZTJY4exY7QI7yYhBKFJppfxzj3nup/TZUwN3gLSklD0mMv8SFvl/qvVcNQGBu/4tpXwMHwt0Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753655880; c=relaxed/simple; bh=wA6ULP4ed/37Zjsp6XVbsxOOKrbLIyb4hwg+q1j62Nw=; h=From:To:Cc:Subject:References:Message-ID:Content-Type: MIME-Version:Date; b=sjXioEDQrzKp+ZE8KwFsGUhB0Y439KfUwq3D5yyzz5flGEAIZAaNpq1GRtU5RLx7XaK0FTyxyv2UvOt+C9qIVGm3OLSkE8oIA8Wa0HEaC26bEcV6G2BmUhr0pvwumkuP9nv/zCwwvVQ4EqEQaKFEFB8kLPnrW9758j1RFDMGBQU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de; spf=pass smtp.mailfrom=linutronix.de; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=bmA4ZPn1; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b=kXwTUPnF; arc=none smtp.client-ip=193.142.43.55 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="bmA4ZPn1"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="kXwTUPnF" From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1753655877; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=jwSUH23lQI7377ZMrztcFMA4c8YJLymu+B7kmWBCqsg=; b=bmA4ZPn1QFvXgPTOA6YvwgyTs3WrBOAdNyWpHJJMRMNFr0klczKo43AsixsJ9rgkPCcdCO WSybMib5GEiCr+Fdaz8dp36rFfMONtQeLxnApTO0HVOhukZxSoX5XUNgVlp5qHztmkbERS O+GyDTXoU6Dbqu12t2Sxx7V3hTEl8MxbajDSW4DzEpWfk38ffuMxEBk8kFCe7HSoI86ABx SgoeAD4xAURyb8d9JDFa1SyF3LRUu0OWrmQUY5jBctTKXwXImxxIMU4zlDy7B7Dhn2KU65 P5rxjCF2nQdvYoAMF341xRAjfZBwPPhsr9Z5O4vk9ODg3NmG01d7cY2b7+9JNw== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1753655877; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: references:references; bh=jwSUH23lQI7377ZMrztcFMA4c8YJLymu+B7kmWBCqsg=; b=kXwTUPnFrNmhakwOepHcP2bj1dvVX/g1c+aQtEk1ChsIhHme33MqUy2s85D3N43OF4nqBh G1jMyxd5z7WvKlBQ== To: Linus Torvalds Cc: linux-kernel@vger.kernel.org, x86@kernel.org Subject: [GIT pull] smp/core for v6.17-rc1 References: <175365567943.581055.6806236199247802443.tglx@xen13> Message-ID: <175365568477.581055.6002001481718640290.tglx@xen13> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Date: Mon, 28 Jul 2025 00:37:56 +0200 (CEST) Linus, please pull the latest smp/core branch from: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-2025-= 07-27 up to: 946a72819825: smp: Wait only if work was enqueued A set of updates for SMP function calls: - Improve localitu of smp_call_function_any() by utilizing sched_numa_find_nth_cpu() instead of picking a random CPU - Wait for work completion in smp_call_function_many_cond() only when there was actually work enqueued - Simplify functions by unutlizing the appropriate cpumask_*() interfaces - Trivial cleanups Thanks, tglx ------------------> Rik van Riel (1): smp: Wait only if work was enqueued Viresh Kumar (1): smp: Fix typo in comment for raw_smp_processor_id() Yury Norov [NVIDIA] (3): smp: Improve locality in smp_call_function_any() smp: Use cpumask_any_but() in smp_call_function_many_cond() smp: Defer check for local execution in smp_call_function_many_cond() include/linux/smp.h | 2 +- kernel/smp.c | 44 ++++++++++++++------------------------------ 2 files changed, 15 insertions(+), 31 deletions(-) diff --git a/include/linux/smp.h b/include/linux/smp.h index f1aa0952e8c3..bea8d2826e09 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -234,7 +234,7 @@ static inline int get_boot_cpu_id(void) #endif /* !SMP */ =20 /** - * raw_processor_id() - get the current (unstable) CPU id + * raw_smp_processor_id() - get the current (unstable) CPU id * * For then you know what you are doing and need an unstable * CPU id. diff --git a/kernel/smp.c b/kernel/smp.c index 974f3a3962e8..c5e1da7a88da 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -741,32 +741,19 @@ EXPORT_SYMBOL_GPL(smp_call_function_single_async); * * Selection preference: * 1) current cpu if in @mask - * 2) any cpu of current node if in @mask - * 3) any other online cpu in @mask + * 2) nearest cpu in @mask, based on NUMA topology */ int smp_call_function_any(const struct cpumask *mask, smp_call_func_t func, void *info, int wait) { unsigned int cpu; - const struct cpumask *nodemask; int ret; =20 /* Try for same CPU (cheapest) */ cpu =3D get_cpu(); - if (cpumask_test_cpu(cpu, mask)) - goto call; - - /* Try for same node. */ - nodemask =3D cpumask_of_node(cpu_to_node(cpu)); - for (cpu =3D cpumask_first_and(nodemask, mask); cpu < nr_cpu_ids; - cpu =3D cpumask_next_and(cpu, nodemask, mask)) { - if (cpu_online(cpu)) - goto call; - } + if (!cpumask_test_cpu(cpu, mask)) + cpu =3D sched_numa_find_nth_cpu(mask, 0, cpu_to_node(cpu)); =20 - /* Any online will do: smp_call_function_single handles nr_cpu_ids. */ - cpu =3D cpumask_any_and(mask, cpu_online_mask); -call: ret =3D smp_call_function_single(cpu, func, info, wait); put_cpu(); return ret; @@ -792,7 +779,6 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, bool wait =3D scf_flags & SCF_WAIT; int nr_cpus =3D 0; bool run_remote =3D false; - bool run_local =3D false; =20 lockdep_assert_preemption_disabled(); =20 @@ -814,19 +800,8 @@ static void smp_call_function_many_cond(const struct c= pumask *mask, */ WARN_ON_ONCE(!in_task()); =20 - /* Check if we need local execution. */ - if ((scf_flags & SCF_RUN_LOCAL) && cpumask_test_cpu(this_cpu, mask) && - (!cond_func || cond_func(this_cpu, info))) - run_local =3D true; - /* Check if we need remote execution, i.e., any CPU excluding this one. */ - cpu =3D cpumask_first_and(mask, cpu_online_mask); - if (cpu =3D=3D this_cpu) - cpu =3D cpumask_next_and(cpu, mask, cpu_online_mask); - if (cpu < nr_cpu_ids) - run_remote =3D true; - - if (run_remote) { + if (cpumask_any_and_but(mask, cpu_online_mask, this_cpu) < nr_cpu_ids) { cfd =3D this_cpu_ptr(&cfd_data); cpumask_and(cfd->cpumask, mask, cpu_online_mask); __cpumask_clear_cpu(this_cpu, cfd->cpumask); @@ -840,6 +815,9 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, continue; } =20 + /* Work is enqueued on a remote CPU. */ + run_remote =3D true; + csd_lock(csd); if (wait) csd->node.u_flags |=3D CSD_TYPE_SYNC; @@ -851,6 +829,10 @@ static void smp_call_function_many_cond(const struct c= pumask *mask, #endif trace_csd_queue_cpu(cpu, _RET_IP_, func, csd); =20 + /* + * Kick the remote CPU if this is the first work + * item enqueued. + */ if (llist_add(&csd->node.llist, &per_cpu(call_single_queue, cpu))) { __cpumask_set_cpu(cpu, cfd->cpumask_ipi); nr_cpus++; @@ -869,7 +851,9 @@ static void smp_call_function_many_cond(const struct cp= umask *mask, send_call_function_ipi_mask(cfd->cpumask_ipi); } =20 - if (run_local) { + /* Check if we need local execution. */ + if ((scf_flags & SCF_RUN_LOCAL) && cpumask_test_cpu(this_cpu, mask) && + (!cond_func || cond_func(this_cpu, info))) { unsigned long flags; =20 local_irq_save(flags);