From nobody Sun Feb 8 22:49:15 2026 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29BD88632A for ; Sun, 25 Jan 2026 02:06:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769306780; cv=none; b=pff6x5SgnMeRnesb2ii3kWLkOAEJUwiYkLI8eGz6c3ojU0kRvAET3KbnuxG9cLm/HDp9+u0jalsMQo90/SolG8C/Pmt0f+YBFQRbvu7qYS7EheNtnYhFYfV8G3xsBjGa8HDliYN3SLHDmp8SzSTUOVY3vk3ioWrcT3Y51Qox+7w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1769306780; c=relaxed/simple; bh=0UC9Eps27DXtof1R8R7WQbTW4+Vxgn3aQ8gRfuTQD4k=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KhanrPabYOktCgoIgfXfZovLkDmF8jxJkbRlRjtC5Ljimc3DNn7J31VKt0Wsc6DOHij/alaRiiouGkv5kRDqNG+MknBuh+/AQ34vxc5a20wyDKEbN6fI7ghCm301HQOPkt+zs2FACjXtDakVwtiFIoa/5IPg5jHbRcmdjaJ2mPQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=TSZbdjMD; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="TSZbdjMD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1769306778; x=1800842778; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=0UC9Eps27DXtof1R8R7WQbTW4+Vxgn3aQ8gRfuTQD4k=; b=TSZbdjMDaH1Q8KrKjsn9KZqQXIXDWOzq3wq/H4keQgqEA8U+03Y4KyT8 ifjFnZiiwsa39DTveG+SimqKMePfCWpP+PYKnzh6avcAj5WRUyTMBKzzF T6mdGjkIK/Tv/QeQ9jR5UJAM9wLQTRqjArWo+2UJQRtn8yl5iwNXNEkaM DGKI4sS1hSlFTGo1kqPvMLIaN5mPK8bpmQAEbZrdAXlRc3L5d6ILZtPYQ KanSN5L8xsfcwSV20YZxkNxx9gASMw4k6fTWhOKTz8YrtU7NbHSj8kwq2 3DDi0PCHCJWykXHB12re2wlLK4JhSMotPlysZ2AopmNs4ieHpj6xxg0qn w==; X-CSE-ConnectionGUID: DOTthduoSWCIhlYa7i6xIQ== X-CSE-MsgGUID: BT1prIj3Qiq9ypxJAmlvdQ== X-IronPort-AV: E=McAfee;i="6800,10657,11681"; a="73101867" X-IronPort-AV: E=Sophos;i="6.21,252,1763452800"; d="scan'208";a="73101867" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jan 2026 18:06:16 -0800 X-CSE-ConnectionGUID: Zl6ts9Z9RMGwG2OubbICaA== X-CSE-MsgGUID: nsikDiucTdaTZ1FqsK/t/Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.21,252,1763452800"; d="scan'208";a="207783093" Received: from chang-linux-3.sc.intel.com (HELO chang-linux-3) ([172.25.66.172]) by fmviesa009.fm.intel.com with ESMTP; 24 Jan 2026 18:06:16 -0800 From: "Chang S. Bae" To: linux-kernel@vger.kernel.org Cc: x86@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, peterz@infradead.org, david.kaplan@amd.com, chang.seok.bae@intel.com Subject: [PATCH 1/7] stop_machine: Introduce stop_machine_nmi() Date: Sun, 25 Jan 2026 01:42:16 +0000 Message-ID: <20260125014224.249901-2-chang.seok.bae@intel.com> X-Mailer: git-send-email 2.51.0 In-Reply-To: <20260125014224.249901-1-chang.seok.bae@intel.com> References: <20260125014224.249901-1-chang.seok.bae@intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: David Kaplan stop_machine_nmi() is a variant of stop_machine() that runs the specified function in NMI context. This is useful for flows that cannot tolerate any risk of interruption even due to an NMI. Arch-specific code must handle sending the actual NMI and running the stop_machine_nmi_handler(). Signed-off-by: David Kaplan Signed-off-by: Chang S. Bae --- Update from the original version: * Move static key handling into stop_machine_cpuslocked_nmi() to support core-code users that already hold cpu hotplug locks * Tweak the subject to better reflect the new interface and changelog a bit as well --- include/linux/stop_machine.h | 50 +++++++++++++++++++++ kernel/stop_machine.c | 84 ++++++++++++++++++++++++++++++++++-- 2 files changed, 130 insertions(+), 4 deletions(-) diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h index 72820503514c..86113084456a 100644 --- a/include/linux/stop_machine.h +++ b/include/linux/stop_machine.h @@ -141,6 +141,29 @@ int stop_machine(cpu_stop_fn_t fn, void *data, const s= truct cpumask *cpus); */ int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, const struct cpu= mask *cpus); =20 +/** + * stop_machine_nmi: freeze the machine and run this function in NMI conte= xt + * @fn: the function to run + * @data: the data ptr for the @fn() + * @cpus: the cpus to run the @fn() on (NULL =3D any online cpu) + * + * Like stop_machine() but runs the function in NMI context to avoid any r= isk of + * interruption due to NMIs. + * + * Protects against CPU hotplug. + */ +int stop_machine_nmi(cpu_stop_fn_t fn, void *data, const struct cpumask *c= pus); + +/** + * stop_machine_cpuslocked_nmi: freeze and run this function in NMI context + * @fn: the function to run + * @data: the data ptr for the @fn() + * @cpus: the cpus to run the @fn() on (NULL =3D any online cpu) + * + * Same as above. Must be called from within a cpus_read_lock() protected + * region. Avoids nested calls to cpus_read_lock(). + */ +int stop_machine_cpuslocked_nmi(cpu_stop_fn_t fn, void *data, const struct= cpumask *cpus); /** * stop_core_cpuslocked: - stop all threads on just one core * @cpu: any cpu in the targeted core @@ -160,6 +183,14 @@ int stop_core_cpuslocked(unsigned int cpu, cpu_stop_fn= _t fn, void *data); =20 int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus); + +bool noinstr stop_machine_nmi_handler(void); +DECLARE_STATIC_KEY_FALSE(stop_machine_nmi_handler_enable); +static __always_inline bool stop_machine_nmi_handler_enabled(void) +{ + return static_branch_unlikely(&stop_machine_nmi_handler_enable); +} + #else /* CONFIG_SMP || CONFIG_HOTPLUG_CPU */ =20 static __always_inline int stop_machine_cpuslocked(cpu_stop_fn_t fn, void = *data, @@ -186,5 +217,24 @@ stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void = *data, return stop_machine(fn, data, cpus); } =20 +/* stop_machine_nmi() is only supported in SMP systems. */ +static __always_inline int stop_machine_nmi(cpu_stop_fn_t fn, void *data, + const struct cpumask *cpus) +{ + return -EINVAL; +} + +static __always_inline bool stop_machine_nmi_handler_enabled(void) +{ + return false; +} + +static __always_inline bool stop_machine_nmi_handler(void) +{ + return false; +} + #endif /* CONFIG_SMP || CONFIG_HOTPLUG_CPU */ + +void arch_send_self_nmi(void); #endif /* _LINUX_STOP_MACHINE */ diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c index 3fe6b0c99f3d..189b4b108d13 100644 --- a/kernel/stop_machine.c +++ b/kernel/stop_machine.c @@ -174,6 +174,8 @@ struct multi_stop_data { =20 enum multi_stop_state state; atomic_t thread_ack; + + bool use_nmi; }; =20 static void set_state(struct multi_stop_data *msdata, @@ -197,6 +199,42 @@ notrace void __weak stop_machine_yield(const struct cp= umask *cpumask) cpu_relax(); } =20 +struct stop_machine_nmi_ctrl { + bool nmi_enabled; + struct multi_stop_data *msdata; + int err; +}; + +DEFINE_STATIC_KEY_FALSE(stop_machine_nmi_handler_enable); +static DEFINE_PER_CPU(struct stop_machine_nmi_ctrl, stop_machine_nmi_ctrl); + +static void enable_nmi_handler(struct multi_stop_data *msdata) +{ + this_cpu_write(stop_machine_nmi_ctrl.msdata, msdata); + this_cpu_write(stop_machine_nmi_ctrl.nmi_enabled, true); +} + +void __weak arch_send_self_nmi(void) +{ + /* Arch code must implement this to support stop_machine_nmi() */ +} + +bool noinstr stop_machine_nmi_handler(void) +{ + struct multi_stop_data *msdata; + int err; + + if (!raw_cpu_read(stop_machine_nmi_ctrl.nmi_enabled)) + return false; + + raw_cpu_write(stop_machine_nmi_ctrl.nmi_enabled, false); + + msdata =3D raw_cpu_read(stop_machine_nmi_ctrl.msdata); + err =3D msdata->fn(msdata->data); + raw_cpu_write(stop_machine_nmi_ctrl.err, err); + return true; +} + /* This is the cpu_stop function which stops the CPU. */ static int multi_cpu_stop(void *data) { @@ -234,8 +272,15 @@ static int multi_cpu_stop(void *data) hard_irq_disable(); break; case MULTI_STOP_RUN: - if (is_active) - err =3D msdata->fn(msdata->data); + if (is_active) { + if (msdata->use_nmi) { + enable_nmi_handler(msdata); + arch_send_self_nmi(); + err =3D raw_cpu_read(stop_machine_nmi_ctrl.err); + } else { + err =3D msdata->fn(msdata->data); + } + } break; default: break; @@ -584,14 +629,15 @@ static int __init cpu_stop_init(void) } early_initcall(cpu_stop_init); =20 -int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, - const struct cpumask *cpus) +static int __stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, + const struct cpumask *cpus, bool use_nmi) { struct multi_stop_data msdata =3D { .fn =3D fn, .data =3D data, .num_threads =3D num_online_cpus(), .active_cpus =3D cpus, + .use_nmi =3D use_nmi, }; =20 lockdep_assert_cpus_held(); @@ -620,6 +666,24 @@ int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *da= ta, return stop_cpus(cpu_online_mask, multi_cpu_stop, &msdata); } =20 +int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, + const struct cpumask *cpus) +{ + return __stop_machine_cpuslocked(fn, data, cpus, false); +} + +int stop_machine_cpuslocked_nmi(cpu_stop_fn_t fn, void *data, + const struct cpumask *cpus) +{ + int ret; + + static_branch_enable_cpuslocked(&stop_machine_nmi_handler_enable); + ret =3D __stop_machine_cpuslocked(fn, data, cpus, true); + static_branch_disable_cpuslocked(&stop_machine_nmi_handler_enable); + + return ret; +} + int stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus) { int ret; @@ -632,6 +696,18 @@ int stop_machine(cpu_stop_fn_t fn, void *data, const s= truct cpumask *cpus) } EXPORT_SYMBOL_GPL(stop_machine); =20 +int stop_machine_nmi(cpu_stop_fn_t fn, void *data, const struct cpumask *c= pus) +{ + int ret; + + cpus_read_lock(); + ret =3D stop_machine_cpuslocked_nmi(fn, data, cpus); + cpus_read_unlock(); + + return ret; +} +EXPORT_SYMBOL_GPL(stop_machine_nmi); + #ifdef CONFIG_SCHED_SMT int stop_core_cpuslocked(unsigned int cpu, cpu_stop_fn_t fn, void *data) { --=20 2.51.0