[RFC][PATCH v2 02/11] stop_machine: Accumulate error code rather than overwrite

Chang S. Bae posted 11 patches 1 day, 23 hours ago
[RFC][PATCH v2 02/11] stop_machine: Accumulate error code rather than overwrite
Posted by Chang S. Bae 1 day, 23 hours ago
cpu_stopper_thread() invokes a stop function and collects its error code
in struct cpu_stop_done. In the multi stop-machine case, it is shared
data, but currently an arbitrary error is recorded as overwriting.

With different errors, accumulating error code instead can distinguish a
multi-error condition as bits are cumulatively set.

Convert the error recoding to accumulate return values.

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Link: https://lore.kernel.org/lkml/20260304163335.GDaahe3wdnqxSC2yfw@fat_crate.local
---
V1 -> V2: New patch

While tried to explain its benefit here, I considered this change
deserves more discussions to ensure its impact, so RFC.
---
 include/linux/stop_machine.h | 12 ++++++------
 kernel/stop_machine.c        | 10 +++++-----
 2 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/include/linux/stop_machine.h b/include/linux/stop_machine.h
index c753dd53e79d..2f986555113a 100644
--- a/include/linux/stop_machine.h
+++ b/include/linux/stop_machine.h
@@ -124,9 +124,9 @@ static inline void print_stop_info(const char *log_lvl, struct task_struct *task
  * the possibility of blocking in cpus_read_lock() means that the caller
  * cannot usefully rely on this serialization.
  *
- * Return: 0 if all invocations of @fn return zero.  Otherwise, the
- * value returned by an arbitrarily chosen member of the set of calls to
- * @fn that returned non-zero.
+ * Return: 0 if all invocations of @fn return zero.  Otherwise, an
+ * accumulated return value from all invocation of @fn that returned
+ * non-zero.
  */
 int stop_machine(cpu_stop_fn_t fn, void *data, const struct cpumask *cpus);
 
@@ -154,9 +154,9 @@ int stop_machine_cpuslocked(cpu_stop_fn_t fn, void *data, const struct cpumask *
  *
  * Context: Must be called from within a cpus_read_lock() protected region.
  *
- * Return: 0 if all invocations of @fn return zero.  Otherwise, the
- * value returned by an arbitrarily chosen member of the set of calls to
- * @fn that returned non-zero.
+ * Return: 0 if all invocations of @fn return zero.  Otherwise, an
+ * accumulated return value from all invocation of @fn that returned
+ * non-zero.
  */
 int stop_core_cpuslocked(unsigned int cpu, cpu_stop_fn_t fn, void *data);
 
diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
index 822cf56fdc81..15268f1207e9 100644
--- a/kernel/stop_machine.c
+++ b/kernel/stop_machine.c
@@ -459,7 +459,7 @@ static int __stop_cpus(const struct cpumask *cpumask,
  * RETURNS:
  * -ENOENT if @fn(@arg) was not executed at all because all cpus in
  * @cpumask were offline; otherwise, 0 if all executions of @fn
- * returned 0, any non zero return value if any returned non zero.
+ * returned 0, the accumulated value of all non-zero @fn returns.
  */
 static int stop_cpus(const struct cpumask *cpumask, cpu_stop_fn_t fn, void *arg)
 {
@@ -512,7 +512,7 @@ static void cpu_stopper_thread(unsigned int cpu)
 		ret = fn(arg);
 		if (done) {
 			if (ret)
-				done->ret = ret;
+				done->ret |= ret;
 			cpu_stop_signal_done(done);
 		}
 		preempt_count_dec();
@@ -674,8 +674,8 @@ EXPORT_SYMBOL_GPL(stop_core_cpuslocked);
  * Local CPU is inactive.  Temporarily stops all active CPUs.
  *
  * RETURNS:
- * 0 if all executions of @fn returned 0, any non zero return value if any
- * returned non zero.
+ * 0 if all executions of @fn returned 0, otherwise the accumulated value
+ * of all non-zero @fn returns.
  */
 int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
 				  const struct cpumask *cpus)
@@ -705,5 +705,5 @@ int stop_machine_from_inactive_cpu(cpu_stop_fn_t fn, void *data,
 		cpu_relax();
 
 	mutex_unlock(&stop_cpus_mutex);
-	return ret ?: done.ret;
+	return ret | done.ret;
 }
-- 
2.51.0