[PATCH 01/12] task_work: Fix NMI race condition

Peter Zijlstra posted 12 patches 1 week ago
[PATCH 01/12] task_work: Fix NMI race condition
Posted by Peter Zijlstra 1 week ago
  __schedule()
  // disable irqs
      <NMI>
	  task_work_add(current, work, TWA_NMI_CURRENT);
      </NMI>
  // current = next;
  // enable irqs
      <IRQ>
	  task_work_set_notify_irq()
	  test_and_set_tsk_thread_flag(current,
                                       TIF_NOTIFY_RESUME); // wrong task!
      </IRQ>
  // original task skips task work on its next return to user (or exit!)

Fixes: 466e4d801cd4 ("task_work: Add TWA_NMI_CURRENT as an additional notify mode.")
Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 kernel/task_work.c |    8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -9,7 +9,12 @@ static struct callback_head work_exited;
 #ifdef CONFIG_IRQ_WORK
 static void task_work_set_notify_irq(struct irq_work *entry)
 {
-	test_and_set_tsk_thread_flag(current, TIF_NOTIFY_RESUME);
+	/*
+	 * no-op IPI
+	 *
+	 * TWA_NMI_CURRENT will already have set the TIF flag, all
+	 * this interrupt does it tickle the return-to-user path.
+	 */
 }
 static DEFINE_PER_CPU(struct irq_work, irq_work_NMI_resume) =
 	IRQ_WORK_INIT_HARD(task_work_set_notify_irq);
@@ -86,6 +91,7 @@ int task_work_add(struct task_struct *ta
 		break;
 #ifdef CONFIG_IRQ_WORK
 	case TWA_NMI_CURRENT:
+		set_tsk_thread_flag(current, TIF_NOTIFY_RESUME);
 		irq_work_queue(this_cpu_ptr(&irq_work_NMI_resume));
 		break;
 #endif
Re: [PATCH 01/12] task_work: Fix NMI race condition
Posted by Steven Rostedt 10 hours ago
On Wed, 24 Sep 2025 09:59:49 +0200
Peter Zijlstra <peterz@infradead.org> wrote:

>   __schedule()
>   // disable irqs
>       <NMI>
> 	  task_work_add(current, work, TWA_NMI_CURRENT);
>       </NMI>
>   // current = next;
>   // enable irqs
>       <IRQ>
> 	  task_work_set_notify_irq()
> 	  test_and_set_tsk_thread_flag(current,
>                                        TIF_NOTIFY_RESUME); // wrong task!
>       </IRQ>
>   // original task skips task work on its next return to user (or exit!)
> 
> Fixes: 466e4d801cd4 ("task_work: Add TWA_NMI_CURRENT as an additional notify mode.")
> Reported-by: Josh Poimboeuf <jpoimboe@kernel.org>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve