[PATCH -next 6/6] [please squash] fixup! rcu: Fix rcu_read_unlock() deadloop due to IRQ work

Joel Fernandes posted 6 patches 2 months, 3 weeks ago
[PATCH -next 6/6] [please squash] fixup! rcu: Fix rcu_read_unlock() deadloop due to IRQ work
Posted by Joel Fernandes 2 months, 3 weeks ago
Please squash few comment-related changes courtesy of review from
Frederic.

Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
---
 kernel/rcu/tree.h        | 10 ++++++----
 kernel/rcu/tree_plugin.h |  7 ++++++-
 2 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index f8f612269e6e..de6ca13a7b5f 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -175,10 +175,12 @@ struct rcu_snap_record {
 };
 
 /*
- * The IRQ work (deferred_qs_iw) is used by RCU to get scheduler's attention.
- * It can be in one of the following states:
- * - DEFER_QS_IDLE: An IRQ work was never scheduled.
- * - DEFER_QS_PENDING: An IRQ work was scheduler but never run.
+ * An IRQ work (deferred_qs_iw) is used by RCU to get the scheduler's attention.
+ * to report quiescent states at the soonest possible time.
+ * The request can be in one of the following states:
+ * - DEFER_QS_IDLE: An IRQ work is yet to be scheduled.
+ * - DEFER_QS_PENDING: An IRQ work was scheduled but either not yet run, or it
+ *                     ran and we still haven't reported a quiescent state.
  */
 #define DEFER_QS_IDLE		0
 #define DEFER_QS_PENDING	1
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index ffe6eb5d8e34..1b9403505c42 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -633,7 +633,12 @@ static void rcu_preempt_deferred_qs_handler(struct irq_work *iwp)
 	local_irq_save(flags);
 
 	/*
-	 * Requeue the IRQ work on next unlock in following situation:
+	 * If the IRQ work handler happens to run in the middle of RCU read-side
+	 * critical section, it could be ineffective in getting the scheduler's
+	 * attention to report a deferred quiescent state (the whole point of the
+	 * IRQ work). For this reason, requeue the IRQ work.
+	 *
+	 * Basically, we want to avoid following situation:
 	 * 1. rcu_read_unlock() queues IRQ work (state -> DEFER_QS_PENDING)
 	 * 2. CPU enters new rcu_read_lock()
 	 * 3. IRQ work runs but cannot report QS due to rcu_preempt_depth() > 0
-- 
2.34.1