[PATCH] workqueue: Check for null pointer return from get_work_pwq()

John Moon posted 1 patch 2 years, 9 months ago
kernel/workqueue.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
[PATCH] workqueue: Check for null pointer return from get_work_pwq()
Posted by John Moon 2 years, 9 months ago
We've encountered a kernel panic with the following stack trace:

-> ret_from_fork
 -> kthread
  -> worker_thread
   -> process_one_work
    -> pwq_dec_nr_in_flight
     -> pwq_activate_inactive_work

The issue was narrowed down to a null pointer dereference within
pwq_activate_inactive_work() stemming from the return value of
get_work_pwq() which may return NULL, but was not checked for
null return prior to use.

While fixing the issue, other dereferences of get_work_pwq()'s
return value were found without a null check.

Add null pointer checks to the calling functions that need them.

Signed-off-by: John Moon <quic_johmoo@quicinc.com>
---
 kernel/workqueue.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 7cd5f5e7e0a1..5de0a2e1aeaa 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -1162,6 +1162,9 @@ static void pwq_activate_inactive_work(struct work_struct *work)
 {
 	struct pool_workqueue *pwq = get_work_pwq(work);

+	if (!pwq)
+		return;
+
 	trace_workqueue_activate_work(work);
 	if (list_empty(&pwq->pool->worklist))
 		pwq->pool->watchdog_ts = jiffies;
@@ -2030,8 +2033,12 @@ static void idle_worker_timeout(struct timer_list *t)
 static void send_mayday(struct work_struct *work)
 {
 	struct pool_workqueue *pwq = get_work_pwq(work);
-	struct workqueue_struct *wq = pwq->wq;
+	struct workqueue_struct *wq;
+
+	if (!pwq)
+		return;

+	wq = pwq->wq;
 	lockdep_assert_held(&wq_mayday_lock);

 	if (!wq->rescuer)
@@ -2184,9 +2191,10 @@ __acquires(&pool->lock)
 {
 	struct pool_workqueue *pwq = get_work_pwq(work);
 	struct worker_pool *pool = worker->pool;
-	bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE;
+	bool cpu_intensive;
 	unsigned long work_data;
 	struct worker *collision;
+
 #ifdef CONFIG_LOCKDEP
 	/*
 	 * It is permissible to free the struct work_struct from
@@ -2199,6 +2207,11 @@ __acquires(&pool->lock)

 	lockdep_copy_map(&lockdep_map, &work->lockdep_map);
 #endif
+	if (!pwq)
+		return;
+
+	cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE;
+
 	/* ensure we're on the correct CPU */
 	WARN_ON_ONCE(!(pool->flags & POOL_DISASSOCIATED) &&
 		     raw_smp_processor_id() != pool->cpu);
--
2.17.1
Re: [PATCH] workqueue: Check for null pointer return from get_work_pwq()
Posted by Tejun Heo 2 years, 9 months ago
On Wed, Dec 07, 2022 at 04:53:44PM -0800, John Moon wrote:
> We've encountered a kernel panic with the following stack trace:
> 
> -> ret_from_fork
>  -> kthread
>   -> worker_thread
>    -> process_one_work
>     -> pwq_dec_nr_in_flight
>      -> pwq_activate_inactive_work
> 
> The issue was narrowed down to a null pointer dereference within
> pwq_activate_inactive_work() stemming from the return value of
> get_work_pwq() which may return NULL, but was not checked for
> null return prior to use.
> 
> While fixing the issue, other dereferences of get_work_pwq()'s
> return value were found without a null check.
> 
> Add null pointer checks to the calling functions that need them.

At that point the work item must have pwq assigned - see insert_work(), so
this can't be the root cause. It's just papering over a bug somewhere else
(e.g. the work item got freed or written over somehow).

Thanks.

-- 
tejun