[RFC PATCH] workqueue: Fix rescuer task's name truncated

Wenchao Hao posted 1 patch 1 year, 9 months ago
There is a newer version of this series
kernel/workqueue.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
[RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Wenchao Hao 1 year, 9 months ago
Task comm of task is limitted to 16, prefix "kworker/R-" is added for
rescuer worker's task, which cause most task name is truncated as
following:

root   81  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xprti]
root   82  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-cfg80]
root   85  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-nfsio]
root   86  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xfsal]
root   87  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xfs_m]
root   88  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-acpi_]
root   93  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-iscsi]
root   95  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
root   97  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
root   99  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]

I want to fix this issue by split rescuer name to 2 part like other
kworker, the normal part is "kworker/R" which is set to task_struct's comm,
another part is wq->name which is added to kworker's desc. These 2 parts
would be merged in wq_worker_comm().

Fixes: b6a46f7263bd ("workqueue: Rename rescuer kworker")

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
---
 kernel/workqueue.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 0066c8f6c154..0ce9e8597a4d 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -5430,7 +5430,7 @@ static int init_rescuer(struct workqueue_struct *wq)
 	}
 
 	rescuer->rescue_wq = wq;
-	rescuer->task = kthread_create(rescuer_thread, rescuer, "kworker/R-%s", wq->name);
+	rescuer->task = kthread_create(rescuer_thread, rescuer, "kworker/R");
 	if (IS_ERR(rescuer->task)) {
 		ret = PTR_ERR(rescuer->task);
 		pr_err("workqueue: Failed to create a rescuer kthread for wq \"%s\": %pe",
@@ -5439,6 +5439,8 @@ static int init_rescuer(struct workqueue_struct *wq)
 		return ret;
 	}
 
+	snprintf(rescuer->desc, sizeof(rescuer->desc), "%s", wq->name);
+
 	wq->rescuer = rescuer;
 	if (wq->flags & WQ_UNBOUND)
 		kthread_bind_mask(rescuer->task, wq_unbound_cpumask);
@@ -6289,6 +6291,8 @@ void wq_worker_comm(char *buf, size_t size, struct task_struct *task)
 						  worker->desc);
 			}
 			raw_spin_unlock_irq(&pool->lock);
+		} else if (worker->desc[0] != '\0') {
+			scnprintf(buf + off, size - off, "-%s", worker->desc);
 		}
 	}
 
-- 
2.32.0
Re: [RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Aaron Tomlin 1 year, 9 months ago
On Wed, Apr 24, 2024 at 02:21:04AM GMT, Wenchao Hao wrote:
> Task comm of task is limitted to 16, prefix "kworker/R-" is added for
> rescuer worker's task, which cause most task name is truncated as
> following:
> 
> root   81  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xprti]
> root   82  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-cfg80]
> root   85  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-nfsio]
> root   86  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xfsal]
> root   87  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xfs_m]
> root   88  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-acpi_]
> root   93  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-iscsi]
> root   95  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
> root   97  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
> root   99  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
> 
> I want to fix this issue by split rescuer name to 2 part like other
> kworker, the normal part is "kworker/R" which is set to task_struct's comm,
> another part is wq->name which is added to kworker's desc. These 2 parts
> would be merged in wq_worker_comm().
> 
> Fixes: b6a46f7263bd ("workqueue: Rename rescuer kworker")
> 
> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
> ---
>  kernel/workqueue.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 0066c8f6c154..0ce9e8597a4d 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -5430,7 +5430,7 @@ static int init_rescuer(struct workqueue_struct *wq)
>  	}
>  
>  	rescuer->rescue_wq = wq;
> -	rescuer->task = kthread_create(rescuer_thread, rescuer, "kworker/R-%s", wq->name);
> +	rescuer->task = kthread_create(rescuer_thread, rescuer, "kworker/R");
>  	if (IS_ERR(rescuer->task)) {
>  		ret = PTR_ERR(rescuer->task);
>  		pr_err("workqueue: Failed to create a rescuer kthread for wq \"%s\": %pe",
> @@ -5439,6 +5439,8 @@ static int init_rescuer(struct workqueue_struct *wq)
>  		return ret;
>  	}
>  
> +	snprintf(rescuer->desc, sizeof(rescuer->desc), "%s", wq->name);
> +
>  	wq->rescuer = rescuer;
>  	if (wq->flags & WQ_UNBOUND)
>  		kthread_bind_mask(rescuer->task, wq_unbound_cpumask);
> @@ -6289,6 +6291,8 @@ void wq_worker_comm(char *buf, size_t size, struct task_struct *task)
>  						  worker->desc);
>  			}
>  			raw_spin_unlock_irq(&pool->lock);
> +		} else if (worker->desc[0] != '\0') {
> +			scnprintf(buf + off, size - off, "-%s", worker->desc);
>  		}
>  	}
>  
> -- 

Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>

-- 
Aaron Tomlin
Re: [RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Tejun Heo 1 year, 9 months ago
Hello

On Sat, May 11, 2024 at 03:15:11PM +0100, Aaron Tomlin wrote:
> > @@ -5439,6 +5439,8 @@ static int init_rescuer(struct workqueue_struct *wq)
> >  		return ret;
> >  	}
> >  
> > +	snprintf(rescuer->desc, sizeof(rescuer->desc), "%s", wq->name);

Can you please address the testbot reported warning?

> >  	wq->rescuer = rescuer;
> >  	if (wq->flags & WQ_UNBOUND)
> >  		kthread_bind_mask(rescuer->task, wq_unbound_cpumask);
> > @@ -6289,6 +6291,8 @@ void wq_worker_comm(char *buf, size_t size, struct task_struct *task)
> >  						  worker->desc);
> >  			}
> >  			raw_spin_unlock_irq(&pool->lock);
> > +		} else if (worker->desc[0] != '\0') {
> > +			scnprintf(buf + off, size - off, "-%s", worker->desc);
> >  		}
> >  	}
> >  
> > -- 
> 
> Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>

The patch looks fine to me otherwise.

Thanks.

-- 
tejun
Re: [RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Wenchao Hao 1 year, 9 months ago
On 2024/5/14 1:17, Tejun Heo wrote:
> Hello
> 
> On Sat, May 11, 2024 at 03:15:11PM +0100, Aaron Tomlin wrote:
>>> @@ -5439,6 +5439,8 @@ static int init_rescuer(struct workqueue_struct *wq)
>>>   		return ret;
>>>   	}
>>>   
>>> +	snprintf(rescuer->desc, sizeof(rescuer->desc), "%s", wq->name);
> 
> Can you please address the testbot reported warning?
> 

Of course.

The warning is also reported here: https://lore.kernel.org/all/20240513030639.3772468-1-haowenchao2@huawei.com/

It's caused because commit 31c89007285d ("workqueue.c: Increase workqueue name length") increase
WQ_NAME_LEN from 24 to 32, but forget to increase WORKER_DESC_LEN which should equalto WQ_NAME_LEN.
Same usage can be found in process_one_work(), it called strscpy() which would not WARNING.

I sent a V2 patch with following changes appended for this warning, maybe you missed them.

diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h
index 158784dd189a..72031fa80414 100644
--- a/include/linux/workqueue.h
+++ b/include/linux/workqueue.h
@@ -92,7 +92,7 @@ enum wq_misc_consts {
         WORK_BUSY_RUNNING       = 1 << 1,
  
         /* maximum string length for set_worker_desc() */
-       WORKER_DESC_LEN         = 24,
+       WORKER_DESC_LEN         = 32,
  };
  
  /* Convenience constants - of type 'unsigned long', not 'enum'! */

>>>   	wq->rescuer = rescuer;
>>>   	if (wq->flags & WQ_UNBOUND)
>>>   		kthread_bind_mask(rescuer->task, wq_unbound_cpumask);
>>> @@ -6289,6 +6291,8 @@ void wq_worker_comm(char *buf, size_t size, struct task_struct *task)
>>>   						  worker->desc);
>>>   			}
>>>   			raw_spin_unlock_irq(&pool->lock);
>>> +		} else if (worker->desc[0] != '\0') {
>>> +			scnprintf(buf + off, size - off, "-%s", worker->desc);
>>>   		}
>>>   	}
>>>   
>>> -- 
>>
>> Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>
> 
> The patch looks fine to me otherwise.
> 
> Thanks.
>
Re: [RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Wenchao Hao 1 year, 9 months ago
On 2024/5/11 22:15, Aaron Tomlin wrote:
> On Wed, Apr 24, 2024 at 02:21:04AM GMT, Wenchao Hao wrote:
>> Task comm of task is limitted to 16, prefix "kworker/R-" is added for
>> rescuer worker's task, which cause most task name is truncated as
>> following:
>>
>> root   81  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xprti]
>> root   82  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-cfg80]
>> root   85  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-nfsio]
>> root   86  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xfsal]
>> root   87  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xfs_m]
>> root   88  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-acpi_]
>> root   93  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-iscsi]
>> root   95  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
>> root   97  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
>> root   99  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
>>
>> I want to fix this issue by split rescuer name to 2 part like other
>> kworker, the normal part is "kworker/R" which is set to task_struct's comm,
>> another part is wq->name which is added to kworker's desc. These 2 parts
>> would be merged in wq_worker_comm().
>>
>> Fixes: b6a46f7263bd ("workqueue: Rename rescuer kworker")
>>
>> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
>> ---
>>   kernel/workqueue.c | 6 +++++-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>> index 0066c8f6c154..0ce9e8597a4d 100644
>> --- a/kernel/workqueue.c
>> +++ b/kernel/workqueue.c
>> @@ -5430,7 +5430,7 @@ static int init_rescuer(struct workqueue_struct *wq)
>>   	}
>>   
>>   	rescuer->rescue_wq = wq;
>> -	rescuer->task = kthread_create(rescuer_thread, rescuer, "kworker/R-%s", wq->name);
>> +	rescuer->task = kthread_create(rescuer_thread, rescuer, "kworker/R");
>>   	if (IS_ERR(rescuer->task)) {
>>   		ret = PTR_ERR(rescuer->task);
>>   		pr_err("workqueue: Failed to create a rescuer kthread for wq \"%s\": %pe",
>> @@ -5439,6 +5439,8 @@ static int init_rescuer(struct workqueue_struct *wq)
>>   		return ret;
>>   	}
>>   
>> +	snprintf(rescuer->desc, sizeof(rescuer->desc), "%s", wq->name);
>> +
>>   	wq->rescuer = rescuer;
>>   	if (wq->flags & WQ_UNBOUND)
>>   		kthread_bind_mask(rescuer->task, wq_unbound_cpumask);
>> @@ -6289,6 +6291,8 @@ void wq_worker_comm(char *buf, size_t size, struct task_struct *task)
>>   						  worker->desc);
>>   			}
>>   			raw_spin_unlock_irq(&pool->lock);
>> +		} else if (worker->desc[0] != '\0') {
>> +			scnprintf(buf + off, size - off, "-%s", worker->desc);
>>   		}
>>   	}
>>   
>> -- 
> 
> Reviewed-by: Aaron Tomlin <atomlin@atomlin.com>
> 
Thanks for your review, I would add your Reviewed-by and remove the RFC 
tags, then post again.
Re: [RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Wenchao Hao 1 year, 9 months ago
On 2024/4/24 2:21, Wenchao Hao wrote:
> Task comm of task is limitted to 16, prefix "kworker/R-" is added for
> rescuer worker's task, which cause most task name is truncated as
> following:
> 
> root   81  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xprti]
> root   82  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-cfg80]
> root   85  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-nfsio]
> root   86  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xfsal]
> root   87  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-xfs_m]
> root   88  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-acpi_]
> root   93  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-iscsi]
> root   95  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
> root   97  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
> root   99  0.0  0.0  0  0 ?   I<   11:18   0:00 [kworker/R-scsi_]
> 
> I want to fix this issue by split rescuer name to 2 part like other
> kworker, the normal part is "kworker/R" which is set to task_struct's comm,
> another part is wq->name which is added to kworker's desc. These 2 parts
> would be merged in wq_worker_comm().
> 
> Fixes: b6a46f7263bd ("workqueue: Rename rescuer kworker")
> 
> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
> ---
>   kernel/workqueue.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
> index 0066c8f6c154..0ce9e8597a4d 100644
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -5430,7 +5430,7 @@ static int init_rescuer(struct workqueue_struct *wq)
>   	}
>   
>   	rescuer->rescue_wq = wq;
> -	rescuer->task = kthread_create(rescuer_thread, rescuer, "kworker/R-%s", wq->name);
> +	rescuer->task = kthread_create(rescuer_thread, rescuer, "kworker/R");
>   	if (IS_ERR(rescuer->task)) {
>   		ret = PTR_ERR(rescuer->task);
>   		pr_err("workqueue: Failed to create a rescuer kthread for wq \"%s\": %pe",
> @@ -5439,6 +5439,8 @@ static int init_rescuer(struct workqueue_struct *wq)
>   		return ret;
>   	}
>   
> +	snprintf(rescuer->desc, sizeof(rescuer->desc), "%s", wq->name);
> +
>   	wq->rescuer = rescuer;
>   	if (wq->flags & WQ_UNBOUND)
>   		kthread_bind_mask(rescuer->task, wq_unbound_cpumask);
> @@ -6289,6 +6291,8 @@ void wq_worker_comm(char *buf, size_t size, struct task_struct *task)
>   						  worker->desc);
>   			}
>   			raw_spin_unlock_irq(&pool->lock);
> +		} else if (worker->desc[0] != '\0') {
> +			scnprintf(buf + off, size - off, "-%s", worker->desc);
>   		}


Hi Tejun and all:

I added another logic in wq_worker_comm() to append worker's desc when
worker is not attached to a work pool. I don't know why the origin
logic only append worker's desc when worker is attached to a work pool,
so I am not sure if it's safe to using worker here directly.

>   	}
>
Re: [RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Aaron Tomlin 1 year, 9 months ago
On Tue, Apr 23, 2024 at 12:55:22PM +0800, Wenchao Hao wrote:
> Hi Tejun and all:
> 
> I added another logic in wq_worker_comm() to append worker's desc when
> worker is not attached to a work pool. I don't know why the origin
> logic only append worker's desc when worker is attached to a work pool,
> so I am not sure if it's safe to using worker here directly.

Hi Wenchao,

A worker description is always updated under the per-pool workqueue lock.
You can indeed use the rescuer's own description to store its associated
workqueue name - good idea. We know the caller of set_worker_desc() will
never touch a rescuer. For this unique rescuer case, if I understand
correctly, you can read a rescuer's description outside the per-pool
workqueue lock; albeit, you need to prevent a race with destroy_workqueue()
to avoid a use-after-free.


Kind regards,

-- 
Aaron Tomlin
Re: [RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Wenchao Hao 1 year, 9 months ago
On 4/24/24 3:12 PM, Aaron Tomlin wrote:
> On Tue, Apr 23, 2024 at 12:55:22PM +0800, Wenchao Hao wrote:
>> Hi Tejun and all:
>>
>> I added another logic in wq_worker_comm() to append worker's desc when
>> worker is not attached to a work pool. I don't know why the origin
>> logic only append worker's desc when worker is attached to a work pool,
>> so I am not sure if it's safe to using worker here directly.
> 
> Hi Wenchao,
> 
> A worker description is always updated under the per-pool workqueue lock.
> You can indeed use the rescuer's own description to store its associated
> workqueue name - good idea. We know the caller of set_worker_desc() will
> never touch a rescuer. For this unique rescuer case, if I understand
> correctly, you can read a rescuer's description outside the per-pool
> workqueue lock; albeit, you need to prevent a race with destroy_workqueue()
> to avoid a use-after-free.
> 

Hi Aaron, thanks a lot for your reply.

I think destroy_workqueue() may not race with wq_worker_comm(),
wq_pool_attach_mutex is used to avoid race, below is my analysis.
(Welcome to point out if my understand is incorrect)

t1 which call destroy_workqueue()     rescuer->task

destroy_workqueue()
  kthread_stop(rescuer->task)
                                      rescuer_thread()
	                                if (should_stop) {
	                                  __set_current_state(TASK_RUNNING);
	                                  set_pf_worker(false);
                                            mutex_lock(&wq_pool_attach_mutex);
				            current->flags &= ~PF_WQ_WORKER;
				            mutex_unlock(&wq_pool_attach_mutex);
	                                  return 0;
	                                }

  kfree(rescuer)

wq_worker_comm() would acquire wq_pool_attach_mutex then check if task->flags
is set PF_WQ_WORKER.
If PF_WQ_WORKER is not set, wq_worker_comm() would not access this task's worker
any more;
If PF_WQ_WORKER is set, the wq_pool_attach_mutex is held durning access of task's
worker.

What confuse me mostly is why the origin logic only append worker's desc when
worker is attached to a work pool.

Thinks.

> 
> Kind regards,
>
Re: [RFC PATCH] workqueue: Fix rescuer task's name truncated
Posted by Aaron Tomlin 1 year, 9 months ago
On Wed, Apr 24, 2024 at 11:12:24PM GMT, Wenchao Hao wrote:
> Hi Aaron, thanks a lot for your reply.

No problem.

> I think destroy_workqueue() may not race with wq_worker_comm(),
> wq_pool_attach_mutex is used to avoid race, below is my analysis.
> (Welcome to point out if my understand is incorrect)
> 
> t1 which call destroy_workqueue()     rescuer->task
> 
> destroy_workqueue()
>   kthread_stop(rescuer->task)
>                                       rescuer_thread()
> 	                                if (should_stop) {
> 	                                  __set_current_state(TASK_RUNNING);
> 	                                  set_pf_worker(false);
>                                             mutex_lock(&wq_pool_attach_mutex);
> 				            current->flags &= ~PF_WQ_WORKER;
> 				            mutex_unlock(&wq_pool_attach_mutex);
> 	                                  return 0;
> 	                                }
> 
>   kfree(rescuer)
> 
> wq_worker_comm() would acquire wq_pool_attach_mutex then check if task->flags
> is set PF_WQ_WORKER.
> If PF_WQ_WORKER is not set, wq_worker_comm() would not access this task's worker
> any more;
> If PF_WQ_WORKER is set, the wq_pool_attach_mutex is held durning access of task's
> worker.

Indeed. If I understand correctly then a use-after-free is theoretically
impossible due to the use of the 'wq_pool_attach_mutex' in the context of
rescuer_thread() and wq_pool_attach_mutex(), respectively.

> What confuse me mostly is why the origin logic only append worker's desc when
> worker is attached to a work pool.

I can only assume there was no intention to use the rescuer kworker's
description information for this purpose. Your patch looks fine to me now.




-- 
Aaron Tomlin