[PATCH] posix-cpu-timers: clear TICK_DEP_BIT_POSIX_TIMER on clone

Benjamin Segall posted 1 patch 1 month, 1 week ago
There is a newer version of this series
kernel/fork.c | 1 +
1 file changed, 1 insertion(+)
[PATCH] posix-cpu-timers: clear TICK_DEP_BIT_POSIX_TIMER on clone
Posted by Benjamin Segall 1 month, 1 week ago
When we clone a new thread, we do not inherit its posix_cputimers, and
clear them with posix_cputimers_init. However, this does not clear the
tick dependency it creates in tsk->tick_dep_mask, and the handler does
not reach the code to clear the dependency if there were no timers to
begin with.

Thus if a thread has a cputimer running before cloneing/forking, that
hierarchy will prevent nohz_full unless they create a cputimer of their
own.

Process-wide timers do not have this problem because fork does not copy
signal_struct as a baseline, it creates one from scratch.

Fixes: b78783000d5c ("posix-cpu-timers: Migrate to use new tick dependency mask model")
Signed-off-by: Ben Segall <bsegall@google.com>
Cc: stable@vger.kernel.org
---
 kernel/fork.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/fork.c b/kernel/fork.c
index df8e4575ff01..b57cd63cfcd1 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2290,10 +2290,11 @@ __latent_entropy struct task_struct *copy_process(
 
 	task_io_accounting_init(&p->ioac);
 	acct_clear_integrals(p);
 
 	posix_cputimers_init(&p->posix_cputimers);
+	tick_dep_clear_task(p, TICK_DEP_BIT_POSIX_TIMER);
 
 	p->io_context = NULL;
 	audit_set_context(p, NULL);
 	cgroup_fork(p);
 	if (args->kthread) {
-- 
2.47.0.rc1.288.g06298d1525-goog
Re: [PATCH] posix-cpu-timers: clear TICK_DEP_BIT_POSIX_TIMER on clone
Posted by Frederic Weisbecker 1 month, 1 week ago
On Wed, Oct 16, 2024 at 04:59:08PM -0700, Benjamin Segall wrote:
> When we clone a new thread, we do not inherit its posix_cputimers, and
> clear them with posix_cputimers_init. However, this does not clear the
> tick dependency it creates in tsk->tick_dep_mask, and the handler does
> not reach the code to clear the dependency if there were no timers to
> begin with.
> 
> Thus if a thread has a cputimer running before cloneing/forking, that
> hierarchy will prevent nohz_full unless they create a cputimer of their
> own.
> 
> Process-wide timers do not have this problem because fork does not copy
> signal_struct as a baseline, it creates one from scratch.
> 
> Fixes: b78783000d5c ("posix-cpu-timers: Migrate to use new tick dependency mask model")
> Signed-off-by: Ben Segall <bsegall@google.com>
> Cc: stable@vger.kernel.org
> ---
>  kernel/fork.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/kernel/fork.c b/kernel/fork.c
> index df8e4575ff01..b57cd63cfcd1 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -2290,10 +2290,11 @@ __latent_entropy struct task_struct *copy_process(
>  
>  	task_io_accounting_init(&p->ioac);
>  	acct_clear_integrals(p);
>  
>  	posix_cputimers_init(&p->posix_cputimers);
> +	tick_dep_clear_task(p, TICK_DEP_BIT_POSIX_TIMER);

Yes but we don't need the expensive atomic_fetch_andnot(). Also more
generally the task tick dependency should be 0 upon creation.

So something like this?

diff --git a/include/linux/tick.h b/include/linux/tick.h
index 72744638c5b0..99c9c5a7252a 100644
--- a/include/linux/tick.h
+++ b/include/linux/tick.h
@@ -251,12 +251,19 @@ static inline void tick_dep_set_task(struct task_struct *tsk,
 	if (tick_nohz_full_enabled())
 		tick_nohz_dep_set_task(tsk, bit);
 }
+
 static inline void tick_dep_clear_task(struct task_struct *tsk,
 				       enum tick_dep_bits bit)
 {
 	if (tick_nohz_full_enabled())
 		tick_nohz_dep_clear_task(tsk, bit);
 }
+
+static inline void tick_dep_init_task(struct task_struct *tsk)
+{
+	atomic_set(&tsk->tick_dep_mask, 0);
+}
+
 static inline void tick_dep_set_signal(struct task_struct *tsk,
 				       enum tick_dep_bits bit)
 {
@@ -290,6 +297,7 @@ static inline void tick_dep_set_task(struct task_struct *tsk,
 				     enum tick_dep_bits bit) { }
 static inline void tick_dep_clear_task(struct task_struct *tsk,
 				       enum tick_dep_bits bit) { }
+static inline void tick_dep_init_task(struct task_struct *tsk) { }
 static inline void tick_dep_set_signal(struct task_struct *tsk,
 				       enum tick_dep_bits bit) { }
 static inline void tick_dep_clear_signal(struct signal_struct *signal,
diff --git a/kernel/fork.c b/kernel/fork.c
index 89ceb4a68af2..6fa9fe62e01e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -105,6 +105,7 @@
 #include <linux/rseq.h>
 #include <uapi/linux/pidfd.h>
 #include <linux/pidfs.h>
+#include <linux/tick.h>
 
 #include <asm/pgalloc.h>
 #include <linux/uaccess.h>
@@ -2292,6 +2293,7 @@ __latent_entropy struct task_struct *copy_process(
 	acct_clear_integrals(p);
 
 	posix_cputimers_init(&p->posix_cputimers);
+	tick_dep_init_task(p);
 
 	p->io_context = NULL;
 	audit_set_context(p, NULL);
Re: [PATCH] posix-cpu-timers: clear TICK_DEP_BIT_POSIX_TIMER on clone
Posted by Benjamin Segall 1 month, 1 week ago
Frederic Weisbecker <frederic@kernel.org> writes:

> On Wed, Oct 16, 2024 at 04:59:08PM -0700, Benjamin Segall wrote:
>> When we clone a new thread, we do not inherit its posix_cputimers, and
>> clear them with posix_cputimers_init. However, this does not clear the
>> tick dependency it creates in tsk->tick_dep_mask, and the handler does
>> not reach the code to clear the dependency if there were no timers to
>> begin with.
>> 
>> Thus if a thread has a cputimer running before cloneing/forking, that
>> hierarchy will prevent nohz_full unless they create a cputimer of their
>> own.
>> 
>> Process-wide timers do not have this problem because fork does not copy
>> signal_struct as a baseline, it creates one from scratch.
>> 
>> Fixes: b78783000d5c ("posix-cpu-timers: Migrate to use new tick dependency mask model")
>> Signed-off-by: Ben Segall <bsegall@google.com>
>> Cc: stable@vger.kernel.org
>> ---
>>  kernel/fork.c | 1 +
>>  1 file changed, 1 insertion(+)
>> 
>> diff --git a/kernel/fork.c b/kernel/fork.c
>> index df8e4575ff01..b57cd63cfcd1 100644
>> --- a/kernel/fork.c
>> +++ b/kernel/fork.c
>> @@ -2290,10 +2290,11 @@ __latent_entropy struct task_struct *copy_process(
>>  
>>  	task_io_accounting_init(&p->ioac);
>>  	acct_clear_integrals(p);
>>  
>>  	posix_cputimers_init(&p->posix_cputimers);
>> +	tick_dep_clear_task(p, TICK_DEP_BIT_POSIX_TIMER);
>
> Yes but we don't need the expensive atomic_fetch_andnot(). Also more
> generally the task tick dependency should be 0 upon creation.
>
> So something like this?

Yeah, the only other uses are contained in rcu_do_batch and rcutorture
tests, which won't end up here anyways.

Up to you if you want to send this or I can send out a v2.

>
> diff --git a/include/linux/tick.h b/include/linux/tick.h
> index 72744638c5b0..99c9c5a7252a 100644
> --- a/include/linux/tick.h
> +++ b/include/linux/tick.h
> @@ -251,12 +251,19 @@ static inline void tick_dep_set_task(struct task_struct *tsk,
>  	if (tick_nohz_full_enabled())
>  		tick_nohz_dep_set_task(tsk, bit);
>  }
> +
>  static inline void tick_dep_clear_task(struct task_struct *tsk,
>  				       enum tick_dep_bits bit)
>  {
>  	if (tick_nohz_full_enabled())
>  		tick_nohz_dep_clear_task(tsk, bit);
>  }
> +
> +static inline void tick_dep_init_task(struct task_struct *tsk)
> +{
> +	atomic_set(&tsk->tick_dep_mask, 0);
> +}
> +
>  static inline void tick_dep_set_signal(struct task_struct *tsk,
>  				       enum tick_dep_bits bit)
>  {
> @@ -290,6 +297,7 @@ static inline void tick_dep_set_task(struct task_struct *tsk,
>  				     enum tick_dep_bits bit) { }
>  static inline void tick_dep_clear_task(struct task_struct *tsk,
>  				       enum tick_dep_bits bit) { }
> +static inline void tick_dep_init_task(struct task_struct *tsk) { }
>  static inline void tick_dep_set_signal(struct task_struct *tsk,
>  				       enum tick_dep_bits bit) { }
>  static inline void tick_dep_clear_signal(struct signal_struct *signal,
> diff --git a/kernel/fork.c b/kernel/fork.c
> index 89ceb4a68af2..6fa9fe62e01e 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c
> @@ -105,6 +105,7 @@
>  #include <linux/rseq.h>
>  #include <uapi/linux/pidfd.h>
>  #include <linux/pidfs.h>
> +#include <linux/tick.h>
>  
>  #include <asm/pgalloc.h>
>  #include <linux/uaccess.h>
> @@ -2292,6 +2293,7 @@ __latent_entropy struct task_struct *copy_process(
>  	acct_clear_integrals(p);
>  
>  	posix_cputimers_init(&p->posix_cputimers);
> +	tick_dep_init_task(p);
>  
>  	p->io_context = NULL;
>  	audit_set_context(p, NULL);
Re: [PATCH] posix-cpu-timers: clear TICK_DEP_BIT_POSIX_TIMER on clone
Posted by Frederic Weisbecker 1 month, 1 week ago
Le Thu, Oct 17, 2024 at 01:09:08PM -0700, Benjamin Segall a écrit :
> Frederic Weisbecker <frederic@kernel.org> writes:
> 
> > On Wed, Oct 16, 2024 at 04:59:08PM -0700, Benjamin Segall wrote:
> >> When we clone a new thread, we do not inherit its posix_cputimers, and
> >> clear them with posix_cputimers_init. However, this does not clear the
> >> tick dependency it creates in tsk->tick_dep_mask, and the handler does
> >> not reach the code to clear the dependency if there were no timers to
> >> begin with.
> >> 
> >> Thus if a thread has a cputimer running before cloneing/forking, that
> >> hierarchy will prevent nohz_full unless they create a cputimer of their
> >> own.
> >> 
> >> Process-wide timers do not have this problem because fork does not copy
> >> signal_struct as a baseline, it creates one from scratch.
> >> 
> >> Fixes: b78783000d5c ("posix-cpu-timers: Migrate to use new tick dependency mask model")
> >> Signed-off-by: Ben Segall <bsegall@google.com>
> >> Cc: stable@vger.kernel.org
> >> ---
> >>  kernel/fork.c | 1 +
> >>  1 file changed, 1 insertion(+)
> >> 
> >> diff --git a/kernel/fork.c b/kernel/fork.c
> >> index df8e4575ff01..b57cd63cfcd1 100644
> >> --- a/kernel/fork.c
> >> +++ b/kernel/fork.c
> >> @@ -2290,10 +2290,11 @@ __latent_entropy struct task_struct *copy_process(
> >>  
> >>  	task_io_accounting_init(&p->ioac);
> >>  	acct_clear_integrals(p);
> >>  
> >>  	posix_cputimers_init(&p->posix_cputimers);
> >> +	tick_dep_clear_task(p, TICK_DEP_BIT_POSIX_TIMER);
> >
> > Yes but we don't need the expensive atomic_fetch_andnot(). Also more
> > generally the task tick dependency should be 0 upon creation.
> >
> > So something like this?
> 
> Yeah, the only other uses are contained in rcu_do_batch and rcutorture
> tests, which won't end up here anyways.
> 
> Up to you if you want to send this or I can send out a v2.

Sounds good, please send a v2.

Thanks!