[PATCH V2 3/4] posix-timers: Initialise timer->it_signal in posix_timer_add()

Eric Dumazet posted 4 patches 10 months ago
[PATCH V2 3/4] posix-timers: Initialise timer->it_signal in posix_timer_add()
Posted by Eric Dumazet 10 months ago
Instead of leaving a NULL value in timer->it_signal,
set it to the current sig pointer, but with the low order bit set.

This fixes a potential race, in the unlikely case a thread
was preempted long enough that other threads created more than
2^31 itimers.

Rename __posix_timers_find() to posix_timers_find()

Mask the low order bit in posix_timers_find().

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 kernel/time/posix-timers.c | 28 +++++++++++++++++++++-------
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 1f73ea955756..ed27c7eab456 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -72,15 +72,22 @@ static int hash(struct signal_struct *sig, unsigned int nr)
 	return hash_32(hash32_ptr(sig) ^ nr, HASH_BITS(posix_timers_hashtable));
 }
 
-static struct k_itimer *__posix_timers_find(struct hlist_head *head,
+static struct signal_struct *posix_sig_owner(const struct k_itimer *timer)
+{
+	/* timer->it_signal can be set concurrently */
+	unsigned long val = (unsigned long)READ_ONCE(timer->it_signal);
+
+	return (struct signal_struct *)(val & ~1UL);
+}
+
+static struct k_itimer *posix_timers_find(struct hlist_head *head,
 					    struct signal_struct *sig,
 					    timer_t id)
 {
 	struct k_itimer *timer;
 
 	hlist_for_each_entry_rcu(timer, head, t_hash, lockdep_is_held(&hash_lock)) {
-		/* timer->it_signal can be set concurrently */
-		if ((READ_ONCE(timer->it_signal) == sig) && (timer->it_id == id))
+		if ((posix_sig_owner(timer) == sig) && (timer->it_id == id))
 			return timer;
 	}
 	return NULL;
@@ -90,8 +97,14 @@ static struct k_itimer *posix_timer_by_id(timer_t id)
 {
 	struct signal_struct *sig = current->signal;
 	struct hlist_head *head = &posix_timers_hashtable[hash(sig, id)];
+	struct k_itimer *timer;
 
-	return __posix_timers_find(head, sig, id);
+	hlist_for_each_entry_rcu(timer, head, t_hash) {
+		/* timer->it_signal can be set concurrently */
+		if ((READ_ONCE(timer->it_signal) == sig) && (timer->it_id == id))
+			return timer;
+	}
+	return NULL;
 }
 
 static int posix_timer_add(struct k_itimer *timer)
@@ -113,8 +126,9 @@ static int posix_timer_add(struct k_itimer *timer)
 		head = &posix_timers_hashtable[hash(sig, id)];
 
 		spin_lock(&hash_lock);
-		if (!__posix_timers_find(head, sig, id)) {
+		if (!posix_timers_find(head, sig, id)) {
 			timer->it_id = (timer_t)id;
+			timer->it_signal = (struct signal_struct *)((unsigned long)sig | 1UL);
 			hlist_add_head_rcu(&timer->t_hash, head);
 			spin_unlock(&hash_lock);
 			return id;
@@ -453,7 +467,7 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
 	}
 	/*
 	 * After succesful copy out, the timer ID is visible to user space
-	 * now but not yet valid because new_timer::signal is still NULL.
+	 * now but not yet valid because new_timer::signal low order bit is 1.
 	 *
 	 * Complete the initialization with the clock specific create
 	 * callback.
@@ -463,7 +477,7 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
 		goto out;
 
 	spin_lock_irq(&current->sighand->siglock);
-	/* This makes the timer valid in the hash table */
+	/* This makes the timer valid in the hash table, clearing low order bit. */
 	WRITE_ONCE(new_timer->it_signal, current->signal);
 	hlist_add_head(&new_timer->list, &current->signal->posix_timers);
 	spin_unlock_irq(&current->sighand->siglock);
-- 
2.48.1.601.g30ceb7b040-goog
Re: [PATCH V2 3/4] posix-timers: Initialise timer->it_signal in posix_timer_add()
Posted by Thomas Gleixner 10 months ago
On Wed, Feb 19 2025 at 12:55, Eric Dumazet wrote:
> Instead of leaving a NULL value in timer->it_signal,
> set it to the current sig pointer, but with the low order bit set.

And that low order bit set does what?

> This fixes a potential race, in the unlikely case a thread
> was preempted long enough that other threads created more than
> 2^31 itimers.

and then what happens?

> Rename __posix_timers_find() to posix_timers_find()

That's not what the patch does. It renames to posix_sig_owner(). Aside
of that the rename is not relevant to the problem itself.

> Mask the low order bit in posix_timers_find().

What for?

I pointed you before to the changelog documentation, which clearly says:

  A good structure is to explain the context, the problem and the
  solution in separate paragraphs and this order.

It's not asked too much to write proper change logs.

> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  kernel/time/posix-timers.c | 28 +++++++++++++++++++++-------
>  1 file changed, 21 insertions(+), 7 deletions(-)
>
> diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
> index 1f73ea955756..ed27c7eab456 100644
> --- a/kernel/time/posix-timers.c
> +++ b/kernel/time/posix-timers.c
> @@ -72,15 +72,22 @@ static int hash(struct signal_struct *sig, unsigned int nr)
>  	return hash_32(hash32_ptr(sig) ^ nr, HASH_BITS(posix_timers_hashtable));
>  }
>  
> -static struct k_itimer *__posix_timers_find(struct hlist_head *head,
> +static struct signal_struct *posix_sig_owner(const struct k_itimer *timer)
> +{
> +	/* timer->it_signal can be set concurrently */
> +	unsigned long val = (unsigned long)READ_ONCE(timer->it_signal);
> +
> +	return (struct signal_struct *)(val & ~1UL);
> +}
> +
> +static struct k_itimer *posix_timers_find(struct hlist_head *head,
>  					    struct signal_struct *sig,
>  					    timer_t id)
>  {
>  	struct k_itimer *timer;
>  
>  	hlist_for_each_entry_rcu(timer, head, t_hash, lockdep_is_held(&hash_lock)) {
> -		/* timer->it_signal can be set concurrently */
> -		if ((READ_ONCE(timer->it_signal) == sig) && (timer->it_id == id))
> +		if ((posix_sig_owner(timer) == sig) && (timer->it_id == id))
>  			return timer;
>  	}
>  	return NULL;
> @@ -90,8 +97,14 @@ static struct k_itimer *posix_timer_by_id(timer_t id)
>  {
>  	struct signal_struct *sig = current->signal;
>  	struct hlist_head *head = &posix_timers_hashtable[hash(sig, id)];
> +	struct k_itimer *timer;
>  
> -	return __posix_timers_find(head, sig, id);
> +	hlist_for_each_entry_rcu(timer, head, t_hash) {
> +		/* timer->it_signal can be set concurrently */
> +		if ((READ_ONCE(timer->it_signal) == sig) && (timer->it_id == id))
> +			return timer;
> +	}
> +	return NULL;
>  }
>  
>  static int posix_timer_add(struct k_itimer *timer)
> @@ -113,8 +126,9 @@ static int posix_timer_add(struct k_itimer *timer)
>  		head = &posix_timers_hashtable[hash(sig, id)];
>  
>  		spin_lock(&hash_lock);
> -		if (!__posix_timers_find(head, sig, id)) {
> +		if (!posix_timers_find(head, sig, id)) {
>  			timer->it_id = (timer_t)id;
> +			timer->it_signal = (struct signal_struct *)((unsigned long)sig | 1UL);
>  			hlist_add_head_rcu(&timer->t_hash, head);
>  			spin_unlock(&hash_lock);
>  			return id;
> @@ -453,7 +467,7 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
>  	}
>  	/*
>  	 * After succesful copy out, the timer ID is visible to user space
> -	 * now but not yet valid because new_timer::signal is still NULL.
> +	 * now but not yet valid because new_timer::signal low order bit is 1.
>  	 *
>  	 * Complete the initialization with the clock specific create
>  	 * callback.
> @@ -463,7 +477,7 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
>  		goto out;
>  
>  	spin_lock_irq(&current->sighand->siglock);
> -	/* This makes the timer valid in the hash table */
> +	/* This makes the timer valid in the hash table, clearing low order bit. */

Clearing the low order bit of what? This is a full write and not a clear
low order bit operation.

>  	WRITE_ONCE(new_timer->it_signal, current->signal);
>  	hlist_add_head(&new_timer->list, &current->signal->posix_timers);
>  	spin_unlock_irq(&current->sighand->siglock);

Thanks,

        tglx
Re: [PATCH V2 3/4] posix-timers: Initialise timer->it_signal in posix_timer_add()
Posted by Eric Dumazet 10 months ago
On Thu, Feb 20, 2025 at 9:19 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, Feb 19 2025 at 12:55, Eric Dumazet wrote:
> > Instead of leaving a NULL value in timer->it_signal,
> > set it to the current sig pointer, but with the low order bit set.
>
> And that low order bit set does what?
>
> > This fixes a potential race, in the unlikely case a thread
> > was preempted long enough that other threads created more than
> > 2^31 itimers.
>
> and then what happens?

Two threads might get the same timer_id given back.

>
> > Rename __posix_timers_find() to posix_timers_find()
>
> That's not what the patch does. It renames to posix_sig_owner(). Aside
> of that the rename is not relevant to the problem itself.

posix_sig_owner() is a new helper, to remove the low order bit.

>
> > Mask the low order bit in posix_timers_find().
>
> What for?



>
> I pointed you before to the changelog documentation, which clearly says:
>
>   A good structure is to explain the context, the problem and the
>   solution in separate paragraphs and this order.
>
> It's not asked too much to write proper change logs.

Ok.

>
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > ---
> >  kernel/time/posix-timers.c | 28 +++++++++++++++++++++-------
> >  1 file changed, 21 insertions(+), 7 deletions(-)
> >
> > diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
> > index 1f73ea955756..ed27c7eab456 100644
> > --- a/kernel/time/posix-timers.c
> > +++ b/kernel/time/posix-timers.c
> > @@ -72,15 +72,22 @@ static int hash(struct signal_struct *sig, unsigned int nr)
> >       return hash_32(hash32_ptr(sig) ^ nr, HASH_BITS(posix_timers_hashtable));
> >  }
> >
> > -static struct k_itimer *__posix_timers_find(struct hlist_head *head,
> > +static struct signal_struct *posix_sig_owner(const struct k_itimer *timer)
> > +{
> > +     /* timer->it_signal can be set concurrently */
> > +     unsigned long val = (unsigned long)READ_ONCE(timer->it_signal);
> > +
> > +     return (struct signal_struct *)(val & ~1UL);
> > +}
> > +
> > +static struct k_itimer *posix_timers_find(struct hlist_head *head,
> >                                           struct signal_struct *sig,
> >                                           timer_t id)
> >  {
> >       struct k_itimer *timer;
> >
> >       hlist_for_each_entry_rcu(timer, head, t_hash, lockdep_is_held(&hash_lock)) {
> > -             /* timer->it_signal can be set concurrently */
> > -             if ((READ_ONCE(timer->it_signal) == sig) && (timer->it_id == id))
> > +             if ((posix_sig_owner(timer) == sig) && (timer->it_id == id))
> >                       return timer;
> >       }
> >       return NULL;
> > @@ -90,8 +97,14 @@ static struct k_itimer *posix_timer_by_id(timer_t id)
> >  {
> >       struct signal_struct *sig = current->signal;
> >       struct hlist_head *head = &posix_timers_hashtable[hash(sig, id)];
> > +     struct k_itimer *timer;
> >
> > -     return __posix_timers_find(head, sig, id);
> > +     hlist_for_each_entry_rcu(timer, head, t_hash) {
> > +             /* timer->it_signal can be set concurrently */
> > +             if ((READ_ONCE(timer->it_signal) == sig) && (timer->it_id == id))
> > +                     return timer;
> > +     }
> > +     return NULL;
> >  }
> >
> >  static int posix_timer_add(struct k_itimer *timer)
> > @@ -113,8 +126,9 @@ static int posix_timer_add(struct k_itimer *timer)
> >               head = &posix_timers_hashtable[hash(sig, id)];
> >
> >               spin_lock(&hash_lock);
> > -             if (!__posix_timers_find(head, sig, id)) {
> > +             if (!posix_timers_find(head, sig, id)) {
> >                       timer->it_id = (timer_t)id;
> > +                     timer->it_signal = (struct signal_struct *)((unsigned long)sig | 1UL);
> >                       hlist_add_head_rcu(&timer->t_hash, head);
> >                       spin_unlock(&hash_lock);
> >                       return id;
> > @@ -453,7 +467,7 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
> >       }
> >       /*
> >        * After succesful copy out, the timer ID is visible to user space
> > -      * now but not yet valid because new_timer::signal is still NULL.
> > +      * now but not yet valid because new_timer::signal low order bit is 1.
> >        *
> >        * Complete the initialization with the clock specific create
> >        * callback.
> > @@ -463,7 +477,7 @@ static int do_timer_create(clockid_t which_clock, struct sigevent *event,
> >               goto out;
> >
> >       spin_lock_irq(&current->sighand->siglock);
> > -     /* This makes the timer valid in the hash table */
> > +     /* This makes the timer valid in the hash table, clearing low order bit. */
>
> Clearing the low order bit of what? This is a full write and not a clear
> low order bit operation.
>

Prior value was (sig | 1L)

New value is (sig)

-> low order bit is cleared.
Re: [PATCH V2 3/4] posix-timers: Initialise timer->it_signal in posix_timer_add()
Posted by Thomas Gleixner 10 months ago
On Thu, Feb 20 2025 at 09:44, Eric Dumazet wrote:
> On Thu, Feb 20, 2025 at 9:19 AM Thomas Gleixner <tglx@linutronix.de> wrote:
>> > This fixes a potential race, in the unlikely case a thread
>> > was preempted long enough that other threads created more than
>> > 2^31 itimers.
>>
>> and then what happens?
>
> Two threads might get the same timer_id given back.

I know that, but how will someone who reads that changelog without the
knowledge and background information know?

That's the whole point of change logs to explain it for the uninformed
reader, no?

>> >
>> >       spin_lock_irq(&current->sighand->siglock);
>> > -     /* This makes the timer valid in the hash table */
>> > +     /* This makes the timer valid in the hash table, clearing low order bit. */
>>
>> Clearing the low order bit of what? This is a full write and not a clear
>> low order bit operation.
>>
>
> Prior value was (sig | 1L)
>
> New value is (sig)
>
> -> low order bit is cleared.

Right I know, but again it's not obvious without figuring out from some
other place what the logic behind this is.

Thanks,

        tglx