[PATCH v5 0/5] barrier: Add smp_cond_load_*_timeout()

Ankur Arora posted 5 patches 3 weeks ago
arch/arm64/include/asm/barrier.h    | 23 ++++++++
arch/arm64/include/asm/rqspinlock.h | 85 -----------------------------
include/asm-generic/barrier.h       | 57 +++++++++++++++++++
kernel/bpf/rqspinlock.c             | 23 +++-----
4 files changed, 87 insertions(+), 101 deletions(-)
[PATCH v5 0/5] barrier: Add smp_cond_load_*_timeout()
Posted by Ankur Arora 3 weeks ago
This series adds waited variants of the smp_cond_load() primitives:
smp_cond_load_relaxed_timeout(), and smp_cond_load_acquire_timeout().

As the name suggests, the new interfaces are meant for contexts where
you want to wait on a condition variable for a finite duration. This
is easy enough to do with a loop around cpu_relax() and a periodic
timeout check (pretty much what we do in poll_idle(). However, some
architectures (ex. arm64) also allow waiting on a cacheline. So, 

  smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)
  smp_cond_load_acquire_timeout(ptr, cond_expr, time_check_expr)

do a mixture of spin/wait with a smp_cond_load() thrown in.

The added parameter, time_check_expr, determines the bail out condition.

There are two current users for these interfaces. poll_idle() with
the change:

  poll_idle() {
      ...
      time_end = local_clock_noinstr() + cpuidle_poll_time(drv, dev);
      
      raw_local_irq_enable();
      if (!current_set_polling_and_test())
      	 flags = smp_cond_load_relaxed_timeout(&current_thread_info()->flags,
      					(VAL & _TIF_NEED_RESCHED),
      					((local_clock_noinstr() >= time_end)));
      dev->poll_time_limit = !(flags & _TIF_NEED_RESCHED);
      raw_local_irq_disable();
      ...
  }

where smp_cond_load_relaxed_timeout() replaces the inner loop in
poll_idle() (on x86 the generated code for both is similar):

  poll_idle() {
      ...
      raw_local_irq_enable();
      if (!current_set_polling_and_test()) {
      	unsigned int loop_count = 0;
      	u64 limit;
      
      	limit = cpuidle_poll_time(drv, dev);
      
      	while (!need_resched()) {
      		cpu_relax();
      		if (loop_count++ < POLL_IDLE_RELAX_COUNT)
      			continue;
      
      		loop_count = 0;
      		if (local_clock_noinstr() - time_start > limit) {
      			dev->poll_time_limit = true;
      			break;
      		}
      	}
      }
      raw_local_irq_disable();
      ...
  }

And resilient queued spinlocks:

  resilient_queued_spin_lock_slowpath() {
      ...
      if (val & _Q_LOCKED_MASK) {
      	RES_RESET_TIMEOUT(ts, RES_DEF_TIMEOUT);
      	smp_cond_load_acquire_timeout(&lock->locked, !VAL,
      				      (ret = check_timeout(lock, _Q_LOCKED_MASK, &ts)));
      }
      ...
  }

Changelog:
  v4 [1]:
    - naming change 's/timewait/timeout/'
    - resilient spinlocks: get rid of res_smp_cond_load_acquire_waiting()
      and fixup use of RES_CHECK_TIMEOUT().
    (Both suggested by Catalin Marinas)

  v3 [2]:
    - further interface simplifications (suggested by Catalin Marinas)

  v2 [3]:
    - simplified the interface (suggested by Catalin Marinas)
       - get rid of wait_policy, and a multitude of constants
       - adds a slack parameter
      This helped remove a fair amount of duplicated code duplication and in
      hindsight unnecessary constants.

  v1 [4]:
     - add wait_policy (coarse and fine)
     - derive spin-count etc at runtime instead of using arbitrary
       constants.

Haris Okanovic tested v4 of this series with poll_idle()/haltpoll patches. [5]

Any comments appreciated!

Thanks
Ankur

 [1] https://lore.kernel.org/lkml/20250829080735.3598416-1-ankur.a.arora@oracle.com/
 [2] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@oracle.com/
 [3] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@oracle.com/
 [4] https://lore.kernel.org/lkml/20250203214911.898276-1-ankur.a.arora@oracle.com/
 [5] https://lore.kernel.org/lkml/2cecbf7fb23ee83a4ce027e1be3f46f97efd585c.camel@amazon.com/
 
 Cc: Arnd Bergmann <arnd@arndb.de>
 Cc: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: linux-arch@vger.kernel.org

Ankur Arora (5):
  asm-generic: barrier: Add smp_cond_load_relaxed_timeout()
  arm64: barrier: Add smp_cond_load_relaxed_timeout()
  arm64: rqspinlock: Remove private copy of
    smp_cond_load_acquire_timewait
  asm-generic: barrier: Add smp_cond_load_acquire_timeout()
  rqspinlock: use smp_cond_load_acquire_timeout()

 arch/arm64/include/asm/barrier.h    | 23 ++++++++
 arch/arm64/include/asm/rqspinlock.h | 85 -----------------------------
 include/asm-generic/barrier.h       | 57 +++++++++++++++++++
 kernel/bpf/rqspinlock.c             | 23 +++-----
 4 files changed, 87 insertions(+), 101 deletions(-)

-- 
2.43.5
Re: [PATCH v5 0/5] barrier: Add smp_cond_load_*_timeout()
Posted by Catalin Marinas 3 weeks ago
On Wed, Sep 10, 2025 at 08:46:50PM -0700, Ankur Arora wrote:
> This series adds waited variants of the smp_cond_load() primitives:
> smp_cond_load_relaxed_timeout(), and smp_cond_load_acquire_timeout().
> 
> As the name suggests, the new interfaces are meant for contexts where
> you want to wait on a condition variable for a finite duration. This
> is easy enough to do with a loop around cpu_relax() and a periodic
> timeout check (pretty much what we do in poll_idle(). However, some
> architectures (ex. arm64) also allow waiting on a cacheline. So, 
> 
>   smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)
>   smp_cond_load_acquire_timeout(ptr, cond_expr, time_check_expr)
> 
> do a mixture of spin/wait with a smp_cond_load() thrown in.
> 
> The added parameter, time_check_expr, determines the bail out condition.
> 
> There are two current users for these interfaces. poll_idle() with
> the change:
> 
>   poll_idle() {
>       ...
>       time_end = local_clock_noinstr() + cpuidle_poll_time(drv, dev);
>       
>       raw_local_irq_enable();
>       if (!current_set_polling_and_test())
>       	 flags = smp_cond_load_relaxed_timeout(&current_thread_info()->flags,
>       					(VAL & _TIF_NEED_RESCHED),
>       					((local_clock_noinstr() >= time_end)));
>       dev->poll_time_limit = !(flags & _TIF_NEED_RESCHED);
>       raw_local_irq_disable();
>       ...
>   }

You should have added this as a patch in the series than include the
implementation in the cover letter.

-- 
Catalin
Re: [PATCH v5 0/5] barrier: Add smp_cond_load_*_timeout()
Posted by Ankur Arora 2 weeks, 6 days ago
Catalin Marinas <catalin.marinas@arm.com> writes:

> On Wed, Sep 10, 2025 at 08:46:50PM -0700, Ankur Arora wrote:
>> This series adds waited variants of the smp_cond_load() primitives:
>> smp_cond_load_relaxed_timeout(), and smp_cond_load_acquire_timeout().
>>
>> As the name suggests, the new interfaces are meant for contexts where
>> you want to wait on a condition variable for a finite duration. This
>> is easy enough to do with a loop around cpu_relax() and a periodic
>> timeout check (pretty much what we do in poll_idle(). However, some
>> architectures (ex. arm64) also allow waiting on a cacheline. So,
>>
>>   smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)
>>   smp_cond_load_acquire_timeout(ptr, cond_expr, time_check_expr)
>>
>> do a mixture of spin/wait with a smp_cond_load() thrown in.
>>
>> The added parameter, time_check_expr, determines the bail out condition.
>>
>> There are two current users for these interfaces. poll_idle() with
>> the change:
>>
>>   poll_idle() {
>>       ...
>>       time_end = local_clock_noinstr() + cpuidle_poll_time(drv, dev);
>>
>>       raw_local_irq_enable();
>>       if (!current_set_polling_and_test())
>>       	 flags = smp_cond_load_relaxed_timeout(&current_thread_info()->flags,
>>       					(VAL & _TIF_NEED_RESCHED),
>>       					((local_clock_noinstr() >= time_end)));
>>       dev->poll_time_limit = !(flags & _TIF_NEED_RESCHED);
>>       raw_local_irq_disable();
>>       ...
>>   }
>
> You should have added this as a patch in the series than include the
> implementation in the cover letter.

This was probably an overkill but I wanted to not add another subsystem
to this series.
Will take care of the cpuidle changes in the arm64 polling in idle series.

--
ankur
Re: [PATCH v5 0/5] barrier: Add smp_cond_load_*_timeout()
Posted by Catalin Marinas 2 weeks, 3 days ago
On Thu, Sep 11, 2025 at 02:57:52PM -0700, Ankur Arora wrote:
> Catalin Marinas <catalin.marinas@arm.com> writes:
> > On Wed, Sep 10, 2025 at 08:46:50PM -0700, Ankur Arora wrote:
> >> This series adds waited variants of the smp_cond_load() primitives:
> >> smp_cond_load_relaxed_timeout(), and smp_cond_load_acquire_timeout().
> >>
> >> As the name suggests, the new interfaces are meant for contexts where
> >> you want to wait on a condition variable for a finite duration. This
> >> is easy enough to do with a loop around cpu_relax() and a periodic
> >> timeout check (pretty much what we do in poll_idle(). However, some
> >> architectures (ex. arm64) also allow waiting on a cacheline. So,
> >>
> >>   smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)
> >>   smp_cond_load_acquire_timeout(ptr, cond_expr, time_check_expr)
> >>
> >> do a mixture of spin/wait with a smp_cond_load() thrown in.
> >>
> >> The added parameter, time_check_expr, determines the bail out condition.
> >>
> >> There are two current users for these interfaces. poll_idle() with
> >> the change:
> >>
> >>   poll_idle() {
> >>       ...
> >>       time_end = local_clock_noinstr() + cpuidle_poll_time(drv, dev);
> >>
> >>       raw_local_irq_enable();
> >>       if (!current_set_polling_and_test())
> >>       	 flags = smp_cond_load_relaxed_timeout(&current_thread_info()->flags,
> >>       					(VAL & _TIF_NEED_RESCHED),
> >>       					((local_clock_noinstr() >= time_end)));
> >>       dev->poll_time_limit = !(flags & _TIF_NEED_RESCHED);
> >>       raw_local_irq_disable();
> >>       ...
> >>   }
> >
> > You should have added this as a patch in the series than include the
> > implementation in the cover letter.
> 
> This was probably an overkill but I wanted to not add another subsystem
> to this series.

If you include it, it's easier to poke the cpuidle maintainers and ask
if they are ok with the proposed API as I want to avoid changing it
afterwards. It doesn't mean they'll have to be merged together, they can
go upstream via separate routes.

> Will take care of the cpuidle changes in the arm64 polling in idle series.

Thanks. We also need Will, Peter Z and Arnd to ack the API and the
generic changes (probably once you added the linux/atomic.h changes).

-- 
Catalin
Re: [PATCH v5 0/5] barrier: Add smp_cond_load_*_timeout()
Posted by Ankur Arora 2 weeks, 2 days ago
Catalin Marinas <catalin.marinas@arm.com> writes:

> On Thu, Sep 11, 2025 at 02:57:52PM -0700, Ankur Arora wrote:
>> Catalin Marinas <catalin.marinas@arm.com> writes:
>> > On Wed, Sep 10, 2025 at 08:46:50PM -0700, Ankur Arora wrote:
>> >> This series adds waited variants of the smp_cond_load() primitives:
>> >> smp_cond_load_relaxed_timeout(), and smp_cond_load_acquire_timeout().
>> >>
>> >> As the name suggests, the new interfaces are meant for contexts where
>> >> you want to wait on a condition variable for a finite duration. This
>> >> is easy enough to do with a loop around cpu_relax() and a periodic
>> >> timeout check (pretty much what we do in poll_idle(). However, some
>> >> architectures (ex. arm64) also allow waiting on a cacheline. So,
>> >>
>> >>   smp_cond_load_relaxed_timeout(ptr, cond_expr, time_check_expr)
>> >>   smp_cond_load_acquire_timeout(ptr, cond_expr, time_check_expr)
>> >>
>> >> do a mixture of spin/wait with a smp_cond_load() thrown in.
>> >>
>> >> The added parameter, time_check_expr, determines the bail out condition.
>> >>
>> >> There are two current users for these interfaces. poll_idle() with
>> >> the change:
>> >>
>> >>   poll_idle() {
>> >>       ...
>> >>       time_end = local_clock_noinstr() + cpuidle_poll_time(drv, dev);
>> >>
>> >>       raw_local_irq_enable();
>> >>       if (!current_set_polling_and_test())
>> >>       	 flags = smp_cond_load_relaxed_timeout(&current_thread_info()->flags,
>> >>       					(VAL & _TIF_NEED_RESCHED),
>> >>       					((local_clock_noinstr() >= time_end)));
>> >>       dev->poll_time_limit = !(flags & _TIF_NEED_RESCHED);
>> >>       raw_local_irq_disable();
>> >>       ...
>> >>   }
>> >
>> > You should have added this as a patch in the series than include the
>> > implementation in the cover letter.
>>
>> This was probably an overkill but I wanted to not add another subsystem
>> to this series.
>
> If you include it, it's easier to poke the cpuidle maintainers and ask
> if they are ok with the proposed API as I want to avoid changing it
> afterwards. It doesn't mean they'll have to be merged together, they can
> go upstream via separate routes.

That makes a lot of sense. Will include this patch as well.

>> Will take care of the cpuidle changes in the arm64 polling in idle series.
>
> Thanks. We also need Will, Peter Z and Arnd to ack the API and the
> generic changes (probably once you added the linux/atomic.h changes).

Makes sense.


Thanks

--
ankur