[v7] barrier: Add smp_cond_load_*_timeout()

[RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Ankur Arora 3 months, 1 week ago

Support waiting in smp_cond_load_relaxed_timeout() via
__cmpwait_relaxed(). Limit this to when the event-stream is enabled,
to ensure that we wake from WFE periodically and don't block forever
if there are no stores to the cacheline.

In the unlikely event that the event-stream is unavailable, fallback
to spin-waiting.

Also set SMP_TIMEOUT_POLL_COUNT to 1 so we do the time-check for each
iteration in smp_cond_load_relaxed_timeout().

Cc: linux-arm-kernel@lists.infradead.org
Cc: Catalin Marinas <catalin.marinas@arm.com>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/arm64/include/asm/barrier.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
index f5801b0ba9e9..92c16dfb8ca6 100644
--- a/arch/arm64/include/asm/barrier.h
+++ b/arch/arm64/include/asm/barrier.h
@@ -219,6 +219,19 @@ do {									\
 	(typeof(*ptr))VAL;						\
 })
 
+#define SMP_TIMEOUT_POLL_COUNT	1
+
+/* Re-declared here to avoid include dependency. */
+extern bool arch_timer_evtstrm_available(void);
+
+#define cpu_poll_relax(ptr, val)					\
+do {									\
+	if (arch_timer_evtstrm_available())				\
+		__cmpwait_relaxed(ptr, val);				\
+	else								\
+		cpu_relax();						\
+} while (0)
+
 #include <asm-generic/barrier.h>
 
 #endif	/* __ASSEMBLY__ */
-- 
2.43.5

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Arnd Bergmann 3 months, 1 week ago

On Tue, Oct 28, 2025, at 06:31, Ankur Arora wrote:
> Support waiting in smp_cond_load_relaxed_timeout() via
> __cmpwait_relaxed(). Limit this to when the event-stream is enabled,
> to ensure that we wake from WFE periodically and don't block forever
> if there are no stores to the cacheline.
>
> In the unlikely event that the event-stream is unavailable, fallback
> to spin-waiting.
>
> Also set SMP_TIMEOUT_POLL_COUNT to 1 so we do the time-check for each
> iteration in smp_cond_load_relaxed_timeout().

After I looked at the entire series again, this one feels like
a missed opportunity. Especially on low-power systems but possibly
on any ARMv9.2+ implementation including Cortex-A320, it would
be nice to be able to both turn off the event stream and also
make this function take fewer wakeups:

> +/* Re-declared here to avoid include dependency. */
> +extern bool arch_timer_evtstrm_available(void);
> +
> +#define cpu_poll_relax(ptr, val)					\
> +do {									\
> +	if (arch_timer_evtstrm_available())				\
> +		__cmpwait_relaxed(ptr, val);				\
> +	else								\
> +		cpu_relax();						\
> +} while (0)
> +

Since the caller knows exactly how long it wants to wait for,
we should be able to fit a 'wfet' based primitive in here and
pass the timeout as another argument.

    Arnd

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Ankur Arora 3 months, 1 week ago

Arnd Bergmann <arnd@arndb.de> writes:

> On Tue, Oct 28, 2025, at 06:31, Ankur Arora wrote:
>> Support waiting in smp_cond_load_relaxed_timeout() via
>> __cmpwait_relaxed(). Limit this to when the event-stream is enabled,
>> to ensure that we wake from WFE periodically and don't block forever
>> if there are no stores to the cacheline.
>>
>> In the unlikely event that the event-stream is unavailable, fallback
>> to spin-waiting.
>>
>> Also set SMP_TIMEOUT_POLL_COUNT to 1 so we do the time-check for each
>> iteration in smp_cond_load_relaxed_timeout().
>
> After I looked at the entire series again, this one feels like
> a missed opportunity. Especially on low-power systems but possibly
> on any ARMv9.2+ implementation including Cortex-A320, it would
> be nice to be able to both turn off the event stream and also
> make this function take fewer wakeups:
>
>> +/* Re-declared here to avoid include dependency. */
>> +extern bool arch_timer_evtstrm_available(void);
>> +
>> +#define cpu_poll_relax(ptr, val)					\
>> +do {									\
>> +	if (arch_timer_evtstrm_available())				\
>> +		__cmpwait_relaxed(ptr, val);				\
>> +	else								\
>> +		cpu_relax();						\
>> +} while (0)
>> +
>
> Since the caller knows exactly how long it wants to wait for,
> we should be able to fit a 'wfet' based primitive in here and
> pass the timeout as another argument.

Per se, I don't disagree with this when it comes to WFET.

Handling a timeout, however, is messier when we use other mechanisms.

Some problems that came up in my earlier discussions with Catalin:

  - when using WFE, we also need some notion of slack
    - and if a caller specifies only a small or no slack, then we need
      to combine WFE+cpu_relax()

  - for platforms that only use a polling primitive, we want to check
    the clock only intermittently for power reasons.
    Now, this could be done with an architecture specific spin-count.
    However, if the caller specifies a small slack, then we might need
    to we check the clock more often as we get closer to the deadline etc.

A smaller problem was that different users want different clocks and so
folding the timeout in a 'timeout_cond_expr' lets us do away with the
interface having to handle any of that.

I had earlier versions [v2] [v3] which had rather elaborate policies for
handling timeout, slack etc. But, given that the current users of the
interface don't actually care about precision, all of that seemed
a little overengineered.

[v2] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@oracle.com/#r
[v3] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@oracle.com/

--
ankur

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Catalin Marinas 3 months, 1 week ago

On Tue, Oct 28, 2025 at 11:01:22AM -0700, Ankur Arora wrote:
> Arnd Bergmann <arnd@arndb.de> writes:
> > On Tue, Oct 28, 2025, at 06:31, Ankur Arora wrote:
> >> Support waiting in smp_cond_load_relaxed_timeout() via
> >> __cmpwait_relaxed(). Limit this to when the event-stream is enabled,
> >> to ensure that we wake from WFE periodically and don't block forever
> >> if there are no stores to the cacheline.
> >>
> >> In the unlikely event that the event-stream is unavailable, fallback
> >> to spin-waiting.
> >>
> >> Also set SMP_TIMEOUT_POLL_COUNT to 1 so we do the time-check for each
> >> iteration in smp_cond_load_relaxed_timeout().
> >
> > After I looked at the entire series again, this one feels like
> > a missed opportunity. Especially on low-power systems but possibly
> > on any ARMv9.2+ implementation including Cortex-A320, it would
> > be nice to be able to both turn off the event stream and also
> > make this function take fewer wakeups:
> >
> >> +/* Re-declared here to avoid include dependency. */
> >> +extern bool arch_timer_evtstrm_available(void);
> >> +
> >> +#define cpu_poll_relax(ptr, val)					\
> >> +do {									\
> >> +	if (arch_timer_evtstrm_available())				\
> >> +		__cmpwait_relaxed(ptr, val);				\
> >> +	else								\
> >> +		cpu_relax();						\
> >> +} while (0)
> >> +
> >
> > Since the caller knows exactly how long it wants to wait for,
> > we should be able to fit a 'wfet' based primitive in here and
> > pass the timeout as another argument.
> 
> Per se, I don't disagree with this when it comes to WFET.
> 
> Handling a timeout, however, is messier when we use other mechanisms.
> 
> Some problems that came up in my earlier discussions with Catalin:
> 
>   - when using WFE, we also need some notion of slack
>     - and if a caller specifies only a small or no slack, then we need
>       to combine WFE+cpu_relax()
> 
>   - for platforms that only use a polling primitive, we want to check
>     the clock only intermittently for power reasons.
>     Now, this could be done with an architecture specific spin-count.
>     However, if the caller specifies a small slack, then we might need
>     to we check the clock more often as we get closer to the deadline etc.
> 
> A smaller problem was that different users want different clocks and so
> folding the timeout in a 'timeout_cond_expr' lets us do away with the
> interface having to handle any of that.
> 
> I had earlier versions [v2] [v3] which had rather elaborate policies for
> handling timeout, slack etc. But, given that the current users of the
> interface don't actually care about precision, all of that seemed
> a little overengineered.

Indeed, we've been through all these options and without a concrete user
that needs a more precise timeout, we decided it's not worth it. It can,
however, be improved later if such users appear.

> [v2] https://lore.kernel.org/lkml/20250502085223.1316925-1-ankur.a.arora@oracle.com/#r
> [v3] https://lore.kernel.org/lkml/20250627044805.945491-1-ankur.a.arora@oracle.com/

-- 
Catalin

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Arnd Bergmann 3 months, 1 week ago

On Tue, Oct 28, 2025, at 22:17, Catalin Marinas wrote:
> On Tue, Oct 28, 2025 at 11:01:22AM -0700, Ankur Arora wrote:
>> Arnd Bergmann <arnd@arndb.de> writes:
>> > On Tue, Oct 28, 2025, at 06:31, Ankur Arora wrote:
>> >> +
>> >
>> > Since the caller knows exactly how long it wants to wait for,
>> > we should be able to fit a 'wfet' based primitive in here and
>> > pass the timeout as another argument.
>> 
>> Per se, I don't disagree with this when it comes to WFET.
>> 
>> Handling a timeout, however, is messier when we use other mechanisms.
>> 
>> Some problems that came up in my earlier discussions with Catalin:
>> 
>>   - when using WFE, we also need some notion of slack
>>     - and if a caller specifies only a small or no slack, then we need
>>       to combine WFE+cpu_relax()

I don't see the difference to what you have: with the event stream,
you implicitly define a slack to be the programmed event stream rate
of ~100µs.

I'm not asking for anything better in this case, only for machines
with WFET but no event stream to also avoid the spin loop.

>>   - for platforms that only use a polling primitive, we want to check
>>     the clock only intermittently for power reasons.

Right, I missed that bit.

>>     Now, this could be done with an architecture specific spin-count.
>>     However, if the caller specifies a small slack, then we might need
>>     to we check the clock more often as we get closer to the deadline etc.

Again, I think this is solved by defining the slack as architecture
specific as well rather than an explicit argument, which is essentially
what we already have.

>> A smaller problem was that different users want different clocks and so
>> folding the timeout in a 'timeout_cond_expr' lets us do away with the
>> interface having to handle any of that.
>>
>> I had earlier versions [v2] [v3] which had rather elaborate policies for
>> handling timeout, slack etc. But, given that the current users of the
>> interface don't actually care about precision, all of that seemed
>> a little overengineered.
>
> Indeed, we've been through all these options and without a concrete user
> that needs a more precise timeout, we decided it's not worth it. It can,
> however, be improved later if such users appear.

The main worry I have is that we get too many users of cpu_poll_relax()
hardcoding the use of the event stream without a timeout argument, it
becomes too hard to change later without introducing regressions
from the behavior change.

As far as I can tell, the only place that currently uses the
event stream on a functional level is the delay() loop, and that
has a working wfet based version.

     Arnd

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Ankur Arora 3 months, 1 week ago

Arnd Bergmann <arnd@arndb.de> writes:

> On Tue, Oct 28, 2025, at 22:17, Catalin Marinas wrote:
>> On Tue, Oct 28, 2025 at 11:01:22AM -0700, Ankur Arora wrote:
>>> Arnd Bergmann <arnd@arndb.de> writes:
>>> > On Tue, Oct 28, 2025, at 06:31, Ankur Arora wrote:
>>> >> +
>>> >
>>> > Since the caller knows exactly how long it wants to wait for,
>>> > we should be able to fit a 'wfet' based primitive in here and
>>> > pass the timeout as another argument.
>>>
>>> Per se, I don't disagree with this when it comes to WFET.
>>>
>>> Handling a timeout, however, is messier when we use other mechanisms.
>>>
>>> Some problems that came up in my earlier discussions with Catalin:
>>>
>>>   - when using WFE, we also need some notion of slack
>>>     - and if a caller specifies only a small or no slack, then we need
>>>       to combine WFE+cpu_relax()
>
> I don't see the difference to what you have: with the event stream,
> you implicitly define a slack to be the programmed event stream rate
> of ~100µs.

True. The thinking was that an adding an explicit timeout just begs the
question of how closely the interface adheres to the timeout and I guess
the final interface tried to sidestep all of that.

> I'm not asking for anything better in this case, only for machines
> with WFET but no event stream to also avoid the spin loop.

That makes sense. It's a good point that the WFET+event-stream-off case
would just end up using the spin lock which is quite suboptimal.

>>>   - for platforms that only use a polling primitive, we want to check
>>>     the clock only intermittently for power reasons.
>
> Right, I missed that bit.
>
>>>     Now, this could be done with an architecture specific spin-count.
>>>     However, if the caller specifies a small slack, then we might need
>>>     to we check the clock more often as we get closer to the deadline etc.
>
> Again, I think this is solved by defining the slack as architecture
> specific as well rather than an explicit argument, which is essentially
> what we already have.

Great. I think that means that I can keep more or less the same interface
with an explicit time_end. Which allows WFET to do the right thing.
And, WFE can have an architecture specific slack (event-stream period).

>>> A smaller problem was that different users want different clocks and so
>>> folding the timeout in a 'timeout_cond_expr' lets us do away with the
>>> interface having to handle any of that.
>>>
>>> I had earlier versions [v2] [v3] which had rather elaborate policies for
>>> handling timeout, slack etc. But, given that the current users of the
>>> interface don't actually care about precision, all of that seemed
>>> a little overengineered.
>>
>> Indeed, we've been through all these options and without a concrete user
>> that needs a more precise timeout, we decided it's not worth it. It can,
>> however, be improved later if such users appear.
>
> The main worry I have is that we get too many users of cpu_poll_relax()
> hardcoding the use of the event stream without a timeout argument, it
> becomes too hard to change later without introducing regressions
> from the behavior change.

True.

> As far as I can tell, the only place that currently uses the
> event stream on a functional level is the delay() loop, and that
> has a working wfet based version.

Will send out the next version with an interface on the following lines:

    /**
    * smp_cond_load_relaxed_timeout() - (Spin) wait for cond with no ordering
    * guarantees until a timeout expires.
    * @ptr: pointer to the variable to wait on
    * @cond: boolean expression to wait for
    * @time_expr: time expression in caller's preferred clock
    * @time_end: end time in nanosecond (compared against time_expr;
    * might also be used for setting up a future event.)
    *
    * Equivalent to using READ_ONCE() on the condition variable.
    *
    * Note that the expiration of the timeout might have an architecture specific
    * delay.
    */
    #ifndef smp_cond_load_relaxed_timeout
    #define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_expr, time_end_ns)	\
    ({									\
            typeof(ptr) __PTR = (ptr);					\
            __unqual_scalar_typeof(*ptr) VAL;				\
            u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;		\
            u64 __time_end_ns = (time_end_ns);				\
                                                                        \
            for (;;) {							\
                    VAL = READ_ONCE(*__PTR);				\
                    if (cond_expr)					\
                            break;					\
                    cpu_poll_relax(__PTR, VAL, __time_end_ns);		\
                    if (++__n < __spin)				\
                            continue;					\
                    if ((time_expr) >= __time_end_ns) {		\
                            VAL = READ_ONCE(*__PTR);			\
                            break;					\
                    }							\
                    __n = 0;						\
            }								\
            (typeof(*ptr))VAL;						\
    })
    #endif

That allows for a __cmpwait_timeout() as you had outlined and similar to
these two patches:

 https://lore.kernel.org/lkml/20241107190818.522639-15-ankur.a.arora@oracle.com/
 https://lore.kernel.org/lkml/20241107190818.522639-16-ankur.a.arora@oracle.com/
 (this one incorporating some changes that Catalin had suggested:
  https://lore.kernel.org/lkml/aKRTRyQAaWFtRvDv@arm.com/)

--
ankur

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Catalin Marinas 3 months ago

On Mon, Nov 03, 2025 at 01:00:33PM -0800, Ankur Arora wrote:
>     /**
>     * smp_cond_load_relaxed_timeout() - (Spin) wait for cond with no ordering
>     * guarantees until a timeout expires.
>     * @ptr: pointer to the variable to wait on
>     * @cond: boolean expression to wait for
>     * @time_expr: time expression in caller's preferred clock
>     * @time_end: end time in nanosecond (compared against time_expr;
>     * might also be used for setting up a future event.)
>     *
>     * Equivalent to using READ_ONCE() on the condition variable.
>     *
>     * Note that the expiration of the timeout might have an architecture specific
>     * delay.
>     */
>     #ifndef smp_cond_load_relaxed_timeout
>     #define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_expr, time_end_ns)	\
>     ({									\
>             typeof(ptr) __PTR = (ptr);					\
>             __unqual_scalar_typeof(*ptr) VAL;				\
>             u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;		\
>             u64 __time_end_ns = (time_end_ns);				\
>                                                                         \
>             for (;;) {							\
>                     VAL = READ_ONCE(*__PTR);				\
>                     if (cond_expr)					\
>                             break;					\
>                     cpu_poll_relax(__PTR, VAL, __time_end_ns);		\

With time_end_ns being passed to cpu_poll_relax(), we assume that this
is always the absolute time. Do we still need time_expr in this case?
It works for WFET as long as we can map this time_end_ns onto the
hardware CNTVCT.

Alternatively, we could pass something like remaining_ns, though not
sure how smp_cond_load_relaxed_timeout() can decide to spin before
checking time_expr again (we probably went over this in the past two
years ;)).

>                     if (++__n < __spin)				\
>                             continue;					\
>                     if ((time_expr) >= __time_end_ns) {		\
>                             VAL = READ_ONCE(*__PTR);			\
>                             break;					\
>                     }							\
>                     __n = 0;						\
>             }								\
>             (typeof(*ptr))VAL;						\
>     })
>     #endif

-- 
Catalin

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Ankur Arora 3 months ago

Catalin Marinas <catalin.marinas@arm.com> writes:

> On Mon, Nov 03, 2025 at 01:00:33PM -0800, Ankur Arora wrote:
>>     /**
>>     * smp_cond_load_relaxed_timeout() - (Spin) wait for cond with no ordering
>>     * guarantees until a timeout expires.
>>     * @ptr: pointer to the variable to wait on
>>     * @cond: boolean expression to wait for
>>     * @time_expr: time expression in caller's preferred clock
>>     * @time_end: end time in nanosecond (compared against time_expr;
>>     * might also be used for setting up a future event.)
>>     *
>>     * Equivalent to using READ_ONCE() on the condition variable.
>>     *
>>     * Note that the expiration of the timeout might have an architecture specific
>>     * delay.
>>     */
>>     #ifndef smp_cond_load_relaxed_timeout
>>     #define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_expr, time_end_ns)	\
>>     ({									\
>>             typeof(ptr) __PTR = (ptr);					\
>>             __unqual_scalar_typeof(*ptr) VAL;				\
>>             u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;		\
>>             u64 __time_end_ns = (time_end_ns);				\
>>                                                                         \
>>             for (;;) {							\
>>                     VAL = READ_ONCE(*__PTR);				\
>>                     if (cond_expr)					\
>>                             break;					\
>>                     cpu_poll_relax(__PTR, VAL, __time_end_ns);		\
>
> With time_end_ns being passed to cpu_poll_relax(), we assume that this
> is always the absolute time. Do we still need time_expr in this case?
> It works for WFET as long as we can map this time_end_ns onto the
> hardware CNTVCT.

So I like this idea. Given that we only promise a coarse granularity we
should be able to get by with using a coarse clock of our choosing.

However, maybe some callers need a globally consistent clock just in
case they could migrate and do something stateful in the cond_expr?
(For instance rqspinlock wants ktime_mono. Though I don't think these
callers can migrate.)

> Alternatively, we could pass something like remaining_ns, though not
> sure how smp_cond_load_relaxed_timeout() can decide to spin before
> checking time_expr again (we probably went over this in the past two
> years ;)).

I'm sure it is in there somewhere :).
This one?: https://lore.kernel.org/lkml/aJy414YufthzC1nv@arm.com/.
Though the whole wait_policy thing confused the issue somewhat there.

Though that problem exists for both remaining_ns and for time_end_ns
with WFE. I think we are fine so long as SMP_TIMEOUT_POLL_COUNT is
defined to be 1.

For now, I think it makes sense to always pass the absolute deadline
even if the caller uses remaining_ns. So:

#define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_expr, remaining_ns)	\
({									\
	typeof(ptr) __PTR = (ptr);					\
	__unqual_scalar_typeof(*ptr) VAL;				\
	u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;			\
	u64 __time_start_ns = (time_expr);				\
	s64 __time_end_ns = __time_start_ns + (remaining_ns);		\
									\
	for (;;) {							\
		VAL = READ_ONCE(*__PTR);				\
		if (cond_expr)						\
			break;						\
		cpu_poll_relax(__PTR, VAL, __time_end_ns);		\
		if (++__n < __spin)					\
			continue;					\
		if ((time_expr) >= __time_end_ns) {			\
			VAL = READ_ONCE(*__PTR);			\
			break;						\
		}							\
		__n = 0;						\
	}								\
	(typeof(*ptr))VAL;						\
})

--
ankur

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Arnd Bergmann 3 months ago

On Wed, Nov 5, 2025, at 09:27, Ankur Arora wrote:
> Catalin Marinas <catalin.marinas@arm.com> writes:
>> On Mon, Nov 03, 2025 at 01:00:33PM -0800, Ankur Arora wrote:
>> With time_end_ns being passed to cpu_poll_relax(), we assume that this
>> is always the absolute time. Do we still need time_expr in this case?
>> It works for WFET as long as we can map this time_end_ns onto the
>> hardware CNTVCT.
>>
>> Alternatively, we could pass something like remaining_ns, though not
>> sure how smp_cond_load_relaxed_timeout() can decide to spin before
>> checking time_expr again (we probably went over this in the past two
>> years ;)).
>
> I'm sure it is in there somewhere :).
> This one?: https://lore.kernel.org/lkml/aJy414YufthzC1nv@arm.com/.
> Though the whole wait_policy thing confused the issue somewhat there.
>
> Though that problem exists for both remaining_ns and for time_end_ns
> with WFE. I think we are fine so long as SMP_TIMEOUT_POLL_COUNT is
> defined to be 1.

We need to be careful with the definition of the time_expr() if
cpu_poll_relax requires the absolute time in CNTVCT domain.
We're probably fine here, but it feels like a bit of a layering
violation. I think passing relative time into it would be cleaner
because it avoids the ambiguity it but probably requires an extra
access to the timer register that is hopefully fast on arm64.

I'm ok with either way.

> For now, I think it makes sense to always pass the absolute deadline
> even if the caller uses remaining_ns. So:
>
> #define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_expr, remaining_ns)	\
> ({									\
> 	typeof(ptr) __PTR = (ptr);					\
> 	__unqual_scalar_typeof(*ptr) VAL;				\
> 	u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;			\
> 	u64 __time_start_ns = (time_expr);				\
> 	s64 __time_end_ns = __time_start_ns + (remaining_ns);		\
> 									\
> 	for (;;) {							\
> 		VAL = READ_ONCE(*__PTR);				\
> 		if (cond_expr)						\
> 			break;						\
> 		cpu_poll_relax(__PTR, VAL, __time_end_ns);		\

This looks perfectly fine to me, thanks for the update!

    Arnd

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Ankur Arora 3 months ago

Arnd Bergmann <arnd@arndb.de> writes:

> On Wed, Nov 5, 2025, at 09:27, Ankur Arora wrote:
>> Catalin Marinas <catalin.marinas@arm.com> writes:
>>> On Mon, Nov 03, 2025 at 01:00:33PM -0800, Ankur Arora wrote:
>>> With time_end_ns being passed to cpu_poll_relax(), we assume that this
>>> is always the absolute time. Do we still need time_expr in this case?
>>> It works for WFET as long as we can map this time_end_ns onto the
>>> hardware CNTVCT.
>>>
>>> Alternatively, we could pass something like remaining_ns, though not
>>> sure how smp_cond_load_relaxed_timeout() can decide to spin before
>>> checking time_expr again (we probably went over this in the past two
>>> years ;)).
>>
>> I'm sure it is in there somewhere :).
>> This one?: https://lore.kernel.org/lkml/aJy414YufthzC1nv@arm.com/.
>> Though the whole wait_policy thing confused the issue somewhat there.
>>
>> Though that problem exists for both remaining_ns and for time_end_ns
>> with WFE. I think we are fine so long as SMP_TIMEOUT_POLL_COUNT is
>> defined to be 1.
>
> We need to be careful with the definition of the time_expr() if
> cpu_poll_relax requires the absolute time in CNTVCT domain.

True. The absolute time assumes that CPU time and CNTVCT domain
times are freely translatable, and won't drift.

> We're probably fine here, but it feels like a bit of a layering
> violation. I think passing relative time into it would be cleaner
> because it avoids the ambiguity it

I'll play around a little; see if we can pass the relative time and
yet not depend on the conjoined value of SMP_TIMEOUT_POLL_COUNT.

> but probably requires an extra
> access to the timer register that is hopefully fast on arm64.
>
> I'm ok with either way.

I'll keep the caller parameter to be remaining_ns. This way we
can internally switch over to relative/absolute time if needed.

>> For now, I think it makes sense to always pass the absolute deadline
>> even if the caller uses remaining_ns. So:
>>
>> #define smp_cond_load_relaxed_timeout(ptr, cond_expr, time_expr, remaining_ns)	\
>> ({									\
>> 	typeof(ptr) __PTR = (ptr);					\
>> 	__unqual_scalar_typeof(*ptr) VAL;				\
>> 	u32 __n = 0, __spin = SMP_TIMEOUT_POLL_COUNT;			\
>> 	u64 __time_start_ns = (time_expr);				\
>> 	s64 __time_end_ns = __time_start_ns + (remaining_ns);		\
>> 									\
>> 	for (;;) {							\
>> 		VAL = READ_ONCE(*__PTR);				\
>> 		if (cond_expr)						\
>> 			break;						\
>> 		cpu_poll_relax(__PTR, VAL, __time_end_ns);		\
>
> This looks perfectly fine to me, thanks for the update!

Thanks for the review comments!

--
ankur

Re: [RESEND PATCH v7 2/7] arm64: barrier: Support smp_cond_load_relaxed_timeout()

Posted by Christoph Lameter (Ampere) 3 months, 1 week ago

On Tue, 28 Oct 2025, Arnd Bergmann wrote:

> After I looked at the entire series again, this one feels like
> a missed opportunity. Especially on low-power systems but possibly
> on any ARMv9.2+ implementation including Cortex-A320, it would
> be nice to be able to both turn off the event stream and also
> make this function take fewer wakeups:

That is certainly something that should be done at some point but for now
this patchset does the best possible with the available systems.

Switching of the event stream is another set of complexities that should
be handled separately later.

Lets just get this one finally in. We have been working on this issue
now for more than 2 years.