[PATCH v8 11/11] arm64: support cpuidle-haltpoll

Ankur Arora posted 11 patches 3 weeks, 1 day ago
[PATCH v8 11/11] arm64: support cpuidle-haltpoll
Posted by Ankur Arora 3 weeks, 1 day ago
Add architectural support for the cpuidle-haltpoll driver by defining
arch_haltpoll_*(). Also define ARCH_CPUIDLE_HALTPOLL to allow
cpuidle-haltpoll to be selected.

Haltpoll uses poll_idle() to do the actual polling. This in turn
uses smp_cond_load*() to wait until there's a specific store to
a cacheline. 
In the edge case -- no stores to the cacheline and no interrupt --
the event-stream provides the terminating condition ensuring we
don't wait forever. But because the event-stream runs at a fixed
frequency (configured at 10kHz) haltpoll might spend more time in
the polling stage than specified by cpuidle_poll_time().

This would only happen in the last iteration, since overshooting the
poll_limit means the governor will move out of the polling stage.

Tested-by: Haris Okanovic <harisokn@amazon.com>
Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
 arch/arm64/Kconfig                        |  6 ++++++
 arch/arm64/include/asm/cpuidle_haltpoll.h | 24 +++++++++++++++++++++++
 2 files changed, 30 insertions(+)
 create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index ef9c22c3cff2..5fc99eba22b2 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -2415,6 +2415,12 @@ config ARCH_HIBERNATION_HEADER
 config ARCH_SUSPEND_POSSIBLE
 	def_bool y
 
+config ARCH_CPUIDLE_HALTPOLL
+	bool "Enable selection of the cpuidle-haltpoll driver"
+	help
+	  cpuidle-haltpoll allows for adaptive polling based on
+	  current load before entering the idle state.
+
 endmenu # "Power management options"
 
 menu "CPU Power Management"
diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
new file mode 100644
index 000000000000..91f0be707629
--- /dev/null
+++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ARCH_HALTPOLL_H
+#define _ARCH_HALTPOLL_H
+
+static inline void arch_haltpoll_enable(unsigned int cpu) { }
+static inline void arch_haltpoll_disable(unsigned int cpu) { }
+
+static inline bool arch_haltpoll_want(bool force)
+{
+	/*
+	 * Enabling haltpoll requires two things:
+	 *
+	 * - Event stream support to provide a terminating condition to the
+	 *   WFE in the poll loop.
+	 *
+	 * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
+	 *
+	 * Given that the second is missing, only allow force loading for
+	 * haltpoll.
+	 */
+	return force;
+}
+#endif
-- 
2.43.5
Re: [PATCH v8 11/11] arm64: support cpuidle-haltpoll
Posted by Okanovic, Haris 1 day, 17 hours ago
On Wed, 2024-09-25 at 16:24 -0700, Ankur Arora wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> Add architectural support for the cpuidle-haltpoll driver by defining
> arch_haltpoll_*(). Also define ARCH_CPUIDLE_HALTPOLL to allow
> cpuidle-haltpoll to be selected.
> 
> Haltpoll uses poll_idle() to do the actual polling. This in turn
> uses smp_cond_load*() to wait until there's a specific store to
> a cacheline.
> In the edge case -- no stores to the cacheline and no interrupt --
> the event-stream provides the terminating condition ensuring we
> don't wait forever. But because the event-stream runs at a fixed
> frequency (configured at 10kHz) haltpoll might spend more time in
> the polling stage than specified by cpuidle_poll_time().
> 
> This would only happen in the last iteration, since overshooting the
> poll_limit means the governor will move out of the polling stage.
> 
> Tested-by: Haris Okanovic <harisokn@amazon.com>
> Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  arch/arm64/Kconfig                        |  6 ++++++
>  arch/arm64/include/asm/cpuidle_haltpoll.h | 24 +++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index ef9c22c3cff2..5fc99eba22b2 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2415,6 +2415,12 @@ config ARCH_HIBERNATION_HEADER
>  config ARCH_SUSPEND_POSSIBLE
>         def_bool y
> 
> +config ARCH_CPUIDLE_HALTPOLL
> +       bool "Enable selection of the cpuidle-haltpoll driver"
> +       help
> +         cpuidle-haltpoll allows for adaptive polling based on
> +         current load before entering the idle state.
> +
>  endmenu # "Power management options"
> 
>  menu "CPU Power Management"
> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
> new file mode 100644
> index 000000000000..91f0be707629
> --- /dev/null
> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _ARCH_HALTPOLL_H
> +#define _ARCH_HALTPOLL_H
> +
> +static inline void arch_haltpoll_enable(unsigned int cpu) { }
> +static inline void arch_haltpoll_disable(unsigned int cpu) { }
> +
> +static inline bool arch_haltpoll_want(bool force)
> +{
> +       /*
> +        * Enabling haltpoll requires two things:
> +        *
> +        * - Event stream support to provide a terminating condition to the
> +        *   WFE in the poll loop.

I missed this earlier: Why did you drop arch_timer_evtstrm_available()?

> +        *
> +        * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
> +        *
> +        * Given that the second is missing, only allow force loading for
> +        * haltpoll.
> +        */
> +       return force;
> +}
> +#endif
> --
> 2.43.5
> 

Re: [PATCH v8 11/11] arm64: support cpuidle-haltpoll
Posted by Okanovic, Haris 2 weeks, 1 day ago
On Wed, 2024-09-25 at 16:24 -0700, Ankur Arora wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> Add architectural support for the cpuidle-haltpoll driver by defining
> arch_haltpoll_*(). Also define ARCH_CPUIDLE_HALTPOLL to allow
> cpuidle-haltpoll to be selected.
> 
> Haltpoll uses poll_idle() to do the actual polling. This in turn
> uses smp_cond_load*() to wait until there's a specific store to
> a cacheline.
> In the edge case -- no stores to the cacheline and no interrupt --
> the event-stream provides the terminating condition ensuring we
> don't wait forever. But because the event-stream runs at a fixed
> frequency (configured at 10kHz) haltpoll might spend more time in
> the polling stage than specified by cpuidle_poll_time().
> 
> This would only happen in the last iteration, since overshooting the
> poll_limit means the governor will move out of the polling stage.
> 
> Tested-by: Haris Okanovic <harisokn@amazon.com>
> Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
>  arch/arm64/Kconfig                        |  6 ++++++
>  arch/arm64/include/asm/cpuidle_haltpoll.h | 24 +++++++++++++++++++++++
>  2 files changed, 30 insertions(+)
>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index ef9c22c3cff2..5fc99eba22b2 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -2415,6 +2415,12 @@ config ARCH_HIBERNATION_HEADER
>  config ARCH_SUSPEND_POSSIBLE
>         def_bool y
> 
> +config ARCH_CPUIDLE_HALTPOLL
> +       bool "Enable selection of the cpuidle-haltpoll driver"
> +       help
> +         cpuidle-haltpoll allows for adaptive polling based on
> +         current load before entering the idle state.
> +
>  endmenu # "Power management options"
> 
>  menu "CPU Power Management"
> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
> new file mode 100644
> index 000000000000..91f0be707629
> --- /dev/null
> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
> @@ -0,0 +1,24 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef _ARCH_HALTPOLL_H
> +#define _ARCH_HALTPOLL_H
> +
> +static inline void arch_haltpoll_enable(unsigned int cpu) { }
> +static inline void arch_haltpoll_disable(unsigned int cpu) { }
> +
> +static inline bool arch_haltpoll_want(bool force)
> +{
> +       /*
> +        * Enabling haltpoll requires two things:
> +        *
> +        * - Event stream support to provide a terminating condition to the
> +        *   WFE in the poll loop.
> +        *
> +        * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
> +        *
> +        * Given that the second is missing, only allow force loading for
> +        * haltpoll.
> +        */
> +       return force;
> +}
> +#endif
> --
> 2.43.5
> 

I applied your patches to master e32cde8d2bd7 and verified same
performance gains on AWS Graviton.

Reviewed-by: Haris Okanovic <harisokn@amazon.com>
Tested-by: Haris Okanovic <harisokn@amazon.com>


Re: [PATCH v8 11/11] arm64: support cpuidle-haltpoll
Posted by Ankur Arora 2 weeks, 1 day ago
Okanovic, Haris <harisokn@amazon.com> writes:

> On Wed, 2024-09-25 at 16:24 -0700, Ankur Arora wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> Add architectural support for the cpuidle-haltpoll driver by defining
>> arch_haltpoll_*(). Also define ARCH_CPUIDLE_HALTPOLL to allow
>> cpuidle-haltpoll to be selected.
>>
>> Haltpoll uses poll_idle() to do the actual polling. This in turn
>> uses smp_cond_load*() to wait until there's a specific store to
>> a cacheline.
>> In the edge case -- no stores to the cacheline and no interrupt --
>> the event-stream provides the terminating condition ensuring we
>> don't wait forever. But because the event-stream runs at a fixed
>> frequency (configured at 10kHz) haltpoll might spend more time in
>> the polling stage than specified by cpuidle_poll_time().
>>
>> This would only happen in the last iteration, since overshooting the
>> poll_limit means the governor will move out of the polling stage.
>>
>> Tested-by: Haris Okanovic <harisokn@amazon.com>
>> Tested-by: Misono Tomohiro <misono.tomohiro@fujitsu.com>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>>  arch/arm64/Kconfig                        |  6 ++++++
>>  arch/arm64/include/asm/cpuidle_haltpoll.h | 24 +++++++++++++++++++++++
>>  2 files changed, 30 insertions(+)
>>  create mode 100644 arch/arm64/include/asm/cpuidle_haltpoll.h
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index ef9c22c3cff2..5fc99eba22b2 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -2415,6 +2415,12 @@ config ARCH_HIBERNATION_HEADER
>>  config ARCH_SUSPEND_POSSIBLE
>>         def_bool y
>>
>> +config ARCH_CPUIDLE_HALTPOLL
>> +       bool "Enable selection of the cpuidle-haltpoll driver"
>> +       help
>> +         cpuidle-haltpoll allows for adaptive polling based on
>> +         current load before entering the idle state.
>> +
>>  endmenu # "Power management options"
>>
>>  menu "CPU Power Management"
>> diff --git a/arch/arm64/include/asm/cpuidle_haltpoll.h b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> new file mode 100644
>> index 000000000000..91f0be707629
>> --- /dev/null
>> +++ b/arch/arm64/include/asm/cpuidle_haltpoll.h
>> @@ -0,0 +1,24 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +
>> +#ifndef _ARCH_HALTPOLL_H
>> +#define _ARCH_HALTPOLL_H
>> +
>> +static inline void arch_haltpoll_enable(unsigned int cpu) { }
>> +static inline void arch_haltpoll_disable(unsigned int cpu) { }
>> +
>> +static inline bool arch_haltpoll_want(bool force)
>> +{
>> +       /*
>> +        * Enabling haltpoll requires two things:
>> +        *
>> +        * - Event stream support to provide a terminating condition to the
>> +        *   WFE in the poll loop.
>> +        *
>> +        * - KVM support for arch_haltpoll_enable(), arch_haltpoll_disable().
>> +        *
>> +        * Given that the second is missing, only allow force loading for
>> +        * haltpoll.
>> +        */
>> +       return force;
>> +}
>> +#endif
>> --
>> 2.43.5
>>
>
> I applied your patches to master e32cde8d2bd7 and verified same
> performance gains on AWS Graviton.

Great.

> Reviewed-by: Haris Okanovic <harisokn@amazon.com>
> Tested-by: Haris Okanovic <harisokn@amazon.com>

Thanks!

--
ankur