[PATCH v2 0/5] PM QoS: Add CPU affinity latency QoS support and resctrl integration

Zhongqiu Han posted 5 patches 2 months, 2 weeks ago
Documentation/power/pm_qos_interface.rst |  63 ++++++++-
fs/resctrl/pseudo_lock.c                 |  51 +------
include/linux/pm_qos.h                   |  40 ++++++
include/linux/resctrl.h                  |   3 +-
kernel/power/qos.c                       | 166 ++++++++++++++++++++++-
5 files changed, 268 insertions(+), 55 deletions(-)
[PATCH v2 0/5] PM QoS: Add CPU affinity latency QoS support and resctrl integration
Posted by Zhongqiu Han 2 months, 2 weeks ago
Hi all,

This patch series introduces support for CPU affinity-based latency
constraints in the PM QoS framework. The motivation is to allow
finer-grained power management by enabling latency QoS requests to target
specific CPUs, rather than applying system-wide constraints.

The current PM QoS framework supports global and per-device CPU latency
constraints. However, in many real-world scenarios, such as IRQ affinity
or CPU-bound kernel threads, only a subset of CPUs are
performance-critical. Applying global constraints in such cases
unnecessarily prevents other CPUs from entering deeper C-states, leading
to increased power consumption.

This series addresses that limitation by introducing a new interface that
allows latency constraints to be applied to a CPU mask. This is
particularly useful on heterogeneous platforms (e.g., big.LITTLE) and
embedded systems where power efficiency is critical for example:

                        driver A       rt kthread B      module C
  CPU IDs (mask):         0-3              2-5              6-7
  target latency(us):     20               30               100
                          |                |                |
                          v                v                v
                          +---------------------------------+
                          |        PM  QoS  Framework       |
                          +---------------------------------+
                          |                |                |
                          v                v                v
  CPU IDs (mask):        0-3            2-3,4-5            6-7
  runtime latency(us):   20             20, 30             100

The current implementation includes only cpu_affinity_latency_qos_add()
and cpu_affinity_latency_qos_remove() interfaces. An update interface is
planned for future submission, along with PM QoS optimizations in the UFS
subsystem.

Patch1 introduces the core support for CPU affinity latency QoS in the PM
QoS framework.

Patch2 removes redundant KERN_ERR prefixes in WARN() calls in the global
CPU PM QoS interface. This change addresses issues in existing code and is
not related to the new interface introduced in this patch series.

Patch3 adds documentation for the new interface.

Patch4 fixes a minor documentation issue related to the return type of
cpu_latency_qos_request_active(). This change addresses issues in existing
doc and is not related to the new interface introduced in this patch
series.

Patch5 updates the resctrl pseudo-locking logic to use the new CPU
affinity latency QoS helpers, improving clarity and consistency. The only
functional and beneficial change is that the new interface actively wakes
up CPUs whose latency QoS values have changed, ensuring the latency limit
takes effect immediately.

Changes since v1:
- Rebased on top of current next.
- Resolve the compilation warning due to a missing static function
  declaration.
- Remove the conditional compilation based on CONFIG_CPU_IDLE and make it
  depend solely on CONFIG_PM.
- Add support for cpu_affinity_latency_qos_active.
- Remove cpu_affinity_latency_qos_update; will reintroduce it when needed
  in the future.
- Optimize the code, for example by using cpu_affinity_latency_qos_active
  inside the add/remove functions to enhance robustness.
- Refine the commit message and fix a few minor issues unrelated to this
  series.
- Refactor the CPU latency PM QoS logic of resctrl pseudo_lock using the
  interfaces provided by this series.
- Link to v1: https://lore.kernel.org/all/20250424095228.1112558-1-quic_zhonhan@quicinc.com/

Zhongqiu Han (5):
  PM: QoS: Add support for CPU affinity latency PM QoS
  PM: QOS: Remove unnecessary KERN_ERR on WARN() calls
  Documentation: PM: QoS: Add CPU affinity latency PM QoS Interface
    documentation
  Documentation: PM: QoS: Fix return type and return value description
  resctrl: Replace PM QoS logic with cpu_affinity_latency_qos_* helpers

 Documentation/power/pm_qos_interface.rst |  63 ++++++++-
 fs/resctrl/pseudo_lock.c                 |  51 +------
 include/linux/pm_qos.h                   |  40 ++++++
 include/linux/resctrl.h                  |   3 +-
 kernel/power/qos.c                       | 166 ++++++++++++++++++++++-
 5 files changed, 268 insertions(+), 55 deletions(-)


base-commit: 024e09e444bd2b06aee9d1f3fe7b313c7a2df1bb
-- 
2.43.0
Re: [PATCH v2 0/5] PM QoS: Add CPU affinity latency QoS support and resctrl integration
Posted by Christian Loehle 2 months, 1 week ago
On 7/21/25 13:40, Zhongqiu Han wrote:
> Hi all,
> 
> This patch series introduces support for CPU affinity-based latency
> constraints in the PM QoS framework. The motivation is to allow
> finer-grained power management by enabling latency QoS requests to target
> specific CPUs, rather than applying system-wide constraints.
> 
> The current PM QoS framework supports global and per-device CPU latency
> constraints. However, in many real-world scenarios, such as IRQ affinity
> or CPU-bound kernel threads, only a subset of CPUs are
> performance-critical. Applying global constraints in such cases
> unnecessarily prevents other CPUs from entering deeper C-states, leading
> to increased power consumption.
> 
> This series addresses that limitation by introducing a new interface that
> allows latency constraints to be applied to a CPU mask. This is
> particularly useful on heterogeneous platforms (e.g., big.LITTLE) and
> embedded systems where power efficiency is critical for example:
> 
>                         driver A       rt kthread B      module C
>   CPU IDs (mask):         0-3              2-5              6-7
>   target latency(us):     20               30               100
>                           |                |                |
>                           v                v                v
>                           +---------------------------------+
>                           |        PM  QoS  Framework       |
>                           +---------------------------------+
>                           |                |                |
>                           v                v                v
>   CPU IDs (mask):        0-3            2-3,4-5            6-7
>   runtime latency(us):   20             20, 30             100
> 
> The current implementation includes only cpu_affinity_latency_qos_add()
> and cpu_affinity_latency_qos_remove() interfaces. An update interface is
> planned for future submission, along with PM QoS optimizations in the UFS
> subsystem.

So what's needed for the UFS use-case additionally?
Would adding that here be too much?
Re: [PATCH v2 0/5] PM QoS: Add CPU affinity latency QoS support and resctrl integration
Posted by Zhongqiu Han 2 months, 1 week ago
On 7/28/2025 6:09 PM, Christian Loehle wrote:
> On 7/21/25 13:40, Zhongqiu Han wrote:
>> Hi all,
>>
>> This patch series introduces support for CPU affinity-based latency
>> constraints in the PM QoS framework. The motivation is to allow
>> finer-grained power management by enabling latency QoS requests to target
>> specific CPUs, rather than applying system-wide constraints.
>>
>> The current PM QoS framework supports global and per-device CPU latency
>> constraints. However, in many real-world scenarios, such as IRQ affinity
>> or CPU-bound kernel threads, only a subset of CPUs are
>> performance-critical. Applying global constraints in such cases
>> unnecessarily prevents other CPUs from entering deeper C-states, leading
>> to increased power consumption.
>>
>> This series addresses that limitation by introducing a new interface that
>> allows latency constraints to be applied to a CPU mask. This is
>> particularly useful on heterogeneous platforms (e.g., big.LITTLE) and
>> embedded systems where power efficiency is critical for example:
>>
>>                          driver A       rt kthread B      module C
>>    CPU IDs (mask):         0-3              2-5              6-7
>>    target latency(us):     20               30               100
>>                            |                |                |
>>                            v                v                v
>>                            +---------------------------------+
>>                            |        PM  QoS  Framework       |
>>                            +---------------------------------+
>>                            |                |                |
>>                            v                v                v
>>    CPU IDs (mask):        0-3            2-3,4-5            6-7
>>    runtime latency(us):   20             20, 30             100
>>
>> The current implementation includes only cpu_affinity_latency_qos_add()
>> and cpu_affinity_latency_qos_remove() interfaces. An update interface is
>> planned for future submission, along with PM QoS optimizations in the UFS
>> subsystem.
> 
> So what's needed for the UFS use-case additionally?
> Would adding that here be too much?
> 

Hi Christian,
Thanks for your review and discussion~

Currently my plan is only to move forward with the current patch series,
which includes only the below interfaces:

cpu_affinity_latency_qos_add()
cpu_affinity_latency_qos_remove()
cpu_affinity_latency_qos_active()


For most use-cases, seems these three interfaces already sufficient.


The reason I mentioned UFS is to explain why the update
interface cpu_affinity_latency_qos_update()

is not included at this stage. The UFS use-case is planned to
use the cpu_affinity_latency_qos_update() interface in the future, which
is similar to the global CPU PM QoS interface
cpu_latency_qos_update_request().



-- 
Thx and BRs,
Zhongqiu Han
Re: [PATCH v2 0/5] PM QoS: Add CPU affinity latency QoS support and resctrl integration
Posted by Christian Loehle 2 months ago
On 7/28/25 11:40, Zhongqiu Han wrote:
> On 7/28/2025 6:09 PM, Christian Loehle wrote:
>> On 7/21/25 13:40, Zhongqiu Han wrote:
>>> Hi all,
>>>
>>> This patch series introduces support for CPU affinity-based latency
>>> constraints in the PM QoS framework. The motivation is to allow
>>> finer-grained power management by enabling latency QoS requests to target
>>> specific CPUs, rather than applying system-wide constraints.
>>>
>>> The current PM QoS framework supports global and per-device CPU latency
>>> constraints. However, in many real-world scenarios, such as IRQ affinity
>>> or CPU-bound kernel threads, only a subset of CPUs are
>>> performance-critical. Applying global constraints in such cases
>>> unnecessarily prevents other CPUs from entering deeper C-states, leading
>>> to increased power consumption.
>>>
>>> This series addresses that limitation by introducing a new interface that
>>> allows latency constraints to be applied to a CPU mask. This is
>>> particularly useful on heterogeneous platforms (e.g., big.LITTLE) and
>>> embedded systems where power efficiency is critical for example:
>>>
>>>                          driver A       rt kthread B      module C
>>>    CPU IDs (mask):         0-3              2-5              6-7
>>>    target latency(us):     20               30               100
>>>                            |                |                |
>>>                            v                v                v
>>>                            +---------------------------------+
>>>                            |        PM  QoS  Framework       |
>>>                            +---------------------------------+
>>>                            |                |                |
>>>                            v                v                v
>>>    CPU IDs (mask):        0-3            2-3,4-5            6-7
>>>    runtime latency(us):   20             20, 30             100
>>>
>>> The current implementation includes only cpu_affinity_latency_qos_add()
>>> and cpu_affinity_latency_qos_remove() interfaces. An update interface is
>>> planned for future submission, along with PM QoS optimizations in the UFS
>>> subsystem.
>>
>> So what's needed for the UFS use-case additionally?
>> Would adding that here be too much?
>>
> 
> Hi Christian,
> Thanks for your review and discussion~
> 
> Currently my plan is only to move forward with the current patch series,
> which includes only the below interfaces:
> 
> cpu_affinity_latency_qos_add()
> cpu_affinity_latency_qos_remove()
> cpu_affinity_latency_qos_active()
> 
> 
> For most use-cases, seems these three interfaces already sufficient.

Probably, but IMO there's no real user of the new extended interface yet,
making review harder and lacking justification.

FWIW in 2014 Lina also pushed for something like $SUBJECT
https://lore.kernel.org/all/1407945689-18494-5-git-send-email-lina.iyer@linaro.org/
Lina made an interface to tie the PM QoS to the relevant irq, which I think
was a great idea. Maybe that series is interesting for you, too?

> 
> 
> The reason I mentioned UFS is to explain why the update
> interface cpu_affinity_latency_qos_update()
> 
> is not included at this stage. The UFS use-case is planned to
> use the cpu_affinity_latency_qos_update() interface in the future, which
> is similar to the global CPU PM QoS interface
> cpu_latency_qos_update_request().

Re: [PATCH v2 0/5] PM QoS: Add CPU affinity latency QoS support and resctrl integration
Posted by Zhongqiu Han 2 months ago
On 8/2/2025 10:38 PM, Christian Loehle wrote:
> On 7/28/25 11:40, Zhongqiu Han wrote:
>> On 7/28/2025 6:09 PM, Christian Loehle wrote:
>>> On 7/21/25 13:40, Zhongqiu Han wrote:
>>>> Hi all,
>>>>
>>>> This patch series introduces support for CPU affinity-based latency
>>>> constraints in the PM QoS framework. The motivation is to allow
>>>> finer-grained power management by enabling latency QoS requests to target
>>>> specific CPUs, rather than applying system-wide constraints.
>>>>
>>>> The current PM QoS framework supports global and per-device CPU latency
>>>> constraints. However, in many real-world scenarios, such as IRQ affinity
>>>> or CPU-bound kernel threads, only a subset of CPUs are
>>>> performance-critical. Applying global constraints in such cases
>>>> unnecessarily prevents other CPUs from entering deeper C-states, leading
>>>> to increased power consumption.
>>>>
>>>> This series addresses that limitation by introducing a new interface that
>>>> allows latency constraints to be applied to a CPU mask. This is
>>>> particularly useful on heterogeneous platforms (e.g., big.LITTLE) and
>>>> embedded systems where power efficiency is critical for example:
>>>>
>>>>                           driver A       rt kthread B      module C
>>>>     CPU IDs (mask):         0-3              2-5              6-7
>>>>     target latency(us):     20               30               100
>>>>                             |                |                |
>>>>                             v                v                v
>>>>                             +---------------------------------+
>>>>                             |        PM  QoS  Framework       |
>>>>                             +---------------------------------+
>>>>                             |                |                |
>>>>                             v                v                v
>>>>     CPU IDs (mask):        0-3            2-3,4-5            6-7
>>>>     runtime latency(us):   20             20, 30             100
>>>>
>>>> The current implementation includes only cpu_affinity_latency_qos_add()
>>>> and cpu_affinity_latency_qos_remove() interfaces. An update interface is
>>>> planned for future submission, along with PM QoS optimizations in the UFS
>>>> subsystem.
>>>
>>> So what's needed for the UFS use-case additionally?
>>> Would adding that here be too much?
>>>
>>
>> Hi Christian,
>> Thanks for your review and discussion~
>>
>> Currently my plan is only to move forward with the current patch series,
>> which includes only the below interfaces:
>>
>> cpu_affinity_latency_qos_add()
>> cpu_affinity_latency_qos_remove()
>> cpu_affinity_latency_qos_active()
>>
>>
>> For most use-cases, seems these three interfaces already sufficient.
> 
> Probably, but IMO there's no real user of the new extended interface yet,
> making review harder and lacking justification.
> 
> FWIW in 2014 Lina also pushed for something like $SUBJECT
> https://lore.kernel.org/all/1407945689-18494-5-git-send-email-lina.iyer@linaro.org/
> Lina made an interface to tie the PM QoS to the relevant irq, which I think
> was a great idea. Maybe that series is interesting for you, too?
> 

Hi Christian,
Thanks for the review~

Just to clarify: in patch 5/5 of the current series, I’ve included a
user of the new extended interface — specifically,
cpu_affinity_latency_qos_active() is used internally within the add
remove interfaces.


I’ve roughly reviewed this patchset you mentioned. Please correct me if
my understanding is inaccurate.

https://lore.kernel.org/all/1407945689-18494-5-git-send-email-lina.iyer@linaro.org/

It seems that this patch series introduces an alternative implementation
and attempts to add a new type of PM QoS request — one that targets IRQs
instead of CPUs. Specifically, when the IRQ affinity changes, the
corresponding CPU latency constraint is updated to reflect the new CPUs
that the IRQ is now affine to.


And It appears that Kevin also recommended implementing this feature
using the per-device API:


https://lore.kernel.org/all/7h4mx9wdxe.fsf@paris.lan/

---->>> From Kevin:
I agree this is a needed feature.  I didn't study it in detail yet, but
after a quick glance, it looks like a good approach.
---->>>

https://lore.kernel.org/all/7hd2blerqz.fsf@paris.lan/

---->>> From Kevin:
I'm curious if you looked at using the per-device QoS API for this
instead of expending the system-wide API.  IOW, from a per-device QoS
POV, a CPU is no different than any other device, and since we already
have the per-device QoS API, I wondered if that might be a better choice
to implment this per-CPU feature.
---->>>


May I know are you suggesting that I should evaluate whether IRQ
affinity scenarios could also be valid use cases for the
cpu affinity latency pm qos interface?
However, there's a more fundamental assumption behind this — that a
particular IRQ actually requires a CPU latency PM QoS constraint, right?



>>
>>
>> The reason I mentioned UFS is to explain why the update
>> interface cpu_affinity_latency_qos_update()
>>
>> is not included at this stage. The UFS use-case is planned to
>> use the cpu_affinity_latency_qos_update() interface in the future, which
>> is similar to the global CPU PM QoS interface
>> cpu_latency_qos_update_request().
> 



-- 
Thx and BRs,
Zhongqiu Han
Re: [PATCH v2 0/5] PM QoS: Add CPU affinity latency QoS support and resctrl integration
Posted by Zhongqiu Han 1 month, 3 weeks ago
On 7/21/2025 8:40 PM, Zhongqiu Han wrote:
> Hi all,
> 
> This patch series introduces support for CPU affinity-based latency
> constraints in the PM QoS framework. The motivation is to allow
> finer-grained power management by enabling latency QoS requests to target
> specific CPUs, rather than applying system-wide constraints.
> 
> The current PM QoS framework supports global and per-device CPU latency
> constraints. However, in many real-world scenarios, such as IRQ affinity
> or CPU-bound kernel threads, only a subset of CPUs are
> performance-critical. Applying global constraints in such cases
> unnecessarily prevents other CPUs from entering deeper C-states, leading
> to increased power consumption.
> 
> This series addresses that limitation by introducing a new interface that
> allows latency constraints to be applied to a CPU mask. This is
> particularly useful on heterogeneous platforms (e.g., big.LITTLE) and
> embedded systems where power efficiency is critical for example:
> 
>                          driver A       rt kthread B      module C
>    CPU IDs (mask):         0-3              2-5              6-7
>    target latency(us):     20               30               100
>                            |                |                |
>                            v                v                v
>                            +---------------------------------+
>                            |        PM  QoS  Framework       |
>                            +---------------------------------+
>                            |                |                |
>                            v                v                v
>    CPU IDs (mask):        0-3            2-3,4-5            6-7
>    runtime latency(us):   20             20, 30             100
> 
> The current implementation includes only cpu_affinity_latency_qos_add()
> and cpu_affinity_latency_qos_remove() interfaces. An update interface is
> planned for future submission, along with PM QoS optimizations in the UFS
> subsystem.
> 
> Patch1 introduces the core support for CPU affinity latency QoS in the PM
> QoS framework.
> 
> Patch2 removes redundant KERN_ERR prefixes in WARN() calls in the global
> CPU PM QoS interface. This change addresses issues in existing code and is
> not related to the new interface introduced in this patch series.
> 
> Patch3 adds documentation for the new interface.
> 
> Patch4 fixes a minor documentation issue related to the return type of
> cpu_latency_qos_request_active(). This change addresses issues in existing
> doc and is not related to the new interface introduced in this patch
> series.
> 
> Patch5 updates the resctrl pseudo-locking logic to use the new CPU
> affinity latency QoS helpers, improving clarity and consistency. The only
> functional and beneficial change is that the new interface actively wakes
> up CPUs whose latency QoS values have changed, ensuring the latency limit
> takes effect immediately.

Hi Rafael,
I hope you're doing well. I just wanted to kindly check in regarding
current patch I submitted a while ago.

I understand things can get busy, and there's absolutely no rush — just
wanted to make sure it hasn't been missed.

Thank you~


> 
> Changes since v1:
> - Rebased on top of current next.
> - Resolve the compilation warning due to a missing static function
>    declaration.
> - Remove the conditional compilation based on CONFIG_CPU_IDLE and make it
>    depend solely on CONFIG_PM.
> - Add support for cpu_affinity_latency_qos_active.
> - Remove cpu_affinity_latency_qos_update; will reintroduce it when needed
>    in the future.
> - Optimize the code, for example by using cpu_affinity_latency_qos_active
>    inside the add/remove functions to enhance robustness.
> - Refine the commit message and fix a few minor issues unrelated to this
>    series.
> - Refactor the CPU latency PM QoS logic of resctrl pseudo_lock using the
>    interfaces provided by this series.
> - Link to v1: https://lore.kernel.org/all/20250424095228.1112558-1-quic_zhonhan@quicinc.com/
> 
> Zhongqiu Han (5):
>    PM: QoS: Add support for CPU affinity latency PM QoS
>    PM: QOS: Remove unnecessary KERN_ERR on WARN() calls
>    Documentation: PM: QoS: Add CPU affinity latency PM QoS Interface
>      documentation
>    Documentation: PM: QoS: Fix return type and return value description
>    resctrl: Replace PM QoS logic with cpu_affinity_latency_qos_* helpers
> 
>   Documentation/power/pm_qos_interface.rst |  63 ++++++++-
>   fs/resctrl/pseudo_lock.c                 |  51 +------
>   include/linux/pm_qos.h                   |  40 ++++++
>   include/linux/resctrl.h                  |   3 +-
>   kernel/power/qos.c                       | 166 ++++++++++++++++++++++-
>   5 files changed, 268 insertions(+), 55 deletions(-)
> 
> 
> base-commit: 024e09e444bd2b06aee9d1f3fe7b313c7a2df1bb


-- 
Thx and BRs,
Zhongqiu Han