drivers/soc/qcom/icc-bwmon.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
From: Shivnandan Kumar <quic_kshivnan@quicinc.com>
Reduce zone1_thres_count from 16 to 3 so the driver can lower the bus
vote after 3 sample windows instead of waiting for 16. The previous
16‑window delay (~64 ms) is too long at higher FPS workloads,
causing delayed decision making and measurable power regression.
Empirical tuning showed that lower values (e.g., 2) made bwmon behavior
jittery, while higher values (4–6) were stable but less responsive and
reduced power savings. A value of 3 provided the best balance: responsive
enough for timely power reduction while maintaining stable bwmon
operation.
Significant power savings were observed across multiple use cases when
reducing the threshold from 16 to 3:
USECASE zone1_thres_count=16 zone1_thres_count=3
4K video playback 236.15 mA 203.15 mA
Sleep 7mA 6.9mA
Display (idle display) 71.95mA 67.11mA
Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
Signed-off-by: Pushpendra Singh <pussin@qti.qualcomm.com>
---
Chages in v3:
- Update commit message
- Link to v2: https://lore.kernel.org/lkml/d72182bc-f8d4-4314-b2f1-c9242618eb67@quicinc.com/
Changes in v2:
- Update commit message
- Link to v1: https://lore.kernel.org/lkml/463eb7c8-00fc-4441-91d1-6e48f6b052c8@quicinc.com
---
drivers/soc/qcom/icc-bwmon.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/soc/qcom/icc-bwmon.c b/drivers/soc/qcom/icc-bwmon.c
index 597f9025e422..e46975da7dba 100644
--- a/drivers/soc/qcom/icc-bwmon.c
+++ b/drivers/soc/qcom/icc-bwmon.c
@@ -830,7 +830,7 @@ static const struct icc_bwmon_data msm8998_bwmon_data = {
static const struct icc_bwmon_data sdm845_cpu_bwmon_data = {
.sample_ms = 4,
.count_unit_kb = 64,
- .zone1_thres_count = 16,
+ .zone1_thres_count = 3,
.zone3_thres_count = 1,
.quirks = BWMON_HAS_GLOBAL_IRQ,
.regmap_fields = sdm845_cpu_bwmon_reg_fields,
@@ -849,7 +849,7 @@ static const struct icc_bwmon_data sdm845_llcc_bwmon_data = {
static const struct icc_bwmon_data sc7280_llcc_bwmon_data = {
.sample_ms = 4,
.count_unit_kb = 64,
- .zone1_thres_count = 16,
+ .zone1_thres_count = 3,
.zone3_thres_count = 1,
.quirks = BWMON_NEEDS_FORCE_CLEAR,
.regmap_fields = sdm845_llcc_bwmon_reg_fields,
--
2.34.1
On 27/02/2026 12:10, Pushpendra Singh wrote: > From: Shivnandan Kumar <quic_kshivnan@quicinc.com> > > Reduce zone1_thres_count from 16 to 3 so the driver can lower the bus > vote after 3 sample windows instead of waiting for 16. The previous > 16‑window delay (~64 ms) is too long at higher FPS workloads, > causing delayed decision making and measurable power regression. > > Empirical tuning showed that lower values (e.g., 2) made bwmon behavior > jittery, while higher values (4–6) were stable but less responsive and > reduced power savings. A value of 3 provided the best balance: responsive > enough for timely power reduction while maintaining stable bwmon > operation. > > Significant power savings were observed across multiple use cases when > reducing the threshold from 16 to 3: > > USECASE zone1_thres_count=16 zone1_thres_count=3 > 4K video playback 236.15 mA 203.15 mA > Sleep 7mA 6.9mA > Display (idle display) 71.95mA 67.11mA Which hardware exactly? Which kernel? You keep running this on downstream, like a lot of code from Qualcomm and speeches on conferences, so I just don't trust the numbers. Best regards, Krzysztof
On 2/28/2026 3:19 PM, Krzysztof Kozlowski wrote: > On 27/02/2026 12:10, Pushpendra Singh wrote: >> From: Shivnandan Kumar <quic_kshivnan@quicinc.com> >> >> Reduce zone1_thres_count from 16 to 3 so the driver can lower the bus >> vote after 3 sample windows instead of waiting for 16. The previous >> 16‑window delay (~64 ms) is too long at higher FPS workloads, >> causing delayed decision making and measurable power regression. >> >> Empirical tuning showed that lower values (e.g., 2) made bwmon behavior >> jittery, while higher values (4–6) were stable but less responsive and >> reduced power savings. A value of 3 provided the best balance: responsive >> enough for timely power reduction while maintaining stable bwmon >> operation. >> >> Significant power savings were observed across multiple use cases when >> reducing the threshold from 16 to 3: >> >> USECASE zone1_thres_count=16 zone1_thres_count=3 >> 4K video playback 236.15 mA 203.15 mA >> Sleep 7mA 6.9mA >> Display (idle display) 71.95mA 67.11mA > Which hardware exactly? Which kernel? > > You keep running this on downstream, like a lot of code from Qualcomm > and speeches on conferences, so I just don't trust the numbers. The numbers presented were obtained on then upstream 6.6 based kernels across multiple SoCs (sc7280/sa8775p). Also, please suggest benchmarks and other perf level measurements done when the number 16 was chosen initially, that way we can ensure there is no regression. Regards, Pushpendra Singh > > Best regards, > Krzysztof
On Tue, Mar 17, 2026 at 11:15:51AM +0530, Pushpendra Singh wrote: > > On 2/28/2026 3:19 PM, Krzysztof Kozlowski wrote: > > On 27/02/2026 12:10, Pushpendra Singh wrote: > >> From: Shivnandan Kumar <quic_kshivnan@quicinc.com> > >> > >> Reduce zone1_thres_count from 16 to 3 so the driver can lower the bus > >> vote after 3 sample windows instead of waiting for 16. The previous > >> 16‑window delay (~64 ms) is too long at higher FPS workloads, > >> causing delayed decision making and measurable power regression. > >> > >> Empirical tuning showed that lower values (e.g., 2) made bwmon behavior > >> jittery, while higher values (4–6) were stable but less responsive and > >> reduced power savings. A value of 3 provided the best balance: responsive > >> enough for timely power reduction while maintaining stable bwmon > >> operation. > >> > >> Significant power savings were observed across multiple use cases when > >> reducing the threshold from 16 to 3: > >> > >> USECASE zone1_thres_count=16 zone1_thres_count=3 > >> 4K video playback 236.15 mA 203.15 mA > >> Sleep 7mA 6.9mA > >> Display (idle display) 71.95mA 67.11mA > > Which hardware exactly? Which kernel? > > > > You keep running this on downstream, like a lot of code from Qualcomm > > and speeches on conferences, so I just don't trust the numbers. > > The numbers presented were obtained on then upstream 6.6 based kernels across multiple SoCs (sc7280/sa8775p). Please don't use old kernels for upstream work. > Also, please suggest benchmarks and other perf level measurements done when the number 16 was chosen initially, > that way we can ensure there is no regression. > -- With best wishes Dmitry
On 17/03/2026 06:45, Pushpendra Singh wrote: > > On 2/28/2026 3:19 PM, Krzysztof Kozlowski wrote: >> On 27/02/2026 12:10, Pushpendra Singh wrote: >>> From: Shivnandan Kumar <quic_kshivnan@quicinc.com> >>> >>> Reduce zone1_thres_count from 16 to 3 so the driver can lower the bus >>> vote after 3 sample windows instead of waiting for 16. The previous >>> 16‑window delay (~64 ms) is too long at higher FPS workloads, >>> causing delayed decision making and measurable power regression. >>> >>> Empirical tuning showed that lower values (e.g., 2) made bwmon behavior >>> jittery, while higher values (4–6) were stable but less responsive and >>> reduced power savings. A value of 3 provided the best balance: responsive >>> enough for timely power reduction while maintaining stable bwmon >>> operation. >>> >>> Significant power savings were observed across multiple use cases when >>> reducing the threshold from 16 to 3: >>> >>> USECASE zone1_thres_count=16 zone1_thres_count=3 >>> 4K video playback 236.15 mA 203.15 mA >>> Sleep 7mA 6.9mA >>> Display (idle display) 71.95mA 67.11mA >> Which hardware exactly? Which kernel? >> >> You keep running this on downstream, like a lot of code from Qualcomm >> and speeches on conferences, so I just don't trust the numbers. > > The numbers presented were obtained on then upstream 6.6 based kernels across multiple SoCs (sc7280/sa8775p). So sorry, that is 2.5 years old kernel. > Also, please suggest benchmarks and other perf level measurements done when the number 16 was chosen initially, > that way we can ensure there is no regression. You must run pure upstream code. As I said, I don't care about any solutions, measurements or improvements based on your downstream fork. Best regards, Krzysztof
On Fri, Feb 27, 2026 at 04:40:51PM +0530, Pushpendra Singh wrote:
> From: Shivnandan Kumar <quic_kshivnan@quicinc.com>
>
> Reduce zone1_thres_count from 16 to 3 so the driver can lower the bus
> vote after 3 sample windows instead of waiting for 16. The previous
> 16‑window delay (~64 ms) is too long at higher FPS workloads,
> causing delayed decision making and measurable power regression.
>
> Empirical tuning showed that lower values (e.g., 2) made bwmon behavior
> jittery, while higher values (4–6) were stable but less responsive and
> reduced power savings. A value of 3 provided the best balance: responsive
> enough for timely power reduction while maintaining stable bwmon
> operation.
Can you please justify this somehow? Numbers? Just the intuition?
>
> Significant power savings were observed across multiple use cases when
> reducing the threshold from 16 to 3:
>
> USECASE zone1_thres_count=16 zone1_thres_count=3
> 4K video playback 236.15 mA 203.15 mA
> Sleep 7mA 6.9mA
> Display (idle display) 71.95mA 67.11mA
>
> Signed-off-by: Shivnandan Kumar <quic_kshivnan@quicinc.com>
> Signed-off-by: Pushpendra Singh <pussin@qti.qualcomm.com>
> ---
> Chages in v3:
> - Update commit message
> - Link to v2: https://lore.kernel.org/lkml/d72182bc-f8d4-4314-b2f1-c9242618eb67@quicinc.com/
>
> Changes in v2:
> - Update commit message
> - Link to v1: https://lore.kernel.org/lkml/463eb7c8-00fc-4441-91d1-6e48f6b052c8@quicinc.com
> ---
> drivers/soc/qcom/icc-bwmon.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/soc/qcom/icc-bwmon.c b/drivers/soc/qcom/icc-bwmon.c
> index 597f9025e422..e46975da7dba 100644
> --- a/drivers/soc/qcom/icc-bwmon.c
> +++ b/drivers/soc/qcom/icc-bwmon.c
> @@ -830,7 +830,7 @@ static const struct icc_bwmon_data msm8998_bwmon_data = {
> static const struct icc_bwmon_data sdm845_cpu_bwmon_data = {
> .sample_ms = 4,
> .count_unit_kb = 64,
> - .zone1_thres_count = 16,
> + .zone1_thres_count = 3,
> .zone3_thres_count = 1,
> .quirks = BWMON_HAS_GLOBAL_IRQ,
> .regmap_fields = sdm845_cpu_bwmon_reg_fields,
> @@ -849,7 +849,7 @@ static const struct icc_bwmon_data sdm845_llcc_bwmon_data = {
> static const struct icc_bwmon_data sc7280_llcc_bwmon_data = {
> .sample_ms = 4,
> .count_unit_kb = 64,
> - .zone1_thres_count = 16,
> + .zone1_thres_count = 3,
> .zone3_thres_count = 1,
> .quirks = BWMON_NEEDS_FORCE_CLEAR,
> .regmap_fields = sdm845_llcc_bwmon_reg_fields,
> --
> 2.34.1
>
--
With best wishes
Dmitry
© 2016 - 2026 Red Hat, Inc.