[PATCH 2/2] perf/core: Fix broken throttling when max_samples_per_tick=1

Qing Wong posted 2 patches 10 months, 1 week ago
[PATCH 2/2] perf/core: Fix broken throttling when max_samples_per_tick=1
Posted by Qing Wong 10 months, 1 week ago
From: Qing Wang <wangqing7171@gmail.com>

According to the throttling mechanism, the pmu interrupts number can not
exceed the max_samples_per_tick in one tick. But this mechanism is
ineffective when max_samples_per_tick=1, because the throttling check is
skipped during the first interrupt and only performed when the second
interrupt arrives.

Perhaps this bug may cause little influence in one tick, but if in a
larger time scale, the problem can not be underestimated.

When max_samples_per_tick = 1:
Allowed-interrupts-per-second max-samples-per-second  default-HZ  ARCH
200                           100                     100         X86
500                           250                     250         ARM64
...
Obviously, the pmu interrupt number far exceed the user's expect.

Fixes: e050e3f0a71b ("perf: Fix broken interrupt rate throttling")
Signed-off-by: Qing Wang <wangqing7171@gmail.com>
---
 kernel/events/core.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 29cdb240e104..4ac2ac988ddc 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10047,16 +10047,15 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
 	if (seq != hwc->interrupts_seq) {
 		hwc->interrupts_seq = seq;
 		hwc->interrupts = 1;
-	} else {
+	} else
 		hwc->interrupts++;
-		if (unlikely(throttle
-			     && hwc->interrupts >= max_samples_per_tick)) {
-			__this_cpu_inc(perf_throttled_count);
-			tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
-			hwc->interrupts = MAX_INTERRUPTS;
-			perf_log_throttle(event, 0);
-			ret = 1;
-		}
+
+	if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
+		__this_cpu_inc(perf_throttled_count);
+		tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
+		hwc->interrupts = MAX_INTERRUPTS;
+		perf_log_throttle(event, 0);
+		ret = 1;
 	}
 
 	if (event->attr.freq) {
-- 
2.43.0
Re: [PATCH 2/2] perf/core: Fix broken throttling when max_samples_per_tick=1
Posted by Peter Zijlstra 9 months, 3 weeks ago
On Sat, Apr 05, 2025 at 10:16:35PM +0800, Qing Wong wrote:
> From: Qing Wang <wangqing7171@gmail.com>
> 
> According to the throttling mechanism, the pmu interrupts number can not
> exceed the max_samples_per_tick in one tick. But this mechanism is
> ineffective when max_samples_per_tick=1, because the throttling check is
> skipped during the first interrupt and only performed when the second
> interrupt arrives.
> 
> Perhaps this bug may cause little influence in one tick, but if in a
> larger time scale, the problem can not be underestimated.
> 
> When max_samples_per_tick = 1:
> Allowed-interrupts-per-second max-samples-per-second  default-HZ  ARCH
> 200                           100                     100         X86
> 500                           250                     250         ARM64
> ...
> Obviously, the pmu interrupt number far exceed the user's expect.
> 
> Fixes: e050e3f0a71b ("perf: Fix broken interrupt rate throttling")
> Signed-off-by: Qing Wang <wangqing7171@gmail.com>
> ---
>  kernel/events/core.c | 17 ++++++++---------
>  1 file changed, 8 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 29cdb240e104..4ac2ac988ddc 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -10047,16 +10047,15 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
>  	if (seq != hwc->interrupts_seq) {
>  		hwc->interrupts_seq = seq;
>  		hwc->interrupts = 1;
> -	} else {
> +	} else
>  		hwc->interrupts++;
> -		if (unlikely(throttle
> -			     && hwc->interrupts >= max_samples_per_tick)) {
> -			__this_cpu_inc(perf_throttled_count);
> -			tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
> -			hwc->interrupts = MAX_INTERRUPTS;
> -			perf_log_throttle(event, 0);
> -			ret = 1;
> -		}
> +
> +	if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
> +		__this_cpu_inc(perf_throttled_count);
> +		tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
> +		hwc->interrupts = MAX_INTERRUPTS;
> +		perf_log_throttle(event, 0);
> +		ret = 1;
>  	}

Fair enough I suppose. I'll make this apply without that revert -- it
seems pointless to have that in between.
Re: [PATCH 2/2] perf/core: Fix broken throttling when max_samples_per_tick=1
Posted by Qing Wang 9 months, 3 weeks ago
Thank you very much for your review. Do you need me to reorganize the 
patch and send it out? Because if only the second patch is accepted, its 
context won't match the current mainline code.

On 4/18/2025 5:03 PM, Peter Zijlstra wrote:
> Fair enough I suppose. I'll make this apply without that revert -- it
> seems pointless to have that in between.
Re: [PATCH 2/2] perf/core: Fix broken throttling when max_samples_per_tick=1
Posted by Peter Zijlstra 9 months, 3 weeks ago
On Fri, Apr 18, 2025 at 09:08:30PM +0800, Qing Wang wrote:
> Thank you very much for your review. Do you need me to reorganize the patch
> and send it out? Because if only the second patch is accepted, its context
> won't match the current mainline code.

I've stomped on it a bit and pushed out to queue/perf/core.

If all looks well, and the robots don't have a fit because I failed to
compile test the thing, it should eventually make its way into tip.
Re: [PATCH 2/2] perf/core: Fix broken throttling when max_samples_per_tick=1
Posted by Qing Wang 9 months, 3 weeks ago
Thank you again. This has kick-started my journey of contributing to the 
Linux Kernel community, and it's really awesome.

On 4/18/2025 9:10 PM, Peter Zijlstra wrote:
> I've stomped on it a bit and pushed out to queue/perf/core.
>
> If all looks well, and the robots don't have a fit because I failed to
> compile test the thing, it should eventually make its way into tip.
[tip: perf/core] perf/core: Fix broken throttling when max_samples_per_tick=1
Posted by tip-bot2 for Qing Wang 9 months, 2 weeks ago
The following commit has been merged into the perf/core branch of tip:

Commit-ID:     f51972e6f8b9a737b2b3eb588069acb538fa72de
Gitweb:        https://git.kernel.org/tip/f51972e6f8b9a737b2b3eb588069acb538fa72de
Author:        Qing Wang <wangqing7171@gmail.com>
AuthorDate:    Sat, 05 Apr 2025 22:16:35 +08:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 25 Apr 2025 14:55:22 +02:00

perf/core: Fix broken throttling when max_samples_per_tick=1

According to the throttling mechanism, the pmu interrupts number can not
exceed the max_samples_per_tick in one tick. But this mechanism is
ineffective when max_samples_per_tick=1, because the throttling check is
skipped during the first interrupt and only performed when the second
interrupt arrives.

Perhaps this bug may cause little influence in one tick, but if in a
larger time scale, the problem can not be underestimated.

When max_samples_per_tick = 1:
Allowed-interrupts-per-second max-samples-per-second  default-HZ  ARCH
200                           100                     100         X86
500                           250                     250         ARM64
...
Obviously, the pmu interrupt number far exceed the user's expect.

Fixes: e050e3f0a71b ("perf: Fix broken interrupt rate throttling")
Signed-off-by: Qing Wang <wangqing7171@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20250405141635.243786-3-wangqing7171@gmail.com
---
 kernel/events/core.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3c69a1a..05136e8 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10065,14 +10065,14 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
 		hwc->interrupts = 1;
 	} else {
 		hwc->interrupts++;
-		if (unlikely(throttle &&
-			     hwc->interrupts > max_samples_per_tick)) {
-			__this_cpu_inc(perf_throttled_count);
-			tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
-			hwc->interrupts = MAX_INTERRUPTS;
-			perf_log_throttle(event, 0);
-			ret = 1;
-		}
+	}
+
+	if (unlikely(throttle && hwc->interrupts >= max_samples_per_tick)) {
+		__this_cpu_inc(perf_throttled_count);
+		tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
+		hwc->interrupts = MAX_INTERRUPTS;
+		perf_log_throttle(event, 0);
+		ret = 1;
 	}
 
 	if (event->attr.freq) {