From nobody Mon Oct 6 19:09:27 2025 Received: from outboundhk.mxmail.xiaomi.com (outboundhk.mxmail.xiaomi.com [118.143.206.90]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E7A081D5CE8; Thu, 17 Jul 2025 10:12:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=118.143.206.90 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752747169; cv=none; b=QFTHH4RVabslPbi3TO1825yA0ZGgTitC9JYOk0DEcBYAadTlCBEkfIArmbeXpbSQBnfyWYul+g9h6X45nAY0ek4gGEUCX5WWfU5J8mY4ser1MfEjMac6cosf43muiGDS9QSa1E8rnvaWXe/Dnj8lkIIG7CVcjS5JIDh3Hmwe8xM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1752747169; c=relaxed/simple; bh=uEgTeSOrVumG9zN+PvcCy4nZs+ahw73Jgn2Q3UeBxYU=; h=From:To:CC:Subject:Date:Message-ID:Content-Type:MIME-Version; b=aC/PpBtSnoGxnPs6fHpuGm5mifQ91a61YpwwceRrJc80pDsPHR20O+LhLKxqJKa/I1zXInLb8EcXWtC6XuuVgnVLruxweYm1JHfYkcWwxdenJbgl66Et6Izec9nQk4RSESD/7g3AOx0GdWY+FAE+Gc4hI+jTuJeAo6h36qi+Y9k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=xiaomi.com; spf=pass smtp.mailfrom=xiaomi.com; arc=none smtp.client-ip=118.143.206.90 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=xiaomi.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=xiaomi.com X-CSE-ConnectionGUID: rrR6SmeVQHem/wIpXjI+pA== X-CSE-MsgGUID: cyioYL1iTzmTefqXy2HvNw== X-IronPort-AV: E=Sophos;i="6.16,318,1744041600"; d="scan'208";a="120777319" From: =?gb2312?B?1uzi/ces?= To: "rafael@kernel.org" CC: Daniel Lezcano , "christian.loehle@arm.com" , "quic_zhonhan@quicinc.com" , "linux-pm@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: [PATCH] cpuidle: menu: find the typical interval by a heuristic classification method Thread-Topic: [PATCH] cpuidle: menu: find the typical interval by a heuristic classification method Thread-Index: Adv2/kCxxcwMPvuoQ+OgADfpnC9xOw== Date: Thu, 17 Jul 2025 10:11:32 +0000 Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Content-Transfer-Encoding: quoted-printable Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The iterations of deviation calculation gives too less predictions on the idle interval by trying to find a single repeating pattern from the whole history. This is not always the case when the workload is flowing. This algorithm assumes there're multiple repeating patterns heuristically, and tries to determine which is the most promising one from the averages of different idle states. It also takes the occurrence sequence into consideration, and gives the prediction close to the recent idle. This increased the shallow idle states detected, but the difference in deep sleep time didn't change a lot. The performance on my platform, as expected, has improved. Before: Multi-Core Score 7279 Overall above under 34107 0.00 2.75 8200 59.90 7.02 29881 57.06 0.00 After: Multi-Core Score 7365 Overall above under 49913 0.00 6.43 7881 44.51 18.08 23108 52.38 0.00 There's another re-classification method, which, instead of looking for the repeating-interval, tends to find the repeating state. It gives a better re= sult on performance gain, but may hurt the power consumption. if (best_state =3D=3D drv->state_count - 1 || state_avg[best_state] =3D=3D = 0) { adj_weight[best_state] +=3D weights[i]; adj_avg[best_state] +=3D value; adj_hit[best_state]++; } else if (best_diff < state_avg[best_state] * 2) { adj_weight[best_state] +=3D weights[i]; adj_avg[best_state] +=3D value; adj_hit[best_state]++; } else { adj_weight[best_state + 1] +=3D weights[i]; adj_avg[best_state + 1] +=3D value; adj_hit[best_state + 1]++; } Repeating State: Multi-Core Score 7421 Overall above under 60857 0.00 8.30 3838 29.88 18.42 15318 39.05 0.00 Signed-off-by: Kaiqian Zhu --- drivers/cpuidle/governors/menu.c | 174 ++++++++++++++++--------------- 1 file changed, 89 insertions(+), 85 deletions(-) diff --git a/drivers/cpuidle/governors/menu.c b/drivers/cpuidle/governors/m= enu.c index 52d5d26fc7c6..52723ec1a0a6 100644 --- a/drivers/cpuidle/governors/menu.c +++ b/drivers/cpuidle/governors/menu.c @@ -99,109 +99,113 @@ static DEFINE_PER_CPU(struct menu_device, menu_device= s); static void menu_update(struct cpuidle_driver *drv, struct cpuidle_device = *dev); -/* - * Try detecting repeating patterns by keeping track of the last 8 - * intervals, and checking if the standard deviation of that set - * of points is below a threshold. If it is... then use the - * average of these 8 points as the estimated value. - */ -static unsigned int get_typical_interval(struct menu_device *data) +static int get_actual_state(struct cpuidle_driver *drv, + struct cpuidle_device *dev, + int duration_us) { -s64 value, min_thresh =3D -1, max_thresh =3D UINT_MAX; -unsigned int max, min, divisor; -u64 avg, variance, avg_sq; -int i; +int actual; -again: -/* Compute the average and variance of past intervals. */ -max =3D 0; -min =3D UINT_MAX; -avg =3D 0; -variance =3D 0; -divisor =3D 0; -for (i =3D 0; i < INTERVALS; i++) { -value =3D data->intervals[i]; -/* - * Discard the samples outside the interval between the min and - * max thresholds. - */ -if (value <=3D min_thresh || value >=3D max_thresh) -continue; +for (int i =3D 0; i < drv->state_count; i++) { +if (duration_us < drv->states[i].target_residency) +break; + +actual =3D i; +} + +return actual; +} + +static unsigned int get_typical_interval(struct cpuidle_driver *drv, + struct cpuidle_device *dev, + struct menu_device *data) +{ +int cnt =3D 0; + +int state_hit[CPUIDLE_STATE_MAX]; +int state_avg[CPUIDLE_STATE_MAX]; +int adj_weight[CPUIDLE_STATE_MAX]; +int adj_avg[CPUIDLE_STATE_MAX]; +int adj_hit[CPUIDLE_STATE_MAX]; +int hit_thres =3D max(2, INTERVALS / drv->state_count); +int weights[INTERVALS] =3D {5, 3, 2, 1}; +int weight =3D 0; +int high_state =3D -1; -divisor++; -avg +=3D value; -variance +=3D value * value; +/* Going through the history, and divide them by the actual state */ +for (int i =3D 0; i < INTERVALS; i++) { +int actual =3D get_actual_state(drv, dev, data->intervals[i]); -if (value > max) -max =3D value; +/* Count the idle states hit in the history */ +state_avg[actual] +=3D data->intervals[i]; +state_hit[actual]++; -if (value < min) -min =3D value; +cnt++; } -if (!max) +if (cnt < hit_thres) return UINT_MAX; -if (divisor =3D=3D INTERVALS) { -avg >>=3D INTERVAL_SHIFT; -variance >>=3D INTERVAL_SHIFT; -} else { -do_div(avg, divisor); -do_div(variance, divisor); +/* calculate the average of each state */ +for (int i =3D 0; i < drv->state_count; i++) { +if (state_hit[i] > 1) +state_avg[i] /=3D state_hit[i]; } -avg_sq =3D avg * avg; -variance -=3D avg_sq; +/* try to re-assign the data points by the closeness */ +for (int i =3D 0; i < INTERVALS; i++) { +/* Starting from the recent history */ +int idx =3D ((data->interval_ptr - i - 1) + INTERVALS) % INTERVALS; +unsigned int diff; +unsigned int best_diff =3D UINT_MAX; +unsigned int best_state, next_state; +unsigned int value =3D data->intervals[idx]; + +for (int state =3D 0; state < drv->state_count; state++) { +diff =3D abs(state_avg[state] - value); +if (diff < best_diff) { +best_diff =3D diff; +best_state =3D state; +} +} -/* - * The typical interval is obtained when standard deviation is - * small (stddev <=3D 20 us, variance <=3D 400 us^2) or standard - * deviation is small compared to the average interval (avg > - * 6*stddev, avg^2 > 36*variance). The average is smaller than - * UINT_MAX aka U32_MAX, so computing its square does not - * overflow a u64. We simply reject this candidate average if - * the standard deviation is greater than 715 s (which is - * rather unlikely). - * - * Use this result only if there is no timer to wake us up sooner. - */ -if (likely(variance <=3D U64_MAX/36)) { -if ((avg_sq > variance * 36 && divisor * 4 >=3D INTERVALS * 3) || - variance <=3D 400) -return avg; +if (best_diff < (state_avg[best_state] >> 2)) { +adj_avg[best_state] +=3D value; +adj_hit[best_state]++; +adj_weight[best_state] +=3D weights[i]; +} else if (best_state < drv->state_count - 1) { +next_state =3D best_state + 1; +diff =3D abs(state_avg[next_state] - value); +if (diff < (state_avg[next_state] >> 2)) { +adj_avg[next_state] +=3D value; +adj_hit[next_state]++; +adj_weight[next_state] +=3D weights[i]; +} +} } -/* - * If there are outliers, discard them by setting thresholds to exclude - * data points at a large enough distance from the average, then - * calculate the average and standard deviation again. Once we get - * down to the last 3/4 of our samples, stop excluding samples. - * - * This can deal with workloads that have long pauses interspersed - * with sporadic activity with a bunch of short pauses. +/* We've adjusted the hit status by the closeness, if one state is still + * hit more often and selected recently, we can assume that state is more + * likely to happen in the future */ -if (divisor * 4 <=3D INTERVALS * 3) { -/* - * If there are sufficiently many data points still under - * consideration after the outliers have been eliminated, - * returning without a prediction would be a mistake because it - * is likely that the next interval will not exceed the current - * maximum, so return the latter in that case. - */ -if (divisor >=3D INTERVALS / 2) -return max; - -return UINT_MAX; +for (int state =3D 0; state < drv->state_count; state++) { +if (adj_weight[state] > 1 && adj_hit[state] >=3D hit_thres) { +adj_avg[state] /=3D adj_hit[state]; + +if (adj_weight[state] > weight) { +weight =3D adj_weight[state]; +high_state =3D state; +} else if (adj_weight[state] =3D=3D weight) { +if (adj_hit[state] > adj_hit[high_state]) +high_state =3D state; +} +} } -/* Update the thresholds for the next round. */ -if (avg - min > max - avg) -min_thresh =3D min; -else -max_thresh =3D max; +if (weight) +return adj_avg[high_state]; -goto again; +return UINT_MAX; } /** @@ -225,7 +229,7 @@ static int menu_select(struct cpuidle_driver *drv, stru= ct cpuidle_device *dev, } /* Find the shortest expected idle interval. */ -predicted_ns =3D get_typical_interval(data) * NSEC_PER_USEC; +predicted_ns =3D get_typical_interval(drv, dev, data) * NSEC_PER_USEC; if (predicted_ns > RESIDENCY_THRESHOLD_NS) { unsigned int timer_us; -- 2.34.1 #/******=E6=9C=AC=E9=82=AE=E4=BB=B6=E5=8F=8A=E5=85=B6=E9=99=84=E4=BB=B6=E5= =90=AB=E6=9C=89=E5=B0=8F=E7=B1=B3=E5=85=AC=E5=8F=B8=E7=9A=84=E4=BF=9D=E5=AF= =86=E4=BF=A1=E6=81=AF=EF=BC=8C=E4=BB=85=E9=99=90=E4=BA=8E=E5=8F=91=E9=80=81= =E7=BB=99=E4=B8=8A=E9=9D=A2=E5=9C=B0=E5=9D=80=E4=B8=AD=E5=88=97=E5=87=BA=E7= =9A=84=E4=B8=AA=E4=BA=BA=E6=88=96=E7=BE=A4=E7=BB=84=E3=80=82=E7=A6=81=E6=AD= =A2=E4=BB=BB=E4=BD=95=E5=85=B6=E4=BB=96=E4=BA=BA=E4=BB=A5=E4=BB=BB=E4=BD=95= =E5=BD=A2=E5=BC=8F=E4=BD=BF=E7=94=A8=EF=BC=88=E5=8C=85=E6=8B=AC=E4=BD=86=E4= =B8=8D=E9=99=90=E4=BA=8E=E5=85=A8=E9=83=A8=E6=88=96=E9=83=A8=E5=88=86=E5=9C= =B0=E6=B3=84=E9=9C=B2=E3=80=81=E5=A4=8D=E5=88=B6=E3=80=81=E6=88=96=E6=95=A3= =E5=8F=91=EF=BC=89=E6=9C=AC=E9=82=AE=E4=BB=B6=E4=B8=AD=E7=9A=84=E4=BF=A1=E6= =81=AF=E3=80=82=E5=A6=82=E6=9E=9C=E6=82=A8=E9=94=99=E6=94=B6=E4=BA=86=E6=9C= =AC=E9=82=AE=E4=BB=B6=EF=BC=8C=E8=AF=B7=E6=82=A8=E7=AB=8B=E5=8D=B3=E7=94=B5= =E8=AF=9D=E6=88=96=E9=82=AE=E4=BB=B6=E9=80=9A=E7=9F=A5=E5=8F=91=E4=BB=B6=E4= =BA=BA=E5=B9=B6=E5=88=A0=E9=99=A4=E6=9C=AC=E9=82=AE=E4=BB=B6=EF=BC=81 This = e-mail and its attachments contain confidential information from XIAOMI, wh= ich is intended only for the person or entity whose address is listed above= . Any use of the information contained herein in any way (including, but no= t limited to, total or partial disclosure, reproduction, or dissemination) = by persons other than the intended recipient(s) is prohibited. If you recei= ve this e-mail in error, please notify the sender by phone or email immedia= tely and delete it!******/#