From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00793C636CC for ; Tue, 7 Feb 2023 05:02:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230134AbjBGFCC (ORCPT ); Tue, 7 Feb 2023 00:02:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230040AbjBGFBt (ORCPT ); Tue, 7 Feb 2023 00:01:49 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C78A22B2BC; Mon, 6 Feb 2023 21:01:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746101; x=1707282101; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=lZUt9bTzhG5+Fn6a2/x++iylr7jp1U1p86qFgAgn3yU=; b=l6lE2P4kbnF0kklERSzoSHvU0x1byi/pE8aw7AtzyiS93pW2d3z/oh6k zg/ryjKJvt6n2Gw5J54AlcUJSO0nNDtHryL7C5GQHO3FPYdHUqKSLGKDQ 3B+EzhDjmh5IRDGVGYfCGcz4jC9akVS0R96Afkys4lJuHx19dhshaui4t UwJMt3oPJ9RkPk5Kg7NbAEwrJc+z3Ja9As0/RC8U83D7PmXPiPMZBQYlT 8AlQJoNt9PSV0rpn4xOJlq0cAkJKBKU/0VZP3QmK6cv2TTGJotUN2xmz2 EaQdOl4u6rP+ZIBfae8+La3ukKKGNDMVKhEV+RWWLHW4MqPBWlHOnRfA7 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625738" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625738" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657688" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657688" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:39 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 01/24] sched/task_struct: Introduce IPC classes of tasks Date: Mon, 6 Feb 2023 21:10:42 -0800 Message-Id: <20230207051105.11575-2-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" On hybrid processors, the architecture differences between the types of CPUs lead to different instructions-per-cycle (IPC) on each type of CPU. IPCs may differ further by the type of instructions. Instructions can be grouped into classes of similar IPCs. Hence, tasks can be classified into groups based on the type of instructions they execute. Add a new member task_struct::ipcc to associate a particular task to an IPC class that depends on the instructions it executes. The scheduler may use the IPC class of a task and data about the performance among CPUs of a given IPC class to improve throughput. It may, for instance, place certain classes of tasks on CPUs of higher performance. The methods to determine the classification of a task and its relative IPC score are specific to each CPU architecture. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * Changed the type of task_struct::ipcc to unsigned short. A subsequent patch uses bit fields to use 9 bits, along with other auxiliary members. Changes since v1: * Renamed task_struct::class as task_struct::ipcc. (Joel) * Use task_struct::ipcc =3D 0 for unclassified tasks. (PeterZ) * Renamed CONFIG_SCHED_TASK_CLASSES as CONFIG_IPC_CLASSES. (PeterZ, Joel) --- include/linux/sched.h | 10 ++++++++++ init/Kconfig | 12 ++++++++++++ 2 files changed, 22 insertions(+) diff --git a/include/linux/sched.h b/include/linux/sched.h index 4df2b3e76b30..98f84f90a01d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -127,6 +127,8 @@ struct task_group; __TASK_TRACED | EXIT_DEAD | EXIT_ZOMBIE | \ TASK_PARKED) =20 +#define IPC_CLASS_UNCLASSIFIED 0 + #define task_is_running(task) (READ_ONCE((task)->__state) =3D=3D TASK_RUN= NING) =20 #define task_is_traced(task) ((READ_ONCE(task->jobctl) & JOBCTL_TRACED) != =3D 0) @@ -1528,6 +1530,14 @@ struct task_struct { union rv_task_monitor rv[RV_PER_TASK_MONITORS]; #endif =20 +#ifdef CONFIG_IPC_CLASSES + /* + * A hardware-defined classification of task that reflects but is + * not identical to the number of instructions per cycle. + */ + unsigned short ipcc; +#endif + /* * New fields for task_struct should be added above here, so that * they are included in the randomized portion of task_struct. diff --git a/init/Kconfig b/init/Kconfig index e76dc579cfa2..731a4b652030 100644 --- a/init/Kconfig +++ b/init/Kconfig @@ -867,6 +867,18 @@ config UCLAMP_BUCKETS_COUNT =20 If in doubt, use the default value. =20 +config IPC_CLASSES + bool "IPC classes of tasks" + depends on SMP + help + If selected, each task is assigned a classification value that + reflects the type of instructions that the task executes. This + classification reflects but is not equal to the number of + instructions retired per cycle. + + The scheduler uses the classification value to improve the placement + of tasks. + endmenu =20 # --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67CDCC636D4 for ; Tue, 7 Feb 2023 05:02:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230140AbjBGFCF (ORCPT ); Tue, 7 Feb 2023 00:02:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230132AbjBGFBv (ORCPT ); Tue, 7 Feb 2023 00:01:51 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 795FC2B084; Mon, 6 Feb 2023 21:01:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746103; x=1707282103; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=Ks6Wx9Pqszz2YvypOx73VVPt8pwQu0X0WqROcgHGPFg=; b=WZdoyckUVTPTLZ2SvcnaawYxkYhotga/TCp5v4N7a2+bLqWFEwZmIjlk WoWar9WG52JEq7WxCX/2bNAesLGgDxBYbocBFWpoy4y/xMiu/MEq/PjWe cCO0WmtY9GowklBREZ399+wQ2MVjzrTqQLKmN4XWXX9PU2CLnV5OREUN0 VhPmsYet5dPMJFBVEslZ1wfq4kFG+XPwEXk3JqvukrC+rDHvUQC7PWE2+ rF/PVCsHvI4lRJUXluceKGroS4y0UpdI3OIGzuEvyggnP+Z/DxSRFoVw/ kNEmureFao+KWih2O8kBXEhAeQICNolkr+1CWB/9lGWPUC+zJeNXsVZ/x w==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625749" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625749" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657693" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657693" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:40 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 02/24] sched: Add interfaces for IPC classes Date: Mon, 6 Feb 2023 21:10:43 -0800 Message-Id: <20230207051105.11575-3-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Add the interfaces that architectures shall implement to convey the data to support IPC classes. arch_update_ipcc() updates the IPC classification of the current task as given by hardware. arch_get_ipcc_score() provides a performance score for a given IPC class when placed on a specific CPU. Higher scores indicate higher performance. When a driver or equivalent enablement code has configured the necessary hardware to support IPC classes, it should call sched_enable_ipc_classes() to notify the scheduler that it can start using IPC classes data. The number of classes and the score of each class of task are determined by hardware. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * Clarified the properties of the IPC score: abstract, and linear. It can normalized when needed. (Ionela) * Selected a better default IPC score. (Ionela) * Removed arch_has_ipc_classes(). It is not suitable for hardware that is not ready to support IPC classes after boot. (Lukasz) * Added a new sched_enable_ipc_classes() interface that drivers or enablement code can call when ready to support IPC classes. (Lukasz) Changes since v1: * Shortened the names of the IPCC interfaces (PeterZ): sched_task_classes_enabled >> sched_ipcc_enabled arch_has_task_classes >> arch_has_ipc_classes arch_update_task_class >> arch_update_ipcc arch_get_task_class_score >> arch_get_ipcc_score * Removed smt_siblings_idle argument from arch_update_ipcc(). (PeterZ) --- include/linux/sched/topology.h | 6 ++++ kernel/sched/sched.h | 66 ++++++++++++++++++++++++++++++++++ kernel/sched/topology.c | 9 +++++ 3 files changed, 81 insertions(+) diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h index 816df6cc444e..5b084d3c9ad1 100644 --- a/include/linux/sched/topology.h +++ b/include/linux/sched/topology.h @@ -280,4 +280,10 @@ static inline int task_node(const struct task_struct *= p) return cpu_to_node(task_cpu(p)); } =20 +#ifdef CONFIG_IPC_CLASSES +extern void sched_enable_ipc_classes(void); +#else +static inline void sched_enable_ipc_classes(void) { } +#endif + #endif /* _LINUX_SCHED_TOPOLOGY_H */ diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 1072502976df..0a9c3024326d 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2532,6 +2532,72 @@ void arch_scale_freq_tick(void) } #endif =20 +#ifdef CONFIG_IPC_CLASSES +DECLARE_STATIC_KEY_FALSE(sched_ipcc); + +static inline bool sched_ipcc_enabled(void) +{ + return static_branch_unlikely(&sched_ipcc); +} + +#ifndef arch_update_ipcc +/** + * arch_update_ipcc() - Update the IPC class of the current task + * @curr: The current task + * + * Request that the IPC classification of @curr is updated. + * + * Returns: none + */ +static __always_inline +void arch_update_ipcc(struct task_struct *curr) +{ +} +#endif + +#ifndef arch_get_ipcc_score + +#define SCHED_IPCC_SCORE_SCALE (1L << SCHED_FIXEDPOINT_SHIFT) +/** + * arch_get_ipcc_score() - Get the IPC score of a class of task + * @ipcc: The IPC class + * @cpu: A CPU number + * + * The IPC performance scores reflects (but it is not identical to) the nu= mber + * of instructions retired per cycle for a given IPC class. It is a linear= and + * abstract metric. Higher scores reflect better performance. + * + * The IPC score can be normalized with respect to the class, i, with the + * highest IPC score on the CPU, c, with highest performance: + * + * IPC(i, c) + * ------------------------------------ * SCHED_IPCC_SCORE_SCALE + * max(IPC(i, c) : (i, c)) + * + * Scheduling schemes that want to use the IPC score along with other + * normalized metrics for scheduling (e.g., CPU capacity) may need to norm= alize + * it. + * + * Other scheduling schemes (e.g., asym_packing) do not need normalization. + * + * Returns the performance score of an IPC class, @ipcc, when running on @= cpu. + * Error when either @ipcc or @cpu are invalid. + */ +static __always_inline +unsigned long arch_get_ipcc_score(unsigned short ipcc, int cpu) +{ + return SCHED_IPCC_SCORE_SCALE; +} +#endif +#else /* CONFIG_IPC_CLASSES */ + +#define arch_get_ipcc_score(ipcc, cpu) (-EINVAL) +#define arch_update_ipcc(curr) + +static inline bool sched_ipcc_enabled(void) { return false; } + +#endif /* CONFIG_IPC_CLASSES */ + #ifndef arch_scale_freq_capacity /** * arch_scale_freq_capacity - get the frequency scale factor of a given CP= U. diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c index d93c3379e901..8380bb7f0cd9 100644 --- a/kernel/sched/topology.c +++ b/kernel/sched/topology.c @@ -670,6 +670,15 @@ DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_pa= cking); DEFINE_PER_CPU(struct sched_domain __rcu *, sd_asym_cpucapacity); DEFINE_STATIC_KEY_FALSE(sched_asym_cpucapacity); =20 +#ifdef CONFIG_IPC_CLASSES +DEFINE_STATIC_KEY_FALSE(sched_ipcc); + +void sched_enable_ipc_classes(void) +{ + static_branch_enable_cpuslocked(&sched_ipcc); +} +#endif + static void update_top_cache_domain(int cpu) { struct sched_domain_shared *sds =3D NULL; --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 41C0CC6379F for ; Tue, 7 Feb 2023 05:02:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230145AbjBGFCJ (ORCPT ); Tue, 7 Feb 2023 00:02:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229462AbjBGFBv (ORCPT ); Tue, 7 Feb 2023 00:01:51 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08D912709; Mon, 6 Feb 2023 21:01:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746105; x=1707282105; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=aj6JFQKyLnSw7eTFaXzl39JTMTTiJ20FhjGa4KuYwkI=; b=RdDab7YVICyTccSdq8g41dZrfMgwXUNmpwT9xowLHToq8U5S8IjpowyJ 2PEr86LBp2l25Bek0y8B3/M/WB1PftNqGb5G0oVjzaHizwh6X9TYiyj4U oIKhyjywiTZs7f8Ro+TR0xRoT512tABTt/q68q7+CdtCshDf0SY5a9P5v Sfddl1YuDN3OX2u8L/FEE4ZaQAcC7ldPCUG/fN77n102+sLLqxW3tqV8T eCjO2Xu5+WDHl74d3IX0iBPgyEaZJq8UrhhNSm2brbNDO3Z8+vooo1xbC I6FXXBkdzDSUqShxG6Z5Yg7OSaTKpewWNEsjNQsSH6Fn3iC4un8o4ahNY A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625761" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625761" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657696" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657696" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:40 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 03/24] sched/core: Initialize the IPC class of a new task Date: Mon, 6 Feb 2023 21:10:44 -0800 Message-Id: <20230207051105.11575-4-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" New tasks shall start life as unclassified. They will be classified by hardware when they run. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * None --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 4580fe3e1d0c..ed1549d28090 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4428,6 +4428,9 @@ static void __sched_fork(unsigned long clone_flags, s= truct task_struct *p) p->se.prev_sum_exec_runtime =3D 0; p->se.nr_migrations =3D 0; p->se.vruntime =3D 0; +#ifdef CONFIG_IPC_CLASSES + p->ipcc =3D IPC_CLASS_UNCLASSIFIED; +#endif INIT_LIST_HEAD(&p->se.group_node); =20 #ifdef CONFIG_FAIR_GROUP_SCHED --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E301C636D4 for ; Tue, 7 Feb 2023 05:02:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229588AbjBGFCZ (ORCPT ); Tue, 7 Feb 2023 00:02:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60620 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230152AbjBGFBx (ORCPT ); Tue, 7 Feb 2023 00:01:53 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CAF374C0A; Mon, 6 Feb 2023 21:01:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746109; x=1707282109; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=iDkGE/mIiTsNfeFOnUGlfr0oeLSL2C0UHOku/Uu39Qo=; b=ep/FwozEgPY3+GMxNFBQI7DFYoIlSGA70lZmUH/CqwfRw4+z9+c6iLY0 qw3YJw+vjvQqbe8YZ3NucEpnWmJDcS3Q1osfUPoU4+75i7tKjzhgNcnxo y9VJWF+4v/zKyJOqHcO4PQnyEHZigA5nMwYUuE1pbuEx7l9fM4PMzGuBc ipo0hQ8/lbMeugQxE5DAoepBYKL2GCo+1Pjs6OaOaOUGgMLeS7wzaNPmo HiMbNsbxb9FtzqTP2T0MUfoEdnAMo0v8AhzCurAn7bhez6Cf0KAsoaZT6 05BFJy61P30OuxiJgsxtNWYgw38oyKW1EwLDPz7Ae9fh0RoL4ej2TXL2K A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625771" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625771" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:40 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657699" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657699" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:40 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 04/24] sched/core: Add user_tick as argument to scheduler_tick() Date: Mon, 6 Feb 2023 21:10:45 -0800 Message-Id: <20230207051105.11575-5-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Differentiate between user and kernel ticks so that the scheduler updates the IPC class of the current task during the former. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * Corrected error in the changeset description: the IPC class of the current task is updated at user tick. (Dietmar) Changes since v1: * None --- include/linux/sched.h | 2 +- kernel/sched/core.c | 2 +- kernel/time/timer.c | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 98f84f90a01d..10c6abdc3465 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -293,7 +293,7 @@ enum { TASK_COMM_LEN =3D 16, }; =20 -extern void scheduler_tick(void); +extern void scheduler_tick(bool user_tick); =20 #define MAX_SCHEDULE_TIMEOUT LONG_MAX =20 diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ed1549d28090..39d218a2f243 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5555,7 +5555,7 @@ static inline u64 cpu_resched_latency(struct rq *rq) = { return 0; } * This function gets called by the timer code, with HZ frequency. * We call it with interrupts disabled. */ -void scheduler_tick(void) +void scheduler_tick(bool user_tick) { int cpu =3D smp_processor_id(); struct rq *rq =3D cpu_rq(cpu); diff --git a/kernel/time/timer.c b/kernel/time/timer.c index 63a8ce7177dd..e15e24105891 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -2073,7 +2073,7 @@ void update_process_times(int user_tick) if (in_irq()) irq_work_tick(); #endif - scheduler_tick(); + scheduler_tick(user_tick); if (IS_ENABLED(CONFIG_POSIX_TIMERS)) run_posix_cpu_timers(); } --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D91A5C64EC5 for ; Tue, 7 Feb 2023 05:02:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230305AbjBGFC2 (ORCPT ); Tue, 7 Feb 2023 00:02:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230156AbjBGFBy (ORCPT ); Tue, 7 Feb 2023 00:01:54 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F40DE5273; Mon, 6 Feb 2023 21:01:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746110; x=1707282110; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=HuUFjeqT6UwPE0HFZg/3PfgKeD3R8aChweTbYJKxkt4=; b=QjCCsi4ZUGb6//og4Z/aoHF0/yhqe5A+FtwDlMIglvBgQHi0cXi4wzEJ Jynw7Ln5n6tP7vtOiA4e2DI2FWRkstmy+dsWhzA/71LphXv9pMCfDhGaD jN6gha8DnEde9QjaRdiuhycLBUHnY+50YhmrAbWUMQb2japJ31Lx2hjTK d5OMLkg6eiMxfxj0fvOSMIhfGFyhOx3iDpk8QFH/lHSEYmvustbPosyCv wxUzq9aDWABZCda6HcHDI5i3Qk7iU5GEpZAXvuQ9/IZqdVrhbBLLIcSj5 OwEwn3QU+5YD7NmkU7qUhsQsqICtTxkcrDjEcSLJlRmRBhMoUsoXUrDFW w==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625782" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625782" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657706" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657706" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:40 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 05/24] sched/core: Update the IPC class of the current task Date: Mon, 6 Feb 2023 21:10:46 -0800 Message-Id: <20230207051105.11575-6-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" When supported, hardware monitors the instruction stream to classify the current task. Hence, at userspace tick, we are ready to read the most recent classification result for the current task. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * Removed argument smt_siblings_idle from call to arch_ipcc_update(). * Used the new IPCC interfaces names. --- kernel/sched/core.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 39d218a2f243..9f4e9cc16df8 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5567,6 +5567,9 @@ void scheduler_tick(bool user_tick) if (housekeeping_cpu(cpu, HK_TYPE_TICK)) arch_scale_freq_tick(); =20 + if (sched_ipcc_enabled() && user_tick) + arch_update_ipcc(curr); + sched_clock_tick(); =20 rq_lock(rq, &rf); --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BACEAC636CC for ; Tue, 7 Feb 2023 05:02:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230080AbjBGFCb (ORCPT ); Tue, 7 Feb 2023 00:02:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229730AbjBGFBy (ORCPT ); Tue, 7 Feb 2023 00:01:54 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9462E55A2; Mon, 6 Feb 2023 21:01:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746111; x=1707282111; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=9U3HT2fPkvVUOBW50CYIJ3B+L4hO5S4Nc1uYuLdQPpY=; b=aC0QiTkNKtnMvNl23FvuS1tVrjynTfR9uEtl8VzdUW+9Y/2E5f2tORgs 414kjOfSowGsHPg/CX3gTgBzYlJk+5r4hOGMZeaWBMhmlisHG7SNRVK+K cNZUs0BhJC/oZM422gUvFhP9W9M7uQaZRw07TRlxEIE1evtZeCmlFSKim gSDNOYgWQxz2Sn1BCfbmAwQOZAXsY/58ONzAtIxFeQ6RS2fFK6O1un5Xv kojvIHxSnaZa+JTTsHXe7mCgbA16TVdyYKInn75Ub4XG7WIGfSehJaWrF U8xInG+XqJTd3lazXpDkwsaY/3H9uTXdU2ojsY2RiCkt7p6wZO6HCxnvH g==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625795" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625795" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:41 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657713" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657713" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:41 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 06/24] sched/fair: Collect load-balancing stats for IPC classes Date: Mon, 6 Feb 2023 21:10:47 -0800 Message-Id: <20230207051105.11575-7-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" When selecting a busiest scheduling group, the IPC class of the current task can be used to select between two scheduling groups of types asym_ packing or fully_busy that are otherwise identical. Compute the IPC class performance score for a scheduling group. It is the sum of the scores of the current tasks of all the runqueues. Also, keep track of the class of the task with the lowest IPC class score in the scheduling group. These two metrics will be used during idle load balancing to compute the current and the prospective IPC class score of a scheduling group. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * Also excluded deadline and realtime tasks from IPCC stats. (Dietmar) * Also excluded tasks that cannot run on the destination CPU from the IPCC stats. * Folded struct sg_lb_ipcc_stats into struct sg_lb_stats. (Dietmar) * Reworded description sg_lb_stats::min_ipcc. (Ionela) * Handle errors of arch_get_ipcc_score(). (Ionela) Changes since v1: * Implemented cleanups and reworks from PeterZ. Thanks! * Used the new interface names. --- kernel/sched/fair.c | 61 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 61 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 0ada2d18b934..d773380a95b3 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8897,6 +8897,11 @@ struct sg_lb_stats { unsigned int nr_numa_running; unsigned int nr_preferred_running; #endif +#ifdef CONFIG_IPC_CLASSES + unsigned long min_score; /* Min(score(rq->curr->ipcc)) */ + unsigned short min_ipcc; /* Class of the task with the minimum IPCC score= in the rq */ + unsigned long sum_score; /* Sum(score(rq->curr->ipcc)) */ +#endif }; =20 /* @@ -9240,6 +9245,59 @@ group_type group_classify(unsigned int imbalance_pct, return group_has_spare; } =20 +#ifdef CONFIG_IPC_CLASSES +static void init_rq_ipcc_stats(struct sg_lb_stats *sgs) +{ + /* All IPCC stats have been set to zero in update_sg_lb_stats(). */ + sgs->min_score =3D ULONG_MAX; +} + +/* Called only if cpu_of(@rq) is not idle and has tasks running. */ +static void update_sg_lb_ipcc_stats(int dst_cpu, struct sg_lb_stats *sgs, + struct rq *rq) +{ + struct task_struct *curr; + unsigned short ipcc; + unsigned long score; + + if (!sched_ipcc_enabled()) + return; + + curr =3D rcu_dereference(rq->curr); + if (!curr || (curr->flags & PF_EXITING) || is_idle_task(curr) || + task_is_realtime(curr) || + !cpumask_test_cpu(dst_cpu, curr->cpus_ptr)) + return; + + ipcc =3D curr->ipcc; + score =3D arch_get_ipcc_score(ipcc, cpu_of(rq)); + + /* + * Ignore tasks with invalid scores. When finding the busiest group, we + * prefer those with higher sum_score. This group will not be selected. + */ + if (IS_ERR_VALUE(score)) + return; + + sgs->sum_score +=3D score; + + if (score < sgs->min_score) { + sgs->min_score =3D score; + sgs->min_ipcc =3D ipcc; + } +} + +#else /* CONFIG_IPC_CLASSES */ +static void update_sg_lb_ipcc_stats(int dst_cpu, struct sg_lb_stats *sgs, + struct rq *rq) +{ +} + +static void init_rq_ipcc_stats(struct sg_lb_stats *sgs) +{ +} +#endif /* CONFIG_IPC_CLASSES */ + /** * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull= tasks * @dst_cpu: Destination CPU of the load balancing @@ -9332,6 +9390,7 @@ static inline void update_sg_lb_stats(struct lb_env *= env, int i, nr_running, local_group; =20 memset(sgs, 0, sizeof(*sgs)); + init_rq_ipcc_stats(sgs); =20 local_group =3D group =3D=3D sds->local; =20 @@ -9381,6 +9440,8 @@ static inline void update_sg_lb_stats(struct lb_env *= env, if (sgs->group_misfit_task_load < load) sgs->group_misfit_task_load =3D load; } + + update_sg_lb_ipcc_stats(env->dst_cpu, sgs, rq); } =20 sgs->group_capacity =3D group->sgc->capacity; --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CC49C636CC for ; Tue, 7 Feb 2023 05:02:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230333AbjBGFCf (ORCPT ); Tue, 7 Feb 2023 00:02:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60074 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230075AbjBGFBz (ORCPT ); Tue, 7 Feb 2023 00:01:55 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6472E6A56; Mon, 6 Feb 2023 21:01:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746112; x=1707282112; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=s23DDV+a9X4aP0kA4U7st9P3jUXY+WdjiUqKmcmbNZo=; b=Sv+DsJxd21ekmcvCS9JQXGV8feTCQx8c9HGtN/ZIuR/YpRcDacT2EfAs +2YDbamk127Uqfu//18/Gl2XGmohoQ7ofFyyrMfjieT6oaeYNPzCQ8Yn6 cU9RM1/TqLOQvasG4/JTjGOb85/DIBOA+3DJEWR47yW9LOlK0boUv/doM JOGbwP9WuxFggbMK9f5FiqrtWObLgFVtdWnXL7EzWWWWnNnYMAWwDaSUM wVLmDrw3wVGQd8cCNjL36Ch0+031pQb5Cc8uNaR4sstEY6/0uIXk3LE2K 2zMWPmIWc+UU/fhG73oapQTCqfQNu35NM7g70AYpAJiIaZOlDAWXVM1sB A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625809" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625809" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657717" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657717" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:42 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 07/24] sched/fair: Compute IPC class scores for load balancing Date: Mon, 6 Feb 2023 21:10:48 -0800 Message-Id: <20230207051105.11575-8-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Compute the joint total (both current and prospective) IPC class score of a scheduling group and the local scheduling group. These IPCC statistics are used during idle load balancing. The candidate scheduling group will have one fewer busy CPU after load balancing. This observation is important for cores with SMT support. The IPCC score of scheduling groups composed of SMT siblings needs to consider that the siblings share CPU resources. When computing the total IPCC score of the scheduling group, divide score of each sibling by the number of busy siblings. Collect IPCC statistics for asym_packing and fully_busy scheduling groups. When picking a busiest group, they are used to break ties between otherwise identical groups. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * Also collect IPCC stats for fully_busy sched groups. * Restrict use of IPCC stats to SD_ASYM_PACKING. (Ionela) * Handle errors of arch_get_ipcc_score(). (Ionela) Changes since v1: * Implemented cleanups and reworks from PeterZ. I took all his suggestions, except the computation of the IPC score before and after load balancing. We are computing not the average score, but the *total*. * Check for the SD_SHARE_CPUCAPACITY to compute the throughput of the SMT siblings of a physical core. * Used the new interface names. * Reworded commit message for clarity. --- kernel/sched/fair.c | 68 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d773380a95b3..b6165aa8a376 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8901,6 +8901,8 @@ struct sg_lb_stats { unsigned long min_score; /* Min(score(rq->curr->ipcc)) */ unsigned short min_ipcc; /* Class of the task with the minimum IPCC score= in the rq */ unsigned long sum_score; /* Sum(score(rq->curr->ipcc)) */ + long ipcc_score_after; /* Prospective IPCC score after load balancing */ + unsigned long ipcc_score_before; /* IPCC score before load balancing */ #endif }; =20 @@ -9287,6 +9289,62 @@ static void update_sg_lb_ipcc_stats(int dst_cpu, str= uct sg_lb_stats *sgs, } } =20 +static void update_sg_lb_stats_scores(struct sg_lb_stats *sgs, + struct sched_group *sg, + struct lb_env *env) +{ + unsigned long score_on_dst_cpu, before; + int busy_cpus; + long after; + + if (!sched_ipcc_enabled()) + return; + + /* + * IPCC scores are only useful during idle load balancing. For now, + * only asym_packing uses IPCC scores. + */ + if (!(env->sd->flags & SD_ASYM_PACKING) || + env->idle =3D=3D CPU_NOT_IDLE) + return; + + /* + * IPCC scores are used to break ties only between these types of + * groups. + */ + if (sgs->group_type !=3D group_fully_busy && + sgs->group_type !=3D group_asym_packing) + return; + + busy_cpus =3D sgs->group_weight - sgs->idle_cpus; + + /* No busy CPUs in the group. No tasks to move. */ + if (!busy_cpus) + return; + + score_on_dst_cpu =3D arch_get_ipcc_score(sgs->min_ipcc, env->dst_cpu); + + /* + * Do not use IPC scores. sgs::ipcc_score_{after, before} will be zero + * and not used. + */ + if (IS_ERR_VALUE(score_on_dst_cpu)) + return; + + before =3D sgs->sum_score; + after =3D before - sgs->min_score; + + /* SMT siblings share throughput. */ + if (busy_cpus > 1 && sg->flags & SD_SHARE_CPUCAPACITY) { + before /=3D busy_cpus; + /* One sibling will become idle after load balance. */ + after /=3D busy_cpus - 1; + } + + sgs->ipcc_score_after =3D after + score_on_dst_cpu; + sgs->ipcc_score_before =3D before; +} + #else /* CONFIG_IPC_CLASSES */ static void update_sg_lb_ipcc_stats(int dst_cpu, struct sg_lb_stats *sgs, struct rq *rq) @@ -9296,6 +9354,13 @@ static void update_sg_lb_ipcc_stats(int dst_cpu, str= uct sg_lb_stats *sgs, static void init_rq_ipcc_stats(struct sg_lb_stats *sgs) { } + +static void update_sg_lb_stats_scores(struct sg_lb_stats *sgs, + struct sched_group *sg, + struct lb_env *env) +{ +} + #endif /* CONFIG_IPC_CLASSES */ =20 /** @@ -9457,6 +9522,9 @@ static inline void update_sg_lb_stats(struct lb_env *= env, =20 sgs->group_type =3D group_classify(env->sd->imbalance_pct, group, sgs); =20 + if (!local_group) + update_sg_lb_stats_scores(sgs, group, env); + /* Computing avg_load makes sense only when group is overloaded */ if (sgs->group_type =3D=3D group_overloaded) sgs->avg_load =3D (sgs->group_load * SCHED_CAPACITY_SCALE) / --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96D3EC636CC for ; Tue, 7 Feb 2023 05:02:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230353AbjBGFCm (ORCPT ); Tue, 7 Feb 2023 00:02:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229743AbjBGFB4 (ORCPT ); Tue, 7 Feb 2023 00:01:56 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D302A901E; Mon, 6 Feb 2023 21:01:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746113; x=1707282113; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=aOX48G3IsFHDTBbZEmk9IpTBJOBpnREOfkswBIUTlho=; b=RLeOi2k2COE/zVoVK2CzAPc1wrUepDlkJUycQaQHSSR1n/8qq/0sJlNZ GPYShstWgk8b5Hy9AmRz+aBmQh7NA+pz4GGd8QkkbSKGmfwN+UCzrk526 i/gGkySJivGiS2ha1Gc6hmdZHYqWfhv3/XF9u3+gTwQBtxXxxo0tI2Yff 6JZB2SHgXRlGAzLRUwGAGtLZIaD+iJG8uaQeCc+d5Z+Lrm2hE//h9iFuO NSr7jbawJR/1fYl3d8+eUnHSncWS8BbbHppBztLlLekXu+zeHxCVq/wlX X8NQ33ekK3pVM5e5nGNyIPufIMnbY078qGGFWDIVXVqrur40i825mAhCv A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625820" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625820" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657720" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657720" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:42 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 08/24] sched/fair: Use IPCC stats to break ties between asym_packing sched groups Date: Mon, 6 Feb 2023 21:10:49 -0800 Message-Id: <20230207051105.11575-9-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" As it iterates, update_sd_pick_busiest() keeps on selecting as busiest sched groups of identical priority. Since both groups have the same priority, either group is a good choice. The IPCC statistics provide a measure of the throughput before and after load balance. Use them to pick a busiest scheduling group from otherwise identical asym_packing scheduling groups. Pick as busiest the scheduling group that yields a higher IPCC score after load balancing. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * Added a comment to clarify why sched_asym_prefer() needs a tie breaker only in update_sd_pick_busiest(). (PeterZ) * Renamed functions for accuracy: sched_asym_class_prefer() >> sched_asym_ipcc_prefer() sched_asym_class_pick() >> sched_asym_ipcc_pick() * Reworded commit message for clarity. --- kernel/sched/fair.c | 72 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 72 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index b6165aa8a376..841927b9b192 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9345,6 +9345,60 @@ static void update_sg_lb_stats_scores(struct sg_lb_s= tats *sgs, sgs->ipcc_score_before =3D before; } =20 +/** + * sched_asym_ipcc_prefer - Select a sched group based on its IPCC score + * @a: Load balancing statistics of a sched group + * @b: Load balancing statistics of a second sched group + * + * Returns: true if @a has a higher IPCC score than @b after load balance. + * False otherwise. + */ +static bool sched_asym_ipcc_prefer(struct sg_lb_stats *a, + struct sg_lb_stats *b) +{ + if (!sched_ipcc_enabled()) + return false; + + /* @a increases overall throughput after load balance. */ + if (a->ipcc_score_after > b->ipcc_score_after) + return true; + + /* + * If @a and @b yield the same overall throughput, pick @a if + * its current throughput is lower than that of @b. + */ + if (a->ipcc_score_after =3D=3D b->ipcc_score_after) + return a->ipcc_score_before < b->ipcc_score_before; + + return false; +} + +/** + * sched_asym_ipcc_pick - Select a sched group based on its IPCC score + * @a: A scheduling group + * @b: A second scheduling group + * @a_stats: Load balancing statistics of @a + * @b_stats: Load balancing statistics of @b + * + * Returns: true if @a has the same priority and @a has tasks with IPC cla= sses + * that yield higher overall throughput after load balance. False otherwis= e. + */ +static bool sched_asym_ipcc_pick(struct sched_group *a, + struct sched_group *b, + struct sg_lb_stats *a_stats, + struct sg_lb_stats *b_stats) +{ + /* + * Only use the class-specific preference selection if both sched + * groups have the same priority. + */ + if (arch_asym_cpu_priority(a->asym_prefer_cpu) !=3D + arch_asym_cpu_priority(b->asym_prefer_cpu)) + return false; + + return sched_asym_ipcc_prefer(a_stats, b_stats); +} + #else /* CONFIG_IPC_CLASSES */ static void update_sg_lb_ipcc_stats(int dst_cpu, struct sg_lb_stats *sgs, struct rq *rq) @@ -9361,6 +9415,14 @@ static void update_sg_lb_stats_scores(struct sg_lb_s= tats *sgs, { } =20 +static bool sched_asym_ipcc_pick(struct sched_group *a, + struct sched_group *b, + struct sg_lb_stats *a_stats, + struct sg_lb_stats *b_stats) +{ + return false; +} + #endif /* CONFIG_IPC_CLASSES */ =20 /** @@ -9596,6 +9658,16 @@ static bool update_sd_pick_busiest(struct lb_env *en= v, /* Prefer to move from lowest priority CPU's work */ if (sched_asym_prefer(sg->asym_prefer_cpu, sds->busiest->asym_prefer_cpu= )) return false; + + /* + * Unlike other callers of sched_asym_prefer(), here both @sg + * and @sds::busiest have tasks running. When they have equal + * priority, their IPC class scores can be used to select a + * better busiest. + */ + if (sched_asym_ipcc_pick(sds->busiest, sg, &sds->busiest_stat, sgs)) + return false; + break; =20 case group_misfit_task: --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E7069C636CC for ; Tue, 7 Feb 2023 05:02:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230349AbjBGFCj (ORCPT ); Tue, 7 Feb 2023 00:02:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230161AbjBGFBz (ORCPT ); Tue, 7 Feb 2023 00:01:55 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D43E4901F; Mon, 6 Feb 2023 21:01:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746113; x=1707282113; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=XBeyrfeapZSrbuAay+AhGt8TNoIKKt2H5iHcCn0WtSs=; b=dhD9I/7mLVj2Pggklg/grywvghBoSKaHwulbTuRAAHYuSm415n4FAEfV tYcSr+xDP+sso3ijlWudcK+qfujjfWET/RZHe5mWc7Rcarq7kwPwyaejx 47TbMWIgSkkkWrwRxyQj36YO2mQVrMudO8wQ6VkRhb3nBeHdigjm46e/P UcpJL10fI5HT6suN3NwaKLGJj+pvhcYvFoR/acBhYiGK6LiYaVum9sCkr 0eA0Q3jPQDP1zUSOpT3mVTJ6y95Mj7Ri7Ecd6mFaDJPs4JiLYb/8bPE2h Gacexu2m7mQUSjj3I1fzWPF+F+YLmpc4k+EivlWM2dlB1gDdZCQDiMkc1 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625837" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625837" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657723" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657723" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:42 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 09/24] sched/fair: Use IPCC stats to break ties between fully_busy SMT groups Date: Mon, 6 Feb 2023 21:10:50 -0800 Message-Id: <20230207051105.11575-10-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" IPCC statistics are used during idle load balancing. After balancing one of the siblings of an SMT core will become idle. The rest of the busy siblings will enjoy increased throughput. The IPCC statistics provide a measure of the increased throughput. Use them to pick a busiest group from otherwise identical fully_busy scheduling groups (of which the avg_load is equal - and zero). Using IPCC scores to break ties with non-SMT fully_busy sched groups is not necessary. SMT sched groups always need more help. Add a stub sched_asym_ipcc_prefer() for !CONFIG_IPC_CLASSES. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * Introduced this patch. Changes since v1: * N/A --- kernel/sched/fair.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 841927b9b192..72d88270b320 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9415,6 +9415,12 @@ static void update_sg_lb_stats_scores(struct sg_lb_s= tats *sgs, { } =20 +static bool sched_asym_ipcc_prefer(struct sg_lb_stats *a, + struct sg_lb_stats *b) +{ + return false; +} + static bool sched_asym_ipcc_pick(struct sched_group *a, struct sched_group *b, struct sg_lb_stats *a_stats, @@ -9698,10 +9704,21 @@ static bool update_sd_pick_busiest(struct lb_env *e= nv, if (sgs->avg_load =3D=3D busiest->avg_load) { /* * SMT sched groups need more help than non-SMT groups. - * If @sg happens to also be SMT, either choice is good. */ - if (sds->busiest->flags & SD_SHARE_CPUCAPACITY) - return false; + if (sds->busiest->flags & SD_SHARE_CPUCAPACITY) { + if (!(sg->flags & SD_SHARE_CPUCAPACITY)) + return false; + + /* + * Between two SMT groups, use IPCC scores to pick the + * one that would improve throughput the most (only + * asym_packing uses IPCC scores for now). + */ + if (sched_ipcc_enabled() && + env->sd->flags & SD_ASYM_PACKING && + sched_asym_ipcc_prefer(busiest, sgs)) + return false; + } } =20 break; --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16F32C636CD for ; Tue, 7 Feb 2023 05:02:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230359AbjBGFCr (ORCPT ); Tue, 7 Feb 2023 00:02:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60084 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229843AbjBGFB4 (ORCPT ); Tue, 7 Feb 2023 00:01:56 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E45BC643; Mon, 6 Feb 2023 21:01:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746114; x=1707282114; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=BrjwTRAQcpfuOeAV7Rg7Zm/X0IgwzgEOiAKjPiNKh1M=; b=ZGnjxVpa8iBCrxYykm4azWnUE82ILNVXXrmb3XL1r1GVuqN/eJz9Fv6+ BG5KtJsW2kvHDhW8u15yamKhtbsHFWIPtI5bZy6b62TNAOHUIfg8BTl6q U+sxFHy32zkOO96x2ls6L01fxLkJfrLFYAAqUmWnvEcC3BtRfXWzNNNnD wP+BBoG13YwvVkI85fwTRqaZp3LSWQKqYSjGYX+zBXadW27ShXC/Q0wv5 6eEhN8XbB0PTIvYMag6ip7fv7IhxmdWx61s09inzXdO9nfMYk/YOUB8lI R8WIEstc3TnFW/izgAh1XJg4cMWsQLMitCr/4HASdp/tYqyQKsmegaFA6 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625849" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625849" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657730" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657730" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:42 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 10/24] sched/fair: Use IPCC scores to select a busiest runqueue Date: Mon, 6 Feb 2023 21:10:51 -0800 Message-Id: <20230207051105.11575-11-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" For two runqueues of equal priority and equal number of running of tasks, select the one whose current task would have the highest IPC class score if placed on the destination CPU. For now, use IPCC scores only for scheduling domains with the SD_ASYM_PACKING flag. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * Only use IPCC scores to break ties if the sched domain uses asym_packing. (Ionela) * Handle errors of arch_get_ipcc_score(). (Ionela) Changes since v1: * Fixed a bug when selecting a busiest runqueue: when comparing two runqueues with equal nr_running, we must compute the IPCC score delta of both. * Renamed local variables to improve the layout of the code block. (PeterZ) * Used the new interface names. --- kernel/sched/fair.c | 64 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 72d88270b320..d3c22dc145f7 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -9399,6 +9399,37 @@ static bool sched_asym_ipcc_pick(struct sched_group = *a, return sched_asym_ipcc_prefer(a_stats, b_stats); } =20 +/** + * ipcc_score_delta - Get the IPCC score delta wrt the load balance's dst_= cpu + * @p: A task + * @env: Load balancing environment + * + * Returns: The IPCC score delta that @p would get if placed in the destin= ation + * CPU of @env. LONG_MIN to indicate that the delta should not be used. + */ +static long ipcc_score_delta(struct task_struct *p, struct lb_env *env) +{ + unsigned long score_src, score_dst; + unsigned short ipcc =3D p->ipcc; + + if (!sched_ipcc_enabled()) + return LONG_MIN; + + /* Only asym_packing uses IPCC scores at the moment. */ + if (!(env->sd->flags & SD_ASYM_PACKING)) + return LONG_MIN; + + score_dst =3D arch_get_ipcc_score(ipcc, env->dst_cpu); + if (IS_ERR_VALUE(score_dst)) + return LONG_MIN; + + score_src =3D arch_get_ipcc_score(ipcc, task_cpu(p)); + if (IS_ERR_VALUE(score_src)) + return LONG_MIN; + + return score_dst - score_src; +} + #else /* CONFIG_IPC_CLASSES */ static void update_sg_lb_ipcc_stats(int dst_cpu, struct sg_lb_stats *sgs, struct rq *rq) @@ -9429,6 +9460,11 @@ static bool sched_asym_ipcc_pick(struct sched_group = *a, return false; } =20 +static long ipcc_score_delta(struct task_struct *p, struct lb_env *env) +{ + return LONG_MIN; +} + #endif /* CONFIG_IPC_CLASSES */ =20 /** @@ -10589,6 +10625,7 @@ static struct rq *find_busiest_queue(struct lb_env = *env, { struct rq *busiest =3D NULL, *rq; unsigned long busiest_util =3D 0, busiest_load =3D 0, busiest_capacity = =3D 1; + long busiest_ipcc_delta =3D LONG_MIN; unsigned int busiest_nr =3D 0; int i; =20 @@ -10705,8 +10742,35 @@ static struct rq *find_busiest_queue(struct lb_env= *env, =20 case migrate_task: if (busiest_nr < nr_running) { + struct task_struct *curr; + busiest_nr =3D nr_running; busiest =3D rq; + + /* + * Remember the IPCC score delta of busiest::curr. + * We may need it to break a tie with other queues + * with equal nr_running. + */ + curr =3D rcu_dereference(busiest->curr); + busiest_ipcc_delta =3D ipcc_score_delta(curr, env); + /* + * If rq and busiest have the same number of running + * tasks and IPC classes are supported, pick rq if doing + * so would give rq::curr a bigger IPC boost on dst_cpu. + */ + } else if (busiest_nr =3D=3D nr_running) { + struct task_struct *curr; + long delta; + + curr =3D rcu_dereference(rq->curr); + delta =3D ipcc_score_delta(curr, env); + + if (busiest_ipcc_delta < delta) { + busiest_ipcc_delta =3D delta; + busiest_nr =3D nr_running; + busiest =3D rq; + } } break; =20 --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04832C636CD for ; Tue, 7 Feb 2023 05:02:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230399AbjBGFC5 (ORCPT ); Tue, 7 Feb 2023 00:02:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229878AbjBGFB5 (ORCPT ); Tue, 7 Feb 2023 00:01:57 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 901C7C678; Mon, 6 Feb 2023 21:01:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746115; x=1707282115; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=d7YOpMlf1P1ORwPsFp8Wf2z9+WUpcXCD+VCa+NWGzJ0=; b=KgwOpYS9wwI5sG+q542pWz5u/vqrl00gdbipfL2/6e4zuRlSjEa2a1FL EByuNyYUqIIm1c1vhNPO/wgqf+J91afi2g6JOobXDlY4H+cSFYwbpilRE k2W035GmwVnSMmxflsN0H6BOW+8rw4OyOtxAf4PArdcu1YtuXJQjLuDxU UdfT1ID5O/f8u7Lo1tBAdnH3XwyjQoLIOC1CwkU7FvRHl+99J5lfaDCKx KinJoe/DUXiPwN55nUT5Qw8qoJhb7j5HVmHuNplFkU+0dkhqn8Rc+jMg8 msjsuc6nY5h3tVppu8rYD5ZtiImDWUCq9i95V9D/HTtapQu0W+SMGgVrg Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625861" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625861" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657733" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657733" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:43 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 11/24] thermal: intel: hfi: Introduce Intel Thread Director classes Date: Mon, 6 Feb 2023 21:10:52 -0800 Message-Id: <20230207051105.11575-12-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" On Intel hybrid parts, each type of CPU has specific performance and energy efficiency capabilities. The Intel Thread Director technology extends the Hardware Feedback Interface (HFI) to provide performance and energy efficiency data for advanced classes of instructions. Add support to parse per-class capabilities. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri Acked-by: Rafael J. Wysocki --- Changes since v2: * None Changes since v1: * Removed a now obsolete comment. --- drivers/thermal/intel/intel_hfi.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/inte= l_hfi.c index 6e604bda2b93..2527ae3836c7 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -77,7 +77,7 @@ union cpuid6_edx { * @ee_cap: Energy efficiency capability * * Capabilities of a logical processor in the HFI table. These capabilitie= s are - * unitless. + * unitless and specific to each HFI class. */ struct hfi_cpu_data { u8 perf_cap; @@ -89,7 +89,8 @@ struct hfi_cpu_data { * @perf_updated: Hardware updated performance capabilities * @ee_updated: Hardware updated energy efficiency capabilities * - * Properties of the data in an HFI table. + * Properties of the data in an HFI table. There exists one header per each + * HFI class. */ struct hfi_hdr { u8 perf_updated; @@ -127,16 +128,21 @@ struct hfi_instance { =20 /** * struct hfi_features - Supported HFI features + * @nr_classes: Number of classes supported * @nr_table_pages: Size of the HFI table in 4KB pages * @cpu_stride: Stride size to locate the capability data of a logical * processor within the table (i.e., row stride) + * @class_stride: Stride size to locate a class within the capability + * data of a logical processor or the HFI table header * @hdr_size: Size of the table header * * Parameters and supported features that are common to all HFI instances */ struct hfi_features { + unsigned int nr_classes; size_t nr_table_pages; unsigned int cpu_stride; + unsigned int class_stride; unsigned int hdr_size; }; =20 @@ -333,8 +339,8 @@ static void init_hfi_cpu_index(struct hfi_cpu_info *inf= o) } =20 /* - * The format of the HFI table depends on the number of capabilities that = the - * hardware supports. Keep a data structure to navigate the table. + * The format of the HFI table depends on the number of capabilities and c= lasses + * that the hardware supports. Keep a data structure to navigate the table. */ static void init_hfi_instance(struct hfi_instance *hfi_instance) { @@ -515,18 +521,30 @@ static __init int hfi_parse_features(void) /* The number of 4KB pages required by the table */ hfi_features.nr_table_pages =3D edx.split.table_pages + 1; =20 + /* + * Capability fields of an HFI class are grouped together. Classes are + * contiguous in memory. Hence, use the number of supported features to + * locate a specific class. + */ + hfi_features.class_stride =3D nr_capabilities; + + /* For now, use only one class of the HFI table */ + hfi_features.nr_classes =3D 1; + /* * The header contains change indications for each supported feature. * The size of the table header is rounded up to be a multiple of 8 * bytes. */ - hfi_features.hdr_size =3D DIV_ROUND_UP(nr_capabilities, 8) * 8; + hfi_features.hdr_size =3D DIV_ROUND_UP(nr_capabilities * + hfi_features.nr_classes, 8) * 8; =20 /* * Data of each logical processor is also rounded up to be a multiple * of 8 bytes. */ - hfi_features.cpu_stride =3D DIV_ROUND_UP(nr_capabilities, 8) * 8; + hfi_features.cpu_stride =3D DIV_ROUND_UP(nr_capabilities * + hfi_features.nr_classes, 8) * 8; =20 return 0; } --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A4E9C636CC for ; Tue, 7 Feb 2023 05:02:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230376AbjBGFCw (ORCPT ); Tue, 7 Feb 2023 00:02:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229874AbjBGFB5 (ORCPT ); Tue, 7 Feb 2023 00:01:57 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90246C67C; Mon, 6 Feb 2023 21:01:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746115; x=1707282115; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=+1DtJmYoUbMLlAjJEQ1d3h7xcYK1w4mnpef+1fxd8fs=; b=FT6uX/2Z2cJDqxB5UQdPg6yPU4mrWILNCMAbkcAW/F+e/3uy6Qs/0kEF 0XfIcNOwyLA75YRzVxpsWrrf+3UZ3uXHqSvt5KobjKrmL/EsQRSPjSI1h M6BrDUgLkm2WB1VHRiPl2q6HDPof2pE362N5D8yj6dTRGbKjFfAmUHoDk 2vpL+iUZ0xIbh56yd9qHd6WZMi509xW0C8O00Fz6W8XETeQhaDwlzaLKE tIDSGHVgWdEEBwXmBVdeIZdv7uYy+a4H64M9rtpmM2v9NtMPl/4OlZOkW JuNtzcQi/sNh/azzkLMTbA8AP69uc2Flsg1o/GHUXjJ6xRrZEY0YMbB4b A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625878" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625878" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657740" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657740" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:43 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 12/24] x86/cpufeatures: Add the Intel Thread Director feature definitions Date: Mon, 6 Feb 2023 21:10:53 -0800 Message-Id: <20230207051105.11575-13-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Intel Thread Director (ITD) provides hardware resources to classify the current task. The classification reflects the type of instructions that a task currently executes. ITD extends the Hardware Feedback Interface table to provide performance and energy efficiency capabilities for each of the supported classes of tasks. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * Removed dependency on CONFIG_INTEL_THREAD_DIRECTOR. Instead, depend on CONFIG_IPC_CLASSES. * Added DISABLE_ITD to the correct DISABLE_MASK: 14 instead of 13. --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/disabled-features.h | 8 +++++++- arch/x86/kernel/cpu/cpuid-deps.c | 1 + 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpuf= eatures.h index fdb8e09234ba..8a6261a5dbbf 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -349,6 +349,7 @@ #define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */ #define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */ #define X86_FEATURE_HFI (14*32+19) /* Hardware Feedback Interface */ +#define X86_FEATURE_ITD (14*32+23) /* Intel Thread Director */ =20 /* AMD SVM Feature Identification, CPUID level 0x8000000a (EDX), word 15 */ #define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */ diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/as= m/disabled-features.h index 5dfa4fb76f4b..f8e145a8c5dd 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -99,6 +99,12 @@ # define DISABLE_TDX_GUEST (1 << (X86_FEATURE_TDX_GUEST & 31)) #endif =20 +#ifdef CONFIG_IPC_CLASSES +# define DISABLE_ITD 0 +#else +# define DISABLE_ITD (1 << (X86_FEATURE_ITD & 31)) +#endif + /* * Make sure to add features to the correct mask */ @@ -117,7 +123,7 @@ DISABLE_CALL_DEPTH_TRACKING) #define DISABLED_MASK12 0 #define DISABLED_MASK13 0 -#define DISABLED_MASK14 0 +#define DISABLED_MASK14 (DISABLE_ITD) #define DISABLED_MASK15 0 #define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UM= IP| \ DISABLE_ENQCMD) diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-d= eps.c index f6748c8bd647..7a87b823eef3 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -81,6 +81,7 @@ static const struct cpuid_dep cpuid_deps[] =3D { { X86_FEATURE_XFD, X86_FEATURE_XSAVES }, { X86_FEATURE_XFD, X86_FEATURE_XGETBV1 }, { X86_FEATURE_AMX_TILE, X86_FEATURE_XFD }, + { X86_FEATURE_ITD, X86_FEATURE_HFI }, {} }; =20 --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C02EC636CC for ; Tue, 7 Feb 2023 05:03:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230427AbjBGFDE (ORCPT ); Tue, 7 Feb 2023 00:03:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229949AbjBGFB5 (ORCPT ); Tue, 7 Feb 2023 00:01:57 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05AC7CA05; Mon, 6 Feb 2023 21:01:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746115; x=1707282115; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=vDnh1C/cq9wjWDStRuJO1Z3DAdsnd8sqeD4zFb0EIc0=; b=KRH57/e/ImFV0sKKTP80CpIj4dO4n+IZ5L2AQHQEqx+QQjlj1auDvaS6 ImagWXbEZaydTT+vz3o4cHHtECTMXZ9lEnCS6V6Hcj3VwbKk1Npu5lEO2 d8Z8HO7RpvcId73r9XM16gwpA0G7BF3Pq4hXwy12hYctZs1kDnhIEHDZN Hlu2iTzt/oW87/2icGmfrUTU9Frt7SXNYyOCH+qAtLEycVSfbMqeq+aWx ABGC5shFVOvjV+lmwlJO+8IPVkNvYlIalOTBsQSTx7cntDLEXMA4CrNWe bz6J+pEXY6e+tiAvUzLFLXcr4nN/oH75swM+fJrYPAthgRzDAJSSEVxoi A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625884" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625884" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657744" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657744" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:43 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 13/24] thermal: intel: hfi: Store per-CPU IPCC scores Date: Mon, 6 Feb 2023 21:10:54 -0800 Message-Id: <20230207051105.11575-14-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The scheduler reads the IPCC scores when balancing load. These reads can be quite frequent. Hardware can also update the HFI table frequently. Concurrent access may cause a lot of lock contention. It gets worse as the number of CPUs increases. Instead, create separate per-CPU IPCC scores that the scheduler can read without the HFI table lock. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Suggested-by: Peter Zijlstra (Intel) Signed-off-by: Ricardo Neri --- Changes since v2: * Only create these per-CPU variables when Intel Thread Director is supported. Changes since v1: * Added this patch. --- drivers/thermal/intel/intel_hfi.c | 46 +++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/inte= l_hfi.c index 2527ae3836c7..b06021828892 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include #include @@ -170,6 +171,43 @@ static struct workqueue_struct *hfi_updates_wq; #define HFI_UPDATE_INTERVAL HZ #define HFI_MAX_THERM_NOTIFY_COUNT 16 =20 +#ifdef CONFIG_IPC_CLASSES +static int __percpu *hfi_ipcc_scores; + +static int alloc_hfi_ipcc_scores(void) +{ + if (!cpu_feature_enabled(X86_FEATURE_ITD)) + return 0; + + hfi_ipcc_scores =3D __alloc_percpu(sizeof(*hfi_ipcc_scores) * + hfi_features.nr_classes, + sizeof(*hfi_ipcc_scores)); + + return !hfi_ipcc_scores; +} + +static void set_hfi_ipcc_score(void *caps, int cpu) +{ + int i, *hfi_class; + + if (!cpu_feature_enabled(X86_FEATURE_ITD)) + return; + + hfi_class =3D per_cpu_ptr(hfi_ipcc_scores, cpu); + + for (i =3D 0; i < hfi_features.nr_classes; i++) { + struct hfi_cpu_data *class_caps; + + class_caps =3D caps + i * hfi_features.class_stride; + WRITE_ONCE(hfi_class[i], class_caps->perf_cap); + } +} + +#else +static int alloc_hfi_ipcc_scores(void) { return 0; } +static void set_hfi_ipcc_score(void *caps, int cpu) { } +#endif /* CONFIG_IPC_CLASSES */ + static void get_hfi_caps(struct hfi_instance *hfi_instance, struct thermal_genl_cpu_caps *cpu_caps) { @@ -192,6 +230,8 @@ static void get_hfi_caps(struct hfi_instance *hfi_insta= nce, cpu_caps[i].efficiency =3D caps->ee_cap << 2; =20 ++i; + + set_hfi_ipcc_score(caps, cpu); } raw_spin_unlock_irq(&hfi_instance->table_lock); } @@ -580,8 +620,14 @@ void __init intel_hfi_init(void) if (!hfi_updates_wq) goto err_nomem; =20 + if (alloc_hfi_ipcc_scores()) + goto err_ipcc; + return; =20 +err_ipcc: + destroy_workqueue(hfi_updates_wq); + err_nomem: for (j =3D 0; j < i; ++j) { hfi_instance =3D &hfi_instances[j]; --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00494C636CC for ; Tue, 7 Feb 2023 05:03:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230411AbjBGFDC (ORCPT ); Tue, 7 Feb 2023 00:03:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60010 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229960AbjBGFB5 (ORCPT ); Tue, 7 Feb 2023 00:01:57 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 518F9CDCB; Mon, 6 Feb 2023 21:01:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746116; x=1707282116; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=3mEF+X1OQu4dffb2wbCkr1lz7Fm3ISrkb07VTK8J8lk=; b=W4bVfXgZ4TDooAdFk0rCdotTO+9FgnsoPB4KQuc1cHFEbK6hlQBY2ZCt LymnpowYpI6ElxLMO2Y4We9FDf9WOlAwChKUXGZVvm1IEShyajI+RjXQc ZryVBGaY5kIRPfmdDsofC+uy4oIHnHvGZ2Km7D/0bR2s9dxjdYZ4Pbiio e7wxwopXaIoB7/Lxff+oNuzh79YrPYLy8OJR7oW5Fp2QendG2bYPDNf6L I3bnIxtMrUiRnTOXGY/9J9bpAy3h7S+Fr05c3RPfPnrGkhLNRMNq7uYnB rWdG6LmTSqDZ5rAHWhfx4Uwbi4P4jNDLQF6mDBdVYk+J6zLS3SpBTCfUq g==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625888" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625888" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657749" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657749" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:44 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 14/24] thermal: intel: hfi: Update the IPC class of the current task Date: Mon, 6 Feb 2023 21:10:55 -0800 Message-Id: <20230207051105.11575-15-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Use Intel Thread Director classification to update the IPC class of a task. Implement the arch_update_ipcc() interface of the scheduler. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * Removed the implementation of arch_has_ipc_classes(). Changes since v1: * Adjusted the result the classification of Intel Thread Director to start at class 1. Class 0 for the scheduler means that the task is unclassified. * Redefined union hfi_thread_feedback_char_msr to ensure all bit-fields are packed. (PeterZ) * Removed CONFIG_INTEL_THREAD_DIRECTOR. (PeterZ) * Shortened the names of the functions that implement IPC classes. * Removed argument smt_siblings_idle from intel_hfi_update_ipcc(). (PeterZ) --- arch/x86/include/asm/topology.h | 6 ++++++ drivers/thermal/intel/intel_hfi.c | 32 +++++++++++++++++++++++++++++++ 2 files changed, 38 insertions(+) diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topolog= y.h index 458c891a8273..ffcdac3f398f 100644 --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -227,4 +227,10 @@ void init_freq_invariance_cppc(void); #define arch_init_invariance_cppc init_freq_invariance_cppc #endif =20 +#if defined(CONFIG_IPC_CLASSES) && defined(CONFIG_INTEL_HFI_THERMAL) +void intel_hfi_update_ipcc(struct task_struct *curr); + +#define arch_update_ipcc intel_hfi_update_ipcc +#endif /* defined(CONFIG_IPC_CLASSES) && defined(CONFIG_INTEL_HFI_THERMAL)= */ + #endif /* _ASM_X86_TOPOLOGY_H */ diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/inte= l_hfi.c index b06021828892..530dcf57e06e 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -72,6 +72,17 @@ union cpuid6_edx { u32 full; }; =20 +#ifdef CONFIG_IPC_CLASSES +union hfi_thread_feedback_char_msr { + struct { + u64 classid : 8; + u64 __reserved : 55; + u64 valid : 1; + } split; + u64 full; +}; +#endif + /** * struct hfi_cpu_data - HFI capabilities per CPU * @perf_cap: Performance capability @@ -174,6 +185,27 @@ static struct workqueue_struct *hfi_updates_wq; #ifdef CONFIG_IPC_CLASSES static int __percpu *hfi_ipcc_scores; =20 +void intel_hfi_update_ipcc(struct task_struct *curr) +{ + union hfi_thread_feedback_char_msr msr; + + /* We should not be here if ITD is not supported. */ + if (!cpu_feature_enabled(X86_FEATURE_ITD)) { + pr_warn_once("task classification requested but not supported!"); + return; + } + + rdmsrl(MSR_IA32_HW_FEEDBACK_CHAR, msr.full); + if (!msr.split.valid) + return; + + /* + * 0 is a valid classification for Intel Thread Director. A scheduler + * IPCC class of 0 means that the task is unclassified. Adjust. + */ + curr->ipcc =3D msr.split.classid + 1; +} + static int alloc_hfi_ipcc_scores(void) { if (!cpu_feature_enabled(X86_FEATURE_ITD)) --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9C36C636CC for ; Tue, 7 Feb 2023 05:03:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230445AbjBGFDL (ORCPT ); Tue, 7 Feb 2023 00:03:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230092AbjBGFB7 (ORCPT ); Tue, 7 Feb 2023 00:01:59 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C74CCDF1; Mon, 6 Feb 2023 21:01:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746117; x=1707282117; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=A+SMOZzgEsVxAgLjF6sCJ/1gpwfA0tMbTNw8B4DLAIM=; b=fCIWpLnWvVOZblBGvUfz1mfcwQWd0Wj6VTVW0bEwvGTCk23O+ZGxeuld jjtcNSGAX6l4iZ/W9PZlReroJ2vApzJ3USF1g8Lm5qUnqlG7JT1qMWDI0 H6FNfi+ZqU1aCCQh8BNlveiPN+RIkkMHbCJ10uUqyWtocDETvtzI67hYH hj9Z0//AAgnDnozgPFdehOz5Miy7HNxQ9O9luBHuopvI18PATDNOP04hv 0kmhNvELgkcUgxUUwU9wzBR/rmZt6fH1Go/E3a8qZeLjprt4QZbQHVhjZ uunENef9sTQ0wGONgCJh2cT7ncG+dxubU8Opat1VDUcw4tNLBdVseObBe g==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625902" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625902" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657752" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657752" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:44 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 15/24] thermal: intel: hfi: Report the IPC class score of a CPU Date: Mon, 6 Feb 2023 21:10:56 -0800 Message-Id: <20230207051105.11575-16-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Implement the arch_get_ipcc_score() interface of the scheduler. Use the performance capabilities of the extended Hardware Feedback Interface table as the IPC score. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * Adjusted the returned HFI class (which starts at 0) to match the scheduler IPCC class (which starts at 1). (PeterZ) * Used the new interface names. --- arch/x86/include/asm/topology.h | 2 ++ drivers/thermal/intel/intel_hfi.c | 27 +++++++++++++++++++++++++++ 2 files changed, 29 insertions(+) diff --git a/arch/x86/include/asm/topology.h b/arch/x86/include/asm/topolog= y.h index ffcdac3f398f..c4fcd9c3c634 100644 --- a/arch/x86/include/asm/topology.h +++ b/arch/x86/include/asm/topology.h @@ -229,8 +229,10 @@ void init_freq_invariance_cppc(void); =20 #if defined(CONFIG_IPC_CLASSES) && defined(CONFIG_INTEL_HFI_THERMAL) void intel_hfi_update_ipcc(struct task_struct *curr); +unsigned long intel_hfi_get_ipcc_score(unsigned short ipcc, int cpu); =20 #define arch_update_ipcc intel_hfi_update_ipcc +#define arch_get_ipcc_score intel_hfi_get_ipcc_score #endif /* defined(CONFIG_IPC_CLASSES) && defined(CONFIG_INTEL_HFI_THERMAL)= */ =20 #endif /* _ASM_X86_TOPOLOGY_H */ diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/inte= l_hfi.c index 530dcf57e06e..fa9b4a678d92 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -206,6 +206,33 @@ void intel_hfi_update_ipcc(struct task_struct *curr) curr->ipcc =3D msr.split.classid + 1; } =20 +unsigned long intel_hfi_get_ipcc_score(unsigned short ipcc, int cpu) +{ + unsigned short hfi_class; + int *scores; + + if (cpu < 0 || cpu >=3D nr_cpu_ids) + return -EINVAL; + + if (ipcc =3D=3D IPC_CLASS_UNCLASSIFIED) + return -EINVAL; + + /* + * Scheduler IPC classes start at 1. HFI classes start at 0. + * See note intel_hfi_update_ipcc(). + */ + hfi_class =3D ipcc - 1; + + if (hfi_class >=3D hfi_features.nr_classes) + return -EINVAL; + + scores =3D per_cpu_ptr(hfi_ipcc_scores, cpu); + if (!scores) + return -ENODEV; + + return READ_ONCE(scores[hfi_class]); +} + static int alloc_hfi_ipcc_scores(void) { if (!cpu_feature_enabled(X86_FEATURE_ITD)) --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC5A1C636CC for ; Tue, 7 Feb 2023 05:03:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230433AbjBGFDI (ORCPT ); Tue, 7 Feb 2023 00:03:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230082AbjBGFB7 (ORCPT ); Tue, 7 Feb 2023 00:01:59 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C7C7D518; Mon, 6 Feb 2023 21:01:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746117; x=1707282117; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=L0pxmqCmeAEPVOWhuC7xe5KQw/uWTSvYwYWe6PdvQds=; b=k8lpySZEZuKpnEK6BNh3MqDFNj8noa8cmCiWoDb0FpmCUQuYh2NwIloX Na7gEE8syVSHC4CF6E4fkewSmjcUZLevai8p0s3bodM2YHP2DQxiUqHkS X9kfu8jHMvU0GzwCtoaqkd3YwvzKT/keEfNVk5M+n07Pa+NFwiyz4Qltb kt/XUIaH2tkpyNJd2zHEs7tQZh6oFOUv+jcguXtrEZO4ge9KVVHssA3md EjGRan8X8RDlLc192EpYjS0vRv59HNbF6HJ6gmjJq5DnylQWOYhlBNm2K 0Yws1c81pAawAKeAwqd36Nf5SEHv4K9ewHpLMVw5BA6YUnB8/MwlZOxrN A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625907" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625907" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:44 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657755" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657755" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:44 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 16/24] thermal: intel: hfi: Define a default class for unclassified tasks Date: Mon, 6 Feb 2023 21:10:57 -0800 Message-Id: <20230207051105.11575-17-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" A task may be unclassified if it has been recently created, spend most of its lifetime sleeping, or hardware has not provided a classification. Most tasks will be eventually classified as scheduler's IPC class 1 (HFI class 0). This class corresponds to the capabilities in the legacy, classless, HFI table. IPC class 1 is a reasonable choice until hardware provides an actual classification. Meanwhile, the scheduler will place classes of tasks with higher IPC scores on higher-performance CPUs. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri Acked-by: Rafael J. Wysocki --- Changes since v2: * None Changes since v1: * Now the default class is 1. --- drivers/thermal/intel/intel_hfi.c | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/inte= l_hfi.c index fa9b4a678d92..7ea6acce7107 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -185,6 +185,19 @@ static struct workqueue_struct *hfi_updates_wq; #ifdef CONFIG_IPC_CLASSES static int __percpu *hfi_ipcc_scores; =20 +/* + * A task may be unclassified if it has been recently created, spend most = of + * its lifetime sleeping, or hardware has not provided a classification. + * + * Most tasks will be classified as scheduler's IPC class 1 (HFI class 0) + * eventually. Meanwhile, the scheduler will place classes of tasks with h= igher + * IPC scores on higher-performance CPUs. + * + * IPC class 1 is a reasonable choice. It matches the performance capabili= ty + * of the legacy, classless, HFI table. + */ +#define HFI_UNCLASSIFIED_DEFAULT 1 + void intel_hfi_update_ipcc(struct task_struct *curr) { union hfi_thread_feedback_char_msr msr; @@ -215,7 +228,7 @@ unsigned long intel_hfi_get_ipcc_score(unsigned short i= pcc, int cpu) return -EINVAL; =20 if (ipcc =3D=3D IPC_CLASS_UNCLASSIFIED) - return -EINVAL; + ipcc =3D HFI_UNCLASSIFIED_DEFAULT; =20 /* * Scheduler IPC classes start at 1. HFI classes start at 0. --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCE28C636D4 for ; Tue, 7 Feb 2023 05:03:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230457AbjBGFDQ (ORCPT ); Tue, 7 Feb 2023 00:03:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60718 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230096AbjBGFB7 (ORCPT ); Tue, 7 Feb 2023 00:01:59 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C78A26EBF; Mon, 6 Feb 2023 21:01:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746117; x=1707282117; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=3ZodASqqSSM2NIj1LSnSMKhTz+ez7n5TchQdgqVyKMk=; b=cf3RcTOkwDxK/NR3BWE33SHlTo7EtZxQEsk/ll545VQT14y2JO6ZaclE FP+2ti8Bzb61Sst5OGfjNqF23wahIOHdpU6++RAAsvcssP0J4pn6d4/Cb D7O7oYuFeRapk0b96g2ehWZj/YxfsCQU+avvB65ooJTzxmLLVYc67Xpc8 DGmU+eOTBhyAweTDJHMgr1bqc5lErPud88NxWOd93bTZO0c8rdObktfsf fAJQD/AH7uEOd6scEKRIjli/Ksi6ntnHo4WOTcsGWJrHVVOtu0ZVUqMZw A1mtrQKDfki2gCri0mN2zGkTRDc8y86fWRwgkI+8kWO2QWht0csLSO9Pu g==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625917" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625917" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657758" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657758" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:44 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 17/24] thermal: intel: hfi: Enable the Intel Thread Director Date: Mon, 6 Feb 2023 21:10:58 -0800 Message-Id: <20230207051105.11575-18-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Enable Intel Thread Director from the CPU hotplug callback: globally from CPU0 and then enable the thread-classification hardware in each logical processor individually. Also, initialize the number of classes supported. Let the scheduler know that it can start using IPC classes. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri Acked-by: Rafael J. Wysocki --- Changes since v2: * Use the new sched_enable_ipc_classes() interface to enable the use of IPC classes in the scheduler. Changes since v1: * None --- arch/x86/include/asm/msr-index.h | 2 ++ drivers/thermal/intel/intel_hfi.c | 40 +++++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index ad35355ee43e..0ea25cc9c621 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1106,6 +1106,8 @@ /* Hardware Feedback Interface */ #define MSR_IA32_HW_FEEDBACK_PTR 0x17d0 #define MSR_IA32_HW_FEEDBACK_CONFIG 0x17d1 +#define MSR_IA32_HW_FEEDBACK_THREAD_CONFIG 0x17d4 +#define MSR_IA32_HW_FEEDBACK_CHAR 0x17d2 =20 /* x2APIC locked status */ #define MSR_IA32_XAPIC_DISABLE_STATUS 0xBD diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/inte= l_hfi.c index 7ea6acce7107..35d947f47550 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -48,6 +48,8 @@ /* Hardware Feedback Interface MSR configuration bits */ #define HW_FEEDBACK_PTR_VALID_BIT BIT(0) #define HW_FEEDBACK_CONFIG_HFI_ENABLE_BIT BIT(0) +#define HW_FEEDBACK_CONFIG_ITD_ENABLE_BIT BIT(1) +#define HW_FEEDBACK_THREAD_CONFIG_ENABLE_BIT BIT(0) =20 /* CPUID detection and enumeration definitions for HFI */ =20 @@ -72,6 +74,15 @@ union cpuid6_edx { u32 full; }; =20 +union cpuid6_ecx { + struct { + u32 dont_care0:8; + u32 nr_classes:8; + u32 dont_care1:16; + } split; + u32 full; +}; + #ifdef CONFIG_IPC_CLASSES union hfi_thread_feedback_char_msr { struct { @@ -506,6 +517,11 @@ void intel_hfi_online(unsigned int cpu) =20 init_hfi_cpu_index(info); =20 + if (cpu_feature_enabled(X86_FEATURE_ITD)) { + msr_val =3D HW_FEEDBACK_THREAD_CONFIG_ENABLE_BIT; + wrmsrl(MSR_IA32_HW_FEEDBACK_THREAD_CONFIG, msr_val); + } + /* * Now check if the HFI instance of the package/die of @cpu has been * initialized (by checking its header). In such case, all we have to @@ -561,8 +577,22 @@ void intel_hfi_online(unsigned int cpu) */ rdmsrl(MSR_IA32_HW_FEEDBACK_CONFIG, msr_val); msr_val |=3D HW_FEEDBACK_CONFIG_HFI_ENABLE_BIT; + + if (cpu_feature_enabled(X86_FEATURE_ITD)) + msr_val |=3D HW_FEEDBACK_CONFIG_ITD_ENABLE_BIT; + wrmsrl(MSR_IA32_HW_FEEDBACK_CONFIG, msr_val); =20 + /* + * We have all we need to support IPC classes. Task classification is + * now working. + * + * All class scores are zero until after the first HFI update. That is + * OK. The scheduler queries these scores at every load balance. + */ + if (cpu_feature_enabled(X86_FEATURE_ITD)) + sched_enable_ipc_classes(); + unlock: mutex_unlock(&hfi_instance_lock); return; @@ -640,8 +670,14 @@ static __init int hfi_parse_features(void) */ hfi_features.class_stride =3D nr_capabilities; =20 - /* For now, use only one class of the HFI table */ - hfi_features.nr_classes =3D 1; + if (cpu_feature_enabled(X86_FEATURE_ITD)) { + union cpuid6_ecx ecx; + + ecx.full =3D cpuid_ecx(CPUID_HFI_LEAF); + hfi_features.nr_classes =3D ecx.split.nr_classes; + } else { + hfi_features.nr_classes =3D 1; + } =20 /* * The header contains change indications for each supported feature. --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C63C4C636CD for ; Tue, 7 Feb 2023 05:03:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230464AbjBGFDU (ORCPT ); Tue, 7 Feb 2023 00:03:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60482 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230036AbjBGFB7 (ORCPT ); Tue, 7 Feb 2023 00:01:59 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C96BD52A; Mon, 6 Feb 2023 21:01:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746117; x=1707282117; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=JgRkYVFr3x01km8FwYfHaTTQU7Y65rdIHc+lrY4vxao=; b=KzWpAtfmumtSM1eYljIE9LJ+HZG/fiLSQ6asQ9NEnvVsdrO86a44bIfR ucZx3mq/QlB3WBkhbmbJFfi27J2EPn9boEGMkfRQZ2aP1JET4yecVEKcF TXVScFQ/duJxsTDlaZg54Yz1IRcUNo4RBOGaJn52atMXvwPR4gWe4boFi 8O6PoK6H8gm0RLERj2ekOd0QPboQV2P4/dHaMGC8QtCQrbKuZe5FC5ipM Ueq3+i+6BAARmBdj7z8FuckEOPwDz8MV4jndHw+TcXCta18DMYarnom1l NVybexqFUzS8wfsDcRZpCUoXncthXd+zRZPjYjjR4GgYE4WhgJL0GlhkA Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625927" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625927" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657762" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657762" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:45 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 18/24] sched/task_struct: Add helpers for IPC classification Date: Mon, 6 Feb 2023 21:10:59 -0800 Message-Id: <20230207051105.11575-19-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The unprocessed classification that hardware provides for a task may not be usable by the scheduler: the classification may change too frequently or architectures may want to consider extra factors. For instance, some processors with Intel Thread Director need to consider the state of the SMT siblings of a core. Provide per-task helper variables that architectures can use to post- process the classification that hardware provides. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * Used bit-fields to fit all the IPC class data in 4 bytes. (PeterZ) * Shortened names of the helpers. * Renamed helpers with the ipcc_ prefix. * Reworded commit message for clarity --- include/linux/sched.h | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 10c6abdc3465..45f28a601b3d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1535,7 +1535,17 @@ struct task_struct { * A hardware-defined classification of task that reflects but is * not identical to the number of instructions per cycle. */ - unsigned short ipcc; + unsigned int ipcc : 9; + /* + * A candidate classification that arch-specific implementations + * qualify for correctness. + */ + unsigned int ipcc_tmp : 9; + /* + * Counter to filter out transient candidate classifications + * of a task. + */ + unsigned int ipcc_cntr : 14; #endif =20 /* --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36A20C636CC for ; Tue, 7 Feb 2023 05:03:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230449AbjBGFDP (ORCPT ); Tue, 7 Feb 2023 00:03:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230034AbjBGFCA (ORCPT ); Tue, 7 Feb 2023 00:02:00 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0433F10ABF; Mon, 6 Feb 2023 21:01:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746118; x=1707282118; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=MnwH95WqrXx0YKjXk9ndqrfgMIqXB+pUulyYTmcEQjs=; b=kYtZL5d7ayHswTaAQ5gMhx0Nug0SLsiXIVIDCSc6wbcXjqbAv0T8xaVL Pmtku8+ZXtdvlmlNDwu8OMjZcKjwqADLokBiQhzDqwCKuFJaiTaw+mUwm hadIQlc2SIrM7S2FBELUTkITH3xGmaaz9rgbpaPVvgFjBkfLc9BZRXekH 7yakHEhYNbUfxxIEeZO2eB7/s7jzWB7UM1SpTmr6mLb09afy7TEsyDVDa kjv3sYUPGbHW0/r+3xRDB66MQCZj+cs/ntE+AtWo13n1X8I2p7uXgKdAH 5algSyVlqMJM4m4IOKejY0wjCYxHZgPQYrvee/kCD+oPNnv0s4YU15FF6 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625938" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625938" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:45 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657768" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657768" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:45 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 19/24] sched/core: Initialize helpers of task classification Date: Mon, 6 Feb 2023 21:11:00 -0800 Message-Id: <20230207051105.11575-20-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Just as tasks start life unclassified, initialize the classification auxiliar variables. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * None --- kernel/sched/core.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9f4e9cc16df8..71b4af7ae496 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4430,6 +4430,8 @@ static void __sched_fork(unsigned long clone_flags, s= truct task_struct *p) p->se.vruntime =3D 0; #ifdef CONFIG_IPC_CLASSES p->ipcc =3D IPC_CLASS_UNCLASSIFIED; + p->ipcc_tmp =3D IPC_CLASS_UNCLASSIFIED; + p->ipcc_cntr =3D 0; #endif INIT_LIST_HEAD(&p->se.group_node); =20 --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5786C636CD for ; Tue, 7 Feb 2023 05:03:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230200AbjBGFDh (ORCPT ); Tue, 7 Feb 2023 00:03:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230124AbjBGFCB (ORCPT ); Tue, 7 Feb 2023 00:02:01 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22F9F1165B; Mon, 6 Feb 2023 21:01:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746119; x=1707282119; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=ptfoaHFvByfojwrO5m9wFLhJI66qhikYpEJ1HsiQosE=; b=T5FDN758/KNrKvZ6v/rCdtf0U8IQ81E3whmmtmtiqcv5uBjFKiXQHKEy NXYezTjbhG3XLAEbsQyQs+/AXTbO3ol8+IzvbbOVfRD3yk9D6yVoA5o0W Yu3cE5GqLgXw+ZI+9QNkPnbLkPbjZlh3oKFKuYItC1xwzOaK/LdSHfAph BBXOQHWmePCUCzL2t6nUfvd3z1REkB+VroHcQN9EzRFj3VJ/TPAzdmbHK nqqwQbY7lqjtj1LjXIsyEKpRTaVkro54a2xVOONVd0BrJV5R3nKIZGxKN jYpUPlhg/XtEOUEP/q6MNBfM5V0FQeXQAuNo7YJzuuu1iVFFYUXbe1COk A==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625950" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625950" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657773" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657773" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:45 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 20/24] sched/fair: Introduce sched_smt_siblings_idle() Date: Mon, 6 Feb 2023 21:11:01 -0800 Message-Id: <20230207051105.11575-21-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" X86 needs to know the idle state of the SMT siblings of a CPU to improve the accuracy of IPCC classification. X86 implements support for IPC classes in the thermal HFI driver. Rename is_core_idle() as sched_smt_siblings_idle() and make it available outside the scheduler code. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Len Brown Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- is_core_idle() is no longer an inline function after this patch. To rule out performance degradation, I compared the execution time of the inline and non-inline versions on a 4-socket Cascade Lake system using the NUMA stressor of stress-ng: $ stress-ng --numa 1500 -t 10m is_core_idle() was called ~200,000 times. I measured the value of the TSC counter before and after calling is_core_idle() and computed the delta value. I arbitrarily removed outliers (defined as any delta larger than 5000 counts). This required removing ~40 samples. The table below summarizes the difference in execution time. All quantities are expressed in TSC counts, except the standard deviation, expressed as a percentage of the average. Average Median Std(%) Mode TSCdelta inline 668.76 626 67.24 42 TSCdelta non-inline 677.64 624 67.67 46 All metrics are similar for the inline and non-inline cases. --- Changes since v2: * Brought back this previously dropped patch. * Profiled inline vs non-inline is_core_idle(). I found not major penalty. * Merged is_core_idle() and sched_smt_siblings_idle() into a single function. (Dietmar) Changes since v1: * Dropped this patch. --- include/linux/sched.h | 2 ++ kernel/sched/fair.c | 21 +++++++++++++++------ 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 45f28a601b3d..7ef9fd84e7ad 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -2449,4 +2449,6 @@ static inline void sched_core_fork(struct task_struct= *p) { } =20 extern void sched_set_stop_task(int cpu, struct task_struct *stop); =20 +extern bool sched_smt_siblings_idle(int cpu); + #endif diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d3c22dc145f7..a66d86c5cb5c 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1064,7 +1064,14 @@ update_stats_curr_start(struct cfs_rq *cfs_rq, struc= t sched_entity *se) * Scheduling class queueing methods: */ =20 -static inline bool is_core_idle(int cpu) +/** + * sched_smt_siblings_idle - Check whether SMT siblings of a CPU are idle + * @cpu: The CPU to check + * + * Returns true if all the SMT siblings of @cpu are idle or @cpu does not = have + * SMT siblings. The idle state of @cpu is not considered. + */ +bool sched_smt_siblings_idle(int cpu) { #ifdef CONFIG_SCHED_SMT int sibling; @@ -1767,7 +1774,7 @@ static inline int numa_idle_core(int idle_core, int c= pu) * Prefer cores instead of packing HT siblings * and triggering future load balancing. */ - if (is_core_idle(cpu)) + if (sched_smt_siblings_idle(cpu)) idle_core =3D cpu; =20 return idle_core; @@ -9518,7 +9525,8 @@ sched_asym(struct lb_env *env, struct sd_lb_stats *sd= s, struct sg_lb_stats *sgs * If the destination CPU has SMT siblings, env->idle !=3D CPU_NOT_IDLE * is not sufficient. We need to make sure the whole core is idle. */ - if (sds->local->flags & SD_SHARE_CPUCAPACITY && !is_core_idle(env->dst_cp= u)) + if (sds->local->flags & SD_SHARE_CPUCAPACITY && + !sched_smt_siblings_idle(env->dst_cpu)) return false; =20 /* Only do SMT checks if either local or candidate have SMT siblings. */ @@ -10687,7 +10695,8 @@ static struct rq *find_busiest_queue(struct lb_env = *env, sched_asym_prefer(i, env->dst_cpu) && nr_running =3D=3D 1) { if (env->sd->flags & SD_SHARE_CPUCAPACITY || - (!(env->sd->flags & SD_SHARE_CPUCAPACITY) && is_core_idle(i))) + (!(env->sd->flags & SD_SHARE_CPUCAPACITY) && + sched_smt_siblings_idle(i))) continue; } =20 @@ -10816,7 +10825,7 @@ asym_active_balance(struct lb_env *env) * busy sibling. */ return sched_asym_prefer(env->dst_cpu, env->src_cpu) || - !is_core_idle(env->src_cpu); + !sched_smt_siblings_idle(env->src_cpu); } =20 return false; @@ -11563,7 +11572,7 @@ static void nohz_balancer_kick(struct rq *rq) */ if (sd->flags & SD_SHARE_CPUCAPACITY || (!(sd->flags & SD_SHARE_CPUCAPACITY) && - is_core_idle(i))) { + sched_smt_siblings_idle(i))) { flags =3D NOHZ_STATS_KICK | NOHZ_BALANCE_KICK; goto unlock; } --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0347C636D4 for ; Tue, 7 Feb 2023 05:03:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230470AbjBGFDY (ORCPT ); Tue, 7 Feb 2023 00:03:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60488 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230039AbjBGFCA (ORCPT ); Tue, 7 Feb 2023 00:02:00 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26BF511EBA; Mon, 6 Feb 2023 21:01:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746119; x=1707282119; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=LRruVzFjzKpVUadVEuZKwelOSSQHqDO6Cu2BUXGn/i0=; b=lIT7rH82m7j2ydTUSyTFAOE1oQoG9CuZrVnEL911ibaoIZCKxF2LVrU7 ELMtviurQo4F/kRMQBuVTfeD2+O6BexIzlNousae+dIaMBft3Yx57FoFi DcOjS3KpHPTj/QALr8DzEJ6SG90f8o+w1fCj+A1Anq3LCSMNXPVAqNpPe sJCuL4sfzXsD3izaZRm0Yyh0nyrX7XNcf7Z+piTmw76JpC+rxSi1Ejn05 aSpt6gKezZDx2D1tCtISx0KgudexZSC4H35g0a3JTBEtwYv4nqeXx+CCg ULxy6wm4D+9SjrXXaeXehv07wq32/SNFuNyy3HC2lNpFQc07g/27rUbOo w==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625961" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625961" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657779" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657779" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:46 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 21/24] thermal: intel: hfi: Implement model-specific checks for task classification Date: Mon, 6 Feb 2023 21:11:02 -0800 Message-Id: <20230207051105.11575-22-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" In Alder Lake and Raptor Lake, the result of thread classification is more accurate when only one SMT sibling is busy. Classification results for class 2 and 3 are always reliable. To avoid unnecessary migrations, only update the class of a task if it has been the same during 4 consecutive user ticks. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * Adjusted the result the classification of Intel Thread Director to start at class 1. Class 0 for the scheduler means that the task is unclassified. * Used the new names of the IPC classes members in task_struct. * Reworked helper functions to use sched_smt_siblings_idle() to query the idle state of the SMT siblings of a CPU. --- drivers/thermal/intel/intel_hfi.c | 60 ++++++++++++++++++++++++++++++- 1 file changed, 59 insertions(+), 1 deletion(-) diff --git a/drivers/thermal/intel/intel_hfi.c b/drivers/thermal/intel/inte= l_hfi.c index 35d947f47550..fdb53e4cabc1 100644 --- a/drivers/thermal/intel/intel_hfi.c +++ b/drivers/thermal/intel/intel_hfi.c @@ -40,6 +40,7 @@ #include =20 #include +#include =20 #include "../thermal_core.h" #include "intel_hfi.h" @@ -209,9 +210,64 @@ static int __percpu *hfi_ipcc_scores; */ #define HFI_UNCLASSIFIED_DEFAULT 1 =20 +#define CLASS_DEBOUNCER_SKIPS 4 + +/** + * debounce_and_update_class() - Process and update a task's classification + * + * @p: The task of which the classification will be updated + * @new_ipcc: The new IPC classification + * + * Update the classification of @p with the new value that hardware provid= es. + * Only update the classification of @p if it has been the same during + * CLASS_DEBOUNCER_SKIPS consecutive ticks. + */ +static void debounce_and_update_class(struct task_struct *p, u8 new_ipcc) +{ + u16 debounce_skip; + + /* The class of @p changed. Only restart the debounce counter. */ + if (p->ipcc_tmp !=3D new_ipcc) { + p->ipcc_cntr =3D 1; + goto out; + } + + /* + * The class of @p did not change. Update it if it has been the same + * for CLASS_DEBOUNCER_SKIPS user ticks. + */ + debounce_skip =3D p->ipcc_cntr + 1; + if (debounce_skip < CLASS_DEBOUNCER_SKIPS) + p->ipcc_cntr++; + else + p->ipcc =3D new_ipcc; + +out: + p->ipcc_tmp =3D new_ipcc; +} + +static bool classification_is_accurate(u8 hfi_class, bool smt_siblings_idl= e) +{ + switch (boot_cpu_data.x86_model) { + case INTEL_FAM6_ALDERLAKE: + case INTEL_FAM6_ALDERLAKE_L: + case INTEL_FAM6_RAPTORLAKE: + case INTEL_FAM6_RAPTORLAKE_P: + case INTEL_FAM6_RAPTORLAKE_S: + if (hfi_class =3D=3D 3 || hfi_class =3D=3D 2 || smt_siblings_idle) + return true; + + return false; + + default: + return true; + } +} + void intel_hfi_update_ipcc(struct task_struct *curr) { union hfi_thread_feedback_char_msr msr; + bool idle; =20 /* We should not be here if ITD is not supported. */ if (!cpu_feature_enabled(X86_FEATURE_ITD)) { @@ -227,7 +283,9 @@ void intel_hfi_update_ipcc(struct task_struct *curr) * 0 is a valid classification for Intel Thread Director. A scheduler * IPCC class of 0 means that the task is unclassified. Adjust. */ - curr->ipcc =3D msr.split.classid + 1; + idle =3D sched_smt_siblings_idle(task_cpu(curr)); + if (classification_is_accurate(msr.split.classid, idle)) + debounce_and_update_class(curr, msr.split.classid + 1); } =20 unsigned long intel_hfi_get_ipcc_score(unsigned short ipcc, int cpu) --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8CA2EC636CD for ; Tue, 7 Feb 2023 05:03:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230487AbjBGFDn (ORCPT ); Tue, 7 Feb 2023 00:03:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230176AbjBGFCC (ORCPT ); Tue, 7 Feb 2023 00:02:02 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1BD801E9F5; Mon, 6 Feb 2023 21:02:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746121; x=1707282121; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=goElNzOu/cMaBcsm8/fX9KbVxksjtxFjfuJIL57mdrg=; b=WpC0sW0pWEipNNeTvEKSOjPIUQuuytPxTSJoQTE1NqlGPAH4vgDuToLZ 5VYDwr2L5bwivqAu7eXIDwl/llPbnTLCFP1O7rcxdsLdDkg+6sgNnNlXx LbTqYXMp7GZ7AFBpj6kpHFd1woQMrqYW5B4pRtWenUHEcpGNCbhgJzslC OvLCDU5Dn9qoeGWuqMGEhG1rqIt+wvj1rlqkBHnfnyQTBJSgHJFJEVe08 Aey3i3l3V11bWCUl7AyGfkqdKmppGHIkjmwjo3l/57BAEGAnQIuh/8/sK L/Vci0tKsojEYaRyr0i+eGjx+jx7K/ay2wheXK4U9k9xRZkwjOS6j9jVl g==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625971" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625971" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:46 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657783" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657783" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:46 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 22/24] x86/cpufeatures: Add feature bit for HRESET Date: Mon, 6 Feb 2023 21:11:03 -0800 Message-Id: <20230207051105.11575-23-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" The HRESET instruction prevents the classification of the current task from influencing the classification of the next task when running serially on the same logical processor. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * None --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/msr-index.h | 4 +++- arch/x86/kernel/cpu/scattered.c | 1 + 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpuf= eatures.h index 8a6261a5dbbf..eb859a82b22a 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -309,6 +309,7 @@ #define X86_FEATURE_MSR_TSX_CTRL (11*32+20) /* "" MSR IA32_TSX_CTRL (Intel= ) implemented */ #define X86_FEATURE_SMBA (11*32+21) /* "" Slow Memory Bandwidth Allocatio= n */ #define X86_FEATURE_BMEC (11*32+22) /* "" Bandwidth Monitoring Event Conf= iguration */ +#define X86_FEATURE_HRESET (11*32+23) /* Hardware history reset instructi= on */ =20 /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 0ea25cc9c621..dc96944d61a6 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -1109,6 +1109,9 @@ #define MSR_IA32_HW_FEEDBACK_THREAD_CONFIG 0x17d4 #define MSR_IA32_HW_FEEDBACK_CHAR 0x17d2 =20 +/* Hardware History Reset */ +#define MSR_IA32_HW_HRESET_ENABLE 0x17da + /* x2APIC locked status */ #define MSR_IA32_XAPIC_DISABLE_STATUS 0xBD #define LEGACY_XAPIC_DISABLED BIT(0) /* @@ -1116,5 +1119,4 @@ * disabling x2APIC will cause * a #GP */ - #endif /* _ASM_X86_MSR_INDEX_H */ diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattere= d.c index 0dad49a09b7a..cb8a0e7a4fdb 100644 --- a/arch/x86/kernel/cpu/scattered.c +++ b/arch/x86/kernel/cpu/scattered.c @@ -28,6 +28,7 @@ static const struct cpuid_bit cpuid_bits[] =3D { { X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 }, { X86_FEATURE_INTEL_PPIN, CPUID_EBX, 0, 0x00000007, 1 }, { X86_FEATURE_RRSBA_CTRL, CPUID_EDX, 2, 0x00000007, 2 }, + { X86_FEATURE_HRESET, CPUID_EAX, 22, 0x00000007, 1 }, { X86_FEATURE_CQM_LLC, CPUID_EDX, 1, 0x0000000f, 0 }, { X86_FEATURE_CQM_OCCUP_LLC, CPUID_EDX, 0, 0x0000000f, 1 }, { X86_FEATURE_CQM_MBM_TOTAL, CPUID_EDX, 1, 0x0000000f, 1 }, --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B82FAC636CD for ; Tue, 7 Feb 2023 05:03:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230478AbjBGFDa (ORCPT ); Tue, 7 Feb 2023 00:03:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60756 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230163AbjBGFCA (ORCPT ); Tue, 7 Feb 2023 00:02:00 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 78AED126F6; Mon, 6 Feb 2023 21:01:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746119; x=1707282119; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=4NcODw8HpUR3D/VhBDlDDOdaVqVUmE0eJRth2Ko7xT0=; b=JTCe7QWJrZ43mMmzQe7vBNEHjQFNVR9x/ficCxEVsfVMfpTJmx1GDxWd kgkLGIUWGmLtwiHUAuW5BARF7HcmYgzS3G06+6y+2xwbrdqpBl1s7ujJU m9xsizrNkL0lKmavliD8wQkmzlDm6Pme5DKI2nbcWTksUxGqkfAonv0A+ mvX8F89+N681tfAR245Dp4g9Mj5RdTgZPUHD5K7CaD5f3ecqDl1bR3Cd/ ONitshFa7Yrd9GNCqmGbKmDFTfEhtSCJydcAH3f9hsCXQZdaIYOwRpa23 K0TeVSHFxByECjRDK8fIkppkGs8LVbujhL74xJRtJrGfQ5UBhf/axgC9e Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625983" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625983" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657790" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657790" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:46 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 23/24] x86/hreset: Configure history reset Date: Mon, 6 Feb 2023 21:11:04 -0800 Message-Id: <20230207051105.11575-24-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Configure the MSR that controls the behavior of HRESET on each logical processor. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * Marked hardware_history_features as __ro_after_init instead of __read_mostly. (PeterZ) --- arch/x86/kernel/cpu/common.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 831a1a07d357..f3f936f7de5f 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -412,6 +412,26 @@ static __always_inline void setup_umip(struct cpuinfo_= x86 *c) cr4_clear_bits(X86_CR4_UMIP); } =20 +static u32 hardware_history_features __ro_after_init; + +static __always_inline void setup_hreset(struct cpuinfo_x86 *c) +{ + if (!cpu_feature_enabled(X86_FEATURE_HRESET)) + return; + + /* + * Use on all CPUs the hardware history features that the boot + * CPU supports. + */ + if (c =3D=3D &boot_cpu_data) + hardware_history_features =3D cpuid_ebx(0x20); + + if (!hardware_history_features) + return; + + wrmsrl(MSR_IA32_HW_HRESET_ENABLE, hardware_history_features); +} + /* These bits should not change their value after CPU init is finished. */ static const unsigned long cr4_pinned_mask =3D X86_CR4_SMEP | X86_CR4_SMAP | X86_CR4_UMIP | @@ -1848,10 +1868,11 @@ static void identify_cpu(struct cpuinfo_x86 *c) /* Disable the PN if appropriate */ squash_the_stupid_serial_number(c); =20 - /* Set up SMEP/SMAP/UMIP */ + /* Set up SMEP/SMAP/UMIP/HRESET */ setup_smep(c); setup_smap(c); setup_umip(c); + setup_hreset(c); =20 /* Enable FSGSBASE instructions if available. */ if (cpu_has(c, X86_FEATURE_FSGSBASE)) { --=20 2.25.1 From nobody Fri Sep 12 23:47:47 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A05F8C636D4 for ; Tue, 7 Feb 2023 05:03:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230493AbjBGFDq (ORCPT ); Tue, 7 Feb 2023 00:03:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230137AbjBGFCC (ORCPT ); Tue, 7 Feb 2023 00:02:02 -0500 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64BAD196BA; Mon, 6 Feb 2023 21:02:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675746120; x=1707282120; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=un8n4Xz9yXThF7EWOBFRrgSk/o9VXPxzUfeccAO2+YA=; b=A1K+TFOmKRVUM96AxuUVd4h3fifqUoc5xeYFhTBA5F+eu+psGs+Z+nHv KTqb2ir9HdMKJ8pZOw75N+KgLixp8pfAxpjXDGqcb6OSRjnLIVUvPhkdt Z2obMioz5pNcPZ/icAu87MBHUTCZuU0jRVrJMapqDpKzSsLNjmy+H44nQ OgRLhcOYgOVewXrILdyMssmxxF3ku+hwmS4ZXbepNjk3c+ae8kahRFg9c 3xNrHq3qDFSk5AvF5DhiizIhlqsJszoxtwFS+ho0WBH2gbNT2vobQp34W gHcNUaqVj5ROf2WOsNByAgKvYwYkS8xtQiMPt8Qn7G1ZPcV8iQDpOOQuH g==; X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="415625995" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="415625995" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 21:01:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10613"; a="668657794" X-IronPort-AV: E=Sophos;i="5.97,278,1669104000"; d="scan'208";a="668657794" Received: from ranerica-svr.sc.intel.com ([172.25.110.23]) by fmsmga007.fm.intel.com with ESMTP; 06 Feb 2023 21:01:47 -0800 From: Ricardo Neri To: "Peter Zijlstra (Intel)" , Juri Lelli , Vincent Guittot Cc: Ricardo Neri , "Ravi V. Shankar" , Ben Segall , Daniel Bristot de Oliveira , Dietmar Eggemann , Len Brown , Mel Gorman , "Rafael J. Wysocki" , Srinivas Pandruvada , Steven Rostedt , Tim Chen , Valentin Schneider , Lukasz Luba , Ionela Voinescu , x86@kernel.org, "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org, Ricardo Neri , "Tim C . Chen" Subject: [PATCH v3 24/24] x86/process: Reset hardware history in context switch Date: Mon, 6 Feb 2023 21:11:05 -0800 Message-Id: <20230207051105.11575-25-ricardo.neri-calderon@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> References: <20230207051105.11575-1-ricardo.neri-calderon@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Reset the classification history of the current task when switching to the next task. Hardware will start the classification of the next task from scratch. Cc: Ben Segall Cc: Daniel Bristot de Oliveira Cc: Dietmar Eggemann Cc: Ionela Voinescu Cc: Joel Fernandes (Google) Cc: Len Brown Cc: Lukasz Luba Cc: Mel Gorman Cc: Rafael J. Wysocki Cc: Srinivas Pandruvada Cc: Steven Rostedt Cc: Tim C. Chen Cc: Valentin Schneider Cc: x86@kernel.org Cc: linux-pm@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ricardo Neri --- Changes since v2: * None Changes since v1: * Measurements of the cost of the HRESET instruction Methodology: I created a tight loop with interrupts and preemption disabled. I recorded the value of the TSC counter before and after executing HRESET or RDTSC. I repeated the measurement 100,000 times. I performed the experiment using an Alder Lake S system. I set the frequency of the CPUs at a fixed value. The table below compares the cost of HRESET with RDTSC (expressed in the elapsed TSC count). The cost of the two instructions is comparable. PCore ECore Frequency (GHz) 5.0 3.8 HRESET (avg) 28.5 44.7 HRESET (stdev %) 3.6 2.3 RDTSC (avg) 25.2 35.7 RDTSC (stdev %) 3.9 2.6 * Used an ALTERNATIVE macro instead of static_cpu_has() to execute HRESET when supported. (PeterZ) --- arch/x86/include/asm/hreset.h | 30 ++++++++++++++++++++++++++++++ arch/x86/kernel/cpu/common.c | 7 +++++++ arch/x86/kernel/process_32.c | 3 +++ arch/x86/kernel/process_64.c | 3 +++ 4 files changed, 43 insertions(+) create mode 100644 arch/x86/include/asm/hreset.h diff --git a/arch/x86/include/asm/hreset.h b/arch/x86/include/asm/hreset.h new file mode 100644 index 000000000000..d68ca2fb8642 --- /dev/null +++ b/arch/x86/include/asm/hreset.h @@ -0,0 +1,30 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_HRESET_H + +/** + * HRESET - History reset. Available since binutils v2.36. + * + * Request the processor to reset the history of task classification on the + * current logical processor. The history components to be + * reset are specified in %eax. Only bits specified in CPUID(0x20).EBX + * and enabled in the IA32_HRESET_ENABLE MSR can be selected. + * + * The assembly code looks like: + * + * hreset %eax + * + * The corresponding machine code looks like: + * + * F3 0F 3A F0 ModRM Imm + * + * The value of ModRM is 0xc0 to specify %eax register addressing. + * The ignored immediate operand is set to 0. + * + * The instruction is documented in the Intel SDM. + */ + +#define __ASM_HRESET ".byte 0xf3, 0xf, 0x3a, 0xf0, 0xc0, 0x0" + +void reset_hardware_history(void); + +#endif /* _ASM_X86_HRESET_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index f3f936f7de5f..17e2068530b0 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -53,6 +53,7 @@ #include #include #include +#include #include #include #include @@ -414,6 +415,12 @@ static __always_inline void setup_umip(struct cpuinfo_= x86 *c) =20 static u32 hardware_history_features __ro_after_init; =20 +void reset_hardware_history(void) +{ + asm_inline volatile (ALTERNATIVE("", __ASM_HRESET, X86_FEATURE_HRESET) + : : "a" (hardware_history_features) : "memory"); +} + static __always_inline void setup_hreset(struct cpuinfo_x86 *c) { if (!cpu_feature_enabled(X86_FEATURE_HRESET)) diff --git a/arch/x86/kernel/process_32.c b/arch/x86/kernel/process_32.c index 470c128759ea..397a6e6f4e61 100644 --- a/arch/x86/kernel/process_32.c +++ b/arch/x86/kernel/process_32.c @@ -52,6 +52,7 @@ #include #include #include +#include #include =20 #include "process.h" @@ -214,6 +215,8 @@ __switch_to(struct task_struct *prev_p, struct task_str= uct *next_p) /* Load the Intel cache allocation PQR MSR. */ resctrl_sched_in(); =20 + reset_hardware_history(); + return prev_p; } =20 diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 4e34b3b68ebd..6176044ecc16 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -53,6 +53,7 @@ #include #include #include +#include #include #include #ifdef CONFIG_IA32_EMULATION @@ -658,6 +659,8 @@ __switch_to(struct task_struct *prev_p, struct task_str= uct *next_p) /* Load the Intel cache allocation PQR MSR. */ resctrl_sched_in(); =20 + reset_hardware_history(); + return prev_p; } =20 --=20 2.25.1