[PATCH v2 16/23] sched/cache: Introduce sched_cache_present to enable cache aware scheduling for multi LLCs NUMA node

Tim Chen posted 23 patches 2 weeks, 1 day ago
There is a newer version of this series
[PATCH v2 16/23] sched/cache: Introduce sched_cache_present to enable cache aware scheduling for multi LLCs NUMA node
Posted by Tim Chen 2 weeks, 1 day ago
From: Chen Yu <yu.c.chen@intel.com>

Cache-aware load balancing should only be enabled if there are more
than 1 LLCs within 1 NUMA node. sched_cache_present is introduced to
indicate whether this platform supports this topology.

Suggested-by: Libo Chen <libo.chen@oracle.com>
Suggested-by: Adam Li <adamli@os.amperecomputing.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
---

Notes:
    v1->v2:
    	Use flag sched_cache_present to indicate whether a platform
    	supports cache aware scheduling. Change this flag from staic key.
    	There should be only 1 static key to control the cache aware
    	scheduling. (Peter Zijlstra)

 kernel/sched/topology.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index d583399fc6a1..9799e3a9a609 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -24,6 +24,8 @@ int max_llcs;
 
 #ifdef CONFIG_SCHED_CACHE
 
+static bool sched_cache_present;
+
 static unsigned int *alloc_new_pref_llcs(unsigned int *old, unsigned int **gc)
 {
 	unsigned int *new = NULL;
@@ -54,7 +56,7 @@ static void populate_new_pref_llcs(unsigned int *old, unsigned int *new)
 		new[i] = old[i];
 }
 
-static int resize_llc_pref(void)
+static int resize_llc_pref(bool has_multi_llcs)
 {
 	unsigned int *__percpu *tmp_llc_pref;
 	int i, ret = 0;
@@ -102,6 +104,11 @@ static int resize_llc_pref(void)
 		rq_unlock_irqrestore(rq, &rf);
 	}
 
+	if (has_multi_llcs) {
+		sched_cache_present = true;
+		pr_info_once("Cache aware load balance is enabled on the platform.\n");
+	}
+
 release_old:
 	/*
 	 * Load balance is done under rcu_lock.
@@ -124,7 +131,7 @@ static int resize_llc_pref(void)
 
 #else
 
-static int resize_llc_pref(void)
+static int resize_llc_pref(bool has_multi_llcs)
 {
 	max_llcs = new_max_llcs;
 	return 0;
@@ -2644,6 +2651,7 @@ static int
 build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *attr)
 {
 	enum s_alloc alloc_state = sa_none;
+	bool has_multi_llcs = false;
 	struct sched_domain *sd;
 	struct s_data d;
 	struct rq *rq = NULL;
@@ -2736,10 +2744,12 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
 				 * between LLCs and memory channels.
 				 */
 				nr_llcs = sd->span_weight / child->span_weight;
-				if (nr_llcs == 1)
+				if (nr_llcs == 1) {
 					imb = sd->span_weight >> 3;
-				else
+				} else {
 					imb = nr_llcs;
+					has_multi_llcs = true;
+				}
 				imb = max(1U, imb);
 				sd->imb_numa_nr = imb;
 
@@ -2787,7 +2797,7 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
 	if (has_cluster)
 		static_branch_inc_cpuslocked(&sched_cluster_active);
 
-	resize_llc_pref();
+	resize_llc_pref(has_multi_llcs);
 
 	if (rq && sched_debug_verbose)
 		pr_info("root domain span: %*pbl\n", cpumask_pr_args(cpu_map));
-- 
2.32.0
Re: [PATCH v2 16/23] sched/cache: Introduce sched_cache_present to enable cache aware scheduling for multi LLCs NUMA node
Posted by Peter Zijlstra 1 week, 2 days ago
On Wed, Dec 03, 2025 at 03:07:35PM -0800, Tim Chen wrote:
> From: Chen Yu <yu.c.chen@intel.com>
> 
> Cache-aware load balancing should only be enabled if there are more
> than 1 LLCs within 1 NUMA node. sched_cache_present is introduced to
> indicate whether this platform supports this topology.
> 
> Suggested-by: Libo Chen <libo.chen@oracle.com>
> Suggested-by: Adam Li <adamli@os.amperecomputing.com>
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> ---
> 
> Notes:
>     v1->v2:
>     	Use flag sched_cache_present to indicate whether a platform
>     	supports cache aware scheduling. Change this flag from staic key.
>     	There should be only 1 static key to control the cache aware
>     	scheduling. (Peter Zijlstra)
> 
>  kernel/sched/topology.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index d583399fc6a1..9799e3a9a609 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -24,6 +24,8 @@ int max_llcs;
>  
>  #ifdef CONFIG_SCHED_CACHE
>  
> +static bool sched_cache_present;

sched_energy_present
sched_asym_cpucapacity
sched_cluster_active
sched_smt_present

are all static keys tied to the current topology, why break the streak
and make this a boolean?
Re: [PATCH v2 16/23] sched/cache: Introduce sched_cache_present to enable cache aware scheduling for multi LLCs NUMA node
Posted by Chen, Yu C 3 days, 11 hours ago
On 12/11/2025 12:32 AM, Peter Zijlstra wrote:
> On Wed, Dec 03, 2025 at 03:07:35PM -0800, Tim Chen wrote:
>> From: Chen Yu <yu.c.chen@intel.com>
>>
>> Cache-aware load balancing should only be enabled if there are more
>> than 1 LLCs within 1 NUMA node. sched_cache_present is introduced to
>> indicate whether this platform supports this topology.
>>
>> Suggested-by: Libo Chen <libo.chen@oracle.com>
>> Suggested-by: Adam Li <adamli@os.amperecomputing.com>
>> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>> ---
>>
>> Notes:
>>      v1->v2:
>>      	Use flag sched_cache_present to indicate whether a platform
>>      	supports cache aware scheduling. Change this flag from staic key.
>>      	There should be only 1 static key to control the cache aware
>>      	scheduling. (Peter Zijlstra)
>>
>>   kernel/sched/topology.c | 20 +++++++++++++++-----
>>   1 file changed, 15 insertions(+), 5 deletions(-)
>>
>> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
>> index d583399fc6a1..9799e3a9a609 100644
>> --- a/kernel/sched/topology.c
>> +++ b/kernel/sched/topology.c
>> @@ -24,6 +24,8 @@ int max_llcs;
>>   
>>   #ifdef CONFIG_SCHED_CACHE
>>   
>> +static bool sched_cache_present;
> 
> sched_energy_present
> sched_asym_cpucapacity
> sched_cluster_active
> sched_smt_present
> 
> are all static keys tied to the current topology, why break the streak
> and make this a boolean?

OK, will convert it into a key.

thanks,
Chenyu
Re: [PATCH v2 16/23] sched/cache: Introduce sched_cache_present to enable cache aware scheduling for multi LLCs NUMA node
Posted by Peter Zijlstra 1 week, 2 days ago
On Wed, Dec 10, 2025 at 05:32:35PM +0100, Peter Zijlstra wrote:
> On Wed, Dec 03, 2025 at 03:07:35PM -0800, Tim Chen wrote:
> > From: Chen Yu <yu.c.chen@intel.com>
> > 
> > Cache-aware load balancing should only be enabled if there are more
> > than 1 LLCs within 1 NUMA node. sched_cache_present is introduced to
> > indicate whether this platform supports this topology.
> > 
> > Suggested-by: Libo Chen <libo.chen@oracle.com>
> > Suggested-by: Adam Li <adamli@os.amperecomputing.com>
> > Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> > Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
> > ---
> > 
> > Notes:
> >     v1->v2:
> >     	Use flag sched_cache_present to indicate whether a platform
> >     	supports cache aware scheduling. Change this flag from staic key.
> >     	There should be only 1 static key to control the cache aware
> >     	scheduling. (Peter Zijlstra)
> > 
> >  kernel/sched/topology.c | 20 +++++++++++++++-----
> >  1 file changed, 15 insertions(+), 5 deletions(-)
> > 
> > diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> > index d583399fc6a1..9799e3a9a609 100644
> > --- a/kernel/sched/topology.c
> > +++ b/kernel/sched/topology.c
> > @@ -24,6 +24,8 @@ int max_llcs;
> >  
> >  #ifdef CONFIG_SCHED_CACHE
> >  
> > +static bool sched_cache_present;
> 
> sched_energy_present
> sched_asym_cpucapacity
> sched_cluster_active
> sched_smt_present
> 
> are all static keys tied to the current topology, why break the streak
> and make this a boolean?

Also, patch doesn't use sched_cache_present at all, so perhaps just drop
it on the floor entirely?
Re: [PATCH v2 16/23] sched/cache: Introduce sched_cache_present to enable cache aware scheduling for multi LLCs NUMA node
Posted by Chen, Yu C 3 days, 11 hours ago
On 12/11/2025 12:52 AM, Peter Zijlstra wrote:
> On Wed, Dec 10, 2025 at 05:32:35PM +0100, Peter Zijlstra wrote:
>> On Wed, Dec 03, 2025 at 03:07:35PM -0800, Tim Chen wrote:
>>> From: Chen Yu <yu.c.chen@intel.com>
>>>
>>> Cache-aware load balancing should only be enabled if there are more
>>> than 1 LLCs within 1 NUMA node. sched_cache_present is introduced to
>>> indicate whether this platform supports this topology.
>>>
>>> Suggested-by: Libo Chen <libo.chen@oracle.com>
>>> Suggested-by: Adam Li <adamli@os.amperecomputing.com>
>>> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
>>> Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
>>> ---
>>>
>>> Notes:
>>>      v1->v2:
>>>      	Use flag sched_cache_present to indicate whether a platform
>>>      	supports cache aware scheduling. Change this flag from staic key.
>>>      	There should be only 1 static key to control the cache aware
>>>      	scheduling. (Peter Zijlstra)
>>>
>>>   kernel/sched/topology.c | 20 +++++++++++++++-----
>>>   1 file changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
>>> index d583399fc6a1..9799e3a9a609 100644
>>> --- a/kernel/sched/topology.c
>>> +++ b/kernel/sched/topology.c
>>> @@ -24,6 +24,8 @@ int max_llcs;
>>>   
>>>   #ifdef CONFIG_SCHED_CACHE
>>>   
>>> +static bool sched_cache_present;
>>
>> sched_energy_present
>> sched_asym_cpucapacity
>> sched_cluster_active
>> sched_smt_present
>>
>> are all static keys tied to the current topology, why break the streak
>> and make this a boolean?
> 
> Also, patch doesn't use sched_cache_present at all, so perhaps just drop
> it on the floor entirely?

The sched_cache_present flag is used in a subsequent patch:
"[20/23] sched/cache: Add user control to adjust the parameters of 
cache-aware
scheduling" This flag is used to check whether the user is eligible to 
enable
cache-aware scheduling. I will try to place the declaration and usage of 
this
flag together.

thanks,
Chenyu